Shoudn't it be quite simple to protect the 'jar' file during production? ie:
Create the archive file as 'something.x'
When finished, rename to '00001-00999.jar' (fsfs uses jar in preference to normal rev files)
Delete 00001-00999 ye olde rev files
This has to be better than keeping tens of k's small files in individual folders?
-----Original Message-----
From: Greg Hudson [mailto:ghudson@MIT.EDU]
Sent: Fri 22/10/2004 20:28
To: JS.staff
Cc: dev@subversion.tigris.org
Subject: Re: possible win32 fsfs performance[Scanned]
On Fri, 2004-10-22 at 13:44, JS.staff wrote:
> Arn't old rev files read only? Only altered when that particular revision is committed?
>
> I'm not sure about the 'filesystem that doesn't suck' stuff... Presumably even Linux (Peace Be Upon It) suffers from inefficiencies when storing 10,000's of small files (compared to fewer big ones).
>
> My idea was to make the 'jaring' an occasional svnadmin routine, not something done daily or anything. Just for very old revisions.
Well, we have to be careful in case anyone is accessing the repository
while this agglomeration is occurring. Since we can't be certain when
the reader is done accessing a rev file, we don't know when it's safe to
delete them. (We could use the repository layer's recovery lock, which
is currently not used for anything, but I'm actually thinking of
throwing that out for FSFS repositories because it means
locking-crippled underlying filesystems can't be used for read-only
access.)
Also, later revs contain a lot of <rev, offset> pointers to earlier
revs. Updating those pointers at collection time would be a nightmare,
so those pointers need to remain valid. So we would need a fast way of
mapping a rev to a collection. I'm not sure how to do that without
imposing artificial constraints on what revs can be collected. (For
instance, if we say that all collections must be of revs numbered
10*N..10*N+9, then we could quickly figure out what collection a rev
belongs to.)
Received on Fri Oct 22 22:23:37 2004