On Nov 27, 2008, at 11:49 AM, Julian Foad wrote:
> On Thu, 2008-11-27 at 03:17 -0800, Blair Zajac wrote:
>> Hyrum K. Wright wrote:
>>> Hi all.
>>>
>>> As of r34446, the implementation of packing on fsfs is
>>> functionally complete on
>>> the fsfs-pack branch. For those that don't know, packing consists
>>> of mushing
>>> all the individual rev files in a completed shard into one file,
>>> thus
>>> eliminating the inode penalty for that entire shard. Packing a
>>> trunk-generated
>>> copy of the ASF repository saved about 1 GB on a 24 GB repo.
>>> There may be
>>> additional performance benefits in dealing with a much smaller set
>>> of rev files
>>> (OS caching, etc.), but I haven't yet investigated that.
>>>
>>> This comes at a cost: the offsets of revisions in the pack file
>>> are stored
>>> separately, and thus require an additional open/seek to get that
>>> information.
>>> Also, determining whether a revision is stored in a pack file or
>>> not also
>>> requires additional I/O. I think that most of this can be
>>> eliminated with
>>> caching and heuristics, but those haven't yet been implemented.
>>>
>>> I'm not currently planning on including this functionality in 1.6,
>>> as it's kinda
>>> biggish feature, the optimizations aren't yet in place, and I feel
>>> like merging
>>> right before we branch 1.6.x could be a bit destabilizing.
>>> However, I could
>>> easily be talked into it. :)
>>>
>>> Anyway, I'm soliciting feedback on the implementation and usage of
>>> this feature.
>>> Comments welcome.
>>
>> This is something I'll definitely need. We're going to be having
>> multiple
>> repositories each with a million revisions, so having fsfs packing
>> will make the
>> repository much easier to work with. Also, any fsck's will be much
>> faster :)
>>
>> So +1 for merging into trunk from me.
>
> Do we have any cost/benefit numbers to demonstrate the "much easier to
> work with", or a qualitative description that you could point me to? I
> couldn't find anything by searching for "pack" or "inode" in the mail
> archives.
No, not directly. It's just known that it's faster to do anything
with fewer larger files than many smaller ones, if you don't care
about looking in the files. Hot backups, tars, etc.
Blair
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe_at_subversion.tigris.org
For additional commands, e-mail: dev-help_at_subversion.tigris.org
Received on 2008-11-27 23:45:33 CET