On Feb 20, 2008 9:39 AM, Branko ╚ibej <brane_at_xbc.nu> wrote:
> Philipp Marek wrote:
> > Hello everybody,
> > AFAIK there's currently no delta for directories (because of performance
> > reasons).
> > Now I'm looking for cheap solutions to make the repository smaller.
> > Would it be possible to make a directory in the repository be a list of
> > blocks, so that only the blocks that have changed entries in them would have
> > to be duplicated?
> Doing that could be a problem IMHO because you'd suddenly have another
> indirection for every block; not optimal. There's another problem in
> that, IIRC, our directories aren't sorted, so every lookup is a linear
> search; these add up.
We print hashes (such as directories) in sorted order. However we
always parse the whole thing. This means that each initial read of a
directory takes linear time to parse, but subsequent accesses to
children take constant time.
There is a cache of directory contents, but in my (not fully
benchmarked) opinion it is too small: it contains at most one
directory per revision. FSFS in general could have more in-memory
caching; it's such a pain to do cache management with APR though. I
keep planning to write an svn_cache (or apr_cache) module...
> Here's my suggestion: sort the directory entries in the repository, and
> switch on delta compression for directories. That should be a relatively
> easy change that doesn't require new logic in the server, and may
> actually be backwards compatible (not sure, but that should be easy to
> find out). Then, of course, do some performance tests and observe lookup
> times and memory usage.
I would be horribly surprised if this change made performance better;
I would assume it would make it worse. Sure, it might make
repositories a little *smaller*, but disks are big; eh.
David Glasser | firstname.lastname@example.org | http://www.davidglasser.net/
Received on 2008-02-20 18:54:16 CET