[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: Making the repository smaller

From: David Glasser <glasser_at_davidglasser.net>
Date: Wed, 20 Feb 2008 09:54:04 -0800

On Feb 20, 2008 9:39 AM, Branko Èibej <brane_at_xbc.nu> wrote:
> Philipp Marek wrote:
> > Hello everybody,
> >
> > AFAIK there's currently no delta for directories (because of performance
> > reasons).
> > Now I'm looking for cheap solutions to make the repository smaller.
> >
> >
> > Would it be possible to make a directory in the repository be a list of
> > blocks, so that only the blocks that have changed entries in them would have
> > to be duplicated?
> >
>
> Doing that could be a problem IMHO because you'd suddenly have another
> indirection for every block; not optimal. There's another problem in
> that, IIRC, our directories aren't sorted, so every lookup is a linear
> search; these add up.

We print hashes (such as directories) in sorted order. However we
always parse the whole thing. This means that each initial read of a
directory takes linear time to parse, but subsequent accesses to
children take constant time.

There is a cache of directory contents, but in my (not fully
benchmarked) opinion it is too small: it contains at most one
directory per revision. FSFS in general could have more in-memory
caching; it's such a pain to do cache management with APR though. I
keep planning to write an svn_cache (or apr_cache) module...

> Here's my suggestion: sort the directory entries in the repository, and
> switch on delta compression for directories. That should be a relatively
> easy change that doesn't require new logic in the server, and may
> actually be backwards compatible (not sure, but that should be easy to
> find out). Then, of course, do some performance tests and observe lookup
> times and memory usage.

I would be horribly surprised if this change made performance better;
I would assume it would make it worse. Sure, it might make
repositories a little *smaller*, but disks are big; eh.

--dave

-- 
David Glasser | glasser@davidglasser.net | http://www.davidglasser.net/
Received on 2008-02-20 18:54:16 CET

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.