[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: Making the repository smaller

From: Branko Čibej <brane_at_xbc.nu>
Date: Wed, 20 Feb 2008 18:39:49 +0100

Philipp Marek wrote:
> Hello everybody,
>
> AFAIK there's currently no delta for directories (because of performance
> reasons).
> Now I'm looking for cheap solutions to make the repository smaller.
>
>
> Would it be possible to make a directory in the repository be a list of
> blocks, so that only the blocks that have changed entries in them would have
> to be duplicated?
>

Doing that could be a problem IMHO because you'd suddenly have another
indirection for every block; not optimal. There's another problem in
that, IIRC, our directories aren't sorted, so every lookup is a linear
search; these add up.

The performance problem was from before we had the delta combiner,
before we had skip deltas, and before we hade xdelta+svndiff1. I
strongly suspect that these three changes are very significant.

Here's my suggestion: sort the directory entries in the repository, and
switch on delta compression for directories. That should be a relatively
easy change that doesn't require new logic in the server, and may
actually be backwards compatible (not sure, but that should be easy to
find out). Then, of course, do some performance tests and observe lookup
times and memory usage.

> These blocks could additionally be stored compressed - although I'm not sure
> whether that's necessary (or good).
>
>
> I just tested with my /etc; this has 302 entries in it, and if I change one
> file in a subdirectory (where the delta in the revision file consists of ~40
> bytes) the full revision has ~14kB, most of them the directory entries
> in /etc.
>
> Additionally it would be nice to get some compression for often used
> properties, ie. properties where name and value are shared between a lot of
> entries.
>

That's a different issue and would require some kind of content-indexed
approach. I really don't think it's worthwhile to try that with the
current repository schema; the necessary changes would be huge.

-- Brane

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe_at_subversion.tigris.org
For additional commands, e-mail: dev-help_at_subversion.tigris.org
Received on 2008-02-20 18:40:15 CET

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.