[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Deltifying directories on the server

From: Hyrum K Wright <hyrum_at_hyrumwright.org>
Date: Mon, 31 Jan 2011 22:29:16 -0600

Philip and I had an interesting conversation with some users this
evening, and I'm just archiving my brain dump here.

These users have a large repository with a large number of branches in
the /branches directory (~35k). We described the well-known
phenomenon in which directories aren't deltified on commit, and thus
cause the repository to have very large revisions, even when the
actual content changes are fairly small. This is due to bubble up and
having to re-write the entire directory list of the /branches
directory.

Philip recalled a time several years ago when he enabled directory
deltification, but the performance was awful, and we've never released
it. In our discussion, we mentioned that directory deltification may
be better performing now, especially in light of the imminent merge of
the diff-bytes-optimizations branch. In the case of a bubble-up
directory modification, the prefix and suffix matching would simplify
the problem space, leaving a very small diff.

The only trouble with the above theory is if directory entry lists are
stored in a hash, and are serialized in an unordered manner, thus
negating any benefits prefix-scanning would provide (and potentially
causing the horrific delta performance in the first place).

Anyway, that was the kernel of our discussion. I haven't dug around
in the code to determine how much of it is true or not, but if anybody
wants something to do, this might be interesting.

-Hyrum
Received on 2011-02-01 05:29:56 CET

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.