[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: [PATCH] Skip-deltas, for review

From: Greg Hudson <ghudson_at_MIT.EDU>
Date: 2002-07-27 05:59:21 CEST

(Replies to several messages here.)

Branko and Glenn are making a mountain out of a molehill on commit
costs. The average number of redeltifications is 1.75, up from 1. The
largest number is log(N), but you get averaging over files as well as
over commits (unless you commit a lot of stuff precisely in sync with
each other).

And the first 32 commits to a file don't do skip-deltas, so you don't
approach that 1.75 average for a long time.

And redeltifications are only one of several things which go on during a
commit. As I observed in my timing tests, even without the noise of WC
overhead (i.e. using "svn load"), commit time only goes up by about 5%
with our repository, and no higher than about 17% in the extreme case
where we get massive benefits.

This just isn't going to be a problem.

> William Uther wrote:
> >Note that I've moved the "prevnode = node" inside the outer loop.

But "count" is not reset either. So we do 1 2 4 8 16 32 back. I'll add
a comment.

>> Remember that the average slowdown in commit speed is O(log(log(N))
>> - we do log N re-deltifications every log N steps. That's REALLY
>> small.

No, it's a constant factor.

>> One strange thing: I though skip-lists would cost more in terms of
>> space. I was thinking about a factor of 2 (hmm - or should that be N
>> logN?). Any thoughts on why you aren't seeing this?

  * This isn't a skiplist, since the only goal is to get from any given
node-revision to the head (or other plaintext). We only have to keep
the longest arrow pointing out of a node-revision.

  * The average arrow crosses 1.75 nodes (would be two, except I'm
skipping half of the second level).

  * A lot of files stabilize by the time they reach 32 node-revisions,
and those files experience no cost.

On Fri, 2002-07-26 at 23:31, Branko ╚ibej wrote:
> The second series makes more sense -- you don't want to redeltify
> successive revisions (1 and 2 in the first series).

A small issue. I meant to skip the second level altogether instead of
just half; this would reduce the 1.75 figures down to 1.5. I'll make
that change.

> However, I'd like to point out that there's a simpler way than skip
> deltas to speed up access to old versions, and keep commit times
> reasonably fast. Simply store every N'th revision of a file as a
> self-referent delta -- actually, a vdelta-compressed fulltext.

This is not as space-efficient as skip-deltas (by order of growth).
It's particularly bad for growing files like ChangeLogs.

To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Sat Jul 27 05:59:58 2002

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.