[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: Whither delta combiner

From: Branko Čibej <brane_at_xbc.nu>
Date: 2002-02-11 20:14:57 CET

Greg Hudson wrote:

>>The practical question then is, how would we change the repository
>>format to store skiplist deltas?
>I'm sorry, I was thinking about the wrong set of operations. We don't
>need skiplist deltas, and we don't need to change the repository format,
>and this should be an easy job. We go with Branko's "re-deltifying
>against the head" idea, but instead of doing it at constant intervals,
>we do it at exponential intervals. (I don't know which Branko had in
What I had in mind was doing this transformation lazily, when we access
a node. Once we have the node's plaintext, we can decide -- based on the
number of deltas we had to apply, the ratio between the size of the
current delta and the delta to head, or maybe some other heuristic --
whether or not to replace the node's contents. That way, the cost of a
commit isn't proportional to the number of the head node's predecessors.

> This gives us all the performance advantages I cited at less of
>a space penalty. I'll call this idea "skip-deltas."
>Right now, when we commit node-revision 100.N, we commit it as plaintext
>and deltify 100.(N-1) to point to it. We'll do the same thing except
>we'll potentially deltify some more node-revisions: 100.(N-2),
>100.(N-4), and so on. The number of additional node-revisions we modify
>is the number of 0s at the end of the binary representation of N
>(assuming we number N starting from 0; if we number starting from 1, use
>N-1). Following this rule, a chain of nine node-revisions should look
> 0 --------------------------------------------> 8
> 4 --------------------> 8
> 2 --------> 4 6 --------> 8
> 1 --> 2 3---> 4 5 --> 6 7 --> 8
>It takes three deltas to get from node-rev 1 to node-rev 8, which is the
>logarithmic behavior we were going for. We still suffer a space
>penalty, because those long arrows are generally more expensive than
>short ones should be, but it's much simpler than dealing with full
>skiplists. (And the space penalty is only a factor of 2, I think, since
>the average arrow length is 2.)
>Some notes on node branches
>I'm not sure we're doing a good job of dealing with branches right now.
>Based on the code Daniel was modifying, I think when we branch we are
>blindly re-deltifying the predecessor node to point into the branch.
>Which works, but doesn't necessarily yield good performance. What RCS
>does looks like:
> 1.1 --> 1.2 --> 1.3 --> 1-4 --> 1.5
> 1.3 <-- <-- --<
>Switching to backwards-branch behavior is pretty easy with or without
>skip-deltas. In the current world, if we noted that we were on a
>branch, then instead of deltifying the predecessor node to point at us,
>we would deltify ourself to point at the predecessor node. With
>skip-deltas, instead of deltifying a bunch of predecessors to point at
>us, we deltify ourself to point at an ancestor node some distance back,
>the distance being 2 to the power of the number of 0s at the end of the
>binary representation of the last component of our ID.
>Unfortunately, there is a performance pitfall with backwards-branch
>behavior in Subversion, because the "trunk" of a node is determined
>simply by whoever modifies it first. If I copy a directory, modify a
>few files in the copy, and never touch them again but instead continue
>to modify them heavily in the original, then those files' plaintexts
>will get further and further away from the "real" head revision. Not
>sure what to do about that. In the average case we'd probably still do
>better than the current behavior.
>To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
>For additional commands, e-mail: dev-help@subversion.tigris.org

Brane Čibej   <brane_at_xbc.nu>   http://www.xbc.nu/brane/
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Sat Oct 21 14:37:06 2006

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.