[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: Whither delta combiner

From: Karl Fogel <kfogel_at_newton.ch.collab.net>
Date: 2002-02-09 05:29:13 CET

Hmmm. Okay, so skiplists are a normal space-time tradeoff? We store
more delta data, but we reach older revisions faster.

This sounds similar to a couple of other ideas that have been floating
around:

   1. "Checkpoint fulltexts" -- every N node revisions, store a
      fulltext instead of a delta, so that the maximum number of
      undeltifications to reach any given fulltext is never greater
      than N. Various change-sensitive heuristics can be used to set
      N; it doesn't have to be the same for every node, or even within
      every node. (I think I first pushed for this, primarily because
      I like saying "checkpoint fulltext" over and over and waiting
      for someone to ask what I mean by it.)

   2. Redeltify old node revisions against head, as they get further
      from head. (I think Branko first proposed this, probably
      because he got sick of me saying "checkpoint fulltext" over and
      over.)

Both proposals trade delta space for faster retrieval of old
revisions; haven't done a formal proof, but I'm pretty sure both
methods end up being exactly the same tradeoff, and I think it's about
the same as the skiplist tradeoff, too. (I haven't thought as much
about the skiplists, so this is just an off-the-cuff reaction.)

I don't see any particular reason to prefer one method over the
others, as far as performance goes, but checkpoint fulltexts (did I
mention how much I *love* saying that?) are probably the most natural
to implement. Easier to simply not make a delta, than to make more
deltas :-), and undeltification is the same algorithm as we currently
use -- namely, keep going until you hit a fulltext.

-Karl

Greg Hudson <ghudson@MIT.EDU> writes:
> Skiplist deltas, you mean? I'll give a simple description. Let's say
> we have a node with eight node-revisions (and no branching), and we only
> want to have one plaintext of that file--the most recent. Right now, I
> assume the node looks something like this (where "1" should really be
> "N.1" where N is the node number, but I'll abbreviate):
>
> 1 <-- 2 <-- 3 <-- 4 <-- 5 <-- 6 <-- 7 <-- 8
>
> To get the plaintext of rev 1, we need to apply seven deltas. With
> skiplist deltas, the node would look like:
>
> 4 <-------------------- 8
> 2 <-------- 4 <-------- 6 <-------- 8
> 1 <-- 2 <-- 3 <-- 4 <-- 5 <-- 6 <-- 7 <-- 8
>
> Now to get the plaintext of rev 1, we need to apply only three deltas (8
> -> 4 -> 2 -> 1), and there shouldn't be more than five deltas between
> any two revs (no more than four, looks like).

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Sat Oct 21 14:37:05 2006

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.