[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: Whither delta combiner

From: Daniel Berlin <dan_at_dberlin.org>
Date: 2002-02-11 18:42:08 CET

On 11 Feb 2002, Karl Fogel wrote:

> Greg Hudson <ghudson@MIT.EDU> writes:
> > Skiplist deltas: The bottom row of N adjacent-rev deltas takes D*N
> > space, of course. The next row up also takes D*N space; there are half
> > as many deltas but they are each twice as big. And so on; since there
> > are lg(N) rows, we need DNlg(N) space.
>
> It's the part about there being lg(N) rows that puzzles me. Sorry to
> be so dense, I feel as though I must be missing something here...
>
> Earlier, you wrote
>
> > Skiplist deltas, you mean? I'll give a simple description. Let's say
> > we have a node with eight node-revisions (and no branching), and we only
> > want to have one plaintext of that file--the most recent. Right now, I
> > assume the node looks something like this (where "1" should really be
> > "N.1" where N is the node number, but I'll abbreviate):
> >
> > 1 <-- 2 <-- 3 <-- 4 <-- 5 <-- 6 <-- 7 <-- 8
> >
> > To get the plaintext of rev 1, we need to apply seven deltas. With
> > skiplist deltas, the node would look like:
> >
> > 4 <-------------------- 8
> > 2 <-------- 4 <-------- 6 <-------- 8
> > 1 <-- 2 <-- 3 <-- 4 <-- 5 <-- 6 <-- 7 <-- 8
> >
> > Now to get the plaintext of rev 1, we need to apply only three deltas (8
> > -> 4 -> 2 -> 1), and there shouldn't be more than five deltas between
> > any two revs (no more than four, looks like).
>
> If the number of rows were truly lg(N), then the minimum number of
> delta applications required to travel back a distance X would grow
> without bound... Oh, wait, I see it! You jump from row to row,
> narrowing down to the granularity you need at any given point. D'oh,
> I knew that. :-)
>
> Is it that we force the number of skiplist rows per node to be lg(N),
> and then within each row, space the skips evenly (or as evenly as
> integrally possible), such that the skip distance in each row is half
> or twice that of the rows above and below, respectively? Of course,
> one needs to make sure that the "destination" nodes coincide as often
> as possible from row to row.
>
> Okay, with my head cleared, I reread your explanation below and it
> makes sense, even though somehow counterintuitive to me. Thanks.
>

One of the more interesting things about skiplists is that you never need
to jump backwards, like you would in a regular binary search.
Only forwards and down.

> The practical question then is, how would we change the repository
> format to store skiplist deltas?

You store a level number, and <level number> "pointers".

However, checkpoint plaintexts was trivial by comparison.

Here's a checkpoint plaintext hack (note hack, not patch, as i'm sure this
isn't the right place to do it, just the only place that seemed to
intersect everything). Note the overloaded use of is_dir so that we
don't execute the while loop for dirs. Hopefully that kind of obsfucation
prevents anyone from thinking this is more than a hack.

Index: ./deltify.c
===================================================================
--- ./deltify.c
+++ ./deltify.c Sun Feb 10 18:41:25 2002
@@ -393,13 +393,31 @@
 {
   svn_fs_id_t *predecessor_id = svn_fs_predecessor_id (id, trail->pool);
   int is_dir = 0;
+ int dist = 0;
   dag_node_t *node;
-
+ svn_fs_id_t *root_id = id;
   SVN_ERR (svn_fs__dag_get_node (&node, fs, id, trail));
   is_dir = svn_fs__dag_is_directory (node);
+ while (!is_dir)
+ {
+ if (svn_fs_predecessor_id (root_id, trail->pool) != NULL)
+ root_id = svn_fs_predecessor_id (root_id, trail->pool);
+ else
+ break;
+ }
+
+ dist = svn_fs_id_distance (root_id, id);
+ if (!is_dir)
+ printf ("Distance to root is %d\n", dist);
+ if (predecessor_id != NULL && (is_dir || dist % 10 != 0))
+ {
+ SVN_ERR (deltify (predecessor_id, id, fs, is_dir ? 1 : 0,
trail));
+ }
+ else if (!is_dir && predecessor_id != NULL && (dist % 10 == 0))
+ {
+ SVN_ERR (undeltify (id, fs, trail));
+ }

- if (predecessor_id != NULL)
- SVN_ERR (deltify (predecessor_id, id, fs, is_dir ? 1 : 0, trail));

   return SVN_NO_ERROR;
 }

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Sat Oct 21 14:37:06 2006

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.