[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: subversion's storing / diffing techniques

From: Greg Hudson <ghudson_at_MIT.EDU>
Date: 2004-05-22 00:34:32 CEST

On Fri, 2004-05-21 at 17:51, Michael Bartholomäus wrote:
> Is there any documentation about the way subversion computes what has
> changed in a file?
> When requesting an old version of a file - how is it reconstructed?
> How does subversion reduce the amount of data it has to store?
> What is the actual storing technique used by subversion?
> What information is stored in the underlying Berkley DB?

We use an algorithm called vdelta (created by Phong Vo at AT&T, I
believe) to compute the differences between one file and another. We
apply the vdelta algorithm to successive corresponding "windows" of the
source and target file. The output of the algorithm is a sequence of
instructions which say how to reconstruct the target window by copying
ranges of bytes from the source window or from the already-constructed
part of the target window, or by inserting bits of "new data." The
instructions are then encoded in a format called svndiff.

That only answers part of the question, of course. Given that we have a
binary delta algorithm (which can also be used as a compression
algorithm, by taking the delta between the empty stream and the target),
how do we store the contents of a file across various changes? We use a
technique called "skip-deltas" so that it only takes O(lg(N)) deltas to
reconstruct any version of a file which has gone through N changes. I
happen to have just written a long piece of mail about that to the users
list, which is presumably stuck in moderation; check the archives
periodically for that. (Subject: "Re: Revision Differences").

Perhaps the easiest-to-retrieve documentation of the vdelta algorithm
lives in a long comment in
http://svn.collab.net/repos/svn/trunk/subversion/libsvn_delta/vdelta.c,
around line 125.

The svndiff format we use to encode vdelta instructions is documented in
http://svn.collab.net/repos/svn/trunk/notes/svndiff.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Sat May 22 00:35:14 2004

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.