Re: Best practise for long term repository size management

From: Daniel Berlin <dberlin_at_dberlin.org>
Date: 2003-05-14 06:53:51 CEST

On Tuesday, May 13, 2003, at 10:06 PM, Daniel Patterson wrote:

> On Wed, 2003-05-14 at 11:54, cmpilato@collab.net wrote:
>> Switch to RTF instead of native Word docs? At least you have a
>> fighting chance of worthwhile deltification. :-)
>
> *sigh* that's kind of what I figured. Has anyone investigated using
> xdelta/xdelta2 to do binary diffs (although I'm not sure that it'd help
> for most binary formats)....

It won't really help for any formats.
Our encoding is a vdelta algorithm output into basically a VCDIFF
subset.

In fact, xdelta3 is using an xdelta style algorithm that will do better
(but i couldn't imagine more than maybe 10% better), but be slower, and
they output into a real VCDIFF based encoding, which will be a bit more
compact.

There is an issue to track svndiff version 1 that i wrote, which made
up just about all of the difference by doing the VCDIFF style address
encoding and range encoding compression of the strings data (It's been
so long, i might not remember the exact details of what we are range
encoding anymore). The remaining VCDIFF encoding pieces aren't worth
the cost for our purposes, but they will complexify our code incredibly.

Algorithmically, you aren't likely to get more than 10% smaller diffs
using the xdelta algorithm. I remember running all kinds of tests
against it and vdelta when working on svndiff 1.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Wed May 14 06:54:42 2003

This message: [ Message body ]
Next message: Frans Thamura: "Subversion"
Previous message: B. W. Fitzpatrick: "Re: experiment to get Mac resource-forks under version control [with PATCH]"
In reply to: Daniel Patterson: "Re: Best practise for long term repository size management"
Next in thread: Branko ÄŒibej: "Re: Best practise for long term repository size management"

Contemporary messages sorted: [ By Date ] [ By Thread ] [ By Subject ] [ By Author ] [ By messages with attachments ]