[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: Bigger --deltas dump with 1.7.5 than with 1.6.17

From: Stefan Fuhrmann <eqfox_at_web.de>
Date: Fri, 22 Jun 2012 00:18:50 +0200

Vincent Lefevre wrote:
> On 2012-06-20 02:48:23 +0200, Vincent Lefevre wrote:
>> On 2012-06-19 19:41:51 +0300, Daniel Shahaf wrote:
>>> I assume that the binary svndiff chunks are different, right?
>> Yes, the binary svndiff chunks are different and have the declared
>> size. But why is 1.6.17 better than 1.7.5?

There is one point that is not obvious to the user.
Deltification uses a binary "xdelta" algorithm, not
"diff". It does not produce minimal deltas but is
very fast at creating reasonably small deltas.

In 1.7, the code got simplified and tuned for speed.
That may impact the compression ratio but that
difference should only be a few bytes if there is
an impact at all.

In 1.8, the code got a few improvements that
will now save a few bytes without hurting performance.
The result may be a smaller dump than in 1.6.
If you have the opportunity, run the /trunk code
and post your results here.
> And the problem I've already mentioned concerning previous versions
> is still there:
>
> $ svn diff -c3876 file:///home/vlefevre/private/svn-mpfr > mpfr-diff
> $ svnadmin dump --incremental --deltas ~/private/svn-mpfr -r3876 > mpfr-dump
> * Dumped revision 3876.
> $ ll mpfr-diff mpfr-dump
> -rw-r--r-- 1 vlefevre vlefevre 2590 2012-06-21 15:03:53 mpfr-diff
> -rw-r--r-- 1 vlefevre vlefevre 9333 2012-06-21 15:03:54 mpfr-dump
>
> where the dump contains data that haven't changed, thus are useless
> in a dump.
>
> Or you can try directly:
>
> $ svn diff -c3876 svn://scm.gforge.inria.fr/svnroot/mpfr > mpfr-diff
> $ svnrdump dump --incremental -r3876 svn://scm.gforge.inria.fr/svnroot/mpfr > mpfr-dump
> * Dumped revision 3876.
> $ ll mpfr-diff mpfr-dump
> -rw-r--r-- 1 vlefevre vlefevre 2590 2012-06-21 15:13:54 mpfr-diff
> -rw-r--r-- 1 vlefevre vlefevre 9210 2012-06-21 15:13:24 mpfr-dump

That effect is easy to explain. Again xdelta is not diff,
hence having a size difference is not an indication for
a bug per se.

xdelta uses fixed-size 100kByte deltification windows.
The Changelog file in question is >400k, i.e. 4+ windows.
You insert about 2k at the beginning of the file, moving
the older parts by a similar distance. At the beginning
of each delta window, those 2+k don't have deltification
partner. Expected delta size: > 4 x 2kBytes.

-- Stefan^2.
Received on 2012-06-22 02:19:34 CEST

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.