[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Size of revs file when deleting lines in a big text file

From: Martin Scharrer <mailinglists_at_madmarty.de>
Date: 2006-12-10 03:48:19 CET

Hi list,

I posted this already in the users list but with no final result and I now
think this can't be solved by non-developer.
For reference purpose: my users-list post was message #59447 with
subject: 'Size of revs file when deleting lines in a big text file - Bug?'.

Now the issue:
I detected the following using svn 1.4.2 (r22196) with FSFS under Linux:
A text file, mbox with mails, with about 5 MB is (already) in subversion.
I deleted now one email, i.e. a couple of hundred lines, located in the first
quarter of the file. The diff is about 16kBytes. After checking in this and
other small changes a detected that the file in the 'db/revs' dir in the
repository is over 3 MB in size.
A 'svn diff -rN:M | wc -c' showed me only <0.5 MB.

I tested this then with a test repository where I repeated this case several
times by checking in a copy with this deletion and then a copy without it.
I repeated this about 50 times with the same repository using a script.

The result shows me that the 'delete'-revs have very big sizes which are much
bigger then the resulting diff (e.g. 500k rev with 16k diff) and that this
size depends on the position of the change. The size is bigger for deletion
nearer to the start of the file and smaller for deletion more on the end.
e.g. 1.7MB direct at the begin, 660kB at the 50% mark, 2k at the end.
BUT the revs where the lines get added again are _all_ very small, about 2k.
The sizes should be about the same, the diffs are.

I tested different order of deletion/addition to make sure it's not because of
skip-deltas like mentioned by Rob Hubbard on the users list.

I now wrote a test script in perl so you can reproduce this easily. The script
generates a test repository and a testfile and then makes a couple of
check-ins and prints the resulting sizes

thanks,
Martin

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Received on Sun Dec 10 03:53:52 2006

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.