[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Um, why do we strip the SVNDIFF header for local writing

From: Daniel Berlin <dan_at_dberlin.org>
Date: 2002-02-14 05:25:46 CET

I just made the svndiff encoding more space efficient, by incorporating,
in a backwards compatible way, the caches from VCDIFF.
For determining which svndiff version to use, we pass a new version
argument to svn_txdelta_to_svndiff.
However, while this works great remotely (since we can just grab the
char after 'SVN' and do the right thing depending on the version), we
strip the 'SVN\0' header locally in rep-strings.c, and send 'SVN\0'
through when we read back the string to fake out the stream.

This completely defeats the purpose of the version number in there, and
guarantees that we can't easily change svndiff formats without destroying
the data in the local repository. You have to resort to shifting tricks
in the offset or something in your actual new algorithm, rather than just
using a different number after the 'SVN' part.

If the answer to "why is their no version number on the repo" is "svndiff
has a header with a version number in it", than wouldn't it make sense to
actually stop stripping it?

In a DB with 3300 revisions on 804 files, there are 21k key/data pairs in
the string database.

Even if we assume every single data item was actual delta data (not even
close to true), this is what, 80k?

Considering the data is big enough that we already have 2800 overflow
pages, the likelyhood that 4 bytes per data item won't fit somewhere in an
existing overflow page is absurdly small.

I'm all for optimization, but this is not only a micro-optimization with
almost 0 effect, it's specifically removing very useful data that we
ourselves put there.
Heck, if you are all that concerned about 4 bytes per data item, we could
reduce it to 1 by just storing the version number.

Of course, if we do that, we wouldn't be able to detect the old "SVN
missing" format in data.

Or we could just change the rep or something.

But I really don't think getting rid of the version number on SVNDiff data
before writing it out (i don't care whether we keep it *with* the data
or not) is right.

To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Sat Oct 21 14:37:07 2006

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.