[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: Proposed svndiff format

From: Karl Fogel <kfogel_at_galois.collab.net>
Date: 2000-10-06 20:40:57 CEST

Greg Hudson <ghudson@mit.edu> writes:
> Like vcdiff, an svndiff consists of a four-byte identifying string
> followed by a series of windows, until the data runs out. The
> four-byte identifier will be { 'S', 'V', 'N', 0 }, where 0 is a
> version number. I could be convinced to nuke the identifying string
> (gets rid of one state in the parsing machine), but I think it's a
> little antisocial to define a binary format without a predictable
> beginning.

Agree.

> Also like vcdiff, integers are generally encoded in a variable-length
> fashion. The high bit of each byte is a continuation bit and the
> other seven bits are data. Higher-order bytes come earlier. So 129
> would be encoded as two bytes, { 0b10000001, 0b00000001 }.
> Implementations must be able to handle numbers up to 32 bytes, which
                                                          ^^^^^
s/bytes/bits/

> means you can safely use source and target views up to 4GB in size.

Should be sufficient. :-)

> [...]
>
> That's it. We won't be able to get deltas quite as compact as in
> vcdiff because there are no games to play with address caches,
> addressing modes, or tailored instruction code sets. But it's still
> pretty compact.

It looks fine. Do you have any quantitative information on the
relative compactness vs vcdiff? If you say it's close enough, I trust
you, I'm just curious. :-)

And if it's a problem, there's nothing preventing us from adding true
vcdiff support later, as we're going to do with gnu diff format.

-Karl
Received on Sat Oct 21 14:36:10 2006

This is an archived mail posted to the Subversion Dev mailing list.