Greg Hudson <ghudson@mit.edu> writes:
> Like vcdiff, an svndiff consists of a four-byte identifying string
> followed by a series of windows, until the data runs out. The
> four-byte identifier will be { 'S', 'V', 'N', 0 }, where 0 is a
> version number. I could be convinced to nuke the identifying string
> (gets rid of one state in the parsing machine), but I think it's a
> little antisocial to define a binary format without a predictable
> beginning.
Agree.
> Also like vcdiff, integers are generally encoded in a variable-length
> fashion. The high bit of each byte is a continuation bit and the
> other seven bits are data. Higher-order bytes come earlier. So 129
> would be encoded as two bytes, { 0b10000001, 0b00000001 }.
> Implementations must be able to handle numbers up to 32 bytes, which
^^^^^
s/bytes/bits/
> means you can safely use source and target views up to 4GB in size.
Should be sufficient. :-)
> [...]
>
> That's it. We won't be able to get deltas quite as compact as in
> vcdiff because there are no games to play with address caches,
> addressing modes, or tailored instruction code sets. But it's still
> pretty compact.
It looks fine. Do you have any quantitative information on the
relative compactness vs vcdiff? If you say it's close enough, I trust
you, I'm just curious. :-)
And if it's a problem, there's nothing preventing us from adding true
vcdiff support later, as we're going to do with gnu diff format.
-Karl
Received on Sat Oct 21 14:36:10 2006