[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: SVNDIFF1 is ready to merge

From: Julian Foad <julianfoad_at_btopenworld.com>
Date: 2005-12-13 00:39:57 CET

Daniel Berlin wrote:
> So svndiff1 is ready to merge.

Can you give us a summary of what this branch does? The summary of the major
commit on the branch (r17225) is:

   Add new svndiff1 diff encoding.

... which is a bit on the brief side :-) I seem to recall you gave a good
description of it in an email a while ago; perhaps you could retrieve and
update that, probably making it just a paragraph or two for the log message for
the merge to trunk. Of course the format details will be in 'notes/svndiff',
so all this log message needs to say is basically why and what is the effect of
this enhancement.

> On an average repository, the savings are abut 30% disk space with
> roughly no time cost (I haven't seen the zlib encoding/decoding on the
> profiles).

That certainly sounds good!

> There are some questions exactly what we should be outputting and
> handling in mod_dav_svn/neon in terms of headers, and i'm happy to
> address any concerns, because I have no real opinions/etc about whether
> we should be using Accept/Accept-Encoding, what the names, etc, we
> should use there are, or anything of the sort :).

Do you mean there are some questions that people have already asked that
haven't yet been answered conclusively, or that you have made some arbitrary
choices in the design which we might like to consider? Can you list these
questions? (You mentioned one there: "Accept" or "Accept-Encoding".)

(I'm just asking to try to stimulate others to discuss or review; this isn't my

> The internal bdb and fsfs formats have been bumped, but we still support
> the "old" format simultaneously. I've added an option to svnadmin to
> create repositories with the old format (IE svndiff0 only), for
> backwards compatibility.

That information should probably go into the log message for the merge to trunk
if nowhere else.

> Index: svndiff-v1
> ===================================================================
> --- svndiff-v1 (revision 0)
> +++ svndiff-v1 (revision 17225)
> @@ -0,0 +1,222 @@
> +This email is long, so i sectionized it into "Introduction and
> +justification", "Changes I made", "Sparkly numbers", "Time Costs and
> +Backwards Compatibility"
> +I obviously have no plans to commit this until we work out these issues,
> +and revise the patch. Thus, style nits, etc, are not necessary. I'm
> +aware it's non-perfect :)
> +
> +--Dan
> \ No newline at end of file

Ah - that looks like the body of your descriptive email that I referred to
above. It contains some interesting information, but I don't think we need a
verbatim copy of it in the "notes" directory. I'm not sure if you added on
purpose: it's not mentioned in any of the log messages.

Where relevant, such as in the merge-to-trunk log message, you can drop in a
reference to the email itself as it exists in one of the archives
(http://svn.haxx.se/dev/archive-2005-10/1054.shtml), and this is better because
the rest of the thread can be traced from there. Any firm technical details
that may be in this email should go into the 'svndiff' file, but most of the
email is discussing the design process and rationale rather than the specification.

> Index: svndiff
> ===================================================================
> --- svndiff (revision 17224)
> +++ svndiff (revision 17225)
> @@ -28,13 +28,21 @@
> The target view length
> The length of the instructions in bytes
> The length of the new data in bytes

Those two lines need to say "... length of the instructions section ..." and
"... length of the new data section ..." respectively, since the sections are
now going to be shorter than the instructions/data that they contain.

> - The window's instructions
> - The window's new data (as raw data)
> + The window's instructions section
> + The window's new data (as raw data) section

Remove "(as raw data)" now that it is no longer always true.

> -Integers (including the first five items listed above) are encoded
> -using a variable-length format. The high bit of each byte is used as
> -a continuation bit; 1 indicates that there is more data and 0
> -indicates the final byte. The other seven bits of each byte are data.
> +In svndiff version 1, the instructions, and new data
> +sections may be compressed by zlib. In order to determine the
> +original size, an integer is appended to the beginning of each of the
> +sections. If the original size matches the encoded size (minus the
> +length of the original size integer)from the header, the data is not
> +compressed. If the original size is different than the encoded size
> +from the header, the remaining data in the section.

There's something missing from that last sentence.

This description wrongly implies that both formats 0 and 1 have the extra
length field.

> +
> +Integers (including the integers described bove) are encoded using a


Since the "offset" and "length" fields in the header haven't yet been referred
to as "integers", it isn't clear that this parenthesis refers to them. Perhaps
"including the offset and all of the lengths ..." would be better.

> +variable-length format. The high bit of each byte is used as a
> +continuation bit; 1 indicates that there is more data and 0 indicates
> +the final byte. The other seven bits of each byte are data.
> Higher-order bits are encoded before lower-order bits. As an example,
> 130 would be encoded as two bytes, 10000001 followed by 00000010.

- Julian

To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Tue Dec 13 00:40:58 2005

This is an archived mail posted to the Subversion Dev mailing list.