RE: Why do we check the base checksum so often?

From: Bert Huijben <bert_at_qqmail.nl>
Date: Sat, 4 Feb 2012 07:34:36 -0800

> -----Original Message-----
> From: Hyrum K Wright [mailto:hyrum.wright_at_wandisco.com]
> Sent: zaterdag 4 februari 2012 6:04
> To: Subversion Development
> Subject: Why do we check the base checksum so often?
>
> The Ev2 shims get in the way of how text deltas are transmitted, by
> reconstituting the full text, and then just streaming that to the
> receiver via svn_txdelta_send_stream(). I've got a patch which
> actually starts reporting the base checksum---which with the shims
> will always be the "empty" checksum---and it turns out that such a
> patch breaks the World.
>
> The reason for this breakage is that there are several places in both
> the FS and the WC that we check the delta editor's reported base
> checksum against some other value we have on hand which we *think*
> should be the base. Until now, these checks have always passed, since
> there was an implicit understanding about what the delta editor would
> use as its base.
>
> However, I think that these checks are wrong. They rely upon an
> implementation detail ("is the delta editor sending a text delta
> against the base we think it ought to?") rather than the result ("did
> we end up with the content we expected to end up with?") It should be
> noted that we *already* check the result of the text delta application
> against what the delta editor provides in close_file().

You can say that all the expected text bases are wrong if you only send deltas against the empty stream. (libsvn_ra_serf did this for quite some time during development of 1.7). If you send deltas that way you get any result you want, but you might just have transformed a working copy to an invalid state.
You told it the new base checksum, and it applied.... but should it really be the new checksum, or did we forget that the node was switched or mixed revision somewhere?

My guess is that these base checksums -at least during normal update processing- are right and that there is some other hidden problem. During 1.7 development I added a few missing checksums in the protocol processing so at least users of all ra apis were able to detect the same breakage. (serf and neon both missed checksums in certain scenarios)

If they were broken, we probably had noticed this years ago. I wouldn't just assume they are broken without verifying all th new code a few more times.

It is not impossible that they were broken, but that is no reason to break the safety net and assume our experimental code knows better than the code used by millions of users over the years.

It's pretty sure that we break backwards compatibility in some place if we have to remove the safety net in this eary stage of switching...

> I've got a patch to remove this superfluous checking, but since the
> code in question is of Ancient Date, I wanted to make sure the above
> reasoning was correct before committing it. We catch a greater class
> of errors through the checking of checksums we already do based upon
> the *result* of the text delta, so such a change does not impact our
> ability to detect corruption.
>
> Thoughts?

If you already have the patch, please send it. Maybe somebody else can help identifying the problem.

Bert
>
> -Hyrum
>
>
> --
>
> uberSVN: Apache Subversion Made Easy
> http://www.uberSVN.com/
Received on 2012-02-04 16:35:21 CET

This message: [ Message body ]
Next message: Julian Foad: "Re: Why do we check the base checksum so often?"
Previous message: Hyrum K Wright: "Why do we check the base checksum so often?"
In reply to: Hyrum K Wright: "Why do we check the base checksum so often?"
Next in thread: Julian Foad: "Re: Why do we check the base checksum so often?"

Contemporary messages sorted: [ by date ] [ by thread ] [ by subject ] [ by author ] [ by messages with attachments ]