[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: Why do we check the base checksum so often?

From: C. Michael Pilato <cmpilato_at_collab.net>
Date: Mon, 6 Feb 2012 15:41:33 -0500

On 02/06/2012 03:21 PM, Hyrum K Wright wrote:
> On Mon, Feb 6, 2012 at 1:26 PM, Greg Stein <gstein_at_gmail.com> wrote:
>> On Mon, Feb 6, 2012 at 12:49, Hyrum K Wright <hyrum.wright_at_wandisco.com> wrote:
>>> ...
>>>> Yeah, sounds like we're in a tough spot here. The checksums in Ev1 exist
>>>> only as sanity checks -- they definitely are NOT the primary selection
>>>> mechanism for the editor implementation's base text.
>>
>> Right. This is a key point, and Hyrum's earlier emails adopted a tone
>> of using the base checksum as a key. The real semantic behind that
>> checksum is, "I'm going to send you a delta against some text base
>> that we've negotiated or assumed through an out-of-band communication,
>> and to verify that our communication is correct, that text base should
>> have CHECKSUM."
>>
>> Ev2 gets rid of all that hand-wave magic and leaves deltafication to
>> other layers (notably, RA layers sending stuff over the wire; ra_local
>> will not need to deltify at all).
>>
>>>> I assume we still have a valid checksum to pass to close_file() (the
>>>> checksum of the post-edit fulltext for that file), right? It's just the
>>>> pre-edit base checksum that's the problem?
>>>
>>> Correct.
>>>
>>> r1241097 relaxes the checks by special-casing the checksum of the
>>> empty stream (as discussed elsethread). This addresses the immediate
>>> issue, and I think the generalized case can be punted toward the
>>> individual ra-layers long-term.
>>
>> Note that this will break third-party Ev1 receivers (since they assume
>> something other than the empty stream). You happened to fix *ours*,
>> but not theirs.
>
> Receivers (such as ours) are making unreasonable assumptions, in my
> opinion. The docs for apply_textdelta() read thusly:
>
> * @a base_checksum is the hex MD5 digest for the base text against
> * which the delta is being applied; it is ignored if NULL, and may
> * be ignored even if not NULL. If it is not ignored, it must match
> * the checksum of the base text against which svndiff data is being
> * applied; if it does not, @c apply_textdelta or the @a *handler call
> * which detects the mismatch will return the error
> * SVN_ERR_CHECKSUM_MISMATCH (if there is no base text, there may
> * still be an error if @a base_checksum is neither NULL nor the hex
> * MD5 checksum of the empty string).
>
> All the docs guarantee is that this checksum must match the base
> checksum against which the delta is being applied. They don't assume
> that the receiver has the content, nor that it will even pay attention
> to the checksums themselves. If the receiver has done some kind of
> backdoor secret handshake that's outside the scope of the delta
> editor, and it should (reasonably) not care.
>
> I maintain that it is still legal (though perhaps not very useful) for
> the base checksum supplied by apply_textdelta() to be whatever
> apply_textdelta() wants it to be.

Sorry, Hyrum, but I sincerely believe you are trying to bend this API to
your will just to work around a tough engineering task. :-)

If you read the whole docstring for apply_textdelta, you find that it begins
like so:

  /** Apply a text delta, yielding the new revision of a file.
   *
   * @a file_baton indicates the file we're creating or updating, and the
   * ancestor file on which it is based; it is the baton set by some
   * prior @c add_file or @c open_file callback.

Now, can you *really* convince yourself that anyone ever intended for the
"base text" to be something other than the text associated with the
"ancestor file on which [this new revision of the file] is based"? What
value would there be in even passing along a base_checksum that doesn't
match that of the file_baton's to-be-changed text?

-- 
C. Michael Pilato <cmpilato_at_collab.net>
CollabNet   <>   www.collab.net   <>   Distributed Development On Demand

Received on 2012-02-06 21:42:09 CET

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.