On 02/06/2012 03:21 PM, Hyrum K Wright wrote:
> On Mon, Feb 6, 2012 at 1:26 PM, Greg Stein <gstein_at_gmail.com> wrote:
>> On Mon, Feb 6, 2012 at 12:49, Hyrum K Wright <hyrum.wright_at_wandisco.com> wrote:
>>>> Yeah, sounds like we're in a tough spot here. The checksums in Ev1 exist
>>>> only as sanity checks -- they definitely are NOT the primary selection
>>>> mechanism for the editor implementation's base text.
>> Right. This is a key point, and Hyrum's earlier emails adopted a tone
>> of using the base checksum as a key. The real semantic behind that
>> checksum is, "I'm going to send you a delta against some text base
>> that we've negotiated or assumed through an out-of-band communication,
>> and to verify that our communication is correct, that text base should
>> have CHECKSUM."
>> Ev2 gets rid of all that hand-wave magic and leaves deltafication to
>> other layers (notably, RA layers sending stuff over the wire; ra_local
>> will not need to deltify at all).
>>>> I assume we still have a valid checksum to pass to close_file() (the
>>>> checksum of the post-edit fulltext for that file), right? It's just the
>>>> pre-edit base checksum that's the problem?
>>> r1241097 relaxes the checks by special-casing the checksum of the
>>> empty stream (as discussed elsethread). This addresses the immediate
>>> issue, and I think the generalized case can be punted toward the
>>> individual ra-layers long-term.
>> Note that this will break third-party Ev1 receivers (since they assume
>> something other than the empty stream). You happened to fix *ours*,
>> but not theirs.
> Receivers (such as ours) are making unreasonable assumptions, in my
> opinion. The docs for apply_textdelta() read thusly:
> * @a base_checksum is the hex MD5 digest for the base text against
> * which the delta is being applied; it is ignored if NULL, and may
> * be ignored even if not NULL. If it is not ignored, it must match
> * the checksum of the base text against which svndiff data is being
> * applied; if it does not, @c apply_textdelta or the @a *handler call
> * which detects the mismatch will return the error
> * SVN_ERR_CHECKSUM_MISMATCH (if there is no base text, there may
> * still be an error if @a base_checksum is neither NULL nor the hex
> * MD5 checksum of the empty string).
> All the docs guarantee is that this checksum must match the base
> checksum against which the delta is being applied. They don't assume
> that the receiver has the content, nor that it will even pay attention
> to the checksums themselves. If the receiver has done some kind of
> backdoor secret handshake that's outside the scope of the delta
> editor, and it should (reasonably) not care.
> I maintain that it is still legal (though perhaps not very useful) for
> the base checksum supplied by apply_textdelta() to be whatever
> apply_textdelta() wants it to be.
Sorry, Hyrum, but I sincerely believe you are trying to bend this API to
your will just to work around a tough engineering task. :-)
If you read the whole docstring for apply_textdelta, you find that it begins
/** Apply a text delta, yielding the new revision of a file.
* @a file_baton indicates the file we're creating or updating, and the
* ancestor file on which it is based; it is the baton set by some
* prior @c add_file or @c open_file callback.
Now, can you *really* convince yourself that anyone ever intended for the
"base text" to be something other than the text associated with the
"ancestor file on which [this new revision of the file] is based"? What
value would there be in even passing along a base_checksum that doesn't
match that of the file_baton's to-be-changed text?
C. Michael Pilato <cmpilato_at_collab.net>
CollabNet <> www.collab.net <> Distributed Development On Demand
Received on 2012-02-06 21:42:09 CET