[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

ra_serf: API and HTTP protocol changes to support delta streaming (was: Re: Proposal: new fsfs.conf properties)

From: Evgeny Kotkov <evgeny.kotkov_at_visualsvn.com>
Date: Fri, 28 Jul 2017 19:21:54 +0300

Julian Foad <julianfoad_at_gmail.com> writes:

> Hi Evgeny. Daniel Shahaf just noticed that r1803143 extended the
> svn_delta_editor_t interface. So I took a look and see in the log
> message that also:
>
>> This requires a minor tweak to the Subversion's HTTP protocol, and
>> it's the reason why streaming would only work against new servers.
>
> It would have been nice to announce these API and protocol changes
> before or after making them, to give devs a better chance to review. I
> don't watch commits closely and hadn't seen this at all.

Thank you for this reminder, I have now taken this to a separate thread.

Adding support for streaming svndiff deltas over ra_serf required a minor
extension of the svn_delta_editor_t interface, and of the Subversion's HTTP
protocol. Both of these changes were implemented in a backward-compatible
manner, and they don't alter the existing behavior of the API or 3rd-party
protocol users.

Please see the https://svn.apache.org/r1803143 changeset for additional
technical details, and here is a recap for convenience:

 (1) The new svn_delta_editor_t.apply_textdelta_stream() callback was added.

     It works in an inverted way, compared to apply_textdelta(), by allowing
     the editor driver to set a callback (svn_txdelta_stream_open_func_t)
     that will be called when the implementation requires the txdelta stream.

     This is what makes streaming deltas possible in the implementation of
     the ra_serf's commit editor.

     Perhaps, a potential alternative would be to implement such optimization
     in the Ev2, without altering the Ev1 interface. But, as long as the
     work on the Ev2 isn't finalized, this relatively small and compatible
     extension to the existing Ev1 interface allows us to solve the long-
     standing issue with the commits working slower than expected and
     requiring disk space to store the temporary files.

 (2) The HTTP protocol was extended to report the result checksum via the
     "X-SVN-Result-Fulltext-MD5" header in the response to a successful
     PUT request. This capability is advertised in the response to the
     initial OPTIONS request.

     An extended explanation of why this is required is the following:
     [[[
     Currently, all PUT requests include a special header that contains the
     result checksum, which is used by the server to validate the integrity
     of the result after it applies the delta received over the wire. While
     this approach works fine if the client first creates a temporary file
     with the delta and only then starts sending it to the server (the result
     checksum is calculated while preparing the temporary file), it can't
     be used in the stream approach, as with it we'd need to know the result
     checksum _before_ we start sending data.

     So we turn the existing scheme inside out, and teach mod_dav_svn to send
     the result checksum in the responses to PUT requests. Then, the client
     would check if the received checksum matches what it calculated, and,
     possibly, return a checksum mismatch error (thus aborting the edit and
     the transaction).
     ]]]

I hope that I didn't miss something subtle about this, and, as well as that,
I am thinking that these changes are worth the result.

Regards,
Evgeny Kotkov
Received on 2017-07-28 18:22:22 CEST

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.