On Wed, Jul 10, 2013 at 7:50 AM, Johan Corveleyn <jcorvel_at_gmail.com> wrote:
> On Wed, Jul 10, 2013 at 1:24 PM, Philip Martin
> <philip.martin_at_wandisco.com> wrote:
>> Johan Corveleyn <jcorvel_at_gmail.com> writes:
>>
>>> Can someone explain again why ra_serf / serf can't just resend
>>> requests, with C-L, whenever it has received a 411? So (after we know
>>> it's HTTP/1.1) send every request optimistically assuming chunked
>>> encoding, and fallback whenever we get a 411 (and perhaps remember the
>>> fallback, so we downgrade that entire connection / session).
>>>
>>> Why is that not possible? Is it because (a) it's too hard to implement
>>> (lots of code paths all over the place), (b) we can't resend requests
>>> because we don't have them available in the right place anymore after
>>> sending, (c) it would potentially break some ordering things (really?
>>> why?), (d) it would require serf 1.4 anyway, or (e) something else?
>>
>> Partly a), mostly c). It is relatively easy to do when ra_serf is
>> processing requests syncronously. It gets much more difficult when
>> ra_serf starts using pipelined requests as there may be outstanding
>> requests when the 411 is received. I posted a patch that keeps track of
>> outstanding requests and retries, it works in lots of cases but won't
>> work in all cases.
>
> Is pipelining optional within (ra-)serf's logic? I.e. can the "start
> pipelining" switch simply be flipped a little bit later, once we're
> sure that chunking works?
Yes. And funny you mention it because I was saying the same thing to
Ben on IRC last night.
However: we cannot do that in a simple fashion, such that I would be
comfortable suggesting a backport to 1.8.x.
Consider if we adopted the "auto" approach and generally sent an extra
OPTIONS request. In the 1.9.x series, we could start to optimize that
away by using the next synchronous request instead. Or when we hit
your question (b), if a body is hard to reproduce, then we hold that
for a moment and send the OPTIONS. And we could disable pipelining for
one request, while we probe the connection's level of support.
> That way we won't do an extra (synchronous,
> but useless) request just for detecting chunkness, but we just
> continue working synchronously with our normal (useful) requests a
> tiny bit longer, before switching to pipelined mode. Still a small
> performance hit in general, but it might be negligible.
Right. And something very doable for 1.9.x. I also want to implement
request compression. That will help some during an import (where we
self-diff the imported file, rather than sending a diff-against-BASE).
An analysis by ghudson long ago showed that our self-diff is actually
pretty good compared to gzip, but I'd prefer to use optimized gzip at
both ends instead of our algorithm.
> Totally making abstraction of implementation complexity, of course ...
>
> Of course, if you could make your patch work in all cases, Philip,
> that would be even better :-).
The ra_serf stuff isn't too bad. Much of the request setup/handling is
in util.c. But we have about three different run loops right now
(one-shot/synchronous, update, and (iirc) replay). They optimize the
connection(s) in different ways.
When we switch to Ev2, there will be some great optimizations in
commit. In particular, alter_file(CONTENTS) means that our
callback-from-serf can read the stream. No transient spillbuf/file
necessary (compared to Ev1, where apply_textdelta pushes content
*into* the RA commit process).
Within 1.9.x, we also can switch more temp-file usage over to use the
spillbuf bucket (ref: sb_bucket.c). That'll keep small items in memory
rather than forced-to-disk.
But back to the original point: what is the smallest patch that we're
comfortable with now? From a user standpoint, and a backport-safety
standpoint. The "auto" suggestion is likely our best solution for now.
Ben put together a branch, but I think it's more complicated than
necessary. That's fixable though :-)
Cheers,
-g
Received on 2013-07-10 16:32:01 CEST