[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: faster client pre-1.0: neon prefetching, multithreading

From: Greg Stein <gstein_at_lyra.org>
Date: 2003-02-11 23:00:37 CET

On Tue, Feb 11, 2003 at 02:15:51PM -0500, Greg Hudson wrote:
> I had no idea that ra_dav was using appreciably more connections than
> ra_svn. Every time I've looked at the code, it looks like it opens two
> sessions in ra_lib->open and continues to reuse them. Is the repeated
> reconnecting being done inside neon?

solo is mistaken. You're correct: we open one, maybe two connections to the
repository and run with it. Conceivably, Apache might decide to close them
(e.g. every 10,000 requests to shut down the child), but they'll normally
stay open for the duration of the operation.

> If that's not the source of the problem, I have an idea for narrowing it
> down. Do a tcpdump of a Subversion operation with timestamps. Using
> the timestamps, find out whether most of the time is spent in the client
> or in the server, and if there are particular steps of the operation
> which require lots of time.

We make some needless PROPFIND requests. It kind of peeves me that ra_dav
has slowly grown this stuff. Calls are made to functions to retrieve some
data, without regard to the fact that (underneath) it has to hit the server,
sometimes needing multiple requests. For example, just do 'svn cat HACKING'
and watch the PROPFINDs blow by. It does something like *four* sets of
triple PROPFINDs before it even bothers to GET the file.

As I recall, the checkout behavior isn't much better. It does a PROPFIND on
the directory to get the whole bunch of props for the dir and its file, but
then promptly ignores that and does a PROPFIND for each file fetched.


The second thing is that Neon is not capable of doing HTTP pipelining. That
is: sending a second (or third or fourth) request while waiting for the
first request to finish. Without pipelining, latencies are created for each
request since they must wait for the first to finish. With pipelining, you
keep both directions of the TCP session full (well, one side will be the
bottleneck, but at least it can be fully utilized).

The "serf" project (at the ASF) is an HTTP client which was intended to
solve the pipelining issue, along with a number of other things. It has been
rather, um, slow going :-) [damn, I wish I had three clones]

> On Tue, 2003-02-11 at 13:37, Branko Cibej wrote:
> > Besides, sending deltas for checkout and import is total nonsense.
> It's not total nonsense; it's a space versus time issue. If we're
> spending 10% of import time in vdelta(), then eliminating checkout and
> import deltas will only speed up imports by about 10%, and at some cost
> in bandwidth. And computers will get faster more quickly than bandwidth
> will get cheaper.

Agreed [with Greg].


Greg Stein, http://www.lyra.org/
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Tue Feb 11 22:57:03 2003

This is an archived mail posted to the Subversion Dev mailing list.