[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: Status of ra_serf

From: Phillip Susi <psusi_at_cfl.rr.com>
Date: 2006-02-17 22:25:16 CET

Justin Erenkrantz wrote:
> For those really busy servers, they are likely a bit beefier than the
> client. For example, svn.apache.org's repository is on a RAID-5'd
> dual-processor machine sitting on a big fat pipe. If we were that
> gun-shy, the client would be sitting idle when there's no need to do
> so as the server can keep up.
> I'm not personally worried about opening up 4 concurrent connections.
> Again, that has been the common strategy of every web browser for
> years and is a decent precedent for us to follow. (In return, we're
> doing much simpler requests.)

Yes, but for years web browsers have been rather dumb and opened 4
connections because they only issue one request per connection. Beefy
or not, servers don't appreciate having to handle 4000 simultaneous
connections when there are only 1000 clients.

> Right. I don't know what the 'best' value is yet. I think it's over
> 100 (absolutely over 10!), but I'm beginning to think 1000 might be
> too high. One of the output of ra_serf may very well be, "Hey, you
> server admins, tune your Apache config with these values."

Is there any reason not to leave it completely unlimited? I can't see
one. Why kick the client off and make them reconnect for no good reason?

> At the very least, I'll ensure that svn.apache.org (and hopefully
> svn.collab.net) are tuned for ra_serf. So, I'll be personally happy.
> ;-)
> My short-term goal with ra_serf is not to focus on server-side
> changes. I want ra_serf to work with any 1.0+ server. If we can
> later add code to optimize the server, all the better. But, ra_serf
> will have zero appeal if only works against 'newer' Subversion
> servers.


> This also keys the performance goals: ra_serf should be competitive in
> most cases with ra_dav or no one will be interested in using it. If
> ra_serf is a couple of percentage points behind from ra_dav but the
> server load drops by half, then that might be enough to convince folks
> to switch our default to ra_serf. But, an average performance penalty
> of more than a few percentage points is going to be a showstopper.
> Certainly, if we can beat ra_dav in most cases, it'll be really easy
> to convince people to switch. ;-)

True, but doesn't quadrupling the connections to the server increase the
load it on quite a bit? That's what I'm concerned about. It also isn't
as helpful for checkout times as keeping one ( or two if you know the
first will be blocked for a while doing processing on the server )
connections pipelined.


> Again, the 'fetching' connection won't live very long in the default
> case. That's why we have to multiplex connections. -- justin

Yes, the server will hang up after x requests, but when you are issuing
x requests at the same time on 4 connections, they will compete with
each other and prevent their tcp windows from opening fully. For very
low values of x ( 1-10 ) then 4 connections might give an improvement
because they are all so short lived anyhow that they can't reach full
open windows, and you get rid of the reconnect latency. For values of x
>= 100 though, I think 2 connections would give better results. Should
make an interesting test...

To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Fri Feb 17 22:26:53 2006

This is an archived mail posted to the Subversion Dev mailing list.