[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: Status of ra_serf

From: Justin Erenkrantz <justin_at_erenkrantz.com>
Date: 2006-02-17 23:39:59 CET

On 2/17/06, Phillip Susi <psusi@cfl.rr.com> wrote:
> Yes, but for years web browsers have been rather dumb and opened 4
> connections because they only issue one request per connection. Beefy
> or not, servers don't appreciate having to handle 4000 simultaneous
> connections when there are only 1000 clients.

They certainly do keepalive but not pipelining. Firefox has long had
pipelining as a config option, but there's *just* enough quirkiness to
not have it on by default for a general purpose HTTP client. ra_serf
doesn't need to talk to every web server in the universe - so we can
assume that pipelining more or less will work. (Some proxies may get
in the way, sure.)

> > Right. I don't know what the 'best' value is yet. I think it's over
> > 100 (absolutely over 10!), but I'm beginning to think 1000 might be
> > too high. One of the output of ra_serf may very well be, "Hey, you
> > server admins, tune your Apache config with these values."
> >
>
> Is there any reason not to leave it completely unlimited? I can't see
> one. Why kick the client off and make them reconnect for no good reason?

Remember that we may able to truly parallelize the requests if the
server has multiple CPUs, etc.

Also, if we try to 'flood' the server with an unlimited pipeline
depth, we'll take up more memory on the client-side than needed as we
have to 'manage' more concurrent requests. Some recent commits to
serf changed the API to only allocate memory when we're about to write
the request not when we create the request. It cut the memory
consumption by half.

> True, but doesn't quadrupling the connections to the server increase the
> load it on quite a bit? That's what I'm concerned about.

No, it doesn't - compared to the large REPORT response ra_dav fetches.
 The server isn't having to do any base-64 encoding - that's the real
room for opportunity that ra_serf has as we're just not going to be
more optimal on the number of connections. Therefore, breaking the
activity into smaller chunks helps ease the computational load on the
server (provided the disk I/O can keep up).

> It also isn't
> as helpful for checkout times as keeping one ( or two if you know the
> first will be blocked for a while doing processing on the server )
> connections pipelined.

Again, it's a function of what resources the server has to offer.
Ideally, we'd like to make the bottleneck the *client* disk I/O not
the network or the server disk I/O.

> Yes, the server will hang up after x requests, but when you are issuing
> x requests at the same time on 4 connections, they will compete with
> each other and prevent their tcp windows from opening fully. For very
> low values of x ( 1-10 ) then 4 connections might give an improvement
> because they are all so short lived anyhow that they can't reach full
> open windows, and you get rid of the reconnect latency. For values of x
> >= 100 though, I think 2 connections would give better results. Should
> make an interesting test...

There's a distinct need to profile our behaviors to ensure we're being
as optimal as we can make it. ;-) -- justin

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Fri Feb 17 23:40:26 2006

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.