[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: Status of ra_serf

From: Phillip Susi <psusi_at_cfl.rr.com>
Date: 2006-02-18 17:11:45 CET

Justin Erenkrantz wrote:
> No. Serf allocates a pool to the request when it decides to deliver
> it. If we decided to not use a pool strategy, it would make my life
> (and any one coding at serf's level) suck. So, that means each
> request requires 8KB of memory at a minimum. For 1000 requests,
> that's 8KB*1000 = ~8MB. If we remove a pipeline limit and write 5000
> requests, we're talking ~40MB of overhead. Feasible, but there's a
> steep cost for memory allocation - reducing our memory footprint makes
> us faster than if we tried to be cute and write 5000 requests when we
> know that we're not going to need them all. (Also, by delaying our
> memory allocation, we get the benefits of pools such that we stabilize
> our memory usage very quickly and it plateaus.)

I see. Then yes, it makes sense to limit the depth of the pipeline. My
point though, wasn't that you should send 5000 requests at once, but
that there is no reason you can't send them in batches of say, 25, and
just keep sending more as the first requests complete, keeping the
pipeline full the whole time. Why should the server close the socket
and require you to reconnect to send another batch of requests?

> In order to stick the full-text in the XML response to the REPORT for
> ra_dav, mod_dav_svn base-64 encodes every file such that it will be
> valid XML parsable by expat. Base-64 encoding is moderately expensive
> and increases the overall space; however, using mod_deflate lets us
> recover some of the space at the cost of even more CPU.

Outch. That kind of sucks. I wonder if the REPORT couldn't just refer
you to the files you could fetch in binary with GET rather than embed
them in the XML response.

>> Also to be clear, you are saying that the increased multiple extra get
>> connections increase the load on the server, but that increase is
>> negligible compared to the load from the report request?
> Correct.
>> I think that in the vast majority of cases, the bottleneck is going to
>> be the network, not the local disk. Most hard disks these days can
>> handle 10 MB/s easily, but not many people have >= 100 Mbps connections
>> to a svn server. Given that, splitting the data stream into multiple
>> tcp sockets tends to lower the total throughput due to the competition,
>> which will increase overall checkout times.
> We'll need to let the data trump any hypotheses. =) -- justin

To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Sat Feb 18 17:12:29 2006

This is an archived mail posted to the Subversion Dev mailing list.