Status of ra_serf

From: Justin Erenkrantz <justin_at_erenkrantz.com>
Date: 2006-02-17 11:26:05 CET

Here's a quick progress report as to where I am with ra_serf as of r18502.

Checkouts/updates are mostly working (a checkout of svn's trunk from
svn.collab.net works fine). No svn:externals, auth, or other
'goodies' yet. 'svn log' should work; a bug may have creeped in
cleaning up memory after a 'svn up' is completed. No, I'm not even
close to being able to run regression tests yet...

We're using the bare-bones update-report and GET/PROPFIND in order to
fetch the actual content. We're utilizing less CPU than neon overall,
but we do take a little bit longer to complete as of r18502. Our
memory usage is generally competitive with ra_dav and, in most cases,
remains constant for the duration of the checkout. (If the pipeline
length gets long enough, more requests will be in the pipeline
increasing our memory usage accordingly.)

As of r18500, we now open 4 connections to the server in order to do
the checkout. Part of the reason for this is that as we parse the
REPORT response, we can start to fetch the actual content. While we
can keep it to one connection (and have done so until just now), we'll
essentially be blocked for the length of the REPORT response: that's
not acceptable to our users, so we're going to spin up three more
connections to do the GET/PROPFINDs. (Why 4? Well, that's
IE/Firefox/etc's default per-server connection limit and is generally
considered the generally accepted maximum on the Web today.)

Even with the multiple connections, I have a feeling that ra_serf is
still not using the network as efficiently as possible. My to-do for
tomorrow is to go through network traces and see what's going on and
if there's any room for more improvements. But, with the multiple
connections, we're usually within a 10-15% of neon's checkout time at
generally less CPU time.

Informal comparison numbers (not scientific, but to give you a
ballpark of where we stand):

Extracted httpd 2.2.0 tarball from a remote machine on Mac OS X:
ra_serf checkout: 9.31s user 15.68s system 32% cpu 1:16.90 total
ra_dav checkout: 12.39s user 14.10s system 40% cpu 1:05.80 total

Checkout of httpd trunk from svn.apache.org on Linux:
ra_serf checkout: 3.98s user 5.11s system 10% cpu 1:22.89 total
ra_dav checkout: 3.67s user 2.06s system 8% cpu 1:04.02 total

Checkout of svn trunk from svn.collab.net on Mac OS X:
ra_serf checkout #1: 8.64s user 11.08s system 19% cpu 1:38.93 total
ra_serf checkout #2: 11.89s user 13.73s system 36% cpu 1:10.05 total
ra_dav checkout #1: 11.84s user 9.60s system 18% cpu 1:54.99 total
ra_dav checkout #2: 11.72s user 9.79s system 17% cpu 2:03.88 total

Hmph. Okay, I wasn't expecting ra_serf to be twice as fast; but I
swear I'm not manipulating the data! Hee-hee. As you can see, I just
ran that one twice. There could also be load spikes on svn.collab.net
too - too much variations in the time for my taste... BTW, given the
broken Apache httpd on svn.collab.net, ra_serf could end up writing
parts of a checked-out file twice - this is on my to-do list for
tomorrow to fix as well (as long as libsvn_wc supports 'whoops, let me
start that file over again' we should be okay).

Daniel, Peter: I'll address your concerns on the commits you replied
to earlier in the morning. I'm dead tired right now.

Comments welcomed.

Thanks. -- justin

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Fri Feb 17 11:31:00 2006

This message: [ Message body ]
Next message: Peter Samuelson: "Re: assert() vs. return(error)"
Previous message: Max Bowsher: "Re: svnauthz-validate.c: Tool to validate authz files"
Next in thread: Greg Hudson: "Re: Status of ra_serf"
Reply: Greg Hudson: "Re: Status of ra_serf"

Contemporary messages sorted: [ By Date ] [ By Thread ] [ By Subject ] [ By Author ] [ By messages with attachments ]