[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: http protocol very slow for moderate-sized data sets

From: Anders J. Munch <ajm_at_flonidan.dk>
Date: 2005-04-28 10:38:39 CEST

From: Chermside, Michael [mailto:mchermside@ingdirect.com]
> I'm setting up a subversion repository, to be accessed via http
> through apache. We have what I consider a moderately (but not
> excessively) large amount of data under version control... about 350
> MB. I am finding the performance to be unacceptably slow.
>
> When I try accessing the repository using the file:// protocol on the
> local machine with a test block of ~150MB it reliably takes about 30
> seconds to check out a fresh working copy. When I try the same query
> from the local machine via the http:// protocol, it takes over 2
> minutes.
>
> Question 1: Why the discrepancy between the http:// and file://
> protocols? What would produce a factor of 4 overhead?

http goes through the whole network protocol stack, plus Apache. Even
when testing on localhost, your data is bound to be copied around
a few times.
file:// goes straight to the files.

Do you really need Apache? svn[+ssh]:// is generally faster, so if
you don't need specific Apache features such as WebDAV or fine-grained
access control, use svnserve.

>
> Question 2: Are these timings reasonable? Is subversion this slow?

1-5 MB/s is slow to you? You must have one heck of a workstation.

Yes, the timings are reasonable. Yes and no, subversion _checkout_ is
this slow. In order to make other things fast, checkout is
comparatively slow. Don't take checkout speed to be indicative of
Subversions general performance. Subversion strives to optimise
network usage, and as long as your server is localhost you don't
really see the benefits of that.

Two copies of everything are stored locally, the one you work on and a
copy for reference in .svn. Also, the Subversion client uses lots of
little files for administration. But then given that file:// is so
much faster, network overhead is probably more of an issue for you
than local file system speed.

>
> Question 3: If the answer to 2 is no, then does anyone have
> suggestions of what to do to troubleshoot my installation?

Is a full checkout really a common operation for you? For most of us
checkouts are not frequent enough to be performance-critical.

If the bottleneck is network bandwidth, perhaps you can get some
mileage out of mod_gzip?

If the bottleneck is the local file system, strive to use a modern
file system that deals well with lots of small files. Microsoft FSs
are said to be bad at this.

- Anders

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org
Received on Thu Apr 28 10:42:25 2005

This is an archived mail posted to the Subversion Users mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.