[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

RE: http protocol very slow for moderate-sized data sets

From: Chermside, Michael <mchermside_at_ingdirect.com>
Date: 2005-04-28 15:35:03 CEST

[Anders Munch replied to my questions about http access being slow]

Thank you for taking the time to respond in detail to my questions.

> > Question 1: Why the discrepancy between the http:// and file://
> > protocols? What would produce a factor of 4 overhead?

> http goes through the whole network protocol stack, plus Apache. Even
> when testing on localhost, your data is bound to be copied around
> a few times.
> file:// goes straight to the files.

Still, a factor of 4 surprises me, but if you say this is to be expected
than I suppose I'll have to put up with it.

> Do you really need Apache? svn[+ssh]:// is generally faster, so if
> you don't need specific Apache features such as WebDAV or fine-grained
> access control, use svnserve.

Unfortunately, yes I need it. The key piece is access control... the
access needs to be managed through Active Directory, which is easily
done via Apache. Unfortunately, allowing anyone other than the
security team to control access is likely to be unacceptable in our
organization, and asking the security team to learn a different
system is also unlikely.

> > Question 2: Are these timings reasonable? Is subversion this slow?

> 1-5 MB/s is slow to you? You must have one heck of a workstation.

Well actually, yes. We're intending for it to support ~20 developers
and about 40 other people who do less frequent checkouts of
configuration data and occasional updates. And we expect an automated
server to be doing clean checkouts, builds, and unit tests on a
frequent basis (hourly, or perhaps triggered by checkins).
Unfortunately, most of my scaling estimates had been done using
svnserve, and I had not expected such a dramatic difference.

> Yes, the timings are reasonable. Yes and no, subversion _checkout_ is
> this slow. In order to make other things fast, checkout is
> comparatively slow. Don't take checkout speed to be indicative of
> Subversions general performance. Subversion strives to optimise
> network usage, and as long as your server is localhost you don't
> really see the benefits of that.

I only used localhost while testing the problem, to avoid the chance
that network issues were contributing. We expect to see a big advantage
for developers. And the developers can live with slow initial checkouts
since they do that rarely. I'm more concerned about the users who just
store config files (although those are much smaller than 350 MB) and
the continuous integration server, as both of those need to do clean
checkouts each time.

> > Question 3: If the answer to 2 is no, then does anyone have
> > suggestions of what to do to troubleshoot my installation?

> Is a full checkout really a common operation for you? For most of us
> checkouts are not frequent enough to be performance-critical.

Yes, see above.

> If the bottleneck is network bandwidth, perhaps you can get some
> mileage out of mod_gzip?

But since I'm seeing the factor-of-4 slowdown on localhost, it can't
be bandwidth limited, right? On the other hand, it's certainly easy
enough to try this... thanks for the suggestion.

> If the bottleneck is the local file system, strive to use a modern
> file system that deals well with lots of small files. Microsoft FSs
> are said to be bad at this.

Now that might well be an issue, this IS running on an NTFS file system.
There IS some possibility of running a unix OS if we know it will
improve performance significantly.

-- Michael Chermside

This email may contain confidential or privileged information. If you believe
 you have received the message in error, please notify the sender and delete
the message without copying or disclosing it.

To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org
Received on Thu Apr 28 16:22:25 2005

This is an archived mail posted to the Subversion Users mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.