[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

caching proxies and SVN network perf

From: Greg Stein <gstein_at_lyra.org>
Date: 2000-10-23 18:51:01 CEST

More thoughts from the conference and my conversations with people...

A number of people have asked whether Subversion [and its use of HTTP / DAV]
will be as fast as CVS. My answer has always been a resounding "yes". Why?

Let's consider the case where SourceForge uses SVN for the projects. SF is
located in California. Now, let's turn to the European developers trying to
get stuff out of SF. What should they do? Point their SVN client at a
caching proxy located in Europe. John checks out some GNOME project, which
loads it into the cache. (note: versions are immutable, so the cache will
hold onto that particular version until the cache's FIFO strategy tosses the
file out; it won't ever expire) Now, Jane comes along and checks out the
same project. Hey! She gets it directly from the cache. No cross-atlantic
checkout. Of course, caching proxies at business network edges also provide
the same benefits.

Also, our SVNDIFF format is quite good. HTTP can also (tranparently) add
GZIP encoding on top of that automatically. The GZIP will squeeze our diffs
down, but also original checkouts, too!

HTTP request pipelining allows us to shove a dozen GET and PROPFIND requests
at the server all at once. We then sit back and wait for each request to be
returned (rather than send/wait/send/wait).

We also happen to be using one of the most tuned network servers out there
(Apache). In addition, our repository is based on the Berkeley DB rather
than files scattered all over the filesystem. And we can just yank a file
from the repository and send it... no RCS file parsing or watching out for
@@ codes in the body of the file.

Similar to the caching proxy concept, it is also possible for a site such as
SF to install a number of cache reverse-proxies in from of their SVN
repository. As requests are made, they are load-balanced across a number of
servers which are fulfilling the requests.

make a guess that it isn't going to be an issue from a complete system
standpoint. With all the other factors weighing in our favor, the extra 100
bytes in a request/response just don't seem like a problem.

Of course, we can run our tests later, but I am quite confident that SVN is
really going to beat CVS quite handily. And when we take advantage of the
HTTP infrastructure (e.g. caching proxies, web farms, ..) we will *really*
cream CVS. Heck, things like CVSup or rsync'ing CVS repositories will simply
disappear in favor of caches.

Fun fun...

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/
Received on Sat Oct 21 14:36:12 2006

This is an archived mail posted to the Subversion Dev mailing list.