[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: Performance (Subversion vs. CVS)

From: Ben Collins-Sussman <sussman_at_collab.net>
Date: 2004-09-14 06:04:11 CEST

On Mon, 2004-09-13 at 21:37, Greg Ward wrote:

> Here are some timing results from CVS; all times are in seconds, each
> measurement repeated 5 times (too lazy to record two decimal places on
> the first test, got keen after that):
>
> initial co 4.1 3.6 3.9 5.4 5.0
> up (no-op) 1.00 0.92 0.87 0.86 1.46
> up -r BETA (from -A) 4.24 3.36 4.21 2.99 4.14
> up -A (from -r BETA) 1.96 2.61 2.12 1.80 2.53
> co -r <recent-branch> 6.83 5.85 5.05 6.31 5.48
>

So how many files were changing every time you did an branch switch?
It's hard to get a sense of how 'busy' these updates were. Saying that
there are 1100 files in the working copy doesn't mean much if only 2
files change when updating to different revisions and branches. ;-)

> Now here are the same activities with Subversion 1.0.6:
>
> co .../trunk 49.3 47.5 46.8
> up (no-op) 1.39 0.54 0.41 2.53 1.17
> switch .../tags/BETA 17.8 19.3 18.0
> switch .../trunk 34.3 35.6 35.7
>

We certainly haven't seen times like this, nothing so severe. Our
experiments have shown svn about 1.5-2x slower. I'm not sure why you're
seeing such different results.

Try re-running some similar CVS and SVN tests over a network, using 'cvs
-z3' for a fair comparison. The svn client is unconditionally
compressing all data it sends to the server, and the server is
decompressing as it receives. And the same story going from server to
client. This even happens when working over file:/// urls. (Yes, svn
needs to grow a switch to let users to toggle compression, to choose
between optimizing for network traffic or CPU.)

Another thing to admit is that Subversion's working copy code
(libsvn_wc) is doing VERY complex, time-consuming things. It's dropping
locks in every .svn/ area. Every .svn/entries file is XML that must be
parsed into memory, then unparsed to save back to disk.

And the biggest difference of all: every change made to the working
copy is journaled first in an xml file, then the xml journal is parsed
and executed. The benefit is that you can interrupt an svn working-copy
operation, and it's able to get itself back into a consistent state by
running 'svn cleanup'. The disadvantage is huge, slow overhead. Lots
of tiny files being created, moved, destroyed in the .svn/ area, just to
update a single file.

We're well aware of libsvn_wc's shortcomings in performance -- we have a
vague, long-term goal to rewrite it for svn 2.0. But it's also doing a
heck of a lot more work than CVS, and it's not clear how much we can
remedy that. A lot of it is intentional, or unavoidable.

Another thing to ponder: think of 'performance' as something amortized
over many weeks of use. Okay, it might take longer to switch to a
branch. But if you tag a huge tree constantly (you have 2400 tags),
think about how much time is saved via O(1) svn tagging. Think about
time saved when you commit a tiny change to a huge binary file (the
client sends only binary diffs). There are other wins which may balance
out over time.

And finally, there's the 'qualitative' perception of performance. Try
using svn for a while. It definitely doesn't "feel" slower, spread out
over the course of a day's work. If the performance difference were
annoying and noticeable, we wouldn't have released 1.0 in the first
place.

Just some food for thought here... my late-night ramblings before going
to bed. Reading back, I sound like some sort of salesman trying to
promote a product. Luckily, we have tons of testimonials on this page.
Read it for warm fuzzies, if you have doubts:

     http://subversion.tigris.org/propaganda.html

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org
Received on Tue Sep 14 06:06:44 2004

This is an archived mail posted to the Subversion Users mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.