[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: Is sqlite fast enough?

From: Johan Corveleyn <jcorvel_at_gmail.com>
Date: Thu, 4 Mar 2010 14:04:16 +0100

On Mon, Feb 22, 2010 at 12:50 PM, Matthew Bentham <mjb67_at_artvps.com> wrote:
> On 22/02/2010 11:42, Matthew Bentham wrote:
>>
>> On 22/02/2010 11:13, Philip Martin wrote:
>>>
>>> Matthew Bentham<mjb67_at_artvps.com> writes:
>>>
>>>> For me on CYGWIN_NT-6.0-WOW64 brahe 1.7.1(0.218/5/3) 2009-12-07 11:48
>>>> i686 Cygwin
>>>
>>> Thanks for testing!
>>>
>>>> $ svn --version
>>>> svn, version 1.6.9 (r901367)
>>>>
>>>> Create the test repo using the shell script, repeat "$ time svn
>>>> status" a few times:
>>>> real 0m37.303s
>>>> real 0m15.754s
>>>> real 0m15.832s
>>>
>>> I know Subversion is slow on Windows but that is extreme, it's about
>>> the same as my Linux machine when the cache is cold and it has to wait
>>> for real disk IO; once in memory its an order of magnitude faster. I
>>> suspect the cygwin layer might be contributing to that (15 seconds of
>>> CPU). Would it be possible for you to try "svn status" with a
>>> non-cygwin client?
>>>
>>
>> Sure:
>>
>> /cygdrive/c/Program\ Files\ \(x86\)/CollabNet\ Subversion/svn.exe
>> --version
>> svn, version 1.6.5 (r38866)
>> compiled Aug 21 2009, 21:38:11
>>
>> time /cygdrive/c/Program\ Files\ \(x86\)/CollabNet\ Subversion/svn.exe
>> status
>>
>> real 0m8.569s
>> real 0m8.599s
>> real 0m8.611s
>>
>> Quite a bit faster :-) Not as fast as your 1.1s on Debian though :-(
>> The machine is a 2.5Ghz Core2 Quad running Vista 64.
>>
>> Matthew
>>
>
> MMm, those latter tests were also done within a cygwin bash shell so that I
> could use "time", but I've just tried in a Windows cmd shell using my
> wristwatch and got the same times.
>

Just adding my .02 to this oldish thread ...

First, great effort for trying to get a more objective, measured idea
of what kind of performance improvement might be expected ... so
thanks for that.

Second, when I'm looking at client-side svn performance, I always test
with a cold cache on the client. IMHO, that's the situation that's
most similar to "real life" usage. A user will almost never run "svn
status" two times consecutively. Same for update, merge, ... all the
actions that require a lot of I/O on the client.

At least not in our company. Most users only update once a day, so
that's always straight from disk. I think the same applies to "status"
(user does lots of stuff in IDE, surfs the 'net a little, consults
some documentation, compiles, tests; oh now it's time to commit some
of this, so let's get a status). So in my book those numbers where the
svn client can find everything in disk cache are not relevant at all.

Just to add some more real-life info (I've done a lot of tests lately
with our 1.6 FSFS repo and 1.6 (windows (SlikSVN)) clients on
different hardware, in an effort to estimate where the bottlenecks
are, where we should invest first, ...):

"Real-life" working copy with 3581 dirs with 35260 files in them.
Performing svn status (clean WC, nothing modified, freshly checked out).

1) "old" desktop pc - Windows XP - 7200 rpm HDD - Pentium D 2.8 GHz - 4Gb RAM
- With cold cache (after reboot, and waiting couple of mins)
$ time svn status
real 2m26.156s
real 2m0.421s

- With hot cache (second time svn status after the first one)
real 0m4.156s
real 0m4.512s

2) "new" desktop pc - Windows Vista - 10k rpm HDD - Core 2 Duo 3.0 GHz
- 4 Gb RAM
- With "cold" cache [1]
real 0m42.822s
real 0m38.844s
real 0m30.030s
real 0m17.066s
real 0m15.272s

- With hot cache
real 0m8.908s
real 0m4.774s
real 0m4.712s
real 0m4.773s
real 0m4.758s

3) SSD desktop pc - Windows Vista - SSD Intel X25-M - Core 2 Duo 3.0
GHz - 4 Gb RAM
- With "cold" cache [1]
real 0m14.788s
real 0m7.378s
real 0m6.583s
real 0m6.456s

- With hot cache
real 0m4.758s
real 0m4.758s
real 0m4.743s
real 0m4.738s

[1] I don't really understand why these numbers go down (status gets
faster) even with reboots in between. Then again, I don't know enough
about Windows' caching strategy. Maybe someone else can comment on
this, but my supposition is that maybe there is some built-in indexing
functionality that makes Windows pull files into RAM right after boot,
because it knows those files are read a lot, so have a high chance of
being read again etc ...

As you can see, the numbers for "hot cache" are almost always the
same, regardless of the storage system (the one exception I'll
attribute to me not having waited long enough after reboot, maybe some
windows services were still starting). That's just not realistic,
that's certainly not what our users feel ... Going down from 2 minutes
(7200 rpm HDD) to 17-ish seconds (10k rpm) to 7-ish seconds (SSD)
seems more correct to me.

Being optimistic I'm hoping that WC-NG will bring sub-5-seconds
statuses for this working copy to the masses, i.e. to all those
developers stuck with laptops and desktops with 7k rpm HDDs :).

Thanks for all the continued efforts!

Johan
Received on 2010-03-04 14:04:53 CET

This is an archived mail posted to the Subversion Dev mailing list.