[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: svn list -R of medium-size repository takes 10 hours.

From: Carsten Koch <Carsten.Koch_at_icem.com>
Date: 2005-07-27 18:22:03 CEST

Barry Scott wrote:
> I think you need to take this up on the svn dev list.

OK.

I have now looked at a much smaller test case with ethereal.
This test case does a 'svn list -R -v' on a directory
that contains only 16 files and one subdirectory with
another 8 files. So 'svn list -R -v' returns a total
of 1465 bytes in 25 lines and 'svn list -R' returns a total
of 440 bytes in 25 lines.

ethereal tells me that 60 k bytes in 113 packets were exchanged
in 7.32 seconds over a 64 k bit ISDN line.

If I am doing my math correctly, the ISDN line is running at
almost full speed. Meaning that my test case is slow due to
bandwidth, not due to latency.
So the only problem - at least in this test case - is,
that 1465 bytes (or 440 bytes) of end result are packed
in 60 k bytes of protocol, resulting in over 4000%
(or over 13000% )protocol overhead.

I looked at the data transmitted and found all kinds of XML
data that even with -v is never displayed: Apparently the data
contains names and values of properties, md5-checksums,
repository-uuids, etc. The dump even contains the string
http://subversion.tigris.org 55 times. I have no idea why
that could be useful.

Of course I fully understand that the protocol is very general
and satisfies more needs that just the ones of 'svn list'.
I also understand that transmitting uncompressed XML is
both very flexible and easy. But 7 seconds to transmit 25 lines
of listing? One must be very patient to like that. ;-)

My question is: Am I the only one suffering from terribly
slow "svn list -R" performance?
If this is not of general interest, I could create a
quick-and-dirty local solution, maybe based on "svnlook -tree"
in the post-commit hook.
If this is of general interest, would somebody be willing
to fix it, so that "svn list -R" becomes up to 130 times
faster?

Btw: "svnlook -tree" takes about 14 seconds to list the
repository of my test case below (the one that takes 10
hours over ISDN and 45 minutes locally with "svn list -R").

Thanks for any insight and Cheers,

Carsten.

>
> Barry
>
> On Jul 21, 2005, at 10:02, Carsten Koch wrote:
>
>> I have written a python script (using pysvn) that needs to know
>> all file/directory names under a certain URL.
>> The script works fine, but it takes ages to complete.
>> The underlying problem being that "svn list -R" is extremly slow.
>>
>> Does anybody know a workaround against the performance problem
>> of "svn list -R"?
>>
>> Our repository has been created just a few months ago, so it is
>> not really big yet, but an "svn list -R" already takes 9 hours,
>> 53 minutes and 56 seconds if run over my 128 kbit/s ISDN line:
>>
>> /usr/bin/time -v svn list -R http://svn/svn | wc -l
>> Command being timed: "svn list -R http://svn/svn"
>> User time (seconds): 757.42
>> System time (seconds): 6.58
>> Percent of CPU this job got: 2%
>> Elapsed (wall clock) time (h:mm:ss or m:ss): 9:53:56
>> Average shared text size (kbytes): 0
>> Average unshared data size (kbytes): 0
>> Average stack size (kbytes): 0
>> Average total size (kbytes): 0
>> Maximum resident set size (kbytes): 0
>> Average resident set size (kbytes): 0
>> Major (requiring I/O) page faults: 0
>> Minor (reclaiming a frame) page faults: 99632
>> Voluntary context switches: 206892
>> Involuntary context switches: 613
>> Swaps: 0
>> File system inputs: 0
>> File system outputs: 0
>> Socket messages sent: 0
>> Socket messages received: 0
>> Signals delivered: 0
>> Page size (bytes): 4096
>> Exit status: 0
>> 139529
>>
>> (This test was run at night with no other load on the ISDN line
>> and almost no other load on either machine)
>>
>> Needless to say that this makes "svn list -R" completely useless for
>> me. ;-)
>>
>> Even when run directly on the svn server, the same "svn list -R"
>> command takes about 45 minutes:
>>
>> /usr/bin/time -v svn list -R http://svn/svn | wc -l
>> Command being timed: "svn list -R http://svn/svn"
>> User time (seconds): 960.05
>> System time (seconds): 32.64
>> Percent of CPU this job got: 36%
>> Elapsed (wall clock) time (h:mm:ss or m:ss): 44:58.62
>> Average shared text size (kbytes): 0
>> Average unshared data size (kbytes): 0
>> Average stack size (kbytes): 0
>> Average total size (kbytes): 0
>> Maximum resident set size (kbytes): 0
>> Average resident set size (kbytes): 0
>> Major (requiring I/O) page faults: 6096
>> Minor (reclaiming a frame) page faults: 115157
>> Voluntary context switches: 0
>> Involuntary context switches: 0
>> Swaps: 0
>> File system inputs: 0
>> File system outputs: 0
>> Socket messages sent: 0
>> Socket messages received: 0
>> Signals delivered: 0
>> Page size (bytes): 4096
>> Exit status: 0
>> 133250
>>
>>
>> I know that this is going to be fixed by issue 1809, see
>> http://subversion.tigris.org/issues/show_bug.cgi?id=1809
>>
>> Is there anything that I can do in the meantime?
>> Would a series of non-recursive lists be faster than the
>> recursive list?
>> Do I have to implement some kind of cache in the post-commit
>> hook?
>>
>> Any hints will be appreciated.
>>
>> Thanks and Cheers,
>>
>> Carsten.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Wed Jul 27 18:29:00 2005

This is an archived mail posted to the Subversion Dev mailing list.