Performance of svn+ssh vs. file for multiple files

From: Eric Peers <eric_at_missinglinktools.com>
Date: Tue, 06 Jul 2010 12:17:31 -0600

Howdy,

I've got a program that needs to checkout specific files at specific
versions. In this particular case a branch does not make sense. I have
found that the performance of svn+ssh in this case is very bad.

I run the rough equivalent of:
svn update -r 2 file1 file2 file3 file4 file5
svn update -r 3 file6 file7 file8 file9 file10

overall I have about 100 such files, and 2 svn update calls. I've
accomplished this with an xargs frontend to svn so as to not overrun the
cmdline.

if I use file:/// as a protocol, it runs in 3 seconds.
if I use svn+ssh:/// as a protocol, it takes 53 seconds.
if I run an svn update -r 3 with no files, it takes about 2s.

I wrote a direct svn api-program to accept the file lists, make the
authentication a single time, and then call svn_update3. This still runs
super slow. around 53s still.

I suspect the problem is because each individual file is called out,
locked, etc. Is there a way to batch these locks together or improve
performance? Cause the ssh channel/ra session to be reused?

Perusing the source code suggests that svn_client__update_internal will
be called for each element in my paths. Since an individual file
lock/svn directory write does not seem to be overly performance costly,
I suspect the problem is in the svn_client__open_ra_session_internal +
svn_ra_do_update2 calls from svn_client__update_internal? Is the
subversion code opening a new ra_session for each of these files at the
expense of an ssh+svnserve on the remote end? Is there a way to force a
single RA session across all the files at an API level without writing
my own svn_client__update_internal?

thoughts here?

thanks!
--eric
Received on 2010-07-06 20:18:45 CEST

This message: [ Message body ]
Next message: Daniel Shahaf: "Re: can't build with zlib"
Previous message: Aaron Turner: "Re: svn add: Unrecognized line ending style"
Next in thread: Daniel Shahaf: "Re: Performance of svn+ssh vs. file for multiple files"
Reply: Daniel Shahaf: "Re: Performance of svn+ssh vs. file for multiple files"
Reply: Nico Kadel-Garcia: "Re: Performance of svn+ssh vs. file for multiple files"

Contemporary messages sorted: [ by date ] [ by thread ] [ by subject ] [ by author ] [ by messages with attachments ]