[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Performance of svn+ssh vs. file for multiple files

From: Eric Peers <eric_at_missinglinktools.com>
Date: Tue, 06 Jul 2010 12:17:31 -0600

Howdy,

I've got a program that needs to checkout specific files at specific
versions. In this particular case a branch does not make sense. I have
found that the performance of svn+ssh in this case is very bad.

I run the rough equivalent of:
svn update -r 2 file1 file2 file3 file4 file5
svn update -r 3 file6 file7 file8 file9 file10

overall I have about 100 such files, and 2 svn update calls. I've
accomplished this with an xargs frontend to svn so as to not overrun the
cmdline.

if I use file:/// as a protocol, it runs in 3 seconds.
if I use svn+ssh:/// as a protocol, it takes 53 seconds.
if I run an svn update -r 3 with no files, it takes about 2s.

I wrote a direct svn api-program to accept the file lists, make the
authentication a single time, and then call svn_update3. This still runs
super slow. around 53s still.

I suspect the problem is because each individual file is called out,
locked, etc. Is there a way to batch these locks together or improve
performance? Cause the ssh channel/ra session to be reused?

Perusing the source code suggests that svn_client__update_internal will
be called for each element in my paths. Since an individual file
lock/svn directory write does not seem to be overly performance costly,
I suspect the problem is in the svn_client__open_ra_session_internal +
svn_ra_do_update2 calls from svn_client__update_internal? Is the
subversion code opening a new ra_session for each of these files at the
expense of an ssh+svnserve on the remote end? Is there a way to force a
single RA session across all the files at an API level without writing
my own svn_client__update_internal?

thoughts here?

thanks!
    --eric
Received on 2010-07-06 20:18:45 CEST

This is an archived mail posted to the Subversion Users mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.