Howdy,
I've got a program that needs to checkout specific files at specific
versions. In this particular case a branch does not make sense. I have
found that the performance of svn+ssh in this case is very bad.
I run the rough equivalent of:
svn update -r 2 file1 file2 file3 file4 file5
svn update -r 3 file6 file7 file8 file9 file10
overall I have about 100 such files, and 2 svn update calls. I've
accomplished this with an xargs frontend to svn so as to not overrun the
cmdline.
if I use file:/// as a protocol, it runs in 3 seconds.
if I use svn+ssh:/// as a protocol, it takes 53 seconds.
if I run an svn update -r 3 with no files, it takes about 2s.
I wrote a direct svn api-program to accept the file lists, make the
authentication a single time, and then call svn_update3. This still runs
super slow. around 53s still.
I suspect the problem is because each individual file is called out,
locked, etc. Is there a way to batch these locks together or improve
performance? Cause the ssh channel/ra session to be reused?
Perusing the source code suggests that svn_client__update_internal will
be called for each element in my paths. Since an individual file
lock/svn directory write does not seem to be overly performance costly,
I suspect the problem is in the svn_client__open_ra_session_internal +
svn_ra_do_update2 calls from svn_client__update_internal? Is the
subversion code opening a new ra_session for each of these files at the
expense of an ssh+svnserve on the remote end? Is there a way to force a
single RA session across all the files at an API level without writing
my own svn_client__update_internal?
thoughts here?
thanks!
--eric
Received on 2010-07-06 20:18:45 CEST