[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: Performance of svn+ssh vs. file for multiple files

From: Eric Peers <eric_at_missinglinktools.com>
Date: Tue, 06 Jul 2010 14:46:11 -0600

Good suggestion Daniel. While this does markedly improve performance, it
does so at the expense of changing the underlying protocol.
Unfortunately, I'm not at liberty to change the underlying protocol - I
have customers that define the protocol, I don't. So my "program" needs
to access their repos using their protocols.

But the results:
ssh port forwarding to an active svnserve takes about 2.5s.
pure svnserve takes roughly 2s

svnserve -d --listen-port 8000
ssh epeers_at_localhost -L 3690:localhost:8000
...then run my svn update commands...

    --eric

On 07/06/2010 12:52 PM, Daniel Shahaf wrote:
> Have you tried using SSH port forwarding instead of svn+ssh://?
>
> Daniel
> (perhaps one of the other devs will address the points you made; I'm
> myself not familiar with that part of the code)
>
> Eric Peers wrote on Tue, 6 Jul 2010 at 21:17 -0000:
>
>> Howdy,
>>
>> I've got a program that needs to checkout specific files at specific versions.
>> In this particular case a branch does not make sense. I have found that the
>> performance of svn+ssh in this case is very bad.
>>
>> I run the rough equivalent of:
>> svn update -r 2 file1 file2 file3 file4 file5
>> svn update -r 3 file6 file7 file8 file9 file10
>>
>> overall I have about 100 such files, and 2 svn update calls. I've accomplished
>> this with an xargs frontend to svn so as to not overrun the cmdline.
>>
>> if I use file:/// as a protocol, it runs in 3 seconds.
>> if I use svn+ssh:/// as a protocol, it takes 53 seconds.
>> if I run an svn update -r 3 with no files, it takes about 2s.
>>
>> I wrote a direct svn api-program to accept the file lists, make the
>> authentication a single time, and then call svn_update3. This still runs super
>> slow. around 53s still.
>>
>> I suspect the problem is because each individual file is called out, locked,
>> etc. Is there a way to batch these locks together or improve performance?
>> Cause the ssh channel/ra session to be reused?
>>
>> Perusing the source code suggests that svn_client__update_internal will be
>> called for each element in my paths. Since an individual file lock/svn
>> directory write does not seem to be overly performance costly, I suspect the
>> problem is in the svn_client__open_ra_session_internal + svn_ra_do_update2
>> calls from svn_client__update_internal? Is the subversion code opening a new
>> ra_session for each of these files at the expense of an ssh+svnserve on the
>> remote end? Is there a way to force a single RA session across all the files
>> at an API level without writing my own svn_client__update_internal?
>>
>> thoughts here?
>>
>> thanks!
>> --eric
>>
>>
>>
>>
>>
Received on 2010-07-06 22:47:25 CEST

This is an archived mail posted to the Subversion Users mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.