Um, have you tried ra_serf yet? It's our new pipelining HTTP client for svn.
On Jan 13, 2008 6:23 PM, Marc-Antoine Ruel <maruel_at_gmail.com> wrote:
> Hi
>
> Use case:
> I'm synching a lot of small files, for many hundreds of megs worth.
> I'm on Wifi + VPN + far far away = > 150 ms ping. I have rather slow
> checkout even if the pipe is > 5 mbit/s.
>
> Hypothesis:
> The sync is limited by the latency I have from the server, not the
> actual bandwidth capacity. If the client would sync many files
> concurrently, in pipeline, it would inherently go much faster
>
> Testing:
> I tried a reduced case to see if the there is immediate room for
> improvement. So I took a directory which only contained dir1 and dir2.
> It was on XP on https protocol with svn 1.5.0, I don't know the exact
> trunk version (sorry).
>
> dir1 261 File(s) 41 Dir(s) 13334007 bytes
> dir2 1728 File(s) 770 Dir(s) 30763364 bytes
>
> As you can see, the directories aren't correctly balanced but that it
> is still sufficient to show my point. I wanted to have different files
> to be sure I wasn't affected by any kind of duplicate detection.
>
> So my tests are:
> Updating the directory that contains both subdirectories: 322 seconds
> Updating both directories, one after the other: 323 seconds. (process
> starting latency + initial https connection results in ~1 second
> overhead) (101 for dir1 and 222 seconds for dir2)
> Updating both sub directories at the same time: 89 seconds (dir1) and
> 216 seconds (perl).
>
> I couldn't believe that is was faster when running two check out than
> one at a time so I tried again the second and third test, in reverse
> order.
>
> Updating both subdirectories at the same time: 111 seconds (dir1) and
> 206 seconds (perl).
> Updating both directories, one after the other: 341 seconds. (116 for
> dir1 and 225 seconds for dir2)
>
> Analysis:
> So as you can see, the absolute error is very high (>30 seconds!), but
> nevertheless, it's possible to see that running two checkout in
> parallel is as fast as running one at a time, which means:
> - Neither the server, the client or the bandwidth is the limiting factor.
> - The limiting factor is something else: the latency to get each file.
>
> Conclusion:
> By pipelining the checkout, i.e. requesting many files at a time, svn
> would reduce the effect of the latency.
>
> Thanks
>
> Marc-Antoine
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe_at_subversion.tigris.org
> For additional commands, e-mail: dev-help_at_subversion.tigris.org
>
>
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe_at_subversion.tigris.org
For additional commands, e-mail: dev-help_at_subversion.tigris.org
Received on 2008-01-14 01:52:31 CET