[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: Feature request: pipelining checkout and update

From: Marc-Antoine Ruel <maruel_at_gmail.com>
Date: Sun, 13 Jan 2008 20:20:02 -0500

I didn't know about this. My bad. I looked at my svn build and it didn't
have serf included. I'll retest with that.

Thanks for the insight.

M-A

2008/1/13, Ben Collins-Sussman <sussman_at_red-bean.com>:
>
> Um, have you tried ra_serf yet? It's our new pipelining HTTP client for
> svn.
>
> On Jan 13, 2008 6:23 PM, Marc-Antoine Ruel <maruel_at_gmail.com> wrote:
> > Hi
> >
> > Use case:
> > I'm synching a lot of small files, for many hundreds of megs worth.
> > I'm on Wifi + VPN + far far away = > 150 ms ping. I have rather slow
> > checkout even if the pipe is > 5 mbit/s.
> >
> > Hypothesis:
> > The sync is limited by the latency I have from the server, not the
> > actual bandwidth capacity. If the client would sync many files
> > concurrently, in pipeline, it would inherently go much faster
> >
> > Testing:
> > I tried a reduced case to see if the there is immediate room for
> > improvement. So I took a directory which only contained dir1 and dir2.
> > It was on XP on https protocol with svn 1.5.0, I don't know the exact
> > trunk version (sorry).
> >
> > dir1 261 File(s) 41 Dir(s) 13334007 bytes
> > dir2 1728 File(s) 770 Dir(s) 30763364 bytes
> >
> > As you can see, the directories aren't correctly balanced but that it
> > is still sufficient to show my point. I wanted to have different files
> > to be sure I wasn't affected by any kind of duplicate detection.
> >
> > So my tests are:
> > Updating the directory that contains both subdirectories: 322 seconds
> > Updating both directories, one after the other: 323 seconds. (process
> > starting latency + initial https connection results in ~1 second
> > overhead) (101 for dir1 and 222 seconds for dir2)
> > Updating both sub directories at the same time: 89 seconds (dir1) and
> > 216 seconds (perl).
> >
> > I couldn't believe that is was faster when running two check out than
> > one at a time so I tried again the second and third test, in reverse
> > order.
> >
> > Updating both subdirectories at the same time: 111 seconds (dir1) and
> > 206 seconds (perl).
> > Updating both directories, one after the other: 341 seconds. (116 for
> > dir1 and 225 seconds for dir2)
> >
> > Analysis:
> > So as you can see, the absolute error is very high (>30 seconds!), but
> > nevertheless, it's possible to see that running two checkout in
> > parallel is as fast as running one at a time, which means:
> > - Neither the server, the client or the bandwidth is the limiting
> factor.
> > - The limiting factor is something else: the latency to get each file.
> >
> > Conclusion:
> > By pipelining the checkout, i.e. requesting many files at a time, svn
> > would reduce the effect of the latency.
> >
> > Thanks
> >
> > Marc-Antoine
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: dev-unsubscribe_at_subversion.tigris.org
> > For additional commands, e-mail: dev-help_at_subversion.tigris.org
> >
> >
>
Received on 2008-01-14 02:20:14 CET

This is an archived mail posted to the Subversion Dev mailing list.