[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: Performance regression with reverse merge

From: Stefan Fuhrmann <stefanfuhrmann_at_alice-dsl.de>
Date: Sun, 8 Mar 2009 17:44:36 +0100

Paul Burba wrote:
> Stefan Fuhrmann wrote:
> >> > Paul Burba wrote:
> >> >> but it does tell
> >> >> us something we already knew: A significant part of merge's slowdown
> >> >> in 1.5.0+ is due to the need to walk the working copy looking for
> >> >> explicit subtree mergeinfo.  This is not something we can skip for
> >> >> mergeinfo aware merges, we need to know about these subtrees.  Though
> >> >> the ongoing WCNG work will probably make it a *lot* faster.
> >> >
> >> > Hm. Looking at the measured data tells a different story:
> >> > the client is responsible for less than 1% of the runtime
> >> > (<3s of 350s total). The very constant flow of data between
> >> > client and server is an indication for a similar situation on
> >> > the server.
>
> Hi Stefan,
>
> When I said "make it a *lot* faster" the "it" I was referring to was
> the time to perform the walk to find all the subtree mergeinfo. In my
> testing the time spent doing the walk for subtree mergeinfo was
> approximately 3 minutes of the 9 minute merge -- hence "a significant
> part of merge's slowdown" (at least for me! YMMV).

Perhaps we are missing each others point. I don't question
that the WC walk accounts for a major part of the run-time.

However, neither disk I/O nor CPU is used more but marginally
during that time. It is the network I/O that sees a constant trickle
of small packages (4k in, 4k out per second). My conclusion is
that during the WC walk, the client spends most time waiting
for the server.

Unless the number of C/S interaction is reduced by the WCNG
design, I expect no significant performance improvement.

> >> > That means, the most time is spent on the network with
> >> > 1500 .. 2000 roundtrips (given a 187ms ping). So, you are
> >> > right that the whole WC is crawled for mergeinfo. But the
> >> > real problem is that for every 'relevant' node, there is an
> >> > individual communication with the server. A faster WC
> >> > implementation alone will have no effect here.
>
> Agreed, but again, for me the crawl is a significant chunk of time.
>
> > IMHO, there are two approaches to speeding things up:
> >
> > * stream / interleave C/S communication
> > (would serf be of any help here?)
>
> It seems so yes. Could you try the merge with serf and see what you
> find? I saw a pretty dramatic improvement using serf over neon:
>
> 1.6.0.RC3.RA_NEON>C:\SVN\TSVN>timethis svn merge -r 15456:15455
> http://tortoisesvn.tigris.org/svn/tortoisesvn/trunk
>
> TimeThis : Command Line : svn merge -r 15456:15455
> http://tortoisesvn.tigris.org/svn/tortoisesvn/trunk
> TimeThis : Start Time : Wed Mar 04 10:02:29 2009
>
> --- Reverse-merging r15456 into
> 'src\TortoiseProc\RevisionGraph\RevisionGraphDlg.cpp':
> U src\TortoiseProc\RevisionGraph\RevisionGraphDlg.cpp
> --- Reverse-merging r15456 into
> 'src\TortoiseProc\RevisionGraph\RevisionGraphDlg.h':
> U src\TortoiseProc\RevisionGraph\RevisionGraphDlg.h
>
> TimeThis : Command Line : svn merge -r 15456:15455
> http://tortoisesvn.tigris.org/svn/tortoisesvn/trunk
> TimeThis : Start Time : Wed Mar 04 10:02:29 2009
> TimeThis : End Time : Wed Mar 04 10:11:36 2009
> TimeThis : Elapsed Time : 00:09:07.140
>
> 1.6.0.RC3.RA_SERF>C:\SVN\TSVN>timethis svn merge -r 15456:15455
> http://tortoisesvn.tigris.org/svn/tortoisesvn/trunk
>
> TimeThis : Command Line : svn merge -r 15456:15455
> http://tortoisesvn.tigris.org/svn/tortoisesvn/trunk
> TimeThis : Start Time : Wed Mar 04 10:12:32 2009
>
> --- Reverse-merging r15456 into
> 'src\TortoiseProc\RevisionGraph\RevisionGraphDlg.cpp':
> U src\TortoiseProc\RevisionGraph\RevisionGraphDlg.cpp
> --- Reverse-merging r15456 into
> 'src\TortoiseProc\RevisionGraph\RevisionGraphDlg.h':
> U src\TortoiseProc\RevisionGraph\RevisionGraphDlg.h
>
> TimeThis : Command Line : svn merge -r 15456:15455
> http://tortoisesvn.tigris.org/svn/tortoisesvn/trunk
> TimeThis : Start Time : Wed Mar 04 10:12:32 2009
> TimeThis : End Time : Wed Mar 04 10:16:36 2009
> TimeThis : Elapsed Time : 00:04:04.421
>
> Still pokey, but it's something. And as I said, for me almost 3
> minutes of this time is spent walking to target tree looking for
> mergeinfo. WCNG should drop that to a fraction of the time, so then
> we could be looking at about a minute.

Here we go (SVN 1.6.0-RC3, i686 LINUX):

$time ./svn co http://tortoisesvn.tigris.org/svn/tortoisesvn/trunk ~/TSVN -r
15533 --ignore-externals > /dev/null

        serf neon
real 2m41.073s 2m15.268s
user 0m54.107s 0m6.840s
sys 1m25.297s 0m3.672s

$time ./svn merge -r 15456:15455
http://tortoisesvn.tigris.org/svn/tortoisesvn/trunk ~/TSVN

        serf neon
real 5m29.002s 6m43.831s
user 0m1.992s 0m2.124s
sys 0m0.824s 0m0.664s

Hm. Merge is 20% faster over Serf but c/o is 20% slower.
More importantly, c/o is CPU-bound with Serf!
That is really unexpected.

> Beyond that, improvements are going to be harder to come by in this
> example because http://tortoisesvn.tigris.org/svn/tortoisesvn/trunk
> has so much explicit subtree mergeinfo, *226* paths in fact! I assume
> this is what you are referring to as "every 'relevant' node". Do you
> know how all of this mergeinfo came about? Do you do 'subtree
> merges'? (i.e. a merge directly targeting a subtree of a branch rather
> than its root).

TSVN developers use SVN close to HEAD. It seems
that most of this merge info was added during 1.5
development (around Oct 2007), i.e. the result
of alpha-quality code.

From what I can tell, most of that merge info can
be deleted because they refer to revision that did
not change the source node. Is it safe to do that or
will I screw up the current merge tracking logic?

> Anyhow, I'll be looking at the code again to see where improvements can be
> made.

Thanks!

Since larger projects will probably use local
merges, they could also produce a larger number
of local svn:mergeinfo.

-- Stefan^2.
Received on 2009-03-08 17:42:47 CET

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.