[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: Performance regression with reverse merge

From: Paul Burba <ptburba_at_gmail.com>
Date: Thu, 5 Mar 2009 11:21:05 -0500

On Thu, Mar 5, 2009 at 4:24 AM, Stefan Fuhrmann
<stefanfuhrmann_at_alice-dsl.de> wrote:
> Mark Phippard wrote:
>> On Wed, Mar 4, 2009 at 5:29 AM, Stefan Fuhrmann
>>
>> <stefanfuhrmann_at_alice-dsl.de> wrote:
>> > I did some more tests and found that the runtime for
>> > 'ordinary' merges between branches is roughly the same
>> > (4+ minutes instead of 6+ but the sub-tree in question was
>> > significantly smaller on that branch).
>> >
>> >> but it does tell
>> >> us something we already knew: A significant part of merge's slowdown
>> >> in 1.5.0+ is due to the need to walk the working copy looking for
>> >> explicit subtree mergeinfo.  This is not something we can skip for
>> >> mergeinfo aware merges, we need to know about these subtrees.  Though
>> >> the ongoing WCNG work will probably make it a *lot* faster.
>> >
>> > Hm. Looking at the measured data tells a different story:
>> > the client is responsible for less than 1% of the runtime
>> > (<3s of 350s total). The very constant flow of data between
>> > client and server is an indication for a similar situation on
>> > the server.

Hi Stefan,

When I said "make it a *lot* faster" the "it" I was referring to was
the time to perform the walk to find all the subtree mergeinfo. In my
testing the time spent doing the walk for subtree mergeinfo was
approximately 3 minutes of the 9 minute merge -- hence "a significant
part of merge's slowdown" (at least for me! YMMV).

>> > That means, the most time is spent on the network with
>> > 1500 .. 2000 roundtrips (given a 187ms ping). So, you are
>> > right that the whole WC is crawled for mergeinfo. But the
>> > real problem is that for every 'relevant' node, there is an
>> > individual communication with the server. A faster WC
>> > implementation alone will have no effect here.

Agreed, but again, for me the crawl is a significant chunk of time.

>> I wonder if this is now where you are running into the optimization
>> that went into 1.5.  Before you did the merge, did you run update so
>> that your WC was at a single revision for the entire WC?  This has an
>> enormous impact on performance and I think this is what you were
>> running into.  Paul can describe it better, but if the WC is not at a
>> single revision then the code has to do more roundtrips to discover
>> mergeinfo that might exist in the repository.
>
> Those WCs were fresh check-outs. Paul, I guess, also just
> made a check out and started his timings.

That is correct.

> The svn:externals used by TSVN don't seem to be the culprit
> either, because the timing was only proportionally better
> when I used a smaller sub-tree with no externals in it.
>
> Back to your question whether this is a release blocker: No.
> Performance got much worse in 1.5 but remained stable in 1.6.
> So 1.6 performance is just as acceptable as the previous version.
>
> I provided the reproduction recipe with some public, real-world
> repo so that Paul and others could analyze it thoroughly and
> make it faster.

That is great, having an real-world example to work with is extremely helpful.

> IMHO, there are two approaches to speeding things up:
>
> * stream / interleave C/S communication
> (would serf be of any help here?)

It seems so yes. Could you try the merge with serf and see what you
find? I saw a pretty dramatic improvement using serf over neon:

1.6.0.RC3.RA_NEON>C:\SVN\TSVN>timethis svn merge -r 15456:15455
http://tortoisesvn.tigris.org/svn/tortoisesvn/trunk

TimeThis : Command Line : svn merge -r 15456:15455
http://tortoisesvn.tigris.org/svn/tortoisesvn/trunk
TimeThis : Start Time : Wed Mar 04 10:02:29 2009

--- Reverse-merging r15456 into
'src\TortoiseProc\RevisionGraph\RevisionGraphDlg.cpp':
U src\TortoiseProc\RevisionGraph\RevisionGraphDlg.cpp
--- Reverse-merging r15456 into
'src\TortoiseProc\RevisionGraph\RevisionGraphDlg.h':
U src\TortoiseProc\RevisionGraph\RevisionGraphDlg.h

TimeThis : Command Line : svn merge -r 15456:15455
http://tortoisesvn.tigris.org/svn/tortoisesvn/trunk
TimeThis : Start Time : Wed Mar 04 10:02:29 2009
TimeThis : End Time : Wed Mar 04 10:11:36 2009
TimeThis : Elapsed Time : 00:09:07.140

1.6.0.RC3.RA_SERF>C:\SVN\TSVN>timethis svn merge -r 15456:15455
http://tortoisesvn.tigris.org/svn/tortoisesvn/trunk

TimeThis : Command Line : svn merge -r 15456:15455
http://tortoisesvn.tigris.org/svn/tortoisesvn/trunk
TimeThis : Start Time : Wed Mar 04 10:12:32 2009

--- Reverse-merging r15456 into
'src\TortoiseProc\RevisionGraph\RevisionGraphDlg.cpp':
U src\TortoiseProc\RevisionGraph\RevisionGraphDlg.cpp
--- Reverse-merging r15456 into
'src\TortoiseProc\RevisionGraph\RevisionGraphDlg.h':
U src\TortoiseProc\RevisionGraph\RevisionGraphDlg.h

TimeThis : Command Line : svn merge -r 15456:15455
http://tortoisesvn.tigris.org/svn/tortoisesvn/trunk
TimeThis : Start Time : Wed Mar 04 10:12:32 2009
TimeThis : End Time : Wed Mar 04 10:16:36 2009
TimeThis : Elapsed Time : 00:04:04.421

Still pokey, but it's something. And as I said, for me almost 3
minutes of this time is spent walking to target tree looking for
mergeinfo. WCNG should drop that to a fraction of the time, so then
we could be looking at about a minute.

Beyond that, improvements are going to be harder to come by in this
example because http://tortoisesvn.tigris.org/svn/tortoisesvn/trunk
has so much explicit subtree mergeinfo, *226* paths in fact! I assume
this is what you are referring to as "every 'relevant' node". Do you
know how all of this mergeinfo came about? Do you do 'subtree
merges'? (i.e. a merge directly targeting a subtree of a branch rather
than its root).

Anyhow, I'll be looking at the code again to see where improvements can be made.

Paul

------------------------------------------------------
http://subversion.tigris.org/ds/viewMessage.do?dsForumId=462&dsMessageId=1272752
Received on 2009-03-05 17:21:29 CET

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.