[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: Incorrect handling of svn:mergeinfo during merging on the svnpatch-diff branch

From: Paul Burba <ptburba_at_gmail.com>
Date: Tue, 16 Sep 2008 13:23:54 -0400

On Thu, Sep 11, 2008 at 12:26 PM, Paul Burba <ptburba_at_gmail.com> wrote:
> On Thu, Sep 11, 2008 at 11:03 AM, Julian Foad
> <julianfoad_at_btopenworld.com> wrote:
>> On Thu, 2008-09-11 at 13:51 +0100, Julian Foad wrote:
>>> On Wed, 2008-09-10 at 11:40 -0400, Paul Burba wrote:
>>> > On Tue, Sep 9, 2008 at 4:32 PM, Paul Burba <ptburba_at_gmail.com> wrote:
>>> > > Now for the specifics: What's going wrong here is that when we perform
>>> > > the reverse merge of -c-32625,
>>> > > libsvn_subr/merge.c:filter_self_referential_mergeinfo() is filtering
>>> > > out /trunk:1-31375' portion of svnpatch-diff's mergeinfo. As
>>> > > mentioned above it should only filter out '/trunk:1-28961', so that is
>>> > > one bug. But I can't see any reason to ever do filtering of this type
>>> > > during a *reverse* merge. Currently the svn_wc_diff_callbacks3_t
>>> > > callback merge.c:merge_props_changed() calls
>>> > > filter_self_referential_mergeinfo() without consideration of the
>>> > > merge's direction.
>>> > >
>>> > > So we need to do the following:
>>> > >
>>> > > 1) Stop filtering self-referential mergeinfo during reverse merges
>>> > > (easier said than done as we don't have enough context in the callback
>>> > > to determine merge direction right now). If you or anything else can
>>> > > see a reason this filtering needs to be done in reverse merges please
>>> > > let me know!
>>>
>>> This looks OK, but I struggled to fully understand both the motivation
>>> and the correctness.
>>
>> Paul explained to me on IRC...
>>
>>> Note that the "filtering", although currently implemented as a single
>>> function call, is performing two different logical functions:
>>>
>>> 1. Avoid adding any self-referential mergeinfo in the current merge.
>>> 2. Clean up any self-referential mergeinfo that may have existed.
>>
>> I was completely wrong about that. It only does (1), and (1) is what's
>> under discussion. That means it makes sense to avoid calling this
>> filtering function in a reverse merge. It still should not matter if we
>> did call it,
>
> I think it still matters as regards merge performance. Given how much
> slower merge performance has become in general with 1.5.x, I find any
> performance improvements (especially simple ones like this) to be at
> least a little compelling.
>
> FWIW I ran Arfrever's reverse merge of 'svn merge -c-32625
> ^/branches/svnpatch-diff' three times with filtering 'on' (trunk) for
> reverse merges and three times with it 'off' (my patch): The average
> time for the merge dropped from 00:05:52 to 00:04:01, a 31% reduction.
>
>> except there is currently a bug in it.
>
> Yes, and this bug also can affect forward merges. The new test I
> added in r33013, merge_tests.py 110 'natural history filtering permits
> valid mergeinfo', demonstrates this problem with a forward merge --
> basically merge.c:filter_self_referential_mergeinfo() filters too
> much, removing mergeinfo that is not part of the target's natural
> history.
>
> I am working on a fix for this problem, but since this looks like it
> will require a substantial re-write of
> merge.c:filter_self_referential_mergeinfo() I am also trying to
> address another issue related to this function; its performance. As
> David Glasser mentioned a long time back, this function's performance
> is poor, see http://svn.haxx.se/dev/archive-2008-02/0063.shtml.
>
>> Sorry for the confusion.
>>
>> Paul is following up with more details.
>
> So there are three somewhat separate issues here:
>
> 1) Should we ever filter mergeinfo property changes for reverse merges
> - it seems we agree that no, we should not. My latest patch is
> attached.

#1 Was already addressed elsewhere in this thread.

> 2) The underlying bug in filter_self_referential_mergeinfo, that
> Arfrever noted while reverse merging to the svnpatch-diff branch and
> that merge_tests.py 110 demonstrates, needs to be fixed.
>
> 3) Any performance improvements to
> merge.c:filter_self_referential_mergeinfo() are a good thing and might
> as well be done if that code is being revamped.

The attached patch addresses #2 and #3.

It implements the performance improvement David suggested in
http://svn.haxx.se/dev/archive-2008-02/0206.shtml by using
svn_client__get_history_as_mergeinfo() *when possible*, rather than
svn_client__repos_locations() in
merge.c:filter_self_referential_mergeinfo().

In some cases this won't make much or any difference to merge
performance, but in one common use case it will speed things up a lot:

1) Merging into an up-to-date WC.
2) The merge source is adding mergeinfo with many discrete revision
ranges for each source path.

We can find a good examples of this in our own repository any time we
synch a branch up with trunk.

For a quick test I checked out
http://svn.collab.net/repos/svn/branches/fs-rep-sharing@33000 and
merged -r33000:33083 from trunk three times with the patch and three
times with trunk (r33081).

          With Patch Trunk
          ---------- ----------
real 5m51.614s 10m17.726s
user 0m0.015s 0m0.015s
sys 0m0.030s 0m0.015s

real 6m2.703s 10m13.016s
user 0m0.031s 0m0.015s
sys 0m0.000s 0m0.015s

real 5m57.865s 10m13.884s
user 0m0.031s 0m0.015s
sys 0m0.000s 0m0.015s
          --------- ----------
Avg real 5m57s 10m15s

A lovely 42% reduction in the time it takes to complete the merge.

This patch also fixes merge_tests.py 110 'natural history filtering
permits valid mergeinfo'.

Running the test one final time after some last second improvements
and hope to commit later today.

Paul

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe_at_subversion.tigris.org
For additional commands, e-mail: dev-help_at_subversion.tigris.org

Received on 2008-09-16 19:25:14 CEST

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.