[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: Incorrect handling of svn:mergeinfo during merging on the svnpatch-diff branch

From: Paul Burba <ptburba_at_gmail.com>
Date: Thu, 11 Sep 2008 12:26:21 -0400

On Thu, Sep 11, 2008 at 11:03 AM, Julian Foad
<julianfoad_at_btopenworld.com> wrote:
> On Thu, 2008-09-11 at 13:51 +0100, Julian Foad wrote:
>> On Wed, 2008-09-10 at 11:40 -0400, Paul Burba wrote:
>> > On Tue, Sep 9, 2008 at 4:32 PM, Paul Burba <ptburba_at_gmail.com> wrote:
>> > > Now for the specifics: What's going wrong here is that when we perform
>> > > the reverse merge of -c-32625,
>> > > libsvn_subr/merge.c:filter_self_referential_mergeinfo() is filtering
>> > > out /trunk:1-31375' portion of svnpatch-diff's mergeinfo. As
>> > > mentioned above it should only filter out '/trunk:1-28961', so that is
>> > > one bug. But I can't see any reason to ever do filtering of this type
>> > > during a *reverse* merge. Currently the svn_wc_diff_callbacks3_t
>> > > callback merge.c:merge_props_changed() calls
>> > > filter_self_referential_mergeinfo() without consideration of the
>> > > merge's direction.
>> > >
>> > > So we need to do the following:
>> > >
>> > > 1) Stop filtering self-referential mergeinfo during reverse merges
>> > > (easier said than done as we don't have enough context in the callback
>> > > to determine merge direction right now). If you or anything else can
>> > > see a reason this filtering needs to be done in reverse merges please
>> > > let me know!
>>
>> This looks OK, but I struggled to fully understand both the motivation
>> and the correctness.
>
> Paul explained to me on IRC...
>
>> Note that the "filtering", although currently implemented as a single
>> function call, is performing two different logical functions:
>>
>> 1. Avoid adding any self-referential mergeinfo in the current merge.
>> 2. Clean up any self-referential mergeinfo that may have existed.
>
> I was completely wrong about that. It only does (1), and (1) is what's
> under discussion. That means it makes sense to avoid calling this
> filtering function in a reverse merge. It still should not matter if we
> did call it,

I think it still matters as regards merge performance. Given how much
slower merge performance has become in general with 1.5.x, I find any
performance improvements (especially simple ones like this) to be at
least a little compelling.

FWIW I ran Arfrever's reverse merge of 'svn merge -c-32625
^/branches/svnpatch-diff' three times with filtering 'on' (trunk) for
reverse merges and three times with it 'off' (my patch): The average
time for the merge dropped from 00:05:52 to 00:04:01, a 31% reduction.

> except there is currently a bug in it.

Yes, and this bug also can affect forward merges. The new test I
added in r33013, merge_tests.py 110 'natural history filtering permits
valid mergeinfo', demonstrates this problem with a forward merge --
basically merge.c:filter_self_referential_mergeinfo() filters too
much, removing mergeinfo that is not part of the target's natural
history.

I am working on a fix for this problem, but since this looks like it
will require a substantial re-write of
merge.c:filter_self_referential_mergeinfo() I am also trying to
address another issue related to this function; its performance. As
David Glasser mentioned a long time back, this function's performance
is poor, see http://svn.haxx.se/dev/archive-2008-02/0063.shtml.

> Sorry for the confusion.
>
> Paul is following up with more details.

So there are three somewhat separate issues here:

1) Should we ever filter mergeinfo property changes for reverse merges
- it seems we agree that no, we should not. My latest patch is
attached.

2) The underlying bug in filter_self_referential_mergeinfo, that
Arfrever noted while reverse merging to the svnpatch-diff branch and
that merge_tests.py 110 demonstrates, needs to be fixed.

3) Any performance improvements to
merge.c:filter_self_referential_mergeinfo() are a good thing and might
as well be done if that code is being revamped.

The following comments regard only 1) and answer some still
outstanding questions Julian had about my patch:

> [[[
>> Don't try to filter natural history from mergeinfo during reverse
>> merges.
>>
>> If a repository somehow gets self-referential mergeinfo in it, it's
>> nice to clean this up, but a reverse merge is not the place to do it.
>
>
>> A reverse merge should simply return mergeinfo to a previous state.
>> See
>> http://subversion.tigris.org/servlets/ReadMsg?listName=dev&msgNo=142777.
>
> Hmm. Does "a reverse merge" mean "a reversal of a merge recorded in this
> thing's merge history" or "a merge of URL1_at_REV1:URL2_at_REV2 in which
> REV2<REV1"? It looks like these two definitions are supposed to come to
> the same thing.

As I've written my patch it means both, but in the context of
merge.c:filter_self_referential_mergeinfo() it means only the former,
since that function is a no-op if we aren't honoring mergeinfo. If
mergeinfo is being honored then we only attempt to reverse merge
revisions that are part of the target's natural history or are
recorded in it's mergeinfo -- see merge.c:filter_merged_revisions().

>> This change also improves reverse merge performance as it eliminates some
>> unnecessary communication with the repos.
>>
>> * subversion/libsvn_client/merge.c
>> (merge_cmd_baton_t): Add 'is_rollback' member.
>> (merge_props_changed): Don't try to filter natural history for a reverse
>> merge.
>> (do_merge): Set merge_cmd_baton_t's new 'is_rollback' member to reflect
>> the current merge source being merged.
> ]]]
>
> [[[
>> Index: subversion/libsvn_client/merge.c
>> ===================================================================
>> --- subversion/libsvn_client/merge.c (revision 33012)
>> +++ subversion/libsvn_client/merge.c (working copy)
>> @@ -264,6 +264,13 @@
>> svn_ra_session_t *ra_session1;
>> svn_ra_session_t *ra_session2;
>>
>> + /* Whether the merge being performed is a reverse or forward merge.
>> + Note: Since combinations of forward and reverse merges are supported
>> + by svn_client_merge_peg3() the value of this field may change and
>> + applies only to the merge_source_t *currently* being merged by
>> + do_merge(). */
>> + svn_boolean_t is_rollback;
>
> This field serves the immediate purpose. I did notice before, when
> looking at storing info about conflicts, that it would be useful to have
> the merge source (URL1_at_REV1, URL2_at_REV2) stored in the baton, where
> currently we have only URL2 (called "url") and now the information (REV2
> < REV1).
>
> See the attached patch to replace "url" with a "merge_source" structure.
>
> Dou you think that would be better?

For the purposes of my patch it is more than is needed, but if for
conflict resolution it would be helpful I see no harm in including it
now. Though I'd like to be explicit about how these values change as
different merge sources are processed (similar to my comment for
IS_ROLLBACK), something like:

  /* The left and right URLs and revs. The value of this field changes to
     reflect the merge_source_t *currently* being merged by do_merge(). */
  merge_source_t merge_source;

>> /* Pool which has a lifetime limited to one iteration over a given
>> merge source, i.e. it is cleared on every call to do_directory_merge()
>> or do_file_merge() in do_merge(). */
>> @@ -564,9 +571,11 @@
>> TRUE, -1, ctx->cancel_func,
>> ctx->cancel_baton, subpool));
>>
>> - /* Don't add mergeinfo from PATH's own history. */
>> - SVN_ERR(filter_self_referential_mergeinfo(&props, path, merge_b,
>> - adm_access, subpool));
>> + /* If this is a forward merge then don't add mergeinfo from
>> + PATH's own history. */
>
> The wording can be read as implying that we might want to add such
> mergeinfo during a reverse merge. Can we make the intent more explicit:
>
> If this is a forward merge then don't add mergeinfo from
> PATH's own history, and take this opportunity to remove any
> that might already be present.
>
> If this is a reverse merge then we can't be adding any, and
> we don't want to filter out any that might already be
> present because we want the result in simple cases to be
> exactly as if the forward merge had never been performed.

As mentioned above, since filter_self_referential_mergeinfo does not
affect the WC's existing mergeinfo most of this is moot. Though I did
make it a bit more explicit:

+ /* If this is a forward merge then don't add new mergeinfo to
+ PATH that is already part of PATH's own history. */

>> + if (!merge_b->is_rollback)
>> + SVN_ERR(filter_self_referential_mergeinfo(&props, path, merge_b,
>> + adm_access, subpool));
>>
>> err = svn_wc_merge_props2(state, path, adm_access, original_props, props,
>> FALSE, merge_b->dry_run, ctx->conflict_func,
>> @@ -5867,6 +5876,7 @@
>> merge_cmd_baton.paths_with_new_mergeinfo = NULL;
>> merge_cmd_baton.ra_session1 = ra_session1;
>> merge_cmd_baton.ra_session2 = ra_session2;
>> + merge_cmd_baton.is_rollback = (rev1 > rev2);
>>
>> /* Populate the portions of the merge context baton that require
>> an RA session to set, but shouldn't be reset for each iteration. */
> ]]]
>
> In summary, I've no problem with you committing this if you could give
> some sort of tweak to the comment I commented on, and all the rest of my
> thoughts are just for your consideration!

Seems that thought still holds, so I'll commit this later today
(baring any objections of course).

Paul

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe_at_subversion.tigris.org
For additional commands, e-mail: dev-help_at_subversion.tigris.org

Received on 2008-09-11 18:26:42 CEST

This is an archived mail posted to the Subversion Dev mailing list.