Re: Implicit keep-alive after reintegrate merge
From: Julian Foad <julianfoad_at_btopenworld.com>
Date: Mon, 16 Jan 2012 18:35:51 +0000 (GMT)
Here's an updated patch, which is both simpler (now that I learnt more about how the existing code works) and contains tests that demonstrate the functionality (merge_reintegrate_tests.py 18 and 19).
I think merge_reintegrate_tests.py 19 demonstrates it better than 1000 words so I'll leave it at that for today.
I'll respond to your last response (below) maybe tomorrow.
----- Original Message -----
> From: Paul Burba <ptburba_at_gmail.com>
> To: Julian Foad <julianfoad_at_btopenworld.com>
> Cc: "dev_at_subversion.apache.org" <dev_at_subversion.apache.org>
> Sent: Wednesday, 11 January 2012, 19:57
> Subject: Re: Implicit keep-alive after reintegrate merge
> On Wed, Jan 11, 2012 at 7:43 AM, Julian Foad <julianfoad_at_btopenworld.com>
>> Hi Paul. Thanks for your thoughts.
>> Paul Burba wrote:
>>> Julian Foad wrote:
>>>> We can say for sure that when we reintegrated B to A (in A:40),
>>>> will have added new mergeinfo on A describing merges from B.
>>>> if change A:40 had instead been a different merge into A, let's
>>>> from C, it is still possible that merge might have brought along
>>>> new mergeinfo describing merges from B, because of the way
>>>> is propagated from branch to branch. Therefore, if we find that
>>>> change A:40 adds new mergeinfo about merges from B, we cannot
>>>> say that A:40 describes a reintegrate merge from B.
>>> Absolutely! There's also the similar case where 'B' is
>>> to 'A', additional edits are made to 'A', and then the
> whole thing is
>>> committed as r40. Now r40 represents both the reintegrate change and
>>> unrelated edits -- but of course the smallest unit mergetracking works
>>> with is a revision, so how does it classify r40?
>> Actually, I don't think we need to be concerned about the "mixture
> of merges and new changes" scenario, and here's why. As we know,
> merging is not a deterministic operation, and requires user input in general
> . When the user does a merge and commits the resulting working copy, the
> result is tracked as a merge. There's no conceptual way Subversion could
> distinguish the two kinds of change without the user telling it. The way to
> tell Subversion about two separate changes is by putting them in two separate
> revisions .
> Hi Julian,
> I agree that users making changes in addition to a merge in a single
> commit are asking for problems. But the underlying problem of
> detection doesn't seem any different from what you described as:
> "However, if change A:40 had instead been a different merge into A,
> let's say from C, it is still possible that merge might have brought
> along some new mergeinfo describing merges from B, because of the way
> mergeinfo is propagated from branch to branch. Therefore, if we find
> that change A:40 adds new mergeinfo about merges from B, we cannot
> simply say that A:40 describes a reintegrate merge from B. We need to
> look more closely. That's what I'm currently working on."
> Maybe I need to wait to see what your current work yields before
> commenting much more :-) Nah, more comments below...
>> So, "Don't commit new changes with a merge" is a rule we have
> to embrace if we want to make progress in defining more useful merge behaviour.
> Or rather the rule is a bit softer: "You shouldn't commit new changes
> with a merge, because Subversion won't track the new changes separately but
> will assume they are part of the merge conflict resolution."
>> In fact the only cases in which Subversion's behaviour will actually be
> affected by this rule are multi-path or cyclic merges, that is the cases where
> we want to avoid merging a change to a branch that already has "it".
>> The only form of cyclic merge we've supported up till now has been the
> reintegrate merge, and that is a special case because it is limited to cases
> where it doesn't need to track individual changes, it simply compares at the
> entire states of the two branches. The effect is that it doesn't matter
> whether the user included new changes along with any merges into the source
> (child) branch.
>> In a multi-path merge, such as the third step of A->B, A->C, B->C,
> we don't yet support automatically skipping any changes coming via B that
> have already been merged directly from A to C. Subversion will still try to
> merge that whole change from B to C, and the already-merged part of it will
> conflict (logically if not physically). The user's options are to avoid
> merging that specific revision from B to C (perhaps by record-only merging it),
> or to deal with that conflict. If a new change had been committed along with
> that A->B merge, neither option successfully merges that new change to C.
>> So the reintegrate merge doesn't care if you have mixed a merge with a
> new change, and for the multi-path scenario the user already needs to keep
> merges separate from new changes.
>> Now, what about the case I'm addressing? That is, the sync merge into
> a branch that has already been reintegrated. If the user wants to carry on
> using the branch in this way, up till now it has been the user's
> responsibility to tell Subversion to ignore the reintegrate merge commit, by the
> record-only merge known as the "keep-alive dance". If there had been
> a new change mixed up with that reintegration, the user would be telling
> Subversion to ignore it. So here again we already had the rule that you have to
> keep new changes separate.
>>  Of course we have automatic text merge tools which give the user a good
> guess at a likely
>> answer, which often turns out to need no further editing in simple
>>  I don't believe it would be productive to invent a mechanism for
>> recording two separate changes within one revision, and if we did the
>> user would still have to tell Subversion about them.
>>>> We need to look more closely. That's what I'm currently
> working on.
>>> This runs up against issue #2897 'Reflective merges are
> faulty'. You
>>> seem to be acknowledging this with your comments in the patch itself:
>>> "Note that with all these improvements, the result might
> approach a more
>>> flexible merge system in which any reflected or cyclic changes are
>>> *BUT WOULD NOT COPE* (emphasis mine) with the case where a revision
>>> in the source branch contains a mixture of changes merged from the
>>> branch and other changes."
>>> But if we can't cope with that case how can we do anything
> that's an
>>> improvement over what we do today (i.e. just merge r40)? We can't
>>> skip r40 based on mergeinfo alone because there might be other
>>> legitimate changes.
>> There are two potential kinds of "other" changes. New changes
> mixed in with the merge are against the rules, as explained above. Other merges
> mixed in with that merge are, I think, legitimate.
> Again, I agree that the first case is really one where the user is
> asking for trouble, so I can live without solving that. Thing is, we
> both agree the second type of change should be detected. What puzzles
> me is that if we can reliably detect the second type it seems we could
> detect the first type too...
>> If we can calculate that r40 contains nothing but changes merged directly
> from branch B, then the Right Thing is to skip it. If it contains other changes
> (or if we can't determine for sure), then we can tell the user what's
> happening (that it would be wrong to try to merge this change because it
> contains "...") and we can either skip it or stop the merge.
>> My contention is that that would be better than what we do today. Note
> that what we do today is unconditionally *attempt* to merge r40 even though
> (with my new code) we could know for sure that it will conflict logically (and
> will probably but not necessarily conflict physically).
> I agree, sorry if I came across as opposed to improvement in this
> space, I'm not.
>> Another way of looking at it is that when we merge a revision that will
> logically conflict with the target, today the default action is to attempt to
> merge it anyway, and my proposed action is to tell the user what's happening
> and allow them to take corrective action in advance. The user could choose a
> different merge source, or perhaps realize that they were trying to merge into
> the wrong target branch, or something else.
>> Does that make sense?
> A random thought about your patch: The skip/stop logic kicks in when
> performing a cherry pick merge. It might be better if it only applies
> when doing sync merges. Reasoning: If a user explicitly states the
> revisions she wants to merge, I think we should assume she knows what
> she wants. Also, if our detection method isn't foolproof we need a
> way for the user to merge a revision we want to skip (or we error out
> on), but we still want mergetracking active so using --ignore-ancestry
> isn't an option. If the user is relying on Subversion to select the
> appropriate revisions (i.e. a sync merge) then your patch's behavior
> makes more sense.
This is an archived mail posted to the Subversion Dev mailing list.