Re: Implicit keep-alive after reintegrate merge

From: Julian Foad <julianfoad_at_btopenworld.com>
Date: Mon, 16 Jan 2012 18:35:51 +0000 (GMT)

Hi Paul. Here's an updated patch, which is both simpler (now that I learnt more about how the existing code works) and contains tests that demonstrate the functionality (merge_reintegrate_tests.py 18 and 19). I think merge_reintegrate_tests.py 19 demonstrates it better than 1000 words so I'll leave it at that for today. I'll respond to your last response (below) maybe tomorrow. - Julian ----- Original Message ----- > From: Paul Burba <ptburba_at_gmail.com> > To: Julian Foad <julianfoad_at_btopenworld.com> > Cc: "dev_at_subversion.apache.org" <dev_at_subversion.apache.org> > Sent: Wednesday, 11 January 2012, 19:57 > Subject: Re: Implicit keep-alive after reintegrate merge > > On Wed, Jan 11, 2012 at 7:43 AM, Julian Foad <julianfoad_at_btopenworld.com> > wrote: >> Hi Paul. Thanks for your thoughts. >> >> Paul Burba wrote: >> >>> Julian Foad wrote: >>>> We can say for sure that when we reintegrated B to A (in A:40), > that >>>> will have added new mergeinfo on A describing merges from B. > However, >>>> if change A:40 had instead been a different merge into A, let's > say >>>> from C, it is still possible that merge might have brought along > some >>>> new mergeinfo describing merges from B, because of the way > mergeinfo >>>> is propagated from branch to branch. Therefore, if we find that >>>> change A:40 adds new mergeinfo about merges from B, we cannot > simply >>>> say that A:40 describes a reintegrate merge from B. >>> >>> Absolutely! There's also the similar case where 'B' is > reintegrated >>> to 'A', additional edits are made to 'A', and then the > whole thing is >>> committed as r40. Now r40 represents both the reintegrate change and >>> unrelated edits -- but of course the smallest unit mergetracking works >>> with is a revision, so how does it classify r40? >> >> Actually, I don't think we need to be concerned about the "mixture > of merges and new changes" scenario, and here's why. As we know, > merging is not a deterministic operation, and requires user input in general > [1]. When the user does a merge and commits the resulting working copy, the > result is tracked as a merge. There's no conceptual way Subversion could > distinguish the two kinds of change without the user telling it. The way to > tell Subversion about two separate changes is by putting them in two separate > revisions [2]. > > Hi Julian, > > I agree that users making changes in addition to a merge in a single > commit are asking for problems. But the underlying problem of > detection doesn't seem any different from what you described as: > > "However, if change A:40 had instead been a different merge into A, > let's say from C, it is still possible that merge might have brought > along some new mergeinfo describing merges from B, because of the way > mergeinfo is propagated from branch to branch. Therefore, if we find > that change A:40 adds new mergeinfo about merges from B, we cannot > simply say that A:40 describes a reintegrate merge from B. We need to > look more closely. That's what I'm currently working on." > > Maybe I need to wait to see what your current work yields before > commenting much more :-) Nah, more comments below... > >> So, "Don't commit new changes with a merge" is a rule we have > to embrace if we want to make progress in defining more useful merge behaviour. > Or rather the rule is a bit softer: "You shouldn't commit new changes > with a merge, because Subversion won't track the new changes separately but > will assume they are part of the merge conflict resolution." >> >> In fact the only cases in which Subversion's behaviour will actually be > affected by this rule are multi-path or cyclic merges, that is the cases where > we want to avoid merging a change to a branch that already has "it". >> >> The only form of cyclic merge we've supported up till now has been the > reintegrate merge, and that is a special case because it is limited to cases > where it doesn't need to track individual changes, it simply compares at the > entire states of the two branches. The effect is that it doesn't matter > whether the user included new changes along with any merges into the source > (child) branch. >> >> In a multi-path merge, such as the third step of A->B, A->C, B->C, > we don't yet support automatically skipping any changes coming via B that > have already been merged directly from A to C. Subversion will still try to > merge that whole change from B to C, and the already-merged part of it will > conflict (logically if not physically). The user's options are to avoid > merging that specific revision from B to C (perhaps by record-only merging it), > or to deal with that conflict. If a new change had been committed along with > that A->B merge, neither option successfully merges that new change to C. >> >> So the reintegrate merge doesn't care if you have mixed a merge with a > new change, and for the multi-path scenario the user already needs to keep > merges separate from new changes. >> >> Now, what about the case I'm addressing? That is, the sync merge into > a branch that has already been reintegrated. If the user wants to carry on > using the branch in this way, up till now it has been the user's > responsibility to tell Subversion to ignore the reintegrate merge commit, by the > record-only merge known as the "keep-alive dance". If there had been > a new change mixed up with that reintegration, the user would be telling > Subversion to ignore it. So here again we already had the rule that you have to > keep new changes separate. >> >> [1] Of course we have automatic text merge tools which give the user a good > guess at a likely >> answer, which often turns out to need no further editing in simple >> cases. >> >> [2] I don't believe it would be productive to invent a mechanism for >> recording two separate changes within one revision, and if we did the >> user would still have to tell Subversion about them. >> >> >>>> We need to look more closely. That's what I'm currently > working on. >>> >>> This runs up against issue #2897 'Reflective merges are > faulty'. You >>> seem to be acknowledging this with your comments in the patch itself: >>> >>> "Note that with all these improvements, the result might > approach a more >>> flexible merge system in which any reflected or cyclic changes are > skipped, >>> *BUT WOULD NOT COPE* (emphasis mine) with the case where a revision >>> in the source branch contains a mixture of changes merged from the > target >>> branch and other changes." >>> >>> But if we can't cope with that case how can we do anything > that's an >>> improvement over what we do today (i.e. just merge r40)? We can't >>> skip r40 based on mergeinfo alone because there might be other >>> legitimate changes. >> >> There are two potential kinds of "other" changes. New changes > mixed in with the merge are against the rules, as explained above. Other merges > mixed in with that merge are, I think, legitimate. > > Again, I agree that the first case is really one where the user is > asking for trouble, so I can live without solving that. Thing is, we > both agree the second type of change should be detected. What puzzles > me is that if we can reliably detect the second type it seems we could > detect the first type too... > >> If we can calculate that r40 contains nothing but changes merged directly > from branch B, then the Right Thing is to skip it. If it contains other changes > (or if we can't determine for sure), then we can tell the user what's > happening (that it would be wrong to try to merge this change because it > contains "...") and we can either skip it or stop the merge. >> >> My contention is that that would be better than what we do today. Note > that what we do today is unconditionally *attempt* to merge r40 even though > (with my new code) we could know for sure that it will conflict logically (and > will probably but not necessarily conflict physically). > > I agree, sorry if I came across as opposed to improvement in this > space, I'm not. > >> Another way of looking at it is that when we merge a revision that will > logically conflict with the target, today the default action is to attempt to > merge it anyway, and my proposed action is to tell the user what's happening > and allow them to take corrective action in advance. The user could choose a > different merge source, or perhaps realize that they were trying to merge into > the wrong target branch, or something else. >> >> Does that make sense? > > A random thought about your patch: The skip/stop logic kicks in when > performing a cherry pick merge. It might be better if it only applies > when doing sync merges. Reasoning: If a user explicitly states the > revisions she wants to merge, I think we should assume she knows what > she wants. Also, if our detection method isn't foolproof we need a > way for the user to merge a revision we want to skip (or we error out > on), but we still want mergetracking active so using --ignore-ancestry > isn't an option. If the user is relying on Subversion to select the > appropriate revisions (i.e. a sync merge) then your patch's behavior > makes more sense. > > Paul >

application/octet-stream attachment: merge-avoid-reflective-3.patch

Received on 2012-01-16 19:36:32 CET

This message: [ Message body ]
Next message: Daniel Shahaf: "Re: svn commit: r1232085 - /subversion/trunk/contrib/server-side/mod_setlocale/mod_setlocale.c"
Previous message: C. Michael Pilato: "Re: [PATCH] repro-template.bat: Avoid three "File not found" errors when run for the first time"
In reply to: Paul Burba: "Re: Implicit keep-alive after reintegrate merge"
Next in thread: Julian Foad: "Re: Implicit keep-alive after reintegrate merge"
Reply: Julian Foad: "Re: Implicit keep-alive after reintegrate merge"

Contemporary messages sorted: [ by date ] [ by thread ] [ by subject ] [ by author ] [ by messages with attachments ]