Re: Problems Reintegrating the fsfs-format7 branch
From: Julian Foad <julianfoad_at_btopenworld.com>
Date: Tue, 5 Mar 2013 16:37:39 +0000 (GMT)
Paul Burba wrote:
What I was thinking of here is to detect the most general condition that isn't handled properly. Let me explain this part, although I'm sure you're familiar with it.
Whenever the direction of merge is from A to B and the automatic merge code doesn't detect that a reintegrate merge is required, and yet one or more cherry-picked revision ranges and/or subtree merges have been merged from B to A, Subversion currently doesn't notice those merges from B to A, and proceeds to merge "everything" from A to B including those changes that were the result of B-to-A merges. That is what the plain "merge" from A to B has always done -- it has always ignored any mergeinfo describing merges from B to A.
In the example at hand, there has been a B-to-A merge of everything from B (except some non-operative revisions), and so we get loads of conflicts by ignoring this when we merge the other way. The intelligent user thinks, "This is clearly a reintegrate scenario."
On the other hand, if there has been a B-to-A merge of not quite everything from B (leaving an operative unmerged part at the beginning or somewhere in the middle), then an old "reintegrate" merge from A to B would error out, whereas an old non-reintegrate merge would just do the merge and wrongly re-merge all the B-to-A changes back to B. The new automatic merge, if it determines (based on the merging of the branch root node) to do a non-reintegrate, currently does the latter.
At some point, maybe not for 1.8, if we can detect any such condition, it would be a good to warn the user:
A clean merge from A to B is not possible because some but not all
We could add some suggestions to the warning:
If you are expecting this to be a reintegrate-like merge, check that
If you are expecting a sync merge, note that there have been some
We could add more specific details of what's been found, and more specific suggestions related to the details, such as the case at hand where all operative revs have been merged B-to-A, if we think it's worth the effort.
A basic implementation would not be cheap if there is a significant amout of subtree mergeinfo to check. And I have carefully avoided the issue of whether we should error out or proceed with the merge, as there's no point bike-shedding about that unless we really decide to do this.
So let's take a closer look at what we could do for the specific case at hand.
> To explicitly use
So, you're thinking more about detecting the very specific kind of scenario that Stefan encountered -- where everything except some leading non-operative revisions had been merged B-to-A, and a reintegrate would give the "correct" result.
In an immediate sense, certainly we could "just do it", special-case this particular scenario (have you thought about what the exact condition should be?) and automatically choose the reintegrate code path in that case.
The trouble is, that doesn't resolve the situation completely. It gives the correct result in terms of the changes merged, but it doesn't straighten out the mergeinfo. That means we have to make sure the rest of the merge subsystem consistently handles this case too: the "sync"-style merge, merge graphing tools that determine what kind of merge was done and what kinds of merge are eligible, and so on. See below.
> I filed an issue for this:
I agree it's atypical (or rather unintended) usage, and I also acknowledge that if Stefan used merge like this, then sometimes other people will do too.
However, I don't want us to put any more special cases in the code, especially without putting them into the *design*. If the old "--reintegrate" accepts a missing leading no-op revision range in considering whether "everything" has been merged, that should now be considered a bug. The new rule is that if every revision on B is *recorded* as merged to A, then we can reintegrate from A to B, else we can't.
The main alternative would be to adopt some principle such as "When determining what's been merged and what needs to be merged, we only notice operative revisions, aka non-null changes". I considered that during the design of automatic merge, and was strongly drawn to the idea and started prototyping around it, but that turned out to be a rat-hole involving potentially exhaustive traversal of the whole mergeinfo graph (all branches) to determine whether a given bit of mergeinfo represented an operative or an inoperative change. And the benefits are near zero, because both sync and reintegrate merges were already filling in any gaps in the mergeinfo in the direction of the merge so the situation can only arise if you use manual merges.
The sensible choice in the end was to require that revision ranges are explicitly recorded whether operative or not.
Having found a case that we suspect may occur frequently enough to cause annoyance, I believe the best thing to do is to admit that we don't handle it automatically. Just because the old reintegrate handles this particular case doesn't mean we're making a mistake if the new automatic merge doesn't. This behaviour change is a small adjustment towards overall more consistent merge behaviour, and consistency is key to large-scale usability, much more so than special casing is.
So, what to do exactly? Options seem to be:
0) leave it as it is
1) detect this specific case and warn or error out
2) detect this specific case and do a reintegrate
3) detect the general case and warn or error out
4) detect the general case and merge properly (hah! that's for 1.9)
I would say 0/1/2 are achievable and 3/4 are too difficult, and 2 is a bad idea.
This is an archived mail posted to the Subversion Dev mailing list.