Re: Subversion merge creates bogus tree conflicts

From: Julian Foad <julianfoad_at_btopenworld.com>
Date: Tue, 15 Jan 2013 17:06:20 +0000 (GMT)

David Moon wrote:
> On Jan 15, 2013, at 8:38 AM, Julian Foad wrote:
>> I think the main point is you need to understand that
>>
>> "svn merge" doesn't calculate the union of two sets of nodes, it combines
>> two sets of *changes* that have been made since a common starting point.
>>
>> You might want to read the section on merging in The Book
> <http://svnbook.red-bean.com/>.
>
> I do understand that, and I did read that.

OK David, I'm sorry for the wrong assumption.

Now, having read through the whole of this email, let me first summarize, and then respond to some particular points.

I agree Subversion's merge should generally work better than it does in scenarios similar to the one you presented. There's a lot of work to do in this area (merging, tree conflicts, conflict resolution). It is my main focus these days.

Your test case attempts to use an automatic merge command to merge unrelated branches (see below re. "unrelated"), and this is not expected to work, and in 1.8 it will in fact throw an "unrelated branches" error. That led Ben and me to conclude your report is invalid. However, by adding a revision range to the merge command we can request a non-automatic merge, which can be appropriate in cases similar to yours. But when I try that with your case, I still get the unexpected conflict -- see below.

Finally, migration from ClearCase is notoriously difficult to do right. The output of such a migration tool cannot be considered /a priori/ to be valid or reasonable Subversion usage.

Now some detailed responses.

> In this case the common starting point is revision zero, i.e. the initial empty
> state of the repository. I do understand that most Subversion merge use cases
> have a non-empty common starting point, but this case does not.

For the purpose of automatic merging -- that is, using merge tracking to figure out what to merge when you don't specify the revisions -- Subversion assumes there is a common ancestor of the *node* (file or directory) that you're merging. The empty state of the repository (the root directory at revision zero) doesn't serve this purpose, because (in your test case) the directories you are merging are independently created *subdirectories* of the root directory, they are not *branches* of the root directory.

The node I'm talking about is the root node of the merge: that is, in your test case, the directory 'B1' (source branch) and 'B2' (the target branch).

You didn't state what nodes were already present at the start of your test case; I'm assuming 'test1' was not present and so 'B1' was created by the first commit shown, and 'B2' originated as 'T1' which was created in the second commit shown.

(If, however, 'B1' and 'T1' did already exist at the beginning of the test case, and had a common ancestor, then an automatic merge would be valid and the analysis would be a bit different.)

> You can excuse Subversion's behavior by saying the two directories named D2
> were created independently, so they can't be merged, but that doesn't
> make the current behavior useful. Anyway the two directories D1 in my testcase
> were also created independently, yet Subversion was able to merge them. Is it
> really right that independently created directories with identical sets of child
> names can be merged, but independently created directories with disjoint sets of
> child names cannot be merged?

Like I said, we're not merging directories, we're merging changes. The distinction is important. More precisely, in this kind of merge, we're merging a change into an existing tree. What change is Subversion trying to merge into the directory D1 in the target branch, and what change into D2? If I run your commands starting with an empty repository, I can see exactly what happens:

[[[
+ svn mkdir --parents $REPOS/test1/branches/B1/D1/D2/A -m xx
Committed revision 1.
+ svn mkdir --parents $REPOS/test1/tags/T1/D1/D2/B -m xx
Committed revision 2.

...

+ cd B2
+ svn merge -r0:HEAD $REPOS/test1/branches/B1
--- Merging r2 through r5 into '.':
C D1/D2/A
--- Recording mergeinfo for merge of r2 through r5 into '.':
U .
Summary of conflicts:
Tree conflicts: 1
]]]

The "--- Merging r2 through r5" line shows that Subversion decides the first change it needs to merge into the target is r2. Notably, it does not try to merge r1, which included the *creation* of 'B1' and of 'A'. The first change within B1 that it tries to merge is a change *inside* A (the creation of 'test1.txt'), and this change can't be merged into the target because the target has no 'A' inside which to make that change.

That is why you see a tree conflict described as 'local delete, incoming edit'. It appears that the directory 'A', which was expected to exist in the target already before change r2 can be merged, seems to be missing from the target and so (because we assume you started from a common state) appears to have been 'deleted'.

I'm not entirely sure whether Subversion should be choosing to merge a bit more -- perhaps "all of revision r1 except for the creation of the branch-root 'B1'" -- but I at this point I still think the use case is invalid. Or, put another way, the use case is certainly something people might want to do from time to time but the "merge" command is not the tool to do it. Rather it requires some manual steps to set up a starting point from which "merge" can then do the rest.

> On Jan 14, 2013, at 10:03 PM, Ben Reser wrote:
>> Looks to me like you're using the merge command wrong.
>>
>> 1) You're using the sync merge format of the 1.7.x merge command,
>> however the two branches you're merging (B1 and B2) have no common
>> ancestor. Since B2 was copied from T1 which you independently
>> created, there's really no way for the command to know what to do.
>> The fact that it does anything can be considered a bug
>
> I would consider it a bug that svn merge didn't recognize that the initial
> empty state of the repository was the common ancestor, so it needed to merge all
> the changes on both branches. You might disagree.

As explained above, Subversion has a stricter definition of node ancestry.

>> 2) If what you're really intending to do here is a cherry-pick merge
>> of the changes that were made on B1 (adding A and test1.txt) then....
>
> Isn't it strange to you that you tried a bunch of variations and some of
> them worked and others didn't? It seems strange to me. Maybe that's
> because I don't understand Subversion well enough to see a pattern in what
> works and what doesn't.

Ben's demonstration didn't seem strange to me, it was spot on as an aid to understanding (rather than as a practical solution). He was showing the kind of steps that are needed to go from a pair of unrelated branches to a point where you can use the "merge" command to do the rest. It might make more sense to you now in retrospect if you've followed what I wrote above.

>> Your example use case seems so far from any realistic
>> scenario that it's hard to envision what you're actually doing here.
>
> I didn't want to distract you by providing information about anything other
> than the boiled-down test case, but maybe that was a mistake. The real-life
> situation involves migration from ClearCase using ClearVision's migrate2svn
> tool. Because of the way that tool works, elements which were the same object
> in ClearCase can be created multiple times independently in Subversion, without
> telling Subversion that they are related, other than having the same name in
> different branches. In addition, that tool can create partially populated
> branches which then need to be merged together to create a fully populated
> branch. It was that final merge that failed with tree conflicts. So it really
> is a realistic scenario if you think migration from ClearCase to Subversion is
> realistic.

It is the migration tool's job to convert the concepts of branches and everything else from ClearCase to Subversion. This is notoriously hard for CC -> SVN in particular, and most of the available tools are woefully inadequate in general. I spoke recently to someone who to the best of my knowledge is at the state of the art in CC->SVN tooling, having solved most of these kind of issues in many large migrations; ask about that in a separate email thread and I'll try to find his details if you're interested.

>> I can understand the argument that maybe we should handle this
>> directory creation conflict a little more gracefully, but it does seem
>> to be a legitimate tree conflict.
>
> If one of the two things being merged created a directory and the other deleted
> the same directory, I would call that a tree conflict. But I don't see how
> two adds to a directory can be a conflict; if the names being added are
> different, there is no conflict. If the names being added are the same, merge
> the children. However, I don't understand Subversion's definition of
> tree conflict. http://svnbook.red-bean.com/en/1.7 doesn't define it, and
> the example it gives involves rename vs. edit conflict, not add vs. add
> conflict.

Two adds of different names to the same directory do not conflict. The conflict you saw is on attempting to add a file into a directory ('A') that didn't exist on the target.

> I don't really want to have an argument. I am just trying to point out that
> Subversion's behavior is not useful in this case. I am also not sure
> Subversion merge is behaving consistently in all cases. It's the Subversion
> development community's choice whether to improve the behavior or leave it
> like it is.

OK. I am (and we are) certainly aware that there is a lot of inconsistency and lack of simple useful "just do the right thing" behaviour in this area. We are working on it.

Thank you for taking the time to provide this feedback.

- Julian

> And before I forget, thank you both for the quick and thoughtful responses.
>
> --Dave Moon
Received on 2013-01-15 18:06:56 CET

This message: [ Message body ]
Next message: David Moon: "Re: Subversion merge creates bogus tree conflicts"
Previous message: Stefan Sperling: "Re: 1.8 Release Status : Test Review Task Update"
In reply to: David Moon: "Re: Subversion merge creates bogus tree conflicts"
Next in thread: David Moon: "Re: Subversion merge creates bogus tree conflicts"
Reply: David Moon: "Re: Subversion merge creates bogus tree conflicts"

Contemporary messages sorted: [ by date ] [ by thread ] [ by subject ] [ by author ] [ by messages with attachments ]