Re: Symmetric Merge -- Algorithm

From: Julian Foad <julianfoad_at_btopenworld.com>
Date: Fri, 20 Apr 2012 18:05:33 +0100 (BST)

Cristian Amarie wrote:

>> [...] You are talking about any graph consisting of a sequence
>> of 'complete' merges[1] between the
>> two branches A and B. Is this a concrete example, where n=2 and m=3?
>>
>>     / -- p ---- q ----- A1 -- s ----- A2
>> O   \      \      /      \    /
>>     \ --- B1 --- B2 -- r ----- B3 -- t
>>
>> Here, p/q/r/s/t means a change that is not a merge. The p/q/r/s/t states
>> are the base candidates, not Ab and Bb.
>
> Ah... my mistake. But that is the idea, yes.
>
>> Did you mean, "(merge B into A produces Ax, where 1 <= x <= n)" and
>> "(merge A into B produces By, where 1 <= y <= m)"?
>
> Yes. B into A will produce in this example A3,

No, B into A in this example will be a no-op (no operation) merge, because all of the changes from B are already included in A. There will be no change in the working copy, so if you try to commit nothing will happen. There will be no A3.

> A into B will produce B4

No, again it will be a no-op. I think we need some more changes on the graph, like this:

    / --- p --- q ----- A1 ---- s -- j -- A2 -- v
O      \    \      /      \      /
    \ h -- B1 -- B2 -- r -- i -- B3 ---- t -- u

Now, merging B into A would produce a new node A3:

    / --- p --- q ----- A1 ---- s -- j -- A2 -- v -- A3
O      \    \      /      \      /      /
    \ h -- B1 -- B2 -- r -- i -- B3 ---- t -- u -----

while merging A into B would produce a new node B4:

    / --- p --- q ----- A1 ---- s -- j -- A2 -- v
O      \    \      /      \      /       \
    \ h -- B1 -- B2 -- r -- i -- B3 ---- t -- u -- B4

> and my goal is to find bases which will have at least a way
> (path) for which A3 == B4 (symmetry).

OK, I think it will help if I try to explain the 'big picture'.

The reason I called this merge algorithm 'symmetric' is because, unlike
the existing (sync and reintegrate) merges, it behaves in the same way
no matter which branch was created first (A copied from B, or B copied
from A) and no matter which direction the previous merge went (A to B,
or B to A).

It is also true that this merge algorithm should produce the same basic
result no matter which direction you merge (A to B, or B to A). A3 and
A4 might not be textually identical, because in each merge there
are potentially some conflicts and the user (with the help of a tool)
might resolve the conflicts in slightly different ways. But
essentially, logically, yes, A3 == B4.

I have described (in the Wiki) how the algorithm will look for a base on A and base on B and will then choose the later one. However, we do not really need to find two different bases; that is just an implementation detail, an artifact of how the existing (sync and reintegrate) merge code works. All we really need is to find the latest one ('t' in these examples). The *same* base ('t') is the correct base for a merge from A to B and the correct base for a merge from B to A.

We could choose *any* node as the base for our 3-way merge, and the merge would still work, in the sense that it would combine the correct sets of logical changes, and A->B and B->A would give the same result. The only trouble with choosing a poor base is that the 3-way merge will find more conflicts.

In graphs like the ones above (no criss-cross merges), the best base is the latest of the ones named p/q/r/s/t, which is 't'. To show why it is 'best', we can decompose each side of that merge (t -> v and t -> u) into a sequence of logical changes:

t -> v is: (A2 v)
is: (j v)

t -> u is: (u)

So the 3-way merge has 't' as its base (origin), and one side has changes 'j' and 'v' relative to that base, and other side has changes 'u' relative to that base.

If we choose the older base 's' as the base for our 3-way merge, we can decompose the two sides of that merge as:

s -> v is: (j A2 v)
is: (j s:B3 t v)
is: (j i t v)

s -> u is: (s:B3 t u)
is: (i t u)

Now the 3-way merge has 's' as its base (origin), and one side has changes (j i t v) relative
to that base, and other side has changes (i t u) relative to that base.

The 3-way merge of the change-sequences (j i t v) and (i t u) will produce the same result as the 3-way merge of the change-sequences (j v) and (u), but with much more potential for conflicts -- and clutter for the user, if it is displayed in a GUI. Remember that the logical changes 'i' might not be textually identical how it appears in branch A and how it appears in branch B.

Also, the merges will even 'work' (but again with more conflicts than necessary) if we choose something like A2 as the base. In that case, one path will traverse *backwards* along the A2 -> t edge, and so the decomposed set of changes seen will include *reverse changes*, which makes it even more confusing for the user looking at the 3-way merge, although algorithmically it is still correct.

Let me know if that doesn't make sense to you.

[...]
> You got my idea exactly (despite the clumsy description...).
> Also
> B2 -> r -> A1 -> B3 -> s -> t -> A2 ==> A3 (end in A)
> B2 -> r -> A1 -> B3 -> s -> A2 -> t ==> B4 (end in B)
> can be possible paths, if A3 == B4 is verified.

Those last two paths are strange: they do not strictly follow the 'merge arrows' nor the 'natural history'. A1 -> B3 and s -> t are not simple 'edges' of the graph, and A2 -> t is an edge but you are following it backwards.

- Julian
Received on 2012-04-20 19:06:09 CEST

This message: [ Message body ]
Next message: Julian Foad: "Re: Symmetric Merge -- switched subtrees [was: no local mods or switched subtrees by default]"
Previous message: Paul Burba: "Re: Symmetric Merge -- switched subtrees [was: no local mods or switched subtrees by default]"
In reply to: Cristian Amarie: "Re: Symmetric Merge -- Algorithm"

Contemporary messages sorted: [ by date ] [ by thread ] [ by subject ] [ by author ] [ by messages with attachments ]