[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: Tree conflicts - thoughts on use cases, merging, and tests

From: Nico Schellingerhout <nico.schellingerhout_at_philips.com>
Date: Tue, 11 Mar 2008 01:48:09 +0100

Julian Foad <julianfoad_at_btopenworld.com> wrote on 03/10/2008 09:33:23 PM:

> Stephen,
>
> I'm thinking about all the different things a user can do (including
renames
> and such like) and how the user can do these things in one branch, do
other
> such things in another branch, and try to merge the two together. Then
I'm
> thinking about how Subversion maps these many different user-level
> changes down
> onto a smaller set of delta-editor operations under the hood, and how
that's
> the place where we're trying to detect conflicts having lost the
information
> about the user's original intention.
>
> This email doesn't have a conclusion, it's just me conveying my
> thought process
> in the hope that it will help us to communicate.
>
>
> USE CASES
>
> I have started by categorising all the user actions I can think of into
four
> categories that reflect whether they delete and/or create an object.This
is a
> list of all possible kinds of change to the file object at a particular
path
> (or, for creation, about to live at that path). Listed below each kind
of
> change are non-exhaustive examples of user-level actions that cause
> that change.
>
> I'm not entirely sure that these four categories are the best way
tosplit up
> the possibilities, but they are serving a useful purpose for me for
> the time being.
>
> Kinds of change, and user-level actions that cause them:
>
> CREATE
> add (a new object without history)
> copy(+modify?) (from any object in any revision)
> move(+modify?) (from any object in the head revision)
>
> GO-AWAY
> delete
> move(+modify?) (to another name and/or into another dir)
>
> REPLACE
> replace with different kind (DELETE by any means, and CREATE byany
means
> an object of the same name)
> replace with same kind (DELETE by any means, and CREATE by any
means an
> object of the same name)
> revert (delete, and copy(+modify?) from this object's history -a
special
> case of replace)
>
> MODIFY
> modify (any text and/or property mods including creation and
deletion of
> properties)
>
>
> Next I thought briefly about the behaviour I would expect when merging a
pair
> of changes, and tabulated the results according to those four
categories.
>
> Desired behaviour for merging two modifications that each start fromthe
same
> base (as in "svn update"):
>
> (For files)
>
> Action ... merged onto ... "Reason"
> | |
> v | CREATE GO-AWAY REPLACE MODIFY
> ------- + ------- ------- ------- -------
> CREATE | C X X X
> |
> GO-AWAY | X C C C
> |
> REPLACE | X C C C
> |
> MODIFY | X C C ok
>
> C = conflict (unless same change to same thing and policy allows it)
> X = can't happen if branch was synchronised (so flag an error or
> a conflict)
> ok = merge in the obvious way
>
>
> Summary of the kinds of change covered by the use cases 1 to 6:
>
> (For files)
>
> Action ... merged onto ... "Reason"
> | |
> v | CREATE GO-AWAY REPLACE MODIFY
> ------- + ------- ------- ------- -------
> CREATE |
> |
> GO-AWAY | 3,,6 2,,5
> |
> REPLACE | (7)
> |
> MODIFY | 1,,4 (7)
>
> #,#,# = use case number for {update,switch,merge}

Julian, Steven,

Thanks for the write ups. I have interpreted the tables as "what should
happen if an action of type 1 is merged onto an action of type 2".
However, I don't think there is a difference between GO-AWAY and REPLACE
when used purely in this way. The difference between the two is slightly
more subtle: it involves taking the history on a branch into account.

>
> We talked about a "use case 7" involving replacement and modification;
that
> needs splitting into separate use cases and formalising.

Actually, I think UC7, which I raised earlier off-list, and that we
discussed
earlier over the phone is different from what you indicate:

UC7 can be read in your notation as:

"MODIFY merged onto a file that has been REPLACEd in an not-yet merged
change on either branch"

As far as I can see this case does not present new problems for update:
1. There has been a REPLACE and a MODIFY in two different revisions on
the archive. Updating should pull these changes in consecutively and
there should not be a conflict.
2. There has been a REPLACE on the archive, and a local MODIFY. Update
reduces to UC2 (merge a move onto a modified file).
3. There has been a MODIFY on the archive, and a local REPLACE. Update
reduces to UC1 (merge a modification on to a file being moved away).

However, there are problems for merging between branches:
1. There has been a REPLACE and a MODIFY in two different revisions on
the source branch. Merging should pull these changes in consecutively and
there should not be a conflict.
2a. There has been a REPLACE on the source branch, and a MODIFY on the
target.
Merging the REPLACE onto the source reduces to UC5.
2b. There has been a REPLACE and a MODIFY on the source branch.
Merging the MODIFY is a new case:

          r1 r2 r3 r4
             x->a edit x merge r2:3
source: x(x) a(x) a(x')
          a(a)

target: x(x) x(x) x(x) x(x') OR? x(x)
          a(a) a(a) a(a) a(x) a(x')

Notation: f(x) is a file with name "f" and contents "x". Files with the
same "identity" are written on the same line.

It is not easy to determine the right behaviour here. (I would favour
the first of the two options, because one could merge r1:2 later on and
would end up with two identical branches, which would be a good thing.)
The user could be warned to make the right call.

3. There has been a MODIFY on the source branch, and a REPLACE on target.
Merging the MODIFY onto the target is a new use case:

          r1 r2 r3 r4
                     edit a merge r2:3
source: x(x) x(x) x(x)
          a(a) a(a) a(a')

             x->a
target: x(x) a(x) a(x) a(x) OR? a(a')
          a(a)

It is not easy to determine the right behaviour here. (I would favour the
first of the two options, because merging r1:2 from target to source
would bring both branches into sync.)
The user could be warned to make the right call.

The problem with cases 2b and 3 is that the detection that works for all
other cases (seeing that a modify is to be merged onto a file that is
scheduled for deletion or is not there at all, or merging a delete on a
file that differs from the file being deleted), does not work here:
this is the perfectly "normal" case of a textual change being
merged on top of a file that does exist.

The only way to determine that these cases are actually conflicts (or at
the very least require special treatment) is to take the complete history
of the pair of files to be merged into account and determine if they share
a common history.

In case 2b a(x') on source and a(a) on target do not share a history,
because they started out as x(x) and a(a), respectively.

In case 3 a(a') on source and a(x) on target do not share a history,
because they started out as a(a) and x(x), respectively.

Some final remarks on UC4-6:
----------------------------

UC4:

----
In my opinion UC4 is obviously a tree conflict: you are trying to merge
a change onto a file that does not exist in the target branch. Something
weird must be going on (a tree conflict).
UC5: (move merged onto a file that has some changes in the target branch)
---
The condition for a conflict here could be that the file in the target
branch differs from the file in the source branch just before the delete.
However, this condition is only necessary because Subversion treats such
a merge as a copy from source to target, completely ignoring the
content of the original file on the target branch.
A better merge algorithm would be:
1. If the corresponding file on target exists (and shares a history),
perform the move on the target branch (instead of modeling it as a copy
from the source branch).
2. Merge the optional changes that may be in the same revision onto the
moved file on the target branch.
Example:
source/x(x) -> source/y(x') (arrow indicates a D/A+ pair)
Should be merged onto x(x'') as follows:
target/x(x'') -> target/y(x''') [where x''' stands for the textual merge 
of
delta(x,x') onto x'']
Instead of the current behaviour:
source/y(x') -> target/y(x')
Note that this requires the merge algorithm to "understand" how to
translate source/x(x) -> source/y(x) to target/x(x) -> target/y(x).
This is trivial if it is just a rename in the same directory, but more
complex if it is a move across directories (the destination directory
on the target may differ from the destination directory on the source).
Again, the notion of sharing history is crucial here, but here the
identity of the target parent directory is involved.
UC6: again, obviously always a tree conflict.
---
[...]
> TESTS
> 
> I was looking at the Python tests in the tree-conflicts branch. It's
> great that 
> you've already written tests.
> 
> I noticed that the test function "set_up_tree_conflicts()" claims to
> be setting 
> up for use cases 1, 2 and 3, but it does so by using "delete" 
> commands whereas 
> the use cases call for "move". I know that a "move" and a "delete" will 
both 
> get handled by the same code path in your implementation because they 
both 
> contain a delta-editor "delete-file" command, but I think we should 
write the 
> use-case tests to describe as accurately as possible the actual use 
cases. In 
> addition, we can have tests for other similar and different actions that 
we 
> think ought to trigger the conflict detection as well.
I agree, a move should be treated as a first class citizen: for detecting
conflicts it doesn't matter, but for resolving them (semi)automatically, 
it
does.
Just my 2 cts.
- Nico
Received on 2008-03-11 01:47:24 CET

This is an archived mail posted to the Subversion Dev mailing list.