RE: "svn diff" and "svn merge" sittin' in a tree

From: Sander Striker <striker_at_apache.org>
Date: 2002-02-15 13:39:05 CET

> From: Karl Fogel [mailto:kfogel@newton.ch.collab.net]
> Sent: 15 February 2002 06:08

> Philip Martin <philip@codematters.co.uk> writes:
>> Greg Hudson <ghudson@MIT.EDU> writes:
>>> I hate to pull the emergency break this late in the design, but after
>>> the recent discussion on using diff3 in "svn update", I think we're
>>> wasting our time designing an "svn merge" which simply takes the diffs
>>> from "svn diff" and applies them to the working copy. It will work, but
>>> it will never be as satisfying as CVS is.
>>>
>>> What we really want to do is take the same plaintexts as "svn diff" uses
>>> to make its diffs, and do a three-way-merge with them and the working
>>> copy.
>>
>> Absolutely, a three-way diff is required. That in itself is not a
>> problem, changing the "diff" callback to take three files instead of
>> two is quite simple.
>
> Okay, I confess to being completely confused here.
>
> Every so often, someone says "we can't just apply diffs to the working
> copy to do merge, we have to do a 3-way diff."
>
> I don't understand how the proposed 3-way diff helps us, or, if it
> does help us, how exactly it differs from what we've been discussing
> for "svn merge" all along.

Comments further down :)

> Above, Greg implies that what CVS does is good (presumably 3-way) and
> what we're going to do is bad (presumably not 3-way), while others
> have implied that "cvs merge" is not a good model because it is not
> 3-way :-). So below, without using the word "3-way", I'm going to
> describe how CVS's merge works, and how I've always assumed
> Subversion's merge is going to work, and show how they're the same
> method, and how I think they relate to what the `diff3' program does
> (and thus relate to 3-way merging?). If I miss something big, someone
> please make a noise.
>
> CVS's merge takes the differences between two revisions of a file F,
> and applies them to some third revision of F. In normal usage, the
> first two revisions are on one branch, and the third revision is in
> your working copy and is based on a different branch (note that the
> mainline is considered just another branch for these purposes). [I
> suppose theoretically you could find some way to merge changes from
> unrelated things to an unrelated thing, but the basic process
> described below would still be the same.]
>
> Here's CVS merge:
>
> cvs update -j TAG1 -j TAG2 foo.c
> vi foo.c ### inspect it for conflicts, resolve any
> cvs commit -m "Merged TAG1-->TAG2 changes into foo.c."
>
> which is essentially equivalent to:
>
> cvs up -p -r TAG1 foo.c > TAG1-foo.c
> cvs up -p -r TAG2 foo.c > TAG2-foo.c
> diff TAG1-foo.c > patchfile
> patch foo.c < patchfile
> vi foo.c ### inspect it for conflicts, resolve any
> cvs commit -m "Merged TAG1-->TAG2 changes into foo.c."
>
> .... ignoring options to diff and patch, and ignoring the fact that
> `patch' doesn't actually put conflict markers into foo.c, so you'd
> have to examine a .rej file and all that, but you get the idea. Heck,
> you could also do
>
> cvs diff -r TAG1 -r TAG2 foo.c | patch foo.c
>
> It's all the same. Of course, real CVS *does* put conflict markers
> into the file; I believe it uses diff3 internally, to do that. But
> it's not doing anything fundamentally beyond the power of regular diff
> and patch, except for the nice markup (which not everyone agrees is
> nice, to my great surprise when we took an informal poll on this
> list).
>
> Okay, so what about "svn diff"?
>
> Same thing, basically. I'm improvising the command syntax here, so
> please bear with me. The fully general command is:
>
> svn merge -R<rev1>:path1 -R<rev2>:path2 foo.c
> ### inspect for conflicts, resolve any
> svn commit -m "merged changes from blah to blah, blah blah."
>
> But of course, it's more common to do something like this:
>
> svn merge -r<rev1>:<rev2> PATH-OR-URL <args optional, implied `.'>
> ### inspect for conflicts, resolve any
> svn commit -m "merged changes from blah to blah, blah blah."
>
> This would do exactly the same thing as CVS: take the differences
> between the specified paths and merge them into the right files in
> your working copy. (Not going to talk about taking previous merge
> history into account yet; that's an interesting but separate problem.)
>
> I won't bother to give the series of equivalent "svn diff" and patch
> commands, they're pretty obvious.
>
> Thus, both CVS merge and SVN merge involve three files: two "sources"
> and one "target". You could call the first source a "common
> ancestor", but that's stretching a bit.

Actually no, that's not stretching it. There is a common ancestor.
What you have is:

BASE, REPOS(HEAD), WC

REPOS and WC both come from editing the BASE.

> In a merge, we don't usually
> want *all* the changes between some genuine common ancestor and its
> descendant (cousin to our target) applied to our target. We want a
> certain subset of those changes, a range, and this is the ability CVS

You want the non conflicting changes, and the ability to choose between
the conflicting ones.

> provides (and what I always presumed SVN would provide, but with
> better memory, to help the user get the right range[s]). Note that
> the left side of a given range is just as much a cousin of the target
> as the right side is, but for some reason in merge terminology it
> seems to be called the "common ancestor" of both the right side and of
> the target.
>
> Okay, on to `diff3'. This is from the `diff3' documentation:
>
> `diff3' can incorporate changes from two modified versions into a
> common preceding version. This lets you merge the sets of changes
> represented by the two newer files. Specify the common ancestor
> version as the second argument and the two newer versions as the
> first and third arguments, like this:
>
> diff3 MINE OLDER YOURS
>
> You can remember the order of the arguments by noting that they are
> in alphabetical order.
>
> You can think of this as subtracting OLDER from YOURS and adding
> the result to MINE, or as merging into MINE the changes that would
> turn OLDER into YOURS. This merging is well-defined as long as
> MINE and OLDER match in the neighborhood of each such change. This
> fails to be true when all three input files differ or when only
> OLDER differs; we call this a "conflict". When all three input
> files differ, we call the conflict an "overlap".
>
> Is this what is meant by "3-way merging"? I've always assumed it is.

The merging rules are quite simple (assume that changes are on
'synchronized' locations in the following):

- The parts where there are no changes between OLDER, YOURS and
MINE are copied verbatim.

  - If there is a change between OLDER and YOURS, but not between
    OLDER and MINE, the change is non conflicting and goes in (the
    change made between OLDER and YOURS that is).

  - If there is a change between OLDER and MINE, but not between
    OLDER and YOURS, the change is non conflicting and goes in (the
    change made between OLDER and MINE that is).

  - If there is a change between OLDER and YOURS _and_ the exact same
    change is present between OLDER and MINE, the change will go
    in (since it is non conflicting)

  - If there is a change between OLDER and YOURS and there is a
    different change between OLDER and MINE, we have a conflicting
    change that will have to be resolved by the user (this is where
    conflict markers come in).

> So the $64,000 question is, does `diff3' pay attention to the
> differences between MINE and OLDER? In other words, does it *really*
> treat OLDER like a common ancestor? Since OLDER is not actually a
> common ancestor, in most real-life use cases of merge, it would be
> problematic if `diff3' cared about differences between OLDER and MINE.
> Suppose the same particular change appears in both OLDER and YOURS,
> but MINE does not have this change, for example.
>
> Let's get concrete. Here are older.txt and yours.txt:
[...]

> And how does diff3 behave?
>
> $ diff3 -m -E mine.txt older.txt yours.txt
> HAN (cynically):
> Hokey religions and ancient
> weapons are no match for a
> good blaster at your side,
> kid.
>
> <<<<<<< mine.txt
> LUKE (suspiciously):
> =======
> LUKE (upset):
> >>>>>>> yours.txt
> You don't believe in the
> Force, do you?
>
> HAN:
> Kid, I've flown from one side
> of this galaxy to the other.
> I've seen a lot of strange
> stuff, but I've never seen
> anything to make me believe
> there's one all-powerful
> force controlling everything.
> There's no mystical energy
> field that controls my...
> HEY, THAT'S NO MOON!
>
> Looks suspiciously like CVS merge, doesn't it? :-)
>
> Note that Han is *still* not langorous. In other words, diff3 behaved
> basically the same as diff+patch would, except it nicely handled the
> conflict markers for us (nicely in my opinion, anyway; like I
> mentioned, apparently not everyone likes conflict markers :-) ).
>
> Note that if you move the "[langorously]" line from older.txt to
> mine.txt and rerun diff3, that line will also be present in the
> output. So not only does diff3 not truly treat OLDER as a common
> ancestor (in the most literal sense), but it Does The Right Thing with
> respect to already-applied changes.
>
> Is this just an artifact of the options I chose to diff3? Am I
> missing something fundamental here?
>
> I'm all for using diff3 if it gets us something we want, such as
> optional conflict markers. But aside from that, I don't see any
> fundamental difference between using diff3 and diff+patch. Whatever
> "3-way merge" means, either both methods are doing it, or neither is.
> Please, someone tell me what I'm missing? I feel like it's something
> big, since so many intelligent people are gung-ho that we must do
> 3-way merge, but for the life of me I can't see what the difference
> is. (Perhaps an explanation with very concrete examples would help me
> understand?)

You have more context in a direct diff3. With diff and patch you can
fail to find the 'synchronization' points (since there could be a change
near another change).

> Philip's other comments about merging (in which he describes how
> ClearCase handles repeated and conflicting merges) are not addressed
> here. Those are interesting problems that we need to solve, but I
> don't see (?) that they're related to the 3-way question, or
> non-question, as the case may be.

If you add a merge callback that takes three arguments, base file,
repos file and working copy file, we can plug in our own merge library
when it is done (assuming we can get it to work, which I still think
we can).

> -Karl

Sander

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Sat Oct 21 14:37:08 2006

This message: [ Message body ]
Next message: Jan Borsodi: "Re: NFS problems"
Previous message: Kevin Pilch-Bisson: "Re: NFS problems"
In reply to: Karl Fogel: "Re: "svn diff" and "svn merge" sittin' in a tree"
Next in thread: Karl Fogel: "Re: "svn diff" and "svn merge" sittin' in a tree"
Reply: Karl Fogel: "Re: "svn diff" and "svn merge" sittin' in a tree"

Contemporary messages sorted: [ By Date ] [ By Thread ] [ By Subject ] [ By Author ] [ By messages with attachments ]