[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: Fixing merge - Subtree, Cyclic, and Tree Change cases

From: Folker Schamel <schamel23_at_spinor.com>
Date: Mon, 08 Aug 2011 14:34:20 +0200

> I do not think that the reverse merge strategy is the same as doing
> multiple forward merges, and I do not think it covers enough cases. For
> example, In your original example, you proposed merging in to branch
> Feature, where
> Feature = F+(R+S+RSFix)
> You proposed "unmerging" (or even reverting) to get F, then merging in
> all of the potentially redundant changes.
>
> Let's consider the case where we make a new change F2, so
> Feature = F+(R+S+RSFix)+F2
>
> Now, F2 is an edit on one of files or code blocks that is added or
> changed in R or S. This is a high-probability case, because people are
> most likely to change new code. In this case, there is no way to reverse
> merge and still keep F2. However, we can forward merge T and RSTFix, and
> still keep F2.

We don't have to keep F2. For example, in your case we can also
reverse merge (R+S+RSfix)+F2, then merge whole trunk, then merge F2
again.

(Indeed, in general, it seems that your sample is a good reason
for not using a minimal reverse merge as I had in mind originally,
but instead to reverse merge always the whole continues block backwards
as far as needed, including F2, merge the other line, and then merge
changes we had merged out too much like F2 back in again.)

For sure there are many variants of how to un-merge and merge which
revisions in which order, giving the same result if there are no
conflicts, but possibly differing significantly regarding the
probability of creating conflicts.
But my main point is:
Clean merging is possible using un-merges.
And, by choosing a good merge order, I still think with the same
non-conflict-quality as when storing some fixes separately.

The reason why I'm not really convinced with this fix storing is:
What exactly is RSTFix? How is it stored? Can the user display it
somehow? How can we forward merge just T and then RSTFix, if merging
T already causes a conflict, which was the reason for RSTFix?
I guess you aleady thought about all this and found solutions.
It just sounds complex to me, introducing a new elementary concept of
fixes which must be calculated, stored, managed, and applied.
Basically implementing a new logic of changes and merging
in parallel to the existing merging logic in Subversion.
And I think this is not necessary.

>
> On 7/19/2011 3:47 PM, Folker Schamel wrote:
>>> On 7/18/2011 4:37 PM, Folker Schamel wrote:
>>>> Hi Andy,
>>>>
>>>> two thoughts about cyclic merges:
>>>>
>>>> 1. Merging should not skip cyclic merges (like this old
>>>> svnmerge tool), but must subtract (reverse-merge) the original
>>>> change first, and then add (merge) the cyclic merge, in order
>>>> to not loose adaptions of changes.
>>> I made a different proposal to solve the same problem. Following your
>>> example, let's say we are merging
>>> Trunk = (R+S+RSFix) + (T + RSTfix)
>>> --- where RSTFix is changes to resolve a merge conflict
>>> into Feature = (R+S+RSFix) + F
>>>
>>> In your proposal, you "unmerge" (R+S+RSFix) to get F. Then, having
>>> separated the stuff that is duplicate from the stuff that is new, you
>>> can do the "one big diff" style merge from Trunk.
>>>
>>> In my proposal, we save RSTfix in our expandable merge_history file, and
>>> then we can in many cases apply T and RSTFix separately, without any
>>> duplicates.
>>>
>>> Do you think that might be easier?
>>
>> At the end also your proposal requires a reverse-merge to calculate
>> RSTfix. So the difference is basically whether to calculate RSTfix
>> on the fly implicitly when needed, or in advance and store it.
>> Which one is easier and/or faster - good question.
>>
>> The idea behind the on-the-fly reverse-merge approach is
>> a) to operate purely on existing revisions (no need to store changes
>> like RSFix separately), and
>> b) (at least in theory) a simple merge algorithm, which basically
>> just says: "Merge everything over, but reverse-merge existing old
>> changesets before", solving this RSTfix adaption issue on the fly
>> automatically implicitly in a robust way, without having to deal
>> with adaptions like RSTfix explicitly (at least in theory).
>> See http://svn.haxx.se/dev/archive-2007-12/0137.shtml
>> (Note that this algorithm assumes "correct" merge info,
>> not the current subversion merge info.)
>>
>> Cheers,
>> Folker
>>
>>>> For example, suppose you have two branches A and B.
>>>> c100 is a change in A.
>>>> c101 is a change in B.
>>>> B merges c1 into B (maybe with or without conflict),
>>>> but has to adapt this change to get it compatible with c101,
>>>> resulting into c102.
>>>> Now, A merges all changes from B to A.
>>>> Then just merging c101 would loose the adaptions made in c102.
>>>> So the correct behavior is to subtract c100 and then add c101 and c102.
>>>> Note that if the changesets are not overlapping, the order
>>>> of the reverse-merges and merges does not matter.
>>>> But if the changesets are overlapping, then the correct of
>>>> reverse-merges and merges can matter.
>>>>
>>>> 2. Supporting cyclic merges correctly requires that merge-info
>>>> only records the direct merge info without carrying over
>>>> existing merge info.
>>>>
>>>> See for example
>>>> http://subversion.tigris.org/ds/viewMessage.do?dsForumId=462&dsMessageId=948427
>>>>
>>>>
>>> I completely agree that "svn newmerge" should be able to handle the case
>>> that you posted in that message.
>>>
>>> I also agree that some of the problems in this merge case come from the
>>> inadequate data in merginfo. As you point out, we can read mergeinfo
>>> carefully, and we don't even know the common ancestor of two branches
>>> being merged. If you know the common ancestor - which is often not that
>>> far back in this workflow - you can ignore everything before that point.
>>> Why isn't there a record dropped in with every merge that says "We
>>> merged X (server + branch + revision) and (all of this other merge
>>> history from there) at time Y"? This would get dragged into the next
>>> branch it gets merged with. You could read back through the merge
>>> tree/graph to find common ancestors. We could be saving those records in
>>> a new and expanded merge_history.
>>>
>>>>
>>>> Cheers,
>>>> Folker
>>>>
>>>>> To start the discussion, I will refer to this blog article by Mark
>>>>> Phippard:
>>>>>
>>>>> http://blogs.collab.net/subversion/2008/07/subversion-merg/
>>>>>
>>>>> I found the article to be a good overview of the issues.I think
>>>>> that we
>>>>> need help from Mark.On the other hand, I have seen that Mark sometimes
>>>>> makes discouraging comments. My work is apparently “hand wavey” and
>>>>> “proprietary”.I’m used to this treatment because I have 25 developers
>>>>> who work for me who often think that I am full of crap.However, it
>>>>> might
>>>>> have a discouraging effect on other contributors.For example, you can
>>>>> see in this great ticket thread -
>>>>> http://subversion.tigris.org/issues/show_bug.cgi?id=2897 - he
>>>>> states "I
>>>>> do not think it is possible in this design....I think we need to
>>>>> accept
>>>>> the limitations of the current design and work towards doing the
>>>>> best we
>>>>> can within that design” Apparently that was enough to kill progress.I
>>>>> think we should keep a more open mind going forward.
>>>>>
>>>>> I’m going to make some claims that some problems have
>>>>> “straightforward”
>>>>> solutions.That doesn’t mean they are simple solutions.Handling all of
>>>>> the merge cases is going to be hard.However, they are
>>>>> straightforward in
>>>>> the sense that we can discuss the strategy at the high level used in
>>>>> Mark’s article.
>>>>>
>>>>> Let’s consider three issues:Subtree merginfo, cyclic merge, and tree
>>>>> change operations
>>>>>
>>>>> SUBTREE MERGINFO
>>>>>
>>>>> Mark notes that reintegrate does not work if you have subtree
>>>>> merginfo.
>>>>> The subtrees potentially make the top-level mergeinfo inaccurate.So,
>>>>> basically everyone that has looked at merge problems in the past four
>>>>> years, including Mark, has tried to get rid of subtree merginfo.It’s
>>>>> amazing that Subversion still tries to support this feature.It
>>>>> can’t be
>>>>> supported in NewMerge.
>>>>>
>>>>> In the following sections, we will also see that the merginfo data is
>>>>> too sparse, and we need to replace it with something bigger and more
>>>>> extensible.
>>>>>
>>>>> CYCLIC MERGE
>>>>>
>>>>> The case where we merge back and forth between a development or
>>>>> deployment branch, and trunk, is the base case for merge.It should be
>>>>> supported.Subversion only supports it with special
>>>>> instructions.This is
>>>>> the “cyclic merge” problem.
>>>>>
>>>>> It seems that we have two basic ways to do a merge.We can grab all of
>>>>> the changes that we are trying to merge in one big diff between the
>>>>> branch we are merging from and the branch we are merging into - the
>>>>> reintegrate merge as described in Mark’s article.Or, we can
>>>>> sequentially
>>>>> apply or “replay” each of the changes that we want to merge into our
>>>>> working copy - the “recursive” strategy that is the default for git.
>>>>>
>>>>> It seems to me that the “one big diff” and the replay strategy are
>>>>> closely related.When you are replaying, you grab all of the changes in
>>>>> any sequence of revisions that doesn’t include a merge as one big
>>>>> diff.So, the current “one big diff” strategy is a special case of the
>>>>> replay strategy that applies when there are no intermediate merges
>>>>> from
>>>>> other branches or cherrypicks.
>>>>>
>>>>> But wait!According to this article, we can’t use the replay strategy
>>>>> because we are missing part of the replay.We lose information that was
>>>>> used to resolve a merge when composing merge commits.If we had that
>>>>> information, we could replay individual merges, and handle a higher
>>>>> percentage of the cyclic merge cases.
>>>>>
>>>>> This problem seems to have a straightforward solution.When we
>>>>> commit the
>>>>> merge, we can stuff the changeset that represents the difference
>>>>> between
>>>>> the merge, and the commit, into the merge_history.We just need an
>>>>> extensible merge_history format to hold it.
>>>>>
>>>>> It’s totally not clear to me why you need to say “reintegrate” when
>>>>> you
>>>>> merge to trunk, and why you need to update the branch after you do a
>>>>> reintegrate merge from it.The computer should be able to remember the
>>>>> history of merges and it should be obvious which things have been
>>>>> merged
>>>>> and which revisions have been committed on both branches.The only
>>>>> reason
>>>>> that I can think if is that that the mergeinfo is so sparse that the
>>>>> computer doesn’t remember enough about the merge history.Would a
>>>>> bigger
>>>>> and more extensible data format give us a straightforward way to solve
>>>>> that problem?
>>>>>
>>>>> TREE CHANGE
>>>>>
>>>>> We can identify tree changes by pattern matching.This is the same
>>>>> tactic
>>>>> that git uses, without any other tree change tracking.We can identify
>>>>> when this match is successful because the match is applied,
>>>>> examined by
>>>>> the merger, and then the merge is committed.In this case we could
>>>>> write
>>>>> thetree map into the merge_history so thatwe can map changes
>>>>> bi-directionally during future merges without guessing again.This is
>>>>> another case of saving information that we need to replay a merge.
>>>>>
>>>>> I think we could get a similar effect by generating a move operation
>>>>> (normal copy & delete form) as part of the merge.I think that this
>>>>> mapping would need to be done by updates as well as by explicit
>>>>> merges.
>>>>>
>>>>>
>>>>> EXPERTISE
>>>>> Who on this list knows enough about the core algorithm used in
>>>>> merge to
>>>>> critique these suggestions and point to places in the code or
>>>>> documentation?
>>>>>
>>>>>
>>>>
>>>
>>>
>>
>
>
Received on 2011-08-08 14:34:58 CEST

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.