[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: It's time to fix Subversion Merge

From: Paul Burba <ptburba_at_gmail.com>
Date: Tue, 12 Jul 2011 13:10:43 -0400

On Mon, Jul 11, 2011 at 1:57 PM, Andy Singleton <andy_at_assembla.com> wrote:
>  I received a lot of good comments, and I will batch up my responses in this
> note.
> From Stefan, essentially "Can you improve the existing merge"?  Yes, I think
> that we can start with the existing merge code.
> However, I also think that any implementation that uses subtree merginfo,
> and does not have extensible merginfo, is doomed.  Too much effort goes into
> fixing up the subtree merge feature, and it makes the tree change problems
> insoluble.  So, we need to decisively cut off the subtree options and move
> to a bigger and more extensible data structure.  That's why I proposed
> adding a new command, "newmerge".  The existing code won't be destabilized.
> Paul notes that we need test cases. Yes, exactly.  The first step in this
> project is to make some test cases, and see how they perform with the
> existing merge, and describe what users report as the problem with these
> cases.  This will settle the debate about whether the existing merge is good
> enough.  We can classify an alternate merge implementation according to how
> many additional cases it handles correctly.  I think a test cases is more
> than a patch.  It is a series of commit and merge operations.

Hi Andy,

To be clear: I mean new tests for our test suite that demonstrate the
problems you allude to (assuming we don't already have a test). I'm
not sure what you mean by "I think a test cases is more than a patch.
It is a series of commit and merge operations." could you clarify?


> Mark and C. Micheal Plato raise the most serious issue.  Subversion merge
> problems come from the core architecture and have persisted over many years.
>  A complete fix may require a more radical change. And, it is possible that
> SVN needs a bigger redesign even to meet the goals I put out today.  You
> have more experience with that than I do.  We will see.  At this point, I
> think that merge can be significantly improved for the existing server
> architecture.
> Yes, the "cyclic merge" problem is a big one, and along with the tree change
> problem, it accounts for most of the frustrating behavior of Subversion
> merge - http://subversion.tigris.org/issues/show_bug.cgi?id=2837
> I believe that cyclic merges can be handled with a bigger merge_history /
> merginfo file. When you do a merge, you make some edits to resolve problems.
>  Then, you commit the changes - all of the merged changesets, plus the
> edits.  You also write the instructions for resolving this merge into the
> merge_history / merginfo file.  The next time you go to do a merge, you can
> replay any of the changes that you need. The new merge_history will be a big
> file with a complete history.
> This won't be a simple implementation, but the inside of a merge is never
> simple.  We need to add intelligence to the merge so that it looks simple to
> the user.  This intelligence can be incrementally improved through test
> cases and the open source process.
> New architecture might be required for handling moved and renamed paths.
>  This is a problem that comes up frequently in merges.  However, it also
> comes up in normal updates.  From a merge point of view, moved files should
> actually move and drag their changes with them, rather than appear as new
> files with copy+delete.
> * After we map to new files (manually, or with an algorithm) in an update or
> a merge, we should remember the change in the merge_history.  That's why we
> make the history extensible.
> * To automate this process, I think that moved files should be identified by
> filename and tree structure, not by file ID.  Yes, this is a change in the
> way that Subversion thinks, but it is clearly a problem that needs to be
> fixed.  Other SCM systems like git use an algorithm that makes a best guess
> on tree matches.  As noted by Greg, git doesn't do any other type of move
> tracking, and git merge works well.
> The work noted by Stefan on truMerge is a good example of this strategy.  We
> can do the same thing - http://trumerge.open.collab.net/ . I completely
> agree with the major points in this implementation:
> 1) It uses "heuristics" to map trees together
> 2) "All merges are done at the root of the branch" and "All merges are
> complete (no merges in sparse working copies, etc.)"
> You can see that getting rid of the subtree merges is a necessary and
> probably sufficient step for fixing the tree change problems.
> Mark asks where we get the GUID/UUID for foreign merges.  It already exists,
> because we have a server UUID, as Daniel wrote:
> <repository_UUID-revision_number>.  We just need to keep track of it.
> In systems like git, if the user wants to cherrypick, the user must enter
> the complete GUID/UUID.  However, it is probably not relevant for
> Subversion.  You can only cherrypick complete commits from the source, not
> from other sources.  So, you can leave out the UUID and just specify the
> revision number.  You can get complete merge commits with this technique.
>  Unfortunately, you are not guaranteed to have access to individual commits
> that were inside the merge. Because of this, changesets inside merge commits
> will be vulnerable to "conflation", you will have to sort through cases
> where you already have some but not all of the changes that were in a merge
> commit you are merging, and you won't be able to cherrypick inside the merge
> commit.  I need to think more about this case, and whether we should track
> individual commits that were merged.  That could be an extension.
> On 7/11/2011 12:51 PM, C. Michael Pilato wrote:
>> On 07/11/2011 11:46 AM, Andy Singleton wrote:
>>>  Many developers are moving from Subversion to other SCM systems that
>>> have
>>> better merge capabilities. I have posted an article with a proposal to
>>> fix
>>> this problem, here:
>>> http://blog.assembla.com/assemblablog/tabid/12618/bid/58122/It-s-Time-to-Fix-Subversion-Merge.aspx
>> [...]
>>> I think that we can build a newmerge prototype by stripping down the
>>> existing merge to remove the subtree options, and moving to the
>>> extensible
>>> merginfo format. It will be useful to get advice about this from
>>> experienced
>>> team members.
>> Your optimism is lovely (and welcome, even!), but I am not as convinced as
>> you that the reason why Subversion's merge functionality is subpar is as
>> superficial as the items you call out (and which are implied by your
>> prototyping plan above).
>> Very little (if anything) about your proposal touches on the *real*
>> problems, such as Subversion's handling of moved/renamed objects, tree
>> conflict detection/handling/resolution, changeset conflation caused by the
>> fundamental diff+patch approach Subversion takes to merges rather than
>> first-class changeset support), etc.  These real problems with merging
>> were
>> documented many years before the merge tracking feature was ever
>> conceived,
>> and neither that feature nor its skin-deep-only warts you aim to address
>> made a dent in solving those very real problems.
>> I don't aim to discourage -- far from it!  On the contrary, I want to
>> encourage a deeper review of the situation.  It's entirely possible that,
>> in
>> doing so, you will find solutions for the deeper core problems here, and
>> obviously the Subversion community (devs and users alike) would love that!
>> -- C-Mike
>> [1] I'll grant that in your blog post, you at least acknowledge the tree
>> changes problem and place great stock in your extensible merge tracking
>> format toward some future solution.
> --
> Andy Singleton
> Founder/CEO, Assembla Online: http://www.assembla.com
> Phone: 781-328-2241
> Skype: andysingleton
Received on 2011-07-12 19:11:17 CEST

This is an archived mail posted to the Subversion Dev mailing list.