[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: Repeated merge strategies

From: Nathanael Nerode <neroden_at_twcny.rr.com>
Date: 2002-12-23 09:04:32 CET

Greg Hudson wrote:
> Okay, with some help from Bill Tutt, I think have deciphered the
> nature of the conflict between Branko and Nathanael Nerode on repeated
> merge strategies. We don't necessarily have to solve this issue
> before 1.0, but I thought it might be nice to describe what's going in
> as concise a matter as I can manage.
Thank you for your calm, clear, voice of reason, and sorry about being
irritable previously. :-)

> Here are the two general schemes:
> The Most Recent Ancestor approach (MRA) [advocated by Nathanael]
> ---------------------------------------
> In this scheme, you record an optional set of merge sources in each
> node-revision. When asked to do a merge with only one source (that
> is, just "svn merge URL", with no second argument), you compute the
> most recent ancestor and do a three-way merge between the common
> ancestor, the given URL, and the WC.
> To compute the most recent ancestor, you chain off the immediate
> predecessors of each node-revision. The immediate predecessors are
> the direct predecessor (the most recent node-revision within the node)
> and the merge sources. I guess the most efficient algorithm is an
> interleaved breadth-first search.
> The Ancestry Set approach (AS) [advocated by Branko]
> ------------------------------
> This is nicely documented in
> http://svn.collab.net/repos/svn/trunk/doc/programmer/design/model.texi
> under "Merging and Ancestry", but I'll give a capsule summary.
> In this scheme, you record the full ancestry set for each
> node-revision--that is, the set of all changes which are accounted for
> in that node-revision. (How you store this ancestry set is
> unimportant; the point is, you need a reasonably efficient way of
This is the part which I hypothesized would require serious
architectural work on Subversion.

> determining it when asked.) If you are asked to "svn merge URL", you
> apply the changes present in URL's ancestry but absent in WC's
> ancestry. Note that this is not a single three-way merge; you may
> have to apply a large number of disjoint changes to the WC.
> Comparisons and arguments
> -------------------------
> AS allows you to merge changes from a branch out of order, without
> doing any bookkeeping. MRA requires you to merge changes from a
> branch in order.
> We'll need to consider what other version control systems do. I'm
> told AS is similar to how Bitkeeper and Arch work, but I don't know
> how true that is. If everyone does AS and we do MRA, we'll stand out
> as needlessly different.
This is, of course, a very good argument for the AS method.

> Branko argued that MRA would not reasonably handle the case of a
> commit which reverts a previous change. I guess he was assuming that
> in the future Subversion will have some mechanism for recording that
> fact; right now a commit which reverts a previous change looks exactly
> the same as an original delta that happens to restore the previous
> state. To me the situation seems reversed; MRA (with no augmentation
> for recording reversions) would attempt to apply a reversion delta,
> while AS (augmented to record reversions by removing the reverted
> delta from the ancestry set) would seem to never apply a reversion.
> (For example, if I create a branch at rev 200, and on the branch I
> revert rev 100, and then I merge that branch into the trunk, the AS
> merge algorithm would seem to do nothing, since the trunk's ancestor
> set is a superset of the branch's ancestor set.)
> MRA may be simpler to implement, since it results in a single
> three-way diff.
This was my primary argument in favor of it.

> MRA may be easier for users to understand, even though AS is probably
> simpler to a mathematician. I have no experience with either system,
> so I'm not sure.
> MRA may break down faster if the merging topology becomes
> non-hierarchical, but I'm having trouble imagining a reasonable use
> case for a non-hierarchical merging topology.
I'm sure there is one, but I seriously doubt it's at all common.

I want to add one comment: the schemes are not actually exclusive. If
one is implemented as the default for 'svn merge', it is quite feasible
to implement the other as 'svn alternate-merge' at some later point, or
to relegate the original one to 'alternate-merge' and make the new one
the default 'merge'.

> An Open Question (applying to both schemes)
> ----------------
> If a user asks to merge a directory, should we apply MRA or AS to each
> subdirectory and file to determine what ancestor(s) to use? Or should
> we apply MRA or AS just once, to the directory itself? The latter
> approach seems simpler and more efficient, but will break down quickly
> if the user wants to merge in subdirectories of a branch in advance of
> merging in the whole thing.

The latter approach is what I was thinking of. With MRA I thought there
might be gotchas in the first approach, with MRAs turning out to be from
different revisions for different files... though the more I think about
it, the more it seems like that actually doesn't cause any problems.

Well, whatever y'all decide, thanks for listening. :-)

To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Mon Dec 23 09:06:24 2002

This is an archived mail posted to the Subversion Dev mailing list.