Re: Symmetric Merge -- Algorithm

From: Cristian Amarie <cristian.amarie_at_gmail.com>
Date: Fri, 20 Apr 2012 00:10:43 +0300

I've been splitting between been a svn baby/trying to swim into svn
code and reading the wiki plus mails regarding symmetric merge.
(Please be patient if I'll bump into some inherent beginner
stupidities, I'm still into the crash course phase).

> 1. Find the latest rev of A synced to B and of B synced to A.
> 2. Choose a base.
For me 2 this is an important point regarding obtaining of the same
result.

Having
/ A1 ... Ax ... An-1 -> An (merge B into A produces An)
O
\ B1 ... By ... Bm-1 -> Bm (merge A into B produces Bm)
(Ab and Bb BASE candidates)

we get the possible paths of walking Ax->An and By->Bm.

Is it feasible to have the condition An == Bm as predicate and select
the base candidates which verifies that there is a path (at least one?)
for each of Ax>An and By>Bm and the result is the same?

Now I suppose this cannot be know in advance, but perhaps we can
backtrack if a pair does not satisfy? (Say for a node Ax we have a set
of B revisions synced to A, By being the latest, but if Ax By cannot
get An == Bm, maybe can be picked another pair (not both the latest
revA synced to B and the other one). Again, I'm not seeing the big
picture here - judging only from the "symmetric" point of view.

> 3. Identify cherry-picks.
...
> [TODO] In what cases do the selection methods give different
> results? Only after a criss-cross merge?
If we have candidate BASE being A1 and B1, and criss-cross A2>B3
and B2>A3

/ A1 ... A2 A3
O
\ B1 ... B2 B3

maybe the new BASE candidates can be A3 and B3 and start the
algorithm over from here?
I mean, A3 is the latest revA synced to B and B3 latest revB synced
to A, right ?

- Cristian

On Mon, Apr 16, 2012 at 7:25 PM, Julian Foad <julian.foad_at_wandisco.com> wrote:
> I have written out how I think a large part of the symmetric merge
> algorithm should go, in more detail. Review and other feedback would
> be welcome.
>
> We need to write out the algorithm in this level of detail and a
> little more detail, and then write several new functions for
> implementing parts of it. I had until recently been thinking that I
> would be able to re-work most of the existing functions to accomodate
> what we need, and I have made some improvements along the way, but
> that is proving too difficult. (I started making some notes in
> <http://wiki.apache.org/subversion/MergeCodeImprovements> about things
> that I think we should be looking to change in the existing merge
> code.)
>
> At this point I haven't included processing of subtree mergeinfo in
> the algorithm description, though I have that in mind too.
>
> I hope your mail viewers preserve indentation and line breaks, as I've
> used indentation (space characters only) for formatting.
>
>
> Symmetric Merge Algorithm
>
> 1. Find the latest rev of A synced to B and of B synced to A.
> 2. Choose a base.
> 3. Identify cherry-picks.
> 4. Break into 3-way merges, skipping the cherry-picks.
> 5. Perform the 3-way merges and mergeinfo addition.
>
>
> In more detail.
>
> 1. Find the latest revision of A synced to B and the latest revision of
> B synced to A.
>
> This means, for the A to B direction, the youngest revision 'rN'
> on A at which all revisions of A, up to and including rN, are now
> recorded as having been merged into B. And "now recorded" means
> recorded in the version of B that is being used as the merge
> target. Usually the result corresponds to the most recent merge
> into B from A, but not necessarily, because later revisions from
> A might previously have been cherry-picked into B.
>
> We consider only direct A <-> B mergeinfo. (Consideration of a
> merge graph extending to other branches is future work. Since we
> record mergeinfo transitively, considering only direct A<->B
> mergeinfo already suffices for some 3-branch scenarios too, but
> not all such scenarios.)
>
> In:
> A_tip: pathrev
> B_wc: wc-abspath
> (Note: B is a WC "actual" tree including any local mods.)
>
> Out:
> A_base: pathrev
> B_base: pathrev
> (Note: B_base can't possibly be B if B has local mods.)
>
> Implementation:
> Currently:
> find_base_on_source()
> find_base_on_target()
> TODO: fill in the details here and re-visit the implementation.
>
>
> 2. Choose a base.
>
> Choose the latest revision of A synced to B or the latest revision
> of B synced to A.
>
> Each candidate base is a common ancestor of the source and target,
> if ancestry means following either the direct line of descent or
> the arrows that indicate complete merges.
>
> Choose the 'more recent' one in some sense -- be it time of commit,
> or number of changes since then, or some other measure.
>
> [TODO] In what cases do the selection methods give different
> results? Only after a criss-cross merge?
>
> In:
> A_base: pathrev
> B_base: pathrev
>
> Out:
> base: pathrev
>
> Implementation:
> Currently, in find_symmetric_merge():
> (A_base->rev > B_base->rev) ? A_base : B_base
>
>
> 3. Identify cherry-picks.
>
> Find changes along the source side of the merge that are already
> 'in' the target.
>
> Look for merges in both directions (from current source to current
> target ("forward") and from current target to current source
> ("backward")). For each merge:
> Break down the merge into its component logical changes.
> If the merge consists entirely of logical changes that are not
> in the target, or consists entirely of logical changes that
> are in the target, treat it as a unit -- a cherry pick.
> Otherwise, fall back to the user: report to the user which
> logical changes that merge includes, and suggest that the user
> run more specific merges that pull only the subset of logical
> changes they want.
>
> 3a. Create an (implicit or explicit) list of indivisible changes
> along each side of the merge, since the BASE. For example:
>
> Source side Target side
> B_at_10:A_at_15 B_at_10:B_at_11 (B_at_10 is BASE)
> A_at_15:A_at_20 B_at_11:B_at_12
> ... ...
> A_at_26:A_at_27 B_at_29:B_at_30
> B_at_30:B_wc (B_at_30 is B_wc's origin)
>
> In:
> base
> A_tip
> B_wc
>
> Out:
> A_changes: a list of (URL1_at_REV1 : URL2_at_REV2) pairs?
> Or, more compactly, BASE plus a list of REVs
> that index into the branch history?
> B_changes: the same
>
> Note: The first change in one of the lists might be a cross-branch
> change (from 'BASE' on the target branch to 'MID' on the source
> branch), whereas all subsequent changes are changes along a
> branch.
>
> Note: The B_changes output needs to be able to represent B_wc as
> its end point, implicitly or explicitly.
>
> Note: These lists need to reference changes in such a way that we
> can fetch and examine the changes, at least in terms of their
> mergeinfo and whether they're operative.
>
> TODO: Should the outputs identify each individual change (revision)
> as operative or inoperative, or is it acceptable to output
> revisions (or ranges of revs) that are not all known to be
> operative?
>
> 3b. Discover the changes from A that have been merged to B since BASE.
>
> In:
> A_changes
> B_changes
>
> Out:
> A_changes_to_skip: a subset of A_changes
> A_changes_to_warn: a subset of A_changes
>
> Implementation phase 1:
>
> This detects direct cherry-picks in the same direction as the
> present merge, which is all that Subversion 1.7 supports.
>
> Fetch mergeinfo on B_wc; note any changes from A that are recorded
> as merged to B (since BASE, and directly) as "already on B".
>
> Implementation phase 2:
>
> This adds detection of cherry-picks in the opposite direction,
> which is new functionality.
>
> For each change in A_changes:
> Fetch the mergeinfo change.
> If the mergeinfo did change (it represents a merge into A):
> If that merge was a direct merge from B, and from nowhere
> else, add this A change onto A_changes_to_skip. If that is,
> instead, a merge from both B and other sources, then add this
> A change onto A_changes_to_warn.
>
> Add this to the phase 1 results for A->B.
>
> Note: If the first change is a BASE:MID cross-branch change, it
> must first be turned into the equivalent series of source-side
> changes, which is a non-trivial exercise.
>
> Implementation phase 3:
>
> This detects both direct and indirect (that is, via a third
> branch) cherry-pick merges, in both directions. This is more
> complex, and a simple implementation of it may run slowly in
> some scenarios.
>
> For each change in A_changes:
> Fetch the mergeinfo change.
> If the mergeinfo did change (it represents a merge into A),
> break down that merge into its component logical changes,
> else take this change on A as the logical change to test.
> See if those logical changes are already on B.
> If all of those logical changes are on B, add this A change
> onto A_changes_to_skip; if some of them are on B, add this
> A change onto A_changes_to_warn.
>
>
> 4. Break into 3-way merges, skipping the cherry-picks.
>
> Implementation:
>
> # Define a merge as (base, tip, target).
>
> merges = [(BASE, A_tip, B_wc)]
> for change in A_changes_to_skip:
> m = find change.rev in merges
> replace [m] with [
> (m.base, change-1, m.target),
> (change, m.tip, m.target) ]
> # elide either or both of those if they are no-ops
>
>
> 5. Perform the 3-way merges and mergeinfo addition.
>
> Implementation:
>
> for m in merges:
> three_way_merge(m.base, m.tip, m.target)
> if m.base is on A:
> do a normal mergeinfo addition
> else:
> mergeinfo += (all source revs from YCA(?) up to m.tip)
>
> - Julian
Received on 2012-04-19 23:11:18 CEST

This message: [ Message body ]
Next message: Julian Foad: "Re: Symmetric Merge -- switched subtrees [was: no local mods or switched subtrees by default]"
Previous message: Philip Martin: "Re: svn commit: r1327979 - in /subversion/trunk/subversion: include/private/svn_repos_private.h libsvn_fs/fs-loader.c libsvn_repos/fs-wrap.c mod_dav_svn/repos.c"
In reply to: Julian Foad: "Symmetric Merge -- Algorithm"
Next in thread: Julian Foad: "Re: Symmetric Merge -- Algorithm"
Reply: Julian Foad: "Re: Symmetric Merge -- Algorithm"

Contemporary messages sorted: [ by date ] [ by thread ] [ by subject ] [ by author ] [ by messages with attachments ]