[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: [RFC] An element-based 'svn merge'

From: Stefan Fuhrmann <eqfox_at_web.de>
Date: Thu, 17 Dec 2015 13:13:58 +0100

On 17.12.2015 11:10, Julian Foad wrote:
> For the first practical application of element-based move tracking and
> merging, I propose to write an element-based alternative to 'svn
> merge'. This version of 'svn merge':
>
> * matches tree elements by path, like today...
> * ... except where the user specifies a different matching (aka a 'move')
> * uses the element-based merge algorithm (which handles moves)

That part is somewhat unclear to me. Possible interpretations include:

* New behaviour must be enabled by a e.g. "--move" option.
* New behaviour only kicks in when it encounters a "move" operation.
* Always operates based on element IDs but the effect is only
   visible with moves.

> This is like the implementation in 'svnmover' except it operates on a
> real repository and WC, and has to be given the matching data
> manually.
>
> Most of the required pieces are in place now, in svnmover.

Excellent! Good to see all the hard work you put into move tracking
over the years to come to fruition.

> The benefits are:
>
> * a merge involving moves will Just Work, without the moves causing
> any conflicts to be reported.

Yay! Do you think it is realistic to complete this within (less than)
a year in time for 1.10?

> The initial implementation will work like this, as a minimum, with the
> following limitations:
>
> * input: a requested merge from known, single-revision source trees,
> either discovered by 'automatic merge' or specified by user
> * input: a set of user-specified element matchings, each of the form
> (src-left-path, src-right-path, target-path)
> * assign EIDs (in memory) to the source-left, source-right and target trees
> * perform a tree merge, creating a temporary result (in memory)
> * check for (new style) conflicts in the result; if any, bail out
> * convert the result to a series of WC edits and apply those to the WC
> * forget all the EID metadata
>
> Work needed:
>
> 1. A UI and data structure to input the user-specified matchings.
> (Straightforward.)
>
> 2. A routine that assigns EIDs to the three trees based on the
> user-specified matchings and path matching. (Straightforward.)

A while ago I've been thinking about how to store the EID assignments
in FSFS and FSX. Adding (root, branchID, EID) -> (noderev, parent)
and (root, path) -> EID mappings should be easy enough. My guess
is that it would take 1 week to do code + another one to get functional.
And then X amount of time to test & stabilize.

Is repo-side support for branches and EIDs even useful at this point?
Do we know enough of what their semantics would be? If so, I'd be
happy to post a design sketch.

> 3. Fetching the contents of the source-side trees -- easiest would be
> to fetch each tree separately in its entirety. A big performance
> improvement would be to fetch src_left and diff(src_left:src_right),
> and construct src_right from those. Better still, also construct
> src_left from target and diff(target, src_left).
>
> 4. Conversion of the element-based result to a series of WC edits. The
> code in branch_compat.c doesn't quite do this, as it assumes an Ev1
> output (with only a 'copy' operation) whereas the WC API has a 'move'
> operation that we probably need to use. In general it will need to
> insert temporary moves e.g. to swap X and Y it may need to start by
> moving X to temporary name Z. Unless the WC API moves can also be set
> up using just 'copy' operations, in which case the approach in
> branch_compat.c is on the right track although it is currently buggy.
>
> Items 1 and 2 are straightforward, and 3 can also be straightforward
> for an initial, crude implementation, and 4 is probably the hardest
> part.

I agree, 4 sounds messy - just because edit state is inherently
complex. OTOH, once you know where each element is and where it
must end up, rebuilding the tree should not be too hard unless you
try to be clever:

     for-each path recursively do:
         if final-eid(path) != current-eid(path)
             if (current-exists(path))
                 move-to-temp(current-contents(path))
             if (final-exists(path))
                 if (final-is-copy(path))
                     copy-to-path(final-contents(path))
                 else
                     move-to-path(final-contents(path))
             update-locations-recursively(path)
     clear-temp

So, you would always know where a specific contents is and never
overwrite nor delete any until the very end. Implementation
details not shown is whether you use trees or maps, are how many
of them and whether to key them by path or EID.

I'm not even sure that this should be mapped onto the current
working copy editor. If the whole thing shall be transactional,
then the following sequence is safer and adds less wc.db overhead:

     for-each path recursively do:
         if appears-somewhere-else(path)
             move-to-temp(path)

     for-each path recursively do:
         if to-be-deleted(path)
             if missing(path)
                 schedule-metadata-removal(path)
             else
                 schedule-deletion(path)
         else
             if missing(path)
                 schedule-install(path)

     run-workqueue

My 2ยข; I'm not a working copy expert.

-- Stefan^2.
Received on 2015-12-17 13:13:13 CET

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.