Re: Ev2 as a move-aware editor

From: Branko ÄŒibej <brane_at_wandisco.com>
Date: Mon, 24 Jun 2013 17:52:52 +0200

On 24.06.2013 17:22, Julian Foad wrote:
> For move tracking we need a move-aware editor. svn_editor_t ("Ev2") is designed to support moves. However, I'm not clear how its move support is meant to work, on a fairly basic level:
>
> * What are the ordering rules for moves?
>
> * Are the move operations to be interprested in series or in parallel?
>
> * When an operation refers to a node that is involved in a move, how are the paths to be interpreted?

I find that answeres to all of these questions can be inferred from the
Once Rule. The way I interpret that, there's no restriction on the order
of any operation, as long as the Once Rule is obeyed. I have not been
able to come up with a case yet where the available operations would not
cover all possible tree mutations while still adhering to that rule.

That IMO makes the question of serial vs. parallel immaterial.

> == Cyclic moves ==
>
> We need to accommodate arbitrary sets of moves, including cycles. For an example, we'll use a simple cycle, swapping two siblings A and B. This ability is needed for two reasons:
>
> * Ev1 supports this (del A, copy A from B_at_OLD, del B, copy B from A_at_OLD).

This is a cyclic copy, not a cyclic move. EV2 has the rotate operation
that's intended to cover the effect of

    $ svn rename A tmp;
    $ svn rename B A
    $ svn rename tmp B

Note that the above is the sequence of operations in the working copy,
but the end result, as far as tree mutations are concerned, is rotate(A,
B) -- regardless of how or if you want to represent intermediate states.

> == Swap or Rotate ==
>
> swap(A,B) == { A->B || B->A }
>
> is a primitive operation. ('||' indicates 'in parallel'.)
>
> rotate(A,B,...N) == { A->B || B->... || ...->N || N->A }
>
> is non-primitive, since any rotate can be composed from multiple sequential swaps. For example, rotate(A,B,C) == { swap(B,C); swap(A,B) }. It is no more useful to rotate more than two paths in one operation (rotate(A,B,C)) than to move more than two paths in one operation (move(A,B,C) == { A->B || B->C }).
>
> So if we have a sequential scheme and we don't want to use temporary paths, we should include the 'swap' primitive rather than a 'rotate'.

I don't understand why you want to devolve to more primitive operations.
Most of the EV2 ops are not primitive; the add_* and alter_* can all be
viewed as a combination of simpler operations. Furthermore, replacing
rotate with swap would place the burden of representing a rotation with
a series of swaps on every edit driver, which doesn't make much sense.
Rotate is just a generalized form of swap.

Given how moves are currently described in the working copy, rotate fits
more naturally into the model anyway -- all you have to do is follow the
moved-from links until you find a cycle, and your rotation is known. I
can't think of a good reason to break that down into swaps.

> == Ev2 ==
>
> One of Ev2's goals is to parallelize text modifications, in order to take advantage of Serf's parallelism. What about tree changes, and specifically what are the semantics of the 'move' operation? The Ev2 documentation indicates sequential requirements for the 'add' operation (parent before children), and also in this rule:
>
> * - The ancestor of an added, copied-here, moved-here, rotated, or
> * modified node may not be deleted. The ancestor may not be moved
> * (instead: perform the move, *then* the edits).
>
> The doc string for 'svn_editor_move' says:
>
> * Move the node at @a src_relpath to @a dst_relpath.
> *
> * @a src_revision specifies the revision at which the receiver should
> * expect to find this node. That is, @a src_relpath at the start of
> * the whole edit and @a src_relpath at @a src_revision must lie within
> * the same node-rev (aka history-segment). This is just like the
> * revisions specified to svn_editor_delete() and svn_editor_rotate().
> * ...
>
> The discussion implies that "the node at src_relpath" refers to a node in the initial tree state. In that respect, it's a parallel operation: the source reference doesn't depend on previous moves. But look at the 'move_cb' implementation in libsvn_fs/editor.c:
>
> # src_root == *initial* tree state
>
> svn_fs_copy(src_root, src_fspath, root, dst_fspath, scratch_pool)
> /* Notice: we're deleting the src repos path from the dst root. */
> # root == *current* tree state (txn)
> svn_fs_delete(root, src_fspath, scratch_pool)
>
> The delete of src_relpath within the current tree state is potentially at odds with the copy from src_relpath relative to the initial state. The delete would delete the wrong thing if the original node at this path has already been replaced, or would try to delete a non-existent node if it had already been deleted or moved away. Could it have been replaced or deleted or moved away, if not itself then perhaps by a copy/move/add/delete of a parent directory?

This is what the once rule prevents. One of the reasons for it, as far
as I could determine, is to be able to detect tree incompatible tree
changes (i.e., changes that will make a commit fail) early in the edit
instead of having to wait for the commit-time rebase to fail.

>
> Let's try the following scenario:
>
> A->A2 || A/B->B2 # move a child of a moved parent
>
> Try with Ev2:
>
> move(src_relpath='A', src_rev=100, dst_relpath='A2', replaces_rev=-1)
> move(src_relpath='A/B', src_rev=100, dst_relpath='B2', replaces_rev=-1)

You're breaking the once rule here.

And the case you're describing can never occur. You cannot have a
working copy that describes what you're doing. Tree mutations can only
be parallelized across distinct subtrees, which isn't the case in your
example; where the operations interact, they must be sequential or they
don't mean anything.

Your case is either:

A->A2; A2/B -> B2

A/B -> B2; A -> A2

which happen to be the two simplest sequences of working copy changes
that'll generate your end result.

> The implementation shown above would fail in the second move when trying to delete 'A/B', because that path no longer exists in the transaction after the first move deleted the whole subtree at 'A'.
>
> The other way works fine:
>
> move(src_relpath='A/B', src_rev=100, dst_relpath='B2', replaces_rev=-1)
> move(src_relpath='A', src_rev=100, dst_relpath='A2', replaces_rev=-1)
>
> So these moves are not parallelizable with this implementation. (Is that implementation wrong?)
>
> Ev2 also documents (as the first Driving Order Restriction) that there must be an alter_directory call for the parent of moved-away node 'A/B'. What does this look like?
>
> alter_dir('A', ...)
>
> or
>
> alter_dir('A2', ...)
>
> ? Can the alter_dir come before or after the move of 'A' to 'A2'? Is the path it references 'A' in either case, or 'A2' in either case, or 'A' if it comes before the move and 'A2' if it comes after the move?
>
> Can we start with a clear consensus of whether we're trying to deliver a sequential or a parallel editor?

Both :)

However I am a bit worried that the "once rule", as stated, is always
broken in the case you describe.

-- Brane

-- 
Branko ÄŒibej | Director of Subversion
WANdisco // Non-Stop Data
e. brane_at_wandisco.com

Received on 2013-06-24 17:53:30 CEST

This message: [ Message body ]
Next message: Branko ÄŒibej: "Re: Ev2 as a move-aware editor"
Previous message: Julian Foad: "Re: [PATCH] Don't validate properties during deletion."
In reply to: Julian Foad: "Ev2 as a move-aware editor"
Next in thread: Branko ÄŒibej: "Re: Ev2 as a move-aware editor"
Reply: Branko ÄŒibej: "Re: Ev2 as a move-aware editor"

Contemporary messages sorted: [ by date ] [ by thread ] [ by subject ] [ by author ] [ by messages with attachments ]