[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: Ev2 using move-away and move-here

From: Greg Stein <gstein_at_gmail.com>
Date: Mon, 12 Aug 2013 13:46:11 -0500

In the wiki page you started, I thought we already solved the move issue by
using the original state for the source. (and remove rotate)

I didn't read this treatise, cuz I think it has an incorrect starting
assumption. You state early about sequential edits, when we relaxed that a
bit due to needing to retain the original state. Does it respond to the
proposal from your wiki page?

Thx,
-g
On Aug 12, 2013 10:43 AM, "Julian Foad" <julianfoad_at_btopenworld.com> wrote:

> TLDR: Ev2 move support doesn't work with the "move" and "rotate"
> instructions as currently defined, and I propose that it could work instead
> with separate "move-away" and "move-here" instructions.
>
>
> == SUMMARY ==
>
> Ev2 was designed with an intention to support “move” semantics, but the
> current design of its move() method does not satisfy the requirements. Here
> I explain why, and present a solution.
>
> The following principles constrain or guide Ev2 design:
>
> * An edit must be able to express any change that can be constructed in
> a WC and committed.
>
> * An edit must be able to express any change that can be constructed in
> a repo across multiple revisions.
>
> * Ev2 uses sequential description: a succession of incremental edits
> applied to an initial state gives a final state.
>
> * Ev2 should not require the driver to use temporary paths to express a
> change.
>
> Given the constraints,
>
> * Not all moves can be expressed by a “move source to destination”
> operation, with or without a “rotate” operation.
>
> * Moves can be expressed, in general, using separate move-away and
> move-here operations.
>
> While other designs are also possible, a move-away and move-here design
> might be the best fit for the shape of the editor. Such a design is
> detailed below, but not yet finished.
>
>
> == WHY IS THIS NECESSARY? ==
>
> === Express Any Change ===
>
> Since the purpose of the editor is to communicate a change originating in
> a WC when it is committed, and a change originating in a repository when
> the WC is updated, then it must be able to express any such change. This
> includes updates across multiple revisions, and from a mixed-revision state
> to a single revision.
>
> Through a series of simple steps of the form “move X to Y”, some quite
> funky overall changes can be created. For example, starting from this state:
>
> |
> +-- A
> |
> +-- B
>
> the following sequence:
>
> move /A/B /X
> move /A /X/B
> move /X /A
>
> results in swapping A with B. If those steps are performed in a WC and the
> result committed all at once, the editor needs to be able to handle it. If
> those steps are committed separately, and then a working copy is updated,
> the editor needs to be able to handle it.
>
> More examples are given later, some of which involve other operations such
> as “make directory” as well as moves. All of this emergent complexity
> results from the introduction of a simple “move” primitive, and there does
> not seem to be any acceptable way to further constrain the basic move that
> would simplify the derived complexity.Temporary Paths
>
> Paths in the Ev2 editor operations refer to the current state of the tree
> being edited, at that moment in the sequence of edit operations. (The sole
> exception is the copy-from path, which is relative to a given committed
> revision.)
>
> A natural and compact way of expressing moves would be as a mapping from
> original-state paths to final-state paths. However, that paradigm does not
> fit well with the desire to express the edit as a sequence of incremental
> steps. If we are going to include move functionality as steps in a sequence
> of edit operations, it makes sense to use paths that are relative to the
> current state.
>
> Ev2 should not require the driver to express a change as a sequence of
> operations that can include moving a node to a temporary path and then
> later moving it again to the final path. The end result of the edit is the
> important part, and there are unlimited potential sequences that lead to
> that result, and it does not make sense to require an edit driver to
> construct such a sequence arbitrarily, if there is an alternative method
> that specifies the result uniquely. The receiver may in fact need to employ
> temporary paths in its implementation, but then it knows this and it is in
> a better position to construct such paths when needed, and it will know
> that they are temporary which may be important.
>
> There are advantages to placing a node in its final position exactly once.
> It was claimed that Google's BigTable implementation of Svn's back-end
> would have benefited from knowing that once a node has been edited then it
> is in its final state. Ev2 aims to provide that guarantee, where Ev1 could
> not.
>
> === Sequential Description ===
>
> Ev2 (svn_editor_t) intends to express a new state in terms of an old state
> and a description of the parts of the new state that differ from the old
> state, or, in other words, a description of the changes against the old
> state. It uses a sequential, incremental description: for example, “add
> directory X, then copy file X/Y from somewhere, then edit the contents and
> properties of file X/Y”.
>
> Only certain parts of the description are incremental. Ev2 aims to allow
> nodes to be visited in arbitrary order, subject to a small number of
> restrictions. The parts where ordering matters are:
> * tree changes before content changes
>
> * alter/add/copy a directory (if required) before touching its
> (immediate) children
>
> === A Move Event is Not Adequate ===
>
> Moves, in general, cannot be expressed as “move from path X to path Y”
> events in such a sequential description without introducing temporary
> paths. This is because some cases require the source path of the move to be
> moved away, then some other steps, and then the destination path of the
> move can be populated. Some classic examples are:
>
> Example 1: Insert a directory level
>
> | |
> +--A mv--\ (add) +--A
> \ |
> \--> +--B
>
> The add cannot happen before the move-away, but must happen before the
> move-here.
>
> (The mirror of Example 1, which would be removing a directory level, is
> not actually problematic:
>
> | |
> +--A (del) /--> +--A
> | /
> +--B mv--/
>
> The move-away must (in principle) happen before the delete, while the
> move-here cannot (in principle) happen before the delete. However, Ev2
> defines that a deletion which is to be replaced shall occur implicitly when
> the replacement is put in place, and so the move-here can happen
> simultaneously with the delete.)
>
> Example 2: Swap two siblings
>
> | |
> +--A mv--\ /--> +--A
> | X |
> +--B mv--/ \--> +--B
>
> Neither of the moves can be completed before doing the move-away part of
> the other one.
>
> Example 3: Swap two directory levels
>
> | |
> +--A mv--\ /--> +--A
> | X |
> +--B mv--/ \--> +--B
> Neither of the moves can be completed before doing the move-away part of
> the other one.
>
> === A Rotate Event is Not Adequate ===
>
> At one time there was an idea that the addition of a “rotate PATH1 PATH2 …
> PATHn” event would complete the semantics and allow arbitrary moves to be
> supported.
>
> While this does enable Example 2 (swap two siblings) and Example 3 (swap
> two directory levels) and many other cases, it does not help with inserting
> a directory level (Example 1), and it has been shown [1] to be incapable of
> resolving other more involved cases involving swapping or rotation. One
> specific example is swapping A with A/B/C [2]:
>
> | |
> +-- A mv--\ /--> +-- A
> | \ / |
> +-- B mv--- X ---> +-- B
> | / \ |
> +-- C mv--/ \--> +-- C
>
> [1] Email thread on dev@, “Ev2 as a move-aware editor”, started on
> 2013-06-24, e.g. <http://svn.haxx.se/dev/archive-2013-06/0560.shtml> or <
> http://mail-archives.apache.org/mod_mbox/subversion-dev/201306.mbox/%3C1372087321.76009.YahooMailNeo%40web186101.mail.ir2.yahoo.com%3E
> >.
>
> [2] An example problem in thread [1], of swapping A with A/B/C: <
> http://svn.haxx.se/dev/archive-2013-06/0684.shtml> or <
> http://mail-archives.apache.org/mod_mbox/subversion-dev/201306.mbox/%3C87bo6rewwp.fsf%40ntlworld.com%3E
> >.
>
>
> == MOVE-AWAY AND MOVE-HERE ==
>
> One solution is to describe the two halves of each move separately:
>
> move-away SOURCE-PATH
> ...
> move-here DESTINATION-PATH
>
> We can then solve Example 1 in the following way: issue the “move-away A”,
> then create a new directory at path A which replaces that source of the
> move, and then finally issue the “move-here A/B” which relies on that
> replacement directory A having been created.
>
> The consumer must be able to put the node aside for an indeterminate
> amount of time until the “move-here” is received.
>
> Of course there needs to be a way to link each move-away with the
> corresponding move-here. Remembering that each edit step refers to the
> current state in a sequence of states, we cannot simply specify the path
> corresponding to the other end of the move like this:
>
> move-away SOURCE-PATH to DESTINATION-PATH
> ...
> move-here DESTINATION-PATH from SOURCE-PATH
>
> because the problem cases are when the destination path does not yet exist
> at the time of a move-away, or the source path no longer exists at the time
> of a move-here. What we can do is use some other unique reference that is
> unique within the edit, like this:
>
> move-away SOURCE-PATH as identifier ID1
> ...
> move-here DESTINATION-PATH from identifier ID1
>
> The reference could perhaps be the destination path as it will finally
> exist at the end of the edit, or just and arbitrary number or string. We
> will just specify the identifier as an “id” and not specify how it is
> generated.
>
> === Ordering Restrictions ===
>
> The ordering rules regarding move-away and move-here should include:
> * mv-away must come before the matching mv-here
>
> - The edit should provide a sequential procedure that the consumer
> must be able to follow without having to buffer an arbitrary amount of
> state.
>
> * mv-here & cp & add must be in nesting order: create (or put in place)
> the parent before its children
>
> * mv-away must come before deleting a parent
>
> - Receiver needs to know that it must preserve this path when we
> delete its parent.
>
> * mv-away must come before mv-away of a parent
>
> - If we allowed “mv-away A; …; mv-away A/B” then the child path “A/B”
> would have to be specified not relative to the current state of the edit,
> as all other operative paths are, but in some other way, because the parent
> has gone into temporary namespace, and has perhaps been replaced so that
> “A/B” now refers to some other node.
>
> - There is a general rule that all edits within a moved directory “A”
> must come after A is moved its destination, but a mv-away of a subtree of A
> is not considered an edit for this purpose.
>
> === Examples Solved ===
>
> Example 1:
>
> | |
> +--A mv--\ add--> +--A
> \ |
> \--> +--B
>
> 1. alter-dir / (children={A})
> 2. mv-away A (id=“original A”)
> 3. add-directory A (children={B})
> 4. mv-here A/B (id=“original A”)
>
> Example 2: Swapping two siblings
>
> | |
> +--A --\ /--> +--A
> | X |
> +--B --/ \--> +--B
>
> 1. alter-dir / (children={A,B})
> 2. mv-away A (id=“original A”)
> 3. mv-away B (id=“original B”)
> 4. mv-here A (id=“original B”)
> 5. mv-here B (id=“original A”)
>
> Example 3 can also be solved in this way, except for some ordering
> restriction issues that are discussed below.The Need for Alter-directory
> before Tree Changes
>
> Why do we have this ordering rule?
>
> If any path is added (with add_*) or deleted/moved/rotated, then
> an svn_editor_alter_directory() call must be made for its parent
> directory with the target/eventual set of children.
>
> It can't be quite right. Inside an add-directory, all the children have to
> be added, and that would imply we need an alter-directory as well as the
> add-directory, violating the Once Rule. It omits “copied” – just an
> oversight? It does not specify when the call must occur – presumably it
> must be before any such modifications to the children.
>
> To remedy those three initial problems, I suppose something like this is
> intended:
>
> Either alter_directory() or add_directory() must be called on a
> directory, declaring its final set of children, before calling
> delete(), move_away(), move_here(), copy(), or add_*() on any child.
>
> (For delete() or move_away(), it must be alter_directory(), as
> the children of add_directory() must not be deleted or moved away.)
>
> But there is still a problem. If we require alter_directory() before a
> move_away(), it leads to a violation of the Once Rule as shown in the
> following example.
>
> Example 3: Swapping two directory levels
>
> | |
> +--A --\ /--> +--A
> | X |
> +--B --/ \--> +--B
>
> 1. alter-dir A (children={}) ### Needed?
>
> 2. mv-away A/B (id=”original A/B”)
>
> 3. alter-dir / (children={A})
>
> 4. mv-away A (id=”original A”)
>
> 5. mv-here A (id=”original A/B”)
>
> 6. alter-dir A (children={B}) ### Second alter-dir on same path, if (1)
> was needed.
>
> 7. mv-here A/B (id=”original A”)
>
> There are two potential problems here:
>
> * We make an edit within subtree A (the “move-away A/B”) before moving A
>
> * The “alter-dir A” is performed twice
>
> The first point violates the assumed constraint that we should disallow
> edits within a subtree before moving the subtree.
>
> The second point at face value violates the Once Rule. We can probably
> resolve is by making the Once Rule apply per node rather than per path –
> see below.
>
> === What If We Allow Edit Before Move? ===
>
> We were assuming that we should disallow edits within a subtree before
> moving the subtree. [Why?]
>
> One solution might be to drop that requirement and let the subtree be
> moved (move-away) part way through editing it, allowing all editing
> operations. To accommodate such a change, some of the other rules that
> currently refer to “a path” probably need to be reformulated to refer to “a
> path relative to such a subtree” instead.
>
> If we allow edits before and after moving, should we also allow edits
> after the move-away and before the move-here? Not sure. It seems like that
> may be more problematic for certain consumer architectures and so probably
> should not be allowed. But is there a better way to decide?
>
> === What if the Once Rule is Per Node? ===
>
> The path-based Once Rule was written something like this:
>
> A path should never be referenced more than once by the add_*, alter_*,
> and delete
> operations (the "Once Rule"). The source path of a copy (and its
> children, if a directory)
> may be copied many times, and are otherwise subject to the Once Rule.
> The destination path
> of a copy [or move_here] may have alter_* operations applied, but not
> add_* or delete. If
> the destination path of a copy or move is a directory, then its children
> are subject to the
> Once Rule. The source path of a move_away() (and its child paths) may be
> replaced using
> add_*() or copy() or move_here() (where these new or copied nodes are
> subject to the Once Rule).
>
> Perhaps the Once Rule should not apply per path as it was stated, but
> rather per node. If a directory is altered and then moved away, we should
> be able to create a replacement directory, being a different node at the
> same path, and then alter that.
>
> * The Once Rule applies per node, rather than per path. The definition
> of a “node” is such that “move” moves an existing node (or nodes) to a new
> path and does not create a new node, while “add” creates a new node and
> “copy” creates a set of new nodes, each new node being different from all
> other nodes that are or were in the tree even if it replaces an existing
> node at the same path.
>
> * One of the following actions can be applied, just Once, directly to
> each node:
>
> - create (only if it does not exist in the initial tree)
>
> - remove (only if it exists in the initial tree or is brought in as
> a child of a copy)
>
> - modify (only if it exists in the initial tree or is brought in as
> a child of a copy)
>
> * A node may be created by one of:
>
> - add_*()
>
> - copy(), optionally followed by alter_*()
>
> No other operations may be applied to such a node during the entire
> edit. Any children may be edited after (not before) the node is created.
> When a node is created by copy(), its children (recursively) are brought
> into the tree, and are then subject to the Once Rule as existing nodes.
>
> * A node may be removed, along with all its children recursively, by one
> of:
>
> - delete()
>
> - add_*() replacing this node
>
> - copy() replacing this node
>
> - move_here() replacing this node
>
> No other operations may be applied to such a node, nor to any children
> (recursively), during the entire edit, except for nodes that have been
> moved away.
>
> * A node may be modified by any combination from the following:
>
> - move_away() followed by move_here(), at most once
>
> - alter_*(), at most once
>
> - (if a directory) edit a child
>
> These can come in any order except that alter_*() is required before
> editing any children, and neither of those can come between move-away and
> move-here except for move-away of a deeper child [and/or editing of such
> prior to move-away?].
>
> [### This seems more complex than it could be. It would be much easier
> if alter_directory() were not required before adding/moving/editing
> children.]
>
> * The source of a copy operation may be a node in the tree being edited.
> Such a node (and its children, if a directory) may be copied many times, in
> addition to being subject to the Once Rule as existing nodes.
>
> [### I'm not sure about the Copy operation: the doc string implies
> it's copying from something in the current state, but it should be able to
> copy from any path-rev.]
>
>
> == CONCLUSION ==
>
> With a bit more work on the ordering restrictions, especially the
> requirement about calling alter-dir, it looks like this design may fit the
> requirements.
>
> Thoughts?
>
> - Julian
>
>
Received on 2013-08-12 20:46:50 CEST

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.