Re: [PATCH] NODES table presence values

From: Greg Stein <gstein_at_gmail.com>
Date: Mon, 20 Sep 2010 16:05:02 -0400

On Mon, Sep 20, 2010 at 15:36, Greg Stein <gstein_at_gmail.com> wrote:
>...
> Erik and I talked further on IRC...
>
> I believe the right approach is a simple boolean "prior-deleted",
> meaning "the nodes visible just under *this* layer have been deleted".
> Examining the root node's moved_to column can refine how the subtree
> was deleted/moved-away.
>
> I dislike the concept of modifying prior layers (preferring to see
> them as inviolate/readonly until I revert recent layers). If I say
> "delete", then it should be a new layer describing which nodes got
> deleted.
>
> Several more things came up in conversation:
>
> * a simple rule for "is this revertable?" is "does the node's op_depth
> match its path component count?"
> * adds have variant op_depth values in a subtree. thus, each node is
> revertable (and implicitly reverting its subtree)
> * deletes have a single op_depth value, making only the root
> revertable. when deleting a node, all previously-deleted children will
> need their op_depth updated
>
> And I identified one more problem here [in discussion with Erik now]:
>
> * svn mv A B ; svn add A ; svn add A/foo
>
> The op_depth of A and A/foo (assuming the latter existed in the
> original A) has the value 1. After the adds, A has 1, and A/foo has 2.
> Thus, we lose the op_depth defined for the move. We can certainly scan
> upwards to find that root (tho we've been wanting to skip these kinds
> of scans).

Okay. We've identified a possible solution here. But first let me back
up a bit to clarify the exact problem space.

1. a node (and its children) are deleted.
2. a new subtree is added, copied-here, or moved-here.

The problem arises *only* in the "added" case since the op_depth
values within an added subtree will vary. For a copied/moved-here, the
op_depth values will specify the root and that root *must* be the same
as the deletion root (if the deletion root was an ancestor, then you
could not copy/move here because a parent is missing; if the deletion
root was a descendent, then you haven't cleared the way for the
copy/move to happen).

Thus, we are only speaking to an added subtree. Nodes in that subtree
are marked with "a prior node was deleted before this node could be
added here". But we don't know where the root of that deletion is,
without scanning for it. Our op_depth values were rejiggered to
specify each added node as its own root.

And recall: I don't think that we want to modify prior layers. We'd
probably just end up in a similar kind of "something got overwritten"
position, and I'd prefer each operation to define a new layer so that
we have a 1:1 linkage between operations and layers (with the caveat
of "delete+something" is a replacing operation).

On IRC, I proposed to Erik that we switch that boolean to a column
named something like "deletion_op_depth". When you first delete a
subtree, then all nodes' op_depth and deletion_op_depth are set to the
value based on the root. If you later delete an ancestor, then both
columns are rewritten as proposed earlier. But as you add nodes, with
variant op_depths, you leave the deletion_op_depth alone.

You can then recover the deletion root by examining deletion_op_depth.
If you revert an add, then you can (re)set its op_depth to the
deletion_op_depth to incorporate that node back into the original
deletion operation. (and set presence to 'deleted', of course).

This allows us to avoid scanning for the root, and allows us to
properly restore fields of a revert.

One last point: in a copied-here/moved-here subtree, if a child is
deleted (and maybe later replaced), then it will have a whole new
op_depth (and deletion_op_depth) and will create a new layer. We're
still good with respect to mixing these operations across subtrees.

In terms of scanning:

* for an added node, we have to scan for the root of the added subtree
because we do not have that information
* for an added node, if it is replacing a prior node, then we know its
root from deletion_op_depth
* for copied/moved-here nodes, we know its root, where we can find the
source information
* for a deleted node, we know its root from op_depth. we cannot
determine "deleted" vs "moved-away" without examining that root
* for moved-away nodes, we need to look in the root to determine where it went

------

Whoops. I just realized something:

Given that children within an added subtree have variant op_depth
values, that also means they define separate layers. That means the
nodes describing the original deletion are *still* in the layer
defined by the deletion.

We might not need deletion_op_depth at all! The later adds are simply
shadowing those deletions.

... okay. I'm not going to edit this email, but leave it all here for
posterity. A little reset is necessary, and start a new
thread/document :-P

Cheers,
-g
Received on 2010-09-20 22:05:58 CEST

This message: [ Message body ]
Next message: Greg Stein: "add NODES.prior_deleted boolean column"
Previous message: Greg Stein: "Re: [PATCH] NODES table presence values"
In reply to: Greg Stein: "Re: [PATCH] NODES table presence values"

Contemporary messages sorted: [ by date ] [ by thread ] [ by subject ] [ by author ] [ by messages with attachments ]