Re: fourth tree: "INHERITED"

From: Greg Stein <gstein_at_gmail.com>
Date: Tue, 13 Apr 2010 09:50:55 -0400

On Tue, Apr 13, 2010 at 06:45, Philip Martin <philip.martin_at_wandisco.com> wrote:
> Greg Stein <gstein_at_gmail.com> writes:
>
>> After some further discussion on IRC, and some thought...
>>
>> I think this may be more of a representational problem, and might not
>> be a "true" fourth tree. Especially because supporting the revert
>> scenario actually implies N trees. Bert tried to describe this a while
>> back, but I didn't understand his description (too many "A" nodes).
>> Consider the following:
>>
>> $ svn cp A X # copies A/Y/Z/file
>> $ svn cp B X/Y # copies B/Z/file
>> $ svn cp C X/Y/Z # copies C/file
>> $ svn cp file X/Y/Z/file
>
> Just to be clear, the second, third and fourth copies need the
> destination to be deleted first.

Ah. True, yes.

>> We have four operation roots, and four layers of "file". Reverting
>> each op-root will reveal the previous layer.
>>
>> In 1.6, we probably had just one layer, but if we're going to solve
>> this, then let's do it right.
>
> The current three tree model can support the creation of all those
> copies, it's only the step-by-step revert that is a problem. The
> current wc-ng only really allows the revert of all the copies in one
> go.

Right.

And you demonstrated where 1.6 could do one level of revert.
Therefore, we should be able to do (at least) one level in 1.7.

>> I propose that we create a new table called NODE_DATA which is keyed
>> by <wc_id, local_relpath, op_depth>. The first two are the usual, and
>> op_depth is the "operation depth". In the above example, we have four
>> WORKING_NODE rows, each establishing an operation root, with
>> local_relpath values of [X, X/Y, X/Y/Z, X/Y/Z/file]. In the NODE_DATA
>> table, we have the following four rows:
>>
>> <1, X/Y/Z/file, 1> # from the X op-root
>> <1, X/Y/Z/file, 2> # from the X/Y op-root
>> <1, X/Y/Z/file, 3> # from the X/Y/Z op-root
>> <1, X/Y/Z/file, 4> # from the X/Y/Z/file op-root
>>
>> Essentially, op_depth = oproot_relpath.count('/') + 1
>>
>> We can record BASE node data as op_depth == 0.
>>
>> Looking up the data for "file" is a query like this:
>>
>> SELECT * from NODE_DATA
>> WHERE wc_id = ?1 AND local_relpath = ?2
>> ORDER BY op_depth DESC
>> LIMIT 1;
>>
>> That provides the "current" file data.
>>
>> Some of the common columns between BASE_NODE and WORKING_NODE move to
>> this new NODE_DATA table. I think they are:
>>
>> kind, [checksum], changed_*, properties
>
> I think NODE_DATA needs more or less everything that is in the current
> WORKING_NODE. When a layer is reverted to uncover the layer below all
> the old columns need to be available. As far as I can see we need to
> remove the WORKING_NODE tree and replace it with the NODE_DATA tree,
> or to put it another way we need to add the op_depth column to
> WORKING_NODE.

I don't think so.

As you note layer, operations on a node cannot be layered/stacked. You
can modify the operation at a node, but you can't layer over it.
Columns like copyfrom_* and moved_* are about the operation, rather
than the node's data. It says *how* the node got there, rather than
talk about the node itself.

The BASE_NODE table is "what", now "how", so it more closely resembles
the suggested NODE_DATA table.

>> Those columns, plus the key, may be about it. I don't know that this
>> table needs a presence column, as the "visible" state is determined by
>> the BASE and WORKING trees. This is why I suggest that maybe we're
>> looking more at how to represent (in the database) the WORKING tree,
>> than truly adding a new "tree".
>
> One thing that occurs to me is that this layering always occurs on
> deleted children of copied parents, it never occurs on roots of
> operations (be they adds, deletes, copies or moves).

I can copy/move subtrees into another copied-subtree without
replacement. But you're right: all the resulting nodes are disjoint.
No true layering occurs.

> Roots can never
> lie one on top of the other. I wonder if we should make WORKING_NODE
> only hold roots, and have a different node type for children. The
> child node would not need the columns that are inherited from the
> parent,

That is how NODE_DATA is defined :-) ... the WORKING_NODE table just
defines operations and all the data for the nodes lives over in
NODE_DATA.

> but it would have a column that defined how many generations
> the child is from the root.

Using op_depth allows you to find all the children for a given
operation. Using a descending sort on op_depth still allows you to use
LIMIT 1 to fetch the most-recent/current node.

> Selecting a nodes data then involves
> looking in WORKING_CHILD_NODE, WORKING_NODE and BASE_NODE.
>
> SELECT * from WORKING_CHILD_NODE
> where wc_id = ?1 AND local_relpath = ?2
> ORDER BY generation
> LIMIT 1
>
> If a WORKING_CHILD_NODE is found then the generation column allows
> easy access to the related WORKING_NODE root, if it is not found then
> look in WORKING_NODE directly for a root (and if not found there then
> look in BASE_NODE).

op_depth also provides this quick access to the root.

A/B/C/D/file, op_depth=2 means that A/B is the operation root. I think
it is a bit clearer than the generation version.

Cheers,
-g
Received on 2010-04-13 15:51:25 CEST

This message: [ Message body ]
Next message: Daniel Shahaf: "Re: [Issue 3596] 'hotcopy' of packed fsfs repos may corrupt target revprops.db"
Previous message: Philip Martin: "Re: [Issue 3596] 'hotcopy' of packed fsfs repos may corrupt target revprops.db"
In reply to: Philip Martin: "Re: fourth tree: "INHERITED""
Next in thread: Philip Martin: "Re: fourth tree: "INHERITED""
Reply: Philip Martin: "Re: fourth tree: "INHERITED""

Contemporary messages sorted: [ by date ] [ by thread ] [ by subject ] [ by author ] [ by messages with attachments ]