[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: svn commit: r1001677 - /subversion/trunk/notes/wc-ng/nodes

From: Greg Stein <gstein_at_gmail.com>
Date: Mon, 27 Sep 2010 13:30:45 -0400

NICE!!

On Mon, Sep 27, 2010 at 07:40, <ehu_at_apache.org> wrote:
> Author: ehu
> Date: Mon Sep 27 11:40:18 2010
> New Revision: 1001677
>
> URL: http://svn.apache.org/viewvc?rev=1001677&view=rev
> Log:
> Add NODES design considerations document in nodes/wc-ng/nodes.
>
> Added:
>    subversion/trunk/notes/wc-ng/nodes
>
> Added: subversion/trunk/notes/wc-ng/nodes
> URL: http://svn.apache.org/viewvc/subversion/trunk/notes/wc-ng/nodes?rev=1001677&view=auto
> ==============================================================================
> --- subversion/trunk/notes/wc-ng/nodes (added)
> +++ subversion/trunk/notes/wc-ng/nodes Mon Sep 27 11:40:18 2010
> @@ -0,0 +1,159 @@
> +
> +Description of the NODES table
> +==============================
> +
> +
> + * Introduction
> + * Inclusion of BASE nodes
> + * Rows to store state
> + * Ordering rows into layers
> + * Visibility of multiple op_depth rows
> + * Restructuring the tree means adding rows
> + *
> +
> +
> +Introduction
> +------------
> +
> +The entire original design of wc-ng evolves around the notion that
> +there are a number of states in a working copy, each of which needs
> +to be managed.  All operations - excluding merge - operate on three
> +trees: BASE, WORKING and ACTUAL.
> +
> +For an in-depth description of what each means, the reader is referred
> +to other documentation, also in the notes/ directory.  In short, BASE
> +is what was checked out from the repository; WORKING includes
> +modifications mode with Subversion commands while ACTUAL also includes
> +changes which have been made with non-Subversion aware tools (rm, cp, etc.).
> +
> +The idea that there are three trees works - mostly. There is no need
> +for more trees outside the area of the metadata administration and even
> +then three trees got us pretty far.  The problem starts when one realizes
> +tree modifications can be overlapping or layered. Imagine a tree with
> +a replaced subtree.  It's possible to replace a subtree within the
> +replacement.  Imagine that happened and that the user wants to revert
> +one of the replacements.  Given a 'flat' system, with just enough columns
> +in the database to record the 'old' and 'new' information per node, a single
> +revert can be supported.  However, in the example with the double
> +replacement above, that would mean it's impossible to revert one of the
> +two replacements: either there's not enough information in the deepest
> +replacement to execute the highest level replacement or vice versa
> +- depending on which information was selected to be stored in the "new"
> +columns.
> +
> +The NODES table is the answer to this problem: instead of having a single
> +row it a table with WORKING nodes with just enough columns to record
> +(as per the example) a replacement, the solution is to record different
> +states by having multiple rows.
> +
> +
> +
> +Inclusion of BASE nodes
> +-----------------------
> +
> +The original technical design of wc-ng included a WORKING_NODE and a
> +BASE_NODE table.  As described in the introduction, the WORKING_NODE
> +table was replaced with NODES.  However, the BASE_NODE table stores
> +roughly the same state information that WORKING_NODE did.  Additionally,
> +in a number of situations, the system isn't interested in the type of
> +state it gets returned (BASE or WORKING) - it just wants the latest.
> +
> +As a result the BASE_NODE table has been integrated into the NODES
> +table.
> +
> +The main difference between the WORKING_NODE and BASE_NODE tables was
> +that the BASE_NODE table contained a few caching fields which are
> +not relevant to WORKING_NODE.  Moving those to a separate table was
> +determined to be wasteful because the primary key of that table
> +whould be much larger than any information stored in it in the first
> +place.
> +
> +
> +
> +Rows to store state
> +-------------------
> +
> +Rows of the NODES table store state of nodes in the BASE tree
> +and the layers in the WORKING tree.  Note that these nodes do not
> +need to exist in the working copy presented to the user: they may
> +be 'absent', 'not-present' or just removed (rm) without using
> +Subversion commands.
> +
> +A row contains information linking to the repository, if the node
> +was received from a repository.  This reference may be a link to
> +the original nodes for copied or moved nodes, but for rows designating
> +BASE state, they refer to the repository location which was checked
> +out from.
> +
> +Additionally, the rows contain information about local modifications
> +such copy, move or delete operations.
> +
> +
> +
> +Ordering rows into layers
> +-------------------------
> +
> +Since the table might contain more than one row per (wc_id, local_relpath)
> +combination, an ordering mechanism needs to be added.  To that effect
> +the 'op_depth' value has been devised.  The op_depth is an integer
> +indicating the depth of the operation which modified the tree in order
> +for the node to enter the state indicated in the row.
> +
> +Every row for the (wc_id, local_relpath) combination must have a unique
> +op_depth associated with it.  The value of op_depth is related to the
> +top-most node being modified in the given tree-restructuring
> +operation (operation root or oproot).  E.g. upon deletion of a subtree,
> +all nodes in the subtree will have rows in the table with the same
> +op_depth.
> +
> +The op_depth is calculated by taking the number of path components in
> +the local_relpath of the oproot. The unmodified tree (BASE) is identified
> +by rows with an op_depth value 0.
> +
> +By having multiple restructuring operations on the same path in a modified
> +subtree (most notably replacements), the table may end up with multiple rows
> +with an op_depth bigger than 0.
> +
> +
> +
> +Visibility of multiple op_depth rows
> +------------------------------------
> +
> +As stated in the introduction, there's no need to leak the concept of
> +multiple op_depth rows out of the meta data store - apart of the BASE
> +and WORKING trees.
> +
> +As described before, the BASE tree is defined by op_depth == 0. WORKING as
> +visible outside the metadata store maps back to those rows where
> +op_depth == MAX(op_depth) for each (wc_id, local_relpath) combination.
> +
> +
> +
> +Restructuring the tree means adding rows
> +----------------------------------------
> +
> +The base idea behind the NODES table is that every tree restructuring
> +operation causes nodes to be added to the table in order to best support
> +the reversal process: in that case a revert simply means deletion of rows
> +and bringing the subtree back into sync with the metadata.
> +
> +There's one exception: When a delete is followed by a copy or move to
> +the deleted location - causing a replacement - a pre-existing (due to the
> +delete) row with the right op_depth exists and needs to be modified. On
> +revert, the modified nodes need to be restored to 'deleted' state, which
> +itself can be reverted during the next revert.
> +
> +### EHU: The statement above probably means that *all* nodes in the subtree
> +  need to be rewritten: they all have a deleted state with the affected
> +  op_depth, meaning they probably need a 'replaced/copied-to' state with
> +  the same op_depth...
> +
> +
> +
> +
> +
> +
> +TODO:
> + * Explain the role of the 'deleted-below' columns
> + * Document states of the table and their meaning (including values
> +    of the relevant columns)
> \ No newline at end of file
>
>
>
Received on 2010-09-27 19:31:26 CEST

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.