NODE_DATA discussions

From: Julian Foad <julian.foad_at_wandisco.com>
Date: Wed, 18 Aug 2010 21:49:16 +0100

Hyrum, Eric H., Philip M. and I met up at WANdisco's office in Sheffield
(England) yeterday and today. One of the things we discussed was
NODE_DATA.

We discussed several sub-topics, of which here are two. I've written
these up including my further thoughts which weren't part of the
discussion, so it's biased.

-----------------------

Observation:

The new NODE_DATA table can completely subsume the old BASE_NODE and
WORKING_NODE tables.

BASE_NODE => NODE_DATA(op_depth==0),
WORKING_NODE => NODE_DATA(op_depth==max).

A few of the columns do not make sense in every op_depth.
translated_size and last_mod_time make sense only on the top-most node.
But putting these columns into this table seems better than keeping them
in a separate table.

  BASE_NODE NODE_DATA WORKING_NODE
  ---------------- -------------- --------------
  Indexing => yes + op_depth <= Indexing
  presence => yes <= presence
  Node-Rev => yes: original_* <= copyfrom_*
  Content => yes <= Content
  Last-Change => yes <= Last-Change
  translated_size => TODO <= translated_size
  last_mod_time => TODO <= last_mod_time
  file_external x no - obsolete
  dav_cache => TODO
  incomplete_children x no - obsolete
                            no - not ready? <= moved_here
                            no - not ready? <= moved_to
                            no - obsolete x keep_local

By "TODO" I mean not yet in wc-metadata.sql.

By "not ready?" I mean we're not ready to fully define and use the
'moved_*' columns so it would be better to insert them with a WC format
upgrade when we are ready.

-----------------------

[RFC] Instead of recording the "deleted" state as a presence value in
the topmost operative layer, consider whether to record it by means of a
flag in the next-nearest layer *beneath*.

This initially sounded appealing, but I'm not so sure now. It could
just be a premature optimisation side-track. It's certainly a way down
the list of Important Things.

Example sequence of operations, showing representation in both schemes:

Operation: clean checkout

  WC paths op_depth=0 op_depth=1 op_depth=0 op_depth=1
  ------------ ---------- ---------- ---------- ----------
  A1/ norm norm
   +- f.old norm norm
   +- f norm norm
  B1/ norm norm
   +- f norm norm
   +- f.new norm norm

Operation: delete ./A1

  WC paths op_depth=0 op_depth=1 op_depth=0 op_depth=1
  ------------ ---------- ---------- ---------- ----------
  A1/ norm Del! norm base_deleted
   +- f.old norm Del! norm base_deleted
   +- f norm Del! norm base_deleted
  B1/ norm norm
   +- f norm norm
   +- f.new norm norm

Operation: copy ^/B1 to ./A1, replacing ^/A1

  WC paths op_depth=0 op_depth=1 op_depth=0 op_depth=1
  ------------ ---------- ---------- ---------- ----------
  A1/ norm Del! norm norm norm
   +- f.old norm Del! norm base_deleted
   +- f norm Del! norm norm norm
   +- f.new norm norm
  B1/ norm norm
   +- f norm norm
   +- f.new norm norm

Flag in layer beneath doesn't require a final base_deleted row in the
topmost layer. The flag is required for (and only for) paths where a
row exists in the previous layer, so it seems more space-efficient to
store it there.

Flag in layer beneath allows simpler reverting of the "add" half of a
replacement: (remove all op_depth=N rows that are children of this op)
rather than (convert all these rows to a different presence value that
depends on their presence in layer N-1). That's not considered an
important UI feature, but the fact that it can be done by a logically
simple operation is likely to result in simpler, less buggy code.

Flag in layer beneath makes reverting the whole operation harder: two
layers need to be modified.

Flag in layer beneath is redundant when overridden by a higher layer, in
other words when a node is present in the WC at this path.

Flag in layer beneath copes with intentional deletion, but what about
'absent' and 'excluded' - it would be wise to be able to support them
too, and maybe we need 'not-present'? If so, it's a presence value
rather than a Boolean flag.

-----------------------

- Julian
Received on 2010-08-18 22:49:59 CEST

This message: [ Message body ]
Next message: Ramkumar Ramachandra: "Re: [PATCH] New dumpstream parser to check version number"
Previous message: Daniel Shahaf: "Re: [PATCH] New dumpstream parser to check version number"

Contemporary messages sorted: [ by date ] [ by thread ] [ by subject ] [ by author ] [ by messages with attachments ]