Hyrum, Eric H., Philip M. and I met up at WANdisco's office in Sheffield
(England) yeterday and today. One of the things we discussed was
NODE_DATA.
We discussed several sub-topics, of which here are two. I've written
these up including my further thoughts which weren't part of the
discussion, so it's biased.
-----------------------
Observation:
The new NODE_DATA table can completely subsume the old BASE_NODE and
WORKING_NODE tables.
BASE_NODE => NODE_DATA(op_depth==0),
WORKING_NODE => NODE_DATA(op_depth==max).
A few of the columns do not make sense in every op_depth.
translated_size and last_mod_time make sense only on the top-most node.
But putting these columns into this table seems better than keeping them
in a separate table.
BASE_NODE NODE_DATA WORKING_NODE
---------------- -------------- --------------
Indexing => yes + op_depth <= Indexing
presence => yes <= presence
Node-Rev => yes: original_* <= copyfrom_*
Content => yes <= Content
Last-Change => yes <= Last-Change
translated_size => TODO <= translated_size
last_mod_time => TODO <= last_mod_time
file_external x no - obsolete
dav_cache => TODO
incomplete_children x no - obsolete
no - not ready? <= moved_here
no - not ready? <= moved_to
no - obsolete x keep_local
By "TODO" I mean not yet in wc-metadata.sql.
By "not ready?" I mean we're not ready to fully define and use the
'moved_*' columns so it would be better to insert them with a WC format
upgrade when we are ready.
-----------------------
[RFC] Instead of recording the "deleted" state as a presence value in
the topmost operative layer, consider whether to record it by means of a
flag in the next-nearest layer *beneath*.
This initially sounded appealing, but I'm not so sure now. It could
just be a premature optimisation side-track. It's certainly a way down
the list of Important Things.
Example sequence of operations, showing representation in both schemes:
Operation: clean checkout
WC paths op_depth=0 op_depth=1 op_depth=0 op_depth=1
------------ ---------- ---------- ---------- ----------
A1/ norm norm
+- f.old norm norm
+- f norm norm
B1/ norm norm
+- f norm norm
+- f.new norm norm
Operation: delete ./A1
WC paths op_depth=0 op_depth=1 op_depth=0 op_depth=1
------------ ---------- ---------- ---------- ----------
A1/ norm Del! norm base_deleted
+- f.old norm Del! norm base_deleted
+- f norm Del! norm base_deleted
B1/ norm norm
+- f norm norm
+- f.new norm norm
Operation: copy ^/B1 to ./A1, replacing ^/A1
WC paths op_depth=0 op_depth=1 op_depth=0 op_depth=1
------------ ---------- ---------- ---------- ----------
A1/ norm Del! norm norm norm
+- f.old norm Del! norm base_deleted
+- f norm Del! norm norm norm
+- f.new norm norm
B1/ norm norm
+- f norm norm
+- f.new norm norm
Flag in layer beneath doesn't require a final base_deleted row in the
topmost layer. The flag is required for (and only for) paths where a
row exists in the previous layer, so it seems more space-efficient to
store it there.
Flag in layer beneath allows simpler reverting of the "add" half of a
replacement: (remove all op_depth=N rows that are children of this op)
rather than (convert all these rows to a different presence value that
depends on their presence in layer N-1). That's not considered an
important UI feature, but the fact that it can be done by a logically
simple operation is likely to result in simpler, less buggy code.
Flag in layer beneath makes reverting the whole operation harder: two
layers need to be modified.
Flag in layer beneath is redundant when overridden by a higher layer, in
other words when a node is present in the WC at this path.
Flag in layer beneath copes with intentional deletion, but what about
'absent' and 'excluded' - it would be wise to be able to support them
too, and maybe we need 'not-present'? If so, it's a presence value
rather than a Boolean flag.
-----------------------
- Julian
Received on 2010-08-18 22:49:59 CEST