[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: [PROPOSAL] WC-NG: merge NODE_DATA, WORKING_NODE and BASE_NODE into a single table (NODES)

From: Erik Huelsmann <ehuels_at_gmail.com>
Date: Mon, 6 Sep 2010 10:30:51 +0200

Given all the responses in the thread, I'd say we're moving to the single
table for BASE and WORKING node recording. There was a flurry of activity
from me yesterday and this morning regarding NODE_DATA: that was just me
flushing my queue of patches.

The work isn't completely irrelevant, as it identifies the spots where the
NODES table will be introduced, just like NODE_DATA had to.

Today, I'll go to draw up the NODES table and move over the queries which
had already been modified for NODE_DATA over to the NODES design. I hope to
get a very long way today already. If it's not done today, then I expect it
to be able to finish it this week.

Anyone wanting to join in: let's chat on IRC.



On Thu, Sep 2, 2010 at 11:34 PM, Erik Huelsmann <ehuels_at_gmail.com> wrote:

> As described by Julian earlier this month, Julian, Philip and I observed
> that the BASE_NODE, WORKING_NODE and NODE_DATA tables have many fields in
> common. Notably, by introducing the NODE_DATA table, most fields from
> BASE_NODE and WORKING_NODE already moved to a common table.
> The remaining fields (after switching to NODE_DATA *and* SINGLE-DB) on the
> side of WORKING_NODE are the 2 cache fields 'translated_size' and
> 'last_mod_time'. Apart from those two, there are the indexing fields wc_id,
> local_relpath and parent_relpath.
> In the end we're storing *lots* of bytes (wc_id, local_relpath and
> parent_relpath) to store 2 64-bit values.
> On the side of BASE_NODE, we end up storing dav_cache, repos_id, repos_path
> and revision. The NODE_DATA table already has the fields original_repos_id,
> original_repos_path and original_revision. When op_depth == 0, these are
> guaranteed to be empty (null), since they are for working nodes with
> copy/move source information. Renaming the three fields in NODE_DATA to
> repos_id, repos_path and revision, generalizing their use to include
> op_depth == 0 [ofcourse nicely documented in the table docs], BASE_NODE
> would be reduced to a store of the dav_cache, translated_size and
> last_mod_time fields.
> By subsuming translated_size and last_mod_time into NODE_DATA, neither
> WORKING_NODE nor BASE_NODE will need to store these values anymore. This
> eliminates the entire reason of existence of WORKING_NODE. BASE_NODE then
> only stores dav_cache. Here too, it's probably more efficient (in size) to
> store dav_cache in NODE_DATA to prevent repeated storage of wc_id,
> local_relpath and parent_relpath in BASE_NODE.
> In addition to the eliminated storage overhead, we'd be making things a
> little less complex for ourselves: UPDATE, INSERT and DELETE queries would
> be operating only on a single table, removing the need to split updates
> across multiple statements.
> This week, I was discussing this change with Greg on IRC. We both have the
> feeling this should work out well. The proposal here is to switch
> (WORKING_NODE, NODE_DATA, BASE_NODE) into a single table --> NODES.
> Comments? Fears? Enhancements?
> Bye,
> Erik.
Received on 2010-09-06 10:31:51 CEST

This is an archived mail posted to the Subversion Dev mailing list.