RE: '@BASE keyword' vs. 'BASE database-tree' vs 'BASE conceptual-tree' (Was: RE: '@BASE' vs. 'BASE tree' -- was: Re: svn_wc__db_base_get_info() vs. svn_wc__db_read_info() ?)

From: Bert Huijben <bert_at_qqmail.nl>
Date: Fri, 29 Jan 2010 13:23:50 +0100

Julian asked me to write down some things about these trees and here is my attempt:

I don't make it easier, but I add a third variant to the comparison:
* @BASE keyword
* BASE tree (in-db)
* conceptual BASE tree (in-documentation).

I started by replying on the following mail and then I try to explain the way I look at the 3-trees on the different levels. I'm not sure if the e-mail follows a logical path, but I hope it can help form some ideas by others who hopefully write them down in a better way.

> -----Original Message-----
> From: Julian Foad [mailto:julianfoad_at_btopenworld.com]
> Sent: vrijdag 29 januari 2010 12:12
> To: Neels J Hofmeyr
> Cc: Greg Stein; dev_at_subversion.apache.org; Bert Huijben
> Subject: Re: '@BASE' vs. 'BASE tree' -- was: Re: svn_wc__db_base_get_info()
> vs. svn_wc__db_read_info() ?
>
> Neels J Hofmeyr wrote:
> [...]
> > I'm trying to say: Independently from wc-ng or its naming, I think that the
> > commandline keyword "@BASE" should mean "the thing I checked out".
> >
> > But using 'svn cat' and 'svn diff', I see that "@BASE" currently means
> > "the thing you checked out, unless this is a copy, in which case the thing
> > you copied". (!!!)

<snip>
> I'm not 100% sure which way I really want the "@BASE" notation to go.
> Before making up my mind I'd like to catalogue the current behaviour of
> all client commands and see if there is a clear majority.

As mentioned before:
Before Subversion 1.4 we didn't store the originally checked out version on replaced with history nodes at all. And even in 1.6 we move these files to a specific location that is only used from the revert and commit code paths.

A normal user is in my eyes -and most likely the eyes of the ones who invented this notion at least 5 years ago- not interested in the historic data of the file/directory he just replaced.

E.g. such a user does a
$ svn rm document
$ svn mv other-file document

<start editing file>
$ vim document

<do-other-work>

$ svn diff document

What does he expect to see on 'svn diff'?

What he gets now is a diff of file compared to the checked out version of file at its original location ('other-file')

This is in all cases (1.0-1.6-x) exactly equivalent to:
svn cat file > a.now
svn cat file -r BASE > a.old
diff -u a.old a.now

And this is how most third party tools implement the per file diff command in a visual way. (Either via calling svn, or via the apis and sometimes skipping one of the exports).

I don't think we can just change this now, because we like the new tree concept over the old concept of just looking at the original version. (-r BASE was introduced far before we even talked about 1.0)

By thinking about changes in presence instead of as a user of the svn commands (aka thinking the WC-NG way) we start to see things in a different way, but I don't think our users are really switching with us here.

And this brings us back to explaining the difference between the 3-trees in WC-NG. (And as all our current documentation, this will be incomplete again... but maybe someone can integrate the relevant portions in the documentation later)

Within WC-NG we have three trees, which all live in our database and in the conceptual level:

* BASE - The checked out version of everything
* WORKING - Overlays applied over BASE. This tree contains certain aspects normally recorded in BASE for added nodes and local transformation.
* ACTUAL - Contains local changes.

In our wc.db database you really see these three tables, storing actual information, but in our current WC-NG documentation we talk more about the conceptual trees that also contain information that is derived from the data in tables above it.

When you work on a portion of our code you usually look at the trees with a limited view.

* svn update/checkout/switch
These commands only look at the BASE-tree (And update your working copy and ACTUAL with 'relevant changes').

* svn cp/mv/add/rm
These commands look at the current version of the working copy (Based on BASE overlayed with WORKING) and apply changes to WORKING. (And update your working copy and ACTUAL with' relevant changes')

* svn propset/changelist/resolve
These commands only update the ACTUAL tree.

* svn commit
This command collapses the ACTUAL and WORKING tree + your 'relevant' working copy state back into the BASE tree.

* svn merge
This is a combination of the cp and propset cases. It updates only WORKING and ACTUAL.

To determine what are 'relevant changes' we always need information from the WORKING tree and/or BASE tree. The WORKING tree describes if we have to look further than just the added node (which only lives there) or if we have enough information by just looking at WORKING.

The documentation and implementation of the trees differ in that conceptually all relevant information of the BASE tree is also available in WORKING.

In the conceptual case the WORKING tree is the BASE data overlaid with WORKING. In the physical database and our api implementation you just look a bit further.

Another way of looking at WORKING is that it describes the tree changes to form BASE in what is really in your working copy.

A consequence is that you can describe all the real tree conflict cases on update by just looking at WORKING when updating BASE. (Edited is the only exception, as that doesn't have to be recorded in WORKING. So you have to check ACTUAL for that case). The merge tree conflicts are between WORKING and ACTUAL.

And following up on yet another mail...
Maybe we should call the conceptual WORKING tree -the BASE tree overlayed with WORKING- the PRISTINE tree.

As that is what you usually update yourself by using the working copy commands (add/cp/rm).

That would also map the existing @BASE keyword exactly to this 'conceptual' tree ;-)

One last thing... Somebody suggested using the term 'Schedule-tree'. I don't think that 'schedule' catches how things work.

1. Within wc-1.0 we tried to encapsulate all tree changes by a single 'schedule' enum value. This didn't work without adding more booleans (copied, deleted) and even then we miss state (node-kind before replacing a file with a directory?).

2. In the WC-NG trees we describe the current state/presence (BASE: before, WORKING: transformation, ACTUAL: local-changes) and we calculate the changes from there.

The schedule is VERY 'real' in the WC-1.0 entries store, but only a calculated value after WC-NG: Schedule could maybe apply to the conceptual overlay, but not really to the in-database tree that stores presence information.

Bert
Received on 2010-01-29 13:24:29 CET

Contemporary messages sorted: [ by date ] [ by thread ] [ by subject ] [ by author ] [ by messages with attachments ]