[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: Working Copy (NG) Data Model

From: Julian Foad <julianfoad_at_btopenworld.com>
Date: Tue, 16 Dec 2008 13:06:34 +0000


While taking a bath this morning, a thought crystallized in my mind on
this subject.

In WC1, when a WC node is scheduled R+ (replacement with history), the
concept of "Base" is being hijacked to provide a place to store the
pristine copy of the replacement, and the real base is RENAMED to
"Revert Base". That's all wrong: the concept of "the base" should ALWAYS
mean "the base that we would revert to".

The extra thing being stored in an R+ situation is a pristine copy of
the copy-from source. It is not "the base". We stored it as if it were
"the base" so that we would not have to change the APIs or the client in
order to make the default client's default "diff" command perform a diff
against this copy.

So the new WC should store, for each node:

  * Base:
    - repos URL_at_REV
    - cached text and props (if caching enabled)

  * Copy-from source (present only on a node scheduled A+/R+):
    - repos URL_at_REV
    - cached text and props (if caching enabled)

Basically I am saying it does not make sense to use the label "working"
for copy-from sources, and we should define:

  * the "base" tree is the tree that we would revert to;

  * the "copy-from" tree is sparsely populated, not really a "tree" as
such but just an (often empty) set of nodes;

  * the "working" tree is what the user works with, using whatever
combination of Subversion commands and OS commands the client requires.
I think this corresponds to the current 'svn_opt_revision_working'.

(Previously when I said "we only need two trees" I was ignoring the
copy-from sources, thinking that the WC should not need to be involved
in presenting them as a "tree" concept at all.)

- Julian

On Wed, 2008-12-10 at 16:42 -0800, Greg Stein wrote:
> After some further reflection, and a bit of IRC conversation with
> hwright, I've also concluded the three tree model is a systemic
> implementation artifact. ACTUAL only exists because we expect non-svn
> commands to edit file contents, and that our expectations can be
> monkeyed by OS-level commands like "rm".
> As a result, only certain changes can be made to these trees (which
> shows as a diff between them).
> First, let me note that I meant "directory contents" below, rather
> than just structure. It also includes moving files.
> So. First, you cannot modify WORKING file contents or properties.
> Those are pristine versions. Either the same as BASE, or whatever is
> given by an add-with-history. The only changes are to the directory
> contents, as performed by "svn rm" and the like.
> Between WORKING and ACTUAL, you cannot modify directories. If you do,
> this usually results in things like missing or obstructed items. Only
> file contents can be changed thru editing, or props via svn commands.
> The commit process rolls these all together.
> The wc_db API will see quite a few changes, based on this.
> Cheers,
> -g
> On 2008-12-08, Greg Stein <gstein_at_gmail.com> wrote:
> > Hey all,
> >
> > In short: Julian was right. He just couldn't explain/justify why :-)
> >
> > Longer answer:
> >
> > Today, we have been looking at the WC as comprised of three trees,
> > each holding the following information:
> >
> > BASE (the pristine stuff we got from the server)
> > contains: directory structure, file contents, properties
> >
> > WORKING (changes made in WC (administratively))
> > contains: directory structure, properties
> >
> > ACTUAL (edits or non-svn changes)
> > contains: directory structure, file contents
> >
> > Above is both the model, and a reflection of how we store/manage the
> > various trees. Julian argued that all the trees should contain all
> > three items, but since *empirically* we built the code as above, it
> > was hard to justify why that was nothing more than being pedantic.
> > That we didn't need to look at the trees as all fully-formed.
> >
> > However, we *do* have code for all parts, for all three trees. It was
> > just hard to *see* because it kind of falls under different names.
> >
> > So. First, let's just say that "svn propset" and friends modify
> > properties in the ACTUAL tree, and ignore the fact that they're stored
> > in administrative bits. Or that you can't really use non-svn commands
> > to modify them. That shifts the properties down from WORKING into
> > ACTUAL. So now WORKING just has "directory structure", as modified by
> > things like "svn add/delete/copy/move". Let's now add file contents
> > and properties to WORKING, with the understanding that they are
> > *usually* the same as the values in BASE.
> >
> > Now think about what happens when a file or tree is in "replaced"
> > state. The file contents in WORKING need to correspond to the *new*
> > node's pristine text. In the current code, we call that the "revert
> > base". Same applies to "revert properties".
> >
> > The file contents in WORKING are part of the crazy code we have right
> > now. If you add with history, then we store the contents as if they
> > were part of BASE. But if there *was* a base (the add is replacing),
> > then only in that case do we switch to storing as the "revert base".
> > This flow of concepts and storage between the BASE and WORKING trees
> > is one of the things making the code all wonky.
> >
> > So with this new realization, I think it will be important to revisit
> > the wc_db.h APIs and the SQL schema. Make sure that we can properly
> > model the three trees and how they get used and stored. I'm still
> > messing with some temp file stuff and path assumptions in the code, so
> > I'm not worrying about it just yet. Will probably be important for
> > hwright's properties work though.
> >
> > Cheers,
> > -g

Received on 2008-12-16 14:08:59 CET

This is an archived mail posted to the Subversion Dev mailing list.