Re: Understanding clones and the file system

From: Ben Collins-Sussman <sussman_at_collab.net>
Date: 2001-08-01 22:55:52 CEST

<peter.westlake@arm.com> writes:

> 1. Name of the root directory
>
> Does the root directory (node) have a name?

Yes, it has a name: "/".

It's unremovable, and Revision 0 is defined to contain nothing but
this root directory.

> /trunk/paint/Makefile
> canvas.c
> brush.c
> write/Makefile
> document.c
> search.c
>
> So, if this was in a repository called R, and I checked out everything,
> would I get R/trunk/...?

In the example, 'trunk' is directory that is an immediate child of
'/'.

If the repository exists at http://foo.com/svn/repo, and I do a

$ svn checkout http://foo.com/svn/repo -d wc

I would get a new working copy called 'wc':

     wc/trunk
     wc/trunk/paint
     ...

If I were to checkout like this:

$ svn checkout http://foo.com/svn/repo/trunk/paint -d wc

My working copy would be

     wc/Makefile
     wc/canvas.c
     ...

If I were to leave off the '-d' argument to checkout:

$ svn checkout http://foo.com/svn/repo/trunk/paint

     paint/Makefile
     paint/canvas.c
     ...

> 2. Branching
>
> Is the following a correct summary?
>
> Suppose you make a clone, like T in the tuna example. If you want to use it
> as a branch, check out T like any other directory, make the changes, and
> check in. The bubble-up algorithm moves up the node DAG along a path
> specified by the working copy. In other words, if F is changed, a new copy
> of F is made. Then a new copy of tuna is made, pointing to the new F. Then
> a new copy of fish is made, pointing to the new tuna, and a new root is
> made, pointing to the new fish. At this point both A and T in the old root
> point to the original fish, and we need to decide what values they should
> have in the new root. Because the parent of fish in the working copy is T,
> the entry for T in the new root points to the new fish. Because A is not on
> the path from F to the root of the working copy, the entry for A is
> unchanged. In effect, this works by keeping the bubble-up algorithm
> blissfully ignorant of the fact that its tree is really a DAG, which
> strikes me as particularly elegant in some way I can't quite put my finger
> on :-)

Yep, your explanation seems correct. Careful about terminology,
though: 'F' is a file node, which is named 'tuna' within a particular
directory. It might be named something else in a different version of
that parent directory. Names don't exist on file nodes, they exist in
directory entries... which is why those diagrams might be a bit
confusing at first. The big square blocks represent either directory
('D') or file ('F') nodes. The names of those nodes live in the
parent. The topmost node in the diagram is the '/' node.

Jim Blandy is the one who deserves the credit for hiding the DAG
behind a bubble-up tree interface. We think it's pretty neat too. :-)

> Because both the original entry and the clone start out pointing to the
> same node, there is little (no?) difference between them. This means there
> isn't a notion of "trunk" inherent in the file structure. The only place it
> shows up is in the node revision ids. Is that right?

True. There are no branches or tags; only cheaply cloned directories.
If you clone a directory and don't write to it, it's semantically
equivalent to a "tag". If you clone a directory and write to it, it's
semantically equivalent to a "branch". But there's no inherent system
of "branch" hierarchy in the tree structure. From the filesystem's
point of view, everything is just a directory, plain and simple.

Node revision ids, as you say, are the magic means of determining
whether nodes are 'related' to one another. When we clone a node
(during the bubble-up process), we choose a new node-rev-id that
indicates it's a descendant.

> 3. Where are clones created?
>
> In the example, the clone entry T appeared in the same node as its original
> A, making it a sibling directory of A. Is this the only place T could have
> been put, or could it go anywhere that wouldn't create a cycle?

Exactly -- it could go anywhere at all, as long as it stays acyclic.
T is just a plain old directory entry that points to specific,
immutable version of some directory node. The reason this stuff works
is because we're heavily depending on the *immutability* of all nodes,
once they're part of a revision tree.

Look at the very last section of the "Future" chapter in the design
doc. The issue of how to present clones to the user is tricky; we've
been putting it off for a year, and it's the Big Issue we need to
tackle when we move from Milestone 3 to Alpha. Very soon!

One proposal on the table is to "encourage" administrators to use a
policy of laying out the repository like this:

   /trunks/proj1
          /proj2
   /branches/proj1
            /proj2
   /tags/proj1
        /proj2

Or possibly, they could have a structure like this:

   /proj1/trunk/
          branches/
          tags/
   /proj2/trunk/
          branches/
          tags/

It might be a Good Thing to encourage such policies. Otherwise,
you'll have to pray that users are *really good* about choosing
descriptive names for directories that are 'clones' and meant to
behave as branches or tags. The weight of interpreting a directory as
a "tag" or "branch" rests on the users' shoulders, because there ain't
no such things from the filesystem's point of view. :-)

Just food for thought.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Sat Oct 21 14:36:34 2006

This message: [ Message body ]
Next message: Ben Collins-Sussman: "M3 ahead..."
Previous message: Branko Èibej: "Re: Ascii/binary detection."
Maybe in reply to: peter.westlake_at_arm.com: "Understanding clones and the file system"
Next in thread: peter.westlake_at_arm.com: "Re: Understanding clones and the file system"

Contemporary messages sorted: [ By Date ] [ By Thread ] [ By Subject ] [ By Author ] [ By messages with attachments ]