> Our starting point was that there's nothing wrong with the CVS model
> (in which directories are indeed first-class objects in the
> repository, although not versioned enough unfortunately).
They are indeed first-class. In some sense my question is whether this is a
bug or a feature. I can make a case both ways, but I claim that in the
majority of situations the containership model isn't what I want to use to
capture the *intent* of the workspace.
> CVS is -- well, wants to be -- a system for versioning directory
> trees. This may be less general than versioning arbitrary namespaces,
> but who's to say we'd make the right "cuts" if we tried to cover many
> models instead of one model... Versioning directory trees is a very
> useful thing, even if not the only thing.
This argument sounds a bit backwards to me. It seems to me that if I want
confidence that I'm doing the right "cuts" then I want to have a general
model for versioning namespaces in which the case of explicitly directories
is merely one case. I might or might not implement the general model, but
I'ld start there.
> We did consider a scheme exactly like the one you describe, where the
> directory hierarchy is implicit in the names, and much pattern
> matching happens. We decided against it for efficiency reasons.
That's interesting, because I *adopted* the approach I did for efficiency
reasons. The pattern matching case is exceptionally rare. In practice, it is
almost always of the form "take things matching this leading substring from
that branch and version". However, it's almost never done.
The reason I did NOT do directories as first class objects is that I wanted
an updated version of a file to result in modification to only one "content"
object in the repository (as opposed to a metadata object such as the object
that holds the members of a given configuration). Part of the reason is that
I wanted the system to be robust in the fact of connection failure. In DCMS,
it should be possible to get halfway through the upload of a commit
consisting of several new file states, lose the connection before making it
to the closing transaction (actually a compare and swap), and restart
without needing to upload the successfully transferred files again. If
directories need to be updated, then we face the problem of not knowing what
state they are in when this line loss occurs, given which it seems that we
sort of have to assume that they are all wrong and upload them all again. I
decided that directories were emergent rather than essential, and ditched
It's clear that DCMS is going to need a container model. It's just not clear
how widely it should be used in the versioning logic.
Where do you see the efficiency issue? I'm clearly missing something.
Received on Sat Oct 21 14:36:05 2006