Bill, I'm sort of losing track of exactly what's being proposed and
what its goals are, at this point.
A *really* helpful thing would be a fresh mail, that begins with a
list of the problems you're trying to solve, including a small
concrete example for each problem. These should only be problems that
exist in the current filesystem implementation, not problems in prior
incarnations of various proposals. Describe solutions only after the
problems are presented.
For example, you seem to want to treat branches as different from
copies, but that's only something I've inferred slowly from your
mails. Maybe there's a reason to do so, but right now branches and
copies are just two words for exactly the same thing in Subversion.
There's no difference, and distinguishing between them would be a
major design change [read: unlikely before 1.0 :-), though if failure
to distinguish leads to horrible problems, then of course we need to
rethink]. Until we understand whether or not this is a goal, and if
it is then why, it's hard to evaluate or even think clearly about the
proposals.
Just to be clear: I really, really appreciate that you're bringing
years of db experience and schema design to bear on Subversion's
problems, and want to benefit from this. But without clear
communication, there's no way we'll grok the wonderful things going on
in your head :-).
Btw: I know it's much easier for you to speak in generic RDB language,
and that you look forward to the day when Subversion uses a real
relational database backend... But, as we are using Berkeley DB right
now, and also in 1.0, so it would help a lot to talk about how things
would be implemented in bdb specifically.
Right now, the goals (as I understand them) of any new scheme are,
roughly in order of priority:
1. To avoid unbounded NodeID length on certain directories
See http://subversion.tigris.org/issues/show_bug.cgi?id=573
2. To eliminate the pre-commit stabilization walk that currently
unsets the mutability flags of new nodes. This is the same as
saying "make commit be a one-row insert" in RDB-speak, I guess.
3. To preserve the ability to ask is_related(NodeA, NodeB) and
is_ancestor(NodeA, NodeB) and get the answer in a reasonable
amount of time; oh, and the same with get_predecessor(NodeA),
except that there might be multiple predecessors and we need to
think more on that.
4. To preserve the ability to get_copy_history(NodeA). Yes, this
is possible in the current scheme, even though the actual copy
history property is set only on the top node of copied tree and
only in the revision that the copy occurred in. We record a
small bit of information in a single place, but it affects
answers about many other places. Right now,
"get_copy_history()" means the same as "get_branch_history()".
5. To avoid the silly non-determistic migration of NodeID portions
depending on which side of a copy gets modified first. Think of
this as a maintainability fix -- we need things to be
predictable.
Note that throughout this mail, "branch" means branch in the
user-visible sense, not in the fs-internal-nodeID sense.
-Karl
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Wed May 8 17:54:59 2002