[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: Maintaining NodeID sanity

From: Karl Fogel <kfogel_at_newton.ch.collab.net>
Date: 2002-05-08 17:54:36 CEST

Bill, I'm sort of losing track of exactly what's being proposed and
what its goals are, at this point.

A *really* helpful thing would be a fresh mail, that begins with a
list of the problems you're trying to solve, including a small
concrete example for each problem. These should only be problems that
exist in the current filesystem implementation, not problems in prior
incarnations of various proposals. Describe solutions only after the
problems are presented.

For example, you seem to want to treat branches as different from
copies, but that's only something I've inferred slowly from your
mails. Maybe there's a reason to do so, but right now branches and
copies are just two words for exactly the same thing in Subversion.
There's no difference, and distinguishing between them would be a
major design change [read: unlikely before 1.0 :-), though if failure
to distinguish leads to horrible problems, then of course we need to
rethink]. Until we understand whether or not this is a goal, and if
it is then why, it's hard to evaluate or even think clearly about the

Just to be clear: I really, really appreciate that you're bringing
years of db experience and schema design to bear on Subversion's
problems, and want to benefit from this. But without clear
communication, there's no way we'll grok the wonderful things going on
in your head :-).

Btw: I know it's much easier for you to speak in generic RDB language,
and that you look forward to the day when Subversion uses a real
relational database backend... But, as we are using Berkeley DB right
now, and also in 1.0, so it would help a lot to talk about how things
would be implemented in bdb specifically.

Right now, the goals (as I understand them) of any new scheme are,
roughly in order of priority:

   1. To avoid unbounded NodeID length on certain directories
      See http://subversion.tigris.org/issues/show_bug.cgi?id=573

   2. To eliminate the pre-commit stabilization walk that currently
      unsets the mutability flags of new nodes. This is the same as
      saying "make commit be a one-row insert" in RDB-speak, I guess.

   3. To preserve the ability to ask is_related(NodeA, NodeB) and
      is_ancestor(NodeA, NodeB) and get the answer in a reasonable
      amount of time; oh, and the same with get_predecessor(NodeA),
      except that there might be multiple predecessors and we need to
      think more on that.

   4. To preserve the ability to get_copy_history(NodeA). Yes, this
      is possible in the current scheme, even though the actual copy
      history property is set only on the top node of copied tree and
      only in the revision that the copy occurred in. We record a
      small bit of information in a single place, but it affects
      answers about many other places. Right now,
      "get_copy_history()" means the same as "get_branch_history()".

   5. To avoid the silly non-determistic migration of NodeID portions
      depending on which side of a copy gets modified first. Think of
      this as a maintainability fix -- we need things to be

Note that throughout this mail, "branch" means branch in the
user-visible sense, not in the fs-internal-nodeID sense.


To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Wed May 8 17:54:59 2002

This is an archived mail posted to the Subversion Dev mailing list.