RE: Re: Maintaining NodeID sanity

From: Bill Tutt <rassilon_at_lyra.org>
Date: 2002-05-09 19:55:35 CEST

> From: Karl Fogel [mailto:kfogel@newton.ch.collab.net]
>
> Bill, I'm sort of losing track of exactly what's being proposed and
> what its goals are, at this point.
>
> A *really* helpful thing would be a fresh mail, that begins with a
> list of the problems you're trying to solve, including a small
> concrete example for each problem. These should only be problems that
> exist in the current filesystem implementation, not problems in prior
> incarnations of various proposals. Describe solutions only after the
> problems are presented.
>

Ok. That's good, your summary batched up the entire list of issues for
me, so I'll add examples. I've also re-ordered them a little bit to
restack their priorities as I see them.

> Just to be clear: I really, really appreciate that you're bringing
> years of db experience and schema design to bear on Subversion's
> problems, and want to benefit from this. But without clear
> communication, there's no way we'll grok the wonderful things going on
> in your head :-).
>

I totally hear that, I'm trying to get the concepts out of my head as
quickly as I can because I know that you want to do the work now, and I
do have to accomplish the work I get paid for, so please allow me to
continue apologizing from writing fragmented and disjointed emails.

*grovels before everyone to ask forgiveness for doing fragmented brain
dumps*

> Right now, the goals (as I understand them) of any new scheme are,
> roughly in order of priority:
>
> 1. To avoid unbounded NodeID length on certain directories
> See http://subversion.tigris.org/issues/show_bug.cgi?id=573
>

Indeed, this is the case. The issue explains the problem very precisely.
i.e.:
Revision histories are unbounded, and if our BDB Key in our NodeRevision
table is our entire revision history that makes our Key be unbound in
size, which is a very bad situation.

> 2. To eliminate the pre-commit stabilization walk that currently
> unsets the mutability flags of new nodes. This is the same as
> saying "make commit be a one-row insert" in RDB-speak, I guess.
>
I think this might be an orthogonal, but still very important issue.
It's more of a "make the SVN transaction commit BDB transaction shorter"
kind of a thing. The longer the transaction lasts, the more time you're
preventing other people from acquiring locks on the data you care about.

> 3. To preserve the ability to ask is_related(NodeA, NodeB) and
> is_ancestor(NodeA, NodeB) and get the answer in a reasonable
> amount of time; oh, and the same with get_predecessor(NodeA),
> except that there might be multiple predecessors and we need to
> think more on that.
>

Yep. We definitely want to preserve as much of the quickness of access
of the ancestry information as we can.

I'd also break the performance needs into a bunch of various categories,
and treat them separately.

Relatedness: Needs to be very fast. O(1)
Immediate Predecessor: Needs to be very fast, but could be slightly
slower than the Relatedness check if necessary. O(1)
Ancestry check: Needs to be efficient, but by necessity is going to be
more expensive then the immediate predecessor check. O(n)

In the face of multiple predecessors the Ancestry check performance item
might need to be broken down into two separate checks:
Direct Line Ancestry check: This is what we normally think of as Node
ancestry information. (i.e. the NodeID never varies at all)

This should needs to execute just as quickly as the Ancestry check
defined above.

Unbounded Ancestry check:
An Unbounded Ancestry check looks up each possible predecessor's
ancestry chain in addition to the direct line ancestry predecessors.

This kind of check is very expensive, and should be avoided as much as
possible.

Currently all of the above checks (except for Unbounded Ancestry Checks
which aren't yet possible) are very easy to calculate since the
NodeRevisionID embeds all of the necessary information and we always
have the key of our BDB row very available and handy.

However, the current is_related() check is implemented by performing an
Ancestry check. So, the current performance of is_related() is currently
O(n) as opposed to what it hopefully should be: O(1).

The is_related(NodeA, NodeB) check is just as expensive with just
(NodeID, ChangeSetID) as the BDB key for a NodeChange as explained in
more detail in the response to issue #4.

The is_related() check with a (NodeID, BranchID, ChangeSetID) key
suddenly becomes O(1) since NodeIDs have been stabilized.

> 4. To avoid the silly non-determistic migration of NodeID portions
> depending on which side of a copy gets modified first. Think of
> this as a maintainability fix -- we need things to be
> predictable.
>

This is definitely an issue. Analyzing and generating reports on the
repository is much easier to accomplish when NodeIDs function in a
deterministic manner. Having a stable NodeID also allows us to have a
much cheaper is_related() check, which as I explained above is very
important for performance.

i.e. it prevents the following weird and counter-intuitive NodeID
assignments:

Prexisting state:
/foo/main (NodeIDMain, ChangeSetIDMain)
/foo/main/A (NodeIDA, ChangeSetIDA)

The user performs: "svn cp /foo/main /foo/alt"

This produces:
/foo/alt (NodeID alt, ChangeSetAlt)

Now there are two possibilities of what can occur here if somebody
submits a change to A. (in terms of generating a new NodeChange for A)

Possibility 1: (in time order)
The user modifes "/foo/main/A"
Some other user modifies "/foo/alt/A"

This produces:
/foo/main/A (NodeIDA, ArbitraryChangeSetID)
/foo/alt/A (NewNodeID, ArbitrryChangeSetID+1)

Nothing weird happens here, the NodeID is always the same under a given
repository path for the file when a new revision is made to replace the
existing "lazyily copied" data.

Possibility 2: (again in time order)
The user modifies "/foo/alt/A"
Some other modifies "/foo/main/A"

This produces:
/foo/alt/A (NodeIDA, ArbitraryChangeSetID)
/foo/main/A (NewNodeID, ArbitraryChangeSetID + 1)

The fact that after a O(1) "svn copy" /foo/main/A doesn't retain NodeIDA
is weird and massively complicates the time cost of the ancestry
relatedness checks. That is, the is_related() check described earlier
ends up being a O(n) check because it can't be short circuited based on
the NodeID being useful enough to use as the relatedness check.

> 5. To preserve the ability to get_copy_history(NodeA). Yes, this
> is possible in the current scheme, even though the actual copy
> history property is set only on the top node of copied tree and
> only in the revision that the copy occurred in. We record a
> small bit of information in a single place, but it affects
> answers about many other places. Right now,
> "get_copy_history()" means the same as "get_branch_history()".
>

Well, to preserve the current abilities of the existing schema is always
a good thing. I think I'd rephrase this as:

To more efficiently know which branch is the owner of a given repository
path.

It's very expensive now to answer the question: "Which branch does
repository path <Path> belong to?"

Say you have the following repository:
e.g.:

/foo/main/A
/foo/alt/A

where /foo/alt is a branch of /foo/main.

You'd like to know that /foo/alt/A is a member of the "/foo/alt/A"
branch.
In the currently implemented schema, we have no efficient mechanism to
answer this question.

And quite frankly, this "goal" is more of one of the handy side-effects
that easily fall out from wanting to solve issues #3 and #4.

I'll leave as an implemention argument whether or not branched files
should retain their de-normalized ancestral history or not since each
NodeChange (after adding the BranchID) would now know very efficiently
where it would need to look for more information.

Additional discussion points that come out of adding a "BranchID" to the
NodeChange key:

* NodeID aren't only completely stable now, they can also not change
when the file branches.

This allows you to perform queries that users expect you to be able to
perform based on concepts of how branching should work.

* Since NodeIDs can stay the same after a "branch/copy" you need to
answer the following question:
Should the repository support a "branch" operation only, or should the
repository also support a "copy" operation where the NodeID actually
changes, and the ancestry information is truncated.

I think if we can get away with it that we should only allow branches.
The schema change does allow you to support copies, but I think that
unnecessarily complicates the user interface. If the user wants the
result of a copy, they can just make the copy on their own and add the
file(s) back into the repository under a different repository path on
their own.

That pretty much sums up the goals of adding a BranchID, and stabilizing
NodeIDs, but I want to comment on one more semi-related topic Karl
brought up recently.

--------------------- Unbounded ancestry storage
---------------------------

Karl, you were also concerned about the unbounded size of complete
ancestry information even if it wasn't in the key of the NodeChange.
There are various ways to handle this issue. Each of which has their own
pros and cons. Let's look at some of those alternatives for a second.

* Just move the ancestry information from the key, and insert it into
the value of your BDB row.
* Create a secondary table to store the ancestry information
There are two sub-options to this option.
   * Option 1: Have a row in this secondary table for each NodeChange
ancestor, and insert the immediate Predecessor NodeChange key into the
NodeChange BDB value.
   * Option 2: Just use this secondary table as the place to stick the
ancestry information in the value, and have the normal BDB key be the
key in this table as well.
   * Option 3: Only store the immediate predecessor in the BDB value in
your NodeChange row.

There isn't much difference between option #1 and #2, except possibly
from a BDB implementation point of view. The major difference occurs
between #1/#2 and #3. #3 is the most normalized way of storing the data.
(i.e. no data redundancies) However, #3 suffers from a performance
issue. When you need to perform your ancestry check you don't know how
far back in the ancestry you need to go, and the predecessor NodeChanges
might be in wildly differing physical data store pages. Options #1 and
#2 try to reduce that problem by de-normalizing the entire ancestry
information locally.

Given the ways in which we want to access ancestry information Option #1
seems to look more favorable.

The reason for that is that we don't always need all of the ancestry
information, we really (95% of the time) just want enough ancestry
information to answer our question. Failure cases will occur, but that
is very infrequently.

If each piece of ancestry information is stored separately you can stop
reading BDB rows when you've answered your current ancestry question.

Additionally, the only place we call the predecessor API directly is
when we're doing stabilization (and/or deltification).

Does that help clarify things?

Bill

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Thu May 9 19:56:39 2002

This message: [ Message body ]
Next message: Philip Martin: "Re: [PATCH] external editor, does this work on Windows..."
Previous message: cmpilato_at_collab.net: "Re: [PATCH] external editor, does this work on Windows..."
In reply to: Karl Fogel: "Re: Maintaining NodeID sanity"
Next in thread: Karl Fogel: "Re: Maintaining NodeID sanity"
Reply: Karl Fogel: "Re: Maintaining NodeID sanity"

Contemporary messages sorted: [ By Date ] [ By Thread ] [ By Subject ] [ By Author ] [ By messages with attachments ]