Ben just committed a workaround for our last remaining M4/M5 issue
(#530), and now Subversion finally passes all the test suites again,
both locally and over DAV. We're rolling the tarball, updating the
status pages, etc, so there's still some administrivia to do, but M4
and M5 are done and the next target is... Alpha! :-)
Those who were on irc.openprojecs.net#svn last night already know a
lot of the following. Issue #530 turned out to be a larger and more
complex problem than expected, related to directory versioning in
general, so we're going with a workaround for now. As far as a
permanent solution goes, there are several available, all of them
related to the commit system rewrite as described in issue #463. We
should discuss them before choosing one.
So first I'll describe the workaround, then the problem that led to
it. See also issue #530 for more details.
The Workaround:
===============
The client will insist that a directory be at head before property
changes can be committed on that dir -- just run "svn up" before doing
"svn propset" on a directory and you'll be fine :-). Note that this
restriction applies only to the dir itself, not to files in the dir.
Although there is still theoretically a tiny window between when the
client learns the youngest rev and when the commit finishes, during
which another prop change could be committed on that dir, in practical
terms this workaround is safe enough to use while the commit system
gets reworked.
The Problem and Solutions:
==========================
The root of this problem has to do with how the client and server
negotiate up-to-dateness checks when committing changes. When ra_dav
starts a commit, mod_dav_svn first creates a txn based on head. This
txn will eventually become the next revision -- as ra_dav sends
changes, they are incorporated into the txn, with new directory nodes
being created due to the "bubble up" process as necessary.
Of course, the client can't commit something based on an out-of-date
revision. For files, this is easy: if the text or props on the file
in the txn have changed since the file in client's base revision, then
the file is out-of-date and the commit is aborted. To do this check,
mod_dav_svn compares the node rev id of the base revision (ra_dav
effectively reports this node-rev id as part of the vsn rsrc url in a
CHECKOUT request, not to be confused with the unrelated "svn
checkout") with the node rev id of that file in the txn. If they're
not the same, then fail.
For directories, things are more complex, for a couple of different
reasons. First, the out-of-date check is relaxed for directories,
because we don't want to prohibit non-conflicting file changes just
because the client's directory was at an old base-revision.
Otherwise, you couldn't commit the same file twice in a row without
running "svn update ." between them!
But second of all, many directories in the txn will be different
(newer) nodes than those of the client's base simply due to the
bubble-up effect. These new nodes don't reflect changes to the
directory itself, only to things underneath the directory -- the only
reason the fs had to make a new dir node was because the entries list
now points to different (newer) object(s).
So very often, the directory in the txn will look misleadingly
different from the one on the client, even if you check the node rev
id. How does mod_dav_svn deal with this? For in the case of a
directory prop change, we need to know that the directory is
up-to-date (or at least, that the properties have not been changed
since the base rev, but at the moment, we do that by insuring that the
dir itself is up-to-date).
Currently, mod_dav_svn just checks that the node rev id of the txn dir
is an immediate successor of the client's dir's node rev id. This
check is not as easy as you might think -- in fact, it's impossible to
do it with 100% reliability, due to the possibility of holes in the
node rev id space. Turns out there are other problems, too, with the
result that if you try this
$ echo "new file" > new1
$ svn ci new1
$ echo "another new file" > new2
$ svn ci new2
...the second commit will fail. You'd have to put an update between
them to make it succeed (and I'd like to point out that Greg Stein,
during our irc discussions last, night correctly predicted the above
reproduction recipe without ever having seen the bug himself :-) ).
So we're faced with the question: how do we reliably determine what
has changed in a directory since the client's base revision?
Greg made the excellent suggestion that we just openly admit what's
been going on for a while now, by declaring that a directory's base
revision refers only to its properties -- the objects inside the dir
should always record their own revisions (as, indeed, they do
already).
That takes care of the working copy side. What about the server?
When a dir's base revision (or rather, base node rev id) B is
transmitted to the server, followed by a prop change request on the
dir, mod_dav_svn has to deduce whether or not the dir's properties
have changed since B. If they have, the request fails, else it
succeeds.
Well, there are a couple of ways to do this:
Every node revision in the filesystem has a created rev (CR) field,
indicating the revision number in which that node rev was created. We
could
1. Change the meaning of the CR field on directories, such that it
only gets "bumped" when the directory receives a prop change,
and otherwise persist from one revision to the next (i.e., the
old CR field would be preserved through bubble-up and add/del
entry changes on that dir, instead of being set to the new rev
every time).
2. Retrieve the property rep key from the dir's node rev in the
base revision, compare it to the prop rep key in that dir in the
txn, see if they're the same.
The big disadvantage of (1) is that it loses some information -- it is
no longer possible to look at a dir's node rev and see, in constant
time, which revision this precise node revision apppeared in. And if
we lose that information, then we've lost the ability to (say) walk
through a given revision tree in log(N) time and know precisely,
without reference to any other revision tree, what the changes
happened in that revision. This may or may not be a big deal, not
sure. It's hard to say how valuable the current meaning of CR on
directories is, since we haven't implemented "svn log" nor any more
sophisticated reports yet. I must admit to nervousness at the thought
of making CR mean something different for directories than for files.
Implementation note for (1): since mod_dav_svn never actually gets the
base revision, but rather gets the node rev id, I guess we'd first
retrieve the CR of that old node rev, then compare it to the CR of the
dir's latest incarnation.
What about solution (2)? Well, under the upcoming new commit system
[see issue #463], commits are supposed to involve no undeltification.
At first I was afraid plan (2) would violate this, because it has to
go retrieve the prop rep key of the dir's old (base) revision. But
because we're being handed the node rev id itself, we can get that key
without any undeltification -- the intermediate directories are
bypassed, and node revs themselves are never deltified (their content
reps are, but the prop rep key is part of the node rev header). Thus
we can compare the old prop rep key to the latest one in constant time
and know if this dir's properties have changed since whatever revision
the client is trying to commit. I don't see any disadvantages to this
solution yet, but am probably missing something...
Would like to hear other people's thoughts on this problem and
possible solutions.
-K
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Sat Oct 21 14:36:45 2006