It's a known problem that Subversion repositories can get wedged, and
that running db_recover -e on them fixes things. The db_recover
program, part of the Berkeley DB distribution, is basically a wrapper
around a single call to DBENV->open, which is given the DB_RECOVER
flag. Since recovery is fast when the repository was shut down
properly, there's no reason Subversion couldn't do this itself. The
FS API provides for this.
In fact, the code in libsvn_repos looks like it's trying to do this,
but it doesn't. I'm having a hard time discerning the intent here.
There is some locking stuff in libsvn_repos/repos.c, but
svn_open_repos uses it in a weird way: it calls svn_fs_open_berkeley,
and *then* it acquires its locks on db.lock. Since there are no
shared resources acquired *after* we get the lock, it's hard to see
what that lock could reliably exclude.
I think the logic in svn_open_repos needs to be:
- get a shared lock on locks/db.lock
- try calling svn_fs_open_berkeley
- if it fails with DB_RUN_RECOVERY, then:
- release shared lock on db.lock
- get an exclusive lock on db.lock
- call svn_fs_berkeley_recover
- release exclusive lock
- retry from the top
In this arrangement, we only try to recover when we have an exclusive
lock, and we never return an opened filesystem object unless we have a
shared lock.
I understand there have been problems with svn_fs_berkeley_recover
itself returning DB_RUN_RECOVERY. That's pretty confusing; it's kind
of useless if it does that. But that's not a problem which can be
swept under the rug, say, by declaring Issue 403 resolved; it's a
central part of the problem.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Tue Jul 23 00:28:28 2002