[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: Wedged repositories

From: Karl Fogel <kfogel_at_newton.ch.collab.net>
Date: 2002-07-23 05:37:01 CEST

"Sander Striker" <striker@apache.org> writes:
> Anyhow, I'm CC'ing in Keith Bostic (Hi Keith, hope you don't mind),
> hoping he can shed some light on this*.
>
> Keith: for the record, one of my repositories locked up one day.
> First I tried to run 'dbrecover -v -h ${REPOS}/db'. This didn't
> unlock my repos. Then I tried 'dbrecover -ve -h ${REPOS}/db' and
> it did unlock my repos. Any ideas?

Indeed, it might be very beneficial (for us) if Keith Bostic read the
whole thread, starting from Jim Blandy's first message. It's pretty
short, and Jim's questions are clear and pointed.

On the off chance that he's willing to do this, I've included below
the three messages that Keith hasn't seen yet, with some minor edits
for brevity. Here they are:

-----------------
Jim's first post:
-----------------

   From: Jim Blandy <jimb@red-bean.com>
   Subject: Wedged repositories
   To: dev@subversion.tigris.org
   Date: Mon, 22 Jul 2002 17:27:59 -0500
   
   It's a known problem that Subversion repositories can get wedged, and
   that running db_recover -e on them fixes things. The db_recover
   program, part of the Berkeley DB distribution, is basically a wrapper
   around a single call to DBENV->open, which is given the DB_RECOVER
   flag. Since recovery is fast when the repository was shut down
   properly, there's no reason Subversion couldn't do this itself. The
   FS API provides for this.
   
   In fact, the code in libsvn_repos looks like it's trying to do this,
   but it doesn't. I'm having a hard time discerning the intent here.
   There is some locking stuff in libsvn_repos/repos.c, but
   svn_open_repos uses it in a weird way: it calls svn_fs_open_berkeley,
   and *then* it acquires its locks on db.lock. Since there are no
   shared resources acquired *after* we get the lock, it's hard to see
   what that lock could reliably exclude.
   
   I think the logic in svn_open_repos needs to be:
   
   - get a shared lock on locks/db.lock
   - try calling svn_fs_open_berkeley
   - if it fails with DB_RUN_RECOVERY, then:
     - release shared lock on db.lock
     - get an exclusive lock on db.lock
     - call svn_fs_berkeley_recover
     - release exclusive lock
     - retry from the top
   
   In this arrangement, we only try to recover when we have an exclusive
   lock, and we never return an opened filesystem object unless we have a
   shared lock.
   
   I understand there have been problems with svn_fs_berkeley_recover
   itself returning DB_RUN_RECOVERY. That's pretty confusing; it's kind
   of useless if it does that. But that's not a problem which can be
   swept under the rug, say, by declaring Issue 403 resolved; it's a
   central part of the problem.

-----------------------
Sander responds to Jim:
-----------------------

   From: "Sander Striker" <striker@apache.org>
   Subject: RE: Wedged repositories
   To: "Jim Blandy" <jimb@red-bean.com>, <dev@subversion.tigris.org>
   Date: Tue, 23 Jul 2002 00:41:22 +0200
   
   Jim Blandy wrote:
> In this arrangement, we only try to recover when we have an exclusive
> lock, and we never return an opened filesystem object unless we have a
> shared lock.
   
   I am sooo +1 on this approach.
    
> I understand there have been problems with svn_fs_berkeley_recover
> itself returning DB_RUN_RECOVERY. That's pretty confusing; it's kind
> of useless if it does that. But that's not a problem which can be
> swept under the rug, say, by declaring Issue 403 resolved; it's a
> central part of the problem.
   
   It seems that 'db_recover -e' works more often than 'db_recover'. The
   -e flag is to retain the environment, which could be influenced by
   DB_CONFIG. So, we need to remove the DB_PRIVATE flag from the recover
   routine in repos.c
   
   Sander

----------------------------------
Jim responds to Sander's response:
----------------------------------

   From: Jim Blandy <jimb@red-bean.com>
   Subject: Re: Wedged repositories
   To: "Sander Striker" <striker@apache.org>
   CC: dev@subversion.tigris.org
   Date: 22 Jul 2002 17:48:51 -0500
   
   "Sander Striker" <striker@apache.org> writes:
> It seems that 'db_recover -e' works more often than 'db_recover'. The
> -e flag is to retain the environment, which could be influenced by
> DB_CONFIG. So, we need to remove the DB_PRIVATE flag from the recover
> routine in repos.c
   
   This doesn't jibe with the Berkeley DB docs, though:
   
       http://www.sleepycat.com/docs/api_c/env_open.html
   
   If we've made sure we're the only process accessing the DB
   environment, it shouldn't matter. We should be able to let the
   recovery process do whatever it pleases. If it doesn't work, then we
   don't really understand what's happening.
   
   I'd like to have a solid explanation for what's going before we start
   flipping flags on and off based on what "works more often." These are
   the repository's low-level mechanisms we're dealing with here; we
   really should try to understand what we're doing.

...And that's all.

Keith, if you've read this far, THANK YOU for your time!

-Karl

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Tue Jul 23 05:50:12 2002

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.