auto recovery (was: Checkpoint less frequently)

From: Greg Stein <gstein_at_lyra.org>
Date: 2003-02-22 06:17:07 CET

On Fri, Feb 21, 2003 at 01:20:55PM -0500, Greg Hudson wrote:
> On Fri, 2003-02-21 at 13:03, Branko Cibej wrote:
> [When the monitor fails to keep a process from hitting a stale lock:]
> > So we wait for a bit, then kill it.
>
> If the monitor process is started automatically, then it may have been
> started by a different user than the one whose process hung. So we
> can't kill it.

Not to mention all the other crap that can happen by arbitrarily whacking
processes. In the DAV case, this would be shooting down an Apache process,
and that could imply that you leave a bunch of shared memory stuffs sitting
around. Yes, Apache does try to clean up in cases like that, but let's not
plan to make it work too hard.

Just say "no" to killing processes :-)

> The following discipline would seem to work, without the need for a
> monitor process:
>
> * Wrap a guard file around the database, per my earlier idea.
> (fcntl-locked, read-locked for normal access, write-locked for
> recovery.)
>
> * Set the lock timeout (at db creation time).

Ah! Key item. Yes, this solves the whole ball of wax.

> * If we time out on a lock, fail the transaction, grab a write lock on
> the guard file, run recovery, and retry.

Well, we can change this a bit:

    * If we time out on a lock:
      - retry the transaction (maybe there are other reasons for a timeout,
        such as the database is simply *busy*)
      - if we get DB_RUNRECOVER, then:
        - fail the transaction (well, fail the *trail*, right?)
        - grab a write lock on REPOS/lock/db.lock
        - run recovery
        - unlock the guard
        - retry if we haven't exhausted the retry count

> But it may be inefficient in some cases:
>
> * If we erroneously time out on a lock, we will still succeed
> eventually, but it may take much longer than it would if we had
> waited. But that problem should be rare.

Berkeley DB should be able to tell us that we need to run the recovery, so
we can just look for that instead of assuming the need.

> * If multiple processes hit the stale lock, they will all run
> recovery. We could avoid that by putting a timestamp in the guard
> file saying when recovery was last run, or we could hypothesize that
> N recoveries doesn't take much longer than one recovery.

The timestamp would be nice. Each process could record when it attempts to
acquire the write lock. When it finally gets the lock, it reads the file,
sees that the recovery finished *after* its acquisition time, and just
releases the write lock and retries the operation.

> I also wonder how many of these problems go away if you instruct
> Berkeley DB to use fcntl locks. (That's possible, right?) And what the
> cost is in everyday performance, of course.

Hmm. Interesting, but I think the timeout is key, and should be able to get
us what we need.

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Received on Sat Feb 22 06:12:48 2003

This message: [ Message body ]
Next message: Greg Stein: "Re: svn commit: rev 5006 - in trunk/subversion: include libsvn_ra_dav libsvn_auth"
Previous message: Michael: "Re: RFC: CHANGES proposal."
In reply to: Greg Hudson: "Re: Checkpoint less frequently"
Next in thread: Greg Stein: "Re: Checkpoint less frequently"

Contemporary messages sorted: [ By Date ] [ By Thread ] [ By Subject ] [ By Author ] [ By messages with attachments ]