Re: [RFC] Conceptual locking procedure for database access [#11511]

From: Branko ÄŒibej <brane_at_xbc.nu>
Date: 2004-12-23 17:04:05 CET

Keith Bostic wrote:

>Here's the pseudo-code to acquire a DB_ENV handle:
>
> Open/create the DB_REGISTER file
> If the DB_REGISTER file is 0-length
> Write identifying string in the first line
>
>
There's a race here. It would be better to acquire the exclusive lock
first, then check and/or write the identifying string.

Yes, fcntl can lock byte ranges beyond EOF, in every implementation I've
heard of.

> Acquire exclusive lock on the file (WAIT)
> For every allocated process ID slot {
> Acquire lock on the process slot (NOWAIT)
> If acquire was successful {
> Release process slot lock
> Recovery is needed
> Break out of loop
> }
> }
>
> If recovery is needed
> Mark all slots in the DB_REGISTER file empty
>
> Find an empty process slot {
> Acquire lock on the process slot (NOWAIT)
> if acquire was successful {
> Overwrite the slot with our process ID
> Break out of loop
> }
> }
>
>
Instead of marking all the slots empty, wouldn't it be better to mark
only those that are marked used but aren't locked? This means that you
always have to walk the whole list in the first loop, but that's the
expected case anyway and it doesn't make sense to short-circuit the
error case, and you can merge the second loop into the first. Live
processes will release their slots anyway when they panic out of the
environment.

Like this:

    For every process ID slot {
        if we dont' have a slot yet and the slot is empty {
            Acquire lock on the process slot (NOWAIT)
            if acquire was successful {
                Overwrite the slot with our process ID
                we have a slot now
            }
            if acquire failed {
                ERROR; this can't happen
            }
        }
        if the slot is used {
            Acquire lock on the process slot (NOWAIT)
            if acquire was successful {
                // test for race with the DB_ENV release, which doesn't
                // acquire the exclusive lock on the whole file
                if the slot is still used {
                    Recovery is needed
                    Mark the slot empty
                }
                Release process lock slot
            }
        }
    }

I think this covers all cases, and it guarantees that we can't have a
slot that's marked as empty and is locked at the same time. That's a
good sanity check, IMO.

> If recovery is needed
> Recover the database environment
>
> Release exclusive lock
>
>Here's the pseudo-code to discard a DB_ENV handle:
>
> Mark our process ID slot empty.
> Release process slot lock
>
>
The rest is fine.

>=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
>Q1: Why isn't an exclusive lock necessary to discard a DB_ENV
>handle?
>
>We mark our process ID slot empty before we discard the process
>slot lock, and threads of control reviewing the register file
>ignore any slots which they can't lock.
>
>Q2: Can there be processes still running in the existing
>environment when we recover it?
>
>Yes. However, performing recovery immediately panics (and
>removes) the existing environment, so the window of
>vulnerability is small. (We check the panic flag in the DB API
>methods, when spinning on a mutex, and whenever about to write
>to disk). The only window of corruption is if the write check
>of the panic were to complete, the region subsequently be
>recovered, and then write continue. That's very, very unlikely
>to happen. Note this vulnerability already exists in Berkeley
>DB, and we've never heard of a problem.
>
>
Wouldn't that be because the documentation says that you must run
recovery only when no other processes are using the environment? That
is, you don't hear problem reports because people aren't excercising
this case very often? I'm concerned that with the new DB_REGISTER flag,
the opportuity for this race to happen would be much greater.

>Q3: Can there be processes still running in the old environment
>after we're up and running with the new one?
>
>Yes. However, those processes can't corrupt anything (as they
>won't be able to write anything into the log or database files
>after the panic of the environment), and those processes will
>hopefully notice the panic flag eventually.
>
>
But what happens to any open transactions these processes hold? Do they
get rolled back automaically? If not, how can the process do a rollback
if it can't write to the log file?

>Q4: Is this design portable?
>
>I'm using fcntl(2) locking and file offsets; that's the only way
>I can think of to get lots of locks, where the locks will go
>away on process death. (We could do the same thing with flock
>locking instead, but flock would require a separate file for
>each process of control in the database environment, which isn't
>going to be pleasant.) Windows supports fcntl-style locking
>using a different API (except for Win/9X and Win/ME). This
>feature won't be as portable as Berkeley DB in general, and will
>need to auto-configure itself off where locking isn't available.
>
>
I think we can safely ignore Win9x. As far as Subversion is concerned,
we don't support BDB repositories on those systems anyway.

-- Brane

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Thu Dec 23 17:04:09 2004

This message: [ Message body ]
Next message: Oliver Klozoff: "RE: [RFC] Conceptual locking procedure for database access [#11511]"
Previous message: e.huelsmann_at_gmx.net: "[l10n] Translation status for 1.1.x r12503"
In reply to: Keith Bostic: "Re: [RFC] Conceptual locking procedure for database access [#11511]"
Next in thread: Oliver Klozoff: "RE: [RFC] Conceptual locking procedure for database access [#11511]"
Reply: Oliver Klozoff: "RE: [RFC] Conceptual locking procedure for database access [#11511]"

Contemporary messages sorted: [ By Date ] [ By Thread ] [ By Subject ] [ By Author ] [ By messages with attachments ]