[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: Subversion's use of Berkeley DB [#11511]

From: Keith Bostic <bostic_at_abyssinian.sleepycat.com>
Date: 2004-12-08 21:10:07 CET

> From: Justin Erenkrantz <justin@erenkrantz.com>
> --On Wednesday, December 8, 2004 12:38 PM -0500 Keith Bostic
> <bostic@abyssinian.sleepycat.com> wrote:
>> As a springboard for that discussion, I propose we find a
>> serialization point for all threads of control using a
>> Subversion repository so we can determine if a thread of
>> control is the first thread of control entering the database
>> environment after a possible application or system failure.
> Well, for mod_dav_svn access (WebDAV), we can have an Apache hook run on
> initialization before httpd starts serving pages. So, the only thing we'd
> need to do is figure out if some other process (or system) crash occurred
> that left it in a potentially goofy (but non-detectable??) state.

I didn't realize that was possible -- my understanding was there
was no way to get Apache to tell a module that it was being called
for the first time. Is that not true?

> Don't I recall you (or someone else) advising that we always run
> recovery on process initialization? Would that work here?

Yes, Subversion should always run recovery on process initialization.

> How did George resolve this for mod_db4? (This doesn't directly help
> ra_svn or SSH tunneling though, but can provide us with some insights.)

Here's some email that he and I exchanged:

    From: George Schlossnagle <george@omniti.com>
    Subject: Re: Subversion [#11511]
    Date: Tue, 30 Nov 2004 13:22:49 -0500

    On Nov 30, 2004, at 1:03 PM, Keith Bostic wrote:

> Hi, George -- Keith Bostic here.
> As you may know, the Subversion tool uses Berkeley DB. We've
> had periodic problems with corruption in the database
> environments used by Subversion. After some discussion with
> Mike Pilato (copied on this email), I believe the problem is
> that the database environment is not being properly recovered
> in all instances.
> The problem is Subversion is itself a library, with different
> top-layer interfaces, Apache and standalone administrative
> programs among them. To solve this problem we're going to need
> to find a way for the Subversion library to know if a thread of
> control entering Subversion code is the first thread of control
> to access the Berkeley DB database environment so it can run
> recovery as it opens the database environment.
> This is exactly the problem you had to solve for the Apache
> mod_db4 module. Is your solution written up anywhere? If
> not, could you describe for to Mike and me?

    I use a shared-memory hash (the mm_hash.[ch] implementation which sits
    on top of libmm, but which could sit on something else) that tracks the
    reference count on the file. My wrapper around the open() function in
    the DB_ENV struct then looks like this:

    static int new_db_env_open(DB_ENV *dbenv, const char *db_home,
    u_int32_t flags, int mode)
         int ret =666;
         DB_ENV *cached_dbenv;
         flags |= DB_INIT_MPOOL;
         /* if global ref count is 0, open for recovery */
         if(global_ref_count_get(db_home) == 0) {
             flags |= DB_RECOVER;
             flags |= DB_INIT_TXN;
             flags |= DB_CREATE;
         if((cached_dbenv = retrieve_db_env(db_home)) != NULL) {
             memcpy(dbenv, cached_dbenv, sizeof(DB_ENV));
             ret = 0;
         else if((ret = old_db_env_open(dbenv, db_home, flags, mode)) == 0) {
         return ret;

    If you have a single DBM file for a given subversion instance (I don't
    know how svn exactly works internally), you can also just use a sysv
    semaphore. The reason I didn't use that in mod_db4 is that it needed
    to be able to support simultaneously managing an arbitrary number of

    I hope that helps, and I'm happy to participate further in the
    discussion if that didn't fully answer your question.



Keith Bostic bostic@sleepycat.com
Sleepycat Software Inc. keithbosticim (ymsgid)
118 Tower Rd. +1-781-259-3139
Lincoln, MA 01773 http://www.sleepycat.com

To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Wed Dec 8 21:11:54 2004

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.