[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: #739: Ensuring ACID in Subversion (aka watcher procecesses are fun)

From: mark benedetto king <bking_at_inquira.com>
Date: 2002-09-20 19:39:27 CEST

On Fri, Sep 20, 2002 at 11:29:12AM +0100, Philip Martin wrote:
>
> One thing we need to do is ensure that the BDB recovery process is
> robust. The documentation requires that no other process is using the
> database when you run recover. At the moment we don't have a way to
> ensure that. What we need is a filesystem lock in the db directory,
> such that when it is present svn_repos_open fails. Then the recovery
> process is
>
> Lock the repository so that new processes fail to open it
> $ svnadmin lock /path/to/repos
>
> Now check for existing processes that are using the DB
> $ ps
> $ lsof
> $ kill xxxx
> $ kill -9 xxxx
>
> Now run BDB recovery and clear the lock
> $ svnadmin recover /path/to/repository

What happens when svnadmin crashes after obtaining a lock?

You've got a stale lock file.

If you handle stale lock files by rm'ing them, we're back
into a lock-stealing scenario (how do you really know the
lock is stale?)

A user can know that no one else is mucking around in his WC.

An administrator frequently isn't quite so sure that no one
else is working on *exactly the same problem*.

We'd need a reference-counted lock file, which means we'd need
svnadmin not to exit.

So, you'd have something like:

$ svnadmin lock /path/to/repos
bash(svnadmin)> ps
bash(svnadmin)> lsof
bash(svnadmin)> kill xxxx
bash(svnadmin)> kill -9 xxxx
bash(svnadmin)> svnadmin recover /path/to/repository
bash(svnadmin)> exit

Then, if we're really careful about POSIX flock() semantics,
we could guarantee that

    1.) no two svnadmins are running at the same time
    2.) no new connections are created after the svnadmin runs

This would probably be a lot less effort than a "NetBDB" implementation,
and obviously would not adversely affect performance, etc. Also, this
functionality would be required for recovery after system crashes.

>
> That provides a secure recovery process, in the face of Subversion
> clients and servers. Obviously a user could write a program that

It's a secure recovery process, but it's a manual recovery process.
Personally, I don't want to have to run the command sequence above
every time someone hits ^C on their client. I'd much rather the
recovery process only be needed in the case of power-outage.

> bypasses svn_repos_open if they have sufficient OS/filesystem access,
> but then they can also use raw BDB calls, normal stdio, or a normal
> editor! If you are concerned about such cases, they are handled by
> the usual OS security measures.

Anyone who does those things deserves what they get. Actually, they
deserve worse. :-)

> At this stage I would argue that we have a "completly and utterly
> robust" system. Once a transaction has been committed it is secure,
> it will never be lost.

From the Jargon File:

    robust:

    Said of a system that has demonstrated an ability to recover
    gracefully from the whole range of exceptional inputs and situations
    in a given environment. One step below bulletproof. Carries the
    additional connotation of elegance in addition to just careful
    attention to detail. Compare smart, opposite: brittle.

I think that requiring manual locking, ps-ing, kill-ing, recovering, etc
does not meet this definition of robustness.

>
> Now suppose you want also want to run BDB recovery automatically. I
> probably would not do that myself, but no matter. Can we use Apache
> to do this? I'm not an Apache expert, but it does have a controlling
> process that remains in communication with it's children. Could we
> provide an Apache module, or a mod_dav_svn directive, that causes
> Apache to detect children that disappear by dumping core, or children
> that hang and become unresponsive? Then Apache could then lock the
> repository to block new children, terminate any existing mod_dav_svn
> children and finally run repository recovery.
>
> Then, to have a system that automatically recovers a locked database
> you run Apache and only allow access through ra_dav.
>
> Is this possible? Does it satisfy your ACID requirements?
>

IANAAE, E. :-)

If we only allow access through ra_dav and all of those things can
be accomplished reliably with apache, then yes, I think it satisfies
them.

--ben

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Fri Sep 20 19:46:47 2002

This is an archived mail posted to the Subversion Dev mailing list.