[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: #739: Ensuring ACID in Subversion (aka watcher procecesses are fun)

From: mark benedetto king <bking_at_inquira.com>
Date: 2002-09-20 18:46:45 CEST

On Fri, Sep 20, 2002 at 10:57:47AM +0100, Philip Martin wrote:
> A standard Unix fork/exec server.


> > The code for this child process is extremely thoroughly QAed
> That's a red herring, all Subversion code is reviewed. Expecting, or
> relying on, one part to be "thoroughly QAed" doesn't really help.

You're right, but it would still be nice if the server were small
enough to be easy to review completely, so that it is unlikely
to be crashing on its own. In particular, apache + mod_dav +
mod_dav_svn + libsvn_fs do not really meet this requirement, IMO.

> > so that it, itself, is unlikely to fail without releasing the
> > locks that it holds. Further, the listener mentioned in (2)
> > can detect when this (extremely rare) failure happens, and DTRT.
> >

This feature means that even if the small, simple server does
crash, things will be cleaned up. We're not, as you suggest,
"relying on" the stability of the service.

> > I think this may be the easiest safe strategy (though it does
> > require a daemon) (which might be auto-started, as mentioned
> > before).
> What worries me about this proposal is the performance impact on
> mod_dav_svn. We already have a sophisticated server, Apache, where
> the fork/exec stuff has been abstracted into the MPM modules. Adding
> this new server imposes the fork/exec model again. Will this degrade
> (Windows?) servers?

This is a very valid concern, and one I would expect from anyone with
experience with apache. However, my belief is that the two services
(HTTP and "NetBDB") are very different in *connection lifetime*.

HTTP connection lifespan: milliseconds (hopefully)
    (ignoring keep-alive optimizations)

    This fact is why HTTP *must* have an MPM infrastructure.

NetBDB connection lifetime: lifetime of client program.
    ra_local: seconds (or maybe tenths of a second, one day)
    ra_dav: lifetime of apache instance

    This fact is why NetBDB can probably get away without one.

In effect, ra_dav -> NetBDB would *leverage* the MPM features of
apache; the fewer times apache forks, the fewer NetBDB services
would be required.

Let me reiterate that this is *exactly* the model that Oracle uses;
that doesn't make it right, but they seem to be doing reasonably well
so far.

> Aside from the multi-processing issue, I'm also concerned about memory
> usage. Everything into and out of the database now requires memory to
> be allocated in both Apache and the new server. There is also the
> overhead of sending the requests over the local socket.

Yes; this is the price you pay for isolation. I'm not sure that, from
a performance standpoint, it makes sense to not pre-fetch additional
records, but, at least in theory, the service would only need enough
memory to hold the largest record in any of the tables. It could
conceivable even stream *this* record to the client. Practically,
I think it makes sense to start with the easiest implementation; with
an API established, the service and the protocol between the client
and service can be optimized at our leisure.

> Now, ra_pipe will require some sort of svnd server, but that should be
> handling ra requests.

Actually, ra_pipe itself doesn't require any daemons (though typically
it will be used with ssh, which requires sshd).

Typically, ra_pipe runs a command like "ssh user@remotehost svnpipe".
Then ra_pipe communicates with this sub-process. All of the
daemon bits are taken care of by sshd.


To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Fri Sep 20 18:54:09 2002

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.