[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: Command-line option for changing block-on-recover behavior considered unnecessary.

From: <kfogel_at_collab.net>
Date: 2004-08-30 21:51:17 CEST

> It's all good. But *don't wimp out on listing the lockers*.

IMHO the proposal overly complex, and would be more maintenance burden
than its worth. I wouldn't veto it, if the other developers also
think it's a good idea. But that little script has to get installed
somewhere, it has to be documented, we have to make sure it does not
cause problems on systems where it won't be invoked, make sure it
invokes lsof or whatever in the right way for the particular system
its on, etc. The estimate of "two, three hours' work at the outside"
seems very low to me. Of course, that's your time you're
volunteering, so the estimate doesn't really matter -- I'm mainly
concerned about the ongoing maintenance and comprehensibility burden.

And for how much benefit? Suppose it's httpd, or svnserve being run
from inetd? In the former case, httpd is probably a child process.
Killing it might not help -- Apache will start up another one when the
next access comes in, which might well happen between the time when
lsof is called and the admin-invoked kill is issued. Unpredictably,
then, our helpful techniques might fail anyway. Sometimes they'll
work, sometimes they won't. That'll be hard to explain. Far better
for the admin to *understand* what is going on, and restart apache
after editing the appropriate conf file. Similar arguments apply to
svnserve.

I think the best balance of our effort vs admin convenience would be
to print out a message that says what's going on, speculates about the
sorts of daemon-style processes that might be responsible (i.e., tells
the admin to check Apache and/or svnserve), and then lets the admin Do
The Right Thing.

I don't deny that it's a burden on admins to ask that they know what
"The Right Thing" is for so many different systems, of which
Subversion is only one. But all my instincts are saying that we
really can't solve this well anyway, and that we'd spend far too much
time trying.

Best,
-Karl

"Eric S. Raymond" <esr@thyrsus.com> writes:
> kfogel@collab.net <kfogel@collab.net>:
> > We can't portably determine the other locking processes; attempting to
> > do so would be too much extra complexity.
>
> The demand for perfect portability is a red herring in this case.
> You're a Unix programmer; think like one and apply the following
> rules:
>
> 1. Never write what you can reuse.
> 2. Better to do 85% of the job than do 0% of it because you can't handle 15%.
> 3. When in doubt, use brute force.
>
> Here's the brute-force solution. You know the list of files with
> critical locks on them. When svnadmin recover detecys that the
> database is locked, it should shell out to a dumb little script --
> we'll call it "list-all-lockers" -- that does this:
>
> 1. Determine if fuser is in a command directory. If so, shell out to
> fuser called on the critical file list. It will will either generate
> the right report to stdout or an error status. On error status,
> complain. You're done.
>
> 2. Determine if lsof is in a command directory. If so, shell out to
> lsof called on the critical file list. It will either generate
> the right report to stdout or an error status. On error status,
> complain. You're done.
>
> 3. If neither of these works, punt with a complaint message. You're done.
>
> This is the patch I was going to write if nobody else got moving. Two
> dozen lines of shell. Maybe three lines of C. Two, three hours' work
> at the outside.
>
> This addition would handle Linux, Mac OS/X and other Unixes, all of whom
> can run bash scripts. svnadmin runs on the *repository* machine, of
> course, so the only people it won't help are people hosting
> repositories on Windows boxes -- and they'd be no worse off than they
> were before.
>
> > Plus, some people do run
> > repositories over very reliable remote filesystems, in which case
> > lsof/fuser is no help anyway (thanks to ghudson for pointing this out
> > in IRC).
>
> So lsof/fuser sez "Oops! Boss, I can't stat that file!", returns an
> error status, and you're at case 3. Yes, this will happen
> occasionally. Is it a good reason not to help the vast majority of
> admins for whom the simple shellout to fuser/lsof will work?
>
> Heck, no.
>
> Here's why it's our responsibility. *The admin doesn't know what the
> critical files are*. So even if he groks fuser/lsof, he can't know
> the proper mystic incantation to generate to find out what the locking
> processes are.
>
> Our choices, therefore are simple:
>
> 1. Punt the problem. But I've flamed at length about why this would be
> wrong, and the list seems to have taken that critique to heart.
>
> 2. Expose svn's guts by uttering a list of the critical files so the
> sdmin can run fsuser/lsof by hand.
>
> 3. Delegate the job of finding the lockers to plugins that may vary
> depending on the host OS (what I've suggested).
>
> Really, is there any reason the distribution shouldn't include
> a `list-all-lockers` script for each port?
>
> > Detecting and auto-killing competing processes, for --force,
> > would be even harder and more error-prone. "I ran 'svnadmin recover'
> > and _what_ died??"
>
> Granted. I had more or less come to the conclusion that this wasn't
> practical myself.
>
> > But we can make 'svnadmin recover' much friendlier than it is now (and
> > IMHO, friendly enough) without going into such complexity:
> >
> > By default, 'svnadmin recover' fails immediately if it can't get a
> > lock, otherwise takes the lock and begins recovering, in
> > non-interruptible mode.
> >
> > But when passed the '--wait' option, if it can't get the lock, it
> > prints out a message saying:
> >
> > "Waiting for lock. Currently, at least one other process has the
> > repository open; recovery will start when the repository is
> > released. Press ^C to give up."
> >
> > Then when it gets the lock, it prints out a second message:
> >
> > "Lock acquired, starting recovery. This is non-interruptible."
>
> It's all good. But *don't wimp out on listing the lockers*.
> --
> Eric S. Raymond

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Mon Aug 30 23:31:19 2004

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.