[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

RE: Repository "hung"

From: Peter Howard <pjh_at_coastal.net.au>
Date: 2003-01-06 08:38:23 CET

> -----Original Message-----
> From: Peter Howard [mailto:pjh@coastal.net.au]
> Sent: Thursday, December 19, 2002 7:07 AM
> To: Brandon Ehle
> Cc: Subversion Dev list
> Subject: RE: Repository "hung"
>
>
>
>
> > -----Original Message-----
> > From: Brandon Ehle [mailto:azverkan@yahoo.com]
> > Sent: Thursday, December 19, 2002 4:18 AM
> > To: Peter Howard
> > Cc: Subversion Dev list
> > Subject: Re: Repository "hung"
> >
> >
> > >
> > >
> > >Some 10 minutes later with nothing more, I ctrl-c the job.
> Think. DAMN!
> > >apache is still running. Stop apache. Now try to recover.
> Same result
> > >"Please stand by . . ." then nothing. Checking the process
> > usage, svnadmin
> > >is sitting at 10-15% of CPU.
> > >
> > >
> > Verify that ALL the "httpd" processes are gone, as shutting down apache
> > normally when something wedges the repository rarely ever results in all
> > apache processes exiting, then manually removed the leaked semaphores
> > with "ipcs" and "ipcrm -s". Any semaphores under the user/group
> > "apache" that are still alive after apache exited need to be removed, if
> > not, then eventually your machine will run out of semaphores and need a
> > reboot.
> >
> > >Check filesystem. It's full. Bother. Remove heaps of stuff.
> Filesystem
> > >now down to 60%. Try recover again. Same result "Please
> stand by . . ."
> > >then nothing more after 5-10 minutes.
> > >
> > >
> > Before running recover, goto the repository/db directory and run "lsof
> > *". If you see anything accessing the files kill it. Then run
> > recover. While recover is running you can use "lsof *" to watch its
> > progress (in a catastrophic recovery, it will walk through the log files
> > a few at a time).
>
> Tried to do that, but it had fixed itself :-) I shut the server down last
> night in frustration and only started it again now (9 hours later) and the
> recover took about 2 seconds. Remote access works now. I _had_ done a
> reboot last night but that didn't get things working. There were also a
> couple of files with access permissions in the db dir, but again
> I had fixed
> that as I went last night, so why now?
>
> So that leaves me moderately bemused as to the exact problem. But if it
> happens again, I'll check the semaphore situation.

Guess, what? It did it to me again today. This time I've checked the
semaphore state prior to "fixing" (which I have failed to magically do yet).
No semaphoeres, one Shared memory segment. If I leave svnadmin recover
running long enough it dies with a seg fault.

I had lsof running on a 20 second loop (+r 20) while svnadmin was running.
The final loop listed the following files being accessed:

./db/log.0000000001
./db/nodes
./db/revisions
./db/transactions
./db/copies
./db/changes
./db/representations
./db/strings

There's 165 revisions in the repository, but it never got past the first
logfile.

Suggestions? Ideas? Note: this is still using 0.15, so does it sound like
anything subsequently fixed?

Thanks

PJH

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Mon Jan 6 08:39:15 2003

This is an archived mail posted to the Subversion Dev mailing list.