[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: svn.collab.net intermittent failures

From: <kfogel_at_collab.net>
Date: 2003-04-12 21:18:38 CEST

Branko Čibej <brane@xbc.nu> writes:
> 99% here. Can anyone remember when the repo started failing? Did you
> make any changes on morbius?

Yes -- the failures started exactly when we upgraded to BDB 4.1.25.

(This also meant a dump/load of the entire repository, of course, so I
suppose BDB wasn't the only variable...)

Here is the original mail I sent, telling the chronology:

   From: Karl Fogel <kfogel@newton.ch.collab.net>
   To: SVN Developers Mailing List <dev@subversion.tigris.org>
   Cc: Blair Zajac <blair@orcaware.com>
   Subject: svn.collab.net intermittent failures
   Date: 11 Apr 2003 14:54:16 -0500
   
   Sorry about the downtime, folks.
   
   We upgraded svn.collab.net to use Berkeley DB 4.1.25 recently, because
   we were seeing problems whereby bdb log files weren't being released.
   Certain patches and ChangeLog entries at sleepycat.com indicated that
   upgrading to 4.1.25 might solve the problem. (We left Apache at
   2.0.44 and Subversion at 0.18.1, however.)
   
   After that, we saw different problems. Mike Pilato was doing all
   this, so I can't describe the problems with complete precision, but
   basically requests in the `nodes' table kept causing DB_RUNRECOVERY to
   be returned. Recovery itself would go smoothly, and then the
   repository would be okay until the next time, ten minutes later.
   
   He next tried upgrading Subversion to 0.20.1, and then brought Apache
   to HEAD, then went down to the released Apache 2.0.45. Finally, he
   brought the Subversion server code to HEAD. The same problems were
   occurring at all the intermediate stages.
   
   That's where the repository is right now: Apache 2.0.45 and Subversion
   HEAD. So far, it appears to be okay. But if there are more problems,
   I'll take it back to BDB 4.0.14 (decide at that time whether to leave
   the server at HEAD, or put it back to 0.20.1).
   
   Reverting BDB would not be ideal, because then we'd be back to the
   accumulating logfiles problem. But that's liveable, at least, and
   gives us time to figure out what the heck happened with 4.1.25.
   
   Sorry for the inconvenience. I'll be watching the repository this
   weekend, surely not alone :-).
   
   Blair, I don't know whether this is related to your problems (quoted
   below), but I notice that you also are using BDB 4.1.25, and Apache
   HEAD.
   
   -Karl
   
   Blair Zajac <blair@orcaware.com> writes:
> > Whoa. No -- not even any reason to think this is the same problem as
> > the ones we're experiencing with svn.collab.net right now, really.
> >
> > What version of server? Of Apache? What OS? What exact version of
> > Berkeley? Was the database created with the same Berkeley that you
> > later ran db_dump with? That error you saw
>
> OK, here you go.
>
> RedHat 9 with all patches.
> Apache/2.0.46-dev (Unix) mod_ssl/2.0.46-dev OpenSSL/0.9.7a DAV/2 SVN/0.20.1+
> svn rev 5592
> apr/apr-util/httpd from CVS on 2003-04-08.
> Berkeley DB 4.1.25.
>
> Yes, the same binaries and libraries are used.
>
> > > $ /opt/i386-linux/db-4.1/bin/db_dump changes > z
> > > $ /opt/i386-linux/db-4.1/bin/db_load changes.1 < z
> > > db_load: Length improper for fixed length record 4087
> > > db_load: Invalid argument
> >
> > ...is quite disturbing :-). Can you examine `z' and identify the bad
> > record?
>
> Hmmm, how do I do that?

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Sat Apr 12 22:02:18 2003

This is an archived mail posted to the Subversion Dev mailing list.