Michel Jouvin <email@example.com> writes:
> We started a production Subversion server a couple of months ago. We are
> now running Apache 2.0.52 + Subversion 1.1.3 + Db 4.2.52 on Tru64 Unix 5.1B.
> We quite frequently experience repository database corruption on all of our
> repositories (7, with very different sizes). In previous versions of
> Subversion (before 1.1.2 I would say), we were generally able to fix these
> corruptions with svnadmin recover. We are now experiencing more and more
> corruptions that can't be fixed (svnadmin recover fails), where the only
> solution is a repository restore from backup.
> The first corruptions we experienced generally occurred during commit,
> especially on large repositories. When we looked at possible causes for
> these corruptions, we found that one reason was we were running 2 Apache
> servers on 2 different nodes in a cluster configuration (cluster file
> system, no NFS involved). We shut down one of the server and it more or
> less solved the corruption during commits. This remains strange as the
> cluster file system has a pure local file system semantics and we never
> experienced such problems with other databases or other Db usage.
> Now we experienced corruptions not related to any repository write. We have
> log files showing successful repository access through HTTP GET followed by
> a GET failure due to database corruption without any repository
> modification in between and without any Apache problem/restart. We
> suspected that these corruption were related to Apache restart during a
> transaction but we now have evidence that corruption can occur at any time
> without any repository modification. We have Apache log files and corrupted
> repository copies.
> Generally svnadmin recover fails on these corruptions. Sometimes we were
> able to fix corruptions by recover + verify as documented in a note. We
> also have a directory that we restored from backup and needed to repair
> before having it accessible again. In this case we had to use recover +
> verify. And verify + recover definitly corrupts the repository.
> Please could you let us know if this is a known problem (I saw a couple of
> issue entries related to similar problems but this is unclear if this is
> really the same) and if there is any workaround ? Is FSFS an alternative to
> consider ?
> Thanks in advance for any help. Let us know materials we could provide to
> help in troubleshooting, if this seems necessary.
Hmmm. I don't know why you're having these problems, but they are not
unfamiliar to us. Maybe it has something to do with being on a 64-bit
system, though that's just a wild guess, I have nothing to back it up
Yes, I suggest using FSFS at least for now. We're working on
improving Subversion's usage of Berkeley DB (the problems are with how
we use it, not with BDB itself). Very few people have problems as
severe as you are experiencing, and these problems have been hard for
us to reproduce reliably. You sound like you can reproduce them
pretty reliably, though, so if you want to resend your description to
the firstname.lastname@example.org list, there might (can't promise) be a
developer interested in using you as a reproduction environment, if
you're willing. (I wish I could, but my personal stack is full right
Sorry for the troubles. I hope the situation improves for you,
P.S. By the way, we try not to say "corruption" if data has not been
corrupted. The issue is that your data is not accessible, but
it has not been corrupted, from your description.
To unsubscribe, e-mail: email@example.com
For additional commands, e-mail: firstname.lastname@example.org
Received on Fri Feb 18 18:26:47 2005