Thanks for the suggestion and we'll follow up with the fsfsverify guy to see
if he can help.
Interestingly today there has been a change in the problem. A new commit of
the file that was having the checksum error seemed to:
A) work (surprising us as we assumed it would try to compute a delta and
fail)
B) masked the earlier problem
That's to say, an update or checkout of the repository as a whole is back to
working now without any errors. Running "svnadmin dump/verify" still fails
with the same error, so the earlier revision is still corrupted, it's just
somehow ignored now. Presumably SVN has some way to determine when it
doesn't need to look at an earlier revision to compute the latest copy of
the file and that's now cut in.
So it looks like the patch tool we needed was to run "svn commit" again in
some particular way :)
Of course, this still leaves us with a partially corrupted repository with
no way to run a dump/restore but at least it's semi-functional again.
If any of the SVN devs would like a copy of the corrupt revision or any
other info about the repro case here I'm happy to pass it along.
Doug
-----Original Message-----
From: Ryan Schmidt [mailto:subversion-2007b@ryandesign.com]
Sent: Monday, July 09, 2007 12:50 AM
To: Douglas Pearson
Cc: users@subversion.tigris.org
Subject: Re: Serious error in repository--help needed
On Jul 8, 2007, at 22:56, Douglas Pearson wrote:
> That's very disappointing to learn that we're really sunk here.
> Anyone else
> have any other ideas beyond a complete rollback of the repository and
> presumably days of struggling to find all changes and get them back
> into the repository?
>
> As to the cause, the description on the "fsfsverify" page
> http://www.szakmeister.net/blog/?page_id=16 (and the existence of that
> patch tool, which BTW is a year and a half old) seem to describe our
> situation pretty accurately. Committing a large set of changes (this
> revision is about 40MB) through apache2 can lead to an invalid
> revision on disk. As I say, we've seen that pattern several times,
> but only when working with large commits and never consistently.
>
> However, there's something different about this corruption than the
> earlier ones as the fsfsverify script fails to correct this problem.
>
> As to whether there should be recovery tools, well frankly that seems
> like a no-brainer to me. If SVN can cause corruption under normal
> usage (not some weird power outage or disk failure case here--just a
> normal commit) then it should be able to recover from those
> corruptions. It may be the error is inside Apache, but even if that's
> true SVN needs to be able to get back to a working state.
As I understand it, fsfsverify exists because the Subversion developers have
not yet been able to properly track down what causes these corruptions to
occur. They have only been able to write this script which corrects the
corruptions after the fact. They have presumably not run into the problem
themselves, else the problem would be easier for them to solve. The
corruption you have now experienced, which could not be corrected by
fsfsverify, could be significant. You should contact the developer of
fsfsverify and provide him with as much information as you can. If this
corruption is indeed similar to that already handled by fsfsverify, perhaps
fsfsverify can be enhanced to also handle your type of corruption.
And perhaps this will even lead them one step closer to finding the cause of
the problem in the first place, and fixing it once and for all in
Subversion.
---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org
Received on Mon Jul 9 23:59:46 2007