[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

FSFS corruption in 1.4.2

From: Ryan Kuester <rkuester_at_kspace.net>
Date: 2007-03-23 20:04:09 CET

First of all, a big thank you to the Subversion community for a
wonderful version control system. The concepts, utilities, and
documentation are all first rate. It has become a core component of
the business at both my former and present employer, and a strongly
positive example of open source software.

I've just hit upon what might be another example of FSFS corruption,
using 1.4.2 through Apache. Operations of various kinds (remote and
local svn co, svnadmin {dump,verify}) requiring a particular
revision's data cause the invoked utility to crash due to SIGABRT
with no error output other than an `Abort' from the shell.
Operations along other file tree paths or on earlier revisions
involving the affected file tree path are fine. svnadmin dump and
verify all work their way up to this particular revision and crash.
Incrementally dumping this particular revision crashes, while dumping
the surrounding revisions incrementally succeeds.

Using strace on svnadmin dump shows:

$ strace svnadmin dump --incremental -r 1078 production
[....snipped....]
read(3, "\346r\300SX\315/c\317\35\316I\271oF2\363\32\343\267\271"...,
4096) = 4096
read(3, "\276??\225\f*\36\260\310B\27\221_\27Z\223\225.\fO\242\v"...,
4096) = 4096
mmap2(NULL, 2147491840, PROT_READ|PROT_WRITE, MAP_PRIVATE|
MAP_ANONYMOUS, -1, 0) = -1 ENOMEM (Cannot allocate memory)
mmap2(NULL, 2147622912, PROT_READ|PROT_WRITE, MAP_PRIVATE|
MAP_ANONYMOUS, -1, 0) = -1 ENOMEM (Cannot allocate memory)
mmap2(NULL, 2097152, PROT_NONE, MAP_PRIVATE|MAP_ANONYMOUS|
MAP_NORESERVE, -1, 0) = 0xb7845000
munmap(0xb7845000, 765952) = 0
munmap(0xb7a00000, 282624) = 0
mprotect(0xb7900000, 135168, PROT_READ|PROT_WRITE) = 0
mmap2(NULL, 2147491840, PROT_READ|PROT_WRITE, MAP_PRIVATE|
MAP_ANONYMOUS, -1, 0) = -1 ENOMEM (Cannot allocate memory)
mprotect(0xb7921000, 2147356672, PROT_READ|PROT_WRITE) = -1 ENOMEM
(Cannot allocate memory)
rt_sigprocmask(SIG_UNBLOCK, [ABRT], NULL, 8) = 0
tgkill(4629, 4629, SIGABRT) = 0
--- SIGABRT (Aborted) @ 0 (0) ---
+++ killed by SIGABRT +++

I guess that's just a byproduct of the corruption in the revision
file, but the utilities should probably handle that more gracefully.

I'm not sure how to provide any more debugging help. I'd be happy to
forward the corrupted revision file to any developers. The only
unusual thing about our svn server is that it runs in a VMware ESX
Server virtual machine. This *could* be just plain vanilla file
system corruption, though I can find no other evidence of that
possibility. I saw a fsfsverify.py script discussed on the list in
early 2006, but it's not applicable 1.4.x.

What I could really use is some advice on recovering the repository.
We didn't notice the problem until a few days after the fact, so many
changes along other paths have been committed since the faulty
revision. Either help on a direct way to fix that revision file, or
advice on a good way to recreate the repository despite this revision
would be appreciated. I'm going to go play with dump/load now.

Kind of a scary problem if it's really FSFS at fault. Let me know
what I can do to help debug this.

--
Ryan
---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org
Received on Fri Mar 23 20:04:18 2007

This is an archived mail posted to the Subversion Users mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.