[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Captured FSFS corruption bug in-the-wild was Re: Fixing the FSFS corruption bug

From: Justin Erenkrantz <justin_at_erenkrantz.com>
Date: 2007-08-27 04:11:49 CEST

On 9/5/06, Malcolm Rowe <malcolm-svn-dev@farside.org.uk> wrote:
> I think I have an idea about what's causing the FSFS corruption problem.
>
> I've managed to produce a reproduction program that successfully commits
> a corrupt transaction, and contrary to what I was expecting, it doesn't
> involve threaded races, disk write errors, or ignoring API error returns.
>
> Instead, it simply avoids closing the representation write stream,
> and continues to write a new representation, like so:
>
> open transaction
> get representation write stream
> write some of the representation
> open transaction
> get representation write stream
> write all of the representation
> close representation write stream
> commit transaction
...
> Importantly, I _haven't_ yet proved that mod_dav_svn (or any other caller)
> is behaving in this way, but the corruption I'm getting from my program
> _exactly_ matches the type of corruption we've seen reported.
>
> I hypothesise that something is causing mod_dav_svn to stop writing the
> representation (perhaps the client connection is terminated?) and then
> the representation is resent (somehow), causing the problem shown above.

We just saw this FSFS corruption bug over at apache.org. I don't have
a foggy clue if you ever confirmed this hypothesis, but I think I can
provide some evidence of it by way of log files.

Note the two PUTs - note that the retry of the PUT happens *before*
the first connection is even cleared/cleaned up by httpd. Weird.

My understanding is that this set of patches is only in 1.5, right?
So, in order to protect ourselves, we have to upgrade to 1.5?

Anyway, sanitized log file below. We do have mod_log_forensic output
as well. -- justin

svn.apache.org 1.2.3.4 - - [27/Jan/2007:18:19:10 -0800] "OPTIONS
/repos/tck/private HTTP/1.1" 401 408 "-" "SVN/1.4.2 (r22196)
neon/0.26.1" 122c5:45bc081e:244 443
svn.apache.org 1.2.3.4 - user [27/Jan/2007:18:19:10 -0800] "OPTIONS
/repos/tck/private HTTP/1.1" 200 147 "-" "SVN/1.4.2 (r22196)
neon/0.26.1" 122c5:45bc081e:245 443
svn.apache.org 1.2.3.4 - user [27/Jan/2007:18:19:10 -0800] "MKACTIVITY
/repos/tck/!svn/act/035b1d41-b17b-124b-98e1-87eb3948e263 HTTP/1.1" 201
406 "-" "SVN/1.4.2 (r22196) neon/0.26.1" 122c5:45bc081e:246 443
svn.apache.org 1.2.3.4 - user [27/Jan/2007:18:19:10 -0800] "PROPFIND
/repos/tck/private HTTP/1.1" 207 268 "-" "SVN/1.4.2 (r22196)
neon/0.26.1" 122c5:45bc081e:247 443
svn.apache.org 1.2.3.4 - user [27/Jan/2007:18:19:10 -0800] "PROPFIND
/repos/tck/!svn/vcc/default HTTP/1.1" 207 246 "-" "SVN/1.4.2 (r22196)
neon/0.26.1" 122c5:45bc081e:248 443
svn.apache.org 1.2.3.4 - user [27/Jan/2007:18:19:11 -0800] "CHECKOUT
/repos/tck/!svn/bln/1817 HTTP/1.1" 201 423 "-" "SVN/1.4.2 (r22196)
neon/0.26.1" 122c5:45bc081f:249 443
svn.apache.org 1.2.3.4 - user [27/Jan/2007:18:19:11 -0800] "PROPPATCH
/repos/tck/!svn/wbl/035b1d41-b17b-124b-98e1-87eb3948e263/1817
HTTP/1.1" 207 357 "-" "SVN/1.4.2 (r22196) neon/0.26.1"
122c5:45bc081f:24a 443
svn.apache.org 1.2.3.4 - user [27/Jan/2007:18:19:11 -0800] "PROPFIND
/repos/tck/private HTTP/1.1" 207 258 "-" "SVN/1.4.2 (r22196)
neon/0.26.1" 122c5:45bc081f:24b 443
svn.apache.org 1.2.3.4 - user [27/Jan/2007:18:19:11 -0800] "CHECKOUT
/repos/tck/!svn/ver/1703/private HTTP/1.1" 201 446 "-" "SVN/1.4.2
(r22196) neon/0.26.1" 122c5:45bc081f:24c 443
svn.apache.org 1.2.3.4 - user [27/Jan/2007:18:19:11 -0800] "PROPFIND
/repos/tck/private/file-doc.zip HTTP/1.1" 404 336 "-" "SVN/1.4.2
(r22196) neon/0.26.1" 122c5:45bc081f:24d 443
svn.apache.org 1.2.3.4 - user [27/Jan/2007:18:19:11 -0800] "PROPFIND
/repos/tck/private/file.zip HTTP/1.1" 404 333 "-" "SVN/1.4.2 (r22196)
neon/0.26.1" 122c5:45bc081f:24e 443
svn.apache.org 1.2.3.4 - user [27/Jan/2007:18:19:20 -0800] "PUT
/repos/tck/!svn/wrk/035b1d41-b17b-124b-98e1-87eb3948e263/private/file.zip
HTTP/1.1" 500 721 "-" "SVN/1.4.2 (r22196) neon/0.26.1"
122c5:45bc0828:24f 443
svn.apache.org 1.2.3.4 - user [27/Jan/2007:18:31:53 -0800] "PUT
/repos/tck/!svn/wrk/035b1d41-b17b-124b-98e1-87eb3948e263/private/file.zip
HTTP/1.1" 204 - "-" "SVN/1.4.2 (r22196) neon/0.26.1"
120df:45bc0b19:617 443
[Sat Jan 27 18:31:58 2007] [error] [client x.x.x.x] (54)Connection
reset by peer: Could not get next bucket brigade [500, #0]
svn.apache.org 1.2.3.4 - user [27/Jan/2007:18:46:02 -0800] "PROPPATCH
/repos/tck/!svn/wrk/035b1d41-b17b-124b-98e1-87eb3948e263/private/file.zip
HTTP/1.1" 207 511 "-" "SVN/1.4.2 (r22196) neon/0.26.1"
120df:45bc0e6a:618 443
svn.apache.org 1.2.3.4 - user [27/Jan/2007:18:46:03 -0800] "PUT
/repos/tck/!svn/wrk/035b1d41-b17b-124b-98e1-87eb3948e263/private/file-doc.zip
HTTP/1.1" 201 456 "-" "SVN/1.4.2 (r22196) neon/0.26.1"
120df:45bc0e6b:619 443
svn.apache.org 1.2.3.4 - user [27/Jan/2007:18:46:49 -0800] "PROPPATCH
/repos/tck/!svn/wrk/035b1d41-b17b-124b-98e1-87eb3948e263/private/file-doc.zip
HTTP/1.1" 207 515 "-" "SVN/1.4.2 (r22196) neon/0.26.1"
120df:45bc0e99:61a 443
svn.apache.org 1.2.3.4 - user [27/Jan/2007:18:46:49 -0800] "MERGE
/repos/tck/private HTTP/1.1" 200 381 "-" "SVN/1.4.2 (r22196)
neon/0.26.1" 120df:45bc0e99:61b 443
svn.apache.org 1.2.3.4 - user [27/Jan/2007:18:46:49 -0800] "DELETE
/repos/tck/!svn/act/035b1d41-b17b-124b-98e1-87eb3948e263 HTTP/1.1" 204
- "-" "SVN/1.4.2 (r22196) neon/0.26.1" 120df:45bc0e99:61c 443

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Mon Aug 27 04:09:08 2007

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.