Re: FSFS Issue...

From: Malcolm Rowe <malcolm-svn-dev_at_farside.org.uk>
Date: 2005-12-03 21:19:19 CET

On Sat, Dec 03, 2005 at 06:16:41AM -0500, John Szakmeister wrote:
> > Yep. I actually noted this on the dev@ list a while back. This particular
> > file was extremely weird. You can also find a large repeated block of data
> > in the svndiff too. Here's the gory details:
> > http://svn.haxx.se/dev/archive-2005-09/0409.shtml
> >
> > Note the size of the repeated block. Does your head hurt yet? :-)
>

I tried to look at that, but my head hurt too much, so I probably didn't
understand it. I did notice the repeated data block, but I couldn't
quite tell: is it an exact repeat of the data at the start of the file?

> FWIW, I just tried copying the first 1690 bytes in front of the repeated
> block. Without touching the actual noderev, the new stream appears to
> consume the correct amount of data.
>

Interesting, so it looks like the second delta rep has actually been
overwritten by the first. Normally I'd be thinking about simultaneous
writes from two different threads here, but it looks like the transaction
isolation code is fairly bulletproof (it just creates directories),
so I can't see two threads getting the same transaction id, much less
trying to create the same transaction.

Ok, here's a scenario I've just thought of:

We create a new transaction, and start writing the delta rep for the
updated file. During the write, we hit a problem - a disk read or write
error, perhaps, it doesn't really matter - and we throw an error back
to the caller.

For whatever reason (handwave, handwave), we don't abort the transaction.
We start writing the delta rep again, appending to the end of the proto
rev file. This time, it completes without error (that it's on the
back-end of the partially-written earlier delta won't actually matter,
since the partial delta won't ever get referenced by anything).

Now, before the proto revfile is moved into place, we run pool cleanup
on the pool that contains the allocation for the original file open
(the one that errored). That closes the file, and as part of the close,
*flushes the partially-filled buffer to disk*, overwriting the start of
the 'real' delta rep.

What do you think? In our case, the second delta-rep (where the text
rep points) starts 19 bytes past a 4k boundary (where 19 bytes is the
size of the DELTA line, and 4k is the APR buffer size). And yes!
The get_file_offset() after the write of the DELTA line will call
apr_file_seek(), which, because we're writing, will flush the buffer to
disk, so we'd expect the last 4k buffered write from the first delta-rep
to finish exactly where we see it: at 4k+19 bytes.

That sounds feasible, but why aren't we killing the transaction if we
hit an error? And when are we running pool cleanup?

I think I might try to see if I can do some fault injection via an
LD_PRELOAD hack or something: it might help to reproduce the problem.

Regards,
Malcolm

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Sat Dec 3 21:20:24 2005

This message: [ Message body ]
Next message: mark benedetto king: "Re: [PATCH] OSX Keychain support"
Previous message: Ben Collins-Sussman: "Re: [PATCH] OSX Keychain support"
In reply to: John Szakmeister: "Re: FSFS Issue..."
Next in thread: John Szakmeister: "Re: FSFS Issue..."
Reply: John Szakmeister: "Re: FSFS Issue..."

Contemporary messages sorted: [ By Date ] [ By Thread ] [ By Subject ] [ By Author ] [ By messages with attachments ]