[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: FSFS Issue...

From: John Szakmeister <john_at_szakmeister.net>
Date: 2005-12-03 12:16:41 CET

On Saturday 03 December 2005 03:58, John Szakmeister wrote:
> On Friday 02 December 2005 12:39, Malcolm Rowe wrote:
> > Hi John,
> >
> > I've been looking at this issue for the past few days.
> >
> > I really can't see what's going wrong. I've audited the FSFS write
> > code, and it looks fine. Critically, I can't see how to drive
> > libsvn_fs_fs/fs_fs.c in such a way as to get the rev files we're seeing.
> >
> > The rev file is only written once (when the proto-rev file is moved into
> > place), and the proto-rev file is only ever appended to (and as a stream:
> > so there's no possibility of parts of it being overwritten).
>
> You found the very same things that I did when I audited the code. I don't
> see how it could have happened either, at least not on the surface (i.e.,
> in what we're doing).
>
> > It's possible that there's a problem re-reading and writing the noderev
> > hash, but if there is, I can't see it (and it would have to affect
> > transaction noderev's only). Similarly, there could possibly be a
> > problem in the code that rewrites a mutable (rev -1) noderev hash to
> > point it to the current revision number - I've not checked that, but it
> > doesn't seem particularly likely.
>
> /me agrees.
>
> > However, there _must_ be something wrong in the FSFS code, because I
> > can't explain how the 'text' line is written out with the wrong
> > rep-offset. That offset is only set in rep_write_get_baton(), just before
> > we write out the DELTA line (and we're not getting a repeated DELTA).
>
> Yep.
>
> > I've looked at pool management, and I can't see anything wrong in
> > that area. I've not yet run valgrind over it though, so I could have
> > easily missed something. (Could someone point me towards instructions
> > for enabling pool debugging please?)
>
> Philip has pointed me to --enable-pool-debug for APR in the past.
> Admittedly, I haven't had the time lately to go and do it though.
>
> > However, the numbers don't really look random enough for this to be
> > simple memory corruption: as you point out, the 'text' offset always
> > points inside the delta rep (whereas I'd expect it to point into infinity
> > if it was just a random overwrite).
>
> Right.
>
> > I think there must (also?) be something wrong in the svndiff code, since,
> > as you pointed out, the windows are repeating. This is a dump from your
> > r94 revision:
>
> [snip lots of window output]
>
> It looks like we're both decoding the same thing. :-)
>
> > (For this file, if the expanded size is correct, I'd expect 0x62 full
> > size (target size 0x19000) windows, and one smaller window).
> >
> > The first 0x60 windows look completely regular (I assume the source
> > window dropping to zero is because the source stream emptied; and I
> > assume the large data sizes in the final windows are because the source
> > window was not useful for a diff in those cases).
> >
> > Window 0x61 appears completely broken: it's moved backward in the source
> > stream, it's got a zero target length, a zero instruction size, and an
> > enormous data size. My hacky program doesn't think there's enough data
> > left in the rep to get to the next window. (Could you double-check
> > my decoding?). Possibly worth noting that the 'text' offset points
> > inside window 0x60, which is the last one that looks good, so window
> > 0x61 might be just junk.
> >
> > It's also interesting that if we subtract the 'text' size (1347483) from
> > the ENDREP offset (2596801), we get 1249318, which is 19 bytes after the
> > 'text' offset (1249299), which is exactly the correct number of bytes
> > for the DELTA line, implying that b->rep_offset and b->delta_start _both_
> > had to be wrong, but wrong in the correct way, if you see what I mean.
>
> Yep. I actually noted this on the dev@ list a while back. This particular
> file was extremely weird. You can also find a large repeated block of data
> in the svndiff too. Here's the gory details:
> http://svn.haxx.se/dev/archive-2005-09/0409.shtml
>
> Note the size of the repeated block. Does your head hurt yet? :-)

FWIW, I just tried copying the first 1690 bytes in front of the repeated
block. Without touching the actual noderev, the new stream appears to
consume the correct amount of data.

I emailed Ranjit to see if he has his corrupted repository still. If so, I'll
have him try a dump and load with the new rev file.

-John

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Sat Dec 3 12:17:24 2005

This is an archived mail posted to the Subversion Dev mailing list.