[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: fsverify.py unable to fix invalid svndiff header

From: Daniel Shahaf <d.s_at_daniel.shahaf.name>
Date: Wed, 18 May 2011 10:53:13 +0200

[ CC += julianf ]

What came out of this thread? Is this one of the known corruption kinds?

Is this is a case of a data block being written partially in one place
and fully in another, or a case of a corrupt or truncated data block?

-- 
Daniel
(at hackathon room)
Steinar Bang wrote on Sat, May 14, 2011 at 19:54:12 +0200:
> >>>>> Stefan Sperling <stsp_at_elego.de>:
> 
> > The script probably took a wrong guess.
> 
> > Hopefully this is the known corruption problem with a duplicate block of
> > data in the revision file.
> 
> > Can you check if the original revision file (i.e. not modified by
> > fsfsverify.py) somewhere contains a data block which contains data
> > that matches the data around byte offset 1916?
> 
> "offset 1916", is that "byte number 1916 in the 683 ref file"?
> Is that 1916 decimal, or hexadecimal?   I'm assuming decimal for now. 
> 
> > Usually the spot where the corruption appears (offset 1916 in your case)
> > contains an incomplete representation, but the representation data in the
> > duplicated block is good.
> 
> > One of way of locating the duplicate block is to open the file in a
> > hex editor and search the entire file for hex strings that occur
> > around or after 1916.
> 
> Ok, opening the file in emacs hexl mode:
>  `M-x hexl-find-file /tmp/svnrepo/svn/db/revs/0/683 RET'
> 
> > Try to locate boundaries of representations, which look as follows:
> 
> > https://svn.apache.org/repos/asf/subversion/trunk/subversion/libsvn_fs_fs/structure
> >   A representation begins with a line containing either "PLAIN\n" or
> >   "DELTA\n" or "DELTA <rev> <offset> <length>\n", where <rev>, <offset>,
> >   and <length> give the location of the delta base of the representation
> >   and the amount of data it contains (not counting the header or
> >   trailer).  If no base location is given for a delta, the base is the
> >   empty stream.  After the initial line comes raw svndiff data, followed
> >   by a cosmetic trailer "ENDREP\n".
> 
> `M-j 1916 RET', takes me here:
> 00000770: 4e44 5245 500a 4445 4c54 410a 5356 4e01  NDREP.DELTA.SVN.
> 00000780: 0000 8c0c 823d 8431 823b 8140 4c00 8844  .....=.1.;._at_L..D
> 00000790: 4ca8 404e 0048 8100 bf45 820d 9945 820d  L._at_N.H...E...E..
> ...
> 
> The cursor is positioned on the "5" starting "5356" in the first line,
> and on the "S" of "SVN". 
> 
> Does that make sense?
> 
> So I should search for "ENDREP.DELTA.SVN"?  The 683 revfile contains 15
> istances of that string, but I have no idea of which ones are relevant
> or not.
> 
> > So if you find a duplicated block of data you should be able to fix this
> > problem by copying representation data from the duplicate block to the
> > corrupted location.
> 
> So what I'm looking for isn't exactly "ENDREP.DELTA.SVN", but what
> follows this text...?
> 
> I tried searching for "x^.Rmo.@", but the one at the cursor is the only
> occurrence in the file.  At least the only aligned so that search will
> find it.  Doesn't look like hexl-mode has the possibility to search for
> a byte sequence.  Maybe I should get myself a proper hex editor?
> 
> > DO NOT change any byte offsets in the file while doing this. If you
> > cannot squeeze the data in because it would overlap with subsequent
> > data you're out of luck but I've never seen this happen.  Usually
> > there is enough room to fit the data, but you might have to add
> > padding. Any dummy byte will do, I usually use 0x42.
> 
> The meaning of life, the universe and everything?  I thought that was 42
> decimal...? :-)
> 
> > Another possibility is that offset 2247 is wrong. In this case the
> > expected svndiff data is probably located elsewhere and the offset
> > in the representation header should be adjusted.
> 
> Right... that's the first error that fsfsverify.py tries to fix?
> 
> `M-j 2247 RET' takes the cursor over the "7" in "5878" in the first
> line: 
> 000008c0: 8550 8b57 8585 5978 5e1d 526d 6fda 400c  .P.W..Yx^.Rmo.@.
> 000008d0: fe1c ff0a 17f8 00d5 12fa a27d 4145 2a90  ...........}AE*.
> 000008e0: 8466 0da4 2361 6353 2414 ee0e b871 89a3  .f..#acS$....q..
> 
> That wasn't as easily recognizable, as the 1916 one, though.  Not as
> recognizable as a boundary at least.
> 
> What are the things I should look for a duplicate of?  The bytes
> following the troublesome position?  And how many?
> 
> > This is of course not an easy task and it is unfortunate that people
> > keep running into this problem. The source of the problem is not yet
> > known :(  If you have any further questions just ask. If you cannot
> > get it fixed at all but can share the revision file privately I will
> > have a go at it.
> 
> I think I need help with this one.  I'll send you the revision file
> privately.
> 
> Thanks!
> 
> 
> - Steinar
> 
Received on 2011-05-18 10:54:07 CEST

This is an archived mail posted to the Subversion Users mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.