Edward Ned Harvey wrote on Wed, Nov 10, 2010 at 00:28:48 -0500:
> > From: Daniel Shahaf [mailto:d.s_at_daniel.shahaf.name]
> >
> > Can you compare the contents of /path/to/file/foo/bar between the master
> > and mirror, as of the last revision successfully synced to the mirror?
>
> The latest rev which synced without reporting any error was 5045. It was
> trying to go from 5045 to 5046 when it triggered the checksum failure.
>
> I checked the history of the file in question, and it was changed in ~200
> different revs. But the revs of interest are: in 4390, it synced to the
> slave without reporting any error, however, from 4390 onward, if I checkout
> from the slave and master, the two files differ. And the next rev where
> this file was changed was 5046, which is when svnsync notices the checksum
> mismatch, and dies.
>
Okay.
> It would seem, all of this behavior could be explained by a simple
> undetected hardware error. During sync of 4390, the slave wrote some bits
> to disk, which got written wrongly. It is known that disks will do this
> rarely. This is one of the huge arguments in favor of ZFS and BTRFS and
> filesystem checksumming in general. Such filesystems detect and correct
> data corruption which would have otherwise passed silently... Which seems
> to be what happened in my case.
>
Yes, the question is whether this thread is just a bunch of hardware
errors, or something deeper.
> All servers and clients are running 1.6.12. However, at the time when 4390
> was committed... The master was 1.6.12, but the slave was probably 1.5.7
>
>
> > If you create a fresh mirror and svnsync it, from r0 to that revision,
> > does the
> > file /path/to/file/foo/bar in the fresh mirror differ from the one in the
> > master?
>
> No problems. Although ... I didn't let it sync from rev 0. (That would be
> impossibly time consuming... weeks....) I did as mentioned before.
> Transferred a backup of the master to the slave, and used it as the "seed"
> for the sync, so I only needed to sync the last 100 revs or something like
> that...
>
That would mean that the "last changed revision" --- r4390 --- is
contained in the seed and wasn't re-svnsync'd. If we suspect that
svnsync committed a bogus r4390 to the slave, we'd better start with
a slave that /doesn't/ already have a knowingly-good r4390...
Of course, you can take that backup and use it to produce a repository
whose youngest revision is earlier than r4390.
Received on 2010-11-10 18:00:51 CET