On 07/22/2010 11:33 AM, Stefan Sperling wrote:
> On Thu, Jul 22, 2010 at 10:56:11AM -0400, C. Michael Pilato wrote:
>> The error you show above happens deep inside the FS implementations' DAG
>> layer, as a sanity check on a text-base before apply a delta to it. It's
>> probably comparing the checksum stored in the FS for a given text-base
>> against the checksum coming across the wire as the delta-base checksum. The
>> FS layer doesn't accept untrusted checksums. That is, you can't tell the FS
>> what the checksum for a file's contents are -- you can only tell it the
>> file's contents, and you can tell it what you expect the checksum to be
>> after you store new file contents. I suspect the customer had actual
>> repository corruption that was only detectable by svnsync after the fact
>> because svnsync had nothing to do (directly, at least) with that corruption
>> in the first place.
>
> And what does the code do if the content stored doesn't match the
> expected checksum? As far as I understand, the sequence of events
> if everything works fine is something like this:
>
> - syncsync wants to commit r40
> [content: 'foo'
> expected checksum: 0beec7b5ea3f0fdbc95d0dd47f3c5bc275da8a33 ]
- filesystem writes 'foo' into transaction, and calculates the content
checksum as it does so.
- filesystem verifies that calculated checksum matches the "expected
checksum" above. If it doesn't, the commit is failed.
- filesystem now has 'foo' stored in transaction, and checksum('foo')
stored in the transaction, too.
- all else works out for the commit, so the transaction is promoted
to a new revision. (In BDB this is a single table tweak. In FSFS,
doesn't this result in the transaction file being rewritten as a
new revision file with updated node-rev-ids? I don't know.)
- SOME BAD EVENT OCCURS
> - svnsync tries committing r41:
> [delta: s/^f/b/
> expected checksum: 78b371f0ea1410abc62ccb9b7f40c34288a72e1a
> expected base checksum: 0beec7b5ea3f0fdbc95d0dd47f3c5bc275da8a33 ]
>
> Now, as r41 is committed, the filesystem sees the base checksum
> matches, the diff is applied, and the content becomes 'boo' in r41,
> again matching the expected checksum.
>
> It's probably more complicated than the above in reality.
> But if this is the gist of it, what went wrong at the customer site?
> Did the wrong content get stored in r40? Or was the expected base checksum
> transmitted wrongly when the commit of r41 was attempted? And how can we tell?
If wrong content was stored for r40, then that wrong content's checksum
matched the expected one for r40. Which seems unlikely.
Bogus expected base checksum transmitted for r41? That's possible.
> That's a good point. If comparing the content is sufficiently cheap,
> we don't need to bother with checksums. Should I try making the loader
> perform such checks on transactions before commit (and also make it do
> similar checks on the content if checksums were provided by the dump file)?
The loader already does checksum validation for text contents when the
checksums are present. It doesn't do so directly, but via the exact same
mechanisms I mentioned above that all code writing file contents to the FS
go through.
The problem we have is that the FS layer doesn't do any validation of
property contents *at all*, so the loader can't rely on the FS to do this
work for it. If you want to add such code to the loader, that's fine with
me. But do it with a comment that indicates that this validation really
ought to be happening in the FS someday, please.
>> (And I will further suggest that we fix the dump.c code to pass FALSE for
>> the 'force' parameter of svn_fs_file_checksum() calls, thus restoring the
>> pre-Hyrums-change behavior.)
>
> Would this mean that we will never output checksums for some files?
> How can we make sure checksums are stored for files which don't have
> them yet, so that the dumps will include checksums even if force == FALSE?
>
> If I understood correctly, for BDB we don't have to do anything special.
> But we'll need to shout curses and maybe hush some magic spells at FSFS
> until it generates checksums which are missing?
Yes, this means the loader will never *generate* a checksum (MD5, SHA1, or
otherwise). It will only provide the checksums that the FS have stored
already for that body of file contents. Remember, we're trusting our dump
stream here. If the file contents in the repository are wrong and don't
have a checksum, there's no value in generating a perfectly valid checksum
for those wrong contents just to make ourselves feel better about it.
--
C. Michael Pilato <cmpilato_at_collab.net>
CollabNet <> www.collab.net <> Distributed Development On Demand
Received on 2010-07-22 17:51:44 CEST