On Thu, Jul 22, 2010 at 10:56:11AM -0400, C. Michael Pilato wrote:
> svnsync doesn't do any checksum verification. That work is done by deeper
> levels. But you provide no support for the implication that Subversion
> didn't attempt to verify checksums at all the reasonable times.
Looks like I need to spend some time in the backend code to get
a better understanding of it.
> The error you show above happens deep inside the FS implementations' DAG
> layer, as a sanity check on a text-base before apply a delta to it. It's
> probably comparing the checksum stored in the FS for a given text-base
> against the checksum coming across the wire as the delta-base checksum. The
> FS layer doesn't accept untrusted checksums. That is, you can't tell the FS
> what the checksum for a file's contents are -- you can only tell it the
> file's contents, and you can tell it what you expect the checksum to be
> after you store new file contents. I suspect the customer had actual
> repository corruption that was only detectable by svnsync after the fact
> because svnsync had nothing to do (directly, at least) with that corruption
> in the first place.
And what does the code do if the content stored doesn't match the
expected checksum? As far as I understand, the sequence of events
if everything works fine is something like this:
- syncsync wants to commit r40
expected checksum: 0beec7b5ea3f0fdbc95d0dd47f3c5bc275da8a33 ]
- filesystem now has 'foo' stored in transaction. Expected checksum
matches the content read back from disk.
Everything looks fine so we can commit the transaction.
(Are transactions being validated before they are committed?)
- svnsync tries committing r41:
expected checksum: 78b371f0ea1410abc62ccb9b7f40c34288a72e1a
expected base checksum: 0beec7b5ea3f0fdbc95d0dd47f3c5bc275da8a33 ]
Now, as r41 is committed, the filesystem sees the base checksum
matches, the diff is applied, and the content becomes 'boo' in r41,
again matching the expected checksum.
It's probably more complicated than the above in reality.
But if this is the gist of it, what went wrong at the customer site?
Did the wrong content get stored in r40? Or was the expected base checksum
transmitted wrongly when the commit of r41 was attempted? And how can we tell?
Note again that removing r40 fixed the problem.
> Okay. It seems (if I'm reading you correctly) that we agree we shouldn't be
> using checksums to verify the integrity of the dump stream. That the value
> to be had here is in verifying that target repository, once loaded, has the
> right data.
> So may I suggest that, when dealing with property lists, it's just a lot
> simpler (especially with our APIs) to compare the before-and-after property
> lists than to trade in the currency of checksums? I mean, for file contents
> it's arguably cheaper to do a checksum compare than a byte-for-byte
> comparison (which isn't even possible in --deltas mode), and that mostly
> because the checksum is pre-computed. But for properties? We don't have
> the checksums handy (so there's extra cost to compute them), and we hold
> those things en totale in memory all the time, and have handy routines for
> finding diffs. I dunno. The property checksums just seem unnecessary in
> the whole scheme of things to me.
That's a good point. If comparing the content is sufficiently cheap,
we don't need to bother with checksums. Should I try making the loader
perform such checks on transactions before commit (and also make it do
similar checks on the content if checksums were provided by the dump file)?
> (And I will further suggest that we fix the dump.c code to pass FALSE for
> the 'force' parameter of svn_fs_file_checksum() calls, thus restoring the
> pre-Hyrums-change behavior.)
Would this mean that we will never output checksums for some files?
How can we make sure checksums are stored for files which don't have
them yet, so that the dumps will include checksums even if force == FALSE?
If I understood correctly, for BDB we don't have to do anything special.
But we'll need to shout curses and maybe hush some magic spells at FSFS
until it generates checksums which are missing?
Received on 2010-07-22 17:34:08 CEST