Daniel Shahaf wrote:
> Julian Foad (Jira) wrote on Mon, 01 Jun 2020 11:20 +0000:
>> Two clues strongly suggest the corruption was originally caused by a bug rather than hardware corruption:
>> * the checksum on the node-revision must account for the corrupted data, otherwise a checksum error is thrown instead;
>
> _Which_ checksum? Do you mean the checksum of the directory
> representation wherein the dangling (off-by-four) pointer was found?
Yes: in my test an error is thrown for the checksum of that
representation if that checksum isn't corrected or nulled.
>> * although an off-by-4 can sometimes be a 1-bit error, this particular one (41271 vs. 41275) is not a 1-bit error.
>
> Is there any chance that this _was_ originally a 1-bit error and then
> some offset got added/subtracted to both the id in the noderev header
> and the id in the directory rep?
Technically I expect that's possible but it seems less likely. It would
have to have happened at original commit time in the transaction
construction and commit time, whereas I suspect (without research) that
a bit error is much more likely to occur in data at rest on disk for years.
>> I do not know which revision contains the bad reference nor what version of svn committed it.
>
> Was CONFIG_OPTION_VERIFY_BEFORE_COMMIT enabled at the time the revision
> containing the dangling pointer was committed? (You may be able to
> answer this even without running grep -aR to find the dangling pointer.)
I very much expect not, because I am not aware of it being used in any
production environments that I have been connected with.
> It's curious that the wrong offset points directly to the value of the
> first field. However, having glanced at svn_fs_fs__write_noderev(),
> I guess that's just a coincidence.
>
> Assuming the node-rev headers _are_ synthesized by svn_fs_fs__write_noderev()
> in this user's environment, that is.
I'm confident they were synthesized by standard FSFS code.
- Julian
Received on 2020-06-02 14:05:27 CEST