On 30.01.2017 20:09, Stefan Fuhrmann wrote:
> On 29.01.2017 22:27, Julian Foad wrote:
>> Stefan, is this an edge case bug?
>>
>> rep_read_contents() in libsvn_fs_fs/cached_data.c_at_1780363 is quoted
>> below.
>>
>> rep_read_contents() potentially reads one or more times from the
>> fulltext cache
>> and then finds the fulltext cache is no longer available. I suppose
>> this could
>> happen because the entry could be evicted because other entries are
>> being more
>> recently/frequently accessed even though we haven't finished calling
>> rep_read_contents() for this particular rep.
>>
>> In this case, rep_read_contents() calls skip_contents() to skip-read
>> from the
>> window stream, to catch up that stream to the point we'd already
>> reached in the
>> fulltext cache before it became unavailable.
>>
>> Then we read the requested LEN bytes from the window stream, and
>> update the
>> calculated checksum with those LEN bytes ... but not with the skipped
>> bytes.
>>
>> Won't that lead to a checksum mismatch when we reach the end of the rep?
>
> Yes, that looks like a bug. I'll fix it next weekend.
>
> The problem is not f7-specific and will only be transient
> b/c the next reader will miss the cache and reconstruct
> the content from the first window on.
>
> Also, using bulk updates should reduce the incident rate
> of this issue.
Turns out that there is indeed a minor bug but that
one prevents the checksum from ever being checked
in the case that you described. So, this should not
be the source for the checksum mismatch that you
are seeing.
In r1781650, I added a regression test reproducing
the above conditions and it does not fail.
The thing is that while we don't run the checksum
over the "catch-up" part of the data, we also don't
advance the OFF element either. Hence, it will never
match the representation length to trigger the checksum
completion and comparison.
I'll fix this today (probably).
-- Stefan^2.
Received on 2017-02-04 12:48:17 CET