[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: pristine store database -- was: pristine store design

From: Mark Mielke <mark_at_mark.mielke.cc>
Date: Tue, 16 Feb 2010 15:29:44 -0500

On 02/16/2010 03:24 PM, Mark Mielke wrote:
> On 02/16/2010 08:54 AM, Neels J Hofmeyr wrote:
>> They are merely half-checks for validity. During normal operation,
>> size and
>> mtime should never change, because we don't open write streams to
>> pristines.
>> If anyone messes with the pristine store accidentally, we would pick
>> it up
>> with the size, or if that stayed the same, with the mtime. But we can
>> pick
>> up all cases of bitswaps/disk failure *only* by verifying *full checksum
>> validity*!
>>
>> So, while checking size and mtime gives a sense of basic sanity, it is
>> really just a puny excuse for not checking full checksum validity. If we
>> really care about correctness of pristines, *every* read of a pristine
>> should verify the checksum along the way. (That would include to
>> always read
>> the complete pristine, even if just a few lines along the middle are
>> needed)
>
> Checking size and mtime gives huge benefits over checking contents.
> Size and mtime can be picked up with a single stat(), whereas a
> checksum requires open()/read()/.../close(). The data for stat() is
> usually stored in the inode which is read in either situation, and
> often small enough to be easily cached. For large work spaces,
> especially those with multi-Kbyte files, doing checksum tests on most
> operations would result in unacceptable performance.
>
> I think it's fine to compare checksum on any files that are noticed to
> have changed (size/mtime), but if the file looks unchanged, assuming
> that it *is* unchanged, is a fine compromise for the performance gains.
>
> If you want a "--compare-checksum" option which does the full check
> optionally - it might be use to some people. I suspect most people
> would avoid using it once they see how much more expensive it is...
>

I just realized perhaps you are talking about size/mtime of the pristine
and not of the working copy. If so, ignore the above. I see no reason to
check the sanity of the pristine during normal operation, presuming
there is some sort of transactional model that guarantees that
Subversion itself will not corrupt the pristine during normal operation
in the case of an expected failure (control-C by user, network failure,
...). :-)

Cheers,
mark
Received on 2010-02-16 21:30:26 CET

This is an archived mail posted to the Subversion Dev mailing list.