[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: rep-cache sanity check on commit

From: Philip Martin <philip.martin_at_wandisco.com>
Date: Tue, 31 Jul 2012 10:26:34 +0100

Stefan Fuhrmann <stefan.fuhrmann_at_wandisco.com> writes:

> On Mon, Jul 30, 2012 at 12:40 PM, Philip Martin
> <philip.martin_at_wandisco.com>wrote:
>
>> When the commit process finds a representation in the rep-cache the only
>> sanity check that happens is that the revision must be less than or
>> equal to HEAD. We don't check that the offset is valid:
>>
>> echo foo > foo
>> svnadmin create repo
>> svn import -mm foo file://`pwd`/repo/A/f
>> sqlite3 repo/db/rep-cache.db "update rep_cache set offset = 4"
>> svn import -mm foo file://`pwd`/repo/A/g
>>
>> or that the checksum at that offset matches:
>>
>> echo foo > foo
>> echo bar > bar
>> svnadmin create repo
>> svn import -mm foo file://`pwd`/repo/A/f
>> sqlite3 repo/db/rep-cache.db "update rep_cache set
>> hash='e242ed3bffccdf271b7fbaf34ed72d089537b42f'"
>> svn import -mm bar file://`pwd`/repo/A/g
>>
>> In both cases corruption in the rep-cache leads to corruption in the
>> revision files but that corruption is not detected by commit process
>> even though subsequent checkouts will fail.
>>
>
> Has that kind of corruption been observed in the wild?

I'm not aware of any reports. I did start looking at this because of a
repository with an incorrect offset referring to an older revision file
but that now appears to be a different problem.

>> Should we do more sanity checking? We are using rep-cache to discard
>> data supplied by the client on the basis that it is already present in
>> the repository. Should we check that the offset really is a representation
>> with the expected checksum?
>>
>
> The full verification would look like this:
> * recursively enumerate all noderevs in the rep's revision
> * check that at least one uses the rep
> * read the rep and verify the checksum
>
> This seems quite costly to do during commit - in particular during
> imports and similar mass commit operations.

On commit we would only want to verify that a revision/offset obtained
from the cache really was a representation. I suppose we might want to
construct the full-text corresponding to the representation to verify
the checksum. We do not need to verify the whole revision or the whole
rep-cache.

-- 
Certified & Supported Apache Subversion Downloads:
http://www.wandisco.com/subversion/download
Received on 2012-07-31 11:27:15 CEST

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.