[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: Checksums (Re: [PATCH]: svndiff version 1)

From: Daniel Berlin <dan_at_dberlin.org>
Date: 2002-02-26 01:48:48 CET

On Mon, 25 Feb 2002, Zack Weinberg wrote:

> This isn't strictly related to your changes, but since you're digging
> around in this stuff anyway: I noticed not so long ago that the MD5
> checksums stored in each diff representation were always all-bytes-zero.

I noticed they are weird too. Sometimes they are words, strings that
seem to be from the file, etc.
Nothing that makes sense.

I was going to track it down this week. I suspect we have a pointer to
random memory or something rather than the digest, and write that.

> I'm sure this is a straightforward bug to find and fix.
> On a larger scale, I'd like to put forward the proposal that
> *everything* we store in the database should have a checksum attached
> to it. MD5 is probably overkill for everything but the strings table
> however, I want Subversion to detect and report immediately when a
> disk block has gone south, causing a chunk of data in the middle of a
> storage pool to be overwritten with NUL bytes.

Verification of the md5's on svndiff's is next on my list, and somethign i
wanted to take care of for this patch.

db_verify should be able to make sure the database structures are okay,
and can detect corruption of keys in most cases (since it's a B-TREE
database, the keys have to appear in a certain order. It'll tell you if
they are out of order. For the nodes database, ignore it, since we use
our own comparison function. but for the other db's, if you get
out-of-order key messages, it means something is corrupt).

> This is not a hypothetical problem; it has happened repeatedly to
> GCC's CVS repository. Because RCS doesn't validate ,v files, it can
> go unnoticed for years, and then good luck finding a backup to restore
> that block from.

Look at ChangeLog,v.BAD in the repo, fer instance.

> Perhaps this is already taken care of by BDB, but then why do we have
> checksums in svndiff representation entries?

Nah, it can only detect certain types of corruption (for instance, I
could corrupt the keys, and as long as i don't change the ordering, it
won't detect it).
We need to use/verify checksums.


To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Sat Oct 21 14:37:09 2006

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.