[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: svn commit: r911804 - /subversion/trunk/subversion/libsvn_wc/wc-metadata.sql

From: Julian Foad <julian.foad_at_wandisco.com>
Date: Mon, 22 Feb 2010 11:03:51 +0000

Greg Stein wrote:
> On Fri, Feb 19, 2010 at 08:13, <julianfoad_at_apache.org> wrote:
> >...
> > +++ subversion/trunk/subversion/libsvn_wc/wc-metadata.sql Fri Feb 19 13:13:09 2010
> > @@ -172,7 +172,9 @@
> > and ACTUAL_NODE tables.
> > */
> > CREATE TABLE PRISTINE (
> > - /* ### the hash algorithm (MD5 or SHA-1) is encoded in this value */
> > + /* The SHA-1 checksum of the pristine text. This is a unique key. The
> > + SHA-1 checksum of a pristine text is assumed to be unique among all
> > + pristine texts referenced from this database. */
> > checksum TEXT NOT NULL PRIMARY KEY,
>
> That comment is now redundant with the PRIMARY KEY attached to that column.

Not quite. Perhaps someone can write this in better words for me. What I
wanted to say was:

"Look, this is an assumption on which the model depends. Don't
'discover' it for yourself and flame us about it. We know that there is
a theoretical possibility of a clash, but it is so much less likely than
many other kinds of problem that we can treat it as a unique key for
practical purposes. If texts have been specially constructed so as to
have the same SHA-1 checksum, as might be done in cryptography research,
that would defeat this assumption, but everyone else stands far more
chance of being hit by a meteorite."

Such an explanatory note would probably be better in some higher-level
place, such as in the PRISTINE table's main doc string or in a different
document, rather than on that particular column where I put it. How
about I move it to the table's main doc string and change the wording
to:

(Note: The PRISTINE table is indexed by the SHA-1 checksum of the
pristine text. A cryptography researcher might have different texts that
are specially constructed so as to have the same SHA-1 checksum, but for
anyone else the chance of ever having a clash is vanishingly small.)

?

> > /* ### enumerated values specifying type of compression. NULL implies
> > @@ -189,7 +191,8 @@
> > refcount INTEGER NOT NULL,
> >
> > /* Alternative MD5 checksum used for communicating with older
> > - repositories. */
> > + repositories. Not guaranteed to be unique among table rows.
>
> pfft. riiiiiight.

Likewise. What I wanted to say was something like:

"The MD5 checksum, like the SHA-1 checksum, is considered distinctive
enough for all practical purposes (except cryptography research).
However, as some clashes have been reported in the world, it would be
best if the code did not assume this is a unique key."

Hmmm... parentheses and "strictly" will help. How about I tone it down
to the following:

  /* Alternative MD5 checksum used for communicating with older
     repositories. (This is not strictly guaranteed to be a unique
     key, although in practice it nearly always will be.)
     NULL if not (yet) calculated. */
  md5_checksum TEXT

?

- Julian

> > + NULL if not (yet) calculated. */
> > md5_checksum TEXT
> > );
Received on 2010-02-22 12:04:30 CET

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.