[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: svn commit: r1794632 - /subversion/trunk/notes/sha1-advisory.txt

From: Stefan Sperling <stsp_at_apache.org>
Date: Wed, 10 May 2017 13:34:07 +0200

On Wed, May 10, 2017 at 09:11:50AM +0000, Daniel Shahaf wrote:
> > Summary:
> > ========
> >
> > Subversion repositories can be corrupted by committing two files
> > which have different content, yet produce the same SHA1 checksum.
>
> I don't think we should call this "corruption": the on-disk data
> structures are intact, both syntactically and semantically. The problem
> is in the libraries' assumption that sha1 has no collisions.
>
> I'm afraid I don't have a good suggestion; perhaps "Distinct files that
> have equal sha1 checksums cannot be checked out"?

I think we should call it corruption simply because it looks like
that to our users when it happens (see webkit).

This is a user-facing text. We want users to take action and upgrade so
they won't run into the problem. The purpose of this text is to raise
awareness. It is not to communicate technical details of the problem,
which can be obtained by other means (reading code, mailing lists, etc.)

I expect "corruption" will turn on people's alarm bells more than your
suggested wording which is very exact but also sounds less dramatic.

> > Details:
> > ========
> >
> > In February 2017 a group of researchers released two PDF files which have
> > different content but produce the same SHA1 checksum. This was the first
> > publicly known SHA1 collision ever produced.
> >
> > If both of these files are committed to a Subversion repository, Subversion
> > de-duplicates content based on the SHA1 checksum and only the content of
>
> Missing qualifiers: only for FSFS and FSX and only if rep-sharing is
> enabled. (I see the "Recommendations" section says that, but I think
> they belong here.)

Ack, fixed.

> > one of the files ends up being stored in the repository. However, meta data
> > stores the MD5 checksums of both files, and these MD5 checksums differ.
> > This causes problems when Subversion eventually uses the MD5 checksum of
> > the content which was not in fact stored. For example, updates and commits
> > may no longer be possible due to an apparent checksum error.
>
> It'd be shorter and clearer to say "may fail with a checksum error".

Yes. I like this much better, too!

> Moreover, the error is not spurious; on the contrary: it functions
> exactly as designed, and prevents the wrong file from being used. Let's
> say this in the advisory?

Too much detail, I'd say.

> The problem is that in a few weeks, when the 'shattered' exploit code is
> released, it'll be affordable to create files that collide sha1 and md5
> simultaneously, so the md5 checksum error will no longer happen. At
> that point we may have to revise this advisory.

Well, this section describes what happens with these specific two PDF files.
I think it's fine as it is. Once people have installed our patch they don't
need to worry about future collisions anyway.

And further in the text, we do explain that there is a general reliance on
SHA1 on the client side as well. Smart cookies will infer the rest.

> > Recommendations:
> > ================
> >
> > We recommend all users update to Subversion 1.9.6 which will reject any
> > commit that would create a SHA1 collision.
> (Nitpick: s/update/upgrade/)
> > Note that this fix only works if the "representation-sharing" feature is
> > enabled (it is enabled by default). If the file db/fsfs.conf inside the
> > repository contains 'enable-rep-sharing = false', this option must be
> > set to 'true' after upgrading to 1.9.6.
> >
> > One solution is just to delete the second file. This will resolve this
> > problem for normal SVN client usage, but it will not work for tools like
> > svnsync or git-svn which try to replay every revision in the repository.
> > This will run into an error on the revision where the content was committed
> > and the tool will not be able to proceed.
>
> s/This/they/; s/the tool//

Sure.

>
> > A second solution would be to remove the problematic revision with svnadmin.
> > svnadmin dump can be used to dump the repository up to the revision that
>
> Quote the command name 'svnadmin dump'?

Sure.

>
> > introduced the problem. This dump file can be loaded into a new repository.
> > If there were more commits after the problematic revision then dump and load
> > all of these subsequent revisions as well.
>
> Mention 'svndumpfilter exclude'?

That may be a third possibility. But has anyone tested it?

> > Another option is to create a Subversion permission rule (authz) that blocks
> > access to the one or both of the files. This will work with tools like
> > svnsync and git-svn as the server will not send the colliding content.
>
> I suggest to give an example; people might not realise that it's
> possible to write authz rules for single files (as opposed to
> directories). E.g.,
>
> [/trunk/tests/data/shattered1.pdf]
> * =
> [/trunk/tests/data/shattered2.pdf]
> * =

Indeed, I agree.

I have committed all the aforementioned changes in r1794707.

Thanks!
Received on 2017-05-10 13:36:10 CEST

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.