[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: svn commit: r1794632 - /subversion/trunk/notes/sha1-advisory.txt

From: Daniel Shahaf <d.s_at_daniel.shahaf.name>
Date: Wed, 10 May 2017 09:11:50 +0000

[ Reviewing the whole file as of this revision. ]

stsp_at_apache.org wrote on Tue, May 09, 2017 at 19:07:09 -0000:
> Author: stsp
> Date: Tue May 9 19:07:09 2017
> New Revision: 1794632
>
> URL: http://svn.apache.org/viewvc?rev=1794632&view=rev
> Log:
> * notes/sha1-advisory.txt: wording tweak
>
> Modified:
> subversion/trunk/notes/sha1-advisory.txt

> Apache Subversion is unable to store SHA1 collisions
>
> Summary:
> ========
>
> Subversion repositories can be corrupted by committing two files
> which have different content, yet produce the same SHA1 checksum.

I don't think we should call this "corruption": the on-disk data
structures are intact, both syntactically and semantically. The problem
is in the libraries' assumption that sha1 has no collisions.

I'm afraid I don't have a good suggestion; perhaps "Distinct files that
have equal sha1 checksums cannot be checked out"?

> Details:
> ========
>
> In February 2017 a group of researchers released two PDF files which have
> different content but produce the same SHA1 checksum. This was the first
> publicly known SHA1 collision ever produced.
>
> If both of these files are committed to a Subversion repository, Subversion
> de-duplicates content based on the SHA1 checksum and only the content of

Missing qualifiers: only for FSFS and FSX and only if rep-sharing is
enabled. (I see the "Recommendations" section says that, but I think
they belong here.)

> one of the files ends up being stored in the repository. However, meta data
> stores the MD5 checksums of both files, and these MD5 checksums differ.
> This causes problems when Subversion eventually uses the MD5 checksum of
> the content which was not in fact stored. For example, updates and commits
> may no longer be possible due to an apparent checksum error.

It'd be shorter and clearer to say "may fail with a checksum error".

Moreover, the error is not spurious; on the contrary: it functions
exactly as designed, and prevents the wrong file from being used. Let's
say this in the advisory?

The problem is that in a few weeks, when the 'shattered' exploit code is
released, it'll be affordable to create files that collide sha1 and md5
simultaneously, so the md5 checksum error will no longer happen. At
that point we may have to revise this advisory.

> Recommendations:
> ================
>
> We recommend all users update to Subversion 1.9.6 which will reject any
> commit that would create a SHA1 collision.
(Nitpick: s/update/upgrade/)
> Note that this fix only works if the "representation-sharing" feature is
> enabled (it is enabled by default). If the file db/fsfs.conf inside the
> repository contains 'enable-rep-sharing = false', this option must be
> set to 'true' after upgrading to 1.9.6.
>
> One solution is just to delete the second file. This will resolve this
> problem for normal SVN client usage, but it will not work for tools like
> svnsync or git-svn which try to replay every revision in the repository.
> This will run into an error on the revision where the content was committed
> and the tool will not be able to proceed.

s/This/they/; s/the tool//

> A second solution would be to remove the problematic revision with svnadmin.
> svnadmin dump can be used to dump the repository up to the revision that

Quote the command name 'svnadmin dump'?

> introduced the problem. This dump file can be loaded into a new repository.
> If there were more commits after the problematic revision then dump and load
> all of these subsequent revisions as well.

Mention 'svndumpfilter exclude'?

> Another option is to create a Subversion permission rule (authz) that blocks
> access to the one or both of the files. This will work with tools like
> svnsync and git-svn as the server will not send the colliding content.

I suggest to give an example; people might not realise that it's
possible to write authz rules for single files (as opposed to
directories). E.g.,

     [/trunk/tests/data/shattered1.pdf]
     * =
     [/trunk/tests/data/shattered2.pdf]
     * =

Cheers,

Daniel
Received on 2017-05-10 11:12:01 CEST

This is an archived mail posted to the Subversion Dev mailing list.