[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: Files with identical SHA1 breaks the repo

From: Daniel Shahaf <d.s_at_daniel.shahaf.name>
Date: Wed, 1 Mar 2017 12:18:02 +0000

Stefan Sperling wrote on Wed, Mar 01, 2017 at 11:01:40 +0100:
> On Tue, Feb 28, 2017 at 10:17:34PM -0600, Greg Stein wrote:
> > I really like this idea.
> >
> > And we could take a copy of APR's sha1 code, and rejigger it to perform
> > *both* hashes during the same scan of the raw bytes. I would expect the
> > time taken to extend by (say) 1.1X rather than a full 2X. The inner loop
> > might cost a bit more, but we'd only scan the bytes once. Very handy, when
> > you're talking about megabytes in a stream-y environment.
> >
> > (and medium-term, push this dual-sha1 computation back into APR)
>
> The idea is nice and I would support its application.

I would not.

Using two variants of sha1 is fine for supporting webkit's use-case, or
to protect admins from disgruntled employees who commit the known
collision to DoS their employer.

However, when the attacker model is a competent/resourceful attacker,
using two variants of sha1 only increases the attack's complexity by
a factor of about 2⁷ ([1, §4.1]), and might in fact cause collisions to
be _easier_ to find (since additional state is output).

So let's not add a variant of sha1. Instead, if we're concerned about
disgruntled employees, we should detect collisions; and if we're
concerned about competent attackers, we should use a stronger/longer
industry-standard hash function.

(Note that we already have both md5 and sha1, so we check (or can check)
both of them. And there's always full byte-by-byte comparisons...)

Cheers,

Daniel

[1] https://www.iacr.org/archive/crypto2004/31520306/multicollisions.pdf

> Note however that it does not help with fixing current releases.
> We would need to store this second hash somewhere, which implies a format
> and/or protocol change, depending on where the idea is applied (rep-cache,
> ra-serf, pristine store, ...)
>
> For now, we should focus on solutions that can be backported because that's
> what our users need most. Our current formats and protocols will only
> store/send MD5 and SHA1 of the full content so for 1.8 and 1.9 we will
> have to find something that works within these restrictions.
> One option would be to disable affected features. But some features can't
> just be disabled, such as the pristine store.
>
> In theory the existing system should work as it is as long as only one side
> of the collision is allowed to survive. We will need format changes only to
> allow storing both PDFs. We could delay 1.10 a bit to gain time for working
> out long-term solutions which imply format changes.
Received on 2017-03-01 13:22:48 CET

This is an archived mail posted to the Subversion Dev mailing list.