[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: Re: dangerous implementation of rep-sharing cache for fsfs

From: Mark Phippard <markphip_at_gmail.com>
Date: Fri, 25 Jun 2010 09:00:02 -0400

On Fri, Jun 25, 2010 at 8:45 AM, <michael.felke_at_evonik.com> wrote:
> 4. you under estimate the error done by misusing math. methods.
>   As I already said in my first e-mail. SHA-1 is developed
>   to detected random and willful data manipulation.
>   It's a cryptographic hash, so that there is a low chance of
>   guessing or calculation a derived data sequence,
>   which generates the same hash value as the original data.
>   But this is the only thing it ensures.
>   There is no evidence that the hash vales are
>   equally distributed on the data sets, which is import for
>   the us of hashing method in data fetching.
>   In fact, as it's a cryptographic hash,
>   you should not be able to calculate it,
>   because this would mean that you are able
>   to calculate sets of data resulting in the same hash value.
>   So you can't conclude from the low chance of
>   guessing or calculation a derived data sequence to
>   a low chance of hash collisions in general.

I am in favor of making our software more reliable, I just do not want
to see us handicap ourselves by programming against a problem that is
unlikely to ever happen. If this is so risky, then why are so many
people using git? Isn't it built entirely on this concept of using
sha-1 hashes to identify content? While I notice if you Google for
this you can find plenty of flame wars over this topic with Git, but I
also notice blog posts like this one:


We are already performance-challenged. Doing extra hash calculations
for a problem that is not going to happen does not seem like a sound

Mark Phippard
Received on 2010-06-25 15:00:42 CEST

This is an archived mail posted to the Subversion Dev mailing list.