On Fri, Jun 25, 2010 at 8:45 AM, <michael.felke_at_evonik.com> wrote:
> 4. you under estimate the error done by misusing math. methods.
>
> As I already said in my first e-mail. SHA-1 is developed
> to detected random and willful data manipulation.
> It's a cryptographic hash, so that there is a low chance of
> guessing or calculation a derived data sequence,
> which generates the same hash value as the original data.
> But this is the only thing it ensures.
> There is no evidence that the hash vales are
> equally distributed on the data sets, which is import for
> the us of hashing method in data fetching.
> In fact, as it's a cryptographic hash,
> you should not be able to calculate it,
> because this would mean that you are able
> to calculate sets of data resulting in the same hash value.
> So you can't conclude from the low chance of
> guessing or calculation a derived data sequence to
> a low chance of hash collisions in general.
I am in favor of making our software more reliable, I just do not want
to see us handicap ourselves by programming against a problem that is
unlikely to ever happen. If this is so risky, then why are so many
people using git? Isn't it built entirely on this concept of using
sha-1 hashes to identify content? While I notice if you Google for
this you can find plenty of flame wars over this topic with Git, but I
also notice blog posts like this one:
http://theblogthatnoonereads.davegrijalva.com/2009/09/25/sha-1-collision-probability/
We are already performance-challenged. Doing extra hash calculations
for a problem that is not going to happen does not seem like a sound
decision.
--
Thanks
Mark Phippard
http://markphip.blogspot.com/
Received on 2010-06-25 15:00:42 CEST