[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: fs-rep-sharing branch

From: David Glasser <glasser_at_davidglasser.net>
Date: Wed, 22 Oct 2008 09:37:26 -0700

On Wed, Oct 22, 2008 at 9:32 AM, David Glasser <glasser_at_davidglasser.net> wrote:
> On Wed, Oct 22, 2008 at 5:34 AM, Greg Stein <gstein_at_gmail.com> wrote:
>> On Wed, Oct 22, 2008 at 4:57 AM, Alan Barrett <apb_at_cequrux.com> wrote:
>>> On Tue, 21 Oct 2008, Greg Stein wrote:
>>>> The simple fact is that we're going to be running around with md5
>>>> checksums in hand for a long while. OR we double-compute, and I'm not
>>>> willing to burn that much CPU to satisfy somebody's misguided
>>>> preconception about md5 collisions.
>>>
>>> What "misguided preconception" did you see in David Glasser's
>>> description of the problem? It seems like quite a real problem to me.
>>
>> That unintended collisions can occur or an attacker is going to
>> somehow cause problems for an svn repository by using md5 collisions
>> against it. That's how the discussion started (somewhat on the mailing
>> list, and definitely on IRC).
>>
>> Later, the idea of researchers surfaced, and that they'd be trying to
>> store file pairs with the same hash. I grant your scenario is valid.
>>
>> We *still* have all the problems that md5 is fully-intertwined in our
>> code. I'm still not willing to do double-checksums and kill millions
>> of coders for a few researchers who could simply tar their candidate
>> pairs together, or gzip them. Yes, that's the brutal truth :-P ... the
>> researchers need to use workarounds, and the millions get a fast
>> product.
>
> It's still a regression vs 1.5. Can you at least acknowledge that?
>
> Also, if double-hash-computation is such a horrible thing, then why is
> it OK for fs_base to use it but not fs_fs?

In any case, if speed is the real concern, I find it difficult to
believe (but haven't actually done benchmarks) that "doing a second
hash computation at the same time as the first for each file during a
commit" is going to be noticable compared to "do three SQLite queries
for each file in a commit".

--dave

-- 
David Glasser | glasser@davidglasser.net | http://www.davidglasser.net/
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe_at_subversion.tigris.org
For additional commands, e-mail: dev-help_at_subversion.tigris.org
Received on 2008-10-22 18:37:40 CEST

This is an archived mail posted to the Subversion Dev mailing list.