[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: fs-rep-sharing branch

From: David Glasser <glasser_at_davidglasser.net>
Date: Tue, 21 Oct 2008 20:36:57 -0700

Er, to be clear. I am *not* talking about changing any use of md5s as
*checksums*, like in the editor interface, etc.

I'm talking about the use of md5s as *keys*.

md5 checksum collision just means that corruption might not be
noticed. md5 key collision means that there are realistic use cases
for repositories that cannot exist.

--dave

On Tue, Oct 21, 2008 at 8:34 PM, David Glasser <glasser_at_davidglasser.net> wrote:
> Did you miss the "I have real experience doing support for Subversion
> repositories for cryptographic researchers who would in fact be trying
> to make these collisions"? md5 has known collisions. sha1 is still
> solid, for today. Most other open source version control systems
> using content-addressable stores use sha1. *fs_base* uses sha1. Why
> not FSFS?
>
> --dave
>
> On Tue, Oct 21, 2008 at 6:50 PM, Greg Stein <gstein_at_gmail.com> wrote:
>> There is a HUGE difference between constructing two files with the
>> same md5 in order to falsify a signature, and that of two files in a
>> repository having the same md5 hash by accident.
>>
>> Sit down and look at the odds. 1 in 2^128. If I understand my powers
>> of two properly, I believe that means the earth is more likely to
>> spontaneously explode, than for two files to have the same hash key.
>>
>> Cheers,
>> -g
>>
>> On Tue, Oct 21, 2008 at 3:57 PM, David Glasser <glasser_at_davidglasser.net> wrote:
>>> As far as I can tell from reading the source, this (at least in FSFS)
>>> assumes that reps sharing the same md5 are the same file. (BDB seems
>>> to use sha1.)
>>>
>>> This means that you cannot store two files with the same md5 in the
>>> same repository. While obviously all hashes have collisions in
>>> theory, md5 has collisions in practice: there are known instances.
>>> And you know, cryptography researchers use Subversion! (At one point
>>> I tried to help fix Ron Rivest's corrupted svn repo...) I do not
>>> think that this limitation is appropriate for Subversion; I would
>>> highly advise against releasing this without changing FSFS to use SHA
>>> as well. (I can't find a mailing-list discussion of this choice; my
>>> apologies if I missed one, I have admittedly been not paying as much
>>> attention as I'd like to Subversion development recently.)
>>>
>>> --dave
>>>
>>> On Mon, Oct 6, 2008 at 8:59 PM, Hyrum K. Wright
>>> <hyrum_wright_at_mail.utexas.edu> wrote:
>>>> The fs-rep-sharing branch is functionally complete, and I'd like to get the
>>>> branch merged to trunk soon. These are the stats for various copies of of our
>>>> repository for the different branch/backend combinations.
>>>>
>>>> BDB: 1.5: 1.4GB
>>>> trunk: 627MB
>>>> reps-shared: 490MB
>>>>
>>>> FSFS: 1.5: 586MB
>>>> trunk: 578MB
>>>> reps-shared: 523MB
>>>>
>>>> The effect is quite pronounced on BDB, with around a 20% space savings compared
>>>> with our current trunk (and over 67% compared with 1.5!) FSFS doesn't show as
>>>> much improvement, partly due to the size of the index required to enable
>>>> rep-sharing, partly due to decreased sharing opportunities in same-revision and
>>>> parallel revision objects, and mostly due to the absolute floor on repo size due
>>>> to inode usage.
>>>>
>>>> We may be able to tune the FSFS implementation just a bit. For instance, it may
>>>> not be likely that directory content representations are likely to be shared, in
>>>> which case we shouldn't bother
>>>>
>>>> The remaining issue is the failing blame tests. Blame tests 10 and 11, which
>>>> test 'blame -g', both fail for both backends. Before the recent commits to add
>>>> rep-sharing to fsfs, the tests only failed for bdb. I'm slightly puzzled here
>>>> because 'blame -g' should be FS-agnostic. If anybody has some insight, I
>>>> welcome it.
>>>>
>>>> [Note: Because SQLite is still not an official dependency, to compile the
>>>> rep-sharing stuff with FSFS, you'll need to add -DENABLE_SQLITE_TESTING to the
>>>> CPPFLAGS when configuring.]
>>>>
>>>> -Hyrum
>>>>
>>>>
>>>>
>>>
>>>
>>>
>>> --
>>> David Glasser | glasser@davidglasser.net | http://www.davidglasser.net/
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: dev-unsubscribe_at_subversion.tigris.org
>>> For additional commands, e-mail: dev-help_at_subversion.tigris.org
>>>
>>>
>>
>
>
>
> --
> David Glasser | glasser@davidglasser.net | http://www.davidglasser.net/
>

-- 
David Glasser | glasser@davidglasser.net | http://www.davidglasser.net/
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe_at_subversion.tigris.org
For additional commands, e-mail: dev-help_at_subversion.tigris.org
Received on 2008-10-22 05:37:11 CEST

This is an archived mail posted to the Subversion Dev mailing list.