[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: fs-rep-sharing branch

From: Greg Stein <gstein_at_gmail.com>
Date: Tue, 21 Oct 2008 18:50:25 -0700

There is a HUGE difference between constructing two files with the
same md5 in order to falsify a signature, and that of two files in a
repository having the same md5 hash by accident.

Sit down and look at the odds. 1 in 2^128. If I understand my powers
of two properly, I believe that means the earth is more likely to
spontaneously explode, than for two files to have the same hash key.

Cheers,
-g

On Tue, Oct 21, 2008 at 3:57 PM, David Glasser <glasser_at_davidglasser.net> wrote:
> As far as I can tell from reading the source, this (at least in FSFS)
> assumes that reps sharing the same md5 are the same file. (BDB seems
> to use sha1.)
>
> This means that you cannot store two files with the same md5 in the
> same repository. While obviously all hashes have collisions in
> theory, md5 has collisions in practice: there are known instances.
> And you know, cryptography researchers use Subversion! (At one point
> I tried to help fix Ron Rivest's corrupted svn repo...) I do not
> think that this limitation is appropriate for Subversion; I would
> highly advise against releasing this without changing FSFS to use SHA
> as well. (I can't find a mailing-list discussion of this choice; my
> apologies if I missed one, I have admittedly been not paying as much
> attention as I'd like to Subversion development recently.)
>
> --dave
>
> On Mon, Oct 6, 2008 at 8:59 PM, Hyrum K. Wright
> <hyrum_wright_at_mail.utexas.edu> wrote:
>> The fs-rep-sharing branch is functionally complete, and I'd like to get the
>> branch merged to trunk soon. These are the stats for various copies of of our
>> repository for the different branch/backend combinations.
>>
>> BDB: 1.5: 1.4GB
>> trunk: 627MB
>> reps-shared: 490MB
>>
>> FSFS: 1.5: 586MB
>> trunk: 578MB
>> reps-shared: 523MB
>>
>> The effect is quite pronounced on BDB, with around a 20% space savings compared
>> with our current trunk (and over 67% compared with 1.5!) FSFS doesn't show as
>> much improvement, partly due to the size of the index required to enable
>> rep-sharing, partly due to decreased sharing opportunities in same-revision and
>> parallel revision objects, and mostly due to the absolute floor on repo size due
>> to inode usage.
>>
>> We may be able to tune the FSFS implementation just a bit. For instance, it may
>> not be likely that directory content representations are likely to be shared, in
>> which case we shouldn't bother
>>
>> The remaining issue is the failing blame tests. Blame tests 10 and 11, which
>> test 'blame -g', both fail for both backends. Before the recent commits to add
>> rep-sharing to fsfs, the tests only failed for bdb. I'm slightly puzzled here
>> because 'blame -g' should be FS-agnostic. If anybody has some insight, I
>> welcome it.
>>
>> [Note: Because SQLite is still not an official dependency, to compile the
>> rep-sharing stuff with FSFS, you'll need to add -DENABLE_SQLITE_TESTING to the
>> CPPFLAGS when configuring.]
>>
>> -Hyrum
>>
>>
>>
>
>
>
> --
> David Glasser | glasser@davidglasser.net | http://www.davidglasser.net/
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe_at_subversion.tigris.org
> For additional commands, e-mail: dev-help_at_subversion.tigris.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe_at_subversion.tigris.org
For additional commands, e-mail: dev-help_at_subversion.tigris.org
Received on 2008-10-22 03:50:42 CEST

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.