[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: Antwort: Re: Re: dangerous implementation of rep-sharing cache for fsfs

From: Daniel Shahaf <d.s_at_daniel.shahaf.name>
Date: Fri, 25 Jun 2010 22:34:25 +0300 (Jerusalem Daylight Time)

michael.felke_at_evonik.com wrote on Fri, 25 Jun 2010 at 19:33 -0000:
> Hello,
>
> Martin got my point:
> >> It's not the probability which concerns me, it's what happens when
> >> a file collides. If I understood the current algorithm right the
> >> new file will be silently replaced by an unrelated one and there
> >> will be no error and no warning at all. If it's some kind of
> >> machine verifyable file like source code the next build in
> >> a different working copy will notice. But if it's something else
> >> like documents or images it can go unnoticed for a very long time.
> >> The work may be lost by then. <<
>
> The data checked in the repository is exactly like this!
> It's mostly data generated by measurements, produced once,
> normally never changed or regenerated and
> untouched after using it once or twice.
> But then, suddenly and unexpected someone comes and what?s to see data
> again,
> in the worst case, to check it, because of a law suite.
> Then it's to late to realize the data is wrong and
> the original one has been drop silently by the repository.
>

Then commit to the repository PGP signatures, or sha512's, or rot13's, or
base64's, or gzip's, of your data files, and set up a cron job to checkout
fresh working copies nightly and manually verify the integrity.

> The mayor role of subversion in our lab is to ensure that data und
> programs haven't changed over time without registration and the
> ability to reproduce the original data.

... or, at least, to alert you very loudly when it's unable to do that.

>
> So I would be very gland we someone would help me implementing the check.

If you have specific questions about FSFS internals, you can ask them on
this list.

As I said, though: in Subversion 1.7, the *working copy* will also rely
on SHA-1 being collision-free. Doesn't that mean for you that you
cannot use Subversion >=1.7 clients?

> I already started investigation the subversion source code
> for a way to implement this.
> Briefly, i think it would a C-function call by rep_write_contents_close()
> in addition to only if (old_rep) that,
> 1. find the data of the old_rep in the repository
> 2. reconstruct the full text of it
> 3. get/finds the full text of the file to be commit
> 4. compares them binary
> 5. returns the result of the comparison as a boolean
>

6. if the comparison failed:
6a. refuse the commit
6b. tell the world you found a SHA-1 collision[1]

> Greetings
>
> P.S. I one weekend now, so excuse that I answer any e-mails Monday.
>
>

[1] apparently, no SHA-1 collisions have been found to date. (see
#svn-dev log today)
Received on 2010-06-25 21:34:16 CEST

This is an archived mail posted to the Subversion Dev mailing list.