[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: Antwort: Re: Re: dangerous implementation of rep-sharing cache for fsfs

From: Daniel Shahaf <d.s_at_daniel.shahaf.name>
Date: Fri, 25 Jun 2010 22:34:25 +0300 (Jerusalem Daylight Time)

michael.felke_at_evonik.com wrote on Fri, 25 Jun 2010 at 19:33 -0000:
> Hello,
> Martin got my point:
> >> It's not the probability which concerns me, it's what happens when
> >> a file collides. If I understood the current algorithm right the
> >> new file will be silently replaced by an unrelated one and there
> >> will be no error and no warning at all. If it's some kind of
> >> machine verifyable file like source code the next build in
> >> a different working copy will notice. But if it's something else
> >> like documents or images it can go unnoticed for a very long time.
> >> The work may be lost by then. <<
> The data checked in the repository is exactly like this!
> It's mostly data generated by measurements, produced once,
> normally never changed or regenerated and
> untouched after using it once or twice.
> But then, suddenly and unexpected someone comes and what?s to see data
> again,
> in the worst case, to check it, because of a law suite.
> Then it's to late to realize the data is wrong and
> the original one has been drop silently by the repository.

Then commit to the repository PGP signatures, or sha512's, or rot13's, or
base64's, or gzip's, of your data files, and set up a cron job to checkout
fresh working copies nightly and manually verify the integrity.

> The mayor role of subversion in our lab is to ensure that data und
> programs haven't changed over time without registration and the
> ability to reproduce the original data.

... or, at least, to alert you very loudly when it's unable to do that.

> So I would be very gland we someone would help me implementing the check.

If you have specific questions about FSFS internals, you can ask them on
this list.

As I said, though: in Subversion 1.7, the *working copy* will also rely
on SHA-1 being collision-free. Doesn't that mean for you that you
cannot use Subversion >=1.7 clients?

> I already started investigation the subversion source code
> for a way to implement this.
> Briefly, i think it would a C-function call by rep_write_contents_close()
> in addition to only if (old_rep) that,
> 1. find the data of the old_rep in the repository
> 2. reconstruct the full text of it
> 3. get/finds the full text of the file to be commit
> 4. compares them binary
> 5. returns the result of the comparison as a boolean

6. if the comparison failed:
6a. refuse the commit
6b. tell the world you found a SHA-1 collision[1]

> Greetings
> P.S. I one weekend now, so excuse that I answer any e-mails Monday.

[1] apparently, no SHA-1 collisions have been found to date. (see
#svn-dev log today)
Received on 2010-06-25 21:34:16 CEST

This is an archived mail posted to the Subversion Dev mailing list.