michael.felke_at_evonik.com wrote on Fri, 25 Jun 2010 at 19:33 -0000:
> Hello,
>
> Martin got my point:
> >> It's not the probability which concerns me, it's what happens when
> >> a file collides. If I understood the current algorithm right the
> >> new file will be silently replaced by an unrelated one and there
> >> will be no error and no warning at all. If it's some kind of
> >> machine verifyable file like source code the next build in
> >> a different working copy will notice. But if it's something else
> >> like documents or images it can go unnoticed for a very long time.
> >> The work may be lost by then. <<
>
> The data checked in the repository is exactly like this!
> It's mostly data generated by measurements, produced once,
> normally never changed or regenerated and
> untouched after using it once or twice.
> But then, suddenly and unexpected someone comes and what?s to see data
> again,
> in the worst case, to check it, because of a law suite.
> Then it's to late to realize the data is wrong and
> the original one has been drop silently by the repository.
>
Then commit to the repository PGP signatures, or sha512's, or rot13's, or
base64's, or gzip's, of your data files, and set up a cron job to checkout
fresh working copies nightly and manually verify the integrity.
> The mayor role of subversion in our lab is to ensure that data und
> programs haven't changed over time without registration and the
> ability to reproduce the original data.
... or, at least, to alert you very loudly when it's unable to do that.
>
> So I would be very gland we someone would help me implementing the check.
If you have specific questions about FSFS internals, you can ask them on
this list.
As I said, though: in Subversion 1.7, the *working copy* will also rely
on SHA-1 being collision-free. Doesn't that mean for you that you
cannot use Subversion >=1.7 clients?
> I already started investigation the subversion source code
> for a way to implement this.
> Briefly, i think it would a C-function call by rep_write_contents_close()
> in addition to only if (old_rep) that,
> 1. find the data of the old_rep in the repository
> 2. reconstruct the full text of it
> 3. get/finds the full text of the file to be commit
> 4. compares them binary
> 5. returns the result of the comparison as a boolean
>
6. if the comparison failed:
6a. refuse the commit
6b. tell the world you found a SHA-1 collision[1]
> Greetings
>
> P.S. I one weekend now, so excuse that I answer any e-mails Monday.
>
>
[1] apparently, no SHA-1 collisions have been found to date. (see
#svn-dev log today)
Received on 2010-06-25 21:34:16 CEST