[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: svn commit: r1515088 - in /subversion/branches/log-addressing/subversion: include/private/svn_subr_private.h include/svn_checksum.h libsvn_subr/checksum.c libsvn_subr/fnv1a.c libsvn_subr/fnv1a.h tests/libsvn_subr/checksum-test.c

From: Stefan Fuhrmann <stefan.fuhrmann_at_wandisco.com>
Date: Tue, 20 Aug 2013 11:00:47 +0200

On Sun, Aug 18, 2013 at 2:27 PM, Branko Čibej <brane_at_wandisco.com> wrote:

> On 18.08.2013 14:25, Branko Čibej wrote:
> > On 18.08.2013 14:15, stefan2_at_apache.org wrote:
> >> Author: stefan2
> >> Date: Sun Aug 18 12:15:01 2013
> >> New Revision: 1515088
> >>
> >> URL: http://svn.apache.org/r1515088
> >> Log:
> >> On the log-addressing branch: For low-overhead checksumming,
> >> add standard FNV-1a and a faster modified version of that to
> >> the list of checksum kinds supported with svn_checksum_t.
> >>
> >> We will use this new checksum to secure parts of FSFS (and
> >> later FSX) that are not directly covered by MD5/SHA1 content
> >> checksums. That will help to localize corruptions much quicker
> >> and more accurately but it will not eliminate the need to run
> >> a full content verification.
> > If you're using this for detecting corruption, rather than key
> > distribution, why not instead use a 64-bit or even 32-bit CRC? It should
> > be much faster than any kind of multiply-with-prime hash.
>

CRC happens to be slower than even standard FNV-1
(6 clk / byte vs. 4 clk / byte) on recent machines (<10y).

Since I want the low-level verification to run on (linear read)
disk speed of >1GB/s, we need a checksum that is 2 clk/B
or better. The interleaved fnv1a_32x4 variant gets down to
~1 clk/B. CRC would require a similar optimization and still
be at least 50% slower than the current code.

> Or you could even use the Adler-32 implementation that we already use in
> the xdelta code.
>

Adler-32 is relatively weak and we already use it implicitly
for our zlib encoded data. Using a different and stronger
checksum should add more verification strength than
applying an existing one twice.

-- Stefan^2.
Received on 2013-08-20 11:01:23 CEST

This is an archived mail posted to the Subversion Dev mailing list.