Justin Erenkrantz writes:
> On 3/23/06, Peter N. Lundblad <peter@famlundblad.se> wrote:
> > Nope, but that's not in APR 0.9 anyway AFAICT.
>
> I would really like to avoid adding a custom data structure if we
> don't have to do so. If it means that only users with APR 1.0+ see
> the performance benefit, so be it as it means that the code stays much
> simpler. -- justin
>
OK, folks. I've been fooling around with this some more the last days.
I tried Justin's suggestion of a custom hash function and it gave less
than half the speed improvement.
Then, I discovered an interesting fact about the adler32 checksum
algorithm (*). In my initial tests, I used big .wav files, which
turns out to be pretty "random". I decided to try some more realistic
data and used uncompressed .tar files of source code (svn 1.2.0, 1.2.3
and 1.3.0). This shows a dramatic slowdown compared with the original
code.
The problem is that the adler32 checksum doesn't make use of all bits
for small block sizes. Some tests I did indicate that the 11th to
16th bit of the checksum are not well distributed and since my first
hash function typically used the 12 low-order bits, this didn't work
out very well.
I've cured this using the following formula:
h = sum ^ (sum >> 12)
to include some higher-order bits in the mess.
The end result is that my patch gives about 15% speed improvements for
some differnt kinds of test data and Justin's suggestion gives us 5%.
(To be fair, I changed my implementation of Justin's suggestion to use
the same hash function that I changed to, but that lead to an
improvement of less than 1%.)
I think the small amount of increased complexity is justified by the
performance win, so I'm going to commit this unless someone
objects.
Thanks,
//Peter
*) I don't claim to be the first who discover this fact. See RFC 3309
for an explanation.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Mon Mar 27 22:08:08 2006