Am 21.05.2012 12:38, schrieb Julian Foad:
> Stefan Fuhrmann wrote:
>> Julian Foad wrote:
>>>> Introduce private API functions that wrap apr_hash_make_custom
>>>> and return hash tables that are 2 to 4 times faster than the
>>>> APR default.
>>> Would it be sensible to propose these (the hash-functions) for
>>> inclusion in APR itself?
>> Certainly. The question would be whether Apache is
>> meant to run on CPUs without a decent MUL.
> I don't understand why that question is relevant.
APRs implementation uses 33 as multiplier which
can conveniently be implemented as shift & add.
My code uses factors up to 33^4 where that
optimization / workaround would no longer be
useful. A non-pipelined MUL operation may take
as much as 40 ticks (i386) instead of 2 .. 6 ticks for
I don't know of any popular CPUs that have this
problem but OTOH, I don't know all exotic platforms /
embedded devices that Apache is being run on.
>>>> Modified: subversion/trunk/subversion/libsvn_subr/hash.c
>>>> --- subversion/trunk/subversion/libsvn_subr/hash.c (original)
>>>> +++ subversion/trunk/subversion/libsvn_subr/hash.c Thu May 3 07:16:11 2012
>>>> +/*** Optimized hash functions ***/
>>>> +/* Optimized version of apr_hashfunc_default. It assumes that the CPU has
>>>> + * 32-bit multiplications with high throughput of at least 1 operation
>>>> + * every 3 cycles. Latency is not an issue. Another optimization is a
>>>> + * mildly unrolled main loop.
>>> Such specific details should at least refer to a specific version
>>> of apr_hashfunc_default(). Perhaps also (for the "1 op per 3 cycles"
>>> part, in particular) a specific system architecture or compiler.
>> r1340601 explains why that is a reasonable assumption.
> What's missing is a statement that this is an optimized version of
> "apr_hashfunc_default in APR 1.4.5".
Added the version info in r1341271.
Received on 2012-05-22 23:01:48 CEST