[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: Fwd: hash in subversions vdelta.c

From: Branko Čibej <brane_at_xbc.nu>
Date: 2002-08-22 11:10:33 CEST

Mikkel Fahnøe Jørgensen wrote:

>>BTW, did you do any performance comparisons between
>>the current code and
>>the mix hash? I'd be very interested in seeing some
>>numbers. (Our
>>randm-test.c could probably be used for that.)
>>
>>
>
>No. A long time ago I replaced a hash not unlike yours
>with the hash in the Compiler Dragon book for use in a
>symbol table. It dramatically improved performance due
>to reduced conflicts. Yet the classic Dragon book hash
>is not considered to be very efficient at avoiding
>conflicts. This is for variable length strings and
>your case is for 32bit values, so it cannot be
>compared directly. It is cheap to test conflicts for
>example.
>
Well, the hash table in vdelta.c is a bit special in that it expects
conflicts, because it can hold duplicate keys. In fact, without that
feature, the VDelta algorithm wouldn't work at all. :-)

>If you are doing a modulo prime on the hash result,
>
We aren't doung a modulo prime. We're doing modulo the table size to
convert the hash value into an index, that's all.

>the mix hash could well turn out to be faster because
>it only requires an And operation to clamp the hash
>key into range. Otherwise it clearly has more
>instructions than unrolling 4 shift/add. My best guess
>is that the mix hash will at the latest perform better
>when conflicts become significant. However, depending
>on the avarage shared run in the data, the fact that
>only 4 chars are being hashed may generate the bulk of
>conflicts, independently of the hash function in use.
>
>If VDelta is using a 64K window (or whatever) you get
>an efficient upper limit on the hashtable which
>affects the choice of indexing method. In particular
>you can oversize a hash table to reduce conflicts and
>you do not need to grow the hash nor risc conflicts
>with virtual memory.
>
>
Take a look at the hash table implementation, and the commentary at the
top of vdelta.c that describes it. It's not a general-purpose table, as
you seem to expect.

-- 
Brane Čibej   <brane_at_xbc.nu>   http://www.xbc.nu/brane/
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Thu Aug 22 11:11:11 2002

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.