xavier roche's homework

random thoughts and sometimes pieces of code

Coucal, Cuckoo-hashing-based hashtable with stash area C library

Greater Coucal Centropus sinensis

Coucal ?

According to Wikipedia,

A coucal is one of about 30 species of birds in the cuckoo family. All of them belong in the subfamily Centropodinae and the genus Centropus. Unlike many Old World cuckoos, coucals are not brood parasites.

If you read my previous blog entry on HTTrack’s new hashtables, you probably already know that cuckoo hashing is a rather new (2001) hashing technique to be used for efficient hashtables.

The idea is to have a smarter open-addressing strategy based on kicking an existing entry if a collision occurs. This is why the name Cuckoo hashtable was chosen.

Note that the name is a bit misleading: a typical cuckoo will not only kick an existing foe, but will consequently kill it (nature, you scary!). Here, we only relocate it, reason why I choose the Coucal, a nicer cuckoo member who does not destroy existing entries - I mean, existing eggs!

This implementation has the following features:

  • guaranteed constant time (Θ(1)) lookup
  • guaranteed constant time (Θ(1)) delete or replace
  • average constant time (O(1)) insert
  • one large memory chunk for table (and one for the key pool)
  • simple enumeration

This implementation has been tested since July 2013 in HTTrack, and so far, no bugs were reported (the only issues spotted were bugs in upstream code that was corrupting existing keys)

Please have a look at this library implementation: https://github.com/xroche/coucal

TL;DR: cuckoo hashtables are both efficient and smart!

Comments