# Cuckoo Hashing

### From LiteratePrograms

## Contents |

# Cuckoo Hashing

Cuckoo hashing is a great way of making a hash table without resorting to chaining to resolve collisions. The basic idea is to use two or more hash functions for every key. This means that for every thing in the table there are two or more places in the table it could be.

Given here is a simple implementation of Cuckoo hashing using the Haskell programming language.

<<CuckooHashing>>=moduleMainwhereimportData.ArrayimportData.ListimportSystem.Random

## Types, Eggs and Tables

Every thing in our hash table will store a value and a list of hash-functions used to place it in the table. For simplicity we're only going to insert whole numbers into the hash table. We'll call these things eggs.

<<CuckooHashing>>=typeValue = InttypeIndex = InttypeHashFunction =(Value -> Index)typeEgg =(Value,[HashFunction])

The Cuckoo hashtable will store a lot of these eggs. We'll use an array to make the table and we'll say that every location may (or may not) contain an egg.

<<CuckooHashing>>=typeCuckooTable = Array Index(Maybe Egg)

To make our empty table we make a big array which contains Nothing.

<<CuckooHashing>>=createCuckooTable :: Int -> CuckooTable createCuckooTable i = listArray(1, i)$ repeat Nothing

To make an egg we take a value and a list of hash functions. We repeat the hash functions in an endless cycle so we know the order to use them in.

<<CuckooHashing>>=createEgg :: Value ->[HashFunction]-> Egg createEgg v hs =(v, cycle hs)

## Adding Eggs

When we insert an egg into our table we'll take the first hash function from the list inside the egg, and use it to find the place the egg belongs.

<<CuckooHashing>>=value :: Egg -> Value value = fst hashEgg :: Egg ->(Index, Egg)hashEgg egg@(v, h:hs)=(h v,(v,hs))

If the place the egg should go is empty then we'll place the egg there. If it isn't empty then we'll kick-out the egg currently in its place and place the new-egg there instead. We'll try and place the kicked-out egg using its next hash function and repeat this until either we find an empty slot, or we reach a limit on the number of kick-outs we allow ourselves to do.

If we go over the number of kick-outs we are allowed we will have an error.

<<CuckooHashing>>=typeError = String addEgg :: CuckooTable -> Egg -> Either Error CuckooTable addEgg = addEgg' kickOutsAllowed

We will make the kick-outs allowed a fixed number to make it simpler.

<<CuckooHashing>>=wherekickOutsAllowed = 20

<<CuckooHashing>>=addEgg' :: Int -> CuckooTable -> Egg -> Either Error CuckooTable addEgg' 0 _ _ = Left "We have kicked-out too many eggs" addEgg' kickouts table egg =let(index, egg')= hashEgg egg place = table ! indexincaseplaceof

In the case the place is empty we'll just put the egg there.

<<CuckooHashing>>=Nothing -> Right $ table //[(index, Just egg')]

If there is another egg there and it has the same value as the egg we are trying to insert we can ignore it.

<<CuckooHashing>>=Just otherEgg ->ifvalue egg == value otherEggthenRight table

If it isn't the same egg then we kick-out the other egg, and put our new egg in its place. We can try and add the other egg again, if our kickout limit isn't too low.

<<CuckooHashing>>=elselettable' = table //[(index, Just egg')]inaddEgg'(kickouts-1)table' otherEgg

## Testing the Table

To show how well the Cuckoo hashtable works we will try add random numbers to the table and how many eggs we can add before we get hit our kick-out limit.

<<CuckooHashing>>=testTilFailure ::[Egg]-> CuckooTable -> Integer -> Integer testTilFailure(e:es)t n =case(addEgg t e)ofLeft err -> n Right t' -> testTilFailure es t'(1+n)testCuckooTable ::(RandomGen g)=>[HashFunction]-> g -> Integer testCuckooTable hashes gen =leteggs = map(flip createEgg $ hashes)$ randoms genintestTilFailure eggs(createCuckooTable tableSize)0

To test the table we'll use a size of 8243: a prime number. We'll also make three hash functions for the testing.

<<CuckooHashing>>=tableSize = 8243 hash1 :: HashFunction hash1 x = 1+((58193 + 1258207*y + 2808193*y^2)`mod` 308383)`mod` tableSizewherey = abs x hash2 :: HashFunction hash2 x = 1+((711929 + 1810057*y + 9767*y^2)`mod` 608459)`mod` tableSizewherey = abs x hash3 :: HashFunction hash3 x = 1+((459257 + 808481*y + 2958259*y^2)`mod` 1358333)`mod` tableSizewherey = abs x

We can run our tests like this. We'll repeat them a million times and take the average so we get good data.

<<CuckooHashing>>=testNumber = 1000000 average ::[Integer]-> Integer average ns = sum `div` lengthwheresum = foldl1'(+)ns length = foldl'(\n _ -> 1+n)0 ns main :: IO()main =dogen <- newStdGenlettest = replicate testNumber $ testCuckooTable[hash1,hash2,hash3]gen print $ average test

## Results

- Using three hash functions: 6076 insertions.
- Using two hash functions: 3993 insertions.
- Using one hash function: 95 insertions.

Download code |