Aps Collision Handling Schemes

Post on 22-Oct-2014

551 views 2 download

Tags:

transcript

CS 225 Lab #10 – Hash Tables

Hash Tables

Hash Table (or Dictionary) is a data structure

designed for O(1) average-case add, remove,

and find operations when a search key is

known

If two keys “collide” (hash to the same value),

running times of all these operations may

degenerate to O(n)‏

Made up of an array of key-value pairs, a

hash function, and a collision-handling scheme

$

$

http://research.cs.vt.edu/AVresearch/hashing/openhash.php

Hash Function

Used to map the search key to the index of a slot in

the table where the corresponding record is

supposedly stored.

Mapping may consist of 2 conversions

map input type to integer

input type could be any primitive or user defined type

can always map binary data to an integer

map integer to valid table index

someInteger % tableSize

Hash Function (2)‏

Hash Function – maps each key k in our hash table to

the index, an integer in the range [0,N-1], where N is

the size of the hash table.

For this assignment, we will be using the following

hash functions:

Summing Components

Cyclic Shift

Polynomial Hash

Summing Components

This function maps strings to integers

Algorithm

sum the ascii values of each character of a string

Example:

hash(“dog”) = 'd' + 'o' + 'g' = 100 + 111 + 103 = 314

hash(“god”) = 'g' + 'o' + 'd' = 103 + 111 + 100 = 314

Regardless of the table size, these two keys will collide

using this hash function!

Cyclic Shift

This function maps strings to integers

Algorithm

same as summing components, but perform a 5-bit cyclic

shift on the sum before adding each character's ascii

value

int hash(string const & key) {

unsigned int h = 0;

for(int i = 0; i < key.size(); ++i)‏

{ h = (h << 5 | h >> 27); h += (unsigned int) key[i]; }

return hash((int) h);

}

Polynomial Hash

This function maps strings to integers

Algorithm:

a polynomial in some non-zero constant a that takes components (x[0],x[1],...,x[k-1]) with a != 1. This can represented mathematically as:

h(x) = x[0]*a(k-1) + x[1]*a(k-2) + ... + x[k-2]*a + x[k-1]

where k is the length of x

Example:

Let a = 2. hash(“man”) = 'm'*a^(2) + 'a'*a + 'n'

= 109*4 + 97*2 + 110 = 436 + 194 + 110 = 740

Collision-Handling Schemes

Collision – when two keys hash to the same table

index

Collision-Handling Schemes – a technique to allow

multiple entries with keys that hash to the same value

to both exist in the table at the same time

For this lab we'll describe two simple schemes:

Separate Chaining

Linear Probing

Separate Chaining

Separate Chaining – a

simple collision-handling

scheme where each table

cell contains a linked list

of entries.

When a collision occurs,

the new entry can be

added to the linked list.

http://www.isr.umd.edu/~austin/ence200.d/java-examples.html

Linear Probing

Linear Probing – simple collision-handling scheme

where each table cell contains only one element, but

collisions are handled by performing a linear search

for the next empty cell.

Each step of this search is called a probe, beginning

with the 0th probe.

Linear Probing (2)

http://codeidol.com/java/javagenerics/Maps/Implementing-Map/

Linear Probing (3)‏

The ith probe with key x is defined by the following function:

H(x,i) = (h(x)+f(i)) % tableSize

where

h(x) is the hashing function

f(i) is the probing function

The probing function in Linear Probing is a linear function:

f(i) = c * i + d

for some positive non-zero constant integers c and d

Hash Table Exercise

For the lab assignment you'll be

implementing parts of separate chaining and linear

probing collision-handling schemes

comparing the running times of different combinations of

hash functions and collision handling schemes

use the 'time' command to measure running times