Date post: | 22-Oct-2014 |
Category: |
Documents |
Upload: | pradeep-shockerzz |
View: | 551 times |
Download: | 2 times |
CS 225 Lab #10 – Hash Tables
Hash Tables
Hash Table (or Dictionary) is a data structure
designed for O(1) average-case add, remove,
and find operations when a search key is
known
If two keys “collide” (hash to the same value),
running times of all these operations may
degenerate to O(n)
Made up of an array of key-value pairs, a
hash function, and a collision-handling scheme
$
$
http://research.cs.vt.edu/AVresearch/hashing/openhash.php
Hash Function
Used to map the search key to the index of a slot in
the table where the corresponding record is
supposedly stored.
Mapping may consist of 2 conversions
map input type to integer
input type could be any primitive or user defined type
can always map binary data to an integer
map integer to valid table index
someInteger % tableSize
Hash Function (2)
Hash Function – maps each key k in our hash table to
the index, an integer in the range [0,N-1], where N is
the size of the hash table.
For this assignment, we will be using the following
hash functions:
Summing Components
Cyclic Shift
Polynomial Hash
Summing Components
This function maps strings to integers
Algorithm
sum the ascii values of each character of a string
Example:
hash(“dog”) = 'd' + 'o' + 'g' = 100 + 111 + 103 = 314
hash(“god”) = 'g' + 'o' + 'd' = 103 + 111 + 100 = 314
Regardless of the table size, these two keys will collide
using this hash function!
Cyclic Shift
This function maps strings to integers
Algorithm
same as summing components, but perform a 5-bit cyclic
shift on the sum before adding each character's ascii
value
int hash(string const & key) {
unsigned int h = 0;
for(int i = 0; i < key.size(); ++i)
{ h = (h << 5 | h >> 27); h += (unsigned int) key[i]; }
return hash((int) h);
}
Polynomial Hash
This function maps strings to integers
Algorithm:
a polynomial in some non-zero constant a that takes components (x[0],x[1],...,x[k-1]) with a != 1. This can represented mathematically as:
h(x) = x[0]*a(k-1) + x[1]*a(k-2) + ... + x[k-2]*a + x[k-1]
where k is the length of x
Example:
Let a = 2. hash(“man”) = 'm'*a^(2) + 'a'*a + 'n'
= 109*4 + 97*2 + 110 = 436 + 194 + 110 = 740
Collision-Handling Schemes
Collision – when two keys hash to the same table
index
Collision-Handling Schemes – a technique to allow
multiple entries with keys that hash to the same value
to both exist in the table at the same time
For this lab we'll describe two simple schemes:
Separate Chaining
Linear Probing
Separate Chaining
Separate Chaining – a
simple collision-handling
scheme where each table
cell contains a linked list
of entries.
When a collision occurs,
the new entry can be
added to the linked list.
http://www.isr.umd.edu/~austin/ence200.d/java-examples.html
Linear Probing
Linear Probing – simple collision-handling scheme
where each table cell contains only one element, but
collisions are handled by performing a linear search
for the next empty cell.
Each step of this search is called a probe, beginning
with the 0th probe.
Linear Probing (2)
http://codeidol.com/java/javagenerics/Maps/Implementing-Map/
Linear Probing (3)
The ith probe with key x is defined by the following function:
H(x,i) = (h(x)+f(i)) % tableSize
where
h(x) is the hashing function
f(i) is the probing function
The probing function in Linear Probing is a linear function:
f(i) = c * i + d
for some positive non-zero constant integers c and d
Hash Table Exercise
For the lab assignment you'll be
implementing parts of separate chaining and linear
probing collision-handling schemes
comparing the running times of different combinations of
hash functions and collision handling schemes
use the 'time' command to measure running times