design and analysis of algorithms hash tables

Design and Analysis of AlgorithmsHash Tables

Haidong XueSummer 2012, at GSU

Dictionary operations

• INSERT • DELETE • SEARCH

O(1)

O(1)

O(1)

“A hash table is an effective data structure for implementing dictionaries” – textbook page 253

Very likely Worst case

(1)

O(1)

(n)

51 2 3 4 6 7 8 9 10

Direct-address tables

2 3 6 1 7 5

Direct-address table:

SEARCH(S, 6)

INSERT(S, )

DELETE(S, )7

4

O(1)

O(1)

O(1)

What’s the problem here?

Storage requirement = , is the universe of keys

When the range of element is in [1, 30000]…..

Direct-addressing: use keys as addresses

0 1 2

2 3 6 1 7 5

Hash tables• Can we have O(1) INSERT, DELETE AND

SEARCH with less storage?

2 3 6 1 7 5

Hash Table:

Hash Function: h(x) = x mod 3

h(2) = 2 mod 3 = 2

h(3) = 3 mod 3 = 0

h(6) = 6 mod 3 = 0

h(1) = 1 mod 3 = 1

h(7) = 7 mod 3 = 1

h(5) = 5 mod 3 = 2

Multiple elements in one slot

Collision!

Yes!

Hash tables0 1 2

Hash Table:

3 1

7 5

2

6

SEARCH(S, 6)

INSERT(S, )

DELETE(S, )7

4

O(1)+2

DELETE in 1-linked-list

SEARCH in 0-linked-list

INSERT in 1-linked-list O(1)+O(1) = O(1)

O(1)+O(1) = O(1)

(2 is the length of the linked-list)h(6)=6 mod 3=0

h(4)=4 mod 3=1

h(7)=7 mod 3=1

A common method is to put them into a linked-list, i.e. chaining

What is the upper bound length?What is the average length?

Analysis of hash tables

0 1 2Hash Table:

3 4

……..

……..

n m

m-1

… … … … … …

Load factor

Uniform hashing “each key is equally likely to hash to any of the m slots”

Analysis of hash tables0 1 2 3 4

……..m-1

… … … … … … 𝜶

Therorem11.1 Unsuccessful search:

(1+ )

Therorem11.2Successful search:

(1+ )

= , T(n)=(1+ )

If =, T(n)=(1+ O(m))=O(1)

How to get uniform hashing?

With the assumption of uniform hashing

Hash functionsHow to get uniform hashing?

Uniform hashing “each key is equally likely to hash to any of the m slots”

• Division hashing• Multiplication hashing• Universal hashing

To achieve this goal, many hashing methods are proposed:

Hash functions – division hashing

• h(k) = k mod mwhere k is value of key, m is the number of slots • E.g.: – Final grades of all my students with a hash table of

10 slots– Items in grocery stores with a hash table of 10 slots

• 99 cents, large soda• $1.99, ground beef• $6.99, lamb

What’s the problem here?What if we still use 10 slots?

Hash functions – division hashing

• h(k) = k mod m• Choose m as a prime number• 2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43,

47, 53, 59, 61, 67, 71, 73,…

• it sometimes not very convenient to be implemented ()

What’s the problem here?

e.g.: 99 mod 7 = 1 199 mod 7 = 3699 mod 7 = 6

Hash functions – multiplication hashing

• h(k) = floor(m(kA mod 1))where m is the number of slots and A is a constant number in (0, 1)• E.g.: A=0.123, m=10– 99*0.123=12.177– 199*0.123=24.477– 699*0.123= 85.977

h(99)=floor(10*0.177)=1h(199)=floor(10*0.477)=4

h(699)=floor(10*0.977)=9

Hash functions – universal hashing

• is set of hash functions;• At the beginning of each execution, randomly

choose a hash function from • Universal: where, and are keys, is the number of slots• If is not in the table, • If is in the table, Theorem 11.3

Another method to deal with collisions: Open Address

• No linked-list• Hash functions include probe number:

• Linear probing: • Quadratic probing: • Double hashing:

• When does not work, use

Number of probes for unsuccessful search is at most

Number of probes for successful search is at most

40 1 2 3 5 6 7 8 9Open addressing:

3 6 12

Another method to deal with collisions: Open Address

3 6 1

h ′ (𝑘 )=𝑘𝑚𝑜𝑑3

h (𝑘 ,𝑖 )=(h′ (𝑘)+𝑖)𝑚𝑜𝑑10

2

h(2, 0)=((2 mod 3) +0)mod 10=2

h(3, 0)=((3 mod 3) +0)mod 10=0

h(6, 0)=((6 mod 3) +0)mod 10=0

h(6, 1)=((6 mod 3) +1)mod 10=1

h(1, 0)=((1 mod 3) +0)mod 10=1

h(1, 1)=((1 mod 3) +1)mod 10=2

h(1, 2)=((1 mod 3) +2)mod 10=3

design and analysis of algorithms hash tables

Documents