linked lists and hash tablesaebnenas/teaching/fall2007/cs5321/lectures/… · single and multi...

Linked Lists and Hash Tables

Jon Woods

CS 5321

Stacks and Queues

Stack – LIFO

Queue - FIFO

123456789

1 2 3 4 5 6 7 8 9

Push – O(1)

Pop – O(1)

Stack Empty - O(1)

3 3 2 1

Enqueue – O(1)

Dequeue - O(1)

3 1 2 3

Linked List

Singly Linked List

Doubly Linked List

Circularly Linked List

1 2 3Head

List Search

List-Search(L,k)

x = head[L]while x != NIL and key[x] != k

do x = next[x]return x

List Search = θ(n)

List Insert

List-Insert(L, x)

next[x] = head[L]if head[L] != NIL

then prev[head[L]] = xhead[L] = xprev[x] = NIL

List Insert = O(1)

List Delete

List-Delete(L, x)

if prev[x] != NILthen next[prev[x]] = next[x]else head[L] = next[x]

if next[x] != NILthen prev[next[x]] = prev[x]

List Delete = O(1) or θ(n)? Why?

Single and Multi Arrays

Multi array implementations represent linked lists with three arrays: key, next,

Single array implementations represent linked lists as a single array, with key, next, and prev stored as sequential values within

a single array.

Multi Array Implementation

1 2 3 4 5 6 7 8

The variable L represents the index of the head, 7 in this case.

Single Array Implementation

1 2 3 4 5 6 7 8 9 10

4 7 13 1 4 16 4 19 9 13

L (19)

What are the advantages of using this implementation? Disadvantages?

Allocate and Free

Allocate-Object() Free-Object(x)If free != NIL next[x] = freex = free free = xfree = next [x]return x

These functions both take O(1) time.

Allocate

1 2 3 4 5 6 7 8

Allocate-Object() will return 4 (the next item on the free list) and then calls List-Insert(L,4).

The new head of the free list is 8.

Free1 2 3 4 5 6 7 8

After calling List-Delete(L,5), we call Free(5).

Object 5 now becomes the new head of the free list.

Direct Address Table

UUniverse of Keys

KActual Keys

Key SatelliteData

Direct Address Table

DIRECT_ADDRESS_SEARCH(T,x)return T[k]

DIRECT_ADDRESS_INSERT(T,x)T[key[x]] = x

DIRECT_ACCESS_DELETE(T,x)T[key[x]] = NIL

All functions are O(1)

Collisions and Chaining

k2 k5 k7

h k1=hk4 ,hk2=h k5=hk7 ,hk6=hk8

Analysis of ChainingE [

1n∑i=1

1 ∑j=i1

∑i=1

n−∑i=1

n2−nn1

=1n−12m

During a search for x, we examine 1 more than the number of elements preceding x.

Assuming uniform hashing, P{h( ) = h( )} = 1/m

Thus, the expected length that we will have to search, E, is 1/m.

If the number of slots is proportional to the number of elements in a table, then n = O(m).

Since α = n/m, O(m)/m = O(1)

ki k j

Hash Functions

Division: h(k) = k mod m

Multiplication: h(k) = m(k A mod 1)

We should choose a power of 2 for m in the multiplication hashing scheme, but NOT for

the division scheme. Why?

Universal Hashing

Randomized hashing functions offer a probabilistic efficiency.

This ensures good average case performance.

With universal hashing, we can achieve θ(1+a) expected search time without

making assumptions based on the keys.

Universal HashingE [Yk]≤ ∑

l∈T , l≠k

if k∉T

nhk =Y k

∣l : l∈T∧l≠k∣=n

E [nhk ]=E[Y k]≤nm

if k∈T

nhk =Y k1

∣l : l∈T∧l≠k∣=n−1

E [nhk ]=E[Y k1]≤n−1m

1=1−1m

Let Y be the number of keys other than k that hash to the same slot as k.

As before, a single pair of keys collide with a probability of 1/m.

If the key k is not in the table, then the number of keys in the same slot as k is equal to the number of keys in the slot not equal to k. The number of keys in T that are not equal to k is n. If k is not in T, then we must examine α keys to find a spot for k.

If the key k is in the table, then the number of keys in the slot with k includes k. The number of keys in T that are not equal to k is n-1. If k is in T, then we must examine α+1 keys to determine we found k.

Designing a Universal Hash Function

We choose a prime number p such that every possible key is in the range 0 to p-1.

We choose two different values, a and b, from that range.

h(k) = ((ak + b) mod p) mod m

Open Addressing

Instead of storing pointers, we have a computation function which indexes values

by calculating a probing sequence.

By not storing pointers, we may yield fewer collisions and attain faster retrieval.

Truly uniform hashing requires m! distinct probing sequences.

Linear and Quadratic Probing

h(k, i) = (h'(k)+i) mod mPrimary Clustering

Only offers m distinct probing sequences

h(k, i) = (h'(k) + c1i + c2i^2) mod mSecondary Clustering

Also offers only m distinct probing sequences

Double Hashing

h k,i=h1kih2 kmodm

h1 k=kmod13

h2 k=1kmod11

h1 14=1,h2 14=4

In double hashing, we calculate two hashes, one for the initial position and one for the offset should that position be full.

In this example, we choose the hash functions depicted at left. After inserting 5 values into the table, we try to insert 14.

Position 1 is full, so we increase by the offset 4. Position 5 is also full, so we put our data into position 9.

Double hashing offers m^2 distinct probing sequences.

Analysis of Open Addressing

E [X ]=∑i=1

∗n−1m−1

∗...∗n−i2

m−i2

E [X ]≤∑i=1

E [X ]=∑i=0

1−probes

The expected number of probes necessary to find an empty slot is equal to the sum of the probabilities of each of the cells being empty assuming the previous one was full.

By manipulating the equation, we can bound the expected number of probes.

Thus, we expect at most 1/(1-a) probes on average.

Perfect Hashing

When used with a static set of keys, and two 'universal' hash schemes, we can

construct a structure with no collisions and a O(1) search time.

Why is this better than other hash schemes?

Perfect Hashing

hk=akbmodp modm

a=3,b=42,p=101,m=9

7 16 23 88 40 52 22 37

m7 a7 b7 S7

m5 a5 b5

m2 a2 b2

m0 a0 b0

60 72 75

h(75) = 2, so 75 hashes to slot 2 of table T.

h'(75) = 7, so 75 hashes to slot 7 of secondary hash table S2.

This man owns the patent on linked lists

Linked List - Patent No. 10260471

Patent Issued April 11, 2006 to LSI Logic Corporation

“A computerized list is provided with auxiliary pointers for traversing the list in different sequences. One or more auxiliary pointers enable a fast, sequential traversal of the list with a minimum of computational time. Such lists may be used in any application where lists may be reordered for various purposes.”

Abhi Talwalkar, CEO LSI Logic

linked lists and hash tablesaebnenas/teaching/fall2007/cs5321/lectures/… · single and multi...

Documents

1 linked lists (lec 6). 2 introduction singly linked lists...

adts, arrays, linked lists - york university · adts,...

3 representing sequences by arrays and linked lists

linked lists -...

chapter 3: linked lists. objectives looking ahead – in...

9-1 9 set adts set concepts. set applications. a set adt:...

linked structures. overview overview of linked structures...

linked lists

arrays and other data structures 4 introduction to arrays 4...

cs 1031 linked lists definition of linked lists examples of...

jawaharlal nehru technological university … · to write...

linked lists. outline why linked lists? linked lists basics...

chapter 7 arrays and array lists 1 chapter 7 arrays and...

linked lists chapter 4. linked structures: motivations...

lecture 3 linear data structures: arrays, array lists...

11 map adts map concepts map applications a map adt:...

arrays and linked lists - department of computer...

lists a list is a finite, ordered sequence of data items....

chapter 3: arrays, linked lists, and recursion

computer science 210: data structures linked...