data structures and algorithms lecture notes 7 prepared by İnanç tahrali
TRANSCRIPT
![Page 1: DATA STRUCTURES AND ALGORITHMS Lecture Notes 7 Prepared by İnanç TAHRALI](https://reader035.vdocument.in/reader035/viewer/2022081516/56649e875503460f94b8b8f4/html5/thumbnails/1.jpg)
DATA STRUCTURES
ANDALGORITHMS
Lecture Notes 7
Prepared by İnanç TAHRALI
![Page 2: DATA STRUCTURES AND ALGORITHMS Lecture Notes 7 Prepared by İnanç TAHRALI](https://reader035.vdocument.in/reader035/viewer/2022081516/56649e875503460f94b8b8f4/html5/thumbnails/2.jpg)
2
REVIEWWe have investigated the following ADTs
LISTS Array Linked List
STACKS QUEUE TREES
Binary Trees Binary Search Trees AVL Trees
What about their running times ?
![Page 3: DATA STRUCTURES AND ALGORITHMS Lecture Notes 7 Prepared by İnanç TAHRALI](https://reader035.vdocument.in/reader035/viewer/2022081516/56649e875503460f94b8b8f4/html5/thumbnails/3.jpg)
3
Running times of important operations
insertion
deletion find
Array O(n) O(n) O(n)
Linked list O(1) O(n) O(n)
Tree O(log n) O(log n) O(logn)
Can we decrease the running times more ?
![Page 4: DATA STRUCTURES AND ALGORITHMS Lecture Notes 7 Prepared by İnanç TAHRALI](https://reader035.vdocument.in/reader035/viewer/2022081516/56649e875503460f94b8b8f4/html5/thumbnails/4.jpg)
4
ROAD MAP HASHING
General Idea Hash Function Separate Chaining Open Adressing Rehashing
![Page 5: DATA STRUCTURES AND ALGORITHMS Lecture Notes 7 Prepared by İnanç TAHRALI](https://reader035.vdocument.in/reader035/viewer/2022081516/56649e875503460f94b8b8f4/html5/thumbnails/5.jpg)
5
Hashing Hashing: implementation of hash tables hash table: an array of elements
fixed size TableSize Search is performed on a part of the item: key Each key is mapped into a number
in the range 0 to TableSize-1 Used as array index
Mapping by hash function Simple to compute Ensure that any two distinct keys get different cells
How to perform insert, delete and find operations in O(1) time ?
![Page 6: DATA STRUCTURES AND ALGORITHMS Lecture Notes 7 Prepared by İnanç TAHRALI](https://reader035.vdocument.in/reader035/viewer/2022081516/56649e875503460f94b8b8f4/html5/thumbnails/6.jpg)
6
An ideal hash table Each key is mapped to a
different index ! Not always possible
many keys, finite indexes
Even distribution
Considerations : Choose a hash function Decide what to do when
two keys hash to the same value
Decide on table size
![Page 7: DATA STRUCTURES AND ALGORITHMS Lecture Notes 7 Prepared by İnanç TAHRALI](https://reader035.vdocument.in/reader035/viewer/2022081516/56649e875503460f94b8b8f4/html5/thumbnails/7.jpg)
7
Hash function
If keys are integers hash function return Key mod
TableSize Ex: TableSize = 10
Keys = 120, 330, 1000 TableSize should be prime
![Page 8: DATA STRUCTURES AND ALGORITHMS Lecture Notes 7 Prepared by İnanç TAHRALI](https://reader035.vdocument.in/reader035/viewer/2022081516/56649e875503460f94b8b8f4/html5/thumbnails/8.jpg)
8
Hash function If keys are strings
Add ASCII values of the characters If TableSize is large and number of characters is small
TableSize = 10000 & number of characters in a key = 8127*8=1016 < 10000
int hash( const string & key, int tableSize ){
int hashVal = 0;for( int i = 0; i < key.length( ); i++ )
hashVal += key[i];
return hashVal % tableSize;}
![Page 9: DATA STRUCTURES AND ALGORITHMS Lecture Notes 7 Prepared by İnanç TAHRALI](https://reader035.vdocument.in/reader035/viewer/2022081516/56649e875503460f94b8b8f4/html5/thumbnails/9.jpg)
9
Hash function If keys are strings
Use all characters∑ 32i Key [KeySize -i -1 ]
Early characters does not count Use only some number of characters Use characters in odd spaces
![Page 10: DATA STRUCTURES AND ALGORITHMS Lecture Notes 7 Prepared by İnanç TAHRALI](https://reader035.vdocument.in/reader035/viewer/2022081516/56649e875503460f94b8b8f4/html5/thumbnails/10.jpg)
10
Hash function If keys are strings
Use first three characters729*key[2] + 27*key[1] + key[0]
If the keys are not random some part of the table is not used.
int hash( const string & key, int tableSize )
{
return ( key [0] + 27 * key [1] + 729 *
key [2]) % tableSize;
}
![Page 11: DATA STRUCTURES AND ALGORITHMS Lecture Notes 7 Prepared by İnanç TAHRALI](https://reader035.vdocument.in/reader035/viewer/2022081516/56649e875503460f94b8b8f4/html5/thumbnails/11.jpg)
11
int hash( const string & key, int tableSize ){
int hashVal = 0;
for( int i = 0; i < key.length( ); i++ ) hashVal = 37 * hashVal + key[ i ];
hashVal %= tableSize;if( hashVal < 0 )
hashVal += tableSize;
return hashVal;}
A good hash function
![Page 12: DATA STRUCTURES AND ALGORITHMS Lecture Notes 7 Prepared by İnanç TAHRALI](https://reader035.vdocument.in/reader035/viewer/2022081516/56649e875503460f94b8b8f4/html5/thumbnails/12.jpg)
12
Collusion Main programming detail is collision
resolution If when an element is inserted, it hashes
to the same value as an already inserted element, there is collision.
There are several methods to deal with this problem Separate chaining Open addressing
![Page 13: DATA STRUCTURES AND ALGORITHMS Lecture Notes 7 Prepared by İnanç TAHRALI](https://reader035.vdocument.in/reader035/viewer/2022081516/56649e875503460f94b8b8f4/html5/thumbnails/13.jpg)
13
Separate Chaining Hash Table
Keep a list of all elements that hash to the same value
TableSize = 10 is not good not prime
![Page 14: DATA STRUCTURES AND ALGORITHMS Lecture Notes 7 Prepared by İnanç TAHRALI](https://reader035.vdocument.in/reader035/viewer/2022081516/56649e875503460f94b8b8f4/html5/thumbnails/14.jpg)
14
Type declaration for separate chaining hash table
template <class HashedObj>class HashTable { public:
explicit HashTable(const HashedObj & notFound,int size = 101);HashTable( const HashTable & rhs )
:ITEM_NOT_FOUND(rhs.ITEM_NOT_FOUND),theLists( rhs.theLists ) { }
const HashedObj & find( const HashedObj & x ) const;
void makeEmpty( );void insert( const HashedObj & x );void remove( const HashedObj & x );
const HashTable & operator=( const HashTable & rhs ); private:
vector<List<HashedObj> > theLists; // The array of Listsconst HashedObj ITEM_NOT_FOUND;
};
int hash( const string & key, int tableSize );int hash( int key, int tableSize );
![Page 15: DATA STRUCTURES AND ALGORITHMS Lecture Notes 7 Prepared by İnanç TAHRALI](https://reader035.vdocument.in/reader035/viewer/2022081516/56649e875503460f94b8b8f4/html5/thumbnails/15.jpg)
15
/* Construct the hash table.template <class HashedObj>HashTable<HashedObj>::HashTable( const HashedObj &
notFound, int size ) : ITEM_NOT_FOUND(notFound), theLists( nextPrime( size ) ){}
/* Make the hash table logically empty.template <class HashedObj>void HashTable<HashedObj>::makeEmpty( ) {
for( int i = 0; i < theLists.size( ); i++ )theLists[ i ].makeEmpty( );
}
/* Deep copy.template <class HashedObj>const HashTable<HashedObj> & HashTable<HashedObj> ::operator=( const HashTable<HashedObj> & rhs ){
if( this != &rhs ) theLists = rhs.theLists; return *this;}
![Page 16: DATA STRUCTURES AND ALGORITHMS Lecture Notes 7 Prepared by İnanç TAHRALI](https://reader035.vdocument.in/reader035/viewer/2022081516/56649e875503460f94b8b8f4/html5/thumbnails/16.jpg)
16
/* Remove item x from the hash table.template <class HashedObj>void HashTable<HashedObj>::remove( const HashedObj & x ) {
theLists[ hash( x, theLists.size( ) ) ].remove( x );}
/* Find item x in the hash table.template <class HashedObj>const HashedObj & HashTable<HashedObj>::find( const HashedObj & x ) const {
ListItr<HashedObj> itr;itr = theLists[ hash( x, theLists.size( ) ) ].find( x );if( itr.isPastEnd( ) ) return ITEM_NOT_FOUND;else return itr.retrieve( );
}
![Page 17: DATA STRUCTURES AND ALGORITHMS Lecture Notes 7 Prepared by İnanç TAHRALI](https://reader035.vdocument.in/reader035/viewer/2022081516/56649e875503460f94b8b8f4/html5/thumbnails/17.jpg)
17
/* Insert item x into the hash table.template <class HashedObj>void HashTable<HashedObj>::insert( const HashedObj & x ){
List<HashedObj> & whichList = theLists[ hash( x, theLists.size( ) ) ];ListItr<HashedObj> itr = whichList.find( x );
if( itr.isPastEnd( ) )whichList.insert( x, whichList.zeroth( ) );
}
![Page 18: DATA STRUCTURES AND ALGORITHMS Lecture Notes 7 Prepared by İnanç TAHRALI](https://reader035.vdocument.in/reader035/viewer/2022081516/56649e875503460f94b8b8f4/html5/thumbnails/18.jpg)
18
Analysis Let ג be load factor of a hash table
number of elements / TableSize is the avarage length of a list ג Successful Find 2/ג comparisons + time
to evaluate hash function Unsuccessful Find & Insert ג
comparisons + time to evaluate hash function
Good choise 1 ~ גDisadvantage of separate chaining is allocate/deallocate memory !
![Page 19: DATA STRUCTURES AND ALGORITHMS Lecture Notes 7 Prepared by İnanç TAHRALI](https://reader035.vdocument.in/reader035/viewer/2022081516/56649e875503460f94b8b8f4/html5/thumbnails/19.jpg)
19
Open Adressing
If collision try an alternate cellh0(x), h1(x), h2(x), …
hi(x) = (hash(x) + F(i)) mod TableSizeF(0) = 0
1 > ג
Good choise < 0.5
![Page 20: DATA STRUCTURES AND ALGORITHMS Lecture Notes 7 Prepared by İnanç TAHRALI](https://reader035.vdocument.in/reader035/viewer/2022081516/56649e875503460f94b8b8f4/html5/thumbnails/20.jpg)
20
Linear Probing• F is a linear function of i
– F(i) = i
Insert keys
{89, 18, 49, 58, 69} • When 49 is
inserted collision occurs– Put into the
next available spot 0
• 58 collidates with 18, 89, 49
![Page 21: DATA STRUCTURES AND ALGORITHMS Lecture Notes 7 Prepared by İnanç TAHRALI](https://reader035.vdocument.in/reader035/viewer/2022081516/56649e875503460f94b8b8f4/html5/thumbnails/21.jpg)
21
Linear Probing
Problem: It is not easy to delete an element May have caused a collision before Mark the element deleted
Problem: Primary Clustering
![Page 22: DATA STRUCTURES AND ALGORITHMS Lecture Notes 7 Prepared by İnanç TAHRALI](https://reader035.vdocument.in/reader035/viewer/2022081516/56649e875503460f94b8b8f4/html5/thumbnails/22.jpg)
22
Linear Probing
Analysis
21
11
2
1&
UI
1
11
2
1S
Problem: Primary Clustering
![Page 23: DATA STRUCTURES AND ALGORITHMS Lecture Notes 7 Prepared by İnanç TAHRALI](https://reader035.vdocument.in/reader035/viewer/2022081516/56649e875503460f94b8b8f4/html5/thumbnails/23.jpg)
23
Quadratic Probing
F(i) is a quadratic functionEx : F(i) = i2
![Page 24: DATA STRUCTURES AND ALGORITHMS Lecture Notes 7 Prepared by İnanç TAHRALI](https://reader035.vdocument.in/reader035/viewer/2022081516/56649e875503460f94b8b8f4/html5/thumbnails/24.jpg)
24
Quadratic Probing When 49
collides with 89, next position attemped is one cell away
58 collides at position 8. The cell one away is tried, another collision occurs. It is inserted into the cell 22=4 away
![Page 25: DATA STRUCTURES AND ALGORITHMS Lecture Notes 7 Prepared by İnanç TAHRALI](https://reader035.vdocument.in/reader035/viewer/2022081516/56649e875503460f94b8b8f4/html5/thumbnails/25.jpg)
25
Quadratic Probing Solves primary clustering problem
All empty cells may not be accessed A loop around full cells may happen Hash table not full but empty space not found
Theorem : If the table size is prime and 0.5>ג new element can always be inserted.
Problem : Secondary clustering!...
![Page 26: DATA STRUCTURES AND ALGORITHMS Lecture Notes 7 Prepared by İnanç TAHRALI](https://reader035.vdocument.in/reader035/viewer/2022081516/56649e875503460f94b8b8f4/html5/thumbnails/26.jpg)
26
template <class HashedObj>class HashTable{ public:
explicit HashTable(const HashedObj & notFound,int size = 101);HashTable( const HashTable & rhs) : ITEM_NOT_FOUND(rhs.ITEM_NOT_FOUND), array( rhs.array ),
currentSize( rhs.currentSize ) { }
const HashedObj & find( const HashedObj & x ) const;
void makeEmpty( );void insert( const HashedObj & x );void remove( const HashedObj & x );
const HashTable & operator=( const HashTable & rhs );
enum EntryType { ACTIVE, EMPTY, DELETED };
Type declaration for open addressing hash table
![Page 27: DATA STRUCTURES AND ALGORITHMS Lecture Notes 7 Prepared by İnanç TAHRALI](https://reader035.vdocument.in/reader035/viewer/2022081516/56649e875503460f94b8b8f4/html5/thumbnails/27.jpg)
27
private:
struct HashEntry{
HashedObj element;EntryType info;HashEntry( const HashedObj & e = HashedObj( ), EntryType i = EMPTY ) : element( e ), info(i) {
}};
vector<HashEntry> array;int currentSize;const HashedObj ITEM_NOT_FOUND;
bool isActive( int currentPos ) const;int findPos( const HashedObj & x ) const;void rehash( );
};
Type declaration for open addressing hash table
![Page 28: DATA STRUCTURES AND ALGORITHMS Lecture Notes 7 Prepared by İnanç TAHRALI](https://reader035.vdocument.in/reader035/viewer/2022081516/56649e875503460f94b8b8f4/html5/thumbnails/28.jpg)
28
/* Construct the hash table.template <class HashedObj>HashTable<HashedObj>::HashTable( const HashedObj & notFound, int size ) :ITEM_NOT_FOUND( notFound ), array( nextPrime( size ) ) {
makeEmpty( );}
/* Make the hash table logically empty.template <class HashedObj>void HashTable<HashedObj>::makeEmpty( ){
currentSize = 0;for( int i = 0; i < array.size( ); i++ )
array[ i ].info = EMPTY;}
![Page 29: DATA STRUCTURES AND ALGORITHMS Lecture Notes 7 Prepared by İnanç TAHRALI](https://reader035.vdocument.in/reader035/viewer/2022081516/56649e875503460f94b8b8f4/html5/thumbnails/29.jpg)
29
/* Find item x in the hash table.template <class HashedObj>const HashedObj & HashTable<HashedObj>::find( const HashedObj & x ) const {
int currentPos = findPos( x );if( isActive( currentPos ) )
return array[ currentPos ].element;else return ITEM_NOT_FOUND;
}
/* Method that performs quadratic probing resolution.template <class HashedObj>int HashTable<HashedObj>::findPos(const HashedObj & x) const {
int collisionNum = 0;int currentPos = hash( x, array.size( ) );
while ( array[ currentPos ].info != EMPTY && array[ currentPos ].element != x ) {
currentPos += 2 * ++collisionNum - 1;
if( currentPos >= array.size( ) )currentPos -= array.size( );
}return currentPos;
}
![Page 30: DATA STRUCTURES AND ALGORITHMS Lecture Notes 7 Prepared by İnanç TAHRALI](https://reader035.vdocument.in/reader035/viewer/2022081516/56649e875503460f94b8b8f4/html5/thumbnails/30.jpg)
30
/* Return true if currentPos exists and is active.template <class HashedObj>bool HashTable<HashedObj>::isActive( int currentPos ) const{
return array[ currentPos ].info == ACTIVE;}
/* Remove item x from the hash table.template <class HashedObj>void HashTable<HashedObj>::remove( const HashedObj & x ){
int currentPos = findPos( x );if( isActive( currentPos ) )
array[ currentPos ].info = DELETED;}
/* Insert routine with quadratic probingtemplate <class HashedObj>void HashTable<HashedObj>::insert( const HashedObj & x ) {
int currentPos = findPos( x );if( isActive( currentPos ) ) return;
array[ currentPos ] = HashEntry( x, ACTIVE );}
![Page 31: DATA STRUCTURES AND ALGORITHMS Lecture Notes 7 Prepared by İnanç TAHRALI](https://reader035.vdocument.in/reader035/viewer/2022081516/56649e875503460f94b8b8f4/html5/thumbnails/31.jpg)
31
/* Deep copy.template <class HashedObj>const HashTable<HashedObj> & HashTable<HashedObj>::operator=( const HashTable<HashedObj> & rhs ){
if( this != &rhs ){
array = rhs.array; currentSize = rhs.currentSize;
}return *this;
}
![Page 32: DATA STRUCTURES AND ALGORITHMS Lecture Notes 7 Prepared by İnanç TAHRALI](https://reader035.vdocument.in/reader035/viewer/2022081516/56649e875503460f94b8b8f4/html5/thumbnails/32.jpg)
32
Double Hashing Use second hash function F(i) = i * hash2(x) Poor example :
hash2(x) = X mod 9hash1(x) = X mod 10TableSize = 10
If X = 99 what happens ?hash2(x) ≠ 0 for any X
![Page 33: DATA STRUCTURES AND ALGORITHMS Lecture Notes 7 Prepared by İnanç TAHRALI](https://reader035.vdocument.in/reader035/viewer/2022081516/56649e875503460f94b8b8f4/html5/thumbnails/33.jpg)
33
Double Hashing
Good choise : hash2(x) = R – (X mod R)
R is a prime and < TableSize
![Page 34: DATA STRUCTURES AND ALGORITHMS Lecture Notes 7 Prepared by İnanç TAHRALI](https://reader035.vdocument.in/reader035/viewer/2022081516/56649e875503460f94b8b8f4/html5/thumbnails/34.jpg)
34
Double Hashing
hash2(x) = 7 – (X mod 7)
![Page 35: DATA STRUCTURES AND ALGORITHMS Lecture Notes 7 Prepared by İnanç TAHRALI](https://reader035.vdocument.in/reader035/viewer/2022081516/56649e875503460f94b8b8f4/html5/thumbnails/35.jpg)
35
Analysis Random collision resolution
Probes are independent No clustering problem
Unsuccessful search and Insert Number of probes until an empty cell is found
fraction of cells that are empty = (ג -1)expected number of probes = (ג -1) / 1
Successful searchP(X)=Number of probes when the element X is inserted
1/N∑ P(X) approximately
0
1 1 1 1ln
1 1dxx
![Page 36: DATA STRUCTURES AND ALGORITHMS Lecture Notes 7 Prepared by İnanç TAHRALI](https://reader035.vdocument.in/reader035/viewer/2022081516/56649e875503460f94b8b8f4/html5/thumbnails/36.jpg)
36
Rehashing If ג gets large, number of probes
increases. Running time of operations starts taking
too long and insertions might fail Solution : Rehashing with larger
TableSize (usually *2) When to rehash
if 0.5 < ג if insertion fails
![Page 37: DATA STRUCTURES AND ALGORITHMS Lecture Notes 7 Prepared by İnanç TAHRALI](https://reader035.vdocument.in/reader035/viewer/2022081516/56649e875503460f94b8b8f4/html5/thumbnails/37.jpg)
37
Rehashing Example Elements 13, 15, 24 and 6 is inserted into an
open addressing hash table of size 7 H(X) = X mod 7 Linear probing is used to resolve collisions
![Page 38: DATA STRUCTURES AND ALGORITHMS Lecture Notes 7 Prepared by İnanç TAHRALI](https://reader035.vdocument.in/reader035/viewer/2022081516/56649e875503460f94b8b8f4/html5/thumbnails/38.jpg)
38
Rehashing Example
If 23 is inserted, the table is over 70 percent full.
A new table is created
17 is the first primetwice as large as the old one; so
Hnew (X) = X mod 17
![Page 39: DATA STRUCTURES AND ALGORITHMS Lecture Notes 7 Prepared by İnanç TAHRALI](https://reader035.vdocument.in/reader035/viewer/2022081516/56649e875503460f94b8b8f4/html5/thumbnails/39.jpg)
39
Rehashing
Rehashing is an expensive operation Running time is O(N)
Rehashing frees the programmer from worrying about table size
Amortized Analysis: Average over N operations Operations take: O(1) time
![Page 40: DATA STRUCTURES AND ALGORITHMS Lecture Notes 7 Prepared by İnanç TAHRALI](https://reader035.vdocument.in/reader035/viewer/2022081516/56649e875503460f94b8b8f4/html5/thumbnails/40.jpg)
40
/* Insert routine with quadratic probingtemplate <class HashedObj>void HashTable<HashedObj>::insert( const HashedObj & x ) {
int currentPos = findPos( x );if( isActive( currentPos ) ) return;
array[ currentPos ] = HashEntry( x, ACTIVE );
if( ++currentSize > array.size( ) / 2 ) rehash( );}/* Expand the hash table.template <class HashedObj>void HashTable<HashedObj>::rehash( ) {
vector<HashEntry> oldArray = array;
array.resize( nextPrime( 2 * oldArray.size( ) ) );for( int j = 0; j < array.size( ); j++ )
array[ j ].info = EMPTY;currentSize = 0;for( int i = 0; i < oldArray.size( ); i++ )
if( oldArray[ i ].info == ACTIVE ) insert( oldArray[ i ].element );}