Download - A Knowledge Sharing Session on
![Page 1: A Knowledge Sharing Session on](https://reader035.vdocument.in/reader035/viewer/2022081516/568134e5550346895d9c195b/html5/thumbnails/1.jpg)
A Knowledge Sharing Session on
Unit IV: Tables (DSPS)
1
![Page 2: A Knowledge Sharing Session on](https://reader035.vdocument.in/reader035/viewer/2022081516/568134e5550346895d9c195b/html5/thumbnails/2.jpg)
Syllabus: Symbol Tables: Static and dynamic tree tables,
AVL trees, AVL Tree Implementation, Algorithms
and analysis of AVL Tree
Hash Tables: Basic Concepts, Hash Function,
Hashing methods, Collision resolution, Bucket hashing,
Dynamic Hashing.
Tables |Unit IV of DSPS (SE-Comp)
2
![Page 3: A Knowledge Sharing Session on](https://reader035.vdocument.in/reader035/viewer/2022081516/568134e5550346895d9c195b/html5/thumbnails/3.jpg)
Part I : Symbol Tables
Symbol Tables: Static and dynamic tree tables, AVL trees, AVL Tree Implementation, Algorithms and analysis of AVL Tree.
Hash Tables: Basic Concepts, Hash Function, Hashing methods, Collision resolution, Bucket hashing, Dynamic Hashing.
Part II: Hash Tables
3
![Page 4: A Knowledge Sharing Session on](https://reader035.vdocument.in/reader035/viewer/2022081516/568134e5550346895d9c195b/html5/thumbnails/4.jpg)
Symbol Table Examples
AVL Tree
AVL Implementation
AVL Algorithm Analysis
Symbol Table | Why Symbol Table
What Compiler Does?
• Lexical analysis– Detects inputs with illegal tokens • e.g.: main$ ();
• Parsing– Detects inputs with ill-formed parse trees • e.g.: missing semicolons
• Semantic analysis– Last “front end” phase– Catches all remaining errors
Symbol Table
4
![Page 5: A Knowledge Sharing Session on](https://reader035.vdocument.in/reader035/viewer/2022081516/568134e5550346895d9c195b/html5/thumbnails/5.jpg)
Symbol Table | Why Symbol Table
Typical Semantic Errors
• multiple declarations: a variable should be declared (in the same region) at most once.
• undeclared variable: a variable should not be used before being declared.
• type mismatch: type of the left-hand side of an assignment should match the type of the right-hand side.
• wrong arguments: methods should be called with the right number and types of arguments.
5
![Page 6: A Knowledge Sharing Session on](https://reader035.vdocument.in/reader035/viewer/2022081516/568134e5550346895d9c195b/html5/thumbnails/6.jpg)
Symbol Table | Aim of Symbol Table
Purpose of Symbol Table
– keep track of names declared in the program
– names of• variables, classes, fields, methods,
6
![Page 7: A Knowledge Sharing Session on](https://reader035.vdocument.in/reader035/viewer/2022081516/568134e5550346895d9c195b/html5/thumbnails/7.jpg)
Symbol Table | Symbol Table Stores
What it Contains
associates a name with a set of attributes, e.g.:
• kind of name (variable, class, field, method, etc)
• type (int, float, etc)
• nesting level
• memory location (i.e., where will it be found at runtime).
7
![Page 8: A Knowledge Sharing Session on](https://reader035.vdocument.in/reader035/viewer/2022081516/568134e5550346895d9c195b/html5/thumbnails/8.jpg)
Symbol Table | Symbol Table Revisit
In Short,
During Lexical Analysis --Finds Symbols--Adds Symbols to symbol table
During Syntactic Analysis--Information about each symbol is filled in
During Semantic Analysis--Used for type checking.
8
![Page 9: A Knowledge Sharing Session on](https://reader035.vdocument.in/reader035/viewer/2022081516/568134e5550346895d9c195b/html5/thumbnails/9.jpg)
Symbol Table | Symbol Table Important?
Info Provided by Symbol Table,• Given an Identifier which name is it?
• What information is to be associated with a name? (Actual Characters of the name, Type, Storage allocation info (number of bytes), Line number where declared, Lines where referenced, Scope.
• How do we access this information?
• How do we associate this information with a name?
9
![Page 10: A Knowledge Sharing Session on](https://reader035.vdocument.in/reader035/viewer/2022081516/568134e5550346895d9c195b/html5/thumbnails/10.jpg)
Symbol Table | Reminder on Symbol Table
Note,
• A name can represent– Variable– Type– Constant– Parameter– Record– Record Field– Procedure– Array– Label– file
10
![Page 11: A Knowledge Sharing Session on](https://reader035.vdocument.in/reader035/viewer/2022081516/568134e5550346895d9c195b/html5/thumbnails/11.jpg)
Symbol Table
Operations on Symbol Table
determining whether a string has alreadybeen stored
inserting an entry for a string
deleting a string when it goes out of scope
This requires three functions:
1. lookup(s): returns the index of the entry forstring s, or 0 if there is no entry2. insert(s): add a new entry for string s and return its index3. delete(s): deletes s from the table (or, typically,hides it)
11
Symbol Table |Operations on Symbol Table
![Page 12: A Knowledge Sharing Session on](https://reader035.vdocument.in/reader035/viewer/2022081516/568134e5550346895d9c195b/html5/thumbnails/12.jpg)
Symbol Table
Example
01 PROGRAM Main02 GLOBAL a,b03 PROCEDURE P (PARAMETER x)04 LOCAL a05 BEGIN {P}06 …a…07 …b…08 …x…09 END {P}10 BEGIN{Main}11 Call P(a)12 END {Main}
12
Symbol Table | Symbol Table Examples
![Page 13: A Knowledge Sharing Session on](https://reader035.vdocument.in/reader035/viewer/2022081516/568134e5550346895d9c195b/html5/thumbnails/13.jpg)
Symbol Table Unsorted List
01 PROGRAM Main02 GLOBAL a,b03 PROCEDURE P (PARAMETER x)04 LOCAL a05 BEGIN {P}06 …a…07 …b…08 …x…09 END {P}10 BEGIN{Main}11 Call P(a)12 END {Main}
Name Characteristic Class Scope Other AttributesDeclared Referenced Other
Main Program 0 Line 1a Variable 0 Line 2 Line 11b Variable 0 Line 2 Line 7P Procedure 0 Line 3 Line 11 1, parameter, xx Parameter 1 Line 3 Line 8a Variable 1 Line 4 Line 6
nOLook up Complexity
13
![Page 14: A Knowledge Sharing Session on](https://reader035.vdocument.in/reader035/viewer/2022081516/568134e5550346895d9c195b/html5/thumbnails/14.jpg)
Symbol Table Sorted List
01 PROGRAM Main02 GLOBAL a,b03 PROCEDURE P (PARAMETER x)04 LOCAL a05 BEGIN {P}06 …a…07 …b…08 …x…09 END {P}10 BEGIN{Main}11 Call P(a)12 END {Main}
nO logLook up Complexity
Name Characteristic Class Scope Other AttributesDeclared Referenced Other
a Variable 0 Line 2 Line 11a Variable 1 Line 4 Line 6b Variable 0 Line 2 Line 7Main Program 0 Line 1P Procedure 0 Line 3 Line 11 1, parameter, xx Parameter 1 Line 3 Line 8
nOWorst Case:
14
![Page 15: A Knowledge Sharing Session on](https://reader035.vdocument.in/reader035/viewer/2022081516/568134e5550346895d9c195b/html5/thumbnails/15.jpg)
Two issues:
1. Interface: how to use symbol tables
2. Implementation: how to implement it.
15
![Page 16: A Knowledge Sharing Session on](https://reader035.vdocument.in/reader035/viewer/2022081516/568134e5550346895d9c195b/html5/thumbnails/16.jpg)
Basic Implementation Techniques
Considerations:
Number of names
Storage space
Retrieval time
16
![Page 17: A Knowledge Sharing Session on](https://reader035.vdocument.in/reader035/viewer/2022081516/568134e5550346895d9c195b/html5/thumbnails/17.jpg)
<1> unordered list (linked list/array)
<2> ordered list» binary search on arrays» expensive insertion
(+) good for a fixed set of names(e.g. reserved words, assembly opcodes)
<3> binary search tree» On average, searching takes O(log(n)) time.» However, names in programs are not chosen
randomly.
<4>AVL:<5> Hash table: most common
(+) constant time 17
![Page 18: A Knowledge Sharing Session on](https://reader035.vdocument.in/reader035/viewer/2022081516/568134e5550346895d9c195b/html5/thumbnails/18.jpg)
Static Tree TableIf Symbols are known in advance :
No insertion and Deletion allowed Cost of searching symbols of higher frequency
should be small.• Huffman tree and OBST
if
do Read
while
Fig: Optimal Search Tree when frequency of symbols are specified
0
0
0
0
1
11
1abc
de
Fig: Huffman Tree 18
![Page 19: A Knowledge Sharing Session on](https://reader035.vdocument.in/reader035/viewer/2022081516/568134e5550346895d9c195b/html5/thumbnails/19.jpg)
Dynamic Tree TablesSymbols are inserted as and when they
comeDeletion is also possibleAVL
32 60
20 45 55 68
50 bst
19
![Page 20: A Knowledge Sharing Session on](https://reader035.vdocument.in/reader035/viewer/2022081516/568134e5550346895d9c195b/html5/thumbnails/20.jpg)
Part I : Symbol Tables
Symbol Tables: Static and dynamic tree tables, AVL trees, AVL Tree Implementation, Algorithms and analysis of AVL Tree
Hash Tables: Basic Concepts, Hash Function, Hashing methods, Collision resolution, Bucket hashing, Dynamic Hashing.
Part II: Hash Tables
20
![Page 21: A Knowledge Sharing Session on](https://reader035.vdocument.in/reader035/viewer/2022081516/568134e5550346895d9c195b/html5/thumbnails/21.jpg)
Where Hashing will be Used?1. docDict2. Database3. Compliers 4. Network Router and Servers5. Substring Search6. Cryptography
Hash Table| Motivation
21
![Page 22: A Knowledge Sharing Session on](https://reader035.vdocument.in/reader035/viewer/2022081516/568134e5550346895d9c195b/html5/thumbnails/22.jpg)
Motivation
Hashing Methods
Collision Resolution
Symbol Table | Why Hash Table
A Problem?
• We have to store some records and perform the following:
add new recorddelete recordsearch a record by
key
Find a way to do these efficiently!
Hashing
22
![Page 23: A Knowledge Sharing Session on](https://reader035.vdocument.in/reader035/viewer/2022081516/568134e5550346895d9c195b/html5/thumbnails/23.jpg)
Use an array to store the records, in unsorted order1. add - add the records as the last entry
fast O(1)
2. delete a target - slow at finding the target, fast at filling the hole (just take the last entry) O(n)
3. search - sequential search slow O(n)
Hash Table| Unsorted Array
23
![Page 24: A Knowledge Sharing Session on](https://reader035.vdocument.in/reader035/viewer/2022081516/568134e5550346895d9c195b/html5/thumbnails/24.jpg)
Use an array to store the records, keeping them in sorted order1. add - insert the record in proper
position. much record movement slow O(n)
2. delete a target - how to handle the hole after deletion? Much record movement slow O(n)
3. search - binary search fast O(log n)
Hash Table| Sorted Array
24
![Page 25: A Knowledge Sharing Session on](https://reader035.vdocument.in/reader035/viewer/2022081516/568134e5550346895d9c195b/html5/thumbnails/25.jpg)
Store the records in a linked list (unsorted) 1. add - fast if one can insert node
anywhere O(1)2. delete a target - fast at disposing the
node, but slow at finding the target O(n)
3. search - sequential search slow O(n) (if we only use linked list, we cannot use binary search even if the list is sorted.)
Hash Table| Linked List
25
![Page 26: A Knowledge Sharing Session on](https://reader035.vdocument.in/reader035/viewer/2022081516/568134e5550346895d9c195b/html5/thumbnails/26.jpg)
What is the Solution then?have better performance but are more
complex
1. Hash table
2. Tree (BST, Heap, …)
Hash Table| More Approaches
26
![Page 27: A Knowledge Sharing Session on](https://reader035.vdocument.in/reader035/viewer/2022081516/568134e5550346895d9c195b/html5/thumbnails/27.jpg)
Array as table?
Hash Table| More Approaches
27
9903030
98020209801010
0056789
00123450033333
tushar
manalipeter
david
sandybubli
73
10020
56.8
81.590
studid name score
9908080 Namrata 49
...
...
![Page 28: A Knowledge Sharing Session on](https://reader035.vdocument.in/reader035/viewer/2022081516/568134e5550346895d9c195b/html5/thumbnails/28.jpg)
Hash Table| Array as table?
28
:33333
:12345
0:
:betty
:andy
:
:90:
81.5:
name score
56789 david 56.8
:9908080
::
:bill::
:49::
9999999
One ‘stupid’ way is to store the records in a huge array (index 0..9999999). The index is used as the student id, i.e. the record of the student with studid 0012345 is stored at A[12345]
One ‘stupid’ way is to store the records in a huge array (index 0..9999999). The index is used as the student id, i.e. the record of the student with studid 0012345 is stored at A[12345]
![Page 29: A Knowledge Sharing Session on](https://reader035.vdocument.in/reader035/viewer/2022081516/568134e5550346895d9c195b/html5/thumbnails/29.jpg)
Hash Table| Whats Wrong Then?
29
Consider this problem. We want to store 1,000 student records and search them by student id.
Consider this problem. We want to store 1,000 student records and search them by student id.
One ‘stupid’ way is to store the records in a huge array (index 0..9999999). The index is used as the student id, i.e. the record of the student with studid 0012345 is stored at A[12345]
One ‘stupid’ way is to store the records in a huge array (index 0..9999999). The index is used as the student id, i.e. the record of the student with studid 0012345 is stored at A[12345]
![Page 30: A Knowledge Sharing Session on](https://reader035.vdocument.in/reader035/viewer/2022081516/568134e5550346895d9c195b/html5/thumbnails/30.jpg)
1. Keys may not be nonnegative integers.
2. Gigantic Memory hog
Hash Table| What's Wrong Then?
30
![Page 31: A Knowledge Sharing Session on](https://reader035.vdocument.in/reader035/viewer/2022081516/568134e5550346895d9c195b/html5/thumbnails/31.jpg)
1. Keys may not be nonnegative integers.
Solution: Prehash
2. Gigantic Memory hogSolution: Direct Hash Table
(reduce universe of all keys to reasonable size)
Hash Table| What's Wrong Then?
31
![Page 32: A Knowledge Sharing Session on](https://reader035.vdocument.in/reader035/viewer/2022081516/568134e5550346895d9c195b/html5/thumbnails/32.jpg)
• Each slot, or position, corresponds to a key in U.
• If there’s an element x with key k, then T [k] contains a pointer to x.
• Otherwise, T [k] is empty, represented by NIL.
Hash Table| Direct Hashing Table
32
![Page 33: A Knowledge Sharing Session on](https://reader035.vdocument.in/reader035/viewer/2022081516/568134e5550346895d9c195b/html5/thumbnails/33.jpg)
Store the records in a huge array where the index corresponds to the keyadd - very fast O(1) delete - very fast O(1) search - very fast O(1)
Hash Table| Direct Hashing Table
33
![Page 34: A Knowledge Sharing Session on](https://reader035.vdocument.in/reader035/viewer/2022081516/568134e5550346895d9c195b/html5/thumbnails/34.jpg)
Hash Table| Hash function
34
function Hash(key: KeyType): integer;
Imagine that we have such a magic function Hash. It maps the key (studid) of the 1000 records into the integers 0..999, one to one. No two different keys maps to the same number.
Imagine that we have such a magic function Hash. It maps the key (studid) of the 1000 records into the integers 0..999, one to one. No two different keys maps to the same number.
H(‘0012345’) = 134H(‘0033333’) = 67H(‘0056789’) = 764…H(‘9908080’) = 3
![Page 35: A Knowledge Sharing Session on](https://reader035.vdocument.in/reader035/viewer/2022081516/568134e5550346895d9c195b/html5/thumbnails/35.jpg)
Hash Table| Hash Table
35
:betty
:bill:
:90:
49:
name score
andy 81.5
::
david:
::
56.8:
:0033333
:9908080
:
0012345
::
0056789:
3
67
0
764
999
134
To store a record, we compute Hash(stud_id) for the record and store it at the location Hash(stud_id) of the array. To search for a student, we only need to peek at the location Hash(target stud_id).
To store a record, we compute Hash(stud_id) for the record and store it at the location Hash(stud_id) of the array. To search for a student, we only need to peek at the location Hash(target stud_id).
![Page 36: A Knowledge Sharing Session on](https://reader035.vdocument.in/reader035/viewer/2022081516/568134e5550346895d9c195b/html5/thumbnails/36.jpg)
Ex: key mod size 2201 mod 1000 =201
Hash Table| Division Method
36
h(k) = k mod m
![Page 37: A Knowledge Sharing Session on](https://reader035.vdocument.in/reader035/viewer/2022081516/568134e5550346895d9c195b/html5/thumbnails/37.jpg)
different keys map to the same indexi.e h(k1)=h(k2)=i (k1!=K2)
Ex: 5 mod 11 and 27 mod 11 have index 5.
Hash Table| Collision
37
![Page 38: A Knowledge Sharing Session on](https://reader035.vdocument.in/reader035/viewer/2022081516/568134e5550346895d9c195b/html5/thumbnails/38.jpg)
HashingWidely useful technique for
implementing dictionariesConstant time per operation (on the
average)Best Case O(1)Worst Case O(n)
KeyRecord
f()=>address
01
23
45
38
![Page 39: A Knowledge Sharing Session on](https://reader035.vdocument.in/reader035/viewer/2022081516/568134e5550346895d9c195b/html5/thumbnails/39.jpg)
Ch s Hash FunctionQuick ComputationI t should spread keys evenly:
Uniform DistributionAvoid collisionVery rare cases
E.g Birth day paradox
39
![Page 40: A Knowledge Sharing Session on](https://reader035.vdocument.in/reader035/viewer/2022081516/568134e5550346895d9c195b/html5/thumbnails/40.jpg)
Hash FunctionsDirect hashingDigit ExtractionModulo –division methodMid-square MethodFolding method
40
![Page 41: A Knowledge Sharing Session on](https://reader035.vdocument.in/reader035/viewer/2022081516/568134e5550346895d9c195b/html5/thumbnails/41.jpg)
1. Hashing with Separate Chaining (Open hashing)-unlimited space
2. Hashing with Open Addressing(closed hashing)
Hash Table|-Collision Resolution DS
41
![Page 42: A Knowledge Sharing Session on](https://reader035.vdocument.in/reader035/viewer/2022081516/568134e5550346895d9c195b/html5/thumbnails/42.jpg)
Hash Table|-Collision Resolution Strategies
42
Separate chaining Open Addressing
Linear Probing Quadratic Probing Double Hashing
LP with chainingLP without chaining
LP WC without replacement
LPWC with replacement
![Page 43: A Knowledge Sharing Session on](https://reader035.vdocument.in/reader035/viewer/2022081516/568134e5550346895d9c195b/html5/thumbnails/43.jpg)
Hash Table| Chained Hash Table
43
2
4
10
3
nil
nilnil
5
nil
:
HASHMAX Key: 9903030name: tomscore: 73
One way to handle collision is to store the collided records in a linked list. The array now stores pointers to such lists. If no key maps to a certain hash value, that array entry points to nil.
One way to handle collision is to store the collided records in a linked list. The array now stores pointers to such lists. If no key maps to a certain hash value, that array entry points to nil.
![Page 44: A Knowledge Sharing Session on](https://reader035.vdocument.in/reader035/viewer/2022081516/568134e5550346895d9c195b/html5/thumbnails/44.jpg)
Is required:• When table is completely full• With quadratic probing when table is
filled half • When insertion fail due to overflow
• Size get double after rehashing• Mod value changed to new size* Very costly as new table creation, insertion from old table with using new hash fun.
Hash Table| Rehashing
44
![Page 45: A Knowledge Sharing Session on](https://reader035.vdocument.in/reader035/viewer/2022081516/568134e5550346895d9c195b/html5/thumbnails/45.jpg)
It’s more efficient when load factor is >=70%
Whr l is load factor=
l=h/t whr h is total mapped loc
t is total loc.
Hash Table| Rehashing
45
![Page 46: A Knowledge Sharing Session on](https://reader035.vdocument.in/reader035/viewer/2022081516/568134e5550346895d9c195b/html5/thumbnails/46.jpg)
Types of Linear Probing (with chaining with and without replacement
Note: Try to Solve all example that is taken in class on transparencies and on board ……you can take it from book…
46
![Page 47: A Knowledge Sharing Session on](https://reader035.vdocument.in/reader035/viewer/2022081516/568134e5550346895d9c195b/html5/thumbnails/47.jpg)
Extendible Hashing• All tech. so far are used for small data• When data becomes bulky there will be too
many disk access• So in that case use extendible hashing• This uses binary (disk) coding to mapped the
loc with binary values.– 4 size hash table with 4 slot– 00– 01– 10– 11 47
![Page 48: A Knowledge Sharing Session on](https://reader035.vdocument.in/reader035/viewer/2022081516/568134e5550346895d9c195b/html5/thumbnails/48.jpg)
**Implementation:
• Followings are some example how to create structure and apply hash function on it…
1. Linear Probing with store and search2. Double hashing 3. Quadratic probing
48
![Page 49: A Knowledge Sharing Session on](https://reader035.vdocument.in/reader035/viewer/2022081516/568134e5550346895d9c195b/html5/thumbnails/49.jpg)
Linear Probeint search_LP(int hashtable[],int key,int T[]){ int I,j;
J=key%max;// mapped locfor(i=0;i<MAX;i++){
if(T[j]==0){hashtable[j]=key; T[j]=1;return(j);}
j=(j+1)%MAX;//next loc in circular way.}
return(-1);}
49
![Page 50: A Knowledge Sharing Session on](https://reader035.vdocument.in/reader035/viewer/2022081516/568134e5550346895d9c195b/html5/thumbnails/50.jpg)
Search in LP
Only change if condition checking
for{If(T[j]==1 && hashtable[j]==key)
{ return(j);
}}
50
![Page 51: A Knowledge Sharing Session on](https://reader035.vdocument.in/reader035/viewer/2022081516/568134e5550346895d9c195b/html5/thumbnails/51.jpg)
Double hashing
51
int search_DH(int hashtable[],int T[]){
int I,j,start;start=f1(key)%max; // 1st mapped locu=f2(key); // u will used for incrementfor(i=0;i<MAX;i++){
j=(start+ i*u)%max; if(T[j]==0) // found empty{ hashtable[j]=key; T[j]=1; return(j);}
}return(-1);}
![Page 52: A Knowledge Sharing Session on](https://reader035.vdocument.in/reader035/viewer/2022081516/568134e5550346895d9c195b/html5/thumbnails/52.jpg)
Quadratic hashing
52
int search_QP(int hashtable[],int T[]){
int I,j,start;start=key%max; // 1st mapped locfor(i=0;i<MAX;i++){ j=(start+ i*i)%max;
if(T[j]==0) // found empty{ hashtable[j]=key; T[j]=1; return(j);}
}return(-1);}