representing sets csc 172 spring 2002 lecture 21
Post on 20-Dec-2015
217 views
TRANSCRIPT
![Page 1: REPRESENTING SETS CSC 172 SPRING 2002 LECTURE 21](https://reader030.vdocument.in/reader030/viewer/2022032704/56649d4c5503460f94a2a05f/html5/thumbnails/1.jpg)
REPRESENTING SETS
CSC 172
SPRING 2002
LECTURE 21
![Page 2: REPRESENTING SETS CSC 172 SPRING 2002 LECTURE 21](https://reader030.vdocument.in/reader030/viewer/2022032704/56649d4c5503460f94a2a05f/html5/thumbnails/2.jpg)
Representations
ListSimple O(n) dictionary operations
Binary Search TreesO(log n) average timeRange queries, sorting
Characteristic Vector O(1) dictionary ops, but limited to small sets
Hash TableO(1) average for dictionary ops
![Page 3: REPRESENTING SETS CSC 172 SPRING 2002 LECTURE 21](https://reader030.vdocument.in/reader030/viewer/2022032704/56649d4c5503460f94a2a05f/html5/thumbnails/3.jpg)
Characteristic Vectors
Boolean Strings whose position corresponds to the members of some fixed “universal” setA “1” in a location means that the element is in the set, )
means that it is not
![Page 4: REPRESENTING SETS CSC 172 SPRING 2002 LECTURE 21](https://reader030.vdocument.in/reader030/viewer/2022032704/56649d4c5503460f94a2a05f/html5/thumbnails/4.jpg)
UNIX file privileges
{user, group, others} x {read, write, execute}
9 possible privileges
Type “ls –l” on UNIX
total 142
-rw-rw-r-- 1 pawlicki none 76 Jun 20 2000 PKG416.desc
-rw-rw-r-- 1 pawlicki none 28906 Jun 20 2000 PKG416.pdf
-rw-rw-r-- 1 pawlicki none 1849 Jun 20 2000 let.1
-rw-rw-r-- 1 pawlicki none 0 Apr 2 13:03 out
-rw-rw-r-- 1 pawlicki none 39891 Jun 20 2000 stapp.uu
![Page 5: REPRESENTING SETS CSC 172 SPRING 2002 LECTURE 21](https://reader030.vdocument.in/reader030/viewer/2022032704/56649d4c5503460f94a2a05f/html5/thumbnails/5.jpg)
UNIX files
The ususa order is rwx for each of user (owner), group, and others
So, a protection mode of 110100000 means that the owner may read and write (but not execute), the group can read only and others cannot even read
![Page 6: REPRESENTING SETS CSC 172 SPRING 2002 LECTURE 21](https://reader030.vdocument.in/reader030/viewer/2022032704/56649d4c5503460f94a2a05f/html5/thumbnails/6.jpg)
CV advantages
If the universal set is small, sets can be represented by bits packed 32 to a word
Insert, delete, and lookup are O(1) on the proper bitUnion, intersection, difference are implemented on a
word-by-word basisO(m) where m is the size of the setSmall constant factor (1/32)Fast, machine operations (remember “bits” lab)
![Page 7: REPRESENTING SETS CSC 172 SPRING 2002 LECTURE 21](https://reader030.vdocument.in/reader030/viewer/2022032704/56649d4c5503460f94a2a05f/html5/thumbnails/7.jpg)
Hashing
A cool way to get from an element x to the place where x can be found
An array [0..B-1] of bucketsBucket contains a list of set elements
B = number of buckets
A hash function that takes potential set elements and produces a “random” integer [0..B-1]
![Page 8: REPRESENTING SETS CSC 172 SPRING 2002 LECTURE 21](https://reader030.vdocument.in/reader030/viewer/2022032704/56649d4c5503460f94a2a05f/html5/thumbnails/8.jpg)
Example
If the set elements are integers then the simplest/best hash function is usually h(x) = x % B
Suppose B = 6 and we wish to store the integers {70, 53, 99, 94, 83, 76, 64, 30}
They belong in the buckets 4, 5, 3, 4, 5, 4, 4, and 0
Note: If B = 7 0,4,1,3,6,6,1,2
![Page 9: REPRESENTING SETS CSC 172 SPRING 2002 LECTURE 21](https://reader030.vdocument.in/reader030/viewer/2022032704/56649d4c5503460f94a2a05f/html5/thumbnails/9.jpg)
Pitfalls of Hash Function Selection
We want to get a uniform distribution of elements into buckets
Beware of data patterns that cause non-uniform distribution
![Page 10: REPRESENTING SETS CSC 172 SPRING 2002 LECTURE 21](https://reader030.vdocument.in/reader030/viewer/2022032704/56649d4c5503460f94a2a05f/html5/thumbnails/10.jpg)
Example
If integers were all even, then B = 6 would cause only bucktes 0,2, and 4 to fill
If we hashed words in the the UNIX dictionary into 10 buckets by length of word then 20% go into bucket 7
![Page 11: REPRESENTING SETS CSC 172 SPRING 2002 LECTURE 21](https://reader030.vdocument.in/reader030/viewer/2022032704/56649d4c5503460f94a2a05f/html5/thumbnails/11.jpg)
Dictionary Operations
Lookup
Go to head of bucket h(x)
Search for bucket list. If x is in the bucket
Insertion: append if not found
Delete – list deletion from bucket list
![Page 12: REPRESENTING SETS CSC 172 SPRING 2002 LECTURE 21](https://reader030.vdocument.in/reader030/viewer/2022032704/56649d4c5503460f94a2a05f/html5/thumbnails/12.jpg)
Analysis
If we pick B to be new n, the nubmer of elements in the set, then the average list is O(1) long
Thus, dictionary ops take O(1) time
Worst case all elements go into one bucketO(n)
![Page 13: REPRESENTING SETS CSC 172 SPRING 2002 LECTURE 21](https://reader030.vdocument.in/reader030/viewer/2022032704/56649d4c5503460f94a2a05f/html5/thumbnails/13.jpg)
Managing Hash Table Size
If n gets as high as 2B, create a new hash table with 2B buckets
“Rehash” every element into the new tableO(n) time total
There were at least n inserts since the last “rehash”All these inserts took time O(n)
Thus, we “amortize” the cost of rehashing over the inserts since the last rehashConstant factor, at worst
So, even with rehashing we get O(1) time ops
![Page 14: REPRESENTING SETS CSC 172 SPRING 2002 LECTURE 21](https://reader030.vdocument.in/reader030/viewer/2022032704/56649d4c5503460f94a2a05f/html5/thumbnails/14.jpg)
Collisions
A collision occurs when two values in the set hash to the same value
There are several ways to deal with thisChaining (using a linked list or some secondary structure)
Open AddressingDouble hashing
Linear Probing
![Page 15: REPRESENTING SETS CSC 172 SPRING 2002 LECTURE 21](https://reader030.vdocument.in/reader030/viewer/2022032704/56649d4c5503460f94a2a05f/html5/thumbnails/15.jpg)
Chaining
0
1
2
3
4
5
6
70
99 64
83 76
94
53
30
Very efficientTime Wise
Other approachesUse less space
![Page 16: REPRESENTING SETS CSC 172 SPRING 2002 LECTURE 21](https://reader030.vdocument.in/reader030/viewer/2022032704/56649d4c5503460f94a2a05f/html5/thumbnails/16.jpg)
Open Addressing
When a collision occurs,
if the table is not full find an available spaceLinear Probing
Double Hashing
![Page 17: REPRESENTING SETS CSC 172 SPRING 2002 LECTURE 21](https://reader030.vdocument.in/reader030/viewer/2022032704/56649d4c5503460f94a2a05f/html5/thumbnails/17.jpg)
Linear ProbingIf the current location is occupied, try the next table location
LinearProbingInsert(K) {if (table is full) error;probe = h(K);while (table[probe] is occupied)
probe = ++probe % M;table[probe] = K;
}
Walk along table until an empty spot is foundUses less memory than chaining (no links)Takes more time than chaining (long walks)Deleting is a pain (mark a slot as having been deleted)
![Page 18: REPRESENTING SETS CSC 172 SPRING 2002 LECTURE 21](https://reader030.vdocument.in/reader030/viewer/2022032704/56649d4c5503460f94a2a05f/html5/thumbnails/18.jpg)
Linear Probingh(K) = K % 13
180 1 2 3 4 5 6 7 8 9 10 11 12
Insert: 18, 41, 22, 59, 32, 31, 73
h(K) : 5,
![Page 19: REPRESENTING SETS CSC 172 SPRING 2002 LECTURE 21](https://reader030.vdocument.in/reader030/viewer/2022032704/56649d4c5503460f94a2a05f/html5/thumbnails/19.jpg)
Linear Probingh(K) = K % 13
41 180 1 2 3 4 5 6 7 8 9 10 11 12
Insert: 18, 41, 22, 59, 32, 31, 73
h(K) : 5, 2,
![Page 20: REPRESENTING SETS CSC 172 SPRING 2002 LECTURE 21](https://reader030.vdocument.in/reader030/viewer/2022032704/56649d4c5503460f94a2a05f/html5/thumbnails/20.jpg)
Linear Probingh(K) = K % 13
41 18 220 1 2 3 4 5 6 7 8 9 10 11 12
Insert: 18, 41, 22, 59, 32, 31, 73
h(K) : 5, 2, 9,
![Page 21: REPRESENTING SETS CSC 172 SPRING 2002 LECTURE 21](https://reader030.vdocument.in/reader030/viewer/2022032704/56649d4c5503460f94a2a05f/html5/thumbnails/21.jpg)
Linear Probingh(K) = K % 13
41 18 59 220 1 2 3 4 5 6 7 8 9 10 11 12
Insert: 18, 41, 22, 59, 32, 31, 73
h(K) : 5, 2, 9, 7,
![Page 22: REPRESENTING SETS CSC 172 SPRING 2002 LECTURE 21](https://reader030.vdocument.in/reader030/viewer/2022032704/56649d4c5503460f94a2a05f/html5/thumbnails/22.jpg)
Linear Probingh(K) = K % 13
41 18 32 59 220 1 2 3 4 5 6 7 8 9 10 11 12
Insert: 18, 41, 22, 59, 32, 31, 73
h(K) : 5, 2, 9, 7, 6,
![Page 23: REPRESENTING SETS CSC 172 SPRING 2002 LECTURE 21](https://reader030.vdocument.in/reader030/viewer/2022032704/56649d4c5503460f94a2a05f/html5/thumbnails/23.jpg)
Linear Probingh(K) = K % 13
41 18 32 59 220 1 2 3 4 5 6 7 8 9 10 11 12
Insert: 18, 41, 22, 59, 32, 31, 73
h(K) : 5, 2, 9, 7, 6, 5,
![Page 24: REPRESENTING SETS CSC 172 SPRING 2002 LECTURE 21](https://reader030.vdocument.in/reader030/viewer/2022032704/56649d4c5503460f94a2a05f/html5/thumbnails/24.jpg)
Linear Probingh(K) = K % 13
41 18 32 59 220 1 2 3 4 5 6 7 8 9 10 11 12
Insert: 18, 41, 22, 59, 32, 31, 73
h(K) : 5, 2, 9, 7, 6, 5,
![Page 25: REPRESENTING SETS CSC 172 SPRING 2002 LECTURE 21](https://reader030.vdocument.in/reader030/viewer/2022032704/56649d4c5503460f94a2a05f/html5/thumbnails/25.jpg)
Linear Probingh(K) = K % 13
41 18 32 59 220 1 2 3 4 5 6 7 8 9 10 11 12
Insert: 18, 41, 22, 59, 32, 31, 73
h(K) : 5, 2, 9, 7, 6, 5,
![Page 26: REPRESENTING SETS CSC 172 SPRING 2002 LECTURE 21](https://reader030.vdocument.in/reader030/viewer/2022032704/56649d4c5503460f94a2a05f/html5/thumbnails/26.jpg)
Linear Probingh(K) = K % 13
41 18 32 59 31 220 1 2 3 4 5 6 7 8 9 10 11 12
Insert: 18, 41, 22, 59, 32, 31, 73
h(K) : 5, 2, 9, 7, 6, 5,
![Page 27: REPRESENTING SETS CSC 172 SPRING 2002 LECTURE 21](https://reader030.vdocument.in/reader030/viewer/2022032704/56649d4c5503460f94a2a05f/html5/thumbnails/27.jpg)
Linear Probingh(K) = K % 13
41 18 32 59 31 220 1 2 3 4 5 6 7 8 9 10 11 12
Insert: 18, 41, 22, 59, 32, 31, 73
h(K) : 5, 2, 9, 7, 6, 5, 8
![Page 28: REPRESENTING SETS CSC 172 SPRING 2002 LECTURE 21](https://reader030.vdocument.in/reader030/viewer/2022032704/56649d4c5503460f94a2a05f/html5/thumbnails/28.jpg)
Linear Probingh(K) = K % 13
41 18 32 59 31 220 1 2 3 4 5 6 7 8 9 10 11 12
Insert: 18, 41, 22, 59, 32, 31, 73
h(K) : 5, 2, 9, 7, 6, 5, 8
73
![Page 29: REPRESENTING SETS CSC 172 SPRING 2002 LECTURE 21](https://reader030.vdocument.in/reader030/viewer/2022032704/56649d4c5503460f94a2a05f/html5/thumbnails/29.jpg)
Double HashingIf the current location is occupied, try another table location
Use two hash functions
If M is prime, eventually will examine every location DoubleHashInsert(K) {
if (table is full) error;
probe = h1(K);
offset = h2(K);
while (table[probe] is occupied)
probe = (probe+offset) % M;
table[probe] = K;
}
Many of the same (dis)advantages as linear probing
Distributes keys more evenly than linear probing
![Page 30: REPRESENTING SETS CSC 172 SPRING 2002 LECTURE 21](https://reader030.vdocument.in/reader030/viewer/2022032704/56649d4c5503460f94a2a05f/html5/thumbnails/30.jpg)
Double Hashingh1(K) = K % 13h1(K) = 8 - K % 8
0 1 2 3 4 5 6 7 8 9 10 11 12
Insert: 18, 41, 22, 59, 32, 31, 73
h1(K) : 5, 2, 9, 7, 6, 5, 8
h2(K) : 6, 7, 2, 5, 8, 1, 7
![Page 31: REPRESENTING SETS CSC 172 SPRING 2002 LECTURE 21](https://reader030.vdocument.in/reader030/viewer/2022032704/56649d4c5503460f94a2a05f/html5/thumbnails/31.jpg)
Double Hashingh1(K) = K % 13h1(K) = 8 - K % 8
41 18 32 59 220 1 2 3 4 5 6 7 8 9 10 11 12
Insert: 18, 41, 22, 59, 32, 31, 73
h1(K) : 5, 2, 9, 7, 6, 5, 8
h2(K) : 6, 7, 2, 5, 8, 1, 7
31
![Page 32: REPRESENTING SETS CSC 172 SPRING 2002 LECTURE 21](https://reader030.vdocument.in/reader030/viewer/2022032704/56649d4c5503460f94a2a05f/html5/thumbnails/32.jpg)
Double Hashingh1(K) = K % 13h1(K) = 8 - K % 8
41 18 32 59 220 1 2 3 4 5 6 7 8 9 10 11 12
Insert: 18, 41, 22, 59, 32, 31, 73
h1(K) : 5, 2, 9, 7, 6, 5, 8
h2(K) : 6, 7, 2, 5, 8, 1, 7
3173