csc 213 – large scale programming. bucket-sort buckets, b, is array of sequence sorts...
TRANSCRIPT
LECTURE 25:BUCKET SORT & RADIX SORT
CSC 213 – Large Scale Programming
Bucket-Sort
Buckets, B, is array of Sequence Sorts Collection, C, in two phases:
1. Remove each element v from C & add to B[v]
2. Move elements from each bucket back to C
A B C
Bucket-Sort Algorithm
Algorithm bucketSort(Sequence<Integer> C)B = new Sequence<Integer>[10] // & instantiate each Sequence
// Phase 1 for each element v in C
B[v].addLast(v) // Assumes each number in C between 0 & 9endfor
// Phase 2loc = 0for each Sequence b in B
for each element v in bC.set(loc, v)loc += 1
endforendfor
return C
Bucket Sort Properties
For this to work, values must be legal indices Non-negative integers only can be used Sorting occurs without comparing objects
Bucket Sort Properties
For this to work, values must be legal indices Non-negative integers only can be used Sorting occurs without comparing
objects
Bucket Sort Properties
For this to work, values must be legal indices Non-negative integers only can be used
Sorting occurs without
comparing objects
Bucket Sort Properties
For this to work, values must be legal indices Non-negative integers only can be used Sorting occurs without comparing objects
Stable sort describes any sort of this type Preserves relative ordering of objects with
same value (BUBBLE-SORT & MERGE-SORT are other
stable sorts)
Bucket Sort Extensions
Use Comparator for BUCKET-SORT Get index for v using compare(v, null)
Comparator for booleans could return 0 when v is false 1 when v is true
Comparator for US states, could return Annual per capita consumption of Jello Consumption of jello overall, in cubic feet State’s ranking by population
Bucket Sort Extensions
State’s ranking by population
1 California2 Texas3 New York4 Florida5 Illinois
6Pennsylvania
7 Ohio8 Michigan9 Georgia
Bucket Sort Extensions
Extended BUCKET-SORT works with many types Limited set of data needed for this to work Need way to enumerate values of the set
Bucket Sort Extensions
Extended BUCKET-SORT works with many types Limited set of data needed for this to work Need way to enumerate values of the set
enumerateis subtle
hint
d-Tuples
Combination of d values such as (k1, k2, …, kd) ki is ith dimension of the tuple
A point (x, y, z) is 3-tuple x is 1st dimension’s value Value of 2nd dimension is y z is 3rd dimension’s value
Lexicographic Order
Assume a & b are both d-tuples a = (a1, a2, …, ad)
b = (b1, b2, …, bd)
Can say a < b if and only if a1 < b1 OR
a1 = b1 && (a2, …, ad) < (b2, …, bd)
Order these 2-tuples using previous definition (3 4) (7 8) (3 2) (1 4) (4 8)
Lexicographic Order
Assume a & b are both d-tuples a = (a1, a2, …, ad)
b = (b1, b2, …, bd)
Can say a < b if and only if a1 < b1 OR
a1 = b1 && (a2, …, ad) < (b2, …, bd)
Order these 2-tuples using previous definition (3 4) (7 8) (3 2) (1 4) (4 8) (1 4) (3 2) (3 4) (4 8) (7 8)
Radix-Sort
Very fast sort for data expressed as d-tuple Cheats to win; faster than sorting’s lower
bound Sort performed using d calls to bucket sort Sorts least to most important dimension of
tuple Luckily lots of data are d-tuples
String is d-tuple of char“L E T T E R S”“L I N G E R S”
Radix-Sort
Very fast sort for data expressed as d-tuple Cheats to win; faster than sorting’s lower
bound Sort performed using d calls to bucket sort Sorts least to most important dimension of
tuple Luckily lots of data are d-tuples
Digits of an int can be used for sorting, also
1 0 0 1 3 7 2 91 0 0 9 2 2 1 0
Radix-Sort For Integers
Represent int as a d-tuple of digits:621010 = 1111102 041010 =
0001002
Decimal digits needs 10 buckets to use for sorting
Ordering using their bits needs 2 buckets O(d∙n) time needed to run RADIX-SORT
d is length of longest element in input In most cases value of d is constant (d =
31 for int) Radix sort takes O(n) time, ignoring
constant
Radix-Sort In Action
List of 4-bit integers sorted using RADIX-SORT100
10010
1101
0001
1110
Radix-Sort In Action
List of 4-bit integers sorted using RADIX-SORT001
01110100111010001
1001
0010
1101
0001
1110
Radix-Sort In Action
List of 4-bit integers sorted using RADIX-SORT 100
11101000100101110
00101110100111010001
1001
0010
1101
0001
1110
Radix-Sort In Action
List of 4-bit integers sorted using RADIX-SORT 100
10001001011011110
10011101000100101110
00101110100111010001
1001
0010
1101
0001
1110
Radix-Sort In Action
List of 4-bit integers sorted using RADIX-SORT 000
10010100111011110
10010001001011011110
10011101000100101110
00101110100111010001
1001
0010
1101
0001
1110
Radix-Sort
Algorithm radixSort(Sequence<Integer> C) // Works from least to most significant value for bit = 0 to 30 C = bucketSort(C, bit) // Sort C using the specified bitendfor
return C
What is big-Oh complexity for Radix-Sort? Call in loop uses each element twice Loop repeats once per digit to complete
sort
Radix-Sort
Algorithm radixSort(Sequence<Integer> C) // Works from least to most significant value for bit = 0 to 30 C = bucketSort(C, bit) // Sort C using the specified bitendfor
return C
What is big-Oh complexity for Radix-Sort? Call in loop uses each element twice
O(n) Loop repeats once per digit to complete
sort * O(1)
O(n)
Radix-Sort
Algorithm radixSort(Sequence<Integer> C) // Works from least to most significant value for bit = 0 to 30 C = bucketSort(C, bit) // Sort C using the specified bitendfor
return C
What is big-Oh complexity for Radix-Sort? Call in loop uses each element twice
O(n) Loop repeats once per digit to complete
sort * O(1)
O(log n) times (?) O(n log n)
For Next Lecture
Review requirements for program #2 1st Preliminary deadline is Monday Spend time working on this: design saves
coding Reading on Graph ADT for Wednesday
Note: these have nothing to do with bar charts
What are mathematical graphs? Why are they the basis of everything in CS?