15-211 fundamental structures of computer science
DESCRIPTION
15-211 Fundamental Structures of Computer Science. Sorting – Part II. February 25, 2003. Ananda Guna. Announcements. Work on Homework #4 Due on Monday, March 17, 11:59pm You should have started by now! Quiz #2 is Tuesday, Feb.25 Study Huffman and LZW algorithms - PowerPoint PPT PresentationTRANSCRIPT
15-211Fundamental Structuresof Computer Science
February 25, 2003
Ananda Guna
Sorting – Part II
Announcements
Work on Homework #4 Due on Monday, March 17, 11:59pmYou should have started by now!
Quiz #2 is Tuesday, Feb.25Study Huffman and LZW algorithms
Midterm is Tuesday March 4th
Review for mid term test thursday
Master Theorem
THEOREM: The recurrence T(n) = aT(n/b) + cn, T(1) = c,
where a, b, and c are all constants, solves to:
T(n) = (n) if a < b T(n) = (n log n) if a = b T(n) = (n log
b a) if a > b
Recurrences
Divide-and-conquer algorithms often lead to recurrences of the following form: T(n) = a*T(n/b) + cn T(1) = c
(Here a,b, and c, are constants > 0.) For merge sort : a = b = 2 What if a = b = 3 for merge sort?
How will that affect c?
Solving General Recurrences
We can solve this by repeated substitution method
T(n) = aT(n/b) + cn = a(aT(n/b2) + cn/b) + cn = a(a(aT(n/b3)+cn/b2) +cn/b) +cn = ……
= ak+1 T(n/bk+1) + cn[(a/b)k+..+(a/b)2+(a/b)+1] We will solve this in class.
Sorting Recap
Selection sort: always O(n^2) Insertion sort: the total time is O(n + # inversions). This is O(n^2)in worst-case and average-case, but might be smaller if the file is almost sorted.
Bubble-sort - between insertion and selection in terms of running time.
These are all easy to code but O(n2) on average
We can do better
Sorting Recap ctd..
Better algorithms Heapsort : O(n log n) worst-case. Can do in-place
in array. Mergesort. O(n log n) worst-case. Simple divide-and-
conquer: split into left and right halves, recursively sort both halves, and then merge the results. Running time described by recurrence: T(n) = 2T(n/2) + cn and Recurrence solves to O(n log n).
Quicksort: O(n^2) worst-case but O(n log n) average-case. • If you always pick the leftmost as pivot, then this is a lot
like inserting into a binary search tree• Cost is like sum of depths of the nodes• How can we avoid the worst case for any data set? Hint:
Randomize
Why is quicksort better than MergeSort? Hint: faster innerloop
Lower Bound for the Sorting Problem
How fast can we sort?
We have seen several sorting algorithms with O(Nlog N) running time.
Can we do better than N.logN?
In fact, O(Nlog N) is a general lower bound for the sorting algorithm.
A proof appears in Weiss.
Informally we can argue as follows…
Decision tree for sorting
a<b<ca<c<bb<a<cb<c<ac<a<bc<b<a
a<b<ca<c<bc<a<b
b<a<cb<c<ac<b<a
a<b<ca<c<b
c<a<b
a<b<c a<c<b
b<a<cb<c<a
c<b<a
b<a<c b<c<a
a<b b<a
b<c c<bc<aa<c
b<c c<b a<c c<a
N! leaves.
So, tree has height log(N!).
log(N!) = (Nlog N).
Our lower bound argument
We make the following observations/Arguments for any two different permutations P1,P2 of the input, the algorithm must at some point make a comparison that causes it to do different things in the two permutations (otherwise, they wouldn't both be sorted)
each comparison has only two outcomes (it's a YES/NO question)
there must be some permutation that causes the algorithm to ask log(n!) questions
So we argued that comparison based algorithms are both O(n log n) and (n log n)
So this is (n log n).
Summary on sorting bound
If we are restricted to comparisons on pairs of elements, then the general lower bound for sorting is (Nlog N).
A decision tree is a representation of the possible comparisons required to solve a problem.
Bucket Sort
Non-comparison-based sorting
If we can use more than just comparisons of pairs of elements, we can sometimes sort more quickly.
A simple example is bucket sort.
In bucket sort, we require the additional knowledge that all elements are non-negative integers less than a specified maximum value.
Bucket sort
1 3 3 1 2
1 2 3
Implementing Bucket Sort
Assume all values are in the range 0..k for some small k.
Make an array of k linked lists Insert each item into
array[item.value()] Make one pass and collect all items This is O(N + k) algorithm
Bucket sort characteristics
Runs in O(N) time.
Easy to implement each bucket as a linked list.
Is stable:If two elements (A,B) are equal with
respect to sorting, and they appear in the input in order (A,B), then they remain in the same order in the output.
Radix Sort
If your integers are in a larger range then do bucket sort on each digit
Start by sorting with the low-order digit using a STABLE bucket sort.
Then, do the next-lowest,and so on If the items are b digits long (or b
bytes long for strings) then the time to sort N items is O(Nb).
Radix sort Example
A sorting algorithm that goes beyond comparison - radix sort.
0 1 00 0 01 0 10 0 11 1 10 1 11 0 01 1 0
20517346
01234567
0 1 00 0 01 0 01 1 01 0 10 0 11 1 10 1 1
0 0 01 0 01 0 10 0 10 1 01 1 01 1 10 1 1
0 0 00 0 10 1 00 1 11 0 01 0 11 1 01 1 1
Each sorting step must be stable.
Radix sort characteristics
Each sorting step can be performed via bucket sort, and is thus O(N).
If the numbers are all b bits long, then there are b sorting steps.
Hence, radix sort is O(bN).
Also, radix sort can be implemented in-place (just like quicksort).
Not just for binary numbers
Radix sort can be used for decimal numbers and alphanumeric strings.
0 3 22 2 40 1 60 1 50 3 11 6 91 2 32 5 2
0 3 10 3 22 5 21 2 32 2 40 1 50 1 61 6 9
0 1 50 1 61 2 32 2 40 3 10 3 22 5 21 6 9
0 1 50 1 60 3 10 3 21 2 31 6 92 2 42 5 2
Thursday and Next Week
We will do a review for midterm on Thursday
Midterm test is Tuesday March 4th
We will post some old exams on Bb. We will have online office hours next
week Work on HW4 – Ask questions early
15-211 Fundamental Data Structures and Algorithms Klaus Sutner April 22, 2004 Computational Geometry
1 Graphs: shortest paths 15-211 Fundamental Data Structures and Algorithms Ananda Guna April 3, 2003