divide and conquer the most well known algorithm design strategy: 1. divide instance of problem into...
Post on 22-Dec-2015
215 views
TRANSCRIPT
Divide and ConquerDivide and Conquer
The most well known algorithm The most well known algorithm design strategy:design strategy:
1.1. Divide instance of problem into Divide instance of problem into two or more smaller instancestwo or more smaller instances
2.2. Solve smaller instances recursivelySolve smaller instances recursively
3.3. Obtain solution to original (larger) Obtain solution to original (larger) instance by combining these instance by combining these solutionssolutions
Divide-and-conquer Divide-and-conquer techniquetechnique
subproblem 2 of size n/2
subproblem 1 of size n/2
a solution to subproblem 1
a solution tothe original problem
a solution to subproblem 2
a problem of size n
Divide and Conquer Divide and Conquer ExamplesExamples
Sorting: mergesort and quicksortSorting: mergesort and quicksort Tree traversalsTree traversals Binary searchBinary search Matrix multiplication-Strassen’s Matrix multiplication-Strassen’s
algorithmalgorithm Convex hull-QuickHull algorithmConvex hull-QuickHull algorithm
General Divide and General Divide and Conquer recurrence:Conquer recurrence:
TT((nn) = ) = aTaT((n/bn/b) + ) + f f ((nn)) where where f f ((nn)) == ΘΘ((nnkk))
1.1. a < ba < bkk T T((nn) ) == ΘΘ((nnkk))
2.2. a = ba = bkk T T((nn) ) == ΘΘ((nnk k lg lg n n ))
3.3. a > ba > bkk T T((nn) ) == ΘΘ((nnlog log b b aa))
MergesortMergesort
Algorithm:Algorithm: Split array A[1..Split array A[1..nn] in two and make ] in two and make
copies of each halfcopies of each half
in arrays B[1.. in arrays B[1.. nn/2 ] and C[1.. /2 ] and C[1.. nn/2 ]/2 ] Sort arrays B and CSort arrays B and C Merge sorted arrays B and C into Merge sorted arrays B and C into
array Aarray A
MergesortMergesort
Algorithm:Algorithm: Merge sorted arrays B and C into array A as Merge sorted arrays B and C into array A as
follows:follows:– Repeat the following until no elements remain Repeat the following until no elements remain
in one of the arrays:in one of the arrays: compare the first elements in the remaining compare the first elements in the remaining
unprocessed portions of the arraysunprocessed portions of the arrays copy the smaller of the two into A, while copy the smaller of the two into A, while
incrementing the index indicating the incrementing the index indicating the unprocessed portion of that array unprocessed portion of that array
– Once all elements in one of the arrays are Once all elements in one of the arrays are processed, copy the remaining unprocessed processed, copy the remaining unprocessed elements from the other array into A.elements from the other array into A.
How Merging WorksHow Merging Works
A: 7 2 1 6 4 9 5A: 7 2 1 6 4 9 5
B: 7 2 1 6 C: 4 9 5 Split The ListB: 7 2 1 6 C: 4 9 5 Split The List
B: 1 2 6 7 C: 4 5 9 Sort Each B: 1 2 6 7 C: 4 5 9 Sort Each ListList
Merge the ListsMerge the Lists
B: 1B: 1 22 44 55 66 77 99
Putting it TogetherPutting it Together
A: 7 2 1 6 4 9 5A: 7 2 1 6 4 9 5
B: 7 2 1 6 C: 4 9 5 Split The ListB: 7 2 1 6 C: 4 9 5 Split The List
B: 1 2 6 7 C: 4 5 9 Sort Each B: 1 2 6 7 C: 4 5 9 Sort Each ListListEach List is sorted by recursively applying Each List is sorted by recursively applying mergesort to the sub-listsmergesort to the sub-lists
D: 7 2 E: 1 6 F: 4 9 G: 5 Split AgainD: 7 2 E: 1 6 F: 4 9 G: 5 Split Again
H: 7 I: 2 J: 1 K: 6 L: 4 M: 9 N: 5H: 7 I: 2 J: 1 K: 6 L: 4 M: 9 N: 5
Efficiency of mergesortEfficiency of mergesort
All cases have same time efficiency: All cases have same time efficiency: ΘΘ( ( n n log log nn) )
Number of comparisons is close to Number of comparisons is close to theoretical minimum for comparison-theoretical minimum for comparison-based sorting: based sorting:
log (log (n n !) ≈ !) ≈ nn log log n n - 1.44- 1.44 n n Space requirement: Space requirement: ΘΘ( ( n n ) () (NOTNOT in-place) in-place) Can be implemented without recursion Can be implemented without recursion
(bottom-up)(bottom-up)
QuicksortQuicksort Select a Select a pivotpivot (partitioning element) (partitioning element) Rearrange the list so that all the elements in Rearrange the list so that all the elements in
the positions before the pivot are smaller the positions before the pivot are smaller than or equal to the pivot and those after the than or equal to the pivot and those after the pivot are larger than the pivot (See algorithm pivot are larger than the pivot (See algorithm Partition Partition in section 4.2) in section 4.2)
Exchange the pivot with the last element in Exchange the pivot with the last element in the first (i.e., the first (i.e., ≤ sublist) – the pivot is now in ≤ sublist) – the pivot is now in its final positionits final position
Sort the two sublistsSort the two sublistsp
A[i]≤p A[i]>p
The partition The partition algorithmalgorithm
Example: 8 1 12 2 6 10 14 15 4 13 9 11 3 7 5 Example: 8 1 12 2 6 10 14 15 4 13 9 11 3 7 5
Efficiency of quicksortEfficiency of quicksort
Best caseBest case: split in the middle: split in the middle
ΘΘ( ( n n log log nn) )
Worst caseWorst case: sorted array!: sorted array!
ΘΘ( ( nn22) )
Average caseAverage case: random arrays: random arrays
ΘΘ( ( n n log log nn))
Efficiency of quicksortEfficiency of quicksort
Improvements:Improvements:– better pivot selection: median of three better pivot selection: median of three
partitioning avoids worst case in sorted partitioning avoids worst case in sorted filesfiles
– switch to insertion sort on small subfilesswitch to insertion sort on small subfiles– elimination of recursionelimination of recursionthese combine to 20-25% improvementthese combine to 20-25% improvement
Considered the method of choice for Considered the method of choice for internal sorting for large files (internal sorting for large files (nn ≥ 10000) ≥ 10000)
QuickHull Algorithm QuickHull Algorithm
Inspired by Quicksort compute Convex Hull:Inspired by Quicksort compute Convex Hull: Assume points are sorted by Assume points are sorted by xx-coordinate -coordinate
valuesvalues Identify extreme points Identify extreme points PP11 and and PP22 (part of hull) (part of hull)
P1
P2
QuickHull Algorithm QuickHull Algorithm
Compute upper hull:Compute upper hull:
– find point find point PPmaxmax that is farthest away from line that is farthest away from line PP11PP22
– compute the hull to the left of line compute the hull to the left of line PP11PPmaxmax
– compute the hull to the right of line compute the hull to the right of line PP22PPmaxmax
Compute lower hull in a similar mannerCompute lower hull in a similar manner
P1
P2
Pmax
Efficiency of QuickHull Efficiency of QuickHull algorithmalgorithm Finding point farthest away from line Finding point farthest away from line
PP11PP2 2 can be done in linear timecan be done in linear time This gives same efficiency as This gives same efficiency as
quicksort: quicksort: – Worst caseWorst case: : ΘΘ( ( nn22) ) – Average caseAverage case: : ΘΘ( ( n n log log nn))
Efficiency of QuickHull Efficiency of QuickHull algorithmalgorithm If points are not initially sorted by x-If points are not initially sorted by x-
coordinate value, this can be coordinate value, this can be accomplished in accomplished in ΘΘ( ( n n log log nn) — ) — no no increase in asymptotic efficiency classincrease in asymptotic efficiency class
Other algorithms for convex hull:Other algorithms for convex hull:– Graham’s scanGraham’s scan– DCHullDCHullalso in also in ΘΘ( ( n n log log nn) )
Closest-Pair Problem:Closest-Pair Problem: Divide and Conquer Divide and Conquer Brute force approach requires comparing every point Brute force approach requires comparing every point
with every other pointwith every other point Given n points, we must perform 1 + 2 + 3 + … + n-2 Given n points, we must perform 1 + 2 + 3 + … + n-2
+ n-1 comparisons.+ n-1 comparisons.
Brute force Brute force O(n O(n22)) The Divide and Conquer algorithm yields The Divide and Conquer algorithm yields O(n log n) O(n log n) Reminder: if n = 1,000,000 then Reminder: if n = 1,000,000 then
nn22 = 1,000,000,000,000 whereas = 1,000,000,000,000 whereas n log n = 20,000,000n log n = 20,000,000
2
)1(1
1
nnk
n
k
Closest-Pair AlgorithmClosest-Pair Algorithm
Given: A set of points in 2-D
Closest-Pair AlgorithmClosest-Pair Algorithm
Step 1: Sort the points in one D
Lets sort based on the X-axis
O(n log n) using quicksort or mergesort
1
2
3
4
5
6
7
8
9
10
11
12
13 1
4
Closest-Pair AlgorithmClosest-Pair Algorithm
Step 2: Split the points, i.e., Draw a line at the mid-point between 7 and 8
1
2
3
4
5
6
7
8
9
10
11
12
13 1
4
Closest-Pair AlgorithmClosest-Pair Algorithm
Sub-Problem 1
Sub-Problem 2
Advantage: Normally, we’d have to compare each of the 14 points with every other point.
(n-1)n/2 = 13*14/2 = 91 comparisons
1
2
3
4
5
6
7
8
9
10
11
12
13 1
4
Closest-Pair AlgorithmClosest-Pair Algorithm
Sub-Problem 1
Sub-Problem 2
Advantage: Now, we have two sub-problems of half the size. Thus, we have to do 6*7/2 comparisons twice, which is 42 comparisons
1
2
3
4
5
6
7
8
9
10
11
12
13 1
4
Closest-Pair AlgorithmClosest-Pair Algorithm
d1d2
Sub-Problem 1
Sub-Problem 2
solution d = min(d1, d2)
Advantage: With just one split we cut the number of comparisons in half. Obviously, we gain an even greater advantage if we split the sub-problems.
1
2
3
4
5
6
7
8
9
10
11
12
13 1
4
Closest-Pair AlgorithmClosest-Pair Algorithm
d1d2
Sub-Problem 1
Sub-Problem 2
d = min(d1, d2)
Problem: However, what if the closest two points are each from different sub-problems?
1
2
3
4
5
6
7
8
9
10
11
12
13 1
4
Closest-Pair AlgorithmClosest-Pair Algorithm
d1d2
Sub-Problem 1
Sub-Problem 2
Here is an example where we have to compare points from sub-problem 1 to the points in sub-problem 2.
1
2
3
4
5
6
78
9
10
11
12
13 1
4
Closest-Pair AlgorithmClosest-Pair Algorithm
d1d2
Sub-Problem 1
Sub-Problem 2
However, we only have to compare points inside the following “strip.”
1
2
3
4
5
6
7
8
9
10
11
12
13 1
4
Closest-Pair AlgorithmClosest-Pair Algorithm
d1d2
Sub-Problem 1
Sub-Problem 2
dd
d = min(d1, d2)
Step 3: But, we can continue the advantage by splitting the sub-problems.
1
2
3
4
5
6
7
8
9
10
11
12
13 1
4
Closest-Pair AlgorithmClosest-Pair Algorithm
Step 3: In fact we can continue to split until each sub-problem is trivial, i.e., takes one comparison.
1
2
3
4
5
6
7
8
9
10
11
12
13 1
4
Closest-Pair AlgorithmClosest-Pair Algorithm
Finally: The solution to each sub-problem is combined until the final solution is obtained
1
2
3
4
5
6
7
8
9
10
11
12
13 1
4
Closest-Pair AlgorithmClosest-Pair Algorithm
Finally: On the last step the ‘strip’ will likely be very small. Thus, combining the two largest sub-problems won’t require much work.
1
2
3
4
5
6
7
8
9
10
11
12
13 1
4
Closest-Pair AlgorithmClosest-Pair Algorithm
1
2
3
4
5
6
7
8
9
10
11
12
13 1
4
Closest-Pair AlgorithmClosest-Pair Algorithm In this example, it takes 22 comparisons to
find the closets-pair. The brute force algorithm would have taken 91
comparisons. But, the real advantage occurs when there are
millions of points.
Closest-Pair Problem:Closest-Pair Problem: Divide and Conquer Divide and Conquer
Here is another animation:Here is another animation: http://www.cs.mcgill.ca/~cs251/ClosestPair/http://www.cs.mcgill.ca/~cs251/ClosestPair/
ClosestPairApplet/ClosestPairApplet.htmlClosestPairApplet/ClosestPairApplet.html
RememberRemember
Homework is due on FridayHomework is due on Friday– In class and on paperIn class and on paper
There is a talk today:There is a talk today:– 4PM RB 340 or 3284PM RB 340 or 328– Darren Lim (faculty candidate)Darren Lim (faculty candidate)– Bioinformatics: Secondary Structure Bioinformatics: Secondary Structure
Prediction in ProteinsPrediction in Proteins
Long TermLong Term
HW7 will be given on Friday.HW7 will be given on Friday. It’ll be due on Wed.It’ll be due on Wed. It’ll be returned on Friday.It’ll be returned on Friday. Exam2 will be on Monday the 22Exam2 will be on Monday the 22ndnd