divide and conquer the most well known algorithm design strategy: 1. divide instance of problem into...

Divide and ConquerDivide and Conquer

The most well known algorithm The most well known algorithm design strategy:design strategy:

1.1. Divide instance of problem into Divide instance of problem into two or more smaller instancestwo or more smaller instances

2.2. Solve smaller instances recursivelySolve smaller instances recursively

3.3. Obtain solution to original (larger) Obtain solution to original (larger) instance by combining these instance by combining these solutionssolutions

Divide-and-conquer Divide-and-conquer techniquetechnique

subproblem 2 of size n/2

subproblem 1 of size n/2

a solution to subproblem 1

a solution tothe original problem

a solution to subproblem 2

a problem of size n

Divide and Conquer Divide and Conquer ExamplesExamples

Sorting: mergesort and quicksortSorting: mergesort and quicksort Tree traversalsTree traversals Binary searchBinary search Matrix multiplication-Strassen’s Matrix multiplication-Strassen’s

algorithmalgorithm Convex hull-QuickHull algorithmConvex hull-QuickHull algorithm

General Divide and General Divide and Conquer recurrence:Conquer recurrence:

TT((nn) = ) = aTaT((n/bn/b) + ) + f f ((nn)) where where f f ((nn)) == ΘΘ((nnkk))

1.1. a < ba < bkk T T((nn) ) == ΘΘ((nnkk))

2.2. a = ba = bkk T T((nn) ) == ΘΘ((nnk k lg lg n n ))

3.3. a > ba > bkk T T((nn) ) == ΘΘ((nnlog log b b aa))

MergesortMergesort

Algorithm:Algorithm: Split array A[1..Split array A[1..nn] in two and make ] in two and make

copies of each halfcopies of each half

in arrays B[1.. in arrays B[1.. nn/2 ] and C[1.. /2 ] and C[1.. nn/2 ]/2 ] Sort arrays B and CSort arrays B and C Merge sorted arrays B and C into Merge sorted arrays B and C into

array Aarray A

MergesortMergesort

Algorithm:Algorithm: Merge sorted arrays B and C into array A as Merge sorted arrays B and C into array A as

follows:follows:– Repeat the following until no elements remain Repeat the following until no elements remain

in one of the arrays:in one of the arrays: compare the first elements in the remaining compare the first elements in the remaining

unprocessed portions of the arraysunprocessed portions of the arrays copy the smaller of the two into A, while copy the smaller of the two into A, while

incrementing the index indicating the incrementing the index indicating the unprocessed portion of that array unprocessed portion of that array

– Once all elements in one of the arrays are Once all elements in one of the arrays are processed, copy the remaining unprocessed processed, copy the remaining unprocessed elements from the other array into A.elements from the other array into A.

How Merging WorksHow Merging Works

A: 7 2 1 6 4 9 5A: 7 2 1 6 4 9 5

B: 7 2 1 6 C: 4 9 5 Split The ListB: 7 2 1 6 C: 4 9 5 Split The List

B: 1 2 6 7 C: 4 5 9 Sort Each B: 1 2 6 7 C: 4 5 9 Sort Each ListList

Merge the ListsMerge the Lists

B: 1B: 1 22 44 55 66 77 99

Putting it TogetherPutting it Together

A: 7 2 1 6 4 9 5A: 7 2 1 6 4 9 5

B: 7 2 1 6 C: 4 9 5 Split The ListB: 7 2 1 6 C: 4 9 5 Split The List

B: 1 2 6 7 C: 4 5 9 Sort Each B: 1 2 6 7 C: 4 5 9 Sort Each ListListEach List is sorted by recursively applying Each List is sorted by recursively applying mergesort to the sub-listsmergesort to the sub-lists

D: 7 2 E: 1 6 F: 4 9 G: 5 Split AgainD: 7 2 E: 1 6 F: 4 9 G: 5 Split Again

H: 7 I: 2 J: 1 K: 6 L: 4 M: 9 N: 5H: 7 I: 2 J: 1 K: 6 L: 4 M: 9 N: 5

Efficiency of mergesortEfficiency of mergesort

All cases have same time efficiency: All cases have same time efficiency: ΘΘ( ( n n log log nn) )

Number of comparisons is close to Number of comparisons is close to theoretical minimum for comparison-theoretical minimum for comparison-based sorting: based sorting:

log (log (n n !) ≈ !) ≈ nn log log n n - 1.44- 1.44 n n Space requirement: Space requirement: ΘΘ( ( n n ) () (NOTNOT in-place) in-place) Can be implemented without recursion Can be implemented without recursion

(bottom-up)(bottom-up)

QuicksortQuicksort Select a Select a pivotpivot (partitioning element) (partitioning element) Rearrange the list so that all the elements in Rearrange the list so that all the elements in

the positions before the pivot are smaller the positions before the pivot are smaller than or equal to the pivot and those after the than or equal to the pivot and those after the pivot are larger than the pivot (See algorithm pivot are larger than the pivot (See algorithm Partition Partition in section 4.2) in section 4.2)

Exchange the pivot with the last element in Exchange the pivot with the last element in the first (i.e., the first (i.e., ≤ sublist) – the pivot is now in ≤ sublist) – the pivot is now in its final positionits final position

Sort the two sublistsSort the two sublistsp

A[i]≤p A[i]>p

The partition The partition algorithmalgorithm

Example: 8 1 12 2 6 10 14 15 4 13 9 11 3 7 5 Example: 8 1 12 2 6 10 14 15 4 13 9 11 3 7 5

Efficiency of quicksortEfficiency of quicksort

Best caseBest case: split in the middle: split in the middle

ΘΘ( ( n n log log nn) )

Worst caseWorst case: sorted array!: sorted array!

ΘΘ( ( nn22) )

Average caseAverage case: random arrays: random arrays

ΘΘ( ( n n log log nn))

Efficiency of quicksortEfficiency of quicksort

Improvements:Improvements:– better pivot selection: median of three better pivot selection: median of three

partitioning avoids worst case in sorted partitioning avoids worst case in sorted filesfiles

– switch to insertion sort on small subfilesswitch to insertion sort on small subfiles– elimination of recursionelimination of recursionthese combine to 20-25% improvementthese combine to 20-25% improvement

Considered the method of choice for Considered the method of choice for internal sorting for large files (internal sorting for large files (nn ≥ 10000) ≥ 10000)

QuickHull Algorithm QuickHull Algorithm

Inspired by Quicksort compute Convex Hull:Inspired by Quicksort compute Convex Hull: Assume points are sorted by Assume points are sorted by xx-coordinate -coordinate

valuesvalues Identify extreme points Identify extreme points PP11 and and PP22 (part of hull) (part of hull)

P1

P2

QuickHull Algorithm QuickHull Algorithm

Compute upper hull:Compute upper hull:

– find point find point PPmaxmax that is farthest away from line that is farthest away from line PP11PP22

– compute the hull to the left of line compute the hull to the left of line PP11PPmaxmax

– compute the hull to the right of line compute the hull to the right of line PP22PPmaxmax

Compute lower hull in a similar mannerCompute lower hull in a similar manner

P1

P2

Pmax

Efficiency of QuickHull Efficiency of QuickHull algorithmalgorithm Finding point farthest away from line Finding point farthest away from line

PP11PP2 2 can be done in linear timecan be done in linear time This gives same efficiency as This gives same efficiency as

quicksort: quicksort: – Worst caseWorst case: : ΘΘ( ( nn22) ) – Average caseAverage case: : ΘΘ( ( n n log log nn))

Efficiency of QuickHull Efficiency of QuickHull algorithmalgorithm If points are not initially sorted by x-If points are not initially sorted by x-

coordinate value, this can be coordinate value, this can be accomplished in accomplished in ΘΘ( ( n n log log nn) — ) — no no increase in asymptotic efficiency classincrease in asymptotic efficiency class

Other algorithms for convex hull:Other algorithms for convex hull:– Graham’s scanGraham’s scan– DCHullDCHullalso in also in ΘΘ( ( n n log log nn) )

Closest-Pair Problem:Closest-Pair Problem: Divide and Conquer Divide and Conquer Brute force approach requires comparing every point Brute force approach requires comparing every point

with every other pointwith every other point Given n points, we must perform 1 + 2 + 3 + … + n-2 Given n points, we must perform 1 + 2 + 3 + … + n-2

+ n-1 comparisons.+ n-1 comparisons.

Brute force Brute force O(n O(n22)) The Divide and Conquer algorithm yields The Divide and Conquer algorithm yields O(n log n) O(n log n) Reminder: if n = 1,000,000 then Reminder: if n = 1,000,000 then

nn22 = 1,000,000,000,000 whereas = 1,000,000,000,000 whereas n log n = 20,000,000n log n = 20,000,000

2

)1(1

1

nnk

n

k

Closest-Pair AlgorithmClosest-Pair Algorithm

Given: A set of points in 2-D


Step 1: Sort the points in one D

Lets sort based on the X-axis

O(n log n) using quicksort or mergesort

1

2

3

4

5

6

7

8

9

10

11

12

13 1

4


Step 2: Split the points, i.e., Draw a line at the mid-point between 7 and 8

1

2

3

4

5

6

7

8

9

10

11

12

13 1

4


Sub-Problem 1

Sub-Problem 2

Advantage: Normally, we’d have to compare each of the 14 points with every other point.

(n-1)n/2 = 13*14/2 = 91 comparisons

1

2

3

4

5

6

7

8

9

10

11

12

13 1

4


Sub-Problem 1

Sub-Problem 2

Advantage: Now, we have two sub-problems of half the size. Thus, we have to do 6*7/2 comparisons twice, which is 42 comparisons

1

2

3

4

5

6

7

8

9

10

11

12

13 1

4


d1d2

Sub-Problem 1

Sub-Problem 2

solution d = min(d1, d2)

Advantage: With just one split we cut the number of comparisons in half. Obviously, we gain an even greater advantage if we split the sub-problems.

1

2

3

4

5

6

7

8

9

10

11

12

13 1

4


d1d2

Sub-Problem 1

Sub-Problem 2

d = min(d1, d2)

Problem: However, what if the closest two points are each from different sub-problems?

1

2

3

4

5

6

7

8

9

10

11

12

13 1

4


d1d2

Sub-Problem 1

Sub-Problem 2

Here is an example where we have to compare points from sub-problem 1 to the points in sub-problem 2.

1

2

3

4

5

6

78

9

10

11

12

13 1

4


d1d2

Sub-Problem 1

Sub-Problem 2

However, we only have to compare points inside the following “strip.”

1

2

3

4

5

6

7

8

9

10

11

12

13 1

4


d1d2

Sub-Problem 1

Sub-Problem 2

dd

d = min(d1, d2)

Step 3: But, we can continue the advantage by splitting the sub-problems.

1

2

3

4

5

6

7

8

9

10

11

12

13 1

4


Step 3: In fact we can continue to split until each sub-problem is trivial, i.e., takes one comparison.

1

2

3

4

5

6

7

8

9

10

11

12

13 1

4


Finally: The solution to each sub-problem is combined until the final solution is obtained

1

2

3

4

5

6

7

8

9

10

11

12

13 1

4


Finally: On the last step the ‘strip’ will likely be very small. Thus, combining the two largest sub-problems won’t require much work.

1

2

3

4

5

6

7

8

9

10

11

12

13 1

4


1

2

3

4

5

6

7

8

9

10

11

12

13 1

4

Closest-Pair AlgorithmClosest-Pair Algorithm In this example, it takes 22 comparisons to

find the closets-pair. The brute force algorithm would have taken 91

comparisons. But, the real advantage occurs when there are

millions of points.

Closest-Pair Problem:Closest-Pair Problem: Divide and Conquer Divide and Conquer

Here is another animation:Here is another animation: http://www.cs.mcgill.ca/~cs251/ClosestPair/http://www.cs.mcgill.ca/~cs251/ClosestPair/

ClosestPairApplet/ClosestPairApplet.htmlClosestPairApplet/ClosestPairApplet.html

RememberRemember

Homework is due on FridayHomework is due on Friday– In class and on paperIn class and on paper

There is a talk today:There is a talk today:– 4PM RB 340 or 3284PM RB 340 or 328– Darren Lim (faculty candidate)Darren Lim (faculty candidate)– Bioinformatics: Secondary Structure Bioinformatics: Secondary Structure

Prediction in ProteinsPrediction in Proteins

Long TermLong Term

HW7 will be given on Friday.HW7 will be given on Friday. It’ll be due on Wed.It’ll be due on Wed. It’ll be returned on Friday.It’ll be returned on Friday. Exam2 will be on Monday the 22Exam2 will be on Monday the 22ndnd

divide and conquer the most well known algorithm design strategy: 1. divide instance of problem into...

Documents