csci 256 data structures and algorithm analysis lecture 13 some slides by kevin wayne copyright...

26
CSCI 256 Data Structures and Algorithm Analysis Lecture 13 Some slides by Kevin Wayne copyright 2005, Pearson Addison Wesley all rights reserved, and some by Iker Gondra

Upload: trevor-bradford

Post on 15-Jan-2016

213 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: CSCI 256 Data Structures and Algorithm Analysis Lecture 13 Some slides by Kevin Wayne copyright 2005, Pearson Addison Wesley all rights reserved, and some

CSCI 256

Data Structures and Algorithm Analysis

Lecture 13

Some slides by Kevin Wayne copyright 2005, Pearson Addison Wesley all rights reserved, and some by Iker Gondra

Page 2: CSCI 256 Data Structures and Algorithm Analysis Lecture 13 Some slides by Kevin Wayne copyright 2005, Pearson Addison Wesley all rights reserved, and some

Algorithmic Paradigms

• Greed– Build up a solution incrementally, myopically optimizing some

local criterion

• Divide-and-conquer– Break up a problem into two or more subproblems, solve each

subproblem independently, and combine solutions to subproblems to form solution to original problem

• Dynamic programming– Break up a problem into a series of overlapping subproblems,

and build up solutions to larger and larger subproblems– Rather than solving overlapping problems again and again,

solve each subproblem only once and record the results in a table. This simple idea can easily transform exponential-time algorithms into polynomial-time algorithms

Page 3: CSCI 256 Data Structures and Algorithm Analysis Lecture 13 Some slides by Kevin Wayne copyright 2005, Pearson Addison Wesley all rights reserved, and some

Algorithmic Paradigms

• The divide and conquer strategies usually worked to reduce a polynomial running time to something faster but still polynomial time

• We need something faster to reduce an exponential brute force search down to polynomial time—which is where DP comes in.

Page 4: CSCI 256 Data Structures and Algorithm Analysis Lecture 13 Some slides by Kevin Wayne copyright 2005, Pearson Addison Wesley all rights reserved, and some

Dynamic Programming (DP)

• DP is like the divide-and-conquer method in that it solves problems by combining the solutions to subproblems

• Divide-and-conquer algorithms partition the problem into independent subproblems, solve the subproblems recursively, and then combine their solutions to solve the original problem

• In contrast, DP is applicable when the subproblems are not independent (i.e., subproblems share subsubproblems). In this context, divide-and-conquer does more work than necessary, repeatedly solving the common subsubproblems. DP solves each subproblem just once and saves its answer in a table

Page 5: CSCI 256 Data Structures and Algorithm Analysis Lecture 13 Some slides by Kevin Wayne copyright 2005, Pearson Addison Wesley all rights reserved, and some

A Little Bit of DP History

• Bellman– Pioneered the systematic study of DP in the 1950s– DP as a general method for optimizing multistage decision

processes

• Etymology– Dynamic programming = planning over time. Thus the word

“programming” in the name of this technique stands for “planning”, not computer programming

– Secretary of Defense was hostile to mathematical research– Bellman sought an impressive name to avoid confrontation

• "it's impossible to use dynamic in a pejorative sense"

• "something not even a Congressman could object to"

Reference: Bellman, R. E. Eye of the Hurricane, An Autobiography.

Page 6: CSCI 256 Data Structures and Algorithm Analysis Lecture 13 Some slides by Kevin Wayne copyright 2005, Pearson Addison Wesley all rights reserved, and some

DP: Many Applications

• Areas– Bioinformatics– Control theory– Information theory– Operations research– Computer science: theory, graphics, AI, systems, ….

• Some famous DP algorithms – we’ll see some of these– Viterbi for hidden Markov models– Unix diff for comparing two files– Smith-Waterman for sequence alignment– Bellman-Ford for shortest path routing in networks– Cocke-Kasami-Younger for parsing context free grammars

Page 7: CSCI 256 Data Structures and Algorithm Analysis Lecture 13 Some slides by Kevin Wayne copyright 2005, Pearson Addison Wesley all rights reserved, and some

Developing DP algorithms

• The development of a DP algorithm can be broken into a sequence of 4 steps

1) Characterize the structure of an optimal solution

2) Recursively define the value of an optimal solution

3) Compute the value of an optimal solution in a bottom-up fashion

4) Construct an optimal solution from computed information

Page 8: CSCI 256 Data Structures and Algorithm Analysis Lecture 13 Some slides by Kevin Wayne copyright 2005, Pearson Addison Wesley all rights reserved, and some

Weighted Interval Scheduling

• Weighted interval scheduling problem– Job j starts at sj, finishes at fj, and has weight or value vj

– Two jobs compatible if they don't overlap– Goal: find maximum weight subset of mutually compatible jobs

Time0 1 2 3 4 5 6 7 8 9 10 11

f

g

h

e

a

b

c

d

Page 9: CSCI 256 Data Structures and Algorithm Analysis Lecture 13 Some slides by Kevin Wayne copyright 2005, Pearson Addison Wesley all rights reserved, and some

Unweighted Interval Scheduling Review

• Recall: Greedy algorithm works if all weights are 1– Consider jobs in ascending order of finish time– Add job to subset if it is compatible with previously chosen jobs

• Observation: Greedy algorithm can fail spectacularly if arbitrary weights are allowed

Time0 1 2 3 4 5 6 7 8 9 10 11

b

a

weight = 999

weight = 1

Page 10: CSCI 256 Data Structures and Algorithm Analysis Lecture 13 Some slides by Kevin Wayne copyright 2005, Pearson Addison Wesley all rights reserved, and some

Weighted Interval Scheduling

• Notation: Label jobs by finishing time: f1 f2 . . . fn

• Def: p(j) = largest index i < j such that job i is compatible with job j

• Ex: p(8) = 5, p(7) = 3, p(2) = 0

Time0 1 2 3 4 5 6 7 8 9 10 11

6

7

8

4

3

1

2

5

Page 11: CSCI 256 Data Structures and Algorithm Analysis Lecture 13 Some slides by Kevin Wayne copyright 2005, Pearson Addison Wesley all rights reserved, and some

DP: Binary Choice: do we take job j or not??

• Notation: OPT(j) = value of optimal solution to the problem consisting of job requests 1, 2, ..., j– Case 1: OPT selects job j

• can't use incompatible jobs { p(j) + 1, p(j) + 2, ..., j - 1 }• must include optimal solution to problem consisting of

remaining compatible jobs 1, 2, ..., p(j)– Case 2: OPT does not select job j

• must include optimal solution to problem consisting of remaining compatible jobs 1, 2, ..., j-1

optimal substructure

Page 12: CSCI 256 Data Structures and Algorithm Analysis Lecture 13 Some slides by Kevin Wayne copyright 2005, Pearson Addison Wesley all rights reserved, and some

Weighted Interval Scheduling: Brute Force

• Brute force algorithm

Input: n, s1,…,sn , f1,…,fn , v1,…,vn

Sort jobs by finish times so that f1 f2 ... fn

Compute p(1), p(2), …, p(n)

Compute-Opt(j) { if (j = 0) return 0 else return max(vj + Compute-Opt(p(j)), Compute-Opt(j-1))}

Page 13: CSCI 256 Data Structures and Algorithm Analysis Lecture 13 Some slides by Kevin Wayne copyright 2005, Pearson Addison Wesley all rights reserved, and some

Compute-Opt(j) correctly computes Opt(j) for each j= 0, 1,2,…, n

Proof: How??

Page 14: CSCI 256 Data Structures and Algorithm Analysis Lecture 13 Some slides by Kevin Wayne copyright 2005, Pearson Addison Wesley all rights reserved, and some

Compute-Opt(j) correctly computes Opt(j) for each j= 0,1,2,…, n

Proof: By Induction (of course!!) Use strong induction here

By definition Opt(0) = 0 (this is base case).

Assume Compute-Opt(i) correctly computes Opt(i) for all

i < j. By induction, Compute-Opt (p(j)) = Opt p(j) and Compute-Opt (j-1) = Opt(j-1).

Hence from definition of Opt(j),

Opt(j) = max (vj + Compute-Opt(p(j)), Compute-Opt(j-1)) = Compute–Opt(j)

Page 15: CSCI 256 Data Structures and Algorithm Analysis Lecture 13 Some slides by Kevin Wayne copyright 2005, Pearson Addison Wesley all rights reserved, and some

Weighted Interval Scheduling: Brute Force

• Observation: Recursive algorithm fails spectacularly because of redundant subproblems exponential algorithm

• Number of recursive calls for family of "layered" instances as below grows like Fibonacci sequence

3

4

5

1

2

p(1) = 0, p(j) = j-2

5

4 3

3 2 2 1

2 1

1 0

1 0 1 0

Page 16: CSCI 256 Data Structures and Algorithm Analysis Lecture 13 Some slides by Kevin Wayne copyright 2005, Pearson Addison Wesley all rights reserved, and some

Recursive algorithm takes exponential time

• Indeed if we implement the Compute-Opt as written, it takes exponential time in the worst case – our example on previous slide -- which shows the tree of calls, shows the tree widens very quickly due to the recursive branching; for this case, where p(j) = j-2, for each j = 2,3,…, Compute-Opt generates separate recursive calls on problems of sizes j-1 and j-2 – so will grow like the Fibonacci numbers which increase exponentially!

Page 17: CSCI 256 Data Structures and Algorithm Analysis Lecture 13 Some slides by Kevin Wayne copyright 2005, Pearson Addison Wesley all rights reserved, and some

Recursive algorithm takes exponential time

• How do we get around this? Note the same computation is occurring over and over again in many recursive calls

Page 18: CSCI 256 Data Structures and Algorithm Analysis Lecture 13 Some slides by Kevin Wayne copyright 2005, Pearson Addison Wesley all rights reserved, and some

Weighted Interval Scheduling: SOLUTION!!! ----Memoization

• Memoization: Store results of each subproblem in a cache; lookup and use as needed

Input: n, s1,…,sn , f1,…,fn , v1,…,vn

Sort jobs by finish times so that f1 f2 ... fn

Compute p(1), p(2), …, p(n)

for j = 1 to n Opt[j] = -1Opt[0] = 0

M-Compute-Opt(j) { if (Opt[j] == -1) Opt[j] = max(vj + M-Compute-Opt(p(j)), M-Compute-Opt(j-1)) return Opt[j]}

Page 19: CSCI 256 Data Structures and Algorithm Analysis Lecture 13 Some slides by Kevin Wayne copyright 2005, Pearson Addison Wesley all rights reserved, and some

Intelligent M-Cmpute-Opt

Use an array M[o,…n]; M[j] starts with the value empty; it will hold the value of M-Compute-Opt(j) as soon as it is determined. To determine Opt(n) invoke M-Compute-Opt(n)

Page 20: CSCI 256 Data Structures and Algorithm Analysis Lecture 13 Some slides by Kevin Wayne copyright 2005, Pearson Addison Wesley all rights reserved, and some

Fill in the array with the Opt values

Opt[ j ] = max (vj + Opt[ p[ j ] ], Opt[ j – 1])

4

7

4

6

7

6

2

Page 21: CSCI 256 Data Structures and Algorithm Analysis Lecture 13 Some slides by Kevin Wayne copyright 2005, Pearson Addison Wesley all rights reserved, and some

Weighted Interval Scheduling: Running Time

• Claim: Memoized version of algorithm takes O(n log n) time– Sort by finish time: O(n log n)

– Computing p() : O(n) after sorting by finish time

– M-Compute-Opt(j): each invocation takes O(1) time and either

• (i) returns an existing value Opt[j]• (ii) fills in one new entry Opt[j] and makes two recursive calls –

how many recursive calls in total???

• Progress measure = # nonempty entries of Opt[]– initially = 0, throughout n– (ii) increases by 1 at most 2n recursive calls

• Overall running time of M-Compute-Opt(n) is O(n)

Page 22: CSCI 256 Data Structures and Algorithm Analysis Lecture 13 Some slides by Kevin Wayne copyright 2005, Pearson Addison Wesley all rights reserved, and some

Weighted Interval Scheduling: Finding a Solution

• Q: DP algorithm computes optimal value. What if we want the solution itself?– A: Do some post-processing

– # of recursive calls n O(n)

Run M-Compute-Opt(n)Run Find-Solution(n)

Find-Solution(j) { if (j = 0) output nothing else if (vj + Opt[p(j)] > Opt[j-1]) print j Find-Solution(p(j)) else Find-Solution(j-1)}

Page 23: CSCI 256 Data Structures and Algorithm Analysis Lecture 13 Some slides by Kevin Wayne copyright 2005, Pearson Addison Wesley all rights reserved, and some

Memoization or Iteration over Subproblems• KEY to this algorithm is the array Opt – it encodes the

notion that we are using the value of the optimal solutions to the subproblems on intervals {1,2,…,j} and defines Opt[j] based on values earlier in the array; once we have Opt[n] the problem is solved – it contains the value of the optimal solution and Find Solution retrieves the actual solution;

• NOTE – we can directly compute the entries in Opt by an iterative algorithm rather than memoized recursion

Page 24: CSCI 256 Data Structures and Algorithm Analysis Lecture 13 Some slides by Kevin Wayne copyright 2005, Pearson Addison Wesley all rights reserved, and some

Weighted Interval Scheduling: Iterative Version

Input: n, s1,…,sn , f1,…,fn , v1,…,vn

Sort jobs by finish times so that f1 f2 ... fn.

Compute p(1), p(2), …, p(n)

Iterative-Compute-Opt { Opt[0] = 0 for j = 1 to n Opt[j] = max(vj + Opt[p(j)], Opt[j-1])}

Page 25: CSCI 256 Data Structures and Algorithm Analysis Lecture 13 Some slides by Kevin Wayne copyright 2005, Pearson Addison Wesley all rights reserved, and some

Iteration over Subproblems

• The previous algorithm captures the essence of the dynamic programming technique and serves as a template for algorithms we develop in the next few lectures – we develop algorithms which iteratively build up subproblems (though there is always an equivalent way to formulate the algorithm as a memoized recursion)

Page 26: CSCI 256 Data Structures and Algorithm Analysis Lecture 13 Some slides by Kevin Wayne copyright 2005, Pearson Addison Wesley all rights reserved, and some

Basic Properties to be satisfied for DP approach • There are only a polynomial number of subproblems.

• The solution to the original problem can be easily be computed from solutions to subproblems (and in the case to weighted interval scheduling may be one of the subproblems.

• There is a natural ordering of subproblems from smallest to largest with an easy to compute recurrence that allows one to determine the solution of the subproblem from the solutions of some number of smaller subproblems.

Subtle issue – the relationship between figuring out the subproblems and then finding the recurrence relation to link them together.