dynamic programming and optimal control€¦ · overview dynamic programming algorithm (dpa)...

16
| | Final Recitation 09.01.2017 Rajan Gill, Weixuan Zhang 1 Dynamic Programming and Optimal Control

Upload: others

Post on 20-Jul-2020

15 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Dynamic Programming and Optimal Control€¦ · Overview Dynamic Programming Algorithm (DPA) Deterministic Systems and the Shortest Path (SP) Infinite Horizon Problems, Stochastic

||

Final Recitation

09.01.2017Rajan Gill, Weixuan Zhang 1

Dynamic Programming and Optimal

Control

Page 2: Dynamic Programming and Optimal Control€¦ · Overview Dynamic Programming Algorithm (DPA) Deterministic Systems and the Shortest Path (SP) Infinite Horizon Problems, Stochastic

||

Overview

Dynamic Programming Algorithm (DPA)

Deterministic Systems and the Shortest Path (SP)

Infinite Horizon Problems, Stochastic SP

Deterministic Continuous-Time Optimal Control

09.01.2017Rajan Gill, Weixuan Zhang 2

Outline

Page 3: Dynamic Programming and Optimal Control€¦ · Overview Dynamic Programming Algorithm (DPA) Deterministic Systems and the Shortest Path (SP) Infinite Horizon Problems, Stochastic

|| 09.01.2017Rajan Gill, Weixuan Zhang 3

Overview

Page 4: Dynamic Programming and Optimal Control€¦ · Overview Dynamic Programming Algorithm (DPA) Deterministic Systems and the Shortest Path (SP) Infinite Horizon Problems, Stochastic

||

Basic Problem

Alternative Problem Formulation

Reformulations

Time lag, correlated disturbances, forecasts, …

09.01.2017Rajan Gill, Weixuan Zhang 4

Dynamic Programming Algorithm (DPA)

Page 5: Dynamic Programming and Optimal Control€¦ · Overview Dynamic Programming Algorithm (DPA) Deterministic Systems and the Shortest Path (SP) Infinite Horizon Problems, Stochastic

||

Basic idea: Principle of Optimality

Algorithm:

Minimizing the recursion equation for each and gives

us the optimal policy:

09.01.2017Rajan Gill, Weixuan Zhang 5

Dynamic Programming Algorithm (DPA)

Page 6: Dynamic Programming and Optimal Control€¦ · Overview Dynamic Programming Algorithm (DPA) Deterministic Systems and the Shortest Path (SP) Infinite Horizon Problems, Stochastic

||

Consider now problems where

is a finite set,

No disturbance .

Convert DP to SP (and vice versa)

DP:

SP:

Viterbi Algorithm

09.01.2017Rajan Gill, Weixuan Zhang 6

Deterministic Systems and the Shortest Path

Page 7: Dynamic Programming and Optimal Control€¦ · Overview Dynamic Programming Algorithm (DPA) Deterministic Systems and the Shortest Path (SP) Infinite Horizon Problems, Stochastic

||

DP finds all optimal paths to end node. Sometimes not

needed.

Exploit structure of these problems to come up with

efficient algorithms for solving shortest path problems:

09.01.2017Rajan Gill, Weixuan Zhang 7

Deterministic Systems and the Shortest Path

Label Correcting Algorithm

Step 1: Remove a node i from OPEN and for each child j of i,

execute step 2.

Step 2: If di + aij < min{dj,dT}, set dj = di + aij and set i to be the

parent of j. In addition, if j≠T, place j in OPEN if it is not already in

OPEN, while if j=T, set dT to the new value di+aiT.

Step 3: If OPEN is empty, terminate; else go to step 1.

Page 8: Dynamic Programming and Optimal Control€¦ · Overview Dynamic Programming Algorithm (DPA) Deterministic Systems and the Shortest Path (SP) Infinite Horizon Problems, Stochastic

||

Consider time-invariant system with infinite horizon:

Optimal policy is stationary:

Optimal cost solves Bellman’s equation:

09.01.2017Rajan Gill, Weixuan Zhang 8

Infinite Horizon Problems

Page 9: Dynamic Programming and Optimal Control€¦ · Overview Dynamic Programming Algorithm (DPA) Deterministic Systems and the Shortest Path (SP) Infinite Horizon Problems, Stochastic

||

Stochastic Shortest Path problems:

Cost-free termination state :

a policy and an integer such that:

Infinite Horizon Problems: Stochastic Shortest Path

09.01.2017Rajan Gill, Weixuan Zhang 9

Page 10: Dynamic Programming and Optimal Control€¦ · Overview Dynamic Programming Algorithm (DPA) Deterministic Systems and the Shortest Path (SP) Infinite Horizon Problems, Stochastic

||

Value iteration:

Step 1: Choose an initial guess .

Step 2: Update cost values with the value iteration formula:

Step 3: If converged for all , terminate. Else go to step 2.

Infinite Horizon Problems: Stochastic Shortest Path

09.01.2017Rajan Gill, Weixuan Zhang 10

Page 11: Dynamic Programming and Optimal Control€¦ · Overview Dynamic Programming Algorithm (DPA) Deterministic Systems and the Shortest Path (SP) Infinite Horizon Problems, Stochastic

||

Policy iteration:

Step 1: Choose an initial stationary policy .

Step 2: Policy evaluation (compute cost of current policy):

Step 3: Policy improvement (find a better policy):

Step 4: If for all , terminate. Else go to step 2.

Infinite Horizon Problems: Stochastic Shortest Path

09.01.2017Rajan Gill, Weixuan Zhang 11

(lin. sys. of eq.)

Page 12: Dynamic Programming and Optimal Control€¦ · Overview Dynamic Programming Algorithm (DPA) Deterministic Systems and the Shortest Path (SP) Infinite Horizon Problems, Stochastic

||

Linear programming:

Optimal cost solves the following linear program:

For each admissible pair we get one linear constraint

Infinite Horizon Problems: Stochastic Shortest Path

09.01.2017Rajan Gill, Weixuan Zhang 12

Page 13: Dynamic Programming and Optimal Control€¦ · Overview Dynamic Programming Algorithm (DPA) Deterministic Systems and the Shortest Path (SP) Infinite Horizon Problems, Stochastic

||

Discounted problems:

Discounted cost:

Infinite Horizon Problems: Discounted Problems

09.01.2017Rajan Gill, Weixuan Zhang 13

Page 14: Dynamic Programming and Optimal Control€¦ · Overview Dynamic Programming Algorithm (DPA) Deterministic Systems and the Shortest Path (SP) Infinite Horizon Problems, Stochastic

||

Basic Problem

No noise: deterministic.

Goal: Find an admissible control trajectory , ,

and corresponding state trajectory which minimize

the cost.

Solution is found by HJB or Minimum Principle.

09.01.2017Rajan Gill, Weixuan Zhang 14

Deterministic Continuous-Time Optimal Control

Page 15: Dynamic Programming and Optimal Control€¦ · Overview Dynamic Programming Algorithm (DPA) Deterministic Systems and the Shortest Path (SP) Infinite Horizon Problems, Stochastic

||

Hamilton-Jacobi-Bellman Equation (cont.-time analog to DPA)

“Derived” by discretizing and taking limits of DPA.

Partial differential equation. Very hard to solve!

Usually guess a solution and proof that is satisfies HJB.

Sufficient condition.

Optimal policy: that minimize RHS of HJB.

09.01.2017Rajan Gill, Weixuan Zhang 15

Deterministic Continuous-Time Optimal Control

Page 16: Dynamic Programming and Optimal Control€¦ · Overview Dynamic Programming Algorithm (DPA) Deterministic Systems and the Shortest Path (SP) Infinite Horizon Problems, Stochastic

||

Minimum Principle (Only finds optimal solution for a specific initial condition )

Define Hamiltonian:

Then:

Only necessary conditions.

Various extensions (e.g. fixed terminal state, …).

09.01.2017Rajan Gill, Weixuan Zhang 16

Deterministic Continuous-Time Optimal Control