performance bounds for constrained linear stochastic...

32
Performance Bounds for Constrained Linear Stochastic Control Yang Wang and Stephen Boyd Electrical Engineering Department Stanford University Workshop on Optimization and Control with Applications, Harbin, June 9 2009

Upload: others

Post on 03-Dec-2020

5 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Performance Bounds for Constrained Linear Stochastic Controlweb.stanford.edu/~boyd/papers/pdf/stoch_ctrl_bnds_talk.pdfLinear stochastic system • linear dynamical system with process

Performance Bounds for

Constrained Linear Stochastic Control

Yang Wang and Stephen Boyd

Electrical Engineering DepartmentStanford University

Workshop on Optimization and Control with Applications, Harbin, June 9 2009

Page 2: Performance Bounds for Constrained Linear Stochastic Controlweb.stanford.edu/~boyd/papers/pdf/stoch_ctrl_bnds_talk.pdfLinear stochastic system • linear dynamical system with process

Outline

• constrained linear stochastic control problem

• some heuristic control schemes

– projected linear control– control-Lyapunov– certainty-equivalent model predictive control (MPC)

• the linear quadratic case

• performance bound

• performance bound parameter choices for control schemes

• numerical examples

Workshop on Optimization and Control with Applications, Harbin, June 9 2009 1

Page 3: Performance Bounds for Constrained Linear Stochastic Controlweb.stanford.edu/~boyd/papers/pdf/stoch_ctrl_bnds_talk.pdfLinear stochastic system • linear dynamical system with process

Linear stochastic system

• linear dynamical system with process noise:

xt+1 = Axt + But + wt, t = 0, 1, . . . ,

– xt ∈ Rn is the state– ut ∈ U is the control input– U ⊂ Rm is the input constraint set, with 0 ∈ U– wt ∈ Rn is zero mean IID process noise, Ewtw

T

t= W

• state feedback control policy:

ut = φ(xt), t = 0, 1, . . . ,

φ : Rn → U is the state feedback function

Workshop on Optimization and Control with Applications, Harbin, June 9 2009 2

Page 4: Performance Bounds for Constrained Linear Stochastic Controlweb.stanford.edu/~boyd/papers/pdf/stoch_ctrl_bnds_talk.pdfLinear stochastic system • linear dynamical system with process

Objective

• objective is average stage cost:

J = lim supT→∞

1

TE

T−1∑

t=0

(ℓx(xt) + ℓu(ut))

– ℓx : Rn → R is state stage cost function– ℓu : U → R is the input state cost function

• ℓx, ℓu, U need not be convex

Workshop on Optimization and Control with Applications, Harbin, June 9 2009 3

Page 5: Performance Bounds for Constrained Linear Stochastic Controlweb.stanford.edu/~boyd/papers/pdf/stoch_ctrl_bnds_talk.pdfLinear stochastic system • linear dynamical system with process

Stochastic control problem

• stochastic control problem: choose feedback function φ to minimize J

• infinite dimensional nonconvex optimization problem

• problem data:

– dynamics and input matrices A, B– distribution of process noise wt

– state and input cost functions ℓx, ℓu

– input constraint set U

• φ⋆ denotes an optimal feedback function

• J⋆ denotes optimal objective value

Workshop on Optimization and Control with Applications, Harbin, June 9 2009 4

Page 6: Performance Bounds for Constrained Linear Stochastic Controlweb.stanford.edu/~boyd/papers/pdf/stoch_ctrl_bnds_talk.pdfLinear stochastic system • linear dynamical system with process

‘Solution’ via dynamic programming

• find V ⋆ : Rn → R and α with

V ⋆(z) + α = minv∈U

(ℓu(v) + EV ⋆(Az + Bv + wt))

(expectation is over wt)

• optimal feedback function is then

φ⋆(z) = argminv∈U

(ℓu(v) + EV ⋆(Az + Bv + wt))

• optimal value of stochastic control problem is J⋆ = α

Workshop on Optimization and Control with Applications, Harbin, June 9 2009 5

Page 7: Performance Bounds for Constrained Linear Stochastic Controlweb.stanford.edu/~boyd/papers/pdf/stoch_ctrl_bnds_talk.pdfLinear stochastic system • linear dynamical system with process

Stochastic control problem

• generally very hard to solve(even more: how would we represent a general function φ?)

• can be effectively solved

– when the problem dimensions are very small, e.g., n = m = 1– when U = Rm and ℓx, ℓu are convex quadratic;

in this case optimal policy is linear: φ⋆(z) = Kz

• many suboptimal methods have been proposed

– can evaluate J for a given φ via Monte Carlo simulation– but how suboptimal is it?

• this talk: an effective method for finding a (good) lower bound on J⋆

Workshop on Optimization and Control with Applications, Harbin, June 9 2009 6

Page 8: Performance Bounds for Constrained Linear Stochastic Controlweb.stanford.edu/~boyd/papers/pdf/stoch_ctrl_bnds_talk.pdfLinear stochastic system • linear dynamical system with process

Projected linear state feedback

• a simple suboptimal policy:

φpl(z) = P(Kplz)

– Kpl ∈ Rm×n is a gain matrix (to be chosen)– P is projection onto U

• when U is a box, i.e., U = {u | ‖u‖∞ ≤ Umax}, reduces to saturatedlinear state feedback

φpl(z) = Umaxsat((1/Umax)Kplz)

sat is (entrywise) unit saturation

Workshop on Optimization and Control with Applications, Harbin, June 9 2009 7

Page 9: Performance Bounds for Constrained Linear Stochastic Controlweb.stanford.edu/~boyd/papers/pdf/stoch_ctrl_bnds_talk.pdfLinear stochastic system • linear dynamical system with process

Control-Lyapunov policy

• control-Lyapunov policy is

φclf(z) = argminv∈U

(ℓu(v) + EVclf(Az + Bv + wt))

– Vclf : Rn → R (which is to be chosen) is the control-Lyapunovfunction

– when Vclf = V ⋆, this is optimal policy

• when Vclf is quadratic, the control-Lyapunov policy simplifies to

φclf(z) = argminv∈U

(ℓu(v) + Vclf(Az + Bv))

since Ewt = 0, and term involving EwtwT

t= W is constant

Workshop on Optimization and Control with Applications, Harbin, June 9 2009 8

Page 10: Performance Bounds for Constrained Linear Stochastic Controlweb.stanford.edu/~boyd/papers/pdf/stoch_ctrl_bnds_talk.pdfLinear stochastic system • linear dynamical system with process

Certainty-equivalent model predictive control (MPC)

• φmpc(z) is found by solving (possibly approximately)

minimize∑

T−1

τ=0 (ℓx(xτ) + ℓu(vτ)) + Vmpc(xT )subject to xτ+1 = Axτ + Bvτ , τ = 0, . . . , T − 1

vτ ∈ U , τ = 0, . . . , T − 1x0 = z

– variables are v0, . . . , vT−1, x0, . . . , xT

– Vmpc : Rn → R is the terminal cost (to be chosen)– T is the planning horizon (also to be chosen)

• let solution be v⋆

0, . . . , v⋆

T−1, x⋆

0, . . . , x⋆

T

• MPC policy is φmpc(z) = v⋆

0

Workshop on Optimization and Control with Applications, Harbin, June 9 2009 9

Page 11: Performance Bounds for Constrained Linear Stochastic Controlweb.stanford.edu/~boyd/papers/pdf/stoch_ctrl_bnds_talk.pdfLinear stochastic system • linear dynamical system with process

Parameters in heuristic control policies

• performance of suboptimal policies depends on choice of parameters(Kpl, Vclf, Vmpc and T )

• one choice for Vclf, Vmpc: (quadratic) value function for someunconstrained linear quadratic problem

• one choice for Kpl: optimal gain matrix for some unconstrained linearquadratic problem

• we will suggest some parameters later . . .

Workshop on Optimization and Control with Applications, Harbin, June 9 2009 10

Page 12: Performance Bounds for Constrained Linear Stochastic Controlweb.stanford.edu/~boyd/papers/pdf/stoch_ctrl_bnds_talk.pdfLinear stochastic system • linear dynamical system with process

The performance bound

our method:

• computes a lower bound J lb ≤ J⋆ using convex optimization(hence is tractable)

• bound is computed for each specific problem instance

• (at this time) cannot guarantee tightness of bound

Workshop on Optimization and Control with Applications, Harbin, June 9 2009 11

Page 13: Performance Bounds for Constrained Linear Stochastic Controlweb.stanford.edu/~boyd/papers/pdf/stoch_ctrl_bnds_talk.pdfLinear stochastic system • linear dynamical system with process

Judging a heuristic policy

• suppose we have a heuristic policy φ with objective J(evaluated by Monte Carlo, say)

• since J lb ≤ J⋆ ≤ J , if J − Jlb is small, then

– policy φ is nearly optimal– bound J lb is nearly tight

• if J − J lb is big, then for this problem instance, either

– policy is poor, or,– bound is poor (or both)

• examples suggest that J − J lb is often small

Workshop on Optimization and Control with Applications, Harbin, June 9 2009 12

Page 14: Performance Bounds for Constrained Linear Stochastic Controlweb.stanford.edu/~boyd/papers/pdf/stoch_ctrl_bnds_talk.pdfLinear stochastic system • linear dynamical system with process

Unconstrained linear quadratic control

• can effectively solve stochastic control problem when

– U = Rm (no constraints)– ℓx(z) = zTQz, ℓu(v) = vTRv, Q � 0, R � 0

• optimal cost is J⋆

lq = Tr(P ⋆

lqW )

• optimal state feedback function is φ⋆(z) = K⋆

lqz, where

K⋆

lq = −(R + BTP ⋆

lqB)−1BTP ⋆

lqA

• P ⋆

lq is positive semidefinite solution of ARE

P ⋆

lq = Q + ATP ⋆

lqA − ATP ⋆

lqB(R + BTP ⋆

lqB)−1BTP ⋆

lqA

Workshop on Optimization and Control with Applications, Harbin, June 9 2009 13

Page 15: Performance Bounds for Constrained Linear Stochastic Controlweb.stanford.edu/~boyd/papers/pdf/stoch_ctrl_bnds_talk.pdfLinear stochastic system • linear dynamical system with process

Linear quadratic control via LMI/SDP

• can characterize J⋆

lq and P ⋆

lq via the semidefinite program (SDP)

maximize Tr(PW )

subject to P � 0[

R + BTPB BTPAATPB Q + ATPA − P

]

� 0

– variable is P– optimal point is P = P ⋆

lq; optimal value is J⋆

lq

• solution does not depend on W , as long as W ≻ 0

• constraints are convex in (P, Q,R), so J⋆

lq(Q,R) is a concave functionof (Q,R)

Workshop on Optimization and Control with Applications, Harbin, June 9 2009 14

Page 16: Performance Bounds for Constrained Linear Stochastic Controlweb.stanford.edu/~boyd/papers/pdf/stoch_ctrl_bnds_talk.pdfLinear stochastic system • linear dynamical system with process

Basic bound

• suppose Q � 0, R � 0, s satisfy

zTQz + vTRv + s ≤ ℓx(z) + ℓu(v) for all z ∈ Rn, v ∈ U

i.e., quadratic stage costs are everywhere smaller than ℓx + ℓv

• then J⋆

lq(Q, R) + s is a lower bound on J⋆

• follows from monotonicity of stochastic control cost w.r.t. stage costs

• lefthand side is optimal value of unconstrained quadratic problem

Workshop on Optimization and Control with Applications, Harbin, June 9 2009 15

Page 17: Performance Bounds for Constrained Linear Stochastic Controlweb.stanford.edu/~boyd/papers/pdf/stoch_ctrl_bnds_talk.pdfLinear stochastic system • linear dynamical system with process

Optimizing the bound

• can optimize the lower bound over Q, R, s by solving

maximize J⋆

lq(Q,R) + s

subject to Q � 0, R � 0,zTQz + vTRv + s ≤ ℓx(z) + ℓu(v) for all z ∈ Rn, v ∈ U

• a convex optimization problem

– objective is concave– constraints are convex– last constraint is convex in Q, R, s for each z and v

• last constraint is semi-infinite, parameterized by the (infinite) setz ∈ Rn, u ∈ U

Workshop on Optimization and Control with Applications, Harbin, June 9 2009 16

Page 18: Performance Bounds for Constrained Linear Stochastic Controlweb.stanford.edu/~boyd/papers/pdf/stoch_ctrl_bnds_talk.pdfLinear stochastic system • linear dynamical system with process

Optimizing the bound

• semi-infinite constraint makes problem difficult in general

• can solve exactly in a few cases

• in other cases, can replace semi-infinite constraint with conservativeapproximation, which still gives a lower bound

Workshop on Optimization and Control with Applications, Harbin, June 9 2009 17

Page 19: Performance Bounds for Constrained Linear Stochastic Controlweb.stanford.edu/~boyd/papers/pdf/stoch_ctrl_bnds_talk.pdfLinear stochastic system • linear dynamical system with process

Quadratic stage cost and finite input set

• can solve optimization problem exactly when

– ℓx(z) = zTQ0z, ℓu(v) = vTR0v, Q � 0, R � 0– U = {u1, . . . , uK} (finite input constraint set)

• constraint

zTQz + vTRv + s ≤ ℓx(z) + ℓu(v) for all z ∈ Rn, v ∈ U

becomes

Q � Q0, uT

iRui + s ≤ uT

iR0ui, i = 1, . . . , K

Workshop on Optimization and Control with Applications, Harbin, June 9 2009 18

Page 20: Performance Bounds for Constrained Linear Stochastic Controlweb.stanford.edu/~boyd/papers/pdf/stoch_ctrl_bnds_talk.pdfLinear stochastic system • linear dynamical system with process

• to optimize the bound we solve SDP (with variables P , Q, R, s)

maximize Tr(PW ) + s

subject to P � 0, Q � 0, R � 0, Q � Q0[

R + BTPB BTPAATPB Q + ATPA − P

]

� 0

uT

iRui + s ≤ uT

iR0ui, i = 1, . . . , K

• monotone in Q, so we can set Q = Q0 w.l.o.g.

Workshop on Optimization and Control with Applications, Harbin, June 9 2009 19

Page 21: Performance Bounds for Constrained Linear Stochastic Controlweb.stanford.edu/~boyd/papers/pdf/stoch_ctrl_bnds_talk.pdfLinear stochastic system • linear dynamical system with process

S-procedure relaxation

• suppose stage costs are quadratic

• suppose we can find R1, . . . , RM and s1, . . . , sM for which

U ⊆ U = {v | vTRiv + si ≤ 0, i = 1, . . . , M}

• a sufficient condition for

zTQz + vTRv + s ≤ ℓx(z) + ℓu(v) for all z ∈ Rn, v ∈ U

is

zTQz + vTRv + s ≤ zTQ0z + vTR0v for all z ∈ Rn, v ∈ U

Workshop on Optimization and Control with Applications, Harbin, June 9 2009 20

Page 22: Performance Bounds for Constrained Linear Stochastic Controlweb.stanford.edu/~boyd/papers/pdf/stoch_ctrl_bnds_talk.pdfLinear stochastic system • linear dynamical system with process

• equivalent to Q � Q0 and

vTRiv + si ≤ 0, i = 1, . . . , M =⇒ vTRv + s ≤ vTR0v

• which is implied by Q � Q0 and the existence of λ1, . . . , λM ≥ 0 with

R − R0 �M∑

i=1

λiRi, s ≤M∑

i=1

λisi

(by the S-procedure)

• so J⋆

lq(Q,R) + s is a still a lower bound on J⋆

Workshop on Optimization and Control with Applications, Harbin, June 9 2009 21

Page 23: Performance Bounds for Constrained Linear Stochastic Controlweb.stanford.edu/~boyd/papers/pdf/stoch_ctrl_bnds_talk.pdfLinear stochastic system • linear dynamical system with process

• to optimize the bound we solve the SDP

maximize Tr(PW ) + s0

subject to P � 0, Q � 0, R � 0, Q � Q0

[

R + BTPB BTPAATPB Q + ATPA − P

]

� 0

R − R0 �∑

M

i=1 λiRi, s0 ≤∑

M

i=1 λisi

λi ≥ 0, i = 1, . . . , M

with variables P , Q, R, λ1, . . . , λM , s0, . . . , sM

• can set Q = Q0 w.l.o.g.

Workshop on Optimization and Control with Applications, Harbin, June 9 2009 22

Page 24: Performance Bounds for Constrained Linear Stochastic Controlweb.stanford.edu/~boyd/papers/pdf/stoch_ctrl_bnds_talk.pdfLinear stochastic system • linear dynamical system with process

Suboptimal control policies

• optimizing the lower bound gives Plb

• can interpret Tr(PlbW ) as optimal cost of an unconstrained quadraticproblem that approximates (and underestimates) our problem

• suggests thatVlb(z) = zTPlbz,

andKlb = −(Rlb + BTPlbB)−1BTPlbA

are good choices of parameters for suboptimal control policies

• examples show this is the case

Workshop on Optimization and Control with Applications, Harbin, June 9 2009 23

Page 25: Performance Bounds for Constrained Linear Stochastic Controlweb.stanford.edu/~boyd/papers/pdf/stoch_ctrl_bnds_talk.pdfLinear stochastic system • linear dynamical system with process

Numerical examples

• illustrate bounds for 3 examples

– small problem with trilevel inputs– large problem with box constraints– discretized mechanical control system

• compare lower bound with various heuristic policies

– projected linear state feedback– model predictive control– control-Lyapunov policy

Workshop on Optimization and Control with Applications, Harbin, June 9 2009 24

Page 26: Performance Bounds for Constrained Linear Stochastic Controlweb.stanford.edu/~boyd/papers/pdf/stoch_ctrl_bnds_talk.pdfLinear stochastic system • linear dynamical system with process

Small problem with trilevel inputs

• n = 8, m = 2

• A, B matrices randomly generated; A scaled so maxi |λi(A)| = 1

• quadratic stage costs with R0 = I, Q0 = I

• wt ∼ N (0, 0.25I)

• finite input set: U = {−0.2, 0, 0.2}2

Workshop on Optimization and Control with Applications, Harbin, June 9 2009 25

Page 27: Performance Bounds for Constrained Linear Stochastic Controlweb.stanford.edu/~boyd/papers/pdf/stoch_ctrl_bnds_talk.pdfLinear stochastic system • linear dynamical system with process

Large problem with box constraints

• n = 30, m = 10

• A, B matrices randomly generated; A scaled so maxi |λi(A)| = 1

• quadratic stage costs with R0 = I, Q0 = I

• wt ∼ N (0, 0.25I)

• box input constraints: U = {v ∈ Rm | ‖v‖∞ ≤ 0.1}

Workshop on Optimization and Control with Applications, Harbin, June 9 2009 26

Page 28: Performance Bounds for Constrained Linear Stochastic Controlweb.stanford.edu/~boyd/papers/pdf/stoch_ctrl_bnds_talk.pdfLinear stochastic system • linear dynamical system with process

Discretized mechanical control system

u1 u2

u3

• 6 masses connected by springs; 3 input tensions between masses

• quadratic stage costs with R0 = I, Q0 = I

• wt uniform on [−0.5, 0.5]

• box input constraints: U = {v ∈ Rm | ‖v‖∞ ≤ 0.1}

Workshop on Optimization and Control with Applications, Harbin, June 9 2009 27

Page 29: Performance Bounds for Constrained Linear Stochastic Controlweb.stanford.edu/~boyd/papers/pdf/stoch_ctrl_bnds_talk.pdfLinear stochastic system • linear dynamical system with process

Heuristic policies

• projected linear state feedback with Kpl = K⋆

lq

• control-Lyapunov policy with Vclf(z) = zTPlbz

• model predictive control (MPC) with T = 30, Vmpc(z) = zTPlbz

(for trilevel example we solve convex relaxation with u(t) ∈ [−0.2, 0.2],then round value to {−0.2, 0, 0.2})

Workshop on Optimization and Control with Applications, Harbin, June 9 2009 28

Page 30: Performance Bounds for Constrained Linear Stochastic Controlweb.stanford.edu/~boyd/papers/pdf/stoch_ctrl_bnds_talk.pdfLinear stochastic system • linear dynamical system with process

Results

small trilevel large random masses

PLSF 12.9 31.3 269.8

CLF 10.8 25.6 61.1

MPC 10.9 25.7 58.9

J lb 9.1 23.8 43.2

• control-Lyapunov with Plb and MPC achieve similar performance

• control-Lyapunov policy can be computed very fast (in tens ofmicroseconds); MPC policy can be computed in milliseconds

• bound Jlb is reasonably close to J for these examples

Workshop on Optimization and Control with Applications, Harbin, June 9 2009 29

Page 31: Performance Bounds for Constrained Linear Stochastic Controlweb.stanford.edu/~boyd/papers/pdf/stoch_ctrl_bnds_talk.pdfLinear stochastic system • linear dynamical system with process

Conclusions

• we’ve shown how to find lower bounds on optimal performance forconstrained linear stochastic control problems

• requires solution of convex optimization problem, hence is tractable

• provides only provable lower bound on optimal performance that we areaware of

• as a by-product, provides excellent choice for quadraticcontrol-Lyapunov function

• in many cases, gives everything you want:

– a provable lower bound on performance– a relatively simple heuristic policy that comes close

Workshop on Optimization and Control with Applications, Harbin, June 9 2009 30

Page 32: Performance Bounds for Constrained Linear Stochastic Controlweb.stanford.edu/~boyd/papers/pdf/stoch_ctrl_bnds_talk.pdfLinear stochastic system • linear dynamical system with process

References

• Wang & Boyd, Performance Bounds for Linear Stochastic Control,Systems Control Letters 58(3):178–182, 2008

• Wang & Boyd, Performance Bounds and Suboptimal Policies for LinearStochastic Control via LMIs, (manuscript)

• Cogill & Lall, Suboptimality Bounds in Stochastic Control: A QueueingExample, Proceedings ACC p1642–1647, 2006

(similar results in MDP setting)

• Lincoln & Rantzer, Relaxing Dynamic Programming, IEEE T-AC51(8):1249-2006, 2006

Workshop on Optimization and Control with Applications, Harbin, June 9 2009 31