aa203 optimal and learning-based controlasl.stanford.edu/aa203/pdfs/lecture/lecture_13.pdf · the...

AA203Optimal and Learning-based Control

Introduction to MPC, persistent feasibility

Roadmap

Optimal control

Open-loop

Indirect methods

Direct methods

Closed-loop

DP HJB / HJI

MPC

Adaptiveoptimal control Model-based RL

Linear methods

Non-linear methods

LQR iLQR DDP LQGAA 203 | Lecture 135/22/19 2

Reachability analysisCoV NOC PMP State/control

paramControlparam

Model predictive control • Model predictive control (or, more broadly, receding horizon

control) entails solving finite-time optimal control problems in a receding horizon fashion

5/22/19 AA 203 | Lecture 13 3

Model predictive control

Key steps:1. At each sampling time 𝑡, solve an open-loop optimal control

problem over a finite horizon2. Apply optimal input signal during the following sampling interval

𝑡, 𝑡 + 13. At the next time step 𝑡 + 1, solve new optimal control problem

based on new measurements of the state over a shifted horizon

5/22/19 AA 203 | Lecture 13 4

Basic formulation

• Consider the problem of regulating to the origin the discrete-time linear invariant system

𝐱 𝑡 + 1 = 𝐴𝐱 𝑡 + 𝐵𝐮 𝑡 , 𝐱 𝑡 ∈ ℝ,, 𝐮 𝑡 ∈ ℝ-

subject to the constraints𝐱 𝑡 ∈ 𝑋, 𝐮 𝑡 ∈ 𝑈, 𝑡 ≥ 0

where the sets 𝑋 and 𝑈 are polyhedra

5/22/19 AA 203 | Lecture 13 5

Basic formulation

• Let 𝑈5→5BC|5∗ ≔ {𝐮5|5∗ , 𝐮5BI|5∗ , … , 𝐮5BCHI|5∗ } be the optimal solution, then

𝐮 𝑡 = 𝐮5|5∗ (𝐱(𝑡))• The optimization problem is then repeated at time 𝑡 + 1, based on

the new state 𝐱5BI|5BI= 𝐱(𝑡 + 1)• Define 𝜋5 𝐱 𝑡 ≔ 𝐮5|5∗ (𝐱(𝑡))• Then the closed-loop system evolves as

𝐱 𝑡 + 1 = 𝐴𝐱 𝑡 + 𝐵𝜋5 𝐱 𝑡 ≔ 𝐟XY(𝐱 𝑡 , 𝑡)• Central question: characterize the behavior of closed-loop system

5/22/19 AA 203 | Lecture 13 8

Simplifying the notation• Note that the setup is time-invariant, hence, to simplify the notation, we

can let 𝑡 = 0 in the finite-time optimal control problem, namely

• Denote 𝑈G∗ 𝐱 𝑡 = {𝐮G∗ , … , 𝐮CHI∗ }5/22/19 AA 203 | Lecture 13 9

𝐽G∗ 𝐱 𝑡 = min𝐮Z,…,𝐮>?@

𝑝 𝐱C +DEFG

CHI

𝑐(𝐱E, 𝐮E)

subject to 𝐱EBI= 𝐴𝐱E + 𝐵𝐮E, 𝑘 = 0,… ,𝑁 − 1𝐱E∈ 𝑋, 𝐮E∈ 𝑈, 𝑘 = 0,… ,𝑁 − 1

𝐱C∈ 𝑋N𝐱G= 𝐱(𝑡)

Simplifying the notation

• With new notation,𝐮 𝑡 = 𝐮G∗ 𝐱 𝑡 = 𝜋(𝐱(𝑡))

and closed-loop system becomes 𝐱 𝑡 + 1 = 𝐴𝐱 𝑡 + 𝐵𝜋 𝐱 𝑡 ≔ 𝐟XY(𝐱 𝑡 )

5/22/19 AA 203 | Lecture 13 10

Typical cost functions

• 2-norm:𝑝 𝐱C = 𝐱C[ 𝑃𝐱C, 𝑐 𝐱E, 𝐮E = 𝐱E[ 𝑄𝐱E+ 𝐮E[ 𝑅𝐮E, 𝑃 ≥ 0, 𝑄 ≥ 0, 𝑅 > 0

• 1-norm or ∞-norm:𝑝 𝐱C = 𝑃𝐱C a 𝑐 𝐱E, 𝐮E = 𝑄𝐱E a+ 𝑅𝐮E a, 𝑝 = 1 or ∞

where 𝑃, 𝑄, 𝑅 are full column ranks

5/22/19 AA 203 | Lecture 13 11

Online model predictive control

repeat

5/22/19 AA 203 | Lecture 13 12

measure the state 𝐱(𝑡) at time instant 𝑡obtain 𝑈G∗ 𝐱 𝑡 by solving finite-time optimal control problemif 𝑈G∗ 𝐱 𝑡 = ∅ then ‘problem infeasible’ stopapply the first element 𝐮G∗ of 𝑈G∗ 𝐱 𝑡 to the systemwait for the new sampling time 𝑡 + 1

Main implementation issues

1. The controller may lead us into a situation where after a few steps the finite-time optimal control problem is infeasible→ persistent feasibility issue

2. Even if the feasibility problem does not occur, the generated control inputs may not lead to trajectories that converge to the origin (i.e., closed-loop system is unstable) → stability issue

Key question: how do we guarantee that such a “short- sighted” strategy leads to effective long-term behavior?

5/22/19 AA 203 | Lecture 13 13

Analysis approaches

1. Analyze closed-loop behavior directly → generally very difficult

2. Derive conditions on terminal function 𝑝, and terminal constraint set 𝑋N so that persistent feasibility and closed-loop stability are guaranteed

5/22/19 AA 203 | Lecture 13 14

Addressing persistent feasibility

Goal: design MPC controller so that feasibility for all future times is guaranteed

Approach: leverage tools from invariant set theory

5/22/19 AA 203 | Lecture 13 15

Set of feasible initial states

• Set of feasible initial states

𝑋G ≔ 𝐱G ∈ 𝑋 ∃ 𝐮G,… , 𝐮CHI such that 𝐱E ∈ 𝑋, 𝐮E ∈ 𝑈, 𝑘 = 0,… ,𝑁 − 1,𝐱C ∈ 𝑋N where 𝐱EBI = 𝐴𝐱E + 𝐵𝐮E, 𝑘 = 0,… ,𝑁 − 1}

• A control input can be found only if 𝐱(0) ∈ 𝑋G!

5/22/19 AA 203 | Lecture 13 16

Controllable sets

• For the autonomous system 𝐱 𝑡 + 1 = 𝜙(𝐱(𝑡))with constraints 𝐱 𝑡 ∈ 𝑋, 𝐮 𝑡 ∈ 𝑈, the one-step controllable set to set 𝑆 is defined as

Pre 𝑆 ≔ {𝐱 ∈ ℝ, ∶ 𝜙 𝐱 ∈ 𝑆}

• For the system 𝐱 𝑡 + 1 = 𝜙 𝐱 𝑡 , 𝐮 𝑡 with constraints 𝐱 𝑡 ∈ 𝑋,𝐮 𝑡 ∈ 𝑈, the one-step controllable set to set 𝑆 is defined as

Pre 𝑆 ≔ {𝐱 ∈ ℝ, ∶ ∃𝑢 ∈ 𝑈 such that 𝜙 𝐱, 𝐮 ∈ 𝑆}

5/22/19 AA 203 | Lecture 13 17

Control invariant sets

• A set 𝐶 ⊆ 𝑋 is said to be a control invariant set for the system 𝐱 𝑡 + 1 = 𝜙 𝐱 𝑡 , 𝐮 𝑡 with constraints 𝐱 𝑡 ∈ 𝑋, 𝐮 𝑡 ∈ 𝑈, if:

𝐱 𝑡 ∈ 𝐶 ⇒ ∃𝐮 ∈ 𝑈 such that 𝜙 𝐱 𝑡 , 𝐮 𝑡 ∈ 𝐶, for all 𝑡

• The set 𝐶v ⊆ 𝑋 is said to be the maximal control invariant set for the system 𝐱 𝑡 + 1 = 𝜙 𝐱 𝑡 , 𝐮 𝑡 with constraints 𝐱 𝑡 ∈ 𝑋,𝐮 𝑡 ∈ 𝑈, if it is control invariant and contains all control invariant sets contained in 𝑋

• These sets can be computed by using the MPT toolbox https://www.mpt3.org/

5/22/19 AA 203 | Lecture 13 18

https://www.mpt3.org/

Persistent feasibility lemma

• Define “truncated” feasibility set:𝑋I ≔ 𝐱I ∈ 𝑋 ∃ 𝐮I,… , 𝐮CHI such that 𝐱E ∈ 𝑋, 𝐮E ∈ 𝑈, 𝑘 = 1,… ,𝑁 − 1,

𝐱C ∈ 𝑋N where 𝐱EBI = 𝐴𝐱E + 𝐵𝐮E, 𝑘 = 1,… ,𝑁 − 1}

• Feasibility lemma: if set 𝑋I is a control invariant set for system:𝐱 𝑡 + 1 = 𝐴𝐱 𝑡 + 𝐵𝐮 𝑡 , 𝐱 𝑡 ∈ 𝑋, 𝐮 𝑡 ∈ 𝑈, 𝑡 ≥ 0

then the MPC law is persistently feasible

5/22/19 AA 203 | Lecture 13 19


• Proof:1. Pre 𝑋I = {𝐱 ∈ ℝ, ∶ ∃𝐮 ∈ 𝑈 such that 𝐴𝐱 + 𝐵𝐮 ∈ 𝑋I}2. Since 𝑋I is control invariant

∀𝐱 ∈ 𝑋I ∃𝐮 ∈ 𝑈 such that 𝐴𝐱 + 𝐵𝐮 ∈ 𝑋I3. Thus 𝑋I ⊆ Pre 𝑋I ∩ 𝑋4. One can write𝑋G = 𝐱G ∈ 𝑋 ∃𝐮G ∈ 𝑈 such that 𝐴𝐱G + 𝐵𝐮 ∈ 𝑋I} = Pre 𝑋I ∩ 𝑋

5. Thus, 𝑋I ⊆ 𝑋G

5/22/19 AA 203 | Lecture 13 20


• Proof:6. Pick some 𝐱G ∈ 𝑋G. Let 𝑈G∗ be the solution to the finite-time

optimization problem, and 𝐮G∗ be the first control. Let𝐱I = 𝐴𝐱G + 𝐵𝐮G∗

7. Since 𝑈G∗ is clearly feasible, one has 𝐱I ∈ 𝑋I. Since 𝑋I ⊆ 𝑋G, one has

𝐱I ∈ 𝑋Ghence the next optimization problem is feasible!

5/22/19 AA 203 | Lecture 13 21

Practical significance

• For 𝑁 = 1, we can set 𝑋N = 𝑋I. If we choose the terminal set to be control invariant, then MPC will be persistently feasible independentof chosen control objectives and parameters• Designer can choose the parameters to affect performance (e.g.,

stability)• How to extend this result to 𝑁 > 1?

5/22/19 AA 203 | Lecture 13 22

Persistent feasibility theorem

• Feasibility theorem: if set 𝑋N is a control invariant set for system:𝐱 𝑡 + 1 = 𝐴𝐱 𝑡 + 𝐵𝐮 𝑡 , 𝐱 𝑡 ∈ 𝑋, 𝐮 𝑡 ∈ 𝑈, 𝑡 ≥ 0

then the MPC law is persistently feasible

5/22/19 AA 203 | Lecture 13 23


• Proof1. Define “truncated” feasibility set at step 𝑁 − 1:

𝑋CHI ≔ 𝐱CHI ∈ 𝑋 ∃ 𝐮CHI such that 𝐱CHI ∈ 𝑋, 𝐮CHI ∈ 𝑈,𝐱C∈ 𝑋N where 𝐱C = 𝐴𝐱CHI + 𝐵𝐮CHI}

2. Due to the terminal constraint𝐴𝐱CHI + 𝐵𝐮CHI = 𝐱C ∈ 𝑋N

5/22/19 AA 203 | Lecture 13 24


• Proof3. Since 𝑋N is a control invariant set, there exists a 𝐮 ∈ 𝑈 such that

𝐱B = 𝐴𝐱C + 𝐵𝐮 ∈ 𝑋N4. The above is indeed the requirement to belong to set 𝑋CHI5. Thus, 𝐴𝐱CHI + 𝐵𝐮CHI = 𝐱C ∈ 𝑋CHI6. We have just proved that 𝑋CHI is control invariant 7. Repeating this argument, one can recursively show that 𝑋CHQ,

𝑋CHO,⋯ , 𝑋I are control invariant, and the persistent feasibility lemma then applies

5/22/19 AA 203 | Lecture 13 25

Practical aspects of persistent feasibility

• The terminal set 𝑋N is introduced artificially for the sole purpose of leading to a sufficient condition for persistent feasibility• We want it to be large so that it does not compromise closed-loop

performance• Though it is simplest to choose 𝑋N = 0, this is generally undesirable • We’ll discuss better choices in the next lecture

5/22/19 AA 203 | Lecture 13 26

Example

• System with 𝑛 = 3 states, 𝑚 = 2 inputs• 𝐴, 𝐵 chosen randomly • quadratic stage cost: 𝑐 𝐱, 𝐮 = 𝐱[𝐱 + 𝐮[𝐮• 𝑋 = 𝐱 𝐱 v ≤ 1}, 𝑈 = {𝐮 | 𝐮 v ≤ 0.5}• initial point: (0.9, −0.9, 0.9)

5/22/19 AA 203 | Lecture 13 27

Next time

• Stability of MPC, explicit MPC, and practical aspects

AA 203 | Lecture 135/22/19 28

aa203 optimal and learning-based controlasl.stanford.edu/aa203/pdfs/lecture/lecture_13.pdf · the...

Documents