aa203 optimal and learning-based controlasl.stanford.edu/aa203/pdfs/lecture/lecture_13.pdf · the...

28
AA203 Optimal and Learning-based Control Introduction to MPC, persistent feasibility

Upload: others

Post on 09-Mar-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: AA203 Optimal and Learning-based Controlasl.stanford.edu/aa203/pdfs/lecture/lecture_13.pdf · the finite-time optimal control problem is infeasible→persistent feasibility issue

AA203Optimal and Learning-based Control

Introduction to MPC, persistent feasibility

Page 2: AA203 Optimal and Learning-based Controlasl.stanford.edu/aa203/pdfs/lecture/lecture_13.pdf · the finite-time optimal control problem is infeasible→persistent feasibility issue

Roadmap

Optimal control

Open-loop

Indirect methods

Direct methods

Closed-loop

DP HJB / HJI

MPC

Adaptiveoptimal control Model-based RL

Linear methods

Non-linear methods

LQR iLQR DDP LQGAA 203 | Lecture 135/22/19 2

Reachability analysisCoV NOC PMP State/control

paramControlparam

Page 3: AA203 Optimal and Learning-based Controlasl.stanford.edu/aa203/pdfs/lecture/lecture_13.pdf · the finite-time optimal control problem is infeasible→persistent feasibility issue

Model predictive control • Model predictive control (or, more broadly, receding horizon

control) entails solving finite-time optimal control problems in a receding horizon fashion

5/22/19 AA 203 | Lecture 13 3

Page 4: AA203 Optimal and Learning-based Controlasl.stanford.edu/aa203/pdfs/lecture/lecture_13.pdf · the finite-time optimal control problem is infeasible→persistent feasibility issue

Model predictive control

Key steps:1. At each sampling time 𝑡, solve an open-loop optimal control

problem over a finite horizon2. Apply optimal input signal during the following sampling interval

𝑡, 𝑡 + 13. At the next time step 𝑡 + 1, solve new optimal control problem

based on new measurements of the state over a shifted horizon

5/22/19 AA 203 | Lecture 13 4

Page 5: AA203 Optimal and Learning-based Controlasl.stanford.edu/aa203/pdfs/lecture/lecture_13.pdf · the finite-time optimal control problem is infeasible→persistent feasibility issue

Basic formulation

• Consider the problem of regulating to the origin the discrete-time linear invariant system

𝐱 𝑡 + 1 = 𝐴𝐱 𝑡 + 𝐵𝐮 𝑡 , 𝐱 𝑡 ∈ ℝ,, 𝐮 𝑡 ∈ ℝ-

subject to the constraints𝐱 𝑡 ∈ 𝑋, 𝐮 𝑡 ∈ 𝑈, 𝑡 ≥ 0

where the sets 𝑋 and 𝑈 are polyhedra

5/22/19 AA 203 | Lecture 13 5

Page 6: AA203 Optimal and Learning-based Controlasl.stanford.edu/aa203/pdfs/lecture/lecture_13.pdf · the finite-time optimal control problem is infeasible→persistent feasibility issue

Basic formulation

• Assume that a full measurement of the state 𝐱(𝑡) is available at the current time 𝑡• The finite-time optimal control problem solved at each stage is

𝐽5∗ 𝐱 𝑡 = min𝐮:|:,…,𝐮:=>?@|:

𝑝 𝐱5BC|5 +DEFG

CHI

𝑐(𝐱5BE|5, 𝐮5BE|5)

5/22/19 AA 203 | Lecture 13 6

subject to 𝐱5BEBI|5= 𝐴𝐱5BE|5 + 𝐵𝐮5BE|5, 𝑘 = 0,… ,𝑁 − 1𝐱5BE|5∈ 𝑋, 𝐮5BE|5∈ 𝑈, 𝑘 = 0,… ,𝑁 − 1𝐱5BC|5∈ 𝑋N𝐱5|5= 𝐱(𝑡)

Page 7: AA203 Optimal and Learning-based Controlasl.stanford.edu/aa203/pdfs/lecture/lecture_13.pdf · the finite-time optimal control problem is infeasible→persistent feasibility issue

Basic formulation

Notation:• 𝐱5BE|5 is the state vector at time 𝑡 + 𝑘 predicted at time 𝑡 (via the

system’s dynamics)• 𝐮5BE|5 is the input 𝐮 at time 𝑡 + 𝑘 computed at time 𝑡

Note: 𝐱O|I≠ 𝐱O|Q

5/22/19 AA 203 | Lecture 13 7

Page 8: AA203 Optimal and Learning-based Controlasl.stanford.edu/aa203/pdfs/lecture/lecture_13.pdf · the finite-time optimal control problem is infeasible→persistent feasibility issue

Basic formulation

• Let 𝑈5→5BC|5∗ ≔ {𝐮5|5∗ , 𝐮5BI|5∗ , … , 𝐮5BCHI|5∗ } be the optimal solution, then

𝐮 𝑡 = 𝐮5|5∗ (𝐱(𝑡))• The optimization problem is then repeated at time 𝑡 + 1, based on

the new state 𝐱5BI|5BI= 𝐱(𝑡 + 1)• Define 𝜋5 𝐱 𝑡 ≔ 𝐮5|5∗ (𝐱(𝑡))• Then the closed-loop system evolves as

𝐱 𝑡 + 1 = 𝐴𝐱 𝑡 + 𝐵𝜋5 𝐱 𝑡 ≔ 𝐟XY(𝐱 𝑡 , 𝑡)• Central question: characterize the behavior of closed-loop system

5/22/19 AA 203 | Lecture 13 8

Page 9: AA203 Optimal and Learning-based Controlasl.stanford.edu/aa203/pdfs/lecture/lecture_13.pdf · the finite-time optimal control problem is infeasible→persistent feasibility issue

Simplifying the notation• Note that the setup is time-invariant, hence, to simplify the notation, we

can let 𝑡 = 0 in the finite-time optimal control problem, namely

• Denote 𝑈G∗ 𝐱 𝑡 = {𝐮G∗ , … , 𝐮CHI∗ }5/22/19 AA 203 | Lecture 13 9

𝐽G∗ 𝐱 𝑡 = min𝐮Z,…,𝐮>?@

𝑝 𝐱C +DEFG

CHI

𝑐(𝐱E, 𝐮E)

subject to 𝐱EBI= 𝐴𝐱E + 𝐵𝐮E, 𝑘 = 0,… ,𝑁 − 1𝐱E∈ 𝑋, 𝐮E∈ 𝑈, 𝑘 = 0,… ,𝑁 − 1

𝐱C∈ 𝑋N𝐱G= 𝐱(𝑡)

Page 10: AA203 Optimal and Learning-based Controlasl.stanford.edu/aa203/pdfs/lecture/lecture_13.pdf · the finite-time optimal control problem is infeasible→persistent feasibility issue

Simplifying the notation

• With new notation,𝐮 𝑡 = 𝐮G∗ 𝐱 𝑡 = 𝜋(𝐱(𝑡))

and closed-loop system becomes 𝐱 𝑡 + 1 = 𝐴𝐱 𝑡 + 𝐵𝜋 𝐱 𝑡 ≔ 𝐟XY(𝐱 𝑡 )

5/22/19 AA 203 | Lecture 13 10

Page 11: AA203 Optimal and Learning-based Controlasl.stanford.edu/aa203/pdfs/lecture/lecture_13.pdf · the finite-time optimal control problem is infeasible→persistent feasibility issue

Typical cost functions

• 2-norm:𝑝 𝐱C = 𝐱C[ 𝑃𝐱C, 𝑐 𝐱E, 𝐮E = 𝐱E[ 𝑄𝐱E+ 𝐮E[ 𝑅𝐮E, 𝑃 ≥ 0, 𝑄 ≥ 0, 𝑅 > 0

• 1-norm or ∞-norm:𝑝 𝐱C = 𝑃𝐱C a 𝑐 𝐱E, 𝐮E = 𝑄𝐱E a+ 𝑅𝐮E a, 𝑝 = 1 or ∞

where 𝑃, 𝑄, 𝑅 are full column ranks

5/22/19 AA 203 | Lecture 13 11

Page 12: AA203 Optimal and Learning-based Controlasl.stanford.edu/aa203/pdfs/lecture/lecture_13.pdf · the finite-time optimal control problem is infeasible→persistent feasibility issue

Online model predictive control

repeat

5/22/19 AA 203 | Lecture 13 12

measure the state 𝐱(𝑡) at time instant 𝑡obtain 𝑈G∗ 𝐱 𝑡 by solving finite-time optimal control problemif 𝑈G∗ 𝐱 𝑡 = ∅ then ‘problem infeasible’ stopapply the first element 𝐮G∗ of 𝑈G∗ 𝐱 𝑡 to the systemwait for the new sampling time 𝑡 + 1

Page 13: AA203 Optimal and Learning-based Controlasl.stanford.edu/aa203/pdfs/lecture/lecture_13.pdf · the finite-time optimal control problem is infeasible→persistent feasibility issue

Main implementation issues

1. The controller may lead us into a situation where after a few steps the finite-time optimal control problem is infeasible→ persistent feasibility issue

2. Even if the feasibility problem does not occur, the generated control inputs may not lead to trajectories that converge to the origin (i.e., closed-loop system is unstable) → stability issue

Key question: how do we guarantee that such a “short- sighted” strategy leads to effective long-term behavior?

5/22/19 AA 203 | Lecture 13 13

Page 14: AA203 Optimal and Learning-based Controlasl.stanford.edu/aa203/pdfs/lecture/lecture_13.pdf · the finite-time optimal control problem is infeasible→persistent feasibility issue

Analysis approaches

1. Analyze closed-loop behavior directly → generally very difficult

2. Derive conditions on terminal function 𝑝, and terminal constraint set 𝑋N so that persistent feasibility and closed-loop stability are guaranteed

5/22/19 AA 203 | Lecture 13 14

Page 15: AA203 Optimal and Learning-based Controlasl.stanford.edu/aa203/pdfs/lecture/lecture_13.pdf · the finite-time optimal control problem is infeasible→persistent feasibility issue

Addressing persistent feasibility

Goal: design MPC controller so that feasibility for all future times is guaranteed

Approach: leverage tools from invariant set theory

5/22/19 AA 203 | Lecture 13 15

Page 16: AA203 Optimal and Learning-based Controlasl.stanford.edu/aa203/pdfs/lecture/lecture_13.pdf · the finite-time optimal control problem is infeasible→persistent feasibility issue

Set of feasible initial states

• Set of feasible initial states

𝑋G ≔ 𝐱G ∈ 𝑋 ∃ 𝐮G,… , 𝐮CHI such that 𝐱E ∈ 𝑋, 𝐮E ∈ 𝑈, 𝑘 = 0,… ,𝑁 − 1,𝐱C ∈ 𝑋N where 𝐱EBI = 𝐴𝐱E + 𝐵𝐮E, 𝑘 = 0,… ,𝑁 − 1}

• A control input can be found only if 𝐱(0) ∈ 𝑋G!

5/22/19 AA 203 | Lecture 13 16

Page 17: AA203 Optimal and Learning-based Controlasl.stanford.edu/aa203/pdfs/lecture/lecture_13.pdf · the finite-time optimal control problem is infeasible→persistent feasibility issue

Controllable sets

• For the autonomous system 𝐱 𝑡 + 1 = 𝜙(𝐱(𝑡))with constraints 𝐱 𝑡 ∈ 𝑋, 𝐮 𝑡 ∈ 𝑈, the one-step controllable set to set 𝑆 is defined as

Pre 𝑆 ≔ {𝐱 ∈ ℝ, ∶ 𝜙 𝐱 ∈ 𝑆}

• For the system 𝐱 𝑡 + 1 = 𝜙 𝐱 𝑡 , 𝐮 𝑡 with constraints 𝐱 𝑡 ∈ 𝑋,𝐮 𝑡 ∈ 𝑈, the one-step controllable set to set 𝑆 is defined as

Pre 𝑆 ≔ {𝐱 ∈ ℝ, ∶ ∃𝑢 ∈ 𝑈 such that 𝜙 𝐱, 𝐮 ∈ 𝑆}

5/22/19 AA 203 | Lecture 13 17

Page 18: AA203 Optimal and Learning-based Controlasl.stanford.edu/aa203/pdfs/lecture/lecture_13.pdf · the finite-time optimal control problem is infeasible→persistent feasibility issue

Control invariant sets

• A set 𝐶 ⊆ 𝑋 is said to be a control invariant set for the system 𝐱 𝑡 + 1 = 𝜙 𝐱 𝑡 , 𝐮 𝑡 with constraints 𝐱 𝑡 ∈ 𝑋, 𝐮 𝑡 ∈ 𝑈, if:

𝐱 𝑡 ∈ 𝐶 ⇒ ∃𝐮 ∈ 𝑈 such that 𝜙 𝐱 𝑡 , 𝐮 𝑡 ∈ 𝐶, for all 𝑡

• The set 𝐶v ⊆ 𝑋 is said to be the maximal control invariant set for the system 𝐱 𝑡 + 1 = 𝜙 𝐱 𝑡 , 𝐮 𝑡 with constraints 𝐱 𝑡 ∈ 𝑋,𝐮 𝑡 ∈ 𝑈, if it is control invariant and contains all control invariant sets contained in 𝑋

• These sets can be computed by using the MPT toolbox https://www.mpt3.org/

5/22/19 AA 203 | Lecture 13 18

Page 19: AA203 Optimal and Learning-based Controlasl.stanford.edu/aa203/pdfs/lecture/lecture_13.pdf · the finite-time optimal control problem is infeasible→persistent feasibility issue

Persistent feasibility lemma

• Define “truncated” feasibility set:𝑋I ≔ 𝐱I ∈ 𝑋 ∃ 𝐮I,… , 𝐮CHI such that 𝐱E ∈ 𝑋, 𝐮E ∈ 𝑈, 𝑘 = 1,… ,𝑁 − 1,

𝐱C ∈ 𝑋N where 𝐱EBI = 𝐴𝐱E + 𝐵𝐮E, 𝑘 = 1,… ,𝑁 − 1}

• Feasibility lemma: if set 𝑋I is a control invariant set for system:𝐱 𝑡 + 1 = 𝐴𝐱 𝑡 + 𝐵𝐮 𝑡 , 𝐱 𝑡 ∈ 𝑋, 𝐮 𝑡 ∈ 𝑈, 𝑡 ≥ 0

then the MPC law is persistently feasible

5/22/19 AA 203 | Lecture 13 19

Page 20: AA203 Optimal and Learning-based Controlasl.stanford.edu/aa203/pdfs/lecture/lecture_13.pdf · the finite-time optimal control problem is infeasible→persistent feasibility issue

Persistent feasibility lemma

• Proof:1. Pre 𝑋I = {𝐱 ∈ ℝ, ∶ ∃𝐮 ∈ 𝑈 such that 𝐴𝐱 + 𝐵𝐮 ∈ 𝑋I}2. Since 𝑋I is control invariant

∀𝐱 ∈ 𝑋I ∃𝐮 ∈ 𝑈 such that 𝐴𝐱 + 𝐵𝐮 ∈ 𝑋I3. Thus 𝑋I ⊆ Pre 𝑋I ∩ 𝑋4. One can write𝑋G = 𝐱G ∈ 𝑋 ∃𝐮G ∈ 𝑈 such that 𝐴𝐱G + 𝐵𝐮 ∈ 𝑋I} = Pre 𝑋I ∩ 𝑋

5. Thus, 𝑋I ⊆ 𝑋G

5/22/19 AA 203 | Lecture 13 20

Page 21: AA203 Optimal and Learning-based Controlasl.stanford.edu/aa203/pdfs/lecture/lecture_13.pdf · the finite-time optimal control problem is infeasible→persistent feasibility issue

Persistent feasibility lemma

• Proof:6. Pick some 𝐱G ∈ 𝑋G. Let 𝑈G∗ be the solution to the finite-time

optimization problem, and 𝐮G∗ be the first control. Let𝐱I = 𝐴𝐱G + 𝐵𝐮G∗

7. Since 𝑈G∗ is clearly feasible, one has 𝐱I ∈ 𝑋I. Since 𝑋I ⊆ 𝑋G, one has

𝐱I ∈ 𝑋Ghence the next optimization problem is feasible!

5/22/19 AA 203 | Lecture 13 21

Page 22: AA203 Optimal and Learning-based Controlasl.stanford.edu/aa203/pdfs/lecture/lecture_13.pdf · the finite-time optimal control problem is infeasible→persistent feasibility issue

Practical significance

• For 𝑁 = 1, we can set 𝑋N = 𝑋I. If we choose the terminal set to be control invariant, then MPC will be persistently feasible independentof chosen control objectives and parameters• Designer can choose the parameters to affect performance (e.g.,

stability)• How to extend this result to 𝑁 > 1?

5/22/19 AA 203 | Lecture 13 22

Page 23: AA203 Optimal and Learning-based Controlasl.stanford.edu/aa203/pdfs/lecture/lecture_13.pdf · the finite-time optimal control problem is infeasible→persistent feasibility issue

Persistent feasibility theorem

• Feasibility theorem: if set 𝑋N is a control invariant set for system:𝐱 𝑡 + 1 = 𝐴𝐱 𝑡 + 𝐵𝐮 𝑡 , 𝐱 𝑡 ∈ 𝑋, 𝐮 𝑡 ∈ 𝑈, 𝑡 ≥ 0

then the MPC law is persistently feasible

5/22/19 AA 203 | Lecture 13 23

Page 24: AA203 Optimal and Learning-based Controlasl.stanford.edu/aa203/pdfs/lecture/lecture_13.pdf · the finite-time optimal control problem is infeasible→persistent feasibility issue

Persistent feasibility theorem

• Proof1. Define “truncated” feasibility set at step 𝑁 − 1:

𝑋CHI ≔ 𝐱CHI ∈ 𝑋 ∃ 𝐮CHI such that 𝐱CHI ∈ 𝑋, 𝐮CHI ∈ 𝑈,𝐱C∈ 𝑋N where 𝐱C = 𝐴𝐱CHI + 𝐵𝐮CHI}

2. Due to the terminal constraint𝐴𝐱CHI + 𝐵𝐮CHI = 𝐱C ∈ 𝑋N

5/22/19 AA 203 | Lecture 13 24

Page 25: AA203 Optimal and Learning-based Controlasl.stanford.edu/aa203/pdfs/lecture/lecture_13.pdf · the finite-time optimal control problem is infeasible→persistent feasibility issue

Persistent feasibility theorem

• Proof3. Since 𝑋N is a control invariant set, there exists a 𝐮 ∈ 𝑈 such that

𝐱B = 𝐴𝐱C + 𝐵𝐮 ∈ 𝑋N4. The above is indeed the requirement to belong to set 𝑋CHI5. Thus, 𝐴𝐱CHI + 𝐵𝐮CHI = 𝐱C ∈ 𝑋CHI6. We have just proved that 𝑋CHI is control invariant 7. Repeating this argument, one can recursively show that 𝑋CHQ,

𝑋CHO,⋯ , 𝑋I are control invariant, and the persistent feasibility lemma then applies

5/22/19 AA 203 | Lecture 13 25

Page 26: AA203 Optimal and Learning-based Controlasl.stanford.edu/aa203/pdfs/lecture/lecture_13.pdf · the finite-time optimal control problem is infeasible→persistent feasibility issue

Practical aspects of persistent feasibility

• The terminal set 𝑋N is introduced artificially for the sole purpose of leading to a sufficient condition for persistent feasibility• We want it to be large so that it does not compromise closed-loop

performance• Though it is simplest to choose 𝑋N = 0, this is generally undesirable • We’ll discuss better choices in the next lecture

5/22/19 AA 203 | Lecture 13 26

Page 27: AA203 Optimal and Learning-based Controlasl.stanford.edu/aa203/pdfs/lecture/lecture_13.pdf · the finite-time optimal control problem is infeasible→persistent feasibility issue

Example

• System with 𝑛 = 3 states, 𝑚 = 2 inputs• 𝐴, 𝐵 chosen randomly • quadratic stage cost: 𝑐 𝐱, 𝐮 = 𝐱[𝐱 + 𝐮[𝐮• 𝑋 = 𝐱 𝐱 v ≤ 1}, 𝑈 = {𝐮 | 𝐮 v ≤ 0.5}• initial point: (0.9, −0.9, 0.9)

5/22/19 AA 203 | Lecture 13 27

Page 28: AA203 Optimal and Learning-based Controlasl.stanford.edu/aa203/pdfs/lecture/lecture_13.pdf · the finite-time optimal control problem is infeasible→persistent feasibility issue

Next time

• Stability of MPC, explicit MPC, and practical aspects

AA 203 | Lecture 135/22/19 28