e cient direct multiple shooting for nonlinear model ... · e cient direct multiple shooting for...

Efficient Direct Multiple Shooting for Nonlinear Model

Predictive Control on Long Horizons

C. Kirches∗,a, L. Wirschinga, H. G. Bocka, J. P. Schlodera

aInterdisciplinary Center for Scientific Computing (IWR), Heidelberg University,Im Neuenheimer Feld 368, 69120 Heidelberg, GERMANY

Abstract

We address direct multiple shooting based algorithms for nonlinear model predic-tive control, with a focus on problems with long prediction horizons. We describedifferent efficient multiple shooting variants with a computational effort that isonly linear in the horizon length. Proposed techniques comprise structure exploit-ing linear algebra on the one hand, and approximation of derivative information inan adjoint Sequential Quadratic Programming method on the other hand. For ex-plicit one–step methods for ordinary differential equations we address the issue ofconsistent and fast generation of both forward and adjoint derivatives of dynamicprocess models according to the principle of Internal Numerical Differentiation.We discuss the applicability of the proposed methods at the example of threebenchmark problems. These have recently been addressed in literature and serveto evaluate the relative performance of each of the proposed methods for both off–line optimal control and on–line nonlinear model predictive control. Throughout,we compare against results published for a recently proposed collocation approachbased on finite elements.

Key words: nonlinear model predictive control, direct and simultaneous

methods for optimal control, sensitivity generation, benchmarks

1. Introduction1

In Nonlinear Model Predictive Control (NMPC), one repeatedly computes2

solutions to optimal control problems (OCPs) on a finite prediction horizon in3

∗Corresponding author: Tel.: +49 (6221) 54–8895; Fax: +49 (6221) 54–5444Email address: [email protected] (C. Kirches)

Preprint submitted to Journal of Process Control July 4, 2011

order to generate feedback controls for dynamical processes. Since these opti-4

mal control problems are approximations to an infinite–horizon counterpart,5

the choice of the prediction horizon length is crucial. For closed–loop per-6

formance and stability, the use of long horizons would be preferable. More-7

over, fast systems often require a fine control discretization of the prediction8

horizon for sufficient controllability. Classic numerical methods, however,9

frequently show a cubic runtime complexity in the horizon length resp. the10

discretization granularity. This effectively limits the applicability of such11

methods to short horizons and coarse discretizations.12

1.1. Contributions13

In this paper we reconsider three case studies for NMPC on long horizons14

that have been presented in [1, 2]. For each of the case studies we com-15

pare the performance of a full–space and an adjoint Sequential Quadratic16

Programming (SQP) method, as well as of three different approaches at17

structure exploitation for the multiple shooting parameterization. For the18

adjoint SQP method we propose a matrix–free condensing step with reduced19

runtime complexity. We improve over [1], wherein model predictive control20

schemes are not addressed, by investigating the performance and behavior21

of all presented algorithms in an NMPC context for typical set point change22

scenarios. Here we conduct a detailed runtime analysis and give recommen-23

dations for the choice of algorithms depending on the characteristics of the24

dynamic process under consideration. In [1] a collocation scheme based on25

finite elements (CFE) is proposed for the computation of sensitivities. Ad-26

dressing this, we review the principle of Internal Numerical Differentiation27

(IND) due to Bock [3] and apply it to automatically, precisely, and efficiently28

compute derivatives of a discretization scheme for the process dynamics. Us-29

ing the test cases we compare the precision and computational effort of IND30

to that reported for the CFE scheme.31

2. The Direct Multiple Shooting Method for Optimal Control32

In this section we give a brief presentation of the direct multiple shooting33

method for optimal control [4] and the solution of the arising structured34

nonlinear programming problem (NLPs) in an SQP context.35

2

2.1. Optimal control problem formulation36

We consider the following class of OCPs which typically arise in NMPC37

on the fixed and finite prediction horizon [t0, t0 + T ],38

minx(·),u(·)

∫ t0+T

t0

L(x(t), u(t)) dt+ E(x(t0 + T )) (1a)

s.t. x(t) = f(x(t), u(t)), t ∈ [t0, t0 + T ], (1b)

x(t0) = x0, (1c)

0 ≤ r(x(t), u(t)), t ∈ [t0, t0 + T ], (1d)

0 ≤ h(x(t0 + T )). (1e)

We denote by x(t) ∈ Rnx the state vector and by u(t) ∈ Rnu the vector of39

continuous controls of the dynamic process.40

The state trajectory is determined from the initial value problem (IVP)41

(1b, 1c), where x0 is the current state of the process and f (x, u) describes42

a model of the dynamic process. For clarity of exposition we concentrate on43

ODE process models in the following, and refer to [5] for treatment of models44

described by differential–algebraic equations (DAEs). States and controls45

may be subject to constraints (1d) and the final state may be restricted by46

an end–point constraint (1e).47

The objective function is of Bolza type with a Lagrange term L (x, u)48

and a Mayer term E (x(T )). This formulation also includes least–squares49

objective functions of the form L (x, u) = ‖l(x, u)‖22, where l is the least-50

squares residual vector. A typical example is the tracking-type objective51

L (x, u) = (x− x)T Q(t) (x− x) + (u− u)T R(t) (u− u) , (2)

where x and u are reference trajectories for x and u, and Q(t) and R(t)52

suitable positive definite weighting matrices. A typical choice for the Mayer53

term is the quadratic cost54

E (x(t0 + T )) =(x(t0 + T )− x(t0 + T )

)TP(x(t0 + T )− x(t0 + T )

), (3)

with a suitable weighting matrix P . The Mayer term can be used — typically55

in conjunction with the end–point constraint h(x(T )) — to design feedback56

control schemes that guarantee stability of the closed-loop system, see [6, 7].57

The problem may also depend on time–independent model parameters58

which are not considered as degrees of freedom for the optimization. These59

3

parameters may e.g. define set–points for tracking–type objectives, or may be60

fixed design parameters of the process model. In practice, it may happen that61

some of the parameters change during the runtime of the process, and this62

gives rise to the important area of on–line state and parameter estimation63

(see e.g. [8, 9, 10]).64

2.2. Discretization of Controls and Parameterization of States65

In this work, algorithms for the efficient numerical solution of problem (1)66

are based on the direct multiple shooting method for optimal control, first67

described by [11, 4] and extended in a series of subsequent works, cf. [12].68

For a suitable partition of the prediction horizon [t0, t0 + T ] ⊂ R into N69

shooting intervals [ti, ti+1], 0 ≤ i < N , we choose the control discretization70

u(t) = ϕi(t, qi), t ∈ [ti, ti+1]. (4)

In NMPC, the usual choice for the basis functions ϕi are piecewise constant71

controls ϕi(t, qi) = qi ∈ Rnq for t ∈ [ti, ti+1]. In contrast to single shooting72

we also apply a state parameterization by introduction of additional initial73

values si for computing the state trajectories on the shooting intervals,74

xi(t) = f(xi(t), ϕi(t, qi)), xi(ti) = si, [ti, ti+1], 0 ≤ i < N. (5)

Continuity of the optimal trajectory on the whole interval [0, T ] is ensured75

by additional matching conditions76

si+1 = xi(ti+1; ti, si, qi), 0 ≤ i < N, (6)

wherein xi(t; ti, si, qi) denotes the solution of the IVP (5) depending on si77

and qi. One particular advantage of this approach is that it allows for the78

use of adaptive integrators for function and sensitivity evaluation, cf. Section79

3. Path constraints (1d) are usually imposed on the shooting grid {ti}0≤i≤N80

only, but strict feasibility in the interior could be ensured if needed, cf. [13].81

2.3. Nonlinear Program82

From the multiple shooting discretization we obtain the NLP83

mins,q

∑N−1

i=0Li (si, qi) + E (sN) (7a)

s.t. 0 = x0 − s0, (7b)

0 = xi(ti+1; ti, si, qi)− si+1, 0 ≤ i < N, (7c)

0 ≤ r(si, ϕi(ti, qi)), 0 ≤ i < N, (7d)

0 ≤ h(sN), (7e)

4

wherein Li (si, qi) =∫ ti+1

tiL(x(t), ϕi(t, qi)) dt. This NLP shows a parametric84

dependence on the initial value x0 and, with Λ = (I, 0, 0, . . . )T and w = (q, s)85

where q = (q0, . . . , qN−1) and s = (s0, . . . , sN), can be written in the more86

generic form87

minw

φ(w) s.t. c(w) + Λx0 = 0, d(w) ≥ 0. (8)

2.4. Sequential Quadratic Programming88

We solve NLP (8) with a Newton–type method. Starting with an initial89

guess (w0, λ0, µ0), a full step SQP iteration (see e.g. [14]) is performed by90

solving the quadratic programming problem (QP)91

min∆w

12∆wTBk∆w + ∆wT bk (9a)

s.t. 0 = Ck∆w + ck + Λxk0, (9b)

0 ≤ Dk∆w + dk. (9c)

for (∆w, λQP, µQP) and iterating according to92

wk+1 = wk + ∆w, λk+1 = λQP, µk+1 = µQP. (10)

In (9), Bk := B(wk) denotes an approximation of the Hessian of the La-93

grangian of (7), bk := ∇φ(wk) is the objective gradient, and (9b, 9c) are94

linearizations of the equality constraints c(·) and inequality constraints d(·)95

in the current iterate wk respectively. Various structural features of QP96

(9), such as block diagonal Hessian and block bidiagonal Jacobians can be97

exploited, see Section 5.98

2.5. Adjoint SQP99

In subproblem (9) it may be desirable to use approximations Ck, Dk100

instead of the exact Jacobians Ck, Dk. In this case, one has to worry about101

correct identification of the true active set of the solution of (9). In [48, 2] it102

has been shown that this is possible by using the so–called modified gradient103

bk := bk + (Ck − Ck)Tλk + (Dk −Dk)Tµk (11)

in place of bk. The key motivation here is that the product CkTλk can be104

computed as a cheap adjoint derivative of the right-hand side of (7c), without105

having to compute the expensive full Jacobian Ck. The same of course holds106

true for the right-hand sides of (7d, 7e) and DkTµk. For (7c) this involves107

the computation of an adjoint sensitivity of the discretized IVP (1b, 1c).108

Applicable principles are addressed in Section 3.109

5

3. Solutions and Sensitivities of the Dynamic Process110

The formulation of the QP (9) crucially relies on the availability of first111

order derivative information. In this section we review the fundamental prin-112

ciple of Internal Numerical Differentiation, pioneered by [3]. We discuss its113

use to automatically, efficiently, and precisely compute derivatives of a dis-114

cretization scheme for (1b), as shown in [3, 16], for the class of explicit115

Runge–Kutta methods [17, 18, 19]. Discussions of the more involved case of116

linear multi–step methods can be found e.g. in [20, 21] and in [22, 23] for the117

adjoint case.118

3.1. The Principle of Internal Numerical Differentiation119

In optimization of dynamic processes by direct and simultaneous methods120

like collocation or direct multiple shooting, also referred to as “first discretize,121

then optimize” approaches, it is paramount that all employed derivatives are122

consistent with the discretization, i.e., one is interested in an exact derivative123

of the discretized problem.124

In order to achieve this goal in direct methods for optimal control, it is125

vitally important to use the same discretization scheme for both computa-126

tion of the nominal solution trajectory and computation of sensitivities, i.e.,127

to ensure identical choices for all adaptively chosen components of the dis-128

cretization scheme during both computations. These components may com-129

prise e.g. the choice of step sizes in error–controlled methods, of orders in130

variable–order methods, of Jacobians and iteration counts in implicit meth-131

ods, or of pivoting decisions in factorizations. This principle, referred to132

as Internal Numerical Differentiation [3], extends also to the process model,133

i.e., the ODE’s right hand side function f (1b).134

3.2. An Exemplary Adaptive Discretization Scheme135

To illustrate this principle, we consider an explicit Runge–Kutta (RK)136

scheme Φ with s ≥ 1 stages and Butcher tableau α, γ ∈ Rs, β ∈ Rs×s. This137

scheme computes an approximation η(τ +h) = η+hΦ(τ, η;h) to the solution138

x(τ + h; τ, η) of the IVP139

x(t) = f(t, x(t)), x(τ) = η, t ∈ [τ, τ + h], (12)

using the step function140

Φ(τ, η; h) :=∑s

i=1γiki, ki := f

(τ + hαi, η + h

∑i−1

j=1βijkj

), (13)

6

wherein we dropped the control argument from f . We assume that some error141

control mechanism, given an initial value η0, in iterations k = 0, . . . , K − 1142

adaptively chooses a step size hk > 0 to compute the approximation ηk+1 =143

η(τk+1) to x(τk+1) at time τk+1 = τk+hk. Details can be found e.g. in [19, 24]144

and many other related works.145

3.3. A Consistent Forward Mode Discretization Scheme146

A discretization scheme to compute a forward directional sensitivity147

xd(tb; ta) = dx(tb)dx(ta)

· d, [ta, tb] ⊆ [t0, t0 + T ], d ∈ Rnx 6= 0 (14)

that is consistent with (13) can be obtained by forward mode differentiation148

of (13) with respect to η and reads149

Φd(τ, η, ηd; h) :=s∑i=1

γikdi , kdi := ∂f

∂x(·) ·

(ηd + h

∑i−1

j=1βijk

dj

), (15)

where ηd(τ) is the approximate of xd(τ ; ta), and ηd(ta) = d. Arguments of ∂f∂x

150

have been dropped for brevity of notation and are the same as in (13). Hence,151

the forward sensitivity scheme (15) is best evaluated simultaneously with the152

forward simulation scheme (13). Clearly, both schemes are consistent: step153

sizes and evaluation points coincide, and forward directions are propagated154

by Φd just like state approximations by Φ. Hence the IND principle is satisfied155

here. We remark that solving the variational system156

xd(t) = ∂f∂x

(t, x) xd(t), xd(ta) = d (16)

on [ta, tb] conforms to the IND principle only if the same RK scheme and the157

same sequence {hk} of step sizes as for the nominal solution is chosen, and158

if the approximations ηk are reused to evaluate ∂f∂x

.159

3.4. A Consistent Adjoint Mode Discretization Scheme160

As discussed in Section 2.5, the availability of adjoint derivative informa-161

tion may bring considerable benefits to Newton–type methods. A discretiza-162

tion scheme to compute an adjoint directional ODE sensitivity163

λx(tb; ta) = λT · dx(tb)dx(ta)

, [ta, tb] ⊆ [0, T ], λ ∈ Rnx 6= 0 (17)

7

that is consistent with (13) can be obtained by applying the reverse mode of164

automatic differentiation (see e.g. [25]) with respect to η to (13),165

λΦ(τ, η, λη; h) := −s∑i=1

λkTi∂f∂x

(·), λki := γiλη + h

s∑j=i+1

βjiλkTj

∂f∂x

(·).

This adjoint scheme starts with ληK := λ and proceeds for k = K, . . . , 1166

backwards in time with steps ληk−1 := ληk − hk−1 · λΦ(tk−1, ηk−1,ληk; hk−1).167

Afterwards, λη0 is the approximation of (17) and the adjoint directional168

derivative of the discretization scheme. We remark that in some similarity169

to the forward case, solving the adjoint system170

˙λx(t)T = −λx(t)T ∂f∂x

(t, x), λx(tb) = d (18)

on [ta, tb] backwards in time using the RK scheme that is adjoint to the one171

used for the forward simulation, i.e., the RK scheme with transposed Butcher172

matrix β, creates a consistent scheme conforming to IND if again all adaptive173

components remain unaltered, cf. [16, 2].174

3.5. Remarks on Efficiency and Accuracy175

The IND principle is specifically designed to allow for larger integrator176

steps hk to be made, hence allowing for faster computation of the ODE sys-177

tem’s solution and sensitivities, incurring no compromise in accuracy (see178

Section 6). By invoking automatic differentiation, a sensitivity computation179

scheme consistent in the sense of the IND principle always guarantees that180

the obtained derivatives are, up to rounding errors, those of the chosen dis-181

cretization scheme. The actual choice of the discretization scheme is liberated182

of any derivative accuracy concerns and should rather be made with regard183

to e.g. the nonlinearity, stiffness, and numerical stability. For stiff systems,184

implicit methods such as [23, 21] are required. Applicable IND techniques185

can be found e.g. in [23].186

4. Techniques for Nonlinear Model Predictive Control187

We now address the issue of applying either of the two SQP methods of188

Sections 2.4, 2.5 in an on–line NMPC setting.189

8

4.1. Initial–Value Embedding190

The key to an efficient numerical algorithm for NMPC is to reuse infor-191

mation from the last problem to initialize the new problem. This is due192

to the fact that subsequent problems differ only in the parameter x0 of the193

linear embedding Λ (8). If model predictions are sufficiently close to real194

process behavior, it is reasonable to expect that the information contained195

in the previous problem’s solution is a good initial guess very close to the196

solution of the new subproblem.197

In [26] and related works it has been proposed to initialize the current198

problem with the full solution of the previous optimization run, in particular199

control and state variables as well as multipliers. In doing so, the value of200

s0 will in general not be the value of the current state x0. We can, however,201

guarantee that s0 attains the value of x0 already after the first full Newton–202

type step by explicitly including the linear initial value constraint (7b), as203

done in the QP formulation (9). We refer to this way to initialize the current204

problem as Initial Value Embedding (IVE).205

4.2. Real–Time Iterations206

Applying the Newton–type step to the new problem initialized by the207

IVE yields a tangential predictor of the solution, i.e., a first order Taylor ap-208

proximation, even in the presence of an active set change. This motivates the209

idea of real–time iterations, which perform only one Newton–type iteration210

per NMPC sample, and is at the same time the main reason for preference211

of active set methods over interior–point techniques. We refer to [27] for a212

detailed survey on the topic of initial value embeddings and the resulting213

first order tangential predictors.214

4.3. Three Phases of a Real–Time Iteration215

Using the IVE also has an important algorithmic advantage. We can eval-216

uate all derivatives and all function values except the initial value constraint217

without knowledge of the current state x0. Consequently, we can pre–solve a218

major part of QP (9). This allows to separate each real–time iteration into219

the following three phases.220

Preparation. All functions and derivatives that do not require knowledge of221

x0 are evaluated using the iterate (wk, λk, µk) of the previous step. ODE222

solution and sensitivity computation according to the IND principle take223

place in this phase. In addition, sparsity analysis, structure exploitation, and224

9

matrix factorizations happen here, see Section 5. Note that the preparation225

phase of the new problem always takes place one sampling period ahead. The226

preparation phase can be interpreted as setting up the tangential predictor227

as a piecewise linear mapping x0 7→ ∆qk0 .228

Feedback. As soon as x0 is available, the QP (9) is solved for ∆qk0 and the229

feedback control ϕ0(tk, qk0 +∆qk0) is given to the process. Hence, the feedback230

delay reduces to the remaining solution time of the QP after preparation.231

The affine-linear dependence of this QP on x0 via Λ can further be exploited232

as described in Section 4.4.233

Transition. The full variables vector ∆wk = (∆qk,∆sk) is computed. The234

SQP step (10) is performed to obtain the new set (wk+1, λk+1, µk+1) of NLP235

variables.236

4.4. Parametric Quadratic Programming237

Both the structured NLP (7) and the QP subproblems (9) show a linear238

parametric dependence on x0. This is favorably exploited by parametric239

active set methods, cf. [28, 29, 30]. The idea here is to introduce a linear240

affine homotopy in a scalar parameter τ ∈ [0, 1] ⊂ R from the QP that was241

solved in iteration k − 1 to the QP to be solved in iteration k:242

min∆w

12∆wTB∆w + bT (τ)∆w (19a)

s.t. 0 = C∆w + c(τ) + Λx0(τ), (19b)

0 ≤ D∆w + d(τ), (19c)

with initial values x0(0) = xk−10 , x0(1) = xk0. The transition from the old243

to the new QP data is realized by gradient and constraint right hand sides244

b ∈ Hnw , c ∈ Hnc , d ∈ Hnd that are affine in τ ,245

Hn = {φ : R→ Rn | φ(τ) = (1− τ)φk−1 + τφk, τ ∈ (0, 1)}. (20a)

Moreover, from the optimality conditions of QP (19) in τ = 0 and τ = 1 it246

can be seen that an update of the QP’s matrices B, C, D is also possible247

without having to introduce a matrix–valued homotopy, e.g. [38].248

Using this approach to compute the SQP algorithm’s steps has multiple249

advantages. First, a phase one for finding a feasible point of the QP is250

unnecessary, as we can start the homotopy in a trivial QP with zero vectors251

10

and known optimal solution. Second, we can monitor the process of solving252

the QP using the distance to the homotopy end. Intermediate iterates are253

physically meaningful, and are optimal for a known QP of the homotopy.254

Thus, intermediate control feedback can be given during the ongoing solution255

process. Furthermore, this property of the intermediate iterates also gives256

rise to online algorithms which fix the maximum number of active set changes257

to meet hard real–time constraints, cf. [29, 30].258

5. Variants of Structure Exploitation259

The QPs (9, 19) exhibit a highly ordered block structure that is due to the260

direct multiple shooting discretization of problem (1). As efficient numerical261

algorithms will have to exploit this structure, we describe in this section three262

algorithmic variants that serve this purpose given different characteristics of263

the problem.264

5.1. Generic Sparse Solvers265

If QP (9) or (19) is to be solved directly in an active set method, fac-266

tors of the structured symmetric indefinite KKT system must be obtained267

after every active set change. A straightforward approach is to use a sparse268

LBLT decomposition as available e.g. with the highly efficient implementa-269

tion MA57 [31] available from HSL. An alternative ignoring symmetry is a270

LU decomposition (e.g. through UMFPACK [32]) as proposed in [33]. Suf-271

ficient sparsity of the QP blocks is required for this approach to be efficient,272

though, and the observable runtime complexity crucially depends on this. In273

addition, matrix update techniques that provide for cheap recovery of the274

factorization after an active set change are not available in general.275

5.2. Block Structured Factorization276

An alternative approach much more tailored to the particular direct mul-277

tiple shooting structures is a symmetric indefinite block factorization as pro-278

posed in [34, 35]. An extensive discussion of the family of such factorizations279

is available in [36]. In [37] a similar idea has been used in a dual active set280

method. In [38] the application of such approaches to mixed–integer optimal281

control problems is discussed, and suitable matrix update techniques are de-282

rived. Achievable runtime complexities are in general linear in the number283

N of shooting intervals, and cubic in nx resp. nq. Update techniques reduce284

the latter effort to quadratic complexity [37, 39].285

11

5.3. Condensing286

The fundamental idea of all so-called condensing algorithms is to pre–287

process the block structured QP (9) into a smaller but densely populated288

one before solving it by a dense active set method. Indeed, this is the only289

structure exploiting approach considered for comparison of run times in [1],290

and we briefly present it in the following.291

The direct multiple shooting discretization creates a particular block292

structure of the constraint matrix Ck in (9) which is deduced from (7),293

C =

Gq0

. . .

GqN−1

=: C1︷︸︸︷−IGs

0 −I. . .

. . .

GsN−1 −I

=: C2︷︸︸︷ . (21)

Herein Gsi denotes the state sensitivity dx(ti+1; ti, si, qi)/dsi and Gq

i denotes294

the parametric sensitivity dx(ti+1; ti, si, qi)/dqi, which both can be computed295

according to the IND principle of section 3. The matrices Bk, Dk and the296

vector bk are partitioned accordingly. Condensing projects problem (9) onto297

the null space of the linearization of the matching conditions (6) by applying298

a block Gaussian elimination M to (21) that yields C ′ = MCk with299

C ′1 =

0Gq

0

Gs1G

q0 Gq

1...

.... . .

GN−11 Gq

0 GN−12 Gq

1 · · · GqN−1

, C ′2 = −I. (22)

Therein, Gji = Gs

i · . . . ·Gsj =

∏jk=iG

sk for j ≥ i. In the same way we obtain300

c′ = Mck. We have from (22) that 0 = C ′1∆q −∆s + c′. This shape of the301

constraints lends itself to elimination of ∆s from the problem by substitution302

of ∆s in both the objective and the constraints. This results in the condensed303

problem304

min∆q

12∆qTB′′∆q + ∆qT b′′ (23a)

s.t. 0 ≤ D′′∆q + d′′, (23b)

12

wherein305

B′′ = B11 +B12C′1 + C ′

T1B

T12 + C ′

T1B22C

′1, D′′ = D1 +D2C

′1, (24a)

b′′ = b1 + C ′T1 b2 −BT

12c′ − C ′T1B22c

′, d′′ = D2c′ + d. (24b)

We refer to e.g. [12] for an extensive derivation of condensing methods also306

addressing effective reductions for the DAE constrained case.307

In a straightforward implementation, the computational effort for con-308

densing is O(N3 · (nx)3 · (nq)2). Condensing methods favourably exploit the309

structural properties of moderately-sized problems that have more system310

states that controls, i.e. nx � nq. Instead of nw = Nnx + (N − 1)nq un-311

knowns, the condensed QP (23) only holds the nw1 = (N − 1)nq controls312

that would also appear in a single–shooting approach, and eliminates the313

nw2 = Nnx states. Hence, this QP can typically be solved efficiently using a314

dense active set method, e.g. [41, 29, 30]. The characteristic runtime com-315

plexity here is O((nw1 )3) in the first active set QP iteration, and O((nw

1 )2) for316

all subsequent ones.317

Block structured factorizations as mentioned above are more effective for318

problems with very large control dimensions nq or very large numbers N of319

shooting intervals on the prediction horizon, see [36, 40, 38] and Section 6.320

5.4. Vector Condensing for Adjoint SQP321

In Section 2.5 we have sketched a variant of the SQP method that works322

on approximations of both the constraint linearizations and the Hessian of the323

Lagrangian. This fact can be exploited to speed up the condensing algorithm324

considerably.325

Prior to the first iteration, we initialize the adjoint SQP method with326

matrices B, C, D of our choice, e.g. the exact Hessian and Jacobians of327

the steady state solution, computed off–line. These matrices are henceforth328

kept fixed and serve as (freely available) approximations for all subsequent329

iterations. Online, only a single cheap adjoint derivative is then required to330

compute the modified gradient compensating for this approximation.331

Moreover, in (24) only the computation of the vector values b′′ and d′′332

from the modified gradient b and the constraint residuals c, d is required.333

The effort here is only O(N · (nx)2 · nq) and hence this methods promises to334

be drastically faster than the full matrix condensing of Section 5.3.335

13

6. Case Studies336

In this section we consider three case studies, two of them were also337

addressed in [1]. These comprise a nonlinear batch process, a continuously338

stirred tank reactor (CSTR) originally due to [42] and treated by e.g. [43, 44,339

45] and [1], and a motion control problem for a chain of masses connected340

by springs, see e.g. [2]. To each of the problems we apply341

• the full–space SQP method with computation of exact Jacobians in342

each NMPC iteration as described in Section 2.4 and [4, 26, 12]. Here,343

the Hessian is computed by either an L–BFGS (economic NMPC) or a344

Gauß–Newton (tracking NMPC) approximation.345

• the adjoint SQP method with cheap adjoint–mode computation of the346

modified gradient as described in Section 2.5 and [48, 2, 46, 47]. Here,347

the Hessian is computed in the off–line optimal solution in advance,348

and is kept fixed for all adjoint SQP iterations.349

We use Fehlberg’s 4th/5th order RK scheme with six stages, and compute350

sensitivities according to the IND principle. We set up and solve the struc-351

tured QP subproblem (9)352

• using a parametric active set method with a sparse LBLT factorization353

of the KKT system in the above methods, computed by the generic354

sparse linear algebra package MA57 described in [31], see Section 5.1;355

• alternatively using a parametric active set method with a tailored block356

factorization [36, 38, 40] for the arising KKT systems, see Section 5.2;357

• or alternatively by condensing of QP (9), see [16, 12], and solution of the358

condensed QP (23) by the parametric active set method qpOASES 3.0359

[29, 30], see Section 5.3.360

For full–space SQP this requires matrix condensing. This is the only361

approach considered in [1] for comparison to the CFE approach pre-362

sented there.363

For adjoint SQP, after initialization with set–point matrices, we only364

need the cheap vector condensing step (24b), cf. Section 5.4.365

Our computing platform is one core of an Intel Core i7 940 machine running366

at 2.67 GHz. Run times quoted from [1] are for an Intel P4 at 3.00 GHz.367

14

6.1. Nonlinear Batch368

The first problem considered for case study is a simplified chemical batch369

reactor with nonlinear dynamics. The yield x2 after one hour of operation is370

to be maximized by suitable control of a reactor temperature profile u. The371

problem formulation on a time horizon t ∈ [0, 1] is given as follows:372

minx(·),u(·)

−x2(1) (25a)

s.t. x1(t) = −(u(t) + pu2(t)

)x1(t) (25b)

x2(t) = u(t)x1(t), (25c)

x1(0) = 1, x2(0) = 0, (25d)

0 ≤ x1(t), 0 ≤ 1− x2(t), 0≤ u(t) ≤ 5, (25e)

where p = 12. Figure 1 depicts state trajectories and piecewise constant373

optimal control trajectories for a discretization of N = 160 shooting intervals.374

0 0.2 0.4 0.6 0.8 10

2

4

t [h]

u

(a) Control trajectory u(·).

0 0.2 0.4 0.6 0.8 10

0.5

1

t [h]

x

(b) State trajectories x1(·), x2(·).

Figure 1: Optimal control and state trajectories of problem (25) for N = 160.

375

Off–line Setting. Table 1 lists problem dimensions and objective function376

values for increasingly fine discretizations N of the time horizon, and com-377

pares run times of the three proposed structure exploiting algorithms for the378

full space SQP. For each choice of N the number of RK steps was chosen379

such that a relative accuracy of 10−6 of the optimal objective is ensured.380

SQP iterations with an L–BFGS Hessian were performed until a KKT tol-381

erance (cf. [12]) of 10−7 was satisfied. We initialized all computations with382

si = (1, 0) for 0 ≤ i ≤ N , qi = 1 for 0 ≤ i ≤ N − 1.383

Discussion. The number of SQP iterations required is nearly independent of384

N ; only very slow growth is observed. For the block QP solvers, overall run-385

time for small N is dominated by IND sensitivity computations, and grows386

15

Dimensions RK Objective Uncondensed SQP Condensed SQPN nvar ncon K [×10−1] # It. Bl.[ms] Sp.[ms] # It. Co.[ms]

5 18 10 5 −5.68388 7 3 4 7 310 33 20 4 −5.72243 9 6 6 9 520 63 40 3 −5.73298 9 9 9 9 940 123 80 2 −5.73479 10 13 12 10 1780 243 160 1 −5.73528 10 22 24 11 57

160 483 320 1 −5.73541 12 54 58 12 341320 963 640 1 −5.73544 12 152 159 14 2717

Table 1: Off-line optimal control: Objective function values, total number of full SQPiterations, and total run times until convergence for problem (25), nx = 2, nu = 1. Sp.:Block structured QP solver using MA57. Bl.: Block structured QP solver using blockstructured linear algebra. Co.: Matrix condensing and dense QP solver qpOASES.

nearly linear for N ≥ 40. Matrix condensing (Co.) runtime shows cubic387

growth, as derived from our presentation in Section 5, and falls behind for388

N ≥ 80. There is no significant difference between block structured factor-389

ization (Bl.) and generic sparse factorization (Sp.). This can be explained by390

observing that the number of active set changes stays below 3 in all SQP iter-391

ations, such that matrix update techniques available for the block structure392

factorization cannot play out their full potential.393

Even taking slightly differing computational platforms into account, all394

approaches using our IND principle are clearly at least as fast as the CFE395

approach, for which run times from 188 ms (N = 5) to 735 ms (N = 160)396

are reported in [1]. The only exception is the matrix condensing variant, the397

only one investigated in [1]. Error control is not addressed therein, but errors398

of the solutions in the range of 10−5 are reported, such that the quality of399

the obtained solutions can be assumed to be comparable.400

Shrinking Horizon NMPC. In [1] problem (25) has been mentioned as a401

benchmark for NMPC on long horizons, but no scenario is proposed. The402

real–world process is simulated by an IVP, starting at t = 0 in x = (1, 0).403

The controller is initialized in the off–line optimal solution. We consider here404

a shrinking horizon scenario with a disturbance ∆p = +0.7 at t = 0.5 h for405

∆t = 0.05 h. This disturbance is applied to the real–world simulation, and is406

assumed to be known to the optimizer with a delay of one sampling period.407

To realize a shrinking horizon, we fix the controls qN−k to qN−1 to zero408

in the k-th RTI, thus keeping the state sN−k unaltered for the remainder of409

16

the horizon. The problem’s dimensions and block structure remain the same.410

The advantage here is that for the block structure KKT factorization, the411

new factorization can be computed from the old one by means of a single412

matrix update. For vector condensing, we avoid having to recondense when413

the number of matrix blocks decreases.414

Figure 2 shows the feedback controls and the resulting state trajectories415

generated by 160 real–time iterations on a shrinking horizon for both the full416

SQP and the adjoint SQP controller. Towards the end of the horizon, the417

full SQP controller shows better reaction to the nonlinearity of the process.418

It achieves a slightly improved yield of x2(1) = 5.64731 · 10−1, compared to419

x2(1) = 5.63029 · 10−1 for the adjoint SQP controller.

0 0.2 0.4 0.6 0.8 10

2

4

t [h]

u

(a) Feedback control trajectories u(·).

0 0.2 0.4 0.6 0.8 10

0.5

1

t [h]

x

(b) State trajectories x1(·), x2(·).

Figure 2: Shrinking horizon feedback control and state trajectories of problem (25) forN = 160. Red: RTI using full SQP. Blue: RTI using adjoint SQP.

420

Discussion. For all algorithmic variants proposed, Table 2 shows a detailed421

analysis of the computational effort of the preparation and feedback phases422

of the RTI of Section 4. The time spent in the transition phase is negligible423

in all cases.424

N Prep. [ms] Feedback [ms]Sp. Co. Sp. Co.

160 10 85 2 < 1320 31 607 4 3640 105 5026 7 17

N Prep. [ms] Feedback [ms]Sp. Co. Sp. Co.

160 9 9 2 < 1320 29 28 4 1640 99 98 12 13

Table 2: NMPC: Average per–iteration run times of preparation and feedback phases inmilliseconds for the full SQP controller (left) and the adjoint SQP controller (right) onproblem (25). Sp.: Sparse QP solver using MA57. Co.: Matrix condensing (full SQP) orvector condensing (adjoint SQP), including runtime of the dense QP solver qpOASES.

From Table 2 we can again see the cubic complexity of matrix condensing425

necessary for full SQP. For the adjoint SQP controller, however, the proposed426

17

vector condensing drastically reduces the runtime spent in the preparation427

phase, and is competitive when compared to the sparse QP solver. In addi-428

tion, the adjoint SQP preparation phase is slightly faster than that of the full429

SQP. This is due to cheaper sensitivity computation by adjoint IND. This430

effect will become more evident in the next case studies.431

6.2. Continuous Stirred Tank Reactor432

The second case study addresses a continuous stirred tank reactor (CSTR)433

originally due to [42], here in a variant described by [45] and later considered434

by e.g. [43, 44]. In this setup, an exothermic reaction of x2(·) takes place in435

a liquid of varying level x1(·) with feed u1(·), and is controlled by external436

regulation u2(·) of the temperature x3(·). For t ∈ [0, 50] we strive to minimize437

a weighted deviation from a given set–point. Parameters of table 3 are taken438

from [45, 44], with time in minutes. The set–point is xs1 = 0.659 m, xs

2 =439

877 mol/m3, us1 = 0.1 m3/min, us

2 = 300 K and least–squares weights are440

w1 = 1, w2 = 10−4, w3 = 105, w4 = 10−1.441

minx(·),u(·)

∫ 50

0

w1

(x1(t)− xs

1

)2+ w2

(x2(t)− xs

2

)2(26a)

+w3

(u1(t)− us

1

)2+ w4

(u2(t)− u s

2

)2dt

s.t. x1(t) =1

πr2

(F0 − u1(t)

), (26b)

x2(t) =1

πr2

F0

(c0 − x2(t)

)x1(t)

− k0x2(t)e−ER

1x3(t) , (26c)

x3(t) =1

πr2

F0

(T0 − x3(t)

)x1(t)

− ∆H

ρ Cp

k0x2(t)e−ER

1x3(t) (26d)

+2U

rρ Cp

(u2(t)− x3(t)

),

x1(0) = xs1, x2(0) = xs

2, x3(0) = 324.5 K, (26e)

0.5 m ≤ x1(t) ≤ 2.5 m, 800 mol/m3 ≤ x2(t) ≤ 1000 mol/m3,

0.085 m3/min ≤ u1(t) ≤ 0.115 m3/min, 299 K ≤ u2(t) ≤ 301 K.

442

Off–Line Run Time Comparison. The off–line scenario investigated in [1]443

involves a set–point change of the molar concentration for t ≥ 9.0 min to444

18

Sym. Unit

x1 mx2 mol/m3

x3 Ku1 m3/minu2 K

Sym. Value Unit

F0 0.1 m3/minT0 350 Kc0 1000 mol/m3

r 0.219 mk0 7.2× 1010 1/min

Sym. Value Unit

E/R 8750 KU 54936 J/(min m2 K)ρ 1000 kg/m3

Cp 239 J/(kg K)∆H −50000 J/mol

Table 3: State and control units and parameter values and units for the CSTR model (26).

c0 = 1050 mol/m3. We initialized all NLP variables in the steady–state445

for c0 = 1000 mol/m3. The total run time for computation of the off–line446

optimal solution (N = 50 intervals, K = 20 RK steps) using the sparse QP447

solver is 89 ms.448

Figures 3 and 4 show the corresponding optimal control and state profiles449

for N = 50 shooting intervals. It is important to note that in this off–line450

computation, the optimizer anticipates this set–point change.451

0 10 20 30 40 500.0998

0.1

0.1002

0.1004

0.1006

t [min]

u1 [

m3/m

in]

(a) Control trajectory u1(·).

0 10 20 30 40 50299.5

300

300.5

301

t [min]

u2 [

K]

(b) Control trajectory u2(·).

Figure 3: Optimal control trajectories of problem (26) for N = 50.

0 10 20 30 40 50

0.58

0.6

0.62

0.64

0.66

t [min]

x1 [

m]

(a) State trajectory x1(·).

0 10 20 30 40 50870

880

890

900

t [min]

x2 [

mo

l/m

3]

(b) State trajectory x2(·).

0 10 20 30 40 50324

326

328

330

t [min]

x3 [

K]

(c) State trajectory x3(·).

Figure 4: Optimal state trajectories of problem (26) for N = 50.

The problem treated in [1] is very similar, although parameter values452

and units appear to have been mixed up in writing, and the initialization453

used there is unknown. For comparison, [1] reports a run time 984 ms for454

19

the CFE approach computing the off–line solution, roughly 10 times slower455

that our approach. In addition, conversion of the values of [45] yields U =456

54,750 J/min m2 K whereas we use the value U = 54,936 J/min m2 K as457

done in [1, 44].458

Adaptivity in IND. The CSTR dynamics show more nonlinear features in the459

optimal solution than the batch reactor does. Therefore, we use this example460

to demonstrate one feature of the IND approach to sensitivity generation,461

namely the ability to easily choose an adaptive discretization by using a462

local error detection and control facility in the RK discretization scheme.463

This feature bears potential for both computational speedups and increased464

precision of the obtained optimal solutions, and sets the IND approach apart465

from the CFE approach presented in [1].466

N Objective for fixed step RK Adaptive RKK = 2 K = 4 K = 8 K = 16 K = 32 øK Objective

10 — — 0.91899 0.92882 0.92651 28.7 0.9269120 — 0.90168 0.90908 0.90738 0.90781 15.1 0.9076640 0.89613 0.90382 0.90198 0.90247 0.90194 8.0 0.9023080 0.90276 0.90093 0.90141 0.90089 0.90133 4.6 0.90124

160 0.90064 0.90112 0.90060 0.90105 0.90093 2.8 0.90095

Table 4: Optimal objective function values for adaptive choice of variable–size RK steps tosatisfy a relative local error tolerance of 10−8 (IND only, columns 7 and 8). The fixed–stepRK schemes (CFE and IND, columns 2 to 6) do not lead to an accuracy of five significantdigits in the objective for any of the listed choices of the number K of equidistant steps.

Discussion. Table 4 shows optimal objective function values computed for467

the CSTR example (26) for increasingly fine choices of the number of shooting468

intervals N and the number of equidistant RK steps K per shooting interval.469

The results are compared to those obtained for an adaptive step size choice470

based on a Fehlberg type estimator [19] for the local truncation error of the471

RK scheme, which is required to stay below 10−8. Here, Table 4 lists the472

average number øK of RK steps per shooting interval. The corresponding473

optimal objective function values are correct up to five decimal digits. The474

fully automatic choice of steps leads to small sizes where necessary and big475

steps where possible. As is evident, a significantly lower total number of476

steps is taken, and this saves computation time. In addition, the precision is477

unrivaled by any of the optimization runs carried out with comparable fixed478

step counts.479

20

Moving Horizon NMPC. We consider the set–point change of [44] to F0 =480

0.11 m3/min at t = 5 min, and use a prediction horizon of 5 min length. Note481

that again the set–point change is not anticipated by the optimizer, but is482

only assumed to be known one sample time after it happened. On startup,483

the process is assumed to be in steady–state. Least–squares objective weights484

proposed in [1] are not suitable for steady–state tracking by NMPC, as the485

control regularization is too strong. We instead propose to choose w1 = 1,486

w2 = 10−4, w3 = 10−8, w4 = 10−4, realizing a tracking objective for xs1, xs

2487

with a reasonably small control regularization.488

Feedback control and state trajectories around the set–point change are489

shown in Figures 5 and 6 for both controllers.490

4 5 6 7 8 9 100.09

0.1

0.11

0.12

t [min]

u1

(a) Control trajectory u1(·).

4 5 6 7 8 9 10299

299.5

300

300.5

301

t [min]

u2

(b) Control trajectory u2(·).

Figure 5: Moving horizon NMPC feedback control trajectories of problem (26) for N =160. Red: RTI using full SQP. Blue: RTI using adjoint SQP.

4 6 8 100.65

0.66

0.67

0.68

0.69

t [min]

x1

(a) State trajectory x1(·).

4 6 8 10876

877

878

879

t [min]

x2

(b) State trajectory x2(·).

4 6 8 10324

324.5

325

325.5

326

t [min]

x3

(c) State trajectory x3(·).

Figure 6: Moving horizon NMPC state trajectories of problem (26) for N = 160. Red:RTI using full SQP. Blue: RTI using adjoint SQP.

Discussion. Table 5 presents run times for preparation and feedback phases491

of the RTI scheme for both the full SQP and the adjoint SQP controller. As492

before, matrix condensing falls behind as the number N of shooting intervals493

increases, but vector condensing performance proposed for the adjoint SQP494

controller matches that of the sparse QP solver. The feedback phase, how-495

ever, is considerably slower also for vector condensing. This is explained by496

21

the small number of nx = 3 differential states that can be eliminated in the497

condensed QP. The advantages of condensing and adjoint IND will become498

more evident for problems with larger state space dimension, such as in the499

third case study to be presented next.

N Prepar. [ms] Feedback [ms]Sp. Co. Sp. Co.

80 10 25 1 1160 24 297 2 4320 66 2162 4 31640 192 17744 9 193

N Prepar. [ms] Feedback [ms]Sp. Co. Sp. Co.

80 8 8 1 1160 20 20 2 2320 56 55 4 12640 172 169 9 54

Table 5: NMPC: Average per–iteration run times in milliseconds for the full SQP controller(left) and the adjoint SQP controller (right) on problem (26), nx = 3, nu = 2. Sp.: SparseQP solver using MA57. Co.: Dense QP solver qpOASES, including runtime of matrixcondensing (full SQP) or vector condensing (adjoint SQP).

500

6.3. Chain Problem501

The third case study involves a motion control problem for a chain of502

n+ 1 point masses connected by springs and subject to gravity, see [2]. The503

point mass positions are denoted by xi(t) ∈ R3 and velocities by vi(t) ∈ R3.504

The first point mass is fixed at the origin. Starting in xi(0) = (7.5i/n, 0, 0),505

vi(0) = (0, 0, 0), 0 ≤ i ≤ n, the velocity vn(t) of the final point mass is to be506

controlled by u(t) ∈ R3 such that this energy–conserving system returns to507

rest in 40 s.508

minx(·),u(·)

∫ 40

0

wv

n∑i=1

||vi(t)||22 + wx||xn(t)− xe||22 + wu||u(t)||22 dt (27a)

xi(t) = vi(t), 1 ≤ i ≤ n− 1, (27b)

vi(t) = (Fi+1(t)− Fi(t)) · n/m− g, 1 ≤ i ≤ n− 1, (27c)

xn(t) = u(t), (27d)

u(t) ∈ [−1, 1]3. (27e)

Therein for 1 ≤ i ≤ n509

Fi(t) :=(xi(t)− xi−1(t)

)· k(n− lr/||xi(t)− xi−1(t)||2

), (28)

with x0(t) := (0, 0, 0) for all t ∈ [0, 40].510

22

Characteristics and weights are given in Table 6. For n = 16 point masses511

this system has nx = 87 states and nu = 3 controls, and hence is considerably512

larger than the case studies considered in [1].

Sym. Value Unit

g (0, 0, 9.81) m/s2

k 0.1 N/mlr 0.55 m

Sym. Value Unit

m 0.45 kgn 15 –xe (7.5, 0, 0) m

Sym. Value Unit

wv 0.25 –wx 25 –wu 0.01 –

Table 6: Parameter values for the chain model (27).

513

Off–line Optimal Control. For N = 640 the NLP after discretization has514

nvar = 57690 unknowns and neq = 55680 equality constraints. The off–515

line solution is obtained after a run time of 2 minutes using the full SQP516

algorithm with Gauss–Newton approximation of the Hessian, and using the517

block structured QP solver. Off–line optimal control trajectories that bring518

the chain to rest are shown in Figure 7 for granularities N = 80, 160, 320,519

640 of the control. Clearly, a sufficiently fine control discretization is required520

to this end: For N = 80 the chain could not be brought to rest by t = 40 s.

0 5 10 15 20 25 30 35 40−1

−0.5

0

0.5

1

t [s]

ux [m

]

0 5 10 15 20 25 30 35 40−1

−0.5

0

0.5

1

t [s]

uz [m

]

(a) Control trajectories ux(·), uz(·).

0 1 2 3 4 5−6

−5

−4

−3

−2

−1

0

xx [m]

xz [

m]

(b) Point mass positions for N = 80 (red)and N = 160 (blue) at t = 40.

Figure 7: Off–line optimal controls for the chain problem (27) (left) for N = 80 (red), 160(blue), 320 (green), 640 (black), and ultimate point mass positions (xi,x(40), xi,z(40)).

521

NMPC Scenario. We now consider the above problem as an NMPC problem522

on a prediction horizon of 8 s discretized with N = 40, 80, or 160 shooting523

23

intervals, and running for 50 s. Table 7 presents run times for both proposed524

controllers and all three approaches at structure exploitation. Clearly, the525

adjoint SQP controller relying on adjoint IND sensitivity generation now526

delivers the better performance by a significant margin. Coupled with the527

proposed vector condensing approach, feedback delays in the low millisecond528

range are possible even for this larger system with 87 states. Second to vector529

condensing comes the dedicated block structured QP solver. For N = 40,530

the adjoint SQP controller is real–time feasible (8s/40 = 200ms sampling531

time) using both approaches to structure exploitation, and is almost real-532

time feasible for N = 80 (100ms sampling time). The sparse solver relying533

on MA57 now falls behind in the feedback phase due to lack of sufficient534

sparsity in the systems to be solved.535

N

4080

160

Prep. [ms] Feedback [ms]Sp. Bl. Co. Sp. Bl. Co.

229 228 1650 218 15 9455 454 7456 807 39 29916 911 38870 1767 85 161

Prep. [ms] Feedback [ms]Sp. Bl. Co. Sp. Bl. Co.

51 56 59 209 12 1108 116 123 531 28 3220 221 249 1822 89 14

Table 7: NMPC: Average per–iteration run times in milliseconds for the full SQP controller(left) and the adjoint SQP controller (right) on problem (27), nx = 87, nu = 3. Sp.: SparseQP solver using MA57. Bl.: Block structured QP solver. Co.: Dense QP solver qpOASES,including runtime of matrix condensing (full SQP) or vector condensing (adjoint SQP).

7. Summary and Conclusions536

In this paper we addressed fast numerical methods for direct multiple537

shooting based NMPC of dynamic process control problems with long predic-538

tion horizons. We presented two SQP methods, one using full Jacobian infor-539

mation and one based on approximate Jacobians and compensation through540

a modified gradient. Both methods crucially rely on efficient and accurate541

computation of forward or adjoint derivative information.542

To this end, we reviewed the fundamental principle of IND at the ex-543

ample of explicit one–step methods for ODE process models. The direct544

multiple shooting discretization induces a block structured QP subproblem545

from which approximate feedback controls are computed. For use in active546

set QP solvers we presented three variants of problem structure exploitation.547

These comprise sparse linear algebra, block structured linear algebra, and a548

24

condensing preprocessing step. For the adjoint SQP method we proposed a549

matrix free condensing step that has a significant runtime complexity advan-550

tage.551

To evaluate the relative merits of each of the proposed direct multiple552

shooting frameworks, we considered three benchmark problems recently ad-553

dressed in literature. We computed off–line optimal solutions for set–point554

change or disturbance scenarios, and also treated these scenarios in a sim-555

ulated NMPC setting. We provided detailed insight into the achieved run556

times, possible sampling rates, and feedback delays. Here, we found the pro-557

posed adjoint SQP method combined with vector condensing to perform best558

by wide margin for systems with larger state space dimensions.559

We also carried out a comparison of the IND principle to a recently pro-560

posed CFE based sensitivity generation method. Including adaptivity into561

the discretization scheme is easily possible in the IND approach. We showed562

that throughout all presented computations, the IND approach is consider-563

ably faster than CFE.564

Both the full SQP and the adjoint SQP algorithm considered in this paper565

treat optimality conditions of the underlying NLP. Further run time speedups566

are possible, e.g. by using the concept of multi–level iteration schemes as first567

described in [48].568

Acknowledgements569

The research leading to these results has received funding from the European570

Union Seventh Framework Programme FP7/2007-2013 under grant agreement571

no FP7-ICT-2009-4 248940. We gratefully acknowledge support by the Heidelberg572

Graduate School of Mathematical and Computational Methods for the Sciences573

(HGS MathComp) funded by Deutsche Forschungsgemeinschaft (DFG). The finan-574

cial support of the DFG in the context of the research cluster “Optimization–based575

control of chemical processes” is gratefully acknowledged.576

References577

[1] J. Tamimi, P. Li, A combined approach to nonlinear model predictive control of fast578

systems, Journal of Process Control 20 (2010) 1092–1102.579

[2] L. Wirsching, An SQP Algorithm with Inexact Derivatives for a Direct Multiple580

Shooting Method for Optimal Control Problems, Diploma thesis, Heidelberg Univer-581

sity, 2006.582

25

[3] H. G. Bock, Numerical treatment of inverse problems in chemical reaction kinetics, in:583

K. Ebert, P. Deuflhard, W. Jager (Eds.), Modelling of Chemical Reaction Systems,584

volume 18 of Springer Series in Chemical Physics, Springer, Heidelberg, 1981, pp.585

102–125.586

[4] H. G. Bock, K. J. Plitt, A Multiple Shooting algorithm for direct solution of optimal587

control problems, in: Proceedings of the 9th IFAC World Congress, Pergamon Press,588

Budapest, 1984, pp. 242–247.589

[5] D. Leineweber, Efficient reduced SQP methods for the optimization of chemical pro-590

cesses described by large sparse DAE models, volume 613 of Fortschritt-Berichte VDI591

Reihe 3, Verfahrenstechnik, VDI Verlag, Dusseldorf, 1999.592

[6] D. Q. Mayne, J. B. Rawlings, C. V. Rao, P. O. M. Scokaert, Constrained model593

predictive control: stability and optimality, Automatica 26 (2000) 789–814.594

[7] J. B. Rawlings, D. Mayne, Model Predictive Control: Theory and Design, Nob Hill595

Publishing, LLC, 2009.596

[8] C. V. Rao, J.B. Rawlings, D. Q. Mayne, Constrained state estimation for nonlinear597

discrete-time systems: Stability and moving horizon approximations, IEEE Transac-598

tions on Automatic Control 48 (2003) 246–258.599

[9] M. Diehl, P. Kuhl, H. G. Bock, J. P. Schloder, Schnelle Algorithmen fur die Zustands-600

und Parameterschatzung auf bewegten Horizonten, Automatisierungstechnik 54601

(2006) 602–613.602

[10] P. Kuhl, M. Diehl, T. Kraus, J. P. Schloder, H. G. Bock, A real-time algorithm for603

moving horizon state and parameter estimation, Computers and Chemical Engineer-604

ing 35 (2011) 71–83.605

[11] K. J. Plitt, Ein superlinear konvergentes Mehrzielverfahren zur direkten Berech-606

nung beschrankter optimaler Steuerungen, Diploma thesis, Rheinische Friedrich–607

Wilhelms–Universitat Bonn, 1981.608

[12] D. Leineweber, I. Bauer, A. Schafer, H. G. Bock, J. P. Schloder, An efficient multiple609

shooting based reduced SQP strategy for large-scale dynamic process optimization610

(Parts I and II), Computers and Chemical Engineering 27 (2003) 157–174.611

[13] A. Potschka, H. G. Bock, J. P. Schloder, A minima tracking variant of semi-infinite612

programming for the treatment of path constraints within direct solution of optimal613

control problems, Optimization Methods and Software 24 (2009) 237–252.614

[14] J. Nocedal, S. Wright, Numerical Optimization, Springer Verlag, Berlin Heidelberg615

New York, 2nd edition, 2006.616

26

[48] H. G. Bock, M. Diehl, E. A. Kostina, J. P. Schloder, Constrained Optimal Feedback617

Control for DAE, in: L. Biegler, O. Ghattas, M. Heinkenschloss, D. Keyes, B. van618

Bloemen Waanders (Eds.), Real-Time PDE-Constrained Optimization, SIAM, 2007,619

pp. 3–24.620

[16] H. G. Bock, Randwertproblemmethoden zur Parameteridentifizierung in Systemen621

nichtlinearer Differentialgleichungen, volume 183 of Bonner Mathematische Schriften,622

Universitat Bonn, Bonn, 1987.623

[17] C. D. T. Runge, Uber die numerische Auflosung von Differentialgleichungen, Math-624

ematische Annalen 46 (1895) 167–178.625

[18] M. Kutta, Beitrag zur naherungsweisen Integration totaler Differentialgleichungen,626

Zeitschrift fur Mathematik und Physik 46 (1901) 435–453.627

[19] E. Fehlberg, Klassische Runge-Kutta-Formeln funfter und siebenter Ordnung mit628

Schrittweiten-Kontrolle, Computing 4 (1969) 93–106.629

[20] I. Bauer, H. G. Bock, S. Korkel, J. P. Schloder, Numerical methods for initial value630

problems and derivative generation for DAE models with application to optimum631

experimental design of chemical processes, in: Scientific Computing in Chemical632

Engineering II, Springer, 1999, pp. 282–289.633

[21] L. Petzold, S. Li, Y. Cao, R. Serban, Sensitivity analysis of differential-algebraic634

equations and partial differential equations, Computers and Chemical Engineering635

30 (2006) 1553–1559.636

[22] A. Sandu, Reverse automatic differentiation of linear multistep methods, in: T. J.637

Barth, M. Griebel, D. E. Keyes, R. M. Nieminen, D. Roose, T. Schlick, C. H. Bischof,638

H. M. Bucker, P. Hovland, U. Naumann, J. Utke (Eds.), Advances in Automatic Dif-639

ferentiation, volume 64 of Lecture Notes in Computational Science and Engineering,640

Springer Berlin Heidelberg, 2008, pp. 1–12.641

[23] J. Albersmeyer, Adjoint based algorithms and numerical methods for sensitivity gen-642

eration and optimization of large scale dynamic systems, Ph.D. thesis, Heidelberg643

University, 2010.644

[24] W. Enright, D. Higham, B. Owren, W. Sharp, A Survey of the Explicit Runge–Kutta645

Method, Technical Report 291/94, Department of Computer Science, University of646

Toronto, Canada;, Toronto, M5S 1A4, Canada, 1995.647

[25] A. Griewank, Evaluating Derivatives, Principles and Techniques of Algorithmic Dif-648

ferentiation, number 19 in Frontiers in Applied Mathematics, SIAM, Philadelphia,649

2000.650

[26] M. Diehl, H. G. Bock, J. P. Schloder, R. Findeisen, Z. Nagy, F. Allgower, Real-651

time optimization and nonlinear model predictive control of processes governed by652

differential-algebraic equations, J. Proc. Contr. 12 (2002) 577–585.653

27

[27] M. Diehl, H. J. Ferreau, N. Haverbeke, Efficient numerical methods for nonlinear654

mpc and moving horizon estimation, in: L. Magni, D. Raimondo, F. Allgower (Eds.),655

Nonlinear Model Predictive Control, volume 384 of Springer Lecture Notes in Control656

and Information Sciences, Springer-Verlag, Berlin, Heidelberg, New York, 2009, pp.657

391–417.658

[28] M. Best, An Algorithm for the Solution of the Parametric Quadratic Programming659

Problem, Applied Mathematics and Parallel Computing, Physica-Verlag, Heidelberg,660

pp. 57–76.661

[29] H. J. Ferreau, H. G. Bock, M. Diehl, An online active set strategy to overcome the662

limitations of explicit MPC, International Journal of Robust and Nonlinear Control663

18 (2008) 816–830.664

[30] H. J. Ferreau, A. Potschka, C. Kirches, The qpOASES website, 2011.665

http://www.kuleuven.be/optec/software/qpOASES.666

[31] I. Duff, MA57 — a code for the solution of sparse symmetric definite and indefinite667

systems, ACM Transactions on Mathematical Software 30 (2004) 118–144.668

[32] T. Davis, Algorithm 832: UMFPACK - an unsymmetric-pattern multifrontal method669

with a column pre-ordering strategy, ACM Trans. Math. Software 30 (2004) 196–199.670

[33] H. Huynh, A Large-Scale Quadratic Programming Solver Based On Block-LU Up-671

dates of the KKT System, Ph.D. thesis, Stanford University, 2008.672

[34] M. Steinbach, A structured interior point SQP method for nonlinear optimal control673

problems, in: R. Bulirsch, D. Kraft (Eds.), Computational Optimal Control, vol-674

ume 115 of International Series of Numerical Mathematics, Birkhauser, Basel Boston675

Berlin, 1994, pp. 213–222.676

[35] M. Steinbach, Structured interior point SQP methods in optimal control, Zeitschrift677

fur Angewandte Mathematik und Mechanik 76 (1996) 59–62.678

[36] M. Steinbach, Fast recursive SQP methods for large-scale optimal control problems,679

Ph.D. thesis, Heidelberg University, 1995.680

[37] R. Bartlett, L. Biegler, QPSchur: A dual, active set, Schur complement method for681

large-scale and structured convex quadratic programming algorithm, Optimization682

and Engineering 7 (2006) 5–32.683

[38] C. Kirches, Fast numerical methods for mixed–integer nonlinear model–predictive684

control, Ph.D. thesis, Heidelberg University, 2010.685

[39] C. Kirches, H. G. Bock, J. P. Schloder, S. Sager, A factorization with686

update procedures for a KKT matrix arising in direct optimal control,687

Mathematical Programming Computation (2010). (submitted). Available Online:688

http://www.optimization-online.org/DB HTML/2009/11/2456.html.689

28

[40] C. Kirches, H. G. Bock, J. P. Schloder, S. Sager, Block structured quadratic pro-690

gramming for the direct multiple shooting method for optimal control, Optimization691

Methods and Software 26 (2011) 239–257.692

[41] P. Gill, W. Murray, M. Saunders, User’s Guide For QPOPT 1.0: A Fortran Package693

For Quadratic Programming, 1995.694

[42] K.-U. Klatt, S. Engell, Ruhrkesselreaktor mit Parallel- und Folgereaktion., in:695

S. Engell (Ed.), Nichtlineare Regelung – Methoden, Werkzeuge, Anwendungen. VDI-696

Berichte Nr. 1026, VDI-Verlag, Dusseldorf, 1993, pp. 101–108.697

[43] M. Diehl, Real-Time Optimization for Large Scale Nonlinear Processes, Ph.D. thesis,698

Heidelberg University, 2001.699

[44] G. Pannocchia, J. B. Rawlings, Disturbance models for offset–free model-predictive700

control, AIChE Journal 49 (2003) 426–437.701

[45] M. Henson, D. Seborg, Nonlinear Process Control, Prentice Hall, Upper Saddle River,702

NJ, 1st edition, 1997.703

[46] L. Wirsching, J. Albersmeyer, P. Kuhl, M. Diehl, H. G. Bock, An adjoint-based704

numerical method for fast nonlinear model predictive control, in: M. Chung, P. Misra705

(Eds.), Proceedings of the 17th IFAC World Congress, Seoul, Korea, July 6–11, 2008,706

volume 17, IFAC-PapersOnLine, 2008, pp. 1934–1939.707

[47] C. Kirches, L. Wirsching, S. Sager, H. G. Bock, Efficient numerics for nonlinear model708

predictive control, in: M. Diehl, F. Glineur, E. Jarlebring, W. Michiels (Eds.), Recent709

Advances in Optimization and its Applications in Engineering, Springer, 2010, pp.710

339–359.711

[48] H. G. Bock, M. Diehl, E. A. Kostina, J. P. Schloder, Constrained Optimal Feedback712

Control for DAE, in: L. Biegler and O. Ghattas and M. Heinkenschloss and D. Keyes713

and B. van Bloemen Waanders (Eds.), Real-Time PDE-Constrained Optimization,714

ch. 1, pp. 3–24, SIAM, 2007.715

29

e cient direct multiple shooting for nonlinear model ... · e cient direct multiple shooting for...

Documents