lagrange multiplier approach to variational problems and applications

Lagrange MultiplierApproach toVariational Problemsand Applications

Advances in Design and Control

SIAM’s Advances in Design and Control series consists of texts and monographs dealing with allareas of design and control and their applications. Topics of interest include shape optimization,multidisciplinary design, trajectory optimization, feedback, and optimal control. The series focuseson the mathematical and computational aspects of engineering design and control that are usablein a wide variety of scientific and engineering disciplines.

Editor-in-ChiefRalph C. Smith, North Carolina State University

Editorial BoardAthanasios C. Antoulas, Rice UniversitySiva Banda, Air Force Research LaboratoryBelinda A. Batten, Oregon State UniversityJohn Betts, The Boeing CompanyStephen L. Campbell, North Carolina State University Eugene M. Cliff, Virginia Polytechnic Institute and State University Michel C. Delfour, University of Montreal Max D. Gunzburger, Florida State University J. William Helton, University of California, San Diego Arthur J. Krener, University of California, DavisKirsten Morris, University of WaterlooRichard Murray, California Institute of TechnologyEkkehard Sachs, University of Trier

Series VolumesIto, Kazufumi and Kunisch, Karl, Lagrange Multiplier Approach to Variational Problems and

ApplicationsXue, Dingyü, Chen, YangQuan, and Atherton, Derek P., Linear Feedback Control: Analysis and

Design with MATLABHanson, Floyd B., Applied Stochastic Processes and Control for Jump-Diffusions: Modeling,

Analysis, and Computation Michiels, Wim and Niculescu, Silviu-Iulian, Stability and Stabilization of Time-Delay Systems:

An Eigenvalue-Based ApproachIoannou, Petros and Fidan, Baris, Adaptive Control TutorialBhaya, Amit and Kaszkurewicz, Eugenius, Control Perspectives on Numerical Algorithms and

Matrix ProblemsRobinett III, Rush D., Wilson, David G., Eisler, G. Richard, and Hurtado, John E., Applied Dynamic

Programming for Optimization of Dynamical SystemsHuang, J., Nonlinear Output Regulation: Theory and ApplicationsHaslinger, J. and Mäkinen, R. A. E., Introduction to Shape Optimization: Theory, Approximation,

and ComputationAntoulas, Athanasios C., Approximation of Large-Scale Dynamical SystemsGunzburger, Max D., Perspectives in Flow Control and OptimizationDelfour, M. C. and Zolésio, J.-P., Shapes and Geometries: Analysis, Differential Calculus, and

OptimizationBetts, John T., Practical Methods for Optimal Control Using Nonlinear ProgrammingEl Ghaoui, Laurent and Niculescu, Silviu-Iulian, eds., Advances in Linear Matrix Inequality Methods

in ControlHelton, J. William and James, Matthew R., Extending H∞ Control to Nonlinear Systems: Control

of Nonlinear Systems to Achieve Performance Objectives

¸

Society for Industrial and Applied MathematicsPhiladelphia

Lagrange MultiplierApproach toVariational Problemsand Applications

Kazufumi ItoNorth Carolina State UniversityRaleigh, North Carolina

Karl KunischUniversity of GrazGraz, Austria

Copyright © 2008 by the Society for Industrial and Applied Mathematics.

10 9 8 7 6 5 4 3 2 1

All rights reserved. Printed in the United States of America. No part of this book may be reproduced, stored, or transmitted in any manner without the written permission of the publisher. For information, write to the Society for Industrial and AppliedMathematics, 3600 Market Street, 6th Floor, Philadelphia, PA 19104-2688 USA.

Trademarked names may be used in this book without the inclusion of a trademarksymbol. These names are used in an editorial context only; no infringement of trademarkis intended.

Library of Congress Cataloging-in-Publication Data

Ito, Kazufumi.Lagrange multiplier approach to variational problems and applications / Kazufumi Ito,

Karl Kunisch.p. cm. -- (Advances in design and control ; 15)

Includes bibliographical references and index.ISBN 978-0-898716-49-8 (pbk. : alk. paper)

1. Linear complementarity problem. 2. Variational inequalities (Mathematics). 3. Multipliers (Mathematical analysis). 4. Lagrangian functions. 5. Mathematicaloptimization. I. Kunisch, K. (Karl), 1952– II. Title.

QA402.5.I89 2008519.3--dc22

2008061103

is a registered trademark.

To our families:

Junko, Yuka, and Satoru

Brigitte, Katharina, Elisabeth, and Anna

ItoKunisc2008/6/12page vii

�

�

�

�

�

�

�

�

Contents

Preface xi

1 Existence of Lagrange Multipliers 11.1 Problem statement and generalities . . . . . . . . . . . . . . . . . . . . . . 11.2 A generalized open mapping theorem . . . . . . . . . . . . . . . . . . . . . 31.3 Regularity and existence of Lagrange multipliers . . . . . . . . . . . . . . 51.4 Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81.5 Weakly singular problems . . . . . . . . . . . . . . . . . . . . . . . . . . . 171.6 Approximation, penalty, and adapted penalty techniques . . . . . . . . . . . 23

1.6.1 Approximation techniques . . . . . . . . . . . . . . . . . . . . . . . 231.6.2 Penalty techniques . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

2 Sensitivity Analysis 272.1 Generalities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 272.2 Implicit function theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . 312.3 Stability results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 342.4 Lipschitz continuity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 452.5 Differentiability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 532.6 Application to optimal control of an ordinary differential equation . . . . . 62

3 First Order Augmented Lagrangians for Equality andFinite Rank Inequality Constraints 653.1 Generalities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 653.2 Augmentability and sufficient optimality . . . . . . . . . . . . . . . . . . . 673.3 The first order augmented Lagrangian algorithm . . . . . . . . . . . . . . . 753.4 Convergence of Algorithm ALM . . . . . . . . . . . . . . . . . . . . . . . 783.5 Application to a parameter estimation problem . . . . . . . . . . . . . . . . 82

4 Augmented Lagrangian Methods for Nonsmooth, Convex Optimization 874.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 874.2 Convex analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89

4.2.1 Conjugate and biconjugate functionals . . . . . . . . . . . . . . . . 924.2.2 Subdifferential . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95

4.3 Fenchel duality theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98

vii

ItoKunisc2008/6/12page viii

�

�

�

�

�

�

�

�

viii Contents

4.4 Generalized Yosida–Moreau approximation . . . . . . . . . . . . . . . . . 1044.5 Optimality systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1094.6 Augmented Lagrangian method . . . . . . . . . . . . . . . . . . . . . . . . 1144.7 Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119

4.7.1 Bingham flow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1204.7.2 Image restoration . . . . . . . . . . . . . . . . . . . . . . . . . . . 1214.7.3 Elastoplastic problem . . . . . . . . . . . . . . . . . . . . . . . . . 1224.7.4 Obstacle problem . . . . . . . . . . . . . . . . . . . . . . . . . . . 1224.7.5 Signorini problem . . . . . . . . . . . . . . . . . . . . . . . . . . . 1244.7.6 Friction problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1254.7.7 L1-fitting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1264.7.8 Control problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126

5 Newton and SQP Methods 1295.1 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1295.2 Newton method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1335.3 SQP and reduced SQP methods . . . . . . . . . . . . . . . . . . . . . . . . 1375.4 Optimal control of the Navier–Stokes equations . . . . . . . . . . . . . . . 143

5.4.1 Necessary optimality condition . . . . . . . . . . . . . . . . . . . . 1455.4.2 Sufficient optimality condition . . . . . . . . . . . . . . . . . . . . 1475.4.3 Newton’s method for (5.4.1) . . . . . . . . . . . . . . . . . . . . . 147

5.5 Newton method for the weakly singular case . . . . . . . . . . . . . . . . . 148

6 Augmented Lagrangian-SQP Methods 1556.1 Generalities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1556.2 Equality-constrained problems . . . . . . . . . . . . . . . . . . . . . . . . 1566.3 Partial elimination of constraints . . . . . . . . . . . . . . . . . . . . . . . 1656.4 Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172

6.4.1 An introductory example . . . . . . . . . . . . . . . . . . . . . . . 1726.4.2 A class of nonlinear elliptic optimal control problems . . . . . . . . 174

6.5 Approximation and mesh-independence . . . . . . . . . . . . . . . . . . . 1836.6 Comments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 186

7 The Primal-Dual Active Set Method 1897.1 Introduction and basic properties . . . . . . . . . . . . . . . . . . . . . . . 1897.2 Monotone class . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1967.3 Cone sum preserving class . . . . . . . . . . . . . . . . . . . . . . . . . . 1977.4 Diagonally dominated class . . . . . . . . . . . . . . . . . . . . . . . . . . 2007.5 Bilateral constraints, diagonally dominated class . . . . . . . . . . . . . . 2027.6 Nonlinear control problems with bilateral constraints . . . . . . . . . . . . 206

8 Semismooth Newton Methods I 2158.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2158.2 Semismooth functions in finite dimensions . . . . . . . . . . . . . . . . . . 217

8.2.1 Basic concepts and the semismooth Newton algorithm . . . . . . . . 2178.2.2 Globalization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 222

ItoKunisc2008/6/12page ix

�

�

�

�

�

�

�

�

Contents ix

8.2.3 Descent directions . . . . . . . . . . . . . . . . . . . . . . . . . . . 2258.2.4 A Gauss–Newton algorithm . . . . . . . . . . . . . . . . . . . . . . 2288.2.5 A nonlinear complementarity problem . . . . . . . . . . . . . . . . 231

8.3 Semismooth functions in infinite-dimensional spaces . . . . . . . . . . . . 2348.4 The primal-dual active set method as a semismooth Newton method . . . . 2408.5 Semismooth Newton methods for a class of nonlinear complementarity

problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2438.6 Semismooth Newton methods and regularization . . . . . . . . . . . . . . 246

9 Semismooth Newton Methods II: Applications 2539.1 BV-based image restoration problems . . . . . . . . . . . . . . . . . . . . 2549.2 Friction and contact problems in elasticity . . . . . . . . . . . . . . . . . . 263

9.2.1 Generalities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2639.2.2 Contact problem with Tresca friction . . . . . . . . . . . . . . . . . 2659.2.3 Contact problem with Coulomb friction . . . . . . . . . . . . . . . . 272

10 Parabolic Variational Inequalities 27710.1 Strong solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28110.2 Regularity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29110.3 Continuity of q → y(q) ∈ L∞(�) . . . . . . . . . . . . . . . . . . . . . . 29210.4 Difference schemes and weak solutions . . . . . . . . . . . . . . . . . . . 29710.5 Monotone property . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 302

11 Shape Optimization 30511.1 Problem statement and generalities . . . . . . . . . . . . . . . . . . . . . . 30511.2 Shape derivative . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30811.3 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 314

11.3.1 Elliptic Dirichlet boundary value problem . . . . . . . . . . . . . . 31411.3.2 Inverse interface problem . . . . . . . . . . . . . . . . . . . . . . . 31611.3.3 Elliptic systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32111.3.4 Navier–Stokes system . . . . . . . . . . . . . . . . . . . . . . . . . 323

Bibliography 327

Index 339

ItoKunisc2008/6/12page x

�

�

�

�

�

�

�

�

ItoKunisc2008/6/12page xi

�

�

�

�

�

�

�

�

Preface

The objective of this monograph is the treatment of a general class of nonlinear variationalproblems of the form

miny∈Y, u∈U f (y, u)

subject to e(y, u) = 0, g(y, u) ∈ K,(0.0.1)

where f : Y×U → R denotes the cost functional, and e : Y×U → W and g : Y×U → Z

are functionals used to describe equality and inequality constraints. Here Y , U , W , and Z

are Banach spaces and K is a closed convex set in Z. A special choice for K is simple boxconstraints

K = {z ∈ Z : φ ≤ z ≤ ψ}, (0.0.2)

where Z is a lattice with ordering ≤, and φ, ψ are elements in Z. Theoretical issues whichwill be treated include the existence of minimizers, optimality conditions, Lagrange mul-tiplier theory, sufficient optimality considerations, and sensitivity analysis of the solutionsto (0.0.1) with respect to perturbations in the problem data. These topics will be coveredmainly in the first part of this monograph. The second part focuses on selected computa-tional methods for solving the constrained minimization problem (0.0.1). The final chapteris devoted to the characterization of shape gradients for optimization problems constrainedby partial differential equations.

Problems which fit into the framework of (0.0.1) are quite general and arise in ap-plication areas that were intensively investigated during the last half century. They includeoptimal control problems, structural optimization, inverse and parameter estimation prob-lems, contact and friction problems, problems in image reconstruction and mathematicalfinance, and others. The variable y is often referred to as the state variable and u as thecontrol or design parameter. The relationship between the variables y and u is described bye. It typically represents a differential or a functional equation. If e can be used to expressthe variable y as a function of u, i.e., y = �(u), then (0.0.1) reduces to

minu∈U J (u) = f (�(u), u) subject to g(�(u), u) ∈ K. (0.0.3)

This is referred to as the reduced formulation of (0.0.1). In the more general case when y

and u are both independent variables linked by the equation constraint e(y, u) = 0 it willbe convenient at times to introduce x = (y, u) in X = Y × U and to express (0.0.1) as

minx∈X f (x) subject to e(x) = 0, g(x) ∈ K. (0.0.4)

xi

ItoKunisc2008/6/12page xii

�

�

�

�

�

�

�

�

xii Preface

From a computational perspective it can be advantageous to treat y and u as independentvariables even though a representation of y in terms of u is available.

As an example consider the inverse problem of determining the diffusion parametera in

−∇ · (a ∇y) = f in �, y|∂� = 0 (0.0.5)

from the measurement yobs , where � is a bounded open set in R2 and f ∈ H−1(�). Anonlinear least squares approach to this problem can be formulated as (0.0.1) by choosingy ∈ Y = H 1

0 (�), a ∈ U = H 1(�) ∩ L∞(�), W = H−1(�), Z = L2(�) and considering

min∫�

(|y − yobs |2 + β |∇a|2) dx

subject to (0.0.5) and a ≤ a ≤ a,

(0.0.6)

where 0 < a < a < ∞ are lower and upper bounds for the “control” variable a. In thisexample, given a ∈ U , the state y ∈ Y can be uniquely determined by solving the ellipticboundary value problem (0.0.5) for y, and we obtain a problem of type (0.0.3).

The general approach that we follow in this monograph for the analytical as well as thenumerical treatment of (0.0.1) is based on Lagrange multiplier theory. Let us subsequentlysuppose that Y,U , and W are Hilbert spaces and discuss the case Z = U and g(y, u) = u,i.e., the constraint u ∈ K . We assume that f and e are C1 and denote by fy , fu the Fréchetderivatives of f with respect to y and u, respectively. Let W ∗ be the dual space of W andlet the duality product be denoted by 〈·, ·〉W ∗,W . The analogous notation is used for Z andZ∗. We form the Lagrange functional

L(y, u, λ) = f (y, u)+ 〈e(y, u), λ〉W,W ∗ , (0.0.7)

where λ ∈ W ∗ is the Lagrange multiplier associated with the equality constraint e(y, u) = 0which, for the present discussion, is supposed to exist. It will be shown that a minimizingpair (y, u) satisfies

Ly(y, u) = fy(y, u)+ ey(y, u)∗λ = 0,

〈fu(y, u)+ eu(y, u)∗λ , v − u〉Z∗,Z ≥ 0 for all v ∈ K,

e(y, u) = 0 and u ∈ K.

(0.0.8)

In the case K = Z the second equation of (0.0.8) results in the equality fy(y, u) +eu(y, u) = 0. In this case a first possibility for solving the system (0.0.8) for the unknowns(y, u, λ) is the use of a direct equation solver of Newton type, for example. Here theLagrange multiplier λ is treated as an independent variable just like y and u. Alternatively,for (y, u) satisfying e(y, u) = 0 and λ satisfying fy(y, u)+ eu(y, u)

∗λ = 0 the gradient ofJ (u) of (0.0.4) can be evaluated as

Ju = fu(y, u)+ eu(y, u)∗λ. (0.0.9)

Thus the combined step of determining y ∈ Y for given u ∈ K such that e(y, u) = 0and finding λ ∈ W ∗ satisfying fy(y, u) + eu(y, u)

∗λ = 0 for (y, u) ∈ Y × U provides a

ItoKunisc2008/6/12page xiii

�

�

�

�

�

�

�

�

Preface xiii

possibility for evaluating the gradient of J at u. If in addition u has to satisfy a constraintof the form u ∈ K , the projected gradient is obtained by projecting fu(y, u) + eu(y, u)

∗λonto K , and (projected) gradient-based iterative methods can be employed to solve (0.0.1).

In optimal control of differential equations, the multiplier λ is called the adjoint stateand the second equation in (0.0.8) is called the optimality condition. Further the system(0.0.8) coincides with the celebrated Pontryagin maximum principle. The steps in theprocedure explained above for obtaining the gradient are referred to as the forward equationstep for the state equation (the third equation in (0.0.8)) and the backward equation step forthe adjoint equation (the first equation). This terminology is motivated by time-dependentproblems, where the third equation is an initial value problem and the first one is a problemwith terminal time boundary condition. Lastly, if K is of type (0.0.2), then the secondequation of (0.0.8) can be written as the complementarity condition

fu(y, u)+ eu(y, u)∗λ+ η = 0,

η = max(0, η + (u− ψ))+min(0, η + (u− φ)).(0.0.10)

Due to the nondifferentiability of the max and min operations classical Newton methodsare not directly applicable to system (0.0.10). Active set methods provide a very efficientalternative. They turn out to be equivalent to semismooth Newton methods in functionspaces. If appropriate structural conditions are met, including that f be quadratic ande affine, then such techniques are globally convergent. Moreover, under more generalconditions they exhibit local superlinear convergence. Due to the practical relevance ofproblems with box constraints, a significant part of this monograph is devoted to the analysisof these methods.

A frequently employed alternative to the multiplier approach is given by the penaltymethod. To explain the procedure consider the sequence of minimization problems

miny∈Y,u∈K f (y, u)+ ck

2| e(y, u)|2W (0.0.11)

for an increasing sequence of penalty parameter ck . That is, the equality constraint iseliminated through the quadratic penalty function. This requires to solve the unconstrainedminimization problem (0.0.11) over Y ×K for a sequence of {ck} tending to infinity. Undermild assumptions it can be shown that the sequence of minimizers (yk, uk) determined by(0.0.11) converges to a minimizer of (0.0.1) as ck →∞. Moreover, (yk, uk) satisfies

fy(yk, uk)+ ey(yk, uk)∗(ck e(yk, uk)) = 0,

〈fu(yk, uk)+ eu(yk, uk)∗(ck e(yk, uk)), v − uk〉Z∗,Z ≥ 0.

(0.0.12)

Comparing (0.0.12) with (0.0.8) it is natural to ask whether ck e(yk, uk) tends to a Lagrangemultiplier associated to e(y, u) = 0 as ck →∞. Indeed this can be shown under suitableconditions. Despite disadvantages due to slow convergence and possible ill-conditioning forsolving (0.0.12) with large values of ck , the penalty method is widely accepted in practice.This is due, in part, to the simplicity of the approach and the availability of powerful algo-rithms for solving (0.0.12) if K = Z, when the inequality in (0.0.12) becomes an equality.

Athird methodology is given by duality techniques. They are based on the introductionof the dual functional

d(λ) = infy∈Y, u∈K L(y, u, λ) (0.0.13)

ItoKunisc2008/6/12page xiv

�

�

�

�

�

�

�

�

xiv Preface

and the duality property

supλ∈Y

d(λ) = infy∈Y, u∈K f (y, u) subject to e(y, u) = 0. (0.0.14)

The duality method for solving (0.0.1) requires minimizing L(y, u, λk) over (y, u) ∈ Y×K

and updating λ by means of

λk+1 = λk + αk J e(yk, uk), (0.0.15)

where(yk, uk) = argmin

y∈Y, u∈KL(y, u, λk),

αk > 0 is an appropriately chosen step size and J denotes the Riesz mapping from W ontoW ∗. It can be argued that e(yk, uk) is the gradient of d(λ) at λk and (0.0.15) is in turn asteepest ascent method for the maximization of d(λ) under appropriate conditions. Suchmethods are called primal-dual methods. Despite the fact that the method can be justifiedonly under fairly restrictive convexity assumptions on L(y, u, λ) with respect to (y, u),it provides an elegant use of Lagrange multipliers and is a basis for so-called augmentedLagrangian methods.

Augmented Lagrangian methods with K = Z are based on the following problemwhich is equivalent to (0.0.1):

miny∈Y,u∈K f (y, u)+ c

2|e(y, u)|2W

subject to e(y, u) = 0.(0.0.16)

Under rather mild conditions the quadratic term enhances the local convexity of L(y, u, λ)in the variables (y, u) for sufficiently large c > 0. It helps the convergence of directsolvers based on the necessary optimality conditions (0.0.8). To carry this a step further weintroduce the augmented Lagrangian functional

Lc(y, u, λ) = f (x, u)+ 〈e(y, u), λ〉 + c

2|e(y, u)|2W . (0.0.17)

The first order augmented Lagrangian method is the primal-dual method applied to (0.0.17),i.e.,

(yk, uk) = argminy∈Y, u∈K

Lc(y, u, λk),

λk+1 = λk + cJ e(yk, uk).(0.0.18)

Its advantage over the penalty method is attributed to the fact that local convergence of(yk, uk) to a minimizer (y, u) of (0.0.1) holds for all sufficiently large and fixed c > 0,without requiring that c→∞. As we noted, Lc(y, u, λ) has local convexity properties un-der well-studied assumptions, and the primal-dual viewpoint is applicable to the multiplierupdate. The iterates (yk, uk, λk) converge linearly to the triple (y, u, λ) satisfying the firstorder necessary optimality conditions, and convergence can improve as c > 0 increases.Due to these attractive characteristics and properties, the method of multipliers and its sub-sequent Newton-like variants have been recognized as a powerful method for minimizationproblems with equality constraints. They constitute an important part of this book.

ItoKunisc2008/6/12page xv

�

�

�

�

�

�

�

�

Preface xv

In (0.0.16) and (0.0.18) the constraint u ∈ K remained as explicit constraint and wasnot augmented. To describe a possibility for augmenting inequalities we return to the generalform g(y, u) ∈ K and consider inequality constraints with finite rank, i.e., Z = Rp andK = {z ∈ Rp : zi ≤ 0, 1 ≤ i ≤ p}. Then under appropriate conditions the formulation

miny∈Y, u∈U, q∈Rp

Lc(y, u, λ)+ (μ, g(y, u)− q)+ c

2|g(y, u)− q|2

Rp

subject to q ≤ 0(0.0.19)

is equivalent to (0.0.1). Here μ ∈ Rp is the Lagrange variable associated with the inequalityconstraint g(y, u) ≤ 0. Minimizing the functional in (0.0.19) over q ≤ 0 results in theaugmented Lagrangian functional

Lc(y, u, λ, μ) = Lc(y, u, λ)+ 1

2|max(0, μ+ c g(y, u))|2

Rp − c

2|μ|2

Rp , (0.0.20)

where equality and finite rank inequality constraints are augmented. The correspondingaugmented Lagrangian method is

(yk, uk) = argminy∈Y, u∈K

Lc(y, u, λk, μk),

λk+1 = λk + cJ e(yk, uk),

μk+1 = max(0, μk + c g(yk, uk)).

(0.0.21)

In the discussion of box-constrained problems we already pointed out the relevance ofnumerical methods for nonsmooth problems; see (0.0.10). In many applied variational prob-lems, for example in mechanics, fluid flow, or image analysis, nonsmooth cost functionalsarise. Consider as a special case the simplified friction problem

min f (y) =∫�

1

2(|∇y|2 + |y|2)− f y dx + g

∫�

|y| ds over y ∈ H 1(�), (0.0.22)

where � is a bounded domain with boundary �. This is an unconstrained minimizationproblem withX = H 1(�) and no control variables. Since the functional is not continuouslydifferentiable, the necessary optimality condition fy(y) = 0 is not applicable. However,with the aid of a generalized Lagrange multiplier theory a necessary optimality conditioncan be written as

−�y + y = f ,∂y

∂ν= g λ on �,

|λ(x)| ≤ 1 and λ(x)y(x) = | y(x)| a.e. in �.

(0.0.23)

From a Lagrange multiplier perspective, problems with L1-type cost functionals as in(0.0.22) and box-constrained problems are dual to each other. So it comes as no surprise thatagain semismooth Newton methods provide an efficient technique for solving (0.0.22) or(0.0.23). We shall analyze them and provide a theoretical basis for their numerical efficiency.

This monograph also contains the analysis of optimization problems constrained bypartial differential equations which are singular in the sense that the state variable cannotbe differentiated with respect to the control. This, however, does not preclude that the cost

ItoKunisc2008/6/12page xvi

�

�

�

�

�

�

�

�

xvi Preface

functional is differentiable with respect to the control and that an optimality principle canbe derived. In terms of shape optimization problems this means that the cost functional isshape differentiable while the state is not differentiable with respect to the shape.

In summary, Lagrange multiplier theory provides a tool for the analysis of generalconstrained optimization problems with cost functionals which are not necessarily C1 andwith state equations which are in some sense singular. It also leads to a theoretical basisfor developing efficient and powerful iterative methods for solving such problems. Thepurpose of this monograph is to provide a rather thorough analysis of Lagrange multipliertheory and to show its impact on the development of numerical algorithms for problemswhich are posed in a function space setting.

Let us give a short description of the book for those readers who do not intend toread it by consecutive chapters. Chapter 1 provides a variety of tools to establish existenceof Lagrange multipliers and is called upon in all the following chapters. Here, as in otherchapters, we do not attempt to give the most general results, nor do we strive for covering thecomplete literature. Chapter 2 is devoted to the sensitivity analysis of abstract constrainednonlinear programming problems and it essentially stands for itself. This chapter is of greatimportance, addressing continuity, Lipschitz continuity, and differentiability of the solutionsto optimization and optimal control problems with respect to parameters that appear in theproblem formulation. Such results are not only of theoretical but also of practical importance.The sensitivity equations have been a starting point for the development of algorithmicconcepts for decades. Nevertheless, readers who are not interested in this topic at first mayskip this chapter without missing technical results which might be needed for later chapters.

Chapters 3, 5, and 6 form a unit which is devoted to smooth optimization problems.Chapter 3 covers first order augmented Lagrangian methods for optimization problems withequality and inequality constraints. Here as in the remainder of the book, the inequalityconstraints that we have in mind typically represent partial differential equations. In fact,during the period in which this monograph was written, the terminology “PDE-constrainedoptimization” emerged. Inverse problems formulated as regularized least squares problemsand optimal control problems for (partial) differential equations are primary examples forthe theories that are discussed here. Chapters 5 and 6 are devoted to second order iter-ative solution methods for equality-constrained problems. Again the equality constraintsrepresent partial differential equations. This naturally gives rise to the following situation:The variables with respect to which the optimization is carried out can be classified intotwo groups. One group contains the state variables of the differential equations and theother group consists of variables which represent the control or input variables for optimalcontrol problems, or coefficients in parameter estimation problems. If the state variablesare considered as functions of the independent controls, inputs, or coefficients, and the costfunctional in the optimization problem is only considered as a functional of the latter, thenthis is referred to as the reduced formulation. Applying a second order method to thereduced functional we arrive at the Newton method for optimization problems with partialdifferential equations as constraints. If both state and control variables are kept as inde-pendent variables and the optimality system involving primal and adjoint variables, whichare the Lagrange multipliers corresponding to the PDE-constraints, is derived, we arrive atthe sequential quadratic programming (SQP) technique: It essentially consists of applyinga Newton algorithm to the first order necessary optimality conditions. The Newton methodfor the reduced formulation and the SQP technique are the focus of Chapter 5. Chapter 6

ItoKunisc2008/6/12page xvii

�

�

�

�

�

�

�

�

Preface xvii

is devoted to second order augmented Lagrangian techniques which are closely related, aswe shall see, to SQP methods. Here the equation constraint is augmented in a penalty term,which has the effect of locally convexifying the optimization problem. Since augmentedLagrangians also involve Lagrange multipliers, there is, however, no necessity to let thepenalty parameter tend to infinity and, in fact, we do not suggest doing so.

Asecond larger unit is formed by Chapters 4, 7, 8, and 9. Nonsmoothness, primal-dualactive set strategy, and semismooth Newton methods are the keywords which characterizethe contents of these chapters. Chapter 4 is essentially a recapture of concepts from convexanalysis in a format that is used in the remaining chapters. A key result is the formulationof differential inclusions which arise in optimality systems by means of nondifferentiableequations which are derived from Yosida–Moreau approximations and which will serveas the basis for the primal-dual active set strategy. Chapter 7 is devoted to the primal-dual active set strategy and its global convergence properties for unilaterally and bilaterallyconstrained problems. The local analysis of the primal-dual active set strategy is achievedin the framework of semismooth Newton methods in Chapter 8. It contains the notion ofNewton derivative and establishes local superlinear convergence of the Newton methodfor problems which do not satisfy the classical sufficient conditions for local quadraticconvergence. Two important classes of applications of semismooth Newton methods areconsidered in Chapter 9: image restoration and deconvolution problems regularized by thebounded variation (BV) functional and friction and contact problems in elasticity.

Chapter 10 is devoted to a Lagrangian treatment of parabolic variational inequali-ties in unbounded domains as they arise in the Black–Scholes equation, for example. Itcontains the use of monotone techniques for analyzing parabolic systems without relyingon compactness assumptions in a Gelfand-triple framework. In Chapter 11 we providea calculus for obtaining the shape derivative of the cost functional in shape optimizationproblems which bypasses the need for using the shape derivative of the state variables of thepartial differential equations. It makes use of the expansion technique that is proposed inChapters 1 and 5 for weakly singular optimal control problems, and of the assumption thatan appropriately defined adjoint equation admits a solution. This provides a versatile tech-nique for evaluating the shape derivative of the cost functional using Lagrange multipliertechniques.

There are many additional topics which would fit under the title of this monographwhich, however, we chose not to include. In particular, issues of discretization, convergence,and rate of convergence are not discussed. Here the issue of proper discretization of adjointequations consistent with the discretization of the primal equation and the consistent timeintegration of the adjoint equations must be mentioned. We do not enter into the discussionof whether to discretize an infinite-dimensional nonlinear programming problem first andthen to decide on an iterative algorithm to solve the finite-dimensional problems, or the otherway around, consisting of devising an optimization algorithm for the infinite-dimensionalproblem which is subsequently discretized. It is desirable to choose a discretization andan iterative optimization strategy in such a manner that these two approaches commute.Discontinuous Galerkin methods are well suited for this purpose; see, e.g., [BeMeVe].Another important area which is not in the focus of this monograph is the efficient solutionof those large scale linear systems which arise in optimization algorithms. We refer thereader to, e.g., [BGHW, BGHKW], and the literature cited there. The solution of largescale time-dependent optimal control problems involving the coupled system of primal and

ItoKunisc2008/6/12page xviii

�

�

�

�

�

�

�

�

xviii Preface

adjoint equations, which need to be solved in opposite directions with respect to time, stilloffers a significant challenge, despite the advances that were made with multiple shooting,receding horizon, and time-domain decomposition techniques. From the point of view ofoptimization theory there are several topics as well into which one could expand. Theseinclude globalization strategies, like trust region methods, exact penalty methods, quasi-Newton methods, and a more abstract Lagrange multiplier theory than that presented inChapter 1.

As a final comment we stress that for a full treatment of a variational problem in func-tion spaces, both its infinite-dimensional analysis as well as its proper discretization and therelation between the two are indispensable. Proceeding from an infinite-dimensional prob-lem directly to its disretization without such a treatment, important issues can be missed. Forinstance discretization without a well-posed analysis may result in the use of inappropriateinner products, which may lead to unnecessary ill-conditioning, which entails unneces-sary preconditioning. Inconsiderate discretization may also result in the loss of structuralproperties, as for instance symmetry properties.

ItoKunisc2008/6/12page 1

�

�

�

�

�

�

�

�

Chapter 1

Existence of LagrangeMultipliers

1.1 Problem statement and generalitiesWe consider the constrained optimization problem

min f (x)

subject to x ∈ C and g(x) ∈ K,(1.1.1)

where f is a real-valued functional defined on a real Banach space X; further C is a closedconvex subset of X, g is a mapping from X into the real Banach space Z, and K is a closedconvex cone in Z. Throughout the monograph it is understood that the vertex of a conecoincides with the origin. Further, in this chapter it is assumed that (1.1.1) admits a solutionx∗. Here we refer to x∗ as solution if it is feasible, i.e., x∗ satisfies the constraints in (1.1.1),and f has a local minimum at x∗. It is assumed that{

f is Fréchet differentiable at x∗, andg is continuously Fréchet differentiable at x∗ (1.1.2)

with Fréchet derivatives denoted by f ′(x∗) and g′(x∗), respectively. The set of feasiblepoints for (1.1.1) will be denoted by M = {x ∈ C : g(x) ∈ K}. Moreover, the followingnotation will be used. The topological duals of X and Z are denoted by X∗ and Z∗. For thesubset A of X the polar or dual cone is defined by

A+ = {x∗ ∈ X∗ : 〈x∗, a〉 ≤ 0 for all a ∈ A},where 〈·, ·〉 (= 〈·, ·〉X∗,X) denotes the duality pairing between X∗ and X. Further for x ∈ X

the conical hull of C \ {x} is given by

C(x) = {λ(c − x) : c ∈ C, λ ≥ 0},and K(z) with z ∈ Z is defined analogously. The Lagrange functional L : X × Z∗ → R

associated to (1.1.1) is defined by

L(x, λ) = f (x)+ 〈λ, g(x)〉Z∗,Z.

1


�

�

�

�

�

�

�

�

2 Chapter 1. Existence of Lagrange Multipliers

Definition 1.1. An element λ∗ ∈ Z∗ is called a Lagrange multiplier for (1.1.1) at thesolution x∗ if λ∗ ∈ K+, 〈λ∗, g(x∗)〉 = 0, and

f ′(x∗)+ λ∗ ◦ g′(x∗) ∈ −C(x∗)+.

Lagrange multipliers are of fundamental importance for expressing necessary as wellas sufficient optimality conditions for (1.1.1) and for the development of algorithms tosolve (1.1.1). If C = X, then the inclusion in Definition 1.1 results in the equation

f ′(x∗)+ λ∗ ◦ g′(x∗) = 0.

For the existence analysis of Lagrange multipliers one makes use of the followingtangent cones which approximate the feasible set M at x∗:

T (M, x∗) ={x ∈ X : x = lim

n→∞1

tn(xn − x∗), tn → 0+, xn ∈ M

},

L(M, x∗) = {x ∈ X : x ∈ C(x∗) and g′(x∗)x ∈ K(g(x∗))}.The cone T (M, x∗) is called the sequential tangent cone (or Bouligand cone) and L(M, x∗)the linearizing cone of M at x∗.

Proposition 1.2. If x∗ is a solution to (1.1.1), then f ′(x∗)x ≥ 0 for all x ∈ T (M, x∗).

Proof. Let x ∈ T (M, x∗). Then there exist sequences {xn}∞n=1 in M and {tn} withlimn→∞ tn = 0 such that x = limn→∞ 1

tn(xn − x∗). By the definition of Fréchet differ-

entiability there exists a sequence {rn} in Z such that

f ′(x∗)x = limn→∞

1

tnf ′(x∗)(xn − x∗) = lim

n→∞1

tn(f (xn)− f (x∗)+ rn)

and limn→∞ 1|xn−x∗| rn = 0. By optimality of x∗ we have

f ′(x∗)x ≥ limn→∞ |x|

1

|xn − x∗| rn = 0

and the result follows.

We now briefly outline the main steps involved in the abstract approach to prove theexistence of Lagrange multipliers. For this purpose we assume that C = X, K = {0}so that the set of feasible points is described by M = {x : g(x) = 0}. We assume thatg′(x∗) : X→ Z is surjective. Then by Lyusternik’s theorem [Ja, We] we have

L(M, x∗) ⊂ T (M, x∗). (1.1.3)

Consider the convex cone

B = {(f ′(x∗)x + r, g′(x∗)x) : r ≥ 0, x ∈ X} ⊂ R× Z.


�

�

�

�

�

�

�

�

1.2. A generalized open mapping theorem 3

Observe that (0, 0) ∈ B and that due to Proposition 1.2 and (1.1.3) the origin (0, 0) is aboundary point of B. Since g′(x∗) is surjective, B has nonempty interior and hence by theEidelheit separation theorem that we shall recall below there exists a closed hyperplane inR× Z which supports B at (0, 0), i.e., there exists 0 �= (α, λ∗) ∈ R× Z∗ such that

α(f ′(x∗)x + r)+ 〈λ∗, g′(x∗)x〉 ≥ 0 for all (r, x) ∈ R+ ×X.

Setting x = 0 we have α ≥ 0. If α = 0, then λ∗g′(x∗) = 0, which would implyλ∗ = 0, which is impossible. Consequently α > 0 and without loss of generality wecan assume α = 1. Hence λ∗ is a Lagrange multiplier. For the general case of equalityand inequality constraints the application of Lyusternik’s theorem is replaced by stabilityresults for solutions of systems of inequalities. The existence of a separating hyperplanewill follow from a generalized open mapping theorem, which is covered in the followingsection.

The separation theorem announced above is presented next.

Theorem 1.3. Let K1 and K2 be nontrivial, convex, and disjoint subsets of a normed linearspace X . If K1 is closed and K2 is compact, then there exists a closed hyperplane strictlyseparating K1 and K2, i.e., there exist x∗ ∈ X , β ∈ R, and ε > 0 such that

〈x∗, x〉 ≤ β − ε for all x ∈ K1 and 〈x∗, x〉 ≥ β + ε for all x ∈ K2.

If K1 is open and K2 is arbitrary, then these inequalities still hold with ε = 0.

Let us give a brief description of the sections of this chapter. Section 1.2 contains theopen mapping theorem already alluded to above. Regularity properties which guaranteethe existence of a Lagrange multiplier in general nonlinear optimal programming problemsin infinite dimensions are analyzed in section 1.3 and applications to parameter estimationand optimal control problems are described in section 1.4. Section 1.5 is devoted to thederivation of a first order optimality system for a class of weakly singular optimal controlproblems and to several applications which are covered by this class. Finally in section 1.6several further alternatives for deriving optimality systems are presented in an informalmanner.

1.2 A generalized open mapping theoremThroughout this section T denotes a bounded linear operator fromX toZ. As in the previoussection C stands for a closed convex set in X, and K stands for a closed convex cone inZ. For Theorem 1.4 below it is sufficient that K is a closed convex set. For ρ > 0 we setXρ = {x : |x| ≤ ρ} andZρ is defined analogously. Recall that by the open mapping theorem,surjectivity of T implies that T maps open sets of X onto open sets of Z. As a consequenceTX1 contains Zρ for some ρ > 0. The following theorem generalizes this result.

Theorem 1.4. Let x ∈ C and y ∈ K , where K is a closed convex set. Then the followingstatements are equivalent:

(a) Z = T C(x)−K(y),


�

�

�

�

�

�

�

�


(b) there exists ρ > 0 such that Zρ ⊂ T ((C − {x}) ∩X1)− (K − {y}) ∩ Z1,

where C(x) = {λ(x − x) : λ ≥ 0, x ∈ C} and K(y) = {k − λy : λ ≥ 0, k ∈ K}.

Proof. Clearly (b) implies (a). To prove the converse we first show that

C(x) =⋃n∈N

n((C − x) ∩X1). (1.2.1)

The inclusion ⊃ is obvious and hence it suffices to prove that C(x) is contained in the seton the right-hand side of (1.2.1). Let x ∈ C(x). Then x = λy with λ ≥ 0 and y ∈ C − x.Without loss of generality we can assume that λ > 0 and |y| > 1. Convexity of C impliesthat 1

|y|y ∈ (C − x) ∩ X1. Let n ∈ N be such that n ≥ λ|y|. Then x = (λ|y|) 1|y|y ∈

n((C − x) ∩X1), and (1.2.1) is verified.For α > 0 let Aα = αT ((C− x)∩X1− (K− y)∩Z1). We will show that 0 ∈ int A1.

In fact, from (1.2.1) and the analogous equality with C replaced by K it follows from (a)that

Z =⋃n∈N

nT ((C − x) ∩X1) ∪⋃n∈N

(−n)((K − y) ∩ Z1).

This implies hat

Z =⋃n∈N

An =⋃n∈N

An. (1.2.2)

Thus the complete spaceZ is the countable union of the closed setsAn. By the Baire categorytheorem there exists some m ∈ N such that int Am �= ∅. Let a ∈ int Am. By (1.2.2) thereexists k ∈ N with −a ∈ Ak . Using Aα = αA1 this implies that −m

ka ∈ Am. It follows that

the half-open interval {λ(−mka)+(1−λ)a : 0 ≤ λ < 1} belongs to int Am and consequently

0 ∈ int Am = m int A1. Thus we have

0 ∈ int A1. (1.2.3)

Hence for some ρ > 0

Zρ ⊂ 1

2A1 = A 1

2⊂ A 1

2+ Zρ

2, (1.2.4)

and consequently for all i = 0, 1, . . .

Z( 12 )

iρ =(

1

2

)i

Zρ ⊂(

1

2

)i

(A 12+ Zρ

2) = A( 1

2 )i+1 + Z( 1

2 )i+1ρ. (1.2.5)

Now choose y ∈ Zρ . By (1.2.4) there exist x1 ∈ (C − x) ∩X1, y1 ∈ (K − y) ∩ Z1,and r1 ∈ Zρ

2such that y = T ( 1

2x1)− 12y1 + r1.

Applying (1.2.5) with i = 1 to r1 implies the existence of x2 ∈ (C − x) ∩ X1,y2 ∈ (K − y) ∩ Z1, and r2 ∈ Z( 1

2 )2ρ such that

y = T

(1

2x1 +

(1

2

)2

x2

)−

(1

2y1 +

(1

2

)y2

)+ r2.


�

�

�

�

�

�

�

�

1.3. Regularity and existence of Lagrange multipliers 5

Repeated application of this argument implies that y = T un − vn + rn, where

un =n∑

i=1

(1

2

)i

xi with xi ∈ (C − x) ∩X1,

vn =n∑

i=1

(1

2

)i

yi with yi ∈ (K − y) ∩ Y1,

and rn ∈ Y( 12 )

nρ . Observe that

un = 1

2x1 + · · · +

(1

2

)n

xn +(

1−n∑

i=1

(1

2

)i)0 ∈ (C − x) ∩X1

and

|un − un+m| ≤n+m∑i=n+1

(1

2

)i

≤(

1

2

)n

.

Consequently {un}∞n=1 is a Cauchy sequence and there exists some x ∈ (C − x) ∩X1 suchthat limn→∞ un = x. Similarly there exists v ∈ (K − y) ∩ Z1 such that limn→∞ vn = v.Moreover limn→∞ rn = 0 and continuity of T implies that y = T x − v.

1.3 Regularity and existence of Lagrange multipliersIn this section existence of a Lagrange multiplier for the optimization problem (1.1.1) willbe established using a regular point condition which was proposed in the publications ofRobinson for sensitivity analysis of generalized equations and later by Kurcyusz, Zowe,and Maurer for the existence of Lagrange multipliers.

Definition 1.5. The element x ∈ M is called regular (or alternatively x satisfies the regularpoint condition) if

0 ∈ int {g′(x)(C − x)−K + g(x)}. (1.3.1)

If (1.3.1) holds, then clearly

0 ∈ int {g′(x)C(x)−K(g(x))}. (1.3.2)

Note that g′(x)C(x)−K(g(x)) is a cone. Consequently (1.3.2) implies that

g′(x)C(x)−K(g(x)) = Z. (1.3.3)

Finally (1.3.3) implies (1.3.1) by Theorem 1.4. Thus, the three conditions (1.3.1)–(1.3.3)are equivalent. Condition (1.3.1) is used in the work of Robinson (see, e.g., [Ro2]) onthe stability of solutions to systems of inequalities, and (1.3.3) is the regularity conditionemployed in [MaZo, ZoKu] to guarantee the existence of Lagrange multipliers.


�

�

�

�

�

�

�

�


Remark 1.3.1. If C = X, then the so-called Slater condition g′(x)h ∈ int K(g(x)) forsome h ∈ X implies (1.3.1). In case C = X = Rn, Z = Rm, K = {y ∈ Rm : yi = 0for i = 1, . . . , k, and yi ≤ 0 for i = k + 1, . . . , m}, g = (g1, . . . , gk, gk+1, . . . , gm),the constraint g(x) ∈ K amounts to k equality and m − k inequality constraints, and theregularity condition (1.3.3) becomes⎧⎪⎪⎨⎪⎪⎩

the gradients {g′i (x)}ki=1 are linearly independent andthere exists x ∈ X such thatg′i (x)x = 0 for i = 1, . . . , k,g′i (x)x < 0 for i = k + 1, . . . , m with gi(x) = 0.

(1.3.4)

This regularity condition is referred to as the Mangasarian–Fromowitz constraint qualifica-tion.

Other variants of conditions which imply existence of Lagrange multipliers are namedafter F. John, Karush–Kuhn–Tucker (KKT), and Arrow, Hurwicz, Uzawa.

Theorem 1.6. If the solution x∗ of (1.1.1) is regular, then there exists an associated Lagrangemultiplier λ∗ ∈ Z∗.

Proof. By Proposition 1.2 we have

f ′(x∗)x ≥ 0 for all x ∈ T (M, x∗).

From Theorem 1 and Corollary 2 in [We], which can be viewed as generalizations ofLyusternik’s theorem, it follows that L(M, x∗) ⊂ T (M, x∗) and hence

f ′(x∗)x ≥ 0 for all x ∈ L(M, x∗). (1.3.5)

We define the set

B = {(f ′(x∗)x + r, g′(x∗)x − y) : r ≥ 0, x ∈ C(x∗), y ∈ K(g(x∗))} ⊂ R× Z.

This is a convex cone containing the origin (0, 0) ∈ R × Z. Due to (1.3.5) the origin is aboundary point of B. By the regular point condition and Theorem 1.4 with T = g′(x∗) andy = g(x∗) there exists ρ > 0 such that

{(α, y) : α ≥ max{f ′(x∗)x : x ∈ (C − x∗) ∩X1} and y ∈ Zρ} ⊂ B,

and hence int B �= ∅. Consequently there exists a hyperplane in R× Z which supports B

at (0, 0), i.e., there exists (α, λ∗) �= (0, 0) ∈ R× Z∗ such that

α(f ′(x∗)x + r)+ λ∗(g′(x∗)x − y) ≥ 0 for all x ∈ C(x∗),y ∈ K(g(x∗)), r ≥ 0. (1.3.6)

Setting (r, x) = (0, 0) this implies that 〈λ∗, y〉 ≤ 0 for all y ∈ K(g(x∗)) and consequently

λ∗ ∈ K∗ and λ∗g(x∗) = 0. (1.3.7)


�

�

�

�

�

�

�

�

1.3. Regularity and existence of Lagrange multipliers 7

The choice (x, y) = (0, 0) in (1.3.6) implies α ≥ 0. If α = 0, then due to regularity of x∗we have λ∗ = 0, which is impossible. Consequently α > 0, and without loss of generalitywe can assume that α = 1. Finally with (r, y) = (0, 0), inequality (1.3.6) implies

f ′(x∗)+ λ∗g′(x∗) ∈ −C(x∗)+.

This concludes the proof.

We close this subsection with two archetypical problems.

Equality-constrained problem. We consider

min f (x)

subject to g(x) = 0.(1.3.8)

Let x∗ denote a local solution at which (1.1.2) is satisfied and such that g′(x∗) : X→ Z issurjective. Then there exists λ∗ ∈ Z∗ such that the KKT condition

f ′(x∗)+ λ∗g′(x∗) = 0 in X∗

holds. Surjectivity of g′(x∗) is of course only a sufficient, not a necessary, condition forexistence of a Lagrangian. This topic is taken up in Example 1.12 and below.

Inequality-constrained problem. This is the inequality-constrained problem with theso-called unilateral box, or simple constraints. We suppose that Z = L2(�) with � adomain in Rn and consider

min f (x)

subject to g(x(s)) ≤ 0 for a.e. s ∈ �.(1.3.9)

Setting C = X and K = {v ∈ L2(�) : v ≤ 0}, this problem becomes a special caseof (1.1.1). Let x∗ denote a local solution at which (1.1.2) is satisfied and such that g′(x∗) :X→ Z is surjective. Then there exists λ∗ ∈ L2(�) such that the KKT conditions

f ′(x∗)+ λ∗g′(x∗) = 0 in X∗,

λ∗(s) ≥ 0, g(x∗(s)) ≤ 0, λ∗(s)g(x∗(s)) = 0 for a.e. s ∈ �(1.3.10)

hold. Again surjectivity of g′(x∗) is a sufficient, but not a necessary, condition for (1.3.10)to hold.

For later use let us consider the case where the functionals f ′(x∗) ∈ X∗ and λ∗ ∈ Z∗are represented by duality pairings, i.e.,

f ′(x∗)(x) = 〈JXf′(x∗), x〉X∗,X, λ∗(z) = 〈JZλ

∗, z〉Z∗,Zfor x ∈ X and z ∈ Z. Here JX is the canonical isomorphism from L(X,R) into thespace of representations of continuous linear functionals on X with respect to the 〈·, ·〉X∗,Xduality pairing, and JZ is defined analogously. In finite dimensions these isomorphisms aretranspositions. With this notation f ′(x∗)+ λ∗g′(x∗) = 0 can be expressed as

JXf′(x∗)+ g′(x∗)∗JZλ

∗ = 0.

Henceforth the notation of the isomorphisms will be omitted.


�

�

�

�

�

�

�

�


1.4 ApplicationsThis section is devoted to a discussion of examples in which the general existence result ofthe previous section is applicable, as well as to other examples in which it fails. Throughout�denotes a bounded domain inRn with Lipschitz continuous boundary ∂�. We use standardHilbert space notation as introduced in [Ad], for example.

Differently from the previous sections we shall use J to denote the cost functionaland f for inhomogeneities in equality constraints representing partial differential equations.Moreover the constraints g(x) ∈ K from the previous sections will be split into equalityconstraints, denoted by e(x) = 0, and inequality constraints, g(x) ∈ K , with K a nontrivialcone.

Example 1.7. Consider the unilateral obstacle problem⎧⎪⎪⎨⎪⎪⎩min J (y) = 1

2

∫�

∣∣∇y∣∣2dx −

∫�

fydx

over y ∈ H 10 (�) and y(x) ≤ ψ(x) for a.e. x ∈ �,

(1.4.1)

where f ∈ L2(�) and ψ ∈ H 10 (�). This is a special case of (1.1.1) with X = Z =

H 10 (�),K = {ϕ ∈ H 1

0 (�) : ϕ(x) ≤ 0 for a.e. x ∈ �}, C = X, and g(y) = y−ψ . It is wellknown and simple to argue that (1.4.1) admits a unique solution y∗ ∈ H 1

0 (�). Moreover,since g′ is the identity, every element of X satisfies the regular point condition. Hence byTheorem 1.6 there exists λ∗ ∈ H 1

0 (�)∗ such that⟨λ∗, ϕ

⟩H−1

0 ,H 10≤ 0 for all ϕ ∈ K,⟨

λ∗, y∗ − ψ⟩H−1

0 ,H 10= 0,

−�y∗ + λ∗ = f in H−1,

where H−1 = H 10 (�)∗. Under well-known regularity assumptions on ψ and ∂�, cf. [Fr,

IK1], for example, λ∗ ∈ L2(�). This extra smoothness of the Lagrange multiplier does notfollow from the general multiplier theory of the previous section.

The following result is useful to establish the regular point condition in diverse optimalcontrol and parameter estimation problems. We require some notation. Let L be a closedconvex cone in the real Hilbert space U which induces an ordering denoted according tou ≥ 0 if u ∈ L. Let q∗ ∈ U ∗ be a nontrivial bounded linear functional on U , withπ : U → ker q∗ the orthogonal projection. For ϕ ∈ U,μ ∈ R, and γ ∈ R+ define

g : U → Z := U × R× Rby

g(u) =(u− ϕ,

∣∣u∣∣2 − γ 2, q∗(u)− μ),

where | · | denotes the norm in U , and put K = L× R− × {0} ⊂ Z.

Proposition 1.8. Assume that [ker q∗]⊥ ∩ L �= {0} and let h0 ∈ [ker q∗]⊥ ∩ L withq∗(h0) = 1. If q∗(L) ⊂ R+, q∗(ϕ) < μ, and

∣∣π(ϕ)∣∣2 + μ2∣∣h0

∣∣2< γ 2; then the set

M = {u ∈ U : g(u) ∈ K}


�

�

�

�

�

�

�

�

1.4. Applications 9

is nonempty and every element u ∈ M is regular, i.e.,

0 ∈ int{g(u)+ g′(u)U −K

}for each u ∈ M. (1.4.2)

Proof. Note that U = ker q∗ ⊕ span {h0} and that the orthogonal projection onto [ker q∗]⊥is given by q∗(u)h0 for u ∈ U . To show that M is nonempty, we define

u = ϕ + (μ− q∗(ϕ)

)h0.

Observe that

u− ϕ = (μ− q∗(ϕ)

)h0 ∈ L,∣∣u∣∣2 = ∣∣π(ϕ)∣∣2 + μ2∣∣h0

∣∣2< γ,

and q∗(u) = μ. Thus u ∈ M . Next let u ∈ M be arbitrary. We have to verify that

0 ∈ int{ (

u− ϕ + h− L, |u|2 − γ 2 + 2(u, h)− R−, q∗(h)) : h ∈ U} ⊂ Z, (1.4.3)

where (·, ·) denotes the inner product in U . Put

δ = min

(γ 2 − ∣∣π(ϕ)∣∣2 − μ2

∣∣h0

∣∣2

1+ 2γ + 2γ∣∣h0

∣∣ ,μ− q∗(ϕ)‖ q∗ ‖ +1

),

and define B = {(u, r, s) ∈ Z : ∣∣ (u, r, s) ∣∣

Z≤ δ

}. Without loss of generality we endow

the product space with the supremum norm in this proof. We shall show that

B ⊂ {(u− ϕ + h1 + σh0 − L,

∣∣u∣∣2 − γ 2 + 2(u, h1 + σh0)− R−, (1.4.4)

σq∗(h0)) : h1 ∈ ker q∗, σ ∈ R}

,

which implies (1.4.3). Let (u, r, s) ∈ B be chosen arbitrarily. We put σ = s and verify thatthere exists (h1, l, r

−) ∈ ker q∗ × L× R− such that(u− ϕ + h1 + sh0 − l,

∣∣u∣∣2 − γ 2 + 2(u, h1 + sh0)− r−)= (u, r). (1.4.5)

This will imply the claim. Observe that by the choice of δ we find μ − q∗(ϕ) −q∗(x)+ s ≥ 0. Hence l is defined by

l = (μ− q∗(ϕ)− q∗(u)+ s

)h0 ∈ L.

Choosing h1 = π(φ−u+ u) we obtain equality in the first coordinate of (1.4.5). Forthe second coordinate in (1.4.5) we observe that∣∣u∣∣2 − γ 2 + 2(u, h1 + sh0)

= ∣∣πu∣∣2 + μ2∣∣h0

∣∣2 − γ 2 + 2(u, π(ϕ − u+ u)+ sh0)

≤ ∣∣πu∣∣2 + μ2∣∣h0

∣∣2 − γ 2 + ∣∣πu∣∣2 + ∣∣πϕ∣∣2 − 2∣∣πu∣∣2 + 2γ

(|u|2 + ∣∣s∣∣ ∣∣h0

∣∣)≤ μ2

∣∣h0

∣∣2 − γ 2 + ∣∣πϕ∣∣2 + 2δγ

(1+ ∣∣h0

∣∣) ≤ −δby the definition of δ. Hence there exists r− ∈ R− such that equality holds in the secondcoordinate of (1.4.5).


�

�

�

�

�

�

�

�


Remark 1.4.1. If the constraint q∗(u) = μ is not present in the definition of M , then theconclusion of Proposition 1.8 holds provided that

∣∣ϕ∣∣ < γ .

Example 1.9. We consider a least squares formulation for the estimation of the potential c in{−�y + cy= f in �,

y= 0 on ∂�(1.4.6)

from an observation z ∈ L2(�), where f ∈ L2(�) and � ⊂ Rn, with n ≤ 4. For thispurpose we set

M = {c ∈ L2(�) : c(x) ≥ 0,

∣∣c∣∣ ≤ γ}

for some γ > 0 and consider⎧⎪⎨⎪⎩min J (c) = 1

2

∣∣y(c)− z∣∣2 + α

2

∣∣c∣∣2

subject to c ∈ M and y(c) solution to (4.6).

(1.4.7)

The Lax–Milgram theorem implies the existence of a variational solution y(c) ∈ H 10 (�)

for every c ∈ M . Here we use the fact that H 1(�) ⊂ L4(�) and cy ∈ L4/3(�) forc ∈ L2(�), y ∈ H 1(�), for n ≤ 4. Proposition 1.8 with L = {c ∈ L2(�) : c(x) ≥ 0} and

g(c) =(c,

∣∣c∣∣2 − γ 2

)is applicable and implies that every element of M is a regular point. Using subsequentialweak limit arguments one can argue the existence of a solution c∗ to (1.4.7). To applyTheorem 1.6 Fréchet differentiability of c→ J (c) at c∗ must be verified. This will followfrom Fréchet differentiability of c→ y(c) at every c ∈ M , which in turn can be argued bymeans of the implicit function theorem, for example. The Fréchet derivative of c → y(c)

at c ∈ M in direction h ∈ L2(�) denoted by y ′(c)h satisfies{−�y ′(c)h+ cy ′(c)h = −hy(c) in �,

y ′(c)h = 0 on ∂�.(1.4.8)

With these preliminaries we obtain a Lagrange multiplier (μ1, μ2) ∈ −L×R+ suchthat the following first order necessary optimality system holds:⎧⎪⎪⎪⎨⎪⎪⎪⎩

(y(c∗)− z, y ′(c∗)h

)L2 + α (c∗, h)− (μ1, h)+ 2μ2(c

∗, h) = 0for all h ∈ L2(�),

−(μ1, c∗)+ μ2

(∣∣c∗∣∣2 − γ)= 0,

(μ1, μ2) ∈ L× R+.(1.4.9)

The first equation in (1.4.9) can be simplified by introducingp(c∗) as the variational solutionto the adjoint equation given by{−�p + c∗p=− (y(c∗)− z) in �,

p= 0 on ∂�.(1.4.10)


�

�

�

�

�

�

�

�

1.4. Applications 11

Using (1.4.8) and (1.4.10) in (1.4.9) we obtain the optimality system⎧⎪⎪⎨⎪⎪⎩p(c∗) y(c∗)+ αc∗ − μ1 + 2μ2c

∗ = 0,

(μ1, μ2) ∈ −L× R+,(c∗,

∣∣c∗∣∣2 − γ 2)∈ L× R−,

(μ1, c∗) = 0, μ2

((∣∣c∗∣∣2 − γ 2

)= 0.

(1.4.11)

Example 1.10. We revisit Example 1.9 but this time the state variable is considered as anindependent variable. This results in the following problem with equality and inequalityconstraints: ⎧⎪⎨⎪⎩

min J (y, c) = 1

2|y − z|2 + α

2|c|2

subject to e(y, c) = 0 and g(c) ∈ L× R−,(1.4.12)

where g is as in Example 1.9 and e : H 10 (�)× L2(�)→ H−1(�) is defined by

e(y, c) = −�y + cy − f.

Note that cy ∈ L43 (�) ⊂ H−1(�) for c ∈ L2(�), y ∈ H 1(�) if n ≤ 4. Let

(c∗, y∗) ∈ L2(�) × H 10 (�) denote a solution of (1.4.12). Fréchet differentiability of J, e,

and g is obvious for this formulation with y as independent variable. The regular pointcondition now has to be established for the constraint(

e

g

)∈ K = {0} × L× R− ⊂ H−1(�)× L2(�)× R.

The second and third components are treated as in Example 1.9 by means of Propo-sition 1.8 and the first coordinate is incorporated by an existence argument for (1.4.6). Thedetails are left to the reader. Consequently there exists a Lagrange multiplier (p, μ1μ2) ∈H 1

0 (�)× L× R+ such that the derivative of

L(y, c, p, μ1, μ2)

= J (y, c)+ 〈e(y, c), p〉H−1,H 10+ (μ1,−c)+ μ2

(∣∣c∗∣∣2 − γ 2

)with respect to (y, c) is zero at (y∗, c∗, p, μ1, μ2). This implies that p is the variationalsolution in H 1

0 (�) to {−�p + c∗p=−(y∗ − z) in �,

p= 0 on ∂�

and

py∗ + αc∗ − μ1 + 2μ2c∗ = 0.

In addition, of course,

(μ1, μ2) ∈ −L× R+,(c∗,

∣∣c∗∣∣2 − γ 2

)∈ L× R−,

(μ1, c∗) = 0, μ2

(∣∣c∗∣∣2 − γ

)= 0.


�

�

�

�

�

�

�

�


Thus the first order optimality condition derived in Example 1.9 above and the presentone coincide, and the adjoint variable of Example 1.9 becomes the Lagrange multiplier ofthe equality constraint e(y, c) = 0.

Example 1.11. We consider a least squares formulation for the estimation of the diffusioncoefficient a in {−(ayx)x + cy = f in � = (0, 1),

y(0) = y(1) = 0,(1.4.13)

where f ∈ L2(0, 1) and c ∈ L2(�), c ≥ 0, are fixed. We assume that a is known outsideof an interval I = (β, γ ) with 0 < β < γ < 1 and that it coincides there with the valuesof a known function ν ∈ H 1(0, 1), which satisfies min {ν(x) : x ∈ (0, 1)} > 0. The set ofadmissible parameters is then defined by

M ={a ∈ H 1(I ) : a ≥ ν,

∣∣a − a∣∣H 1(I )

≤ γ, a(β) = ν(β),

a(γ ) = ν(γ ), and∫ γ

β

a dx = m}.

Here a ∈ H 1(I ) is a fixed reference parameter with a(β) = ν(β), a(γ ) = ν(γ ), andm = ∫ γ

βa dx. Recall that H 1(I ) ⊂ C(I ) and hence a(β) and a(γ ) are well defined. The

coefficient a appearing in (1.4.13) coincides with ν on the complement of I and with someelement of M on I . Consider the least squares formulation⎧⎨⎩min 1

2

∣∣y(a)− z∣∣2L2(0,1) + α

2

∣∣a∣∣2H 1(I )

subject to a ∈ M and y(a) solution to (1.4.13),(1.4.14)

where z ∈ L2(0, 1) is given. It is simple to argue the existence of a solution a∗ to (1.4.14).Note that minimizing over M implies that the mean of the coefficient a is known. This,together with some additional technical assumptions, implies that the solution a∗ is (locally)stable with respect to perturbations of z [CoKu]. To argue the existence of Lagrange mul-tipliers associated to the constraints defining M one considers the set of shifted parametersM = {a − a : a ∈ M}, i.e.,

M ={a ∈ H 1(I ) : a ≥ ν − a,

∣∣a∣∣H 1(I )

≤ γ,

∫ γ

β

a dx = 0, a(β) = a(γ ) = 0

}.

We apply Proposition 1.8 with U = H 10 (I ), and (|u|2

L2(I )+ |ux |2L2(I )

)1/2 as norm,

L = {a ∈ U : a ≥ 0}, q∗(a) = ∫ γ

βa dx, ϕ = ν − a ∈ U , and μ = 0. The mapping

g : U → U × R× R is given by

g(a) =(a − ϕ,

∣∣a∣∣2H 1(I )

− γ 2, q∗(a)).

Let us assume that∫ γ

βν dx < m and

∣∣ν − a∣∣H 1(I )

< γ . Then we have q∗(L) ⊂R+, q∗(ϕ) < m− ∫ γ

βadx = μ, and∣∣πϕ∣∣

H 1(I )= ∣∣π(ν − a)

∣∣H 1(I )

≤ ∣∣ν − a∣∣H 1(I )

< γ.


�

�

�

�

�

�

�

�


It remains to ascertain that [ker q∗]⊥∩L is nonempty. Let ψ be the unique solution of{−ψxx + ψ = 1 on (β, γ ),

ψ(β) = ψ(γ ) = 0.

By the maximum principle ψ ∈ L. A short calculation shows that ψ ∈ [ker q∗]⊥ andhence h0 := q∗(ψ)−1ψ satisfies h0 ∈ [ker q∗]⊥ ∩ L and q∗(h0) = 1.

Next we present examples where the general existence result of Section 1.3 is notapplicable. Different techniques to be developed in the following sections will guarantee,however, the existence of a Lagrange multiplier. In these examples we shall again use e forequality and g for inequality constraints.

Example 1.12. Consider the problem{min x2

1 + x23

subject to e(x1, x2, x3) = 0,(1.4.15)

where

e(x1, x2, x3) =(x1 − x2

2 − x3

x22 − x2

3

).

Here X = R3, Z = R2, e of Section 1.3 is e here, and K = {0}. Note that x∗ = (0, 0, 0) is asolution of (1.4.15) and that ∇e(x∗) is not surjective. Hence x∗ is not a regular point for theconstraint e(x) = 0. Nevertheless (0, λ2) with λ2 ⊂ R arbitrary is a Lagrange multiplierfor the constraint in (1.4.15).

Note that in case C = X and K = {0}, the last requirement in Definition 1.1 becomes

f ′(x∗)+ e′(x∗)∗λ∗ = 0 in X∗,

where e′(x∗)∗ : Z∗ → X∗ is the adjoint of e′(x∗). Hence if e′(x∗) is not surjective and if ithas closed range, then ker e′(x∗)∗ is not empty and the Lagrange multiplier is not unique.

Example 1.13. Here we consider optimal control of the equation with elliptic nonlinearity(Bratu problem) {−�y + exp(y) = u in �,

∂y

∂n= 0 on � and y = 0 on ∂� \ �, (1.4.16)

where � is a bounded domain inRn, n ≤ 3, with smooth boundary ∂� and � is a connected,strict subset of ∂�. Let H 1

�(�) = {y ∈ H 1(�) : y = 0 on ∂� \ �}. The following lemmaestablishes the concept of solution to (1.4.16). Its proof is given at the end of this section.

Lemma 1.14. For every u ∈ L2(�) the variational problem

(∇y,∇v)+ 〈exp y, v〉H 1�(�)∗,H 1

�(�) = (u, v) for v ∈ H 1�(�)

has a unique solution y = y(u) ∈ H 1�(�) and there exists a constant C such that

|y|H 1�∩L∞ ≤ C(|u|L2(�) + C) for all u ∈ L2(�) (1.4.17)


�

�

�

�

�

�

�

�


and

|y(u1)− y(u2)|H 1�∩L∞ ≤ C|u1 − u2|L2(�) for all ui ∈ L2(�), i = 1, 2. (1.4.18)

The optimal control problem is given by⎧⎪⎪⎪⎪⎨⎪⎪⎪⎪⎩min J (y, u) = 1

2 |y − z∣∣2L2(�)

+ α2 |u

∣∣2L2(�)

subject to (y, u) ∈ (H 1

�(�) ∩ L∞(�))× L2(�)

and (y, u) solution to (1.4.17),

(1.4.19)

where z ∈ L2(�). To consider this problem as a special case of (1.1.1) we set X =Y × L2(�), where Y = H 1

�(�) ∩ L∞(�), Z = H 1�(�)∗,K = {0}, and define e : X→ Z∗

as the mapping which assigns to (y, u) ∈ Y × U the functional

v → (∇y,∇v)+ (exp(y), v)− (u, v).

The control component of any minimizing sequence {(yn, un)} for J is bounded. Togetherwith Lemma 1.14 it is therefore simple to argue that (1.4.19) admits a solution (y∗, u∗) ∈Y × U . Clearly J and e are continuously Fréchet differentiable. If (y∗, u∗) was a regularpoint, then this would require surjectivity of e′(y∗, u∗) : X → H 1

�(�)∗. For (δy, δu) ∈X, e′(y∗, u∗)(δy, δu) is the functional defined by

v → (∇δy,∇v)+ (exp(y∗)δy − δu, v), v ∈ H 1�(�).

Since−�+exp(y∗) is an isomorphism from H 1�(�) to H 1

�(�)∗ and H 1�(�) is not contained

in L∞(�), if n > 1, it follows that e′(y∗, u∗) : X→ H 1�(�)∗ is not surjective if n > 1.

Example 1.15. We consider the least squares problem for the estimation of the vector-valued convection coefficient u in{−�y + u · ∇y = f in �,

y = 0 on ∂�, div u = 0,(1.4.20)

from data z ∈ L2(�). Here � is a bounded domain inRn, n ≤ 3, with smooth boundary andf ∈ L2(�). To cast the problem in abstract form we chooseU = {

u ∈ L2n(�) : div u = 0

},

X = (H 1

0 (�) ∩ L∞(�))×U,Z = H−1(�),K = {0} and define e : X→ Z as the mapping

which assigns to (y, u) ∈ X the functional

v → (∇y,∇v)− (uy,∇v)− (f, v) for v ∈ H 10 (�).

Note that e(y, u) is not well defined for (y, u) ∈ H 10 (�)× U , since u · ∇y ∈ L1(�) only.

The regularized least squares problem is given by⎧⎨⎩min J (y, u) = 1

2 |y − z|2L2(�)

+ α2 |u|2L2

n(�)

subject to e(y, u) = 0, div u = 0, (y, u) ∈ X.

(1.4.21)


�

�

�

�

�

�

�

�


Observe that (u · ∇y, y) = (∇ · (uy), y) = 0. Using this fact and techniques similar tothose of the proof of Lemma 1.14 (compare [Tr, Section 2.3] and [IK14, IK15]) it can beshown that for every u ∈ U there exists a unique solution y = y(u) ∈ X of (1.4.20) and forevery bounded subset B of U there exists a constant k(B) such that

|y(u)|X ≤ k(B)|u|L2n

for all u ∈ B. (1.4.22)

Extracting a subsequence of a minimizing sequence to (1.4.21) it is simple to argue theexistence of a solution (y∗, u∗) ∈ X to (1.4.21). Clearly e is Fréchet differentiable at(y∗, u∗) and for δy ∈ H 1

0 (�) ∩ L∞(�), ey(y∗, u∗)δy ∈ H−1 is the functional defined by⟨

ey(y∗, u∗)δy, v

⟩H−1,H 1

0= (∇δy,∇v)+ (u∗ · ∇δy, v) with v ∈ H 1

0 (�).

Note that ey(y∗, u∗) is well defined on H 10 (�) ∩ L∞(�). But as a consequence of the

bilinear term u∗ · ∇δy which is only in L1(�), ey(y∗, u∗) is not defined on H 1

0 (�). Theoperator e′(y∗, u∗) from X to H−1(�) is not surjective if n > 1, and hence (y∗, u∗) doesnot satisfy the regular point condition.

We now turn to the proof of Lemma 1.14 for which we require the following resultfrom [Tr].

Lemma 1.16. Let ϕ : (k1, h1)→ R be a nonnegative, nonincreasing function and supposethat there are positive constants r,K , and β, with β > 1, such that

ϕ(h) ≤ K(h− k)−rϕ(k)β for k1 < k < h < h1.

If k := K1r 2

β

β−1 ϕ(k1)β−1r satisfies k1 + k < h1, then ϕ(k1 + k) = 0.

Proof of Lemma 1.14. Let us first argue the existence of a solution y = y(u) ∈ of

(∇y,∇v)+ (ey, v)H 1�(�)∗,H 1

�(�) = (u, v) for all v ∈ H 1�(�). (1.4.23)

Since (exp (s1) − exp (s2))(s1 − s2) ≥ 0 for s1, s2 ∈ R, it follows that −�φ + exp (φ) −γ I defines a maximal monotone operator from a subset of H 1

�(�) to H 1�(�)∗ for some

γ > 0 [Ba], and hence (1.4.23) has a unique variational solution y = y(u) ∈ H 1�(�) for

every u ∈ L2(�); see [IK15] for details. Since

|∇(y(u1)− y(u2))|2 ≤ (u1 − u2, y(u1)− y(u2))

and ∂� \ � is nonempty, there exists an embedding constant C such that

|y(u1)− y(u2)|H 1�(�) ≤ C |u1 − u2|L2(�) (1.4.24)

for u1, u2 in L2(�), and

|y(u)|H 1�≤ C(|u|L2(�) + C) for all u ∈ L2(�).

Throughout the remainder of the proof C will denote a generic constant, independentof u ∈ L2(�). To verify (1.4.17) it remains to obtain an L∞(�) bound for y = y(u). The


�

�

�

�

�

�

�

�


proof is based on a generalization of well-known L∞(�) estimates due to Stampacchia andMiranda [Tr] for linear variational problems to the nonlinear problem (1.4.23). Let us aimfirst for a pointwise (a.e.) upper bound for y. For k ∈ (0,∞) we set yk = (y − k)+ and�k = {x ∈ � : yk > 0}. Note that yk ∈ H 1

�(�) and yk ≥ 0. Using (1.4.23) we find

(∇yk,∇yk) = (∇y,∇yk) = (u, yk)− (ey, yk) ≤ (u, yk),

and hence

|∇yk|2L2 ≤ |u|L2 |yk|L2 . (1.4.25)

By Hölder’s inequality and a well-known embedding result

|yk|L2 =(∫

�k

y2k

) 12

≤ |yk|L6 |�k| 13 ≤ C|∇yk|L2 |�k| 1

3 .

Here we used the assumption that n ≤ 3. Employing this estimate in (1.4.25) implies that

|∇yk|L2 ≤ C|�k| 13 |u|L2 . (1.4.26)

We denote by h and k arbitrary real numbers satisfying 0 < k < h <∞, and we find

|yk|4L4 =∫�k

(y − k)4 >

∫�h

(y − k)4 ≥ |�h|(h− k)4,

which, combined with (1.4.26), gives

|�h| ≤ C(h− k)−4|�k| 43 |u|4L2 , (1.4.27)

where the constant C is independent of h, k, and u. It will be shown that Lemma 1.16 isapplicable to (1.4.27) with ϕ(k) = |�k|, β = 4

3 and K = C|u|4L2 . The conditions on k1 and

h1 can easily be satisfied. In fact, in our case k1 = 0, h1 = ∞, and k = C14 |u|L2 2

β

β−1 |�0| β−14 .

The condition k1 + k < h1 is satisfied since

k = C14 |u|L2 2

β

β−1 |�0| β−14 < C

14 |u|L2 2

β

β−1 |�| β−14 <∞.

We conclude that |�k| = 0 and hence y ≤ k a.e. in �. A uniform lower bound ony can be obtained in an analogous manner by considering yk = (−(k + y))+. We leavethe details to the reader. This concludes the proof of (1.4.17). To verify (1.4.18) the H 1

estimate for y(u1)−y(u2) is already clear from (1.4.24) and it remains to verify the L∞(�)

estimate. Let us set yi = y(ui), z = y1 − y2, zk = (z − k)+, and �k = {x ∈ � : zk > 0}for k ∈ (0,∞). We obtain

|∇zk|2L2 = (∇z,∇zk) = (u1 − u2, zk)− (ey1 − ey2 , zk) ≤ (u1 − u2, zk).

Proceeding as above with y and yk replaced by z and zk the desired pointwise upper boundfor y1 − y2 is obtained. For the lower bound we define zk = (−(k + z))+ for k ∈ (0,∞)

and �k = {x ∈ � : zk > 0} = {x : k + y1(x) < y2(x)}. It follows that

|∇zk|2L2 = −(∇(y1 − y2),∇zk) = (ey1 − ey2 , zk)− (u1 − u2, zk) ≤ −(u1 − u2, zk).

From this inequality we obtain the desired uniform pointwise lower bound ony1 − y2.


�

�

�

�

�

�

�

�

1.5. Weakly singular problems 17

1.5 Weakly singular problemsWe consider the optimization problem with equality and inequality constraints of the type{

min J (y, u)

subject to e(y, u) = 0, u ∈ C ⊂ U,(1.5.1)

with J : Y × U → R, e : Y1 × U → W , where Y,U,W are Hilbert spaces and Y1 is aBanach space that is densely embedded in Y . Further C is a closed convex subset of U .Below W ∗ will denote the dual of W . Let (y∗, u∗) denote a local solution to (1.5.1) and letV (y∗)×V (u∗) ⊂ Y1×U denote a neighborhood of (y∗, u∗) such that J (y∗, u∗) ≤ J (y, u)

for all (y, u) ∈ V (y∗)× (V (u∗)∩C) satisfying e(y, u) = 0. It is assumed throughout thatJ is Fréchet differentiable in a neighborhood in the Y × U topology of (y∗, u∗) and thatthe Fréchet derivative is locally Lipschitz continuous. Further e is assumed to be Fréchetdifferentiable at (y∗, u∗) with Fréchet derivative

e′(y∗, u∗)(δy, δu) = ey(y∗, u∗)δy + eu(y

∗, u∗)δu.

In particular ey(y∗, u∗) ∈ L(Y1,W). Since Y1 is dense in Y , one may consider ey(y∗, u∗)as densely defined linear operator with domain in Y . To distinguish this operator fromey(y

∗, u∗) defined on Y1 we shall denote it in this section by

G : D(G) ⊂ Y → W

and we assume that

(H1) G∗ : D(G∗) ⊂ W ∗ → Y ∗ is densely defined.

Then necessarily G is closable [Ka, p. 168]. Its closure will be denoted by the same symbol.In addition the following assumptions will be required:

(H2) Jy(y∗, u∗) ∈ Rg G∗.

Condition (H2) is a regularity assumption. It implies the existence of a solution λ∗ ∈ D(G∗)to the adjoint equation

G∗λ+ Jy(y∗, u∗) = 0,

which is a Lagrange multiplier associated to the constraint e(y, u) = 0.

(H3) There exists a dense subset D of C with the following property: For every u ∈ D

there exists tu > 0 such that for all t ∈ [0, tu] there exists y(t) ∈ Y1 satisfyinge(y(t), u∗ + t (u− u∗)) = 0 and

limt→0+

1

t|y(t)− y∗|2Y = 0. (1.5.2)

(H4) For every u ∈ D and y(·) as in (H3), e is directionally differentiable at every elementof {(y∗ + s(y(t) − y∗), u∗ + st (u − u∗)) : s ∈ [0, 1], t ∈ [0, tu]} in all directions(y, u) ∈ Y1 × U and

limt→0+

1

t

⟨∫ 1

0[e′(y∗+s(y(t)−y∗), u∗+stv)−e′(y∗, u∗)](y(t)−y∗, tv)ds, λ∗

⟩W,W ∗

= 0,

where v = u− u∗.


�

�

�

�

�

�

�

�


Note that (H4) is satisfied if (H3) holds with Y replaced by Y1 and if e : Y1 → W is Fréchetdifferentiable with locally Lipschitzian derivative.

Our assumptions do not require surjectivity of e′(y∗, u∗) : Y1 × U �→ W which isrequired by (1.3.1) nor that e′(y∗, u∗) is well defined on all of Y ×U . Below we shall giveseveral examples which illustrate the applicability of the assumptions. For now, we arguethat if they are satisfied, then a Lagrange multiplier with respect to the equality constraintexists.

Theorem 1.17. Let (y∗, u∗) be a local solution of (1.5.1) and assume that (H1)–(H4) hold.Then⎧⎪⎨⎪⎩

e(y∗, u∗) = 0 in W (primal equation),G∗λ∗ + Jy(y

∗, u∗) = 0 in Y ∗ (adjoint equation),⟨eu(y

∗, u∗)∗λ∗ + Ju(y∗, u∗), u− u∗

⟩U∗,U

≥ 0 for all u ∈ C (optimality).

(1.5.3)

The system of equations (1.5.3) is referred to as an optimality system.

Proof. Let u ∈ D, set v = u− u∗, choose tu according to (H3), and assume that t ∈ (0, tu].Due to (H3) and (H4)

0 = e(y(t), u∗ + tv)− e(y∗, u∗) = G(y(t)− y∗)+ t eu(y∗, u∗) v

+∫ 1

0[e′(y∗ + s(y(t)− y∗), u∗ + stv)− e′(y∗, u∗)](y(t)− y∗, tv)ds.

(1.5.4)

(H2) implies the existence of a solution λ∗ to the adjoint equation. Observe that by (1.5.4)and the fact that u∗ is a local solution to (1.5.1)

0 ≤ J (y(t), u∗ + tv)− J (y∗, u∗) = J ′(y∗, u∗)(y(t)− y∗, tv)

+∫ 1

0

[J ′(y∗ + s(y(t)− y∗), u∗ + stv)− J ′(y∗, u∗)

](y(t)− y∗, tv)ds

+〈G∗λ∗, y(t)− y∗〉Y ∗,Y + t〈eu(y∗, u∗)v, λ∗〉W,W ∗

+∫ 1

0

⟨[e′(y∗ + s(y(t)− y∗), u∗ + stv)− e′(y∗, u∗)

](y(t)− y∗, tv), λ∗

⟩W,W ∗

ds.

By the second equation in (1.5.3), local Lipschitz continuous differentiability of J , (H3),(H4), and the fact that J ′(y∗, u∗)(y(t) − y∗, tv) = Jy(y

∗, u∗)(y(t) − y∗) + tJu(y∗, u∗)v

we obtain

0 ≤ limt→0+

1

tJ (y(t), u∗ + tv)− J (y∗, u∗)) = 〈Ju(y∗, u∗)(u∗)+ eu(y

∗, u∗)∗λ∗, u− u∗〉U∗,U .

Since u is arbitrary in D and D is dense in C it follows that

〈Ju(y∗, u∗)(u∗)+ eu(y∗, u∗)∗λ∗, u− u∗〉U∗,U ≥ 0 for all u ∈ C.

This ends the proof.


�

�

�

�

�

�

�

�


We next give several examples which demonstrate the applicability of hypotheses(H1)–(H4) and the necessity to allow for two spaces Y1 and Y with Y1 � Y . The typicalsituation that we have in mind is Y1 = Y ∩ L∞(�) with Y a Hilbertian function spaceover �.

Example 1.18. Consider first the finite-dimensional equality-constrained optimizationproblem ⎧⎨⎩

min y21 + u2,

y1 − y22 = u,

y32 = u2

(1.5.5)

for which (y∗, u∗) = (0, 0, 0) is the solution. Here Y = R2, U = R1, W = R1, and

e(y, u) =(y1 − y2

2 − u

y32 − u2

).

Note that with (y1, y2, u) = (x1, x2, x3) this problem coincides with Example 1.12. Werecall that e′(y∗, u∗) is not surjective and the theory of Section 1.3 assuring the existenceof a Lagrange multiplier is therefore not applicable. However, G∗ = (

1 00 0

)and the adjoint

equation

G∗(λ1

λ2

)=

(00

)has infinitely many solutions. Thus (H1), (H2) are satisfied. As for (H3) note that

(y1(u)y2(u)

) =(u+u 4

3

u23

)defines a solution branch to e(y, u) = 0 for which (H3) is satisfied. It is simple to

verify (H4). Hence (1.5.5) is an example for an optimization problem where all hypotheses(H1)–(H4) are satisfied and Theorem 1.17 is applicable.

Example 1.19. We consider the optimal control problems with distributed control

min J (y, u) = 1

2|y − z|2L2(�) +

α

2|u|2L2(�) (1.5.6)

subject to ⎧⎨⎩−�y + exp(y) = u in �,∂y

∂n= 0 on �,

y = 0 on ∂� \ �,(1.5.7)

where α > 0, z ∈ L2(�), � is a bounded domain in Rn, n ≤ 3, with smooth boundary ∂�,and � is a connected, strict subset of ∂�. The concept of solution to (1.5.7) is the variationalone of Lemma 1.14. To consider this problem in the general setting of this section the controlvariable u is chosen inU = L2(�). We set Y = H 1

�(�) = {y ∈ H 1(�) : y = 0 on ∂�\�},Y1 = H 1

�(�) ∩ L∞(�), and W = (H 1�(�))∗. Moreover e : Y1 × U → W is defined by

assigning to (y, u) ∈ Y1 × U the functional on W ∗ given by

v → (∇y,∇v)+ (exp y, v)− (u, v) for v ∈ H 1�(�).


�

�

�

�

�

�

�

�


Let (y∗, u∗) ∈ Y1 denote a solution to (1.5.6), (1.5.7). For (y, u) ∈ Y1 × L2(�) theFréchet derivative ey(y, u)δy of e with respect to y in direction δy ∈ H 1

�(�) is given by thefunctional v → (∇δy,∇v)+ ((exp y)δy, v). Clearly ey(y, u) : Y → W is symmetric and(H1), (H2) are satisfied. (H3) is a direct consequence of Lemma 1.14. To verify (H4) notethat e′(y, u)(δy, δu) is the functional defined by

v → (∇δy,∇v)+ (exp(y)δy, v)− (δu, v)� for v ∈ H 1�(�).

If (y, u) ∈ Y1 × U , then e′(y, u) is well defined on Y × U . (H4) requires us to consider

limt→0+

1

t

∫ 1

0

∫�

(exp(y∗ + s(y(t)− y∗))− exp y∗)(y(t)− y∗)λ∗dxds, (1.5.8)

where y(t) is the solution to (1.5.7) with u = u∗ + tv, v ∈ U . Note that λ∗ ∈ L∞ and that{|y(t)|L∞ : t ∈ [0, 1]} is bounded by Lemma 1.14. Moreover |y(t)− y∗|Y1 ≤ c t |v|L2 for aconstant c independent of t ∈ [0, 1], and thus the pointwise local Lipschitz property of theexponential function implies that the limit in (1.5.8) is zero. (H4) now easily follows.

The considerations of this example remain correct for cost functionals J that are muchmore general than the one in (1.5.6). In fact, it suffices thatJ is weakly lower semicontinuousfrom Y×U to R and radially unbounded with respect to u, i.e., limn→∞ |un|L2 = ∞ impliesthat lim supn→∞ inf y∈Y J (y, un) = ∞. This guarantees existence of a solution (y∗, u∗).The general regularity assumptions and (H1)–(H4) are satisfied if J : Y × U → R iscontinuous and Fréchet differentiable in a neighborhood of (y∗, u∗) with locally Lipschitzcontinuous derivative.

Example 1.20. Consider the nonlinear optimal boundary control problem

min J (y, u) = 1

2|y − z|2L2(�) +

α

2|u|2Hs(�) (1.5.9)

subject to ⎧⎨⎩−�y + exp y = f in �,∂y

∂n= u on �,

y = 0 on ∂� \ �,(1.5.10)

where α > 0, z ∈ L2(�), f ∈ L∞ are fixed, � is a bounded domain in Rn, with smoothboundary ∂�, and � is a nonempty, connected, strict subset of ∂�. Further s is a realnumber strictly larger than n−3

2 if n ≥ 3, and s = 0 if n < 3. Differently from Example 1.19the dimension n of � is now allowed to be arbitrary. This example can be treated within thegeneral framework of this book by setting Y, Y1,W as in Example 1.19. The control spaceU is chosen to be Hs(�). To verify (H1)–(H4) one proceeds as in Example 1.19 by utilizingthe following lemma. Its proof is given in [IK15] and is similar to that of Lemma 1.14.

Lemma 1.21. The variational problem

(∇y,∇v)+ (exp y, v) = (f, v)+ (u, v)�, v ∈ H 1�(�),


�

�

�

�

�

�

�

�


has a unique solution y = y(u) ∈ H 1�(�) ∩ L∞(�) for every u ∈ Hs(�), and there exists

a constant c such that

|y|H 1�∩L∞ ≤ c(|u|Hs(�) + c) for all u ∈ Hs(�). (1.5.11)

Moreover, c can be chosen such that

|y(u1)− y(u2)|H 1�(�)∩L∞ ≤ c|u1 − u2|Hs for all ui ∈ Hs(�), i = 1, 2. (1.5.12)

Example 1.22. We reconsider (1.4.21) from Example 1.15 as a special case of (1.5.1). Forthis purpose we set Y = H 1

0 (�), Y1 = H 10 (�) ∩ L∞(�), W = H−1(�), and e as in

Example 1.15. Let (y∗, u∗) ∈ Y1 × U denote a solution of (1.4.21). Clearly e is Fréchetdifferentiable at (y∗, u∗) and the partial derivative G = ey(y

∗, u∗) ∈ L(Y1,W) is thefunctional characterized by

〈ey(y∗, u∗)δy, v〉W,W ∗ = (∇δy,∇v)+ (u∗ · ∇δy, v), v ∈ W ∗ = H 10 (�).

As a consequence of the quadratic term u∗ · ∇δy which is only in L1(�), G is not definedon all of Y = H 1

0 (�). As an operator from Y1 to W , the operator G is not surjective.Considered as an operator with domain in Y , its adjoint is given by

G∗w = −�w − ∇ · (u∗w).

The domain of G∗ contains Y1 and hence G∗ is densely defined. Moreover its range con-tains L2(�) and thus (H1) as well as (H2) are satisfied. Let U(u∗) ⊂ U be a boundedneighborhood of u∗. Since for every u ∈ U(u∗)

(∇(y(u)− y∗),∇v)− (u(y(u)− y∗),∇v) = ((u− u∗)y∗,∇v) for all v ∈ H 10 (�),

it follows that there exists a constant k > 0 such that

|y(u)− y∗|H 1 ≤ k|u− u∗|L2n

for all u ∈ U(u∗), (1.5.13)

and (H3) follows. The validity of (H4) is a consequence of (1.5.13) and the fact that λ∗ isthe unique variational solution in H 1

0 (�) to

−�λ∗ − ∇ · (u∗λ∗) = −(y∗ − z)

and hence an element of L∞(�).

Remark 1.5.1. Comparing Examples 1.19 and 1.20 with Example 1.22 we observe that thelinearization e′(y, u), with (y, u) ∈ Y1 × U , is well defined on Y × U for Examples 1.19and 1.20 but it is only defined with domain strictly contained in Y × U for Example 1.22.For none of these examples is e defined on all of Y × U .

Example 1.23. Here we consider the nonlinear optimal control problem with nonlinearityof blowup type: ⎧⎪⎪⎨⎪⎪⎩

min 12 |∇(y − z)|2

L22(�)

+ α2 |u|2L2(�)

subject to

−�y − exp y = f in �,∂y

∂n= u on �,

y = 0 on ∂� \ �,(1.5.14)


�

�

�

�

�

�

�

�


where α > 0, z ∈ H 1�(�), f ∈ L2(�), � is a smooth bounded domain in R2, and � ⊂ ∂�

is a connected strict subset of ∂�. Since � is assumed to be a two-dimensional domain wehave the following property of the exponential function: for every p ∈ [1,∞),

{| exp y|Lp : y ∈ B} is bounded (1.5.15)

provided that B is a bounded subset of H 10 (�) [GT, p. 155]. The variational form of the

boundary value problem in (1.5.14) is given by

(∇y,∇v)− (exp y, v) = (f, v)+ (u, v)� for all v ∈ H 1�(�), (1.5.16)

where H 1�(�) is defined in Example 1.19. To argue existence of a solution to (1.5.14) let

{(yn, un)} be a minimizing sequence with weak limit (y∗, u∗) ∈ H 10 (�) × L2(�). Due to

(1.5.15) and the radial unboundedness of the cost functional with respect to the H 1�(�) ×

L2(�)-norm the set {| exp yn|Lp : n ∈ N} is bounded for everyp ∈ [1,∞) and {| exp yn|W 1,p :n ∈ N} is bounded for every p ∈ [1, 2). Since W 1,p(�) is compactly embedded inL2(�) for every p ∈ (1, 2) it follows that for a subsequence of {yn}, denoted by thesame symbol, lim exp(yn) = exp y∗ in L2(�). It is now simple to argue that (y∗, u∗) is asolution to (1.5.14). Let us discuss then the validity of (H1)–(H4) with Y = Y1 = H 1

�(�),W = (H 1

�(�))∗, with the obvious choice for J , and with e : Y × U → W the mappingassigning to (y, u) ∈ Y × U the functional v → (∇y,∇v)− (exp y, v)− (f, v)− (u, v)�for v ∈ H 1

�(�). From (1.5.15) it follows that e is well defined and its Fréchet derivative at(y, u) in direction (δy, δu) is characterized by

(e′(y, u)(δy, δu), v) = (∇δy,∇v)− (exp(y)δy, v)− (δu, v)� for v ∈ H 1�(�).

The operator G = ey(y∗, u∗) can be expressed as

G(δy) = −�δy − exp(y∗)δy.

Note that G ∈ L(Y,W ∗), and G is self-adjoint with compact resolvent. In particular, (H1)is satisfied. The spectrum of G consists of eigenvalues only. It will be assumed that

0 is not an eigenvalue of G. (1.5.17)

Due to the regularity assumption for z (note that it would suffice to have �z ∈ (H 1�)∗),

(1.5.17) implies that (H2) holds. To argue the validity of (H3) and (H4) one can rely on theimplicit function theorem. Let B be a bounded open neighborhood of y∗ in H 1

�(�). Using(1.5.15) one argues the existence of a constant κ > 0 such that

| exp y − exp y|L4 ≤ κ|y − y|H 1 for all y, y ∈ B.

It follows that e is continuous on B × U and its partial derivative ey(y, u) is Lipschitzcontinuous with respect to (y, u) ∈ B × U . The implicit function theorem implies theexistence of a neighborhoodU(u∗) ofu∗ such that for everyu ∈ U(u∗) there exists a solutiony(u∗) of (1.5.16) depending continuously on u. Since e(y, u) is Lipschitz continuous withrespect to u it follows, moreover, that there exists L > 0 such that

|y(u)− y∗|H 1 ≤ L|u− u∗|L2(�) for all u ∈ U(u∗).

(H3) and (H4) are a direct consequence of this estimate.


�

�

�

�

�

�

�

�

1.6. Approximation, penalty, and adapted penalty techniques 23

The methodology utilized to consider this example can also be applied to Examples 1.19 and1.20 provided that � is restricted to being two-dimensional. This is essential for (1.5.15) tohold. For Example 1.23 it is essential for the cost functional to be radially unbounded withrespect to the H 1

�(�)-norm for the y-component to guarantee that minimizing sequences arebounded. For Examples 1.19 and 1.20 the a priori bound on the y-component of minimizingsequences can be obtained through the state equation.

1.6 Approximation, penalty, and adapted penaltytechniques

As in the previous section we consider problems of type{min J (y, u)

subject to e(y, u) = 0, u ∈ C ⊂ U.(1.6.1)

We describe techniques that in specific situations can be more powerful than thegeneral theory of Section 1.4 to obtain optimality systems of type⎧⎪⎪⎪⎪⎨⎪⎪⎪⎪⎩

e(y∗, u∗) = 0,

e∗y(y∗, u∗)λ∗ + Jy(y∗, u∗) = 0,⟨

e∗u(y∗, u∗)λ∗ + Ju(y∗, u∗), u− u∗

⟩U∗,U ≥ 0 for all u ∈ C,

(1.6.2)

where (y∗, u∗) denotes a solution to (1.6.1). We shall proceed formally here, referring thereader to [Ba, Chapter 3.2.2] and [Lio1] for details.

1.6.1 Approximation techniques

If J and/or e are not smooth, then one can aim for obtaining an optimality system of the form(1.6.2) by introducing families of smooth operators eε and smooth functionals Jε, ε > 0,and replacing (1.6.1) by {

min Jε(y, u)

subject to eε(y, u) = 0, and u ∈ C.(1.6.3)

Let (yε, uε), ε > 0, denote solutions to (1.6.3). Under relatively mild conditions it can beargued that {(yε, uε)}ε>0 contains accumulation points and that they are solutions of (1.6.2).But they need not coincide with (y∗, u∗). To overcome this difficulty a penalty term is addedto the cost in (1.6.3), resulting in the adapted penalty formulation⎧⎪⎨⎪⎩

min Jε(y, u)+ 1

2| u− u∗|2

subject to eε(y, u) = 0 and u ∈ C.

(1.6.4)

It can be shown that under appropriate assumptions (yε, uε) converges to (y∗, u∗) and thatthere exists λ∗ such that (1.6.2) holds. This approach can be used for optimal control ofsemilinear elliptic equations, variational inequalities [Ba, BaK1, BaK2], and state-dependentparameter estimation problems [BKR], for example.


�

�

�

�

�

�

�

�


1.6.2 Penalty techniques

This technique is well suited for systems with multiple steady states or finite explosionproperty. The procedure is explained by means of an example. We consider the nonlinearparabolic control system⎧⎪⎪⎪⎪⎨⎪⎪⎪⎪⎩

yt −�y − y3 = u in Q = (0, T )×�,

y = 0 on � = (0, T )× ∂�,

y(0, ·) = y0 in �,

(1.6.5)

where y = y(t, x), T > 0, and y0 is a given initial condition. With (1.6.5) we associate theoptimal control problem⎧⎪⎨⎪⎩

min J (y, u) = 1

6|y − z|6L6(Q) +

α

2|u|2L2(Q)

subject to u ∈ C, y ∈ H 1,2(Q) and (1.6.5),

(1.6.6)

where α > 0, z ∈ L6(Q), C is a closed convex set in L2(Q), and H 1,2(Q) = {ϕ ∈ L2(Q) :

ϕt , ϕxi , ϕxixj ∈ L2(Q)}. Realizing the state equation by a penalty term leads to⎧⎪⎨⎪⎩

min Jε(y, u) = J (y, u)+ 1

ε

∣∣yt −�y − y3 − u∣∣2L2(Q)

subject to u ∈ C, y ∈ H 1,2(Q), y|� = 0, y(0, ·) = y0.

(1.6.7)

It can be shown that for every ε > 0 there exists a solution (yε, uε) to (1.6.7). Considering thebehavior of the pair (yε, uε) as ε→ 0+ the question of convergence to a particular solution(y∗, u∗) of (1.6.6) again arises. As before this can be realized by means of adaptation terms:⎧⎪⎨⎪⎩

min Jε(y, u)+ 1

2

∣∣y − y∗∣∣2L2(Q)

+ 1

2

∣∣u− u∗∣∣2L2(Q)

subject to u ∈ C, y ∈ H 1,2(Q), y|� = 0, y(0, ·) = y0.

(1.6.8)

Let us denote solutions to (1.6.8) by (yε, uε). It is fairly straightforward to deriveoptimality conditions for (1.6.8): Setting

λε = −1

ε

(yεt −�yε − (yε)3 − uε

) ∈ L2(Q) (1.6.9)

we find ⎧⎪⎪⎪⎪⎨⎪⎪⎪⎪⎩λεt −�λε − 3(yε)2λε = −(yε − z)5 − (yε − y∗) in Q,

λε = 0 on �,

λε(T , ·) = 0 in Q

(1.6.10)


�

�

�

�

�

�

�

�

1.6. Approximation, penalty, and adapted penalty techniques 25

and∫Q

(αuε − λε)(u− uε)dQ+∫Q

(uε − u∗)(u− uε)dQ ≥ 0 for all u ∈ C, (1.6.11)

where λε must be interpreted as a weak solution in L2(Q) of (1.6.10) (i.e., the inner productwith smooth test function is taken, and partial integration bringing all differentiations onthe test function is carried out). It can be verified that uε → u∗ in L2(Q), yε ⇀ y∗ weaklyin H 1,2(Q) and that there exists λ∗ ∈ W 2,1;6/5, the weak limit of λε, such that the followingoptimality system for (1.6.6) is satisfied:⎧⎪⎪⎪⎪⎪⎪⎪⎪⎨⎪⎪⎪⎪⎪⎪⎪⎪⎩

yt −�y − y3 = u in Q,

−λt −�λ− 3y2 λ = −(y − z)5 in Q,

y = λ = 0 on∑

,

y(0, ·) = y0, λ(T , ·) = 0 in �,

(1.6.12)

where W 2,1;6/5 = {ϕ ∈ L6/5(Q) : ϕt , ϕxi , ϕxixj ∈ L6/5(Q)

}; see [Lio1, Chapter I.3].


�

�

�

�

�

�

�

�


�

�

�

�

�

�

�

�

Chapter 2

Sensitivity Analysis

2.1 GeneralitiesIn this chapter we discuss the sensitivity of solutions to parameterized mathematical pro-gramming problems of the form

minx∈C f (x, p) subject to

e(x, p) = 0, g(x, p) ≤ 0, and �(x, p) ∈ K,

(2.1.1)

where C is a closed convex set in X, and further e : X × P → W represents an equalityconstraint, g : X×P → Rm a finite-dimensional inequality constraint, and � : X×P → Z

an infinite-dimensional affine constraint. The cost f as well as the equality and inequalityconstraints are allowed to depend on a parameter p ∈ P . Unless otherwise specified P isa normed linear space, X, W, Z are real Hilbert spaces, and K is a closed convex conewith vertex at 0 in Z. Such problems arise in inverse, control problems, and variationalinequalities, and some examples will be discussed in Section 2.6.

The cone K induces a natural ordering ≤ by means of z1 ≤ z2 if z1 − z2 ∈ K . Wedenote by K+ the dual cone given by

K+ = {z ∈ Z∗ : 〈z, z〉 ≤ 0 for all z ∈ K},where 〈·, ·〉 denotes the duality pairing between Z and Z∗. Suppose that x0 is the solutionto (2.1.1) for the nominal parameter p = p0. The objective of this section is to investigate

• continuity and Lipschitz continuity of the solution mapping

p ∈ P → (x(p), λ(p), μ(p), η(p)) ∈ X ×W ∗ × Rm,+ ×K+,

where for p in a neighborhood of p0, x(p) ∈ X denotes the local solution of (2.1.1)in a neighborhood of x0, and λ, μ, η are the Lagrange multipliers associated withconstraints described by e, g, � in (2.1.1);

• differentiability of the minimum value function F(p) = f (x(p), p) at p0; and

27


�

�

�

�

�

�

�

�

28 Chapter 2. Sensitivity Analysis

• directional differentiability of the solution mapping.

Due to the local nature of the analysis, f , e, and g need only be defined in a neigh-borhood of (x0, p0). We assume throughout this chapter that they are twice continuouslyFréchet differentiable with respect to x and that their first and second derivatives are con-tinuous in a neighborhood of (x0, p0). It is assumed that � is affine in x for every p ∈ P

and that the first derivative �′(p0) with respect to x is continuous in a neighborhood of p0.In order to ensure the existence of a Lagrange multiplier at x0 we assume that

(H1) x0 is a regular point, i.e.,

0 ∈ int

⎧⎨⎩⎛⎝ e′(x0, p0)

g′(x0, p0)

�′(p0)

⎞⎠ (C − x0)+⎛⎝ 0Rm+−K

⎞⎠+⎛⎝ 0

g(x0, p0)

�(x0, p0)

⎞⎠⎫⎬⎭ , (2.1.2)

where the interior is taken in W × Rm × Z and primes denote the Fréchet derivatives withrespect to x. We define the Lagrange functional L : X × P ×W ∗ × Rm × Z∗ → R by

L(x, p, λ, μ, η) = f (x, p)+ 〈λ, e(x, p)〉 + 〈μ, g(x, p)〉Rm + 〈η, �(x, p)〉.With (2.1.2) holding, it follows from Theorem 1.6 that there exists a Lagrange multiplier(λ0, μ0, η0) ∈ W ∗ × Rm,+ ×K+ such that

〈L′(x0, p0, λ0, μ0, η0), c − x0〉 ≥ 0 for all c ∈ C,

e(x0, p0) = 0,

〈μ0, g(x0, p0)〉 = 0, g(x0, p0) ≤ 0, μ0 ≥ 0,

〈η0, �(x0, p0)〉 = 0, �(x0, p0) ∈ K, η0 ∈ K+.

(2.1.3)

To express (2.1.3) in a more compact form we recall that the subdifferential ∂ψC of theindicator function

ψC(x) ={

0 if x ∈ C,

∞ if x /∈ C

of a closed convex set C in a Banach space X is given by

∂ψC(x) ={ {y ∈ X∗ : 〈y, c − x〉X∗,X ≤ 0 for all c ∈ C} if x ∈ C,

{ } if x /∈ C.

The set ∂ψC(x) is also referred to as the normal cone to C at x. For convenience we alsospecify

∂ψK+(η) ={ {z ∈ Z : 〈z∗ − η, z〉X∗,X ≤ 0 for all z∗ ∈ K+} if η ∈ K+,{ } if η /∈ K+.


�

�

�

�

�

�

�

�

2.1. Generalities 29

Note that (2.1.3) is equivalent to

0 ∈

⎧⎪⎪⎨⎪⎪⎩L′(x0, p0, λ0, μ0, η0)+ ∂ψC(x0),

e(x0, p0),

−g(x0, p0)+ ∂ψRm,+(μ0),

−�(x0, p0)+ ∂ψK+(η0).

(2.1.4)

In fact this follows from the following characterization of the normal cone to K+.

Lemma 2.1. Suppose that K is a closed convex cone in Z. Then

z ∈ ∂ψK+(η) if and only if z ∈ K, η ∈ K+, and 〈η, z〉 = 0.

Proof. Assume that z ∈ ∂ψK+(η). Then ∂ψK+(η) is nonempty, η ∈ K+, and

〈z∗ − η, z〉 ≤ 0 for all z∗ ∈ K+. (2.1.5)

Hence with z∗ = 2η this implies 〈η, z〉 ≤ 0 and for z∗ = η

2 we have 〈η, z〉 ≥ 0, andtherefore 〈η, z〉 = 0. From (2.1.5) it follows that 〈z∗, z〉 ≤ 0 for all z∗ ∈ K+, and hence thegeometric form of the Hahn–Banach theorem implies that z ∈ K .

Conversely, if z ∈ K , η ∈ K+, and 〈η, z〉 = 0, then 〈z∗ − η, z〉 ≤ 0 for all z∗ ∈ Z∗and thus z ∈ ∂ψK+(η).

Let A : X→ X∗ be the operator representation of the Hessian L′′(x0, p0, λ0, μ0, η0)

of L such that〈Ax1, x2〉 = L′′(x0, p0, λ0, μ0, η0)(x1, x2)

and define operators E : X→ W , G : X→ Rm, and L : X→ Z by

E = e′(x0, p0), G = g′(x0, p0), L = �′(p0).

Without loss of generality one can assume that the coordinates of the inequality con-straints g(x, p0) ≤ 0 and the associated Lagrange multiplier μ0 are arranged so thatμ0 = (μ+0 , μ

00, μ

−0 ) and g = (g+, g0, g−), with g+ : X → Rm1 , g0 : X → Rm2 ,

g− : X→ Rm3 , m = m1 +m2 +m3, and

g+(x0, p0) = 0, μ+0 > 0,g0(x0, p0) = 0, μ0

0 = 0,g−(x0, p0) < 0, μ−0 = 0.

(2.1.6)

We further define

G+ = g+(x0, p0)′, G0 = g0(x0, p0)

′,

and

E+ =(

E

G+

): X→ W × Rm1 .


�

�

�

�

�

�

�

�


Note that the coordinates denoted by superscripts + behave locally like equalityconstraints. Define E(z) : X × R→ (W × Rm1)× Rm2 × Z for z ∈ Z by

E(z) =⎛⎝ E+ 0

G0 0L z.

⎞⎠ .

The adjoint E∗(z) is given by

E∗(z) =(

E∗+ G∗0 L∗

0 0 〈·, z〉Z).

The following conditions will be used.(H2) There exists a κ > 0 such that 〈Ax, x〉X∗,X ≥ κ |x|2X for all x ∈ ker(E+).(H3) E(z) is surjective at z = �(x0, p0).(H4) There exists a neighborhood U×N of (x0, p0) and a positive constant ν such that

|f (x, p)− f (x, q)| + |e(x, p)− e(x, q)|W + |g(x, p)− g(x, q)|Rm

+ |�(x, p)− �(x, q)|Z ≤ ν |p − q|P

for all (x, p) and (x, q) in U × N .

Condition (H2) is a second order sufficient optimality condition for (2.1.1) at x0 and (H3)implies that there exists a neighborhood O of �(x0, p0) in Z such that E(z) is surjective forall z ∈ O.

The chapter is organized as follows. In Sections 2.2 and 2.3 we discuss the basic resultsfor establishing Lipschitz continuity of the solution mapping. The implicit function theoryfor the generalized equation of the form (2.1.4) is discussed in Section 2.2. In Section 2.3stability results for solutions to mathematical programming problems including (2.1.1) areaddressed. Sufficient conditions for stability of the solutions are closely related to suffi-cient optimality conditions for local minimizers. Section 2.3 therefore also contains firstand second order sufficient conditions for local optimality. Sections 2.4 and 2.5 are de-voted to establishing Lipschitz continuity and directional differentiability of the solutionmapping. For the latter we employ the assumption of polyhedricity of a closed convexcone K which, together with appropriate additional conditions, implies that the directionalderivative (x, λ, μ, η) in direction q satisfies

0 ∈

⎧⎪⎪⎨⎪⎪⎩L′p(x0, p0, λ0, μ0, η0)q + Ax + E∗λ+G∗+μ+ +G∗

0μ0 + L∗η,

−e+p (x0, p0)q − E+x,−g0

p(x0, p0)q −G0x + ∂ψRm2 ,+(μ0),

−�p(x0, p0)q − Lx + ∂ψK+(η),

(2.1.7)

where e+ = (e

g+)

and K+ is the dual cone of K = ∪λ>0λ(K − �(x0, p0))∩[η0]⊥. Section 2.6contains some applications to optimal control of ordinary differential equations. This chapteris strongly influenced by the work of S. M. Robinson and also by that of W. Alt.


�

�

�

�

�

�

�

�

2.2. Implicit function theorem 31

2.2 Implicit function theoremIn this section we discuss an implicit function theorem for parameterized generalized equa-tions of the form

0 ∈ F(x, p)+ ∂ψC(x), (2.2.1)

where F : X×P → X∗, with X∗ the strong dual space of X, and C is a closed convex set inX. We assume that X and P are normed linear spaces unless specified otherwise. Supposex0 ∈ X is a solution to (2.2.1) for a reference parameter p = p0 ∈ P . Let F(x, p0) beFréchet differentiable at x0 and define the linearized form by

T x = F(x0, p0)+ F ′(x0, p0)(x − x0)+ ∂ψC(x). (2.2.2)

Definition 2.2 (Strong Regularity). The generalized equation (2.2.1) is called stronglyregular at x0 with associated Lipschitz constant ρ if there exist neighborhoods of V of 0and U of x0 in X such that (T −1V ) ∩ U , the intersection of U with the restriction of T −1

to V , is single valued and Lipschitz continuous from V to U with modulus ρ.

Note that if C = X, the strong regularity assumption coincides with F ′(x0, p0)−1

being a bounded linear operator, which is the common condition for the implicit functiontheorem. The following result is taken from Robinson’s work [Ro3].

Theorem 2.3 (Generalized Implicit Function Theorem). Let P be a topological space,and assume that F is Fréchet differentiable with respect to x and both F and F ′ arecontinuous at (x0, p0). If (2.2.1) is strongly regular at x0 with associated Lipschitz constantρ, then for every ε > 0 there exists neighborhoods Nε of p0 and Uε of x0, and a single-valued function x : Nε → Uε such that for every p ∈ Nε , x(p) is the unique solution in Uε

of (2.2.1). Furthermore, for each p, q ∈ Nε

|x(p)− x(q)|X ≤ (ρ + ε) |F(p, x(q))− F(q, x(q))|X∗ . (2.2.3)

Proof. For ε > 0 we choose a δ > 0 such that ρδ < ε/(ρ + ε). By strong regularity, thereexists neighborhoods V of 0 and U of x0 in X such that (T −1V ) ∩ U is single valued andLipschitz continuous from V to U with modulus ρ. Let

h(x, p) = F(x0, p0)+ F ′(x0, p0)(x − x0)− F(x, p)

and choose a neighborhood Nε of p0 and a closed ball Uε of radius r about x0 contained inU so that for each p ∈ Nε and x ∈ Uε we have for h(x, p) ∈ V

|F ′(x, p)− F ′(x0, p0)| ≤ δ

and

ρ |F(x0, p)− F(x0, p0)| ≤ (1− ρδ)r.


�

�

�

�

�

�

�

�


For any p ∈ Nε define an operator �p from Uε → U by

�p(x) = T −1(h(x, p)) ∩ U. (2.2.4)

Note that x ∈ Uε ∩ �p(x) if and only if x ∈ Uε and (2.2.1) holds. We show that �p is astrict contraction and �p maps Uε into itself. For x1, x2 ∈ Uε we find, using h′(x, p) =F ′(x0, p0)− F ′(x, p), that

|�p(x1)−�p(x2)| ≤ ρ |h(x1, p)− h(x2, p)|

= ρ| ∫ 10 h′(x1 + t (x1 − x2), p)(x1 − x2) dt | ≤ ρδ |x1 − x2|,

where ρ δ < 1. Since x0 = T −1(0) ∩ U we have

|�p(x0)− x0| ≤ ρ |h(x0, p)| = ρ |F(x0, p)− F(x0, p0)| ≤ (1− ρδ)r,

and thus for x ∈ Uε

|�p(x)− x0| ≤ |�p(x)−�p(x0)| + |�p(x0)− x0| ≤ ρδ |x − x0| + (1− ρδ)r ≤ r.

By the Banach fixed point theorem �p has a unique fixed point x(p) in Uε and for eachx ∈ Uε we have

|x(p)− x| ≤ (1− ρδ)−1|�p(x)− x|. (2.2.5)

It follows from our earlier observation that x(p) is the unique solution to (2.2.1) in Uε . Toverify (2.2.3) with p, q ∈ Nε we use (2.2.5) with x = x(q) and obtain

|x(p)− x(q)| ≤ (1− ρδ)−1|�p(x(q))− x(q)|.Since x(q) = �q(x(q)) we find

|�p(x(q))− x(q)| ≤ ρ |h(x(q), p)− h(x(q), q)| = ρ |F(x(q), p)− F(x(q), q)|,and hence

|x(p)− x(q)| ≤ ρ(1− ρδ)−1|F(x(q), p)− F(x(q), q)|.Observing that ρ(1− ρδ)−1 ≤ ρ + ε the desired estimate (2.2.3) follows.

As a consequence of the previous theorem, Lipschitz continuity of F(x, p) in p

implies local Lipschitz continuity of x(p) at p0.

Corollary 2.4. Assume in addition to the conditions of Theorem 2.3 that P is a normedlinear space and that for some ν > 0

|F(x, p)− F(x, q)| ≤ ν |p − q|for p, q ∈ Nε and x ∈ Uε . Then x(p) is Lipschitz on Nε with modulus ν (ρ + ε).

It should be remarked that the condition of strong regularity is the weakest possiblecondition which can be imposed on the values of a function F and its derivative at a point x0,


�

�

�

�

�

�

�

�

2.2. Implicit function theorem 33

so that for each perturbation satisfying the hypothesis of Theorem 2.3, a function x(·) willexist having the properties stated in Theorem 2.3. To see this, one has only to consider afunction F : X→ X∗ which is Fréchet differentiable at x0 and satisfies

0 ∈ F(x0)+ ∂ψC(x0).

Let P be a neighborhood of the origin in X∗, and let

F(x, p) = F(x0, p0)+ F ′(x0, p0)(x − x0)− p

with p0 = 0. Choose ε > 0. If there exist neighborhoods Nε and Vε and a function x(·)satisfying the properties stated in Theorem 2.3, then with

T x = F(x0, p0)+ F ′(x0, p0)(x − x0)+ ∂ψC(x)

we see that the restriction to Nε of T −1V is a singled-valued, Lipschitz continuous function.Therefore the generalized equation 0 ∈ F(x, p0)+ ∂ψC(x) is strongly regular at x0.

One of the important consequences of Theorem 2.3 is the following theorem onparametric sensitivity analysis.

Theorem 2.5. Under the conditions of Theorem 2.3 and Corollary 2.4 there exists for eachε > 0 a function αε(p) : Nε → R+ satisfying limp→p0 αε(p) = 0, such that

|x(p)−�p(x0)| ≤ αε(p) |p − p0|,where �p(x0) is the unique solution in Uε of the linear generalized equation

0 ∈ F(x0, p)+ F ′(x0, p0)(x − x0)+ ∂ψC(x). (2.2.6)

Proof. From the proof of Theorem 2.3 it follows that x(p) = �p(x(p)) and thus by strongregularity

|x(p)−�p(x0)| ≤ ρ |h(x(p), p)− h(x0, p)|

≤ ρ

∣∣∣∣∫ 1

0h′(x0 + θ (x(p)− x0)) dθ

∣∣∣∣ |x(p)− x0|

≤ ρν(ρ + ε)|p − p0|∣∣∣∣∫ 1

0h′(x0 + θ (x(p)− x0)) dθ

∣∣∣∣ .Since h′(x, p) = F ′(x0, p0)− F ′(x, p) and F ′ is continuous,∣∣∣∣∫ 1

0e′(x0 + θ (x(p)− x0)) dθ

∣∣∣∣→ 0

as p→ p0, which completes the proof.

In the case C = X the generalized equation (2.2.1) reduces to the nonlinear equationF(x, p) = 0 and

�p(x0) = F ′(x0, p0)−1(F (x0, p0)− F(x0, p)).


�

�

�

�

�

�

�

�


Thus, Theorem 2.5 shows that if F(x0, ·) is Fréchet differentiable at p0, then so is x(p) and

x ′(p0) = −F ′(x0, p0)−1 ∂F

∂p(x0, p0). (2.2.7)

In many applications, one might find (2.2.6) significantly easier to solve than (2.2.1). In fact,the necessary optimality condition (2.1.3) is of the form (2.2.1), and (2.2.6) corresponds toa quadratic programming problem, as we will discuss in Section 2.4. Thus Theorem 2.5 canprovide a relatively cheap way to find a good approximation to x(p) for p near p0.

2.3 Stability resultsThroughout this section, let X, Y be real Banach spaces and let C be closed convex subsetsof X. Concerning K it suffices in this section to assume that it is a closed convex set in Y ,unless it is specifically assumed that it is a closed convex cone in Y . For p in the metricspace P with metric δ(·, ·) consider the parameterized minimization problem

min f (x, p) subject to x ∈ C, g(x, p) ∈ K. (2.3.1)

This class of problems contains (2.1.1) as a special case. To facilitate the analysis of (2.3.1)let �(p) denote the set of feasible points, i.e.,

�(p) = {x ∈ C : g(x, p) ∈ K}.For a given p0 ∈ P and x0 ∈ �(p0) a local minimizer of (2.3.1), define the set �r(p) by

�r(p) = �(p) ∩ B(x0, r),

where r > 0 and B(x0, r) is the closure of the open ball with radius r at x0 in X. We furtherdefine the value function μr(p) by

μr(p) = inf {f (x, p)|x ∈ �r(p)}. (2.3.2)

Throughout this section it is assumed that f : X × P → R and g : X × P → Y arecontinuously Fréchet differentiable with respect to x and that their first derivatives arecontinuous in a neighborhood of (x0, p0). Recall from Definition 1.5 that x0 ∈ �(p0) is aregular point if

0 ∈ int {g(x0, p0)+ gx(x0, p0)(C − x0)−K}. (2.3.3)

We use a generalized inverse function theorem for set-valued functions due to Robin-son [Ro3] for the stability analysis of (2.3.1).

For a nonempty set S in X and x ∈ X we define

d(x, S) = inf {|y − x| : y ∈ S},and d(x, S) = ∞ if S = ∅. For nonempty sets S and S in X the Hausdorff distance isdefined by

dH (S, S) = max{

sup{d(x, S) : x ∈ S}, sup{d(y, S) : y ∈ S}}.


�

�

�

�

�

�

�

�

2.3. Stability results 35

We further set BX = {x ∈ X : |x| < 1}. For a set-valued function F : X → Y we definethe preimage of F by F−1(y) = {x ∈ X : y ∈ F(x)}. A set-valued function is called closedand convex if its graph is closed and convex.

Theorem 2.6 (Generalized Inverse Function Theorem). Let X and Y be normed linearspaces and F a closed convex set-valued function from X to Y . Let y0 ∈ F(x0) and supposethat for some η > 0,

y0 + η BY ⊂ F(x0 + BX).

Then for x ∈ x0 + BX and any y ∈ y0 + int (η BY ),

d(x, F−1(y) ∩ (x0 + BX)) ≤ (η − |y − y0|)−1(1+ |x − x0|)d(y, F (x)).

Proof. Let x ∈ x0+BX and y−y0 ∈ int (η BY ). For y ∈ F(x) the claim is clearly satisfiedand therefore we assume y /∈ F(x). Let δ > 0 be arbitrary and choose yδ ∈ F(x) such that

0 < |yδ − y| < d(y, F (x))+ δ. (2.3.4)

Let α = η − |y − y0| > 0. Choose any ε ∈ (0, α) and set

yε = y + (α − ε)|y − yδ|−1(y − yδ).

Then |yε − y0| ≤ |y − y0| + (α − ε) < η, and

yε ∈ y0 + int (ηBY ).

Thus by assumption there exists

xε ∈ x0 + BX with yε ∈ F(xε).

For 11+(α−ε)|y−yδ |−1 ∈ (0, 1), we find, using convexity of F and the fact that yδ ∈ F(x),

y = (1− λ)yδ + λ yε ∈ (1− λ)F (x)+ λF(xε) ⊂ F((1− λ)x + λ xε). (2.3.5)

Since x and xε are contained in x0 + BX we have (1 − λ)x + λ xε ∈ x0 + BX. Togetherwith (2.3.5) this implies

d(x, F−1(y) ∩ (x0 + BX)) ≤ |x − (1− λ)x + λ xε | = λ |x − xε |.However, |x − xε | ≤ |x − x0| + |x0 − xε | ≤ |x − x0| + 1 and

λ = |y − yδ||y − yδ| + α − ε

<|y − yδ|α − ε

.

Hence by (2.3.4)

d(x, F−1(y) ∩ (x0 + BX)) ≤ (η − |y − y0| − ε)−1(1+ |x − x0|) (d(y, F (x))+ δ),

and letting ε → 0+, δ → 0+ the claim follows.


�

�

�

�

�

�

�

�


The proof of the stability result below relies on the following set-valued fixed point lemma.For a metric space (X , d) let F(X ) denote the collection of all nonempty closed subsetsof X .

Lemma 2.7. Let (X , d) be a complete metric space and let � : X → F(X ) be a set-valuedmapping. Suppose that there exists x0 ∈ X and constants r > 0, α ∈ (0, 1), ε ∈ (0, r) suchthat for all x1, x2 in the closed ball B(x0, r) the sets �(x1), �(x2) are nonempty,

dH (�(x1),�(x2)) ≤ α d(x1, x2),

and

d(x0,�(x0)) ≤ (1− d)(r − ε).

Then there exists x ∈ B(x0, r) such that x ∈ �(x)andd(x0, x) ≤ (1−α)−1d(x0,�(x0))+ε.If moreover

dH (�(S1),�(S2)) ≤ α dH (S1,S2) (2.3.6)

for all nonempty closed subsets S1 and S2 of B(x0, r), then the sequence Sk = �(Sk−1),with S0 = {x0}, converges in the Hausdorff metric to a set S satisfying S = �(S).

Proof. Let δ = α(x0,�(x0)) and set γ = εα−1(1 − α). By induction one proves theexistence of a sequence xk+1 ∈ �(xk), k = 0, 1, . . . , such that xk ∈ B(x0, r) and

d(xk+1, xk) ≤ αkδ + (1− 2−(k+1))γ αk+1,

d(xk+ü1, x0) ≤ (1− αk+1)(r − ε)+ γ

k+1∑i=1

(1− 2−i )αi .

(2.3.7)

In fact, choose x1 ∈ �(x0) with d(x0, x1) ≤ δ + γ

2 α ≤ (1 − α)(r − ε) + γ

2 α ≤ r .Hence x1 ∈ B(x0, r) and (2.3.7) holds for k = 0. Now suppose that (2.3.7) holds with k

replaced by k−1. Then d(x0, xk) < r and hence �(xk) is nonempty and there exists xk+1 ∈�(xk) satisfying d(xk+1, xk) ≤ d(�(xk), xk)+2−(k+1)γ αk+1. Consequently d(xk+1, xk) ≤dH (�(xk),�(xk−1)) + 2−(k+1)γ αk+1 ≤ α d(xk, xk−1) + 2−(k+1)γ αk+1 ≤ αkδ + (1 −2−(k+1))γ αk+1, and d(xk+1, x0) ≤ d(xk+1, xk) + d(xk, x0) ≤ γ

∑k+1i=1 (1 − 2−i )αi + (1 −

αk+1)(r − ε), and (2.3.7) holds for all k. For k ≥ 0 and for every m ≥ 1 we have

d(xk+m, xk) ≤m−1∑i=0

d(xk+i+1, xk+i )

≤m−1∑i=0

(αk+iδ + (1− 2−(k+i+1))γ αk+i+1

)≤ αk(1− α)−1(δ + γ α),


�

�

�

�

�

�

�

�


and consequently {xk} is a Cauchy sequence in B(xo, r). Since X is complete there existsx ∈ B(x0, r) with limk→∞ xk = x and d(x, x0) ≤ δ

1−α + ε. To verify the fixed pointproperty, note that for every k we have

d(x,�(x)) ≤ d(x, xk+1)+ d(xk+1,�(x))

≤ d(x, xk+1)+ dH (�(xk),�(x))

≤ d(x, xk+1)+ α d(xk, x).

(2.3.8)

Passing to the limit with respect to k implies that x ∈ �(x).We now assume that� also satisfies (2.3.6). Since the set of closed subsets of B(xo, r)

is complete with respect to the Hausdorff metric, the existence of a set S with lim Sk = S

in the Hausdorff metric follows. To argue invariance of S under � note that

dH (S,�(S)) = limk→∞ dH (Sk,�(S))

= limk→∞ dH (�(Sk−1),�(S)) ≤ lim

k→∞ dH (Sk−1, S) = 0

and hence S = �(S).

The following result is an important consequence of Theorem 2.6.

Theorem 2.8 (Stability). Let x0 ∈ �(p0) be a regular point. Then for every ε > 0 thereexist neighborhoods Uε of x0, Nε of p0 such that for p ∈ Nε the set �(p) is nonempty andfor each x ∈ Uε ∩ C

d(x,�(p)) ≤ (M + ε) d(0, {g(x, p)−K : x ∈ C}), (2.3.9)

where a constant M > 0 is independent of ε > 0.

Proof. Let F be the closed convex set-valued mapping from X into Y given by

F(x) ={

g′(x0, p0)(x − x0)+ g(x0, p0)−K for x ∈ C,

{} for x /∈ C.

The regular point condition implies that there exists an η > 0 such that

η BY ⊂ F(x0 + BX).

Here we use Theorem 1.4 and the discussion below Definition 1.5. For δ > 0 and r > 0 letρ = (η − 2δ r)−1(1+ r) where we assume that

2δ r < η and 2ρδ < 1.

Leth(x, p) = g(x0, p0)+ g′(x0, p0)(x − x0)− g(x, p)


�

�

�

�

�

�

�

�


and choose a neighborhood of N of p0 and a closed ball B(x0, r) of radius r about x0 suchthat for x ∈ U = C ∩ B(x0, r) and p ∈ N

|g′(x, p)− g′(x0, p0)| ≤ δ

and|g(x0, p)− g(x0, p0)| ≤ δ r.

As in the proof of Theorem 2.3 it can be shown that

|h(x1, p)− h(x2, p)| ≤ δ|x1 − x2|for all x1, x2 ∈ U and p ∈ N . In particular, if x ∈ U and p ∈ N , then

|h(x, p)| ≤ |h(x0, p0)| + |h(x, p)− h(x0, p0)| ≤ δ r + δ r < η.

For any p ∈ N define the set-valued closed convex mapping �p from U → C by

�p(x) = F−1(h(x, p)). (2.3.10)

Note that x ∈ �p(x) implies that g(x, p) ∈ K . We argue that Lemma 2.7 is applicablewith � = �p and X = C endowed with the metric induced by X. For every x ∈ U the set�p(x) is a closed convex subset of X. (Hence, if X is a reflexive Banach space, then themetric projection is well defined and Lemma 2.7 will be applicable with ε = 0.) For everyx1, x2 ∈ U we have by Theorem 2.6

dH (�p(x1),�p(x2))| ≤ ρ |h(x1, p)− h(x2, p)| < ρδ |x1 − x2|,where we used that h(xi, p) ∈ F(xi) for i = 1, 2. Moreover

|�p(x0)− x0| ≤ ρ |g(x0, p)− g(x0, p0)| ≤ ρδr,

and for every pair of sets S1 and S2 in U we have

dH (�p(S1),�p(S2)) ≤ ρ δ dH (S1,S2).

Hence all conditions of Lemma 2.7 are satisfied with α = 2ρδ and ε = r(1−2α)1−α . Con-

sequently there exists a Cauchy sequence {xk} in C such that xk+1 ∈ �p(xk) and x =limk→∞ xk ∈ C satisfies x ∈ �p(x). Moreover, if Sk = �p(Sk−1) with S0 = {x0}, thendH (Sm, Sk) → 0 as m ≥ k → ∞ and thus S = limk→∞ Sk satisfies S = �p(S) andS ⊂ �p. Let

(x, y) ∈ M = {(x, y) ∈ U × Y : g(x0, p0)+ g′(x0, p0)(x − x0)+ y ∈ K}.For all x ∈ S we have by Theorem 2.6

d(x, F−1(h(x, p)) ≤ ρ d(h(x, p), F (x)) ≤ ρ |h(x, p)+ y|

≤ ρ (|g(x, p)− g(x0, p0)− g′(x0, p0)(x − x0)− y| + |h(x, p)− h(x, p)| ).


�

�

�

�

�

�

�

�


Since |h(x, p)− h(x, p)| ≤ δ |x − x|,

d(x, S) ≤ ρ

1− ρδ|g(x, p)− g(x0, p0)− g′(x0, p0)(x − x0)− y|

for all (x, y) ∈ M . Now, for x ∈ C and y ∈ g(x, p)−K we let y = g(x, p)− g(x0, p0)−g′(x0, p0)(x − x0)− y. Then (x, y) ∈ M and hence

d(x,�(p)) ≤ d(x, S) ≤ ρ

1− ρδ|y|.

This holds for all y ∈ g(x, p)−K . Since ρ

1−ρδ = 1+rη−δ(1+3r) , we can select for every ε > 0

neighborhoods N = Nε and U = Uε such that ρ

1−ρδ ≤ η−1 + ε. This implies the claim

with M = η−1.

Theorem 2.9 (Lipschitz Continuity of Value Function). Let x0 ∈ �(p0) be a localminimizer. Assume moreover that x0 is regular and that there exists a neighborhood U ×N

of (x0, p0) such that

|f (x, p)− f (x, p0)| ≤ Lf (|x − x| + δ(p, p0)) (2.3.11)

for x, x ∈ U and p ∈ N and

|g(x, p)− g(x, p0)| ≤ Lg δ(p, p0) (2.3.12)

for (x, p) ∈ U ×N . Then there exist constants r > 0 and Lr and a neighborhood N of p0

such that|μr(p)− μr(p0)| ≤ Lr δ(p, p0)

for all p ∈ N .

Proof. Since x0 is a local minimizer of (2.3.1) at p0 there exists s > 0 such that

f (x0, p0) ≤ f (x, p0) for all x ∈ �s(p0).

Let Cs = C ∩ B(x0, s). By Theorem 1.4 and the discussion following Definition 1.5 theregular point condition (2.3.3) also holds with C replaced by Cs . Applying Theorem 2.8with C = Cs and ε = 1 there exist neighborhoods U1 of x0 and N1 of p0 such that �s(p)

is nonempty and

d(x,�s(p)) ≤ (M + 1) d(0, {g(x, p)−K : x ∈ Cs}) (2.3.13)

for each x ∈ U1 and p ∈ N1.We now choose r > 0 and a neighborhood N of p0 such that

r ≤ s, B(x0, r) ⊂ U1 ∩ U, N ⊂ N1 ∩N, and 2(M + 1)Lg δ(p, p0) < r for all p ∈ N .

(2.3.14)By (2.3.13) and (2.3.12) we obtain for all p ∈ N

d(x0, �s(p)) ≤ (M + 1) |g(x0, p)− g(x0, p0)| ≤ (M + 1)Lg δ(p, p0).


�

�

�

�

�

�

�

�


Thus there exists a x(p) ∈ �s(p) such that

|x(p)− x0| ≤ 2(M + 1)Lg δ(p, p0) < r

and therefore x(p) ∈ �r(p) and f (x(p), p) ≥ μr(p). From (2.3.11)

|f (x(p), p)− f (x0, p0)| ≤ Lr δ(p, p0),

where Lr = Lf (2(M + 1)Lg + 1). Combining these inequalities, we have

μr(p) ≤ f (x(p), p) ≤ f (x0, p0)+ Lr δ(p, p0) = μr(p0)+ Lr δ(p, p0) (2.3.15)

for all p ∈ N .Conversely, let p ∈ N and x ∈ �r(p). From (2.3.13) and (2.3.12)

d(x,�s(p0)) ≤ (M + 1) |g(x, p)− g(x0, p0)|.Hence there exists x ∈ �s(p0) such that

|x − x| ≤ 2(M + 1)Lg δ(p, p0) < r, (2.3.16)

and thus f (x0, p0) ≤ f (x, p0). From (2.3.11) we deduce that |f (x, p) − f (x, p0)| ≤Lr δ(p, p0), and hence

μr(p0) = f (x0, p0) ≤ f (x, p0) ≤ f (x, p)+ Lr δ(p, p0) (2.3.17)

for each x ∈ �r(p). The desired estimate follows from (2.3.15) and (2.3.17).

In the above proof we followed the work of Alt [Alt3]. Next, we establish Höldercontinuity of the local minimizers x(p) of (2.3.1). Theorem 2.9 provides Lipschitz conti-nuity of the value function under the regular point condition. The following example showsthat the local solutions to the perturbed problems may behave rather irregularly.

Example 2.10. Let X = Y = R2, P = R, C = R2, K = {(y1, y2) : y1 ≥ 0, y2 ≥ 0},f (x1, x2, p) = x1+p x2, and g(x1, x2, p) = (x1, x2). For p0 = 0 a local solution to (2.3.1)is given by x0 = (0, 1) and μr(p0) = 0. Now let r > 0 be arbitrary. Then for p �= p0 theperturbed problem

min f (x, p), x ∈ �r(p) (2.3.18)

has a unique solution

x(p) ={

(0,max(1− r, 0)) if p > 0,(0, 1+ r) if p < 0

and

μr(p) ={

p max(1− r, 0) if p > 0,p(1+ r) if p < 0.

This shows that μr(p) is Lipschitz continuous but the distance |x(p)−x0| is not continuouswith respect to p at p0 = 0. The reason for this behavior of x(p) is that the unperturbed


�

�

�

�

�

�

�

�


problem does not depend on x2 and all (0, x2) with x2 ≥ 0 are optimal for p = p0. On theother hand if we let f (x1, x2, p) = x1 + px2 + 1

2 (x2 − 1)2, then for r ≥ 1 and |p| ≤ 1 thesolution to (2.3.18) is given by

x(p) = (0, 1− p).

Thus the distance |x(p)− x0| is continuous in p.

This example suggests that some kind of local invertibility should be assumed onf in order to ensure continuity of the solutions. We define a condition closely related tosufficient optimality conditions:

There exist κ > 0, β ≥ 1, γ > 0, and neighborhoods U of x0 and N of p0 such that

f (x, p) ≥ f (x0, p0)+ κ |x − x0|β − γ δ(p, p0) (2.3.19)

for all p ∈ N and x ∈ �(p) ∩ U .

The following result shows that this condition is implied by sufficient optimalityconditions at x0.

Theorem 2.11. Assume that x0 is regular, that (2.3.11)–(2.3.12) hold, and that there existconstants κ > 0, β ≥ 1 and neighborhood U of x0 such that

f (x, p0) ≥ f (x0, p0)+ κ |x − x0|β for all x ∈ �(p0) ∩ U . (2.3.20)

Then condition (2.3.19) holds.

Proof. We select r, s > 0 and a neighborhoodN1 ofp0 as in (2.3.14) in the proof of Theorem2.9. We can assume that B(x0, s) ∈ U . Set U = B(x0, r) and choose x ∈ �r(p), p ∈ N .From (2.3.16) there exists an x ∈ �s(p0) such that

|x − x| ≤ 2(M + 1)Lg δ(p, p0),

and due to (2.3.17)

f (x, p) ≥ f (x, p0)− Lr δ(p, p0).

By (2.3.20)

f (x, p0) ≥ f (x0, p0)+ κ |x − x0|β.Note that

|x − x0|β ≤ (|x − x0| + |x − x|)β

≤ |x − x0|β + β(|x − x0| + |x − x|)β−1 |x − x| ≤ |x − x0|β + β(3s)β−1 |x − x|.Thus,

f (x, p) ≥ f (x0, p0)+ κ |x − x0|β − Lr δ(p, p0)

≥ f (x0, p0)+ κ |x − x0|β − γ δ(p, p0),

where γ = Lr + κβ(3s)β−12(M + 1)Lg and (2.3.19) follows with N = N .


�

�

�

�

�

�

�

�


Condition (2.3.20) can be verified under the following second order sufficient opti-mality condition due to Maurer and Zowe [MaZo] .

Theorem 2.12. Assume that K is a closed convex cone in Y and that x0 ∈ �(p0) is aregular point, and let L(x, λ) = f (x, p0)+〈λ, g(x, p0)〉Y ∗,Y be the Lagrangian for (2.3.1)at x0 with Lagrange multiplier λ0. If there exist constants ω > 0, β > 0 such that

L′′(x0, λ0, p0)(h, h) ≥ ω |h|2 for all h ∈ S, (2.3.21)

where

S = L(�(p0), x0) ∩ {h ∈ X : 〈λ0, g′(x0, p0)h〉Y ∗,Y ≥ −β |h|},

then there exist κ > 0 and a neighborhood U of x0 such that (2.3.20) holds with β = 2.For convenience we recall the definition of the linearizing cone: L(�(p0), x0) =

{x ∈ C(x0) : g′(x0)x ∈ K(g(x0))}, where K(g(x0)) = {λ(y − g(x0)) : y ∈ K, λ ≥ 0}.

Proof. All quantities are evaluated at p0 and this dependence will therefore be suppressed.First we show that every x ∈ � = �(p0) can be represented as

x − x0 = h(x)+ z(x) with h(x) ∈ L(�, x0) and |z(x)| = o(|x − x0|), (2.3.22)

where o(s)

s→ 0 as s → 0+. Let x ∈ � and expand g at x0:

g(x)− g(x0) = g′(x0)(x − x0)+ r(x, x0) with |r(x, x0)| = o(|x − x0|).By the generalized open mapping theorem (see Theorem 1.4), there exist α > 0 and k(x) ∈K(g(x0)) such that for some z(x) ∈ α |r(x, x0)| (BX ∩ C(x0))

r(x, x0) = g′(x0)z(x)− k(x).

If we put h(x) = x − x0 + z(x), then h(x) ∈ C(x0), |z(x)| = o(|x − x0|), and

g′(x0)h(x) = g′(x0)(x − x0)+ r(x, x0)+ k(x)

= g(x)− g(x0)+ k(x) ∈ K − g(x0)+ k(x) ⊂ K(g(x0)),

i.e., h(x) ∈ L(�, x0). Next, for x ∈ �

f (x) ≥ f (x)+ 〈λ0, g(x)〉 = L(x, λ0) = L(x0, λ0)+ 1

2L′′(x − x0, x − x0)+ r(x − x0),

(2.3.23)where |r(x−x0)| = o(|x−x0|2), and we used 〈λ0, g(x)〉 ≤ 0 for x ∈ �, and L′(x0, λ0) = 0.Now put B = L′′(x0, λ0) and S = L(�) ∩ {h ∈ X : 〈λ0, g

′(x0)h〉 ≥ −β |h|} in Lemma2.13. It implies the existence of δ0 > 0 and γ > 0 such that

L′′(x0, λ0)(x − x0, x − x0) ≥ δ0 |x − x0|2 for all x − x0 = h(x)+ z(x) ∈ � − x0

with |z(x)| ≤ γ |h(x)| and 〈λ0, g′(x0)h〉 ≥ −β |h|.


�

�

�

�

�

�

�

�


Choose δ ∈ (0, 1) satisfying δ/(1 − δ) < γ and ρ > 0 such that |z(x)| ≤ δ|x − x0| for|x − x0| ≤ ρ. Then for |x − x0| ≤ ρ

|h(x)| ≥ |x − x0| − |z(x)| ≥ (1− δ)|x − x0|and thus

|z(x)| ≤ δ

1− δ|h(x)| < γ |h(x)|. (2.3.24)

Hence

L′′(x0, λ0)(x − x0, x − x0) ≥ δ0 |x − x0|2 for all x − x0 = h(x)+ z(x) ∈ � − x0

with |x − x0| ≤ ρ and 〈λ0, g′(x0)h(x)〉 ≥ −β |h(x)|.

By (2.3.23) there exists ρ ∈ (0, ρ] such that

f (x) ≥ f (x0)+ δ04 |x − x0|2 for all x − x0 = h(x)+ z(x) ∈ � − x0

with |x − x0| ≤ ρ and 〈λ0, g′(x0)h(x)〉 ≥ −β |h(x)|. (2.3.25)

For the case x − x0 = h(x) + z(x) ∈ � − x0 with |x − x0| ≤ ρ and 〈λ0, g′(x0)h(x)〉 <

−β |h(x)| we find

f (x)− f (x0) = f ′(x0)(x − x0)+ r(x − x0)

= −〈λ0, g′(x0)h(x)〉 − 〈λ0, g

′(x0)z(x)〉 + r(x − x0)

≥ β|h(x)| + r1(x − x0),

where r1(x − x0) = −〈λ0, g′(x0)z(x)〉 + r(x − x0) and r1(x − x0) = o(|x − x0|). Together

with (2.3.24) this implies

f (x) ≥ f (x0)+ β(1− δ)|x − x0| + r1(x − x0) for x − x0 = h(x)+ z(x) ∈ � − x0

with |x − x0| ≤ ρ and 〈λ0, g′(x0)h(x)〉 < −β |h(x)|.

Combined with (2.3.25) we arrive at the desired conclusion.

Lemma 2.13. Let S be a subset of a normed linear space X and let B : X × X → R be abounded symmetric quadratic form satisfying for some ω > 0

B(h, h) ≥ ω |h|2 for all h ∈ S.

Then there exist δ0 > 0 and γ > 0 such that

B(h+ z, h+ z) ≥ δ0 |h+ z| 2 for all h ∈ S, z ∈ X with |z| ≤ γ |h|.

Proof. Let b = ‖B‖ and choose γ > 0 such that δ = ω − 2bγ − bγ 2 > 0. Then for allz ∈ X and h ∈ S satisfying |z| ≤ γ |h|

B(h+ z, h+ z) ≥ ω |h|2 − 2b|h||z| − b|z|2 ≥ δ |h|2.


�

�

�

�

�

�

�

�


Since |h+ z| ≤ |h| + |z| ≤ (1+ γ ) |h| we get

B(h+ z, h+ z) ≥ δ0 |h+ z|2

with δ0 = δ/(1+ γ )2.

Remark 2.3.1. An example which illustrates the benefit of including the set {h ∈ X :〈λ0, g

′(x0, p0)h〉Y ∗,Y ≥ −β |h|} into the definition of the set S on which the Hessian mustbe positive definite is given by the one-dimensional example with

f (x) = −x3 − x, g(x) = x, K = R−, and C = R.

In this case� = R−, x0 = 0, f (x0) = 0, f ′(x0) = −1, andf ′′(x0) = 0. We further find thatL(�, x0) = K and that the Lagrange multiplier is given by λ0 = 1. Consequently for β ∈(0, 1) we have that S = L(�, x0)∩ {h ∈ R : λ0g

′(x0)h ≥ −β |h|} = {0}. Hence (2.3.21) istrivially satisfied. Let us note that f is bounded below as f (x) ≥ f (x0)+max(|x|, 2|x|2)for x ∈ K . A similar argument does not hold if f is replaced by f (x) = −x3. In thiscase x0 = 0, λ0 = 0, f ′′(x0) = 0, S = K , and (2.3.21) is not satisfied. Note that in thiscase we have lack of strict complementarity; i.e., the constraint is active and simultaneouslyλ0 = 0.

Theorem 2.14. Suppose that x0 ∈ �(p0) is a regular point and that for some β > 0

f ′(x0)h ≥ β|h| for all h ∈ L(�(p0), x0).

Then there exists α > 0 and a neighborhood U of x0 such that

f (x) ≥ f (x0)+ α|x − x0| for all x ∈ �(p0) ∩ U .

Proof. For any x ∈ �(p0) we use the representation

x − x0 = h(x)+ z(x)

from (2.3.22) in the proof of Theorem 2.12. By expansion we have

f (x)− f (x0) = f ′(x0)h(x)+ f ′(x0)z(x)+ r(x, x0)

with |r(x, x0)| = o(|x − xo|). The first order condition implies

f (x)− f (x0) ≥ β|h(x)| + r1(x, x0),

where r1(x, x0) = f ′(xo)z(x)+ r(x, xo) and |r1(x, x0)| = o(|x− x0|). Choose a neighbor-hood U of x0 such that |z(x)| ≤ 1

2 |x − x0| and |r1(x, x0)| ≤ β

4 |x − x0| for x ∈ �(p0) ∩ U .Then |h(x)| ≤ 1

2 |x − x0| and f (x)− f (x0) ≥ β

4 |x − x0| for x ∈ U .

Note that for C = X we have f ′(x0) + g(x0, p0)∗λ0 = 0 and the second order

condition of Theorem 2.12 involves the set of directions h ∈ L(�(p0), x0) for which thefirst order condition is violated.

Following [Alt3] we now show that (2.3.19) implies stability of local minimizers.


�

�

�

�

�

�

�

�

2.4. Lipschitz continuity 45

Theorem 2.15. Suppose that the local solution x0 ∈ �(p0) is a regular point and that(2.3.11), (2.3.12), and (2.3.19) are satisfied. Then there exist real numbers r > 0 andL > 0 and a neighborhood N of p0 such that the value function μr is Lipschitz continuousat p0 and for each p ∈ N the following statements hold:

(a) For every sequence {xn} in �r(p) with the property that limn→∞ f (xn, p) =μr(p) it follows that xn ∈ B(x0, r) for all sufficiently large n.

(b) If there exists a point x(p) ∈ �r(p) with μr(p) = f (x(p), p), then x(p) ∈B(x0, r), i.e., x(p) is a local minimizer of (2.3.1) and

|x(p)− x0| ≤ Lδ(p, p0)1β .

Proof. We choose r > 0 and a neighborhood N of p0 as in (2.3.14) in the proof of Theorem2.9. Without loss of generality we can assume that N × B(x0, r) ⊂ N × U with N, U

determined from (2.3.19) and that

κ rβ > (Lr + γ )δ(p, p0) (2.3.26)

for all p ∈ N , where Lr is determined in Theorem 2.9. Thus μr is Lipschitz continuous atp0 by Theorem 2.9. Let p ∈ N be arbitrary and let {xn} in �(p) be a sequence satisfyinglimn→∞ f (xn, p) = μr(p). By (2.3.19) and Lipschitz continuity of μr

κ |xn − x0|β ≤ |f (xn, p)− f (x0, p0)| + γ δ(p, p0)

≤ |f (xn, p)− μr(p)| + |μr(p)− μr(p0)| + γ δ(p, p0)

≤ |f (xn, p)− μr(p)| + (Lr + γ ) δ(p, p0).

This estimate together with (2.3.26) imply |xn − x0| < r for all sufficiently large n. Thisproves (a).

Similarly, for x(p) ∈ �r(p) satisfying μr(p) = f (x(p), p) we have

κ |x(p)− x0|β ≤ (Lr + γ ) δ(p, p0).

Thus, |x(p)− x0| < r ,

|x(p)− x0| ≤(Lr + γ

κδ(p, p0)

) 1β

and (b) follows.

2.4 Lipschitz continuityIn this section we investigate Lipschitz continuous dependence of the solution and theLagrange multipliers with respect to the parameter p in (2.1.1). Throughout this sectionand the next we assume that C = X and that the requirements on the function spacesspecified below (2.1.1) are met.


�

�

�

�

�

�

�

�


Theorem 2.16. Assume (H1)–(H4) hold at the local solution x0 of (2.1.1). Then thereexist neighborhoods N = N(p0) of p0 and U = U(x0, λ0, μ0, η0) of (x0, λ0, μ0, η0) and aconstant M such that for all p ∈ N there exists a unique (xp, λp, μp, ηp) ∈ U satisfying

0 ∈

⎧⎪⎪⎨⎪⎪⎩L′(x, p, λ, μ, η),−e(x, p),−g(x, p)+ ∂ψRm,+(μ),

−�(x, p)+ ∂ψK+(η),

(2.4.1)

and

|(x(p), λ(p), μ(p), η(p))− (x(q), λ(q), μ(q), η(q))| ≤ M |p − q| (2.4.2)

for all p, q ∈ N . Moreover, there exists a neighborhood N ⊂ N of p0 such that x(p) is alocal solution of (2.1.1) if p ∈ N .

For the proof we shall employ the following lemma.

Lemma 2.17. Consider the quadratic programming problem in a Hilbert space H

min J (x) = 1

2〈Ax, x〉 + 〈a, x〉

subject to Ex = b in Y and Gx − c ∈ K ⊂ Z,

(2.4.3)

where Y , Z are Banach spaces, E ∈ L(H, Y ), G ∈ L(H, Z), and K is a closed convexcone in Z. If

(a) 〈A·, ·〉 is a symmetric bounded quadratic form on H×H and there exists an ω > 0such that 〈Ax, x〉 ≥ ω |x|2H for all x ∈ ker (E), and

(b) for all (b, c) ∈ Y × Z

S(b, c) = {x ∈ H : Ex = b, Gx − c ∈ K} is nonempty,

then there exists a unique solution to (2.4.3) satisfying

〈Ax + a, v − x〉 ≥ 0 for all v ∈ S(b, c).

Moreover, if x is a regular point, then there exists (λ, η) ∈ Y ∗ × Z∗ such that

0 ∈⎛⎝ a

b

c

⎞⎠+⎛⎝ A E∗ G∗−E 0 0−G 0 0

⎞⎠⎛⎝ x

λ

η

⎞⎠+⎛⎝ 0

0∂ψK+(η)

⎞⎠ .

Proof. Since K is closed and convex, S(b, c) is closed and convex as well. Since S(b, c)

is nonempty, there exists a unique w ∈ rangeE∗ such that Ew = b. Moreover, everyx ∈ S(b, c) can be expressed as x = w + y, where y ∈ ker (E). From (a) it follows thatJ is coercive and bounded below on S(b, c). Hence there exists a bounded minimizingsequence {xn} in S(b, c) such that limn→∞ J (xn) → inf x∈S(b,c) J (x). Boundedness of


�

�

�

�

�

�

�

�


xn and weak sequential closedness of S(b, c) imply the existence of a subsequence of xnthat converges weakly to some x in S(b, c). Since J is weakly lower semicontinuous,J (x) ≤ lim inf J (xn)n→∞. Hence x minimizes J over S(b, c).

For v ∈ S(b, c) and t ∈ (0, 1) we have x + t (v − x) ∈ S(b, c). Thus,

0 ≤ J (x + t (v − x))− J (x) = t 〈Ax + a, v − x〉 + t2

2〈A(v − x), v − x〉

and letting t → 0+ we have

〈Ax + a, v − x〉 ≥ 0 for all v ∈ S(b, c).

The last assertion follows from Theorem 1.6.

Proof of Theorem 2.16. The proof of the first assertion of the theorem is based on the implicitfunction theorem of Robinson for generalized equations; see Theorem 2.3. It requires us toverify the strong regularity condition for the linearized form of (2.4.1) which is given by

0 ∈

⎧⎪⎪⎨⎪⎪⎩L′(x0, p0, λ0, μ0, η0)+ A(x − x0)+ E∗(λ− λ0)+G∗(μ− μ0)+ L∗(η − η0),

−e(x0, p0)− E(x − x0),

−g(x0, p0)−G(x − x0)+ ∂ψRm,+(μ),

−�(x0, p0)− L(x − x0)+ ∂ψK+(η),

where the operators A,E,G, and L are defined in the introduction to this chapter.If we define the multivalued operator T from X ×W × Rm × Z to itself by

T

⎛⎜⎜⎝x

λ

μ

η

⎞⎟⎟⎠ =

⎛⎜⎜⎝A E∗ G∗ L∗−E 0 0 0−G 0 0 0−L 0 0 0

⎞⎟⎟⎠⎛⎜⎜⎝

x

λ

μ

η

⎞⎟⎟⎠

+

⎛⎜⎜⎝f ′(x0, p0)− Ax0

Ex0

−g(x0, p0)+Gx0

−�(x0, p0)+ Lx0

⎞⎟⎟⎠+⎛⎜⎜⎝

00

∂ψRm,+(μ)

∂ψK+(η)

⎞⎟⎟⎠ ,

then the linearization can be expressed as

0 ∈ T (x, λ, μ, η).

Since the constraint associated withg−(x, p0) ≤ 0 is inactive in a neighborhood of x0, strongregularity of T at (x0, λ0, μ0, η0) easily follows from strong regularity of the multivaluedmapping T from X× (W × Rm1)× Rm2 ×Z into itself at (x0, (λ0, μ

+0 ), μ

00, η0) defined by

T

⎛⎜⎜⎝x

λ

μ0

η

⎞⎟⎟⎠ =

⎛⎜⎜⎝A E∗+ G∗

0 L∗−E+ 0 0 0−G0 0 0 0−L 0 0 0

⎞⎟⎟⎠⎛⎜⎜⎝

x

λ

μ0

η

⎞⎟⎟⎠

+

⎛⎜⎜⎝f ′(x0, p0)− Ax0

E+x0

G0x0

−�(x0, p0)+ Lx0

⎞⎟⎟⎠+⎛⎜⎜⎝

00

∂ψRm2 ,+(μ0)

∂ψK+(η)

⎞⎟⎟⎠ .


�

�

�

�

�

�

�

�


Strong regularity of T requires us to show that there exist neighborhoods of V of 0 and U of(x0, (λ0, μ

+0 ), μ

00, η0) in X× (W ×Rm1)×Rm2 ×Z such that T −1(V )∩ U is single valued

and Lipschitz continuous from V to U .We now divide the proof in several steps.

Existence. We show the existence of a solution to

(α, β, γ, δ) ∈ T (x, λ, μ0, η)

for (α, β, γ, δ) ∈ V . Observe that this is equivalent to

0 ∈

⎛⎜⎜⎝a

b

c

d

⎞⎟⎟⎠+⎛⎜⎜⎝

A E∗+ G∗0 L∗

−E+ 0 0 0−G0 0 0 0−L 0 0 0

⎞⎟⎟⎠⎛⎜⎜⎝

x

λ

μ0

η

⎞⎟⎟⎠+⎛⎜⎜⎝

00

∂ψRm2 ,+(μ0)

∂ψK+(η)

⎞⎟⎟⎠ , (2.4.4)

where

(a, b, c, d) = f ′(x0, p0)−Ax0−α,E+x0−β,G0x0−γ,−�(x0, p0)+Lx0− δ). (2.4.5)

To solve (2.4.4) we introduce the quadratic optimization problem with linear con-straints:

min1

2〈Ax, x〉 + 〈a, x〉

subject to E+x = b, G0x ≤ c, and Lx − d ∈ K.

(2.4.6)

We verify the conditions in Lemma 2.17 for (2.4.6) with Y = W×Rm1 , Z = Rm2×Z,E = E+, G = (G0, L), and K = Rm2,− ×K . Let S be the feasible set of (2.4.6):

S(α, β, γ, δ) = {x ∈ X : E+x = b, G0x ≤ c, Lx − d ∈ K},where the relationship between (α, β, γ, δ) and (a, b, c, d) is given by (2.4.5). Clearly,x0 ∈ S(0) and moreover x0 is regular point. From Theorem 2.8 it follows that there existsa neighborhood V of 0 such that for all (α, β, δ, γ ) ∈ V the set S(α, β, γ, δ) is nonempty.Lemma 2.17 implies the existence of a unique solutionx to (2.4.6) for every (α, β, δ, γ ) ∈ V .By (H1)–(H2) and Theorems 2.12, 2.11, and 2.14 the solution x is Hölder continuous withexponent 1

2 with respect to (α, β, γ, δ) ∈ V . Next we show that x = x(α, β, γ, δ) is aregular point for (2.4.6), i.e.,

0 ∈ int

⎧⎨⎩⎛⎝ 0

G0x − c

Lx − d

⎞⎠+⎛⎝ E+

G0

L

⎞⎠ (X − x)+⎛⎝ 0

Rm2,+−K

⎞⎠⎫⎬⎭ .

This is equivalent to

0 ∈ int

⎧⎨⎩⎛⎝ β

γ

�(x0, p0)+ δ

⎞⎠+⎛⎝ E+

G0

L

⎞⎠ (X − x0)+⎛⎝ 0

Rm2,+−K

⎞⎠⎫⎬⎭ .


�

�

�

�

�

�

�

�


Since x0 is a regular point, there exists a neighborhood V ⊂ V of 0 such that this inclusionholds for all (α, β, γ, δ) ∈ V . Hence the existence of a solutions to (2.4.4) follows fromLemma 2.17.

Uniqueness. To argue uniqueness, assume that (xi, λi , μ0i , ηi), i = 1, 2, are solutions to

(2.4.4). It follows that

〈A(x1 − x2)+ E∗+(λ1 − λ2)+G∗0(μ

01 − μ0

2)+ L∗(η1 − η2), x1 − x2〉 ≤ 0,

E+(x1 − x2) = 0,

〈G0(x1 − x2), μ01 − μ0

2〉 ≥ 0,

〈L(x1 − x2), η1 − η2〉 ≥ 0.

Using (H2) we find

κ |x1−x2|2 ≤ 〈A(x1−x2), x1−x2〉 ≤ −〈G∗0(μ

01−μ0

2), x1−x2〉−(L∗(η1−η2), x1−x2) ≤ 0,

which implies that x1 = x2 = x. To show uniqueness of the remaining components, weobserve that

E∗+(λ1 − λ2)+G∗0(μ

01 − μ0

2)+ L∗(η1 − η2) = 0,

〈η1 − η2, L(x − x0)+ �(x0, p0)+ δ〉 = 0,

or equivalentlyE∗(z)(λ1 − λ2, μ

01 − μ0

2 + η1 − η2) = 0,

where z = L(x − x0) + �(x0, p0) + δ and E is defined below (2.1.6). Due to continuousdependence of x on (α, β, γ, δ) we can assure that z ∈ O for all (α, β, γ, δ) ∈ V . Hence(H3) implies that (λ1 − λ2, μ

01 − μ0

2, η1 − η2) = 0.

Continuity. By (H3) the operator E∗(�(x0, p0)) has closed range and injective. Hence thereexists ε > 0 such that

|E∗(�(x0, p0))�| ≥ ε |�|for all � = (λ, μ0, η). Since ‖E∗(�(x0, p0)) − E∗(z)‖ ≤ |z − �(x0, p0)|, there exists aneighborhood of 0 in X × (W × Rm1)× Rm2 × Z, again denoted by V , such that

|E∗(zδ)�| ≥ ε

2|�| (2.4.7)

for all (α, β, γ, δ) ∈ V , where zδ = �(x0, p0)+ Lx − Lx0 + δ. Any solution (x, λ, μ0, η)

satisfiesE∗(zδ)(λ, μ0, η) = (−Ax − a, 0).

It thus follows from (2.4.7) that there exists a constant k1 such that for all (α, β, γ, δ) ∈ V

the solution (x, λ, μ, η) satisfies

|(x, λ, μ0, η)|X×(W×Rm1 )×Rm2×Z ≤ k1.


�

�

�

�

�

�

�

�


Henceforth let (αi, βi, γi, δi) ∈ V and let (xi, λi , μ0i , ηi) denote the corresponding solution

to (2.4.4) for i = 1, 2. Then

E∗(zδ1)

⎛⎝ λ1 − λ2

μ01 − μ0

2η1 − η2

⎞⎠ =(

α1 − α2 + A(x2 − x1)

〈η2, L(x2 − x1)+ δ2 − δ1〉).

By (2.4.7) this implies that there exists a constant k2 such that

|(λ1− λ2, μ01−μ0

2, η1−η2)|(W×Rm1 )×Rm2×Z ≤ k2 (|α1−α2|+|δ1−δ2|+|x1−x2|). (2.4.8)

Now, from the first equation in (2.4.4) we have

〈A(x1−x2)+E∗+(λ1− λ2)+G∗0(μ

01−μ0

2)+L∗(η1−η2)+a1−a2, x1−x2〉 ≤ 0. (2.4.9)

We also find

〈μ01 − μ0

2,G0(x1 − x2)〉 = 〈μ01, c1 − c2〉 + 〈μ0

1, c2 −G0x2〉

−〈μ02,G0x1 − c1〉 − 〈μ0

2, c1 − c2〉 ≥ 〈μ01 − μ0

2, c1 − c2〉(2.4.10)

and similarly

〈η1 − η2, L(x1 − x2)〉 ≥ 〈η1 − η2, d1 − d2〉. (2.4.11)

Representing x1 − x2 = v + w with v ∈ ker (R+) and w ∈ range (E∗+), one obtains

E+(x1 − x2) = E+w = b1 − b2.

Thus

|w| ≤ k3 |b1 − b2| (2.4.12)

for some k3, and from (H2) and (2.4.9)–(2.4.11)

κ |v|2 ≤ 〈Av, v〉 = 〈A(x1 − x2), x1 − x2〉 − 2 〈Av,w〉 − 〈Aw,w〉

≤ −〈λ1 − λ2, b1 − b2〉 − 〈μ01 − μ0

2, c1 − c2〉 − 〈η1 − η2, d1 − d2〉

− 〈a1 − a2, v + w〉 − 2 〈Av,w〉 − 〈Aw,w〉.Let � = (λ1 − λ2, μ

01 − μ0

2, η1 − η2). Then

κ |v|2 ≤ |�||(β1 − β2, γ1 − γ2, δ1 − δ2)| + |α1 − α2|(|v| + |w|)+ ‖A‖(2|v||w| + |w|2).It thus follows from (2.4.8), (2.4.12) that there exists a constant k4 such that

|x1 − x2| ≤ k4 |(α1 − α2, β1 − β2, γ1 − γ2, δ1 − δ2)|for all (αi, βi, γi, δi) ∈ V . We apply (2.4.8) once again to obtain Lipschitz continuity of(x, λ, μ0, η) with respect to (α, β, γ, δ) in a neighborhood of the origin. Consequently T


�

�

�

�

�

�

�

�


is strongly regular at (x0, λ0, μ0, η0), Robinson’s theorem is applicable, and (2.4.1), (2.4.2)follow.

Local solution. We show that there exists a neighborhood N of p0 such that for p ∈ N thesecond order sufficient optimality (2.3.21) is satisfied at x(p) so that x(p) is a local solutionof (2.1.1) by Theorem 2.12. Due to (H2) and the smoothness properties of f, e, g we canassume that

L′′(x(p), p, λ(p), η(p))(h, h) ≥ κ

2|h|2 for all h ∈ ker E+ (2.4.13)

if p ∈ N(p0). Let us define Ep = (e′(x(p), p), g′+(x(p), p)) for p ∈ N(p0). Due tosurjectivity ofEp0 and smoothness properties of e, g there exists a neighborhood N ⊂ N(p0)

of p0 such that Ep is surjective for all p ∈ V . Lemma 2.13 implies the existence of δ0 >

and γ > 0 such that

L′′(x(p), p, λ(p), η(p))(h+ z, h+ z) ≥ δ0 |h+ z|2

for all h ∈ ker E+ and z ∈ X satisfying |z| ≤ γ |h|. The orthogonal projection onto ker Ep

is given by Pker Ep= I − E∗p(EpE

∗p)−1Ep. We can select N so that

|E∗p(EpE∗p)−1Ep − E∗p0

(Ep0E∗p0)−1Ep0 | ≤

γ

1+ γ

for all p ∈ V . For x ∈ ker Ep, we have x = h + z with h ∈ ker E+ and z ∈ (ker E+)⊥and |x|2 = |h|2 + |z|2. Thus,

|z| ≤ |E∗p(EpE∗p)−1Epx − E∗p0

(Ep0E∗p0)−1Ep0x| ≤

γ

1+ γ(|h| + |z|)

and hence |z| ≤ γ |h|. From (2.4.13) this implies

L′′(x(p), p, λ(p), η(p)) ≥ δ0 |x|2 for all x ∈ ker Ep

and, since L(�(p), p) ⊂ ker Ep, the second order sufficient optimality (2.3.21) is satisfiedat x(p).

The first part of the proof of Theorem 2.16 contains a Lipschitz continuity result forlinear complementarity problems. In the following corollary we reconsider this result ina modified form which will be used for the convergence analysis of sequential quadraticprogramming problems.

Throughout the following discussion p0 in (2.1.1) (with C = X) is fixed and thereforeits notation is suppressed. For (x, λ, μ) ∈ X ×W ∗ × Rm let

L(x, λ, μ) = f (x)+ 〈λ, e(x)〉 + 〈μ, g(x)〉;i.e., the equality and finite rank inequality constraints are realized in the Lagrangian term.For (x, λ, μ) ∈ X×W ∗×Rm, (x, λ, μ) ∈ X×W ∗×Rm, and (a, b, c, d) ∈ X×W×Rn×Z


�

�

�

�

�

�

�

�


consider ⎧⎪⎪⎪⎪⎪⎪⎪⎪⎨⎪⎪⎪⎪⎪⎪⎪⎪⎩

min 12 L′′

(x, λ, μ)(x − x, x − x)+ 〈f ′(x)+ a, x − x〉,

e(x)+ e′(x)(x − x) = b,

g(x)+ g′(x)(x − x) ≤ c,

�(x)+ L(x − x)− d ∈ K.

(2.4.14)

Let (x, λ, μ) ∈ X × W ∗ × Rm and define the operator G from X × W ∗ × Rm × Z∗ toX ×W × Rm × Z by

G(x, λ, μ)(x, λ, μ, η)

=

⎛⎜⎜⎝f ′(x)−e(x)−g(x)−�(x)

⎞⎟⎟⎠+⎛⎜⎜⎝

L′′(x, λ, μ)(x − x)+ e′(x)∗λ+ g′(x)∗μ+ L∗η−e′(x)(x − x)

−g′(x)(x − x)

−L(x − x)

⎞⎟⎟⎠ .

Note that

0 ∈

⎛⎜⎜⎝a

b

c

d

⎞⎟⎟⎠+ G(x, λ, μ)(x, λ, μ, η)+

⎛⎜⎜⎝00∂ψRm,+(μ)

∂ψK+(η)

⎞⎟⎟⎠ (2.4.15)

is the first order optimality system for (2.4.14). The following modification of (H2) will beused

(H2) there exists κ > 0 such that 〈L′′(x0, λ0, μ0) x, x〉X ≥ κ|x|2X for all x ∈ ker(E+).

Corollary 2.18. Assume that (H1), (H2), (H3) hold at a local solution of (2.1.1) and thatthe second derivatives of f , e, and g are Lipschitz continuous in a neighborhood of x0.Then there exist neighborhoods U(ξ0) of ξ0 = (x0, λ0, μ0, η0) in X × W ∗ × Rm × Z∗,U (x0, λ0, μ0) of (x0, λ0, μ0), and V of the origin in X ×W × Rm × Z and a constant K ,such that for all (x, λ, μ) ∈ U (x0, λ0, μ0)) and q = (a, b, c, d) ∈ V , there exists a uniquesolution ξ = ξ(x, λ, μ, q) = (x, λ, μ, η) ∈ U(ξ0) of (2.4.15), and x is a local solution of(2.4.14). Moreover, for every pair q1, q2 ∈ V and (x1, λ1, μ1), (x2, λ2, μ2) ∈ U (x0, λ0, μ0)

we have

|ξ(x1, λ1, μ1, q1)− ξ(x2, λ2, μ2, q2)| ≤ K(|(x1, λ1, μ1)− (x2, λ2, μ2)| + |q1 − q2|).

Proof. From the first part of the proof of Theorem 2.16 it follows that

0 ∈

⎛⎜⎜⎝a

b

c

d

⎞⎟⎟⎠+ G(x0, λ0, μ0)(x, λ, μ, η)+

⎛⎜⎜⎝00∂ψRm,+(μ)

∂ψK+(η)

⎞⎟⎟⎠


�

�

�

�

�

�

�

�

2.5. Differentiability 53

admits a unique solution ξ = (x, λ, μ, η) which depends on (a, b, c, d) Lipschitz contin-uously provided that (a, b, c, d) is sufficiently small. The proof of the corollary can becompleted by observing that the estimates in the proof of Theorem 2.16 also hold uniformlyif (x0, λ0, μ0) is replaced by (x, λ, μ) in a sufficiently small neighborhood U (x0, λ0, μ0)

of (x0, λ0, μ0), and (a, b, c, d) is in a sufficiently small neighborhood V of the origin inX ×W × Rm × Z.

As an alternative to reconsidering the proof of Theorem 2.16 one can apply Theorem2.16 to (2.1.1) with P = (X×W ∗ ×Rm)× (X×W ×Rm×Z), p = (x, λ, μ, a, b, c, d),p0 = (x0, λ0, μ0, 0), and f, e, g, � replaced by

f (x, x, λ, μ, a) = 12L′′

(x, λ, μ)(x − x, x − x)+ 〈f ′(x)+ a, x − x〉,

e(x; x, b) = e(x)+ e′(x)(x − x)− b,

g(x; x, c) = g(x)+ g′(x)(x − x)− c,

�(x; x, d) = �(x)+ L(x − x)− d.

Clearly (H1)–(H3) hold for e, g, � at (x0, p0)withp0 = (x0, λ0, μ0, 0). Lipschitz continuityof f , e, g, � with respect to (x, λ, μ, a, b, c, d) is a consequence of the general regularityrequirements on f, e, g, � made at the beginning of the chapter and the assumption thatthe second derivatives of f and e are Lipschitz continuous in a neighborhood of x0. Thisimplies (H4) for e, f , g, �.

2.5 DifferentiabilityIn this section we discuss the differentiability properties of the solution ξ(p) = (x(p), λ(p),

μ(p), η(p)) of the optimality system (2.1.3) with respect to p. Throughout it is assumedthat p ∈ N so that (2.1.3) has the unique solutions ξ(p) in U , where N and U are specifiedin Theorem 2.16. We assume that C = X.

Definition 2.19. A functionφ fromP to a normed linear space X is said to have a directionalderivative at p0 ∈ P if

limt→0+

φ(p0 + t q)− φ(p0)

t

exists for all q ∈ P .

Consider the linear generalized equation

0 ∈

⎧⎪⎪⎨⎪⎪⎩L′(x0, p, λ0, μ0, η0)+ A(x − x0)+ E∗(λ− λ0)+G∗(μ− μ0)+ L∗(η − η0),

−e(x0, p)− E(x − x0),

−g(x0, p)−G(x − x0)+ ∂ψRm,+(μ),

−�(x0, p)− L(x − x0)+ ∂ψK+(η).(2.5.1)

In the proof of Theorem 2.16 strong regularity of (2.1.4) at ξ(0) = (x0, λ0, μ0, η0) wasestablished. Therefore it follows from Theorem 2.5 that there exist neighborhoods of p0


�

�

�

�

�

�

�

�


and ξ(p0) which, without loss of generality, we again denote by N and U , such that (2.5.1)admits a unique solution ξ (p) = (x(p), λ(p), μ(p), η(p)) inU for everyp ∈ N . Moreoverwe have

|ξ(p)− ξ (p)| ≤ α(p) |p − p0|, (2.5.2)

where α(p) : N → R+ satisfies α(p)→ 0 asp→ p0. Thus if ξ has a directional derivativeat p0, then so does ξ(p) and the directional derivatives coincide. We henceforth concentrateon the analysis of directional differentiability of ξ (p). From (2.5.2) and Lipschitz continuityif p→ ξ(p) at p0, it follows that for every q ∈ P

lim supt>0

∣∣∣∣∣ ξ (p0 + t q)− ξ (p0)

t

∣∣∣∣∣ <∞.

Hence there exist weak cluster points of ξ (p0+t q)−ξ (p0)

tas t → 0+. Note that these weak

cluster points coincide with those of ξ(p0+t q)−ξ(p0)

tas t → 0+. We will show that under

appropriate conditions, these weak cluster points are strong limit points and we deriveequations that they satisfy. The following definition and additional hypotheses will be used.

Definition 2.20. A closed convex set C of a Hilbert space H is called polyhedric at z ∈ H if

∪λ>0 λ(C − Pz) ∩ [z− Pz]⊥ = ∪λ>0 λ(C − Pz) ∩ [z− Pz]⊥,

where P denotes the metric projection onto C and [z − Pz]⊥ stands for the orthogonalcomplement of the subspace spanned by x − Pz ∈ H . Moreover C is called polyhedric ifC is polyhedric at every z ∈ H .

(H5) e(x0, ·), �(x0, ·), f ′(x0, ·), e′(x0, ·), g′(x0, ·), and �′(x0, ·) are directionally differen-tiable at p0.

(H6) K is polyhedric at �(x0, p0)+ η0.

(H7) There exists ν > 0 such that 〈Ax, x〉 ≥ ν |x|2 for all x ∈ ker E.

(H8)(EL

) : X→ W × Z is surjective.

Since every element z ∈ Z can be decomposed uniquely as z = z1 + z2 with z1 = PKz andz2 = PK+z and 〈z1, z2〉 = 0 (see [Zar]), (H.6) is equivalent to

∪λ>0 λ(K − �(x0, p0)) ∩ [η0]⊥ = ∪λ>0 λ(K − �(x0, p0)) ∩ [η0]⊥.

Recall the decomposition of the inequality constraint with finite rank and the nota-tion that was introduced in (2.1.6). Due to the complementarity condition and continuousdependency of ξ(p) on p one can assume that

g+(x(p), p) = 0, g−(x(p), p) < 0, μ+(p) > 0, μ−(p) = 0 (2.5.3)

for p ∈ N . This also holds with (x(p), μ(p)) replaced by (x(p), μ(p)).


�

�

�

�

�

�

�

�


Theorem 2.21. Let (H1)–(H5) hold and let (x, λ, μ, η) denote a weak cluster point ofξ(p0+t q)−ξ(p0)

tas t → 0+. Then (x, λ, μ, η) satisfies

0 ∈

⎧⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎨⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎩

L′p(x0, po, λ0, μ0, η0)q + Ax + E∗λ+G∗μ+ L∗η,

−e+p (x0, p0)q − E+x,

−g0p(x0, p0)−G0x + ∂ψRm2 ,+(μ0),

μ−,

〈η, �(x0, p0)〉 + 〈η0, Lx + �p(x0, p0)q〉.

(2.5.4)

Proof. As described above, we can restrict ourselves to a weak cluster point of ξ (p0+t q))−ξ (p0)

t.

We put pn = p0 + tn q. By (2.5.1)

0=L′(x0, pn, λ0, μ0, η0)− L′(x0, p0, λ0, μ0, η0)

+A(x(pn)− x0)+ E∗(λ(pn)− λ0)+G∗(μ(pn)− μ0)+ L∗(η(pn)− η0) = 0,

0=−e(x0, pn)+ e(x0, p0)− E(x(pn)− x0) = 0.

Dividing these two equations by tn > 0 and letting tn → 0+ we obtain the first two equationsin (2.5.4). For the third equation note that μ0 ≥ 0 since μ0(pn) ≥ 0 and μ0(p0) = 0. Sinceg0(x0, p0) = 0 we have from (2.5.1)

〈g0(x0, pn)− g0(x0, p0)+G0(x(pn)− x0), z− (μ0(pn)− μ0(p0)〉 ≥ 0

for all z ∈ Rm2,+. Dividing this inequality by tn > 0 and letting tn → 0+ we obtain thethird inclusion. The fourth equation is obvious. For the last equation we recall that

〈η(pn), �(x0, pn)+ L(x(pn)− x0)〉 = 〈η0, �(x0, p0)〉 = 0.

Thus,

〈η(pn)− η0, �(x0, pn)〉 + 〈η(pn), L(x(pn)− x0)〉 + 〈η0, �(x0, pn)− �(x0, p0)〉 = 0,

which implies the last equality.

Define the value function V (p)

V (p) = infx∈c {f (x, p) : e(x, p) = 0, g(x, p) ≤ 0, �(x, p) ∈ K}.

Then we have the following corollary.

Corollary 2.22 (Sensitivity of Cost). Let (H.1)–(H.4) hold and assume that e, g, � arecontinuously differentiable in the sense of Fréchet at (x0, p0). Then the Gâteaux derivative


�

�

�

�

�

�

�

�


of the value function μ at p0 exists and is given by

V ′(p0)= Lp(x0, p0, λ0, μ0, η0)

= fp(x0, p0)+ 〈λ0, ep(x0, p0)〉 + 〈μ0, gp(x0, p0)〉 + 〈η0, �p(x0, p0)〉.

Proof. Note that

V (p) = f (x(p), p) = L(x(p), p, λ(p), μ(p), η(p)).

For q ∈ P let ξ(t) = t−1(x(p0 + t q)− x(p0)), t > 0. Then there exists a subsequence tnsuch that limn→∞ tn = 0 and ξ(tn)→ (x, λ, μ, η). Now, let pn = p0 + tn q and

V (pn)− V (p0) = L(x(pn), pn, λ(pn), μ(pn), η(pn))

−L(x(pn), p0, λ(pn), μ(pn), η(pn))+ L(x(pn), p0, λ(pn), μ(pn), η(pn))

−L(x(pn), p0, λ(p0), μ(p0), η(p0))+ L(x(pn), p0, λ(p0), μ(p0), η(p0))

−L(x(p0), p0, λ(p0), μ(p0), η(p0)).

Thus,

V (pn)−V (p0)

tn= 〈L′(x0, p0, λ0, μ0, η0), x〉 + 〈L(x0, p0, λ0, μ0, η0)p, q〉

+〈λ, e(x0, p0)〉 + 〈μ, g(x0, p0)〉 + 〈η, �(x0, p0)〉 +O(tn).

Clearly 〈λ, e(x0, p0)〉 = 0 and considering separately the cases according to (2.5.3)we find 〈μ, g(x0, p0)〉 = 0 as well. Below we verify that

〈η0, Lx + �p(x0, p0)q〉 = 0. (2.5.5)

From Theorem 2.21 this implies that 〈η, �(x0, p0)〉 = 0. To verify (2.5.5) note that from(2.5.1) we have

〈η(tn), �(x0, p0)− �(x0, pn)− L(x(tn)− x0)〉 ≤ 0

and consequently 〈η0, �p(x0, p0)q + L x〉 ≥ 0. Moreover

〈η0, �(x0, pn)− �(x0, p0)+ L(x(tn)− x0)〉 ≤ 0

and therefore 〈η0, �p(x0, p0)q + L x〉 ≤ 0 and (2.5.5) holds. It follows that

limn→∞

V (pn)− V (p0)

tn= 〈Lp(x0, p0, λ0, μ0, η0), q〉.

Since the limit does not depend on the sequence {tn} the desired result follows.


�

�

�

�

�

�

�

�


In order to prove strong differentiability of x(p) we use a result due to Haraux [Har]on directional differentiability of the metric projection onto a closed convex set.

Theorem 2.23. Let C be a closed convex set in a Hilbert space H with metric projectionP from H onto C. Let ψ be an H -valued function and assume that C is polyhedric atψ(0). Set

K(ψ(0)) = ∪λ>0 λ(C − Pψ(0)) ∩ [ψ(0)− Pψ(0)]⊥,and denote by PK(ψ(0)) the projection onto K(ψ(0)). If there exists a sequence tn such that

limn→∞ tn → 0+ and limn→∞ ψ(tn)−ψ(0)tn

=: ψ exists in H , then

limtn→0+

Pψ(tn)− Pψ(0)

tn= PK(ψ(0))ψ .

Proof. Let γ (t) = Pψ(t)−Pψ(0)t

. Since the metric projection onto a closed convex set in aHilbert space is a contraction, γ (tn) has a weak limit which is denoted by γ . Since

〈ψ(tn)− Pψ(tn), Pψ(0)− Pψ(tn)〉 ≤ 0

and Pψ(tn) = tn γ (tn)+ Pψ(0), we obtain

〈ψ(tn)− tn γ (tn)− Pψ(0),−tnγ (tn)〉 ≤ 0.

This implies ⟨tn

ψ(tn)− ψ(0)

tn+ ψ(0)− tn γ (tn)− Pψ(0),−tnγ (tn)

⟩≤ 0

and hence

t2n

⟨γ (tn), γ (tn)− ψ(tn)− ψ(0)

tn

⟩≤ tn〈γ (tn), ψ(0)− Pψ(0)〉

= 〈Pψ(tn)− Pψ(0), ψ(0)− Pψ(0)〉 ≤ 0

(2.5.6)

for all n. Since the norm is weakly lower semicontinuous, (2.5.6) implies that

〈γ, γ − ψ〉 ≤ 0. (2.5.7)

Employing (2.5.6) again we have

0 ≥ 〈γ 〈tn), ψ(0)− Pψ(0)〉 ≥ tn

⟨γ (tn), γ (tn)− ψ(tn)− ψ(0)

tn

⟩for tn > 0. Since {γ (tn)} is bounded, this implies

〈γ,ψ(0)− Pψ(0)〉 = 0,

and thus γ ∈ K(ψ(0)).


�

�

�

�

�

�

�

�


Let w ∈ ∪λ>0 λ(C−Pψ(0))∩[ψ(0)−Pψ(0)]⊥ be arbitrary. Then there exist λ > 0and u ∈ C such that w = λ (u− Pψ(0)) and 〈ψ(0)− Pψ(0), u− Pψ(0)〉 = 0. Since

〈ψ(tn)− tn γ (tn)− Pψ(0), u− Pψ(0)− tnγ (tn)〉 ≤ 0,

we find for δn = γ (tn)− γ that⟨tn

ψ(tn)− ψ(0)

tn+ ψ(0)− Pψ(0)− tnγ − tnδn, u− Pψ(0)− tnγ − tnδn

⟩≤ 0.

Thus,⟨tn

ψ(tn)− ψ(0)

tn− tnγ − tnδn, u− Pψ(0)− tnγ − tnδn

⟩≤ 〈ψ(0)− Pψ(0), tnδn〉

and therefore⟨ψ(tn)− ψ(0)

tn− γ, u− Pψ(0)

⟩≤ 〈δn, u− Pψ(0)〉 + 〈ψ(0)− Pψ(0), δn〉 + tn M

for some constant M . Letting tn → 0+ we have

〈ψ − γ, u− Pψ(0)〉 ≤ 0.

Since C is polyhedric at ψ(0) it follows that

〈ψ − γ,w〉 ≤ 0 for all w ∈ K(ψ(0)). (2.5.8)

Combining (2.5.7) and (2.5.8) one obtains

〈γ − ψ, γ − w〉 ≤ 0 for all w ∈ K(ψ(0)),

and thus γ = PK(ψ(0))ψ

To show strong convergence of γ (tn) observe that from (2.5.7) and (2.5.8)

〈ψ − γ, γ 〉 = 0.

Hence

〈ψ, γ 〉 = |γ |2 ≤ lim inf |γ (tn)|2 ≤ lim sup

⟨ψ(tn)− ψ(0)

tn, γ (tn)

⟩= 〈ψ, γ 〉,

which implies that lim |γ (tn)|2 = |γ |2 and completes the proof.

Theorem 2.24 (Sensitivity Equation). Let (H1)–(H8) hold. Then the solution mappingp → ξ(p) = (x(p), λ(p), μ(p), η(p)) is directionally differentiable at p0, and the direc-tional derivative (x, λ, μ, η) at p0 in direction q ∈ P satisfies

0 ∈

⎧⎪⎪⎨⎪⎪⎩Lp(x0, p0, λ0, μ0, η0)q + Ax + E∗λ+G∗+μ+G∗

0μ0 + L∗η,

−e+p (x0, p0)q − E+x,−g0

p(x0, p0)q −G0x + ∂ψRm,+(μ0),

−�p(x0, p0)q − Lx + ∂ψK+(η).

(2.5.9)


�

�

�

�

�

�

�

�


Proof. Let {tn} be a sequence of real numbers with limn→∞ tn = 0+ andw−lim t−1n (ξ(p0+

tnq) − ξ(p0)) = (x, λ, μ, η). Then (x, λ, μ, η) is also a weak cluster point of w −lim t−1

n (ξ (p0 + tnq)− ξ(p0)). The proof is now given in several steps.We first show that x(tn)−x(0)

tnconverges strongly in X as n→∞ and (2.5.9) holds for

all weak cluster points. Let pn = p0 + tn q and define

φ(tn) = L′(x0, pn, λ0, μ0, η0)+ f ′(x0, p0)− Ax0.

From (2.5.1) we have

0 ∈

⎧⎪⎪⎪⎪⎨⎪⎪⎪⎪⎩L′(x0, pn, λ0, μ0, η0)+ f ′(x0, p0)+ A(x(tn)− x0)+ E∗(λ(tn))

+G∗(μ(tn))+ L∗(η(tn)),−e(x0, pn)− E(x(tn)− x0),

−g(x0, pn)−G(x(tn)− x0)+ ∂ψRm,+(μ(tn)),

−�(x0, pn)− L(x(tn)− x0)+ ∂ψK+(η(tn)).

(2.5.10)

By (H8) there exists a unique w(tn) ∈ range (E∗, L∗) such that⎛⎝ E

L

⎞⎠w(tn) =⎛⎝ −e(x0, pn)+ Ex0

−�(x0, pn)+ Lx0

⎞⎠ .

Due to (H5) and (H8) there exists w ∈ X such that lim t−1n (w(tn) − w(0)) = w. If we

define y(tn) = x(tn)− w(tn), then using Lemma 2.7 one verifies that y(tn) ∈ C, where

C = {x ∈ X : Ex = 0 and Lx ∈ K}.

Observe that

〈E∗λ(tn)+ L∗η(tn), c − yn(tn)〉 ≤ 0 for all c ∈ C

and thus

〈Ayn(t)+ φ(tn)+ Aw(tn)+G∗μ(tn), c − y(tn)〉 ≥ 0 for all c ∈ C. (2.5.11)

If we equip ker E with the inner product

((x, y)) = 〈Ax, y〉 for x, y ∈ ker E,

then ker E is a Hilbert space by (H7). Let AP = Pker E APker E and define

ψ(tn) = −A−1P Pker E(φ(tn)+ Aw(tn)+G∗μ(tn)) ∈ ker E.

Then limtn→0+ψ(tn)−ψ(0)

tn=: ψ exists, and (2.5.11) is equivalent to

((y(tn)− ψ(tn), c − y(tn))) ≥ 0 for all c ∈ C.


�

�

�

�

�

�

�

�


Thus y(tn) = PCψ(tn), where PC denotes the metric projection in ker E onto PC withrespect to the inner product ((·, ·)). For any h ∈ ker E we find

((h, ψ(0)− PC ψ(0) ))= 〈h,−f ′(x0, p0)+ Axo − Aw(0)−G∗μ0 − Ay(0)〉

= 〈h,−f ′(x0, p0)−G∗μ0 − E∗λ0〉 = 〈h,L∗η0〉,

and therefore

[ψ(0)− PCψ(0)]⊥ = {h : 〈η0, L h〉 = 0}. (2.5.12)

Here the orthogonal complement is taken with respect to the ((·, ·))—the inner producton ker E. Moreover we have L(y(0) = �(x0, p0). Hence from Lemma 2.25 below wehave ∪λ>0λ(C − PC ψ(0)) ∩ [ψ(0)− PC ψ(0)]⊥ = ∪λ>0λ(C − PC ψ(0)) ∩ [ψ(0) −PC ψ(0)]⊥, i.e., C is polyhedric with respect to ψ(0). Theorem 2.23 therefore impliesthat limtn→o+(t

−1n )(y(tn)− y(0)) =: y exists and satisfies

((ψ − y, v − y)) ≥ 0 for all v ∈ C(ψ(0)). (2.5.13)

Moreover limtn→0(t−1n )(x(tn)− x(0)) exists and will be denoted by x.

To verify that (x, λ, μ, η) satisfies (2.5.9) it suffices to prove the last inclusion. From(2.5.13) we have

〈−L′p(x0, p0, λ0, μ0, η0)q − A x −G∗μ, v − y〉 ≤ 0

and from the first equation of (2.5.4)

〈L∗η, v − y〉 = 〈η, L v − �p(x0, p0)q〉 ≤ 0 (2.5.14)

for all v ∈ C(ψ(0)) = ∪λ>0 λ(C − PC ψ(0)) ∩ [ψ(0) − PC ψ(0)]⊥. Here we used thefacts that φ = L′p(x0, λ0, μ0, ηo)q and L w = −�p(x0, p0)q. From (2.5.12), (H8), andLy(0) = �(x0, p0) we have

L C(ψ(0)) = ∪λ>0 λ(K − �(x0, p0) ∩ [η0]⊥ =: K

and therefore by (2.5.14)

〈η, v − L x − � p(x0, p0)q〉 ≤ 0 for all v ∈ K. (2.5.15)

Recall from (2.5.5) that

〈η0, �p(xo, p0)q + L x〉 = 0.

This further implies �p(x0, p0)q +L x ∈ K , and from (2.5.15) we deduce η ∈ K+. Conse-quently η ∈ ∂ ψK(�p(x0, p0)q+L x), which is equivalent to � p(xo, p0)q+L x ∈ ∂ ψK+(η)

and implies the last inclusion in (2.5.9).


�

�

�

�

�

�

�

�


Next, we show that the weak cluster points of limt→0+ξ (t)−ξ(0)

tare unique. Let

(xi , λi , μi , ηi), i = 1, 2, be two weak cluster points. Then by (2.5.9)

0 = 〈A(x1 − x2), x1 − x2〉 + 〈μ1 − μ2,G(x1 − x2)〉 + 〈η1 − η2, L(x1 − x2)〉

= 〈A(x1 − x2), x1 − x2〉 + 〈μ01,−g0

p(x0, p0)q −G0x2〉 + 〈μ02,−g0

p(x0, p0)q −G0x1〉

+〈η1,−�p(x0, p0)q − Lx2〉 + 〈η2,−�p(x0, p0)q − Lx1〉

≥ 〈A(x1 − x2), x1 − x2〉,and by (H2) this implies x1 = x2. From (2.5.9) we have

E∗(�(x0, p0))

⎛⎝ ˙λ1 − ˙λ2

μ01 − μ0

2η1 − η2

⎞⎠ = 0

and hence (H3) implies (λ1, μ1, η1) = (λ2, μ2, η2). Consequently ξ (t)−ξ(0)t

has a uniqueweak limit as t → 0+. Finally we show that the unique weak limit is also strong. For thex-component this was already verified. Note that from (2.5.1)

E∗(�(x0, p0))

⎛⎝ ˜λ(t)− λ0

μ0 − μ00

η(t)− η0

⎞⎠ =⎛⎝ −L′(x0, p0 + t q, λ0, μ0, η0)− A(x(t)− x0)

〈η(t),−�(x0, p + t q)+ �(x0, p0)− L( ˆx(t)− x0)〉

⎞⎠ .

Dividing this equation by t and noticing that the right-hand side converges strongly ast → 0+, it follows from (H3) that limt→0+ t−1(λ(t)− λ0, μ(t)−μ0, η(t)− η0) convergesstrongly as well.

Lemma 2.25. Assume that (H6) and (H8) hold and let y(0) ∈ C be such that Ly(0) =�(x0, p0). Then we have

∪λ>0 λ(C − y(0)) ∩ {h ∈ ker E : 〈η0, L h〉 = 0}

= ∪λ>0 λ(C − y(0)) ∩ {h ∈ ker E : 〈η0, L h〉 = 0}.(2.5.16)

Proof. It suffices to verify that the set on the right-hand side of (2.5.16) is contained in theset on the left. For this purpose let y ∈ ∪λ>0 λ(C − y(0)) ∩ {h ∈ ker E : 〈η0, L h〉 =0} and decompose y = y1 + y2 with y1 ∈ ker( E

L ) and y2 ∈ (ker( EL ))⊥. We need

only consider y2. Since Ly2 ∈ ∪λ→0 λ(K − �(x0, p0)) ∩ [η0]⊥ and therefore Ly2 ∈∪λ>0 λ(K − �(x0, p0)) ∩ [η0]⊥ by (H6), hence there exist sequences {λn} and {kn} withλn > 0, kn ∈ K, 〈kn, η0〉 = 0, and lim λn(kn − �(x0, p0)) = Ly2. Let us definecn = cn+y(0), where cn is the unique element in ( E

L )∗ satisfying ( EL )cn = ( 0

kn−Ly(0) ). Thenwe haveE(cn−y(0)) = 0, 〈η0, L cn〉 = 〈η0, kn〉 = 0 and hence the sequenceλn(cn−y(0)) iscontained in the set on the left-hand side of (2.5.16). Moreover limn→∞( E

L )(λn(cn−y(0)) =( EL )y2, and since both λn(cn − y(0)) and y2 are contained in the range of ( E

L )∗ we find thatlimn→∞ λn(cn − y(0)) = y2.


�

�

�

�

�

�

�

�


2.6 Application to optimal control of anordinary differential equation

We consider optimal control of the linear ordinary differential equation with a terminalconstraint

min∫ T

0f (y(t), u(t), α)dt

subject to

y(t) = A(α)y(t)+ B(α)u(t)+ h(t) on (0, T ),

y(0) = 0,

|y(T )− yd | ≤ δ,

u ≤ z a.e. on (0, T ),

(2.6.1)

where T > 0, y(t) ∈ Rn, u(t) ∈ Rn, α ∈ Rr , yd ∈ Rn, δ > 0, z ∈ L2, A(α) ∈ Rn×n,B(α) ∈ Rn×m, f : Rn×Rm×Rr → R, andh ∈ L2(0, T ). All function spaces of this sectionare considered on the interval (0, T ) with Euclidean image space of appropriate dimension.The perturbation parameter is given by p = (α, yd, h, z) ∈ P = Rr × Rn × L2 × L2 withreference parameter p0 = (α0, y

d0 , h0, z0). To identify (Pp) as a special case of (2.1.1) we

setX = H 1

L × L2,W = L2, Z = L2,K = {φ ∈ L2 : φ ≤ 0 a.e.},where H 1

L = {φ ∈ H 1 : φ(0) = 0}, x = (y, u), and

f (y, u, p) =∫ T

0f (y, u, α)dt,

e(y, u, p) = y − A(α)y − B(α)u− h,

g(y, u, p) = 1

2

(|y(T )− yd |2 − δ2),

�(y, u, p) = u− z.

The linearizations E,G, and L are

E(y0, u0, p0)(h, v) = h− A(α0)h− B(α0)v,

G(y0, u0, p0)(h, v) = 〈y0(T )− yd0 , h(T )〉,

L(y0, u0, p0)(h, v) = v.

The following assumptions will imply conditions (H1)–(H8) of this chapter.

(A1) There exists a solution (y0, u0) of (2.6.1).


�

�

�

�

�

�

�

�

2.6. Application to optimal control of an ODE 63

(A2) A(·) and B(·) are Lipschitz continuous and directionally differentiable at α0.

(A3) (y, u) → f (y, u, α) is twice continuously differentiable in a neighborhood of (y0,

u0, α0) ∈ H 1L × L2 × Rr .

(A4) There exists ε > 0 such that for the Hessian with respect to (y, u)

(h, v) f′′(y0, u0, α0) (h, v)

T ≥ ε|v|2

for all (h, v) ∈ Rn × Rm.

(A5) There exist a neighborhood V of (y0, u0, α0) ∈ H 1L × L2 × Rr and a constant κ

such that |f (y, u, α1) − f (y, u, α2)| ≤ κ|α1 − α2| for all (y, u, αi) ∈ V , i = 1, 2,and f ′(y0, u0, ·) is directionally differentiable at α0, where the prime denotes thederivative with respect to (y, u).

(A6) 〈y0(T )− yd0 ,

∫ T

0 eA(α0)(T−s) B(α0) (z0 − u0)ds〉 �= 0.

With (A3) holding, the regularity requirements made at the beginning of the chapterare satisfied. The regular point condition (H1) will be satisfied if for every (φ, ρ, ψ) ∈L2 × R× L2 there exist (h, v) ∈ H 1

L × L2 and (r+, r, k) ∈ R+ × R×K such that⎧⎪⎪⎪⎪⎨⎪⎪⎪⎪⎩h− A(α0)h− B(α0)v = φ,

〈y0(T )− yd0 , h(T )〉 + r+ + r g(y0, u0, p0) = ρ,

v − k + r(u0 − z0) = ψ.

(2.6.2)

From the first and third equations we have

v = ψ + k + r(z0 − u0),

h(t) = ρ1(t)+∫ t

0eA(α0)(t−s)B(α0)(k + r(z0 − u0))dt,

where ρ1(t) =∫ t

0 eA(α0)(t−s)(B(α0)ψ+φ)ds. The second equation in (2.6.2) is equivalent to⟨y0(T )− yα

0 ,

∫ T

0eA(α0)(T−s)B(α0)(k + r(z0 − u0))ds

⟩+ r+ + r g(y0, u0, p0)

= ρ − ρ1(T ).

If g(y0, u0, p0) = 0, then, using (A6), the desired solution for (2.6.2) is obtained by settingr+ = k = 0 and

r =⟨y0(t)− yd

o ,

∫ T

0eA(α0)(T−s)B(α0)(z0 − u0)ds

⟩−1

(ρ − ρ1(T )).

Ifg(y0, u0, p0) < 0 andρ−ρ1 ≥ 0, then the choice r+ = ρ−ρ1, r = k = 0 gives the desiredsolution. Finally if g(y0, u0, p0) < 0 and ρ − ρ1 < 0, then r = (ρ − ρ1)g(y0, u0, p0) > 0,


�

�

�

�

�

�

�

�


r+ = 0, and k = r(u0−u) ∈ K give a solution for (2.6.2). Thus the regular point conditionholds and implies the existence of a Lagrange multiplier (λ0, μ0, η0) ∈ L2 × R× L2. Forthe Lagrangian functional

L(y, u, p0, λ0, μ0, η0)=∫ T

0f (y, u, α0)dt + 〈λ0, y − A(α0)y − B(α0)u− h0〉

+μ0

2(|y(T )− yd

0 | − δ2|)− 〈η0, u− z0〉,the Hessian with respect to (y, u) at (y0, u0) in directions (h, v) ∈ H 1

L × L2 satisfies

L′′(y0, u0, p0, λ0, μ0, η0)((h, v), (h, v))

=∫ T

0(h(t), v(t))f

′′(y0, u0, α0)(h(t), v(t))

T dt + μ0|h(T )|2 ≥ ε|v|2L2

by (A4). Let k1 > 0 be chosen such that |h|H 1L≤ k1|v|L2 for all (h, v) ∈ H 1

L ×L2 in ker E.Then there exists a constant k2 > 0 such that

L′′(y0, u0, p0, λ0, μ0, η0)((h, v), (h, v)) ≥ k2(|h|2H 1

L

+ |v|1L2)

for all (h, v) ∈ ker E and (H2), (H7) hold.We turn to the verification of (H3). The case g(y0, u0, α0) < 0 is simple and we

therefore consider the case g(y0, u0, α0) = 0. For arbitrary (φ, ρ, ψ) ∈ L2 × R × L2

existence of a solution (h, v, r) ∈ H 11 × L2 × R to⎧⎪⎪⎪⎪⎨⎪⎪⎪⎪⎩

h− A(α0)h− B(α0)v = φ,

〈y0(T )− yd0 , h(T )〉 = ρ,

v + r(z0 − u0) = ψ

(2.6.3)

must be shown. From the first and third equations

v = ψ − r(z0 − u0)

and

h(t) = ρ1(t)− r

∫ t

0eA(α0)(t−s)B(α0)(z0 − u0)ds.

From the second equation of (2.6.3) and (H7) we find

r =⟨y0(T )− yd

0 ,

∫ T

0eA(α0)(T−s)B(α0)(z0 − u0)ds

⟩−1

(〈y0(T )− yd0 , ρ1(T )〉 − ρ),

and (H3) holds. Conditions (H4) and (H5) follow from (A2) and (A5). The cone of a.e.nonpositive functions inL2 is polyhedric [Har] and hence (H6) holds. Finally (H8) is simpleto check. We note that (A5) and (A6) are not needed if (2.6.1) is considered without terminalconstraint.


�

�

�

�

�

�

�

�

Chapter 3

First Order AugmentedLagrangians for Equalityand Finite Rank InequalityConstraints

3.1 GeneralitiesThis chapter is devoted to first order augmented Lagrangian methods for constrained prob-lems of the type {

min f (x) over x ∈ X

subject to e(x) = 0, g(x) ≤ 0, �(x) ∈ K,(3.1.1)

where f : X→ R, e : X→ W, g : X→ Rm, � : X→ Z with X,W , and Z real Hilbertspaces, K a closed convex cone with vertex at 0 in Z, and Rm endowed with the naturalordering x ≤ 0 if xi ≤ 0 for i = 1, . . . , m. A sequence of unconstrained problems with theproperty that their solutions converge to the solution of (3.1.1) will be formulated. As weshall see, first order augmented Lagrangian techniques are related to Lagrangian or dualitymethods, as well as to the penalty technique. They arise from considering the Lagrangianfunctional for (3.1.1) and adding a proper penalty term. Differently from pure penaltymethods, however, the penalty parameter is not taken to infinity. Rather the penalty term withfixed weight is used to enhance convexity of the Lagrangian functional in the neighborhoodof a local minimum and to speed up convergence of pure Lagrangian methods. Affineinequality constraints with infinite-dimensional image space can be added to the problemformulation (3.1.1). Such constraints, however, are not augmented in this chapter. Centralto the analysis is the augmentability property, which is also of interest in its own right. It hasreceived a considerable amount of attention for finite-dimensional optimization problems;see, e.g., [Hes1]. Augmentability of (3.1.1) means that a properly defined penalty terminvolving the constraints in (3.1.1) can be added to the Lagrangian functional for (3.1.1)such that the resulting augmented Lagrangian functional has the property that its Hessianis positive definite on all of X under minimal assumptions on (3.1.1). We shall presenta convergence theory for the first order augmented Lagrangian method applied to (3.1.1)without strict complementarity assumption. This means that we do not assume that μ∗i = 0implies gi(x∗) < 0, where x∗ denotes a local solution of (3.1.1) and μ∗i is the ith coordinateof the Lagrange multiplier associated to the inequality constraintg(x) ≤ 0. As an applicationwe shall discuss the fact that in the context of parameter estimation problems the first order

65


�

�

�

�

�

�

�

�

66 Chapter 3. First Order Augmented Lagrangians

augmented Lagrangian method can be considered as a hybrid method combining the output-matching and equation error methods. In the context of parameter estimation or, equallywell, of optimal control problems, the first order augmented Lagrangian method lends itselfto vectorization and parallelization in a natural way; see [KuTa].

It will be convenient throughout this section to identify the Hilbert spaces X,W , andZ with their duals. As a consequence the Lagrange multiplier associated to the equalityconstraint e(x) = 0 is sought in W and that for �(x) ∈ K in Z.

Let us now fix those assumptions which are assumed throughout this chapter: Thereexists x∗ ∈ X and (λ∗, μ∗, η∗) ∈ W × Rp × Z such that{

f, e, and g are twice continuously Fréchet differentiable

in a neighborhood of x∗,(3.1.2)

x∗ is a stationary point of (3.1.1) with Lagrange multiplier (λ∗, μ∗), i.e.,{(f ′(x∗), h)X +

(λ∗, e′(x∗)h

)W+ (

μ∗, g′(x∗)h)Rm +

(η∗, �′(x∗)h

)Z= 0

for all h ∈ X,(3.1.3)

and {e(x∗) = 0,

(μ∗, g(x∗)

)Rm = 0, μ∗ ≥ 0, g(x∗) ≤ 0,(

η∗, �(x∗))Z= 0, η∗ ∈ K+, �(x∗) ∈ K.

(3.1.4)

Moreover we assume that

e′(x∗) : X→ W is surjective. (3.1.5)

Above(·, ·)

Wand

(·, ·)Z

denote the inner products in W and Z, respectively, and(·, ·)Rm is the usual inner product in Rm. If x∗ is a local solution of (3.1.1), if further (3.1.2)

holds and x∗ is a regular point in the sense of Definition 1.5, then there exists a Lagrangemultiplier (λ∗, μ∗, η∗) such that (3.1.3), (3.1.4) hold.

Introducing the Lagrange functional L : X ×W × Rm × Z → R by

L(x, λ, μ, η) = f (x)+ (λ, e(x)

)W+ (

μ, g(x))Rm +

(η, �(x)

)Z,

(3.1.3) can be expressed as

L′(x∗, λ∗, μ∗, η∗)h = 0 for all h ∈ X.

Here and below the prime denotes differentiation with respect to x. With (3.1.2) holding,L′′(x∗, λ∗, μ∗, η∗) : X ×X→ R exists as symmetric bilinear form given by

L′′(x∗, λ∗, μ∗, η∗)(h, k) = f ′′(x∗)(h, k)+ (λ∗, e′′(x∗)(h, k)

)W+ (

μ∗, g′′(x∗)(h, k))Rm

for all (h, k) ∈ X×X.The index set of the coordinates of the finite rank inequality constraintsis decomposed as

I1 = {i : gi(x∗) = 0, μ∗i = 0},I2 = {i : gi(x∗) = 0, μ∗i < 0},I3 = {i : gi(x∗) < 0}.


�

�

�

�

�

�

�

�

3.2. Augmentability and sufficient optimality 67

Clearly I1 ∪ I2 ∪ I3 = {1, . . . , m}. If I1 is empty, then we say that strict complementarityholds. Let us denote the cardinality of Ii by mi. Without loss of generality we assumethat mi ≥ 1 for i = 1, 2, 3 and that I1 = 1, . . . , m1, I2 = m1 + 1, . . . , m1 + m2, andI3 = m1 +m2 + 1, . . . , m.

The following second order sufficient optimality condition will be of centralimportance:

There exists a constant γ > 0 such that

⎧⎪⎪⎨⎪⎪⎩L′′

(x∗, λ∗, μ∗, η∗)(h, h) ≥ γ |h|2X for all h ∈ C,where C = {h ∈ X : e′(x∗)h = 0, g′i (x∗)h ≤ 0

for i ∈ I1, g′i (x

∗)h = 0 for i ∈ I2}.(3.1.6)

In caseX is finite-dimensional, (3.1.6) is well known to imply that x∗ is a strict local solutionto (3.1.1); see, e.g., [Be, p. 71]. For the infinite-dimensional case this will be proved inTheorem 3.4 below.

Section 3.2 is devoted to augmentability of problem (3.1.1). This means that a func-tional is associated to (3.1.1) with the property that if x∗ is a local minimizer of (3.1.1), thenit is also a minimizer of that functional where the essential constraints are eliminated andat most simple affine constraints remain. Here this is achieved on the basis of Lagrangianfunctionals and augmentability is obtained without the use of a strict complementarity as-sumption. Section 3.3 contains the foundations for the first order augmented Lagrangianalgorithm based on a duality framework. The convergence analysis for the first order algo-rithm is given in Section 3.4. Section 3.5 contains an application to parameter estimationproblems.

3.2 Augmentability and sufficient optimalityWe consider a modification of (3.1.1) where the equality and finite rank inequality constraintsof (3.1.1) are augmented:

min fc(x, u) = f (x)+ c

2|e(x)|2W +

c

2|g(x)+ u|2

Rm over (x, u) ∈ X × Rm

subject to e(x) = 0, g(x)+ u = 0, u ≥ 0, �(x) ∈ K,

(3.2.1)

where c > 0 will be appropriately chosen. Observe that x∗ is a local solution to (3.1.1)if and only if (x∗, u∗) = (x∗,−g(x∗)) is a (local) solution to (3.2.1). Moreover, x∗ isstationary for (3.1.1) with Lagrange multiplier (λ∗, μ∗, η∗) if and only if (x∗,−g(x∗)) isstationary for (3.2.1) with Lagrange multiplier (λ∗, μ∗, μ∗, η∗), i.e., (3.1.3)–(3.1.4) holdwith g(x∗) = −u∗. Associated to (3.2.1) we introduce the functional

Lc(x, u, λ, μ, η) = L(x, λ, μ, η)+ c

2|e(x)|2W +

c

2|g(x)+ u|2

Rm . (3.2.2)


�

�

�

�

�

�

�

�


Its Hessian L′′c at (x∗, u∗) in direction ((h, k), (h, k)) ∈ (X × Rm) is given by

L′′c (x∗, u∗, λ∗, μ∗, η∗)((h, k), (h, k))= L′′(x∗, λ∗, μ∗, η∗)(h, h)+ c|e′(x∗)h|2 + c|g′(x∗)h+ k|2= L′′(x∗, λ∗, μ∗, η∗)(h, h)+ c|e′(x∗)h|2 + c

∑i∈I2

|g′(x∗)h|2

+ c∑

i∈I1∪I3

|gi(x∗)h+ ki |2 + c∑i∈I2

(|ki |2 + 2kig′i (x

∗)h).

(3.2.3)

By the Riesz representation theorem there exists for every i ∈ {1, . . . , m} a unique li ∈ X

such that g′i (x∗)h =(li , h

)X

for every h ∈ X, where(·, ·)

Xdenotes the inner product in X.

The operator E : X→ W × Rm2 , defined by

Eh = (e′(x∗)h,

(lm1+1, h

)X, . . . ,

(lm1+m2 , h

)X

),

combines the linearized equality constraint and the active inequality constraints, whichact like equalities in the sense that the associated Lagrange multipliers are positive. Theessential technical result which will imply augmentability of the constraints in (3.1.1) isgiven next.

Proposition 3.1. If (3.1.2)–(3.1.6) hold, then there exist constants c > 0 and τ ∈ (0, γ ]such that

H((h, k), (h, k)) := L′′(x∗, λ∗, μ∗, η∗)(h, h)+ c|Eh|2 + c

m1∑i=1

|(li , h)X + ki |2

≥ τ(|h|2 + |k2Rm1 |)

for all c ≥ c, h ∈ X, and k ∈ Rm1+ .

Proof. Step 1. An equivalent characterization of C is derived by eliminating linear depen-dencies in the definition of C. The cone in (3.1.6) can be equivalently expressed as

C = {h ∈ X : Eh = 0 and(li , h

)X≤ 0 for all i ∈ I1}. (3.2.4)

We next show that⎧⎪⎪⎪⎨⎪⎪⎪⎩there exist m2 ≤ m2 and vectors ni, i = m1 + 1, . . . , m1 + m2,

such that E : X→ Y × Rm2 defined by

Eh = (e′(x∗)h,

(nm1+1, h

)X, . . . ,

(nm1+m2 , h

)X

)is surjective and ker E = ker E.

(3.2.5)

To verify (3.2.5) letni = Pker e′ li ,wherePker e′ denotes the projection onto ker e′(x∗).ChooseI2 ⊂ I2 such that {ni}i∈I2

are linearly independent and span{ni}i∈I2= span{ni}i∈I2 . Possibly

after reindexing we have {ni}i∈I2= {ni}m1+m2

m1+1 .We show that ker E = ker E. Let h ∈ ker E.

Then e′(x∗)h = 0. For every i ∈ I2 let li = ni + l⊥i , where l⊥i ∈ (ker e′(x∗))⊥. Then


�

�

�

�

�

�

�

�


(ni, h

)X= (

li , h)X= 0, i ∈ I2, and therefore h ∈ ker E. The converse, ker E ⊂ ker E, is

proved similarly. To verify surjectivity of E, observe that the adjoint E∗ : W×Rm2 → X ofE is given by E∗(w, r) = e′(x∗)∗w+∑

i∈I2rini . If E∗(w, r) = 0, then, using

∑i∈I2

rini ∈ker e′(x∗) and e′(x∗)∗w ∈ (ker e′(x∗))⊥, we have

∑i∈I2

rini = 0 and e′(x∗)∗w = 0.

Therefore (w, r) = 0, ker E∗ = 0, and range E = W ×Rm2 , by the closed range theorem.Thus (3.2.5) holds.

Finally we consider the set of vectors {ni}i∈I1 with ni = PEli . After possible rear-rangement of indices there exists a subset I1 = {1, . . . , m1} ⊂ I1 such that {ni}i∈I1

arelinearly independent in ker E. The cone C can therefore equivalently be expressed as

C = {h ∈ X : h ∈ ker E and

(ni, h

)X≤ 0 for all i ∈ I1

}.

To exclude trivial cases we assume throughout that I1 and I2 are nonempty.Step 2. Here we characterize the polar cone

C∗1 ={h ∈ ker E : (h, h)

X≤ 0 for all h ∈ C1

}of the closed convex cone

C1 ={h ∈ ker E : (ni, h

)X≤ 0 for all i ∈ I1

}.

Denoting by P and P ∗ the canonical projections in ker E onto C1 and C∗1 , respectively,every element h ∈ ker E can be uniquely expressed as h = h1+h2 with

(h1, h2

)X= 0 and

h1 = Ph ∈ C1, h2 = P ∗h ∈ C∗1 (see [Zar, p. 256]). Moreover

C1 =⋂i∈I1

Ci with Ci = {h ∈ ker E : (ni, h)X≤ 0}

and

C∗1 =(⋂

Ci

)∗ = co⋃

C∗i = co⋃

C∗i

(see [Zar]), with co denoting convex closure. It follows that

C∗1 ={ m1∑

i=1

αini : (α1, . . . , αm1) ∈ Rm1+

}. (3.2.6)

Step 3. Let K1 be chosen such that

K1

m1∑i=1

|αi |2 ≤∣∣∣∣∣m1∑i=1

αini

∣∣∣∣∣2

for every (α1, . . . , αm1) ∈ Rm1 . For arbitrary h =∑m1i=1 αini ∈ C∗1 we have

|h|2 =m1∑i=1

αi

(ni, h

)X≤

∑I+1

αi〈ni, h)X,


�

�

�

�

�

�

�

�


where I+1 = {i ∈ I1 :(ni, h

)X> 0 and αi > 0}. Consequently

|h|2 ≤⎛⎝∑

I+1

α2i

⎞⎠⎛⎝∑I+i

(ni, h

)2X

⎞⎠1/2

≤ K−1/21 |h|

⎛⎝∑I+1

(ni, h

)2X

⎞⎠1/2

and

K1|h|2 ≤∑I+1

(ni, h

)2X

for every h ∈ C∗1 . (3.2.7)

Step 4. Every h ∈ X can be uniquely decomposed into mutually orthogonal elementsas h = h1 + h2 + h3, where h1 ∈ C1 ⊂ ker E, h2 ∈ C∗1 ⊂ ker E, and h3 ∈ ker E⊥.

By Step 2 there exists a vector (α1, . . . , αm1) ∈ Rm1 such that h2 =∑m1i=1 αini. Note

that(ni, h1

)X≤ 0 for all i ∈ I1. Therefore, using the fact that

(h1, h2

)X= ∑m1

i=1 αi

(h1,

ni

)X= 0, it follows that{

if αi > 0 for some i = 1, . . . , m1, then(ni, h1

)X= (

PEli, h1)X= (

li , h1)X= 0.

(3.2.8)

We shall use Step 3 with I+1 = {i ∈ I1 :(ni, h2

)X> 0 and αi > 0}. Let k ∈ Rm1+ and c > 0.

Then by (3.1.6) and (3.2.8) we have

H((h, k), (h, k)) = L′′(x∗, λ∗, μ∗, η∗)(h, h)+ c|Eh3|2+ c

∑I+1

∣∣(li , h3)X+ (

li , h2)X+ ki

∣∣2 + c∑I1\I+1

∣∣(li , h)X + ki∣∣2

≥ γ |h1|2 + 2L′′(x∗, λ∗, μ∗, η∗)(h1, h2 + h3)

+ L′′(x∗, λ∗, μ∗, η∗)(h2 + h3, h2 + h3)

+ c|Eh3|2 + c1

2

∑I+1

((ni, h2

)X+ ki

)2 − c1

∑I+1

(li , h3

)2X

+ c2

2

∑I1\I+1

|ki |2 − c2

∑I1\I+1

(li , h

)2X

for all 0 < c1, c2 ≤ c to be chosen below. Here we used (a + b)2 ≥ 12a

2 − b2 and thefact that

(li , h2

) = (ni, h2

)for i ∈ I1. To estimate Eh3 from below we make use of the

fact that E is an isomorphism from (ker E)⊥ to W ×Rm2 . Hence there exists K2 > 0 suchthat K2|h3| ≤ |Eh3| ≤ |Eh3| for all h3 ∈ (ker E)⊥. Setting L = ∥∥L′′

(x∗, λ∗, μ∗)∥∥ , c3 =

min(c1, c2), and c4 = max(c1, c2) we obtain

H((h, k), (h, k)) = γ |h1|2 − 2L|h1|(|h2| + |h3|)− L(|h2| + |h3|)2

+ cK22 |h3|2 + c1

2

∑I+1

(ni, h2

)2X+ c3

2

∑I1

k2i

− 3c4

∑I1

(li , h3

)2X− 3c2

∑I1−I+1

((li , h1

)2X+ (

li , h2)2X

),

where we used the fact that(ni, h2

)X≥ 0 for i ∈ I+1 .


�

�

�

�

�

�

�

�


Setting |l|2 =∑I1|li |2 and using (3.2.7) we find

H((h, k), (h, k)) ≥ γ |h1|2 + c1

2K1|h2|2 + cK2

2 |h3|2 + c3

2

∑I1

k2i

− 2L|h1|(|h2| + |h3|)− 2L(|h2|2 + |h3|2)− 3c2|l|2|h1|2 − 3c2|l|2|h2|2 − 3c4|l|2|h3|2

≥ γ |h1|2 + c1

2K1|h2|2 + cK2

2 |h3|2 + c3

2

∑I1

k2i

− 2L(γ

4

)1/2 (γ4

)−1/2 |h1||h2| − 2L(γ

4

)1/2 (γ4

)−1/2 |h1||h3|− 2L|h2|2 − 2L|h3|2 − 3c2|l|2|h1|2 − 3c2|l|2|h2|2 − 3c4|l|2|h3|2

≥ |h1|2(γ − γ

2− 3c2|l|2

)+ |h2|2

(c1

2K1 − 4

γL2 − 2L− 3c2|l|2

)+ |h3|2

(cK2

2 −4

γL2 − 2L− 3c4|l|2

)+ c3

2

∑I1

k2i .

We now choose the constants in the order c2, c1, and c such that the coefficients of|h1|2, |h2|2, and |h3|2, with c replaced by c, are positive. This implies the claim forevery c ≥ c.

It is worthwhile to point out the following corollary to Proposition 3.1 which iswell known in finite-dimensional spaces.

Corollary 3.2. Let E ∈ L(X,W) be surjective and let A ∈ L(X) be self-adjoint andcoercive on ker E; i.e., there exists γ > 0 such that

(Ax, x

) ≥ γ |x|2 for all x ∈ ker E.Then there exist constants τ > 0 and c > 0 such that

((A + cE∗E)x, x

)X≥ τ |x|2 for all

x ∈ X and c ≥ c.

This follows from Proposition 3.1 with A = L′′(x∗).

Proposition 3.3. Let (3.1.2)–(3.1.6) hold. Then there exists σ > 0 such that

L′′c (x∗, u∗, λ∗, μ∗, η∗)((h, k), (h, k))+∑i∈I2

μ∗i ki ≥ σ(|h|2 + |k2|)

for all h ∈ X with |h| ≤ μ/(2c supi∈I2|li |), μ = minI2 μ

∗i and all k ∈ Rm with ki ≥ 0 for

i ∈ I1 ∪ I2, and c is as introduced in Proposition 3.1.

Proof. Let A = L′′c (x∗, u∗, λ∗, μ∗, η∗)((h, k), (h, k))+∑

i∈I2μ∗i ki . Then from (3.2.3) and

the definition of H

A = H((h, k), (h, k))+ c∑I3

∣∣(li , h)X + ki∣∣2 + c

∑I2

{|ki |2 + 2(li , h

)Xki}+∑

I2

μ∗i ki ,


�

�

�

�

�

�

�

�


where k denotes the first m1 coordinates of the vector k. By Proposition 3.1 we have forevery ε > 1

A ≥ τ(|h|2 + |k|2Rm1 )+ c

∑I3

((li , h

)2X(1− ε2)+ |ki |2(1− ε−2)

)+ c

∑I2

|ki |2 − 2c∑I2

∣∣(li , h)X∣∣ki +∑I2

μ∗i ki

≥ τ(|h|2 + |k|2Rm1 )+ c(1− ε−2)

∑I2∪I3

|ki |2

+ c(1− ε2)|h|2∑I3

|li |2 − 2c∑I2

∣∣(li , h)X∣∣ki +∑I2

μ∗i ki .

Now choose ε > 1 such that τ2 = (ε2 − 1)τ

∑I3|li |2. Then using the constraint for h we

find that

A ≥ 1

2τ(|h|2 + |k|2

Rm1 )+ c(1− ε−2)∑I2∪I3

k2i +

(μ− 2c sup

I2

|li ||h|)∑

I2

ki

≥ 1

2τ(|h|2 + |k|2

Rm1 )+ c(1− ε−2)∑I2∪I3

k2i .

This proves the claim for c = c. For arbitrary c ≥ c the assertion follows from the form ofL′′c (x∗, u∗, λ∗, μ∗).

Theorem 3.4. Assume that (3.1.2)–(3.1.6) hold. Then there exist constants σ > 0, c > 0and a neighborhood U(x∗,−g(x∗)) = U(x∗, u∗) of (x∗, u∗) such that

Lc(x, u, λ∗, μ∗, η∗)+ (

μ∗, u)Rm

= f (x)+ (λ∗, e(x)

)W+ (

μ∗, g(x)+ u)Rm +

(η∗, �(x)

)Z+ c

2|e(x)|2W +

c

2|g(x)+ u|2

Rm

≥ f (x∗)+ σ (|x − x∗|2 + |u− u∗|2)(3.2.9)

for all c ≥ c and (x, u) ∈ U(x∗, u∗) with ui ≥ 0 for i ∈ I1 ∪ I2.

Proof. By (3.1.4) we have(μ∗, u∗

) = −(μ∗, g(x∗)

) = 0. Moreover

Lc(x∗, u∗, λ∗, μ∗, η∗)+ (

μ∗, u∗)Rm = f (x∗),

(Lc)x(x∗, u∗, λ∗, μ∗, η∗) = (Lc)u(x

∗, u∗, λ∗, μ∗, η∗) = 0,

where (Lc)x and (Lc)u denote the partial derivatives of Lc with respect to x and u. Conse-quently

Lc(x∗, u∗, λ∗, μ∗, η∗)+ (

μ∗, u∗)Rm = f (x∗)+ (

μ∗, u− u∗)Rm

+L′′c (x∗, u∗, λ∗, μ∗, η∗)((x− x∗, u−u∗), (x− x∗, u−u∗))+ o(|x− x∗|2+ |u−u∗|2).


�

�

�

�

�

�

�

�


The definitions of I1 and I3 imply that μ∗i = 0 for i ∈ I1 ∪ I3. Thus Proposition 3.3 impliesthe existence of a neighborhood U(x∗, u∗) of (x∗, u∗) and a constant σ > 0 such that (3.2.9)holds for all (x, u) ∈ U(x∗, u∗) with ui ≥ 0 for i ∈ I1 ∪ I2.

The conclusion of Theorem 3.4 is referred to as augmentability of problem (3.1.1).This means that a functional, which in our case is Lc(x, u, λ

∗, μ∗, η∗) + (μ∗, u∗

)Rm, can

be found with the property that it has (x∗, u∗) as a strict local minimizer under the simpleconstraint u ≥ 0, while the explicit nonlinear constraints are eliminated. Let us note that if(3.1.6) holds with C replaced by the larger set

C = {h ∈ X : e′(x∗)h = 0, g′i (x∗)h ≤ 0 for i ∈ I1 ∪ I2},

then the conclusion of Theorem 3.4 holds with Lc(x, u, λ∗, μ∗, η∗)+ (

μ∗, u∗)Rm replaced

by Lc(x, u, λ∗, μ∗, η∗). In case X is finite-dimensional, augmentability is analyzed in detail

in [Hes1], for example. The proof of augmentability in [Hes1] depends in essential manneron compactness of the unit ball. In [Be, p. 161 ff ], the augmentability analysis relies on thestrict complementarity assumption, i.e., I1 = ∅ is assumed.

We obtain, as immediate consequence of Theorem 3.4, that (3.1.6) provides a secondorder sufficient optimality condition.

Corollary 3.5. Assume that (3.1.2)–(3.1.6) hold. Then there exists a neighborhood U(x∗)of x∗ such that

f (x) ≥ f (x∗)+ σ |x − x∗|2for all x ∈ U(x∗) satisfying e(x) = 0, g(x) ≤ 0, and �(x) ∈ K.

Theorem 3.4 can be used as the basis to define augmented cost functionals in termsof x only with the property that x∗ is a uniform strict local unconstrained minimum. Thetwo choices we present differ in the way in which the inequality constraints are treated. Thefirst choice uses the classical cutoff penalty functional

g(x) = max(g(x), 0), (3.2.10)

and the second is the Bertsekas penalty functional

g(x, μ, c) = max

(g(x),−μ

c

), (3.2.11)

whereμ ∈ Rm and c > 0. In each of the two cases the max operation acts coordinatewise. Tomotivate the choice (3.2.11) we use the Lagrangian Lc(x, u, λ, μ, η) introduced in (3.2.2)for (3.2.1) and consider⎧⎨⎩min Lc(x, u, λ, μ, η) = fc(x, u)+

(λ, e(x)

)W

+(μ, g(x)+ u

)Rm +

(η, �(x)

)Z

over x ∈ X, u ≥ 0.(3.2.12)

Carrying out the constraint minimization with respect to u with x and μ fixed results in

u = u(x, μ) = max

(0,−

(g(x)+ μ

c

)),


�

�

�

�

�

�

�

�


and consequently

g(x)+ u(x, μ) = max

(g(x),−μ

c

)= g(x, μ, c). (3.2.13)

Corollary 3.6. Let (3.1.2)–(3.1.6) hold. Then there exists a neighborhood U(x∗) of x∗such that

f (x)+(λ∗, e(x)

)W+ (

μ∗, g(x))Rm +

(η∗, �(x)

)Z+ c

2|e(x)|2W +

c

2|g(x)|2

≥ f (x∗)+ σ |x − x∗|2

for all x ∈ U(x∗), and c ≥ c, where g(x) = max(g(x), 0).

Proof. Settingu = max(−g(x), 0) (3.2.14)

we have g(x) + u = g(x) and u ≥ 0. Recall the definition of U = U(x∗,−g(x∗)) fromTheorem 3.4. Determine a neighborhood U(x∗) such that x ∈ U(x∗) implies (x,−g(x)) ∈U and gi(x) ≤ 0 if gi(x∗) < 0. It is simple to argue that x ∈ U(x∗) implies (x, u) ∈ U,

where u is defined in (3.2.14). The claim now follows from Theorem 3.4.

Corollary 3.7. Let (3.1.2)–(3.1.6) hold and let r > 0. Then there exist constants δ > 0 andc = c(r) ≥ c such that

f (x)+(λ∗, e(x)

)X+ (

μ∗, g(x, μ, c))Rm +

(η∗, �(x)

)Z+ c

2|e(x)|2W +

c

2|g(x, μ, c)|2

≥ f (x∗)+ σ |x − x∗|2

for all c ≥ c, x ∈ Bδ = {x ∈ X : |x − x∗| ≤ δ}, and μ ∈ B+r = {μ ∈ Rm+ : |μ| ≤ r},where g(x, μ, c) = max(g(x),−μ

c).

Proof. Let ε > 0 be such that gi(x∗) ≤ −ε for all i ∈ I3 and such that |x − x∗| < ε

and |u − u∗| < ε implies (x, u) ∈ U(x∗, u∗), where U(x∗, u∗) is given in Theorem 3.4.Determine δ ∈ (0, ε) such that |x−x∗| ≤ δ implies |g(x)−g(x∗)| < ε

2 and choose c ≥ 2rε.

For c ≥ c and μ ∈ B+r we define

u = max

(0,−

(μ

c+ g(x)

)). (3.2.15)

Then g(x) + u = g(x) and u ≥ 0. To verify the claim it suffices to show that x ∈ Bδ

and μ ∈ B+r imply |u − u∗| < ε, where u is defined in (3.2.15). If i ∈ I1 ∪ I2, then|ui −u∗i | ≤ |gi(x∗)−gi(x)|+ μi

c. For i ∈ I3 we have μi

c+gi(x) ≤ μi

c+|gi(x∗)−gi(x)|+

gi(x∗) < r

c+ ε

2 − ε < 0 and consequently |ui − u∗i | = |gi(x∗) − gi(x) − μi

c|. Summing

over i = 1, . . . , m we find |u− u∗| ≤ 2|g(x∗)− g(x)|2 + 2c2 |μ|2 < ε2.

Remark 3.2.1. Note that

x → (μ, g(x, μ, c)

)+ c

2|g(x, μ, c)|2


�

�

�

�

�

�

�

�

3.3. The first order augmented Lagrangian algorithm 75

is C1 if g is C1. This follows from the identity(μ, g(x, μ, c)

)Rm + c

2|g(x, μ, c)|2

Rm = 1

2c

(|max(0, μ+ cg(x))|2Rm − |μ|2Rm

). (3.2.16)

Here, as throughout, the norm on Rm is the Euclidean one. On the other hand,

x → (μ, g(x, μ, c)

)+ c

2|g(x, μ, c)|2

is not C1.

3.3 The first order augmented Lagrangian algorithmHere we describe the first order augmented Lagrangian algorithm which is a hybrid methodcombining the penalty technique and a Lagrangian method. Considering for a momentequality constraints only, the penalty method for (3.1.1) consists in minimizing

f (x)+ ck

2|e(x)|2

for a sequence of penalty parameters ck tending to ∞. The Lagrangian method relies onminimizing

f (x)+ (λk, e(x))

and updating λk as a maximizer of the dual problem associated to (3.1.1), which willbe defined below. The first order augmented Lagrangian method provides a systematictechnique for the multiplier update. Its convergence analysis will be given in the nextsection.

Let x∗ be a local solution of (3.1.1) with (3.1.2)–(3.1.6) holding. The algorithm willrequire startup values (λ0, μ0) ∈ W ×Rm+ for the Lagrange multipliers. We set r = |μ∗| +(|λ0−λ∗|2+|μ0−μ∗|2)1/2 and choose ε and c = c(r, ε) ≥ c as in the proof of Corollary 3.6.Recall that εwas chosen such thatgi(x∗) ≤ −ε for all indices i in the set of inactive indices I3

and such that |x−x∗| < ε, |u−u∗| < ε imply (x, u) ∈ U(x∗,−g(x∗))withU(x∗,−g(x∗))given in Theorem 3.4. Finally, let {cn}∞n=1 be a monotonically nondecreasing sequence ofpenalty parameters with c1 > c, and set σn = cn− c. It is not required that limn→∞ cn = ∞,as would be the case for penalty methods.

Algorithm ALM.

1. Choose (λ0, μ0) ∈ W × Rm+.

2. Let xn be a solution to

min Ln(x) over x ∈ X with �(x) ∈ K. (Pn)

3. Update the Lagrange multipliers

λn = λn−1 + σne(xn), μn = μn−1 + σng(xn, μn−1, cn),


�

�

�

�

�

�

�

�


where

Ln(x) = f (x)+ (λn−1, e(x)

)W+ (

μn−1, g(x, μn−1, cn))Rm

+ cn

2|e(x)|2W +

cn

2|g(x, μn−1, cn)|2.

Observe that μn = max(μn−1 + σng(xn), μn−1(1 − σncn)) and consequently μn ∈ Rn+ for

each n = 1, 2, . . . . Existence of solutions to (Pn) will be discussed in Section 3.4. It will beshown that (Pn) has a solution in the interior of Bδ (see Corollary 3.6) for all n sufficientlylarge, or for all n if c1 is sufficiently large, and that these solutions converge to x∗, while(λn, μn) converges to (λ∗, μ∗).

To motivate the Lagrange multiplier update in step 3 of the algorithm and to justifycalling this a first order algorithm, we consider problem (3.2.1) without the infinite rank in-equality constraint. Introducing Lagrange multipliers for the equality constraints e(x) = 0and g(x) + u = 0 we obtained the augmented Lagrange functional Lc in (3.2.12). Carry-ing out the minimization with respect to u and utilizing (3.2.13) suggests introducing themodified augmented Lagrangian functional

Lc(x, λ, μ) = f (x)+ (λ, e(x)

)W+ (

μ, g(x, μ, c))Rm + c

2|e(x)|2W +

c

2|g(x, μ, c)|2

Rm.

(3.3.1)Since Lc(x

∗, λ∗, μ∗) = f (x∗) we find

Lc(x∗, λ, μ) ≤ Lc(x

∗, λ∗, μ∗) ≤ Lc(x, λ∗, μ∗) for all x ∈ Bδ and μ ≥ 0, (3.3.2)

whenever c ≥ c. The first inequality follows from the fact that e(x∗) = g(x∗, μ) = 0 forμ ≥ 0 and the second one from Corollary 3.6. From the saddle point property (3.3.2) forLc(x, λ, μ) we deduce that

supλ,μ≥0

infx

Lc(x, λ, μ) ≤ Lc(x∗, λ∗, μ∗) ≤ inf

xsupλ,μ

Lc(x, λ, μ). (3.3.3)

For the following discussion we assume strict complementarity, i.e., I1 = ∅. Thenx → Lc(x, λ, μ) is twice continuously differentiable for (x, λ, μ) in a neighborhood of(x∗, λ∗, μ∗) and

L′′c (x∗, λ∗, μ∗)(h, h) = L′′(x∗, λ∗, μ∗)(h, h)+ c|Eh|2W for h ∈ X.

Thus by (3.1.6) and Corollary 3.2, L′′c (x∗, λ∗, μ∗) is positive definite on X for all c suffi-ciently large. Moreover (x, λ, μ)→ L′′c (x, λ, μ) is continuous in a neighborhood U(x∗)×U(λ∗, μ∗) of (x∗, λ∗, μ∗) and L′′c (x, λ, μ) ≥ τ |x|2 for τ > 0 independent of x ∈ X and(λ, μ) ∈ W × Rm. It follows that

minx∈U(x∗)

L′′c (x, λ, μ)

admits a unique solution x(λ, μ) for every (λ, μ) ∈ U(λ∗, μ∗) and as a consequence ofTheorem 2.24 (λ, μ) → x(λ, μ) is differentiable. Consequently the locally defined dualfunctional

d(λ, μ) = minx∈U(x∗)

L′′c (x, λ, μ)


�

�

�

�

�

�

�

�

3.3. The first order augmented Lagrangian algorithm 77

is differentiable for (λ, μ) in a neighborhood of (λ∗, μ∗). For the gradient ∇d at (λ, μ) indirection (δλ, δμ) we obtain using (3.2.16)

∇d(λ, μ)(δλ, δμ) =(f ′(x)+ e′(x)∗λ+ g′(x)∗ max(0, μ+ cg(x)), xλ(δλ)+ xμ(δμ)

)+ (

δλ, e(x))+ (

δμ, g(x, μ, c)),

where x = x(λ, μ).

Utilizing the first order optimality condition this implies that the gradient of d(λ, μ)is given by

∇d(λ, μ) = (e(x(λ, μ)), g(x(λ, μ), μ, c)) ∈ W × Rm. (3.3.4)

Thus the multiplier updates in step 3 of Algorithm ALM are steepest ascent directions forthe dual problem

supλ,μ≥0

d(λ, μ).

In view of (3.3.3) the first order augmented Lagrangian algorithm is an iterativealgorithm combining minimization in the primal direction in step 2 and maximization in thedual direction on every level of the iteration.

Remark 3.3.1. In this chapter we identify X and W with their dual spaces and considere′(x)∗ as an operator from W to X. If, alternatively, e′(x)∗ is considered as an operatorfrom W ∗ to X∗, then the Lagrange multiplier is taken as an element of W ∗ and the Lagrangefunctional is defined as L(x, λ) = f (x) + 〈λ, e(x)〉W ∗,W . The relation between λ and λ

is given by I λ = λ, with I the canonical isomorphism from W to W ∗. If, for example,W = H−1(�), then W ∗ = H 1

0 (�) and I = (−�)−1. The Lagrange multiplier update instep 3 of Algorithm ALM is then given by λn = λn−1 + σnI e(xn).

As already mentioned the augmented Lagrangian algorithm ALM is also a hybridmethod combining Lagrange multiplier and penalty methods. Solving (Pn) without thepenalty terms we obtain the former, for λn = 0 and cn satisfying limn→∞ cn = ∞we obtainthe latter. The choice of cn for Algorithm ALM in practice is an important one. While nogeneral purpose techniques appear to be available, guidelines for the choice include choosingthem large enough such that the augmented Lagrangian has positive definite Hessian in thesense of Proposition 3.1. Large cn will improve the convergence estimates, as we shall seein the following section, and at the same time they can be the cause for ill-conditioning ofthe auxiliary problems (Pn). In fact, for large cn the Hessian of the cost functional Ln(x)

can have eigenvalues on significantly different scales. For further discussion on the choiceof the parameter c we refer the reader to [Be].

Let us make a few historical comments on the first order augmented Lagrangianalgorithm. It was originated by Hestenes [Hes2] and Powell [Pow] and extended amongothers by Bertsekas [Be] and in [PoTi, Roc2]. The infinite-dimensional case has received lessattention. In this respect we refer the reader to the work of Polyak and Tret’yakov [PoTr] andFortin and Glowinski [FoGl]. Both consider the case of augmenting equality constraintsonly. In [FoGl] augmented Lagrangian methods are also developed systematically forsolving nonlinear partial differential equations.


�

�

�

�

�

�

�

�


3.4 Convergence of Algorithm ALMWe shall use the following notation:

Lc(x, μ) = f (x)+ (λ∗, e(x)

)+ (μ∗, g(x, μ, c)

)+ (η∗, �(x)

)+ c

2|e(x)|2+ c

2|g(x, μ, c)|2.

Further we set r = |μ∗| + (|λ0 − λ∗|2 + |μ0 − μ∗|2)1/2. Corollary 3.7 then guarantees theexistence of δ > 0 and c = c(r) such that

Lc(x, μ)− f (x∗) ≥ σ |x − x∗|2 (3.4.1)

for all x ∈ Bδ, μ ∈ B+r , and c ≥ c. We also assume that Bδ is contained in the region ofapplicability of (3.4.1) and that {cn}∞n=1 is a monotonically nondecreasing sequence withc < c1. Then we have the following convergence properties of Algorithm ALM fromarbitrary initializations (λ0, μ0) ∈ W ×Rm+ in the case that suboptimal solutions are chosenin step 2.

Theorem 3.8. Assume that (3.1.2)–(3.1.5), (3.4.1) hold and that xn ∈ Bδ satisfy

Ln(xn) ≤ Ln(x∗) for each n = 1, . . . . (3.4.2)

If (λn, μn) are defined as in step 3 of Algorithm ALM with xn replaced by xn, then for n ≥ 1we have with σn = cn − c

σ |xn − x∗|2 + 1

2σn

(|λn − λ∗|2 + |μn − μ∗|2)

≤ 1

2σn

(|λn−1 − λ∗|2 + |μn−1 − μ∗|2). (3.4.3)

This implies that μn ∈ B+r for all n ≥ 1 and

|xn − x∗|2 ≤ 1

2σ σn

(|λ0 − λ∗|2 + |μ0 − μ∗|2) (3.4.4)

and

∞∑n=1

σn|xn − x∗|2 ≤ 1

2σ(|λ0 − λ∗|2 + |μ0 − μ∗|2). (3.4.5)

Proof. We proceed by induction and assume that the claim has been verified up to n−1. Forn = 1 the result can be obtained by the general arguments given below. For the inductionstep we observe that (3.4.3), which is assumed to hold up to n− 1, implies

|μn−1| ≤ |μ∗| + (|λ0 − λ∗|2 + |μ0 − μ∗|2)1/2 = r.

Consequently μn−1 ∈ B+r and (3.4.1) with μ = μn−1 is applicable. Using the fact that

[max(0, μn−1,i + cngi(x∗))]2 − μ2

n−1,i ≤ 0


�

�

�

�

�

�

�

�

3.4. Convergence of Algorithm ALM 79

and (3.2.16) we find

Ln(x∗) = f (x∗)+ 1

2cn

m∑i=1

{[max(0, μn−1,i + cngi(x∗))]2 − μ2

n−1,i} ≤ f (x∗). (3.4.6)

Next we rearrange terms in Ln(xn) and obtain

Ln(xn) = f (xn)+(λ∗, e(xn)

)+ (λn−1 − λ∗, e(xn)

)+ (μ∗, g(xn, μn−1, cn)

)+ (

μn−1 − μ∗, g(xn, μn−1, cn))+ c

2|e(xn)|2 + 1

2(cn − c)|e(xn)|2

+ c

2|g(xn, μn−1, cn)|2 + 1

2(cn − c)|g(xn, μn−1, cn)|2

= L(xn, μn−1)+ 1

2σn

(|λn − λ∗|2 − |λn−1 − λ∗|2)

+ 1

2σn

(|μn − μ∗|2 − |μn−1 − μ∗|2)− (η∗, �(xn)

)Z.

Since η∗ ∈ K+ and �(xn) ∈ K we have(η∗, �(xn)

) ≤ 0. This fact together with the aboveequality, (3.4.2), and (3.4.6) implies

L(xn, μn−1)+ 1

2σn

(|λn − λ∗|2 − |λn−1 − λ∗|2)

+ 1

2σn

(|μn − μ∗|2 − |μn−1 − μ∗|2) ≤ Ln(xn) ≤ f (x∗).

Finally (3.4.1) implies that

σ |xn − x∗|2 + 1

2σn

|λn − λ∗|2 + 1

2σn

|μn − μ∗|2 ≤ 1

2σn

|λn−1 − λ∗|2 + 1

2σn

|μn−1 − μ∗|2,

which establishes (3.4.3). Estimates (3.4.4) and (3.4.5) follow from (3.4.3).

The following conditions will be utilized to guarantee existence of local solutions tothe auxiliary problems (Pn):⎧⎪⎪⎪⎨⎪⎪⎪⎩

f : X→ R and gi : X→ R, i = 1, . . . , m, are weakly lower

semicontinuous,

e : X→ W maps weakly convergent sequences to

weakly convergent sequences.

(3.4.7)

Further (PCn ) denotes problem (Pn) with the additional constraint that x is contained in the

closed ball Bδ. We refer to xn as the solution to (PCn ) if Ln(xn) ≤ Ln(x) for all x ∈ Bδ.

Proposition 3.9. If (3.1.2)–(3.1.5), (3.4.1), and (3.4.7) hold, then (P Cn ) admits a solution

xn for every n = 1, 2, . . . . Moreover, there exists n0 such that every solution xn of (P Cn )


�

�

�

�

�

�

�

�


satisfies xn ∈ int Bδ if n ≥ n0. If 1c1−c (|λ0 − λ∗|2 + |μ0 − μ∗|2) is sufficiently small, then

every solution xn of (P Cn ) is in int Bδ for every n = 1, 2, . . . .

Thus for n0 or c sufficiently large the solutions to (PCn ) are local solutions of the

unconstrained problems (Pn).

Proof. Let {xin}i∈N be a minimizing sequence for (PC

n ). There exists a weakly convergentsubsequence {xik

n }k∈N with weak limit xn ∈ Bδ. Condition (3.4.7) implies weak lowersemicontinuity of Ln and hence Ln(xn) ≤ lim inf ik Ln(x

ikn ) and xn is a solution of (PC

n ).In particular (3.4.2) holds with xn replaced by xn for n = 1, 2, . . . . Consequently

limn xn = x∗ and xn ∈ int Bδ for all n sufficiently large. Alternatively, if 1c1−c (|λ0 − λ∗|2 +

|μ0 − μ∗|2) is sufficiently small, then xn ∈ int Bδ for all n = 1, 2, . . . by (3.4.4).

For the solutions xn obtained in Proposition 3.9 the conclusions of Theorem 3.8hold, in particular limn xn = x∗ and the associated Lagrange multipliers {λn} and {μn} arebounded.

To investigate convergence of the Lagrange multipliers we introduce M : X →W × Rm1+m2 × Z defined by

M = (e′(x∗), g′ac(x∗), �′(x∗)),

where gac denotes the first m1 +m2 coordinates of g. The following additional hypothesiswill be needed:

There exists a constant κ > 0 such that∥∥f ′(x∗)− f ′(x)∥∥ ≤ κ|x∗ − x|,∥∥e′(x∗)− e′(x)

∥∥ ≤ κ|x∗ − x|,∥∥g′(x∗)− g′(x)∥∥ ≤ κ|x∗ − x|

(3.4.8)

for all x ∈ Bδ, andM is surjective. (3.4.9)

We point out that (3.4.9) is more restrictive than the regular point condition in Def-inition 1.5 of Chapter 1. Below we set μn,ac = col(μn,1, . . . , μn,m1+m2) and μ∗ac =col(μ∗1, . . . , μ

∗m1+m2

). Without loss of generality we shall assume that σ ≤ 1.

Theorem 3.10. Let (3.1.2)–(3.1.5), (3.4.1), and (3.4.8)–(3.4.9) hold and let xn be a solutionof (Pn) in int Bδ, n ≥ 1. Then there exists a constant K independent of n such that

|λn − λ∗| + |μn − μ∗| + |ηn − η∗| ≤ K√σ σn

(|λn−1 − λ∗| + |μn−1 − μ∗|) for all n ≥ 1.

Here ηn denotes the Lagrange multiplier associated with the constraint � ∈ K in (Pn).

Sufficient conditions for xn ∈ Bδ were given in Proposition 3.9 above.


�

�

�

�

�

�

�

�

3.4. Convergence of Algorithm ALM 81

Proof. Optimality of xn and x∗ for (Pn) and (3.1.1) (see (3.1.3)) imply(f ′(xn), h

)+ (λn−1 + cne(xn), e

′(xn)h)+ (

max(0, μn−1 + cng(xn)), g′(xn)h

)+(

ηn, �′(xn)h

) = 0

and (f ′(x∗), h

)+ (λ∗, e′(x∗)h

)+ (μ∗, g′(x∗)h

)+ (η∗, �′(x∗)h

) = 0

for all h ∈ X. If we set λn = λn−1+ cne(xn) and μn = max(0, μn−1+ cng(xn)), we obtainthe following equation in X:

e′(x∗)∗(λn − λ∗)+ g′ac(x∗)∗(μn,ac − μ∗ac)+ �′(x∗)∗(ηn − η∗) = f ′(x∗)− f ′(xn)

+ (e′(x∗)∗ − e′(xn)∗)λn + (g′ac(x∗)∗ − g′ac(xn)

∗)μn,ac.

Here we used the fact that μn,i = 0 for i ∈ I3 by the choice of δ and c (see the proof ofCorollary 3.7), and we setμn,ac = col (μn,1, . . . , μn,m1+m2).Note thatM∗ : W×Rm1+m2 →X is given by M∗ = col (e′(x∗)∗, g′ac(x∗)∗). Hence we find

MM∗(λn − λ∗, μn,ac − μ∗ac, ηn − η∗) = M(f ′(x∗)− f ′(xn)

+ (e′(x∗)∗ − e′(xn)∗)λn + (g′ac(x∗)∗ − g′ac(xn)

∗)μn,ac

). (3.4.10)

Since λn − λ∗ = ce(xn) and μn − μn = cg(xn, μn−1, cn) we have

|λn − λn| = c|e(xn)− e(x∗)|,|μn,ac − μn,ac| ≤ c|gac(xn)− gac(x

∗)|. (3.4.11)

Thus by (3.4.8) and (3.4.3), (3.4.4) of Theorem 3.8 the sequences {λn} and {μn} are uniformlybounded. Consequently by (3.4.9) and (3.4.10) there exists a constant K such that |λn −λ∗| + |μn,ac − μ∗ac| ≤ K|xn − x∗|, and further by (3.4.11) a constant K can be chosensuch that

|λn − λ∗| + |μnac − μ∗ac| + |ηn − η∗| ≤ K|xn − x∗| for all n = 1, 2, . . . . (3.4.12)

The choice of δ implies that gi(xn) ≤ μn−1,i

cnfor all i ∈ I3 and therefore

μni =

c

cnμn−1

i for i ∈ I3, n ≥ 1. (3.4.13)

From (3.4.3), with xn replaced by xn, (3.4.12), and (3.4.13), and using σ ≤ 1 the theoremfollows.

Corollary 3.11. Under the assumptions of Theorem 3.10 we have

|λn − λ∗| + |μn − μ∗| + |ηn − η∗| ≤(

K√σ

)n n∏i=1

1√σi

(|λ0 − λ∗| + |μ0 − μ∗|)

for all n ≥ 1.


�

�

�

�

�

�

�

�


Remark 3.4.1. If the surjectivity requirement for M is replaced by assuming that M∗ issurjective, then the proof of Theorem 3.10 implies the existence of a constant K such that

|PM(λn − λ∗, μn,ac − μ∗ac, ηn − η∗)|W×Rm1+m2×Z ≤ K|xn − x∗|,where PM = M(M∗M)−1M∗ denotes the orthogonal projection of W × Rm1+m2 × Z ontothe range of M which is closed, since M∗ is surjective. Since xn → x∗ in X this impliesconvergence of PM(λn, μn,ac, ηn) to PM(λ∗, μ∗ac, η∗).

Remark 3.4.2. If a constraint of the type x ∈ C, with C a closed convex set in X, appears in(3.1.1), then Theorem 3.8 and Proposition 3.9 remain valid if Algorithm ALM is modifiedsuch that in (Pn) the functional Ln(x) is minimized over x ∈ C and xn appearing in (3.4.2)satisfies xn ∈ C. The stationarity condition (3.1.3) has to be replaced by

f ′(x∗)(x − x∗)+ (λ∗, e′(x∗)(x − x∗)

)W+ (

μ∗, g′(x∗)(x − x∗))Rm ≥ 0 (3.1.3′)

for all x ∈ C, in this case.

3.5 Application to a parameter estimation problemWe consider a least squares formulation for the estimation of the coefficient a in{

−div (a grad y) = f in �,

y = 0 in ∂�,(3.5.1)

where � is a bounded domain in Rn, with Lipschitz continuous boundary ∂� if n ≥ 2, andf ∈ H−1(�), f �= 0. Let Q be a Hilbert space that embeds compactly in L∞(�) and letN : Q→ R denote a seminorm on Q with ker(N) ⊂ span{1} and such that N(a) definesa norm on Q/R = {a ∈ Q : (1, a)

Q= 0} that is equivalent to the norm induced by Q. For

z ∈ H 10 (�), α > 0, and β > 0 we consider{

min 12 |y − z| 2

H 10 (�)

+ α2N

2(a)

subject to e(a, y) = 0, a(x) ≥ β,(3.5.2)

where e : Q×H 10 (�)→ H−1(�) is given by

e(a, y) = div (a grad y)+ f.

This is a special case of (3.1.1) with X = Q×H 10 (�), W = H−1(�), g = 0, and f the

cost functional in (3.5.2). The affine constraint a(x) ≥ β can be considered as an explicitconstraint as discussed in Remark 3.4.2 by setting C = {a ∈ Q : a(x) ≥ β}. Denote thesolution to (3.5.1) as a function of a by y(a). To guarantee the existence of a solution for(3.5.2) we require the following assumption:{

either N is a norm on Q

or inf {|y(a)− z|H 1 : a is constant } < |z|H 1 .(3.5.3)


�

�

�

�

�

�

�

�

3.5. Application to a parameter estimation problem 83

Lemma 3.12. If (3.5.3) holds, then there exists a solution (a∗, y∗) of (3.5.2).

Proof. Let {(an, yn)}∞n=1 denote a minimizing sequence for (3.5.2). If N is a norm, thena subsequence argument implies the existence of a solution to (3.5.2). Otherwise we de-compose an as an = a1

n + a2n with a2

n ∈ Q/R and a1n ∈ (Q/R)⊥. Since {(an, yn)}∞n=1 is a

minimizing sequence, {N(a2n)}∞n=1 and consequently {|a2

n|L∞}∞n=1 are bounded. We have

a1n

∫�

|∇yn|2 dx = −∫�

a2n|∇yn|2 dx +

∫�

fyn dx.

Since a ≥ β implies a1 ≥ β, it follows that {a1n|∇yn|2L2}∞n=1, is bounded. If a1

n →∞, then|∇yn| = |∇y(an)| → 0 for n→∞. In this case

|z| ≤ limn→∞ |y(an)− z|2H 1(�) + βN2(an) ≤ inf {|y(a)− z|H 1 : a = constant} < |z|H 1 ,

which is impossible. Therefore {a1n}∞n=1 and as a consequence {an}∞n=1 are bounded, and a

subsequence of {(an, yn)} converges weakly to a solution of (3.5.2).

Using Theorem 1.6 one argues the existence of Lagrange multipliers (λ∗, η∗) ∈H−1(�)×Q∗ such that the Lagrangian

L(a, y, λ∗, η∗) = 1

2|y − z|2

H 10 (�)

+ α

2N2(a)+ 〈λ∗, e(a, y)〉H 1

0 ,H−1 + 〈η∗, β − a〉Q∗,Q

is stationary at (a∗, y∗), and 〈η∗, β − a∗〉Q∗,Q = 0, 〈η∗, h〉Q∗,Q ≥ 0 for all h ≥ 0. Hence(3.1.2)–(3.1.5) hold. Here (3.1.3)–(3.1.4) must be modified to include the affine inequalityconstraint with infinite-dimensional image space. We turn to the augmentability condition(3.4.1) and introduce the augmented Lagrangian

Lc(a, y, λ∗, η∗) = L(a, y, λ∗, η∗)+ c

2|e(a, y)|2H−1

for c ≥ 0.Henceforth we assume that (3.5.2) admits a solution (a0, y0) forα = 0.Such a solution

exists if z is attainable, i.e., if there exists a0 ∈ Q, a0 ≥ β, such that y(a0) = y and in thiscase y0 = y(a0). Alternatively, if the set of coefficients is further constrained by a normbound |a|Q ≤ γ, for example, then existence of a solution to (3.5.2) with α = 0 is alsoguaranteed.

Proposition 3.13. Let (a0, y0) denote a solution to (3.5.2) with α = 0, let (a∗, y∗) be asolution for α > 0, and choose γ ≥ |a∗|Q. Then, if |y0 − z|H 1 is sufficiently small, thereexist positive constants c0 and σ such that

Lc(a, u; λ∗, η∗) ≥ 1

2|y∗ − z|2H 1 + α

2N2(a∗)+ σ(|a − a∗|2Q + |y − y∗|2

H 10)

for all c ≥ c0 and (a, y) ∈ C ×H 10 (�) with |a|Q ≤ γ.

This proposition implies that (3.4.1) is satisfied and that Theorem 3.8 and Proposi-tion 3.9 are applicable with Bδ = {a ∈ Q : a(x) ≥ β, |a|Q ≤ γ } ×H 1

0 (�).


�

�

�

�

�

�

�

�


Proof. For (a, y) ∈ C ×H 10 (�) we have

Lc(a, y, λ∗, η∗)− Lc(a

∗, y∗, λ∗, η∗) = ∇L(a∗, y∗, λ∗, η∗)

− (h∇λ∗,∇v)

L2 + 1

2|v|2

H 10+ α

2N2(h)+ c

2|e(a, h)|2H−1 ,

where h = a − a∗, v = y − y∗, and ∇L denotes the gradient of L with respect to (a, y).

First order optimality implies that

Lc(a, y, λ∗, η∗)− Lc(a

∗, y∗, λ∗, η∗) ≥ −(h∇λ∗,∇v)

L2

+ 1

2|v|2

H 10+ α

2N2(h)+ c

2|e(a, h)|2H−1 =: D.

Introduce P = ∇ · (−�)−1∇ and note that P can be extended to an orthogonal projectionon L2(�)n. We find that

|e(a, y)|2H−1 = (P(a∇y − a∗∇y∗), a∇y − a∗∇y∗)

L2

= (P(a∇v + h∇y∗), h∇y∗ + a∇v)

L2

≥ 1

2

(P(h∇y∗), h∇y∗)

L2 −(P(a∇v), a∇v)

L2

≥ 1

2

(P(h∇y∗), h∇y∗)

L2 − |a|2L∞|v|2H 10

≥ 1

2

(P(h∇y∗), h∇y∗)

L2 − γ 2k21 |v|2H 1

0,

where k1 denotes the embedding constant ofQ intoL∞(�).Henceforth we consider the casethat ker N = span{1} and use the decomposition introduced in the proof of Lemma 3.12.We have

|e(a, y)|2 ≥ 1

2|h1|2|y∗|2H 1

0− |h1| |h2|L∞|y∗|2H 1

0− γ 2k2

1 |v|H 10.

Since Q embeds continuously into L∞(�) and N is a norm equivalent to the norm on Q/R,there exists a constant k2 such that |h2|L∞ ≤ k2N(h) for all h = h1+h2 ∈ Q. Consequently

|e(a, y)|2 ≥ 1

2|h1|2|y∗|2H 1

0− k2|h1|N(h2) |y∗|2H 1

0− γ 2k2

1 |v|H 10

(3.5.4)

for all (a, y) ∈ C×H 10 (�).Next note thatLy(a

∗, u∗; λ∗, η∗) = 0 implies that−∇(a∗∇λ∗) =−�(y∗ − y) in H−1(�), and hence∣∣(h∇λ∗,∇v)

L2

∣∣ ≤ 1

β(|h1| + k2N(h2))|y∗ − z|H 1

0|v|H 1

0. (3.5.5)

Setting c = δα with δ > 0 and observing that |f |H−1 ≤ k3|y∗|H 1 for a constant k3 dependingonly on β,

D = 1

2(1− δαγ 2k2

1)|v|2H 10− 1

β(|h1| + k2N(h2))|y∗ − z||v|H 1

0

+ α

2

(N2(h2)+ δ

2|y∗|2

H 10|h1|2 − k2

k23

δ|h1|N(h2) |f |2H−1

).


�

�

�

�

�

�

�

�

3.5. Application to a parameter estimation problem 85

There exist δ > 0 and k4 > 0 such that

D ≥ k4(|v|2H 10+ |h1|2 +N2(h2))− 1

β(|h1| + k2N(h2)) |y∗ − z|H 1

0|v|H 1

0.

Since |h1| +N(h2) defines an equivalent norm on Q, the claim follows with c0 = δα.

It is worthwhile to comment on the auxiliary problems of Algorithm ALM. Theseinvolve the minimization of

Ln(a, y) = 1

2|y−z|2

H 10+ α

2N2(a)+〈λn−1, A(a)y+f 〉H 1

0 ,H−1+ cn

2|A(a)y+f |2H−1 (3.5.6)

over (a, y) ∈ C ×H 10 (�), where A(a)y = div (a grad y). The resulting optimality condi-

tions are given by

α

2(N2(an))

′(a − an)

+ (−∇λn−1 ∇ yn + cn ∇�−1(A(an)yn + f ) ∇yn, a − an)Ln ≥ 0 (3.5.7)

for all a ≥ an and

−�(yn − z)+ A(an)λn−1 + A(an)(−�)−1(A(an)yn + f ) = 0. (3.5.8)

The Lagrange multiplier is updated according to

λn = λn−1 + σn(−�)−1(A(an)yn + f ). (3.5.9)

To simplify (3.5.7), (3.5.8) for numerical realization one can proceed iteratively solv-ing (3.5.7) with yn replaced by yn−1 for an and then (3.5.8) for yn. A good initialization forthe variable λ is given by λ0 = 0. In fact, for small residue problems |y∗ − z|H 1

0is small

and then ∂∂y

L(a∗, y∗, λ∗, η∗) = 0 implies that λ∗ is small. Setting λ0 = 0 and y0 = z thecoefficient a1 is determined from

mina≥β

α

2N2(a)+ C1

2|A(a)z+ f |2H−1 . (3.5.10)

Thus the first step of the augmented Lagrangian algorithm for estimating a in (3.5.1) involvesa regularized equation error step. Recall that the equation error method for parameterestimation problems consists in solving the hyperbolic equation −div (a grad z) = f fora. This estimate is improved during the subsequent iterations. The first order augmentedLagrangian method can therefore be considered as a hybrid method combining the outputleast squares and the equation error approach to parameter estimation.

Let us close this section with some brief comments. If the H 1 least squares criterion isreplaced by an L2-criterion, then the first negative Laplacian in (3.5.8) must be replaced bythe identity operator. For the H 1-criterion the same number of differentiations are appliedto y in the least squares and the equation error term in (3.5.8). If one would approach theproblem of estimating a in (3.5.1) by first discretizing and then applying an optimizationstrategy, the discrete analogue of (−�)−1 can be interpreted as preconditioning. We assumedthat Q embeds compactly in L∞(�). This is the case, for example, if Q = H 1(�) whenn = 1, or Q = H 2(�) when n = 2 or 3, or if Q ⊂ L∞(�) is finite-dimensional.


�

�

�

�

�

�

�

�


�

�

�

�

�

�

�

�

Chapter 4

Augmented LagrangianMethods for Nonsmooth,Convex Optimization

4.1 IntroductionThe class of optimization problems that motivates this chapter is given by

min f (x)+ ϕ(�x) over x ∈ C, (4.1.1)

where X, H are real Hilbert spaces, C is a closed convex subset of X, and � ∈ L(X,H).In this chapter we identify H with its dual space and we distinguish between X and itsdual X∗. Further f : X → R is a continuously differentiable, convex function, and ϕ :H → (−∞,∞] is a proper, lower semicontinuous, convex but not necessarily differentiablefunction. This problem class encompasses a wide variety of optimization problems includingvariational inequalities of the first and second kind [Glo]. Our formulation here is slightlymore general than the one in [Glo, EkTe], since we allow the additional constraint x ∈ C.

For example, one can formulate regularized inverse scattering problems in the form(4.1.1):

min∫�

μ

2|∇u|2 + g |∇u| dx + 1

2

∫�

∣∣∣∣∫�

k(x, y)u(y) dy − z(x)

∣∣∣∣2

dx (4.1.2)

over u ∈ H 1(�) and u ≥ 0, where k(x, y) denotes a scattering kernel. The problemconsists in recovering the original image u defined on a domain � from scattered andnoisy data z ∈ L2(�). Here μ > 0 and g > 0 are fixed and should be adjusted to thestatistics of the noise. If μ = 0, then this problem is equivalent to the image enhancementalgorithm in [ROF] based on minimization of the BV-seminorm

∫�|∇u| ds. In this example

ϕ(v) = g∫�|v| ds, which is nondifferentiable. Several additional applications are treated

at the end of this chapter.We develop a Lagrange multiplier theory to deal with the nonsmoothness of ϕ. To

briefly explain the approach let x, λ ∈ H and c > 0, and define the family of generalizedYosida–Moreau approximations ϕc(x, λ) by

ϕc(x, λ) = infu∈H

{ϕ(x − u)+ (λ, u)H + c

2|u|2H

}. (4.1.3)

87


�

�

�

�

�

�

�

�

88 Chapter 4. Augmented Lagrangian Methods for Nonsmooth, Convex Optimization

This is equivalent to an augmented Lagrangian approach. In fact, note that (4.1.1) is equiv-alent to

min f (x)+ ϕ(�x − u)

subject to x ∈ C and u = 0 in H.

(4.1.4)

Treating the equality constraint u = 0 in (4.1.4) by the augmented Lagrangian methodresults in the minimization problem

minx∈C,u∈H f (x)+ ϕ(�x − u)+ (λ, u)H + c

2|u|2H , (4.1.5)

where λ ∈ H is a multiplier and c is a positive scalar penalty parameter. Equivalently,problem (4.1.5) is written as

minx∈C Lc(x, λ) = f (x)+ ϕc(�x, λ). (4.1.6)

It will be shown that ϕc(u, λ) is continuously Fréchet differentiable with respect to u ∈ H .Moreover, if xc ∈ C denotes the solution to (4.1.6), then it satisfies

〈f ′(xc)+�∗λc, x − xc〉X∗,X ≥ 0 for all x ∈ C,

λc = ϕ′c(�xc, λc).

It will further be shown that under appropriate conditions the pair (xc, λc) ∈ C ×H has a(strong-weak) cluster point (x, λ) as c → ∞ such that x ∈ C is the minimizer of (4.1.1)and that λ ∈ H is a Lagrange multiplier in the sense that

〈f ′(x)+�∗λ, x − x〉X∗,X ≥ 0 for all x ∈ C (4.1.7)

with the complementarity condition

λ = ϕ′c(�x, λ) for each c > 0. (4.1.8)

System (4.1.7)–(4.1.8) for the pair (x, λ) is a necessary and sufficient optimality conditionfor problem (4.1.1). We analyze iterative algorithms of Uzawa and augmented Lagrangiantype for finding the optimal pair (x, λ) and present a convergence analysis. It will be shownthat condition (4.1.8) is equivalent to the complementarity condition λ ∈ ∂ϕ(�x). Thusthe frequently employed differential inclusion λ ∈ ∂ϕ(�x) is replaced by the nonlinearequation (4.1.8).

This chapter is organized as follows. In Sections 4.2–4.3 we present the basic convexanalysis and duality theory in Banach spaces. Section 4.4 is devoted to the generalizedYosida–Moreau approximation which is the basis for the remainder of the chapter. Condi-tions for the existence of Lagrange multiplier for (4.1.1) and optimality systems are derivedin Section 4.5. Section 4.6 is devoted to the augmented Lagrangian algorithm. Convergenceof both the augmented Lagrangian and the Uzawa methods are proved. Section 4.7 containsa large number of concrete applications.


�

�

�

�

�

�

�

�

4.2. Convex analysis 89

4.2 Convex analysisIn this section we present standard results from convex analysis and duality theory in Banachspaces following, in part, [BaPe, EkTu]. Throughout X denotes a real Banach space.

Definition 4.1. (1) A functional F : X→ (−∞,∞] is called convex if

F((1− λ) x1 + λ x2) ≤ (1− λ) F (x1)+ λF(x2)

for all x1, x2 ∈ X and 0 ≤ λ ≤ 1. It is called proper if it is not identically∞.(2) A functional F : X → (−∞,∞] is said to be lower semicontinuous (l.s.c.) at

x ∈ X ifF(x) ≤ lim inf

y→xF (y).

A functional F is l.s.c. if it is l.s.c. at all x ∈ X.(3) A functional F : X→ (−∞,∞] is called weakly lower semicontinuous (w.l.s.c.)

at x ifF(x) ≤ lim inf

n→∞ F(xn)

for all sequences {xn} converging weakly to x. Further F is w.l.s.c. if it is w.l.s.c. at allx ∈ X.

(4) The subset D(F) = {x ∈ X : F(x) < ∞} of X is called the effective domainof F .

(5) The epigraph of F is defined by epi(F ) = {(x, c) ∈ X × R : F(x) ≤ c}.

Lemma 4.2. A functional F : X→ (−∞,∞] is l.s.c. if and only if its epigraph is closed.

Proof. The lemma follows from the fact that epi (F ) is closed if and only if (xn, cn) ∈epi(F )→ (x, c) in X × R implies F(x) ≤ c.

Lemma 4.3. A functional F : X→ (−∞,∞] is l.s.c. if and only if its level sets Sc = {x ∈X : F(x) ≤ c} are closed for all c ∈ R.

Proof. Assume that F is l.s.c. Let c > 0 and let {xn} be a sequence in Sc with limit x. ThenF(x) ≤ lim inf x→∞ F(xn) ≤ c and hence x ∈ Sc and Sc is closed. Conversely, assumethat Sc is closed for all c > 0, and let {xn} be a sequence converging to x in X. Choose asubsequence {xnk

} withlimk→∞F(xnk

) = lim infn→∞ F(xn),

and suppose that F(x) > lim inf n→∞ F(xn). Then there exists c ∈ R such that

lim infn→∞ F(xn) < c < F(x),

and there exists an index m such that F(xnk) < c for all k ≥ m. Since Sc is closed x ∈ Sc,

which is a contradiction to c < F(x).

Lemma 4.4. Assume that all level sets of the functional F : X → (−∞,∞] are convex.Then F is l.s.c. if and only if it is w.l.s.c. on X.


�

�

�

�

�

�

�

�


Proof. Assume that F is l.s.c. Suppose that {xn} converges weakly to x in X, and let {xnk}

be a subsequence such that d = lim inf F(xn) = limk→∞ F(xnk). By Lemma 4.3 the sets

{x : F(x) ≤ d+ ε} are closed for every ε > 0. By assumption they are also convex. Henceby Mazur’s lemma they are also weakly (sequentially) closed. Hence F(x) ≤ d + ε forevery ε > 0. Since ε is arbitrary, we have that F(x) ≤ d and F is w.l.s.c. The converseimplication is obvious.

Note that for a convex function all associated level sets are convex, but not viceversa.

Theorem 4.5. Let F be a proper, l.s.c., convex functional on X. Then F is bounded belowby an affine functional; i.e., there exist x∗ ∈ X∗ and c ∈ R such that

F(x) ≥ 〈x∗, x〉X∗,X + c for all x ∈ X.

Moreover, F is the pointwise supremum of a family of such continuous affine functionals.

Proof. Let x0 ∈ X and choose β ∈ R such that F(x0) > β. Since epi(F ) is a closed convexsubset of the product space X × R, it follows from the separation theorem for convexsets [EkTu] that there exists a closed hyperplane H ⊂ X × R given by

H = {(x, r) ∈ X × R : 〈x∗0 , x〉 + ar = α} with x∗0 ∈ X∗, a ∈ R, α ∈ R,such that

〈x∗0 , x0〉 + aβ < α < 〈x∗0 , x〉 + ar for all (x, r) ∈ epi(F ).

Setting x = x0 and r = F(x0), we have

〈x∗0 , x0〉 + aβ < α < 〈x∗0 , x0〉 + aF(x0)

and thus a (F (x0)− β) > 0. If F(x0) <∞, then a > 0 and thus

α

a− 1

a〈x∗0 , x〉 ≤ r for all (x, r) ∈ epi(F ),

β <α

a− 1

a〈x∗0 , x0〉 < F(x0).

Hence, b(x) = αa− 1

a〈x∗0 , x〉 is a continuous affine function on X such that b ≤ F and

the first claim is established with c = αa

and x∗ = −x∗0a

. Moreover β < b(x0) < F(x0).Therefore

F(x0) = sup{b(x0) = 〈x∗, x0〉 + c :x∗ ∈ X∗, c ∈ R, b(x) ≤ F(x) for all x ∈ X

}.

(4.2.1)

If F(x0) = ∞, either a > 0 (thus we proceed as above) or a = 0. In the latter caseα − 〈x∗0 , x0〉 > 0 and α − 〈x∗0 , x〉 < 0 on D(F). Since F is proper there exists an affinefunction b(x) = 〈x∗, x〉 + c such that b ≤ F . Thus

〈x∗, x〉 + c + θ (α − 〈x∗0 , x〉) < F(x)


�

�

�

�

�

�

�

�


for all x and θ > 0. Choosing θ > 0 large enough so that

〈x∗, x〉 + c + θ (α − 〈x∗0 , x〉) > β,

we have that b(x) = 〈x∗, x〉+ c+ θ (α−〈x∗0 , x〉) is a continuous affine function on X withb ≤ F and β < b(x0). Therefore (4.2.1) holds at x0 with F(x0) = ∞ as well.

Theorem 4.6. If F : X→ (−∞,∞] is convex and bounded above on an open set U , thenF is continuous on U .

Proof. We choose M ∈ R such that F(x) ≤ M − 1 for all x ∈ U . Let x be any element inU . Since U is open there exists a δ > 0 such that the open ball {x ∈ X : |x − x| < δ} iscontained in U . For any ε ∈ (0, 1), let θ = ε

M−F(x). Then for x ∈ X satisfying |x− x| < θ δ

we have ∣∣∣∣x − x

θ+ x − x

∣∣∣∣ = |x − x|θ

< δ.

Hence x−xθ+ x ∈ U . By convexity of F

F(x) ≤ (1− θ)F (x)+ θ F

(x − x

θ+ x

)< (1− θ)F (x)+ θ M,

and thus

F(x)− F(x) < θ(M − F(x)) = ε.

Similarly, x−xθ+ x ∈ U and

F(x) ≤ θ

1+ θF

(x − x

θ+ x

)+ 1

1+ θF (x) <

θM

1+ θ+ 1

1+ θF (x),

which impliesF(x)− F(x) > −θ(M − F(x)) = −ε.

Therefore |F(x)− F(x)| < ε if |x − x| < θδ and F is continuous in U .

Theorem 4.7. If F : X → (−∞,∞] is a proper, l.s.c., convex functional on X, and F isbounded above on a convex, open δ-neighborhood U of a bounded, convex set C, then F isLipschitz continuous on C.

Proof. By Theorem 4.5 and by assumption there exist constants M and m such that

m ≤ F(x) ≤ M for all x ∈ U .

Let x and x be in C with |x − x| ≤ δM−m , x �= x, and set θ = 2|x−x|

θ. Without loss of

generality we assume that 2M−m < 1 so that θ ∈ (0, 1). Then y = x−x

θ+ x ∈ U since x ∈ U

and |y − x| = x−x|θ≤ δ

2 . Due to convexity of F we have

F(x) ≤ (1− θ)F (x)+ θ

(F

(x − x

θ+ x

))≤ (1− θ)F (x)+ θ M


�

�

�

�

�

�

�

�


and hence

F(x)− F(x) ≤ θ(M − F(x)) ≤ 2

δ(M −m)|x − x|.

Similarly, x−xθ+ x ∈ U and

F(x) ≤ θ

1+ θF

(x − x

θ+ x

)+ 1

1+ θF (x) ≤ θ M

1+ θ+ 1

1+ θF (x),

which implies−θ(M −m) ≤ −θ(M − F(x)) ≤ F(x)− F(x)

and therefore

|F(x)− F(x)| ≤ 2

θ(M −m)|x − x| for |x − x| ≤ δ

M −m.

Since C is bounded and convex, Lipschitz continuity for all x, x ∈ C follows.

4.2.1 Conjugate and biconjugate functionals

Definition 4.8. The functional F ∗ : X∗ → [−∞,∞] defined by

F ∗(x∗) = sup {〈x∗, x〉X∗,X − F(x) : x ∈ X}is called the conjugate of F .

If F is bounded below by an affine functional 〈x∗, ·〉 − c, then

F ∗(x∗) = supx∈X{〈x∗, x〉 − F(x)} ≤ sup

x∈X{〈x∗, x〉 − 〈x∗, x〉 + c} = c,

and hence F ∗ is not identically ∞. Note also that D(F) = ∅ implies that F ∗ ≡ −∞ andconversely, if F ∗(x∗) = −∞ for some x∗ ∈ X, then D(F) is empty.

Example 4.9. If F(x) = 1p|x|p, x ∈ R, then F ∗(x∗) = 1

q|x∗|q for 1 < p < ∞ and

1p+ 1

q= 1.

Example 4.10. If F(x) is the indicator function of the closed unit ball of X, i.e., F(x) = 0for |x| ≤ 1 and F(x) = ∞ otherwise, then F ∗(x∗) = |x∗|. In fact, F ∗(x∗) = sup{〈 x∗, x〉 :|x| ≤ 1} = |x∗|.If F is a proper, l.s.c., convex functional on X, then by Theorem 4.5

F(x0) = sup {〈x∗, x0〉 − c : x∗ ∈ X∗, c ∈ R, 〈x∗, x〉 − c ≤ F(x) for all x ∈ X}for every x0 ∈ X. For x∗ ∈ X define

c(x∗) = inf {c ∈ R : c ≥ 〈x∗, x〉 − F(x) for all x ∈ X}


�

�

�

�

�

�

�

�


and observe thatF(x0) = sup

x∗∈X∗{〈x∗, x0〉 − c(x∗)}.

But c(x∗) = supx∈X{〈x∗, x〉 − F(x)} = F ∗(x∗) and hence

F(x0) = supx∗∈X∗

{〈x∗, x0〉 − F ∗(x∗)}.

This suggests the following definition.

Definition 4.11. For F : X→ (−∞,∞] the biconjugate functional F ∗∗ : X→ (−∞,∞]is defined by

F ∗∗(x) = supx∗∈X∗

{〈x∗, x〉 − F ∗(x∗)}.

Above we proved the following result.

Theorem 4.12. If F is a proper, l.s.c., convex functional, then F = F ∗∗.

It is simple to argue that (F ∗)∗ = F ∗∗ if X is reflexive. Next we prove some importantresults on conjugate functionals.

Theorem 4.13. For every functional F : X → (−∞,∞] the conjugate F ∗ is convex andl.s.c. on X∗.

Proof. If F is identically∞, then F ∗ is identically−∞. Otherwise D(F) is nonempty andF(x∗) > −∞ for all x∗ ∈ X∗, and

F ∗(x∗) = sup {〈x∗, x〉 − F(x) : x ∈ D(F)}.Thus F ∗ is the pointwise supremum of the family of continuous affine functionals x∗ →〈x∗, x〉 − F(x) for x ∈ D(F). This implies that F ∗ is convex. Moreover {x∗ ∈ X∗ :〈x∗, x〉 − F(x) ≤ c} is closed for each c ∈ R and x ∈ D(F). Consequently

{x∗ ∈ X∗ : F ∗(x∗) ≤ c} =⋂

x∈D(F)

{x∗ ∈ X∗ : 〈x∗, x〉 − F(x) ≤ c}

is closed for each c ∈ R. Hence F ∗ is l.s.c. by Lemma 4.3.

Theorem 4.14. For every F : X → (−∞,∞] the biconjugate F ∗∗ is convex, l.s.c., andF ∗∗ ≤ F . Moreover for each x ∈ X

F ∗∗(x) = sup {〈x∗, x〉−c : x∗ ∈ X∗, c ∈ R, 〈x∗, x〉−c ≤ F(x) for all x ∈ X}. (4.2.2)

Proof. If there is no continuous affine functional which is everywhere less than F , thenF ∗ ≡ ∞ and hence F ∗∗ ≡ −∞. In fact, if there exist c ∈ R and x∗ ∈ X∗ with c ≥ F ∗(x∗),then 〈x∗, ·〉 − c is everywhere less than F , which is impossible. The claims readily followin this case.


�

�

�

�

�

�

�

�


Otherwise, assume that there exists a continuous affine functional everywhere lessthan F . Then D(F ∗) is nonempty and F ∗∗(x) > −∞ for all x ∈ X. If F ∗(x∗) = −∞for some x∗ ∈ X∗, then F ∗∗ ≡ ∞, F ≡ ∞, and the claims follow. In the remaining caseF ∗(x∗) is finite for all x∗ and we have

F ∗∗(x) = sup{〈x∗, x〉 − F ∗(x∗) : x∗ ∈ X∗, F ∗(x∗) finite}.For every x∗ ∈ D(F ∗) we have

〈x∗, x〉 − F ∗(x∗) ≤ F(x) for all x ∈ X,

hence F ∗∗ ≤ F and

F ∗∗(x) ≤ sup{〈x∗, x〉 − c : x∗ ∈ X∗, c ∈ R, 〈x∗, x〉 − c ≤ F(x) for all x ∈ X}.Let x∗ and c be such that 〈x∗, x〉 − c ≤ F(x) for all x ∈ X. Then

〈x∗, x〉 − c ≤ 〈x∗, x〉 − F(x∗) for all x ∈ X.

Therefore

sup{〈x∗, x〉 − c : x∗ ∈ X∗, c ∈ R, 〈x∗, x〉 − c ≤ F(x) for all x ∈ X} ≤ F ∗∗(x),

and (4.2.2) follows. Moreover F ∗∗ is the pointwise supremum of a family of continuousaffine functionals, and it follows as in the proof of Theorem 4.13 that F ∗ is convex andl.s.c.

Theorem 4.15. If F : X→ (−∞,∞] is a convex functional which is finite and l.s.c. at x,then F(x) = F ∗∗(x).

For the proof of this result we refer the reader to [EkTu, p. 104], for example.

Theorem 4.16. For every F : X→ (−∞,∞], we have F ∗ = F ∗∗∗.

Proof. Since F ∗∗ ≤ F due to Theorem 4.14, F ∗ ≤ F ∗∗∗ by the definition of conjugatefunctions. The definition of F ∗∗ implies that

〈x∗, x〉 − F ∗∗(x) ≤ F(x∗) (4.2.3)

if F ∗∗(x) and F ∗(x∗) are finite. If F ∗(x∗) = ∞ or F ∗∗(x) = ∞, inequality (4.2.3) holdsas well. If F ∗(x∗) = −∞, we have that F and F ∗ are identically ∞. If F ∗∗(x) = −∞,then F ∗ is identically∞, and (4.2.3) holds for all (x, x∗) ∈ X ×X∗. Thus,

F ∗∗∗(x) = supx∈X{〈x∗, x〉 − F ∗∗(x)} ≤ F(x∗)

for all x∗ ∈ X∗. Hence F ∗∗∗ = F ∗.

Theorem 4.17. Let F : X → (−∞,∞] and assume that ∂F (x) �= ∅ for some x ∈ X.Then F(x) = F ∗∗(x).

Proof. Since ∂F (x) �= ∅ there exists a continuous affine functional � ≤ F with �(x) =F(x). Due to Theorem 4.14 we have � ≤ F ∗∗ ≤ F . It follows that �(x) = F ∗∗(x) = F(x),as desired.


�

�

�

�

�

�

�

�


4.2.2 Subdifferential

In this short section we summarize some important results on subdifferentials of functionalsF : X → (−∞,∞]. While F is not assumed to be convex, the applications that we havein mind are to convex functionals.

Definition 4.18. Let F : X → (−∞,∞]. The subdifferential of F at x is the (possiblyempty) set

∂F (x) = {x∗ ∈ X∗ : F(y)− F(x) ≥ 〈x∗, y − x〉 for all y ∈ X}.If ∂F (x) is nonempty, then F is called subdifferentiable at x and the set of all points whereF is subdifferentiable is denoted by D(∂F).

As a first observation, we note that x0 = argminx∈X F(x) if and only if 0 ∈ ∂F (x0).In fact, these two statements are equivalent to

F(x) ≥ F(x0)+ (0, x − x0) for all x ∈ X.

Example 4.19. Let F be Gâteaux differentiable at x, i.e., there exists w∗ ∈ X∗ such that

limt→0+

F(x + t v)− F(x)

t= 〈w∗, v〉 for all v ∈ X,

and w∗ ∈ X∗ is called the Gâteaux derivative of F at x. It is denoted by F ′(x). If in additionF is convex, then F is subdifferentiable at x and ∂F (x) = {F ′(x)}. Indeed, for v = y − x

F(x + t (y − x))− F(x)

t≤ F(y)− F(x), where 0 < t < 1.

As t → 0+ we obtain

〈F ′(x), y − x〉 ≤ F(y)− F(x) for all y ∈ X,

and thus F ′(x) ∈ ∂F (x). On the other hand, if w∗ ∈ ∂F (x), we find for y ∈ X and t > 0

F(x + t y)− F(x)

t≥ 〈w∗, y〉.

Taking the limit t → 0+, we obtain

〈F ′(x)− w∗, y〉 ≥ 0 for all y ∈ X.

This implies that w∗ = F ′(x).

Example 4.20. For F(x) = 12 |x|2, x ∈ X, we will show that ∂F (x) = F(x), where

F : X→ X∗ denotes the duality mapping. In fact, if x∗ ∈ F(x), then

〈x∗, x − y〉 = |x|2 − 〈x∗, y〉 ≥ 1

2(|x|2 − |y|2) for all y ∈ X.


�

�

�

�

�

�

�

�


Thus x∗ ∈ ∂ϕ(x). Conversely, if x∗ ∈ ∂ϕ(x), then

1

2(|y|2 − |x|2) ≥ 〈x∗, y − x〉 for all y ∈ X. (4.2.4)

We let y = t x, 0 < t < 1, and obtain

1+ t

2|x|2 ≤ 〈x∗, x〉

and thus |x|2 ≤ 〈x∗, x〉. Similarly, if t > 1, then |x|2 ≥ 〈x∗, x〉 and therefore |x|2 = 〈x∗, x〉and |x∗| ≥ |x|. On the other hand, setting y = x + t u, t > 0, in (4.2.4), we have

t 〈x∗, u〉 ≤ 1

2(|x + t u|2 − |x|2) ≤ t |u||x| + t2

2|u|2,

which implies 〈x∗, u〉 ≤ |u||x|. Hence |x∗| ≤ |x| and we obtain |x|2 = |x∗|2 = 〈x∗, x〉Example 4.21. Let K be a closed convex subset of X and let ψK be the indicator functionof K , i.e.,

ψK(x) ={

0 if x ∈ K,

∞ otherwise.

Obviously, ψK is convex and l.s.c. on X. By definition we have for x ∈ K

∂ψK(x) = {x∗ ∈ X∗ : 〈x∗, y − x〉 ≤ 0 for all y ∈ K}and ∂ψK(x) = ∅ if x /∈ X. Thus D(ψK) = D(∂ψK) = K and ∂ψK(x) = {0} for eachinterior point of K . Moreover, for x ∈ K , ∂ψK(x) coincides with the definition of thenormal cone to K at x.

Theorem 4.22. Let F : X→ (−∞,∞].(1) We have

x∗ ∈ ∂F (x) if and only if F(x)+ F ∗(x∗) = 〈x∗, x〉.(2) Assume that X is reflexive. Then x∗ ∈ ∂F (x) implies that x ∈ ∂F ∗(x∗). If

moreover F is convex and l.s.c., then x∗ ∈ ∂F (x) if and only if x ∈ ∂F ∗(x∗).

Proof. (1) Note that x∗ ∈ ∂F (x) if and only if

〈x∗, x〉 − F(x) ≤ 〈x∗, x〉 − F(x) (4.2.5)

for all x ∈ X. By the definition of F ∗ this implies F ∗(x∗) = 〈x∗, x〉 − F(x). Conversely,if F(x∗)+ F(x) = 〈x∗, x〉, then (4.2.5) holds for all x ∈ X.

(2) Since F ∗∗ ≤ F by Theorem 4.14, it follows from (1) that

F ∗∗(x) ≤ 〈x∗, x〉 − F ∗(x∗).


�

�

�

�

�

�

�

�


By the definition of F ∗∗,

F ∗∗(x) ≥ 〈x∗, x〉 − F ∗(x∗).

Hence

F ∗(x∗)+ F ∗∗(x) = 〈x∗, x〉.Since X is reflexive, F ∗∗ = (F ∗)∗. Applying (1) to F ∗ we have x ∈ ∂F ∗(x∗). If in additionF is convex and l.s.c., it follows from Theorem 4.12 that F ∗∗ = F . Thus if x ∈ ∂F ∗(x∗),then

F(x)+ F ∗(x∗) = F ∗(x∗)+ F ∗∗(x) = 〈x∗, x〉by applying (1) to F ∗. Therefore x∗ ∈ ∂F (x) again by (1).

Proposition 4.23. For F : X → (−∞,∞] the set ∂F (x) is closed and convex for everyx ∈ X.

Proof. If F(x) = ∞, then ∂F (x) = ∅. Henceforth let F(x) <∞. For every x∗ ∈ X∗ wehave F ∗(x∗) ≥ 〈x∗, x〉 − F(x) and hence by Theorem 4.22

∂F (x) = {x∗ ∈ X∗ : F ∗(x∗)− 〈x∗, x〉 ≤ −F(x)}.By Theorem 4.13 the functional x∗ → F ∗(x∗) − 〈x∗, x〉 is convex and l.s.c. The claimfollows from Lemma 4.3.

Theorem 4.24. If the convex function F is continuous at x, then ∂F (x) is not empty.

Proof. Since F is continuous at x, there exists for every ε > 0 an open neighborhood Uε

of x such thatF(x) ≤ F(x)+ ε, x ∈ Uε.

Then Uε × (F (x)+ ε,∞) is an open set in X×R, contained in epi F . Hence (epi F)o, therelative interior of epi F , is nonempty. Since F is convex, epi F is convex and (epi F)o

is convex. Note that (x, F (x)) is a boundary point of epi F . Hence by the Hahn–Banachseparation theorem, there exists a closed hyperplane S = {(x, a) ∈ X×R : 〈x∗, x〉+α a =β} for nontrivial (x∗, α) ∈ X∗ × R and β ∈ R such that

〈x∗, x〉 + α a > β for all (x, a) ∈ (epi F)o,

〈x∗, x〉 + α F(x) = β.

(4.2.6)

Since (epi F)o = epi F , every neighborhood of (x, a) ∈ epi F contains an elementof (epi ϕ)o. Suppose 〈x∗, x〉 + α a < β. Then

{(x, a) ∈ X × R : 〈x∗, x〉 + α a < β}


�

�

�

�

�

�

�

�


is a neighborhood of (x, a) and contains an element of (epi ϕ)o, which contradicts (4.2.6).Therefore

〈x∗, x〉 + α a ≥ β for all (x, a) ∈ epi F. (4.2.7)

Suppose α = 0. For any u ∈ Uε there is an a ∈ R such that F(u) ≤ a. Then from (4.2.7)

〈x∗, u〉 = 〈x∗, u〉 + α a ≥ β

and thus〈x∗, u− x〉 ≥ 0 for all u ∈ Uε.

Choose δ > 0 such that |u − x| ≤ δ implies u ∈ U . For any nonzero element x ∈ X lett = δ

|x| . Then |(tx + x)− x| = |tx| = δ so that tx + x ∈ Uε . Hence

〈x∗, x〉 = 〈x∗, (tx + x)− x〉/t ≥ 0.

Similarly, −t x + x ∈ Uε and

〈x∗, x〉 = 〈x∗, (−tx + x)− x〉/(−t) ≤ 0.

Thus, 〈x∗, x〉, x∗ = 0, which is a contradiction. Therefore α is nonzero. From (4.2.6),(4.2.7) we have for a > F(x) that α(a − F(x)) > 0 and hence α > 0. Employing (4.2.6),(4.2.7) again implies that ⟨

−x∗

α, x − x

⟩+ F(x) ≤ F(x)

for all x ∈ X and therefore − x∗α∈ ∂F (x).

4.3 Fenchel duality theoryIn this section we discuss some elements of duality theory. We call

infx∈X F(x) (P )

the primal problem, where X is a real Banach space and F : X → (−∞,∞] is a properl.s.c. convex function. We have the following result for the existence of a minimizer.

Theorem 4.25. Let X be reflexive and let F be a l.s.c. proper convex functional defined onX satisfying

lim|x|→∞ F(x) = ∞. (4.3.1)

Then there exists an x ∈ X such that

F(x) = inf {F(x) : x ∈ X}.

Proof. Let η = inf {F(x) : x ∈ X} and let {xn} be a minimizing sequence such thatlimn→∞ F(xn) = η. Condition (4.3.1) implies that {xn} is bounded in X. Since X is


�

�

�

�

�

�

�

�

4.3. Fenchel duality theory 99

reflexive, there exists a subsequence that converges weakly to some x in X and it followsfrom Lemma 4.4 that F(x) = η.

We embed (P ) into a family of perturbed problems

infx∈X �(x, y), (Py)

where y ∈ Y is an embedding variable, Y is a Banach space, and � : X × Y → (−∞,∞]is a proper l.s.c. convex function with �(x, 0) = F(x). Thus (P0) = (P ). For example, interms of (4.1.1) we let

F(x) = f (x)+ ϕ(�x) (4.3.2)

and with Y = H

�(x, y) = f (x)+ ϕ(�x + y). (4.3.3)

Definition 4.26. The dual problem of (P ) with respect to � is defined by

supy∗∈Y ∗

(−�∗(0, y∗)). (P ∗)

The value function of (Py) is defined by

h(y) = infx∈X �(x, y), y ∈ Y.

In this section we analyze the relationship between the primal problem (P ) and itsdual (P ∗). Throughout we assume that h(y) > −∞ for all y ∈ Y .

Theorem 4.27. sup (P ∗) ≤ inf (P ).

Proof. For any (x, y) ∈ X × Y and (x∗, y∗) ∈ X∗ × Y ∗ we have

〈x∗, x〉 + 〈y∗, y〉 −�(x, y) ≤ �∗(x∗, y∗).

Thus0 = 〈0, x〉 + 〈y∗, 0〉 ≤ F(x)+�∗(0, y∗)

for all x ∈ X and y∗ ∈ Y ∗. Therefore

supy∗∈Y ∗

(−�∗(0, y∗)) = sup (P ∗) ≤ inf (P ) = infx∈X F(x).

Lemma 4.28. h is convex.

Proof. The proof is established by contradiction. Suppose there exist y1, y2 ∈ Y andθ ∈ (0, 1) such that

θ h(y1)+ (1− θ) h(y2) < h(θ y1 + (1− θ) y2).


�

�

�

�

�

�

�

�

100 Chapter 4.Augmented Lagrangian Methods for Nonsmooth, Convex Optimization

Then there exist c and ε > 0 such that

θ h(y1)+ (1− θ) h(y2) < c − ε < c < h(θ y1 + (1− θ) y2).

Set a1 = h(y1)+ θε

and

a2 = c − θ a1

1− θ= c − ε − θ h(y1)

1− θ> h(y2).

By definition of h there exist x1, x2 ∈ X such that

h(y1) ≤ �(x1, y1) ≤ a1 and h(y2) ≤ �(x2, y2) ≤ a2.

Thus

h(θ y1 + (1− θ) y2) ≤ �(θ x1 + (1− θ) x2, θ y1 + (1− θ) y2)

≤ θ �(x1, y1)+ (1− θ)�(x2, y2) ≤ θ a1 + (1− θ) a2 = c,

which is a contradiction. Hence h is convex.

Lemma 4.29. For all y∗ ∈ Y ∗, h∗(y∗) = �∗(0, y∗).

Proof.

h∗(y∗) = supy∈Y

(〈y∗, y〉 − h(y)) = supy∈Y

(〈y∗, y〉 − infx∈X �(x, y))

= supy∈Y

supx∈X

(〈y∗, y〉 −�(x, y)) = supy∈Y

supx∈X

(〈0, x〉 + 〈y∗, y〉 −�(x, y))

= sup(x,y)∈X×Y

(〈(0, y∗), (x, y)〉 −�(x, y)) = �∗(0, y∗).

Theorem 4.30. If h is l.s.c. at 0, then inf (P ) = sup (P ∗).

Proof. Since F is proper, h(0) = inf x∈X F(x) <∞. Since h is convex by Lemma 4.28, itfollows from Theorem 4.15 that h(0) = h∗∗(0). Thus by Lemma 4.29

sup (P ∗) = supy∗∈Y ∗

(−�∗(0, y∗)) = supy∗∈Y ∗

(〈y∗, 0〉 − h∗(y∗))

= h∗∗(0) = h(0) = inf (P ).

Theorem 4.31. If h is subdifferentiable at 0, then inf (P ) = sup (P ∗) and ∂h(0) is the setof solutions of (P ∗).

Proof. By Lemma 4.29 we have that y∗ solves (P ∗) if and only if

−h∗(y∗) = −�∗(0, y∗) = supy∗∈Y ∗

(−�∗(0, y∗))

= supy∗∈Y ∗

(〈y∗, 0〉 − h∗(y∗)) = h∗∗(0).


�

�

�

�

�

�

�

�


By Theorem 4.22

h∗∗(0)+ h∗∗∗(y∗) = 〈y∗, 0〉 = 0

if and only if y∗ ∈ ∂h∗∗(0). Since h∗∗∗ = h∗ by Theorem 4.16, we have y∗ ∈ ∂h∗∗(y∗) ifand only if −h∗(y∗) = h∗∗(0). Consequently y∗ solves (P ∗) if and only if y∗ ∈ ∂h∗∗(0).Since ∂h(0) is not empty, ∂h(0) = ∂h∗∗(0) by Theorem 4.17. Therefore ∂h(0) is the set ofall solutions of (P ∗) and (P ∗) has at least one solution. Let y∗ ∈ ∂h(0). Then

〈y∗, x〉 + h(0) ≤ h(x)

for all x ∈ X. If {xn} is a sequence in X such that xn → 0, then

lim infn→∞ h(xn) ≥ lim〈y∗, xn〉 + h(0) = h(0)

and h is l.s.c. at 0. By Theorem 4.30 inf (P ) = sup (P ∗).

Corollary 4.32. If there exists an x ∈ X such that �(x, ·) is finite and continuous at 0, thenh is continuous on an open neighborhood U of 0 and h = h∗∗. Moreover,

inf (P ) = sup (P ∗)

and ∂h(0) is the set of solutions of (P ∗).

Proof. First show that h is continuous. Clearly, �(x, ·) is bounded above on an openneighborhood U of 0. Since for all y ∈ Y

h(y) ≤ �(x, y),

h is bounded above onU . Sinceh is convex by Lemma 4.28 it is continuous by Theorem 4.6.Hence h = h∗∗ by Theorem 4.15. Moreover h is subdifferentiable at 0 by Theorem 4.24.The conclusion then follows from Theorem 4.31.

Example 4.33. Consider the case (4.3.2)–(4.3.3), i.e.,

�(x, y) = f (x)+ ϕ(�x + y),

where f : X → (−∞,∞], ϕ : Y → (−∞,∞] are l.s.c. and convex and � : X → Y is acontinuous linear operator. Let us calculate the conjugate of �:

�∗(x∗, y∗) = supx∈X

supy∈Y{〈x∗, x〉 + 〈y∗, y〉 −�(x, y)}

= supx∈X{〈x∗, x〉 − f (x)+ sup

y∈Y[〈y∗, y〉 − ϕ(�x + y)]},

where

supy∈Y[〈y∗, y〉 − ϕ(�x + y)] = sup

y∈Y[〈y∗,�x + y〉 − ϕ(�x + y)− 〈y∗,�x〉]

= supz∈Y[〈y∗, z〉 − ϕ(z)] − 〈y∗,�x〉 = ϕ∗(y∗)− 〈y∗,�x〉.


�

�

�

�

�

�

�

�


Thus

�∗(x∗, y∗) = supx∈X{〈x∗, x〉 − 〈y∗,�x〉 − f (x)+ ϕ∗(y∗)}

= supx∈X{〈x∗ −�∗y∗, x〉 − f (x)+ ϕ∗(y∗)} = f ∗(x∗ −�y∗)+ ϕ∗(y∗).

Theorem 4.34. For any x ∈ X and y∗ ∈ Y ∗, the following statements are equivalent.

(1) x solves (P ), y∗ solves (P ∗), and min (P ) = max (P ∗).

(2) �(x, 0)+�∗(0, y∗) = 0.

(3) (0, y∗) ∈ ∂�(x, 0).

Proof. Clearly (1) implies (2). If (2) holds, then

�(x, 0) = F(x) ≥ inf (P ) ≥ sup (P ∗) ≥ −�∗(0, y∗)

by Theorem 4.27. Thus

�(x, 0) = min (P ) = max (P ∗) = −�∗(0, y∗).

Therefore (2) implies (1). Since 〈(0, y∗), (x, 0)〉 = 0, equivalence of (2) and (3) follows byTheorem 4.22.

Any solution y∗ of (P ∗) is called a Lagrange multiplier associated with �. ForExample 4.33 the optimality condition implies

0= �(x, 0)+�∗(0, y∗) = f (x)+ f ∗(−�∗y∗)+ ϕ(�x)+ ϕ(y∗)

= [f (x)+ f ∗(−�∗y∗)− 〈−�∗y∗, x〉] + [ϕ(�x)+ ϕ∗(y∗)− 〈y∗,�x)].(4.3.4)

Since each expression of (4.3.4) in square brackets is nonnegative, it follows that

f (x)+ f ∗(−�∗y∗)− 〈−�∗y∗, x〉 = 0,

ϕ(�x)+ ϕ∗(y∗)− 〈y∗,�x) = 0.

By Theorem 4.22−�∗y∗ ∈ ∂f (x),

y∗ ∈ ∂ϕ(�x).

(4.3.5)

The functional L : X × Y ∗ → (−∞,∞] defined by

−L(x, y∗) = supy∈Y

{〈y∗, y〉 −�(x, y)} (4.3.6)


�

�

�

�

�

�

�

�


is called the Lagrangian. Note that

�∗(x∗, y∗)= supx∈X, y∈Y

{〈x∗, x〉 + 〈y∗, y〉 −�(x, y)}

= supx∈X

〈x∗, x〉 + supy∈Y

{〈y∗, y〉 −�(x, y)}} = supx∈X

(〈x∗, x〉 − L(x, y∗)).

Thus−�∗(0, y∗) = inf

x∈X L(x, y∗) (4.3.7)

and therefore the dual problem (P ∗) is equivalent to

supy∗∈Y ∗

infx∈X L(x, y∗).

If � is a convex l.s.c. function that is finite at (x, y), then for the biconjugate of �x : y →�(x, y) in y we have �x(y)

∗∗ = �(x, y) and

�(x, y) = �∗∗x (x, y) = supy∗∈Y ∗

{〈y∗, y〉 −�∗x(y∗)}

= supy∗∈Y ∗

{〈y∗, y〉 + L(x, y∗)}.

Hence�(x, 0) = sup

y∗∈Y ∗L(x, y∗) (4.3.8)

and the primal problem (P ) is equivalently written as

infx∈X sup

y∗∈YL(x, y∗).

Thus by means of the Lagrangian L, the problems (P ) and (P ∗) can be formulated asmin-max problems, and by Theorem 4.27

supy∗

infx

L(x, y∗) ≤ infx

supy∗

L(x, y∗).

Theorem 4.35 (Saddle Point). Assume that � is a convex l.s.c. function that is finite at(x, y∗). Then the following are equivalent.

(1) (x, y∗) ∈ X × Y ∗ is a saddle point of L, i.e.,

L(x, y∗) ≤ L(x, y∗) ≤ L(x, y∗) for all x ∈ X, y∗ ∈ Y ∗. (4.3.9)

(2) x solves (P ), y∗ solves (P ∗), and min (P ) = max (P ∗).

Proof. Suppose (1) holds. From (4.3.7) and (4.3.9)

L(x, y∗) = infx∈X L(x, y∗) = −�∗(0, y∗)


�

�

�

�

�

�

�

�


and from (4.3.8) and (4.3.9)

L(x, y∗) = supy∗∈Y ∗

L(x, y∗) = �(x, 0).

Thus, �∗(x, 0)+�(0, y∗) = 0 and (2) follows from Theorem 4.34.Conversely, if (2) holds, then from (4.3.7) and (4.3.8)

−�∗(0, y∗) = infx∈X L(x, y∗) ≤ L(x, y∗),

�(x, 0) = supy∗∈Y ∗

L(x, y∗) ≥ L(x, y∗).

Consequently −�∗(0, y∗) = �(x, 0) by Theorem 4.34 and (4.3.9) holds.

Theorem 4.35 implies that no duality gap between (P ) and (P ∗) is equivalent to thesaddle point property of the pair (x, y∗).

For Example 4.33 we have

L(x, y∗) = f (x)+ 〈y∗,�x〉 − ϕ(y∗). (4.3.10)

If (x, y∗) is a saddle point, then from (4.3.9)

−�∗y∗ ∈ ∂f (x),

�x ∈ ∂ϕ∗(x).(4.3.11)

It follows from Theorem 4.22 that the second equation is equivalent to

y∗ ∈ ∂ϕ(�x)

and (4.3.11) is equivalent to (4.3.5), if X is reflexive. Thus the necessary optimality sys-tem for

minx∈X F(x) = f (x)+ ϕ(�x)

is given by−�∗y∗ ∈ ∂f (x),

y∗ ∈ ∂ϕ∗(�x).

(4.3.12)

4.4 Generalized Yosida–Moreau approximationDefinition 4.36 (Monotone Operator). Let A be a graph in X ×X∗.

(1) A is monotone if

〈y1 − y2, x1 − x2〉 ≥ 0 for all x1, x2 ∈ D(A) and y1 ∈ Ax1, y2 ∈ Ax2

(2) A is maximal monotone if any monotone extension of A coincides with A.


�

�

�

�

�

�

�

�

4.4. Generalized Yosida–Moreau approximation 105

Let ϕ be an l.s.c. convex function on X. For x∗1 ∈ ∂ϕ(x1) and x∗2 ∈ ∂ϕ(x2),

ϕ(x1)− ϕ(x2) ≤ 〈x∗2 , x1 − x2〉,

ϕ(x2)− ϕ(x1) ≤ 〈x∗1 , x2 − x1〉.It follows that 〈x∗1 − x∗2 , x1 − x2〉 ≥ 0. Hence ∂ϕ is a monotone in X ×X∗. The followingtheorem characterizes maximal monotone operators by a range condition [ItKa].

Theorem 4.37 (Minty–Browder). Assume that X and X∗ are reflexive and strictly convexBanach spaces and letF : X→ X∗ denote the duality mapping. Then a monotone operatorA is maximal monotone if and only if R(λF +A) = X∗ for all λ > 0 (or, equivalently, forsome λ > 0).

Theorem 4.38 (Rockafeller). Let X be a real Banach space. If ϕ is an l.s.c. proper convexfunctional on X, then ∂ϕ is a maximal monotone operator from X into X∗.

Proof. We prove the theorem for the case that X is reflexive . The general case is consideredin [Roc2]. ByAsplund’s renorming theorem we can assume that after choosing an equivalentnorm, X and X∗ are strictly convex. Using the Minty–Browder theorem it suffices to provethat R(F + ∂ϕ) = X∗. For x∗0 ∈ X∗ we must show that the equation x∗0 ∈ Fx + ∂ϕ(x)

has at least a solution x0. Note that Fx is single valued due to the fact that X∗ is strictlyconvex. Define the proper convex functional f : X→ (−∞,∞] by

f (x) = 1

2|x|2X + ϕ(x)− 〈x∗0 , x〉.

Since f is l.s.c. and f (x) → ∞ as |x| → ∞, there exists x0 ∈ D(f ) such that f (x0) ≤f (x) for all x ∈ X. The subdifferential of the mapping x → 1

2 |x| 2 is given by the monotoneoperator F . Hence we find

ϕ(x)− ϕ(x0) ≥ 〈x∗0 , x − x0〉 − 〈F(x), x − x0〉 for all x ∈ X.

Setting x(t) = x0 + t (u− x0) with u ∈ X and using convexity of ϕ we have

ϕ(u)− ϕ(x0) ≥ 〈x∗0 , u− x0〉 − 〈F(x(t)), u− x0〉.

Taking the limit t → 0+ and using the fact that x → F(x) is continuous from X endowedwith the norm topology to X∗ endowed with the weak∗ topology, we obtain

ϕ(u)− ϕ(x0) ≥ 〈x∗0 , u− x0〉 − 〈F(x0), u− x0〉,

which implies that x∗0 − F(x0) ∈ ∂ϕ(x0).

Throughout the remainder of this section let H denote a real Hilbert space whichis identified with its dual H ∗. Further let A denote a maximal monotone operator inH × H . Recall that A is necessarily densely defined and closed. Moreover the resolvent


�

�

�

�

�

�

�

�


Jμ = (I + μA)−1, with μ > 0, is a contraction defined on all of H ; see [Bre, ItKa, Paz].Moreover

|Jμx − x| → 0 as μ→ 0+ for each x ∈ H. (4.4.1)

The Yosida approximation Aμ of A is defined by

Aμx = 1

μ(x − Jμx).

The operator Aμ is single valued, monotone, everywhere defined, Lipschitz continuous withLipschitz constant 1

μand Aμx ∈ AJμx for all x ∈ H .

Let ϕ be an l.s.c., proper, and convex function on H . Throughout the remainder ofthis chapter let A denote the maximal monotone operator ∂ϕ on the Hilbert space H = H ∗.For x, λ ∈ H and c > 0 define the functional ϕc(x, λ) by

ϕc(x, λ) = infu∈H

{ϕ(x − u)+ (λ, u)H + c

2|u|2H

}. (4.4.2)

Then ϕc(x, λ) is a smooth approximation of ϕ in the following sense.

Theorem 4.39. For x, λ ∈ H the infimum in (4.4.2) is attained at a unique point

uc(x, λ) = x − J 1c

(x + λ

c

).

ϕc(x, λ) is convex, Lipschitz continuously Fréchet differentiable in x and

ϕ′c(x, λ) = λ+ c uc(x, λ) = A 1c

(x + λ

c

).

Moreover, limc→∞ ϕc(x, λ) = ϕ(x) and

ϕ

(J 1

c

(x + λ

c

))− 1

2c|λ|2H ≤ ϕc(x, λ) ≤ ϕ(x) for every x, λ ∈ H.

Proof. First we observe that (4.4.2) is equivalent to

ϕc(x, λ) = infv∈H

{ϕ(v)+ c

2

∣∣∣∣x + λ

c− v

∣∣∣∣2}− 1

2c|λ|2, (4.4.3)

where v = x − u. Note that v → ψ(v) = ϕ(v) + c2 |v − y|2, with y = x + λ

c, is convex

and l.s.c. for every y ∈ H . Since ϕ is proper, D(A) �= ∅, and thus for x0 ∈ D(A), andx∗0 ∈ Ax0 we have

ϕ(x) ≥ ϕ(x0)+ (x∗0 , x − x0) for all x ∈ H.

Thus, lim ψ(v) = ∞ as |v| → ∞. Hence there exists a unique minimizer v0 ∈ H of ψ .Let ξ ∈ H and define η = (1− t)v0 + tξ for 0 < t < 1. Since ϕ is convex we have

ϕ(η)− ϕ(v0) ≤ t (ϕ(ξ)− ϕ(v0)). (4.4.4)


�

�

�

�

�

�

�

�

4.4. Generalized Yosida–Moreau approximation 107

Moreover, since ψ(v0) ≤ ψ(η),

ϕ(η)− ϕ(v0) ≥ c

2(|v0 − y|2 − |(1− t)v0 + tξ − y|2)

= tc (v0 − ξ, v0 − y)− t2c

2|v0 − ξ |2.

(4.4.5)

From (4.4.4) and (4.4.5) we obtain, taking the limit t → 0+,

ϕ(ξ)− ϕ(v0) ≥ c (y − v0, ξ − v0).

Since ξ ∈ H is arbitrary this implies that y − v0 ∈ 1cAv0. Thus v0 = J 1

cy and

uc(x, λ) = x − v0 = x − J 1c

(x + λ

c

)attains the minimum in (4.4.2). Note that this argument also implies that A is maximal.

For x1, x2 ∈ X and 0 < t < 1

ϕc((1− t)x1 + tx2, λ) = ψ((1− t)v1 + tv2)− 1

2c|λ|2,

where yi = xi + λc

and vi = J 1cyi for i = 1, 2. Hence the convexity of x → ϕc(x, λ)

follows from the one of ψ .Next, we show that ∂ϕc(x, λ) = A 1

c(x + λ

c). For x ∈ H , let y = x + λ

c∈ H and

v = J 1cy. Then, we have

ϕ(v)+ c

2|v − y|2 ≥ ϕ(v0)+ c

2|v0 − y|2

and

ϕ(v0)+ c

2|v0 − y|2 ≥ ϕ(v)+ c

2|v − y|2.

Thus,

c

2(|v0 − y|2 − |v0 − y|2) ≥ ϕc(x, λ)− ϕc(x, λ) ≥ c

2(|v − y|2 − |v − y|2). (4.4.6)

Since |v − v0| → 0 as |x − x| → 0, it follows from (4.4.6) that

|ϕc(x, λ)− ϕc(x, λ)−(c(y − v0), x − x

)||x − x| → 0

as |x − x| → 0. Hence x → ϕc(x, λ) is Fréchet differentiable with F -derivative

c(y − v0) = λ+ c uc(x, λ) = A 1c

(x + λ

c

).


�

�

�

�

�

�

�

�


The dual representation of ϕc(x, λ) is derived next. Define the functional �(v, y) onH ×H by

�(v, y) = ϕ(v)+ c

2|v − (y + y)|2,

where we set y = x + λc. Consider the family of primal problems

infv∈H �(v, y) for each y ∈ H (Py)

and the dual problemsupy∗∈H

{−�∗(0, y∗)}. (P ∗)

If h(y) is the value functional of (Py), i.e., h(y) = inf v∈H �(v, y), then from (4.4.3) wehave

ϕc(x, λ) = h(0)− 1

2c|λ|2. (4.4.7)

From the proof of Theorem 4.39 it follows that h(y) is continuously Fréchet differentiablewith h′(0) = ϕ′c(x, λ). It thus follows from Theorem 4.31 that inf (P0) = max (P ∗) andh′(0) is the solution of (P ∗).

This leads to the following theorem.

Theorem 4.40. For x, λ ∈ H

ϕc(x, λ) = supy∗∈H

{(x, y∗)H − ϕ∗(y∗)− 1

2c|y∗ − λ|2H

}, (4.4.8)

where the supremum is attained at a unique point λc(x, λ) and we have

λc(x, λ) = λ+ c uc(x, λ) = ϕ′c(x, λ) = A 1c

(x + λ

c

). (4.4.9)

Proof. For (v∗, y∗) ∈ H ×H we have by definition

�∗(v∗, y∗)= supv∈H

supy∈H

{(v∗, v)+ (y∗, y)− ϕ(v)− c

2|v − (y + y)|2

}

= supv∈H

{(v∗, v)+ (y∗, v − y)− ϕ(v)

+ supy∈H

[−(

y∗, v − (y + y))− c

2|v − (y + y)|2

]}

= supv∈H

{(y∗ + v∗, v)− ϕ(v)− (y∗, y)+ 1

2c|y∗|2

}

= ϕ∗(y∗ + v∗)− (y∗, y)+ 1

2c|y∗|2.


�

�

�

�

�

�

�

�

4.5. Optimality systems 109

Hence,

h(0) = supy∗∈H

{−�∗(0, y∗)} = supy∗∈H

{(y∗, y)− ϕ∗(y∗)− 1

2c|y∗|2

}which implies (4.4.8) from (4.4.7) and since y = x + λ

c. By Theorem 4.31 the maximum

of y∗ → 〈y∗, x〉 − ϕ∗(y∗) − 12c |y∗ − λ|2 is attained at the unique point h′(0) = λc(x, λ)

that is given by (4.4.9).

The following theorem provides an equivalent characterization of λ ∈ ∂ϕ(x). Thisresults will be used in the following section to replace the differential inclusion, whichrelates the primal and adjoint variable by means of a nonlinear equation.

Theorem 4.41.

(1) If λ ∈ ∂ϕ(x) for x, λ ∈ H , then λ = ϕ′c(x, λ) for all c > 0.

(2) Conversely, if λ = ϕ′c(x, λ) for some c > 0, then λ ∈ ∂ϕ(x).

Proof. If λ ∈ ∂ϕ(x), then by Theorems 4.39, 4.40, and 4.22

ϕ(x) ≥ ϕc(x, λ) ≥ 〈λ, x〉 − ϕ∗(λ) = ϕ(x)

for every c > 0. Thus, λ ∈ H attains the supremum in (4.4.8) and by Theorem 4.40we have λ = ϕ′c(x, λ). Conversely, if λ ∈ H satisfies λ = ϕ′c(x, λ) for some c > 0,then uc(x, λ) = 0 by Theorem 4.40. Hence it follows from Theorem 4.39, (4.4.2), andTheorem 4.40 that

ϕ(x) = ϕc(x, λ) = 〈λ, x〉 − ϕ∗(λ),

and thus λ ∈ ∂ϕ(x) by Theorem 4.22.

4.5 Optimality systemsIn this section we derive first order optimality systems for (4.1.1) based on Lagrange mul-tipliers. In what follows we assume that f : X → R and ϕ : H → (−∞,∞] arel.s.c., convex functions, that f is continuously differentiable, and that ϕ is proper. Further� ∈ L(X,H) and C is a closed convex set in X. In addition we require that

(A1) f, ϕ are bounded below by zero on K ,

(A2) 〈f ′(x1)− f ′(x2), x1 − x2〉X∗,X ≥ σ |x1 − x2|2Xfor some σ > 0 independent of x1, x2 ∈ C, and

(A3) ϕ(�x0) <∞ and ϕ is continuous at �x0 for some x0 ∈ C.


�

�

�

�

�

�

�

�


As a consequence of (A2) we have

f (x)− f (x0)− 〈f ′(x0), x − x0〉X∗,X

=∫ 1

0〈f ′(x0 + t (x − x0))− f ′(x0), x − x0〉X∗,X dt ≥ σ

2|x − x0|2X

(4.5.1)

for all x ∈ X. Due to Theorem 4.24 there exists y∗0 ∈ D(A(y0)) = ∂ϕ(y0) such that

ϕ(�x)− ϕ(�x0) ≥ (y∗0 ,�x −�x0)H for x0 ∈ H. (4.5.2)

Hence, lim f (x)+ϕ(�x)→∞ as |x|X →∞ and it follows from Theorem 4.25 and (A2)that there exists a unique minimizer x ∈ C of (4.1.1).

Theorem 4.42 (Optimality). A necessary and sufficient condition for x ∈ C to be theminimizer of (4.1.1) is given by

〈f ′(x), x − x〉X∗,X + ϕ(�x)− ϕ(�x) ≥ 0 for all x ∈ C. (4.5.3)

Proof. Assume that x is the minimizer of (4.1.1). Then for x ∈ C and 0 < t < 1 we havex + t (x − x) ∈ C and

f (x + t (x − x))+ ϕ(�(x + t (x − x)) ≥ f (x)+ ϕ(�x).

Sinceϕ(�((1− t)x + tx))− ϕ(�x) ≤ t (ϕ(�x)− ϕ(�x)),

we obtaint−1 (f (x + t (x − x))− f (x))+ ϕ(�x)− ϕ(�x) ≥ 0.

Taking the limit t → 0+, we have (4.5.3) for all x ∈ C.Conversely, assume that x ∈ C satisfies (4.5.3). Then, from (4.5.1),

f (x)+ ϕ(�x)− (f (x)+ ϕ(�x))

= f (x)− f (x)− 〈f ′(x), x − x〉 + 〈f ′(x), x − x〉 + ϕ(�x)− ϕ(�x)

≥ σ

2|x − x|2X

for all x ∈ C. Thus, x is a minimizer of (4.1.1).

Next, ϕ is split from the optimality system in (4.5.3) by means of an appropriatelydefined Lagrange multiplier. For this purpose we consider the regularized minimizationproblems:

min f (x)+ ϕc(�x, λ) over x ∈ C (4.5.4)


�

�

�

�

�

�

�

�


for c > 0 and λ ∈ H . From Theorem 4.39 it follows that x → ϕc(�x, λ) is convex,continuously differentiable, and bounded below by − 1

2c |λ|2H . Thus, for λ ∈ H , (4.5.4) hasa unique solution xc ∈ C and xc ∈ C is the solution of (4.5.4) if and only if xc satisfies

〈f ′(xc), x − xc〉 +(ϕ′c(�xc, λ), �(x − xc)

)H≥ 0 for all x ∈ C. (4.5.5)

It follows from Theorems 4.39 and 4.40 that

ϕ′c(�xc, λ) = A 1c

(�xc + 1

cλ

)= λc ∈ H. (4.5.6)

We have the following result.

Theorem 4.43 (Lagrange Multipliers). (1) Suppose that xc converges strongly to x in X

as c → ∞ and that {λc}c≥1 has a weak cluster point. Then for each weak cluster pointλ ∈ H of {λc}c≥1

λ ∈ ∂ϕ(�x) and

〈f ′(x), x − x〉X∗,X +(λ, �(x − x)

)H≥ 0 for all x ∈ C.

(4.5.7)

Conversely, if (x, λ) ∈ C ×H satisfies (4.5.7), then x minimizes (4.1.1).(2) Assume that there exists λc ∈ ∂ϕ(�xc) for c ≥ 1 such that {|λc|H }, c ≥ 1, is

bounded. Then, xc converges strongly to x in X as c→∞.

Proof. To verify (1), assume that xc → x and let λ be a weak cluster point of λc in H .Then, from (4.5.5)–(4.5.6), we can conclude that (x, λ) ∈ C ×H satisfies

〈f ′(x), x − x〉 + (λ, �(x − x)

)H≥ 0

for all x ∈ C. It follows from Theorems 4.39 and 4.40 that(λc, vc

)H− 1

2c|λc − λ|2 = ϕc(�xc, λ)+ ϕ∗(λc)

≥ ϕ

(J 1

c

(vc + λ

c

))− 1

2c|λ|2H + ϕ∗(λc).

Since J 1c(vc + λ

c)→ v = �x and ϕ and ϕ∗ are l.s.c., letting c→∞, we obtain(

λ, v)H≥ ϕ(v)+ ϕ∗(λ),

which implies that λ ∈ ∂ϕ(v) by Theorem 4.22. Hence, (x, λ) ∈ C ×H satisfies (4.5.7).Conversely, suppose that (x, λ) ∈ C ×H satisfies (4.5.7). Then ϕ(�x)− ϕ(�x) ≥(

λ, �(x− x))H

for all x ∈ C. Thus the inequality in (4.5.7) implies (4.5.3). It then followsfrom Theorem 4.42 that x minimizes (4.1.1).

For (2) we note that from (4.5.5) we have

〈f ′(xc), x − xc〉 + ϕc(�x, λ)− ϕc(�xc, λ) ≥ 0 for all x ∈ C.

Thus

〈f ′(xc), x − xc〉 + ϕc(�x, λ)− ϕc(�xc, λ) ≥ 0.


�

�

�

�

�

�

�

�


Also, from (4.5.3)

〈f ′(x), xc − x〉 + ϕ(�xc)− ϕ(�x) ≥ 0.

Adding these inequalities, we obtain

〈f ′(x)− f ′(xc), x − xc〉 + ϕc(�xc, λ)− ϕ(�xc)

≤ ϕc(�x, λ)− ϕ(�x) ≤ 0.(4.5.8)

By (4.4.8)

ϕ(vc)− 1

2c|λc − λ|2H = 〈vc, λc〉H − ϕ∗(λc)− 1

2c|λc − λ|2H ≤ ϕc(vc, λ), (4.5.9)

where vc = �xc and hence

ϕ(vc)− ϕc(vc, λ) ≤ 1

2c|λc − λ|2H .

Thus, from (4.5.8) and (A2)

σ

2|xc − x∗|2X ≤

1

2c|λc − λ|2H

and since {|λc|H }c≥1 is bounded, |xc − x∗|X → 0 as c→∞.

The following lemma addresses the condition in Theorem 4.43 (2).

Lemma 4.44. (1) If D(ϕ) = H and bounded above, on bounded sets, then ∂ϕ(�xc) isnonempty and |∂ϕ(�xc)|H is uniformly bounded for c ≥ 1.

(2) If ϕ = χK with K a closed convex set in H and �xc ∈ K for all c > 0, then λc

can be chosen to be 0 for all c > 0.

Proof. (1) By definition of xc and by Theorem 4.39

f (xc)+ ϕc(vc, λ) ≤ f (x)+ ϕ(�x)

and

f (xc)+ ϕ

(J 1

c

(vc + λ

c

))≤ f (x)+ ϕ(�x)+ 1

2c|λ|2H ,

where vc = �xc. Sinceϕ is bounded below by 0, there exists a constantK1 > 0 independentof c ≥ 1 such that

f (xc)− f (x)− 〈f ′(x), xc − x〉 ≤ K1 + ‖f ′(x)‖|xc − x|X.By (4.5.1)

σ

2|xc − x|2X ≤ K1 + ‖f ′(x)‖|xc − x|X.

Thus there exists a constant M such that |xc − x|X ≤ M for c ≥ 1. Since ϕ is everywheredefined, convex, l.s.c., and assumed to be bounded above on bounded sets, it follows from


�

�

�

�

�

�

�

�


Theorem 4.7 that ϕ is Lipschitz continuous in the open set B = {v ∈ H : |v − v| <(M + 1) ‖�‖}, where v = �x. Let L denote the Lipschitz constant. By Theorem 4.24∂ϕ(vc) is nonempty for c ≥ 1. Let λc ∈ ∂ϕ(vc) for c ≥ 1. Hence

L |v − vc| ≥ ϕ(v)− ϕ(vc) ≥(λc, v − vc

)for all v ∈ B. Since vc ∈ B we have |λc| ≤ L, and the claim follows.

(2) In this case ∂ϕ(vc) is given by the normal coneNC(vc) = {w ∈ H : (w, vc−v)H ≥0 for all v ∈ C} and thus 0 ∈ ∂ϕ(vc) for all c > 0.

Theorem 4.45 (Complementarity). Assume that there exists a pair (x, λ) ∈ C × H thatsatisfies (4.5.7). Then the complementarity condition λ ∈ ∂ϕ(�x) can equivalently beexpressed as

λ = ϕ′c(�x, λ) (4.5.10)

and x is the unique solution of

minx∈C f (x)+ ϕc(�x, λ) (4.5.11)

for every c > 0.

Proof. The first claim follows directly from Theorem 4.41. From (4.5.5) we conclude thatx is a minimizer of (4.5.11) if and only if x ∈ C satisfies

〈f ′(x), x − x〉X∗,X +(ϕ′c(�x, λ), �(x − x)

)H≥ 0 for all x ∈ C.

From (4.5.7) and (4.5.10) it follows that x ∈ C satisfies this inequality as well and hencex = x.

Theorems 4.43 and 4.45 imply that if a pair (x, λ) ∈ C ×H satisfies⎧⎨⎩〈f ′(x), x − x〉 + (

λ, �(x − x)) ≥ 0 for all x ∈ C,

λ = ϕ′c(�x, λ)

(4.5.12)

for some c > 0, then x is the minimizer of (4.1.1). Conversely, if x is a minimizer of (4.1.1)and

∂(ϕ ◦�+ ψC)(x) = �∗∂ϕ(�x)+ ∂ψC(x), (4.5.13)

then there exists a λ ∈ H such that the pair (x, λ) satisfies (4.5.12) for all c > 0. Here ψC

denotes the indicator function of the set C. In fact it follows from (4.5.7) that −f ′(x) ∈∂(ϕ ◦�+ ψC)(x) and by (4.5.13)

−f ′(x) ∈ �∗∂ϕ(�x)+ ∂ψC(x) = �∗∂ϕ(�x)+NC(x),

where NC(x) = {z ∈ X∗ : 〈z, x − x〉 ≤ 0 for all x ∈ C}. This implies that there existssome λ ∈ ∂ϕ(�x) such that (4.5.7) holds and also (4.5.12) for the pair (x, λ). Condition(4.5.13) holds, for example, if there exists x ∈ int (C) and ϕ is continuous and finite at �x

(see, e.g., Propositions 12 and 13 in Section 3.2 of [EkTu]).


�

�

�

�

�

�

�

�


The following theorem discusses the equivalence between the existence of a Lagrangemultiplier λ and uniform boundedness of λc.

Theorem 4.46. Suppose � is compact and let λ ∈ H be fixed. Then λc = ϕ′c(vc, λ) isuniformly bounded in c ≥ 1 if and only if there exists λ∗ ∈ ∂ϕ(�x) such that (4.5.7) holds.In either case there exists a subsequence (xc, λc) ∈ X×H such that xc → x strongly in X

and λc → λ∗ weakly in H where (x, λ) ∈ C ×H satisfies (4.5.7).

Proof. Assume that |λc|H is uniformly bounded. From (4.5.5)

〈f ′(xc)+�∗λc, x − xc〉 ≥ 0 for all x ∈ C,

where λc = ϕ′c(�xc, λ). From (A2) it follows that

σ |xc − x|2X ≤ 〈f ′(xc)− f ′(x), xc − x〉 ≤ 〈f ′(x)−�∗λc, xc − x〉

≤ (‖�‖|λc| + ‖f ′(x)‖) |xc − x|,which by assumption implies that |xc|X is uniformly bounded. Since� is compact it followsthat any weakly convergent sequence λc → λ∗ in H satisfies �∗λc → �∗λ strongly in X∗.Again, from (A2) we have

σ |xc − xc|X ≤ 〈f ′(xc)− f ′(xc), xc − xc〉 ≤ |�∗λc −�∗λc|X∗for any c, c > 0. Hence {xc} is a Cauchy sequence in X and thus there exists x ∈ X suchthat |xc − x|X → 0. The rest of the proof for the “only if” part is identical to the one of thefirst part of Theorem 4.43. It will be shown in Theorem 4.49 that

σ

2|xc − x|2X +

1

2c|λc − λ|2H ≤

1

2c|λ− λ|2H

if (4.5.7) holds. This implies the “if” part.

4.6 Augmented Lagrangian methodIn this section we discuss the augmented Lagrangian method for (4.1.1). Throughout weassume that (A1)–(A3) of Section 4.5 hold and we set Lc(x, λ) = f (x)+ φc(�x, λ).

Augmented Lagrangian Method.

Step 1: Choose a starting value λ1 ∈ H a positive number c and set k = 1.

Step 2: Given λk ∈ H find xk ∈ C by

Lc(xk, λk) = min Lc(x, λk) over x ∈ C.

Step 3: Update λk by λk+1 = ϕ′c(�xk, λk).

Step 4: If the convergence criterion is not satisfied, then set k = k + 1 and go to Step 2.


�

�

�

�

�

�

�

�

4.6. Augmented Lagrangian method 115

Before proving convergence of the augmented Lagrangian method we motivate theupdate of the Lagrange multiplier in Step 3. First, it follows from (4.5.12) that it is a fixedpoint iteration for the necessary optimality condition in the dual variable λ. To give thesecond motivation we define the dual functional dc : H → R by

dc(λ) = inff (x)+ ϕc(�x, λ) over x ∈ C (4.6.1)

for λ ∈ H . The update is then a steepest ascent step for dc(λ). In fact it will be shown inthe following lemma that (4.6.1) attains the minimum at a unique minimizer x(λ) ∈ C andthat dc is continuously Fréchet differentiable with F -derivative

d ′c(λ) = u(�x(λ), λ),

where uc(x, λ) is defined in Theorem 4.39. Thus the steepest ascent step is given byλk+1 = λk + c u(�x(λk), λk), which by Theorem 4.40 coincides with the update given inStep 3.

Lemma 4.47. For λ ∈ H and c > 0 the infimum in (4.6.1) is attained at a unique minimizerx(λ) ∈ C and the mapping λ ∈ H → x(λ) ∈ X is Lipschitz continuous with Lipschitzconstant σ−1. Moreover, the dual functional dc is continuously Fréchet differentiable withF-derivative

d ′c(λ) = u(�x(λ), λ),

where uc(v, λ) is defined in Theorem 4.39.

Proof. The proof is given in three steps.Step 1. Since f (x) + ϕc(�x, λ) is convex and l.s.c. and since (4.5.1) holds, there

exists a unique x(λ) that attains its minimum over K . To establish Lipschitz continuity ofλ→ x(λ) we note that x(λ) satisfies the necessary optimality condition

〈f ′(x(λ)), x − x(λ)〉 + (ϕ′c(�x(λ), λ), �(x − x(λ))

) ≥ 0 for all x ∈ C. (4.6.2)

Using this inequality at λ, μ ∈ H , we obtain

〈f ′(x(μ))− f ′(x(λ)), x(μ)− x(λ)〉

+(ϕ′c(�x(μ), μ)− ϕ′c(�x(λ), λ), �(x(μ)− x(λ))

) ≤ 0.

By (A2) and (4.5.6) this implies that

σ |x(μ)− x(λ)|2X +(A 1

c

(�x(μ)+ μ

c

)− A 1

c

(�x(λ)+ λ

c

), �(x(μ)− x(λ))

)≤ 0,

and thus

σ |x(μ)− x(λ)|2X +(A 1

c

(�x(μ)+ μ

c

)− A 1

c

(�x(λ)+ μ

c

),�(x(μ)− x(λ))

)

≤(A 1

c

(�x(λ)+ λ

c

)− A 1

c

(�x(λ)+ μ

c

),�(x(μ)− x(λ))

).


�

�

�

�

�

�

�

�


Since A 1c

is monotone and Lipschitz continuous with Lipschitz constant c, this inequalityyields

σ |x(μ)− x(λ)|X ≤ |μ− λ|H ,

which shows the first assertion.Step 2. We show that for every v ∈ H the functional λ ∈ H → ϕc(v, λ) is Fréchet

differentiable with F -derivative ∂∂λϕc(v, λ) given by u(v, λ). Since (4.4.2) can be equiva-

lently written as (4.4.3), it follows from the proof of Theorem 4.39 that λ ∈ H → ϕc(v, λ)

is Fréchet differentiable and∂

∂λϕc(v, λ) = 1

c(λ+ c u(v, λ))− λ

c= u(v, λ).

Step 3. To argue differentiability of λ→ dc(λ) note that it follows from (4.6.2) thatfor λ, μ ∈ H

dc(μ)− dc(λ) = f (x(μ))− f (x(λ))+ ϕc(�x(μ), λ)− ϕc(�x(λ), λ)

+ϕc(�x(μ), μ)− ϕc(�x(μ), λ)

≥ 〈f ′(x(λ)), x(μ)− x(λ)〉 + (ϕ′c(�x(λ), λ), �(x(μ)− x(λ))

)+ϕc(�x(μ), μ)− ϕc(�x(μ), λ)

≥ ϕc(�x(μ), μ)− ϕc(�x(μ), λ).

(4.6.3)

Similarly, we have

dc(μ)− dc(λ) = f (x(μ))− f (x(λ))+ ϕc(�x(μ), μ)− ϕc(�x(λ), μ)

+ϕc(�x(λ), μ)− ϕc(�x(λ), λ)

≤ −〈f ′(x(μ)), x(λ)− x(μ)〉 + 〈ϕ′c(�x(μ), μ), �(x(λ)− x(μ))〉

+ϕc(�x(λ), μ)− ϕc(�x(λ), λ)

≤ ϕc(�x(λ), μ)− ϕc(�x(λ), λ).

(4.6.4)

It follows from Step 2 and Theorem 4.39 that

|ϕc(�x(μ), μ)− ϕc(�x(μ), λ)− (u(�x(λ)λ), μ− λ)|

≤∣∣∣∣∫ 1

0(u(�x(μ), λ+ t (μ− λ))− u(�x(λ), λ), μ− λ) dt

∣∣∣∣≤ 1

c|μ− λ|

∫ 1

0

∣∣∣∣A 1c

(�x(μ)+ 1

c(λ+ t (μ− λ))

)− A 1

c

(�x(λ)+ λ

c

)+ t (λ− μ)

∣∣∣∣ dt≤ 1

c|μ− λ|

∫ 1

0(c‖�‖|x(μ)− x(λ)| + 2t |μ− λ|) dt ≤

(‖�‖σ

+ 1

c

)|μ− λ|2.


�

�

�

�

�

�

�

�

4.6. Augmented Lagrangian method 117

Hence (4.6.3)–(4.6.4) and Step 2 imply that λ ∈ H → dc(λ) is Fréchet differentiable withF -derivative given by u(�x(λ), λ), where u(�x(λ), λ) is Lipschitz continuous in λ.

The following theorem asserts convergence of the augmented Lagrangian method.

Theorem 4.48. Assume that (A1)–(A3) hold and that there exists λ ∈ ∂ϕ(�x) such that(4.5.7) is satisfied. Then the sequence (xk, λk) is well defined and satisfies

σ

2|xk − x|2X +

1

2c|λk+1 − λ|2H ≤

1

2c|λk − λ|2H (4.6.5)

and∞∑k=1

σ

2|xk − x|2X ≤

1

2c|λ1 − λ|2H , (4.6.6)

which implies that |xk − x|X → 0 as k →∞.

Proof. It follows from Theorem 4.45 that λ = ϕ′c(v, λ), where v = �x. Next, we establish(4.6.5). From (4.5.5) and Step 3

〈f ′(xk), x − xk〉 +(λk+1, �(x − xk)

) ≥ 0

and from (4.5.7)〈f ′(x), xk − x〉 + (

λ, �(xk − x)) ≥ 0.

Adding these two inequalities, we obtain

〈f ′(xk)− f ′(x), xk − x〉 + (λk+1 − λ, �(xk − x)

) ≤ 0. (4.6.7)

From Theorems 4.39 and 4.40

λk+1 − λ = A 1c

(vk + λk

c

)− A 1

c

(v + λ

c

),

where vk = �xk and v = �x. From the definition of Aμ we have

〈Aμv − Aμv, v − v〉 = μ |Aμv − Aμv|2 + 〈Aμv − Aμv, Jμv − Jμv〉 ≥ μ |Aμv − Aμv|2

for μ > 0 and v, v ∈ H , since Aμv ∈ AJμv and A is monotone. Thus,

(λk+1 − λ, vk − v) =(λk+1 − λ, vk + λk

c−

(v + λ

c

))

−1

c(λk+1 − λ, λk − λ) ≥ 1

c|λk+1 − λ|2 − 1

c(λk+1 − λ, λk − λ)

≥ 1

2c|λk+1 − λ|2 − 1

2c|λk − λ|2.

Hence, (4.6.5) follows from (A2) and (4.6.7). Summing up (4.6.5) with respect to k, weobtain (4.6.6).


�

�

�

�

�

�

�

�


The duality (Uzawa) method is an iterative method for λk ∈ H . It is given by

λk+1 = ϕ′c(�xk, λk), (4.6.8)

where xk ∈ C solves

〈f ′(xk)+�∗λk, x − xk〉 ≥ 0 for all x ∈ C. (4.6.9)

It will be shown that the Uzawa method is conditionally convergent in the sense that thereexists 0 < c < c such that it converges for c ∈ [c, c]. On the other hand, the augmentedLagrangian method can be written as

λk+1 = ϕ′c(�xk, λk),

where xk ∈ C satisfies

〈f ′(xk)+�∗λk+1, x − xk〉 ≥ 0 for all x ∈ C.

Note that the Uzawa method is explicit with respect to λ while the augmented Lagrangianmethod is implicit and converges unconditionally (see Theorem 4.48) with respect to c.

Theorem 4.49 (Uzawa Algorithm). Assume that (A1)–(A3) hold and that there existsλ ∈ ∂ϕ(�x) such that (4.5.7) is satisfied. Then there exists c such that for the sequence(xk, λk) generated by (4.6.8)–(4.6.9) we have |xk − x|X → 0 as k →∞ if 0 < c ≤ c.

Proof. As shown in the proof of Theorem 4.48 it follows from (4.5.7) and (4.6.9) that

〈f ′(xk)− f ′(x), xk − x〉 + 〈λk − λ, �(xk − x)〉 ≤ 0. (4.6.10)

Since

λk+1 − λ = A 1c

(vk + λk

c

)− A 1

c

(v + λ

c

),

where vk = �xk , and since v = �x and A 1c

is Lipschitz continuous with Lipschitz con-stant c,

|λk+1 − λ|2 ≤ |λk − λ+ c (vk − v)|2

= |λk − λ|2 + 2c〈λk − λ, vk − v)〉 + c2 |vk − v|2

≤ |λk − λ|2 − (2cσ − c2‖�‖2)|xk − x|2,where (4.6.10) was used for the last inequality. Choose c < 2σ(‖�‖)−1. Then β =2cσ − c2‖�‖2 > 0 if c ∈ (0, c] and

β

∞∑k=0

|xk − x|2 ≤ |λ0 − λ|2.

Consequently |xk − x| → 0 as k →∞.


�

�

�

�

�

�

�

�


4.7 ApplicationsIn this section we discuss applications of the results in Sections 4.5 and 4.6. In many casesthe conjugate functional ϕ∗ is given by

ϕ∗(v) = ψK∗(v),

where K∗ is a closed convex set in H and ψS is the indicator function of a set S, i.e.,

ψS(x) =⎧⎨⎩

0 if x ∈ S,

∞ if x �∈ S.

Then it follows from Theorem 4.40 that for v, λ ∈ H

ϕc(v, λ) = supy∗∈K∗

{− 1

2c|y∗ − (λ+ c v)|2H

}+ 1

2c(|λ+ c v|2H − |λ|2H ). (4.7.1)

Hence the supremum is attained at λc(v, λ) = ProjK∗(λ + c v), where ProjK∗(φ) denotesthe projection of φ ∈ H onto K∗. This implies that the update in Step 3 of the augmentedLagrangian algorithm is given by

λk+1 = ProjK∗(λk + c �xk). (4.7.2)

Example 4.50. For the equality constraint

ϕ(v) = ψK(v) with K = {0}and ϕ∗(w) = 0 for w ∈ H . Thus

ϕc(v, λ) = 1

2c(|λ+ c v|2H − |λ|2H )

and

λk+1 = λk + c �xk,

which coincides with the first order augmented Lagrangian update for equality constraintsdiscussed in Chapter 3.

Example 4.51. If ϕ(v) = ψK(v), where K is a closed convex cone with vertex at the origininH , then ϕ∗ = ψK+ , whereK+ = {w ∈ H : 〈w, v〉H ≤ 0 for all v ∈ K} is the dual cone ofK . In particular, if K = {v ∈ L2(�) : v ≤ 0 a.e.}, then K+ = {w ∈ L2(�) : w ≥ 0 a.e.}.Thus for the inequality constraint in H = L2(�) the update (4.7.2) becomes

λk+1 = max(0, λk + c �xk),

where the max operation is defined pointwise in �. Here L2(�) denotes the space of scalar-valued square integrable functions over a domain � in Rn. For K = {v ∈ Rn : vi ≤0 for all i} the update (4.7.2) is given by λk+1 = max(0, λk + c�xk), where the maximumis taken coordinatewise. It coincides with the first order augmented Lagrangian update forfinite rank inequality constraints in Chapter 3.


�

�

�

�

�

�

�

�


Example 4.52. If ϕ(v) = |v|H , then ϕ∗ = ψB , where B is the closed unit ball in H . Thusthe update (4.7.2) becomes

λk+1 = λk + c �kxk

max(1, |λ+ c �xk|) .

Example 4.53. We consider the bilateral inequality constraintφ ≤ v ≤ ψ inL2(�). We set

K ={v ∈ L2(�) : −ψ − φ

2≤ v − φ + ψ

2≤ ψ − φ

2

}and ϕ = ψK . It follows that

ϕ∗(w) =⟨w,

φ + ψ

2

⟩L2

+⟨ψ − φ

2, |w|

⟩L2

.

From Theorem 4.40 we conclude that the expression 〈x,w〉 − φ∗(w) − 12c |w − λ|2 is

maximized with respect to w at the element y∗ satisfying

x − φ + ψ

2− y∗ − λ

c∈ ψ − φ

2∂(| · |)(y∗)

a.e. in �. Thus it follows from Theorem 4.40 that the complementarity condition (4.1.8) isgiven by

λ = max(0, λ+ c (�x − ψ))+min(0, λ+ c (�x − φ)),

and the Lagrange multiplier update in Step 3 of the augmented Lagrangian method is

λk+1 = max(0, λk + c (�xk − ψ))+min(0, λk + c (�xk − φ)),

where the max and min operations are defined pointwise a.e. in �.

Note that with obvious modifications, �x in Sections 4.5 and 4.6 can be replaced bythe affine function of the form �x + a with a ∈ H .

4.7.1 Bingham flow

We consider the problem

min∫�

(μ

2|∇u|2 − f u

)dx + g

∫�

|∇u| dx over u ∈ H 10 (�), (4.7.3)

where� is a bounded open set inR2 with Lipschitz boundary, g andμ are positive constants,and f ∈ L2(�). For a discussion of (4.7.3) we refer the reader to [Glo, GLT] and thereferences therein. In the context of the general theory of Section 4.5 we choose

X = H 10 (�), H = L2(�)× L2(�), and � = g grad,


�

�

�

�

�

�

�

�


K = X, and define f : X→ R and ϕ : H → R by

f (u) =∫�

(μ

2|∇u|2 − f u

)dx

and

ϕ(v1, v2) =∫�

√v2

1 + v22 dx.

Conditions (A1)–(A3) are clearly satisfied. Since dom(ϕ) = H it follows from Theo-rem 4.45 and Lemma 4.44 that there exists λ such that (4.5.7) holds. Moreover, it is notdifficult to show that

ϕ∗(v) = ψK∗(v),

where K∗ is given by

K∗ = {v ∈ H : |v(x)|R2 ≤ 1 a.e. in �}.Hence it follows from (4.7.2) that Steps 2 and 3 in the augmented Lagrangian method aregiven by

−μ�uk − g div λk+1 = f , (4.7.4)

where

λk+1 =

⎧⎪⎪⎨⎪⎪⎩λk + c∇uk on Ak = {x : |λk(x)+ c∇uk(x)|R2 ≤ 1},λk + c∇uk

|λk + c∇uk|R2on � \ Ak.

(4.7.5)

4.7.2 Image restoration

For the image restoration problem introduced in (4.1.2) the analysis is similar to the one forthe Bingham flow and Steps 2 and 3 in the augmented Lagrangian method are given by

−μ�uk +K∗(Kuk − z)+ g div λk+1 = 0, μ∂u

∂n+ g λk+1 = 0 on �,

where

λk+1 =

⎧⎪⎪⎨⎪⎪⎩λk + c∇uk on Ak = {x : |λk(x)+ c∇uk(x)|R2 ≤ 1},λk + c∇uk

|λk + c∇uk|R2on � \ Ak

in the strong form. Here K : L2(�)→ L2(�) denotes the convolution operator defined by

(Ku)(x) =∫�

k(x, s)u(s) ds.


�

�

�

�

�

�

�

�


4.7.3 Elastoplastic problem


min∫�

(β

2|∇u|2 − f u

)dx over u ∈ H 1

0 (�)

subject to |∇u| ≤ 1 a.e. in �,

(4.7.6)

where β is a positive constant, � is a bounded domain inR2, and f ∈ L2(�). In the contextof the general theory we choose

X = H 10 (�), H = L2(�)× L2(�), and � = ∇,

K = X, and define f : X→ R and ϕ : H → R by

f (u) =∫�

(β

2|∇u|2 − f u

)dx and ϕ = ψC,

where C is the closed convex set defined by C = {v ∈ H : |v|R2 ≤ 1 a.e. in �}. Then

ϕ∗(w) =∫�

|w|R2 dx.

The maximum of (4.4.8), i.e., 〈∇u,w〉 − φ∗(w)− 12c |w− λ| with respect to w ∈ L2(�)×

L2(�), is attained at y∗ such that

λ+ c∇u = y∗ + cμ, where |μ| ≤ 1 and μ · y∗ = |y∗|a.e. in �. It thus follows from Theorem 4.40 that the Lagrange multiplier update is given by

λk+1 = c max

(0, |λ

k + c∇uk

c| − 1

)λk + c∇uk

|λk + c∇uk|a.e. in �. The existence of Lagrange multiplier λ ∈ L∞(�) for the inequality constraint in(4.7.6) is shown in [Bre2] for f = 1. In general, existence is still an open problem.

4.7.4 Obstacle problem


min∫�

(1

2|∇u|2 − f u

)dx over u ∈ H 1

0 (�)

subject to φ ≤ u ≤ ψ a.e. in �.

(4.7.7)

In the context of the general theory we choose

X = H 10 (�), H = L2(�), and � = the natural injection,


�

�

�

�

�

�

�

�


C = X, and define f : X→ R and ϕ : H → R by

f (u) =∫�

(1

2|∇u|2 − f u

)dx and ϕ(v) = ψK,

where K is the closed convex set defined by K = {v ∈ H : φ ≤ v ≤ ψ a.e. in �}. Forthe one-sided constraint u ≤ ψ (i.e., φ = −∞) it is shown in [IK4], for example, that thereexists a unique λ ∈ ∂ϕ(u) such that (4.5.7) is satisfied provided that ψ ∈ H 1(�), ψ |� ≥ 0,and sup(0, f +�ψ) ∈ L2(�). In fact, the optimal pair (u, λ) ∈ (H 2 ∩H 1

0 )× L2 satisfies

−�u+ λ = f ,

λ = max(0, λ+ c (u− ψ)).

(4.7.8)

In this case Steps 2 and 3 in the augmented Lagrangian method is given by

−�uk + λk+1 = f ,

λk+1 = max(0, λk + c (uk − ψ)).

For the two-sided constraint case we assume that φ,ψ ∈ H 1(�) satisfy

φ ≤ 0 ≤ ψ on �,max(0,�ψ + f ), min(0,�φ + f ) ∈ L2(�), (4.7.9)

S1 = {x ∈ � : �ψ + f > 0} ∩ S2 = {x ∈ � : �φ + f < 0} = ∅, (4.7.10)

and that there exists a c0 > 0 such that

−�(ψ − φ)+ c0 (ψ − φ) ≥ 0 a.e. in �. (4.7.11)

Let λ ∈ H be defined by

λ(x) =⎧⎨⎩

�ψ(x)+ f (x), x ∈ S1,

�φ(x)+ f (x), x ∈ S2,

0 otherwise.(4.7.12)

Employing the regularization procedure of (4.5.4) we find the following theorem.

Theorem 4.54. Assume that φ,ψ ∈ H 1(�) satisfy (4.7.9)–(4.7.12) where λ is defined in(4.7.12). Then (4.5.5) is given by

−�uc + λc = f , λc =⎧⎨⎩

λ+ c (uc − ψ) if λ+ c (uc − ψ) > 0,λ+ c (uc − φ) if λ+ c (uc − φ) < 0,0 otherwise,

(4.7.13)

and φ ≤ uc ≤ ψ, |λc| ≤ |λ| a.e. in � for c ≥ c0. Moreover, as c →∞, uc ⇀ u∗ weaklyin H 2(�) and λc ⇀ λ weakly in L2(�) where u is the solution of (4.7.7) and (u, λ) satisfiesthe necessary and sufficient optimality condition

−�u+ λ = f , λ =⎧⎨⎩

λ+ c (u− ψ) if λ+ c (u− ψ) > 0,λ+ c (u− φ) if λ+ c (u− φ) < 0,0 otherwise.


�

�

�

�

�

�

�

�


Proof. From Theorem 4.40 it follows that

λc(u, λ) =⎧⎨⎩

λ+ c (u− ψ) if λ+ c (u− ψ) > 0,λ+ c (u− φ) if λ+ c (u− φ) < 0,0 otherwise.

(4.7.14)

We note that (4.5.4) has a unique solution uc ∈ H 10 (�). From (4.5.5)–(4.5.6) we deduce

that uc satisfies (4.7.13). Since λ ∈ L2(�) it follows that λc(uc, λ) ∈ L2(�) and thusuc ∈ H 2(�). Let η = sup(0, uc − ψ). Then η ∈ H 1

0 (�). Hence, we have

(∇(uc − ψ),∇η)+ (−(�ψ + f )+ λc, η) = 0.

For x ∈ � assume that uc(x) ≥ ψ(x). If λ(x) > 0, then−(�ψ+f )+λc = c (uc−ψ) ≥ 0.If λ(x) = 0, then−(�ψ+ f )+λc ≥ c (uc−ψ) ≥ 0. If λ(x) < 0, then−(�ψ+ f )+λc ≥−�(ψ − φ) + c (ψ − φ) ≥ 0 for c ≥ c0. Thus, we have (−(�ψ + f ) + λc, η) ≥ 0 and|∇η|2 = 0. This implies that η = 0 and uc ≤ ψ a.e. in �. Similarly, we can prove thatuc ≥ φ a.e. in � by choosing the test function η = inf (0, uc − φ) ∈ H 1

0 (�). Moreover, itfollows from (4.7.14) that |λc| = |λc(uc, λ)| ≤ |λ| a.e. in� and |λc|L2 is uniformly bounded.Thus, there exists a weakly convergent subsequence (uc, λc) ⇀ (u, λ) in H 2(�)×L2(�).Moreover the subsequence uc converges strongly to u in H 1

0 (�) and, as shown in the proofof Theorem 4.46,

−�u+ λ = f and λ ∈ ∂ϕ(u). (4.7.15)

Hence, it follows from Theorem 4.43 that u minimizes (4.7.7). Since the solution to (4.7.7)is unique it follows from (4.7.15) that λ ∈ L2(�) is unique. The theorem now follows fromTheorem 4.45.

From Theorem 4.54 it follows that Steps 2 and 3 in the augmented Lagrangian methodfor the two-sided constraint are given by

−�uk + λk+1 = f , λk+1 =⎧⎨⎩

λk + c (uk − ψ) if λk + c (uk − ψ) > 0,λk + c (uk − φ) if λk + c (uk − φ) < 0,0 otherwise.

Note that the last equality can be equivalently expressed as

λk+1 = max(0, λk + c (uk − ψ))+min(0, λk + c (uk − φ)).

4.7.5 Signorini problem

Consider the Signorini problem

min∫�

1

2(|∇u|2 + |u|2)− f u dx over u ∈ H 1(�)

subject to u ≥ 0 on the boundary �,

(4.7.16)


�

�

�

�

�

�

�

�


which is a simplified version of a contact problem arising in elasticity theory. In this casewe choose

X = H 1(�), H = L2(�), and � = the trace operator on boundary �,


f (u) =∫�

1

2(|∇u|2 + |u|2)− f u dx and ϕ(v) = ψK,

where K is the closed convex set defined by K = {v ∈ L2(�) : v ≥ 0 a.e. in �}. Iff ∈ L2(�), then the unique minimizer u ∈ H 2(�) [Bre2] and it is shown in [Glo] thatfor λ = ∂

∂nu, the pair (u, λ) satisfies (4.5.7). Note that range(�) is dense in H and � is

compact. It thus follows from Theorems 4.43 and 4.46 that (4.5.7) has a unique solutionand that uc → u strongly in X and λc → λ weakly in H . Steps 2 and 3 in the augmentedLagrangian method are given by

−�uk + uk = f ,∂

∂nuk − λk+1 = 0 on �,

λk+1 = max(0, λk − c uk) on �.

4.7.6 Friction problem

Consider a simplified friction problem from elasticity theory

min∫�

1

2(|∇u|2 + |u|2)− f u dx + g

∫�

|u| ds over u ∈ H 1(�), (4.7.17)

where � is a bounded open domain ofR2 with sufficiently smooth boundary �. In this casewe choose

X = H 1(�), H = L2(�), and � = the trace operator on boundary �,


f (u) =∫�

1

2(|∇u|2 + |u|2)− f u dx and ϕ(v) = g

∫�

|v| ds.

Note that ϕ = ψK∗ , where K∗ = {v ∈ H : |v(x)| ≤ 1 a.e. in �}. Since dom(ϕ) = H

it follows from Theorem 4.43 and Lemma 4.44 that there exists λ such that (4.5.7) holds.From (4.7.2) Steps 2 and 3 in the augmented Lagrangian method are given by

−�uk + uk = f ,∂

∂nuk + λk+1 = 0 on �,

λk+1 =⎧⎨⎩ λk + c uk on �k = {x ∈ � : |λk + c uk| ≤ 1},

λk + c uk

|λk + c uk| on � \ �k.


�

�

�

�

�

�

�

�


4.7.7 L1-fitting

Consider the minimization problem

min∫�

|u− z| dx + μ

2

∫�

|∇u|2 ds over u ∈ H 10 (�) (4.7.18)

for interpolation of noisy data z ∈ L2(�) by minimizing the L1-norm of the error u − z

over u ∈ H 10 (�). Again μ > 0 is fixed but should be adjusted to the statistics of noise. The

analysis of this problem is very similar to that of a friction problem and Steps 2 and 3 in theaugmented Lagrangian method are given by

−μ�uk + λk+1 = 0,

λk+1 =⎧⎨⎩ λk + c (uk − z) on �k = {x : |λk(x)+ c (uk(x)− z)| ≤ 1},

λk + c (uk − z)

|λk + c (uk − z)| on � \�k.

4.7.8 Control problem

Consider the optimal control problem

min1

2

∫ T

0|x(t)|2 + |u(t)|2 dt over (x, u) ∈ L2(0, T ;Rn)× L2(0, T ;Rm)

subject to x(t)− x0 −∫ t

0Ax(s)+ Bu(s) ds = 0 and |u(t)| ≤ 1 a.e.,

(4.7.19)

where A ∈ Rn×n, B ∈ Rn×m, and x0 ∈ Rn are fixed. In this case we formulate the problemas in the form (4.1.1) by choosing

X = L2(0, T ;Rn)×L2(0, T ;Rm), H = L2(0, T ;Rm), and �(x, u) = u for (x, u) ∈ X,

and define f : X→ R and ϕ : H → by

f (x, u) = 1

2

∫ T

0(|x(t)|2 + |u(t)|2) dt and ϕ(v) = ψK(v),

where K = {v ∈ H : |v(t)| ≤ 1 a.e. in (0, T )}. C is the closed affine space defined by

C ={(x, u) ∈ X : x(t)− x0 −

∫ t

0Ax(s)+ Bu(s) ds = 0

}.

It follows from Theorems 4.39 and 4.40 that

λc = c

(v + λ

c− v + λ

c

max(1, |v + λc|))= c max

(0,

∣∣∣∣v + λ

c

∣∣∣∣− 1

)λ+ c v

|λ+ c v| (4.7.20)


�

�

�

�

�

�

�

�


for v, λ ∈ H . For c > 0 the regularized problem

fc(x, u) = f (x, u)+ ϕc(u, 0) over (x, u) ∈ C

has a unique solution (xc, uc) ∈ C. Define the Lagrangian

L(x, u, μ) = fc(x, u)+∫ T

0

(x(t)− x0 −

∫ t

0Ax(s)+ Bu(s) ds, μ(t)

)Rn

dt.

Since the mapping F : X→ L2(0, T ;Rn) defined by

F(x, u) = x(t)−∫ t

0Ax(s)+ Bu(s) ds, t ∈ (0, T ),

is surjective it follows from the Lagrange multiplier theory that there exists a unique La-grange multiplier μc ∈ L2(0, T ;Rn) such that the Fréchet derivative of L with respect to(x, u) satisfies L′(xc, uc, μc)(h, v) = 0 for all (h, v) ∈ X. Hence we obtain∫ T

0

[(h(t)−

∫ t

0Ah(s) ds, μc(t)

)+ (x(t), h(t))

]dt = 0

for all h ∈ L2(0, T ;Rn) and∫ T

0

[(u(t)+ c max(0, |uc| − 1)

uc

|uc| , v(t))−

(∫ t

0Bv(s) ds, μc(t)

)]dt = 0

for all v ∈ L2(0, T ;Rm). It is not difficult to show that if we define pc(t) = −∫ T

tμc(s) ds,

then (xc, uc, pc) satisfies

d

dtpc(t)+ AT pc(t)+ xc(t) = 0, pc(T ) = 0,

uc(t)+ c max(0, |uc(t)| − 1)uc(t)

|uc(t)| = −BT pc(t).

(4.7.21)

Let (x, u) be a feasible point of (4.7.19); i.e., (x, u) ∈ C and u ∈ K . Since fc(xc, uc) ≤fc(x, u) = f (x, u) and ϕc(uc, 0) ≥ 0 it follows that |(xc, uc)|X is uniformly bounded inc > 0. Thus, there exists a subsequence of (xc, uc) that converges weakly to (x, u) ∈ X asc→∞. We shall show that (x, u) is the solution of (4.7.19). First, since F(xc, uc) = x0, xcconverges strongly to x in L2(0, T ;Rn) and weakly in H 1(0, T ;Rn) and (x, u) ∈ C. Fromthe first equation of (4.7.21) it follows that pc converges strongly to p in H 1(0, T ;Rn),where p satisfies

d

dtp(t)+ AT p(t)+ x(t) = 0, p(T ) = 0.

Next, from the second equation of (4.7.21) we have

|uc(t)| + c max(0, |uc(t)| − 1) = |BT pc(t)|


�

�

�

�

�

�

�

�


and thus

|uc(t)| =

⎧⎪⎪⎨⎪⎪⎩|BT pc(t)| if |BT pc(t)| ≤ 1,

1+ |BT pc(t)| − 1

c + 1if |BT pc(t)| ≥ 1,

t ∈ [0, T ]. (4.7.22)

Note that

uc(t) = −BT pc(t)

1+ ηc(t)|uc(t)|−1,

where ηc(t) = c max(0, |uc(t)| − 1) = c

1+ cmax(0, |BT pc(t)| − 1).

(4.7.23)

Define the functions η, u ∈ H 1(0, T ) by

η(t) = max(0, |BT p(t)| − 1) and u(t) = −BT p(t)

1+ η(t)= −BT p(t)

max(1, |BT p(t)|) (4.7.24)

for t ∈ [0, T ]. Then

|u(t)| = |BT p(t)|max(1, |BT p(t)|) .

Since BT pc → BT p in C(0, T ) we have |uc| → |u| in C(0, T ) by (4.7.22) and it followsfrom (4.7.23)–(4.7.24) that ηc → η and uc → u in C(0, T ). Since uc → u weakly inL2(0, T ;Rm) we have u = u. Therefore, we conclude that uc → u in C(0, T ) and ifλ(t) = η(t) u(t), then (x, u, p, λ) satisfies (x, u) ∈ C, u ∈ C and

d

dtp(t)+ AT p(t)+ x = 0, p(T ) = 0,

u(t)+ λ = −BT p(t), λ(t) = λc(u(t), λ(t)) = ϕ′c(u, λ),

(4.7.25)

which is equivalent to (4.5.7). Now, from Theorem 4.43 we deduce that (x, u) is the solutionto (4.7.19). The last equality in (4.7.25) can be verified by separately considering the cases|u(t)| < 1 and |u(t)| = 1. It follows from (4.7.20) that Steps 2 and 3 in the augmentedLagrangian method are given by

d

dtxk(t) = Axk(t)+ Buk(t), xk(0) = x0,

d

dtpk(t)+ AT pk(t)+ xk = 0, pk(T ) = 0,

uk(t)+ λk+1(t) = −BT pk(t), where λk+1 = c max

(0,

∣∣∣∣uk + λk

c

∣∣∣∣− 1

)λk + c uk

|λk + c uk|for t ∈ [0, T ].


�

�

�

�

�

�

�

�

Chapter 5

Newton and SQP Methods

In this chapter we discuss the Newton method for the equality-constrained optimal controlproblem

min J (y, u) subject to e(y, u) = 0, (P )

where J : Y × U → R and e : Y × U → W , and Y,U , and W are Hilbert spaces. Thefocus is on problems where for given u there exists a solution y(u) to e(y, u) = 0, which istypical for optimal control problems. We shall refer to

J (u) = J (y(u), u) (5.0.1)

as the reduced cost functional. In the first section we give necessary and sufficient optimalityconditions based on Taylor series expansion arguments. Section 5.2 is devoted to Newton’smethod to solve (P ) and sufficient conditions for quadratic convergence are given. Sec-tion 5.3 contains a discussion of SQP (sequential quadratic programming) and reduced SQPtechniques. We do not provide a convergence analysis here, since this can be obtained asa special case from the results on second order augmented Lagrangians in Chapter 6. Theresults are specialized to a class of optimal control problems for the Navier–Stokes equationin Section 5.4. Section 5.5 is devoted to the Newton method for weakly singular problemsas introduced in Chapter 1.

5.1 Preliminaries

(I) Necessary Optimality Condition

The first objective is to characterize the derivative of J without recourse to the derivativeof the state y with respect to the control u. This is in the same spirit as Theorem 1.17 butobtained under the less restrictive assumption of this chapter. We assume that

(C1) (y, u) ∈ Y ×U satisfies e(y, u) = 0 and there exists a neighborhood V (y)×V (u) of(y, u) on which J and e are C1 with Lipschitz continuous first derivatives such thatfor every v ∈ V (u) there exists a unique y(v) ∈ V (y) satisfying e(y(v), v) = 0.

129


�

�

�

�

�

�

�

�

130 Chapter 5. Newton and SQP Methods

Theorem 5.1. Assume that the pair (y, u) satisfies (C1) and that there exists λ ∈ W ∗ suchthat

(C2) ey(y, u)∗λ+ Jy(y, u) = 0, and

(C3) limt→0+1t|y(u+ td)− y|2Y = 0 for all d ∈ U .

Then the Gâteaux derivative J ′(u) exists and

J ′(u) = eu(y, u)∗λ+ Ju(y, u). (5.1.1)

Proof. For d ∈ U and t sufficiently small let v = y(u+ td)− y. Then

J (u+ td)− J (u) =∫ 1

0

(J ′(y+sv, u+std)(v, td)−J ′(y, u)(v, td)

)ds+J ′(y, u)(v, td).

(5.1.2)Similarly,

0 = 〈λ, e(v, u+ td)− e(y, u)〉W ∗,W

=⟨λ,

∫ 1

0

(e′(y + sv, u+ std)(v, td)− e′(y, u)(v, td)

)ds

⟩+ 〈λ, e′(y, u)(v, td)〉,

(5.1.3)and hence by Lipschitz continuity of e′ in V (y)× V (u) and (C3) we have

limt→0+

1

t〈λ, ey(y, u)v〉 = −〈λ, eu(y, u)d〉.

Since J ′ is Lipschitz continuous in V (y) × V (u), it follows from (5.1.2) and conditions(C2)–(C3) that

limt→0+

J (u+ td)− J (u)

t= − lim

t→0+

1

t〈λ, ey(y, u)v〉 + Ju(y, u)d

= (eu(y, u)∗λ+ Ju(y, u))d.

Defining the Lagrangian L : Y × U ×W ∗ → R by

L(y, u, λ) = J (y, u)+ 〈λ, e(y, u)〉,(5.1.1) can be written as

J ′(u) = Lu(y, u, λ).

If ey(y, u) : Y → W is bijective, then (5.1.1) also follows from the implicit functiontheory. In fact, by the implicit function theorem there exists a neighborhood V (y)× V (u)

of (y, u) such that e(y, v) = 0 has a unique solution y(v) ∈ V (y) for v ∈ V (u) andy(·) : V (u) ⊂ U → Y is C1 with

ey(y, v)y′(v)+ eu(y, v) = 0, v ∈ V (u).


�

�

�

�

�

�

�

�

5.1. Preliminaries 131

Thus

J ′(u)v = Jy(y, u)y′(u)v + Ju(y, u)v = 〈eu(y, u)∗λ, v〉 + Ju(y, u)v.

If conditions (C1)–(C3) hold at a local minimizer (y∗, u∗) of (P ), it follows fromTheorem 5.1 that the first order necessary optimality condition for (P ) is given by⎧⎪⎪⎨⎪⎪⎩

ey(y∗, u∗)∗λ∗ + Jy(y

∗, u∗) = 0,

eu(y∗, u∗)∗λ∗ + Ju(y

∗, u∗) = 0,

e(y∗, u∗) = 0,

(5.1.4)

or equivalently

L′(y∗, u∗, λ∗) = 0, e(y∗, u∗) = 0,

where, as in the previous chapter, prime with L denotes the derivative with respect to (y, u).

(II) Sufficient Optimality Condition

Assume that (y∗, u∗, λ∗) ∈ Y1 × U ×W ∗ satisfies (C1) and that the necessary optimalitycondition (5.1.4) holds. For (y, u) ∈ V (y∗)×V (u∗) define the quadratic approximation toJ (y, u) and 〈λ, e(y, u)〉 by

E1(y−y∗, u−u∗) = J (y, u)−J (y∗, u∗)−Ju(y∗, u∗)(u−u∗)−Jy(y

∗, u∗)(y−y∗) (5.1.5)

and

E2(y− y∗, u−u∗) = 〈λ∗, e(y, u)− e(y∗, u∗)− ey(y∗, u∗)(y− y∗)− eu(y

∗, u∗)(u−u∗)〉.(5.1.6)

Since for y = y(u) with u ∈ V (u∗)

E2(y(u)− y∗, u− u∗) = −〈λ∗, ey(y∗, u∗)(y − y∗)+ eu(y∗, u∗)(u− u∗)〉,

we find by summing (5.1.5) and (5.1.6) with y = y(u) and using (5.1.4)

J (u)− J (u∗) = J (y(u), u)−J (y∗, u∗) = E1(y(u)−y∗, u−u∗)+E2(y(u)−y∗, u−u∗).

Based on this identity, we have the following local sufficient optimality conditions.• If E(u) := E1(y(u)− y∗, u− y∗)+E2(y(u)− y∗, u− u∗) ≥ 0 for all u ∈ V (u∗),

then u∗ is a local minimizer of (P ). If E(u) > 0 for all u ∈ V (u∗) with u �= u∗, then u∗ isa strict local minimizer.

•Assume that J is locally uniformly convex in the sense that for some α > 0

E1(y(u)− y∗, u− u∗) ≥ α(|y(u)− y∗|2Y + |u− u∗|2U) for all u ∈ V (u∗),

and let β ≥ 0 be the size of the nonlinearity of e, i.e.,

|e(y, u)−e(y∗, u∗)−ey(y∗, u∗)(y−y∗)−eu(y

∗, u∗)(u−u∗)| ≤ β(|y−y∗|2Y +|u−u∗|2U)


�

�

�

�

�

�

�

�


for all (y, u) ∈ V (y∗)× V (u∗). Then

E(u) ≥ (α − β|λ∗|W ∗) (|y(u)− y∗|2Y + |u− u∗|2U) for all u ∈ V (u∗).

Thus, u∗ is a strict local minimizer of (P ) if α − β|λ∗|W ∗ > 0.• Note that

E(u) = L(x, λ∗)− L(x∗, λ∗)− L′(x∗, λ∗)(x − x∗),

where x = (y, u), x∗ = (y∗, u∗), and y∗ = y(u∗). If L is C2 with respect to x = (y, u) inV (y∗)× V (u∗) with second derivative with respect to x denoted by L′′, we have

E(u) = L′′(x∗, λ∗)(x − x∗, x − x∗)+ r(x − x∗), (5.1.7)

where |r(x − x∗)| = o(|x − x∗|2). Thus, if there exists σ > 0 such that

L′′(x∗, λ∗)(δx, δx) ≥ σ |δx|2 for all δx ∈ ker e′(x∗), (5.1.8)

and e′(x∗) : Y × U → W is surjective, then there exist σ0 > 0 and ρ > 0 such that

J (u)− J (u∗) = E(u) ≥ σ0(|y(u)− y(u∗))|2 + |u− u∗|2) (5.1.9)

for all y(u) ∈ V (y∗)×V (u∗)with |(y(u), u)−(y(u∗), u∗)| < ρ. In fact, from Lemma 2.13,Chapter 2, with S = ker e′(x∗) there exist γ > 0, σ0 > 0 such that

L′′(x∗, λ∗)(x − x∗, x − x∗) ≥ σ0|x − x∗|2

for all x ∈ V (y∗)× V (u∗) with |xker⊥| ≤ γ |xker|,(5.1.10)

where we decomposed x − x∗ as

x − x∗ = xker + xker⊥ ∈ ker e′(x∗)+ (ker e′(x∗))⊥.

Since 0 = e(x)− e(x∗) = e′(x∗)(x − x∗)+ η(x − x∗), where |η(x − x∗)|W = o(|x − x∗|),it follows that for δ ∈ (0, γ

1+γ ) there exists ρ > 0 such that

|xker⊥| ≤ δ|x − x∗| if |x − x∗| < ρ.

Consequently

|xker⊥| ≤δ

1− δ|xker| < γ |xker| if |x − x∗| < ρ,

and hence (5.1.9) follows from (5.1.7) and (5.1.10).We refer the reader to [CaTr, RoTr] and the literature cited therein for detailed in-

vestigations on the topic of second order sufficient optimality. The aim of these results isto establish second order sufficient optimality conditions which are close to second ordernecessary conditions.


�

�

�

�

�

�

�

�

5.2. Newton method 133

5.2 Newton methodIn this section we describe the Newton method applied to the reduced form (5.0.1) of (P ).That is, let y(u) denote a solution to e(y, u) = 0. Then the constrained optimization problemis transformed to the unconstrained problem for u in U :

minu∈U J (u) = J (y(u), u). (5.2.1)

Let (y∗, u∗) denote a solution to (P ), and assume that (C1) holds for (y∗, u∗) and that (C2),(C3) hold for all (y(u), u) ∈ V (y∗) × V (u∗). In addition it is assumed that J and e areC2 in V (y∗)×V (u∗) with Lipschitz continuous second derivatives. From Theorem 5.1 thefirst derivative of J (u) is given by

J ′(u) = eu(y, u)∗λ+ Ju(y, u), (5.2.2)

where u ∈ V (u∗), y = y(u), and λ = λ(u) satisfy

ey(y, u)∗λ = −Jy(y, u). (5.2.3)

From (5.2.2) it follows that

J ′(u) = Lu(y(u), u, λ(u)) for u ∈ V (u∗). (5.2.4)

We henceforth assume that for every u ∈ V (u∗) and y = y(u), λ = λ(u),⎛⎝ Lyy(y, u, λ) ey(y, u)∗

ey(y, u) 0

⎞⎠⎛⎝ μ1

μ2

⎞⎠+⎛⎝ Lyu(y, u, λ)

eu(y, u)

⎞⎠ = 0 (5.2.5)

admits a solution (μ1, μ2) ∈ L(U, Y )× L(U,W ∗). Note that (5.2.5) is an operator equa-tion in L(U, Y ∗) × L(U,W). It consists of the linearized primal equation for μ1 and theadjoint operator equation with right-hand side −ey(y, u)∗μ2 − Lyu(y, u, λ) ∈ L(U, Y ∗)for μ2.

Using (5.24) and the adjoint operator in the form Ly(y, u, λ) = 0 and e(y, u) = 0,we find

J ′′(u) = Lyu(y, u, λ)μ1 + eu(y, u)∗μ2 + Luu(y, u, λ), (5.2.6)

where (μ1, μ2) satisfy (5.2.5). In fact, since e and J are required to have Lipschitz con-tinuous second derivatives in V (y∗) × V (u∗) we obtain the following relationships in W

and Y ∗:

ey(y, u)v + eu(y, u)(td) = o(|t |),Lyy(y, u, λ)v + ey(y, u)

∗w + Lyu(y, u, λ)(td) = o(|t |),(5.2.7)


�

�

�

�

�

�

�

�


where (y, u) ∈ V (y∗)× V (u∗), d ∈ U , and v = y(u+ td)− y, w = λ(u+ td)− λ. By(5.2.4), (5.2.5), and (5.2.7) we find

J ′(u+ td)− J ′(u) = Luy(y, u, λ)v + Luu(y, u, λ)(td)+ eu(y, u)∗w + o(|t |)

= −Lyy(y, u, λ)(μ1, v)− 〈ey(y, u)v, μ2〉W,L(U,W ∗) + Luu(y, u, λ)(td)

−〈ey(y, u)μ1, w〉L(U,W ∗),W + o(|t |)= 〈ey(y, u)μ1, w〉L(U,W),W ∗ + Lyu(y, u, λ)(td, μ1)+ 〈e∗u(y, u)μ2, td〉L(U,U∗),U

+Luu(y, u, λ)(td)− 〈ey(y, u)μ1, w〉L(U,W),W ∗ + o(|t |).Dividing by t and letting t → 0 we obtain (5.2.6).

From (5.2.5) we deduce

μ1 = −(ey(y, u))−1eu(y, u),

μ2 = −(ey(y, u)∗)−1(Lyyμ1 + Lyu),

where L is evaluated at (y, u, λ). Hence the second derivative of J is given by

J ′′(u) = Luyμ1 − eu(y, u)∗(ey(y, u)∗)−1(Lyyμ1 + Lyu)+ Luu

= T (y, u)∗L′′(y, u, λ)T (y, u),

(5.2.8)

with

T (y, u) =⎛⎝ −ey(y, u)−1eu(y, u)

I

⎞⎠ ,

where T ∈ L(U, Y × U). After this preparation the Newton equation

J ′′(u)δu+ J ′(u) = 0 (5.2.9)

can be expressed as

T (y, u)∗L′′(y, u, λ)(

δy

δu

)+ T (y, u)∗

⎛⎝ 0

(eu(y, u)∗λ+ Ju(y, u)

⎞⎠ = 0,

ey(y, u)δy + eu(y, u)δu = 0.

By the definition of T (y, u) and the closed range theorem we have

ker(T (y, u)∗) = range(T (y, u))⊥ = ker(e′(y, u))⊥ = range(e′(y, u)∗)

provided that range(e′(y, u)), and hence range(e′(y, u))∗, is closed. As a consequence theNewton update can be expressed as(

L′′(y, u, λ) e′(y, u)∗e′(y, u) 0

)⎛⎝ δy

δu

δλ

⎞⎠+⎛⎝ 0

eu(y, u)∗λ+ Ju(y, u)

0

⎞⎠ = 0, (5.2.10)


�

�

�

�

�

�

�

�

5.2. Newton method 135

where (δy, δu, δλ) ∈ Y × U ×W ∗ and the equality is understood in Y ∗ × U ∗ ×W .We now state the Newton iteration for (5.2.1).

Newton Method.

(i) Initialization: Choose u0 ∈ V (u∗), solve

e(y, u0) = 0, ey(y, u0)∗λ+ Jy(y, u0) = 0 for (y0, λ0),

and set k = 0(ii) Newton step: Solve for (δy, δu, δλ) ∈ Y × U ×W ∗

(L′′(yk, uk, λk) e′(yk, uk)

∗e′(yk, uk) 0

)⎛⎝ (δy

δu

)δλ

⎞⎠+⎛⎝ 0

eu(yk, uk)∗λk + Ju(yk, uk)

0

⎞⎠ = 0.

(iii) Update uk+1 = uk + δu.(iv) Feasibility step: Solve for (yk+1, λk+1) ∈ Y ×W ∗

e(y, uk+1) = 0, ey(yk+1, uk+1)∗λ+ Jy(y, uk+1) = 0.

(v) Stop, or set k = k + 1, and goto (ii).

The following theorem provides sufficient conditions for local quadratic convergence.Here we assume that ey(x∗) : Y → W is a bijection at a local solution x∗ = (y∗, u∗) of (P ).Then there corresponds a unique Lagrange multiplier λ∗.

Theorem 5.2. Let x∗ = (y∗, u∗) be a local solution to (P ) and let V (y∗)× V (u∗) denotea neighborhood in which J and e are C2 with Lipschitz continuous second derivatives. Letus further assume that ey(x∗) : Y → W is a bijection and that there exists κ > 0 such that

L′′(x∗, λ∗)(h, h) ≥ κ|h|2 for all h ∈ ker e′(x∗).

Then, if |u0 − u∗| is sufficiently small, |(yk+1, uk+1, λk+1) − (y∗, u∗, λ∗)| ≤ K|uk − u∗|2for a constant K independent of k = 0, 1, . . . .

Proof. Since ey(x∗) : Y → W is a homeomorphism and x → ey(x) is continuous from

X→ L(Y,X) , there exist r > 0 and γ > 0 such that⎧⎪⎪⎨⎪⎪⎩‖ey(x)−1‖L(W,Y ) ≤ γ for all x ∈ Br,

‖ey(x)−∗‖L(Y ∗,W ∗) ≤ γ for all x ∈ Br,

〈T ∗(x)L′′(x, λ)T (x)u, u〉 ≥ κ2 |u|2 for all x ∈ Br, |λ− λ∗| < r,

(5.2.11)

where Br = {(y, u) ∈ Y × U : |y − y∗| < r, |u − u∗| < r}. These estimates imply that,possibly after decreasing r > 0, there exists some K > 0 such that

|y(u)− y∗|Y ≤ K|u− u∗|U and |λ(u)− λ∗|W ∗ ≤ K|u− u∗|U for all |u− u∗| < r,


�

�

�

�

�

�

�

�


where ey(y(u), u)∗λ(u) = −Jy(y(u), u). Thus for ρ = min(r, r

K) the inequalities in

(5.2.11) hold with x = (y(u), u) and λ = λ(u), provided that |u − u∗| < ρ. Let S(u) =T (y(u), u)∗L′′(y(u), u, λ(u))T (y(u), u). Then there exists μ > 0 such that

|S(u)− S(u∗)| ≤ μ|u− u∗| for |u− u∗| < ρ.

Let |u0−u∗| < min( κμ, ρ) and, proceeding by induction, assume that |uk−u∗| ≤ |u0−u∗|.

Then

S(uk)(uk+1 − u∗) = S(uk)(uk − u∗)− J ′(uk)+ J ′(u∗)

=∫ 1

0(S(uk + s(u∗ − uk))− S(uk))(u

∗ − uk) ds

(5.2.12)

and hence

|uk+1 − u∗| ≤ 2

κ

μ

2|uk − u∗|2 ≤ μ

κ|u0 − u∗| |u0 − u∗| ≤ |u0 − u∗| < ρ.

It further follows that |y(uk+1) − y∗|Y ≤ Kμ

κ|uk − u∗|2 and |y(uk+1) − y∗|Y ≤ K|uk+1 −

u∗| < Kρ ≤ r . In a similar manner we have |λ(uk+1) − λ∗|W ∗ ≤ Kμ

κ|uk − u∗|2 and

|λ(uk+1)− λ∗|W ∗ < r . The claim follows from these estimates.

Remark 5.2.1. If |λ∗|W ∗ is small, then it is suggested to approximate the Hessian ofL′′(y, u, λ) by

L′′(y, u, λ) ∼ J ′′(y, u)

and the reduced Hessian T ∗L′′T by

J ′′(u) ∼ T (y, u)∗J ′′(y, u)T (y, u).

Under the assumptions of Theorem 5.2, T ∗J ′′ T is positive definite in a neighborhood ofthe solution, provided that λ∗ is sufficiently small. Moreover, if λ∗ = 0, then the Newtonmethod converges superlinearly to (y∗, u∗), if it converges. Indeed we can follow the proofof Theorem 5.2 and replace (5.2.12) by

S0(uk)(uk+1 − u∗) = S(uk)(uk − u∗)− J ′(uk)+ J ′(u∗)

−T ∗(y(uk), uk)(e′′(xk)∗(λk − λ∗))T (y(uk), uk)(uk − u∗),

where S0(u) = T ∗(y(u), u) J ′′(y(u), u) T (y(u), u).

Remark 5.2.2. Note that the reduced Hessian S = T ∗L′′T is a Schur complement of thelinear system (5.2.10). That is, if we eliminate δy and δλ by

δy = −e−1y euδu, δλ = −(e∗y)−1(Lyyδy + Lyuδu),

we obtain (5.2.9) in the form

Sδu+ Lu(y, u, λ) = 0. (5.2.13)


�

�

�

�

�

�

�

�

5.3. SQP and reduced SQP methods 137

Remark 5.2.3. If (5.2.13) is solved by an iterative method based on Krylov subspaces,then this uses Sr for given r ∈ U and requires that we perform the forward solution

δy = −e−1y (y, u)eu(y, u)r,

the adjoint solution

δλ = −(ey(y, u)∗)−1(Lyy δy + Lyur),

and evaluation of the expression

Sr = Luy δy + Luur + eu(y, u)∗δλ.

In applications where the discretization of U is much smaller than that of Y ×U ×W ∗ thisprocedure may offer a significant reduction in storage and execution time over solving thesaddle point problem (5.2.10).

5.3 SQP and reduced SQP methodsIn this section we describe the SQP (sequential quadratic programing) and reduced SQPmethods without entering into technical details. Throughout this section we set x = (y, u) ∈X = Y × U . Let x∗ denote a local solution to{

min J (x) over x ∈ X

subject to e(x) = 0.(5.3.1)

Further let

L(x, λ) = J (x)+ 〈λ, e(x)〉denote the Lagrangian and let λ∗ be a Lagrange multiplier at the solution x∗. Then anecessary first order optimality condition is given by

L′(y∗, u∗, λ∗) = 0, e(x∗) = 0, (5.3.2)

and a second order sufficient optimality condition by

L′′(x∗, λ∗)(h, h) ≥ σ |h|2X for all h ∈ ker e′(x∗), (5.3.3)

where σ > 0. Differently from the Newton method described in the previous section,in the SQP method both y and u are considered as independent variables related by theequality constraint e(y, u) = 0 which is realized by a Lagrangian term. The SQP methodthen consists essentially in a Newton method applied to the necessary optimality condition(5.3.2) to iteratively solve for (y∗, u∗, λ∗). This results in determining updates from thelinear system(

L′′(yk, uk, λk)e′(yk, uk)

∗e′(yk, uk) 0

)⎛⎝(δy

δu

)δλ

⎞⎠+⎛⎝(

ey(yk, uk)∗λk + Jy(yk, uk)

eu(yk, uk)∗λk + Ju(yk, uk)

)e(yk, uk)

⎞⎠ = 0.

(5.3.4)


�

�

�

�

�

�

�

�


If the values for yk and λk are obtained by means of a feasibility step as in (iv) of the Newtonalgorithm, then the first and the last components on the right-hand side of (5.3.4) are 0 andwe arrive at the Newton algorithm. This will be further investigated in Section 5.5 forweakly singular problems. Note that for affine constraints the iterates, except possibly theinitialization, are feasible by construction, and hence e(xk) = 0.

Let us note that given an iterate xk = (yk+1, uk+1) near x∗ the SQP step for (5.3.1) isalso obtained as the necessary optimality to the quadratic subproblem⎧⎪⎨⎪⎩

min〈L′(xk, λk), h〉 + 1

2L′′(xk, λk)(h, h)

subject to e′(xk)h+ e(xk) = 0, h ∈ X,

(5.3.5)

for hk = (δyk, δuk) and setting xk+1 = xk + hk .If (5.3.1) admits a solution (x∗, λ∗) and J and e are C2 in a neighborhood of x∗ with

Lipschitz continuous second derivatives, and if further e′(x∗) is surjective and there existsκ > 0 such that

L′′(x∗, λ∗)(h, h) ≥ κ|h|2X for all h ∈ ker e′(x∗),

then

|(xk+1, λk+1)− (x∗, λ∗)|2 ≤ K|(xk, λk)− (x∗, λ∗)|for a constant K independent of k, provided that |(x0, λ0)− (x∗, λ∗)| is sufficiently small.Thus the SQP method is locally quadratically convergent. The proof is quite similar to theone of Theorem 6.4.

Assume next that there exists a null-space representation of e′(x) for all x in a neigh-borhood of x∗; i.e., there exist a Hilbert space H and an isomorphism T (x) from H ontoker e′(x), where ker e′(x∗) is endowed with the norm of X. Then (5.3.2) and (5.3.3) can beexpressed as

T (x∗)∗J ′(x∗) = 0 (5.3.6)

and

T (x∗)∗L′′(x∗, λ∗)T (x∗) ≥ σ I, (5.3.7)

respectively.Referring to (5.3.5), let qk ∈ ker e′(xk)⊥ ⊂ X satisfy e′(xk)qk + e(xk) = 0. Then

hk ∈ X can be expressed ashk = qk + T (xk)w.

Thus, (5.3.5) is reduced to

min〈L′(xk, λk), T (xk)w〉 + 〈T (xk)w,L′′(xk, λk)qk〉

+1

2〈T (xk)w,L′′(xk, λk)T (xk)w〉,

(5.3.8)


�

�

�

�

�

�

�

�


and the solution hk to (5.3.5) can be expressed as

hk = qk + T (xk)wk, (5.3.9)

where wk is a solution to the unconstrained problem (5.3.8) in H , given by

T ∗(xk)L′′(xk, λk)T (xk)wk = −T ∗(xk)(J ′(xk)− L′′(xk, λk)Rk(xk)e(xk)). (5.3.10)

Therefore the full SQP step is decomposed into a minimization step in H , a restoration stepto the linearized equation

e′(xk)q + e(xk) = 0,

and an update of the Lagrange multiplier according to

e′(xk)∗ δλ = −e′(xk)∗λk − Jy(xk)− L′′(xk, λk)hk.

If e′(xk) ∈ L(X,W) admits a right inverse R(xk) ∈ L(W,X) satisfying e′(xk)R(xk) = IW ,then qk = −R(xk)e(xk) in (5.3.9) and

hk = −Rke(xk)− Tk

(T ∗k L′′

(xk, λk)Tk)−1T ∗k (J

′(xk)− L′′(xk, λk)Rke(xk)), (5.3.11)

where Tk = T (xk) and Rk = R(xk) and we used that T ∗w = 0 for w ∈ range e′(x)∗. Notethat a right inverse to e′(x) for x in a neighborhood of x∗ exists if e′(x∗) is surjective andx → e′(x) is continuous.

An alternative to deriving the update wk is given by differentiating

T (x)∗J ′(x) = T (x)∗L′(x, λ)

with respect to x, evaluating at x = x∗, λ = λ∗, and using (5.3.2):

d

dx(T (x∗)∗J ′(x∗)) = T (x∗)∗L′′(x∗, λ∗).

This representation holds in general only at the solution x∗. But if we use its structure foran update of the form hk = qk + T (xk)wk in a Newton step to (5.3.6), we arrive at

T ∗k L′′(xk, λk)hk = T ∗k L′′(xk, λk)(Tkwk − R(xk)e(xk)) = −T ∗k J ′(xk),which results in an update as in (5.3.10) above.

In the reduced SQP approach the term L′′(xk, λk)Rke(xk) is deleted from the expres-

sion in (5.3.11). Recall that for Newton’s method this term vanishes since e(xk) = 0 at eachiteration level. This results in

T ∗k L′′(xk, λk)T (xk)wk = −T ∗k J ′(xk). (5.3.12)

This equation for the update of the control coincides with the Schur complement form ofthe Newton update given in (5.2.13), since

Lu(xk, λk) = e∗u(xk)λk + Ju(xk) = T ∗k J′(xk),


�

�

�

�

�

�

�

�


where we used the fact that in the Newton method e∗y(xk)λk + Jy(xk) = 0. The updates forthe state and the adjoint state differ, however, for the Newton and the reduced SQP methods.

A second distinguishing feature for reduced SQP methods is that often the reducedHessians T ∗k L′′(xk, λk)Tk are approximated by invertible operators Bk ∈ L(H) suggestedby secant methods. Thus a reduced SQP step has the form xk+1 = xk + hRED

k , where

hREDk = −Rk e(xk)− Tk B

−1k T ∗k J ′(xk). (5.3.13)

For the reduced SQP method the step hREDk in general depends on the specific choice of

the null-space representation and the right inverse. In the full SQP method the step hk in(5.3.11) is invariant with respect to these choices. A third distinguishing feature of reducedSQP methods is the choice of the Lagrange multiplier, which is required to specify theupdate of Bk . The λ-update is typically not derived from the first equation in (5.3.4). Fromthe first order condition we have

〈J ′(x∗), R(x∗)h〉 + 〈λ∗, e′(x∗)R(x∗)h〉 = 〈J ′(x∗), R(x∗)h〉 + 〈λ∗, h〉 = 0

for all h ∈ W.

This suggests the λ-update

λ+ = −R(x)∗ J ′(x). (5.3.14)

Other choices for the Lagrange multiplier update are possible. For convergence proofs ofreduced SQP methods this update is required to be locally Lipschitz continuous; see [Kup,KuSa].

In the case of problem (P ), a vector (δy, δu) ∈ X lies in the null-space of e′(x) if

ey(x)δy + eu(x)δu = 0.

Assuming that ey(x) : Y → W is invertible, we have

ker e′(x) = {(δy, δu) : δy = −ey(x)−1eu(x) δu}.

This suggests choosing H = U and using the following definitions for the null-spacerepresentation and the right inverse:

T (x) = (−ey(x)−1eu(x), I ), R(x) = (ey(x)−1, 0). (5.3.15)

In this case a reduced SQP step can be decomposed as follows:

(i) solve Bk δu = −T (xk)∗J ′(xk),

(ii) solve ey(xk)δy = −e′u(xk)δu− e(xk),

(iii) set xk+1 = xk + (δy, δu),

(iv) update λ.


�

�

�

�

�

�

�

�


The update of the Lagrange multiplier λ is needed for the approximation Bk to the reducedHessian T ∗(xk)L′′(xk, λk)T (xk). A popular choice is given by the BFGS-update formula

Bk+1 = Bk + 1

〈zk, δuk〉〈zk, ·〉zk −1

〈δuk, Bkδuk〉〈Bkδuk, ·〉Bkδuk,

where

zk = T ∗(xk)L′(xk + T (xk)δuk, λk

)− J ′(xk).

Each SQP step requires at least three linear systems solves, one in W ∗ for the evaluation ofT (xk)

∗J ′(xk), one inU for δu, and another one in Y for δy. The update of the BFGS formularequires one additional system solve. The typical convergence behavior that can be provedfor reduced SQP methods with BFGS update is local two-step superlinear convergence[Kup, KuSa, KSa],

limk→∞

|xk+1 − x∗||xk−1 − x∗| = 0

provided that

B0 − T (x∗)∗L′′(x∗, λ∗)T (x∗) is compact. (5.3.16)

Here B0 denotes the initialization to the reduced Hessian approximation, and (x∗, λ∗) is asolution to (5.1.4). If e is C1 at x∗ and ey(x

∗) is continuously invertible and compact, then(5.3.16) holds for the choice B0 = Luu(x

∗, λ∗). In [HiK2], condition (5.3.16) is analyzedfor a class of optimal control problems related to the stationary Navier–Stokes equations.We recall that condition (5.3.16) is related to an analogous condition in the context of solvingnonlinear equations F(x) = 0 in infinite-dimensional Hilbert spaces by means of secantmethods, in particular the Broyden method. The initialization of the Jacobian must be acompact perturbation of the linearization ofF at the solution to ascertain localQ-superlinearconvergence [Grie, Sa].

If we deal with finite-dimensional problems with e a mapping fromRN intoRM , withM < N , then a common null-space representation is provided by the QR decomposition.Let H = RN−M and

e′(x)T = Q(x)R(x) = (Q1(x) Q2(x))

(R1(x)

0

),

where Q(x) is orthogonal, Q1(x) ∈ RN×M , Q2(x) ∈ RN×(N−M), and R1(x) ∈ RM×M is anupper triangular matrix. Then the null-space representation and the right inverse to e′(x)are given by

T (x) = Q2(x) ∈ RN×(N−M) and R(x) = Q1(x)(RT1 (x))

−1 ∈ RN×M.

In the finite-dimensional setting one often works with the “least squares solution” for theλ-update, i.e., with the solution to

minλ|J ′(x)+ e′(x)∗λ|,

which results in λ(x)+ = −R−11 (x)QT

1 (x)J′(x).


�

�

�

�

�

�

�

�


To derive another interesting feature of a variant of a reduced SQPmethod we recapturethe derivation from above for (P ) with T and R chosen as in (5.3.15), as⎧⎪⎪⎨⎪⎪⎩

T (x)∗L′′(x, λ)T (x)w = −T (x)∗L′(x, λ),

h = (δy, δu) = q + T (x)w, where q − R(x)e(x) = 0,

λ+ δλ = −R(x)∗J ′(x).

(5.3.17)

In particular, the term L′′(xk, λk)Rke(xk) is again deleted from the expression in (5.3.10)

and the Lagrange multiplier update is chosen according to (5.3.14). However, differentlyfrom (5.3.13), the reduced Hessian is not approximated in (5.3.17). Here we do not indicatethe iteration index k, and we assume that ey(x) : Y → W is bijective and that (5.3.7) holds.

Then (q,w, δλ) is a solution to (5.3.17) if and only if (δy, δu, δλ) is a solution to thesystem⎛⎝ 0 0 ey(x)

∗0 T (x)∗L′′(x, λ)T (x) eu(x)

∗ey(x) eu(x) 0

⎞⎠⎛⎝ δy

δu

δλ

⎞⎠ = −⎛⎝ Ly(x, λ)

Lu(x, λ)

e(x)

⎞⎠ , (5.3.18)

with w = δu, q = −R(x)e(x), and (δy, δu) = q + T (x)δu. In fact, R∗(x) is a left inverseto e′(x)∗, and from the first equation in (5.3.18) we have

δλ = −(ey(x)∗)−1(Jy(x)+ e∗y(x)λ) = −R(x)∗J ′(x)+ λ.

From the second equation

T (x)∗L′′(x, λ)T (x)δu = −Lu(x, λ)− eu(x)∗δλ

= −Lu(x, λ)+ eu(x)∗ey(x)−∗Ly(x, λ) = −T (x)∗L′(x, λ) = −T (x)∗J ′(x).

The third equation is equivalent to

(δy, δu) = −R(x)e(x)+ T (x)δu.

System (5.3.18) should be compared to (5.3.4). We note that the reduced SQP step isequivalent to a block tridiagonal system, where the “Luu” element in the system matrix isreplaced by the reduced Hessian while the elements corresponding to “Lyy” and “Lyu” arezero. The system matrix of (5.3.18) can be used advantageously as preconditioning foriteratively solving (5.3.4). In fact, let us denote the system matrices in (5.3.4) and (5.3.18)by S and Sred , respectively, and consider the iteration

δzn+1 = δzn − S−1red

(S δzk + col (Ly(x, λ),Lu(x, λ), e(x))

). (5.3.19)

In [IKSG] it was proved that the iteration matrix I −S−1red S is nilpotent of degree 3. Hence

the iteration (5.3.19) converges in 3 steps to the solution of (5.3.4) with (xk, λk) = (x, λ).


�

�

�

�

�

�

�

�

5.4. Optimal control of the Navier–Stokes equations 143

5.4 Optimal control of the Navier–Stokes equationsIn this section we consider the optimal control problem⎧⎪⎪⎪⎪⎪⎪⎨⎪⎪⎪⎪⎪⎪⎩

min J (y, u) = ∫ T

0

(�(y(t))+ h(u(t))

)dt

over u ∈ U = L2(0, T ; U ) subject to

ddty(t) = A0y(t)+ F(y(t))+ Bu(t) for t ∈ (0, T ],

y(0) = y0,

(5.4.1)

where A0 is a densely defined, self-adjoint operator in a Hilbert space H which satisfies(−A0φ, φ)H ≥ α|φ|2H for some α > 0 independent of φ ∈ domA0. We set V =dom((−A0)

12 ) endowed with |φ|V = |(−A0)

12 φ|H as norm. The V -coercive form (v,w)→

((−A0)12 v, (−A0)

12 w) also defines an operator A ∈ L(V , V ∗) satisfying 〈−Av,w〉V ∗,V =

((−A0)12 v, (−A0)

12 w)H for all v,w ∈ W . Since A and A0 coincide on dom A0, and

dom A0 is dense in V , we shall not distinguish between A and A0 as operators in L(V , V ∗).In (5.4.1), moreover, U is the Hilbert space of controls, B ∈ L(U , V ∗), and y0 ∈ H . We as-sume that h ∈ C1(U ,R) with Lipschitz continuous first derivative and � ∈ C1(V ,R), with�′ Lipschitz continuous on bounded subsets of H . The nonlinearity F is supposed to satisfy

(H1)

⎧⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎨⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎩

F : V → V ∗ is continuously differentiable and there exists

a constant c > 0 such that for every ε > 0

〈F(y)− F(y), y − y〉V ∗,V ≤ ε|y − y|2V + cε(|y|2V + |y|2H |y|2V )|y − y|2H ,

〈F ′(y)y, y〉V ∗,V ≤ ε|y|2V + cε|y|2H |y|2V (1+ |y|2H ),

|(F ′(y)− F ′(y))v|V ∗ ≤ c|y − y| 12H |y − y| 1

2V |v|

12H |v|

12V

for all y, y, and v in V.

(H2)

⎧⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎨⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎩

For any u ∈ L2(0, T ; U ) there exists a unique weak

solution y = y(u) in

W(0, T ) = L2(0, T ;V ) ∩H 1(0, T ;V ∗) satisfying

〈 ddty(t), ψ〉V ∗,V = 〈A0y(t)+ F(y(t))+ Bu(t), ψ〉V ∗,V

for all ψ ∈ V, and y(0) = y0.

Moreover {y(u) : |u|L2(0,T ;U ) ≤ r} is bounded in

W(0, T ) for each r > 0.

With these preliminaries the dynamical system in (5.4.1) can be considered with valuesin V ∗. The conditions on A0 and F are motivated by the two-dimensional incompressible


�

�

�

�

�

�

�

�


Navier–Stokes equations with distributed control given by⎧⎪⎪⎨⎪⎪⎩ddty + (y · ∇)y + ∇p = ν∇y + Bu in (0, T ] ×�,

∇y = 0 in (0, T ] ×�,

y(0, ·) = y0 in �,

(5.4.2)

where � is a bounded domain with Lipschitz continuous boundary ∂�, y = y(t, x) ∈ R2 isthe the velocity field, p = p(t, x) ∈ R is the pressure, and ν is the normalized viscosity. Let

V = {φ ∈ H 10 (�)2 : ∇ · φ = 0}, H = {φ ∈ L2(�)2 : ∇ · φ = 0, n · φ = 0 on δ�},

where n is the outer unit normal vector to ∂�, let � denote the Laplacian in H , and let PF

denote the orthogonal projection of L2(�)2 onto the closed subspace H . Then A0 = νP�

is the Stokes operator inH . It is a self-adjoint operator with domain dom((−A0)1

2 ) = V and

−(A0φ,ψ)H = ν(∇φ,∇ψ)L2 for all φ ∈ dom(A0), ψ ∈ V.

If ∂� is sufficiently smooth, e.g., ∂� is C2, then dom(A0) = H 2(�)2∩V. The nonlinearityin (5.4.2) satisfies the following properties:∫

�

(u · ∇v) w dx =∫�

(u · ∇w) v dx, (5.4.3)

∫�

(u · ∇v)w dx ≤ c|u| 12H |u|

12V |v|V |w|

12H |w|

12V (5.4.4)

for a constant c independent of u, v,w ∈ V ; see, e.g., [Te]. In terms of the generalformulation (5.4.1) the nonlinearity F : V → V ∗ is given by

(F (φ), v)V ∗,V = −∫�

(φ · ∇)φ v dx.

It is well defined due to (5.4.4). Moreover (H1) follows from (5.4.3) and (5.4.4). From thetheory of variational solutions to the Navier–Stokes equations it follows that there existsa constant C such that for all y0 ∈ H and u ∈ L2(0, T ; U ) there exists a unique solutiony ∈ W(0, T ) such that

|y|C(0,T ;H) + |y|W(0,T ) ≤ C(|y0|H + |u|L2(0,T ;U ) + |y0|2H + |u|2L2(0,T ;U )),

where |y|W(0,T ) = |y|L2(0,T ;V )+| ddt y|L2(0,T ;V ∗) and we recall that C([0, T ];H) is embeddedcontinuously into W(0, T ); see, e.g., [LiMa, Tem]. Assuming the existence of a local solu-tion to (5.4.1) we shall present in the following subsections first and second order optimalityconditions, as well as the steps which are necessary to realize the Newton algorithm.

Control and especially optimal control has received a considerable amount of attentionin the literature. Here we only mention a few [Be, FGH, Gu, HiK2] and refer the reader tofurther references given there.


�

�

�

�

�

�

�

�


5.4.1 Necessary optimality condition

Let x∗ = (y∗, u∗) denote a local solution to (5.4.1). We shall derive a first order optimalitycondition by verifying the assumptions of Theorem 5.1 and applying (5.1.4). In the contextof Section 5.1 we set

Y = W(0, T ) = L2(0, T ;V ) ∩H 1(0, T ;V ∗),W = L2(0, T ;V ∗)×H, U = L2(0, T ; U ),

J (y, u) = ∫ T

0

(�(y(t))+ h(u(t))

)dt,

e(y, u) = (yt − A0y − F(y)− Bu, y(0)).

To verify (C1)–(C3) with (y, u) = (y∗, u∗), let V (y∗)×V (u∗) denote a bounded neighbor-hood of (y∗, u∗). Let V (y∗) be chosen such that y(u) ∈ V (y∗) for every u ∈ V (u∗). Thisis possible by (H2). Since Y = W(0, T ) is embedded continuously into C([0, T ];H), thecontinuity assumptions for � and h imply the continuity requirements J in (C1). Note thate′(y, u) : Y × U → W is given by

e′(y, u)(δy, δu) = ((δy)t − A0δy − F ′(y)δy − Bδu, δy(0)),

and global Lipschitz continuity of (y, u) → e′(y, u) from V (y∗) × V (u∗) ⊂ Y × U toL(Y × U,W) follows from the last condition in (H1). Solvability of e(y, u) = 0 withrespect to y for given u follows from (H2) and hence (C1) holds. Since, by (H1), thebounded bilinear form a(t; ·, ·) : V × V → R defined by

t → a(t;φ,ψ) = −〈A0φ,ψ〉V ∗,V − 〈F ′(y∗(t))φ, ψ〉V ∗,Vsatisfies

a(t;φ, φ) ≥ |φ|2V − ε|φ|2V −c

ε|φ|2V |y∗(t)|2V (1+ |y∗|2C(0,T ;H)),

there exists a constant c > 0 such that

a(t;φ, φ) ≥ 1

2|φ|2V − c|y∗(t)|2V |φ|2H for t ∈ (0, T ),

where |y∗|2V ∈ L1(0, T ). Consequently, the adjoint equation{ − ddtp∗(t) = A0p

∗(t)+ F ′(y∗(t))∗p∗(t)+ �′(y∗(t)),

p∗(T ) = 0(5.4.5)

admits a unique solution p∗ ∈ W(0, T ), with

|p∗(t)|2H +∫ T

t

|p∗(s)|2V ds ≤ eC∫ T

0 |y∗(s)|2V ds∫ T

t

|�′(y∗(s))|2V ∗ds (5.4.6)

for a constant C independent of t ∈ [0, T ]. In fact,

−1

2

d

dt|p∗(t)|2H + |p∗(t)|2V ≤ |〈F ′(y∗(t))p∗(t), p∗(t)〉V ∗,V | + |�′(y∗(t))|V ∗ |p∗(t)|V .


�

�

�

�

�

�

�

�


Hence by (H1) there exists a constant C such that

− d

dt|p∗(t)|2H − C|y∗(t)|2V |p∗(t)|2H + |p∗(t)|2V ≤ |�′(y∗(t))|2V ∗ .

Multiplying by exp(− ∫ T

tρ(s)ds), where ρ(s) = C|y∗(s)|2V , we find

|p∗(t)|2H +∫ T

t

|p∗(s)|2V exp∫ s

t

ρ(τ )dτ ds ≤∫ T

t

|�′(y∗(s))|2V ∗ exp∫ s

t

ρ(τ )dτ ds

and (5.4.6) follows. This implies (C2) with λ∗ = (p∗, p∗(0)). For any y ∈ W(0, T ) wehave

d

dt|y(t)|2H = 2

⟨d

dty(t), y(t)

⟩V ∗,V

for t ∈ (0, T ). (5.4.7)

Let u ∈ V (u∗) and denote by y = y(u) ∈ V (y∗) the solution to the dynamical system in(5.4.1). From (5.4.7) we have

1

2

d

dt|y(t)− y∗(t)|2H + |y(t)− y∗(t)|2V

= 〈F(y(t))− F(y∗(t)), y(t)− y∗(t)〉V ∗,V + 〈Bu(t)− Bu∗(t), y(t)− y∗(t)〉V ∗,V .

By (H2) the set {y(u) : u ∈ V (u∗)} is bounded in W(0, T ). Hence by (H1) there exists aconstant C independent of u ∈ V (u∗) and t ∈ [0, T ] such that

d

dt|y(t)− y∗(t)|2H + |y(t)− y∗(t)|2V ≤ C(ρ(t)|y(t)− y∗(t)|2H + |u(t)− u∗(t)|2

U),

where ρ(t) = |y(t)|2V + |y∗(t)|2V . By Gronwall’s inequality this implies that

|y(t)− y∗(t)|2H +∫ t

0|y(t)− y∗(t)|2V ds ≤ C exp

(C

∫ T

0ρ(τ)dτ

)∫ t

0|u(s)− u∗(s)|2ds,

where∫ T

0 ρ(τ)dτ is bounded independent of u ∈ V (u∗). Utilizing the equations satisfiedby y(u) and y(u∗) it follows that

|y(u)− y(u∗)|W(0,T ) ≤ C|u− u∗|L2(0,T ;U ) (5.4.8)

for a constant C independent of u ∈ V (u∗). Hence (C3) follows and Theorem 5.1 impliesthe optimality system

d

dty∗(t) = A0y

∗(t)+ F(y∗(t))+ Bu∗(t), y(0) = y0,

− d

dtp∗(t) = A0(y

∗(t))+ F ′(y∗(t))∗p∗(t)+ �′(y∗(t)), p∗(T ) = 0,

B∗p∗(t)+ h′(u∗(t)) = 0.


�

�

�

�

�

�

�

�


5.4.2 Sufficient optimality condition

We assume J to be of the form

J (y, u) = 1

2

∫ T

0(Q(y(t)− y(t)), y(t)− y(t))Hdt + β

2

∫ T

0|u(t)|2

Udt,

where Q is symmetric and positive semidefinite on H , y ∈ L2(0, T ;H), and β > 0. Weshall verify the requirements in (II) of Section 5.1. By (5.4.8) it follows that

E1(y(u)− y∗, u− u∗) ≥ α(|y(u)− y∗|2W(0,T ) + |u− u∗|2L2(0,T ;U )

)

for all u ∈ V (u∗).(5.4.9)

Moreover,

|e(y(u), u)− e(y∗, u∗)− e′(y∗, u∗)(y(u)− y∗, u− u∗)|L2(0,T ;V ∗)×H

= |F(y(u))− F(y∗)− F ′(y∗)(y(u)− y∗)|L2(0,T ,V ∗)

=(∫ T

0|F(y)− F(y∗)− F ′(y∗)(y(u)− y∗)|2V ∗

) 12

≤ c

(∫ T

0

(∫ 1

0s|y(t)− y∗(t)|H |y(t)− y∗(t)|V ds

)2

dt

) 12

≤ c√2|y − y∗|C(0,T ;H)|y − y∗|L2(0,T ;V ),

and therefore

|e(y(u), u)− e(y∗, u∗)− e′(y∗, u∗)(y(u)− y∗)|L2(0,T ;V ∗)×H

≤ C|y(u)− y∗|2W(0,T )

(5.4.10)

for a constant C independent of u ∈ V (u∗). From (5.4.6)

|p∗|L2(0,T ;V ) ≤ exp

(C

2|y∗|L2(0,T ;V )

)|Q(y∗ − y)|L2(0,T ;V ∗). (5.4.11)

Combining (5.4.9)–(5.4.11) implies that if |Q(y∗ − y)|L2(O,T ,V ∗) is sufficiently small, then(y∗, u∗) is a strict local minimum.

5.4.3 Newton’s method for (5.4.1)

Here we specify the steps necessary to carry out one Newton iteration for the optimal controlproblem related to the Navier–Stokes equation. Let u ∈ L2(0, T ; U ) and let y = y(u) ∈W(0, T ) denote the associated velocity field. For δu ∈ L2(0, T ; U ) let δy, λ, and μ inW(0, T ) denote the solutions to the sensitivity equation

d

dtδy = A0δy + F ′(y)δy + Bδu, δy(0) = 0, (5.4.12)


�

�

�

�

�

�

�

�


the adjoint equation

− d

dtλ = A0λ+ F ′(y)∗λ+ �′(y), λ(T ) = 0, (5.4.13)

and the adjoint equation arising in the Hessian

− d

dtμ = A0μ+ F ′(y)∗μ+ (F ′′(y)∗λ)δy + �′′(y)δy, μ(T ) = 0. (5.4.14)

Then the operators characterizing the Newton step

J ′′(u) δu = −J ′(u) (5.4.15)

can be expressed by

J ′(u) = h′(u)+ B∗λ

and

J ′′(u)δu = h′′(u)δu+ B∗μ.

Let us also note that

−〈F ′(y)δy, ψ〉V ∗,V = b(y, δy, ψ)+ b(δy, y, ψ),

−〈F ′(y)∗λ,ψ〉 = b(y, ψ, λ)+ b(ψ, y, λ),

−〈(F ′′(y)∗λ) δy, ψ〉 = b(δy, ψ, λ)+ b(ψ, δy, λ)

for all ψ ∈ V , where b(u, v,w) = ∫�(u · ∇)v w dx. The evaluation of the gradient J ′(u)

requires one forward in time solve for y(u) and one adjoint solve for λ. Each evaluation ofthe Hessian necessitates one forward solve of the sensitivity equation for δy and one solveof the second adjoint equation (5.4.14) for μ.

If the approximation as in Remark 5.2.1 is used, then this results in not including theterm (F ′′(y)∗λ)δy in the second adjoint equation and each evaluation requires us to solvethe linear time-varying Hamiltonian system

d

dty = A(t) δy + B δu, δy(0) = 0,

− d

dtμ = A(t)∗μ+ �′′(y)δy, μ(T ) = 0,

where A(t) = A0 + F ′(y(t)).

5.5 Newton method for the weakly singular caseIn this section we discuss the Newton method for the weakly singular case as introducedin Section 1.5, and we utilize the notation given there. In particular J : Y × U → R, e :Y1 × U → W , with ey(y, u) considered as an operator in Y . Assuming that (y, u, λ) ∈Y1 × U × dom(ey(y, u)

∗), the SQP step for (δy, δu, δλ) is given by⎛⎝L′′(y, u, λ) e′(y, u)∗

e′(y, u) 0

⎞⎠⎛⎝δy

δu

δλ

⎞⎠ = −⎛⎝ey(y, u)

∗λ+ Jy(y, u)

eu(y, u)∗λ+ Ju(y, u)

e(y, u)

⎞⎠ .


�

�

�

�

�

�

�

�

5.5. Newton method for the weakly singular case 149

The updates (y+δy, u+δu, λ+δλ) will not necessarily remain in Y1×U×dom(ey(y, u)∗)

since it is not assumed that e′ is surjective from Y1×U to W . However, the feasibility stepsconsisting in solving the primal equation

e(y, u+) = 0 (5.5.1)

for y = y+ and the adjoint equation

ey(y+, u+)∗λ+ Jy(y

+, u+) = 0 (5.5.2)

for the dual variableλ = λ+ will guarantee that (y+, λ+) ∈ Y1×Z1 holds. Hereu+ = u+δu

and Z1 ⊂ dom (ey(y+, u+)∗) denotes a Banach space densely embedded into W ∗, with

λ∗ ∈ Z1. Since Y1 and Z1 are contained in Y and W ∗, the feasibility steps (5.5.1)–(5.5.2)can also be considered as smoothing steps. Thus we obtain the Newton method for thesingular case.

Algorithm

• Initialization: Choose u0 ∈ V (u∗), solve

e(y, u0) = 0, ey(y0, u0)∗λ+ Jy(y0, u0) = 0 for (y0, λ0) ∈ Y1 × Z1,

and set k = 0.

• Newton step: Solve for (δy, δu, δλ) ∈ Y × U ×W ∗⎛⎝L′′(yk, uk, λk) e′(yk, uk)∗

e′(yk, uk) 0

⎞⎠⎛⎝δy

δu

δλ

⎞⎠ = −⎛⎝ 0eu(yk, uk)

∗λk + Ju(yk, uk)

0

⎞⎠ .

• Newton update: uk+1 = uk + δu.

• Feasibility step: Solve for (yk+1, λk+1) ∈ Y1 × Z1:

e(y, uk+1) = 0, ey(yk+1, uk+1)∗λ+ Jy(yk+1, uk+1) = 0.

• Stop, or set k = k + 1, and goto the Newton step.

Remark 5.5.1. The algorithm is essentially the Newton method of Section 5.2. Because ofthe feasible step, the first and the last components on the right-hand side of the SQP step arezero, and Newton’s method and the SQP method coincide. Let us point out that the SQPiteration may not be well defined without the feasibility step since the updates (y+, λ+) mayonly be in Y ×W ∗, while e and e′y(y+, u+)∗ are not necessarily well defined on Y ×U andW ∗.

We next specify the assumptions which justify the above derivation and under whichwell-posedness and convergence of the algorithm can be proved. Thus let (y∗, u∗, λ∗) ∈Y1 × U × Z1 be a solution to (5.1.4), or equivalently to (1.5.3), and let

V (y∗)× V (u∗)× V (λ∗) ⊂ Y1 × U × Z1

be a convex bounded neighborhood of the solution triple (y∗, u∗, λ∗).


�

�

�

�

�

�

�

�


(H5) (a) For every u ∈ V (u∗) there exists a unique solution y = y(u) ∈ V (y∗) ofe(y, u) = 0. Moreover, there exists M > 0 such that |y(u)− y∗|Y1 ≤ M|u− u∗|U .

(b) For every (y, u) ∈ V (y∗) × V (u∗) there exists a unique solution λ = λ(y, u) ∈V (λ∗) of ey(y, u)∗λ+Jy(y, u) = 0 and |λ(y, u)−λ∗|Z1 ≤ M|(y, u)−(y∗, u∗)|Y1×U .

(H6) J is twice continuously Fréchet differentiable on Y × U with the second derivativelocally Lipschitz continuous.

(H7) The operator e : V (y∗) × V (u∗) ⊂ Y1 × U → W is Fréchet differentiable withLipschitz continuous Fréchet derivative e′(y, u) ∈ L(Y1 × U,W). Moreover, foreach (y, u) ∈ V (y∗)× V (u∗) the operator e′(y, u) with domain in Y ×U has closedrange.

(H8) For every λ ∈ V (λ∗) the mapping (y, u)→ 〈λ, e(y, u)〉W ∗,W from V (y∗)×V (u∗) toR is twice Fréchet differentiable and the mapping (y, u, λ)→ 〈λ, e′′(y, u)(·, ·)〉W ∗,Wfrom V (y∗) × V (u∗) × V (λ∗) ⊂ Y1 × U × Z1 → L(Y1 × U, Y ∗ × U ∗) is Lips-chitz continuous. Moreover, for each (y, u, λ) ∈ Y1 × U × Z1, the bilinear form〈λ, e′′(y, u)(·, ·)〉W ∗,W can be extended as a continuous bilinear form on (Y × U)2.

(H9) For every (y, u) ∈ V (y∗) × V (u∗) ⊂ Y1 × U the operator e′(y, u) can be extendedas continuous linear operator from Y × U to W , and the mapping (y, u)→ e′(y, u)from V (y∗) × V (u∗) ⊂ Y1 × U → L(Y × U,W) is continuous and (y, u, λ) →(λ, e′′(y, u)(·, ·)) fromV (y∗)×V (u∗)×V (λ∗) ⊂ Y1×U×Z1 → L(Y×U, Y ∗×U ∗)is continuous.

(H10) There exists κ > 0 such that

〈L′′(y∗, u∗, λ∗)v, v〉Y ∗×U∗,Y×U ≥ κ |v|2Y×Ufor all v ∈ ker e′(y∗, u∗) ⊂ Y × U .

(H11) e′(y∗, u∗) ∈ L(Y × U,W) is surjective.

Condition (H5) requires well-posedness of the primal and the adjoint equation inY1, respectively, Z1. The adjoint equations arise from linearization of e at elements ofY1×U . Condition (H6) requires smoothness of J . In (H7) and (H8) the necessary regularityrequirements for e as mapping on Y1×U and in Y ×U are specified. From (H5) it followsthat the initialization as well as the feasibility step are well defined provided thatuk ∈ V (u∗).As a consequence the derivatives of J and e that are required for defining the Newton stepare taken at elements (yk, uk, λk) ∈ Y1 × U × Z1.

For x = (y, u, λ) ∈ V (y∗)× V (u∗)× V (λ∗) let A(x) denote the operator

A(x) =(

L′′(y, u, λ) e′(y, u)∗e′(y, u) 0

).

Conditions (H6) and (H9) guarantee that the operatorA(x) ∈ L(Y×U×W ∗, Y ∗×U ∗×W)

for x ∈ V (y∗)× V (u∗)× V (λ∗) . Conditions (H6), (H9)–(H11) imply that


�

�

�

�

�

�

�

�


(H12) there exist a neighborhood V (x∗) ⊂ V (y∗)× V (u∗)× V (λ∗) and M1, such that forevery x = (y, u, λ) ∈ V (x∗) and δw ∈ Y ∗ × U ∗ ×W

A(x)δx = δw

admits a unique solution δx ∈ Y × U ×W ∗ satisfying

|δx|Y×U×W ∗ ≤ M1 |δw|Y ∗×U∗×W .

Theorem 5.3. If (H5)–(H8) and (H12) hold at a solution (y∗, u∗, λ∗) ∈ Y1 × U × Z1 of(5.1.4) and |u0−u∗|U is sufficiently small, then the iterates of the algorithm are well definedand they satisfy

|(yk+1, uk+1, λk+1)− (y∗, u∗, λ∗)|Y1×U×Z1 ≤ K |uk − u∗|2U (5.5.3)

for a constant K independent of k.

Proof. Well-posedness of the algorithm in a neighborhood of (y∗, u∗, λ∗) follows from(H5) and (H12). To prove convergence let us denote by x∗ the triple (y∗, u∗, λ∗), andsimilarly δx = (δy, δu, δλ) and xk = (yk, uk, λk). Without loss of generality we assumethat min(M,M1) ≥ 1. The Newton step of the algorithm can be expressed as

A(xk)δx = −F(xk),

with F : Y1 × U × Z1 → Y ∗ × U ∗ ×W defined by

F(y, u, λ) = −(ey(y, u)∗λ+ Jy(y, u), eu(y, u)∗λ+ Ju(y, u), e(y, u)).

Due to the smoothing step the first and third coordinates of F are 0 at xk . By (H5) thereexists M > 0 such that

|yk − y∗|Y1 ≤ M |uk − u∗|U if uk ∈ V (u∗) (5.5.4)

and

|λk − λ∗|Z1 ≤ M |(yk, uk)− (y∗, u∗)|Y1×U if (yk, uk) ∈ V (y∗)× V (u∗), (5.5.5)

where (yk, λk) are determined by the feasibility step. Let us assume that xk ∈ V (x∗). Thenit follows from (H6)–(H8) that, possibly after increasing M , it can be chosen such that forevery xk ∈ V (y∗)× V (u∗)× V (λ∗)

|F(x∗)− F(xk)− F ′(xk)(x∗ − xk)|Y ∗×U∗×W

=∫ 1

0|(F ′(xk + s(x∗ − xk))− F ′(xk))(x∗ − xk)|Y ∗×U∗×W ds

=∫ 1

0|(A(xk + s(x∗ − xk))− A(xk))(x

∗ − xk)|Y ∗×U∗×W ds ≤ M

2|x∗ − xk|2Y1×U×Z1

.


�

�

�

�

�

�

�

�


Moreover,

|A(xk)(xk + δx − x∗)| = |F(x∗)− F(xk)− F ′(xk)(x∗ − xk)| ≤ M

2|xk − x∗|2Y1×U×Z1

.

Consequently, by (H12)

|uk+1 − u∗|U ≤ |xk+1 − x∗|Y×U×W ∗ ≤ MM1

2|xk − x∗|2Y1×U×Z1

, (5.5.6)

provided that xk ∈ V (x∗). The proof will be completed by an induction argument withrespect to k. Let r be such that 2M5M1r < 1 and that |x− x∗| < 2M2r implies x ∈ V (x∗).Assume that |u0 − u∗| ≤ r . Then |y0 − y∗|Y1 ≤ Mr by (5.5.4) and |λ0 − λ∗|Z1 ≤

√2M2r

by (5.5.5). It follows that

|x0 − x∗|Y1×U×Z1 ≤ 2M2|u0 − u∗|2U ≤ 2M2r

and hence x0 ∈ V (x∗). Let |xk − x∗|Y1×U×Z1 ≤ 2M2r . Then from (5.5.6)

|uk+1 − u∗|U = 2M5M1r2 ≤ r. (5.5.7)

Consequently (5.5.4)–(5.5.6) are applicable and imply

|xk+1 − x∗|Y1×U×Z1 ≤ 4M2|uk+1 − u∗|2U ≤ M3M1|xk − x∗|2Y1×U×Z1. (5.5.8)

It follows that|xk+1 − x∗|Y1×U×Z1 ≤ 4M7M1|uk − u∗|2U ,

which implies (5.5.3) withK = 4M7M1. From (5.5.7)–(5.5.8) finally |xk+1−x∗|Y1×U×Z1 ≤2M2r .

Let us return now to some of the examples of Chapter 1, Section 1.5, and discuss theapplicability of conditions (H5)–(H11).

Example 1.19 revisited. Condition (H5)(a) is a direct consequence of Lemma 1.14. Con-dition (H5)(b) corresponds to

(∇λ,∇φ)+ (eyλ, φ)+ (y − z, φ) = 0 for all φ ∈ Y, (5.5.9)

given y ∈ Y1. Let Z1 = Y1 = Y ∩ L∞(�). It follows from [Tr, Chapter 2.3] and the proofof Lemma 1.14 that there exists a unique solution λ = λ(y) ∈ Z1 to (5.5.9). Moreover, ify ∈ V (y∗) and w = λ(y)− λ(y∗), then w ∈ Z1 satisfies

(∇w,∇φ)+ (ey∗w, φ)+ ((ey − ey

∗)λ, φ)+ (y − y∗, φ) = 0 for all φ ∈ Y.

From the proof of Lemma 4.1 it follows that there exists M > 0 such that

|λ(y)− λ(y∗)|Z1 ≤ M |y − y∗|Y1 for all y ∈ V (y∗)

and thus (H5)(b) is satisfied. It is simple to argue the validity of (H6)–(H9). Note thate′(y∗, u∗) is surjective from Y × U to W and thus (H11) is satisfied. As for (H10), thiscondition is equivalent to the requirement that

|δy|2L2(�) + (λ∗ey∗, (δy)2)+ β |δu|2U ≥ κ(|δy|2Y + |δu|2U)


�

�

�

�

�

�

�

�


for all (δy, δu) ∈ Y × U satisfying

(∇δy,∇φ)+ (ey∗δy, φ)− (δu, φ) = 0 for all φ ∈ Y, (5.5.10)

where λ∗ is the solution to (5.3.11) with y = y∗. Then there exists k > 0 such that

|δy|2Y ≤ k|δu|2

for all (δy, δu) satisfying (5.5.10). It follows that (H10) holds if |(λ∗)−|L∞ is sufficientlysmall. This is the case, for example, in the case of small residue problems, in the sense that|y∗ − z|2

L2(�)is small enough. If z ≥ y∗, then the weak maximum principle [Tr] is applied

to (5.5.9) and gives λ∗ ≥ 0.With quite analogous arguments it can be shown that (H5)–(H11) also hold for Ex-

ample 1.20 of Chapter 1.

Example 1.22 revisited. The constraint and adjoint equations are given by

(∇y,∇φ)− (u y,∇φ) = (f, φ) for all φ ∈ H 10 (�)

and

(∇λ,∇φ)+ (u λ,∇φ)+ (y − z, φ) = 0 for all φ ∈ Y = H 10 (�), (5.5.11)

where div u = 0, f ∈ L2(�), and z ∈ L2(�). As already discussed in Examples 1.15and 1.22 they admit unique solutions in Y1 = H 1

0 (�) ∩ L∞(�) for u ∈ U = L2n(�). Let

w1 = y(u)− y(u∗) and w2 = λ(y, u)− λ(y∗, u∗). Then

(∇w1,∇φ)− (uw1,∇φ) = ((u− u∗)y∗,∇φ)and

(∇w2,∇φ)+ (uw2,∇φ) = −((u− u∗) λ∗,∇φ)− (y − y∗, φ) for all φ ∈ Y.

From (1.4.22) it follows that (H5) holds if y∗ and λ∗ are in W 1,∞. Conditions (H6)–(H8) are easily verified. Here we address only the closed range property of e′(y, u), with(y, u) ∈ Y1 × U . For u ∈ C∞n (�) with div u = 0 surjectivity follows from the Lax–Milgram lemma, and a density argument asserts surjectivity for every u ∈ U . We turn to(H12) and assume that u ∈ C∞n (�), div u = 0 first. Then e′(y, u) ∈ L(V × U,H−1(�))

and e′(y, u)(δy, δu) = 0 can be expressed as

(∇δy,∇φ)− (δu y,∇φ)− (u δy,∇φ) = 0 for all φ ∈ Y.

Hence

|δy|Y ≤ |y|L∞|δu|U (5.5.12)

for all (δy, δu) ∈ ker e′(y, u). Henceforth we assume that

β − 2|y∗|L∞|λ∗|L∞ > 0.


�

�

�

�

�

�

�

�


Then there exists a neighborhood V (y∗)× V (λ∗) of (y∗, λ∗) and κ > 0 such that

β − 2|y|L∞(�)|λ|L∞ ≥ κ (5.5.13)

for all (y, λ) ∈ V (y∗) × V (λ∗). For (δy, δu) ∈ Y × U and (y, λ) ∈ V (y∗) × V (λ∗) wehave by (5.5.12) and (5.5.13)

L′′(y, u, λ)((δy, δu), (δy, δu)) = |δy|2L2(�)

+ β|δu|2U − 2(δu∇δy, λ)≥ β|δu|2 − 2|λ|L∞|∇δy|Y |δu|U ≥ κ|δu|2U .

This estimate, together with (5.5.12), implies that L′′(y, u, λ) is coercive on ker e′(y, u)and hence A(x)δx = δw admits a unique solution in Y ×U × Y for every δw. To estimateδx in terms of δw, we note that the last equation in the system A(x)δx = δw implies that

|δy|Y ≤ |y|L∞|δu|U + |w3|H−1 . (5.5.14)

Similarly, we find from the first equation in the system that

|δλ|2Y ≤ |δy||δλ| + |λ|L∞|δλ|Y |δu|U + |w1|H−1 |δλ|Yand consequently

|δλ|Y ≤ c|δy| + |λ|L∞(�)|δu|U + |w1|H−1 , (5.5.15)

where c is the embedding constant of H 10 (�) into L2(�). Moreover, using (5.5.14) we find

L′′(y, u, λ)((δy, δu), (δy, δu)) ≥ |δy|2L2 + β|δu|2U − 2|λ|L∞|δy|Y |δu|U

≥ |δy|2L2 + β|δu|2U − 2|λ|L∞|y|L∞(�)|δu|2U − 2|λ|L∞|w3|H−1 |δu|U

≥ |δy|2 + κ|δu|2U − 2|λ|L∞|w3|H−1 |δu|U .(5.5.16)

From (5.5.15), (5.5.16) and utilizing A(x)δx = δw we obtain

|δy|2 + κ|δu|2 ≤ (2|λ|L∞|δu|U + |δλ|Y )|w3|H−1 + |w1|H−1 |δy|Y + |w2|U |δu|U . (5.5.17)

Inequalities (5.5.14), (5.5.15), and (5.5.17) imply the existence of a constant M1 such that

|δx|Y×U×Y ≤ M1|δw|H−1×U×H−1 (5.5.18)

for all (y, λ) ∈ V (y∗)× V (λ∗) and every b ∈ C∞n (�), with div b = 0. A density argumentwith respect to u implies thatA(x)δx = δw admits a solution for all x ∈ V (y∗)×U×V (λ∗)and that (5.5.18) holds for all such x.


�

�

�

�

�

�

�

�

Chapter 6

AugmentedLagrangian-SQPMethods

6.1 GeneralitiesThis chapter is devoted to second order augmented Lagrangian methods for optimizationproblems with equality constraints of the type{


subject to e(x) = 0,(6.1.1)

and to problems with equality constraints as well as additional constraints{min f (x) over x ∈ C


where f : X→ R, e : X→ W , with X and W real Hilbert spaces and C a closed convexset in X. We shall show that for equality constraints it is possible to replace the first orderLagrangian update that was developed in Chapter 4 by a second order one. It will becomeevident that second order augmented Lagrangian methods are closely related to SQP meth-ods. For equality-constrained problems the SQP method also coincides with the Newtonmethod applied to the first order optimality conditions. Just like the Newton method, theSQP method and second order augmented Lagrangian methods are convergent with secondorder convergence rate if the initialization is sufficiently close to a local solution of (6.1.1)and if appropriate additional regularity conditions are satisfied. In case the initialization isnot in the region of attraction, globalization techniques such as line searches or trust regiontechniques may be necessary. We shall not focus on such methods within this monograph.However, the penalty term, which, together with the Lagrangian term, characterizes theaugmented Lagrangian method, also has a globalization effect. This will become evidentfrom the analytical as well as the numerical results of this chapter. Let us stress that for theconcepts that are analyzed in this chapter we do not advocate choosing c large.

As in Chapter 3, throughout this chapter we identify the dual of the spaceW with itself,and we consider e′(x)∗ as an operator from W to X. In Section 6.2 we present the secondorder augmented Lagrangian method for (6.1.1). Problems with additional constraints as

155


�

�

�

�

�

�

�

�

156 Chapter 6. Augmented Lagrangian-SQP Methods

in (6.1.2) are considered in Section 6.3. Applications to optimal control problems will begiven in Section 6.4. Section 6.5 is devoted to short discussions of miscellaneous topicsincluding reduced SQP methods and mesh independence. In Section 6.6 we give a shortdescription of related literature.

6.2 Equality-constrained problemsIn this section we consider {



f : X → R, e : X → W , with X and W real Hilbert spaces. Let x∗ be a local solution of(6.2.1). As before derivatives as well as partial derivatives with respect to the variable x

will be denoted by primes. We shall not distinguish by notation between the functional f ′in the dual X∗ of X and its Riesz representation in X. As in Chapter 4 we shall identify thetopological duals of W and Z with themselves.

It is assumed throughout that⎧⎪⎨⎪⎩f and e are twice continuously Fréchet differentiable

with Lipschitz continuous second derivatives

in a convex neighborhood of V (x∗) of x∗(6.2.2)

ande′(x∗) is surjective. (6.2.3)

The Lagrangian functional associated with (6.2.1) is denoted by L : X ×W → R and it isgiven by

L(x, λ) = f (x)+ (λ, e(x)

)W.

With (6.2.3) holding there exists a Lagrange multiplier λ∗ ∈ W such that the following firstorder necessary optimality condition is satisfied:

L′(x∗, λ∗) = 0, e(x∗) = 0. (6.2.4)

We shall also make use of the following second order sufficient optimality condition:{there exists κ > 0 such that

L′′(x∗, λ∗)(h, h) ≥ κ |h|2X for all h ∈ ker e′(x∗).(6.2.5)

Here L′′(x∗, λ∗) denotes the bilinear form characterizing the second Fréchet derivative ofL with respect to x at (x∗, λ∗). For any c > 0 the augmented Lagrangian functionalLc : X ×W → R is defined by

Lc(x, λ) = f (x)+ (λ, e(x)

)W+ c

2|e(x)|2W .

We note that the necessary optimality condition implies

L′c(x∗, λ∗) = 0 e(x∗) = 0 for all c ≥ 0. (6.2.6)


�

�

�

�

�

�

�

�

6.2. Equality-constrained problems 157

Lemma 6.1. Let (6.2.3) and (6.2.5) hold. Then there exists a neighborhood V (x∗, λ∗) of(x∗, λ∗), c > 0 and σ > 0 such that

L′′c (x, λ)(h, h) ≥ σ |h|2X for all h ∈ X, (x, λ) ∈ V (x∗, λ∗), and c ≥ c.

Proof. Corollary 3.2 and conditions (6.2.3) and (6.2.5) imply the existence of σ > 0 andc > 0 such that

L′′c (x∗, λ∗)(h, h) ≥ 2σ |h|2X for all h ∈ X and c ≥ c.

Due to continuity of (x, λ)→ L′′c (x, λ) the conclusion of the lemma follows.

Lemma 6.1 implies in particular that x → Lc(x, λ∗) can be bounded from below by

a quadratic function. This fact is referred to as augmentability of (6.2.1) at (x∗, λ∗).

Lemma 6.2. Let (6.2.3) and (6.2.5) hold. Then there exist σ > 0, c > 0, and aneighborhood V (x∗) of x∗ such that

Lc(x, λ∗) ≥ Lc(x

∗, λ∗)+ σ∣∣x − x∗

∣∣2X

for all x ∈ V (x∗) and c ≥ c. (6.2.7)

Proof. Due to Taylor’s theorem, Lemma 6.1, and (6.2.6) we find for x ∈ V (x∗)

Lc(x, λ∗) = Lc(x

∗, λ∗)+ 1

2L′′c (x∗, λ∗)(x − x∗, x − x∗)+ o(|x − x∗|2X)

≥ Lc(x∗, λ∗)+ σ

2

∣∣x − x∗∣∣2X+ o(

∣∣x − x∗∣∣2X).

The claim follows from this estimate.

Without loss of generality we may assume that the neighborhoods V (x∗) and V (x∗)of (6.2.2) and Lemma 6.2 coincide and that V (x∗, λ∗) of Lemma 6.1 equals V (x∗)×V (λ∗).Due to (6.2.2) we can further assume that e′(x) is surjective for all x ∈ V (x∗).

To iteratively determine (x∗, λ∗) one can apply Newton’s method to (6.2.6). Given acurrent iterate (x, λ) the next iterate (x, λ) is the solution to(

L′′c (x, λ) e′(x)∗e′(x) 0

)(x − x

λ− λ

)= −

(L′c(x, λ)e(x)

). (6.2.8)

Alternatively, (6.2.8) can be used to define the λ-update only, whereas the x-update iscalculated by a different technique. We shall demonstrate next that (6.2.8) can be solvedfor λ without recourse to x.

For (x, λ) ∈ V (x∗, λ∗) and c ≥ c we define B(x, λ) ∈ L(W) by

B(x, λ) = e′(x)L′′c (x, λ)−1e′(x)∗. (6.2.9)


�

�

�

�

�

�

�

�


Here L(W) denotes the space of bounded linear operators from W into itself. Note thatB(x, λ) is invertible. In fact, there exists a constant k > 0 such that(

B(x, y)y, y)W= (

L′′c (x, λ)−1e′(x)∗y, e′(x)∗y)X

≥ k∣∣e′(x)∗y∣∣2

X

for all y ∈ W . Since e′(x)∗ is injective and has closed range, there exists k such that∣∣e′(x)∗y∣∣2X≥ k |y|2W for all y ∈ W,

and by the Lax–Milgram theorem continuous invertibility of B(x, λ) follows, provided that(x, λ) ∈ V (x∗, λ∗) and c ≥ c. Premultiplying the first equation in (6.2.8) by e′(x)L′′c (x, λ)−1

we obtain {λ= λ+ B(x, λ)−1

[e(x)− e′(x)L′′c (x, λ)−1L′c(x, λ)

],

x= x − L′′c (x, λ)−1L′c(x, λ).(6.2.10)

Whenever the dependence of (x, λ) on (x, λ, c) is important, (x(x, λ, c), λ(x, λ, c)) willbe written in place of (x, λ). If, for fixed λ, x = x(λ) is chosen as a local solution to

min Lc(x, λ) subject to x ∈ X, (6.2.11)

then L′c(x(λ), λ) = 0 and

λ(x(λ), λ, c) = λ+ B(x(λ), λ)−1e(x(λ)). (6.2.12)

We point out that (6.2.12) can be interpreted as a second order update to the Lagrangevariable. To acknowledge this, let dc denote the dual functional associated with Lc, i.e.,

dc(λ) = min Lc(x, λ) subject to {x : ∣∣x − x∗∣∣ ≤ ε}

for some ε > 0. Then the first and second derivatives of dc with respect to λ satisfy

∇λdc(λ) = e(x(λ))

and

∇2λdc(λ) = −B(x(λ), λ),

and (6.2.12) presents a Newton step for maximizing dc.Returning to (6.2.8) we note that its blockwise solution given in (6.2.10) requires

setting up and inverting L′′c (x, λ). Following an argument due to Bertsekas [Be] we nextargue that L′′c (x, λ) can be avoided during the iteration. One requires only that L′′0(x, λ) =L′′(x, λ). In fact, we find

L′c(x, λ) = L′0(x, λ+ ce(x))

and

L′′c (x, λ) = L′′0(x, λ+ ce(x))+ c(e′(x)(·), e′(x)(·))

W.


�

�

�

�

�

�

�

�


Consequently (6.2.8) can be expressed as(L′′0(x, λ+ ce(x))+ ce′(x)∗e′(x) e′(x)∗

e′(x) 0

)(x − x

λ− λ

)= −

(L′0(x, λ+ ce(x))

e(x)

). (6.2.13)

Using the second equation e′(x)(x−x) = −e(x) in the first equation of (6.2.13) we arrive at

L′′0(x, λ+ ce(x))(x − x)− ce′(x)∗e(x)+ e′(x∗)(λ− λ) = −L′0(x, λ+ ce(x)),

and hence (6.2.8) is equivalent to(L′′0(x, λ+ ce(x)) e′(x)∗

e′(x) 0

)(x − x

λ− (λ+ ce(x))

)= −

(L′0(x, λ+ ce(x))

e(x)

). (6.2.14)

Solving (6.2.8) is thus equivalent to

(i) carrying out the first order multiplier iteration

λ = λ+ ce(x),

(ii) solving (L′′0(x, λ) e′(x)∗e′(x) 0

)(x − x

λ− λ

)= −

(L′0(x, λ)e(x)

)for (x, λ).

It will be convenient to introduce the matrix of operators

M(x, λ) =(

L′′0(x, λ) e′(x)∗e′(x) 0

).

With (6.2.3) and (6.2.5) holding there exist a constantκ > 0 and a neighborhoodU(x∗, λ∗) ⊂V (x∗, λ∗) of (x∗, λ∗) such that

‖ M−1(x, λ) ‖L(X×W)≤ κ for all (x, λ) ∈ U(x∗, λ∗). (6.2.15)

Lemma 6.3. Assume that (6.2.3) and (6.2.5) hold. Then there exists a constant K > 0such that for any (x, λ) ∈ U(x∗, λ∗) the solution (x, λ) of

M(x, λ)

(x − x

λ− λ

)= −

(L′(x, λ)e(x)

)satisfies ∣∣∣(x, λ)− (x∗, λ∗)

∣∣∣X×W

≤ K∣∣(x, λ)− (x∗, λ∗)

∣∣2X×W . (6.2.16)


�

�

�

�

�

�

�

�


Proof. Note that

M(x, λ)

(x − x

λ− λ

)= −

(L′(x, λ)− L′(x∗, λ∗)

e(x)− e(x∗)

)and consequently

M(x, λ)

(x − x∗

λ− λ∗

)= −

(L′(x, λ)− L′(x∗, λ∗)

e(x)− e(x∗)

)+M(x, λ)

(x − x∗

λ− λ∗

).

This equality further implies

M(x, λ)

(x − x∗

λ− λ∗

)=

∫ 1

0

[M(x, λ)−M(tx + (1− t)x∗, tλ+ (1− t)λ∗)

](x − x∗

λ− λ∗

)dt.

The regularity properties of f and e imply that (x, λ)→ M(x, λ) is Lipschitz continuouson U(x∗, λ∗) for a Lipschitz constant γ > 0. Thus we obtain∣∣∣∣M(x, λ)

(x − x∗

λ− λ∗

)∣∣∣∣X×W

≤ γ

2

∣∣(x, λ)− (x∗, λ∗)∣∣2X×W ,

and by (6.2.15) ∣∣∣(x, λ)− (x∗, λ∗)∣∣∣X×W

≤ γ κ

2

∣∣(x, λ)− (x∗, λ∗)∣∣2X×W ,

which implies the claim.

We now describe three algorithms and analyze their convergence. They are all basedon (6.2.14) and differ only in the choice of x. Recall that if x is a solution to (6.2.11),then (6.2.12) and hence (6.2.14) provide a second order update to the Lagrange multiplier.Solving (6.2.11) implies extra computational cost. In the results which follow we show thatas a consequence a larger region of attraction with respect to the initial condition and animproved rate of convergence factor is obtained, compared to methods which solve (6.2.11)only approximately or skip it all together. As a first choice x in (6.2.14) is determined bysolving

min Lc(x, λn) subject to x ∈ V (x∗).

The second choice is to take x only as an appropriate suboptimal solution to this optimizationproblem, and the third choice is to simply choose x as the solution x of the previous iterationof (6.2.14).

Algorithm 6.1.

(i) Choose λ0 ∈ W, c ∈ (c,∞) and set σ = c − c, n = 0.

(ii) Determine x as a solution of

min Lc(x, λn) subject to x ∈ V (x∗). (Paux)


�

�

�

�

�

�

�

�


(iii) Set λ = λn + σe(x).

(iv) Solve for (x, λ):

M(x, λ)

(x − x

λ− λ

)= −

(L′0(x, λ)e(x)

).

(v) Set λn+1 = λ, n = n+ 1, and goto (ii).

The existence of a solution to (Paux) is guaranteed if, for example, the followingconditions on f and e hold:⎧⎪⎨⎪⎩

f : X→ R is weakly lower semicontinuous,

e : X→ W maps weakly convergent sequences

to weakly convergent sequences.

(6.2.17)

Under the conditions of Theorem 6.4 below it follows that the solutions x of (Paux)

satisfy x ∈ V (x∗).

Theorem 6.4. If (6.2.3), (6.2.5), and (6.2.17) hold and 1c−c |λ0 − λ∗|2 is sufficiently small,

then Algorithm 6.1 is well defined and

∣∣(xn+1, λn+1)− (x∗, λ∗)∣∣X×W ≤

K

c − c

∣∣λn − λ∗∣∣2W

(6.2.18)

for a constant K independent of c and n = 0, 1, . . . .

Proof. Let η be the largest radius for a ball centered at (x∗, λ∗) and contained in U(x∗, λ∗),and let γ be a Lipschitz constant for f ′ and e′ on U(x∗, λ∗). Further let E = e′(x∗) andnote that (EE∗)−1E ∈ L(X) as a consequence of (6.2.3). We define

M = 1+ 2(cγ )2 + 8γ 2(1+ μ)2 ‖(EE∗)−1E ‖2,

where μ =(

1+ cγ√2σ σ

)|λ0 − λ∗|W + |λ∗|W , and we put

η = min

(η

√2σ σ

M,

2σ σ

KM

), (6.2.19)

where K is given in Lemma 6.3. Let us assume that∣∣λ0 − λ∗∣∣W

< η.

The proof will be given by induction on n. The case n = 0 follows from the generalarguments given below. For the induction step we assume that∣∣λi − λ∗

∣∣W≤ ∣∣λi−1 − λ∗

∣∣W

for i = 1, . . . , n. (6.2.20)


�

�

�

�

�

�

�

�


Using (ii) of Algorithm 6.1 we have

f (x∗) ≥ Lc(x, λn) = f (x)+ (λ∗, e(x)

)W+ (

λn − λ∗, e(x))W

+ 1

2c |e(x)|2W +

1

2(c − c) |e(x)|2W

≥ Lc(x, λ∗) + (

λn − λ∗, e(x))W+ σ

2|e(x)|2W

= Lc(x, λ∗) + 1

2σ

( ∣∣λn + σe(x)− λ∗∣∣2W− ∣∣λn − λ∗

∣∣2W

).

In view of Lemma 6.2 and the fact that x is chosen in V (x∗)(⊂ V (x∗)) we find

σ∣∣x − x∗

∣∣2X+ 1

2σ

∣∣∣λ− λ∗∣∣∣2

W≤ 1

2σ

∣∣λn − λ∗∣∣2W. (6.2.21)

This implies in particular that |x − x∗| ≤√

M2σ σ |λ0 − λ∗| < η and hence x ∈ V (x∗). The

necessary optimality conditions for (6.2.1) and (Paux) with x ∈ V (x∗) are given by

f ′(x∗)+ E∗λ∗ = 0

andf ′(x)+ e′(x)∗(λn + ce(x)) = 0.

Subtracting these two equations and defining λ = λn + ce(x) give

E∗(λ∗ − λ) = f ′(x)− f ′(x∗)+ (e′(x)∗ − e′(x∗)∗)(λ)

and consequently

λ∗ − λ = (EE∗)−1E[f ′(x)− f ′(x∗)+ (e′(x)∗ − e′(x∗)∗)(λ)

].

We obtain|λ∗ − λ|W ≤ 2γ (1+ |λ|W)‖(EE∗)−1E‖ ∣∣x − x∗

∣∣X. (6.2.22)

To estimate λ observe that due to (6.2.20) and (6.2.21)∣∣λ∣∣W≤

∣∣∣λ− λ

∣∣∣W+

∣∣∣λ∣∣∣W= c

∣∣e(x)− e(x∗)∣∣W+ ∣∣λ∗∣∣

W+

∣∣∣λ− λ∗∣∣∣W

≤ cγ∣∣x − x∗

∣∣W+ ∣∣λ∗∣∣

W+ ∣∣λn − λ∗

∣∣W≤

(1+ cγ√

2σ σ

) ∣∣λn − λ∗∣∣W

+ ∣∣λ∗∣∣W≤ μ. (6.2.23)

Using (6.2.20)–(6.2.23) we find

|(x, λ)− (x∗, λ∗)|2X×W ≤ |x − x∗|2X + 2|λ− λ|2W + 2|λ− λ∗|2W= |x − x∗|2X + 2(cγ )2|x − x∗|2X + 8γ 2(1+ μ)2‖(EE∗)−1E‖2 |x − x∗|2X= M

2σ σ

∣∣λn − λ∗∣∣2X<

M

2σ ση2 ≤ η2.


�

�

�

�

�

�

�

�


This implies that Lemma 6.3 is applicable with (x, λ) = (x, λ). We find

∣∣(xn+1, λn+1)− (x∗, λ∗)∣∣X×W ≤

KM

2σ σ

∣∣λn − λ∗∣∣2X, (6.2.24)

and (6.2.18) is proved with K = KM2σ . From (6.2.20), (6.2.24), and the definition of η we

also obtain ∣∣(xn+1, λn+1)− (x∗, λ∗)∣∣X×W ≤

∣∣λn − λ∗∣∣W

< η.

This implies (6.2.20) for i = n+ 1 and the proof is finished.

Algorithm 6.2. This coincides with Algorithm 6.1 except for (i) and (ii), which are re-placed by

(i)′ Choose λ0 ∈ W, c ∈ (c,∞), and σ ∈ (0, c − c] and set n = 0.

(ii)′ Determine x ∈ V (x∗) such that

Lc(x, λn) ≤ Lc(x∗, λn) = f (x∗).

Theorem 6.5. Let (6.2.3) and (6.2.5) hold. If |λ0 − λ∗|W is sufficiently small, thenAlgorithm 6.2 is well defined and

∣∣(xn+1, λn+1)− (x∗, λ∗)∣∣X×W ≤ K

(1+ 1

σ σ

) ∣∣λn − λ∗∣∣2W

(6.2.25)

for all n = 0, 1, . . . . Here xn+1 stands for x of step (iv) of Algorithm 6.1 andK , independentof c, is given in (6.2.16).

Proof. Let η be defined as in the proof of Theorem 6.4. We define

η = min

{η√a,

1

Ka

}, (6.2.26)

where a = 1+ 1σ σ

. The proof is based on an induction argument. If∣∣λ0 − λ∗∣∣ < η,

then the first iterate of Algorithm 6.2 is well defined,∣∣(x1, λ1)− (x∗, λ∗)∣∣X×W < η

and (6.2.25) holds with n = 0. This will follow from the general arguments given below.Assuming that |(xn, λn)− (x∗, λ∗)|X×W < η, we show that Algorithm 6.2 is well definedfor n+ 1, that ∣∣(xn+1, λn+1)− (x∗, λ∗)

∣∣X×W < η,


�

�

�

�

�

�

�

�


and that (6.2.25) holds. As in the proof of Theorem 6.4 one argues that (6.2.21) holds andconsequently ∣∣∣(x, λ)− (x∗, λ∗)

∣∣∣2

X×W≤

(1+ 1

2σ σ

) ∣∣λn − λ∗∣∣2W. (6.2.27)

This implies that |(x, λ∗) − (x∗, λ∗)|X×W < η and hence Lemma 6.3 is applicable with(x, λ) = (x, λ) and (iv) of Algorithm 6.2 is well defined. Combining (6.2.16) with (6.2.27)we find ∣∣(xn+1, λn+1)− (x∗, λ∗)

∣∣X×W ≤ K

(1+ 1

2σ σ

) ∣∣λn − λ∗∣∣2W,

which is (6.2.25). By the definition of η we further obtain∣∣(xn+1, λn+1)− (x∗, λ∗)∣∣X×W ≤

∣∣λn − λ∗∣∣W

< η.

Remark 6.2.1. In the proof of Theorem 6.4 as well as in that of Theorem 6.5 we utilize(6.2.7) from Lemma 6.2. Conditions (6.2.3) and (6.2.5) are sufficient conditions for (6.2.7)to hold for c ≥ c > 0. If (6.2.7) can be shown to hold for all c ≥ 0, then c can be chosenequal to 0 and σ = c is admissible in (i′) of Algorithm 6.2.

In the third algorithm we delete the second step of Algorithms 6.1 and 6.2 and directlyiterate (6.2.14).

Algorithm 6.3.

(i) Choose (x0, λ0) ∈ X ×W, c ≥ 0 and put n = 0.

(ii) Set λ = λn + ce(xn).

(iii) Solve for (x, λ):

M(xn, λ)

(x − xn

λ− λ

)= −

(L′0(xn, λ)e(xn)

).

(iv) Set (xn+1, λn+1) = (x, λ), n = n+ 1, and goto (ii).

Theorem 6.6. Let (6.2.3) and (6.2.5) hold. If max(1, c) |(x0, λ0)− (x∗, λ∗)|X×W issufficiently small, then Algorithm 6.3 is well defined and∣∣(xn+1, λn+1)− (x∗, λ∗)

∣∣X×W ≤ K

∣∣(xn, λn)− (x∗, λ∗)∣∣2X×W

for a constant K independent of n = 0, 1, 2, . . . , but depending on c.

Proof. Let η and γ be defined as in the proof of Theorem 6.4. We introduce

η = min

{η√a,

1

Ka

}, (6.2.28)


�

�

�

�

�

�

�

�

6.3. Partial elimination of constraints 165

where a = max(2, 1+ 2c2γ 2

)and K is defined in Lemma 6.3.

Let us assume that ∣∣(x0, λ0)− (x∗, λ∗)∣∣X×W < η.

Again we proceed by induction and the case n = 0 follows from the general argumentsgiven below. Let us assume that |(xn, λn)− (x∗, λ∗)|X×W < η. Then∣∣∣(xn, λ)− (x∗, λ∗)

∣∣∣2

X×W≤ ∣∣xn − x∗

∣∣2X+ 2c2

∣∣e(xn)− e(x∗)∣∣2W+ 2

∣∣λn − λ∗∣∣2W

≤ ∣∣xn − x∗∣∣2X+ 2c2γ 2

∣∣xn − x∗∣∣2X+ 2

∣∣λn − λ∗∣∣2W

≤ a∣∣(xn, λn)− (x∗, λ∗)

∣∣2X×W < η2,

and thus Lemma 6.3 is applicable with (x, λ) = (xn, λ).

Remark 6.2.2. (i) If c is set equal to 0 in Algorithm 6.3, then we obtain the well-knownSQP algorithm for the equality-constrained problem (6.2.1). It is well known to have asecond order convergence rate which also follows from Theorem 6.6 since K is finite forc = 0.

(ii) Theorem 6.6 suggests that in case (Paux) is completely skipped in the secondorder augmented Lagrangian update the penalty parameter may have a negative effect onthe region of attraction and on the convergence rate estimate. Our numerical experience,without additional globalization techniques, indicates that moderate values of c do notimpede the behavior of the algorithm when compared to c = 0, which results in the SQPalgorithm. Choosing c > 0 may actually enlarge the region of attraction when compared toc = 0. For parameter estimation problems c > 0 is useful, because in this way Algorithm6.3 becomes a hybrid algorithm combining the output least squares and the equation errorformulations [IK9].

6.3 Partial elimination of constraintsHere we consider ⎧⎪⎨⎪⎩

min f (x) subject to

e(x) = 0, g(x) ≤ 0, and �(x) ∈ K,

(6.3.1)

where f : X → R, e : X → W, g : X → Rm, � : X → Z, where X,W , and Z arereal Hilbert spaces, K is a closed convex cone in Z, and � is an affine constraint. In theremainder of this chapter we identify the dual of Z with itself. Let x∗ be a local solution of(6.3.1) and assume that⎧⎨⎩

f, e, and g are twice continuously Fréchet differentiablewith second derivatives Lipschitz continuousin a neighborhood of x∗.

(6.3.2)


�

�

�

�

�

�

�

�


The objective of this section is to approximate (6.3.1) by a sequence of problemswith quadratic cost and affine constraints. For this purpose we utilize the LagrangianL : X ×W × Rm × Z → R associated with (6.3.1) given by

L(x, λ, μ, η) = f (x)+ (λ, e(x)

)W+ (

μ, g(x))Rm +

(η, �(x)

)Z.

We shall require the following assumption (cf. Chapter 3):

x∗ is a regular point, i.e., (6.3.3)

0 ∈ int

⎧⎨⎩⎛⎝ e′(x∗)

g′(x∗)L

⎞⎠X +⎛⎝ 0Rm+−K

⎞⎠+⎛⎝ 0

g(x∗)�(x∗)

⎞⎠⎫⎬⎭ ,

where the interior is taken in W × Rm × Z. With (6.3.3) holding there exists a Lagrangemultiplier (λ∗, μ∗, η∗) ∈ W × Rm,+ ×K+ such that

0 ∈

⎧⎪⎪⎨⎪⎪⎩L′(x∗, λ∗, μ∗, η∗),−e(x∗),−g(x∗)+ ∂ ψRm,+(μ∗),−�(x∗)+ ∂ ψK+(η∗).

As in Section 3.1, we assume that the coordinates of the inequalities g(x) ≤ 0 are arrangedso that μ∗ = (μ∗,+, μ∗,0, μ∗,−) and g = (g+, g0, g−) with g+ : X→ Rm1 , g0 : Rm2 , g− :X→ Rm3 , m = m1 +m2 +m3, and

g+(x∗) = 0, μ∗,+ > 0,

g0(x∗) = 0, μ∗,0 = 0,

g−(x∗) < 0, μ∗,− = 0.

We further define G+ = g+(x∗)′,G0 = g0(x∗)′,

E+ =(

E

G+

): X→ W × Rm1 ,

and for z ∈ Z we define the operator E(z) : X × R→ (W × Rm1)× Rm2 × Z by

E(z) =⎛⎝ E+ 0

G0 0L z

⎞⎠ .

The following additional assumptions are required:

there exists κ > 0 such that L′′(x∗, λ∗, μ∗) (x, x) ≥ κ|x|2X for all x ∈ ker(E+) (6.3.4)

and

E(�(x∗)) is surjective. (6.3.5)

The SQP method for (6.2.1) with elimination of the equality and finite rank inequalityconstraints is given next. We set

L(x, λ, μ) = f (x)+ (λ, e(x)

)W+ (

μ, g(x))Rm .


�

�

�

�

�

�

�

�


Algorithm 6.4.

(i) Choose (x0, λ0, μ0) ∈ X ×W × Rm and set n = 0.

(ii) Solve for (xn+1, λn+1, μn+1):⎧⎪⎪⎪⎪⎨⎪⎪⎪⎪⎩min 1

2L′′(xn, λn, μn)(x − xn, x − xn)+ f ′(xn)(x − xn),

e(xn)+ e′(xn)(x − xn) = 0,

g(xn)+ g′(xn)(x − xn) ≤ 0, �(xn)+ L(x − xn) ∈ K,

(Paux)

where (λn+1, μn+1) are the Lagrange multipliers associated to the equality and in-equality constraints.

(iii) Set n = n+ 1 and goto (ii).

Since (6.3.3)–(6.3.5) imply (H1), (H2), (H3) of Chapter 2, there exist by Corol-lary 2.18 neighborhoods U (x∗, λ∗, μ∗) and U(x∗) such that the auxiliary problem (Paux)of Algorithm 6.4 admits a unique local solution xn+1 in U(x∗) provided that (xn, λn, μn) ∈U (x∗, λ∗, μ∗) = U (x∗)×U (λ∗, μ∗). To obtain a convergence rate estimate forxn, Lagrangemultipliers to the constraints in (Paux) are introduced. Since the regular point condition isstable with respect to perturbations in x∗, we can assume that U (x∗) is chosen sufficientlysmall such that for xn ∈ U (x∗) there exist (λn+1, μn+1, ηn+1) ∈ W × Rm × Z such that

0 ∈ G(xn, λn, μn)(xn+1, λn+1, μn+1, ηn+1)+

⎛⎜⎜⎝00∂ ψRm,+ (μn+1)

∂K+ (ηn+1)

⎞⎟⎟⎠ , (6.3.6)

where

G(xn, λn, μn)(x, λ, μ, η)

=

⎛⎜⎜⎝L′′

(xn, λn, μn)(x − xn)+ f ′(xn)+ e′(xn)∗λ+ g′(xn)∗μ+ L∗η−e(xn)− e′(xn)(x − xn)

−g(xn)− g′(xn)(x − xn)

−�(xn)− L(x − xn)

⎞⎟⎟⎠ .

Theorem 6.7. Assume that (6.3.2) – (6.3.5) are satisfied at (x∗, λ∗, μ∗)and that |(x0, λ0, μ0)−(x∗, λ∗, μ∗)| is sufficiently small. Then there exists K > 0 such that

|(xn+1, λn+1, μn+1, ηn+1)− (x∗, λ∗, μ∗, η∗)| ≤ K|(xn, λn, μn)− (x∗, λ∗, μ∗)|2for all n = 1, 2, . . . .

Proof. The optimality system for (6.3.1) can be expressed by

0 ∈ G(x∗, λ∗, μ∗)(x∗, λ∗, μ∗, η∗)+

⎛⎜⎜⎝00∂ ψRm,+ (μ∗)∂ ψK+ (η∗)

⎞⎟⎟⎠ ,


�

�

�

�

�

�

�

�


or equivalently

0 ∈

⎛⎜⎜⎝anbncndn

⎞⎟⎟⎠+ G(xn, λn, μn)(x∗, λ∗, μ∗, η∗)+

⎛⎜⎜⎝00∂ ψRm,+ (μ∗)∂+k (η∗)

⎞⎟⎟⎠ , (6.3.7)

where (an, bn, cn, dn) = (a(xn, λn, μn), b(xn), c(xn), d(xn)) and

a(x, λ, μ) = f ′(x∗)− f ′(x)+ (e′(x∗)∗ − e′(x)∗)λ∗

+ (g′(x∗)∗ − g′(x)∗)μ∗ − L′′(x, λ, μ)(x∗ − x),

b(x) = e(x)+ e′(x)(x∗ − x)− e(x∗),c(x) = g(x)+ g′(x)(x∗ − x)− g(x∗),d(x) = 0.

Without loss of generality we may assume that the first and second derivatives of f, e, andg are Lipschitz continuous in U (x∗). It follows that there exists L such that

(|a(x, λ)|2 + |b(x)|2 + |c(x)|2)1/2≤ L(|x − x∗|2 + |λ− λ∗|2 + |μ− μ∗|2)for all (x, λ, μ) ∈ U (x∗)× U (λ∗, μ∗).

(6.3.8)

Let K be determined from Corollary 2.18 and letB((x∗, λ∗, μ∗), r) denote a ball in U (x∗)×U (λ∗, μ∗) with center (x∗, λ∗, μ∗) and radius r , where r K L < 1.

Proceeding by induction, assume that (xn, λn, μn) ∈ B((x∗, λ∗, μ∗), r). Then fromCorollary 2.18, with (x1, λ1, μ1) = (x2, λ2, μ2) = (xn, λn, μn), and (6.3.6) – (6.3.8) wefind

|(xn+1, λn+1, μn+1, ηn+1)− (x∗, λ∗, μ∗, η∗)|≤ K|(an, bn, cn)|X×W×Rm ≤ K L(|xn − x∗|2 + |λn − λ∗|2 + |μn − μ∗|2)≤ (K L r)r < r.

This estimate implies that (xn+1, λn+1, μn+1) ∈ B((x∗, λ∗, μ∗), r), as well as the desiredlocal quadratic convergence of Algorithm 6.4.

Remark 6.3.1. Let L(x, λ) = f (x) + (λ, e(x)

)W

and consider Algorithm 6.4 withL′′

(xn, λn, μn) replaced by L′′(xn, λn); i.e., only the equality constraints are eliminated.

If (6.3.4) is satisfied with L′′(x∗, λ∗, μ∗) replaced by L′′

(x∗, λ∗), then Theorem 6.7 holdsfor the resulting algorithm with the triple (x, λ, μ) replaced by (x, λ).

Next we consider {min f (x) subject toe(x) = 0, x ∈ C,

(6.3.9)


�

�

�

�

�

�

�

�


where f and e are as in (6.3.1) and C is a closed convex set in X. Let x∗ denote a localsolution of (6.3.9) and assume that{

f and e are twice continuously Fréchet differentiable,f′′

and e′′

are Lipschitz continuous in a neighborhood of x∗, (6.3.10)

0 ∈ int e′(x∗)(C − x∗), (6.3.11)

and

there exists a constant κ > 0 such that L′′(x∗, λ∗) (x, x) ≥ κ|x|2X

for all x ∈ ker e′(x∗). (6.3.12)

To solve (6.3.9) we consider the SQP algorithm with elimination of the equality constraint.

Algorithm 6.5.

(i) Choose (x0, λ0) ∈ X ×W ∗ and set n = 0.

(ii) Solve for (xn+1, λn+1):⎧⎪⎨⎪⎩min 1

2L′′(xn, λn)(x − xn, x − xn)+ f ′(xn)(x − xn),

e(xn)+ e′(xn)(x − xn) = 0, x ∈ C,

(Paux)

where λn+1 is the Lagrange multiplier associated to the equality constraint.

(iii) Set n = n+ 1 and goto (ii).

The optimality condition for (Paux) in Algorithm 6.5 is given by

0 ∈ G(xn, λn)(xn+1, λn+1)+(∂ψC(xn+1)

0

), (6.3.13)

where

G(xn, λn)(x, λ) =(

L′′(xn, λn)(x − xn)+ f ′(xn)−e(xn)− e′(xn)(x − xn)

).

Due to (6.3.11) and (6.3.12) there exists a neighborhood U(x∗, λ∗) of (x∗, λ∗) such that(6.3.13) admits a solution (xn+1, λn+1) and xn+1 is the solution of (Paux) of Algorithm 6.5,provided that (xn, λn) ∈ U(x∗, λ∗). We also require the following condition:⎧⎪⎪⎪⎪⎪⎪⎪⎪⎪⎨⎪⎪⎪⎪⎪⎪⎪⎪⎪⎩

There exist neighborhoods U (x∗)× U (λ∗) of (x∗, λ∗) andV of the origin in X ×W and a constant K such thatfor all q1, q2 ∈ V and (x, λ) ∈ U (x∗)× U (λ∗) there existsa unique solution(x, λ) = (x(x, λ, q), λ(x, λ, q1), λ(x, λ, q1)) ∈ U (x∗)× U (λ∗) of0 ∈ q1 + G(x, λ)(x, λ)+ ∂ψC(x) and|(x(x, λ, q1), λ(x, λ, q1))− (x(x, λ, q2), λ(x, λ, q2))| ≤ K|q1 − q2|.

(6.3.14)


�

�

�

�

�

�

�

�


Remark 6.3.2. From the discussion in the first part of this section it follows that (6.3.14)holds for problem (6.3.1) with C = {x : g(x) ≤ 0, �(x) ∈ K}, provided that g is convex.Condition (6.3.14) was analyzed for optimal control problems in several papers; see, forinstance, [AlMa, Tro].

Theorem 6.8. Suppose that (6.3.10)–(6.3.13) hold at a local solution x∗ of (6.3.9). If|(x0, λ0) − (x∗, λ∗)| is sufficiently small, then the iterates (xn, λn) converge quadraticallyto (x∗, λ∗).

Proof. The optimality system for (6.3.9) can be expressed as

0 ∈ G(x∗, λ∗)(x∗, λ∗)+(∂ψC(x

∗)0

),

or equivalently

0 ∈(anbn

)+ G(xn, λn)(x

∗, λ∗)+(∂ψC(x

∗)0

), (6.3.15)

where

an = f ′(x∗)− f ′(xn)+ (e′(xn)∗ − e′(x∗)∗)λ∗ − L′′(xn, λn)(x

∗ − xn),

bn = e(xn)+ e′(xn)(x∗ − xn)− e(x∗).

In view of (6.3.10) we may assume that U (x∗) is chosen sufficiently small such that f′′

ande′′

are Lipschitz continuous on U (x∗). Consequently there exists L such that

|(an, bn)| ≤ L(|xn − x∗|2 + |λn − λ∗|2), provided that xn ∈ U (x∗). (6.3.16)

Let B((x∗, λ∗), r) denote a ball with radius r < (K L)−1 and center (x∗, λ∗) containedin U(x∗, λ∗) and in U (x∗) × U (λ∗), and assume that (xn, λn) ∈ B((x∗, λ∗), r). Then by(6.3.13)–(6.3.16) we have

|(xn+1, λn+1)− (x∗, λ∗)| ≤ K L |(xn, λn)− (x∗, λ∗)|2 < |(xn, λn)− (x∗, λ∗)| < r.

The proof now follows by an induction argument.

In Section 6.2 the combination of the second order method (6.2.8) with first orderaugmented Lagrangian updates was analyzed for equality-constrained problems. This ap-proach can also be taken for problems with inequality constraints. We present the analogueof Theorem 6.4. Let

Lc(x, λ, μ) = f (x)+ (λ, e(x)

)W+ (

μ, g(x, μ, c))Rm + c

2|e(x)|2W +

c

2|g(x, μ, c)|2

Rm,

where g(x, μ, c) = max(g(x),−μ

c), as defined in Chapter 3. Below c ≥ c denote the

constants and Bδ the closed ball of radius δ around x∗ that were utilized in Corollary 3.7.


�

�

�

�

�

�

�

�


Algorithm 6.6.

(i) Choose (λ0, μ0) ∈ W × Rm, c ∈ [c,∞) and set n = 0.

(ii) Determine x as the solution to

min Lc(x, λn, μn) subject to x ∈ Bδ, �(x) ∈ K. (P 1aux)

(iii) Update λ = λn + (c − c)e(x), μ = μn + (c − c)g(x, μn, c).

(iv) Solve (Paux) of Algorithm 6.4 with (xn, λn, μn) = (x, λ, μ) for (xn+1, λn+1, ηn+1).

(v) Set n = n+ 1 and goto (ii).

Theorem 6.9. Assume that (3.4.7), (3.4.9) of Chapter 3 and (6.3.2)–(6.3.4) of this chapterhold at (x∗, λ∗, μ∗). If 1

c−c (|λ0 − λ∗|2 + |μ0 − μ∗|2) is sufficiently small, then Algorithm6.6 is well defined and

|(xn+1, λn+1, μn+1, ηn+1)− (x∗, λ∗, μ∗, η∗)|X×W×Rm×Z

≤ K

c − c(|λn − λ∗|2W + |μn − μ∗|2

Rm)

for a constant K independent of n = 0, 1, 2, . . . .

Proof. The assumptions guarantee that Corollary 3.7, Proposition 3.9, and Theorem 3.10 ofChapter 3 and Theorem 6.7 of the present chapter are applicable. The proof is now similar tothat of Theorem 6.4, with Theorem 6.7 replacing Lemma 6.3 to estimate the step of (Paux).Let η be the radius of the largest ball U centered at (x∗, λ∗, μ∗) such that Theorem 6.7 isapplicable and that the x-coordinates of elements in U are also in Bδ . Let

η = min

(η κ√c − c,

κ2(c − c)

K

), where κ2 = 2(c − c)

1+ 2K2,

with K defined in Theorem 3.10 and K in Theorem 6.7,

|(λ0, μ0)− (λ∗, μ∗)|W×Rm < η.

We proceed by induction with respect to n. The case n = 0 is simple and we assume that

|(λi, μi)− (λ∗, μ∗)| ≤ |(λi−1, μi−1)− (λ∗, μ∗)| for i = 1, . . . , n. (6.3.17)

From Theorems 3.8 and 3.10 we have

|(x − x∗, λ− λ∗, μ− μ∗)| ≤ 1

κ√c − c

| (λn, μn)− (λ∗, μ∗)|

<η

κ√c − c

≤ η.

(6.3.18)


�

�

�

�

�

�

�

�


This estimate implies that x ∈ int Bδ and that Theorem 6.7 with (xn, λn, μn) replaced by(x, λ, μ) is applicable. It implies

|(xn+1, λn+1, μn+1, ηn+1)− (x∗, λ∗, μ∗, η∗)| ≤ K|(x, λ, μ)− (x∗, λ∗, μ∗)|2, (6.3.19)

and combined with (6.3.18)

|(xn+1, λn+1, μn+1, ηn+1)− (x∗, λ∗, μ∗, η∗)| ≤ K

κ2(c − c)|(λn, μn)− (λ∗, μ∗)|2

≤ |(λn, μn)− (λ∗, μ∗)|2.This implies (6.3.17) with i = n+ 1 as well as the claim of the theorem.

6.4 Applications

6.4.1 An introductory example

Consider the optimal control problem⎧⎪⎪⎨⎪⎪⎩min 1

2

∫Q|y − z|2dxdt + β

2 |u|2U ,yt = �y + g(y)+ Bu in Q,

y = 0 on �,

y(0, ·) = ϕ on �,

(6.4.1)

where Q = (0, T )×� and � = (0, T )× ∂�. Define e : X = Y × U → W by

e(y, u) = yt −�y − g(y)− Bu.

Here B ∈ L(U, Y ), where U is the Hilbert space of controls, and the choice for Y can be

Y = {y ∈ L2(H 10 (�)) : yt ∈ L2(H−1(�))}

or

Y = L2(H 10 (�) ∩H 2(�)) ∩W 1,2(L2(�)),

for example. Here L2(H 10 (�)) is an abbreviation for L2(0, T ;H 1

0 (�)). The former choicecorresponds to variational formulations, the latter to strong formulations of the partial dif-ferential equation and both require regularity assumptions for g. Matching choices forW are

W = L2(H−1(�)) and W = L2(L2(�)).

We proceed with Y = {y ∈ L2(H 10 (�)) : yt ∈ L2(H−1(�))}. The Lagrangian is given by

L(y, u, λ) = J (y, u)+ 〈λ, e(y, u)〉L2(H−1)

= J (y, u)+∫ T

0〈λ, (−�)−1(yt −�y − g(y)− Bu)〉H−1,H 1

0dt,


�

�

�

�

�

�

�

�


where � denotes the Laplacian with Dirichlet boundary conditions. The augmented La-grangian SQP step as described below (6.2.14) is given by

λ = λ+ c(yt −�y − g(y)− Bu)

and ⎛⎝ I + g′′(y)�−1λ 0 ( ∂∂t+�+ g′(y))�−1

0 βI B∗�−1

(−�)−1( ∂∂t−�− g′(y)) �−1B 0

⎞⎠⎛⎝ δy

δu

δλ

⎞⎠= −

⎛⎝ y − z− (( ∂∂t+�+ g′(y))(−�)−1λ

βu+ B∗�−1λ

(−�)−1(yt −�y − g(y)− Bu)

⎞⎠ ,

with δy(0, ·) = δλ(T , ·) = 0. Inspection of the previous system (e.g., for the sake ofsymmetrization) suggests introducing the new variable � = −�−1λ.

This results in an update for � given by

� = �+ c(−�)−1(yt −�y − g(y)− Bu), (6.4.2)

and the transformed system has the form⎛⎝ I − g′′(y)� 0 −( ∂∂t+�+ g′(y))

0 βI −B∗( ∂∂t−�− g′(y)) −B 0

⎞⎠⎛⎝ δy

δu

δ�

⎞⎠= −

⎛⎝ y − z− ( ∂∂t+�+ g′(y))�

βu− B∗�yt −�y − g(y)− Bu

⎞⎠ . (6.4.3)

Let us also point out that if we had immediately discretized (6.4.1), then the differencesbetween topologies tend to get lost and (−�)−1 in (6.4.2) may have been forgotten. On thediscretized level the effect of (−�)−1 can be restored by preconditioning.

Let us now return to the system (6.4.3). Due to its size it will—after discretization—not be solved by a direct method but rather by iteration based on conjugate gradients.The question of choice for preconditioners arises. The following two choices for blockpreconditioners were successful in our tests [KaK]:⎛⎝ 0 0 P ∗

0 βI 0P 0 0

⎞⎠−1

=⎛⎝ 0 0 P−1

0 β−1I 0P−∗ 0 0

⎞⎠ (6.4.4)

and ⎛⎝ I 0 P ∗0 βI 0P 0 0

⎞⎠−1

=⎛⎝ 0 0 P−1

0 β−1I 0P−∗ 0 P−∗P−1,

⎞⎠ , (6.4.5)


�

�

�

�

�

�

�

�


where P = ∂t − �. Note that (6.4.4) requires 2, whereas (6.4.5) needs 4, parabolicsolves per iteration. Numerically, it appears that (6.4.4) is preferable to (6.4.5). For furtherinvestigations on preconditioning of SQP-like systems we refer the reader to [BaSa]. Thismay still be extended by the results of Battermann and Sachs.

6.4.2 A class of nonlinear elliptic optimal control problems

The general framework of the previous section will be applied to optimal control problemsgoverned by partial differential equations of the type⎧⎪⎨⎪⎩

−�y + g(y)= f in �,

∂y

∂n= u on �1,

∂y

∂n= uf on �2,

(6.4.6)

where f ∈ L2(�) and uf ∈ L2(�2) are fixed and u ∈ L2(�1) will be the control variable.Here � is a bounded domain in R2 with C1,1 boundary or � is convex. For the case ofhigher-dimensional domains we refer the reader to [IK10]. The boundary ∂� is assumedto consist of two disjoint sets �1, �2, each of which is connected, or possibly consisting offinitely many connected components, with � = ∂� = �1 ∪ �2, and �2 possibly empty.Further it is assumed that g ∈ C2(R) and that g(H 1(�)) ⊂ L1+ε(�) for some ε > 0.Equation (6.4.6) is understood in the variational sense, i.e.,

(∇y,∇ϕ)� + (g(y), ϕ)� = (f , ϕ)� + (u, ϕ)� (6.4.7)

for all ϕ ∈ H 1(�), where

u ={

u on �1,

uf on �2,

(·, ·)� denotes the L2-inner product on �, and (·, ·)� stands for duality pairing betweenfunctions in Lp(�) and Lq(�) with p−1+ q−1 = 1 for appropriate choices of p. We recallthat H 1(�) is continuously embedded into Lp(�) for every p ≥ 1 if n = 2.

In (6.4.7) we should more precisely write 〈u, τ�ϕ〉� instead of 〈u, ϕ〉� , with τ� thezero order trace operator on �. However, we shall frequently suppress this notation. Werefer to y as a solution of (6.4.6) if (6.4.7) holds. The optimal control problem is given by⎧⎨⎩

min 12 |Cy − yd |2Z + α

2 |u|2L2(�1)

subject to (y, u) ∈ H 1(�)× L2(�1) a solution of (6.4.6).(6.4.8)

Here C is a bounded linear (observation) operator from H 1(�) to a Hilbert space Z, andyd ∈ Z and α > 0 are fixed.

To express (6.4.8) in the form (6.2.1) of Section 6.2 we introduce

e : H 1(�)× L2(�1)→ H 1(�)∗

with

〈e(y, u), ϕ〉(H 1)∗,H 1 = (∇y,∇ϕ)� + (g(y)− f , ϕ)� − (u, ϕ)�


�

�

�

�

�

�

�

�


and

e : H 1(�)× L2(�)→ H 1(�)

by

e = N e,

where N : H 1(�)∗ → H 1(�) is the Neumann solution operator associated with

(∇v,∇ϕ)� + (v, ϕ) = (h, ϕ)� for all ϕ ∈ H 1(�),

where h ∈ H 1(�)∗. In the context of Section 6.2 we set

X = H 1(�)× L2(�1), Y = H 1(�),

with x = (y, u) ∈ X, and

f (x) = f (y, u) = 1

2|Cy − yd |2Z +

α

2|u|2L2(�1)

.

We assume that (6.4.8) admits a solution x∗ = (y∗, u∗). The regularity requirements (6.2.2)of Section 6.2 are clearly met by the mapping g. Those for e are implied by

(h0)y → g(y) is twice continuously differentiable from H 1(�) to L1+ε(�)

for some ε > 0 with Lipschitz continuous second derivative in aneighborhood of y∗.

We shall also require the following hypothesis:

(h1) g′(y∗) ∈ L2+ε(�) for some ε > 0.

With (h1) holding, g′(y∗)ϕ ∈ L2(�) for every ϕ ∈ H 1(�). It is simple to argue that

ker e′(x∗) = {(v, h) : (∇v,∇ϕ)�+ (

g′(y∗)v, ϕ)�= (

h, ϕ)�1

for all ϕ ∈ H 1(�)},i.e., (v, h) ∈ ker e′(x∗) if and only if (v, h) is a variational solution of⎧⎪⎪⎪⎪⎨⎪⎪⎪⎪⎩

−�v + g′(y∗)v= 0 in �,

∂v∂n=h on �1,

∂v∂n= 0 on �2.

(6.4.9)

If (6.2.3) is satisfied, i.e., if e′(x∗) is surjective, then there exists a unique Lagrange multiplierλ∗ ∈ H 1(�) associated to x∗ such that

e′(x∗)∗λ∗ + (NC∗(Cy∗ − yd), αu∗) = 0 in H 1(�)× L2(�1), (6.4.10)

where e′(x∗)∗ : H 1(�) → H 1(�) × L2(�1) denotes the adjoint of e′(x∗) and C∗ : Z →H 1(�)∗ stands for the adjoint of C : H 1(�) → Z with Z a pivot space. More preciselywe have the following proposition.


�

�

�

�

�

�

�

�


Proposition 6.10 (Necessary Condition). If x∗ is a solution to (6.4.8), e′(x∗) is surjective,and (h1) holds, then λ∗ is a variational solution of{ −�λ∗ + g′(y∗)λ∗ = −C∗(Cy∗ − yd) in �,

∂λ∗

∂n= 0 on �,

(6.4.11)

i.e., (∇λ∗,∇ϕ)+ (g′(y∗)λ∗, ϕ)+ (Cy∗ − yd, Cϕ)Z = 0 for all ϕ ∈ H 1(�) and

τ�1λ∗ = αu∗ on �1. (6.4.12)

Proof. The Lagrangian associated with (6.4.8) can be expressed by

L(y, u, λ) = 1

2|Cy − yd |2Z +

α

2|u|2L2(�1)

+ (∇λ,∇y)�

+(λ, g(y)− f

)�− (

λ, u)�.

For every (v, h) ∈ H 1(�)× L2(�1) we find

Ly(y∗, u∗, λ∗)(v) = (

Cy∗ − yd, Cv)Z+ (∇λ∗,∇v)

�+ (

λ∗, g′(y∗)v)�

and

Lu(y∗, u∗, λ∗)(h) = α

(u∗, h

)�1− (

λ∗, h)�1.

Thus the claim follows.

We now aim for a priori estimates for λ∗. We define B : H 1(�) → H 1(�)∗ as thedifferential operator given by the left-hand side of (6.4.11); i.e., Bv = ϕ is characterized asthe solution to(∇v,∇ψ)

�+ (

g′(y∗)v, ψ)�= 〈ϕ,ψ〉(H 1)∗,H 1 for all ψ ∈ H 1(�).

We shall use the following hypothesis:

(h2) 0 is not an eigenvalue of B.

Note that (h2) holds, for example, if

g′(y∗) ≥ β a.e. on �

for some β > 0. With (h2) holding, B is an isomorphism from H 1(�) onto H 1(�)∗.Moreover, (h2) implies surjectivity of e′(x∗).

Lemma 6.11. Let the conditions of Proposition 6.10 hold.(i) There exists a constant K(x∗) such that

|λ∗|H 1 ≤ K(x∗)|(NC∗(Cy∗ − yd), αu∗)|X.


�

�

�

�

�

�

�

�


(ii) If moreover (h2) is satisfied and C∗(Cy∗ − yd) ∈ L2(�), then there exists aconstant K(y∗) such that

|λ∗|H 2 ≤ K(y∗)|C∗(Cy∗ − yd)|L2(�).

Proof. Due to surjectivity of e′(x∗) we have (e′(x∗)e′(x∗)∗)−1 ∈ L(H 1(�)) and thus (i)follows from (6.4.10). Let us turn to (ii). Due to (h2) and (6.4.11) there exists a constantKy∗ such that

|λ∗|H 1 ≤ Ky∗ |C∗(Cy∗ − yd)|(H 1)∗ . (6.4.13)

To obtain the desiredH 2(�) estimate for λ∗ one applies the well-knownH 2 a priori estimatefor Neumann problems to

−�λ∗ + λ∗ = w in �,

∂λ∗

∂n= 0 in ∂�

with w = λ∗ − g′(y∗)λ∗ − C∗(Cy∗ − yd). This gives

|λ∗|H 2 ≤ K(|λ∗|L2 + |g′(y∗)λ∗|L2 + |C∗(Cy∗ − yd)|L2

)for a constant K (depending on � but independent of y∗). Since

|g′(y∗)λ∗|L2 ≤ |g′(y∗)|L1+ε |λ∗|H 1 ,

the desired result follows from (6.4.13).

To calculate the second Fréchet derivative we shall use

(h3) g′′(y∗) ∈ L1+ε(�) for some ε > 0.

Proposition 6.12. Let (6.2.3), (h1), and (h3) hold. Then

L′′(y∗, u∗, λ∗)((v, h), (v, h)) = |Cv|2Z + α|h|2L2(�1)

+ (λ∗, g′′(y∗)v2

)�

(6.4.14)

for all (v, h) ∈ X.

Proof. It suffices to observe that by Sobolev’s embedding theorem there exists a constantKe such that

|(λ∗, g′′(y∗)v2)�| ≤ Ke|λ∗|H 1 |g′′(y∗)|L1+ε |v|2H 1 (6.4.15)

for all v ∈ H 1(�).

We turn now to an analysis of the second order sufficient optimality condition (6.2.5)for the optimal control problem (6.4.8). In view of (6.4.14) the crucial term is given by(λ∗, g′′(y∗)v2

)�

. Two types of results will be given. The first will rely on |(λ∗, g′′(y∗)v2)�|


�

�

�

�

�

�

�

�


being sufficiently small. This can be achieved by guaranteeing that λ∗ or, in view of (6.4.11),Cy∗ − yd is small. We refer to this case as small residual problems. The second class ofassumptions rests on guaranteeing that λ∗g′′(y∗) ≥ 0 a.e. on �.

In the statement of the following theorem we use Ke from (6.4.15) and K(x∗),K(y∗)from Lemma 6.11. Further ‖B−1‖ denotes the norm of B−1 as an operator from H 1(�)∗ toH 1(�).

Theorem 6.13. Let (6.2.3), (h1)–(h3) hold.(i) If Z = H 1(�), C = id, and

KeK(x∗)|(y∗ − yd, αg∗)|X|g′′(y∗)|Lq < 1, (6.4.16)

then the second order sufficient optimality condition (6.2.5) holds.(ii) If Z = L2(�), C is the injection of H 1(�) into L2(�), and

keK(y∗)|y∗ − yd |L2(�)|g′′(y∗)|L∞(�) ≤ 1, (6.4.17)

where ke is the embedding constant of H 2(�) into L∞(�), then (6.2.5) is satisfied.(iii) If

2‖B−1‖‖τ�1‖Ky∗ |C∗(Cy∗ − yd)|(H 1)∗ < α, (6.4.18)

where ‖τ�1‖ is the norm of the trace operator from H 1(�) onto L2(�1) and Ky∗ is definedin (6.4.13), then (6.2.5) is satisfied.

Proof. (i) By (6.4.14) and (6.4.15) we have for every (v, h) ∈ X

L′′(y∗, u∗, λ∗)((v, h), (v, h))≥ |v|2H 1(�) + α|h|2L2(�1)

−Ke|λ∗|H 1(�)|g′′(y∗)|L1+ε (�)|v|2H 1(�)

≥ (1−KeK(x∗)|(y∗ − yd, αu

∗)|X|g′′(y∗)|L1+ε (�)

) |v|2H 1 + α|h|2L2(�1),

where in the last estimate we used Lemma 6.11 (i). The claim now follows from (6.4.16).We observe that in this case L′′(y∗, u∗, λ∗) is positive definite on all of X, not only onker e′(x∗).

(ii) By (6.4.9) and (h2) we obtain

|v|H 1(�) ≤ ‖B−1‖‖τ�1‖|h|L2(�1) for all (v, h) ∈ ker e′(x∗). (6.4.19)

Here ‖τ�1‖ denotes the norm of the trace operator from H 1(�) onto L2(�1). Hence byLemma 6.11 (ii) and (6.4.19) we find for every (v, h) ∈ ker e′(x∗)

L′′(v∗, u∗, λ∗)((v, h), (v, h)) = |v|2L2(�) + α|h|2L2(�1)− (

λ∗, g′′(y∗)v2)L2(�)

≥ |v|2L2(�) + α|h|2L2(�1)− |λ∗|L∞(�)|g′′(y∗)|L∞(�)|v|2L2(�)

≥[1− keK(y∗)|g′′(y∗)|L∞(�)|y∗ − yd |L2(�)

]|v|2L2(�)

+α

2

(|h|2L2(�1)

+ 1

‖B−1‖2‖τ�1‖2|v|2H 1(�)

).


�

�

�

�

�

�

�

�


Due to (6.4.17) the expression in brackets is nonnegative and the result follows.(iii) In this case C can be a boundary observation operator, for example. As in (i) we

find

L′′(y∗, u∗, λ∗)((v, h), (v, h)) ≥ |Cv|2Z + α|h|2L2(�1)−Ke|λ∗|H 1(�)|g′′|L1+ε (�)|v|2H 1(�)

≥ |Cv|2Z + α|h|2L2(�1)−Ky∗ |C∗(Cy∗ − yd)|(H 1)∗ |g′′|Lε(�)|v|2H 1(�),

where (6.4.17) was used. This estimate and (6.4.19) imply that for (v, h) ∈ ker e′(x∗)

L′′(y∗, u∗, λ∗)((v, h), (v, h)) ≥ |Cv|2Z +α

2|h|2L2(�1)

+[

α

2‖B−1‖‖τ�1‖−Ky∗ |C∗(Cy∗ − yd)|(H 1)∗ |g′′|L1+ε (�)

]|v|2H 1(�).

The desired result follows from (6.4.18).

In view of (6.4.16)–(6.4.18) and the fact that |y∗ − yd | is decreasing with α → 0+,the question arises, whether decreasing α results in the fact that the second order sufficientoptimality condition. This is nontrivial since the term |y∗ − yd | in (6.4.16)–(6.4.18) ismultiplied by factors which depend on x∗ and hence on α itself. We refer the reader to[IK10] for a treatment of this issue.

In Theorem 6.13 the second order optimality condition was guaranteed by smallresidue conditions. Alternatively one can proceed by assuming that

(h4) λ∗g′′(y∗) ≥ 0 a.e. on �.

Theorem 6.14. Assume that (6.2.3), (h1), (h3), and (h4) hold and that

(a) Z = H 1(�) and C = id,

or

(b) (h2) is satisfied.

Then (6.2.5) holds.

Proof. By Proposition 6.12 and (h4) we find for all (v, h) ∈ X

L′′(y∗, u∗, λ∗)((v, h), (v, h)) ≥ |Cv|2Z + α|h|2L2(�1).

In the case that (a) holds the conclusion is obvious and L′′(v∗, u∗, λ∗) is positive not onlyon ker e′(x∗) but on all of X. In case (b) we use (6.4.19) to conclude that

L′′(y∗, u∗, λ∗)((v, h), (v, h)) ≥ |Cv|2Z +α

2

(|h|2L2(�1)

+ 1

‖B−1‖2‖τ�1‖2|v|2H 1(�)

)for all (v, h) ∈ ker e′(x∗).

Next we give a sufficient condition for (h4).


�

�

�

�

�

�

�

�


Theorem 6.15. Let (6.2.3), (h1) hold and assume that

(i) 〈C∗(yd − Cy∗), ψ〉(H 1)∗,H 1 ≥ 0 for all ψ ∈ H 1(�) with ψ ≥ 0, a.e.,

(ii) g′(y∗) ≥ 0 a.e. on �, and

(iii) g′′(y∗) ≥ 0 a.e. on �.

Then (h4) holds. The conclusion remains correct if the inequalities in (i) and (iii) arereversed.

Proof. Set ϕ = inf(0, λ∗) ∈ H 1(�) in (6.4.11). Then we have∫�

|∇ϕ|2dx + (g′(y∗)ϕ, ϕ)+ 〈c∗(Cy∗ − yd), ϕ〉(H 1)∗×H 1 = 0.

Since g′(y∗) ≥ 0 it follows from (i) that |∇ϕ|2L2(�)

= 0 and λ∗ ≥ 0. Together with (iii)we find λ∗g′′(y∗) ≥ 0 a.e. on �. If the inequalities in (i) and (iii) are reversed, we takeϕ = sup(0, λ∗).

Example 6.16. We consider

−�y + y3 − y = f in �,

∂y

∂n= u on � (6.4.20)

and the associated optimal control problem⎧⎨⎩ min 12

∫�|y − yd |2dx + α

2

∫�u2dx

subject to (y, u) ∈ H 1(�)× L2(�) a solution of (6.4.20).(6.4.21)

In the context of the problem (6.4.8) we set �1 = � = ∂�, �2 = ∅, and

Z = L2(�), C : H 1(�)→ L2(�) canonical injection, g(t) = t3 − t.

Equation (6.4.20) represents a simplified Ginzburg–Landau model for superconductivitywith y denoting the wave function, which is valid in the absence of internal magnetic fields(see [Tin, Chapters 1, 4]). Both (6.4.20) and −�y + y3 + y = h are of interest in thiscontext, but here we concentrate on (6.4.20) since it has three equilibria,±1 and 0, of which±1 are stable and 0 is unstable.

Proposition 6.17. Problem (6.4.21) admits a solution (y∗, u∗) ∈ H 1(�)× L1(�).

Proof. We first argue that the set of admissible pairs (y, u) ∈ H 1(�)× L2(�) for (6.4.21)is not empty. For this purpose we consider

−�y + y3 − y= f in �,

y= 0 on ∂�.(6.4.22)


�

�

�

�

�

�

�

�


Let T : L6(�)→ L6(�) be the operator defined by

T (y) = (−�)−1(h+ y − y3),

where � denotes the Laplacian with Dirichlet boundary conditions. Clearly T is completelycontinuous and (I − T )y = 0 implies that y is a solution to (6.4.22) with y ∈ H 2(�) ∩H 1

0 (�). By a variant of Schauder’s fixed point theorem due to Schäfer [Dei, p. 61], either(I− t T )y = 0 admits a solution for every t ∈ [0, 1] or (I− t T )y = 0 admits an unboundedset of solutions for some t ∈ (0, 1). Assume that the latter is the case and let y = y(t)

satisfy (I − t T )y = 0. Then −(t)−1�y + y3 − y = f , and hence

t−1

∫�

|∇y|2dx +∫�

y4dx =∫�

y2dx +∫

f y dx

≤∫|y(x)|>2

y2dx +∫|y(x)|≤2

y2dx +∫�

|y| |f |dx.(6.4.23)

This implies that

t−1∫�

|∇y|2dx ≤ 4 |�| +∫�

|y| |f |dx.

Since y ∈ H 10 (�), Poincaré’s inequality implies the existence of a constant C independent

of y = y(t) such that

|y(t)|H 10≤ C(|f |L2 + |�|).

Consequently the set of solutions y(t) to (I − t T )y = 0 is necessarily bounded in L6(�),and (6.4.22) admits a solution y ∈ H 2(�) ∩ H 1

0 (�). Setting u = ∂y

∂u∈ H 1/2(�) ⊂ L2(�)

we find the existence of an admissible pair for (6.4.21).Let (yn, un) ∈ H 1(�) × L2(�) be a minimizing sequence for (6.4.21). Due to the

choice of the cost functional the sequence {un}∞n=1 is bounded in L2(�). Proceeding as in(6.4.23) it is simple to argue that {yn}∞n=1 is bounded in H 1(�). Standard weak convergencearguments then imply that every weak subsequential limit of {(yn, un)} is a solution to(6.4.21).

Proposition 6.18. For every (y, u) ∈ X = H 1(�)×L2(�) the linearization e′(y, u) : X→X is surjective.

Proof. We follow [GHS]. Since e = N e with N defined below (6.4.8) it suffices to showthat E : = e(y, u) : X→ H 1(�)∗ is surjective, where E is characterized by

〈E(v, h),w〉H 1,∗,H 1 = (∇v,∇w)�+ (

q v,w)�− (

h, τ� w)�,

with q = 3y2 − 1 ∈ H 1(�). Consider the operator E0 : H 1(�)→ H 1(�)∗ defined by

〈E0(v), w〉H 1,∗,H 1 = (∇v,∇w)�+ (

q v,w)�,

and observe that E0 = −�+I+(q−1)I ; i.e., E0 is a compact perturbation by the operator(q − 1)I ∈ L(H 1(�),H 1(�)∗) of the homeomorphism (−�+ I ) ∈ L(H 1(�),H 1(�)∗).


�

�

�

�

�

�

�

�


By the Fredholm alternative either E0, and hence E, is surjective or there exists a finite-dimensional kernel of E0 spanned by the basis {vi}Ni=1 in H 1(�) and E0v = z, for z ∈H 1(�)∗, is solvable if and only if 〈z, vi〉H 1,∗,H 1 = 0 for all i = 1, . . . , N . In the latter casevi, i = 1, . . . , N , are nontrivial on � by Lemma 6.19 below. Without loss of generalitywe may assume that

(vi, vj

)�= δij for i, j = 1, . . . , N . For z ∈ H 1(�)∗ we define z ∈

H 1(�)∗ by 〈z, w〉H 1,∗,H 1 = 〈z,w〉H 1,∗,H 1+ (h,w

)�

, where h = −∑Ni=1〈z, vi〉H 1,∗,H 1τ�vi ∈

L2(�). Then 〈z, vi〉H 1,∗,H 1 = 0 for all i = 1, . . . , N and hence there exists v ∈ H 1(�) suchthat E0v = z or equivalently(∇v,∇w)

�+ (

q v,w)�= 〈z,w〉H 1,∗,H 1 for all v ∈ H 1(�).

Consequently E is surjective.

Lemma 6.19. If for q ∈ L2(�) the function v ∈ H 1(�) satisfies(∇v,∇w)�+ (

q v,w)�= 0 for all v ∈ H 1(�) (6.4.24)

and v = 0 on �, then v = 0 in �.

Proof. Let � be a bounded domain with smooth boundary strictly containing �. Define

v ={v in �,

0 in � \�,

and let q denote the extension by 0 of q. Since v = 0 on � = ∂� we have v ∈ H 1(�) and(∇v,∇w)�+ (

q v, w)�= 0 for all w ∈ H 1

0 (�) by (6.4.24). This, together with the fact

that v = 0 on an open nonempty subset of �, implies that v = 0 in � and v = 0 in �; see[Geo].

Let us discuss the validity of the conditions (hi) for the present problem. Sobolev’sembedding theorem and

g′(t) = 3t2 − 1 and g′′(t) = 6t

imply that (h1), (h3), and (h4) hold. The location of the equilibria of g suggests that forh = 0, yd ≥ 1 implies 1 ≤ y∗ ≤ yd and similarly that yd ≤ −1 implies yd ≤ y∗ ≤ −1.This was confirmed in numerical experiments [IK10]. In these cases g′(y∗) ≥ β > 0 and(i)–(iii) of Theorem 6.15 hold.

The numerical results in [IK10] are based on Algorithm 6.3. While it does not containa strategy on the choice of c, it was observed that c > 0 and of moderate size is superior(with respect to region of attraction for convergence) to c = 0, which corresponds to theSQP method without line search. For example, for the choice yd = 1 and the initializationy0 = −2, the iterates yn have to pass through the stable equilibrium −1 and the unstableequilibrium 0 to reach the desired state yd . This can be accomplished with Algorithm 6.3with c > 0, without globalization strategy, but not with c = 0. Similar comments apply tothe case of Neumann boundary controls.


�

�

�

�

�

�

�

�

6.5. Approximation and mesh-independence 183

Example 6.20. This is the singular system

−�y − y3 = h in �,

∂y

∂n= u on �, (6.4.25)

and the associated optimal control problem⎧⎨⎩min 1

2 |y − yd |2H 1(�)+ α

2

∫�u2ds

subject to (y, u) ∈ H 1(�)× L2(�1) a solution of (6.4.25),(6.4.26)

where yd ∈ H 1(�). If (6.4.25) admits at least one feasible pair (y, u), then it is simple toargue that (6.4.26) has a solution x∗ = (y∗, u∗). We refer the reader to [Lio2, Chapter 3] forexistence results in the case that the cost functional is of the form |y−yd |rLr (�)+α|g|2

L2(�)for

appropriately chosen r > 2. The existence of a Lagrange multiplier is assured in the samemanner as in Example 6.6. Clearly (h1) and (h3) are satisfied. For yd = const ≥ 1

2 weobserved numerically that 0 ≤ y∗ ≤ yd , λ∗ < 0, which in view of (h4) and Theorem 6.14explains the second order convergence rate that is observed numerically.

6.5 Approximation and mesh-independenceThis section is devoted to a brief description of some aspects concerning approximationof Algorithm 6.3. For this purpose let h denote a discretization parameter tending to zeroand, for each h, let Xh and Wh be finite-dimensional subspaces of X and W , respectively.In this section we set Z = X × W and Zh = Xh × Wh. The finite-dimensional spacesXh and Wh are endowed with the inner products and norms induced by X and W . Weintroduce surjective restriction operators RX

h ∈ L(X,Xh) and RWh ∈ L(W,Wh) and define

Rh = (RXh , R

Wh ). For each h, let fh : Xh → R and eh : Xh → Wh denote discretizations of

the cost functional f and the constraint e. Together with the infinite-dimensional problem(6.2.1) we consider the finite-dimensional discrete problems{

min fh(xh) over x ∈ Xh

subject to eh(x) = 0.(6.5.1)

The Lagrangians for (6.5.1) are given by

Lh(xh, λh) = fh(xh)+(eh(xh), λh

)W,

and the approximations of (6.2.6) are

L′h(xh, λh + ceh(xh)) = 0 , eh(xh) = 0, (6.5.2)

which are first order necessary optimality conditions for (6.5.1).


�

�

�

�

�

�

�

�


We require the following assumptions.

Condition 6.5.1. The approximations (Zh, Rh) of the space Z are uniformly bounded; i.e.,there exists a constant cR > 0 independent of h satisfying

‖Rh‖L(Z,Zh)≤ cR .

Condition 6.5.2. For every h problem (6.5.1) has a local solution x∗h . The mappings fh

and eh are twice continuously Fréchet differentiable in neighborhoods V ∗h of x∗h , and theoperators eh are Lipschitz continuous on V ∗h with a uniform Lipschitz constant ξe > 0independent of h.

Condition 6.5.3. For every h there exists a nonempty open and convex set V ∗h ⊆ V ∗h suchthat a uniform Babuška–Brezzi condition is satisfied on V ∗h ; i.e., there exists a constantβ > 0 independent of h satisfying

infwh∈Wh

supqh∈Xh

(e′h(xh)qh, wh

)W

‖qh‖X ‖wh‖W≥ β for all xh ∈ V ∗h .

The Babuška–Brezzi condition implies that the operators e′h(xh) are surjective on V ∗h .Hence, if Condition 6.5.3 holds, there exists for every h > 0 a Lagrange multiplier λ∗h ∈ Wh

such that (x∗h, λ∗h) solves (6.5.2).

Condition 6.5.4. There exist a scalar r > 0 independent of h and a neighborhood V (λ∗h)of λ∗h for every h such that for V ∗h = V ∗h × V (λ∗h)

(i) B((x∗hλ

∗h); r

) ⊆ V ∗h and(ii) a uniform second order sufficient optimality condition is satisfied on V ∗h ; i.e., there

exists a constant κ > 0 such that for all (xh, λh) ∈ V ∗hL′′h(xh, λh)(qh)

2 ≥ κ ‖qh‖2X

for all qh ∈ ker e′h(xh).

We define

F(x, λ) =(

L′(x, λ)e(x)

).

For every h and for all (xh, λh) ∈ V ∗h we introduce approximations of F and M by

Fh(xh, λh)=(

L′h(xh, λh)

eh(xh)

),

Mh(xh, λh)=(

L′′h(xh, λh) e′h(xh)

e′h(xh) 0

).

By Conditions 6.5.3 and 6.5.4 there exists a bound η which may depend on β and κ∗ but isindependent of h so that

‖M−1h (xh, λh)‖L(Zh)

≤ η (6.5.3)


�

�

�

�

�

�

�

�

6.5. Approximation and mesh-independence 185

for all (xh, λh) ∈ V ∗h . We require the following consistency properties of Fh and Mh, whereV (x∗, λ∗) = V (x∗)×V (λ∗) denotes the neighborhood of the solution x∗ of (6.2.1) and theassociated Lagrange multiplier λ∗ defined above (6.2.8).

Condition 6.5.5. For each h we have Rh(V (x∗, λ∗)) ⊆ V ∗h , and the approximations of theoperators F and M are consistent on V (x∗) and V (x∗, λ∗), respectively, i.e.,

limh→0

‖Fh(Rh(x, λ))− RhF(x, λ)‖Z = 0

and

limh→0

∥∥∥∥Mh(Rh(x, λ))Rh

(q

w

)− RhM(x, λ)

(q

w

)∥∥∥∥Z

= 0

for all (q,w) ∈ Z and (x, λ) ∈ V (x∗, λ∗). Moreover, for every h let the operator Mh beLipschitz continuous on V ∗h with a uniform Lipschitz constant ξM > 0 independent of h.

Now we consider the following approximation of Algorithm 6.3.

Algorithm 6.7.

(i) Choose (x0h, λ

0h) ∈ V ∗h , c ≥ 0 and put n = 0.

(ii) Set λnh = λn

h + c eh(xnh).

(iii) Solve for (xh, λh):

Mh(xnh, λ

nh)

(xh − xn

h

λh − λnh

)= −

(L′h(xn

h, λnh)

eh(xnh)

).

(iv) Set (xn+1h , λn+1

h ) = (xh, λh), n = n+ 1, and goto (ii).

Theorem 6.21. Let c ‖(x0h, λ

0h)− (x∗h, λ

∗h)‖Z be sufficiently small for all h and let (6.2.3),

(6.2.5), and Conditions 6.5.2–6.5.4 hold. Then we have the following:(a) Algorithm 6.7 is well defined and

‖(xn+1h , λn+1

h )− (x∗h, λ∗h)‖Z ≤ C ‖(xn

h, λnh)− (x∗h, λ

∗h)‖2

Z,

where C = 12 η ξM max(2, 1+ 2c2ξ 2

e ) is independent of h.(b) Let (x0, λ0) be a startup value of Algorithm 6.3 such that c ‖(x0, λ0)− (x∗, λ∗)‖Z

is sufficiently small and assume that

limh→0

∥∥(x0h, λ

0h)− Rh(x

0, λ0)∥∥Z= 0. (6.5.4)

If in addition Conditions 6.5.1 and 6.5.5 are satisfied, we obtain

limh→0

∥∥(xnh, λ

nh)− Rh(x

n, λn)∥∥Z= 0


�

�

�

�

�

�

�

�


for alln, where (xnh, λ

nh)and (xn, λn)are thenth iterates of the finite- and infinite-dimensional

methods, respectively.

Corollary 6.22. Under the hypotheses of Theorem 6.21

limh→0

‖(x∗h, λ∗h)− Rh(x∗, λ∗)‖

Z= 0

holds.

Corollary 6.23. Let the hypotheses of Theorem 6.21 hold and let {(Zh(n), Rh(n))} be asequence of approximations with limn→∞ h(n) = 0. Then we have

limn→∞‖(x

nh(n), λ

nh(n))− Rh(n)(x

∗, λ∗)‖Z= 0.

We turn to a brief account of mesh-independence. This is an important feature of iter-ative approximation schemes for infinite-dimensional problems. It asserts that the numberof iterations to reach a specified approximation quality ε > 0 is independent of the meshsize. We require the following notation:

n(ε) = min{n0| for n ≥ n0 :

∥∥F (xn, λn + ce(xn)

)∥∥Z< ε

},

nh(ε) = min{n0| for n ≥ n0 :

∥∥Fh

(xnh, λ

nh + ceh(x

nh)

)∥∥Z< ε

}.

We point out that both n(ε) and nh(ε) depend on the startup values (x0, λ0) and (x0h, λ

0h) of

the infinite- and finite-dimensional methods.

Theorem 6.24. Under the assumptions of Theorem 6.21 there exists for each ε > 0 aconstant hε > 0 such that

n(ε)− 1 ≤ nh(ε) ≤ n(ε) for h ∈ (0, hε] .

The proofs of the results of this section as well as numerical examples which demon-strate that mesh-independence is also observed numerically can be found in [KVo1, Vol].

6.6 CommentsSecond order augmented Lagrangian methods as discussed in Section 6.2 were treatedin [Be] for finite-dimensional and in [IK9, IK10] for infinite-dimensional problems. Thehistory for SQP methods is a long one; see [Han, Pow1] and [Sto, StTa], for exam-ple, and the references therein for the analysis of finite-dimensional problems. Infinite-dimensional problems are considered in [Alt5], for example. Mesh-independence of aug-mented Lagrangian-SQP methods was analyzed in [Vol]. This paper also contains manyreferences on mesh-independence for SQP and Newton methods. Methods which replaceequality- and inequality-constrained optimization problems by a linear quadratic approx-imation with respect to the equality constraints and the cost functional, while leaving the


�

�

�

�

�

�

�

�

6.6. Comments 187

equality constraints as explicit constraints, are frequently referred to as Lagrange–Newtonmethods in the literature. We quote [Alt4, AlMa, Don, Tro] and the references therein.Reduced SQP methods in infinite-dimensional spaces were analyzed in [JaSa, KSa, Kup]among others. For optimization problems with equality and simple inequality constraints theSQP method can advantageously be combined with projection methods; see [Hei], for ex-ample. We have not considered the topic of approximating the Hessian by secant updates.In part this is due to the fact that in the context of optimal control of partial differentialequations the structure of the nonlinearities is such that the continuous problem may allowa rather straightforward characterization of the gradient and Hessian, provided, of course,that it is sufficiently regular. Secant methods for SQP methods are considered in, e.g.,[Sto, KuSa, Kup, StTa].


�

�

�

�

�

�

�

�


�

�

�

�

�

�

�

�

Chapter 7

The Primal-DualActive Set Method

This chapter is devoted to the primal-dual active set strategy for variational problems withsimple constraints. This is an efficient method for solving the optimality systems arisingfrom quadratic programming problems with unilateral or bilateral affine constraints and it isequally well applicable to certain complementarity problems. The algorithm and some of itsbasic properties are described in Section 7.1. In the ensuing sections sufficient conditionsfor its convergence with arbitrary initialization and without globalization are presentedfor a variety of different classes of problems. Sections 7.2 and 7.3 are devoted to thefinite-dimensional case where the system matrix is an M-matrix or has a cone-preservingproperty, respectively. Operators which are of diagonally dominant type are considered inSections 7.4 and 7.5 for unilateral, respectively, bilateral problems. In Section 7.6 nonlinearoptimal control problems with control constraints are investigated.

7.1 Introduction and basic propertiesHere we investigate the primal-dual active method. Let us consider the quadratic program-ming problem

⎧⎪⎪⎨⎪⎪⎩minx∈X

1

2

(Ax, x

)X− (

a, x)X

subject to Ex = b, Gx ≤ ψ,

(7.1.1)

where A ∈ L(X) is a self-adjoint operator in the real Hilbert space X, E ∈ L(X,W),G ∈ L(X,Z), with W a real Hilbert space, and Z = Rn or Z = L2(�), with � a domainin Rd , endowed with the usual Hilbert space structure and the natural ordering. We assumethat (7.1.1) admits a unique solution denoted by x. If x is a regular point in the sense ofDefinition 1.5 (with C = X and g(x) = (Ex − b,Gx − ψ)), then there exists a Lagrange

189


�

�

�

�

�

�

�

�

190 Chapter 7. The Primal-Dual Active Set Method

multiplier (λ, μ) ∈ W × Z such that

Ax + E∗λ+G∗μ = a,

Ex = b,

μ = max(0, μ+ c (Gx − ψ)),

(7.1.2)

where c > 0 is a fixed constant and max is interpreted pointwise a.e. in � if Z = L2(�) andcoordinatewise if Z = Rn. The third equation in (7.1.2) constitutes the complementaritycondition associated with the inequality constraint in (7.1.1), as discussed in Examples 4.51and 4.53. Note that the auxiliary problems in the augmented Lagrangian-SQP methods ofChapter 6 (see for instance (ii) of Algorithm 6.4 or (ii) of Algorithm 6.6) take the formof (7.1.1).

The primal-dual active set method that will be analyzed in this chapter is an efficienttechnique for solving (7.1.2). While (7.1.2) is derived from (7.1.1) with A self-adjoint, thisassumption is not essential in the remainder of this chapter, and we therefore drop it unlessit is explicitly specified.

We next give two examples of constrained optimal control problems which are specialcases of (7.1.1).

Example 7.1. Let X = L2(�), where � ⊂ � is the control domain and consider theoptimal control problem⎧⎪⎪⎨⎪⎪⎩

minu∈X

1

2

∫�

|y − yd |2 dx + α

2|u|2X

subject to −�y = Bu, y = 0 on ∂� and u ≤ ψ,

where α > 0, yd ∈ L2(�), ψ ∈ L2(�), and B ∈ L(X,L2(�)) is the extension-by-zerooperator � to �. This problem can be formulated as (7.1.1) without equality constraint(E = 0) by setting

A = αI + B∗(−�)−2B, a = B∗(−�)−1y, and G = I,

where � denotes the Laplacian with homogeneous Dirichlet boundary conditions.

Example 7.2. Here we consider optimal control of the heat equation with Neumann bound-ary control and set, X = L2(0, T ;L2(�)), where � is a subset of the boundary ∂� of �:⎧⎪⎪⎪⎪⎪⎪⎪⎨⎪⎪⎪⎪⎪⎪⎪⎩

minu∈X

1

2

∫ T

0

∫�

|y − yd |2 dx dt + α

2|u|2U

subject to ddty = �y, y(0, ·) = y0 y = 0 on ∂� \ �,

ν · ∇y(t) = Bu(t) in �, u(t) ≤ ψ,

where α > 0, yd ∈ L2(0, T ;L2(�)), y0 ∈ L2(�), � is a subset of the boundary ∂�, ν isthe outer normal to �, and B is the extension-by-zero operator � to ∂�. For u ∈ X there


�

�

�

�

�

�

�

�

7.1. Introduction and basic properties 191

exists a unique solution y = y(u) ∈ Y = L2(0, T ;H 1(�))∩H 1(0, T ;L2(�)) to the initialboundary value problem arising as the equality constraints. Let T ∈ L(U, Y ) denote thesolution operator given by y = T (u). Setting

A = αI + T ∗T , a = T ∗yd, and G = I,

this optimal control problem is equivalent to (7.1.1).

In these examples A is bounded on X. More specifically it is a compact perturbationof a multiple of the identity. There are important problems which are not covered by (7.1.1).For the obstacle problem, considered in Section 4.7.4, the choice X = L2(�) results in theunbounded operator A = −�, and hence it is not covered by (7.1.1). If X is chosen asH 1

0 (�) and G as the inclusion of H 10 (�) into L2(�), then the solution does not satisfy the

regular point condition. While special techniques allow one to guarantee the existence ofa Lagrange multiplier μ associated with the constraint y ≤ ψ in L2(�), the natural spacefor the Lagrange multiplier is H−1(�). This suggests combining the primal-dual active setmethod with a regularization method, which will be discussed in a more general context inSection 8.6. Discretization of the obstacle problem by central finite differences leads to asystem matrix A on X = Rn that is symmetric positive definite and is an M-matrix. Thisimportant class of finite-dimensional variational problems will be discussed in Section 7.2.

Another important class of problems are state-constrained optimal control problem,for example,

min1

2

∫�

|y − y|2 dx + α

2|u|2L2(�)

subject to

−�y = u, y = 0 on ∂�, y ≤ ψ

for y ∈ H 10 (�) and u ∈ L2(�). This problem can be formulated as (7.1.1) with X =

(H 2(�) ∩H 10 (�))× L2(�) and G the inclusion of X into L2(�). The solution, however,

will not satisfy a regular point condition, and the Lagrange multiplier associated with thestate constraint y ≤ ψ is only a measure in general. This class of problems will be discussedin Section 8.6.

Let us return to (7.1.2). If the active set

A = {μ+ c(Gx − ψ) > 0}is known, then the linear system reduces to


Ex = b,

Gx = ψ in A and μ = 0 in Ac.

Here and below we frequently use the notation {f > 0} to stand for {x : f (x) > 0} iff ∈ L2(�) and {i : fi > 0} if f ∈ Rn. The active set at the solution, however, is unknown.


�

�

�

�

�

�

�

�


The primal-dual active set method uses the complementarity condition

μ = max(0, μ+ c (Gx − ψ))

as a prediction strategy. Based on the current primal-dual pair (x, μ) the updates for theactive and inactive sets are determined by

I = {μ+ c(Gx − ψ) ≤ 0} and A = {μ+ c(Gx − ψ) > 0}.This leads to the following Newton-like method.

Primal-Dual Active Set Method.

(i) Initialize x0, μ0. Set k = 0.

(ii) Set Ik = {μk + c(Gxk − ψ) ≤ 0}, Ak = {μk + c(Gxk − ψ) > 0}.(iii) Solve for (xk+1, λk+1, μk+1):

Axk+1 + E∗λk+1 +G∗μk+1 = a,

Exk+1 = b,

Gxk+1 = ψ in Ak and μk+1 = 0 in Ik.

(iv) Stop, or set k = k + 1, and return to (ii).

It will be shown in Section 8.4 that the above algorithm can be interpreted as a semismoothNewton method for solving (7.1.2). This will allow the local convergence analysis of thealgorithm. In this chapter we concentrate on its global convergence, i.e., convergence forarbitrary initializations and without the necessity for a line search. In case A is positivedefinite, a sufficient condition for the existence of solutions to the auxiliary systems of step(iii) of the algorithm is given by surjectivity of E and surjectivity G : N(E)→ Z. In factin this case (iii) is the necessary optimality condition for⎧⎪⎪⎨⎪⎪⎩

minx∈X

1

2

(Ax, x

)X− (

a, x)X

subject to Ex = b, Gx = ψ on Ak.

In the remainder of this chapter we focus on the reduced form of (7.1.2) given by

Ax + μ = a, μ = max(0, μ+ c (x − ψ)), (7.1.3)

where A ∈ L(Z).We now derive sufficient conditions which allow us to transform (7.1.2) into (7.1.3).

In a first step we assume that

G is surjective, range(G∗) ⊂ kerE, Ex = b for some x ∈ (ker E)⊥. (7.1.4)


�

�

�

�

�

�

�

�


Note that (7.1.4) implies that G : N(E) → Z is surjective. If not, then there exists anonzero z ∈ Z such that (z,Gx)Z = (G∗z, x)X = 0 for all x ∈ ker E. If we let x = G∗z,then |x|2 = 0 and z = 0, since G∗ is injective. Let PE denote the orthogonal projection inX onto ker E. Then (7.1.2) is equivalent to

Ax +G∗μ = PE(a − Ax), μ = max(0, μ+ c (Gx − (ψ −Gx),

E∗λ = (I − PE)(a − A(PEx + x)),

with A = PEAPE and x = x + x ∈ ker E + (ker E)⊥. The first of the above equations isequivalent to the system

(I − PG)A((I − PG) x + PG x)+G∗μ = (I − PG)PE(a − A x),

PG A((I − PG) x + PG x) = PG PE (a − A x),

(7.1.5)

where PG = I − G∗(GG∗)−1G is the orthogonal projection in ker E ⊂ X onto ker G.Since G∗ is injective the first equation in (7.1.5) is equivalent to

(GG∗)−1GA(G∗(GG∗)−1y + x2)+ μ = (GG∗)−1GPE(a − A x),

where y = Gx1 for x1 ∈ ker E ∩ (ker G)⊥, x2 ∈ ker G, and x = x1 + x2. Let

A11 = (GG∗)−1GAG∗(GG∗)−1, A12 = (GG∗)−1GAPG,A22 = PGAPG

and

a1 = (GG∗)−1GPE(a − Ax), a2 = PGPE(a − Ax).

Then (7.1.5) is equivalent to the following equation in Z × (ker E ∩ ker G):(A11 A12

A21 A22

)(y

x2

)+

(μ

0

)=

(a1

a2

). (7.1.6)

Equation (7.1.6) together with

μ = max(0, μ+ c(y − (ψ −Gx))) (7.1.7)

and E∗λ = (I − PE)(a − A(x + x)) are equivalent to (7.1.2) where x = x + x, withx = x1 + x2 ∈ ker E, x2 ∈ ker E ∩ ker G, x1 ∈ ker E ∩ (ker G)⊥, y = Gx1.

Note that the system matrix in (7.1.6) is positive definite if A restricted to ker E ispositive definite.

Let us now further assume that

A22 is nonsingular. (7.1.8)

Then (7.1.6), (7.1.7) are equivalent to

(A11 − A12A−122 A

∗12)y + μ = a1 − A12A

−122 a2


�

�

�

�

�

�

�

�


and (7.1.7), which is of the desired form (7.1.3). In the finite-dimensional case, (7.1.3)admits a unique solution for every a ∈ Rn if and only if A is a P -matrix; see [BePl,Theorem 10.2.15.]. Recall that A is called a P -matrix if all its principal minors are positive.In view of the fact that the reduction of (7.1.6) to (7.1.3) was achieved by taking the Schurcomplement with respect to A22 it is also worthwhile to recall that the Schur complementof a P -matrix (resp., M-matrix) is again a P -matrix (resp., M-matrix); see [BePl, p. 292].

For further reference it will be convenient to specify the primal-dual active set algo-rithm for the reduced system (7.1.3).

Primal-Dual Active Set Method for Reduced System.


(ii) Set Ik = {μk + c(xk − ψ) ≤ 0}, Ak = {μk + c(xk − ψ) > 0}.(iii) Solve for (xk+1, μk+1):

Axk+1 + μk+1 = a,

xk+1 = ψ in Ak and μk+1 = 0 in Ik.


We use (7.1.3) rather than (7.1.6) for the convergence analysis for the reason ofavoiding additional notation. All convergence results that follow equally well apply to(7.1.6) where it is understood that the coordinates corresponding to the variable x2 aretreated as inactive throughout the algorithm and the corresponding Lagrange multipliers areset and updated by 0.

In the following subsections convergence will be proved under various different con-ditions. These conditions will imply also the existence of a solution to the subsystems instep (iii) of the algorithm as well as the existence of a unique solution to (7.1.3).

For � ⊂ {1, . . . , n}, respectively, � ⊂ �, let R� denote the restriction operator to�. Then R∗

�is the extension-by-zero operator to �c. For any A let I be its complement in

Rn, respectively, �, and denote

AA = RAAR∗A, AA,I = RAAR∗I,

and analogously for AI and AI,A, and

δxA = RA(xk+1 − xk), δxI = RI(xk+1 − xk),

and analogously for δμA and δμI . From (iii) above we have

AAkδxAk

+ AAk ,IkδxIk

+ δμAk= 0,

AIkδxIk

+ AIk ,AkδxAk

− μkIk= 0.

(7.1.9)


�

�

�

�

�

�

�

�


The following properties for k = 1, 2, . . . follow from steps (ii) and (iii):

μk(xk − ψ) = 0, μk + (xk − ψ) > 0 on Ak,

xk − ψ ≥ 0, μk ≥ 0 on Ak, xk ≤ ψ, μk ≤ 0 on Ik,

δxAk≤ 0, δμIk

≥ 0,

(7.1.10)

all statements holding coordinatewise if Z = Rn and pointwise a.e. for Z = L2(�).

Remark 7.1.1. From (7.1.10) it follows that Ak = Ak+1 implies that the solution is found,i.e., (xk, μk) = (x∗, μ∗). In numerical practice it was observed that Ak = Ak+1 can beused as a stopping criterion; see [HiIK, IK20, IK22].

Remark 7.1.2. The primal-dual active set strategy can be interpreted as a prediction strategywhich, on the basis of (xk, μk), predicts the true active and inactive sets for (7.1.3), i.e.,the sets

A∗ = {μ∗ + c(x∗ − ψ) > 0} and I∗ = (A∗)c.

To further pursue this point we assume that the systems in (7.1.3) and step (iii) of thealgorithm admit solutions, and we define the following partitioning of the index set atiteration level k:

IG = Ik ∩ I∗, IB = Ik ∩A∗, AG = Ak ∩A∗, AB = Ak ∩ I∗.

The sets IG, AG give a good prediction and the sets IB , AB give a bad prediction. Letus denote �x = xk+1 − x∗, �μ = μk+1 − μ∗, and we denote by G((xk, μk)) the systemmatrix for step (iii) of the algorithm:

G(xk, μk) =

⎛⎜⎜⎝AIk

AIkAkIIk

0AAkIk

AAk0 IAk

0 0 IIk0

0 −cIAk0 0

⎞⎟⎟⎠ .

Then we have the identity

G(xk, μk)

⎛⎜⎜⎝�xIk

�xAk

�μIk

�μAk

⎞⎟⎟⎠ = −col(0Ik

, 0Ak, 0IG

, μ∗IB, 0AG

, c(ψ − x∗)AB

). (7.1.11)

Here we assumed that the components of the equation μ−max{0, μ+ c(x − ψ)} = 0 areordered as (IG, IB,AG,AB). Since xk ≥ ψ on Ak and μk ≤ 0 on Ik , we have

|ψ − x∗|AB≤ |xk − x∗|AB

and |μ∗|IB≤ |μk − μ∗|IB

. (7.1.12)

By the definition

�yAG= 0, �xAB

= (ψ − x∗)AB, �μIG

= 0, �μIB= −μ∗IB

. (7.1.13)


�

�

�

�

�

�

�

�


On the basis of (7.1.11)–(7.1.13) we can draw the following conclusions.(i) If xk → x∗, then, in the finite-dimensional case, there exists an index k such that

IB = AB = ∅ for all k ≥ k. Consequently the convergence occurs in finitely many steps.(ii) By (7.1.11)–(7.1.12) there exists a constant κ ≥ 1 independent of k such that

|�x| + |�μ| ≤ κ(|(xk − x∗)AB

| + |(μk − μ∗)IB|) .

Thus if the incorrectly predicted sets are small in the sense that

|(xk − x∗)AB| + |(μk − μ∗)IB

| ≤ 1

2κ − 1(|(xk − x∗)AB,c

| + |(μk − μ∗)IB,c|),

where AB,c, IB,c denote the complement of the indices AB , IB , respectively, then

|xk+1 − x∗| + |μk+1 − μ∗| ≤ 1

2(|xk − y∗| + |μk − μ∗|),

and convergence follows.(iii) If x∗ < ψ and μ0 + c (x0 − ψ) ≤ 0 (e.g., y0 = ψ , μ0 = 0), then the algorithm

converges in one step. In fact, in this case AB = IB = ∅.

7.2 Monotone classIn this section we assume that Z = Rn and that A is an M-matrix, i.e., it is nonsingular, itsnondiagonal elements are nonnegative, and A−1 ≥ 0. Then there exists a unique solution(x∗, μ∗) to (7.1.3).

Example 7.3. Consider the discretized obstacle problem on the square (0, 1)× (0, 1). Leth = 1

Nand letui,j denote the approximation of the solution at the nodes (ih, jh), 0 ≤ i, j ≤

N . Using the central difference approximation results in a finite-dimensional variationalinequality satisfying

4ui,j − ui+1,j − ui−1,j − ui,j+1 − ui,j−1

h2+ μi,j = fi,j ,

μi,j = max(0, μi,j + c (ui,j − ψi,j )

for 1 ≤ i, j ≤ N − 1, and u0,j = uN,j = ui,0 = ui,N = 0. The resulting matrix A inR(N−1)2

is an M-matrix.

The following theorem asserts the convergence of the iterates of the primal-dual activeset method for (7.1.3).

Theorem 7.4. Assume that A is an M-matrix. Then xk → x∗ for arbitrary initial data.Moreover x∗ ≤ xk+1 ≤ xk for all k ≥ 1, xk ≤ ψ for all k ≥ 2, and there exists k0 such thatμk ≥ 0 for all k ≥ k0.

Proof. Since A is an M-matrix we have A−1I ≥ 0 and A−1

I AI,A ≤ 0 for every indexpartition of {1, . . . , n} into I and A. Since δxIk

= −A−1Ik

AIkAkδxAk

−A−1Ik

δμIkby (7.1.9)


�

�

�

�

�

�

�

�

7.3. Cone sum preserving class 197

it follows that δxIk≤ 0. Together with δxAk

≤ 0, which follows from the third equationin (7.1.10), this implies that xk+1 ≤ xk for k ≥ 1. Next we show that xk is feasible fork ≥ 2. Due to monotonicity of xk with respect to k it suffices to show this for k = 2. For isuch that (x1 − ψ)i > 0 we have μ1

i = 0 by (7.1.10) and hence μ1i + c (x1 − ψ)i > 0 and

i ∈ A1. Since x2 = ψ on A1 and x2 ≤ x1 it follows that x2 ≤ ψ .To verify that x∗ ≤ xk for k ≥ 1, note that

aIk−1 = μ∗Ik−1+ AIk−1x

∗Ik−1

+ AIk−1Ak−1x∗Ak−1

= AIk−1xkIk−1

+ AIk−1Ak−1ψAk−1 ,

and consequently

AIk−1

(xk

Ik−1− x∗Ik−1

) = μ∗Ik−1+ AIk−1Ak−1

(x∗Ak−1

− ψAk−1

).

Since μ∗Ik−1≥ 0 and x∗Ak−1

≤ ψAk−1 , the M-matrix properties of A imply that xkIk−1

≥ x∗Ik−1

and consequently xk ≥ x∗ for all k ≥ 1.Turning to the feasibility of μk assume that for a pair of indices (k, i), k ≥ 1, we have

μki < 0. Then necessarily i ∈ Ak−1, xk

i = ψi , and μki + c(xk

i − ψi) < 0. It follows that

i ∈ Ik , μk+1i = 0, and μk+1

i + c(xk+1i − ψi) ≤ 0, since xk+1

i ≤ ψi , k ≥ 1. Consequentlyi ∈ Ik+1 and by induction i ∈ Ik for all k ≥ k + 1. Thus, whenever a coordinate of μk

becomes negative at iteration k, it is zero from iteration k+1 onwards, and the correspondingprimal coordinate is feasible. Due to finite-dimensionality of Rn it follows that there existsko such that μk ≥ 0 for all k ≥ ko.

Monotonicity of xk and x∗ ≤ xk ≤ ψ for k ≥ 2 imply the existence of x suchthat lim xk = x ≤ ψ . Since μk = Axk + a ≥ 0 for all k ≥ ko, there exists μ suchthat lim μk = μ ≥ 0. Together with the complementarity property μ(x − ψ), which is aconsequence of the first equation in (7.1.10), it follows that (x, μ) = (x∗, μ∗).

7.3 Cone sum preserving classFor rectangular matrices B ∈ Rn×m we denote by ‖ · ‖1 the subordinate matrix norm whenboth Rn and Rm are endowed with the 1-norms. Moreover, B+ denotes the n × m matrixcontaining the positive parts of the elements of B. Recall that a square matrix is called aP -matrix if all its principle minors are positive.

Theorem 7.5. If A is a P -matrix and for every partitioning of the index set {1, . . . , n} intodisjoint subsets I and A we have ‖(A−1

I AIA)+‖1 < 1 and∑

i∈I(A−1I xI)i > 0 for xI ≥ 0

with xI �= 0, then limk→∞ xk = x∗.

The third condition in the above theorem motivates the terminology cone sum pre-serving. If A is an M-matrix, then the conditions of Theorem 7.5 are satisfied. The proofwill reveal that M(xk) =∑n

i=1 xki is a merit function.

Proof. From (7.1.9) and the fact that xk+1 = ψ on Ak we have for k = 1, 2, . . .

(xk+1 − xk)Ik= A−1

IkAIkAk

(xk − ψ)Ak+ A−1

Ikμk

Ik


�

�

�

�

�

�

�

�


and upon summation over the inactive indices∑i∈Ik

(xk+1i − xk

i ) =∑i∈Ik

(A−1

IkAIkAk

(xk − ψ)Ak

)i+

∑i∈Ik

(A−1Ik

μkIk)i . (7.3.1)

Using again that xk+1 = ψ on Ak , this implies that

n∑i=1

(xk+1i −xk

i ) = −∑i∈Ak

(xki −ψi)+

∑i∈Ik

(A−1Ik

AIkAk(xk−ψ)Ak

)i +∑i∈Ik

(A−1Ik

μkIk)i . (7.3.2)

Since xkAk≥ ψAk

it follows that

n∑i=1

(xk+1i − xk

i ) ≤ (‖(A−1Ik

AIkAk)+‖1 − 1) |xk − ψ |1,Ak

+∑i∈Ik

(A−1Ik

μkIk)i < 0 (7.3.3)

unless xk is the solution to (7.1.3). In fact, if |xk −ψ |1,Ak, then xk ≤ ψ on Ak and μk ≥ 0

on Ak . If moreover μIk= 0, then μk ≥ 0 and xk ≤ ψ on �. Together with the first

equation in (7.1.10), this implies that {(xk, μk)} satisfies the complementarity conditions.It also satisfies Axk + μk = a and hence (xk, μk) is a solution to (7.1.3). Consequently

xk →M(xk) =n∑

i=1

xki

acts as a merit function for the algorithm. Since there are only finitely many possible choicesfor active/inactive sets, there exists an iteration index k such that Ik = Ik+1. In this case(xk+1, μk+1) is a solution to (7.1.3). In fact, in view of (iii) of the algorithm it suffices to showthat xk+1 and μk+1 are feasible. This follows from the fact that due to Ik = Ik+1 we have

c(xk+1i −ψi) = μk+1

i + c(xk+1i −ψi) ≤ 0 for i ∈ Ik and μk+1

i + c(xk+1i −ψi) = μk+1

i > 0for i ∈ Ak . From (7.1.10) we deduce μk+1(xk+1−ψ) = 0, and hence the complementarityconditions hold and the algorithm converges in finitely many steps.

A perturbation result. We now discuss the primal-dual active set strategy for the casewhere the matrix A can be expressed as an additive perturbation of an M-matrix.

Theorem 7.6. Assume that A = M + K with M an M-matrix and K an n × n matrix.If ‖K‖1 is sufficiently small, then the primal-dual active set algorithm is well defined andlimk→∞(xk, μk) = (x∗, μ∗), where (x∗, μ∗) is a solution to (7.1.3). If yT Ay > 0 for y �= 0,then the solution to (7.1.3) is unique.

Proof. As a consequence of the assumption that M is an M-matrix all principal submatricesof M are M-matrices as well [BePl]. Let S denote the set of all subsets of {1, . . . , n} andA its complement. Define

ρ = supI∈S

‖M−1I KI‖1 and σ = sup

I∈S‖BIA(K)‖1, (7.3.4)


�

�

�

�

�

�

�

�

7.3. Cone sum preserving class 199

where BIA(K) = M−1I KIA−M−1

I KI(M+K)−1I AIA. Assume that K is chosen such that

ρ < 12 and σ < 1. For every subset I ∈ S the inverse of AI exists and can be expressed as

A−1I = (II +

∞∑i=1

(−M−1I KI)i

)M−1

I .

Consequently the algorithm is well defined. Proceeding as in the proof of Theorem 7.5 wearrive at

n∑i=1

(xk+1i − xk

i ) = −∑i∈Ak

(xki − ψi)+

∑i∈Ik

(A−1

IkAIkAk

(xk − ψ)Ak

)i+

∑i∈Ik

(A−1Ik

μkIk)i ,

where μki ≤ 0 for i ∈ Ik and xk

i ≥ ψi for i ∈ Ak . Below we drop the index k with Ik andAk . Note that A−1

I AIA ≤ M−1I KIA−M−1

I KI(M+K)−1I AIA = BIA(K). Here we used

(M +K)−1I −M−1

I = −M−1I KI(M +K)−1

I and M−1I MIA ≤ 0. This implies

n∑i=1

(xk+1i − xk

i ) ≤ −∑i∈A

(xki − ψi)+

∑i∈I

(BIA(K)(xk − ψ)A

)i+

∑i∈I

(A−1I μk

I)i .

(7.3.5)We estimate

∑i∈I

(A−1I μk

I)i =∑i∈I

⎛⎝M−1I μk

I +∞∑j=1

(−M−1I KI)jM−1

I μkI

⎞⎠i

≤ −|M−1I μk

I |1 +∞∑j=1

ρi |M−1I μk

I |1

= (α − 1)|M−1I μk

I |1 +(

1

1− ρ− (α + 1)

)|M−1

I μkI |1 = (α − 1)|M−1

I μkI |1,

where we set α = ρ

1−ρ ∈ (0, 1) by (7.3.4). This estimate, together with (7.3.4) and (7.3.5),implies that

n∑i=1

(xk+1i − xk

i ) ≤ (σ − 1)|xkAk− ψAk

|1 + (α − 1)|M−1Ik

μkIk|1.

Now it can be verified in the same manner as in the proof of Theorem 7.5 that xk →M(xk) = ∑n

i=1 xki is a merit function for the algorithm and convergence of (xk, μk) to

a solution (x∗, μ∗) follows. If there are two solutions to (7.1.3), then their difference y

satisfies ytAy ≤ 0 and hence y = 0 and uniqueness follows.

Observe that the M-matrix property is not stable under arbitrarily small perturbationssince off-diagonal elements may become positive. Theorem 7.6 guarantees that convergenceof the primal-dual active set strategy for arbitrary initial data is preserved for sufficientlysmall perturbations K of an M-matrix. Therefore, Theorem 7.6 is also of interest in con-nection with numerical implementations of the primal-dual active set algorithm.


�

�

�

�

�

�

�

�


7.4 Diagonally dominated classWe again consider the reduced problem (7.1.3),

Ax + μ = a, μ = max(0, μ+ c (x − ψ)), (7.4.1)

but differently from Sections 7.2 and 7.3 we admit also the infinite-dimensional case. Suf-ficient conditions related to diagonal dominance of A will be given which imply that

M(xk+1, μk+1) = max

(β

∫Ik

|(xk+1 − ψ)+| dx,∫

Ak

|(μk+1)−| dx)

(7.4.2)

with β > 0 acts as a merit functional for the primal-dual algorithm. Here we set φ+ =max(φ, 0) and φ− = −min(φ, 0). The natural norm associated to this merit functional isthe L1(�)-norm and consequently we assume that

A ∈ L(L1(�)), a ∈ L1(�), and ψ ∈ L1(�). (7.4.3)

The analysis of this section can also be used to obtain convergence in the Lp(�)-norm forany p ∈ (1,∞) if the norms in the integrands of M are replaced by | · |p-norms and theL1(�)-norms below are replaced by Lp(�)-norms as well.

The results also apply for Z = Rn. In this case the integrals in (7.4.2) are replacedby sums over the respective index sets.

We assume that there exist constants ρi, i = 1, . . . , 5, such that for all partitions Aand I of � and for all φA ≥ 0 in L2(A) and φI ≥ 0 in L2(I)

|[A−1I φI]−| ≤ ρ1 |φI |,

|[A−1I AIAφA]+| ≤ ρ2 |φA|

(7.4.4)

and

|[AAφA]−| ≤ ρ3 |φA|,

|[AAIA−1I φI]−| ≤ ρ4 |φI |,

|[AAIA−1I AIAφA]+| ≤ ρ5 |φA|.

(7.4.5)

Here | · | denotes the L1(�)-norm. Assumption (7.4.4) requires in particular the existenceof A−1

I . By a Schur complement argument with respect to the sets Ik and Ak this impliesexistence of a solution to the linear systems in step (iii) of the algorithm for every k.

Theorem 7.7. If (7.4.3), (7.4.4), (7.4.5) hold and ρ = max(β ρ1 + ρ2,ρ3

β+ ρ4 + ρ5

β) <

1, then M is a merit function for the primal-dual algorithm of the reduced system andlimk→∞(xk, μk) = (x∗, μ∗) in L1(�)× L1(�), with (x∗, μ∗) a solution to (7.4.1).

Proof. For every k ≥ 1 we have (xk+1−ψ)+ ≤ (xk+1− xk)+ on Ik and (μk+1)− = (δμ)−on Ak . Therefore

M(xk+1, μk+1) ≤ max

(β

∫Ik

(δxIk)+,

∫Ak

(δμAk)−

). (7.4.6)


�

�

�

�

�

�

�

�

7.4. Diagonally dominated class 201

From (7.1.9) we deduce that

δxIk= −A−1

Ik(−μk

Ik) + A−1

IkAIkAk

(−δxAk),

with μkIk≤ 0 and δxAk

≤ 0. By (7.4.4) therefore

|(δxIk)+| ≤ ρ1|μk

Ik| + ρ2|δxAk

|

= ρ1

∫Ik∩Ak−1

|(μkIk)−| + ρ2

∫Ak∩Ik−1

(xk − ψ)+

≤(ρ1 + ρ2

β

)M(xk, μk).

(7.4.7)

Similarly, by (7.1.9)

δμAk= AAk

(−δxAk)+ AAkIk

A−1Ik

(−μkIk)− AAkIk

A−1Ik

AIkAk(−δxAk

).

Since δxAk≤ 0 and μk

Ik≤ 0, we find by (7.4.5)

|(δμAk)−| ≤ ρ3|δxAk

| + ρ4|μkIk| + ρ5|δxAk

| ≤(ρ3 + ρ5

β+ ρ4

)M(xk, νk), (7.4.8)

and therefore


(βρ1 + ρ2,

ρ3 + ρ5

β+ ρ4

)M(xk, μk) = ρ M(xk, μk).

Thus, if ρ < 1, then M is a merit functional. Furthermore M(xk+1, μk+1) ≤ ρkM(x1, μ1).Together with (7.4.7), (7.4.8), and (7.1.9) it follows that (xk, μk) is a Cauchy sequence.Hence there exists (x∗, μ∗) such that limk→∞(xk, μk) = (x∗, μ∗) and Ax∗ + μ∗ =a, μ∗(x∗−ψ) = 0 a.e. in�. Since (xk−ψ)+ → (x∗−ψ)+ as k →∞ and limk→∞

∫�(xk+1−

ψ)+ = 0, it follows that x∗ ≤ ψ . Similarly, one argues that μ∗ ≥ 0. Thus (x∗, μ∗) is asolution to (7.4.1).

Concerning the uniqueness of the solution to (7.4.1), assume that A ∈ L(L2(�)) andthat (Ay, y)L2(�) > 0 for all y �= 0. Assume further that (x∗, μ∗) and (x, μ) are solutions to(7.4.1) with x−x∗ ∈ L2(�). Then (x−x∗, A(x−x∗))L2(�) ≤ 0 and therefore x−x∗ = 0.

Remark 7.4.1. In the finite-dimensional case the integrals in the definition of M must bereplaced by sums over the active/inactive index sets. If A is an M-matrix, then ρ1 = ρ2 =ρ5 = 0 and ρ < 1 if ρ3

β+ ρ4 < 1. This is the case if A is diagonally dominant in the sense

that ρ4 < 1 and β is chosen sufficiently large. If these conditions are met, then ρ < 1 isstable under perturbations of A.

Remark 7.4.2. Consider the infinite-dimensional case with A = αI + K , where α > 0,K ∈ L(L1(�)), andKφ ≥ 0 for allφ ≥ 0. This is the case for the operators in Examples 7.1and 7.2, as can be argued by using the maximum principle. Let ‖K‖ denote the norm of


�

�

�

�

�

�

�

�


K in K ∈ L(L1(�)). For ‖K‖ < α and any I ⊂ � we have A−1I = 1

αII − 1

αKIA−1

Iand hence ρ1 ≤ ‖K‖

α(α−‖K‖) . Moreover ρ3 = 0. The conditions involving ρ2, ρ4, and ρ5 are

satisfied with ρ2 = ‖K‖α−‖K‖ , ρ4 = ‖K‖2

α(α−‖K‖) , and ρ5 = ‖K‖2

α−‖K‖ .

7.5 Bilateral constraints, diagonally dominated classConsider the quadratic programming with the bilateral constraints

min1

2

(Ax, x

)X− (

a, x)X

subject toEx = b, ϕ ≤ Gx ≤ ψ

with conditions on A,E, and G as in (7.1.1). From Example 4.53 of Section 4.7, we recallthat the necessary optimality condition for this problem is given by


Ex = b,

μ = max(0, μ+ c(Gx − ψ))+min(0, μ+ c(Gx − ϕ)),

(7.5.1)

where max as well as min are interpreted as pointwise a.e. operations if Z = L2(�) andcoordinatewise for Z = Rn.

Primal-Dual Active Set Algorithm.


(ii) Given (xk, μk), setA+

k = {μk + c(Gxk − ψ) > 0},Ik = {μk + c(Gxk − ψ) ≤ 0 ≤ μk + c(Gxk − ϕ)},A−

k = {μk + c(Gxk − ϕ) < 0}.

(iii) Solve for (xk+1, λk+1, μk+1):

Axk+1 + E∗λk+1 +G∗μk+1 = a,

Exk+1 = b,

Gxk+1 = ψ in A+k , μk+1 = 0 in Ik, and Gxk+1 = ϕ in A−

k .


We assume the existence of a solution to the auxiliary systems in step (iii). In caseA is positive definite a sufficient condition for existence is given by surjectivity of E and


�

�

�

�

�

�

�

�

7.5. Bilateral constraints, diagonally dominated class 203

surjectivity of G : N(E) → Z. As in the unilateral case we shall consider a transformedversion of (7.5.1). If (7.1.4) and (7.1.8) are satisfied, then (7.5.1) can be transformed intothe equivalent system

Ax + μ = a, μ = max(0, μ+ c (x − ψ))+min(0, μ+ c (x − ϕ)), (7.5.2)

where A is a bounded operator on Z. The algorithm for this reduced system is obtainedfrom the above algorithm by replacing G by I and deleting the terms involving E and E∗.As in the unilateral case, if one does not carry out the reduction step from (7.1.6) to (7.5.2),then the coordinates corresponding to x2 are treated as inactive ones in the algorithm. Wehenceforth concentrate on the infinite-dimensional case and give sufficient conditions for

M(xk+1, μk+1)=max

(∫Ik

((xk+1 − ψ)+ + (xk+1 − ϕ)−) dx,

∫A+

k

(μk+1)− dx +∫

A−k

(μk+1)+ dx

) (7.5.3)

to act as a merit function for the algorithm applied to the reduced system. In the finite-dimensional case the integrals must be replaced by sums over the respective index sets. Wenote that (iii) with G = I implies the complementarity property

(xk − ψ)(xk − ϕ)μk = 0 a.e. in �. (7.5.4)

As in the previous section the merit function involves L1-norms and accordingly weaim for convergence in L1(�). We henceforth assume that

A ∈ L(L1(�)), a ∈ L1(�), ψ and ϕ ∈ L1(�). (7.5.5)

Below ‖ · ‖ denotes the norm of operators in L(L1(�)). The following conditions will beused: There exist constants ρi, i = 1, . . . , 5, such that for arbitrary partitions A ∪ I = �

we have

‖A−1I ‖ ≤ ρ1,

‖A−1I AIA‖ ≤ ρ2

(7.5.6)

and

‖AAk− c I‖ ≤ ρ3,

‖AAIA−1I ‖ ≤ ρ4,

‖AAIA−1I AIA‖ ≤ ρ5.

(7.5.7)

We further set ρ = 2 max(max(ρ1, ρ2,ρ2

c),max(ρ3 + ρ5, ρ4),

ρ3+ρ5

c).

Theorem 7.8. If (7.5.5), (7.5.6), (7.5.7) hold and ρ < 1, then M is a merit function forthe primal-dual algorithm of the reduced system (7.5.2) and limk→∞(xk, μk) = (x∗, μ∗) inL1(�)× L1(�), with (x∗, μ∗) a solution to (7.5.2).


�

�

�

�

�

�

�

�


Proof. For δx = xk+1 − xk and δμ = μk+1 − μk we have

AA+kδxA+

k+ AA+

k IkδxIk

+ AA+k A−

kδxA−

k+ δμA+

k= 0,

AIkδxIk

+ AIkA+kδxA+

k+ AIkA−

kδxA−

k− μk

Ik= 0,

AA−kδxA−

k+ AA−

k IkδxIk

+ AA−k A+

kδxA+

k+ δμA−

k= 0

(7.5.8)

with

μk

A+k

⎧⎨⎩> 0 on A+

k−1 ∩A+k ,= 0 on Ik−1 ∩ A+k ,

> c (ψ − ϕ) on A−k−1 ∩A+

k ,

(7.5.9)

μkIk∈

⎧⎨⎩[c(ϕ − ψ), 0) on A+

k−1 ∩ Ik,

= 0 on Ik−1 ∩ Ik,

(0, c(ψ − ϕ)] on A−k−1 ∩ Ik,

(7.5.10)

μk

A−k

⎧⎨⎩< c (ϕ − ψ) on A+

k−1 ∩A−k ,= 0 on Ik−1 ∩ A−k ,

< 0 on A−k−1 ∩A−

k ,

(7.5.11)

δxA+k

⎧⎨⎩= 0 on A+

k−1 ∩A+k ,

< 0 on Ik−1 ∩ A+k ,= ψ − ϕ <

μk

con A−

k−1 ∩A+k ,

(7.5.12)

δxA−k

⎧⎨⎩ = ϕ − ψ >μk

con A+

k−1 ∩A−k ,

> 0 on Ik−1 ∩ A−k ,= 0 on A−k−1 ∩A−

k .

(7.5.13)

From (7.5.4)

(xk+1Ik

− ψIk)+ ≤ (δxIk

)+ and (xk+1Ik

− ϕIk)− ≤ (δxIk

)−. (7.5.14)

This implies that


(∫Ik

|δxIk| ,

∫A+

k

(μk+1)− +∫

A−k

(μk+1)+). (7.5.15)

From (7.5.8), (7.5.6), (7.5.10), (7.5.12), and (7.5.13) we have

|δxIk| ≤ ρ1|μk

Ik| + ρ2|δxAk

|

≤ ρ1

(|(μk

Ik∩A+k−1

)−| + |(μk

Ik∩A−k−1

)+|)+ ρ2

(|δxA+

k| + |δxA−

k|)

≤ ρ1

(|(μk

Ik∩A+k−1

)−| + |(μk

Ik∩A−k−1

)+|)+ ρ2

(|(xk − ψ)+A+

k ∩Ik−1| + 1

c|(μk

A+k ∩A−

k−1)+|

+|(xk − ϕ)−A−k ∩Ik−1

| + 1c|(μk

A−k ∩A+

k−1)−|

).


�

�

�

�

�

�

�

�

7.5. Bilateral constraints, diagonally dominated class 205

This implies

|δxIk| ≤ 2 max

(ρ1, ρ2,

ρ2

c

)M(xk, μk). (7.5.16)

From (7.5.8) further

μk+1Ak− (μk

Ak− cδxAk

) = g, (7.5.17)

where g = (cI − AδxAk)δxAk

− AAkIkA−1

AkAIkAk

δxAk. By (7.5.9), (7.5.12), we have

μk

A+k

− c δxA+k≥ 0.

Similarly, by (7.5.11), (7.5.13)

μk

A−k

− c δxA−k≤ 0.

Consequently,

|(μk+1A+

k

)−| + |(μk+1A−

k

)+|≤ |gAk| ≤ (ρ3 + ρ5)|δxAk

| + ρ4|μIk |

≤ 2 max

(ρ4, ρ3 + ρ5,

ρ3 + ρ5

c

)M(xk, μk).

(7.5.18)

By (7.5.15), (7.5.16), and (7.5.18)

M(xk+1, μk+1) ≤ 2 max

(max

(ρ1, ρ2,

ρ2

c

),max

(ρ4, ρ3 + ρ5,

ρ3 + ρ5

c

))M(xk, μk).

It follows that M(xk+1, μk+1) ≤ ρkM(x1, μ1) and if ρ < 1, then M(xk, μk)→ 0 as k →∞. From the estimates leading to (7.5.16) it follows that xk is a Cauchy sequence. Moreoverμk is a Cauchy sequence by (7.5.8). Hence there exist (x∗, μ∗) such that limk→∞(xk, μk) =(x∗, μ∗). By Lebesgue’s bounded convergence theorem and since M(xk, μk) → 0, itfollows that ϕ ≤ x∗ ≤ ψ . Clearly Ax∗ +μ∗ = a and (x∗ −ψ)(x∗ − ϕ)μ∗ = 0 by (7.5.4).This last equation implies that μ∗ = 0 on I∗ = {ϕ < x∗ < ψ}. It remains to show thatμ∗ ≥ 0 on A∗,+ = {x∗ = ψ} and μ∗ ≤ 0 on A∗,− = {x∗ = ϕ}. Let s ∈ A∗,+ be such thatxk(s) and μk(s) converge. Then μ∗(s) ≥ 0. If not, then μ∗(s) < 0 and there exists k suchthat μk(s) + c(xk(s) − ψ(s)) ≤ μ∗(s)

2 < 0 for all k ≥ k. Then s ∈ Ik and μk+1 = 0 fork ≥ k, contradicting μ∗(s) < 0. Analogously one shows that μ∗ ≤ 0 on A∗,−.

Conditions (7.5.6) and (7.5.7) are satisfied for additive perturbations of the operatorcI , for example. This can be deduced from the following result.

Theorem 7.9. Assume that A = cI+K with K ∈ L(L1(�)) and ‖K‖ < c and that (7.5.5),(7.5.6), (7.5.7) are satisfied. If

ρ = 2 max

(max

(‖K‖c

ρ1, ρ2

),max(ρ3 + ρ5, ρ4),

ρ3 + ρ5

c

)< 1,

then the conclusions of the previous theorem are valid.


�

�

�

�

�

�

�

�


Proof. We follow the proof of Theorem 7.8 and eliminate the overestimate (7.5.14). LetP = {xk+1

Ik− ψ > 0} ∩ Ik . We find

xk+1 − ψ

⎧⎪⎨⎪⎩≤ δxk

P∩Ik−1on P ∩ Ik−1,

= δxk

P∩A+k−1

on P ∩A+k−1,

= δxk

P∩A−k−1+ (ϕ − ψ)P∩A−

k−1on P ∩A−

k−1.

This estimate, together with A−1 = 1cI − 1

cKA−1, implies that∫

Ik

(xk+1 − ψ)+ ≤∫P

δxk +∫P∩A−k−1

ϕ − ψ

≤∫P

A−1Ik

μkIk−

∫P

A−1Ik

AIkAkδxAk

+∫P∩A−k−1

ϕ − ψ

= 1

c

∫P

μkIk+

∫P∩A−k−1

ϕ − ψ − 1

c

∫P

KIkA−1

Ikμk

Ik−

∫P

A−1Ik

AIkAkδxAk

≤ −1

c

∫P

KIkA−1

Ikμk

Ik−

∫P

A−1Ik

AIkAkδxAk

,

and hence ∫Ik

(xk+1 − ψ)+ ≤ ‖K‖c

ρ1|μkIk| + ρ2|δxAk

|.

An analogous estimate can be obtained for∫Ik(xk+1 − ϕ)− and we find∫

Ik

(xk+1 − ψ)+∫

Ik

(xk+1 − ϕ)− ≤ ‖K‖c

ρ1|μkIk| + ρ2|δxAk

|.

We can now proceed as in the proof of Theorem 7.8.

Example 7.10. We apply Theorem 7.9, A = I + K ∈ L(L1(�)). By Neumann seriesarguments we find ρ = 2 max( γ

1−γ ,max(γ + γ 2

1−γ ,γ

1−γ )) = γ

1−γ , where γ = ‖K‖, and

ρ < 1 if ‖K‖ < 13 . If A = I +K is replaced by A = cI +K , then ρ < 1 if γ < c

2c+1 , in

case c ≥ 1, and ρ < 1 if c2

c+2 , in case c ≤ 1.

Example 7.11. Here we consider the finite-dimensional case A = I + K ∈ Rn×n, whereRn is endowed with the �1-norm. Again Theorem 7.9 is applicable and ρ < 1 if ‖K‖ < 1

3 ,where ‖ ·‖ denotes the matrix-norm subordinate to the �1-norm ofRn. Recall that this normis given by the maximum over the column sums of the absolute values of the matrix.

7.6 Nonlinear control problems with bilateral constraintsIn this section we consider a special case of bilaterally constrained problems that werealready investigated in the previous section, where the operator A is of the form T ∗T . This


�

�

�

�

�

�

�

�

7.6. Nonlinear control problems with bilateral constraints 207

will allow us to obtain improved sufficient conditions for global convergence of the primal-dual active set algorithm. The primary motivation for such problems are optimal controlproblems, and we therefore slightly change the notation to that which is more common inoptimal control.

Let U and Y be Hilbert spaces with U = L2(�), where � is a bounded measurableset inRd , and let T : U → Y be a, possibly nonlinear, continuously differentiable, injective,mapping with Fréchet derivative denoted by T ′. Further let ϕ,ψ ∈ U with ϕ < ψ a.e. in�. For α > 0 and z ∈ Y consider

minϕ≤u≤ψ J (u) = 1

2|T (u)− z|2Y +

α

2|u|2U . (7.6.1)

The necessary optimality condition for (7.6.1) is given by{αu+ T ′(u)∗(T (u)− z)+ μ = 0,μ = max(0, μ+ α(u− ψ))+min(0, μ+ α(u− ϕ)),

(7.6.2)

where (u, μ) ∈ U ×U , and max as well as min are interpreted as pointwise a.e. operations.If the upper or lower constraint are not present, we can set ψ = ∞ or ϕ = −∞.

Example 7.12. Let � ⊂ Rn be a bounded domain in Rn with Lipschitz continuous bound-ary �, let � ⊂ �, � ⊂ � be measurable subsets, and consider the optimal control problem

minϕ≤u≤ψ

1

2|y − z|2Y +

α

2| u|2U subject to

(∇y,∇v)� + (y, v)� = (u, v)� for all v ∈ H 1(�), (7.6.3)

where z ∈ L2(�). We set Y = L2(�) and U = L2(�). Define Lu = y with L : L2(�)→L2(�) as the solution operator to the inhomogeneous Neumann boundary value problem(7.6.3) and set T = R�L : L2(�) → L2(�), where R� : L2(�) → L2(�) denotes the

canonical restriction operator. Then T ∗ : L2(�)→ L2(�) is given by

T ∗ = R�L∗ E�

with R� the restriction operator to � and E� the extension-by-zero operator from � to �.Further the adjoint L∗ : L2(�) → L2(�) of L is given by L∗w = τ�p, where τ� is theDirichlet trace operator from H 1(�) to L2(�) and p is the solution to

(∇p,∇w)� + (p,w)� = (w, w)� for all w ∈ H 1(�). (7.6.4)

The fact that L and L∗ are adjoint to each other follows by setting v = p in (7.6.3) andw = y in (7.6.4).

We next specify the primal-dual active set algorithm for (7.6.1). The iteration indexis denoted by k and an initial choice (u0, μ0) is assumed to be available.


�

�

�

�

�

�

�

�


Primal-Dual Active Set Algorithm.

(i) Given (uk, μk), determineA+

k = {μk + α(uk − ψ) > 0},Ik = {μk + α(uk − ψ) ≤ 0 ≤ μk + α(uk − ϕ)},A−

k = {μk + α(uk − ϕ) < 0}.(ii) Determine (uk+1, μk+1) from

uk+1 = ψ on A+k , uk+1 = ϕ on A−

k , μk+1 = 0 on Ik,

and

α uk+1 + T ′(uk+1)∗(T (uk+1)− z)+ μk+1 = 0. (7.6.5)

Note that the equations for (uk+1, μk+1) in step (ii) of the algorithm constitute thenecessary optimality condition for the auxiliary problem⎧⎨⎩ min 1

2 |T (u)− z|2Y + α2 |u|2U over u ∈ U

subject to u = ψ on A+k , u = ϕ on A−k .(7.6.6)

The analysis in this section relies on the fact that (7.6.2) can be equivalently expressed as⎧⎪⎪⎪⎪⎪⎪⎪⎪⎨⎪⎪⎪⎪⎪⎪⎪⎪⎩

y = T (u),

λ = −T ′(u)∗(y − z),

α u− λ+ μ = 0,

μ = max(0, μ+ (u− ψ))+min(0, μ+ (u− ϕ)),

(7.6.7)

where λ = −T ′(u)∗(T (u)−z) is referred to as the adjoint state. Analogously, for uk+1 ∈ U ,setting yk+1 = T (uk+1), λk+1 = −T ′(uk+1)∗(T (uk+1) − z), (7.6.5) can equivalently beexpressed as ⎧⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎨⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎩

yk+1 = T (uk+1), where

uk+1 =⎧⎨⎩

ψ on A+k ,

1αλk+1 on Ik,

ϕ on A−k ,

λk+1 = −T ′(uk+1)∗(yk+1 − z),

α uk+1 − λk+1 + μk+1 = 0.

(7.6.8)

In what follows we give conditions which guarantee convergence of the primal-dualactive set strategy for linear and certain nonlinear operators T from arbitrary initial data.The convergence proof is based on an appropriately defined functional which decays when


�

�

�

�

�

�

�

�


evaluated along the iterates of the algorithm. An a priori estimate for the adjoint variable λ

in (7.6.8) will play an essential role.To specify the condition alluded to in the above let us consider two consecutive iterates

of the algorithm. For every k = 1, 2, . . . , the sets A+k , A−

k , and Ik give a mutually disjointdecomposition of �. According to (i) and (ii) in the form (7.6.8) we find

uk+1 − uk =

⎧⎪⎪⎪⎪⎨⎪⎪⎪⎪⎩Rk

A+ on A+k ,

1α(λk+1 − λk)+ Rk

I on Ik,

RkA− on A−

k ,

(7.6.9)

where the residual Rk is given by

RkA+ =

⎧⎨⎩0 on A+

k−1 ∩A+k ,

ψ − 1αλk = ψ − uk < 0 on Ik−1 ∩A+

k ,

ψ − ϕ < 1αμk on A−

k−1 ∩A+k ,

(7.6.10)

RkI =

⎧⎨⎩1αμk = 1

αλk − ψ ≤ 0 on A+

k−1 ∩ Ik,

0 on Ik−1 ∩ Ik,1αμk = 1

αλk − ϕ ≥ 0 on A−

k−1 ∩ Ik,

(7.6.11)

RkA− =

⎧⎨⎩ϕ − ψ > 1

αμk on A+

k−1 ∩A−k ,

ϕ − 1αλk = ϕ − uk > 0 on Ik−1 ∩A−

k ,

0 on A−k−1 ∩A−

k .

(7.6.12)

Here Rk denotes the function defined on � whose restrictions to A+k , Ik , A−

k coincide withRk

A+ , RkI , and Rk

A− .We shall utilize the following a priori estimate:⎧⎨⎩

There exists ρ < α such that

|λk+1 − λk|U < ρ |Rk|U for every k = 1, 2, . . . .(7.6.13)

Sufficient conditions for (7.6.13) will be given at the end of this section. The convergenceproof will be based on the following merit functional M : U × U → R given by

M(u,μ) = α2∫�

(|(u− ψ)+|2 + |(ϕ − u)+|2) dx +∫

A+(u)|μ−|2 dx +

∫A−(u)

|μ+|2 dx,

where A+(u) = {x : u ≥ ψ} and A−(u) = {x : u ≤ ϕ}. Note that the iterates (uk, μk) ∈U × U satisfy

μk(uk − ψ)(ϕ − uk)(x) = 0 for a.e. x ∈ �, (7.6.14)

and hence at most one of the integrands of M(uk, μk) can be strictly positive at x ∈ �.

Theorem 7.13. Assume that (7.6.13) holds for the iterates of the primal-dual active setstrategy. Then M(uk+1, μk+1) ≤ α−2ρ2M(uk, μk) for every k = 1, . . . . Moreover thereexist (u∗, μ∗) ∈ U×U , such that limk→∞(uk, μk) = (u∗, μ∗) and (u∗, μ∗) satisfies (7.6.2).


�

�

�

�

�

�

�

�


and

|μk+1,+(x)| ≤ |λk+1(x)− λk(x)| for x ∈ A−(uk+1). (7.6.21)

Combining (7.6.19)–(7.6.21) implies that

M(uk+1, μk+1) ≤∫�

|λk+1(x)− λk(x)|2dx. (7.6.22)

Since (7.6.13) is supposed to hold we have

M(uk+1, μk+1) ≤ ρ2|Rk|2U .Moreover, from the definition of Rk we deduce that

|Rk|2U ≤ α−2M(uk, μk), (7.6.23)

and consequently

M(uk+1, μk+1) ≤ α−2ρ2M(uk, μk) for k = 1, 2, . . . . (7.6.24)

From (7.6.13), (7.6.23), (7.6.24) it follows that |λk+1 − λk|U ≤ (ρ

α)kρ|R0|U . Thus there

exists λ∗ ∈ U such that limk→∞ λk = λ∗.Note that for k ≥ 1

A+k = {x : λk(x) > α ψ(x)}, Ik = {x : α ϕ(x) ≤ λk(x) ≤ α ψ(x)},

A−k = {x : λk(x) < α ϕ(x)},

and hence

μk+1 = max(0, λk − α ψ)+min(0, λk − α ϕ)+ (λk+1 − λk)χA+k ∪A−k .

Since limk→∞(λk+1 − λk) = 0 and limk→∞ λk exists, it follows that there exists μ∗ ∈ U

such that limk→∞ μk = μ∗, and

μ∗ = max(0, λ∗ − α ψ)+min(0, λ∗ − α ϕ). (7.6.25)

From the last equation in (7.6.8) it follows that there exists u∗ such that limk→∞ uk = u∗ andα u∗−λ∗+μ∗ = 0. Combined with (7.6.25) the triple (u∗, μ∗) satisfies the complementaritycondition given by the second equation in (7.6.2). Passing to the limit with respect to k in(7.6.5) we obtain that the first equation in (7.6.2) is satisfied by (u∗, μ∗).

We turn to the discussion of (7.6.13) and consider the linear case first.

Proposition 7.14. If T is linear and ‖T ‖2L(U,Y ) < α, then (7.6.13) holds.

Proof. From (7.6.8) and (7.6.9) we have, with δu = uk+1 − uk , δy = yk+1 − yk , andδλ = λk+1 − λk , ⎧⎪⎪⎪⎪⎨⎪⎪⎪⎪⎩

δu = Rk + 1αδλ χIk

,

T δu = δy,

T ∗ δy + δλ = 0.

(7.6.26)


�

�

�

�

�

�

�

�


Taking the inner product in L2 with δλ in the first equation we arrive at

(δu, δλ) ≤ (Rk, δλ),

and taking the inner product with δu in the third equation,

|δy|2 + (δλ, δu) = 0.

Combining these two relations implies that

|δy|2 ≤ (Rk, δλ).

Utilizing the third equation in (7.6.26) in this last inequality and the fact that the normsof T and T ∗ coincide, we have |δλ| ≤ ‖T ∗‖2 |Rk|, from which the desired estimatefollows.

We now turn to a particular case when (7.6.13) holds for a nonlinear operator T . Let� = � be a bounded domain in Rn, n = 2 or 3, with smooth boundary ∂ �. Furtherlet φ : R → R be a monotone mapping with locally Lipschitzian derivative, satisfyingφ(0) = 0, and such that the substitution operator determined by φ maps H 1(�) into L2(�).We choose U = Y = L2(�) and define T (u) = y as the solution operator to{ −�y + φ(y)= u in �,

y= 0 on ∂�,(7.6.27)

where � denotes the Laplacian. The adjoint variable λ is the solution to{�λ+ φ′(y)λ=−(y − z) in �,

λ= 0 on ∂�.(7.6.28)

Let (u0, μ0) be an arbitrary initialization and let U = {uk : k = 1, 2, . . .} denote the set ofiterates generated by the primal-dual active set algorithm. Since these iterates are solutionsto the auxiliary problems (7.6.6), it follows that for every α > 0 the set U is bounded inL2(�) uniformly with respect to α ≥ α.

By monotone operator theory and regularity theory of elliptic partial differential equa-tions it follows that the set of primal states {yk = y(uk) : k = 1, 2, . . .} and adjoint states{λk = λ(y(uk)) : k = 1, 2, . . .} are bounded subsets of L∞(�); see [Tr]. Let C denote thisbound and let LC denote the Lipschitz constant of φ on the ball BC(0) with center 0 andradiusC inR. Denote byH 1

0 (�) = {u ∈ H 1(�) : u = 0 on ∂�} the Hilbert space endowedwith norm |∇u|L2 and let κ stand for the embedding constant from H 1

0 (�) into L2(�).

Proposition 7.15. Assume that 0 < (1+C LC) κ4

α−(1+C LC) κ4 < 1, where α ≥ α. Then (7.6.13) holdsfor the mapping T determined by the solution operator to (7.6.27).

Proof. From (7.6.8) we have

−�(yk+1 − yk)+ φ(yk+1)− φ(yk) = 1

α(λk+1 − λk)χIk+1 + Rk, (7.6.29)

−�(λk+1 − λk)+ φ′(yk+1)(λk+1 − λk)+ (φ′(yk+1)− φ′(yk))λk + yk+1 − yk = 0,(7.6.30)


�

�

�

�

�

�

�

�


�

�

�

�

�

�

�

�

Chapter 8

Semismooth NewtonMethods I

8.1 IntroductionIn this chapter we study semismooth Newton methods for solving nonlinear nonsmoothequations. These investigations are motivated by complementarity problems, variationalinequalities, and optimal control problems with control or state constraints, for example.The operator equation for which we desire to find a solution is typically Lipschitz continuousbut not C1 regular. We shall also establish the relationship between the semismooth Newtonmethod and the primal-dual active set method that was discussed in Chapter 7.

Since semismooth Newton methods are not widely known even for finite-dimensionalproblems, we consider the finite-dimensional case before we turn to problems in infinitedimensions. In fact, these two cases are distinctly different. In finite dimensions we haveRademacher’s theorem, which states that every locally Lipschitz continuous function isdifferentiable almost everywhere. This result has no counterpart for functions betweeninfinite-dimensional function spaces.

As an example consider the nonlinear complementarity problem

g(x) ≤ 0, x ≤ ψ, and (g(x), x − ψ)Rn = 0,

where g : Rn → Rn and ψ ∈ Rn. It can be expressed equivalently as the problem of findinga root to the following equation:

F(x) = g(x)+max(0,−g(x)+ x − ψ) = max(g(x), x − ψ) = 0, (8.1.1)

where the max operation just like the inequalities must be interpreted componentwise. Notethat F is a locally Lipschitz continuous function if g is locally Lipschitzian, but it is not C1

if g is C1. A function is called locally Lipschitz continuous if it is Lipschitz continuous onevery bounded subset of its domain.

Let us introduce some of the key concepts for the finite-dimensional case. For alocally Lipschitz continuous function F : Rm → Rn let DF denote the set of points atwhich F is differentiable. For x ∈ Rm we define ∂BF (x) as

∂BF (x) ={J : J = lim

xi→x, xi∈DF

∇F(xi)

}, (8.1.2)

215


�

�

�

�

�

�

�

�

216 Chapter 8. Semismooth Newton Methods I

and we denote by ∂F (x) the generalized derivative at x introduced by Clarke [Cla], i.e.,

∂F (x) = co ∂BF (x), (8.1.3)

where co stands for the convex hull.A generalized Newton iteration for solving the nonlinear equation F(x) = 0 with

F : Rn → Rn can be defined by

xk+1 = xk − V −1k F (xk), where Vk ∈ ∂BF (xk). (8.1.4)

The reason for using ∂BF rather than ∂F in (8.1.4) is the following: For the convergenceanalysis we shall require that all V ∈ ∂BF (x∗) are nonsingular, where x∗ is the thoughtsolution to F(x) = 0. This is more readily satisfied for ∂BF than for ∂F , as can be seenfrom F(x) = |x|, for example. In this case 0 ∈ ∂F (0) but 0 /∈ ∂BF .

We also introduce the coordinatewise operation

∂bF (x) = ⊗mi=1∂BFi(x),

where Fi is the ith coordinate of F . From the definition of ∂BF (x) it follows that ∂BF (x) ⊂∂bF (x). For F given in (8.1.1) we have

∂BF (x) = ∂bF (x)

if −g′(x)+ I is surjective. Moreover, in Section 8.4 it will be shown that if we select

V (x) d =⎧⎨⎩

d if − g(x)+ x − ψ > 0,

g′(x)d if − g(x)+ x − ψ ≤ 0,

then the generalized Newton method reduces to the primal-dual active set method.Local convergence of {xk} to x∗, a solution of F(x) = 0, is based on the following

concepts. The generalized Jacobians V k ∈ ∂BF (xk) are selected so that their inversesV (xk)

−1 are uniformly bounded and that they satisfy the condition

|F(x∗ + h)− F(x∗)− V h| = o(|h|), (8.1.5)

where V = V (x∗ + h) ∈ ∂BF (x∗ + h), for h in a neighborhood of x∗. Then from (8.1.5)with h = xk − x∗ and Vk = V (xk) we have

|xk+1 − x∗| = |V −1k (F (xk)− F(x∗)− Vk(x

k − x∗))| = o(|xk − x∗|). (8.1.6)

Thus, there exists a neighborhood B(x∗, ρ) of x∗ such that if x0 ∈ B(x∗, ρ), then xk ∈B(x∗, ρ) and xk converges to x∗ superlinearly. This discussion will be made rigorous for Fmapping between finite-dimensional spaces, as above, as well as for the infinite-dimensionalcase. For the finite-dimensional case we shall rely on the notion of semismooth functions.

Definition 8.1. F : Rm → Rn is called semismooth at x if F is locally Lipschitz at x and

limV∈∂F (x+t h′), h′→h, t→0+

V h′ exists for all h ∈ Rm. (8.1.7)


�

�

�

�

�

�

�

�

8.2. Semismooth functions in finite dimensions 217

Semismoothness was originally introduced by Miflin [Mif] for scalar-valued func-tions. Convex functions and real-valued C1 functions are examples for such semismoothfunctions. Definition 8.1 is due to Qi and Sun [Qi, QiSu]. It will be shown that if F issemismooth at x∗, condition (8.1.5) holds. Due to the fact that the notion of Clarke derivativeis not available in infinite dimensions, Definition 8.1 does not allow a direct generalizationto the infinite-dimensional case. Rather, the notion of Newton differentiability related tothe property expressed in (8.1.5) was developed in [CNQ, HiIK]. Alternatively, in [Ulb]the infinite-dimensional case of mappings into Lp spaces was treated by considering thesuperposition of mappings F = !(G), with G a C1 mapping into Lr and ! the substitu-tion operator (!y)(s) = ψ(y(s)) for y ∈ Lr , where ψ is a semismooth function betweenfinite-dimensional spaces.

As an alternative to (8.1.4) the increment dk for a generalized Newton method xk+1 =xk + dk can be defined as the solution to

F(xk)+ F ′(xk; d) = 0, (8.1.8)

where F ′(x; d) denotes the directional derivative of F at x in direction d. Note that it maybe a nontrivial task to solve (8.1.8) for dk . This method was investigated in [Pan1, Pan2].We shall return to (8.1.8) in the context of globalization of the method specified by (8.1.4).

In Section 8.2 we present the finite-dimensional theory for Newton’s method forsemismooth functions. Section 8.3 is devoted to the discussion of Newton differentia-bility and solving nonsmooth equations in Banach spaces. In Section 8.4 we exhibit therelationship between the primal-dual active set method and semismooth Newton methods.Section 8.5 is devoted to a class of nonlinear complementarity problems. In Section 8.6we discuss applications where, for different reasons, semismooth Newton methods are notdirectly applicable but rather a regularization is necessary as, for instance, in the case ofstate-constrained optimal control problems.

8.2 Semismooth functions in finite dimensions

8.2.1 Basic concepts and the semismooth Newton algorithm

In this section we discuss properties of semismooth functions and analyze convergence of thegeneralized Newton method (8.1.4). We follow quite closely the work by Qi and Sun [QiSu].First we describe the relationship of semismoothness to directional differentiability. Afunction F : Rm → Rn is called directionally differentiable at x ∈ Rm if

limt→0+

F(x + t h)− F(x)

t= F ′(x;h)

exists for all h ∈ Rd . Further F is called Bouligand-differentiable (B-differentiable) [Ro4]at x if it is directionally differentiable at x and

limh→0

F(x + h)− F(x)− F ′(x;h)|h| = 0.

A locally Lipschitzian function F is B-differentiable at x if and only if it is directionallydifferentiable at x; see [Sha] and the references therein.


�

�

�

�

�

�

�

�


Theorem 8.2. (i) Suppose that F : Rm → Rn is locally Lipschitz continuous and direction-ally differentiable at x. Then F ′(x; ·) is Lipschitz continuous and for every h, there existsa V ∈ ∂F (x) such that

F ′(x;h) = V h. (8.2.1)

(ii) If F is locally Lipschitz continuous, then the following statements are equivalent.(1) F is semismooth at x.(2) F is directionally differentiable at x and for every V ∈ ∂F (x + h),

V h− F ′(x;h) = o(|h|) as h→ 0.

(3) limx+h∈DF , |h|→0F ′(x+h;h)−F ′(x;h)

|h| = 0.

Proof. (i) For h, h′ ∈ Rm we have

|F ′(x;h)− F ′(x;h′)| =∣∣∣∣ limt→0+

F(x + th)− F(x + th′)t

∣∣∣∣ ≤ L |h− h′|, (8.2.2)

where L is the Lipschitz constant of F in a neighborhood of x. Thus F ′(x; ·) is Lipschitzcontinuous at x. Since

F(y)− F(x) ∈ co ∂F ([x, y])(y − x) (8.2.3)

for all x, y ∈ Rm (see [Cla, p. 72]), there exist a sequence {tk}, with tk → 0+, and Vk ∈co ∂F ([x, x + tk h])(y − x) such that

F ′(x;h) = limk→∞ Vkh.

Since F is locally Lipschitz, the sequence {Vk} is bounded, and there exists a subsequenceof Vk , denoted by the same symbol, such that Vk → V . Moreover ∂F is closed at x; i.e.,xi → x and Zi → Z, with Zi ∈ ∂F (xi), imply that Z ∈ ∂F (x) [Cla, p. 70], and henceV ∈ ∂F (x). Thus F ′(x;h) = V h, as desired.

(ii) We turn to verify the equivalence of (1)–(3).(1)→ (2): First we show that F ′(x;h) exists and that

F ′(x;h) = limV∈∂F (x+th), t→0+

V h. (8.2.4)

SinceF is locally Lipschitz, {F(x+th)−F(x)

t: t → 0} is bounded. Thus there exists a sequence

ti → 0+ and � ∈ Rn such that

limi→∞

F(x + tih)− F(x)

ti= �.

We argue that � equals the limit in (8.2.4). By (8.2.3)

F(x + tih)− F(x)

ti∈ co (∂F ([x, x + tih])) h = co (∂F ([x, x + tih]h).


�

�

�

�

�

�

�

�


The Carathéodory theorem implies that for each i there exist tki ∈ [0, ti], λki ∈ [0, 1] with∑n

k=0 λki = 1, and V k

i ∈ ∂F (x + tki h), where k = 0, . . . , n, such that

F(x + tih)− F(x)

ti=

n∑k=0

λki V

ki h.

By passing to subsequences such that lim λki → λk , for k = 0, . . . , n, we have

� =n∑

k=0

limi→∞ λk

i

(limi→∞V k

i h)=

n∑k=0

λk limV∈∂F (x+th), t→0+

V h = limV∈∂F (x+th), t→0+

V h.

Next we prove that the limit in (8.2.4) is uniform for all h with |h| = 1. This implies (2).If the claimed uniform convergence in (8.2.4) does not hold, then there exists ε > 0, andsequences {hk} in Rm with |hk| = 1, {tk} with tk → 0+, and Vk ∈ ∂F (x + tkh

k) such that

|Vkhk − F ′(x;hk)| ≥ 2ε.

Passing to a subsequence we may assume that hk converges to some h ∈ Rm. By Lipschitzcontinuity of F ′(x;h) with respect to h we have

|Vkhk − F ′(x;h)| ≥ ε

for all k sufficiently large. This contradicts the semismoothness of F at x.(2)→ (1): Suppose that F is not semismooth at x. Then there exist ε > 0, h ∈ Rm,

and sequences {hk} in Rm and {tk}, and h ∈ Rm satisfying hk → h, tk → 0+, and Vk ∈∂F (x + tkh

k) such that|Vkh

k − F ′(x;h)| ≥ 2ε.

Since F ′(x; ·) is Lipschitz continuous,

|Vkhk − F ′(x;hk)| ≥ ε

for sufficiently large k. This contradicts assumption (2).

(2)→ (3) follows from (8.2.1).

(3)→ (2): For arbitrary ε > 0 there exists δ > 0 such that for all h with |h| < δ andx + h ∈ DF

|F ′(x + h;h)− F ′(x;h)| ≤ ε |h|. (8.2.5)

Let V ∈ ∂F (x + h) and |h| < δ2 with h �= 0. By (8.1.3)

V h ∈ co{

limh′→h, x+h′∈DF

F ′(x + h′;h)}.

By the Carathéodory theorem, there exist λk ≥ 0 with∑n

k=0 λk = 1 and hk, k = 1, . . . ,

satisfying x + hk ∈ DF and |hk − h| ≤ min( δ2 ,ε|h|L, |h|), where L is the Lipschitz constant

of F near x, such that ∣∣∣∣V h−n∑

k=0

λkF ′(x + hk;h)∣∣∣∣ ≤ ε|h|. (8.2.6)


�

�

�

�

�

�

�

�


From (8.2.5) and (8.2.2) we find∣∣∣∣ n∑k=0

λkF ′(x + hk;h)− F ′(x;h)∣∣∣∣

≤n∑

k=0

λk(|F ′(x + hk;h)− F ′(x + hk;hk)| + |F ′(x + hk;hk)− F ′(x;hk)|

+|F ′(x;hk)− F ′(x;h)|)≤

n∑k=0

λk(2L|hk − h| + ε |hk|) ≤ 4ε |h|.

It thus follows from (8.2.6) that

|V h− F ′(x + h;h)| ≤ 5ε |h|.Since ε > 0 is arbitrary, (2) follows.

Theorem 8.3. Let F : Rn → Rn be locally Lipschitz continuous, let x ∈ Rn, and supposethat allV ∈ ∂BF (x) are nonsingular. Then there exist a neighborhoodN of x and a constantC such that for all y ∈ N and V ∈ ∂BF (y)

|V −1| ≤ C. (8.2.7)

Proof. First, we claim that there exist a neighborhood N of x and a constant C such thatfor all y ∈ DF ∩N , ∇F(y) is nonsingular and

|∇F(y)−1| ≤ C. (8.2.8)

If this claim is not true, then there exists a sequence yk → x, yk ∈ DF , such that eitherall ∇F(yk) are singular or |(∇F(yk))−1| → ∞. Since F is locally Lipschitz the set{∇F(yk) : k = 1, . . .} is bounded. Thus there exists a subsequence of ∇F(yk) thatconverges to some V . Then V must be singular and V ∈ ∂BF (x). This contradicts theassumption and there exists a neighborhoodN of x such that (8.2.8) holds for all y ∈ DF∩N .Moreover (8.2.7) follows from (8.1.2), (8.2.8), and continuity of the norm.

Lemma 8.4. Suppose that F : Rn → Rn is semismooth at a solution x∗ of F(x) = 0and that all V ∈ ∂BF (x∗) are nonsingular. Then there exist a neighborhood N of x∗ andε : R+ → R+ with limt→0+ ε(t) = 0 monotonically such that

|x − V −1F(x)− x∗| ≤ ε(|x − x∗|)|x − x∗|,

|F(x − V −1F(x))| ≤ ε(|x − x∗|)|F(x)|for all V ∈ ∂BF (x) and x ∈ N .

Proof. From Theorem 8.3 there exist a neighborhood N of x∗ and a constant C such that|V −1| ≤ C for all V ∈ ∂BF (x) with x ∈ N . Thus, if x ∈ N , then it follows from


�

�

�

�

�

�

�

�


Theorem 8.2 (2) and B-differentiability of F at x∗, which is implied by semismoothness ofF at x∗, that

|x − V −1F(x)− x∗| ≤ |V −1||F(x)− F(x∗)− V (x − x∗)|≤ |V −1|(|F(x)− F(x∗)− F ′(x∗; x − x∗)| + |F ′(x∗; x − x∗)− V (x − x∗)|)≤ ε(|x − x∗|)|x − x∗|.

This implies the first claim. Let x = x − V −1F(x). Since F is B-differentiable at x∗we have

|F(x)| ≤ |F ′(x∗; x − x∗)| + ε(|x − x∗|)|x − x∗|.From the first part of the theorem we obtain for a possibly redefined function ε

|F(x)| ≤ (L+ ε) ε |x − x∗|,where ε = ε(|x − x∗|) and L is the Lipschitz constant of F at x∗. Since

|x − x∗| ≤ |x − x| + |x − x∗|≤ |V −1F(x)| + |x − x∗| ≤ C|F(x)| + ε |x − x∗|,

we find

|x − x∗| ≤ C

1− ε|F(x)|

for all x sufficiently close to x∗ and hence

|F(x)| ≤ Cε(L+ ε)

1− ε|F(x)|.

This implies the second claim.

We are now prepared for the local superlinear convergence result that was announcedin the introduction of this chapter.

Theorem 8.5 (Superlinear Convergence). Suppose that F : Rn → Rn is semismooth at asolution x∗ of F(x) = 0 and that all V ∈ ∂BF (x∗) are nonsingular. Then the iterates

xk+1 = xk − V −1k F (xk), Vk ∈ ∂BF (xk),

for k = 0, 1, . . . are well defined and converge to x∗ superlinearly if x0 is chosen sufficientlyclose to x∗. Moreover, |F(xk)| decreases superlinearly to 0.

Proof. We proceed by induction and suppose that x0 ∈ N and ε(|x0 − x∗|) ≤ 1 with N, ε

defined in Lemma 8.4. Without loss of generality we may assume that N is a ball in Rm.To verify the induction step suppose that xk ∈ N . Then

|xk+1 − x∗| ≤ ε(|xk − x∗|)|xk − x∗| ≤ |x0 − x∗|.This implies that xk+1 ∈ N and superlinear convergence of xk → x∗. Superlinear conver-gence of F(xk) to 0 easily follows from the second part of Lemma 8.4.


�

�

�

�

�

�

�

�


8.2.2 Globalization

We now discuss globalization of the semismooth Newton method (8.1.4). For this purposewe define the merit function θ by

θ(x) = |F(x)|2.Throughout we assume that F : Rn → Rn is locally Lipschitz continuous and B-

differentiable and that the following assumptions (8.2.9)–(8.2.11) hold:

S = {x ∈ Rn : |F(x)| ≤ |F(x0)|} is bounded. (8.2.9)

There exist σ , b > 0 and a graph � from S to the nonemptysubsets of Rn such thatθ ′(x; d) ≤ −σ θ(x) and |d| ≤ b |F(x)| for all d ∈ �(x) and x ∈ S.

(8.2.10)

Moreover � has the following closure property: xk → x anddk → d with xk ∈ S and dk ∈ �(xk) imply θo(x; d) ≤ −σ θ(x). (8.2.11)

Here θo(x; d) denotes the Clarke generalized directional derivative of θ at x in direction d

(see [Cla]) which is defined by

θo(x; d) = lim supy→x, t→0+

θ(y + t d)− θ(y)

t.

It is assumed that b > C is an arbitrarily large parameter. It serves the purpose that theiterates {dk} are uniformly bounded. With reference to the closure property of � given in(8.2.11), note that it is not required that d ∈ �(x). Conditions (8.2.10) and (8.2.11) will bediscussed in Section 8.2.3 below.

Following [HaPaRa, Qi] we next investigate a globalization strategy for the general-ized Newton iteration (8.1.4) and introduce the following algorithm.

Algorithm G.Let β, γ ∈ (0, 1) and σ ∈ (0, σ ). Choose x0 ∈ Rn and set k = 0. Given xk with

F(xk) �= 0. Then(i) If there exists a solution hk to

Vk hk = −F(xk)

with |hk| ≤ b|F(xk)|, and if further

|F(xk + hk)| < γ |F(xk)| ,set dk = hk , xk+1 = xk + dk , αk = 1, and mk = 0.

(ii) Otherwise choose dk ∈ �(xk) and let αk = βmk , where mk is the first positiveinteger m for which

θ(xk + βm dk)− θ(xk) ≤ −σβm θ(xk).


�

�

�

�

�

�

�

�


Finally set xk+1 = xk + αk dk .

In (ii) the increment dk = hk from (i) can be chosen if θ ′(xk;hk) ≤ −σθ(xk).

Theorem 8.6. Suppose that F : Rn → Rn is locally Lipschitz and B-differentiable.(a) Assume that (8.2.9)–(8.2.11) hold. Then the sequence {xk} generated by Algo-

rithm G is bounded, it satisfies |F(xk+1)| < |F(xk)| for all k ≥ 0, and each accumulationpoint x∗ of {xk} satisfies F(x∗) = 0.

(b) If moreover for one such accumulation point

|h| ≤ c |F ′(x∗;h)| for all h ∈ Rn, (8.2.12)

then the sequence xk converges to x∗.(c) If in addition to the above assumptionsF is semismooth at x∗ and allV ∈ ∂BF (x∗)

are nonsingular, then xk converges to x∗ superlinearly.

Proof. (a) First we prove that for each x ∈ S such that θ(x) �= 0 and d satisfying θ ′(x; d) ≤−σ θ(x), there exists a τ > 0 such that

θ(x + τ d)− θ(x) ≤ −στ θ(x) for all τ ∈ [0, τ ].

If this is not the case, then there exists a sequence τn → 0+ such that

θ(x + τn d)− θ(x) > −στn θ(x).

Dividing both sides by τn and letting n→∞, we have by (8.2.10)

−σ θ(x) ≥ θ ′(x; d) ≥ −σ θ(x).

Since σ < σ , this shows θ(x) = 0, which contradicts the assumption θ(x) �= 0. Hencefor each level k at which dk ∈ �(xk) is chosen according to the second alternative inAlgorithm G, there exists mk < ∞ and αk > 0 such that |F(xk+1)| < |F(xk)|. Byconstruction the iterates therefore satisfy |F(xk+1)| < |F(xk)| for each k ≥ 0.

Assume first that lim supαk > 0. If the first alternative in Algorithm G with αk = 1occurs infinitely many times, then using the fact that γ < 1 we find that limk→0 θ(x

k) = 0.Otherwise, for all k sufficiently large

0 ≤ θ(xk+1) ≤ (1− σαk)θ(xk) ≤ θ(xk).

Thus θ(xk) is monotonically decreasing, bounded below by 0, and hence convergent.Therefore limk→∞(θ(xk+1) − θ(xk)) = 0 and consequently limk→∞ αkθ(x

k) = 0. Thuslim supαk > 0 implies that limk→∞ θ(xk) = 0. Hence each accumulation point x∗ of {xk}satisfies F(x∗) = 0. Note that the existence of an accumulation point follows from (8.2.9).

If on the other hand lim supαk = 0, then lim mk →∞. By the definition of mk , forτk := βmk−1, we have τk → 0 and

θ(xk + τk dk)− θ(xk) > −στk θ(xk). (8.2.13)


�

�

�

�

�

�

�

�


By (8.2.9), (8.2.10) the sequence {(xk, dk)} is bounded. Let {(xk, dk)}k∈K be any convergentsubsequence with limit (x∗, d). Note that

θ(xk + τk dk)− θ(xk)

τk= θ(xk + τk d)− θ(xk)

τk+ θ(xk + τk d

k)− θ(xk + τk d)

τk,

where

limk∈K,k→∞

θ(xk + τk dk)− θ(xk + τk d)

τk→ 0,

since θ is locally Lipschitz continuous. Since dk ∈ �(xk) for all k ∈ K it follows from(8.2.11) that θ ′(x∗; d) ≤ −σ θ(x∗). Then from (8.2.13) and (8.2.11) we find

−σθ(x∗) ≤ lim supk∈K, k→∞

θ(xk + τk dk)− θ(xk)

τk≤ θo(x∗; d) ≤ −σ θ(x∗). (8.2.14)

It follows that (σ − σ) θ(x∗) ≤ 0 and thus θ(x∗) = 0.(b) Since F is B-differentiable at x∗ there exists a δ > 0 such that for |x − x∗| ≤ δ

|F(x)− F(x∗)− F ′(x∗; x − x∗)| ≤ 1

2c|x − x∗|.

Thus,

|F ′(x∗; x − x∗)| ≤ |F(x)| + |F(x)− F(x∗)− F ′(x∗, x − x∗)|

≤ |F(x)| + 1

2c|x − x∗|.

From (8.2.12)

|x − x∗| ≤ c |F ′(x∗; x − x∗)| ≤ c |F(x)| + 1

2|x − x∗|

and thus|x − x∗| ≤ 2c |F(x)| if |x − x∗| ≤ δ.

Given ε ∈ (0, δ) define the set

N(x∗, ε) ={x ∈ Rn : |x − x∗| ≤ ε, |F(x)| ≤ ε

2c + b

}.

Since x∗ is an accumulation point of {xk}, there exists an index k such that xk ∈ N(x∗, ε).Since |dk| ≤ b|F(xk)| for all k we have

|xk+1 − x∗| ≤ |xk − x∗ + αk dk| ≤ |xk − x∗| + αk |dk|

≤ 2c |F(xk)| + b |F(xk)| = (2c + b) |F(xk)| ≤ ε.

Hence xk+1 ∈ N(x∗, ε). By induction, xk ∈ N(x∗, ε) for all k ≥ k and thus the sequencexk converges to x∗.


�

�

�

�

�

�

�

�


(c) Since limk→∞ xk = x∗ the iterates xk of Algorithm G enter into the region ofattraction forTheorem 8.5. Moreover, referring to the proof of Lemma 8.4, for anyγ ∈ (0, 1)there exists kγ such that the iterates according to (8.1.4) satisfy |F(xk+1)| ≤ γ |F(xk)| fork ≥ kγ . Hence these iterates coincide with those of the Algorithm G for k ≥ kγ , andsuperlinear convergence follows.

Remark 8.2.1. (i) We point out that the requirement that the graph � satisfies the closureproperty (8.2.11) is used in the proof ofTheorem 8.6 only for the case that lim supk→∞ αk = 0.

(ii) For part (a) of Theorem 8.6 the condition |hk| ≤ b|F(xk)| in the first alternativeof Algorithm G is not required and |d| ≤ b|F(x)| for all x ∈ S used in alternative (ii) canbe replaced by requiring that the directions are uniformly bounded. These conditions areused in the proof of Theorem 8.6(b).

(iii) Since h → F ′(x∗;h) is positively homogeneous, since we consider the finite-dimensional case here, one can easily argue that (8.2.12) is equivalent to F ′(x∗;h) �= 0 forall h �= 0, which is called BD-regularity in [Qi].

8.2.3 Descent directions

We turn to a discussion of conditions (8.2.10) and (8.2.11) required for the descent direc-tions d .

For the Clarke generalized directional derivative θo(x, d) we have by local Lipschitzcontinuity of F at x that

θo(x, d) = lim supy→x,t→0+

2(F (x), (F (y + t d)− F(y))

t

and there exists Fo : Rn × Rn → Rn such that

θo(x, d) = 2(F (x), F o(x; d)) for (x, d) ∈ Rn × Rn.

We introduce the notion of quasi-directional derivative.

Definition 8.7. Let F : Rn → Rn be directionally differentiable. Then G : S × Rn → Rn

is called a quasi-directional derivative of F on S ⊂ Rn if

(i) (F (x), F ′(x; d)) ≤ (F (x),G(x; d)),(ii) G(x; td) = tG(x; d) for all d ∈ Rn, x ∈ S, and t ≥ 0,

(iii) (F (x), F o(x; d)) ≤ lim supx→x, d→d (F (x),G(x; d)) for all x → x, d → d withx, x ∈ S.

For the special case of optimization subject to box constraints a quasi-directional derivativewill be constructed in Section 8.2.5. In the remainder of this section we consider therelationship between well-known choices for descent directions (see, e.g., [HPR, Pan1])and the concept of quasi-directional derivative of Definition 8.7, and we assume that F is


�

�

�

�

�

�

�

�


a locally Lipschitz continuous and directionally differentiable function on S where S refersto the set defined in (8.2.9).

(a) Bouligand direction. If there exists b such that

|h| ≤ b |F ′(x;h)| for all x ∈ S, h ∈ Rn (8.2.15)

and ifF(x)+ F ′(x; d) = 0 (8.2.16)

admits a solution d for each x ∈ S, then a first choice for the direction is given by thesolution d to (8.2.16), i.e., �(x) = d; see, e.g., [Pan1]. By (8.2.15) we have |d| ≤ b|F(x)|.Moreover

θ ′(x, d) = 2 (F ′(x; d), F (x)) = −2 θ(x),

and therefore the inequalities in (8.2.10) hold with b = b and σ = 2. For this choice,however, � does not satisfy (8.2.11), in general, unless additional conditions are imposedon the problem data; see Section 8.2.5 below.

(b) Generalized Bouligand direction. As a second choice (see [HPR, Pan1]), weassume that G is a quasi-directional derivative of F on S, that

|h| ≤ b |G(x;h)| for all x ∈ S, h ∈ Rn (8.2.17)

holds, and thatF(x)+G(x; d) = 0 (8.2.18)

admits a solution d which is used as the descent direction for each x ∈ S. We set �(x) = d.Then one argues as for the first choice that the inequalities in (8.2.10) hold with b = b

and σ = 2. Moreover � satisfies (8.2.11), since for any (x, d) → (x, d) in S × R with�(x) = d we have

θo(x, d) ≤ 2 lim supx→x, d→d

(F (x),G(x; d)) = −2 limx→x

|F(x)|2 = −2θ(x).

We refer the reader to Section 8.2.5 for the construction of G for specific applications.(c) Generalized gradient direction. The following choice was discussed in [HPR].

Here d is chosen as the solution to

mind

J (x, d) = 2(F (x),G(x; d))+ η |d|2, (8.2.19)

where η > 0 and x ∈ S. Assume that for some L > 0⎧⎨⎩h→ G(x;h) is continuous and

|G(x;h)| ≤ L |h| for all x ∈ S, h ∈ Rn.

(8.2.20)

Then, d → J (d) is coercive, bounded below, and continuous. Thus there exists an optimalsolution d to (8.2.19).


�

�

�

�

�

�

�

�


Lemma 8.8. Assume that F : Rn → Rn is Lipschitz continuous, directionally differentiableand that G is a quasi-directional derivative of S satisfying (8.2.20).

(a) If d is an optimal solution to (8.2.19), then

(F (x),G(x; d)) = −η |d|2.(b) If d = 0 is an optimal solution to (8.2.19), then (F (x),G(x;h)) ≥ 0 for all

h ∈ Rn.

Proof. (a) Let d be an optimal solution to (8.2.19) and consider

minα≥0

2(F (x),G(x;αd))+ α2η |d|2.

Then α = 1 is optimal and differentiating with respect to α, we have

(F (x),G(x; d))+ η |d|2 = 0.

(b) If d = 0 is an optimal solution, then the optimal value of (8.2.19) is zero. Itfollows that for all h ∈ Rn and α ≥ 0

0 ≤ 2α(F (x),G(x;h))+ α2η|h|2.Dividing by α > 0 and letting α → 0+ we obtain the claim.

By Lemma 8.8 the optimal value of the cost J in (8.2.19) is given by −η|d|2. If thisvalue is negative, then any solution to (8.2.19) provides a decay for θ . The optimal valueof the cost is 0 if and only if d = 0 is the optimal solution. In this case Lemma 8.8 impliesthat x is a stationary point in the sense that (F (x),G(x;h)) ≥ 0 for all h ∈ Rn.

Let us now turn to the discussion of condition (8.2.10) for the direction given bythe solution d to (8.2.19), i.e., �(x) = d. We assume (8.2.17) and that (8.2.18) admitsa solution for every x ∈ S. Since J (0) = 0, we have 2(F (x),G(x; d)) ≤ −η |d|2 andtherefore

η |d|2 ≤ −2(F (x),G(x; d)) ≤ 2|F(x)| |G(x; d)| ≤ 2L |d| |F(x)|.Thus,

|d| ≤ 2L

η|F(x)|,

and the second condition in (8.2.10) holds. Turning to the first condition let d satisfyF(x) + G(x; d) = 0. Then using Lemma 8.8(a) and (8.2.17) we find, at a solution d to(8.2.19),

J (x, d)= (F (x),G(x; d)) = −η |d|2 ≤ 2 (G(x; d), F (x))+ η |d|2

≤−2 |F(x)|2 + ηb2 |F(x)|2 = −(2− ηb2) θ(x).

(8.2.21)

Since G is a quasi-directional derivative of F on S, we have θ ′(x; d) ≤ 2(F (x),G(x; d)) ≤−2(2− ηb2) θ(x) and thus the direction d defined by (8.2.19) satisfies the first condition in(8.2.10) with σ = 2(2− ηb2), provided that η < 2

b2 .


�

�

�

�

�

�

�

�


To argue (8.2.11) for this choice of �, let xk → x, dk → d, with dk = �(xk), xk ∈ S,and choose dk such that F(xk)+G(xk; dk) = 0. Then

1

2θo(x; d) ≤ lim sup

k→∞(F (xk),G(xk; dk))

≤ lim supk→∞

(2(F (xk),G(xk; dk))+ η|dk|2)

≤ lim supk→∞

(2(F (xk),G(xk; dk))+ η|dk|2)

≤ −2|F(x)|2 + η b2 limk→∞ |F(xk)|2 = −(2− ηb2)θ(x),

and thus (8.2.11) holds if η < 2b2 .

8.2.4 A Gauss–Newton algorithm

In this section we admit the case that the minimum of θ is not attained with value 0,which typically arises in nonlinear least squares problems, and the case where the equationVkd = −F(xk) or F(xk) + G(xk; d) = 0, which would provide candidates for searchdirections in (ii) of Algorithm G, does not have a solution. For these cases we consider thefollowing Gauss–Newton method.

Gauss–Newton algorithm.Let β, γ ∈ (0, 1), α > 0, and σ ∈ (0, 2η). Choose x0 ∈ Rn and set k = 0. Given xk

with F(xk) �= 0. Then(i) If there exists Vk ∈ ∂BF (xk) such that

hk = −(α |F(xk)| I + V Tk Vk)

−1V Tk F (xk) (8.2.22)

with |hk| ≤ b |F(xk)| satisfies |F(xk+hk)| < γ |F(xk)|, let dk = hk and set xk+1 = xk+dk ,αk = 1, and mk = 0.

(ii) Otherwise, let dk be defined by (8.2.19) with x = xk . Stop if dk = 0 orF(xk) = 0.Otherwise, set αk = βmk where mk is the first positive integer m for which

θ(xk + βm dk)− θ(xk) ≤ −σβm |dk|2, (8.2.23)

and set xk+1 = xk + αk dk .

Note that (8.2.22) gives the solution to

minh

1

2|F(xk)+ Vkh|2 + α

2|F(xk)| |h|2. (8.2.24)

If the Gauss–Newton algorithm terminates after finitely many steps with index k, theneither F(xk) = 0 or dk = 0 in the second alternative of the algorithm. In this case xk is astationary point of θ in the sense that (F (xk),G(xk; d)) ≥ 0 for all d by Lemma 8.8.


�

�

�

�

�

�

�

�


We next discuss global convergence when the Gauss–Newton algorithm takes in-finitely many steps.

Theorem 8.9. Suppose that F : Rn → Rn is locally Lipschitz and B-differentiable, (8.2.9),(8.2.11), and (8.2.20) hold, and G is a quasi-directional derivative of F on S.

(a) If {xk} is an infinite sequence generated by the Gauss–Newton algorithm, then{xk} is bounded and |F(xk+1)| < |F(xk)| for all k. If the first alternative occurs infinitelyoften, then all accumulation points x∗ satisfy F(x∗) = 0. Otherwise, limk→∞ dk = 0 and

(F (x∗),G(x;h)) ≥ 0 for all h ∈ Rn. (8.2.25)

(b) If for some accumulation point F(x∗) = 0 and

|h| ≤ c |F ′(x∗;h)| for all h ∈ Rn, (8.2.26)

then the sequence xk converges to x∗.(c) If in addition to the assumptions in (a) and (b), F is semismooth at x∗ and all

V ∈ ∂BF (x∗) are nonsingular, then xk converges to x∗ superlinearly.

Proof. (a) The algorithm guarantees that |F(xk+1)| < |F(xk)| for all k. Due to (8.2.9)the sequence {xk} is bounded. In case the first alternative is taken we have from (8.2.24)with h = 0

α|dk|2 ≤ |F(xk)|.In case of the second alternative Lemma 8.8(a) and (8.2.20) imply that

η|dk|2 ≤ |F(xk)||G(xk; dk)| ≤ L|F(xk)||dk|.Consequently the sequence {dk} is bounded.

If the first alternative of the Gauss–Newton algorithm occurs infinitely often, thenlimk→∞ |F(xk)| = 0 and every accumulation point x∗ of {xk} satisfies F(x∗) = 0.

Let us turn to the case when eventually only the second alternative occurs. Since|F(xk)| is monotonically decreasing and bounded below, the sequence F(xk+1)−F(xk) isconvergent and hence by (8.2.23)

limk→∞αk|dk|2 = 0.

If limk→∞ dk �= 0, then there exists an index set K such that limk∈K, k→∞ |dk| �= 0 andconsequently limk∈K, k→∞ αk = 0. For τk = βmk−1 we have

−σ |dk|2 ≤ 1

τ k(θ(xk + τ kdk)− θ(xk)). (8.2.27)

Let K ⊂ K be such that {xk}k∈K , {dk}k∈K are convergent subsequences with limits x∗ andd . Note that

1

τ k(θ(xk + τ kdk)− θ(xk))= 1

τ k(θ(xk + τ kd)− θ(xk))

+ 1

τ k(θ(xk + τ kdk)− θ(xk + τ kd))

(8.2.28)


�

�

�

�

�

�

�

�


with

limk∈K,k→∞

1

τ k(θ(xk + τ kdk)− θ(xk + τ kd)) = 0,

since θ is locally Lipschitz continuous. By Lemma 8.8(a) we have (F (xk),G(xk; dk)) =−2η|dk| for all k ∈ K . Since G is assumed to be a quasi-directional derivative we haveθo(x∗, d) = (F (x∗), F o(x∗; d)) ≤ −2η|d|. Passing to the limit in (8.2.27), utilizing(8.2.28), we find

−σ |d|2 ≤ lim supk∈K, k→∞

1

τ k(θ(xk + τ kdk)− θ(xk)) = θo(x∗, d) ≤ −2η|d|2.

Since σ ∈ (0, 2η) this implies that d = 0, which contradicts our assumption. Consequentlylimk→∞ dk = 0. From Lemma 8.8(a) we have

2α(F (xk),G(xk;h))+ ηα2|h|2 = J (xk, αh) ≥ J (kk; dk) = −η|dk|2

for every α > 0 and h ∈ Rn. Passing to the limit with respect to k and dividing by α weobtain

2 (F (x∗),G(x∗;h))+ ηα|h|2 ≥ 0,

and (8.2.25) follows by letting α → 0+.

(b) The proof is identical to the one of Theorem 8.6(b).

(c) From Theorem 8.3 there exists a bounded neighborhood N ⊂ {x : |F(x)| ≤|F(x0)|} of x∗ and a constant C such that for all x ∈ N and V ∈ ∂BF (x) we have|V −1| ≤ C. Consequently there exists M > 0 such that for all x ∈ N we have∣∣(α|F(x)|I + V (x)T V (x)

)−1V (x)T − V (x)−1

∣∣≤ ∣∣α|F(x)|(α|F(x)|I + V (x)T V (x)

)−1∣∣ ≤ M|F(x)| ≤ M|F(x0)|.(8.2.29)

Let h = −(α|F(x)|I + V (x)T V (x)

)−1V (x)T F (x). Then by Lemma 8.4, possibly after

shrinking N , we have for all x ∈ N

|x + h− x∗| = |x − V (x)−1F(x)− x∗ + h+ V (x)−1F(x)|

≤ ε(|x − x∗|)|x − x∗| +M|F(x)|2.(8.2.30)

Moreover

|F(x + h)| ≤ |F (x − V (x)−1F(x)

)|+|F [(V (x)−1 − (α|F(x)|I + V (x)T V (x))−1V (x)T

)F(x) ]|

≤ Lε(|x − x∗|)|F(x)| +ML2|F(x)|2 ≤ L(ε(|x − x∗|)+ML2|x − x∗| )|F(x)|,

where we used (8.2.29) and denoted by L the Lipschitz constant of F on the bounded set{x : |x| ≤ M|F(x0)|2} ∪ {x − V (x)−1F(x) : x ∈ N} ∪N .


�

�

�

�

�

�

�

�


Since xk → x∗, the last estimate implies the existence of an index k such that xk ∈ N and

|F(xk + hk)| ≤ γ |F(xk)| for all k ≥ k,

where hk = −(α|F(xk)|I + V (xk)T V (xk)

)−1V (xk)T F (xk). Thus the first alternative in

the Gauss–Newton algorithm is chosen for all k ≥ k, and superlinear convergence followsfrom (8.2.30) with (x, h) = (xk, hk), where we also use that |F(x)| ≤ L|x| for x ∈ N .

8.2.5 A nonlinear complementarity problem

Consider the complementarity problem⎧⎨⎩φ ≤ x ≤ ψ, g(x)i (x − ψ)i (x − φ)i = 0 for all i,

g(x) ≤ 0 for x ≥ ψ and g(x) ≥ 0 for x ≤ φ,

where g ∈ C1(Rn,Rn) and φ < ψ . This corresponds to the bilateral constraint case andcan equivalently be expressed as

F(x) = g(x)+max(0,−g(x)+ x − ψ)+min(0,−g(x)+ x − φ) = 0;see Section 4.7, Example 4.53. Clearly θ(x) = |F(x)|2 is locally Lipschitz continuous anddirectionally differentiable, and hence B-differentiable.

Define

A+ = {−g(x)+ x − ψ > 0}, A− = {−g(x)+ x − φ < 0},

I1 = {−g(x)+ x − ψ = 0}, I2 = {x − ψ < g(x) < x − φ}, and

I3 = {−g(x)+ x − φ = 0}.We obtain

F ′(x; d) =

⎧⎪⎪⎪⎪⎪⎪⎪⎪⎨⎪⎪⎪⎪⎪⎪⎪⎪⎩

d on A+ ∪A−,

g′(x)d on I2,

max(g′(x)d, d) on I1,

min(g′(x)d, d) on I3,

and the Bouligand direction (8.2.16) is as the solution to

d + x − ψ = 0 on A+, d + x − φ = 0 on A−, g′(x)d + g(x) = 0 on I2,

max(g′(x)d, d)+ F(x) = 0 on I1, min(g′(x)d, c d)+ F(x) = 0 on I3.

(8.2.31)


�

�

�

�

�

�

�

�


Further if g′(x)− I is surjective and g ∈ C2(Rn,Rn), then Fo(x, d) satisfies

Fo(x; d) =

⎧⎪⎪⎪⎪⎪⎪⎪⎪⎨⎪⎪⎪⎪⎪⎪⎪⎪⎩

d on A+ ∪A−,

g′(x)d on I2,

max(g′(x)d, d) on {F(x) > 0} ∪ (I1 ∪ I3),

min(g′(x)d, d) on {F(x) ≤ 0} ∪ (I1 ∪ I3).

(8.2.32)

To verify this claim we consider F(x) = g(x)+max(0,−g(x)+x−ψ). The general casethen follows with minor modifications. We find

θo(x, d) = lim supy→x,t→0+

2

t(F (x), F (y + td)− F(y))

= lim supy→x,t→0+

2

t(F (x),max(g′(x)(y + td), y + td − ψ)−max(g′(x)y, y − ψ)),

(8.2.33)

where ψ = ψ + g(x)− g′(x)x. For d ∈ Rn we define r ∈ Rn by

ri > 0 on S1 = {i ∈ I1 : Fi(x) > 0, (Ad)i > di} ∪ {i ∈ I1 : Fi(x) ≤ 0, (Ad)i < di},ri < 0 on S2 = {i ∈ I1 : Fi(x) > 0, (Ad)i ≤ di} ∪ {i ∈ I1 : Fi(x) ≤ 0, (Ad)i ≥ di},

and r arbitrary otherwise. Here we set A = g′(x). Let y ∈ Rn be such that

Ay = y − ψ + r,

and choose t = ty such that

(Ay + tAd)i > (y + td − ψ)i for i ∈ S1,

(Ay + tAd)i < (y + td − ψ)i for i ∈ S2

for t ∈ (0, ty). Passing to the limit in (8.2.33) we arrive at (8.2.32).For arbitrary δ > 0 we claim that the quasi-directional derivative G, defined by

G(x; d) =

⎧⎪⎪⎪⎪⎪⎪⎪⎪⎪⎨⎪⎪⎪⎪⎪⎪⎪⎪⎪⎩

d on Aδ = {−g(x)+ x − ψ > δ} ∩ {−g(x)+ x − φ < −δ},g′(x)d on Iδ = {x − ψ + δ ≤ g(x) ≤ x − φ − δ},and otherwise

max(g′(x)d, d) if {F(x) > 0},min(g′(x)d, d) if {F(x) ≤ 0},

(8.2.34)

is a quasi-directional derivative for F . In fact, G is clearly positively homogeneous ofdegree 1, and

(F (x), F ′(x, d)) ≤ (F (x),G(x, d)),


�

�

�

�

�

�

�

�


which shows (i) of Definition 8.7. Moreover we have

(F (x), F o(x, d)) ≤ (F (x),G(x, d)).

If |x − x| sufficiently small, then

i ∈ Iδ(x) implies that i ∈ I2(x),

i ∈ Aδ(x) implies that i ∈ A(x).

Thus, (iii) holds by the definition of G.To argue that (8.2.18) has a solution, let Aδ ∪ Iδ ∪ (Aδ ∩ Iδ)

c be a pairwise disjointpartition of the index {i = 1, . . . , n}. SetA = g′(x), for x ∈ S, and decomposeA accordingto the partition of the index set

A =⎛⎝ A11 A12 A13

A21 A22 A23

A31 A32 A33

⎞⎠ .

Setting d = (d1, d2, d3)T and F = (F1, F2, F3)

T , the equation

G(x; d) = −Fis equivalent to

d1 = −F1,

d2 = −A−122 A23d3 + A−1

22 (−F2 + A21F1),

and ⎧⎪⎪⎪⎪⎨⎪⎪⎪⎪⎩Md3 + μ = −w − F3,

μ = max(0, μ+ d3 + F3) on {F(x) > 0},

μ = min(0, μ+ d3 + F3) on {F(x) ≤ 0},

(8.2.35)

where

w = −A31F1 + A32A−122 (−F2 + A21F1) and M = A33 − A32A

−122 A23.

Assume thatA is symmetric positive definite for every x ∈ S. Then every Schur complementof A is positive definite as well, and (8.2.35) admits a unique solution (d3, μ). It followsthat G(x; d) = −F admits a unique solution d for every x ∈ S and F ∈ Rn. Consequently(8.2.18) is satisfied. Since g′(x) is positive definite for every x ∈ S and S is closed andbounded by (8.2.9), and hence compact, it follows that g′(x) and M are uniformly positivedefinite with respect to x ∈ S. This implies that there exists b such that |d| ≤ b|F | for someb independent of x ∈ S. Consequently (8.2.17) holds.

We remark that (iii) of Definition 8.7 is not satisfied for the choiceG(x; d) = F ′(x, d)unless g(x)i �= (x − ψ)i and g(x)i �= (x − φ)i for all i and x ∈ S.


�

�

�

�

�

�

�

�


8.3 Semismooth functions in infinite-dimensional spacesIn infinite-dimensional spaces notions of generalized derivatives for functions which are notC1 cannot rely on Rademacher’s theorem. Here, instead, we shall mainly utilize a conceptof generalized derivative that is sufficient to guarantee superlinear convergence of Newton’smethod. We refer the reader to the discussion in Section 8.1, especially (8.1.5) and (8.1.6).This notion of differentiability is called Newton derivative and will be defined below. Werefer the reader to [HiIK, CNQ, Ulb] for further discussions of the topics covered in thissection.

Let X,Z be real Banach spaces and let D ⊂ X be an open set.

Definition 8.10. (1) F : D ⊂ X → Z is called Newton differentiable at x if there exist anopen neighborhood N(x) ⊂ D and mappings G : N(x)→ L(X,Z) such that

lim|h|→0

|F(x + h)− F(x)−G(x + h)h|Z|h|X = 0. (A)

The family {G(s) : s ∈ N(x)} is called an N -derivative of F at x.(2) F is called semismooth at x if it is Newton differentiable at x and

limt→0+

G(x + t h)h exists uniformly in |h| = 1.

(3) F is directionally differentiable at x ∈ D if

limt→0+

F(x + t h)− F(x)

t=: F ′(x;h)

exists for all h ∈ X.(5) F is B-differentiable at x ∈ D if F is directionally differentiable at x and

lim|h|→0

F(x + h)− F(x)− F ′(x;h)|h|X = 0.

Note that differently from the finite-dimensional case we do not require Lipschitzcontinuity of F as part of the definition of semismoothness.

Lemma 8.11. Suppose that F : D ⊂ X → Z is Newton differentiable at x ∈ D withN -derivative G.

(1) F is directionally differentiable at x if and only if

limt→0+

G(x + t h)h exists for all h ∈ X. (8.3.1)

In this case F ′(x;h) = limt→0+ G(x + th)h for all h ∈ X.(2) F is B-differentiable at x if and only if

limt→0+

G(x + t h)h exists uniformly in |h| = 1, (8.3.2)

i.e., F is semismooth.


�

�

�

�

�

�

�

�

8.3. Semismooth functions in infinite-dimensional spaces 235

Proof. (1) If (8.3.1) holds for h ∈ X with |h| = 1, then

limt→0+

F(x + th)− F(x)

t= lim

t→0+G(x + th)h.

Since h ∈ X with |h| = 1 was arbitrary this implies that F is directionally differentiable atx and

F ′(x;h) = limt→0+

G(x + th)h.

Similarly the converse holds.(2) If F is directionally differentiable at x, then F is B-differentiable at x, i.e.,

lim|h|→0

F(x + h)− F(x)− F ′(x;h)|h|X = 0 if and only if

limt→0+

F(x + tv)− F(x)

t− F ′(x; v) = 0 and the limit is uniform in |v|X = 1.

Here we use positive homogeneity of the directional derivative F ′(x;h) with respect to thesecond variable.

If F is B-differentiable at x, then it is differentiable and from (1) we have

limt→0

F(x + tv)− F(x)

t= F ′(x; v) = lim

t→0G(x + tv)v.

The Bouligand property and the equivalence stated above imply that the limt→0 G(x+ tv)v

exists uniformly in |v| = 1. The converse easily follows as well.

Example 8.12. Letψ : R→ R be semismooth at every x ∈ R in the sense of Definition 8.1and globally Lipschitz, i.e., there exists a constant L such that

|ψ(s)− ψ(t)| ≤ L|s − t | for all s, t ∈ R.We first argue that there exists a measurable selection V : R→ R such that V (t) ∈ ∂ψ(t)

for a.e. t ∈ R. Since ∂ψ(s) is a nonempty closed set in R for every s ∈ R (see [Cla,p. 70]), it suffices to prove that the multivalued function s → ∂ψ is measurable; i.e., forevery compact set C ⊂ R the preimage PC = {t ∈ R : ∂ψ(t)∩C �= ∅} is measurable. Themeasurable selection theorem (see [Cla, p. 111]) then ensures the existence of the desiredmeasurable selection V . To verify measurability of s → ∂ψ(s) let C be a compact setand let {tk} be a convergent sequence in PC with limit t∗. Choose vk ∈ ∂ψ(tk) ∩ C fork = 1, 2, . . . . By compactness of C there exists a convergent subsequence, denoted by thesame symbol, with limit v∗ ∈ C. Since tk → t∗, upper semicontinuity of ∂ψ at t∗ (see [Cla,p. 70]) implies the existence of a sequence vk ∈ ∂ψ(t∗) such that limk→∞(vk − vk) = 0.Consequently limk→∞ vk = v∗ and by closedness of ∂ψ(t∗) we have v∗ ∈ ∂ψ(t∗) ∩ C.Thus PC is closed and therefore measurable.

Associated to ψ with the properties specified above, we define for 1 ≤ p ≤ q ≤ ∞the substitution operator

F : Lq(�)→ Lp(�)


�

�

�

�

�

�

�

�


byF(x)(s) = ψ(x(s)) for a.e. s ∈ �,

where x ∈ Lq(�), and with � a bounded domain in Rn.We now verify that F is Newton differentiable on R if 1 ≤ p < q ≤ ∞ and that any

measurable selection V of ∂ψ provides an N -derivative. Let D : R×R→ R be given by

D(s, v) = |ψ(s + v)− ψ(s)− V (s + v)v|.By Theorem 8.2 we have

limv→0

v−1D(s, v) = 0 for every s ∈ R. (8.3.3)

Moreover global Lipschitz continuity of ψ implies that

D(s, v) ≤ 2L|v| for all (s, v) ∈ R2. (8.3.4)

Let x ∈ Lq(�) and let {hk} be a sequence in Lq(�) converging to 0. Then there exists a sub-sequence {hk′ } converging to 0 a.e. in�. By (8.3.3) this implies thathk′(s)

−1D(x(s), hk′(s))

→ 0 for a.e. s ∈ �. By (8.3.4) Lebesgue’s bounded convergence theorem is applicableand implies that h−1

k′ D(x, hk′) converges to 0 in Lp for every 1 ≤ p <∞. Since this is thecase for every a.e. convergent subsequence of hk and since {hk} was arbitrary, we find that

|h−1 D(x, h)|Lp → 0 for every 1 ≤ p <∞ as |h|Lq(�) → 0. (8.3.5)

By the Hölder inequality we obtain

|D(x, h)|Lp ≤ |h−1 D(x, h)|Lr |h|Lq ,

where r = qp

q−p if q < ∞, and r = p if q = ∞. Using (8.3.5) this implies Newtondifferentiability of F at x.

To verify that F is semismooth at x ∈ Lq(�), we shall show that

|V (x + h)h− ψ ′(x;h)|Lp

|h|Lq

→ 0 as |h|Lq → 0. (8.3.6)

Let D : R×R→ R be given by D(s, v) = |V (s + v)v−ψ ′(s; v)|. Then by Theorem 8.2(3) and Lipschitz continuity of ψ we have

limv→0

v−1D(s, v) = 0 and D(s, v) ≤ 2L|v| for all (s, v) ∈ R2.

The proof of (8.3.6) can now be completed in the same manner as that for Newton dif-ferentiability, by using Lebesgue’s bounded convergence theorem. Semismoothness ofF : Lq(�) → Lp(�) follows from (8.3.5), (8.3.6), and Lemma 8.11 (2). The class ofmappings F of this example was treated in [Ulb].

Example 8.13. Let X be a Hilbert space. Then the norm functional F(x) = |x| is Newtondifferentiable. In fact, let G(x + h)h = ( x+h

|x+h| , h)X and G(0)h = (λ, h)X for some λ withλ ∈ X. Then

|h|−1|F(x+h)−F(x)−G(x+h)h| = |h|−1

∣∣∣∣ (2(x + h), h)X − |h|2|x| + |x + h| − (x + h, h)X

|x + h|∣∣∣∣→ 0

as h→ 0. Hence F is Newton differentiable on X. Moreover F is semismooth.


�

�

�

�

�

�

�

�


Example 8.14. Let F : Lq(�) → Lp(�) denote the pointwise max operation F(x) =max(0, x) and define Gm by

Gm(x)(s) =⎧⎨⎩

0 if x(s) < 0,δ if x(s) = 0,1 if x(s) > 0,

where δ ∈ [0, 1]. It follows from Example 8.12 that F is semismooth from Lq(�) intoLp(�) provided that 1 ≤ p < q ≤ ∞, with Gm as N -derivative. It can also be arguedthat Gm is an N-derivative for F for any choice of δ ∈ R (see [HiIK]). If p = q, then F isdirectionally differentiable at every x ∈ Lq(�). In fact, for h ∈ Lq(�) define

F ′(x;h)(s) ={

0 if x(s) < 0 or x(s) = 0, h(s) ≤ 0,h(s) if x(s) > 0 or x(s) = 0, h(s) ≥ 0.

Then we have ∣∣∣∣F(x(s)+ t h(s))− F(x(s))

t− F ′(x;h)(s)

∣∣∣∣ ≤ 2|h(s)|

and

limt→0+

∣∣∣∣F(x(s)+ t h(s))− F(x(s))

t− F ′(x;h)(s)

∣∣∣∣ for a.e. s ∈ �.

By the Lebesgue dominated convergence theorem

limt→0+

∣∣∣∣F(x + t h)− F(x)

t− F ′(x;h)

∣∣∣∣Lp

= 0,

i.e., F ′(x;h) is the directional derivative of F at x.However, F is not Newton differentiable with Gm as N-derivative in general. For

this purpose consider x = −|s| on � = (−1, 1) and choose hn(s) as 1n

multiplied by thecharacteristic function of the interval (− 1

n, 1n). Then |hn|pLp = 2

np+1 and∫ 1

−1|F(x + hn)− F(x)−Gm(x + hn)hn|p ds =

∫ 1n

− 1n

|x(s)|p = 2

p + 1

(1

n

)p+1

.

Thus,

limn→∞

|F(x + hn)− F(x)−Gm(x + hn)hn|Lp

|hn|Lp

=(

1

p + 1

) 1p

�= 0,

and hence condition (A) is not satisfied at x for any p ∈ [1,∞). To consider the casep = ∞ we choose � = (0, 1) and show that (A) is not satisfied at x(s) = s. For thispurpose define for n = 2, . . .

hn(s) =

⎧⎪⎨⎪⎩−(1+ 1

n)s on (0, 1

n] ,

(1+ 1n)s − 2

n(1+ 1

n) on ( 1

n, 2n] ,

0 on ( 2n, 1] .


�

�

�

�

�

�

�

�


Observe that En = {s : x(s)+ hn(s) < 0} ⊃ (0, 1n]. Therefore

limn→∞

1

|hn|L∞(0,1)|max(0, x + hn)−max(0, x)−Gm(x + hn)hn|L∞(0,1)

= limn→∞

n2

n+ 1|x|L∞(En) ≥ lim

n→∞n

n+ 1= 1,

and hence (A) cannot be satisfied.

Lemma 8.15. Suppose that H : D ⊂ X→ Y is continuously Fréchet differentiable at x ∈D and φ : Y → Z is Newton differentiable at H(x) with N -derivative G. Then F = φ(H)

is Newton differentiable at x with N-derivative G(H(x + h))H ′(x + h) ∈ L(X,Z) for h

sufficiently small.

Proof. LetU be a convex neighborhood of x inD such thatH ′ ∈ L(X, Y ) is continuous inU

and H(U) is contained in N(H(x)), with N(H(x)) defined according to N -differentiabilityof φ at H(x). Let h ∈ X be such that x∗ + h ∈ U and note that

H(x + h) = H(x)+∫ 1

0H ′(x + θh)h dθ, (8.3.7)

where ∥∥∥∥∫ 1

0H ′(x + θh) dθ −H ′(x + h)

∥∥∥∥→ 0 as |h|X → 0, (8.3.8)

sinceH ′ ∈ L(X, Y ) is continuous at x. By Newton differentiability of φ atH(x) and (8.3.7)it follows that

lim|h|X→0

1

|h|X∣∣∣∣φ(H(x + h))− φ(H(x))−G(H(x + h))

∫ 1

0H ′(x + θh)h dθ

∣∣∣∣Z

= 0.

From (8.3.8) we deduce that

lim|h|X→0

1

|h|X∣∣φ(H(x + h))− φ(H(x))−G(H(x + h))H ′(x + h)h

∣∣Z= 0.

This implies Newton differentiability of F = φ(H) at x.

Theorem 8.16. Suppose that x∗ is a solution toF(x) = 0 and thatF is Newton differentiableat x∗ with N-derivative G. If G is nonsingular for all x ∈ N(x∗) and {‖G(x)−1‖ : x ∈N(x∗)} is bounded, then the Newton iteration

xk+1 = xk −G(xk)−1F(xk)

converges superlinearly to x∗ provided that |x0 − x∗| is sufficiently small.

Proof. Note that the Newton iterates satisfy

|xk+1 − x∗| ≤ ‖G(xk)−1‖ |F(xk)− F(x∗)−G(xk)(xk − x∗)| (8.3.9)


�

�

�

�

�

�

�

�


if xk ∈ N(x∗). Let B(x∗, r) denote a ball of radius r centered at x∗ contained in N(x∗) andlet M be such that ‖G(x)−1‖ ≤ M for all x ∈ B(x∗, r). We apply (A) with x = x∗. Letη ∈ (0, 1] be arbitrary. Then there exists ρ ∈ (0, r) such that

|F(x∗ + h)− F(x∗)−G(x∗ + h)h| < η

M|h| ≤ 1

M|h| (8.3.10)

for all |h| < ρ. Consequently, if we choose x0 such that |x0 − x∗| < ρ, then by inductionfrom (8.3.9), (8.3.10) with h = xk − x∗ we have |xk+1 − x∗| < ρ and in particularxk+1 ∈ B(x∗, ρ). It follows that the iterates are well defined. Moreover, since η ∈ (0, 1] ischosen arbitrarily, xk → x∗ converges superlinearly.

The following theorem provides conditions which guarantee in an appropriate senseglobal convergence of the Newton iteration [QiSu].

Theorem 8.17. Suppose that F is continuous and directionally differentiable on the closedsphere S = B(x0, r). Assume also the existence of bounded operators G(·) ∈ L(X,Z) andconstants β, γ ∈ R+ such that

‖G(x)−1‖ ≤ β, |G(x)(y − x)− F ′(x; y − x)| ≤ γ |y − x|,

|F(y)− F(x)− F ′(x; y − x)| ≤ δ |y − x|for all x, y ∈ S, where α = β(γ + δ) < 1, and

β |F(x0)| ≤ r(1− α).

Then the iterates defined by

xk+1 = xk −G(xk)−1F(xk) for k = 0, 1, . . .

remain in S and converge to the unique solution x∗ of F(x) = 0 in S. Moreover, we havethe error estimate

|xk − x∗| ≤ α

1− α|xk − xk−1|.

Proof. First note that

|x1 − x0| ≤ |G(x0)−1F(x0)| ≤ β |F(x0)| ≤ r(1− α).

Thus x1 ∈ S. Suppose that x1, . . . , xk ∈ S. Then

|xk+1 − xk| ≤ |G(xk)−1F(xk)| ≤ β |F(xk)|

≤β |F(xk)− F(xk−1)− F ′(xk−1; xk − xk−1)|

+β |G(xk−1)(xk − xk−1)− F ′(xk−1; xk − xk−1)|

≤β(δ + γ ) |xk − xk−1| = α |xk − xk−1| ≤ αk |x1 − x0| ≤ r(1− α)αk.

(8.3.11)


�

�

�

�

�

�

�

�


Since

|xk+1 − x0| ≤k∑

j=0

|xj+1 − xj | ≤k∑

j=0

rαj (1− α) ≤ r,

we have rk+1 ∈ S and by induction xk ∈ S for all k. For each m > n

|xm − xn| ≤m−1∑j=n|xj+1 − xj | ≤

m−1∑j=n

rαj (1− α) ≤ rαn.

Hence {xk} is a Cauchy sequence in S and there exists limk→∞ xk = x∗ ∈ S. By locallyLipschitz continuity of F and (8.3.11)

|F(x∗)| = lim |F(xk)| ≤ lim αβ−1 |xk − xk−1| = 0,

i.e., F(x∗) = 0. For y∗ ∈ S satisfying F(y∗) = 0 we find

|y∗ − x∗| ≤β |G(x∗)(y∗ − x∗)|

≤β |F(y∗)− F(x∗)− F ′(x∗; y∗ − x∗)|

+β |G(x∗)(y∗ − x∗)− F ′(x∗; y∗ − x∗)|

≤α |y∗ − x∗|.This implies y∗ = x∗ and hence x∗ is the unique solution to F(x) = 0 in S. Finally

|xm − xk| ≤m−1∑j=k|xj+1 − xj | ≤

m−k∑j=1

αj |xk − xk−1| ≤ α

1− α|xk − xk−1|

implies the asserted error estimate by letting m→∞.

8.4 The primal-dual active set method as a semismoothNewton method

Let us consider the complementarity problem in the unknowns (x, μ) ∈ L2(�)× L2(�)

Ax + μ = a,

μ = max(0, μ+ c(x − ψ)),

(8.4.1)

where A ∈ L(L2(�)), with � a bounded domain inRn, c > 0, and a ∈ Lp(�),ψ ∈ Lp(�)

for some p > 2. Recall that the second equation in (8.4.1) is equivalent to

x ≤ ψ, μ ≥ 0, (μ, x − ψ)L2(�) = 0, (8.4.2)

where the inequalities and the max operation are understood in the pointwise a.e. sense. InExample 7.1 of Chapter 7 it was shown that such problems arise in the context of constrainedoptimal control problems with A of the form


�

�

�

�

�

�

�

�

8.4. The primal-dual active set method as a semismooth Newton method 241

A = αI + B∗(−�)−2B, (8.4.3)

where α > 0, B is a bounded operator, and � denotes the Laplacian with homogeneousDirichlet boundary conditions. This suggests and justifies our assumption that

A = αI + C, where C ∈ L(L2(�), Lp(�)) with p > 2. (8.4.4)

We shall show that the primal-dual active set method discussed in Chapter 7 is equiv-alent to the semismooth Newton method applied to

0 = F(x, μ) =⎧⎨⎩

Ax − a + μ,

μ−max(0, μ+ c (x − ψ)).

(8.4.5)

For this purpose we define G : Lp(�)→ L2(�) by

G(x)(s) =⎧⎨⎩

0 if x(s) ≤ 0,

1 if x(s) > 0,(8.4.6)

and we recall that the max function is Newton differentiable from Lp(�) to L2(�) if p > 2withG as anN -derivative. ANewton step applied to the second equation in (8.4.4) results in

c (xk+1 − xk)+ c (xk − ψ) = 0 in Ak = {s : μk(s)+ c (xk(s)− ψ(s)) > 0},

(μk+1 − μk)+ μk = 0 in Ik = {s : μk(s)+ c (xk(s)− ψ(s)) ≤ 0}.Hence a Newton step for (8.4.5) is given by⎧⎪⎪⎪⎪⎨⎪⎪⎪⎪⎩

Axk+1 − a + μk+1 = 0,

xk+1 = ψ in Ak,

μk+1 = 0 in Ik.

(8.4.7)

This coincides with the primal-dual active set strategy of Section 7.1. To analyze its localconvergence properties note that under assumption (8.4.4), equation (8.4.5) with c = α isequivalent to the reduced problem

Ax − a +max(0,−Cx + a − αψ) = 0. (8.4.8)

Applying a semismooth Newton step to (8.4.8) results in

A(xk+1 − xk)−G(−Cxk + a − αψ)C(xk+1 − xk)

+Axk − a +max(0,−Cxk + a − αψ) = 0.(8.4.9)

Setting μk+1 = a − Axk+1 the iterates of (8.4.9) coincide with those of (8.4.7), providedthat the initialization for the reduced iteration (8.4.9) is chosen such that μ0 = a − Ax0.This follows from the fact that μk + α(xk − ψ) = −Cxk + a − αψ .


�

�

�

�

�

�

�

�


For any partition � = A ∪ I into measurable sets A and I let RI : L2(I)→ L2(I)denote the canonical restriction operator and R∗I : L2(I)→ L2(I) its adjoint. Further set

AI = RI A R∗I .

Proposition 8.18. Assume that (8.4.1) with ψ and a in Lp(�), p > 2, admits a solution(x∗, μ∗) ∈ L2(�)× L2(�) and that (8.4.4) holds. If moreover

{A−1I : � = I ∪A} is uniformly bounded in L(L2(�)), (8.4.10)

then the iterates (xk, μk) defined by (8.4.7) converge superlinearly in L2(�) × L2(�)to (x∗, μ∗), provided that |x∗ − x0| is sufficiently small and μ0 = a − Ax0. Moreover(x∗, μ∗) ∈ Lp(�)× Lp(�).

Proof. First note that if (8.4.1) admits a solution (x∗, μ∗) for some c > 0, then it admits thesame solution with c = α and x∗ is also a solution of (8.4.8). By (8.4.4) this implies thatx∗ ∈ Lp(�) and the first equation in (8.4.1) implies that μ∗ ∈ Lp(�). By Lemma 8.15 andExample 8.14 the mapping F x = αAx−a+max(0,−Cx+a−αψ) is Newton differentiablefrom L2(�) into itself and G(x) = A − G(−Cx + a − αψ)C is an N -derivative. By(8.4.10) the family of operators {G(x)−1 : x ∈ L2(�)} is uniformly bounded in L(L2(�)),and hence by Theorem 8.16 the iterates xk converge superlinearly to x∗, provided that|x0 − x∗| is sufficiently small. Superlinear convergence of μk to μ∗ follows from (8.4.5)and (8.4.7).

Note that (8.4.10) is satisfied for operators of the form given in (8.4.3).The characterization of inequalities by means of max and min operations in comple-

mentarity systems such as (8.4.2) is one of several possibilities. Another frequently usedcomplementarity function is the Fischer–Burmeister function

√(ψ − x)2 + μ2−(ψ−x)−

μ. Numerical experiments appear to indicate that max-based complementarity functionscan be more efficient numerically than the Fischer–Burmeister function; see, e.g., [Kan].

As shown in Section 7.5 a class of optimization problems with bilateral constraintsϕ ≤ x ≤ ψ can be expressed as

Ax − a + μ = 0, μ = max(0, μ+ c (x − ψ)+min(0, μ+ c (x − ϕ)) (8.4.11)

or, equivalently, as

Ax − a +max(0,−Cx + a − αψ)+min(0,−Cx + a − αψ) = 0, (8.4.12)

where c = α and ϕ < ψ with a, ϕ, and ψ in Lp(�).The primal-dual active set strategy applied to (8.4.11) can be expressed as

Axk+1 − a + μk+1 = 0,

xk+1 = ψ in A+k = {s : μk(s)+ c (xk(s)− ψ(s)) > 0},

μk+1 = 0 in Ik = {s : μk(s)+ c (xk(s)− ϕ(s)) ≤ 0 ≤ μk(s)+ c (xk(s)− ψ(s))},

xk+1 = ϕ in A−k = {s : μk(s)+ c (xk(s)− ϕ(s)) < 0}.

(8.4.13)


�

�

�

�

�

�

�

�

8.5. Semismooth Newton methods for nonlinear complementarity problems 243

In the bilateral case, the role of c influences the likelihood of switches from beingactive-above to active-below in one iteration. If, for example, s ∈ A+

k , then μk+1(s) +c (xk+1(s) − ϕ(s)) = μk+1(s) + c (ψ(s) − ϕ(s)) is more likely to be negative and hences ∈ A−

k+1 if c is large.For c = α, iteration (8.4.13) is equivalent to applying the semismooth Newton method

to (8.4.12) with G as N -derivative for the max function and

G(x)(s) =⎧⎨⎩

0 if x(s) ≥ 0,

1 if x(s) < 0

as N -derivative for the min function. Under the assumptions of Proposition 8.18 localsuperlinear convergence of the iteration (8.4.13) can be shown.

8.5 Semismooth Newton methods for a class of nonlinearcomplementarity problems

In this section we consider nonlinear complementarity problems in the Hilbert space X =L2(�). Let g be a continuous mapping from L2(�) into itself with � a bounded domain inRn, and consider

g(x)+ μ = 0, μ ≥ 0, x ≤ ψ, and (μ, x − ψ) = 0, (8.5.1)

where ψ ∈ L2(�). As discussed in Chapter 7, (8.5.1) can be expressed equivalently as

0 = F(x, μ) =⎧⎨⎩

g(x)+ μ,

μ−max(0, μ+ c (x − ψ)),

(8.5.2)

for any c > 0, where max denotes the pointwise max operation.If � is a continuously differentiable functional on X, then (8.5.1) with g = �′ is the

necessary optimality condition for

minx∈L2(�)

�(x) subject to x ≤ ψ. (8.5.3)

We shall employ semismooth Newton methods for solving F(x, μ) = 0. Let G(x) be asdefined in Section 8.4.

Applying a primal-dual active set strategy to the second equation in (8.5.2) results in

g(xk+1)+ μk+1 = 0,

xk+1 = ψ in Ak = {s : μk(s)+ c (xk(s)− ψ(s)) > 0},

μk+1 = 0 in Ik = {s : μk(s)+ c (xk(s)− ψ(s)) ≤ 0}

(8.5.4)

for k = 0, 1, . . . . To investigate the convergence properties of (8.5.4) we note that (8.5.2)is equivalent to

g(x)+max(0,−g(x)+ c (x − ψ)) = 0, (8.5.5)


�

�

�

�

�

�

�

�


and the iteration (8.5.4) is equivalent to the reduced iteration

g(xk+1)+G(−g(xk)+ c (xk − ψ)

)(−g(xk+1)+ c (xk+1 − ψ)

−(−g(xk)+ c (xk − ψ))+max(0,−g(xk)+ c (xk − ψ))) = 0,

(8.5.6)

provided that the initialization for (8.5.4) is chosen such thatμ0 = −g(x0). Note that (8.5.5)can be considered as a partial semismooth Newton iteration. Separating the generalizeddifferentiation of the max operation from that of the linearization of the nonlinearity g wasinvestigated numerically in [dReKu, GrVo].

Proposition 8.19. Assume that (8.5.2) admits a solution x∗, that x → g(x)− c (x − ψ) isLipschitz continuous from L2(�) to Lp(�) in a neighborhood U of x∗ for some c > 0 andp > 2, and that there exists α > 0 such that

α|x − y|2L2(I) ≤∫

I(g(x)− g(y))(s)(x − y)(s) ds (8.5.7)

for all partitions � = A ∪ I and x, y ∈ L2(�). Then the iterates (xk, μk) defined by(8.5.4) converge superlinearly to (x∗, μ∗), provided that |x∗ − x0| is sufficiently small andμ0 = −g(x0).

Proof. From (8.5.5) and (8.5.6) we have

g(xk+1)− g(x∗)+G(zk)(−g(xk+1)+ c(xk+1 − ψ)+ g(x∗)− c(x∗ − ψ))

= G(zk)(−g(xk)+ c(xk − ψ)+ g(x∗)− c(x∗ − ψ)

)+max(0,−g(x∗)+ c(x∗ − ψ))−max(0,−g(xk)+ c(xk − ψ)),

where zk = −g(xk)+ c(xk −ψ). Taking the inner product in L2(�) with xk+1− x∗, using(8.5.7) to estimate the left-hand side from below, and Example 8.14 and Lipschitz continuityof x → g(x) + c(x − ψ) from L2(�) to Lp(�) to bound the right-hand side from abovewe find √

min(c, α)|xk+1 − x∗|L2(�) ≤ o(|xk − x∗|L2(�)).

Finally (8.5.2) and (8.5.4) imply that

μk+1 − μ∗ = g(x∗)− g(xk+1),

and hence superlinear convergence of μk to μ∗ follows from superlinear convergence of xk

to x∗.

A full semismooth Newton step applied to (8.5.5) is given by

g′(xk)(xk+1 − xk)+G(−g(xk)+ c (xk − ψ)

) (−g′(x)(xk+1 − xk)+ c (xk+1 − xk))

+g(xk)+max(0,−g(xk)+ c (xk − ψ)) = 0.(8.5.8)


�

�

�

�

�

�

�

�

8.5. Semismooth Newton methods for nonlinear complementarity problems 245

It can be equivalently expressed as

g′(xk)(xk+1 − xk)+ g(xk)+ μk+1 = 0,

xk+1 = ψ in Ak = {s : −g(xk)(s)+ c (xk(s)− ψ(s)) > 0},

μk+1 = 0 in Ik = {s : −g(xk)(s)+ c (xk(s)− ψ(s)) ≤ 0}.

(8.5.9)

Remark 8.5.1. Note that, differently from the linear case considered in Section 8.4, if weapply a full semismooth Newton step to (8.5.2) rather than to the reduced equation (8.5.5),then the resulting algorithm differs in the update of the active/inactive sets. Applying asemismooth Newton step to (8.5.2) results in Ak = {s : μk(s)+ c (xk(s)− ψ(s)) > 0} ={s : −g(xk−1)(s)− g′(xk−1)(xk − xk−1)+ c (xk(s)− ψ(s)) > 0}.

To investigate local convergence of (8.5.8) we denote, for any partition � = A ∪ Iinto measurable sets I and �, by RI : L2(�) → L2(I) the canonical restriction operatorand by R∗I : L2(I)→ L2(�) its adjoint. Further we set

g′(x)I = RI g′(x) R∗I .

Proposition 8.20. Assume that (8.5.2) admits a solution x∗, that x → g(x)− c (x − ψ) isa C1 function from L2(�) to Lp(�) in a neighborhood U of x∗ for some c > 0 and p > 2,and that

{g′(x)−1I ∈ L(L2(I)) : x ∈ U, � = A ∪ I} is uniformly bounded.

Then the iterates xk defined by (8.5.8) converge superlinearly to x∗, provided that |x∗ − x0|is sufficiently small.

Proof. By Lemma 8.15 and Example 8.14 the mapping x → max(0,−g(x) + c(x − ψ))

is Newton differentiable from L2(�) into itself and G(−g(x) + c(x − ψ))(−g′(x) + cI)

is an N -derivative. Moreover g′(x) + G(−g(x) + c(x − ψ))(−g′ + cI) is invertible inL(L2(�)) with uniformly bounded inverses for x ∈ U . Setting

z = −g(x)+ c(x − ψ), A = {z > 0}, I = � \A, hI = χIh, hA = χAh,

this follows from the fact that for given f ∈ L2(�) the solution to the equation

g′(x)h+G(z)(−g′(x)h+ ch) = f

is given by

chA = fA and hI = g′I(x)−1

(fI − 1

cχI g′(x)fA

).

From Theorem 8.16 we conclude that xk → x∗ superlinearly, provided that |x∗ − x0| issufficiently small.


�

�

�

�

�

�

�

�


8.6 Semismooth Newton methods and regularizationIn this section we discuss the case where the operator A of Section 8.4 is not a continuousbut only a closed operator in X = L2(�). Let V denote a real Hilbert space that isdensely and continuously embedded into X and let V ∗ denote its dual. For ψ ∈ V letC = {y ∈ V : y ≤ ψ}.

We identify X∗ with X and thus we have the Gel’fand triple framework: V ⊂ X =X∗ ⊂ V ∗. Let a : V × V → R be a bounded coercive bilinear form, i.e., for M, ν > 0

|a(y, v)| ≤ M |y|V |v|V for all y, v ∈ V,

a(v, v) ≥ ν |v|2V for all v ∈ V.

For given f ∈ V ∗ consider the variational inequality for y ∈ C:

a(y, v − y)− 〈f, v − y〉 ≥ 0 for all v ∈ C. (8.6.1)

The existence of solutions to (8.6.1) is established in Theorem 8.26. The solution to (8.6.1)is unique. In fact if y ∈ C is a solution to (8.6.1), we have

a(y, y − y)− 〈f, y − y〉 ≥ 0,

a(y, y − y)− 〈f, y − y〉 ≥ 0.

Summing these inequalities, we have a(y − y, y − y) ≤ 0 and thus y = y.

If a is symmetric, then (8.6.1) is the necessary and sufficient optimality condition forthe constrained minimization problem in V

min1

2a(y, y)− 〈f, y〉 over y ∈ C. (8.6.2)

Let us define A ∈ L(V , V ∗) by

〈Ay, v〉V ∗×V = a(y, v) for y, v ∈ V .

We note that (8.6.2) is equivalent to

Ay + μ = f, y ≤ ψ, μ ∈ C+, 〈μ, y − ψ〉V ∗,V = 0, (8.6.3)

where the first equality holds in V ∗ and C+ = {μ ∈ V ∗ : 〈μ, v〉V ∗,V ≤ 0 for all v ≤ 0}.If μ (or equivalently y) has extra regularity in the sense that μ ∈ L2(�), then (8.6.3) isequivalent to ⎧⎨⎩

Ay + μ = f,

μ = max(0, μ+ c(y − ψ))

(8.6.4)

for each c > 0.


�

�

�

�

�

�

�

�

8.6. Semismooth Newton methods and regularization 247

Example 8.21. For the obstacle problem in its simplest form we choose V = H 10 (�) and

a(v,w) = ∫�∇v∇w dx. In general, the unique solution (y, μ) is in L2(�)×H−1(�). If

∂� and ψ are sufficiently regular, then (y, μ) ∈ (H 10 (�)∩H 2(�))×L2(�) and (8.6.3) is

equivalent to (8.6.4).

Example 8.22. Consider the state-constrained optimal control problem with β > 0 andy ∈ L2(�)

minu∈L2(�)

1

2|y − y|2L2(�) +

β

2|u|2L2(�) subject to Ey = u and y ∈ C, (8.6.5)

where E is a closed linear operator in L2(�). We assume that E−1 exists and set V =dom (E), where dom (E) is endowed with the graph norm of E. For E = −� withhomogeneous Dirichlet boundary conditions, we have dom E = H 1

0 (�) ∩ H 2(�). Thenecessary and sufficient optimality condition for (8.6.5) is given by

Ey = u, y ∈ C, βu = p,

〈E∗p + y − y, v − y〉V ∗×V ≥ 0 for all v ∈ C,(8.6.6)

where E∗ : X→ V ∗ denotes the conjugate of E as operator from V to X. This optimalitysystem can be written as (8.6.1) with f = y and

a(v,w) = β (Ev,Ew)X + (v,w) for v, w ∈ V .

It can also be written in the form (8.6.3) with μ ∈ V ∗, but differently from Example 8.21,μ /∈ L2(�) in general; see [BeK3].

In the case of mixed constraints, i.e., when pointwise constraints on the controls andthe state are present, it is natural to treat the control constraints with the active set methodspresented in Chapter 7 and to relax the state constraints by the technique explained in thissection.

Thus the obstacle and the state-constrained optimal control problem differ in that forthe former, under weak regularity conditions on the problem data, the Lagrange multiplieris in L2(�) and complementarity can be expressed in the form of (8.6.4), whereas for thelatter this is not the case. Before we can address the applicability of the primal-dual activeset strategy or semismooth Newton methods to such problems we also need to consider thecandidates for the iterates. In the case of the obstacle problem these are given formally by

a(yk+1, v)+ (μk+1, v)L2(�) = (f, v)L2(�) for all v ∈ H 10 (�),

yk+1 = ψ in Ak = {x : μk + c (yk(x)− ψ(x)) > 0},

μk+1 = 0 in Ik = {x : μk + c (yk(x)− ψ(x)) ≤ 0}.

(8.6.7)

This is the optimality condition for

min1

2

∫�

|∇y|2 dx − (f, y)L2(�) over y ∈ H 10 (�) with y = ψ on Ak,


�

�

�

�

�

�

�

�


for which the Lagrange μk+1 is not in L2(�), and hence an update of Ak as in (8.6.7) isnot feasible. Alternatively, considering the possibility of applying a semismooth Newtonapproach we observe that the reduced form of (8.6.4) is given by

μ = max(0, μ+ c(A−1(f − μ)− ψ)),

so that no smoothing of μ under the max operation occurs.In order to remedy these difficulties related to the regularity properties of the Lagrange

multiplier we consider a one-parameter family of regularized problems based on smoothingof the complementarity condition given by

μ = α max(0, μ+ c(y − ψ)), with 0 < α < 1. (8.6.8)

This is a relaxation of the second equation in (8.6.4) withα as a continuation parameter. Notethat an update for μ based on (8.6.8) results in μ ∈ L2(�). Equation (8.6.8) is equivalent to

μ = max

(0,

cα

1− α(y − ψ)

), (8.6.9)

with cα/(1− α) ranging in (0,∞) for α ∈ (0, 1). We shall use a generalization of (8.6.8)and introduce an additional shift parameter μ ∈ L2(�) into (8.6.9). Moreover we replacecα/(1− α) by c and arrive at

μ = max(0, μ+ c (y − ψ)), with c ∈ (0,∞). (8.6.10)

This is exactly the same as the generalized Yosida–Moreau approximation for inequalityconstraints discussed in Chapter 4.7 and is related to augmented Lagrangian methods asdescribed in Chapter 4.6. Utilizing this regularization in (8.6.4) results in{

Ay + μ = f,

μ = max(0, μ+ c (y − ψ)).(8.6.11)

Note that for each c > 0y → max(0, μ+ c (y − ψ))

is Lipschitz continuous and monotone from X to X. Thus existence of a unique yc ∈ V

satisfyingAy +max(0, μ+ c (y − ψ)) = f

follows by monotone operator techniques; see, e.g., the discussion above Theorem 4.43.This implies the existence of a unique solution (yc, μc) ∈ V × L2(�) to (8.6.11). Conver-gence as c→∞ will be analyzed at the end of this section.

The semismooth Newton iteration for (8.6.11) is discussed next.

Semismooth Newton Algorithm with Regularization.

(i) Choose μ, c, y0 , set k = 0.

(ii) Set Ak = {x : (μ+ c (yk − ψ))(x) > 0}, Ik = � \Ak .


�

�

�

�

�

�

�

�


(iii) Solve for yk+1 ∈ V :

a(y, v)+ (μ+ c (y − ψ), χAkv)L2(�) = (f, v)L2(�) for all v ∈ V.

(iv) Set

μk+1 =⎧⎨⎩

0 on Ik,

μ+ c (yk+1 − ψ) on Ak.

(v) Stop, or set k = k + 1, goto (ii).

The iterates μk as assigned in step (iv) are not necessary for the algorithm, but theywill be useful in the convergence analysis. The practical relevance of μ is given by thefact that for certain problems a proper choice can guarantee feasibility of the iterates, i.e.,yk ≤ ψ , as will be shown next.

Theorem 8.23. Assume that a(φ, φ+) ≤ 0, for φ ∈ V , implies that φ = 0. Then if

μ ≥ (f − Aψ)+

in the sense that(μ, φ) ≥ 〈f, φ〉 − a(ψ, φ) (8.6.12)

for all φ ∈ C, the solution to (6.11) is feasible, i.e., yc ≤ ψ .

Proof. Since for φ = (yc − ψ)+(max 0, μ+ c(yc − ψ), φ

) ≥ (μ, φ),

it follows from (8.6.11) and (8.6.12) that

a(y − ψ, φ) = 〈f, φ〉 − a(ψ, φ)− (μ, φ) ≤ 0.

This implies that φ = 0 and thus yc ≤ ψ .

Theorem 8.24. If a(φ, φ+) ≤ 0, for φ ∈ V , implies that φ = 0, then the iterates of thesemismooth Newton algorithm with regularization satisfy yk+1 ≤ yk for all k ≥ 1.

Proof. Let φ = (yk+1 − yk)+ and observe that

a(yk+1−yk, φ)+a(yk, φ)−〈f, φ〉 + (μ + c (yk−ψ), φ χAk) + c(yk+1−yk, φ χAk

) = 0.

We have

a(yk, φ)− 〈f, φ〉 + (μ+ c(yk − ψ), φ χAk)

= −(μ+ c(yk − ψ), φ χAk−1)+ (μ+ c(yk − ψ), φ χAk)

= (μ+ c(yk − ψ), φ χAk∩Ik−1)− (μ+ c(yk − ψ), φ χIk∩Ak−1) ≥ 0.


�

�

�

�

�

�

�

�


This implies that

a(yk+1 − yk, φ)+ c(yk+1 − yk, φ χAk) ≤ 0;

thus, a(yk+1 − yk, φ) ≤ 0 and hence φ = 0.

Let us stress that in a finite-dimensional or discretized setting with L2(�) replacedby Rn the need for regularization is not apparent, since the lack of regularity, as exhibitedin Examples 8.21 and 8.22 and in the discussion of the iterative step of the semismoothNewton algorithm, does not exist. If the finite-dimensional problems arise from discretiza-tion of continuous ones, then the regularity of the Lagrange multipliers, however, certainlyinfluences the convergence and approximation properties. It will typically lead to mesh-size-dependent behavior of the semismooth Newton algorithm.

We turn to convergence of the semismooth Newton algorithm with regularization.Recall that A−1 ∈ L(V ∗, V ). Below we shall also denote the restriction of A−1 to L2(�)

by the same symbol.

Theorem 8.25. Assume that μ − cψ ∈ Lp(�) and A−1 ∈ L(L2(�), Lp(�)) for somep > 2. If μ0 ∈ L2(�) and |μ0 − μc|L2(�) is sufficiently small, then (yk, μk) → (yc, μc)

superlinearly in V × L2(�).

Proof. First we show superlinear convergence of μk to μc by applying Theorem 8.16 toF : L2(�)→ L2(�) defined by

F(μ) = μ−max(0, μ+ c(A−1(f − μ)− ψ)).

By Example 8.14 and Lemma 8.15 the mapping F is Newton differentiable with N -derivative given by

G(μ) = I + cG(μ+ c(A−1(f − μ)− ψ))A−1,

where G was defined in (8.4.6). The proof will be completed by showing that G(μ) hasuniformly bounded inverses in L(L2(�)) for μ ∈ L2(�). We define

A = {x : (μ+ c(A−1(f − μ)− ψ))(x) > 0}, I = � \A.

Further, let RA : L2(�) → L2(A) and RI : L2(�) → L2(I) denote the restriction opera-tors to A and I. Their adjoints R∗A : L2(A) → L2(�) and R∗I : L2(I) → L2(�) are theextension-by-zero operators fromA andI to�, respectively. The mapping (RA, RI) : L2(�)

→ L2(A)×L2(I) determines an isometric isomorphism and everyμ ∈ L2(�) can uniquelybe expressed as (RA μ,RI μ). The operator G(μ) can equivalently be expressed as

G(μ) =(

IA 00 II

)+ c

(RAA−1R∗A RAA−1R∗I

0 0

),

where IA and II denote the identity operators on L2(A) and L2(I). Let (gA, gI) ∈L2(A)× L2(I) be arbitrary and consider the equation

G(μ)((δμ)A, (δμ)I) = (gA, gI). (8.6.13)


�

�

�

�

�

�

�

�


Then necessarily (δμ)I = gI and (8.6.13) is equivalent to

(δμ)A + c RAA−1R∗A (δμ)A = gA − c RAA−1R∗I gI . (8.6.14)

The Lax–Milgram theorem and nonnegativity of A−1 imply the existence of a unique solu-tion (δμ)A to (8.6.14) and consequently (8.6.13) has a unique solution for every (gA, gI)and every μ. Moreover these solutions are uniformly bounded with respect to μ ∈ L2 since(δμ)I = gI and

|δμA|L2(A) ≤ |gA|L2(�) + c‖A−1‖L(L2(�)) |gI |L2(I).

This proves superlinear convergence μk → μc in L2(�). Superlinear convergence of yk toyc inV follows fromAyk+μk = f and the fact thatA : V ∗ → V is a homeomorphism.

Convergence of the solutions (yc, μc) to (8.6.11) as c→∞ is addressed next.

Theorem 8.26. The solution (yc, μc) ∈ V×X of the regularized problem (8.6.11) convergesto the solution (y∗, μ∗) ∈ V ×V ∗ of (8.6.3) in the sense that yc → y∗ strongly in V and μc

→ μ∗ strongly in V ∗ as c→∞.

Proof. Recall that yc ∈ V satisfies{a(yc, v)+ (μc, v) = 〈f, v〉 for all v ∈ V,

μc = max(0, μ+ c(yc − ψ)).(8.6.15)

Since μc ≥ 0, for y ∈ C

(μc, yc − y) = (μc, yc − ψ − (y − ψ)) ≥ c

2|(yc − ψ)+|2X −

1

2c|μ|2X.

Letting v = yc − y in (8.6.15),

a(yc, yc − y)+ c

2|(yc − ψ)+|2X ≤ 〈f, yc − y〉 + 1

2c|μ|2X. (8.6.16)

Since a is coercive, this implies that

ν |yc|2V +c

2|(yc − ψ)+|2X is bounded uniformly in c.

Thus there exist a subsequence yc, denoted by the same symbol, and y∗ ∈ V such thatyc → y∗ weakly in V . For all φ ≥ 0

((yc − ψ)+, φ)X ≥ (yc − ψ, φ)X.

Since (yc − ψ)+ → 0 in X and yc → y∗ weakly in X as c→∞, this yields

(y∗ − ψ, φ)X ≤ 0 for all φ ≥ 0


�

�

�

�

�

�

�

�


which implies y∗ − ψ ≤ 0 and thus y∗ ∈ C. Since φ → √a(φ, φ) defines an equivalent

norm on V and norms are w.l.s.c., letting c→ 0 in (8.6.16) we obtain

a(y∗, y∗ − y) ≤ 〈f, y∗ − y〉 for all y ∈ C

which implies that y∗ is the solution to (8.6.1). Now, setting y = y∗ in (8.6.16)

a(yc − y∗, yc − y∗)+ a(y∗, yc − y∗)− 〈f, yc − y∗〉 ≤ 1

2c|μ|2X.

Since yc → y∗ weakly in V , this implies that limc→∞ |yc − y∗|V = 0. Hence (yc, μc)→(y∗, μ∗) strongly in V × V ∗.


�

�

�

�

�

�

�

�

Chapter 9

Semismooth NewtonMethods II: Applications

In the previous chapter semismooth Newton methods in function spaces were investigated.In was demonstrated that in certain cases the semismooth Newton method is equivalent tothe primal-dual active set method. The application to nonlinear complementarity problemswas discussed and the necessity of introducing regularization in cases where the Lagrangemultiplier associated to the inequality condition has low regularity was demonstrated. Inthis chapter applications of semismooth Newton methods to nondifferentiable variationalproblems in function spaces will be treated. They concern image restoration problemsregularized by bounded variation functionals in Section 9.1 and frictional contact problemsin elasticity in Section 9.2.

We shall make use of the Fenchel duality theorem which we recall for further reference;see, e.g., Section 4.3 and [BaPe, EkTe] for details. Let V and Y be Banach spaces withtopological duals V and Y , respectively. Further, let � ∈ L(V , Y ) and let F : V −→R∪{∞}, G : Y −→ R∪{∞} be convex, proper, and l.s.c. functionals such that there existsv0 ∈ V with F(v0) <∞, G(�v0) <∞ and G is continuous at �v0. Then

infv∈V{F(v)+ G(�v)} = sup

q∈Y

{−F (−� q)− G (q)}, (9.0.1)

where � ∈ L(Y , V ) is the adjoint of �. See, e.g., Section 4.3, Theorem 4.30 andExample 4.33 with �(v,w) = F(v)+ G(�v + w). The convex conjugates F : V −→R ∪ {∞} and G : Y −→ R ∪ {∞} of F and G, respectively, are defined by

F (v ) = supv∈V

{〈v, v 〉V,V − F(v)},

and analogously for G . The conditions imposed on F and G guarantee that the dual problem,i.e., the problem on the right-hand side of (9.0.1), admits a solution. Furthermore, v ∈ V

and q ∈ Y are solutions to the two optimization problems in (9.0.1) if and only if theextremality conditions

−� q ∈ ∂F(u),

q ∈ ∂G(�u)(9.0.2)

hold, where ∂F denotes the subdifferential of F .

253


�

�

�

�

�

�

�

�

254 Chapter 9. Semismooth Newton Methods II: Applications

9.1 BV-based image restoration problemsIn this section we consider the nondifferentiable optimization problem{

min 12

∫�|Ku− f |2dx + α

2

∫�|u|2dx + β

∫�|Du|

over u ∈ BV(�),(9.1.1)

where � is a simply connected domain in R2 with Lipschitz continuous boundary ∂�,f ∈ L2(�), β > 0, α ≥ 0 are given, and K ∈ L(L2(�)). We assume that K∗K is invertibleor α > 0. Further BV(�) denotes the space of functions of bounded variation. A functionu is in BV(�) if the BV-seminorm defined by∫

�

|Du| = sup

{∫�

u div �v : �v ∈ (C∞0 (�))2, |�v(x)|�∞ ≤ 1

}is finite. Here | · |�∞ denotes the supremum norm on R2. It is well known that BV(�) ⊂L2(�) for � ⊂ R2 (see [Giu]) and that u �→ |u|L2 + ∫

�|Du| defines a norm on BV(�).

If K = identity and α = 0, then (9.1.1) is the well-known image restoration problemwith BV-regularization. It consists of recovering the true image u from the noisy image f .BV-regularization is known to be preferable to regularization by

∫�|∇u|2 dx, for example,

due to its ability to preserve edges in the original image during the reconstruction process.Since the pioneering work in [ROF], the literature on (9.1.1) has grown tremendously. Wegive some selected references [AcVo, CKP, ChLi, ChK, DoSa, GeYa, IK12] and refer thereader to the monograph [Vog] for additional ones.

Despite its favorable properties for reconstruction of images, and especially imageswith blocky structure, problem (9.1.1) poses some severe difficulties. On the analytical levelthese are related to the fact that (9.1.1) is posed in a nonreflexive Banach space, the dualof which is difficult to characterize [Giu, IK18], and on the numerical level the optimalitysystem related to (9.1.1) consists of a nonlinear partial differential equation, which is notdirectly amenable to numerical implementations.

Following [HinK1] we shall show that the predual of (9.1.1) is a bilaterally constrainedoptimization problem in a Hilbert space, for which the primal-dual active set strategy canadvantageously be applied and which can be analyzed as a semismooth Newton method.We require some facts from vector-valued function spaces, which we summarize next. LetIL2(�) = L2(�) × L2(�) be endowed with the Hilbert space inner product structure andnorm. If the context suggests to do so, then we shall distinguish between vector fields �v ∈IL2(�) and scalar functions v ∈ L2(�) by using an arrow on top of the letter. Analogouslywe set IH1

0(�) = H 10 (�) × H 1

0 (�). We set L20(�) = {v ∈ L2(�) : ∫

�vdx = 0} and

H0(div) = {�v ∈ IL2(�) : div �v ∈ L2(�), �v · n = 0 on ∂�}, where n is the outer normal to∂�. The space H0(div) is endowed with |�v|2H0(div) = |�v|2IL2(�)

+ | div �v|2L2 as norm. Further

we put H0(div 0) = {�v ∈ H0(div) : div �v = 0 a.e. in �}. It is well known that

IL2(�) = grad H 1(�)⊕H0(div 0); (9.1.2)

cf. [DaLi, p. 216], for example. Moreover,

H0(div) = H0(div 0)⊥ ⊕H0(div 0), (9.1.3)


�

�

�

�

�

�

�

�

9.1. BV-based image restoration problems 255

with

H0(div 0)⊥ = {�v ∈ grad H 1(�) : div �v ∈ L2(�), �v · n = 0 on ∂�},and div : H0(div 0)⊥ ⊂ H0(div)→ L2

0(�) is a homeomorphism. In fact, it is injective byconstruction and for every f ∈ L2

0(�) there exists, by the Lax–Milgram lemma, ϕ ∈ H 1(�)

such that

div∇ϕ = f in �, ∇ϕ · n = 0 on ∂�,

with ∇ϕ ∈ H0(div 0)⊥. Hence, by the closed mapping theorem we have

div ∈ L(H0(div 0)⊥, L20(�)).

Finally, let Pdiv and Pdiv⊥ denote the orthogonal projections in IL2(�) onto H0(div 0) andgrad H 1(�), respectively. Note that the restrictions of Pdiv and Pdiv⊥ to H0(div 0) coincidewith the orthogonal projections in H0(div) onto H0(div 0) and H0(div 0)⊥.

Let �1 denote the two-dimensional vector field with 1 in both coordinates, set B =αI +K∗K, and consider{

min 12 | div �p +K∗f |2B over �p ∈ H0(div),

such that − β�1 ≤ �p ≤ β�1, (9.1.4)

where for v ∈ L2(�) we put |v|2B = (v, B−1v)L2 . It is straightforward to argue that (9.1.4)admits a solution.

Theorem 9.1. The Fenchel dual to (9.1.4) is given by (9.1.1) and the solutions u∗ of (9.1.1)and �p ∗ of (9.1.4) are related by

Bu∗ = div �p ∗ +K∗f, (9.1.5)

〈(− div)∗u∗, �p − �p ∗〉H0(div)∗,H0(div) ≤ 0 for all �p ∈ H0(div), (9.1.6)

with −β�1 ≤ �p ≤ β�1.

Alternatively (9.1.4) can be considered as the predual of the original problem (9.1.1).

Proof. We apply Fenchel duality as recalled at the beginning of the chapter with V =H0(div), Y = Y ∗ = L2(�), � = − div, G : Y → R given by G(v) = 1

2 |v − K∗f |2B , andF : V → R defined by F( �p) = I[−β�1,β�1]( �p), where

I[−β�1,β�1]( �p) ={

0 if − β�1 ≤ �p(x) ≤ β�1 for a.e. x ∈ �,

∞ otherwise.

The convex conjugate G∗ : L2(�)→ R of G is given by

G∗(v) = 1

2|Kv + f |2 + α

2|v|2 − 1

2|f |2.

Further the conjugate F∗ : H0(div)∗ → R of F is given by

F∗(�q) = sup�p∈S1

〈�q, �p〉H0(div)∗,H0(div) for �q ∈ H0(div)∗, (9.1.7)


�

�

�

�

�

�

�

�


where S1 = { �p ∈ H0(div) : −β�1 ≤ �p ≤ β�1}. Let us set

S2 = { �p ∈ C10(�)× C1

0(�) : −β�1 ≤ �p ≤ β�1}.The set S2 is dense in the topology of H0(div) in S1. In fact, let �p be an arbitrary elementof S1. Since (D(�))2 is dense in H0(div) (see, e.g., [GiRa, p. 26]), there exists a sequence�pn ∈ (D(�))2 converging inH0(div) to �p. Let P denote the canonical projection inH0(div)onto the closed convex subset S1 and note that, since �p ∈ S1,

| �p − P �pn|H0(div) ≤ | �p − �pn|H0(div) + | �pn − P �pn|H0(div)

≤ 2| �p − �pn|H0(div) → 0 for n→∞.

Hence limn→∞ | �p − P �pn|H0(div) = 0 and S2 is dense in S1. Returning to (9.1.7) we havefor v ∈ L2(�) and (− div)∗ ∈ L(L2(�), V ∗),

F∗((− div)∗v) = sup�p∈S2

(v,− div �p),

which can be +∞. By the definition of the functions of bounded variation it is finite if andonly if v ∈ BV(�) (see [Giu, p. 3]) and

F∗((−div)∗v) = β

∫�

|Dv| <∞ for v ∈ BV(�).

The dual problem to (9.1.4) is found to be

min1

2|Ku− f |2 + α

2|u|2 + β

∫�

|Du| over u ∈ BV(�).

From (9.0.2) moreover we find

〈(− div)∗u∗, �p − �p ∗〉H0(div)∗,H0(div) ≤ 0 for all p ∈ S1

andBu∗ = div �p ∗ +K∗f.

We obtain the following optimality system for (9.1.4).

Corollary 9.2. Let �p ∗ ∈ H0(div) be a solution to (9.1.4). Then there exists �λ∗ ∈ H0(div)∗such that

div∗ B−1 div �p ∗ + div∗ B−1K∗f + �λ∗ = 0, (9.1.8)

〈�λ∗, �p − �p ∗〉H0(div)∗,H0(div) ≤ 0 for all �p ∈ H0(div), (9.1.9)

with −β�1 ≤ �p ≤ β�1.

Proof. Apply div∗ B−1 to (9.1.5) and set �λ∗ = − div∗ u∗ ∈ H0(div)∗ to obtain (9.1.8). Forthis choice of �λ∗, (9.1.9) follows from (9.1.6).


�

�

�

�

�

�

�

�


To guarantee uniqueness of the solutions we replace (9.1.4) by{min 1

2 | div �p +K∗f |2B + γ

2 | �p|2 over �p ∈ H0(div)

such that − β�1 ≤ �p ≤ β�1 (9.1.10)

with γ > 0. Clearly (9.1.10) admits a unique solution.Our goal now is to investigate the primal-dual active set strategy for (9.1.10). If

we were to simply apply this method to the first order necessary optimality condition for(9.1.10), then the resulting nonlinear operator arising in the complementarity system is notNewton differentiable; cf. Section 8.6. The numerical results which are obtained in thisway are competitive from the point of view of image reconstruction [HinK1], but due tolack of Newton differentiability the iteration numbers are mesh-dependent. We thereforeapproximate (9.1.10) by a sequence of more regular problems where the constraints−β�1 ≤�p ≤ β�1 are realized by penalty terms and an IH1

0(�) smoothing guarantees superlinearconvergence of semismooth Newton methods applied to the first order optimality conditionsof the approximating problems:{

min 12c |∇ �p|2 + 1

2 | div �p +K∗f |2B + γ

2 | �p|2+ 1

2c |max(0, c( �p − β�1))|2 + 12c |min(0, c( �p + β�1))|2 over �p ∈ IH1

0(�).

(9.1.11)Let �pc denote the unique solution to (9.1.11). It satisfies the optimality condition

− 1

c� �pc − ∇B−1 div �pc − ∇B−1K∗f + γ �pc + �λc = 0, (9.1.12a)

�λc = max(0, c( �pc − β�1))+min(0, c( �pc + β�1)). (9.1.12b)

Next we address convergence as c→∞.

Theorem 9.3. The family {( �pc, �λc)}c>0 converges weakly in H0(div) × IH10(�)

∗to the

unique solution ( �p ∗, �λ∗) ∈ H0(div) × H0(div)∗ of the optimality system associated to(9.1.10) given by

div∗ B−1 div �p ∗ + div∗ B−1K∗f + γ �p ∗ + �λ∗ = 0, (9.1.13)

〈�λ∗, �p − �p ∗〉H0(div)∗,H0(div) ≤ 0 for all �p ∈ H0(div). (9.1.14)

Moreover, the convergence of �pc to �p ∗ is strong in H0(div).

Proof. The proof is related to that of Theorem 8.26. The variational form of (9.1.13) isgiven by

(div �p ∗, div �v)B + (K∗f, div �v)B + γ ( �p ∗, �v)+ 〈�λ∗, �v〉H0(div)∗,H0(div) = 0 (9.1.15)

for all �v ∈ H0(div). To verify uniqueness, let us suppose that ( �pi, �λi) ∈ H0(div)×H0(div)∗,i = 1, 2, are two solution pairs to (9.1.13), (9.1.14). For δ �p = �p2 − �p1, δ�λ = �λ2 − �λ1

we have

(B−1 div δ �p, div �v)+ γ (δ �p, �v)+ 〈δ�λ, �v〉H0(div)∗,H0(div) = 0 (9.1.16)


�

�

�

�

�

�

�

�


for all �v ∈ H0(div), and

〈δ�λ, δ �p〉H0(div)∗,H0(div) ≥ 0.

With �v = δ �p in (9.1.16) we obtain

|B−1 div δ �p|2 + γ |δ �p|2 ≤ 0

and hence �p1 = �p2. From (9.1.15) we deduce that �λ1 = �λ2. Thus uniqueness is establishedand we can henceforth rely on subsequential arguments.

In the following computation we consider the coordinates �λic, i = 1, 2, of �λc. We

have for a.e. x ∈ �

�λic �p i

c =(max(0, c( �p i

c − β))+min(0, c( �p ic + β))

) �p ic

=⎧⎨⎩

c( �p ic − β) �p i

c if �p ic ≥ β,

0 if | �p ic | = β,

c( �p ic + β) �p i

c if �p ic ≤ β.

It follows that

(�λic, �p i

c )IL2(�) ≥1

c|�λi

c|2IL2(�)for i = 1, 2,

and consequently

(�λc, �pc)IL2(�) ≥1

c|�λc|2IL2(�)

for every c > 0. (9.1.17)

From (9.1.12) and (9.1.17) we deduce that

1

c|∇ �pc|2 + | div �pc|2B + γ | �pc|2 ≤ | div �pc|B |K∗f |B

and hence

1

c|∇ �pc|2 + 1

2| div �pc|2B + γ | �pc|2 ≤ 1

2|K∗f |B. (9.1.18)

We further estimate

|�λc|IH10(�)

∗ = sup|�v|IH1

0(�)=1〈�λc, �v〉IH1

0(�)∗,IH1

0(�)

≤ sup|�v|IH1

0(�)=1

{1

c|∇ �pc||∇�v| + | div �pc|B | div �v|B + |K∗f |B | div �v|B + γ | �pc| |�v|

}.

From (9.1.18) we deduce the existence of a constant K , independent of c ≥ 1, such that

|�λc|IH10(�)

∗ ≤ K. (9.1.19)

Combining (9.1.18) and (9.1.19) we can assert the existence of ( �p ∗, �λ∗) ∈ H0(div) ×IH1

0(�)∗

such that for a subsequence denoted by the same symbol

( �pc, �λc) ⇀ ( �p ∗, �λ∗) weakly in H0(div)× IH10(�)

∗. (9.1.20)


�

�

�

�

�

�

�

�


We recall the variational form of (9.1.12), i.e.,

1

c(∇ �pc,∇�v)+ (div �pc, div �v)B + (K∗f, div �v)B + γ ( �pc, �v)+ (�λc, �v) = 0

for all �v ∈ IH10(�). Passing to the limit c→∞, using (9.1.18) and (9.1.20) we have

(div �p ∗, div �v)B+(K∗f, div �v)B + γ ( �p ∗, �v)+ 〈�λ∗, �v〉IH1

0(�)∗,IH1

0(�) = 0 for all �v ∈ IH10(�).

(9.1.21)

Since IH10(�) is dense in H0(div) and �p ∗ ∈ H0(div) we have that (9.1.21) holds for all �v ∈

H0(div). Consequently �λ∗ can be identified with an element inH0(div)∗ and 〈·, ·〉IH10(�)

∗,IH1

0(�)

in (9.1.21) can be replaced by 〈·, ·〉H0(div)∗,H0(div). We next verify that �p ∗ is feasible. Forthis purpose note that

(�λc, �p − �pc) =(

max(0, c( �pc − β�1))+min(0, c( �pc + β�1), �p − �pc)) ≤ 0 (9.1.22)

for all −β�1 ≤ �p ≤ β�1. From (9.1.11) we have

1

c|∇ �pc|2 + | div �pc +K∗f |2B + γ | �pc|2 + 1

c|�λc|2 ≤ |K∗f |2B. (9.1.23)

Consequently, 1c|�λc|2 ≤ |K∗f |2B for all c > 0. Note that

1

c|�λc|2IL2(�)

= c|max(0, �pc − β�1)|2IL2(�)+ c|min(0, �pc + β�1)|2IL2(�)

and thus

|max(0, ( �pc − β�1))|2IL2(�)

c→∞−−−→ 0 and |min(0, ( �pc + β�1))|2IL2(�)

c→∞−−−→ 0. (9.1.24)

Recall that �pc ⇀ �p ∗ weakly in IL2(�). Weak lower semicontinuity of the convex functional�p �→ |max(0, �p − β�1)|IL2(�) and (9.1.24) imply that∫

�

|max(0, �p ∗ − β�1)|2dx ≤ lim infc→∞

∫�

|max(0, �pc − β�1)|2dx = 0.

Consequently, �p ∗ ≤ β�1 and analogously one verifies that −β�1 ≤ �p ∗. In particular �p ∗ isfeasible and from (9.1.22) we conclude that

〈�λc, �p ∗ − �pc〉H0(div)∗,H0(div) ≤ 0 for all c > 0. (9.1.25)

By optimality of �pc for (9.1.11) we have

lim supc→∞

(1

2| div �pc +K∗f |2B +

γ

2| �pc|2

)≤ 1

2| div �p +K∗f |2B +

γ

2| �p|2 (9.1.26)


�

�

�

�

�

�

�

�


for all �p ∈ S2 = { �p ∈ (C10(�))2 : −β�1 ≤ �p ≤ β�1}. Density of S2 in S1 = { �p ∈ H0(div) :

−β�1 ≤ �p ≤ β�1} in the norm of H0(div) implies that (9.1.26) holds for all �p ∈ S1 andconsequently

lim supc→∞

(1

2| div �pc +K∗f |2B +

γ

2| �pc|2

)≤ 1

2| div �p ∗ +K∗f |2B +

γ

2| �p ∗|2

≤ lim infc→∞

(1

2| div �pc +K∗f |2B +

γ

2| �pc|2

),

where for the last inequality weak lower semicontinuity of norms is used. The aboveinequalities, together with weak convergence of �pc to �p ∗ in H0(div), imply strong conver-gence of �pc to �p ∗ in H0(div). Finally we aim at passing to the limit in (9.1.25). This isimpeded by the fact that we only established �λc ⇀ �λ∗ in IH1

0(�)∗. Note from (9.1.12) that

{− 1c� �pc + �λc}c≥1 is bounded in H0(div). Hence there exists �μ∗ ∈ H0(div)∗ such that

−1

c� �pc + �λc ⇀ �μ∗ weakly in H0(div)∗,

and consequently also in IH10(�)

∗. Moreover, { 1√

c|∇ �pc|}c≥1 is bounded and hence

−1

c� �pc ⇀ 0 weakly in IH1

0(�)∗

as c→∞. Since �λc ⇀ �λ∗ weakly in IH10(�)

∗it follows that

〈�λ∗ − �μ∗, �v〉IH10(�)

∗,IH1

0(�) = 0 for all �v ∈ IH10(�).

Since both �λ∗ and �μ∗ are elements of H0(div)∗ and since IH10(�) is dense in H0(div), it

follows that �λ∗ = �μ∗ in H0(div)∗. For �p ∈ S2 we have

〈�λ∗, �p − �p ∗〉H0(div)∗,H0(div) = 〈μ∗, �p − �p ∗〉H0(div)∗,H0(div)

= limc→∞

⟨−1

c� �pc + �λc, �p − �pc

⟩H0(div)∗,H0(div)

= limc→∞

(1

c(∇ �pc,∇( �p − �pc))+ (�λc, �p − �pc)

)≤ lim

c→∞

(1

c(∇ �pc,∇ �p)+ (�λc, �p − �pc)

)≤ 0

by (9.1.22) and (9.1.23). Since S2 is dense in S1 we find

〈�λ∗, �p − �p ∗〉H0(div)∗,H0(div) ≤ 0 for all �p ∈ S1.

Remark 9.1.1. For γ = 0 problems (9.1.10) and (9.1.11) admit a solution, which, however,is not unique. From the proof of Theorem 9.3 it follows that {( �pc, �λc)}c>0 contains a weakaccumulation point in H0(div)× IH1

0(�)∗

and every weak accumulation point is a solutionof (9.1.10).


�

�

�

�

�

�

�

�


Semismooth Newton treatment of (9.1.11).We turn to the algorithmic treatment of the infinite-dimensional problem (9.1.11) for whichwe propose the following algorithm.

Algorithm.

(1) Choose �p0 ∈ IH10(�) and set k = 0.

(2) Set, for i = 1, 2,

A+,ik+1 = {x : ( �p i

k + β�1)(x) > 0},A−,i

k+1 = {x : ( �p ik + β�1)(x) < 0},

I ik+1 = � \ (A+,i

k+1 ∪A−,ik+1).

(3) Solve for �p ∈ IH10(�) and set �pk+1 = �p where

1

c(∇ �p,∇�v)+ (div �p, div �v)B + (K∗f, div �v)B + γ ( �p, �v)+ (c( �p − β�1)χA+

k+1, �v)+ (c( �p + β�1)χA−

k+1, �v) = 0

(9.1.27)

for all �v ∈ IH10(�).

(4) Set

�λik+1 =

⎧⎪⎪⎪⎪⎨⎪⎪⎪⎪⎩0 on I i

k+1,

c( �p ik+1 − β�1) on A+,i

k+1,

c( �p ik+1 + β�1) on A−,i

k+1

for i = 1, 2.

(5) Stop, or set k = k + 1, and goto (2).

Above χA+k+1

stands for

χi

A+k+1=

⎧⎨⎩1 if x ∈ A+,i

k+1,

0 if x /∈ A+,ik+1,

and analogously for A−k+1. The superscript i, i = 1, 2, refers to the respective component.

We note that (9.1.27) admits a solution �pk+1 ∈ IH10(�). Step (4) is included for the sake of

the analysis of the algorithm. Let C : IH10(�)→ H−1(�)×H−1(�) stand for the operator

C = −1

c�− ∇B−1 div+γ id.

It is a homeomorphism for every c > 0 and allows us to express (9.1.12) as

C �p − ∇B−1K∗f + c max(0, �p − β�1)+ c min(0, �p + β�1) = 0, (9.1.28)


�

�

�

�

�

�

�

�


where we drop the index in the notation for �pc. For ϕ ∈ L2(�) we define

Dmax(0, ϕ)(x) ={

1 if ϕ(x) > 0,0 if ϕ(x) ≤ 0

(9.1.29)

and

Dmin(0, ϕ)(x) ={

1 if ϕ(x) < 0,0 if ϕ(x) ≥ 0.

(9.1.30)

Using (9.1.29), (9.1.30) as Newton derivatives for the max and the min operations in (9.1.28)the semismooth Newton step can be expressed as

C �pk+1 + c( �pk+1 − β�1)χA+k+1+ c( �pk+1 + β�1)χA−

k+1− ∇B−1K∗f = 0, (9.1.31)

and �λk+1 from step (4) of the algorithm is given by

�λk+1 = c( �pk+1 − β�1)χA+k+1+ c( �pk+1 + β�1)χA−

k+1. (9.1.32)

The iteration of the algorithm can also be expressed with respect to the variable �λ ratherthan �p. For this purpose we define

F(�λ)=�λ− c max(0, C−1(∇f − �λ)− β�1)− c min(0, C−1(∇f − �λ)+ β�1), (9.1.33)

where we put f = B−1K∗f . Setting �pk = C−1(∇f − �λk), the semismooth Newton stepapplied to F(�λ) = 0 at �λ = �λk results in

�λk+1= c(C−1(∇f − �λk+1)− β�1)χA+k+1+ c(C−1(∇f − �λk+1)+ β�1)χA−

k+1

which coincides with (9.1.32). Therefore the semismooth Newton iterations according tothe algorithm and that for F(�λ) = 0 coincide, provided that the initializations are related byC �p0 − ∇f + �λ0 = 0. The mapping F is Newton differentiable, i.e., for every �λ ∈ IL2(�)

|F(�λ+ �h)− F(�λ)−DF(�λ+ �h)h|IL2(�) = o(|�h|IL2(�)) (9.1.34)

for |�h|IL2(�) → 0; see Example 8.14. Here D denotes the Newton derivative of F definedby means of (9.1.29) and (9.1.30). For (9.1.34) to hold, the smoothing property of C−1

in the sense of an embedding from IL2(�) into ILp(�) for some p > 2 is essential. Thefollowing result now follows from Theorem 8.16.

Theorem 9.4. If |�λc − �λ0|IL2(�) is sufficiently small, then the iterates {( �pk, �λk)}∞k=1 of the

algorithm converge superlinearly in IH10(�)× IL2(�) to the solution ( �pc, �λc) of (9.1.11).

In Theorem 9.3 convergence as c → ∞ is established and Theorem 9.4 assertsconvergence of the iterates ( �pk, �λk) for each fixed c. The combination of these two limitprocesses was addressed in [HinK2], where a path concept with respect to the variable c isdeveloped and the iterations with respect to k are controlled to remain in a neighborhoodaround the path.


�

�

�

�

�

�

�

�

9.2. Friction and contact problems in elasticity 263

Let us compare the algorithm of this section to the general framework of augmentedLagrangians presented in Section 4.6 for nonsmooth problems. We again introduce in(9.1.10) a diffusive regularization and realize the inequality constraints by a generalizedYosida–Moreau approximation. This suggests considering the LagrangianL( �p, �μ) : IH1

0(�)

× IL2(�)→ R defined by

Lc( �p, �λ) = 1

2c|∇ �p|2 + 1

2| div �p +K∗f |2B +

γ

2| �p|2 + φc( �p, �λ), (9.1.35)

where φc is the generalized Yosida–Moreau approximation of the indicator function φ of theset { �p ∈ IL2(�) : −β�1 ≤ �p ≤ β�1}, and c > 0, c > 0. Here we choose c differently fromc since in the limit of the augmented Lagrangian iteration the constraint −β�1 ≤ �p ≤ β�1 issatisfied for any fixed c > 0. We have

φc( �p, �λ) = inf�q∈IL2(�)

φ( �p − �q)+ ( �μ, �q)IL2(�) +c

2|�q|2IL2(�)

for c > 0 and �μ ∈ IL2(�), which can equivalently be expressed as

φc( �p, �λ) = 1

2c

∣∣ max(0, �λ+ c( �p − β�1))∣∣2

IL2(�)

+ 1

2c

∣∣ min(0, �λ+ c( �p + β�1))∣∣2

IL2(�)− 1

2c

∣∣�λ∣∣2IL2(�)

.

The auxiliary problems in step 2 of the augmented Lagrangian method of Section 4.6 withLc

given in (9.1.35) coincide with (9.1.11) except for the shift by �λk in the max/min operations.Each of these auxiliary problems can efficiently be solved by the semismooth Newtonalgorithm presented in this section if �pk ∓ β�1 is replaced by �λk + c( �pk ∓ β�1). Conversely,one can think of introducing augmented Lagrangian steps to the algorithm of this section,with the goal of avoiding possible ill-conditioning as c → ∞. For numerical experiencewith the algorithmic concepts of this section we refer the reader to [HinK1].

9.2 Friction and contact problems in elasticity

9.2.1 Generalities

This chapter is concerned with the application of semismooth Newton methods to contactproblems with Tresca and Coulomb friction in two spatial dimensions. In contact problems,also known as Signorini problems, one has to detect the contact zone between an elastic bodyand a rigid foundation. At the contact boundary, frictional forces are often too large to beneglected. Thus, besides the nonpenetration condition the frictional behavior in the contactzone has to be taken into account. Modeling friction involves a convex, nondifferentiablefunctional which, by means of Fenchel duality theory, can be related to a bilateral obstacleproblem for which the primal-dual active set concept can advantageously be applied.

We commence with a formulation of the Signorini contact problem with Coulombfriction in two dimensions. A generalization to the three-dimensional case is given in[HuStWo]. We consider an elastic body that occupies, in its initial configuration, the openand bounded domain � ⊂ R2 with C1,1-boundary � = ∂�. Let this boundary be divided


�

�

�

�

�

�

�

�


into three disjoint parts, namely, the Dirichlet part �d , further the part �n with prescribedsurface load h ∈ L2(�n) :=

(L2(�n)

)2, and the part �c, where contact and friction with a

rigid foundation may occur. For simplicity we assume that �c ∩ �d = ∅ to avoid working

with the space H12

00(�c). We are interested in the deformation y = (y1, y2)! of the elastic

body which is also subject to a given body force f ∈ L2(�) := (L2(�)

)2. The gap between

the elastic body and the rigid foundation is d := τN d ≥ 0, where d ∈ H1(�) := (H 1(�)

)2

and τN y denotes the normal component of the trace along �c. As usual in linear elasticity,the linearized strain tensor is

ε(y) = 1

2

(∇y + (∇y)!).

Using Hooke’s law for the stress-strain relation, the linearized stress tensor

σ (y) := Cε(y) := λtr(ε(y))Id+ 2με(y)

is obtained, where λ and μ are the Lamé parameters. These parameters are given by

λ = (Eν)/((1+ ν)(1− 2ν)

)and μ = E/

(2(1+ ν)

)with Young’s modulus E > 0 and the Poisson ration ν ∈ (0, 0.5). Above, C denotes thefourth order isotropic material tensor for linear elasticity.

The Signorini problem with Coulomb friction is then given as follows:

−Div σ (y) = f in �, (9.2.1a)

τy = 0 on �d, (9.2.1b)

σ (y)n = h on �n, (9.2.1c)

τN y − d ≤ 0, σN(y) ≤ 0, (τN y − d)σN(y) = 0 on �c, (9.2.1d)

|σT (y)| < F|σN(y)| on {x ∈ �c : τT y = 0}, (9.2.1e)

σT (y) = −F|σN(y)||τT y| τT y on {x ∈ �c : τT y �= 0}, (9.2.1f )

where the sets {x ∈ �c : τT y = 0} and {x ∈ �c : τT y �= 0} are referred to as thesticky and the sliding regions, respectively. Above, Div denotes the rowwise divergenceoperator and τ : H1(�) → H

12 (�) := (

H12 (�)

)2the zero order trace mapping. The

corresponding scalar-valued normal and tangential component mappings are denoted byτN , τT : H1(�)→ H

12 (�c); i.e., for y ∈ H1(�) we have the splitting τy = (τN y)n+ (τT y)t

with n and t denoting the unit normal and tangential vector along�c, respectively. Similarly,using (9.2.1a) we can, following [KO], decompose the stress along the boundary, namely,σ (y)n = σN(y)n+ σT (y)t with mappings σN, σT : Y → H− 1

2 (�c). Moreover, F : �c → R

denotes the friction coefficient which is supposed to be uniformly Lipschitz continuous.There are major mathematical difficulties inherent in the problem (9.2.1). For instance,

in general σN(y) in (9.2.1e), (9.2.1f) is not pointwise a.e. defined. Replacing the Coulombfriction in the above model by Tresca friction means replacing |σN(y)| by a given frictiong. The resulting system can then be analyzed and the existence of a unique solution can beproved. For a review we refer the reader to [Rao], for example.


�

�

�

�

�

�

�

�


9.2.2 Contact problem with Tresca friction

We now give a variational statement of the Signorini problem with given friction. The setof admissible deformations is defined as

Y := {v ∈ H1(�) : τv = 0 a.e. on �d},where H1(�) := (H 1(�))2. To incorporate the nonpenetration condition between theelastic body and the rigid foundation we introduce the cone

K := {v ∈ Y : τN v ≤ 0 a.e. on �c}.We define the symmetric bilinear form a(· , ·) on Y × Y and the linear form L(·) on Y by

a(y, z) :=∫�

(σy) : (εz) dx, L(y) =∫�

f y dx +∫�n

h τy dx,

where : denotes the sum of the componentwise products.For the friction g we assume that g ∈ H− 1

2 (�c) and g ≥ 0, i.e., 〈g, h〉�c≥ 0 for

all h ∈ H12 (�c) with h ≥ 0. Since the friction coefficient F is assumed to be uniformly

Lipschitz continuous, it is a factor on H12 (�c), i.e., the mapping

λ ∈ H12 (�c) �→ Fλ ∈ H

12 (�c)

is well defined and bounded; see [Gri, p. 21]. By duality it follows that F is a factor onH− 1

2 (�c) as well. Consequently, the nondifferentiable functional

j (y) :=∫�c

Fg|τT y| dx

is well defined on Y. After these preparations we can state the contact problem with givenfriction as

miny∈d+K

J (y) := 1

2a(y, y)− L(y)+ j (y) (P)

or, equivalently, as elliptic variational inequality [Glo]:{Find y ∈ d +K such that

a(y, z− y)+ j (z)− j (y) ≥ L(z− y) for all z ∈ d +K.(9.2.2)

Due to the Korn inequality, the functional J (·) is uniformly convex; further it is l.s.c. Thisimplies that (P) and equivalently (9.2.2) admit a unique solution y∗ ∈ d +K.

To derive the dual problem corresponding to (P), we apply the Fenchel calculus tothe mappings F : Y → R and G : V ×H

12 (�c)→ R given by

F(y) :={ −L(y) if y ∈ d +K,

∞ else;G(q, ν) := 1

2

∫�

q : C q dx +∫�c

Fg|ν| dx,


�

�

�

�

�

�

�

�


where

V = {p ∈ (L2(�))2×2 : p12 = p21}.Furthermore, � ∈ L(Y,V ×H

12 (�c)) is given by

�y := (�1y,�2y) = (εy, τT y),

which allows us to express (P) as

miny∈Y

{F(y)+ G(�y)

}.

Endowing V × H12 (�c) with the usual product norm, F and G satisfy the conditions for

the Fenchel duality theorem, which was recalled in the preamble of this chapter. For theconvex conjugate one derives that F (−� (p, μ)) equals +∞ unless

−Div p = f , p · n = h in L2(�n), and pT + μ = 0 in H− 12 (�c), (9.2.3)

where pT = (n!p) · t ∈ H− 12 (�c). Further one obtains that

F (−� (p, μ)) ={−〈pN, d〉�c

if (9.2.3) and pN ≤ 0 in H− 12 (�c) hold,

∞ else.

Evaluating the convex conjugate for G yields that

G (p, μ) =⎧⎨⎩ 1

2

∫�

C−1p : p dx if 〈Fg, |ν|〉�c− 〈ν, μ〉�c

≥ 0 for all ν ∈ H12 (�c),

∞ else.

Thus, following (9.0.1) we derive the dual problem corresponding to (P):

sup(p,μ)∈V×H− 1

2 (�c)

s.t. (9.2.3),pN≤0 in H− 1

2 (�c),

and⟨Fg, |ν|⟩

�c−〈ν, μ〉�c≥0

for all ν∈H 12 (�c).

−1

2

∫�

C−1p : p dx + 〈pN, d〉�c. (P )

This problem is an inequality-constrained maximization problem of a quadratic func-tional, while the primal problem (P) involves the minimization of a nondifferentiable func-tional. Evaluating the extremality conditions (9.0.2) for the above problems one obtains thefollowing lemma.

Lemma 9.5. The solution y∗ ∈ d+K of (P) and the solution (p∗, μ∗) of (P ) are relatedby σy∗ = p∗ and by the existence of λ∗ ∈ H− 1

2 (�c) such that

a(y∗, z)− L(z)+ 〈μ∗, τT z〉�c+ 〈λ∗, τN z〉�c

= 0 for all z ∈ Y, (9.2.4a)

〈λ∗, τN z〉�c≤ 0 for all z ∈ K, (9.2.4b)

〈λ∗, τN y∗ − d〉�c= 0, (9.2.4c)

〈Fg, |ν|〉�c− 〈μ∗, ν〉�c

≥ 0 for all ν ∈ H12 (�c), (9.2.4d)

〈Fg, |τT y∗|〉�c

− 〈μ∗, τT y∗〉�c

= 0. (9.2.4e)


�

�

�

�

�

�

�

�


Proof. The extremality condition −� (p∗, μ∗) ∈ ∂F(y∗) results in y∗ ∈ d +K and

(p∗, ε(z− y∗))− L(z− y∗)+ 〈μ∗, τT (z− y∗)〉�c≥ 0 for all z ∈ d +K. (9.2.5)

The condition (p∗, μ∗) ∈ ∂G(�y∗) yields that p∗ = σy∗ and (9.2.4d) and (9.2.4e). Intro-ducing the multiplier λ∗ for the variational inequality (9.2.5) leads to (9.2.4a), (9.2.4b), and(9.2.4c).

According to (9.2.3) the multiplier μ∗ ∈ H− 12 (�c) corresponding to the nondifferen-

tiability of the primal functional J (·) has the mechanical interpretation μ∗ = −σT y∗. Using

Green’s theorem in (9.2.4a) one finds

λ∗ = −σNy∗, (9.2.6)

i.e., λ∗ is the negative stress in normal direction.We now briefly comment on the case that the given friction g is more regular, namely,

g ∈ L2(�c). In this case we can define G on the larger set V×L2(�c). One can verify thatthe assumptions for the Fenchel duality theorem hold, and thus obtain higher regularity forthe dual variable μ corresponding to the nondifferentiability of the cost functional in (P ),in particular μ ∈ L2(�c). This implies that the dual problem can be written as follows:

sup(p,μ)∈V×L2(�c)

s.t. (9.2.3),pN≤0 in H− 1

2 (�c),

and |μ|≤Fg a.e. on �c.

−1

2

∫�

C−1p : p dx + 〈pN, d〉�c. (9.2.7)

Utilizing the relation p = σy and (9.2.6), one can transform (9.2.7) into⎧⎪⎪⎪⎪⎪⎪⎨⎪⎪⎪⎪⎪⎪⎩

− min(λ,μ)∈H− 1

2 (�c)∈×L2(�c)

λ≥0 in H− 1

2 (�c)|μ|≤Fg a.e. on �c

1

2a(yλ,μ, yλ,μ)+ 〈λ, d〉�c

,

where yλ,μ satisfies

a(yλ,μ, z)− L(z)+ 〈λ, τN z〉�c+ (μ, τT z)�c

= 0 for all z ∈ Y.

(9.2.8)

Problem (9.2.8) is an equivalent form for the dual problem (9.2.7), now written in thevariables λ andμ. The primal variable yλ,μ appears only as an auxiliary variable determinedfrom λ and μ. Since g ∈ L2(�c), also the extremality conditions corresponding to (P) and(9.2.7) can be given more explicitly. First, (9.2.4d) is equivalent to

|μ∗| ≤ Fg a.e. on �c, (9.2.4d′)

and a brief computation shows that (9.2.4e) is equivalent to{τT y

∗ = 0 or

τT y∗ �= 0 and μ∗ = Fg

τT y∗|τT y∗| .

(9.2.4e′)

Moreover (9.2.4d′) and (9.2.4e′) can equivalently be expressed as

σ τT y∗ −max(0, σ τT y

∗ + μ∗ − Fg)−min(0, σ τT y∗ + μ∗ + Fg) = 0 (9.2.9)

with arbitrary σ > 0.


�

�

�

�

�

�

�

�


We now introduce and analyze a regularized version of the contact problem with givenfriction that allows the application of the semismooth Newton method. In what follows weassume that g ∈ L2(�c). We start our consideration with a regularized version of the dualproblem (9.2.7) written in the form (9.2.8). Let γ1, γ2 > 0, λ ∈ L2(�c), λ ≥ 0, andμ ∈ L2(�c), and define the functional J

γ1,γ2: L2(�c)× L2(�c) −→ R by

J γ1,γ2

(λ, μ) := 1

2a(yλ,μ, yλ,μ)+ (λ, d)�c

+ 1

2γ1‖λ− λ‖2

�c

+ 1

2γ2‖μ− μ‖2

�c− 1

2γ1‖λ‖2

�c− 1

2γ2‖μ‖2

�c,

where yλ,μ ∈ Y satisfies

a(yλ,μ, z)− L(z)+ (λ, τN z)�c+ (μ, τT z)�c

= 0 for all z ∈ Y. (9.2.10)

The regularized dual problem with given friction is defined as

max(λ,μ)∈L2(�c)×L2(�c)

λ≥0, |μ|≤Fg a.e. on �c

−J γ1,γ2

(λ, μ). (P γ1,γ2

)

Obviously, the last two terms in the definition of J γ1,γ2

are constants and can thus be neglectedin the optimization problem (P

γ1,γ2). However, they are introduced with regard to the primal

problem corresponding to (P γ1,γ2

), which we turn to next. We define the functional Jγ1,γ2 :Y → R by

Jγ1,γ2(y) :=1

2a(y, y)− L(y)+ 1

2γ1‖max(0, λ+ γ1(τN y − d))‖2

�c

+ 1

γ2

∫�c

Fgh(τT y(x), μ(x)) dx,

where h(· , ·) : R× R −→ R is a local smoothing of the absolute value function, given by

h(x, α) :=

⎧⎪⎨⎪⎩|γ2x + α| − 1

2Fg if |γ2x + α| ≥ Fg,

1

2Fg|γ2x + α|2 if |γ2x + α| < Fg.

(9.2.11)

Then the primal problem corresponding to (P γ1,γ2

) is

miny∈Y

Jγ1,γ2(y). (Pγ1,γ2 )

This can be verified similarly as for the original problem using Fenchel duality theory;see [Sta1] for details. Clearly, both (Pγ1,γ2 ) and (P

γ1,γ2) admit unique solutions yγ1,γ2

and(λγ1,γ2 , μγ1,γ2), respectively. Note that the regularization turns the primal problem into theunconstrained minimization of a continuously differentiable functional, while the corre-sponding dual problem is still a constrained minimization of a quadratic functional. Toshorten notation we henceforth mark all variables of the regularized problems only by the


�

�

�

�

�

�

�

�


index γ instead of γ1,γ2. It can be shown that the extremality conditions relating (Pγ1,γ2 ) and(P

γ1,γ2) are

a(yγ , z)− L(z)+ (μγ , τT z)�c+ (λγ , τN z)�c

= 0 for all z ∈ Y, (9.2.12a)

λγ −max(0, λ+ γ1(τN yγ − d)) = 0 on �c, (9.2.12b){γ2(ξγ − τT yγ )+ μγ − μ = 0,

ξγ −max(0, ξγ + σ(μγ − Fg))−min(0, ξγ + σ(μγ + Fg)) = 0(9.2.12c)

for any σ > 0. Here, ξγ is the Lagrange multiplier associated to the constraint |μ| ≤ Fg in(P

γ1,γ2). By setting σ = γ−1

2 , ξγ can be eliminated from (9.2.12c), which results in

γ2 τT yγ + μ−μγ −max(0, γ2 τT yγ + μ−Fg)−min(0, γ2 τT yγ + μ+Fg) = 0. (9.2.13)

While (9.2.13) and (9.2.12c) are equivalent, they will motivate slightly different active setalgorithms due to the parameter σ in (9.2.12c).

Next we investigate the convergence of the primal variable yγ as well as the dualvariables (λγ , μγ ) as the regularization parameters γ1, γ2 tend to infinity. For this purposewe denote by y∗ the solution of (P) and by (λ∗, μ∗) the solution to (P ).

Theorem 9.6. For all λ ∈ L2(�c), λ ≥ 0, μ ∈ L2(�c), and g ∈ L2(�c), the primalvariable yγ converges to y∗ strongly in Y and the dual variables (λγ , μγ ) converge to

(λ∗, μ∗) weakly in H− 12 (�c)× L2(�c) as γ1 →∞ and γ2 →∞.

Proof. The proof of this theorem is related to that of Theorem 8.26 in Chapter 8 and can befound in [KuSt].

An active set algorithm for solving (9.2.12) is presented next. The interpretation asgeneralized Newton method is discussed later. In the following we drop the index γ .

Algorithm SSN.

1. Initialize (λ0, ξ 0, μ0, y0) ∈ L2(�c)× L2(�c)× L2(�c)× Y, σ > 0 and set k := 0.

2. Determine the active and inactive sets

Ak+1c = {x ∈ �c : λ+ γ1(τN yk − d) > 0},

Ik+1c = �c \Ak+1

c ,

Ak+1f,− = {x ∈ �c : ξk + σ(μk + Fg) < 0},

Ak+1f,+ = {x ∈ �c : ξk + σ(μk − Fg) > 0},

Ik+1f = �c \ (Ak+1

f,− ∪Ak+1f,+ ).

3. If k ≥ 1, Ak+1c = Ak

c , Ak+1f,− = Ak

f,−, and Ak+1f,+ = Ak

f,+ stop. Else


�

�

�

�

�

�

�

�


4. Solve

a(yk+1, z)− L(z)+ (μk+1, τT z)�c+ (λk+1, τN z)�c

= 0 for all z ∈ Y,

λk+1 = 0 on Ik+1c , λk+1 = λ+ γ1(τN yk+1 − d) on Ak+1

c ,

μk+1 − μ− γ2τT yk+1 = 0 on Ik+1

f ,

μk+1 = −Fg on Ak+1f,− , μk+1 = Fg on Ak+1

f,+ .

5. Set

ξk+1 :=

⎧⎪⎪⎨⎪⎪⎩τT y

k+1 + γ−12 (μ+ Fg) on Ak+1

f,− ,

τT yk+1 + γ−1

2 (μ− Fg) on Ak+1f,+ ,

0 on Ik+1f ,

k := k + 1 and goto step 2.

Note that there exists a unique solution to the system in step 4, since it representsthe necessary and sufficient optimality conditions for the equality-constrained auxiliaryproblem

minλ=0 on Ik+1

c ,

μ=−Fg on Ak+1f,− , μ=Fg on Ak+1

f,+

J γ1,γ2

(λ, μ),

with J γ1,γ2

as defined in (9.2.10) that clearly has a unique solution. If the algorithm stopsat step 3 then yk is the solution to the primal problem (Pγ1,γ2 ) and (λk, μk) solves the dualproblem (P

γ1,γ2).

Provided that we choose σ = γ−12 , the above algorithm can be interpreted as a

semismooth Newton method in infinite-dimensional spaces. To show this assertion, weconsider a reduced system instead of (9.2.12). Thereby, as in the dual problem (P

γ1,γ2), the

primal variable y only acts as an auxiliary variable that is calculated from the dual variables(λ, μ). We introduce the mapping F : L2(�c)× L2(�c) −→ L2(�c)× L2(�c) by

F(λ,μ) =⎛⎜⎝ λ−max(0, λ+ γ1(τN yλ,μ − d))

γ2τT yλ,μ + μ− μ−max(0, γ2τT yλ,μ + μ− Fg) · · ·· · · −min(0, γ2τT yλ,μ + μ+ Fg)

⎞⎟⎠ , (9.2.14)

where for given λ and μ we denote by yλ,μ the solution to

a(y, z)− L(z)+ (μ, τT z)�c+ (λ, τN z)�c

= 0 for all z ∈ Y. (9.2.15)

For (λ, μ) ∈ L2(�c) × L2(�c) we have that τN yλ,μ, τT yλ,μ ∈ H12 (�c). Since H

12 (�c)

embeds continuously into Lp(�c) for every p <∞, we have the following composition ofthe first component of F , for each 2 < p <∞:{

L2(�c)× L2(�c) → Lp(�c)"→ L2(�c),

(λ, μ) �→ τN yλ,μ �→ max(0, λ+ γ1(τN yλ,μ − d)).(9.2.16)


�

�

�

�

�

�

�

�


Since the mapping " involves a norm gap under the max functional, it is Newton differ-entiable (see Example 8.14), and thus the first component of F is Newton differentiable.A similar observation holds for the second component as well, and thus the whole map-ping F is Newton differentiable. Hence, we can apply the semismooth Newton methodto the equation F(λ,μ) = 0. Calculating the explicit form of the Newton step leads toAlgorithm SSN with σ = γ−1

2 .

Theorem 9.7. Suppose that there exists a constant g0 > 0 with Fg ≥ g0, and furthersuppose that σ ≥ γ−1

2 and that ‖λ0 − λγ ‖�c, ‖μ0 − μγ ‖�c

are sufficiently small. Thenthe iterates (λk, ξ k, μk, yk) of Algorithm SSN converge superlinearly to (λγ , ξγ , μγ , yγ )

in L2(�c)× L2(�c)× L2(�c)× Y.

Proof. The proof consists of two steps. First we prove the assertion for σ = γ−12 and then

we utilize this result for the general case σ ≥ γ−12 .

Step 1. For σ = γ−12 Algorithm SSN is a semismooth Newton method for the equation

F(λ,μ) = 0 (F as defined in (9.2.14)). We already argued Newton differentiability of F .To apply Theorem 8.16, it remains to show that the generalized derivatives have uniformlybounded inverses, which can be achieved similarly as in the proof of Theorem 8.25 inChapter 8. Clearly, the superlinear convergence of (λk, μk) carries over to the variables ξk

and yk .Step 2. For σ > γ−1

2 we cannot use the above argument directly. Nevertheless, onecan prove superlinear convergence of the iterates by showing that in a neighborhood of thesolution the iterates of Algorithm SSN with σ > γ−1

2 coincide with those of AlgorithmSSN with σ = γ−1

2 . The argument for this fact exploits the smoothing properties of theNeumann-to-Dirichlet mapping for the elasticity equation. First, we again consider the caseσ = γ−1

2 . Clearly, for all k ≥ 1 we have λk − λk−1 ∈ L2(�c) and μk − μk−1 ∈ L2(�c).The corresponding difference yk − yk−1 of the primal variables satisfies

a(yk − yk−1, z)+ (μk − μk−1, τT z)�c+ (λk − λk−1, τN z)�c

= 0 for all z ∈ Y.

From regularity results for elliptic variational equalities it follows that there exists a constantC > 0 such that

‖τT yk − τT y

k−1‖C0(�c) ≤ C(‖λk − λk−1‖�c

+ ‖μk − μk−1‖�c

). (9.2.17)

We now show that (9.2.17) implies

Akf,− ∩Ak+1

f,+ = Akf,+ ∩Ak+1

f,− = ∅ (9.2.18)

provided that ‖λ0−λγ ‖�cand ‖μ0−μγ ‖�c

are sufficiently small. If B := Akf,−∩Ak+1

f,+ �= ∅,

then it follows that τT yk−1 + γ−1

2 (μ+ Fg) < 0 and τT yk + γ−1

2 (μ− Fg) > 0 on B, whichimplies that τT y

k − τT yk−1 > 2γ−1

2 Fg ≥ 2γ−12 Fg0 > 0 on B. This contradicts (9.2.17)

provided that ‖λ0 − λγ ‖�cand ‖μ0 − μγ ‖�c

are sufficiently small. Analogously, one canshow that Ak

f,+ ∩Ak+1f,− = ∅.

We now choose an arbitrary σ ≥ γ−12 and assume that (9.2.18) holds for Algorithm

SSN if σ = γ−12 . Then we can argue that in a neighborhood of the solution the iterates of

Algorithm SSN are independent of σ ≥ γ−12 . To verify this assertion we separately consider


�

�

�

�

�

�

�

�


the sets Ikf ,Ak

f,−, and Akf,+. On Ik

f we have that ξk = 0 and thus σ has no influence whendetermining the new active and inactive sets. On the set Ak

f,− we have μk = −Fg. Here,

we consider two types of sets. First, sets where ξk < 0 belong to Ak+1f,− for the next iteration

independently of σ . And, second, if ξk ≥ 0, we use

ξk + σ(μk − Fg) = ξk − 2σFg.

Sets where ξk − 2σFg ≤ 0 are transferred to Ik+1f , and those where 0 < ξk − 2σFg ≤

ξk−2γ−12 Fg belong to Ak+1

f,+ for the next iteration. However, the case that x ∈ Akf,−∩Ak+1

f,+cannot occur for σ ≥ γ−1

2 , since it is already ruled out by (9.2.18) for σ = γ−12 . On the set

Akf,+ one argues analogously.

This shows that in a neighborhood of the solution the iterates are the same for allσ ≥ γ−1

2 , and thus the superlinear convergence result from Step 1 carries over to thegeneral case σ ≥ γ−1

2 , which ends the proof.

Aside from the assumption that ‖λ0 − λγ ‖�cand ‖μ0 − μγ ‖�c

are sufficiently small,σ controls the probability that points are moved from the lower active set to the upper, orvice versa, in one iteration. Smaller values for σ make it more likely that points belong toAk

f,− ∩ Ak+1f,+ or Ak

f,+ ∩ Ak+1f,− . In the numerical realization of Algorithm SSN it turns out

that choosing small values for σ may not be optimal, since this may lead to the situationthat points which are active with respect to the upper bound become active with respect tothe lower bound in the next iteration, and vice versa. This in turn may lead to cycling of theiterates. Such undesired behavior can be overcome by choosing larger values for σ . If theactive set strategy is based on (9.2.13), one cannot take advantage of a parameter whichhelps avoid points from changing from Af,+ to Af,−, or vice versa, in one iteration.

Remark 9.2.1. So far we have not remarked on the choice of λ and μ. One possibility isto choose them according to first order augmented Lagrangian updates. In practice this willresult in carrying out some steps of Algorithm SSN and then updating (λ, μ) as the currentvalues of (λk, μk).

9.2.3 Contact problem with Coulomb friction

We give a short description of how the contact problem with Coulomb friction can betreated with the methods of the previous subsection. As pointed out earlier, one of the maindifficulties of such problems is related to the lack of regularity of σN(y) in (9.2.1). Thefollowing formulation utilizes the contact problem with given friction g ∈ H− 1

2 (�c) and afixed point idea. We define the cone of nonnegative functionals over H

12 (�c) as

H− 1

2+ (�c) :={ξ ∈ H− 1

2 (�c) : 〈ξ, η〉�c≥ 0 for all η ∈ H

12 (�c), η ≥ 0

}and consider the mapping ! : H− 1

2+ (�c) → H− 1

2+ (�c) defined by !(g) := λg , whereλg is the unique multiplier for the contact condition in (9.2.4) for the problem with givenfriction g. Property (9.2.4b) implies that ! is well defined. With (9.2.6) in mind, y ∈ Y iscalled a weak solution of the Signorini problem with Coulomb friction if its negative normal


�

�

�

�

�

�

�

�


boundary stress−σN(y) is a fixed point of the mapping !. In general, such a fixed point forthe mapping ! does not exist unless F is sufficiently small; see, e.g., [EcJa, Has, HHNL].

The regularization for the Signorini contact problem with Coulomb friction that weconsider here corresponds to the regularization in (Pγ1,γ2 ) and reflects the fact that the La-grangian for the contact condition relates to the negative stress in the normal direction. Itis given by

a(y, z− y)+ (max(0, λ+ γ1(τN y − d)), τN (z− y)

)�c− L(z− y)

+ 1

γ2

∫�c

F max(0, λ+ γ1(τN y − d)){h(τT z, μ)− h(τT y, μ)

}dx ≥ 0

(9.2.19)

for all z ∈ Y, with h(· , ·) as defined in (9.2.11). Existence for (9.2.19) is obtained by meansof a fixed point argument for the regularized Tresca friction problem. For this purpose weset L2+(�c) := {ξ ∈ L2(�c) : ξ ≥ 0 a.e.} and define the mapping !γ : L2+(�c)→ L2+(�c)

by !γ (g) := λγ = max(0, λ+γ1(τN yγ −d)) with yγ the unique solution of the regularizedcontact problem with friction g ∈ L2+(�c). In a first step Lipschitz continuity of the mapping�γ : L2+(�c)→ Y which assigns to a given friction g ∈ L2+(�c) the corresponding solutionyγ of (Pγ1,γ2 ) is investigated.

Lemma 9.8. For every γ1, γ2 > 0 and λ ∈ L2(�c), μ ∈ L2(�c) the mapping �γ definedabove is Lipschitz continuous with constant

L = ‖F‖L∞(�c) c1

κ, (9.2.20)

where ‖F‖∞ denotes the essential supremum of F, κ the coercivity constant of a(· , ·), and c1

the continuity constant of the trace mapping from Y to L2(�c). In particular, the Lipschitzconstant L does not depend on the regularization parameters γ1, γ2.

For the proof we refer the reader to [Sta1]. We next address properties of the map-ping !γ .

Lemma 9.9. For every γ1, γ2 > 0 and λ ∈ L2(�c), μ ∈ L2(�c) the mapping !γ :L2+(�c)→ L2+(�c) is compact and Lipschitz continuous with constant

L = cγ

κ‖F‖∞,

where c is a constant resulting from trace theorems.

Proof. We consider the following composition of mappings.

L2+(�c)�γ−→ Y

"−→ L2(�c)ϒ−→ L2+(�c),

g �→ y �→ τN y �→ max(0, λ+ γ1(τN y − d)).(9.2.21)

From Lemma 9.8 it is known that �γ is Lipschitz continuous. The mapping " consistsof the linear trace mapping from Y into H

12 (�c) and the compact embedding of this space


�

�

�

�

�

�

�

�


into L2(�c). Therefore, it is compact and linear, in particular Lipschitz continuous with aconstant c2 > 0. Since

‖max(0, λ+ γ1(ξ − d))−max(0, λ+ γ1(ξ − d))‖�c≤ γ1‖ξ − ξ‖�c

(9.2.22)

for all ξ, ξ ∈ L2(�c), the mapping ϒ is Lipschitz continuous with constant γ1. This impliesthat !γ is Lipschitz continuous with constant

L = c1c2γ1

κ‖F‖∞, (9.2.23)

where c1, c2 are constants from trace theorems. Concerning compactness the compositionof " and �γ is compact. From (9.2.22) it then follows that !γ is compact. This ends theproof.

We can now show that the regularized contact problem with Coulomb friction has asolution.

Theorem 9.10. The mapping !γ admits at least one fixed point, i.e., the regularizedCoulomb friction problem (9.2.19) admits a solution. If ‖F‖∞ is such that L as defined in(9.2.23) is smaller than 1, the solution is unique.

Proof. We apply the Leray–Schauder fixed point theorem to the mapping !γ : L2(�c)→L2(�c). Using Lemma 9.9, it suffices to show that λ is bounded in L2(�c) independentlyof g. This is clear taking into account the dual problem (P

γ1,γ2). Indeed,

minλ≥0, |μ|≤Fg a.e. on �c

J γ1,γ2

(λ, μ) ≤ minλ≥0

J γ1,γ2

(λ, 0) <∞.

Hence, the Leray–Schauder theorem guarantees the existence of a solution to the regularizedCoulomb friction problem. Uniqueness of the solution holds if F is such that L is smallerthan 1, since in this case !γ is a contraction.

In the following algorithm the fixed point approach is combined with an augmentedLagrangian concept to solve (9.2.19).

Algorithm ALM-FP.

1. Initialize (λ0, μ0) ∈ L2(�c)× L2(�c) and g0 ∈ L2(�c), m := 0.

2. Choose γ m1 , γ m

2 > 0 and determine the solution (λm, μm) to problem (P γ1,γ2

) with

given friction gm and λ := λm, μ := μm .

3. Update gm+1 := λm, λm+1 := λm, μm+1 := μm, and m := m + 1. Unless anappropriate stopping criterion is met, goto step 2.

The auxiliary problems (P γ1,γ2

) in step (2) can be solved by Algorithm SSN. Numericalexperiments with this algorithm are given in [KuSt]. For the following brief discussion ofthe above algorithm, we assume that the mapping! admits a fixed point λ∗ inL2(�c). Then,


�

�

�

�

�

�

�

�


with the variables y∗ ∈ Y and μ∗ ∈ L2(�c) corresponding to λ∗, we have for γ1, γ2 > 0that

a(y∗, z)− L(z)+ (λ∗, τN z)�c+ (μ∗, τT z)�c

= 0 for all z ∈ Y,

λ∗ −max(0, λ∗ + γ1(τN y∗ − d)) = 0 on �c,

γ2τT y∗ −max(0, γ2τT y

∗ + μ∗ − Fλ∗)−min(0, γ2τT y∗ + μ∗ + Fλ∗) = 0 on �c.

Provided that Algorithm ALM-FP has a limit point, it also satisfies this system of equations;i.e., the limit satisfies the original, nonregularized contact problem with Coulomb friction.

We end the section with a numerical example taken from [KuSt].

Example 9.11. We solve the contact problem with Tresca as well as with Coulomb frictionusing Algorithms SSN and ALM-FP, respectively. We choose � = [0, 3] × [0, 1], the gapfunction d = max

(0.0015, 0.003(x1 − 1.5)2 + 0.001

)and E = 10000, ν = 0.45, and

f = 0. The boundary of possible contact and friction is �c := [0, 3] × {0}, and we assumetraction-free boundary conditions on �n := [0, 3] × {1} ∪ {0} × [0, 0.2] ∪ {3} × [0, 0.2].On �d := {0} × [0.2, 1] ∪ {3} × [0.2, 1] we prescribe the deformation as follows:

τy =

⎧⎪⎪⎨⎪⎪⎩(

0.003(1− x2)

−0.004

)on {0} × [0.2, 1],(−0.003(1− x2)

−0.004

)on {3} × [0.2, 1].

Further γ1 = γ2 = 108, σ = 1, λ = μ = 0, and Algorithm SSN is initialized byλ0 = μ0 = 0. Algorithm ALM-FP is initialized with the solution of the pure contactproblem and g0 = 0, λ0 = μ0 = 0. The MATLAB code is based on [ACFK], whichthat uses linear and bilinear finite elements for the discretization of the elasticity equationswithout friction and contact.

The semismooth Newton method detects the solution for given friction g ≡ 1 andF = 1 after 7 iterations. The corresponding deformed mesh and the elastic shear energydensity are shown in Figure 9.1. As expected, in a neighborhood of the points (0, 0.2) and

−0.5 0 0.5 1 1.5 2 2.5 3 3.5−0.2

0

0.2

0.4

0.6

0.8

1

Figure 9.1. Deformed mesh for g ≡ 1; gray tones visualize the elastic shearenergy density.


�

�

�

�

�

�

�

�


0 0.5 1 1.5 2 2.5 3−40

−30

−20

−10

0

10

20

0 0.5 1 1.5 2 2.5 3−40

−30

−20

−10

0

10

20

0 0.5 1 1.5 2 2.5 3−40

−30

−20

−10

0

10

20

0 0.5 1 1.5 2 2.5 3−40

−30

−20

−10

0

10

20

30

40

0 0.5 1 1.5 2 2.5 3−80

−60

−40

−20

0

20

40

60

80

0 0.5 1 1.5 2 2.5 3−200

−150

−100

−50

0

50

100

150

200

Figure 9.2. Upper row, left: multiplier λ∗ (solid), rigid foundation (multiplied by5 · 103, dotted), and normal displacement τN y∗ (multiplied by 5 · 103, dashed). Lower row,left figure: dual variable μ∗ (solid) with bounds±Fλ∗ (dotted) and tangential displacementy∗ (multiplied by 5 · 103, dashed) for F = 2. Middle column: same as left, but with F = 5.Right column: same as first column, but with F = 10.

(3, 0.2), i.e., the points where the boundary conditions change from Neumann to Dirichlet,we observe a stress concentration due to a local singularity of the solution. Further, we alsoobserve a (small) stress concentration close to the points where the rigid foundation has thekinks.

We turn to the Coulomb friction problem and investigate the performance of Algo-rithmALM-FP. In Figure 9.2 the normal and the tangential displacement with correspondingmultipliers for F = 2, 5, 10 are depicted. One observes that the friction coefficient signif-icantly influences the deformation. For instance, in the case F = 2 the elastic body is incontact with the foundation in the interval [1.4, 1.6], but it is not for F = 5 and F = 10.These large values of F may, however, be physically of little relevance. Algorithm ALM-FP requires overall between 20 and 25 linear solves to stop with |gm−gm−1|�c

|gm|�c ≤ 10−7. Forfurther information on the numerical performance we refer the reader to [Sta1].


�

�

�

�

�

�

�

�

Chapter 10

Parabolic VariationalInequalities

In this section we discuss the Lagrange multiplier approach to parabolic variational inequal-ities in the Hilbert space H = L2(�) which are of the type⟨

d

dty∗(t)+ Ay∗(t)− f (t), y − y∗(t)

⟩≥ 0, y∗(t)− ψ ∈ C,

y∗(0) = y0

(10.0.1)

for all y − ψ ∈ C, where the closed convex set C of H is defined by

C = {y ∈ H : y ≥ 0},

A is a closed operator on H , � denotes an open domain in Rn, and y ≥ 0 is interpreted inthe pointwise a.e. sense.

We consider the Black–Scholes model for American options, which is a variationalinequality of the form

− d

dtv(t, S)−

(σ 2

2S2 vSS + rS vS − r v + Bv

)≥ 0 ⊥ v(t, S) ≥ ψ(S),

v(T , S) = ψ(S)

(10.0.2)

for a.e. (t, S) ∈ (0, T )× (0,∞), where ⊥ denotes complementarity, i.e., a ≥ 0 ⊥ b ≥ ψ

if a ≥ 0, b ≥ ψ , and a(b−ψ) = 0. In (10.0.2) the reward function ψ(S) = (K − S)+ forthe put option and ψ(S) = (S − K)+ for the call option. Here S ≥ 0 denotes the price, vthe value of the share, r > 0 is the interest rate, σ > 0 is the volatility of the market, andK is the strike price. Further T is the maturity date. The integral operator B is defined by

Bv(S) = −λ∫ ∞

0

((z− 1)SvS + (v(t, S)− v(t, zS))

)dν(z).

Note that (10.0.2) is a backward equation with respect to the time variable. Settingy(t, S) = v(T − t, S) we arrive at (10.0.1), and (10.0.2) has the following interpretation

277


�

�

�

�

�

�

�

�

278 Chapter 10. Parabolic Variational Inequalities

[Kou] in mathematical finance. The price process St is governed by the Ito’s stochasticdifferential equation

dSt/St− = r dt + σ dBt + (Jt − 1) dπt ,

where Bt denotes a standard Brownian motion, πt is a counting Poisson process, Jt − 1 isthe magnitude of the jump ν, and λ ≥ 0 is the rate. The value function v is represented by

v(t, S) = supτ

Et,x[e−r(τ−t)ψ(Sτ )] over all stopping times τ ≤ T . (10.0.3)

It will be shown that for the put case, (0,∞) can be replaced by � = (S,∞) withcertain S > 0. Thus, (10.0.2) can be formulated as (10.0.1) by defining a bounded bilinearform a on V × V by

a(v, φ) =∫ ∞

S

(σ 2

2S2vS + (r − σ 2)S v

)φS + (2r − σ 2) vφ − Bv φ) dS (10.0.4)

for v, φ ∈ V , where V is defined by

V ={φ ∈ H : φ is absolutely continuous on (S,∞),

∫ ∞

S

S2|φS |2 dS <∞ and φ(S)→ 0 as S →∞ and ψ(S) = 0

}equipped with

|φ|2V =∫ ∞

S

(S2|φS |2 + |φ|2) dS.

Now

a(v, φ) ≤ σ 2

2|v|V |φ|V + |r − σ 2||v|H |φ|V + |2r − σ 2||v|H |φ|H

and

a(v, v)≥ σ 2

2|v|2V +

(2r − 3

2σ 2

)|v|2H − |r − σ 2||v|V |v|H

≥ σ 2

4|v|2V +

(2r − 3

2σ 2 − (r − σ 2)2

σ 2

)|v|2H .

Note that if Av ∈ L2(�), then

(Av, φ) = a(v, φ) for all φ ∈ V.

Then, the solution v(T − t, S) satisfies (10.0.1) with f (S) = λ∫ S

S

0 ψ(zS) dν(z). Letv+ = sup(0, v). Since v+ ≥ v a.e. in �,

(−Bv, v+) ≥ (−Bv+, v+).


�

�

�

�

�

�

�

�

279

Thus,

Av = −(σ 2

2S2 vSS + rS vS − r v + Bv

)satisfies

〈Av, v+〉 ≥ 〈Av+, v+〉.Motivated by this example we make the following assumptions. Let X be a Hilbert

space that is continuously embedded into H , and let V be a separable closed linear subspaceof X endowed with the induced norm and dense in H . Assume that

ψ ∈ X, f ∈ L2(0, T ;V ∗),and that

φ+ = sup(0, φ) ∈ V for all φ ∈ V.

The following assumptions will be used for the operator A.(1) A ∈ L(X, V ∗), i.e., there exists M such that

| 〈Ay, φ〉V ∗×V | ≤ M|y| |φ| for all y ∈ X and φ ∈ V,

and A is closed in H with

dom(A) = {y ∈ X : Ay ∈ H } ⊂ X,

where dom(A) is a Hilbert space equipped with the graph norm.(2) There exist ω > 0 and ρ ∈ R such that for all φ ∈ V

〈Aφ, φ〉 ≥ ω|φ|2V − ρ|φ|2H .

(3) For all φ ∈ V ,〈Aφ, φ+〉 ≥ 〈Aφ+, φ+〉.

(4) There exists λ ∈ H satisfying λ ≤ 0 a.e. such that

〈λ+ Aψ − f (t), φ〉 ≤ 0

for a.e. t and all φ ∈ V satisfying φ ≥ 0 a.e.(5) There exists an ψ ∈ dom(A) such that ψ − ψ ∈ V ∩ C = C.(6) Let as be the symmetric form on V × V defined by

as(y, φ) = 1

2(〈Ay, φ〉 + 〈Aφ, y〉)

for y, φ ∈ V and assume that the skew-symmetric form satisfies

1

2|〈Ay, φ〉 − 〈Aφ, y〉| ≤ M|y|V |φ|H

for a constant M independent of y, φ ∈ V .(7) (φ − γ )+ ∈ V for any γ ∈ R+ and φ ∈ V , and 〈A1, (φ − γ )+〉 ≥ 0.


�

�

�

�

�

�

�

�


Assumptions (1)–(5) apply to (10.0.2) and to second order elliptic differential oper-ators. Assumption (6) applies to the biharmonic operator �2 and to self-adjoint operators.For the biharmonic operator and systems of equations as the elasticity system, for instance,the monotone property (3) does not hold.

In this chapter we discuss (10.0.1) without assuming that V is embedded compactlyinto H . In the latter case, one can use the Aubin lemma, which states that W(0, T ) =L2(0, T ;V )∩H 1(0, T ;V ∗) is compactly embedded intoL2(0, T ;H). This ensures that theweak limit of certain approximating sequences defines the solution; see, e.g., [GLT, IK23].Instead, our analysis uses the monotone trick for variational inequalities. From, e.g., [Tan,p. 151], we recall that W(0, T ) embeds continuously into C([0, T ];H).

We commence with the definitions of strong and weak solutions to (10.0.1).

Definition 10.1 (Strong Solution). Given y0 − ψ ∈ C and f ∈ L2(0, T ;H), an X-valuedfunction y∗(t), with y∗ − ψ ∈ L2(0, T ;V ) ∩ H 1(0, T ;V ∗) is called a strong solution of(10.0.1) if y∗(0) = y0, y∗ ∈ H 1(δ, T ;H) for every δ > 0, y∗(t, x) ≥ ψ(x) a.e. in(0, T )×�, and for a.e. t ∈ (0, T ),⟨

d

dty∗(t)+ Ay∗(t)− f (t), y − y∗(t)

⟩≥ 0

for a.e. t ∈ (0, T ) and for all y − ψ ∈ C.

Defining λ∗ = − ddty∗ − Ay∗ + f (t) ∈ L2(0, T ;V ∗), we have in the distributional

sense that y∗ satisfies⎧⎪⎨⎪⎩d

dty∗(t)+ Ay∗(t)+ λ∗(t) = f (t), y∗(0) = y0,

〈λ∗(t), y − ψ〉 ≤ 0 for all y − ψ ∈ C and 〈λ∗, y∗ − ψ〉 = 0.

(10.0.5)

Moreover, in case that y∗ is a strong solution we have y∗ ∈ L2(δ, T ; dom(A)) for everyδ > 0. Further λ∗ ∈ L2(δ, T ;H) and (10.0.1) can equivalently be written as a variationalinequality in the form⎧⎪⎨⎪⎩

d

dty∗(t)+ Ay∗(t)+ λ∗(t) = f (t), y∗(0) = y0,

λ∗(t) ≤ 0, y∗(t) ≥ ψ, (y∗(t)− ψ, λ∗(t))H = 0 for a.e. t > 0.

(10.0.6)

Definition 10.2 (Weak Solution). Assume that y0 ∈ H and f ∈ L2(0, T , V ∗). Then afunction y∗ − ψ ∈ L2(0, T ;V ) satisfying y∗(t, x) ≥ ψ(x) a.e. in (0, T ) × � is called aweak solution to (10.0.1) if∫ T

0

[⟨d

dty(t), y(t)− y∗(t)〉 + 〈Ay∗(t), y(t)− y∗(t)〉 − 〈f (t), y(t)− y∗(t)

⟩]dt

+1

2|y(0)− y0|2H ≥ 0

(10.0.7)


�

�

�

�

�

�

�

�

10.1. Strong solutions 281

is satisfied for all y − ψ ∈ K, where

K = {y ∈ W(0, T ) : y(t, x) ≥ 0 a.e. in (0, T )×�} (10.0.8)

and

W(0, T ) = L2(0, T ;V ) ∩H 1(0, T ;V ∗).

Since for y∗ and y in W(0, T )∫ T

0

⟨d

dty(t)− y∗(t), y(t)− y∗(t)

⟩dt = 1

2(|y(T )− y∗(T )|2H − |y(0)− y0|2H ),

it follows that a strong solution to (10.0.1) is also a weak solution.Let us briefly outline this chapter. Section 10.1 is devoted to proving existence and

uniqueness of strong solutions. Extra regularity of solutions is obtained in Section 10.2.Continuous dependence of the solution with respect to parameters in A is investigated inSection 10.3. Section 10.4 focuses on weak solutions obtained as the limit of approximatingdifference schemes. In Section 10.5 monotonic behavior of solutions with respect to initialconditions and the forcing function is proved. We refer to [Sey] and the literature cited therefor an introduction to numerical aspects in mathematical finance.

10.1 Strong solutionsIn this section we establish the existence of the strong solution to (10.0.1) under assumptions(1)–(5) and (1)–(2), (5)–(6), respectively.

For λ ∈ H satisfying λ ≤ 0 we consider the regularized equations of the form{ddtyc + Ayc +min(0, λ+ c (yc − ψ)) = f, c > 0,

yc(0) = y0.(10.1.1)

Proposition 10.3. If assumptions (1)–(2) hold and y0 ∈ H , f ∈ L2(0, T ;V ∗), and λ ∈ H ,then (10.1.1) has a unique solution yc satisfying yc − ψ ∈ L2(0, T ;V ) ∩H 1(0, T ;V ∗).

Proof. Existence and uniqueness of the solution to (10.1.1) follow with monotone tech-niques; see [ItKa, Lio3], for instance. Define A : V → V ∗ by

Aφ = Aφ +min(−λ, cφ).Then (10.1.1) can equivalently be expressed as

d

dtv +Av = f − λ− Aψ ∈ L2(0, T ;V ∗), (10.1.2)

with v = yc − ψ and v(0) = y0 − ψ ∈ H . We note that A is hemicontinuous, i.e.,s → 〈A(φ1 + sφ2), φ3〉 is continuous from R→ R for all φi ∈ V, i = 1, . . . , 3, and

|Aφ|V ∗ ≤ |Aφ|V ∗ + c|φ|H for all φ ∈ V,

〈Aφ1 −Aφ2, φ1 − φ2〉 ≥ ω|φ1 − φ2|2V − ρ|φ1 − φ2|2H for all φ1, φ2 ∈ V,

〈Aφ, φ〉 ≥ ω|φ|2V − ρ|φ|2H for all φ ∈ V.


�

�

�

�

�

�

�

�


It follows that (10.1.2) admits a unique solution v ∈ L2(0, T ;V )∩H 1(0, T ;V ∗) andthis gives the desired solution yc = v + ψ of (10.1.1); cf. [ItKa, Theorem 8.7], [Lio3,Theorem II.1.2].

Theorem 10.4. (1) If in addition to the assumptions in Proposition 10.3 assumptions (3)–(4) hold, y0 − ψ ∈ C, then yc(t) − ψ ∈ C and yc(t) ≥ yc(t) for c ≥ c. Moreover,yc − ψ → y∗ − ψ strongly in L2(0, T ;V ) and weakly in H 1(0, T ;V ∗) as c→∞, wherey∗ is the unique solution of (10.0.1) in the sense that y∗ −ψ ∈ K, (10.0.6) is satisfied withλ∗ ∈ L2(0, T ;H), and the estimate

1

2e−2ρt |yc(t)− y∗(t)|2H +

∫ t

0e−2ρsω |yc(s)− y∗(s)|2V ds ≤ 1

c

∫ t

0e−2ρs |λ|2 ds → 0

for t ∈ [0, T ] holds. If in addition assumption (7) is satisfied and λ ∈ L∞(�), then

|yc(t)− y∗(t)|L∞ ≤ 1

c|λ|L∞ .

(2) If assumptions (1)–(4) hold, y0 − ψ ∈ C, and f ∈ L2(0, T ;H), then y∗ is theunique strong solution to (10.0.1).

Proof. (1) From (10.1.1) it follows that yc satisfies{ddtyc + A(yc − ψ)+ λc = f − Aψ,

yc(0) = y0,(10.1.3)

where

λc = min(0, λ+ c (yc − ψ)).

If y0−ψ ∈ C, then yc(t)−ψ ∈ C. In fact, letφ = min(0, yc−ψ) = −(yc−ψ)− ∈ −C∩V .Since λ ≤ 0 it follows that⟨

d

dtyc + A(yc − ψ), φ

⟩+ 〈Aψ − f + λ+ c φ, φ〉 = 0,

where by assumptions (3) and (2)

〈A(yc − ψ), φ〉 ≥ 〈Aφ, φ〉 ≥ ω|φ|2 − ρ|φ|2Hand by (4)

〈Aψ − f (t)+ λ, φ〉 ≥ 0.

Thus,

1

2

d

dt|φ|2H ≤ ρ |φ|2H


�

�

�

�

�

�

�

�


and consequently

e−2ρt |φ|2H ≤ |φ(0)|2H = 0. (10.1.4)

Since

0 ≥ λc = min(0, λ+ c (yc − ψ)) ≥ λ,

we have

|λc(t)|H ≤ |λ|Hfor all t ∈ [0, T ]. From (10.1.3) we deduce that {yc} is bounded inL2(0, T ;V ). By assump-tion (1) and again by (10.1.3) it follows that {Ayc} and { d

dtyc} are bounded in L2(0, T ;V ∗).

Thus, there exist λ∗ ∈ L2(0, T ;H) satisfying λ∗ ≤ 0 a.e. and y∗ satisfying y∗ − ψ ∈ K,such that for a subsequence denoted again by c,

λc → λ∗ weakly in L2(0, T ;H),

Ayc → Ay∗ andd

dtyc → d

dty∗ weakly in L2(0, T ;V ∗)

(10.1.5)

as c→∞ . Taking the limit in (10.1.3) implies that

d

dty∗ + Ay∗ − f = −λ∗, y∗(0) = y0, (10.1.6)

with equality in the differential equation holding in the sense of L2(0, T ;V ∗).For φ = −(yc − yc)

− with c ≤ c we deduce from (10.1.3) that⟨d

dt(yc − yc)+ A(yc − yc), φ

⟩+ (λc − λc, φ) = 0,

where

(λc − λc, φ)=(

min(0, λ+ c(yc − ψ))−min(0, λ+ c(yc − ψ))

+min(0, λ+ c(yc − ψ))−min(0, λ+ c(yc − ψ)), φ) ≥ 0,

since yc ≥ ψ . Hence, using the same arguments as those leading to (10.1.4), we have|φ(t)|H = 0 and thus

yc ≥ yc for c ≤ c.

By Lebesgue dominated convergence theorem and the theorem of Beppo Levi, yc → y∗strongly to L2(0, T ;H) and pointwise a.e. in (0, T )×�. Since

0 ≥∫ T

0(λc, yc − ψ)H dt ≥ −1

c

∫ T

0|λ|2H dt → 0

as c→∞, we have ∫ T

0(λ∗, y∗ − ψ) dt = 0.


�

�

�

�

�

�

�

�


That is, (y∗, λ∗) satisfies (10.0.6), where the first equation is satisfied in the sense ofL2(0, T ;V ∗). Suppose that y ∈ K satisfies (10.0.1). Then it follows that

1

2

d

dt|y∗(t)− y(t)|2H + 〈A(y∗ − y(t)), y∗(t)− y(t)〉 ≤ 0

and thus e−2ρt |y∗(t) − y(t)|2H ≤ |y0 − y(0)|2H . This implies that y∗ is the unique solutionto (10.0.1) in K and that the whole family {(yc, λc)} converges in the sense specified in(10.1.5). From (10.0.1) and (10.1.1)⟨

d

dty∗(t)+ Ay∗(t)− f (t), yc(t)− y∗(t)

⟩≥ 0,

⟨d

dtyc(t)+ Ayc(t)− f (t), y∗(t)− yc(t)

⟩≥ (λc, yc − ψ)H .

Since yc ≥ ψ , we have (λc, yc − ψ) ≥ − 1c|λ|2H . Summing the above inequalities and

multiplying by e−2ρt give

1

2

d

dt(e−2ρt |yc(t)− y∗(t)|2H )+ e−2ρt 〈A(yc(t)− y∗(t)), yc(t)− y∗(t)〉

+ρ |yc(t)− y∗(t)|2H ≤1

ce−2ρt |λ|2,

which implies the first estimate and in particular that yy → y∗ strongly in L2(0, T ;V ).Suppose next that in addition λ ∈ L∞(�). Let k ∈ R+ and φ = (yc − y∗ − k)+. By

assumption φ ∈ V . From (10.0.6) and (10.1.1)⟨d

dty∗ + Ay∗ − f, φ

⟩≥ 0

and ⟨d

dtyc + Ayc + λc − f, φ

⟩= 0.

If k ≥ 1c|λ|L∞ , then

(λc, φ) = (min(0, λ+ c (yc − ψ)), (yc − y∗ − k)+) = 0,

where we use that yc ≥ y∗ ≥ ψ . Hence, we obtain⟨d

dt(yc − y∗ − k)+ A(yc − y∗ − k)+ Ak, φ

⟩≤ 0.

By assumption (3) and since 〈A1, φ〉 ≥ 0,

1

2

d

dt|φ|2 + 〈Aφ, φ〉 ≤ 0,

which implies the second estimate.


�

�

�

�

�

�

�

�


(2) Now suppose that f ∈ L2(0, T ;H) and that assumptions (1)–(4) hold. Consider(10.1.3) in the form { d

dtyc + Ayc = f − λc,

y(0) = y0.(10.1.7)

We decompose yc = yc,i + yh, where yc,i and yh are the solutions to (10.1.7) with initialcondition and forcing functions set to zero, respectively. Note that {λc} is bounded inL2(0, T ;H) uniformly with respect to c. Hence by the following lemma {Ayc,i} and { d

dtyc,i}

are bounded inL2(0, T ;H) uniformly for c > 0. Moreover, Ayh ∈ L2(δ, T ;H) and ddtyh ∈

L2(δ, T ;H) for every δ > 0. Thus yc is bounded in H 1(δ, T ;H)∩L2(δ, T ; dom(A)) andconverges weakly in H 1(δ, T ;H) ∩ L2(δ, T ; dom(A)) to y∗ for c→∞.

Lemma 10.5. Under the assumptions of the previous theorem −A generates an analyticsemigroup on H . If d

dtx + Ax = g ∈ L2(0, T ;H), with x0 = 0, then d

dtx(t) and Ax(t) ∈

L2(0, T ;H), and

|Ax|L2(0,T ;H) ≤ k |g|L2(0,T ;H),

with k independent of g ∈ L2(0, T ;H).

Proof. LetB = A−ρI . Further for u ∈ dom(A) and λ ∈ Cwith Reλ ≥ 0 set g = λu+Bu.Then, since

Re λ (u, u)H + 〈Bu, u〉 ≤ |g|H |u|H ,

assumption (2) implies that

ω|u|2V ≤ |f |H |u|H . (10.1.8)

From assumption (1) and (10.1.8)

|λ| |u|2H ≤ |g|H |u|H + M|u|2V ≤(

1+ M

ω

)|g|H |u|H ,

and thus

|u|H = |(λI + B)−1g|H ≤(

1+ M

ω

)1

|λ| |g|H . (10.1.9)

It thus follows from [ItKa, Paz, Tan] that −B and hence −A generate analytic semigroupson H related by e−B t = e (ρ−A) t .

For g ∈ L2(0,∞;H) with eρ ·g ∈ L2(0,∞;H) consider

d

dtx + Ax = g with x(0) = 0.

This is related tod

dtz+ Bz = gρ := geρ· with x(0) = 0 (10.1.10)


�

�

�

�

�

�

�

�


by z(t) = eρ tx. Taking the Laplace transform of (10.1.10), we obtain

λz+ Bz = gρ, where z =∫ ∞

z0e−λsz(s) ds,

and thus by (10.1.9)

|Bz|H ≤ |B(λI + B)−1gρ |H ≤(

2+ M

ω

)|gρ |H .

From the Fourier–Plancherel theorem we have∫ ∞

0|Az(t)|2H dt ≤

(2+ M

ω

)∫ ∞

0|gρ |2 dt.

This implies that |Ax|L2(0,T ;H) ≤ eρ T (2 + Mω)|g|L2(0,T ;H) choosing g = 0 for

t ≥ T .

To allow δ = 0 for the strong solutions in the previous theorem we let y denote thesolution to {

ddty + Ay = 0,

y(0) = y0,

and we consider { ddt(yc − y)+ A(yc − y) = f − λc,

y(0)− y(0) = 0.

Arguing as in (2) of Theorem 10.4 we obtain the following corollary.

Corollary 10.6. Under the assumptions of Theorem 10.4 we have y∗ − y ∈ H 1(0, T ;H)∩L2(0, T ; dom(A)).

Next we turn to verify existence under a different set of assumptions which, in par-ticular, does not involve the monotone assumption (3). For λ = 0 in (10.1.1), let yc denotethe corresponding solution, i.e.,

d

dtyc + Ayc + c min(0, c (yc − ψ)) = f, c > 0, (10.1.11)

which exists by Theorem 10.4 (1).

Theorem 10.7. If assumptions (1)–(2) and (5)–(6) hold, y0 − ψ ∈ C ∩ V , and f ∈L2(0, T ;H), then (10.0.1) has a unique, strong solution y∗(t) in H 1(0, T ;H) ∩L2(0, T ; dom(A)), and yc → y∗ strongly in L2(0, T ;V ) ∩ C(0, T ;H) as c→∞. More-over t → y∗(t) ∈ V is right continuous. If in addition assumptions (3)–(4) hold, thenyc ≤ yc for c ≤ c and yc(t) → y∗(t) strongly in H for each t ∈ [0, T ] and pointwisea.e. in �.


�

�

�

�

�

�

�

�


Proof. For λ = 0 we have

1

2

d

dt|yc − ψ |2H + 〈A(yc − ψ), yc − ψ〉 + 〈Aψ − f, yc − ψ〉 + c |(yc − ψ)−|2 = 0.

From assumptions (1)–(2) we have

|yc(t)− ψ |2H +∫ t

0(ω |yc − ψ |2V + c |(yc − ψ)−|2H ) ds

≤ |y0 − ψ |2H +∫ t

0

(2ρ |yc − ψ |2H +

1

ω|Aψ − f |2V ∗

)ds

and thus

|yc(t)− ψ |2H +∫ t

0(ω |yc − ψ |2V + c |(yc − ψ)−|2H ) ds

≤ e2ρt

(|y0 − ψ |2H +

1

ω

∫ t

0|Aψ − f |2V ∗ ds

).

(10.1.12)

With assumptions (5) and (6) holding, we have that yc ∈ H 1(0, T ;H),(d

dtyc + Ayc + c min(0, yc − ψ)− f (t),

d

dtyc

)= 0

for a.e. t ∈ (0, T ). Then∣∣∣∣ ddt yc∣∣∣∣2

H

+ d

dt(as(yc − ψ, yc − ψ)+ c|(yc − ψ)−|2) ≤ 2(M2|yc − ψ |2V + |Aψ − f |2H )

and hence

as(yc(t)− ψ, yc(t)− ψ)+ c |(yc(t)− ψ)−|2H +∫ t

0

∣∣∣∣ ddt yc∣∣∣∣2

H

ds

≤ as(y0 − ψ, y0 − ψ)+∫ t

02(M2|yc(s)− ψ |2V + |Aψ − f (s)|2H ) ds,

(10.1.13)

where we used the fact that y0 − ψ ∈ C. It thus follows from (10.1.12)–(10.1.13) that

|yc(t)− ψ |2V + c |(yc(t)− ψ)−|2H +∫ t

0

∣∣∣∣ ddt yc∣∣∣∣2

H

ds

≤ K

(|y0 − ψ |V + |ψ − ψ |V +

∫ t

0|Aψ − f (s)|2H ds

) (10.1.14)

for a constant K independent of c > 0 and t ∈ [0, T ].Hence there exists a y∗ such that y∗ − ψ ∈ H 1(0, T ;H) ∩ B(0, T ;V ) and on a

subsequence

d

dtyc → d

dty∗, Ayc → Ay∗


�

�

�

�

�

�

�

�


weakly in L2(0, T ;H) and L2(0, T ;V ∗), respectively. In particular this implies that yc →y∗ weakly in L2(0, T ;H). Above B(0, T ;V ) denotes the space of all everywhere boundedmeasurable functions from [0, T ] toV . By assumption (5) we have that y∗−ψ ∈ B(0, T ;V )

as well.Since ∫ T

0(yc − ψ, φ)H dt ≥ −

∫ T

0((yc − ψ)−, φ)H dt

for all φ ∈ L2(0, T ;H) with φ(t) ∈ C for a.e. t , and since limc→0∫ T

0 |(yc(t)−ψ)−|2H dt =0, by (10.1.14) we have∫ T

0(y∗(t)− ψ, φ)H dt ≥ 0 for all φ ∈ L2(0, T ;H) with φ(t) ∈ C. (10.1.15)

This implies that y∗(t)− ψ ∈ C for a.e. t ∈ [0, T ]. For y − ψ ∈ C

−(c(yc(t)− ψ)−, y − yc(t)) = −(c(yc(t)− ψ)−, y − ψ − (yc(t)− ψ)

) ≤ 0

for a.e. t ∈ (0, T ). It therefore follows from (10.1.11) that for y ∈ K⟨d

ds(yc(t)− y(t)+ y(t))+ A(yc(t)− y(t))+ Ay(t)− f (t), y(t)− yc(t)

⟩≥ 0.

(10.1.16)Hence ∫ t

0e−2ρs

⟨d

ds(yc(s)− y(s)), yc(s)− y(s)

⟩ds

+∫ t

0e−2ρs

⟨d

dsy(s)+ A(yc(s)− y(s)), yc(s)− y(s)

⟩

≤∫ t

0e−2ρs〈Ay(s)− f (s), y(s)− yc(s)〉 ds.

For z ∈ W(0, T ) we have∫ t

0e−2ρs

⟨d

dsz(s), z(s)

⟩ds = 1

2(e−2ρt |z(t)|2H − |z(0)|2H )+

∫ t

0ρ e−2ρs |z(s)|2H ds.

(10.1.17)This, together with (10.1.16), implies for y ∈ K

1

2(e−2ρt |yc(t)− y(t)|2H − |y0 − y(0)|2H )

+∫ t

0e−2ρs

( ⟨d

dsy(s)+ A(yc(s)− y(s)), yc(s)− y(s)

⟩+ ρ |yc(s)− y(s)|2H

)ds

≤∫ t

0e−2ρs〈Ay(s)− f (s), y(s)− yc(s)〉 ds

→∫ t

0e−2ρs〈Ay − f (s), y(s)− y∗(s)〉 ds


�

�

�

�

�

�

�

�


as c→∞. Since norms are w.l.s.c., we obtain

1

2(e−2ρt |y∗(t)− y(t)|2H − |y0 − y(0)|2H )

+∫ t

0e−2ρs

(⟨d

dsy(s)+ A(y∗(s)− y(s)), y∗(s)− y(s)

⟩+ ρ |y∗(s)− y(s)|2H

)ds

≤∫ t

0e−2ρs〈Ay(s)− f (s), y(s)− y∗(s)〉 ds,

or equivalently, using (10.1.17)∫ t

0e−2ρs

(⟨d

dsy∗(s)+ Ay∗(s)− f (s), y(s)− y∗(s)

⟩)ds ≥ 0 (10.1.18)

for all y ∈ K and t ∈ [0, T ]. If y(t) − ψ ∈ H 1(0, T ;H) ∩ B(0, T ;V ) also satisfies(10.1.18), then it follows from (10.1.18) that∫ t

0e−2ρs

⟨d

ds(y(s)− y∗(s))+ A(y(s)− y∗(s)), y(s)− y∗(s)

⟩ds ≤ 0.

Using (10.1.17) this implies

1

2e−2ρt |y(t)−y∗(t)|2H+

∫ t

0e−2ρs(〈A(y(s)−y∗(s)), y(s)−y∗(s)〉+ρ |y∗(s)−y(s)|2H ) ds ≤ 0

and thus y(t) = y∗(t). Hence the solution to (10.1.18) is unique. Integrating (10.1.16) on(τ, t) with 0 ≤ τ < t ≤ T we obtain with the arguments that lead to (10.1.18)∫ t

τ

e−2ρs

⟨d

dsy∗(s)+ Ay∗(s)− f (s), y(s)− y∗(s)

⟩ds ≥ 0, (10.1.19)

and thus y∗ satisfies (10.0.1).To argue that yc − ψ → y∗ − ψ strongly in L2(0, T ;V ) ∩ C(0, T ;H), note that

λc = c min(0, yc − ψ) converges weakly in L2(0, T ;V ∗) to λ∗. From (10.1.11) and(10.0.5) we have

1

2

d

dt|yc − y∗|2 + 〈A(yc − y∗), yc − y∗〉 = 〈λ∗ − λc, yc − y∗〉

≤ 〈λ∗ − λc, yc − ψ + ψ − y∗〉 ≤ 〈λ∗, yc − ψ〉 + 〈λc, y∗ − ψ〉 =: ηc,

where |ηc|L1(0,T ;R) → 0 for c→∞. By assumption (2)

1

2

d

dt|yc − y∗|2H + ω|yc − y∗|2V − ρ|yc − y∗|2H ≤ ηc,

and henced

dt[e−2ρt |yc − y∗|2H ] + ω e−2ρt |yc − y∗|2V ≤ 2e−2ρtηc,


�

�

�

�

�

�

�

�


which implies that

e−2ρt |yc(t)− y∗(t)|2H + ω

∫ t

0e−2ρs |yc(s)− y∗(s)|2V ds ≤

∫ t

0e−2ρsηc(s) ds,

and the desired convergence of yc to y∗ in L2(0, T ;V ) ∩ C(0, T ;H) follows.It remains to argue right continuity of t → y∗(t) ∈ V . From (10.1.19) it follows that∫ t

τ

⟨e−2ρs

(d

dsy∗(s)+ Ay∗(s)− f (s)

),y∗(s − h)− y∗(s)

−h⟩ds ≤ 0, (10.1.20)

where h > 0. Using a(b − a) = b2−a2

2 − 12 (a − b)2 we find

lim infh→0

1

−h∫ t

τ

e−2ρsas(y∗(s)− ψ, y∗(s − h)− y∗(s))

≥ 2ρ∫ t

τ

e−2ρsas(y∗(s)− ψ, y∗(s)− ψ) ds

+1

2e−2ρtas(y

∗(t)− ψ, y∗(t)− ψ)− 1

2e−2ρτ as(y

∗(τ )− ψ, y∗(τ )− ψ).

This estimate, together with assumption (6) and the fact that { y∗(·−h)−y∗−h }h>0 is weaklybounded in L2(0, T ;H), allows us to pass to the limit in (10.1.20) to obtain

e−2ρtas(y∗(t)− ψ, y∗(t)− ψ)− e−2ρτ as(y

∗(τ )− ψ, y∗(τ )− ψ)

+ ∫ t

τe−2ρs

∣∣ ddty∗(s)

∣∣2

Hds

≤ M

∫ t

τ

|y∗(s)− ψ |V∣∣∣∣ ddt y∗(s)

∣∣∣∣ ds + ∫ t

τ

e−2ρs |Aψ − f (s)|H∣∣∣∣ ddt y∗(s)

∣∣∣∣ ds.Consequently, we have

e−2ρtas(y∗(t)− ψ, y∗(t)− ψ) ≤ e−2ρτ as(y

∗(τ )− ψ, y∗(τ )− ψ)

+∫ t

τ

e−2ρs(M2|y∗(s)− ψ |2V + |Aψ − f (s)|2H ) ds

for 0 ≤ τ ≤ t ≤ T . This implies that

as(y∗(τ )−ψ, y∗(τ )−ψ)+|y∗(τ )−ψ |2H ≥ lim sup

t→τ

(as(y

∗(t)−ψ, y∗(t)−ψ)+|y∗(t)−ψ |2H).

Since as(φ, φ)+|φ|2H defines an equivalent norm on the Hilbert spaceV , |y∗(t)−y∗(τ )|V →0 as t ↓ τ . Hence y∗ is right continuous.

Now, in addition assumptions (3)–(4) are supposed to hold. Let

λc(t) = c min(0, yc(t)− ψ).

Then for c ≤ c and φ = (yc − yc)+

(λc − λc, φ) =((c − c)min(0, yc − ψ)+ c(min(0, yc − ψ)−min(0, yc − ψ)), φ

) ≥ 0.


�

�

�

�

�

�

�

�

10.2. Regularity 291

Hence, using the arguments leading to (10.1.4), we have yc ≤ yc for c ≤ c. Then yc(t)→y∗(t) strongly in H and pointwise a.e. in �. Moreover,

λc(t) = c min(0, yc − ψ)→ λ∗ weakly in L2(0, T ;V ∗).Summarizing Theorems 10.4 and 10.7 we have the following corollary.

Corollary 10.8. If assumptions (1)–(6) hold, y0 − ψ ∈ C ∩ V , and f ∈ L2(0, T ;H), thenfor every t ∈ [0, T ]

yc(t) ≤ y∗(t) ≤ yc(t)

and

yc(t) ↑ y∗(t) and yc(t) ↓ y∗(t)

pointwise a.e., monotonically in � as c → ∞. Moreover, y∗ − ψ ∈ H 1(0, T ;H) ∩L2(0, T ; dom(A)) ∩ L2(0, T ;V ).

10.2 RegularityIn this section we discuss additional regularity of the solution y∗ to (10.0.1) under theassumptions of Theorem 10.4 (3). For h > 0 we have, suppressing the superscripts “∗”,

d

dt

((y(t + h)− y(t)

h

)+ A

(y(t + h)− y(t)

h

)+ λ(t + h)− λ(t)

h= f (t + h)− f (t)

h.

From (10.0.5)

(λ(t + h)− λ(t), y(t + h)− y(t)) = −(λ(t + h), y(t)− ψ)− (λ(t), y(t + h)− ψ) ≥ 0

and thus ⟨d

dt

((y(t + h)− y(t)

h

),(y(t + h)− y(t)

h

⟩

+⟨A

(y(t + h)− y(t)

h

),y(t + h)− y(t)

h

⟩

≤⟨f (t + h)− f (t)

h,y(t + h)− y(t)

h

⟩.

Multiplying this by t > 0 we obtain

d

dt

(t

2

∣∣∣∣y(t + h)− y(t)

h

∣∣∣∣2

H

)+ tω

2

∣∣∣∣y(t + h)− y(t)

h

∣∣∣∣2

V

≤ 1

2

∣∣∣∣ (y(t + h)− y(t)

h

∣∣∣∣2

+ρt∣∣∣∣y(t + h)− y(t)

h

∣∣∣∣2

H

+ t

2ω

∣∣∣∣ (f (t + h)− f (t)

h

∣∣∣∣2

V ∗.


�

�

�

�

�

�

�

�


Integrating in time,

t

∣∣∣∣y(t + h)− y(t)

h

∣∣∣∣2

H

+ ω

∫ t

0s

∣∣∣∣y(s + h)− y(s)

h

∣∣∣∣2

V

ds

≤ e2ρt∫ t

0

(∣∣∣∣y(s + h)− y(s)

h

∣∣∣∣2

H

+ s

ω

∣∣∣∣f (s + h)− f (s)

h

∣∣∣∣2

V ∗

)ds,

and letting h→ 0+, we obtain

t

∣∣∣∣ ddt y∣∣∣∣2

H

+ ω

∫ t

0s

∣∣∣∣ ddt y∣∣∣∣2

V

ds ≤ e2ρt∫ t

0

(∣∣∣∣ ddt y∣∣∣∣2

H

+ s

ω

∣∣∣∣ ddt f∣∣∣∣2

H

)ds. (10.2.1)

Hence we obtain the following theorem.

Theorem 10.9. Suppose that assumptions (1)–(5) hold and that y0 − ψ ∈ C ∩ V , f ∈L2(0, T ;H) ∩ H 1(0, T , V ∗). Then the strong solution satisfies (10.2.1) and thus y(t) ∈dom (A) for all t > 0.

The conclusion of Theorem 10.9 remains correct under the assumptions of the firstpart of Theorem 10.7, i.e., under assumptions (1)–(2) and (5)–(6), y0 − ψ ∈ C ∩ V , andf ∈ L2(0, T ;H) ∩H 1(0, T , V ∗).

10.3 Continuity of q → y(q) ∈ L∞(�)

In this section we analyze the continuous dependence of the strong solution to (10.0.1) withrespect to parameters in the operator A. Let U denote the normed linear space of parametersand let U ⊂ U be a bounded subset such that for each q ∈ U the operator A(q) satisfiesthe assumptions (1)–(5) specified at the beginning of this chapter with ρ = 0. We assumefurther that dom(A(q)) = D is independent of q ∈ U and that

q ∈ U → A(q) ∈ L(X, V ∗)

is Lipschitz continuous with Lipschitz constant κ . Let y(q) be the solution to (10.0.1)corresponding to A = A(q), q ∈ U . Then for q1, q2 ∈ U , we have⟨

d

dt(y(q1)− y(q2))+ A(q1)(y(q1)− y(q2))

+ (A(q1)− A(q2))y(q2) , y(q1)− y(q2)

⟩≤ 0,

and therefore

|y(q1)(T )− y(q2)(T )|2H + ω

∫ T

0|y(q1)− y(q2)|2V dt

≤ 1

ω

∫ T

0|(A(q2)− A(q1))y(q2)|2V ∗ dt = e(q1, q2)

2,


�

�

�

�

�

�

�

�

10.3. Continuity of q → y(q) ∈ L∞(�) 293

where

e(q1, q2)2 ≤ κ

ω|q1 − q2|2U |y(q2)|2L2(0,T ;V ).

Since from Theorem 10.9, y(q)(T ) ∈ D for q ∈ U , it follows by interpolation that

|y(q1)(T )− y(q2)(T )|Wα ≤ Ce(q1, q2)1−α (10.3.1)

where Wα = [H,D]α is the interpolation space between D and H (see, e.g., [Fat, Chapter8]), and C is an embedding constant. If L∞(�) ⊂ Wα, with α ∈ (α0, 1) for some α0, thenHölder’s continuity of q ∈ U → y(q)(T ) ∈ L∞(�) follows.

Next we prove Lipschitz continuity of q ∈ U → y(q) ∈ L∞(0, T ;L∞(�)). Someprerequisites are established first. We assume that A(q) generates an analytic semigroupS(t) = S(t; q) on H for every q ∈ U [Paz]. Then for each q ∈ U there exists M such that

‖AαS(t)‖ ≤ M

tαfor all t > 0, (10.3.2)

where Aα denote the fractional powers of A, with α ∈ (0, 1). We assume that M isindependent of q ∈ U . We shall further assume that

dom(A12 ) = V (10.3.3)

for all q ∈ U , which is the case for a large class of second order elliptic differential operators;see, e.g., [Fat, Chapter 8]. We assume that

‖A(q)‖L(V,V )∗ ≤ ω for all q ∈ U . (10.3.4)

Let r > 2 be such that

V ⊂ Lr(�),

and let M denote the embedding constant so that

|ζ |Lr (�) ≤ M|ζ |V for all ζ ∈ V. (10.3.5)

For

p = 2r

r − 2∈ (2,∞),

we shall utilize the assumption

|A− 12 (q1)(A(q1)− A(q2))y|Lp(�) ≤ κ|q1 − q2|U |A(q2)

αy|Hfor some α ∈ (0, 1) and all q1, q2 ∈ U , y ∈ D.

(10.3.6)

This assumption is applicable, for example, if the parameter enters as a constant intothe leading differential term of A(q) or if it enters into the lower order terms.


�

�

�

�

�

�

�

�


Theorem 10.10. Let A(q) generate an analytic semigroup for every q ∈ U , and letassumptions (1)–(5) and (7) hold. If further (10.3.3)–(10.3.6) are satisfied and f ∈L∞(0, T ;H), y0 ∈ C ∩ D, then q → y(q) is Lipschitz continuous from U → L∞(0, T ;L∞(�)).

Proof. (1) Let q ∈ U and A = A(q). Let f 1, f 2 ∈ L∞(0, T ;H) with

A−12 f i ∈ L∞(0, T ;Lp(�)), i = 1, 2,

and let y1, y2 denote the corresponding strong solutions to (10.0.1). For k > 0 let

φk = max(0, y1 − y2 − k)

and

�k = {x ∈ � : φk > 0}.By assumption φk ∈ V and φk ≥ 0. Note that

(min(0, λ+ c (z1 − ψ))−min(0, λ+ c (z2 − ψ)), (z1 − z2 − k)+) ≥ 0

for all z1, z2 ∈ H and k > 0. Thus, it follows from (10.0.5) and Theorem 10.4 that forφk = (y1 − y2 − k)+ ∈ V⟨

d

dt(y1 − y2), φk

⟩+ 〈A(y1 − y2), φk〉 ≤ (f 1 − f 2, φk) (10.3.7)

for a.e. t ∈ (0, T ). By assumption we have

〈Aζ, (ζ − k)+) ≥ ω|(ζ − k)+|2Vfor ζ ∈ V and hence it follows from (10.3.7) that

ω

∫ T

0|φk|2V dt ≤

∫ T

0|(f, φk)| dt ≤ ‖A(q)‖L(V .V ∗)

∫ T

0|A− 1

2 f |L2(�k)|φk|V dt,

where φk(0) = 0, f = f 1 − f 2, and thus for any β > 1

ω

(∫ T

0|φk|2V dt

) 12

≤ ω

(∫ T

0|A− 1

2 f |2L2(�k)dt

) 12

≤ ω C1−β(∫ T

0|A− 1

2 f |2L2(�k)dt

) β

2

(10.3.8)

with

C =(∫ T

0|A− 1

2 f |2L2 dt

) 12

.


�

�

�

�

�

�

�

�

10.3. Continuity of q → y(q) ∈ L∞(�) 295

For p > q = 2 we have∫�k

|A− 12 f |q dx ≤

(∫�k

|A− 12 f |p

)q/p

|�k|(p−q)/p. (10.3.9)

We denote by h and k arbitrary real numbers satisfying 0 < k < h < ∞ and we findfor r > 2

|φk|rLr ≥∫�k

|φ − k|r dx >

∫�h

|φ − k|r ds ≥ |�h||h− k|r . (10.3.10)

It thus follows from (10.3.5) and (10.3.8)–(10.3.10) that for β > 1

(∫ T

0|�h| 2

r dt

) 12

≤ M

|h− k|(∫ T

0|φk|2V

) 12

≤ ωMC1−β

ω|h− k|(∫ T

0|A− 1

2 f |2L2(�k)dt

) β

2

≤ ωMC1−β

ω|h− k|(∫ T

0|A− 1

2 f |2Lp |�k|p−2p dt

) β

2

.

For 1P+ 1

Q= 1 this implies that

(∫ T

0|�h| 2

r dt

) 12

≤ ωMC1−β

ω|h− k|(∫ T

0|A− 1

2 f |2PLp dt

) β

2P(∫ T

0|�k|

Q(p−2)p dt

) β

2Q

.

For P = ∞ and Q = 1 this implies, using that p = 2rr−2 ,

(∫ T

0|�h| 2

r dt

) 12

≤ K

|h− k|(∫ T

0|�k| 2

r dt

) β

2

, (10.3.11)

where K = ωMC1−βω

|A− 12 f |βL∞(0,T ;Lp(�)).

Now, we use the following fact [Tr]: Let ϕ : (k1, h1) → R be a nonnegative,nonincreasing function and suppose that there are positive constants K , s, and β > 1such that

ϕ(h) ≤ K(h− k)−sϕ(k)β for k1 < k < h < h1.

Then, if k = K1s 2

β

β−1 ϕ(k1)β−1r satisfies k1 + k < h1, it follows that ϕ(k1 + k) = 0. Here

we set

ϕ(k) =(∫ T

0|�k| r2 dt

) 12

on (0,∞), s = 1, β > 1, and

k1 = supt∈(0,T )

|A− 12 (f 1(t)− f 2(t))|Lp(�).


�

�

�

�

�

�

�

�


It then follows from (10.3.8)–(10.3.10), as in the computation below (10.3.10), that

ϕ(k1) ≤ MωC

ωk1.

From the definition of k we have in case C ≥ k1

k ≤ 2β

β−1

(ωM

ω

)β

C1−βkβ1 Cβ−1k

1−β1 = 2

β

β−1

(ωM

ω

)β

k1.

The same estimate can also be obtained in the case that C ≤ k1, and consequently k1+ k ≤� k1, where � = 1 + 2

β

β−1 ( ωMω)β . Hence we obtain y1 − y2 ≤ �k1 a.e. in (0, T ) × �.

Analogously a uniform lower bound for y1 − y2 is obtained by using φk = min(0, y1 −y2 − k) ≤ 0 and thus

|y1 − y2|L∞(0,T ;L∞(�)) ≤ � supt∈(0,T )

|A− 12 (f 1(t)− f 2(t))|Lp(�). (10.3.12)

(2) We use the estimate of step (1) to obtain Lipschitz continuous dependence of thesolution on the parameter q. Let q1, q2 ∈ U with corresponding solutions y(q1) and y(q2).Since

d

dty(q2)+ A(q1)y(q2)+ (A(q2)− A(q1))y(q2)+ λ(q2) = f (t),

y(q2) is the solution to (10.0.1) with A = A(q1) and f (t) = f − (A(q2)− A(q1))y(q2) ∈L2(0, T ;H). Hence we can apply the estimate of (1) with A = A(q1) and f 1 − f 2 =(A(q2)− A(q1))y(q2) and obtain

|y1 − y2|L∞(0,T ;L∞(�)) ≤ � supt∈(0,T )

|A(q1)− 1

2 (A(q1)− A(q2))y(t; q2)|Lp(�).

Utilizing (10.3.6) this implies that

|y1 − y2|L∞(0,T ;L∞(�)) ≤ � κ |q1 − q2|U supt∈(0,T )

|Aα(q2)y(t; q2)|H . (10.3.13)

To estimate Aα(q2)y(t; q2) recall from Theorem 10.4 that λ ≤ λ(t; q) ≤ 0 and thus {f −λ(q) : q ∈ U} is uniformly bounded in L∞(0, T ;H). From (10.0.6) we have that

A(q2)αy(t; q2) = A(q2)

αS(t; q2)y0 +∫ t

0A(q2)

αS(t − s; q2)(f (s)− λ(s; q2)) ds

∈ L∞(0, T ;H).

From (10.3.2) and since y0 ∈ D it follows that {A(q2)αy(q2) : q2 ∈ U} is bounded in

L∞(0, T ;H) as desired.


�

�

�

�

�

�

�

�

10.4. Difference schemes and weak solutions 297

10.4 Difference schemes and weak solutionsIn this section we establish existence of weak solutions to (10.0.1) based on finite differenceschemes (10.4.1). This difference approximation is also used to establish uniqueness ofthe weak solution and for proving monotonicity properties of the solution in the followingsection. For the sake of the simplicity of presentation we assume that ρ = 0. Consequently〈Aφ, φ〉 = as(φ, φ) defines an equivalent norm on V . For h > 0 consider the discretized(in time) variational inequality: Find yk − ψ ∈ C, k = 1, . . . , N, satisfying⟨

yk − yk−1

h+ Ayk − f k, y − yk

⟩≥ 0, y0 = y0 (10.4.1)

for all y − ψ ∈ C, where

f k = 1

h

∫ kh

(k−1)hf (t) dt

and Nh = T . Throughout this section we assume that y0 ∈ H and f ∈ L2(0, T ;V ∗).

Theorem 10.11. Assume that y0 ∈ H and f ∈ L2(0, T , V ∗) and that assumptions (1)–(2)hold. Then there exists a unique solution {yk}Nk=1 to (10.4.1).

Proof. To establish existence of solutions to (10.4.1), we proceed by induction with respectto k and assume that existence has been proven up to k − 1. To verify the induction stepconsider the regularized problems

ykc − yk−1

h+ Ayk

c + c min(0, ykc − ψ)− f k = 0. (10.4.2)

Sincey ∈ H → min c (0, y − ψ)

is Lipschitz continuous and monotone, the operator B : V → V ∗ defined by

B(y) = y

h+ Ay + c min(0, y − ψ)

is coercive, monotone, and continuous for all h > 0. Hence by the theory of maximalmonotone operators (10.4.2) admits a unique solution; cf., e.g., [ItKa, Chapter I.5], [Ba,Chapter II.1]. For each c > 0 and k = 1, . . . , N we find

1

2h(|yk

c − ψ |2H − |yk−1 − ψ |2H + |ykc − yk−1|2H )

+〈A(ykc − ψ)+ Aψ − f k, yk

c − ψ〉 + c |(ykc − ψ)−|2H = 0.

Thus the families |ykc − ψ |2V and c |(yk

c − ψ)−|2H are bounded in c > 0 and there existsa subsequence of {yk

c − ψ} that converges to some yk − ψ weakly in V as c → ∞. As


�

�

�

�

�

�

�

�


argued in the proof of Theorem 10.7 (cf. (10.1.15)), yk − ψ ∈ C and hence yk − ψ ∈ C.Note that

(−(ykc−ψ)−, y−yk

c ) = (−(ykc−ψ)−, y−ψ−(yk

c−ψ)) ≤ 0 for all y−ψ ∈ C (10.4.3)

and

lim infc→∞ 〈Ayk

c , ykc − y〉

= lim infc→∞

(〈A(ykc − ψ), yk

c − ψ〉 + 〈A(ykc − ψ),ψ − y〉 + 〈Aψ, yk

c − ψ〉)≥ 〈A(yk − ψ), yk − ψ〉 + 〈A(yk − ψ),ψ − y〉 + 〈Aψ, yk − ψ〉

= 〈Ayk, yk − y〉.

(10.4.4)

Passing to the limit in (10.4.2) utilizing (10.4.3) and (10.4.4) we obtain⟨yk − yk−1

h+ Ayk − f k, yk − y

⟩≤ lim inf

c→∞

(⟨ykc − yk−1

h, yk

c − y

⟩+ 〈Ayk

c , ykc − y〉 − 〈f k, yk

c − y〉)≤ 0,

and hence yk satisfies (10.4.1).To verify uniqueness, let yk be another solution to (10.4.1). Then, from (10.4.1),⟨

(yk − yk−1)− (yk−1 − yk−1)

h+ A(yk − yk), yk − yk

⟩≤ 0

and thus1

2h|yk − yk|2H + 〈A(yk − yk), yk − yk〉 ≤ 1

2h|yk−1 − yk−1|2H .

Since y0 = y0 = y0, this implies that yk = yk for all k ≥ 1.

Next we discuss existence and uniqueness of weak solutions to (10.0.1) by passing tothe limit in the piecewise defined functions

y(1)h = yk + t − k h

h(yk+1 − yk), y

(2)h = yk+1 on (k h, (k + 1) h] (10.4.5)

for k = 0, . . . , N − 1.

Theorem 10.12. Suppose that the assumptions of Theorem 10.11 hold. Then there existsa unique weak solution y∗ of (10.0.1). Moreover t → y∗(t) ∈ H is right continuous,y∗ ∈ B(0, T ;H), and y

(2)h − ψ → y∗ − ψ strongly in L2(0; T ;V ).


�

�

�

�

�

�

�

�


Proof. Setting y = ψ in (10.4.1), we obtain

|yk − ψ |2H − |yk−1 − ψ |2H + |yk − yk−1|2H + hω |yk − ψ |2V ≤h

ω|Aψ − f k|2V ∗ .

Thus,

|ym−ψ |2H +m∑

k=�+1

(|yk − yk−1|2H +ωh |yk −ψ |2V ) ≤ |y�−ψ |2H +1

ω

m∑k=�+1

|Aψ − f k|2V ∗ h(10.4.6)

for all 0 ≤ � < m ≤ N . Since∫ T

0|y(1)

h − y(2)h |2H h ≤ h

3

N∑k=1

|yk − yk−1|2H h→ 0 as h→ 0+.

From the above estimates it follows that there exist subsequences of y(1)h , y

(2)h (denoted by

the same symbols) and y∗(t) ∈ L2(0, T ;V ) such that

y(1)h (t), y

(2)h (t)→ y∗(t) weakly in L2(0, T ;V ) as h→ 0+. (10.4.7)

Note that

d

dty(1)h = yk+1 − yk

hon (k h, (k + 1) h].

Thus, we have from (10.4.1) for every y ∈ K⟨d

dty + Ay

(2)h − fh, y − y

(2)h

⟩+

⟨d

dty(1)h − d

dty, y − y

(2)h

⟩≥ 0 (10.4.8)

a.e. in (0, T ). Here⟨d

dt(y

(1)h − y), y − y

(2)h

⟩=

⟨d

dty(1)h − d

dty, y − y

(1)h

⟩+

⟨d

dty(1)h − d

dty, y

(1)h − y

(2)h

⟩(10.4.9)

with ∫ T

0

⟨d

dty(1)h − d

dty, y − y

(1)h

⟩dt ≤ 1

2|y(0)− y0|2H (10.4.10)

and ∫ T

0

(d

dty(1)h , y

(1)h − y

(2)h

)dt = −1

2

N∑k=1

|yk − yk−1|2H . (10.4.11)

Since ∫ T

0〈Ay∗, y∗ − y〉 dt ≤ lim inf

h→0+

∫ T

0〈Ay(2)

h , y(2)h − y〉 dt,


�

�

�

�

�

�

�

�


which can be argued as in (10.4.4), it follows from (10.4.7)–(10.4.11) that every weak clusterpoint y∗ of y(2)

h satisfies∫ T

0

⟨d

dty + Ay∗ − f, y − y∗

⟩dt + 1

2|y(0)− y0|2H ≥ 0 (10.4.12)

for all y ∈ K and a.e. t ∈ (0, T ). Hence y∗ ∈ L2(0, T ;V ) is a weak solution of (10.0.1)and y∗ ∈ B(0, T ;H).

Moreover, from (10.4.6)

|y∗(t)− ψ |2H ≤ |y∗(τ )− ψ |2H +1

ω

∫ t

τ

|Aψ − f (s)|2V ∗ ds

for all 0 ≤ τ ≤ t ≤ T . Thus,

lim supt↓τ

|y∗(t)− ψ |2H ≤ |y∗(τ )− ψ |2H .

which implies that t → y∗(t) ∈ H is right continuous.Let y∗ be a weak solution. Setting y = y

(1)h ∈ K in (10.4.12) and y = y∗(t) in

(10.4.8) we have ∫ T

0

⟨d

dty(1)h + Ay∗ − f, y

(1)h − y∗

⟩dt ≥ 0, (10.4.13)

∫ T

0

⟨d

dty(1)h + Ay

(2)h − fh, y

∗ − y(2)h

⟩dt ≥ 0, (10.4.14)

where we used that ∫ T

0

⟨d

dt(y

(1)h − y∗), y∗ − y

(2)h

⟩dt ≤ 0

from (10.4.9)–(10.4.11).Summing up (10.4.13) and (10.4.14) and using (10.4.11) implies that∫ T

0

(〈Ay∗, y(1)h − y

(2)h 〉 − 〈f, y(1)

h − y∗〉 − 〈fh, y∗ − y

(2)h 〉

)dt

≥ 1

2

N∑k=1

|yk − yk−1|2H +∫ T

0〈A(y∗ − y

(2)h ), y∗ − y

(2)h 〉 dt.

Letting h → 0+ we obtain 0 ≥ 〈A(y∗(t) − y(t)), y∗(t) − y(t)〉 a.e. on (0, T ) for everyweak cluster point y of y(2)

h in L2(0, T ;V ). This implies that the weak solution is uniqueand that ∫ T

0〈A(y∗ − y

(2)h ), y∗ − y

(2)h 〉 dt → 0

as h→ 0+.


�

�

�

�

�

�

�

�


Corollary 10.13. Let y = y(y0, y) denote the weak solution to (10.0.1), given y0 ∈ H andf ∈ L2(0, T ;V ∗). Then for all t ∈ [0, T ]

|y(y0, f )(t)− y(y0, f )(t)|H + ω

∫ T

0|y(y0, f )− y(y0, f )|2V ds

≤ |y0 − y0|2H +1

ω

∫ t

0|f − f |2V ∗ ds.

Proof. Let yk and yk be the solution to (10.4.1) corresponding to (y0, f ) and (y0, f ). Itthen follows from (10.4.1) that⟨

(yk − yk)− (yk−1 − yk−1)

h+ A(yk − yk)− (f k − f k), yk − yk

⟩≤ 0.

Thus,

|yk − yk|2H + ωh |yk − yk|2V ≤ |yk−1 − yk−1|2H +h

ω|f k − f k|2V ∗ .

Summing this in k, we have

|ym − ym|2H + ω

m∑k=1

h |yk − yk|2V ≤ |y0 − y0|2H +1

ω

m∑k=1

h |f k − f k|2V ∗ ,

which implies the desired estimate by letting h→ 0+.

Corollary 10.14. Let λ ∈ H satisfy λ ≤ 0 and let yc ∈ W(0, T ) be the solution to

d

dtyc(t)+ Ayc(t)+min(0, λ+ c (yc − ψ)) = f. (10.4.15)

Then yc → y∗ weakly in L2(0, T ;V ) and yc(T )→ y∗(T ) weakly in H as c→∞, wherey∗ is the unique weak solution to (10.0.1). In addition, if y∗ ∈ W(0, T ), then

|yc − y∗|L2(0,T ;V ) + |yc − y∗|L∞(0,T ;H) → 0

as c→∞.

Proof. Note that

(min(0, λ+ c (yc − ψ)), yc − ψ) ≥ c

2|(yc − ψ)−|2H −

1

2c|λ|2H .

Thus, we have

1

2

d

dt|yc − ψ |2H + 〈A(y − ψ), yc − ψ〉 + c |(yc − ψ)−|2H ≤ 〈f − Aψ, yc − ψ〉 + 1

c|λ|2

and

|yc(t)−ψ |2H + ω

∫ t

0(|yc −ψ |2V + c |(yc −ψ)2

H ) ds ≤ 1

ω

∫ t

0

(|f − Aψ |2V ∗ +

1

c|λ|2

)ds.


�

�

�

�

�

�

�

�


Hence∫ T

0 |yc − ψ |2H dt → 0 as c→ 0 and {yc − ψ}c≥1 is bounded is L2(0, T ;V ). Usingthe same arguments as in the proof of Theorem 10.7, there exist y∗ and a subsequence of{yc − ψ}c≥1 that converges weakly to y∗ − ψ ∈ L2(0, T ;V ), and y∗ − ψ ≥ 0 a.e. in(0, T )×�. For y(t) ∈ K∫ T

0

[⟨d

dty(t)− d

dt(y(t)− yc), y(t)− yc(t)

⟩+ 〈Ayc(t)− f (t), y(t)− yc(t)〉

+(min(0, λ+ c (yc − ψ)), y(t)− ψ − (yc − ψ))

]dt = 0,

where∫ T

0

⟨− d

dt(y(t)− yc), y(t)− yc(t)

⟩dt = 1

2(|y(0)− y0|2H −|y(T )− yc(T )|2), (10.4.16)

(min(0, λ+ c (yc − ψ)), y(t)− ψ − (yc − ψ)) ≤ 1

2c|λ|2H . (10.4.17)

Hence, we have∫ T

0[⟨d

dty(t), y(t)− yc(t)

⟩dt + 〈Ayc(t)− f (t), y(t)− yc(t)〉 + 1

2|y(0)− y0|2H

≥ 1

2|y(T )− yc(T )|2H −

1

2c

∫ T

0|λ|2H ds.

Letting c→∞, y∗ satisfies (10.0.7) and thus y∗ is the weak solution of (10.0.1).Suppose that y∗ ∈ W(0, T ). Then by (10.4.15),⟨

d

dt(yc − y∗)+ A(yc − y∗)+ d

dty∗ + Ay∗ − f, y∗ − yc

⟩+(min(0, λ+ c (yc − ψ)), y∗(t)− ψ − (yc − ψ)) = 0.

From (10.4.16)–(10.4.17), and since yc → y∗ weakly in L2(0, T ;V ),

1

2|yc(t)− y∗(t)|2H +

∫ t

0〈A(yc − y∗), yc − y∗〉 ds → 0,

and this convergence is uniform with respect to t ∈ [0, T ].

10.5 Monotone propertyIn this section we establish monotone properties of the weak solution to (10.0.1). As in theprevious section assumption (2) is used with ρ = 0.

Corollary 10.15. If assumptions (1)–(3) hold, then

y(y0, f ) ≥ y(y0, f )


�

�

�

�

�

�

�

�

10.5. Monotone property 303

provided that y0 ≥ y0 and f ≥ f , with y0, y0 ∈ H and f, f ∈ L2(0, T ;V ∗). As aconsequence, if y0 = ψ and f (t) ≥ f (s) for a.e. t > s, then the weak solution satisfiesy(t) ≥ y(s) for t ≥ s.

Proof. Assume that yk−1 ≥ yk−1. For φ = −(ykc − yk

c )− it follows from (10.4.2) that(

ykc − yk

c − (yk−1 − yk−1)

h, φ

)+ 〈A(yk

c − ykc )− (f k − f k), φ〉 − c

((yk

c − ψ)−

−(ykc − ψ)−, φ

) = 0.

Since ⟨−yk−1 − yk−1

h− (f k − f k), φ

⟩− c

((yk

c − ψ)− − (ykc − ψ)−, φ

) ≥ 0,

we have from assumption (3) that

|φ|2Hh

+ 〈Aφ, φ〉 ≤ 0

and thus ykc − yk

c ≥ 0 for sufficiently small h > 0. From the proof of Theorem 10.11 itfollows that we can take the limit with respect to c and obtain yk − yk ≥ 0. By inductionthis holds for all k ≥ 0. The first claim of the theorem now follows from (10.4.2) andTheorem 10.12. The second one follows from

y(t;ψ, f (·)) = y(s; y(t − s;ψ, f ), f (· + t − s)) ≥ y(s;ψ, f (·)).

Corollary 10.16. Let assumptions (1)–(3) hold and suppose that the stationary variationalinequality

〈Ay − f, φ − y〉 ≥ 0 for all φ − ψ ∈ C (10.5.1)

with f ∈ V ∗ has a solution y−ψ ∈ C. Then if y(0) = ψ and f (t) = f , we have y(t) ↑ y,where y is the minimum solution to (10.5.1).

Proof. Suppose y is a solution to (10.5.1). Since y(t) := y, t ≥ 0, is also the uniquesolution to (10.0.1) with y0 = y ≥ ψ , it follows from Corollary 10.16 that y(t) ≤ y for allt ∈ [0, T ]. On the other hand, it follows from Theorem 10.4 (2) that⟨

y(τ + 1)− y(τ)+∫ τ+1

τ

(Ay(s)− f ) ds, φ − y(τ)

⟩≥ 0 for all φ − ψ ∈ C (10.5.2)

since y(s) ≥ y(τ), s ≥ τ . By the Lebesgue monotone convergence theorem y =limτ→∞ y(τ) ∈ C exists. Letting τ →∞ in (10.5.2), we obtain that y satisfies (10.5.1) andthus y ≤ y.


�

�

�

�

�

�

�

�


Corollary 10.17 (Perturbation). Let assumptions (1)–(3) hold, and let ψ1, ψ2 ∈ H andf ∈ L2(0, T ;V ∗). Denote by y1

c and y2c the solutions to (10.1.1) with y0 equal to ψ1 and

ψ2, respectively, and let y1 and y2 be the corresponding weak solutions to (10.0.1). Assumethat (φ − γ )+ ∈ V for any γ ∈ R+, that φ ∈ V , and that 〈A1, (φ − γ )+〉 ≥ 0. Then forα = max(0, supx∈� (ψ1 − ψ2)) and β = min(0, inf x∈� (ψ1 − ψ2)) we have

β ≤ y1c − y2

c ≤ α,

β ≤ y1 − y2 ≤ α.

Proof. Note that

c(y1c − y2

c − α) = λ+ c(y1c − ψ1)− (λ+ c(y2

c − ψ2))− c(ψ1 − ψ2 − α).

Thus, an elementary calculation shows that

(min(0, λ+ c(y1c − ψ1))−min(0, λ+ c(y2

c − ψ2)), (y1c − y2

c − α)+) ≥ 0.

As in the proof of Theorem 10.4 it follows that φ = (yc − y2c − α)+ satisfies

1

2

d

dt|φ|2H ≤ ρ|φ|2H .

Since φ(0) = 0, this implies that φ(t) = 0, t ≥ 0, and thus y1c −y2

c ≤ α a.e. on (0, T )×�.Similarly, letting φ = (y1

c − y2c − β)− we obtain y1

c − y2c ≥ β a.e. on (0, T ) × �. Since

from Corollary 10.14, y1c → y1 and y2

c → y2 weakly in L2(0, T ;H) for c→∞, we obtainthe desired estimates.


�

�

�

�

�

�

�

�

Chapter 11

Shape Optimization

11.1 Problem statement and generalitiesWe apply the framework that we developed in Section 5.5 for weakly singular problems tocalculate the shape derivative of constrained optimization problems of the form{

min J (y,�, �)

subject to e(y,�) = 0(11.1.1)

over admissible domains � and manifolds �. Here e denotes a partial differential equationdepending on the state variable y and the domain �, and J stands for a cost functionaldepending, besides y and �, on �, which constitutes the variable part of the boundary of �.

We consider as an example the problem of determining an unknown interior domainD ⊂ U in an elliptic equation{

�y = 0, in � = U \D,

y = 0 on ∂D

from the Cauchy data

y = f,∂y

∂n= g on the outer boundary ∂U.

This problem can be formulated as the shape optimization problem⎧⎪⎪⎨⎪⎪⎩min

∫∂U|y − f |2 ds

subject to

�y = 0 in � and y = 0 on ∂D, ∂∂ny = g on ∂U,

and it is a special case by the theory to be presented in Section 11.2.Shape sensitivity calculus has been widely analyzed in the past and is covered in the

well-known lecture notes [MuSi] and in the monographs [DeZo, HaMä, HaNe, SoZo, Zo],

305


�

�

�

�

�

�

�

�

306 Chapter 11. Shape Optimization

for example. The most commonly used approach relies on differentiating the reducedfunctional J (�) = J (y(�),�, �(�)) using the chain rule. As a consequence shape differ-entiability of y with respect to variations of the domain are essential in this method. In analternative approach the partial differential equation is realized in a Lagrangian formulation;see [DeZo], for example.

The method for computing the shape derivative that we describe here is quite differentand elementary. In short, it can be described as follows. First we embed e(y,�) = 0 to anequation in a fixed domain �0 by a coordinate transformation which is called the methodof mapping. Then we combine the Lagrange multiplier method to realize the constrainte(y,�) = 0, using the shape derivative of functionals to calculate the shape derivative ofJ (�). In this process, differentiability of the state with respect to the geometric quantitiesis not used. In fact, we require only Hölder continuity with exponent greater than or equalto 1

2 of y with respect to the geometric data. We refer the reader to [IKPe2] for an examplein which the reduced cost functional is shape differentiable whereas the state variable of theconstraining partial differential equation is not.

For comparison we briefly discuss an example using the “chain rule” approach. Butfirst we require some notions from shape calculus, which we introduce only formally here.Let � be a reference domain in Rd , and let h : U → Rd , with � ⊂ U , denote a mappingdescribing perturbations of � by means of

�t = Ft(�),

where Ft : �→ Rd is a perturbation of the identity, given by Ft = id + th, for example.Let {zt : �t → R|t ≤ τ }, for some τ > 0, denote a family of mappings, and consider theassociated family zt = zt ◦ Ft : � → R which transports zt back to �. Then the shapederivative of {zt } at � in direction h is defined as

z′(x) = limt→0

1

t(zt (x)− z0(x)) for x ∈ �,

and the material derivative is given by

z(x) = limt→0

1

t(zt (x)− z0(x)) for x ∈ �.

Under appropriate regularity conditions we have

z′ = z− ∇z · F0.

For a functional �→ J (�) the shape derivative at � with respect to the perturbation h isdefined as

J ′(�)h = limt→0

1

t

(J (�t)− J (�)

).

Consider now the cost functional

min J (y,�, �) = 1

2

∫�

y2d � (11.1.2)


�

�

�

�

�

�

�

�

11.1. Problem statement and generalities 307

subject to the constraint e(y,�) = 0 which is given by the mixed boundary value problem

−�y = f in �, (11.1.3)

y = 0 on �0, (11.1.4)

∂y

∂n= g on �. (11.1.5)

Here the boundary of the domain � is the disjoint union of a fixed part �0 and the unknownpart�, and f and g are given functions. Aformal differentiation leads to the shape derivativeof the reduced cost functional

J ′(�)h =∫�

yy ′� d� + 1

2

∫�

(∂y2

∂n+ κy2

)h · n d�, (11.1.6)

where y ′� denotes the shape derivative of the solution y of (11.1.3) at � with respect toa deformation field h, and κ stands for the curvature of �. For a thorough discussion ofthe details we refer the reader to [DeZo, SoZo]. Differentiating formally the constrainte(y,�) = 0 with respect to the domain, one obtains that y ′� satisfies

−�y ′� = 0 in �,

y ′� = 0 on �0, (11.1.7)

∂y ′�∂n

= div�(h · n∇�y)+(f + ∂g

∂n+ κg

)h · n on �,

where div� , ∇� stand for the tangential divergence and tangential gradient, respectively, onthe boundary �. Introducing a suitably defined adjoint variable and using (11.1.7), the firstterm on the right-hand side of (11.1.6) can be manipulated in such a way that J ′(�)h can berepresented in the form required by the Zolesio–Hadamard structure theorem (see [DeZo])

J ′(�)h =∫�

Gh · n d�.

We emphasize that the kernelG does not involve the shape derivative y ′� anymore. Althoughy ′� is only an intermediate quantity, a rigorous analysis requires justifying the formal steps inthe preceding discussion. In addition one has to verify that the solution of (11.1.7) actuallyis the shape derivative of y in the sense of the definition in, e.g., [SoZo]. Furthermore, sincethe trace of y ′� on �0 is used in (11.1.7) one needs y ′� ∈ H 1(�). However, y ∈ H 2(�) is notsufficient to allow for an interpretation of the Neumann condition in (11.1.7) in H−1/2(�).Hence y ′� ∈ H 1(�) requires more regularity of the solution y of (11.1.3) than H 2(�). Inthe approach of this chapter we utilize only y ∈ H 2(�) for the characterization of the shapederivative of J (�). We return to this example in Section 11.3.

In Section 11.2 we present the proposed general framework to compute the shapederivative for (11.1.1). Section 11.3 contains applications to shape optimization constrainedby linear elliptic systems, inverse interface problems, the Bernoulli problem, and shapeoptimization for the Navier–Stokes equations.


�

�

�

�

�

�

�

�


11.2 Shape derivativeConsider the shape optimization problem

min J (y,�, �) ≡∫�

j1(y) dx +∫�

j2(y) ds +∫∂�\�

j3(y)ds (11.2.1)

subject to the constrainte(y,�) = 0, (11.2.2)

which represents a partial differential equation posed on the domain � ⊂ Rd with boundary∂�. We focus on sensitivity analysis of the reduced cost functional in (11.2.1)–(11.2.2)with respect to �.

To describe the admissible class of geometries, letU ⊂ Rd be a fixed bounded domainwith C1,1-boundary ∂U , or a convex domain with Lipschitzian boundary, and let D be adomain with C1,1-boundary � := ∂D satisfying D ⊂ U . For the reference domain � weadmit any of the three cases

(i) � = D,

(ii) � = U ,

(iii) � = U \ D.

Note that∂� = (∂� ∩ �) ∪ (∂� \ �) ⊂ U ∪ ∂U. (11.2.3)

Thus the boundary ∂� for the cases (i)–(iii) is given by

(i)′ ∂� = � ∪ ∅ = �,

(ii)′ ∂� = ∅ ∪ ∂U = ∂U ,

(iii)′ ∂� = � ∪ ∂U.

To introduce the admissible class of perturbations let h ∈ C1,1(U) with h = 0 on∂U = 0 and define, for t ∈ R, the mappings Ft : U → Rd by the perturbation of identity

Ft = id + t h. (11.2.4)

Then there exists τ > 0 such that Ft(U) = U and Ft is a diffeomorphism for |t | < τ .Defining the perturbed domains

�t = Ft(�)

and the perturbed manifolds as�t = Ft(�),

it follows that �t is of class C1,1 and �t ⊂ U for |t | < τ . Note that since h|∂U = 0 theboundary of U remains fixed as t varies, and hence by (11.2.3)

(∂�)t \ �t = ∂� \ � for |t | < τ.


�

�

�

�

�

�

�

�

11.2. Shape derivative 309

Alternatively to (11.2.4) the perturbations could be described as the flow determined by theinitial value problem

d

dtχ(t) = h(χ(t)), χ(0; x) = x,

with Ft(x) = χ(t; x), i.e., by the velocity method.

Let J (�t ) be the functional defined by J (�t ) = J (yt , �t , �t ), where yt satisfies theconstraint

e(yt , �t ) = 0. (11.2.5)

The shape derivative of J at � in the direction of the deformation field h is defined as

J ′(�)h = limt→0

1

t

(J (�t )− J (�)

).

The functional J is called shape differentiable at � if J ′(�)h exists for all h ∈ C1,1(U,Rd)

and defines a continuous linear functional on C1,1(U ,Rd). Using the method of mappingsone transforms the perturbed state constraint (11.2.5) to the fixed domain�. For this purposedefine

yt = yt ◦ Ft .

Then yt : �→ Rl satisfies an equation on the reference domain �, which we express as

e(yt , t) = 0, |t | < τ. (11.2.6)

We suppress the dependence of e on h, because h will denote a fixed vector field throughout.Because F0 = id one obtains y0 = y and

e(y0, 0) = e(y,�). (11.2.7)

We axiomatize the above description and impose the following assumptions on e,respectively, e.

(H1) There is a Hilbert space X and a C1-function e : X × (−τ, τ ) → X∗ such thate(yt , �t ) = 0 is equivalent to

e(yt , t) = 0 in X∗,

with e(y, 0) = e(y,�) for all y ∈ X.

(H2) There exists 0 < τ0 ≤ τ such that for |t | < τ0 there exists a unique solution yt ∈ X

to e(yt , t) = 0 and

limt→0

|yt − y0|Xt1/2

= 0.


�

�

�

�

�

�

�

�


(H3) ey(y,�) ∈ L(X,X∗) satisfies

〈e(v,�)− e(y,�)− ey(y,�)(v − y), ψ〉X∗×X = O(|v − y|2X)for every ψ ∈ X, where y, v ∈ X.

(H4) e and e satisfy

limt→0

1

t〈e(yt , t)− e(y, t)− e(yt , �)+ e(y,�), ψ〉X∗×X = 0

for every ψ ∈ X, where yt and y are the solutions of (11.2.6) and (11.2.2),respectively.

In applications (H4) typically results in an assumption on the regularity of the coefficientsin the partial differential equation and on the vector field h. We assume throughout thatX ↪→ L2(�,Rl) and, in the case that j2, j3 are nontrivial, that the elements ofX admit tracesin L2(�,Rl), respectively, L2(∂� \ �,Rl). Typically X will be a subspace of H 1(�,Rl)

for some l ∈ N.With regards to the cost functional J we require

(H5) ji ∈ C1,1(Rl ,R), i = 1, 2, 3.

As a consequence of (H1)–(H2) we infer that (11.2.5) has a unique solution yt which isgiven by

yt = yt ◦ F−1t .

Condition (H5) implies that j1(y) ∈ L2(�), j ′1(y) ∈ L2(�)l , j2(y) ∈ L2(�), j ′2(y) ∈L2(�)l , and j3(y) ∈ L2(∂U), j ′3(y) ∈ L2(∂U)l for y ∈ X. Hence the cost functionalJ (y,�, �) is well defined for every y ∈ X.

Lemma 11.1. There is a constant c > 0, such that

|ji(v)− ji(y)− j ′i (y)(v − y)|L1 ≤ c|v − y|2Xholds for all y, v ∈ X, i = 1, 2, 3.

Proof. For j1 the claim follows from∫�

|j1(v)− j1(y)− j ′1(y)(v − y)| dx

≤∫�

∫ 1

0|j ′1

(y(x)+ s(v(x)− y(x))

)− j ′1(y(x))| ds |v(x)− y(x)| dx

≤ L

2|v − y|2L2 ≤ c|v − y|2X,

where L > 0 is the Lipschitz constant for j ′1. The same argument is valid also for j2

and j3.


�

�

�

�

�

�

�

�


Subsequently we use the following notation:

It = det DFt, At = (DFt)−T , wt = It |Atn|,

where DFt is the Jacobian of Ft and n denotes the outer normal unit vector to �. Werequire additional regularity properties of the transformation Ft which we specify next. LetI = [−τ0, τ0] with τ0 ≤ τ sufficiently small.

F0 = id, t → Ft ∈ C(I, C2(U ,Rd)),

t → Ft ∈ C1(I, C1(U ,Rd)), t → F−1t ∈ C(I, C1(U ,Rd)),

t → It ∈ C1(I, C(U)), t → At ∈ C(I, C(U ,Rd×d)),t → wt ∈ C(I, C(�)), (11.2.8)

d

dtFt |t=0 = h,

d

dtF−1t |t=0 = −h,

d

dtDFt |t=0 = Dh,

d

dtDF−1

t |t=0 = d

dt(At )

T |t=0 = −Dh,

d

dtIt |t=0 = div h,

d

dtwt |t=0 = div� h.

The surface divergence div� is defined by

div� h = div h|� − (Dhn) · n.The properties (11.2.8) are easily verified if Ft is specified by perturbation of the identityas in (11.2.4). As a consequence of (11.2.8) there exists α > 0 such that

It (x) ≥ α, x ∈ U . (11.2.9)

We furthermore recall the following transformation theorem where we already utilize(11.2.9).

Lemma 11.2. (1) Let ϕt ∈ L1(�t); then ϕt ◦Ft ∈ L1(�) and∫�t

ϕt dxt =∫�

ϕt ◦Ft det DFt dx.

(2) Let ht ∈ L1(�t ); then ht ◦Ft ∈ L1(�) and∫�t

ht d�t =∫�

ht ◦Ft det DFt |(DFt)−T n| d�.

We now formulate the representation of the shape derivative of J at � in direction h.

Theorem 11.3. Assume that (H1)–(H5) hold, that F satisfies (11.2.8), and that the adjointequation

〈ey(y,�)ψ, p〉X∗×X − (j ′1(y), ψ)� − (j ′2(y), ψ)� − (j ′3(y), ψ)�\� = 0, ψ ∈ X,

(11.2.10)


�

�

�

�

�

�

�

�


admits a unique solution p ∈ X, where y is the solution to (11.2.2). Then the shapederivative of J at � in the direction h exists and is given by

J ′(�)h = − d

dt〈e(y, t), p〉X∗×X|t=0 +

∫�

j1(y) div h dx +∫�

j2(y) div� h ds. (11.2.11)

Proof. Referring to (H2) let yt , y ∈ X satisfy

e(yt , t) = e(y,�) = 0 (11.2.12)

for |t | < τ0. Then yt = yt ◦ Ft is the solution of (11.2.5). Utilizing Lemma 11.2 onetherefore obtains

1

t(J (�t )− J (�)) = 1

t

∫�

(It j1(y

t )− j1(y))dx + 1

t

∫�

(wtj2(y

t )− j2(y))ds

+ 1

t

∫∂U

(j3(yt )− j3(y))ds

= 1

t

∫�

(It (j1(y

t )− j1(y)− j ′1(y)(yt − y))+ (It − 1)j ′1(y)(y

t − y)

+ j ′1(y)(yt − y)+ (It − 1)j1(y)

)dx

+ 1

t

∫�

(wt(j2(y

t )− j2(y)− j ′2(y)(yt − y))+ (wt − 1)j ′2(y)(y

t − 1)

+ j ′2(y)(yt − y)+ (wt − 1)j2(y)

)ds

+ 1

t

∫∂U

(j3(yt )− j3(y)− j ′3(y)(y

t − y))ds + 1

t

∫∂U

j ′3(y)(yt − y)ds.

(11.2.13)

Lemma 11.1 and (11.2.8) result in the estimates∣∣∣∣∫�

It (j1(yt )− j1(y)− j ′1(y)(y

t − y)) dx

∣∣∣∣ ≤ c|yt − y|2X,∣∣∣∣∫�

wt(j2(yt )− j2(y)− j ′2(y)(y

t − y)) ds

∣∣∣∣ ≤ c|yt − y|2X,∣∣∣∣∫∂U

(j3(yt )− j3(y)− j ′3(y)(y

t − y))ds

∣∣∣∣ ≤ c|yt − y|2X,

(11.2.14)

where c > 0 does not depend on t . Employing the adjoint state p one obtains

(j ′1(y), yt − y)� + (j ′2(y), y

t − y)� + (j ′3(y), yt − y)∂U = 〈ey(y,�)(yt − y), p〉X∗×X

= −〈e(yt , �)− e(y,�)− ey(y,�)(yt − y), p〉X∗×X− 〈e(yt , t)− e(y, t)− e(yt , �)+ e(y,�), p〉X∗×X− 〈e(y, t)− e(y, 0), p〉X∗×X,

(11.2.15)


�

�

�

�

�

�

�

�


where we used (11.2.12). We estimate the ten additive terms on the right-hand side of(11.2.13). Terms one, five, and nine converge to zero by (11.2.14) and (H2). Terms twoand six converge to 0 by (11.2.8) and (H2). For terms four and eight one uses (11.2.8).The claim (11.2.11) now follows by passing to the limit in terms three, seven, and ten using(11.2.15), (H3), (H2), (H4), and (H1).

To check (H2) in specific applications the following result will be useful. It relies on

(H6)

⎧⎪⎪⎨⎪⎪⎩the linearized equation

〈Ey(y,�)δy, ψ〉X∗×X = 〈f,ψ〉X∗×X, ψ ∈ X,

admits a unique solution δy ∈ X for every f ∈ X∗.

Note that this condition is more stringent than the assumption of solvability of theadjoint equation in Theorem 11.3, which requires solvability only for a specific right-handside.

Proposition 11.4. Assume that (11.2.2) admits a unique solution y and that (H6) is satisfied.Then (H2) holds.

Proof. Let y ∈ X be the unique solution of (11.2.2). In view of

ey(y, 0) = ey(y,�)

Assumption (H6) implies that ey(y, 0) is bijective. The claim follows from the implicitfunction theorem.

Computing the derivative ddt〈e(y, t), p〉X∗×X|t=0 in (11.2.11), and subsequently argu-

ing that J is in fact a shape derivative at �, can be facilitated by transforming the equation〈e(y, t), p〉 = 0 back to 〈e(y ◦ F−1

t , �t ), p ◦ F−1t 〉 = 0 together with the following well-

known differentiation rules of the functionals.

Lemma 11.5 (see [DeZo]). (1) Let f ∈ C(I,W 1,1(U)) and assume that ft (0) exists inL1(U). Then

d

dt

∫�t

f (t, x) dx|t=0 =∫�

ft (0, x) dx +∫�

f (0, x) h · n ds.

(2) Let f ∈ C(I,W 2,1(U)) and assume that ft (0) exists in W 1,1(U). Then

d

dt

∫�t

f (t, x) ds|t=0 =∫�

ft (0, x) ds +∫�

(∂

∂nf (0, s)+ κf (0, s)

)h · n ds,

where κ stands for the additive curvature of �.

The first part of the lemma is valid also for domains � with Lipschitz continuousboundary.


�

�

�

�

�

�

�

�


In the examples below f (t) will be typically given by expressions of the form

μv ◦ F−1t , μ∂i(v ◦ F−1

t )∂j (w ◦ F−1t ), v ◦ F−1

t w ◦ F−1t ∂i(z ◦ F−1

t ),

where μ ∈ H 1(U) and v, z, and w ∈ H 2(U) are extensions of elements in H 2(�). Theassumptions of Lemma 11.5 can be verified using the following result.

Lemma 11.6 (see [SoZo]). (1) If y ∈ Lp(U), then t → y ◦ F−1t ∈ C(I, Lp(U)), 1 ≤

p <∞.(2) If y ∈ H 2(U), then t → y ◦ F−1

t ∈ C(I, H 2(U)).(3) If y ∈ H 2(U), then d

dt(y ◦ F−1

t )|t=0 exists in H 1(U) and is given by

d

dt(y ◦ F−1

t )|t=0 = −(Dy) h.

As a consequence we note that ddt∂i((y ◦ F−1

t ))|t=0 exists in L2(U) and is given by

d

dt∂i((y ◦ F−1

t ))|t=0 = −∂i(Dy h), i = 1, . . . , d.

In the next section∇y stands for (Dy)T , where y is either a scalar- or vector-valued function.To enhance readability we use two symbols for the inner product inRd , (x, y), respectively,x · y. The latter will be utilized only in the case of nested inner products.

11.3 ExamplesThroughout this section it is assumed that (H5) is satisfied and that the regularity assumptionsof Section 11.2 for D, �, and U hold. If J does not depend on �, we write J (y,�) in placeof J (y,�, �).

11.3.1 Elliptic Dirichlet boundary value problem

As a first example we consider the volume functional

J (y,�) =∫�

j1(y) dx

subject to the constraint

(μ∇y,∇ψ)� − (f, ψ)� = 0, (11.3.1)

where X = H 10 (�), f ∈ H 1(U), and μ ∈ C1(U ,Rd×d) such that μ(x) is symmetric and

uniformly positive definite. Here � = D and � = ∂�. Thus e(y,�) : X→ X∗ is given by

〈e(y,�), ψ〉X∗×X = (μ∇y,∇ψ)� − (f, ψ)�.


�

�

�

�

�

�

�

�

11.3. Examples 315

The equation on the perturbed domain is determined by

〈e(yt , �t ), ψt 〉X∗t ×Xt=

∫�t

(μ∇yt ,∇ψt) dxt −∫�t

fψ dxt

=∫�

(μtAt∇yt , At∇ψt)It dx −∫�

f tψt dx ≡ 〈e(yt , t), ψt 〉X∗,X(11.3.2)

for any ψt ∈ Xt , with yt = yt ◦ Ft , μt = μ◦Ft , f t = f ◦Ft , and Xt = H 1

0 (�t). Herewe used that ∇yt = (At∇yt )◦F−1

t and Lemma 11.5. (H1) is a consequence of (11.2.8),(11.3.2), and the smoothness of μ and f . Since (11.3.1) admits a unique solution and (H6)holds, Proposition 11.4 implies (H2). Since e is linear in y, assumption (H3) follows. Forthe verification of (H4) observe that

〈e(yt , t)− e(y, t)− e(yt , �)+ e(y,�), ψ〉X∗×X= ((μtItAt − μ)∇(yt − y),At∇ψ)� + (μ∇(yt − y), (At − I )∇ψ).

Hence (H4) follows from differentiability of μ, (11.2.8), and (H2). In view of Theorem 11.3we have to compute d

dt〈e(y, t), p〉X∗×X|t=0 for which we use the representation on �t in

(11.3.2). Recall that the solution y of (11.3.1) as well as the adjoint state p, defined by

(μ∇p,∇ψ)� = (j ′1(y), ψ)�, ψ ∈ H 10 (�), (11.3.3)

belong to H 2(�) ∩H 10 (�). Since � ∈ C1,1 (actually Lipschitz continuity of the boundary

would suffice), y as well as p can be extended to functions in H 2(U), which we againdenote by the same symbol. Therefore by Lemmas 11.5 and 11.6

d

dt〈e(y, t), p〉X∗×X|t=0

= d

dt

(∫�t

(μ∇(y ◦ F−1t ),∇(p ◦ F−1

t )) dxt −∫�t

fp ◦ F−1t dxt

)|t=0

=∫�

(μ∇y,∇p) (h, n) ds +∫�

((μ∇(−∇y · h),∇p)

+ (μ∇y,∇(−∇p · h))+ f (∇p, h) ) dx. (11.3.4)

Note that ∇y · h as well as ∇p · h do not belong to H 10 (�) but they are elements of H 1(�).

Therefore Green’s theorem implies∫�

((μ∇(−∇y · h),∇p)+ (μ∇y,∇(−∇p · h))+ f (∇p, h)) dx=

∫�

div(μ∇p) (∇y, h) dx −∫�

(μ∇p, n)(∇y, h) ds

+∫�

(div(μ∇y)+ f ) (∇p, h) dx −∫�

(μ∇y, n) (∇p, h) ds

= −∫�

j ′1(y) (∇y, h) dx − 2∫�

(μn, n)∂y

∂n

∂p

∂n(h, n) ds.

(11.3.5)


�

�

�

�

�

�

�

�


Above we used the strong form of (11.3.1) and (11.3.3) in L2(�) as well as the identities

(μ∇y, n) = (μn, n)∂y

∂n, (∇y, h) = ∂y

∂n(h, n)

(together with the ones with y and p interchanged) which follow from y, p ∈ H 10 (�).

Applying Theorem 11.3 results in

J ′(�)h = − d

dt〈e(y, t), p〉X∗,X|t=0 +

∫�

j1(y) div h dx

=∫�

(μn, n)∂y

∂n

∂p

∂n(h, n) ds +

∫�

(j ′1(y)(∇y, h)+ j1(y) div h) dx

=∫�

(μn, n)∂y

∂n

∂p

∂n(h, n) ds +

∫�

div(j1(y)h) dx,

and the Stokes theorem yields the final result,

J ′(�)h =∫�

((μn, n)

∂y

∂n

∂p

∂n+ j1(y)

)(h, n) ds.

Remark 11.3.1. If we were to be content with a representation of the shape variation interms of volume integrals we could take the expression for d

dte(y, t)|t=0 given in (11.3.4)

and bypass the use of Green’s theorem in (11.3.5). The regularity requirement on the domainthen results from y ∈ H 2(�), p ∈ H 2(�). In [Ber] the shape derivative in terms of thevolume integral is referred to as the weak shape derivative, whereas the final form in termsof the boundary integrals is called the strong shape derivative.

11.3.2 Inverse interface problem

We consider an inverse interface problem which is motivated by electrical impedance to-mography. Let U = � = (−1, 1) × (−1, 1) and ∂U = ∂�. Further let the domainD = �−, with �− ⊂ U , represent the inhomogeneity of the conducting medium and set�+ = U \ �−. We assume that �− is a simply connected domain of class C2 with bound-ary � which represents the interface between �− and �+. The inverse problem consists ofidentifying the unknown interface � from measurements z which are taken on the boundary∂U . This can be formulated as

min J (y,�) ≡∫∂U

(y − z)2 ds (11.3.6)

subject to the constraints

− div(μ∇y) = 0 in �− ∪�+,

[y] = 0,

[μ

∂y

∂n−

]= 0 on �, (11.3.7)

∂y

∂n= g on ∂U,


�

�

�

�

�

�

�

�

11.3. Examples 317

where g ∈ H 1/2(∂U), z ∈ L2(∂U), with∫∂U

g = ∫∂U

z = 0, [v] = v+ − v− on �, andn+/− standing for the unit outer normals to �+/−. The conductivity μ is given by

μ(x) ={μ−, x ∈ �−,μ+, x ∈ �+,

for some positive constants μ− and μ+. In the context of the general framework of Section11.2 we have j1 = j2 = 0 and j3 = (y − z)2. Clearly (11.3.7) admits a unique solutiony ∈ H 1(U) with

∫∂U

y = 0. Its restrictions to �+ and �− will be denoted by y+ and y−,respectively. It turns out that the regularity of y± is better than the one of y.

Proposition 11.7. Let � and �± be as described above. Then the solution y ∈ H 1(U) of(11.3.7) satisfies

y± ∈ H 2(�±).

Proof. Let �H be the smooth boundary of a domain �H with

�− ⊂ �H ⊂ �H ⊂ U.

Then y|�H ∈ H 3/2 (�H ). The problem{ −div(μ+∇yH ) = 0 in U\�H,

∂yH∂n= g on ∂U, yH = y|�H on �H

has a unique solution yH ∈ H 2(U\�H) with yH = y+|�H. Therefore,

b := y|∂U = yH |∂U ∈ H 3/2(∂U).

Then the solution y to (11.3.7) coincides with the solution to⎧⎨⎩−div(μ∇y) = 0 in �− ∪ �+,[y] = 0, [μ ∂y

∂n− ] = 0 on �,

y = b on ∂U.

We now argue that y± ∈ H 2(�±). Let yb ∈ H 2(U) denote the solution to{ −�yb = 0 in U,

yb = b on ∂U.

Define w ∈ H 10 (U) as the unique solution to the interface problem⎧⎪⎨⎪⎩

− div(μ∇w) = 0 in �,

[w] = 0, [μ ∂w∂n− ] = −[μ ∂yb

∂n− ] on �,

w = 0 on ∂U.

(11.3.8)


�

�

�

�

�

�

�

�


Then yb ∈ H 2(�) implies [μ ∂yb∂n− ] ∈ H 1/2(�). By [ChZo], (11.3.8) has a unique solution

w ∈ H 10 (U) with the additional regularity w± ∈ H 2(�±). Consequently y = w + yb

satisfies y|∂� = g and y± ∈ H 2(�±), as desired. In an analogous wayp± ∈ H 2(�±).

To consider the inverse problem (11.3.6), (11.3.7) within the general framework ofSection 11.2 we set X = {v ∈ H 1(U) : ∫

∂Uv = 0} and define

〈e(y,�), ψ〉X∗×X = (μ∇y,∇ψ)U − (g, ψ)∂U ,

respectively,

〈e(y, t), ψ〉X∗×X = (μtAt∇y,At∇ψIt )U − (g, ψ)∂U

= (μ+∇(y◦F−1t ),∇(ψ◦F−1

t ))�+t + (μ−∇(y◦F−1t ),∇(ψ◦F−1

t ))�−t − (g, ψ)∂U .

Note that the boundary term is not affected by the transformation Ft since the deformationfield h vanishes on ∂U . The adjoint state is given by

− div(μ∇p) = 0 in �− ∪�+,

[p] = 0,

[μ

∂p

∂n−

]= 0 on �, (11.3.9)

∂p

∂n= 2(y − z) on ∂U,

respectively,

(μ∇p,∇ψ)U = 2(y − z, ψ)∂U for ψ ∈ X. (11.3.10)

Assumption (H4) requires us to consider

1

t|〈e(yt − y, t)− e(yt − y,�),ψ〉|

≤ 1

t

∫�+|(μ+ItAt∇(yt − y),At∇ψ)− (μ+∇(yt − y),∇ψ)| dx

+ 1

t

∫�−|(μ−ItAt∇(yt − y),At∇ψ)− (μ+∇(yt − y),∇ψ)| dx

≤ μ+∫�+

∣∣∣∣(1

t(ItAt − I )∇(yt − y),At∇ψ

)∣∣∣∣ dx+ μ+

∫�+

∣∣∣∣(∇(yt − y),1

t(At − I )∇ψ)

∣∣∣∣ dx+ μ−

∫�−

∣∣∣∣(1

t(ItAt − I )∇(yt − y),At∇ψ

)∣∣∣∣ dx+ μ−

∫�−

∣∣∣∣(∇(yt − y),1

t(At − I )∇ψ)

∣∣∣∣ dx.The right-hand side of this inequality converges to 0 as t → 0+ by (11.2.8). The remainingassumptions can be verified as in Section 11.3.1 for the Dirichlet problem and thus Theorem


�

�

�

�

�

�

�

�

11.3. Examples 319

11.3 is applicable. By Proposition 11.7 the restrictions y± = y|�± , p± = p|�± satisfy y±,p± ∈ H 2(�±). Using Lemma 11.5 we find that

d

dt〈e(y, t), p〉X∗×X|t=0

=∫∂�+

(μ+∇y+,∇p+)(h, n+) ds −∫�+

(μ+∇(∇y+ · h),∇p+) dx

−∫�+

(μ+∇y+,∇(∇p+ · h)) dx +∫∂�−

(μ−∇y−,∇p−)(h, n−) ds

−∫�−

(μ−∇(∇y− · h),∇p−) dx −∫�−

(μ−∇y−,∇(∇p− · h)) dx

=∫�

[μ∇y,∇p](h, n+) ds −∫�+

μ+(∇(∇y+ · h),∇p+)+ (∇y+,∇(∇p+ · h)) dx

−∫�−

μ−(∇(∇y− · h),∇p−)+ (∇y−,∇(∇p− · h)) dx.

Applying Green’s formula as in Example 11.3.1 (observe that (∇y, h), (∇p, h) /∈ H 1(U))together with (11.3.9) results in

−∫�+

(μ+∇(∇y+ · h),∇p+)) dx −∫�−

(μ−∇(∇y− · h),∇p−)) dx

=∫�+

div(μ+∇p+)(∇y+, h) dx +∫�−

div(μ−∇p−)(∇y−, h) dx

−∫∂�+

(μ+∇p+, n+)(∇y+, h) ds −∫∂�−

(μ−∇p−, n−)(∇y−, h) ds

= −∫�

[μ

∂p

∂n+(∇y, h)

]ds.

In the last step we utilize h = 0 on ∂U . Similarly we obtain

−∫�+

(μ+∇y+,∇(∇p+ · h))) dx −∫�−

(μ−∇y−,∇(∇p− · h))) dx

= −∫�

[μ

∂y

∂n+(∇p, h)

]ds.

Collecting terms results in

J ′(�)h = −∫�

[μ(∇y,∇p)] (h, n+) ds+∫�

([μ

∂p

∂n+(∇y, h)

]+

[μ

∂y

∂n+(∇p, h)

])ds.

The identity[ab] = [a]b+ + a−[b] = a+[b] + [a]b−

implies[ab] = 0 if [a] = [b] = 0.


�

�

�

�

�

�

�

�


Hence the transition conditions [μ

∂y

∂n+

]=

[∂y

∂τ

]= 0,[

μ∂p

∂n+

]=

[∂p

∂τ

]= 0,

(11.3.11)

where ∂∂τ

stands for the tangential derivative imply[μ

∂p

∂n+(∇y, h)

]=

[μ

∂p

∂n+∂y

∂n+(h, n+)+ μ

∂p

∂n+∂y

∂τ(h, τ )

]=

[μ

∂p

∂n+∂y

∂n+(h, n+)

]+

[μ

∂p

∂n+∂y

∂τ(h, τ )

]=

[μ

∂p

∂n+∂y

∂n+

](h, n+),

and analogously [μ

∂y

∂n+(∇p, h)

]=

[μ

∂p

∂n+∂y

∂n+

](h, n+),

which entails

J ′(�)h = −∫�

[μ(∇y,∇p)] (h, n+) ds + 2∫�

[μ

∂p

∂n+∂y

∂n+

](h, n+) ds

= −∫�

[μ∂y

∂τ

∂p

∂τ

](h, n+) ds +

∫�

[μ

∂y

∂n+∂p

∂n+

](h, n+) ds

= −∫�

[μ]∂y∂τ

∂p

∂τ(h, n+) ds +

∫�

[μ

∂y

∂n+∂p

∂n+

](h, n+) ds.

In view of (11.3.11) this can be rearranged as

−[μ]∂y∂τ

∂p

∂τ+

[μ

∂y

∂n+∂p

∂n+

]= −μ+ ∂y

∂τ

∂p

∂τ+ μ−

∂y

∂τ

∂p

∂τ+ μ+

∂y+

∂n+∂p+

∂n+− μ−

∂y−

∂n+∂p−

∂n+

= −μ+(∂y

∂τ

∂p

∂τ+ 1

2

(∂y+

∂n+∂p−

∂n++ ∂y−

∂n+∂p+

∂n+

))+ μ−

(∂y

∂τ

∂p

∂τ+ 1

2

(∂y−

∂n+∂p+

∂n++ ∂y+

∂n+∂p−

∂n+

))= −1

2[μ]((∇y+,∇p−)+ (∇y−,∇p+)),

which gives the representation

J ′(�)h = −1

2

∫�

[μ]((∇y+,∇p−)+ (∇y−,∇p+)) (h, n+) ds= −

∫�

[μ](∇y+,∇p−) (h, n+) ds.


�

�

�

�

�

�

�

�

11.3. Examples 321

11.3.3 Elliptic systems

Here we consider a domain � = U \D, where D ⊂ U and the boundaries ∂U and � = ∂D

are assumed to be C2 regular.We consider the optimization problem

min J (y,�, �) ≡∫�

j1(y) dx +∫�

j2(y) ds,

where y is the solution of the elliptic system

〈e(y,�), ψ〉X∗×X =∫�

(a(x,∇y,∇ψ)− (f, ψ)

)dx −

∫�

(g, ψ) ds = 0 (11.3.12)

inX = {v ∈ H 1(�)l : v|∂U = 0}. Above∇y stands for (Dy)T . We require thatf ∈ H 1(U)l

and that g is the trace of a given function G ∈ H 2(U)l . Furthermore we assume thata : U × Rd×d × Rd×d satisfies

(1) a(·, ξ, η) is continuously differentiable for every ξ , η ∈ Rd×d ,

(2) a(x, ·, ·) defines a bilinear form on Rd×d × Rd×d which is uniformly bounded inx ∈ U ,

(3) a(x, ·, ·) is uniformly coercive for all x ∈ U .

In the case of linear elasticity a is given by

a(x,∇y,∇ψ) = λ tr e(y) tr e(ψ)+ 2μe(y) : e(ψ),

where e(y) = 12 (∇y + (∇y)T ), and λ, μ are the positive Lamé coefficients. In this case a

is symmetric, and (11.3.12) admits a unique solution in X ∩H 2(�)l for every f ∈ L2(�)l

and g ∈ H12 (∂U)l ; see, e.g., [Ci].

The method of mapping suggests defining

〈e(y, t), ψ〉X∗×X =∫�

(a(Ft (x), At∇y,At∇ψ)− (f t , ψ)

)It dx −

∫�

(gt , ψ)wt ds

=∫�t

(a(x,∇(y◦F−1

t ),∇(ψ◦F−1t ))− (f, ψ◦F−1

t ))dx −

∫�t

(g, ψ◦F−1t ) ds.

(11.3.13)

The adjoint state is determined by the equation

〈ey(y,�)ψ, p〉X∗×X =∫�

(a(x,∇ψ,∇p)− j ′1(y)ψ

)dx −

∫�

j ′2(y)ψ ds = 0, (11.3.14)

ψ ∈ X. Under the regularity assumptions on a, (11.3.12) admits a unique solution inX ∩ H 2(�)l and the adjoint equation admits a solution for any right-hand side in X∗ so


�

�

�

�

�

�

�

�


that Proposition 11.4 is applicable. All these properties are satisfied for the linear elasticitycase. Assumptions (H1)–(H4) can then be argued as in Section 11.3.1.

Employing Lemma 11.5 we obtain

d

dt〈e(y, t), p〉X∗×X|t=0 = −

∫�

(a(x,∇(∇yT h),∇p)+ a(x,∇y,∇(∇pT h))

)dx

+∫�

a(x,∇y,∇p) (h, n) ds +∫�

(f,∇pT h) dx −∫�

(f, p) (h, n) ds

+∫�

(g,∇pT h) ds −∫�

(∂

∂n(g, p)+ κ(g, p)

)(h, n) ds.

Since ∇yT h ∈ X and ∇pT h ∈ X, this expression can be simplified using (11.3.12) and(11.3.14):

d

dt〈e(y, t), p〉X∗×X|t=0 = −

∫�

j ′1(y)∇yT h dx −∫�

j ′2(y)∇yT h ds

+∫�

(a(x,∇y,∇p)− (f, p)

)(h, n) ds −

∫�

(∂

∂n(g, p)+ κ(g, p)

)(h, n) ds,

which implies

J ′(�)h =∫�

j ′1(y)∇yT h dx +∫�

j1(y) div h dx

+∫�

j ′2(y)∇yT h ds +∫�

j2(y) div� h ds

+∫�

(−a(x,∇y,∇p)+ (f, p)+ ∂

∂n(g, p)+ κ(g, p)

)(h, n) ds.

For the third and fourth terms the tangential Green’s formula (see, e.g., [DeZo]) yields∫�

j ′2(y)∇yT h ds +∫�

j2(y) div� h ds =∫�

(∂

∂nj2(y)+ κj2(y)

)(h, n) ds.

The first and second terms can be combined using the Stokes theorem. Summarizing wefinally obtain

J ′(�)h =∫�

(−a(x,∇y,∇p)+ (f, p)+ j1(y)

+ ∂

∂n

(j2(y)+ (g, p)

)+ κ(j2(y)+ (g, p)))(h, n) ds.

(11.3.15)

This example also comprises the shape optimization problem of Bernoulli type:

min J (y,�, �) ≡ min�

∫�

y2 ds,


�

�

�

�

�

�

�

�

11.3. Examples 323

where y is the solution of the mixed boundary value problem

−�y= f in �,

y = 0 on ∂U,

∂y

∂n= g on �,

which was analyzed with a similar approach in [IKP]. Here the boundary ∂� of the domain� ⊂ R2 is the disjoint union of a fixed part ∂U and an unknown part � both with nonemptyrelative interior. Let the state space X be given by

X = {ϕ ∈ H 1(�) : ϕ = 0 on ∂U}.Then the Eulerian derivative of J is given by (11.3.15), which reduces to

J ′(�)h =∫�

(−(∇y,∇p)+ fp + ∂

∂n(y2 + gp)+ κ(y2 + gp)

)(h, n) ds.

This result coincides with the representation obtained in [IKP]. The present derivation,however, is considerably simpler due to a better arrangement of terms in the proof ofTheorem 11.3. It is straightforward to adapt the framework to shape optimization problemsassociated with the exterior Bernoulli problem.

11.3.4 Navier–Stokes system

Consider the stationary Navier–Stokes equations

−ν �y + (y · ∇)y + ∇p = f in �,

div y = 0 in �,

y = 0 on ∂�

(11.3.16)

on a bounded domain � ⊂ Rd , d = 2, 3, with ν > 0 and f ∈ H 1(U). In the context of thegeneral framework we set � = D with C2-boundary � = ∂�. The variational formulationof (11.3.16) is given by

Find (y, p) ∈ X ≡ H 10 (�)d × L2(�)/R such that⟨

e((y, p),�

), (ψ, χ)

⟩X∗×X ≡ ν(∇y,∇ψ)� + ((y · ∇)y, ψ)�

− (p, div ψ)� − (f, ψ)� + (div y, χ)� = 0(11.3.17)

holds for all (ψ, χ) ∈ X. Let the cost functional J be given by

J (y,�) =∫�

j1(y) dx.

Considering (11.3.17) on a perturbed domain �t mapping the equation back to the referencedomain � yields the form of e(y, t). Concerning the transformation of the divergence wenote that for ψt ∈ H 1

0 (�t)d and ψt = ψt ◦ Ft ∈ H 1

0 (�)d , one obtains

div ψt = (Dψti A

Tt ei) ◦ F−1

t = ((At )i∇ψt,i) ◦ F−1t ,


�

�

�

�

�

�

�

�


where ei stands for the ith canonical basis vector in Rd and (At )i denotes the ith row ofAt = (DFt)

−T . We follow the convention to sum over indices which occur at least twicein a term. Thus one obtains⟨

e((yt , pt ), t

), (ψ, χ)

⟩X∗×X = ν(ItAt∇yt , At∇ψ)� +

((yt · At∇)yt , Itψ

)�

− (pt , It (At )k∇ψk

)�− (f t It , ψ)� +

(It (At )k∇yt

k, χ)�= 0

for all (ψ, χ) ∈ X.The adjoint state (λ, q) ∈ X is given by the solution to⟨

e′((y, p),�

)(ψ, χ), (λ, q)

⟩X∗×X = (j ′1(y), ψ)�,

which amounts to

ν(∇ψ,∇λ)� + ((ψ · ∇)y + (y · ∇)ψ, λ)�

− (χ, div λ)� + (div ψ, q)� = (j ′1(y), ψ)�(11.3.18)

for all (ψ, χ) ∈ X. Integrating by parts one obtains

((y · ∇)ψ, λ)� = −∫�

ψ · λ div y dx −∫�

ψ · ((y · ∇)λ) dx

+∫�

(ψ · λ) (y · n) ds = −(ψ, (y · ∇)λ)�

because y ∈ H 10 (�)d and div y = 0. Therefore

((ψ · ∇)y + (y · ∇)ψ, λ)� = (ψ, (∇y)λ− (y · ∇)λ )� (11.3.19)

holds for all ψ ∈ H 1(�)d . As a consequence the adjoint equation can be interpreted as

−ν�λ+ (∇y)λ− (y · ∇)λ− ∇q = j ′1(y),div λ = 0,

(11.3.20)

where the first equation holds in L2(�)d and the second one in L2(�).For the evaluation of d

dt〈e((y, p), t), (λ, q)〉X∗×X|t=0, (y, p), (λ, q) ∈ X being the

solution of (11.3.17), respectively, (11.3.18), we transform this expression back to�t , whichgives ⟨

e((y, p), t

), (λ, q)

⟩X∗×X = ν(∇(y ◦ F−1

t ),∇(λ ◦ F−1t ))�t

+ ((y ◦ F−1

t · ∇)y ◦ F−1t , λ ◦ F−1

t

)�t− (div(λ ◦ F−1

t ), p ◦ F−1t )�t

− (f, λ ◦ F−1t )�t

+ (div(y ◦ F−1t ), q ◦ F−1

t )�t.

To verify conditions (H1)–(H4) we introduce the continuous trilinear form c : H 10 (�)d×

H 10 (�)d ×H 1

0 (�)d by c(y, v,w) = ((y · ∇)v,w) and assume that

ν2 > N |f |H−1 and ν > M, (11.3.21)


�

�

�

�

�

�

�

�

11.3. Examples 325

where

N = supy,v,w∈H 10

c(y, v,w)

|y|H 10|v|H 1

0|w|H 1

0

and M = supv∈H 10

c(v, v, y)

|v|2H 1

0

,

with y the solution to (11.3.16). Condition (H1) is satisfied by construction. If ν is suffi-ciently large so that the first inequality in (11.3.21) is satisfied, existence of a unique solution(y, p) ∈ H 1

0 (�)d × L2(�)/R to (11.3.16) is guaranteed; see, e.g., [Te]. The second con-dition in (11.3.21) ensures the bijectivity of the linearized operator e′(y, p), and thus (H6)holds. In particular this implies that (H2) holds and that the adjoint equation admits a uniquesolution. To verify (H3) we consider for arbitrary (v, q) ∈ X and (ψ, χ) ∈ X

〈e((v, q),�)− e((y, p),�)− e′((y, p),�)((v, q)− (y, p)), (ψ, χ)〉X∗,X= ((v − y) · ∇(v − y), ψ) ≤ K|ψ |H 1

0 (�)|v − y|2H 1

0 (�),

where K is an embedding constant, independent of (v, q) ∈ X and (ψ, χ) ∈ X. Verifying(H4) requires us to consider the quotient of the following expression with t and taking thelimit as t → 0:

ν[(ItAt∇(yt − y),At∇ψ)− (∇(yt − y),∇ψ)]+ [((yt · At∇)yt , Itψ)− ((yt · ∇)yt , ψ)− ((y · At∇)y, Itψ)+ ((y · ∇)y, ψ)]− [(It (At )k∇ψk, p

t − p)+ (div ψ,pt − p)]+ [(It (At )k∇ (yt

k − yk), χ)− (div(yt − y), χ)]for (ψ, χ) ∈ X. The first two terms in square brackets can be treated by analogous estimatesas in Sections 11.3.1 and 11.3.2. Noting that the third and fourth square brackets can beestimated quite similarly to each other, we give the estimate for the last one:

((It − 1)(At )k∇(ytk − yk), χ)+ (((At )k − ek)∇(yt

k − yk), χ)

(ek∇(ytk − yk)− div(yt − y), χ),

which, upon division by t , tends to 0 for t → 0.In the following calculation we utilize that (y, p), (λ, q) ∈ H 2(�)d ×H 1(�), which

is satisfied if � is C2. Applying Lemma 11.5 results in

d

dt

⟨e((y, p), t

), (λ, q)

⟩X∗×X|t=0

= ν(∇(−∇yT h),∇λ)� + ν(∇y,∇(−∇λT h))� + ν

∫�

(∇y,∇λ) (h, n) ds

+((

(−∇yT h) · ∇)y, λ

)�+ (

(y · ∇)(−∇yT h), λ)�

+ ((y · ∇)y,−∇λT h

)�+

∫�

((y · ∇)y, λ) (h, n) ds

− (−∇pT h, div λ)� − (p, div(−∇λT h))� −∫�

p div λ (h, n) ds

− (f,−∇λT h)� −∫�

f λ (h, n) ds

+ (div(−∇yT h), q)� + (div y,−∇qT h)� +∫�

q div y (h, n) ds.


�

�

�

�

�

�

�

�


Since div y = div λ = 0 and y, λ ∈ H 10 (�)d , this expression simplifies to

d

dt

⟨e((y, p), t

), (λ, q)

⟩X∗×X|t=0

= ν(∇y,∇ψλ)� + ((y · ∇)y, ψλ)� − (p, div ψλ)� − (f, ψλ)�

+ ν(∇ψy,∇λ)� + ((ψy · ∇)y + (y · ∇)ψy, λ)�

+ (div ψy, q)� + ν

∫�

(∇y,∇λ) (h, n) ds,

where we have used the abbreviation

ψy = −(∇y)T h, ψλ = −(∇λ)T h.Note that ψy , ψλ ∈ H 1(�)d but not in H 1

0 (�)d . Green’s formula, together with (11.3.16),(11.3.20), entails

d

dt

⟨e((y, p), t

), (λ, q)

⟩X∗×X|t=0

= (−ν�y + (y · ∇y)y + ∇p − f,ψλ)� +∫�

ν

(∂y

∂n,ψλ

)ds +

∫�

p (ψλ, n) ds

+ (ψy,−ν�λ+ (∇y)λ− (y · ∇)λ− ∇q)� +∫�

ν

(∂λ

∂n,ψy

)ds

+∫�

q (ψy, n) ds + ν

∫�

(∇y,∇λ) (h, n) ds

= −∫�

ν

(∂y

∂n, (∇λ)T h

)ds −

∫�

p ((∇λ)T h, n) ds −∫�

ν

(∂λ

∂n, (∇y)T h

)ds

−∫�

q ((∇y)T h, n) ds + ν

∫�

(∇y,∇λ) (h, n) ds − (j ′1(y), (∇y)T h)�

= −∫�

(ν

(∂y

∂n,∂λ

∂n

)+ p

(∂λ

∂n, n

)+ q

(∂y

∂n, n

))(h, n) ds − (j ′1(y), (∇u)T h)�.

Arguing as in Section 11.3.3 one eventually obtains by Theorem 11.3

J ′(�)h =∫�

(ν

(∂y

∂n,∂λ

∂n

)+ p

(∂λ

∂n, n

)+ q

(∂y

∂n, n

))(h, n) ds

+∫�

(j1(y) div h+ j ′1(y)∇yT h) dx

=∫�

(ν

(∂y

∂n,∂λ

∂n

)+ p

(∂λ

∂n, n

)+ q

(∂y

∂n, n

)+ j1(y)

)(h, n) ds.


�

�

�

�

�

�

�

�

Bibliography

[ACFK] J.Albert, C. Carstensen, S.A. Funken, and R. Close, MATLAB implementationof the finite element method in elasticity, Computing 69(2002), 239–263.

[AcVo] R. Acar and C. R. Vogel, Analysis of bounded variation penalty methods forill-posed problems, Inverse Problems 10(1994), 1217–1229.

[Ad] R. Adams, Sobolev Spaces, Academic Press, Boston, 1975.

[AlMa] W. Alt and K. Malanowski, The Lagrange-Newton method for nonlinearoptimal control problems, Comput. Optim. Appl. 2(1993), 77–100.

[Alt1] W. Alt, Stabilität mengenwertiger Abbildungen mit Anwendungen auf nicht-lineare Optimierungsprobleme, Bayreuther Mathematische Schriften 3, 1979.

[Alt2] W. Alt, Lipschitzian perturbations in infinite optimization problems, in Math-ematical Programming with Data Perturbations II, A.V. Fiacco, ed., LectureNotes in Pure and Appl. Math. 85, Marcel Dekker, New York, 1983, 7–21.

[Alt3] W. Alt, Stability of Solutions and the Lagrange-Newton Method for NonlinearOptimization and Optimal Control Problems, Habilitation thesis, Bayreuth,1990.

[Alt4] W. Alt, The Lagrange-Newton method for infinite-dimensional optimizationproblems, Numer. Funct. Anal. Optimiz. 11(1990), 201–224.

[Alt5] W. Alt, Sequential quadratic programming in Banach spaces, in Advancesin Optimization, W. Oettli and D. Pallaschke, eds., Lecture Notes in Econom.and Math. Systems 382, Springer-Verlag, Berlin, 1992, 281–301.

[Ba] V. Barbu, Analysis and Control of Nonlinear Infinite Dimensional Systems,Academic Press, Boston, 1993.

[BaHe] A. Battermann and M. Heinkenschloss, Preconditioners for Karush-Kuhn-Tucker matrices arising in the optimal control of distributed systems, in Controland Estimation of Distributed Parameter Systems, Vorau (1996), Internat. Ser.Numer. Math. 126, Birkhäuser, Basel, 1998, 15–32.

[BaK1] V. Barbu and K. Kunisch, Identification of nonlinear elliptic equations, Appl.Math. Optim. 33(1996), 139–167.

327


�

�

�

�

�

�

�

�

328 Bibliography

[BaK2] V. Barbu and K. Kunisch, Identification of nonlinear parabolic equations,Control Theory Adv. Tech. 10(1995), 1959–1980.

[BaKRi] V. Barbu, K. Kunisch, and W. Ring, Control and estimation of the boundaryheat transfer function in Stefan problems, RAIRO Modél Math. Anal. Numér.30(1996), 1–40.

[BaPe] V. Barbu and Th. Percupanu, Convexity and Optimization in Banach Spaces,Reidel, Dordrecht, 1986.

[BaSa] A. Battermann and E. Sachs, An indefinite preconditioner for KKT systemsarising in optimal control problems, in Fast Solution of Discretized Optimiza-tion Problems, Berlin, 2000, Internat. Ser. Numer. Math. 138, Birkhäuser,Basel, 2001, 1–18.

[Be] D. P. Bertsekas, Constrained Optimization and Lagrange Multiplier Methods,Academic Press, Paris, 1982.

[BeK1] M. Bergounioux and K. Kunisch, Augmented Lagrangian techniques for el-liptic state constrained optimal control problems, SIAM J. Control Optim.35(1997), 1524–1543.

[BeK2] M. Bergounioux and K. Kunisch, Primal-dual active set strategy for stateconstrained optimal control problems, Comput. Optim. Appl. 22(2002), 193–224.

[BeK3] M. Bergounioux and K. Kunisch, On the structure of the Lagrange multi-plier for state-constrained optimal control problems, Systems Control Lett.48(2002), 16–176.

[BeMeVe] R. Becker, D. Meidner, and B. Vexler, Efficient numerical solution of parabolicoptimization problems by finite element methods, Optim. Methods Softw.22(2007), 813–833.

[BePl] A. Berman and R. J. Plemmons, Nonnegative Matrices in the MathematicalSciences, Computer Science and Scientific Computing Series,Academic Press,New York, 1979.

[Ber] M. Berggren, A Unified Discrete-Continuous Sensitivity Analysis Method forShape Optimization, Lecture at the Radon Institut, Linz, Austria, 2005.

[Bew] T. Bewley, Flow control: New challenges for a new renaissance, Progress inAerospace Sciences 37(2001), 24–58.

[BGHW] L. T. Biegler, O. Ghattas, M. Heinkenschloss, and B. van Bloemen Waanders,eds., Large-Scale PDE-Constrained Optimization, Lecture Notes in Comput.Sci. Eng. 30, Springer-Verlag, Berlin, 2003.

[BGHKW] L. T. Biegler, O. Ghattas, M. Heinkenschloss, D. Keyes, and B. van Bloe-men Waanders, eds., Real-Time PDE-Constrained Optimization, SIAM,Philadelphia, 2007.


�

�

�

�

�

�

�

�

Bibliography 329

[BHHK] M. Bergounioux, M. Haddou, M. Hintermüller, and K. Kunisch, A comparisonof a Moreau–Yosida-based active set strategy and interior point methods forconstrained optimal control problems, SIAM J. Optim. 11(2000), 495–521.

[BIK] M. Bergounioux, K. Ito, and K. Kunisch, Primal-dual strategy for constrainedoptimal control problems, SIAM J. Control Optim. 37(1999), 1176–1194.

[BKW] A. Borzì, K. Kunisch, and D.Y. Kwak, Accuracy and convergence properties ofthe finite difference multigrid solution of an optimal control optimality system,SIAM J. Control Optim. 41(2003), 1477–1497.

[BoK] A. Borzì and K. Kunisch, The numerical solution of the steady state solidfuel ignition model and its optimal control, SIAM J. Sci. Comput. 22(2000),263–284.

[BoKVa] A. Borzì, K. Kunisch, and M. Vanmaele, A multigrid approach to the optimalcontrol of solid fuel ignition problems, in Multigrid Methods VI, Lect. NotesComput. Sci. Eng. 14, Springer-Verlag, Berlin, 59–65.

[Bre] H. Brezis, Opérateurs Maximaux Monotones et Semi-groupes de Constractiondas le Espaces de Hilbert, North–Holland, Amsterdam, 1973.

[Bre2] H. Brezis, Problèmes unilatéraux, J. Math. Pures Appl. 51(1972), 1–168.

[CaTr] E. Casas and F. Tröltzsch, Second-order necessary and sufficient optimalityconditions for optimization problems and applications to control theory, SIAMJ. Optim. 13(2002), 406–431.

[ChK] G. Chavent and K. Kunisch, Regularization of linear least squares problems bytotal bounded variation, ESAIM Control Optim. Calc. Var. 2(1997), 359–376.

[ChLi] A. Chambolle and P.-L. Lions, Image recovery via total bounded variationminimization and related problems, Numer. Math. 76(1997), 167–188.

[ChZo] Z. Chen and J. Zou, Finite element methods and their convergence for ellipticand parabolic interface problems, Numer. Math. 79(1998), 175–202.

[Ci] P. G. Ciarlet, Mathematical Elasticity, Vol. 1, North–Holland, Amsterdam,1987.

[CKP] E. Casas, K. Kunisch, and C. Pola, Regularization by functions of boundedvariation and applications to image enhancement, Appl. Math. Optim.40(1999), 229–258.

[Cla] F. H. Clarke, Optimization and Nonsmooth Analysis, John Wiley and Sons,New York, 1983.

[CNQ] X. Chen, Z. Nashed, and L. Qi, Smoothing methods and semismooth methodsfor nondifferentiable operator equations, SIAM J. Numer. Anal. 38(2000),1200–1216.


�

�

�

�

�

�

�

�

330 Bibliography

[CoKu] F. Colonius and K. Kunisch, Output least squares stability for parameter esti-mation in two point value problems, J. Reine Angew. Math. 370(1986), 1–29.

[DaLi] R. Dautray and J. L. Lions, Mathematical Analysis and Numerical Methodsfor Science and Technology, Vol. 3, Springer-Verlag, Berlin, 1990.

[Dei] K. Deimling, Nonlinear Functional Analysis, Springer-Verlag, Berlin, 1985.

[DeZo] M. C. Delfour and J.-P. Zolésio, Shapes and Geometries: Analysis, DifferentialCalculus, and Optimization, SIAM, Philadelphia, 2001.

[Don] A. L. Dontchev, Local analysis of a Newton-type method based on partialelimination, in The Mathematics of Numerical Analysis, Lectures in Appl.Math. 32, AMS, Providence, RI, 1996, 295–306.

[DoSa] D. C. Dobson and F. Santosa, Recovery of blocky images from noisy and blurreddata, SIAM J. Appl. Math. 56(1996), 1181–1198.

[dReKu] C. de los Reyes and K. Kunisch, A comparison of algorithms for controlconstrained optimal control of the Burgers equation, Calcolo 41(2004), 203–225.

[EcJa] C. Eck and J. Jarusek, Existence results for the static contact problem withCoulomb friction, Math. Models Methods Appl. Sci. 8(1998), 445–468.

[EkTe] I. Ekeland and R. Temam, Convex Analysis and Variational Problems, North–Holland, Amsterdam, 1976.

[EkTu] I. Ekeland and T. Turnbull, Infinite Dimensional Optimization and Convexity,The University of Chicago Press, Chicago, 1983.

[Fat] H. O. Fattorini, Infinite Dimensional Optimization and Control Theory, Cam-bridge University Press, Cambridge, 1999.

[FGH] A. V. Fursikov, M. D. Gunzburger, and L. S. Hou, Boundary value prob-lems and optimal boundary control for the Navier–Stokes systems: The two-dimensional case, SIAM J. Control Optim. 36(1998), 852–894.

[FoGl] M. Fortin and R. Glowinski, Augmented Lagrangian Methods: Applicationsto Numerical Solutions of Boundary Value Problems, North–Holland, Ams-terdam, 1983.

[Fr] A. Friedman, Variational Principles and Free Boundary Value Problems, JohnWiley and Sons, New York, 1982.

[Geo] V. Georgescu, On the unique continuation property for Schrödinger Hamilto-nians, Helv. Phys. Acta 52(1979), 655–670.

[GeYa] D. Geman and C. Yang, Nonlinear image recovery with half-quadratic regu-larization, IEEE Trans. Image Process. 4(1995), 932–945.


�

�

�

�

�

�

�

�

Bibliography 331

[GHS] M. D. Gunzburger, L. Hou, and T. P. Svobodny, Finite element approxima-tions of optimal control problems associated with the scalar Ginzburg-Landauequation, Comput. Math. Appl. 21(1991), 123–131.

[GiRa] V. Girault and P.-A. Raviart, Finite Element Methods for Navier–Stokes Equa-tions, Springer-Verlag, Berlin, 1986.

[Giu] E. Giusti, Minimal Surfaces and Functions of Bounded Variation, Birkhäuser,Boston, 1984.

[Glo] R. Glowinski, Numerical Methods for Nonlinear Variational Problems,Springer-Verlag, Berlin, 1984.

[GLT] R. Glowinski, J. L. Lions, and R. Tremoliers, Numerical Analysis of VariationalInequalities, North–Holland, Amsterdam, 1981.

[Gri] P. Grisvard, Elliptic Problems in Nonsmooth Domains, Monographs Stud.Math. 24, Pitman, Boston, MA, 1985.

[Grie] A. Griewank, The local convergence of Broyden-like methods on Lipschitzianproblems in Hilbert spaces, SIAM J. Numer. Anal. 24(1987), 684–705.

[GrVo] R. Griesse and S. Volkwein, A primal-dual active set strategy for optimalboundary control of a nonlinear reaction-diffusion system, SIAM J. ControlOptim. 44(2005), 467–494.

[Gu] M. D. Gunzburger, Perspectives of Flow Control and Optimization, SIAM,Philadelphia, 2003.

[Han] S.-P. Han, Superlinearly convergent variable metric algorithms for generalnonlinear programming problems, Math. Programming 11(1976), 263–282.

[HaMä] J. Haslinger and R. A. E. Mäkinen, Introduction to Shape Optimization,SIAM, Philadelphia, 2003.

[HaNe] J. Haslinger and P. Neittaanmäki, Finite Element Approximation for OptimalShape, Material and Topology Design, 2nd ed, Wiley, Chichester, 1996.

[HaPaRa] S.-H. Han, J.-S. Pang, and N. Rangaray, Glabally convergent Newton methodsfor nonsmooth equations, Math. Oper. Res. 17(1992), 586–607.

[Har] A. Haraux, How to differentiate the projection on a closed convex set inHilbert space. Some applications to variational inequalities, J. Math. Soc.Japan 29(1977), 615–631.

[Has] J. Haslinger, Approximation of the Signorini problem with friction, obeyingthe Coulomb law, Math. Methods Appl. Sci. 5(1983), 422–437.

[Hei] M. Heinkenschloss, Projected sequential quadratic programming methods,SIAM J. Optim. 6(1996), 373–417.


�

�

�

�

�

�

�

�

332 Bibliography

[Hes1] M. R. Hestenes, Optimization Theory. The Finite Dimensional Case, JohnWiley and Sons, New York, 1975.

[Hes2] M. R. Hestenes, Multiplier and gradient methods, J. Optim. Theory Appl.4(1968), 303–320.

[HHNL] I. Hlavacek, J. Haslinger, J. Necas, and J. Lovisek, Solution of VariationalInequalities in Mechanics, Appl. Math. Sci. 66, Springer-Verlag, New York,1988.

[HiIK] M. Hintermüller, K. Ito, and K. Kunisch, The primal-dual active set strategyas a semismooth Newton method, SIAM J. Optim. 13(2002), 865–888.

[HiK1] M. Hinze and K. Kunisch, Three control methods for time-dependent fluidflow, Flow Turbul. Combust. 65(2000), 273–298.

[HiK2] M. Hinze and K. Kunisch, Second order methods for optimal control of time-dependent fluid flow, SIAM J. Control Optim. 40(2001), 925–946.

[HinK1] M. Hintermüller and K. Kunisch, Total bounded variation regularization as abilaterally constrained optimization problem, SIAM J. Appl. Math. 64(2004),1311–1333.

[HinK2] M. Hintermüller and K. Kunisch, Feasible and noninterior path-followingin constrained minimization with low multiplier regularity, SIAM J. ControlOptim. 45(2006), 1198–1221.

[HPR] S.-P. Han, J.-S. Pang, and N. Rangaraj, Globally convergent Newton methodsfor nonsmooth equations, Math. Oper. Res. 17(1992), 586–607.

[HuStWo] S. Hüeber, G. Stadler, and B. I. Wohlmuth, A primal-dual active set algorithmfor three-dimensional contact problems with Coulomb friction, SIAM J. Sci.Comput. 30(2008), 572–596.

[IK1] K. Ito and K. Kunisch, The augmented Lagrangian method for equality andinequality constraints in Hilbert space, Math. Programming 46(1990), 341–360.

[IK2] K. Ito and K. Kunisch, The augmented Lagrangian method for parameteriza-tion in elliptic systems, SIAM J. Control Optim. 28(1990), 113–136.

[IK3] K. Ito and K. Kunisch, The augmented Lagrangian method for estimating thediffusion coefficient in an elliptic equation, in Proceedings of the 26th IEEEConference on Decision and Control, Los Angeles, 1987, 1400–1404.

[IK4] K. Ito and K. Kunisch, An augmented Lagrangian technique for variationalinequalities, Appl. Math. Optim. 21(1990), 223–241.

[IK5] K. Ito and K. Kunisch, Sensitivity analysis of solutions to optimization prob-lems in Hilbert spaces with applications to optimal control and estimation, J.Differential Equations 99(1992), 1–40.


�

�

�

�

�

�

�

�

Bibliography 333

[IK6] K. Ito and K. Kunisch, On the choice of the regularization parameter in non-linear inverse problems, SIAM J. Optim. 2(1992), 376–404.

[IK7] K. Ito and K. Kunisch, Sensitivity measures for the estimation of parametersin 1-D elliptic boundary value problems, J. Math. Systems Estim., Control6(1996), 195–218.

[IK8] K. Ito and K. Kunisch, Maximizing robustness in nonlinear illposed inverseproblems, SIAM J. Control Optim. 33(1995), 643–666.

[IK9] K. Ito and K. Kunisch, Augmented Lagrangian-SQP-methods in Hilbert spacesand application to control in the coefficients problems, SIAM J. Optim.6(1996), 96–125.

[IK10] K. Ito and K. Kunisch, Augmented Lagrangian-SQP methods for nonlinearoptimal control problems of tracking type, SIAM J. Control Optim. 34(1996),874–891.

[IK11] K. Ito and K. Kunisch, Augmented Lagrangian methods for nonsmooth convexoptimization in Hilbert spaces, Nonlinear Anal. 41(2000), 573–589.

[IK12] K. Ito and K. Kunisch, An active set strategy based on the augmentedLagrangian formulation for image restoration, MZAN Math. Model Nu-mer. Anal. 33(1999), 1–21.

[IK13] K. Ito and K. Kunisch, Augmented Lagrangian formulation of nonsmooth,convex optimization in Hilbert spaces, in Control of Partial Differential Equa-tions, E. Casas, ed., Lecture Notes in Pure andAppl. Math. 174, Marcel Dekker,New York, 1995, 107–117.

[IK14] K. Ito and K. Kunisch, Estimation of the convection coefficient in ellipticequations, Inverse Problems 13(1997), 995–1013.

[IK15] K. Ito and K. Kunisch, Newton’s method for a class of weakly singular optimalcontrol problems, SIAM J. Optim. 10(1999), 896–916.

[IK16] K. Ito and K. Kunisch, Optimal control of elliptic variational inequalities,Appl. Math. Optim. 41(2000), 343–364.

[IK17] K. Ito and K. Kunisch, Optimal control of the solid fuel ignition model withH 1-cost, SIAM J. Control Optim. 40(2002), 1455–1472.

[IK18] K. Ito and K. Kunisch, BV-type regularization methods for convoluted objectswith edge, flat and grey scales, Inverse Problems 16(2000), 909–928.

[IK19] K. Ito and K. Kunisch, Optimal control, in Encyclopedia of Electrical andElectronic Engineering, J. G. Webster, ed., 15, John Wiley and Sons, NewYork, 1999, 364–379.

[IK20] K. Ito and K. Kunisch, Semi-smooth Newton methods for variational inequal-ities of the first kind, MZAN Math. Model. Numer. Anal. 37(2003), 41–62.


�

�

�

�

�

�

�

�

334 Bibliography

[IK21] K. Ito and K. Kunisch, The primal-dual active set method for nonlinear optimalcontrol problems with bilateral constraints, SIAM J. Control Optim. 43(2004),357–376.

[IK22] K. Ito and K. Kunisch, Semi-smooth Newton methods for state-constrainedoptimal control problems, Systems Control Lett. 50(2003), 221–228.

[IK23] K. Ito and K. Kunisch, Parabolic variational inequalities: The Lagrange mul-tiplier approach, J. Math. Pures Appl. (9) 85(2006), 415–449.

[IKK] K. Ito, M. Kroller, and K. Kunisch, A numerical study of an augmented La-grangian method for the estimation of parameters in elliptic systems, SIAMJ. Sci. Statist. Comput. 12(1991), 884–910.

[IKP] K. Ito, K. Kunisch, and G. Peichl, Variational approach to shape derivativesfor a class of Bernoulli problems, J. Math. Anal. Appl. 314(2006), 126–149.

[IKPe] K. Ito, K. Kunisch, and G. Peichl, On the regularization and the numericaltreatment of the inf-sup condition for saddle point problems, Comput. Appl.Math. 21(2002), 245–274.

[IKPe2] K. Ito, K. Kunisch, and G. Peichl, A variational approach to shape derivatives,to appear in ESAIM Control Optim. Calc. Var.

[IKSG] K. Ito, K. Kunisch, V. Schulz, and I. Gherman, Approximate Nullspace Itera-tions for KKT Systems in Model Based Optimization, preprint, University ofTrier, 2008.

[ItKa] K. Ito and F. Kappel, Evolution Equations and Approximations, World Scien-tific, River Edge, NJ, 2002.

[Ja] J. Jahn, Introduction to the Theory of Nonlinear Optimization, Springer-Verlag,Berlin, 1994.

[JaSa] H. Jäger and E.W. Sachs, Global convergence of inexact reduced SQPmethods,Optim. Methods Softw. 7(1997), 83–110.

[Ka] K. Kato, Perturbation Theory for Linear Operators, Springer-Verlag, Berlin,1980.

[KaK] A. Kauffmann and K. Kunisch, Optimal control of the solid fuel ignition model,ESAIM Proc. 8(2000), 65–76.

[Kan] C. Kanzow, Inexact semismooth Newton methods for large-scale complemen-tarity problems, Optim. Methods Softw. 19(2004), 309–325.

[KeSa] C. T. Kelley and E. W. Sachs, Solution of optimal control problems by apointwise projected Newton method, SIAM J. Control Optim. 33(1995), 1731–1757.


�

�

�

�

�

�

�

�

Bibliography 335

[KO] N. Kikuchi and J. T. Oden, Contact Problems in Elasticity: A Study of Vari-ational Inequalities and Finite Element Methods, SIAM Stud. Appl. Math. 8,SIAM, Philadelphia, 1988.

[Kou] S. G. Kou, A jump-diffusion model for option pricing, Management Sci.48(2002), 1086–1101.

[KPa] K. Kunisch and X. Pan, Estimation of interfaces from boundary measurements,SIAM J. Control Optim. 32(1994), 1643–1674.

[KPe] K. Kunisch and G. Peichl, Estimation of a temporally and spatially varyingdiffusion coefficient in a parabolic system by an augmented Lagrangian tech-nique, Numer. Math. 59(1991), 473–509.

[KRe] K. Kunisch and F. Rendl, An infeasible active set method for quadratic prob-lems with simple bounds, SIAM J. Optim. 14(2003), 35–52.

[KRo] K. Kunisch and A. Rösch, Primal-dual active set strategy for a general classof optimal control problems, to appear in SIAM J. Optim.

[KSa] K. Kunisch and E. W. Sachs, Reduced SQP methods for parameter identifica-tion problems, SIAM J. Numer. Anal. 29(1992), 1793–1820.

[Kup] F.-S. Kupfer, An infinite-dimensional convergence theory for reduced SQPmethods in Hilbert space, SIAM J. Optim. 6(1996), 126–163.

[KuSa] F.-S. Kupfer and E. W. Sachs, Numerical solution of a nonlinear paraboliccontrol problem by a reduced SQP method, Comput. Optim. Appl. 1(1992),113–135.

[KuSt] K. Kunisch and G. Stadler, Generalized Newton methods for the 2D-Signorinicontact problem with friction in function space, MZAN Math. Model. Numer.Anal. 39(2005), 827–854.

[KuTa] K. Kunisch and X-. Tai, Sequential and parallel splitting methods for bilinearcontrol problems in Hilbert spaces, SIAM J. Numer. Anal. 34(1997), 91–118.

[KVo1] K. Kunisch and S. Volkwein, Augmented Lagrangian-SQP techniques andtheir approximation, in Optimization Methods in Partial Differential Equa-tions, Contemp. Math. 209, AMS, Providence, RI, 1997, 147–160.

[KVo2] K. Kunisch and S. Volkwein, Galerkin proper orthogonal decomposition meth-ods for parabolic systems, Numer. Math. 90(2001), 117–148.

[KVo3] K. Kunisch and S. Volkwein, Galerkin proper orthogonal decomposition meth-ods for a general equation in fluid dynamics, SIAM J. Numer. Anal. 40(2002),492–515.

[LeMa] F. Lempio and H. Maurer, Differential stability in infinite-dimensional nonlin-ear programming, Appl. Math. Optim. 6(1980), 139–152.


�

�

�

�

�

�

�

�

336 Bibliography

[LiMa] J.-L. Lions and E. Magenes, Non-Homogeneous Boundary Value Problemsand Applications, Springer-Verlag, Berlin, 1972.

[Lio1] J.-L. Lions, Control of Distributed Singular Systems, Gauthier-Villars, Paris,1985.

[Lio2] J.-L. Lions, Optimal Control of Systems Governed by Partial DifferentialEquations, Springer-Verlag, Berlin, 1971.

[Lio3] J.-L. Lions, Quelques Methodes de Resolution de Problemes aux Limites NonLineares, Dunod, Paris, 1969.

[MaZo] H. Maurer and J. Zowe, First and second-order necessary and sufficient opti-mality conditions for infinite-dimensional programming problems, Math. Pro-gramming 16(1979), 98–110.

[Mif] R. Mifflin, Semismooth and semiconvex functions in constrained optimization,SIAM J. Control Optim. 15(1977), 959–972.

[MuSi] F. Murat and J. Simon, Sur le controle par un domain geometrique, Rapport76015, Universite Pierre et Marie Curie, Paris, 1976.

[Pan1] J. S. Pang, Newton’s method for B-differentiable equations, Math. Oper. Res.15(1990), 311–341.

[Pan2] J. S. Pang, A B-differentiable equation-based, globally and locally quadrat-ically convergent algorithm for nonlinear programs, complementarity andvariational inequality problems, Math. Programming 51(1991), 101–131.

[Paz] A. Pazy, Semi-groups of nonlinear contractions and their asymptotic behavior,in Nonlinear Analysis and Mechanics: Heriot Watt Symposium, Vol III, R. J.Knops, ed., Res. Notes in Math. 30, Pitman, Boston, MA, 1979, 36–134.

[PoTi] E. Polak andA. L. Tits, A globally convergent implementable multiplier methodwith automatic limitation, Appl. Math. Optim. 6(1980), 335–360.

[PoTr] V. T. Polak and N. Y. Tret’yakov, The method of penalty estimates for condi-tional extremum problems, Žurnal Vycislitel’noı Matematiki i MatematiceskoıFiziki 13(1973), 34–46.

[Pow] M. J. D. Powell, A method for nonlinear constraints in minimization problems,in Optimization, R. Fletcher, ed., Academic Press, New York, 1968, 283–298.

[Pow1] M. J. D. Powell, The convergence of variable metric methods for nonlinearlyconstrained optimization problems, Nonlinear Programming 3, O. L. Man-gasarian, ed., Academic Press, New York, 1987, 27–63.

[Qi] L. Qi, Convergence analysis of some algorithms for solving nonsmooth equa-tions, Math. Oper. Res. 18(1993), 227–244.


�

�

�

�

�

�

�

�

Bibliography 337

[QiSu] L. Qi and J. Sun, A nonsmooth version of Newton’s method, Math. Program-ming 58(1993), 353–367.

[Rao] M. Raous, Quasistatic Signorini problem with Coulomb friction and couplingto adhesion, in New Developments in Contact Problems, P. Wriggers andPanagiotopoulos, eds., CISM Courses and Lectures 384, Springer-Verlag, NewYork, 1999, 101–178.

[Ro1] S. M. Robinson, Regularity and stability for convex multivalued functions,Math. Oper. Res. 1(1976), 130–143.

[Ro2] S. M. Robinson, Stability theory for systems of inequalities, Part II: Differen-tiable nonlinear systems, SIAM J. Numer. Anal. 13(1976), 497–513.

[Ro3] S. M. Robinson, Strongly regular generalized equations, Math. of Oper. Res.5(1980), 43–62.

[Ro4] S. M. Robinson, Local structure of feasible sets in nonlinear programming,Part III: Stability and sensitivity, Math. Programming Stud. 30(1987), 45–66.

[Roc1] R. T. Rockafeller, Local boundedness of nonlinear monotone operators, Michi-gan Math. J. 16(1969), 397–407.

[Roc2] R. T. Rockafeller, The multiplier method of Hestenes and Powell applied toconvex programming, J. Optim. Theory Appl. 12(1973), 34–46.

[ROF] L. I. Rudin, S. Osher, and E. Fatemi, Nonlinear total variation based noiseremoval algorithms, Phys. D 60 (1992), 259–268.

[RoTr] A. Rösch and F. Tröltzsch, On regularity of solutions and Lagrange multipliersof optimal control problems for semilinear equations with mixed pointwisecontrol-state constraints, SIAM J. Control Optim. 46(2007), 1098–1115.

[Sa] E. Sachs, Broyden’s method in Hilbert space, Math. Programming 35(1986),71–82.

[Sey] R. Seydel, Tools for Computational Finance, Springer-Verlag, Berlin, 2002.

[Sha] A. Shapiro, On concepts of directional differentiability, J. Optim. TheoryAppl.13(1999), 477–487.

[SoZo] J. Sokolowski and J. P. Zolesio, Introduction to Shape Optimization, Springer-Verlag, Berlin, 1991.

[Sta1] G. Stadler, Infinite-Dimensional Semi-Smooth Newton and Augmented La-grangian Methods for Friction and Contact Problems in Elasticity, Ph.D.thesis, University of Graz, 2004.

[Sta2] G. Stadler, Semismooth Newton and augmented Lagrangian methods for asimplified problem friction, SIAM J. Optim. 15(2004), 39–62.


�

�

�

�

�

�

�

�

338 Bibliography

[Sto] J. Stoer, Principles of sequential quadratic programming methods for solvingnonlinear programs, in Computation Mathematical Programming, K. Schitt-kowski, ed., NATO Adv. Sci. Inst. Ser F Comput. Systems Sci. 15, Springer-Verlag, Berlin, 1985, 165–207.

[StTa] J. Stoer and R. A. Tapia, The Local Convergence of Sequential QuadraticProgramming Methods, Technical Report 87–04, Rice University, 1987.

[Tan] H. Tanabe, Equations of Evolution, Pitman, London, 1979.

[Te] R. Temam, Navier Stokes Equations: Theory and Numerical Analysis, North–Holland, Amsterdam, 1979.

[Tem] R. Temam, Infinite-Dimensional Dynamical Systems in Mechanics andPhysics, Springer-Verlag, Berlin, 1988.

[Tin] M. Tinkham, Introduction to Superconductivity, McGraw–Hill, New York,1975.

[Tr] G. M. Troianiello, Elliptic Differential Equations and Obstacle Problems,Plenum Press, New York, 1987.

[Tro] F. Troeltzsch, An SQP method for the optimal control of a nonlinear heatequation, Control Cybernet. 23(1994), 267–288.

[Ulb] M. Ulbrich, Semismooth Newton methods for operator equations in functionspaces, SIAM J. Optim. 13(2003), 805–841.

[Vog] C. R. Vogel, Computational Methods for Inverse Problems, Frontiers Appl.Math. 23, SIAM, Philadelphia, 2002.

[Vol] S. Volkwein, Mesh-independence for an augmented Lagrangian-SQP methodin Hilbert spaces, SIAM J. Control Optim. 38(2000), 767–785.

[We] J. Werner, Optimization—Theory and Applications, Vieweg, Braunschweig,1984.

[Zar] E. M. Zarantonello, Projection on convex sets in Hilbert space and spectraltheory, in Contributions to Nonlinear FunctionalAnalysis, E. H. Zarantonello,ed., Academic Press, New York, 1971, 237–424.

[Zo] J. P. Zolesio, The material derivative (or speed) method for shape optimization,in Optimization of Distributed Parameter Structures, Vol II, E. Haug and J. Cea,eds., Sijthoff & Noordhoff, Alphen aan den Rijn, 1981, 1089–1151.

[ZoKu] J. Zowe and S. Kurcyusz, Regularity and stability for the mathematical pro-gramming problem in Banach spaces, Appl. Math. Optim. 5(1979), 49–62.


�

�

�

�

�

�

�

�

Index

adapted penalty, 23adjoint equation, 18American options, 277augmentability, 65, 68, 73augmented Lagrange functional, 76augmented Lagrangian, 65, 263augmented Lagrangian-SQPmethod, 155

Babuška–Brezzi condition, 184Bernoulli problem, 322Bertsekas penalty functional, 73BFGS-update formula, 141biconjugate functional, 93bilateral constraints, 202Bingham flow, 120Black–Scholes model, 277Bouligand differentiability, 217, 234Bouligand direction, 226box constraints, 225Bratu problem, 13, 152BV-regularization, 254BV-seminorm, 87, 254

Clarke directional derivative, 222coercive, 71complementarity condition, 88, 113complementarity problem, 51, 215, 240conical hull, 1conjugate functional, 92, 253constrained optimal control, 190contact problem, 263convection estimation, 14, 153convex functional, 89Coulomb friction, 263

descent direction, 225directional differentiability, 58, 217, 234

dual cone, 1, 27dual functional, 76, 115dual problem, 77, 99duality pairing, 1

effective domain, 89Eidelheit separation theorem, 3elastoplastic problem, 122epigraph, 89equality constraints, 3, 8equality-constrained problem, 7extremality condition, 253

feasibility step, 149Fenchel duality theorem, 253first order augmented Lagrangian, 67, 75first order necessary optimality condition,

156friction problem, 125, 263

Gâteaux differentiable, 95Gauss–Newton algorithm, 228Gel’fand triple, 246generalized equation, 31, 33generalized implicit function theorem, 31generalized inverse function theorem, 35globalization strategy, 222

Hessian, 29

image restoration, 121implicit function theorem, 31indicator function, 28, 92inequality constraints, 6inequality-constrained problem, 7inverse function theorem, 35inverse interface problem, 316

339


�

�

�

�

�

�

�

�

340 Index

Karush–Kuhn–Tucker condition, 6

L1-fitting, 126Lagrange functional, 28, 66Lagrange multiplier, 2, 28, 277Lagrangian method, 75least squares, 10linear elasticity, 263linearizing cone, 2lower semicontinuous, 89

M-matrix, 194, 196Mangasarian–Fromowitz constraint

qualification, 6material derivative, 306Maurer–Zowe optimality condition, 42maximal monotone operator, 104mesh-independence, 183method of mapping, 306metric projection, 57monotone operator, 104

Navier–Stokes control, 143, 323Newton differentiable, 234Newton method, 129, 133nonlinear complementarity problem, 231,

243nonlinear elliptic optimal control prob-

lem, 174normal cone, 28null-space representation, 138

obstacle problem, 8, 122, 247optimal boundary control, 20optimal control, 62, 126, 172optimal distributed control, 19optimality system, 18

P -matrix, 194parabolic variational inequality, 277parameter estimation, 82parametric mathematical programming, 27

Hölder continuity of solutions, 45Lipschitz continuity of solutions, 46Lipschitz continuity of value func-

tion, 39sensitivity of value function, 55

partial semismooth Newton iteration, 244penalty method, 75penalty techniques, 24polar cone, 1, 69polyhedric, 54, 57preconditioning, 174primal equation, 18primal-dual active set strategy, 189, 202,

240proper functional, 89

quasi-directional derivative, 225

reduced formulation, xivreduced SQP method, 139regular point, 28, 166regular point condition, 5restriction operator, 183

saddle point, 103second order augmented Lagrangian, 155second order sufficient optimality condi-

tion, 156semismooth function, 216semismooth Newton method, 192, 215,

241sensitivity equation, 58sequential tangent cone, 2shape calculus, 305shape derivative, 305, 306, 308shape optimization, 305Signorini problem, 124, 263Slater condition, 6SQP (sequential quadratic programing),

129, 137stability analysis, 34state-constrained optimal control, 191, 247stationary point, 66strict complementarity, 65, 67, 76strong regularity, 31strong regularity condition, 47subdifferential, 28, 95sufficient optimality, 67sufficient optimality condition, 131superlinear convergence, 141, 221

Tresca friction, 263


�

�

�

�

�

�

�

�

Index 341

Uzawa method, 118

value function, 39, 99variational inequality, 246

weakly lower semicontinuous, 89weakly singular problem, 148

Yosida–Moreau approximation, 87, 248


�

�

�

�

�

�

�

�

lagrange multiplier approach to variational problems and applications

Documents

optimal control

control perspectives

control series

flow control

control of nonlinear

linear feedback control

university of california

state university michel