xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi contents 2.7...

536
Birkhäuser Advanced Texts Basler Lehrbücher Vassili Kolokoltsov Differential Equations on Measures and Functional Spaces

Upload: others

Post on 08-Sep-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

Birkhäuser Advanced Texts Basler Lehrbücher

Vassili Kolokoltsov

Differential Equations on Measures and Functional Spaces

Page 2: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations
Page 3: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

Birkhäuser Advanced Texts Basler Lehrbücher

Series editorsSteven G. Krantz, Washington University, St. Louis, USAShrawan Kumar, University of North Carolina at Chapel Hill, Chapel Hill, USA

More information about this series at http://www.springer.com/series/4842

Jan Nekovár, Sorbonne Université, Paris, France

Page 4: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

Vassili Kolokoltsov

Differential Equations on Measures and Functional Spaces

Page 5: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

Vassili Kolokoltsov Department of Statistics University of Warwick Warwick, UK

ISSN 1019-6242 ISSN 2296-4894 (electronic) Birkhäuser Advanced Texts Basler Lehrbücher ISBN 978-3-030-03376-7 ISBN 978-3-030-03377-4 (eBook) https://doi.org/10.1007/978-3-030-03377-4

Mathematics Subject Classification (2010): 34B15, 34G10, 34G20, 34H05, 35D30, 35F20, 35F21, 35K25, 35K55, 35K67, 35Q20, 35Q82, 35Q83, 35Q84, 35Q91, 35Q92, 35S05, 45N05, 46E10, 47D03, 47D06, 47D07, 47D08, 47G20, 49J50, 91A13, 91A22

Higher School of EconomicsMoscow, Russia

© Springer Nature Switzerland AG 2019 This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broad-casting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors, and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, express or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. This book is published under the imprint Birkhäuser, www.birkhauser-science.com by the registered company Springer Nature Switzerland AG The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland

Page 6: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

Contents

Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xi

Standard notations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xv

1 Analysis on Measures and Functional Spaces

1.1 Banach spaces: notations and examples . . . . . . . . . . . . . . . 1

1.2 Smooth functions on Banach spaces . . . . . . . . . . . . . . . . . 9

1.3 Additive and multiplicative integrals . . . . . . . . . . . . . . . . . 14

1.4 Differentials of the norms . . . . . . . . . . . . . . . . . . . . . . . 19

1.5 Smooth mappings between Banach spaces . . . . . . . . . . . . . . 23

1.6 Locally convex spaces and Frechet spaces . . . . . . . . . . . . . . 26

1.7 Linear operators in spaces of measures and functions . . . . . . . . 37

1.8 Fractional calculus . . . . . . . . . . . . . . . . . . . . . . . . . . . 43

1.9 Generalized functions: main operations . . . . . . . . . . . . . . . 56

1.10 Generalized functions: regularization . . . . . . . . . . . . . . . . . 59

1.11 Fourier transform, fundamental solutions and Green functions . . 64

1.12 Sobolev spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70

1.13 Variational derivatives . . . . . . . . . . . . . . . . . . . . . . . . . 71

1.14 Derivatives compatible with duality, AM- and AL-spaces . . . . . 77

1.15 Hints and answers to chosen exercises . . . . . . . . . . . . . . . . 80

1.16 Summary and comments . . . . . . . . . . . . . . . . . . . . . . . 83

2 Basic ODEs in Complete Locally Convex Spaces

2.1 Fixed-point principles for curves in Banach spaces . . . . . . . . . 85

2.2 ODEs in Banach spaces: well-posedness . . . . . . . . . . . . . . . 88

2.3 Linear equations and chronological exponentials . . . . . . . . . . 93

2.4 Linear evolutions involving spatially homogeneous ΨDOs . . . . . 98

2.5 Hamiltonian systems, boundary-value problems andthe method of shooting . . . . . . . . . . . . . . . . . . . . . . . . 104

2.6 Hamilton–Jacobi equation, method of characteristicsand calculus of variation . . . . . . . . . . . . . . . . . . . . . . . . 113

v

Page 7: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

vi Contents

2.7 Hamilton–Jacobi–Bellman equation and optimal control . . . . . . 119

2.8 Sensitivity of integral equations . . . . . . . . . . . . . . . . . . . . 125

2.9 ODEs in Banach spaces: sensitivity . . . . . . . . . . . . . . . . . 131

2.10 Linear first-order partial differential equations . . . . . . . . . . . 133

2.11 Equations with memory: causality . . . . . . . . . . . . . . . . . . 135

2.12 Equations with memory: fractional derivatives . . . . . . . . . . . 137

2.13 Linear fractional ODEs and related integral equations . . . . . . . 139

2.14 Linear fractional evolutions involving spatiallyhomogeneous ΨDOs . . . . . . . . . . . . . . . . . . . . . . . . . . 142

2.15 Sensitivity of integral and differential equations:advanced version . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145

2.16 ODEs in locally convex spaces . . . . . . . . . . . . . . . . . . . . 148

2.17 Monotone and accretive operators . . . . . . . . . . . . . . . . . . 150

2.18 Hints and answers to chosen exercises . . . . . . . . . . . . . . . . 153

2.19 Summary and comments . . . . . . . . . . . . . . . . . . . . . . . 155

3 Discrete Kinetic Systems: Equations in l+p3.1 Equations in Rn

+ . . . . . . . . . . . . . . . . . . . . . . . . . . . . 160

3.2 Examples in Rn+: replicator dynamics and

mass-action-law kinetics . . . . . . . . . . . . . . . . . . . . . . . . 162

3.3 Entropy and equilibria for linear evolutions in Rn+ . . . . . . . . . 169

3.4 Entropy and equilibria for nonlinear evolutions in Rn+ . . . . . . . 172

3.5 Kinetic equations for collisions, fragmentation, reproductionand preferential attachment . . . . . . . . . . . . . . . . . . . . . . 175

3.6 Simplest equations in l+p . . . . . . . . . . . . . . . . . . . . . . . . 182

3.7 Existence of solutions for equations in l+p . . . . . . . . . . . . . . 186

3.8 Additive bounds for rates . . . . . . . . . . . . . . . . . . . . . . . 188

3.9 Evolution of moments under additive bounds . . . . . . . . . . . . 191

3.10 Accretive operators in lp . . . . . . . . . . . . . . . . . . . . . . . . 194

3.11 Accretivity for evolutions with additive rates . . . . . . . . . . . . 197

3.12 The major well-posedness result in l+p . . . . . . . . . . . . . . . . 200

3.13 Sensitivity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 202

3.14 Second-order sensitivity . . . . . . . . . . . . . . . . . . . . . . . . 206

3.15 Stability of solutions with respect to coefficients . . . . . . . . . . 208

3.16 Hints and answers to chosen exercises . . . . . . . . . . . . . . . . 209

3.17 Summary and comments . . . . . . . . . . . . . . . . . . . . . . . 210

Page 8: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

Contents vii

4 Linear Evolutionary Equations: Foundations

4.1 Semigroups and their generators . . . . . . . . . . . . . . . . . . . 213

4.2 Semigroups of operators on Banach spaces . . . . . . . . . . . . . 219

4.3 Simple diffusions and the Schrodinger equation . . . . . . . . . . . 223

4.4 Evolutions generated by powers of the Laplacian . . . . . . . . . . 233

4.5 Evolutions generated by ΨDOs with homogeneous symbolsand their mixtures . . . . . . . . . . . . . . . . . . . . . . . . . . . 237

4.6 Perturbation theory and the interaction picture . . . . . . . . . . . 244

4.7 Path integral representation . . . . . . . . . . . . . . . . . . . . . . 250

4.8 Diffusion with drifts and Schrodinger equations withsingular potentials and magnetic fields . . . . . . . . . . . . . . . . 254

4.9 Propagators and their generators . . . . . . . . . . . . . . . . . . . 259

4.10 Well-posedness of linear Cauchy problems . . . . . . . . . . . . . . 263

4.11 The operator-valued Riccati equation . . . . . . . . . . . . . . . . 266

4.12 An infinite-dimensional diffusion equation invariational derivatives . . . . . . . . . . . . . . . . . . . . . . . . . 270

4.13 Perturbation theory for propagators . . . . . . . . . . . . . . . . . 272

4.14 Diffusions and Schrodinger equations with nonlocal terms . . . . . 278

4.15 ΨDOs with homogeneous symbols (time-dependent case) . . . . . 281

4.16 Higher-order ΨDEs with nonlocal terms . . . . . . . . . . . . . . . 284

4.17 Hints and answers to chosen exercises . . . . . . . . . . . . . . . . 285

4.18 Summary and comments . . . . . . . . . . . . . . . . . . . . . . . 286

5 Linear Evolutionary Equations: Advanced Theory

5.1 T -products with three-level Banach towers . . . . . . . . . . . . . 289

5.2 Adding generators with 4-level Banach towers . . . . . . . . . . . 296

5.3 Mixing generators . . . . . . . . . . . . . . . . . . . . . . . . . . . 304

5.4 The method of frozen coefficients: heuristics . . . . . . . . . . . . . 308

5.5 The method of frozen coefficients: estimates forthe Green function . . . . . . . . . . . . . . . . . . . . . . . . . . . 311

5.6 The method of frozen coefficients: main examples . . . . . . . . . . 316

5.7 The method of frozen coefficients: regularity . . . . . . . . . . . . 322

5.8 The method of frozen coefficients: the Cauchy problem . . . . . . 326

5.9 Uniqueness via duality and accretivity; generalized solutions . . . 331

5.10 Uniqueness via positivity and approximations; Feller semigroups . 334

5.11 Levy–Khintchin generators and convolution semigroups . . . . . . 339

5.12 Potential measures . . . . . . . . . . . . . . . . . . . . . . . . . . . 345

5.13 Vector-valued convolution semigroups . . . . . . . . . . . . . . . . 348

5.14 Equations of order at most one . . . . . . . . . . . . . . . . . . . . 351

5.15 Smoothness and smoothing of propagators . . . . . . . . . . . . . 354

5.16 Summary and comments . . . . . . . . . . . . . . . . . . . . . . . 358

Page 9: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

viii Contents

6 The Method of Propagators for Nonlinear Equations

6.1 Hamilton–Jacobi–Bellman (HJB) andGinzburg–Landau equations . . . . . . . . . . . . . . . . . . . . . 362

6.2 Higher-order PDEs and ΨDEs, andCahn–Hilliard-type equations . . . . . . . . . . . . . . . . . . . . . 370

6.3 Nonlinear evolutions and multiplicative-integral equations . . . . . 371

6.4 Causal equations and general path-dependent equations . . . . . . 374

6.5 Simplest nonlinear diffusions: weak treatment . . . . . . . . . . . . 377

6.6 Simplest nonlinear diffusions: strong treatment . . . . . . . . . . . 379

6.7 Simplest nonlinear diffusions: regularity and sensitivity . . . . . . 382

6.8 McKean–Vlasov equations . . . . . . . . . . . . . . . . . . . . . . . 385

6.9 Landau–Fokker–Planck-type equations . . . . . . . . . . . . . . . . 393

6.10 Forward-backward systems . . . . . . . . . . . . . . . . . . . . . . 394

6.11 Linearized evolution around non-linear propagators . . . . . . . . 396

6.12 Sensitivity of nonlinear propagators . . . . . . . . . . . . . . . . . 401

6.13 Summary and comments . . . . . . . . . . . . . . . . . . . . . . . 402

7 Equations in Spaces of Weighted Measures

7.1 Conditional positivity . . . . . . . . . . . . . . . . . . . . . . . . . 405

7.2 Simplest equations that preserve positivity . . . . . . . . . . . . . 407

7.3 Path-dependent equations and forward-backward systems . . . . . 413

7.4 Kinetic equations (Boltzmann, Smoluchowski, Vlasov, Landau)and replicator dynamics . . . . . . . . . . . . . . . . . . . . . . . . 416

7.5 Well-posedness for basic kinetic equations . . . . . . . . . . . . . . 424

7.6 Equations with additive bounds for rates . . . . . . . . . . . . . . 427

7.7 On the sensitivity of kinetic equations . . . . . . . . . . . . . . . . 431

7.8 On the derivation of kinetic equations: second quantizationand beyond . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 432

7.9 Interacting particles andmeasure-valued diffusions . . . . . . . . . . . . . . . . . . . . . . . 438

7.10 Summary and comments . . . . . . . . . . . . . . . . . . . . . . . 440

8 Generalized Fractional Differential Equations

8.1 Green functions of fractional derivatives and theMittag-Leffler function . . . . . . . . . . . . . . . . . . . . . . . . . 444

8.2 Linear evolution . . . . . . . . . . . . . . . . . . . . . . . . . . . . 446

8.3 The fractional HJB equation and related equationswith smoothing generators . . . . . . . . . . . . . . . . . . . . . . 450

8.4 Generalized fractional integration and differentiation . . . . . . . . 453

8.5 Generalized fractional linear equations, part I . . . . . . . . . . . . 458

8.6 Generalized fractional linear equations, part II . . . . . . . . . . . 463

8.7 The time-dependent case; path integral representation . . . . . . . 468

Page 10: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

Contents ix

8.8 Chronological operator-valued Feynmann–Kac formula . . . . . . . 475

8.9 Summary and comments . . . . . . . . . . . . . . . . . . . . . . . 479

9 Appendix

9.1 Fixed-point principles . . . . . . . . . . . . . . . . . . . . . . . . . 481

9.2 Special functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 483

9.3 Asymptotics of the Fourier transform: power functionsand their exponents . . . . . . . . . . . . . . . . . . . . . . . . . . 485

9.4 Asymptotics of the Fourier transform: functionsof power growth . . . . . . . . . . . . . . . . . . . . . . . . . . . . 491

9.5 Argmax in convex Hamiltonians . . . . . . . . . . . . . . . . . . . 498

Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 503

Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 521

Page 11: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

PrefaceObjectives, scope and methodology

This is an advanced text on ordinary differential equations (ODEs) in Banachand more general locally convex spaces, most notably ODEs on measures andvarious function spaces. The methodology is carefully chosen in order to providea very concise introduction of the fundamentals and then move on quickly, butrigorously and systematically to the up-fronts of modern research in linear andnonlinear PDEs, ΨDEs, general kinetic equations and fractional evolutions. Morethan half of the book content has not previously been included in any textbook.Other parts have been streamlined and given unified arguments.

The level of generality was chosen such that the book content is suitable forthe study of the most important nonlinear equations of mathematical physics, suchas Boltzmann, Smoluchovskii, Vlasov, Landau–Fokker–Planck, Cahn–Hilliard, Ha-milton–Jacobi–Bellman, nonlinear Schrodinger or McKean–Vlasov diffusions andtheir nonlocal extensions, of mass-action-law kinetics from chemistry, as well asof nonlinear evolutions arising in evolutionary biology, mean-field games, opti-mization theory, epidemics and system biology, in general models of interactingparticles or agents that describe splitting and merging, collisions and breakage,mutations or the preferential-attachment growth on networks. With this objec-tive in mind, the abstract vector spaces are introduced and studied mostly notfor their own sake, but as a convenient tool for storing and summarizing the ba-sic properties of the concrete infinite-dimensional spaces of smooth or integrablefunctions, measures and distributions (generalized functions), which are crucial forthe above-mentioned equations. In other words, the general theory is developedas a tool for effectively solving concrete problems, and it aims at simplifying andnot complicating the matter. In accordance with this approach, we are not dealingmuch with ‘pathologies’ that may arise in abstract spaces, but rather focus on theregularity properties of the most important classes of equations.

A large number of remarks and comments are scattered throughout the textwhich stress the interconnections between various parts of the book and aim atrevealing where and how a particular result is used in other chapters, or may beused in other contexts.

In order to make the text appealing and accessible to readers with differentbackgrounds, much attention is paid to the clarification of the links between thelanguages of pseudo-differential operators (ΨDOs), generalized functions, operatortheory, abstract linear spaces, fractional calculus and path integrals. With thesame objective in mind, lots of attention is paid to proper definitions of all theobjects that are used. Some definitions are even repeated in different chapters.A detailed subject index refers to the pages where the corresponding notions aredefined. Also, the book contains many exercises that deal with examples andfurther developments. Note that these exercises never substitute the proofs of themain results. Solutions are provided for most exercises of the four initial chapters.Exercises in later chapters are more research-oriented.

xi

Page 12: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

xii Preface

General context and specific features

The basic classes of partial differential, integral and pseudo-differential equationsusually lead to ODEs in infinite-dimensional spaces with unbounded, not Lipschitz-continuous and/or singular coefficients. Roughly speaking, our major tools in theirstudy derive from the methods of semigroups and propagators, on the one hand,and from the exploitation of some kind of positivity preservation, on the other.

The overall emphasis is on the well-posedness of the problems (existence,uniqueness and continuous dependence of solutions on the initial data), sensitivity(smooth dependence of solutions on initial data and/or parameters), regularity ofthe solutions in various classes of smooth functions equipped with pointwise orintegral norms, with precise growth estimates, and either the integral representa-tions of solutions (whenever possible) or the natural approximating schemes thatallow for various kinds of numerical algorithms to be employed for finding solu-tions. Apart from being crucial for numeric computations, explicit estimates areimportant for studying equations with random coefficients, which requires precisecontrol over all bounds. Together with the regular solutions, various concepts ofgeneralized solutions are introduced, whereby two basic classes of such solutionsare stressed: generalized solutions by approximations (that can be approximationsof regular solutions or approximations by discrete times) and generalized solutionsby duality (that can be a Banach-space duality or, more generally, duality of lo-cally convex spaces, the most notable example of the latter being the method ofgeneralized functions).

A unique feature of the exposition in this book is that it is strongly influencedby the links with probability theory and Markov processes (Feller semigroups,Levy–Khintchin generators, path integrals are standard players in stochastic anal-ysis, but not in standard texts on ODEs), though always remaining independentof these links. The links are crucial for modern developments in the field, sinceprobability theory keeps steadily penetrating all areas of natural and even socialsciences. They are made explicit in several side notes that are aimed at readerswith some knowledge of stochastic analysis. The links are revealed in detail in[147], [148]. Other accessible books on the links between PDE and stochastics are[12], [68] and [118].

The exposition is also strongly influenced by the fractional calculus, which israpidly developing as an appropriate tool to deal with various complex problems innatural and social sciences, see, e.g., [253, 255, 263]. Although results on fractionaldifferential equations are analysed in special sections (that readers may choose toomit), it turns out that the type of singularities occurring in fractional equationsare in fact quite common in other settings, like nonlinear diffusions or generalperturbation theory estimates. Therefore, the growth estimates of the solutionsand their sensitivity to parameters are naturally expressed in terms of the Mittag-Leffler functions in various, seemingly unrelated contexts, these functions beingthe main players in fractional calculus. A unified abstract framework for thesecontexts makes it possible to treat them in a very effective and concise way.

Page 13: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

Preface xiii

The source of many developments in the theory of differential equations (es-pecially nonlinear differential equations) can be traced back to the analysis ofsystems of interacting particles (or agents in the social context) in the limit oflarge particle numbers (dynamic law of large numbers). Although this link hasnot been formally developed in the book, it was crucial for the selection of mate-rial and methodology.

Another source of new methods and ideas in nonlinear differential equationsis the theory of optimization and competitive control systems. This link is beingdeveloped here, including the analysis of various classes of the Hamilton–Jacobi–Bellman equation, forward-backward systems (occurring in mean-field games), theRiccati equation and the replicator dynamics of evolutionary game theory andcontrolled systems of interacting agents.

Since differential equations are a key tool in almost all developments and ap-plications of mathematics, many introductory textbooks on differential equationsare available. Traditionally, the topic is included in the undergraduate curriculumin two separate parts: ordinary differential equations (ODEs) and partial differen-tial equations (PDEs). Examples for classical texts on ODEs are [15, 104, 220, 224].Meanwhile, the standard theory of PDEs is much more diversified. Starting fromthe classical second-order equations of mathematical physics (Laplace, heat andwave equations), it utilizes various methods for various types of equations. There-fore, the boundaries of even the core of the subject are difficult to overview, andthe same applies for a comprehensive list of textbooks. However, a large portionof these methods can be unified by looking at partial differential operators as rep-resentatives of linear operators in certain abstract spaces, and then applying thetools of functional analysis. In this framework, PDEs – and more general integro-differential and pseudo-differential equations – are considered ODEs in abstractlinear spaces, and the boundary between these two parts of the theory is meltingaway. The well-known book [219] was one of the first systematic developments ofthis framework. Following this general idea, the present book provides a uniqueconcise and application-oriented exposition of ODEs on measures and functionalspaces, starting from scratch and moving up to the level of modern research inmany directions, including non-equilibrium statistical mechanics, nonlinear quan-tum mechanics, fractional evolutions, evolutionary biology, models of interactingagents, and others.

Readers and prerequisites

This textbook is designed to serve as a multi-purpose learning resource on differ-ential equations. It is mostly aimed at postgraduate or final year undergraduatecourses for mathematics students. However, the final Chapters can also be of inter-est to researchers in linear and nonlinear differential equations. On the other hand,the book can also be used for basic undergraduate courses and self-studies. Forinstance, Chapter 2 can be considered an intensive undergraduate introductorycourse on ODEs, if the term ‘Banach space’ is substituted by ‘Euclidean spaceRd’ and if the integration methods of the simplest one-dimensional equations are

Page 14: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

xiv Preface

added as exercises. A study in this framework would help the students to graspthe more abstract approaches from the beginning of their curriculum, which makestheir further transition to graduate courses more easy. Similarly, Chapter 3 on theequations in Rd

+ and l+1 does not require any prerequisites apart from introductorycalculus and linear algebra.

The overall level of presentation is meant to be appropriate for readers whoare familiar with basic calculus and linear algebra, with the principles of conver-gence in general compact, metric and Banach spaces, with the basic notions oflinear spaces, linear operators, dual spaces and operators, with the theory of mea-sure, integration and Lp-spaces and occasionally with some functions of complexvariables. No prior knowledge of ODEs is assumed.

Each of the first four chapters can serve as the basis for a crash course on theirrespective topic. Put together, they cover a range of topics that is appropriate fora full one semester module. Elements of the other chapters can be used to enhancethe course in various particular directions, or as a basis for more advanced courses.For each chapter, a specific abstract and summary are provided.

The material of the last chapters has been adapted from research articlesand specialized monographs. Their topical selection was of course influenced bythe research interests of the author, but a strong attempt was made to choose onlytopics within the mainstream of research, including general methods which can beused in a variety of developments and which at the same time have grown matureenough to be presented in a more or less final form.

Though aimed at mathematicians and filled with abstract theory, the bookis meant to be truly application-oriented: Not in the sense that it is to be used forproducing certain concrete industrial products, but in the sense that the abstracttheory is developed in order to effectively solve basic concrete problems that arisein natural sciences or modelling of social processes – that is, as a tool to streamlineand simplify the analysis of these problems, and not for the sake of generality inits own right.

Bibliographic comments

Due to the immense amount of literature on the topics that are touched upon in thebook, it was unfortunately impossible to provide an exhaustive guide to all relevantcontributions. Instead, the given bibliography essentially includes the sources thathave been used by the author for the preparation of the manuscript, as well assome classical textbooks and key references for related further developments.

Acknowledgement

It is my pleasure to express my gratitude for fruitful discussion and joint researchon the topics reflected in this book to my colleagues, collaborators and PhDstudents, especially to S. Assing, A. Bensoussan, A. Hilbert, M.E. Hernandez-Hernandez, A. Kulik, S. Katsikas, O.A. Malafeyev, L. Toniazzi, M. Troeva, M.Veretennikova and W. Yang. Also, I gratefully acknowledge the support by theRussian Academic Excellence project ‘5-100’.

Page 15: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

Standard notations

N, Z, R, C Sets of natural, integer, real and complex numbers

Z+ N ∪ {0}R+ {x ∈ R : x ≥ 0}N∞, Z∞, R∞, C∞ Sets of sequences from N, Z, R, C

Z∞+ ,R∞

+ Subsets of Z∞,R∞ with non-negative elements

Cd, Rd Complex and real d-dimensional spaces

(x, y) or xy Scalar product of the vectors x, y ∈ Rd; also x2 = |x|2 = (x, x)

|x| or ‖x‖ Standard Euclidean norm√(x, x) of x ∈ Rd or x ∈ Cd

Rea, Im a Real and imaginary part of a complex number a

[x] Integer part of a real number x (maximal integer not exceeding x)

sgnx = sgn(x) The sign of x (equals 1, 0,−1 if x < 0, x = 0, x > 0 respectively)

Sd Unit sphere in Rd+1

C(X) For a metric or topological space X , the Banach space of boundedcontinuous functions on X equipped with the sup-norm‖f‖ = ‖f‖C(X) = supx∈X |f(x)|

M(X) Banach space of finite signed Borel measures on X

M+(X) and P(X) The subsets of M(X) of positive and positive normalized(probability) measures

C∞(X) For a locally compact X , the subspace of C(X) consisting of functionsthat tend to zero at infinity

Cf (X) For a positive function f on X , the Banach space of continuousfunctions on X with a finite norm ‖g‖Cf(X) = ‖g/f‖C(X)

Mf(X) For a positive continuous function f on X , the space of Borel measureson X with a finite norm ‖μ‖Mf (X) = sup{(g, μ) : ‖g‖Cf(X) ≤ 1}

Cf,∞(X) For a locally compact X and a positive function f , the subspace of

Cf (X) consisting of functions g such that g(x)f(x) → 0, as x → ∞

Ck(Rd) or short Ck Banach space of k times continuously differentiablefunctions with bounded derivatives on Rd, with the norm being thesum of the sup-norms of the function itself and all its partialderivatives up to and including the order k

Ck∞(Rd) ⊂ Ck(Rd) Functions whose derivatives up to and including order k areall in C∞(Rd)

∇f = (∇1f, . . . ,∇df) =(

∂f∂x1

, . . . , ∂f∂xd

)The gradient of the function f

Δ = ∇2 =∑

j ∇2j The Laplacian operator

∇⊗2f = ∂2f∂x2 The matrix of the second-order derivatives of f , sometimes referredto as the Hessian

xv

Page 16: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

xvi Standard notations

Lp(X,μ) or Lp(X,μ), p ≥ 1 The Banach spaces of (the equivalence classes of)integrable functions on the metric or topological space X with respectto the Borel measure μ, equipped with the p-norm

‖f‖p =[∫ |f(x)|p μ(dx)]1/p

Lp(Rd) The space Lp(R

d, μ) with Lebesgue measure μ

S(Rd) Schwartz space of fast-decreasing functions:{f ∈ C∞(Rd) : ∀k, l ∈ N, |x|k∇lf ∈ C∞(Rd)}

|ν| The (positive) total variation measure for a signed measure ν

(f, g) =∫f(x)g(x) dx Scalar product for functions f, g on Rd. For f ∈ C(X),μ ∈ M(X), the following notation is used:(f, μ) = (μ, f) =

∫Xf(x)μ(dx)

AT or A′ Transpose of a matrix A

A� or A′ Dual or adjoint operator of A

kerA, trA Kernel and trace of the matrix A

1M Indicator function of a set M (equals one or zero according to whetherits argument is in M or not)

1 Constant function equal to one, and also the identity operator

f = O(g) For functions f and g, this means that |f | ≤ Cg for some constant C

f = o(g)x→a For functions f and g, this means that limx→af(x)g(x) = 0

Standard abbreviations

ODE Ordinary differential equation

PDE Partial differential equation

ΨDE Pseudo-differential equation

ΨDO Pseudo-differential operator

r.h.s., l.h.s. Right-hand side and left-hand side, respectively

Page 17: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

Chapter 1

Analysis on Measures andFunctional Spaces

In this chapter, we shall review some key facts on the calculus of smooth map-pings between Banach spaces, Frechet spaces and general locally convex spaces,and their key representatives, including the spaces of generalized functions (or dis-tributions). Sections 1.1 to 1.6 deal with abstract notions, and the remaining partprovides more concrete information on spaces of measures and functions, theirdual spaces and the structure of the basic classes of linear operators, includingmultidimensional mixed fractional derivatives and ΨDOs. In order to be reason-ably self-contained, we supply most of the proofs, apart from some standard factsthat are clearly formulated and provided with references where the proofs can befound.

1.1 Banach spaces: notations and examples

In this introductory section, we explain the notations for the basic Banach spacesof functions, operators and measures that are used throughout the book withoutfurther reminder. Apart from fixing notations, the objective is to draw the circle ofnotions and ideas that comprise the main building blocks for the development inthis treatise and that the reader is supposed to be familiar with. Besides standardcalculus and linear algebra (including the Fourier transform), this includes thetheory of measure and integration, convergence in compact, metric (and rarelyin general topological) spaces and basic theoretical facts on Banach spaces. Itis only in rare occasions that we use deeper facts of functional analysis like theHahn–Banach theorem, Baire’s theorem on categories or Schauder’s fixed-pointprinciple. Details on all these prerequisites can be found in many standard textson functional analysis including [166, 231] and [115].

© Springer Nature Switzerland AG 2019 V. Kolokoltsov, Differential Equations on Measures and Functional Spaces,

1

Birkhäuser Advanced Texts Basler Lehrbücher, https://doi.org/10.1007/978-3-030-03377-4_1

Page 18: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

2 Chapter 1. Analysis on Measures and Functional Spaces

We shall mostly work with Banach spaces over the field of real numbers.Recall that for any Banach space B with the norm ‖.‖ = ‖.‖B (we write simply‖ξ‖ if it is clear which Banach space we are talking about), the dual Banach space,usually denoted by B∗ or B′, is defined as the space of continuous linear functionalsz on B, the value of z at y being usually denoted by z(y) = (z, y). This space isBanach with respect to the norm

‖z‖� = sup{|z(y)| : ‖y‖ ≤ 1}.

Remark 1. For a real Hilbert space H , for instance the usual Euclidean space Rn,H can be identified with its dual H∗, so that (z, y) is the inner product of z andy, which coincides with the value of z on y.

Each Banach space B is naturally embedded in its second dual space B∗∗,since any element y ∈ B defines the linear functional on B∗ via the formula y(z) =(z, y). The Banach space B is called reflexive if this embedding is a bijection,that is, the second dual B∗∗ is isomorphic to B. Reflexive spaces share manyfeatures with Hilbert spaces and are usually more convenient for analysis. Themain examples of Banach spaces in this book, however, are not reflexive.

The weakest topology of B∗ ensuring that all functionals from B are contin-uous is called the ∗-weak topology of B∗. Thus b∗n → b∗ in the ∗-weak topologymeans that (b∗n, b) → (b∗, b) for any b ∈ B. The weakest topology of B ensuringthat all functionals from B∗ are continuous is called the weak topology of B. If Bis reflexive, then the weak and the ∗-weak topology coincide on B∗, as well as onB = (B∗)∗.

A subset Z of B∗ is said to separate points of B, if for any b1, b2 ∈ B thereexists z ∈ Z such that (z, b1) = (z, b2).

In many situations, it is handy to work with dual pairs of Banach spaces. Inphysics, these pairs usually represent observables and states. Therefore we shallsometimes use the notations Bst and Bobs for the dual pairs. We say that a pair ofBanach spaces (Bobs, Bst) is a dual pair if each of these spaces is a closed subspaceof the dual of the respective other space that separates the points of the latter.Given a pair (Bobs, Bst), one defines the weak topology of Bobs (respectively Bst)with respect to this dual pair as the weakest topology that makes all functionalsfrom Bst (respectively Bobs) continuous.

A linear operator A between two Banach spacesB1 andB2 is a linear mappingA : D �→ B2, where D is a subspace of B1 called the domain of A. The operator Ais said to be densely defined if D is dense in B1. The operator A is called boundedif the norm ‖A‖ = supx∈D ‖Ax‖/‖x‖ is finite. If A is bounded and D is dense,then A has a unique bounded extension (with the same norm) to an operator withthe whole B1 as domain.

A linear operator on a Banach space is called a contraction, if its norm doesnot exceed 1. For an operator A : B → B, the dual operator A∗ : B∗ → B∗ isdefined by the equation (A∗z, b) = (z, Ab).

Page 19: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

1.1. Banach spaces: notations and examples 3

It is known (and not difficult to show) that a linear operator A : B1 → B2 iscontinuous if and only if it is bounded. For a continuous linear mapping A : B1 →B2, its norm is defined as

‖A‖B1 �→B2 = ‖A‖L(B1,B2) = supx =0

‖Ax‖B2

‖x‖B1

. (1.1)

The space of bounded linear operators B1 → B2 equipped with this norm is aBanach space itself, often denoted by L(B1, B2). For B1 = B2 = B we shall alsouse the shorter notation ‖A‖L(B) or even simpler ‖A‖B instead of ‖A‖L(B,B). Asequence of bounded operators An, n = 1, 2, . . . , from B1 to B2 is said to convergestrongly to an operator A if Anf → Af for any f ∈ B1. This defines the strongtopology on L(B1, B2), which is weaker than the norm topology. If B2 = R, thenL(B1,R) = B∗

1 and the strong topology turns into the ∗-weak topology of B∗1 .

Exercise 1.1.1. Show that ‖A∗‖ = ‖A‖ for any bounded operator A : B → B.

A bilinear operator from B1 to B2 is a mapping D : B1 ×B1 → B2 which islinear with respect to each of its two variables. It is called bounded if there existsa constant d such that ‖D(x, y)‖ ≤ d‖x‖ ‖y‖. The minimal such constant is calledthe norm of D. Clearly, if D is bounded, then it is continuous as a function oftwo variables. Let us denote L2(B1, B2) the space of bounded symmetric bilinearoperators from B1 to B2, which is Banach with respect to the above defined norm.Similarly, one defines the spaces Ln(B1, B2) of multi-linear symmetric boundedmappings B1 × · · · ×B1 → B2.

Remark 2. If a bilinear mapping between Banach spaces is continuous with respectto each of its variables separately, then it is bounded. This fact follows from theprinciple of uniform boundedness.

The following examples of Banach spaces are going to appear quite often inthis book.

For a topological (in particular metric) space X and a Banach space B, wedefine the space C(X,B) of continuous bounded functions f : X → B. Note thatthis space is often denoted by Cb(X,B) in other literature. It is a Banach spaceequipped with the sup-norm

‖f‖C(X,B) = supx∈X

‖f(x)‖B.

Sometimes we shall also use the space Cuc(X,B) of bounded, uniformly continuousfunctions on X . Another established notation for this space is BUC(X,B). Forthe special case B = R, we write shortly C(X) for C(X,R) and Cuc(X) forCuc(X,R). Both of them are Banach spaces with the norm ‖f‖sup = supx |f(x)|.The space Cuc(X) is a closed subspace of C(X).

Particular cases are the space (Rd, sup) of d-dimensional vectors with thenorm ‖y‖sup = maxj |yj | and the space l∞ ⊂ R∞ of sequences of bounded elementswith the norm ‖y‖sup = supj |yj|.

Page 20: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

4 Chapter 1. Analysis on Measures and Functional Spaces

Integration represents a pairing of measures and functions. Whenever theintegral exists, one therefore often uses the notation (f, μ) = (μ, f) =

∫f(x)μ(dx)

for a function f and a measure μ. The space M(X) for a topological (in particularmetric) space X is the space of bounded signed measures on the Borel σ-algebraof X (i.e., the algebra that is generated by all open subsets). It is a Banach spacewith respect to the norm

‖μ‖ = sup{|(f, μ)| : f ∈ C(X), ‖f‖sup ≤ 1}.

This norm coincides with the full variation norm μ = (1, μ+ + μ−), where μ =μ+ − μ− is the Hahn decomposition of μ in its positive and negative parts. Thepositive measure |μ| = μ+ + μ− is often referred to as the total variation measureof μ. The subset of positive measures is denoted by M+(X). The elements ofM+(X) that have a total measure of 1 are called probability measures, and theset of these measures is denoted by P(X).

An important example for measures are the Dirac measures or atoms δx,which assign the measure 1 to the point x and zero to the compliment of x. Ameasure μ ∈ M(X) is called discrete or atomic if μ =

∑∞j=1 ajδxj with some

summable sequence {aj} of numbers and some sequence {xj} of points in X . Ameasure is called continuous if it contains no atoms (i.e., any point is a set of zeromeasure). It is known (and easy to show) that the sets of discrete and continuousmeasures are closed Banach subspaces in M(X).

Occasionally, we shall use unbounded measures μ, which will always be takenfrom the class of the so-called Radon measures MR(X). These measures have theproperty that |μ|(A) is finite for any compact set A. For μ1, μ2 ∈ MR(X), thecomplex-valued function μ1+ iμ2 on Borel subsets of X is called a complex Radonmeasure. The space of such measures is denoted by MC

R(X). For μ = μ1+ iμ2, thetotal variation measure of μ is defined as the positive measure |μ| = |μ1| + |μ2|.The space of bounded complex measures is denoted by MC(X).

An important special case of unbounded measures and functions are weightedspaces. For a continuous non-negative function L on X , let the weighted functionspace CL(X) be the space of continuous functions on X with the finite norm

‖f‖CL(X) = inf{K : |f(x)| ≤ KL(x) for all x}. (1.2)

Similarly, one defines the weighted measure space M(X,L) of Borel measures onX with the finite norm

‖μ‖M(X,L) =

∫X

L(x)|μ|(dx) = sup{(g, μ) : ‖g‖CL(X) ≤ 1}. (1.3)

If L is strictly positive, then the above-defined weighted spaces are Banach spacesand all measures from M(X,L) are Radon measures.

If X is a locally compact space, the closed subspace C∞(X) (respectivelyC∞(X,B) for a Banach space B) of C(X) (respectively of C(X,B)) consists of

Page 21: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

1.1. Banach spaces: notations and examples 5

functions that vanish at infinity, i.e., such functions f that for any ε > 0 there existsa compact K such that |f(x)| < ε for x /∈ K. Similarly, for a positive function L onX the space CL,∞(X) is the closed subspace of CL(X) which contains functions fsuch that f(x)/L(x) vanishes at infinity. The fundamental Riesz–Markov theoremstates that M(X) coincides with the dual space [C∞(X)]∗, which defines the ∗-weak topology on M(X). For the space C∞(N), which is the closed subspace ofl∞ containing sequences {y1, y2, . . .} such that yn → 0 as n → ∞, we use the shortterm c∞. (Another established notation for this space is c0.)

If a positive Borel measure μ is chosen on X , the spaces of (the equivalenceclasses of) integrable functions Lp(X,μ), p ≥ 1, are defined in the usual way. Theyare equipped with the p-norm

‖f‖p =

[∫|f(x)|p μ(dx)

]1/p.

Special cases are the spaces lp ⊂ R∞ of sequences with the bounded norm

‖y‖p =

( ∞∑p=1

|yj |)1/p

,

and their finite-dimensional subsets Rd. As usual, we write Lp(Ω) for Ω ⊂ Rn, ifthe measure is assumed to be Lebesgue.

The space L∞(X,μ) is the equivalence class of bounded functions on X withthe ess-sup-norm (supremum that is obtained by ignoring sets of measure zero)denoted by‖f‖∞. For X = N, L∞(X,μ) turns into the space l∞ of boundedsequences, where the sup-norm and ess-sup-norm coincide.

It is a known fact that [Lp(X,μ)]∗ = Lq(X,μ) for p ≥ 1 with 1/p+ 1/q = 1,including the case p = 1 and q = ∞, but excluding p = ∞. By this correspondence,an element f ∈ Lq(X,μ) defines a functional on Lp(X,μ) via integration: f(g) =∫f(x)g(x)μ(dx). In particular, l∗p = lq. Moreover, l1 = (c∞)∗.

According to the tradition, the weak topology on M(X) (also called narrowtopology by some authors) is meant to be the weak topology with respect to thepair (C(X),M(X)). Note that this definition differs from the general definitionof weak topology as given above. Only for l1 = M(N) = L1(N) (where in thenotation L1(N) the set N is supposed to be equipped with the uniform measurethat assigns the unit measure to any point) the dual space is l∞ = C(N) and theweak topology for measures coincides with the weak topology in the general senseof functional analysis.

When working with the weak topology, Prokhorov’s compactness criterionplays an important role: The bounded family μt of Borel measures on a completemetric space X is relatively compact in the weak topology if it is tight, that is,for any ε > 0 there exists a compact set K ⊂ X such that μt(X \K) < ε for allt. For example, let Pp(Rd) denote the subset of P(Rd) that contains probability

Page 22: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

6 Chapter 1. Analysis on Measures and Functional Spaces

measures μ with a finite pth order moment, i.e., with∫ |x|pμ(dx) < ∞. Then, for

any p > 0 and λ > 0, the set

M =

{μ ∈ Pp(Rd) :

∫|x|pμ(dx) < λ

}

is compact in the weak topology of P(Rd). More generally, for any non-negativecontinuous function L on a locally compact space X such that L(x) → ∞ asx → ∞, the set

M+≤λ(X,L) =

{μ ∈ M+(X) :

∫X

L(x)μ(dx) ≤ λ

}(1.4)

is compact in the weak topology of M(X) for any λ > 0.

A sequence of measures μn is said to converge vaguely to μ, if (μn, φ) con-verges to (μ, φ) for any φ with a compact support. For bounded sequences μn ona locally compact space, the vague and ∗-weak convergence coincide. However,unbounded sequences of measures can converge vaguely (and not ∗-weakly) to anunbounded measure.

Remark 3. A sequence of measures μn converges vaguely to a measure μ if andonly if μn converges in the space D′ of generalized functions, see Section 1.9.

Let us point out the crucial difference between strong Banach topologiesand weak topologies in M(X) for the example X = Rd. Recall that a measureμ ∈ M(Rd) is called absolutely continuous (respectively singular) with respectto the Lebesgue measure if μ has a density with respect to the Lebesgue mea-sure (or if the whole measure is concentrated on a set of zero Lebesgue measure,respectively). The famous Lebesgue decomposition theorem states that any con-tinuous measure in Rd can be uniquely represented as the sum of an absolutelycontinuous and a singular measure. Moreover, the sets of absolutely continuousand singular measures are closed Banach subspaces in M(Rd). In the weak topol-ogy, the situation is different: Here, the space of absolutely continuous measures(which is naturally isomorphic to the space L1(R

d)) is weakly dense in M(X). Infact, let φ(x) be a mollifier, i.e., a continuous even function Rd → [0, 1] that issupported on the unit ball and has the unit norm in L1(R

d). For any μ ∈ M(X),let us define

fn(x) = nd

∫φ(n(x − y))μ(dy). (1.5)

Then the sequence of measures with the continuous densities fn converges weaklyto μ, as n → ∞.

Exercise 1.1.2. Prove this statement.

Exercise 1.1.3. Show that the set of discrete measures is weakly dense in M(X).

Page 23: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

1.1. Banach spaces: notations and examples 7

For the analysis of ODEs, Lipschitz-continuous functions play a crucial role.We define CbLip(X) as the space of bounded Lipschitz functions with the norm

‖f‖bLip = ‖f‖+ ‖f‖Lip, ‖f‖Lip = supx =y

|f(x) − f(y)|ρ(x, y)

, (1.6)

where ρ is the metric on X . The Lipschitz constant itself does not represent anorm, since ‖f‖Lip = 0 for all constant functions f . One says that f is locallyLipschitz if for any x the Lipschitz constant ‖f‖Lip is bounded for any y in someneighbourhood of x. If we want to stress which metric ρ is used, we can use thenotation CbLip(ρ)(X) both for the space itself and for the norms. For instance,

using the l1-norm for vectors x ∈ Rd, one has

‖f‖Lip(1) = supx =y

|f(x)− f(y)||x− y|1 = sup

jsup

|f(x)− f(y)||xj − yj | , (1.7)

where the last supremum is over the pairs x, y that differ only in their jth coordi-nate.

Exercise 1.1.4. Prove the second equation in (1.7).

For X an open subset of Rd, let Ck(X) denote the space of k times continu-ously differentiable functions on X with uniformly bounded derivatives, equippedwith the norm

‖f‖Ck(X) = ‖f‖+k∑

j=1

‖f (j)‖, (1.8)

where ‖f (j)‖ is the supremum of the absolute values of all partial derivatives of fof order j. In particular, for a differentiable function, we find ‖f‖C1 = ‖f‖bLip(1).For X a closed convex subset of Rd with a nonempty interior Int(X), let Ck(X)denote the closed subspace of Ck(Int(X)) that contains functions whose deriva-tives up to the kth order all have a continuous extension to X . For convex subsetX ∈ Rd, we denote by Ck∞(X) (respectively Ck

bLip(X)) the closed subspace of

functions f from Ck(X) such that f and all its derivatives up to order k belongto C∞(X) (respectively with all derivatives of up to and including order k be-longing to CbLip(X)). For a Banach space B, the B-valued spaces Ck

∞(X,B) andCk

bLip(X,B) are similarly defined.

When analysing measure-valued evolution, more exotic spaces are sometimesneeded. For example, the spaces Ck×k(R2d) that denote subspaces of C(R2d)consist of functions f such that the partial derivatives ∂α+βf/∂xα∂yβ with multi-index α, β, |α| ≤ k, |β| ≤ k, are well defined and belong to C(R2d). The supremumof the norms of these derivatives provides a natural norm for these spaces.

Let us now recall the celebrated Monge–Kantorovich theorem. It states thatthe weak topology on the subset P1(Rd) of probability measures with a finite firstmoment can be metricized by the metric

dMK(μ1, μ2) = sup{|(f, μ1 − μ2)| : |f(x) − f(y)| ≤ |x− y|}.

Page 24: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

8 Chapter 1. Analysis on Measures and Functional Spaces

Let us single out an important corollary of this fact (and of the above-mentionedfacts on tightness), which is very useful for working with spaces of differentiablefunctions.

Proposition 1.1.1. For any k ≥ 1 and λ > 0, the set

P1≤λ(R

d) =

{μ ∈ P(Rd) :

∫|x|μ(dx) < λ

}

is a compact subset of the dual Banach space (Ck(Rd))∗.

The existence of the metric dMK for metricising the weak convergence allowfor a better quantification of the classes of weakly continuous functions. For in-stance, a handy class are the weakly Lipschitz-continuous functions on measuresthat have a finite weak Lipschitz constant

‖F‖weakLip = sup{|F (μ1)− F (μ2)| : dMK(μ1, μ2) ≤ 1}. (1.9)

The metric dMK is not the only handy metric for metricizing the weak topol-ogy. Let us indicate another one. For that purpose, let X be a locally compactspace such that the space C∞(X) is separable (which is the case, e.g., when X isa locally compact metric space), so that there exists a countable set of functionsφn ∈ C∞(X), n ∈ N, of unit norm such that their finite linear combinations aredense in C∞(X). Then the function

d(μ, ν) =∑

n2−n |(φn, μ− ν)|

1 + |(φn, μ− ν)| (1.10)

defines a distance that metricizes the ∗-weak topology on M(X). On weakly com-pact sets (1.4), weak and ∗-weak topologies coincide, which has the followingimplication.

Proposition 1.1.2. For any non-negative continuous function L on a locally com-pact metric space X such that L(x) → ∞ as x → ∞, the weak topology of compactsets (1.4) can be given by the metric (1.10).

Finally, let us suggest a couple of exercises on the properties of the spaces ofsequences lp.

Exercise 1.1.5. Let A = (Ajk), j, k ∈ N, be an infinite matrix defining a linearoperator A in l∞ according to the usual rule: (Ax)j =

∑k Ajkxk. Check the

following statements:

‖A‖l∞→l∞ = supj

∑k|Ajk|, ‖A‖l1→l∞ = sup

jsupk

|Ajk|, (1.11)

‖A‖l1→l1 =∑

jsupk

|Ajk|, ‖A‖l∞→l1 ≤∑

j

∑k|Ajk|. (1.12)

Find an example which shows that (1.12) can be a strict inequality.

Page 25: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

1.2. Smooth functions on Banach spaces 9

Exercise 1.1.6. A symmetric matrix A as defined in the previous exercise alsodefines a bilinear operator that is an element of L2(l∞,R). Show that

‖A‖L2(l∞,R) ≤∑

j

∑k|Ajk|, ‖A‖L2(l1,R) ≤ sup

jsupk

|Ajk|. (1.13)

Exercise 1.1.7. A sequence y(m) of elements of l∞ converges ∗-weakly to y, i.e.,(y(m), x) → (y, x) for any x ∈ l1, if and only if it is uniformly bounded and eachcoordinate converges.

Exercise 1.1.8. A sequence y(m) of elements of l1 converges weakly to y ∈ l1 if andonly if it converges in the norm of l1.

1.2 Smooth functions on Banach spaces

Let M be a closed convex subset of a Banach space B and F a real function onM . We shall assume for convenience that the linear space that is generated by Mcoincides with B. This can always be achieved by reducing B appropriately.

The directional derivative (less popular names are first variation or Gateauxderivative, the latter term being usually linked to linearity, see below) of F at Yin the direction ξ ∈ B (so that hξ ∈ M − Y for some h > 0) is defined as

DξF (Y ) = DF (Y )[ξ] = limh→0+

F (Y + hξ)− F (Y )

h(1.14)

Hereby, the notation h → 0+ means that h → 0 through positive values only.

Remark 4. The different notations Dξ and D[ξ] are introduced in order to usethem in different contexts, where subscripts or brackets may be overloaded withother stuff.

One says that F is directionally differentiable onM if this derivative exists atall points ofM and in all eligible directions ξ (so that hξ ∈ M−Y for some h > 0).We say that the directional derivative is bounded on M if |DξF (Y )| ≤ C‖ξ‖ forall Y, ξ and a constant C. It is locally bounded if it is bounded for Y from anybounded subset of M .

From the definition, it follows that if F is directionally differentiable on M ,then

DaξF (Y ) = aDξF (Y ) (1.15)

for all a > 0. If the directional derivative is locally bounded, then for any Y ∈ Mand ξ ∈ M − Y the function F (Y + sξ) of a real variable s ∈ [0, 1] has boundedright and left derivatives everywhere, and hence by the Lebesgue theorem it isdifferentiable almost everywhere and equals the integral over its derivative:

F (Y + ξ)− F (Y ) =

∫ 1

0

d

dsF (Y + sξ) ds =

∫ 1

0

DξF (Y + sξ) ds. (1.16)

This is the first-order Taylor expansion for functions on Banach spaces.

Page 26: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

10 Chapter 1. Analysis on Measures and Functional Spaces

The existence of a bounded directional derivative does not imply that themapping DξF (Y ) is linear in ξ. If the mapping DξF (Y ) is linear in ξ, the linearoperator DξF (Y ) is usually called the Gateaux derivative. If this is the case forall Y , then F is called Gateaux-differentiable on M . The standard condition thatis sufficient for this linearity involves the continuity with respect to Y :

Proposition 1.2.1. If F has a bounded (respectively locally bounded) directionalderivative on M such that DξF (Y ) is continuous in Y for any ξ, then DξF (Y ) islinear in ξ for any Y , DξF (Y ) is a continuous function of two variables ξ and Y ,and F (Y ) is Lipschitz-continuous on M (respectively locally).

Proof. By the first-order Taylor expansion and (1.15), we get

F (Y + s(ξ1 + ξ2)) = F (Y + sξ1) + [F (Y + sξ1 + sξ2)− F (Y + sξ1)]

= F (Y + sξ1) + s

∫ 1

0

Dξ2F (Y + sξ1 + hsξ2) dh

= F (Y + sξ1) +

∫ s

0

Dξ2F (Y + sξ1 + hξ2) dh.

Due to the continuity of Dξ2F (Z) at the point Z = Y , it follows that

lims→0+

1

s[F (Y + s(ξ1 + ξ2))− F (Y )] = Dξ1F (Y ) +Dξ2F (Y ),

that is, the additivity of DξF (Y ). On the other hand, passing to the limit s → 0+in the equation

F (Y + sξ)− F (Y ) =

∫ s

0

DξF (Y + hξ) dh

= −[F (Y )− F (Y + sξ)]

= −∫ s

0

D−ξF (Y + sξ − hξ) dh

(1.17)

yields D−ξF (Y ) = −DξF (Y ), which extends (1.15) to negative a and thus com-pletes the proof of the linearity ofDξF (Y ). The continuity ofDξF (Y ) with respectto both variables follows from this linearity. The Lipschitz-continuity of F followsfrom (1.16). �

Remark 5. For the first-order Taylor expansion to hold, only the local boundednessof DξF (Y ) for any ξ is needed. Similarly, for linearity in ξ to hold, only the localboundedness of DξF (Y ) and its continuity in Y for any ξ are needed. For thefinite-dimensional case, this would imply the boundedness in ξ. But in the infinite-dimensional case, the boundedness in ξ (or the continuity of DξF (Y ) with respectto ξ) represents an additional assumption.

Page 27: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

1.2. Smooth functions on Banach spaces 11

Defining the norm of the linear mapping DξF (Y ) = DF (Y )[ξ] in the usualway, i.e.,

‖DF (Y )‖ = sup{|DF (Y )[ξ]| : ‖ξ‖ ≤ 1},one obtains from (1.16) the formula for finite increments:

|F (Y )− F (Z)| ≤ ‖Y − Z‖ sup{‖DF (Z + s(Y − Z))‖ : s ∈ [0, 1]}. (1.18)

In many cases, a version of differentiability that is stronger than that ofGateaux becomes important. One says that a function F on M is Frechet-differ-entiable at Y if there exists an element DF (Y ) ∈ B∗, called the derivative of Fat Y , such that for any ε > 0 there exists δ such that

|F (Y + ξ)− F (Y )−DF (Y )[ξ]| ≤ ε‖ξ‖ (1.19)

for any ‖ξ‖ with Y + ξ ∈ M and ‖ξ‖ ≤ δ. In other words, the condition reads|F (Y + ξ) − F (Y ) −DF (Y )[ξ]| = o(‖ξ‖), as ‖ξ‖ → 0. In this context, the terms‘derivative’, ‘strong derivative’ and ‘Frechet derivative’ all mean the same thing.

Notice that the existence of the Gateaux derivative (1.14) means that for anyδ there exists ε such that

|F (Y + hξ)− F (Y )−DF (Y )[hξ]| ≤ ε‖hξ‖ (1.20)

for h ∈ (0, δ). Therefore, the Frechet derivative of F is also necessarily its Gateauxderivative. The difference is that (1.19) holds uniformly for all ξ in some boundeddomain.

At this point, an important peculiarity of infinite-dimensional settings has tobe mentioned. The continuity of DF (Y )[ξ] as a function of two variables clearlyimplies that the mapping Y �→ DF (Y ) is a continuous mapping M → B∗ withB∗ considered in its ∗-weak topology, but not necessarily in the norm topology.For B = Rd, these topologies coincide, which implies that under the assumptionsof Proposition 1.2.1 the Gateaux derivative coincides with the Frechet derivative(according to the next result). In general Banach spaces, one has to impose thenorm continuity as an additional assumption.

Proposition 1.2.2. Under the assumptions of Proposition 1.2.1, let us assume thatthe mapping Y �→ DF (Y ) is continuous as a mapping from M to B∗ with thenorm-topology that is used in B∗. Then F is Frechet-differentiable at any point,with the derivative being given by the directional (or Gateaux) derivative.

Proof. By the first-order Taylor expansion (1.16), we find

|F (Y + ξ)− F (Y )−DF (Y )[ξ]| =∣∣∣∣∫ 1

0

(DF (Y + sξ)−DF (Y ))[ξ] ds

∣∣∣∣≤ ‖ξ‖ sup{‖DF (Y + sξ)−DF (Y )‖ : s ∈ [0, 1]},

implying (1.19) by the continuity of DF . �

Page 28: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

12 Chapter 1. Analysis on Measures and Functional Spaces

Let us denote by C1Frechet(M) or simply C1(M) the space of bounded contin-

uous functions F on M that have bounded Frechet derivatives DF (Y ) such thatthe mapping Y �→ DF (Y ) is continuous in the norm topologies of B and B∗. Thespace C1(M) is equipped with the norm

‖F‖C1(M) = supY ∈M

|F (y)|+ supY ∈M

‖DF (Y )‖B∗ . (1.21)

Let us denote by C1Gat(M) the space of bounded continuous functions F onM

that have bounded Gateaux derivatives DF (Y ) such that DF (Y )[ξ] depends con-tinuously on Y for any ξ (or equivalently, by Proposition 1.2.1, so that DF (Y )[ξ]is a continuous function of two variables), equipped with the same norm (1.21). Itfollows from Proposition 1.2.2 that C1(M) is a closed subspace of C1

Gat(M) andF ∈ C1

Gat(M) belongs to C1(M) whenever the mapping Y �→ DF (Y ) is continuousin the norm topologies of B and B∗.

The higher-order derivatives are defined recursively. For instance,

D2F (Y )[ξ, η] = D (DF (Y )[ξ]) [η].

Proposition 1.2.3. If the derivatives DlF (Y )[ξ1, . . . , ξl], l = 1, . . . , k, are well de-fined and depend continuously on Y , then the multi-linear forms DlF (Y )[ξ1, . . .,ξl], l = 1, . . . , k, are invariant under any permutations of ξ1, . . . , ξl.

Proof. Let us show this for k = 2, all other cases work accordingly. Applying twicethe first-order Taylor expansion (1.16) yields

D (DF (Y )[ξ2]) [ξ1]

= limh1→0

1

h1limh2→0

1

h2

× [(F (Y + h1ξ1 + h2ξ2)− F (Y + h1ξ1))− (F (Y + h2ξ2)− F (Y ))]

= limh1→0

limh2→0

∫ 1

0

ds1

∫ 1

0

ds2D(DF (Y + s1h1ξ1 + s2h2ξ2)[ξ2]

)[ξ1].

Due to the assumptions of continuity, this repeated limit equals the jointlimit limh1,h2→0. Hence it can be reversed and equals the repeated limit limh2→0

limh1→0. �

The spaces Ck(M) of k times continuously differentiable functions on M aredefined recursively as the subsets of functions F from Ck−1(M) with the derivativeDkF (Y ) being uniformly bounded in Y and such that the mapping Y → DkF (Y )is continuous in the norm topologies. The space Ck(M) is equipped with the norm

‖F‖Ck(M) = ‖F‖Ck−1(M) + supY ∈M

sup‖ξ1‖,...,‖ξk‖≤1

|DkF (Y )[ξ1, . . . , ξk]|. (1.22)

Page 29: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

1.2. Smooth functions on Banach spaces 13

Similarly, one defines the space CkGat(M) ⊂ Ck−1

Gat (M) of k times Gateaux-differ-entiable functions with bounded derivatives such that |DkF (Y )[ξ1, . . . , ξk]| is acontinuous function of (k+1) variables, and equipped with the same norm (1.22).

As above, Ck(M) is a closed subspace of CkGat(M), and F ∈ Ck

Gat(M) belongsto Ck(M) whenever the mappings Y �→ DlF (Y ) are continuous as the mappingsbetween the Banach spaces B and Ll(B,R) for all l = 1, . . . , k.

Of course, these definitions depend on the norm that is used in B. For in-stance, the norm (1.8) is a special case of (1.22) if the norm ‖.‖1 is chosen for Rd.More generally, if B = l1 and M = l1 or M = l+1 , then

‖F‖C1(M) = ‖F‖C(M) + supx∈M

supk

∣∣∣∣ ∂F∂xk

∣∣∣∣ , (1.23)

‖F‖C2(M) = ‖F‖C1(M) + supx∈M

supk,l

∣∣∣∣ ∂2F

∂xk∂xl

∣∣∣∣ . (1.24)

Moreover, if all ∂F/∂xk are defined and the r.h.s. of (1.23) is bounded, then (i)F ∈ C1(M) if the mapping x �→ {∂F/∂xk} is continuous as a mapping from l1 tol∞, and (ii) F ∈ C1

Gat(M) if the mapping x �→ ∂F/∂xk is continuous for any k.From the first-order Taylor formula (1.16), it follows that for F ∈ C1

Gat(M),

F (x) = F (0) +∑

jxjGj(x), (1.25)

with continuous functions Gj . In fact, this holds with Gj(x) =∫ 1

0(∂F/∂xj)(sx) ds.

By applying the first-order Taylor expansion to DξF (Y + sξ) in (1.16) andthen to the next derivative, one gets the Taylor expansion of any order. For in-stance, for F ∈ C2

Gat(M) the second-order Taylor expansion reads

F (Y + ξ)− F (Y ) = DF (Y )[ξ] +

∫ 1

0

(1− s)D2F (Y + sξ)[ξ, ξ] ds. (1.26)

As an insightful example, let us differentiate the determinant mapping A �→detA for A being a square matrix in Rd. If A is invertible, then

D detA[B] = detA tr (A−1B). (1.27)

To obtain this formula, one can use the identity

detA = exp{ tr lnA}, (1.28)

which is valid whenever ‖A − 1‖ < 1. Recall that the analytic function f(x) =∑∞j=0 fnx

n/n! with the radius of convergence R can be defined as mappings onthe set of square matrices by the same series expansion, as long as their norm doesnot exceed R. This defines exp and ln in (1.28) whenever ‖A − 1‖ < 1. Formula

Page 30: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

14 Chapter 1. Analysis on Measures and Functional Spaces

(1.28) is proven by reducing it to the case of diagonal matrices, where it becomesstraightforward. For sufficiently small ε, one therefore has

det(A+ εB) = detAdet(1+ εA−1B) = detA exp{ tr ln(1+ εA−1B)}= detA exp{ε tr (A−1B) + o(ε)} = detA(1+ ε tr (A−1B) + o(ε)),

which implies (1.27).

1.3 Additive and multiplicative integrals

In this section, we briefly touch the theory of integration of vector-valued functions.For standard proofs, we shall refer to other sources, but we will clearly indicatethe main peculiarities of the infinite-dimensional case.

By the standard definition, if F is a given σ-algebra of subsets of a set Ω,and S a metric space, a mapping f : Ω → M is F -measurable if f−1(M) ∈ F forany Borel set M ⊂ S. In the theory of integration, it is convenient to modify thedefinition: If B is a Banach space, a mapping f : Ω → B is called F -measurable iff−1(M) ∈ F for any Borel set M ⊂ B and if the range of f is separable. Thesetwo definitions only coincide for separable Banach spaces B. The convenience ofthe second definition stems from the fact (see, e.g., Section 1 of [63] for a proof)that f : Ω → B is F - measurable in the second sense for any Banach space Bif and only if there exists a sequence fn of B-valued F -step functions (i.e., finitelinear combinations of functions taking some constant value on an element of Fand vanishing otherwise) such that fn → f pointwise. This property can be used –and is indeed used – as an alternative definition. Obviously, it nicely matches thestandard definition of the integral as the limit of the integrals over step functions.

Remark 6. It can be also shown (see Section 1 of [63]) that for any F -measurablefunction f : Ω → B there exists a sequence fn of functions that are countablelinear combinations of the functions taking some constant value on an element ofF and vanishing otherwise, so that fn → f uniformly.

For a measurable space (Ω,F , μ), where Ω is a set, F its σ-algebra and μ aσ-additive measure on F , a set N ⊂ Ω is called μ-negligible if N is a subset of aset of μ-measure zero. Recall that some property (of the points of Ω) is said tohold almost surely with respect to μ, if it holds everywhere apart from a negligibleset. For Ω being a subset of Rd and F the Borel σ-algebra (the only case we aregoing to deal with), N is negligible if and only if for any ε > 0 there exists acountable union U of open balls such that U ⊃ N and the total volume of U doesnot exceed ε.

A mapping f : Ω → B is called μ-measurable if it is F -measurable apart froma negligible set. A function f : Ω → B is called Bochner-integrable with respect toμ if it is μ-measurable and ‖f‖ is integrable with respect to μ. Bochner’s theoremstates (see, e.g., [115] for a proof) that f is Bochner-integrable if and only if there

Page 31: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

1.3. Additive and multiplicative integrals 15

exists a sequence fn of B-valued F -step functions converging to f almost surelywith respect to μ (i.e., outside a μ-negligible set) and such that∫

‖fn(ω)− f(ω)‖μ(dω) → 0,

as n → ∞. The Bochner integral of f over the set M ∈ F is then defined as∫M

f(ω)μ(dω) = limn→∞

∫M

fn(ω)μ(dω),

where the integrals of the step functions are defined in the usual way (as for realfunctions).

Remark 7. The extension of this integral to functions with values in general locallyconvex linear spaces is also well established, see, e.g., [209].

We are mostly interested in integrals of B-valued functions that are definedon the real line. For ODEs, the link between integration and differentiation is ofkey importance. We shall first prove two preliminary results on the unique identi-fication of a function by its right or left derivative. Afterwards, we will discuss thelink between integration and differentiation in more detail for a simpler Riemannintegral.

Lemma 1.3.1. Let a continuous function h : [0, 1] → B, with a Banach space B,have the right derivative h′

+(x) everywhere on [0, 1), which is a continuous functionup to the boundary point. Then h(x) is differentiable and h′(x) = h′

+(x), so thath(x)− h(0) =

∫ x

0h′(y)dy =

∫ x

0h′+(y)dy.

Proof. Let us introduce the integral ψ(x) =∫ x

0h′+(y)dy. Then φ(x) = ψ(x)−h(x)

is continuous and its right derivative vanishes everywhere. It remains to show thatsuch φ must be a constant. Assuming that this is not the case, one can find twopoints a < b in [0, 1] such that φ(b) = φ(a). Set ε = ‖φ(b)− φ(a)‖/2(b− a). Sinceφ′+(a) = 0, the number

c = inf

{x ≤ b :

‖φ(x) − φ(a)‖x− a

> ε

}

is well defined and c ∈ (a, b). By continuity, we find ‖φ(c)−φ(a)‖ = ε(c−a). Sinceφ′+(c) = 0, there exists d ∈ (c, b) such that ‖φ(x)− φ(c)‖ ≤ ε(x− c) for x ∈ (c, d).

For these x,‖φ(x) − φ(a)‖ ≤ ε(c− a) + ε(x− c) = ε(x− a),

which contradicts the definition of c. �

We would like to extend this result to functions that have derivatives almosteverywhere. Basic examples of real analysis (for instance, the famous Cantor stair-case) show that a continuous function whose derivative vanishes almost everywhere

Page 32: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

16 Chapter 1. Analysis on Measures and Functional Spaces

does not necessarily have to be a constant. For this to hold, a stronger continuityrequirement is needed: A function f : R → B is called absolutely continuous if forany ε > 0 there exists δ > 0 such that for any finite collection of pairwise disjointintervals (ak, bk) with

∑k(bk − ak) < δ it follows that∑

k‖f(bk)− f(ak)‖ < ε.

Lemma 1.3.2. Let an absolutely continuous function h : [0, 1] → B, with a Banachspace B, have a vanishing right derivative h′

+(x) almost everywhere on [0, 1). Thenh is a constant.

Proof. Since the set where h′+ does not exist can be enclosed into a union of

intervals with arbitrary small total length, the total oscillation of h on this unioncan be made arbitrary small due to the absolute continuity of h. But then, thetotal oscillation on the remaining closed set must be zero by Lemma 1.3.1. Thusthe total oscillation of h vanishes. �

Let us specifically mention the simple but important link between the Boch-ner integral and the usual integrals in the case of B = L1(X).

Lemma 1.3.3. Let ft be a bounded measurable curve in [0, T ] → L1(X,μ), whereX is a complete separable metric space and μ a finite Borel measure on X. Thenft is Bochner-integrable, and(∫ T

0

ft dt

)(x) =

∫ T

0

ft(x) dt (1.29)

almost surely with respect to μ, with the usual real-valued integral on the r.h.s.

Proof. Equation (1.29) holds for L1(X,μ)-valued step functions. Approximatingf by such functions fn so that

∫ T

0

‖fnt − ft‖L1(X,μ) → 0,

as n → ∞ (which is possible by the Bochner theorem), and passing to the limityields (1.29) for a given ft. �

Let us now discuss Riemann integrals and their multiplicative extensions,which are crucial for the theory of linear ODEs. For that purpose, let f be afunction f : [t, T ] → B with values in a Banach space B. For a partition Δ = {t =t0 < t1 < · · · < tn = T } of the interval [t, T ], let us define |Δ| = maxj(tj − tj−1).Let sj ∈ [tj−1, tj], j = 1, . . . , n, be arbitrary points. The expression

R(f,Δ, {sj}) =n∑

j=1

f(sj)(tj − tj−1)

Page 33: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

1.3. Additive and multiplicative integrals 17

is called the Riemann sum built on the triple (f,Δ, {sj}). The function f is calledRiemann-integrable, if the limit of these sums exists, as |Δ| → 0, which is indepen-dent of the choices of {sj}. This limit is called the Riemann integral of f on [t, T ]:

∫ T

t

f(s) ds = lim|Δ|→0

R(f,Δ, {sj}). (1.30)

The main criterion of integrability is as follows.

Theorem 1.3.1. If f is bounded on [t, T ] and continuous on [t, T ]\N , where N is asubset of zero Lebesgue measure (or a Lebesgue negligible set), then f is Riemann-integrable on any interval [t1, T1], t ≤ t1 ≤ T1 ≤ T , and the integral

∫ s

t f(τ)dτ isa Lipschitz-continuous function:∣∣∣∣

∫ s1

t

f(τ) dτ −∫ s2

t

f(τ) dτ

∣∣∣∣ ≤ |s1 − s2| sups∈[t,T ]

‖f(s)‖.

Proof. A proof can be found in [117], where it is carried out even more generallyfor functions with values in Frechet spaces. The proof is essentially the same asfor real-valued functions. �

As a direct consequence of Theorem 1.3.1 and Lemma 1.3.2, we obtain thefollowing assertion.

Theorem 1.3.2. Let f be bounded on [t, T ] and continuous on [t, T ] \ N , whereN is a subset of zero Lebesgue measure. Then the function F (s) =

∫ s

tf(τ) dτ ,

s ∈ [t, T ], the indefinite integral of f , is an absolutely continuous function thatvanishes at t and is differentiable at all points of continuity of f , where F ′(s) =f(s). Moreover, F is the unique absolutely continuous function that vanishes at tand is differentiable with F ′(s) = f(s) almost surely.

Next, let A : [t, T ] → L(B,B) be a curve in the space of bounded operatorsL(B,B). For a partition Δ = {t = t0 < t1 < · · · < tn = T } of the interval[t, T ] and a choice of points sj ∈ [tj−1, tj ], j = 1, . . . , n, let us define two types ofmultiplicative time-ordered Riemann approximations as

RM (A,Δ, {sj})= e(tn−tn−1)A(sn)e(tn−1−tn−2)A(sn−1) · · · e(t1−t)A(s1),

(1.31)

RM (A,Δ, {sj})= [1 + (tn − tn−1)A(sn)][1 + (tn−1 − tn−2)A(sn−1)] · · · [1 + (t1 − t)A(s1)].

(1.32)

If there exists a limit of RM (A,Δ, {sj}), as |Δ| → 0, independent of the choices ofsj, then A is called multiplicatively time-ordered Riemann-integrable on [t, T ]. The

Page 34: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

18 Chapter 1. Analysis on Measures and Functional Spaces

respective limit is called the multiplicative time-ordered Riemann integral of A(s),or the T -product, or the chronological exponential, or the time-ordered exponential:

T exp

{∫ T

t

A(τ) dτ

}= lim

|Δ|→0RM (A,Δ, {sj}). (1.33)

Remark 8. Reversing the order of the multipliers in (1.31) leads to the time-reversed multiplicative Riemann integral.

The relation between the approximations (1.31) and (1.32) is determined bythe following elementary fact.

Proposition 1.3.1. If a function A is bounded, then the limit of RM (A,Δ, {sj})exists if and only if the limit of RM (A,Δ, {sj}) exists, in which case they coincide:

lim|Δ|→0

RM (A,Δ, {sj}) = lim|Δ|→0

RM (A,Δ, {sj}).

Proof. We have

RM (A,Δ, {sj})− RM (A,Δ, {sj})= [e(tn−tn−1)A(sn) − (1 + (tn − tn−1)A(sn))]e

(tn−1−tn−2)A(sn−1) · · · e(t1−t)A(s1)

+ · · ·+ (1 + (tn − tn−1)A(sn)) · · ·· · · (1 + (t2 − t1)A(sn−1))[e

(t1−t)A(s1) − (1 + (t1 − t)A(s1))],

so that

‖RM (A,Δ, {sj})− RM (A,Δ, {sj})‖≤ n exp{(n− 1) sup

s∈[t,T ]

‖A(s)‖ supj

‖e(tj−tj−1)A(sj) − (1 + (tj − tj−1)A(sj))‖.

The last term is of order (tj − tj−1)2, so that ‖RM (A,Δ, {sj})− RM (A,Δ, {sj})‖

is of order n∑

j(tj − tj−1)2 ≤ |Δ| → 0, as |Δ| → 0. �

The main result about the multiplicative integral for bounded A is the fol-lowing.

Theorem 1.3.3. The function A is multiplicatively Riemann-integrable on [t, T ], ifand only if it is Riemann-integrable on [t, T ]. In particular, it is multiplicativelyRiemann-integrable, if it is bounded on [t, T ] and continuous on [0, T ] \N , whereN is a subset of zero Lebesgue measure.

Proof. A proof can be found in [198] (Theorem 16.3), where it is carried out ineven greater generality for functions with values in Banach algebras. �

The application of multiplicative integrals to basic linear ODEs will be pre-sented in Section 2.3, where also an alternative series representation for T -productsis developed.

Page 35: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

1.4. Differentials of the norms 19

Remark 9. Of course, the story is quite different for unbounded operators A, acase which is of great importance for applications. This case is not covered bythe above results! In particular, in this case RM may not be defined, so that thedefinition via RM is more fundamental.

1.4 Differentials of the norms

As an example for differentiation, let us look at an important class of functionson a Banach space, so-called convex functions φ defined by the property

φ(αx + (1 − α)y) ≤ αφ(x) + (1 − α)φ(y), (1.34)

for any α ∈ (0, 1). It is equivalent to the requirement that the restriction of φ onany straight line is convex as a real function on the line, which can in turn berewritten as

φ(x2) ≤ x2 − x1

x3 − x1φ(x3) +

x3 − x2

x3 − x1φ(x1), (1.35)

for any points x1, x2, x3 such that x2 belongs to the interval (x1, x3). Since (1.35)can be equivalently written as

φ(x2)− φ(x1)

x2 − x1≤ φ(x3)− φ(x1)

x3 − x1, (1.36)

it follows that (1.35) is equivalent to the requirement that, for x < y, the incre-ments (φ(y) − φ(x))/(y − x) are increasing both in y and in x. In particular, forany convex function φ(x) on a real line, the right and left derivatives φ′

+(x) andφ′−(x) always exist and

φ(x) − φ(x− h)

h≤ φ′

−(x) ≤ φ′+(x) ≤

φ(x+ h)− φ(x)

h. (1.37)

For any Banach space, ‖x‖ and ‖x‖2 are convex functions. It therefore follows thattheir directional derivatives

[x, y]+ = Dy‖x‖ =d

dh+

∣∣∣∣h=0

‖x+ hy‖,(1.38)

[x, y]− = −D−y‖x‖ =d

dh−

∣∣∣∣h=0

‖x+ hy‖ = − d

dh+

∣∣∣∣h=0

‖x− hy‖and

(x, y)+ =1

2Dy‖x‖2 =

1

2

d

dh+

∣∣∣∣h=0

‖x+ hy‖2,(1.39)

(x, y)− = −1

2D−y‖x‖2 =

1

2

d

dh−

∣∣∣∣h=0

‖x+ hy‖2

Page 36: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

20 Chapter 1. Analysis on Measures and Functional Spaces

are always well defined and have the following properties:

[x, y]− ≤ [x, y]+, (x, y)− ≤ (x, y)+, (x, y)± = ‖x‖[x, y]±. (1.40)

The functions [x, y]− and [x, y]+ are called the normalized lower and upper semi-inner products on B. The functions (x, y)− and (x, y)+ are called the lower andupper semi-inner products on B. Clearly,

[0, y]± = ±‖y‖, (0, y)± = 0.

Proposition 1.4.1. Let μt be a curve in B, t ∈ [0, T ]. Then at any point t, wherethe right (or left) derivative of μt exists, there exists the right (or left) derivativeof ‖μt‖ and ‖μt‖2, with

‖μt‖′± = [μt, μ′t±]±, (‖μt‖2)′± = 2(μt, μ

′t±)±. (1.41)

Proof. This follows from the definitions. For the first equation, e.g., at the pointst where μ′

t+ exists, we know that

‖μt‖′+ = limh→0+

1

h(‖μt+h‖ − ‖μt‖)

= limh→0+

1

h(‖μt + μ′

t+h+ o(h)‖ − ‖μt‖) = [μt, μ′t+]+. �

As a corollary, we get the following proposition.

Proposition 1.4.2. Let μt =∫ t

0 νs ds with a bounded curve νt in B, t ∈ [0, T ], whichis almost surely continuous. Then

‖μt‖ = ‖μ0‖+∫ t

0

[μs, μ′s]+ds

= ‖μ0‖+∫ t

0

[μs, μ′s]−ds = ‖μ0‖+ 1

2

∫ t

0

([μs, μ′s]+ + [μs, μ

′s]−)ds.

(1.42)

Proof. By Theorem 1.3.2, μt is an absolutely continuous function which is differ-entiable at all points of continuity of f , i.e., almost surely. Hence, by Proposition1.4.1, ‖μt‖ is absolutely continuous and almost surely has bounded right and leftderivatives as given by (1.41). For an absolutely continuous real function, the setof points where the right and the left derivative differ from each other has zeromeasure (Lebesgue theorem) and thus does not contribute to the integral (al-ternatively, see Lemma 1.3.2). Hence one can use either of them or any convexcombination in the Newton–Leibnitz integral representation. �

As an important example, let us consider the space of measures.

Page 37: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

1.4. Differentials of the norms 21

Proposition 1.4.3. Let B = M(X) for a complete metric space X. If μ ∈ M+(X),then

[μ, ν]± = ±‖νsing‖+∫

νabs(dx), (1.43)

where ν = νabs+νsing is the Hahn decomposition of ν into the absolutely continuousand the singular part with respect to μ. For a general μ ∈ M(X),

[μ, ν]± = ±‖νsing‖+∫X+

νabs(dx)−∫X−

νabs(dx), (1.44)

where X+ (respectively X−) is the support of the positive (respectively negative)part of μ.

Proof. Let us derive only [μ, ν]+ for μ ∈ M+(X), leaving all other cases as anexercise. For μ ∈ M+(X),

[μ, ν]+ = limh→0+

1

h(‖μ+ hν‖ − ‖μ‖) = ‖νsing‖+ lim

h→0+

1

h(‖μ+ hνabs‖ − ‖μ‖).

Since νabs = g(x)μ with some g ∈ L1(X,μ), we have

[μ, ν]+ = ‖νsing‖+ limh→0+

∫1

h(|(1 + hg(x))| − 1)μ(dx)).

The function under the integral tends to g(x) for all x (where g(x) is finite) andis bounded by |g(x)|. Hence, by the dominated convergence theorem, the limit inthe last formula equals

∫g(x)μ(dx) =

∫νabs(dx), as required. �

Exercise 1.4.1. Let K be a compact space, B = C(K), and f, g ∈ B, f = 0. Then

[f, g]+ = max{g(x) sgn(f(x)) : |f(x)| = ‖f‖},[f, g]− = min{g(x) sgn(f(x)) : |f(x)| = ‖f‖}. (1.45)

Exercise 1.4.2. Let B = L1(X,μ), with a Borel measure μ on a complete metricspace X . Then

[f, g]± =

∫X\M0

g(x) sgn (f(x))μ(dx) ±∫M0

|g(x)|μ(dx), (1.46)

where M0 = {x : f(x) = 0}. In particular, for B = l1,

[f, g]± =∑

j:fj =0

gj sgn fj ±∑

j:fj=0

|gj|. (1.47)

Finally, for B = R, [f, g]± = g sgn (f) if f = 0 and [f, g]± = ±g if f = 0.

The following results on the differentiation of the magnitudes of curves inthe spaces L1(X) or M(X) are an important tool for proving the correctness ofmeasure-valued nonlinear evolutions, e.g., in Sections 3.12, 3.13 and 7.6.

Page 38: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

22 Chapter 1. Analysis on Measures and Functional Spaces

Proposition 1.4.4. Let X be a complete separable metric space and μ ∈ M+(X).Let yt be a bounded almost surely continuous curve in L1(X,μ) and

x(t, z) =

∫ t

0

y(s, z) ds,

where, according to Lemma 1.3.3, the integral can be understood as a usual integral(pointwise, for any z), or as the Bochner integral for L1(X,μ)-valued functions,or as a Riemann integral. Then

|x(t, z)| =∫ t

0

sgn [x(s, z)]y(s, z) ds. (1.48)

Proof. Let us give two proofs, one as a corollary of Proposition 1.4.2 and anindependent one. (i) Let Mt = {z : x(t, z) = 0}. By Proposition 1.4.2 and Exercise1.4.2 with B = R,

|x(t, z)| =∫ t

0

sgn [x(s, z)]y(s, z) ds+

∫ t

0

ds

∫Ms

|y(s, z)|dz

=

∫ t

0

sgn [x(s, z)]y(s, z) ds−∫ t

0

ds

∫Ms

|y(s, z)|dz.

Hence, the last term in both integrals vanishes, which proves (1.48). (ii) Firstnote that if y(t) is a real-valued bounded measurable function on [0, T ] and

x(t) =∫ t

0y(s) ds, then |x(t)| = ∫ t

0sgnx(s) y(s) ds, which can be seen for instance

by approximating x(t) or y(t) with polynomials and then passing to the limit.Applying this result to y(t) = y(t, z) yields (1.48). �

Proposition 1.4.5. Let X be a complete separable metric space. Let νt be a boundedalmost surely continuous curve in M(X) and μ(t) =

∫ t

0ν(s) ds (Riemann or

Bochner integral). Then

|μ(t)| =∫ t

0

sgn [μ(s)]ν(s) ds, (1.49)

where sgn [μ(t)] = sgn [μ(t)](z) is a function on X which equals 1 or −1 on thepositive or respectively negative parts of μ(t).

Proof. (i) One proof can be carried out exactly as the first proof of Proposition1.4.4. (ii) The second proof works for the case when νt has only discontinuitiesof the first kind and at each point coincides either with its left or right limit. Inthis case, all measures ν(t) are absolutely continuous with respect to the measure

M =∫ T

0|ν(t)| dt, and thus the statement is reduced to the setting of Proposition

1.4.4, as all our curves become elements of L1(X,M). �

Page 39: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

1.5. Smooth mappings between Banach spaces 23

1.5 Smooth mappings between Banach spaces

For mappings F : M → B2 from a closed convex subset M of a Banach space B1

to a Banach space B2, the previous notions and results of the case B2 = R can beextended almost automatically. Namely, the definitions of directional (or Gateaux)derivatives (1.14) and of derivatives (1.19) (or Frechet derivatives) remain thesame, only convergence and norms are understood in the sense of the correspondingBanach spaces. Propositions 1.2.1, 1.2.2 and their proofs also remain valid, theonly difference being the use of the Banach-space-valued integrals of functionsF (s) from R to B2 (see Theorems 1.3.1 and 1.3.2 for these integrals).

For continuous functions F (and we are only using continuous functions), suchintegrals are defined exactly as for real functions, namely as limits of Riemannian

sums, with the Leibnitz rule∫ 1

0F ′(s) ds = F (1)−F (0) following in the usual way.

Extending the notation Ck(M), we define the spaces

C1(M,B2) and C2(M,B2)

of differentiable functions F of order 1 or 2 with continuous bounded derivatives,the continuity being understood as the continuity of mappings Y �→ DF (Y ) andY �→ D2F (Y ) with the norm topologies of M , L(B1, B2) and L2(B1, B2). Thenorm on these spaces is defined analogously to (1.21) and (1.22), for instance

‖F‖C2(M,B2) = supY

(‖F (Y )‖B2 + ‖DF (Y )‖L(B1,B2) + ‖D2F (Y )‖L2(B1,B2)

)= sup

Y

(‖F (Y )‖B2 + sup

‖ξ‖B1≤1

‖DF (Y )[ξ]‖B2 (1.50)

+ sup‖ξ‖B1 ,‖η‖B1≤1

‖D2F (Y )[ξ, η]‖B2

).

Similarly the spaces of Gateaux-differentiable functions C1Gat(M,B2), C

2Gat(M,B2)

are defined, which contain C1(M,B2) and C2(M,B2) as closed subspaces. Again,the difference is in the assumption that the derivatives are continuous functionsof all its variables in the Gateaux case and are continuous functions from M tothe corresponding spaces of bounded operators in the case of Frechet-differentiablefunctions. The Taylor formulas (1.16) and (1.26) remain valid for F ∈ C1

Gat(M,B2)and F ∈ C2

Gat(M,B2), respectively.

The following chain rule is a key tool for analysis.

Proposition 1.5.1. Let Φ ∈ C1(M,B2) (respectively C1Gat(M,B2)) with M a closed

convex subset of B1, and F ∈ C1(B2, B3) (respectively C1Gat(B2, B3)) for some

Banach spaces B1, B2, B3. Then the composition F ◦ Φ(Y ) = F (Φ(Y )) belongs toC1(M,B3) (respectively C1

Gat(M,B3)) and

Dξ(F ◦ Φ)(Y ) = DF (Φ(Y ))[DξΦ(Y )], (1.51)

for any Y, ξ such that Y + hξ ∈ M for some h > 0.

Page 40: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

24 Chapter 1. Analysis on Measures and Functional Spaces

Proof. By the first-order Taylor expansion of the composition map, we get

F (Φ(Y + ξ)− F (Φ(Y ))

=

∫ 1

0

DF (Φ(Y ) + s(Φ(Y + ξ)− Φ(Y )))[Φ(Y + ξ)− Φ(Y )] ds

=

∫ 1

0

DF (Φ(Y ) + s(Φ(Y + ξ)− Φ(Y )))

[∫ 1

0

DΦ(Y + θξ)[ξ] dθ

]ds

=

∫ 1

0

DF (Φ(Y ) + s(Φ(Y + ξ)− Φ(Y )))

×[DΦ(Y )[ξ] +

∫ 1

0

(DΦ(Y + θξ)−DΦ(Y ))[ξ] dθ

]ds.

Writing hξ instead of ξ and passing to the limit h → 0 completes the proof. �

Another important result concerns partial derivatives of Gateaux- or Frechet-type:

Proposition 1.5.2. Let B1, B2, B3 be three Banach spaces, and M1 and M2 convexclosed subsets of B1 and B2 such that the linear span of each Mi is Bi. Let Fbe a mapping B1 × B2 → B3 such that the Gateaux derivatives D1F (b1, b2)[ξ1]and D2F (b1, b2)[ξ2] with respect to the first and the second variable exist for allb1, b2 ∈ M1 ×M2 and all ξ1, ξ2, so that bi + hξi ∈ Mi for sufficiently small h.

(i) If for any ξ1 and ξ2 the functions D1F (b1, b2)[ξ1] and D2F (b1, b2)[ξ2] are con-tinuous as functions of b1, b2, then the mapping F is Gateaux-differentiableon M1 ×M2 and

DF (b1, b2)[ξ1, ξ2] = D1F (b1, b2)[ξ1] +D2F (b1, b2)[ξ2]. (1.52)

(ii) If the mappings (b1, b2) �→ D1F (b1, b2) and (b1, b2) �→ D2F (b1, b2) are con-tinuous as mappings from M1×M2 to L(B1, B3) and L(B2, B3) respectively,then the mapping F is Frechet-differentiable on M1 ×M2 and this derivativeis given by (1.52).

Proof. Let us prove only (i), since the second statement is fully analogous. By thefirst-order Taylor expansion,

F (b1 + hξ1, b2 + hξ2)− F (b1, b2)

= F (b1 + hξ1, b2 + hξ2)− F (b1, b2 + hξ2) + F (b1, b2 + hξ2)− F (b1, b2)

=

∫ h

0

D1F (b1 + sξ1, b2 + hξ2)[ξ1] ds+

∫ h

0

D2F (b1, b2 + sξ2)[ξ2] ds,

which tends to D1F (b1, b2)[ξ1]+D2F (b1, b2)[ξ2] due to the assumed continuity. �

Page 41: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

1.5. Smooth mappings between Banach spaces 25

The space CbLip(M,B2) of Lipschitz-continuous mappingsM → B2 is definedas the subspace of C(M,B2) with a finite norm

‖F‖bLip = ‖F‖C(M,B2) + ‖F‖Lip, ‖F‖Lip = supY =Z

‖F (Y )− F (Z)‖B2

‖Y − Z‖B1

. (1.53)

As follows from the first-order Taylor expansion (1.16), if F ∈ C1Gat(M,B2),

then F ∈ CbLip(M,B2) and

‖F‖bLip = ‖F‖C1(M,B2), ‖F‖Lip = supY

‖DF (Y )‖. (1.54)

Similarly, one defines the space C1bLip(M,B2) of functions F from C1(M,B2),

which have a Lipschitz-continuous derivative, that is

‖DF (Y1)−DF (Y2)‖ ≤ C‖Y1 − Y2‖, (1.55)

with a constant C. We say that F has a locally Lipschitz-continuous derivative, ifthis holds for Y1, Y2 from any bounded subsets of M . From the first-order Taylorexpansion (1.16), it follows that

F (Y + ξ)− F (Y ) = DF (Y )[ξ] +

∫ 1

0

(DF (Y + sξ)[ξ]−DF (Y )[ξ]) ds. (1.56)

Consequently, if (1.55) holds, then

‖F (Y + ξ)− F (Y )−DξF (Y )‖B2 ≤ C‖ξ‖2/2, (1.57)

which is stronger than (1.19) and often a rather handy inequality. If the derivativeis only locally Lipschitz-continuous, the estimate (1.57) holds for Y, ξ from anybounded set.

Weakening the condition of Lipschitz continuity, it is handy to define thespace C1

luc(M,B2) (respectively C1uc(M,B2)) of mappings with locally uniformly

continuous derivatives (respectively uniformly continuous). It is the closed sub-space of C1(M,B2) of functions F such that the mapping Y �→ DF (Y ) is uni-formly continuous on bounded subsets of M (respectively on the whole M). IfF ∈ C1

luc(M,B2) (respectively F ∈ C1uc(M,B2)), then with the help of (1.56)

again, (1.19) improves to

‖F (Y + ξ)− F (Y )−DF (Y )[ξ]‖B2 ≤ ε‖ξ‖ for ‖ξ‖ ≤ δ, (1.58)

uniformly for Y and Y + ξ from any bounded set (respectively for any Y, Y + ξfrom M).

Similarly, the space C2luc(M,B2) (respectively C2

uc(M,B2)) of the mappingswith locally uniformly continuous (respectively uniformly continuous) second deri-vatives is the closed subspace of C2(M,B2) of functions F such that the mapping

Page 42: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

26 Chapter 1. Analysis on Measures and Functional Spaces

Y �→ D2F (Y ) is uniformly continuous on bounded subsets of M (respectively onthe whole M).

By introducing these spaces, the peculiarity of the infinite-dimensional casebecomes obvious once again, since for finite-dimensional B1 and B2 the spacesC1

luc(M,B2) and C1(M,B2) coincide (though C1uc(M,B2) may still be different).

The same applies for the spaces C2luc(M,B2) and C2(M,B2).

The following example shows that derivatives in Banach spaces may lookquite different from the usual derivatives.

Exercise 1.5.1. (i) If A is an invertible element of L(B,B), then the mappingF : A �→ A−1 has the following differential at A:

DF (A)[ξ] = −A−1ξA−1. (1.59)

Hint: (A+ hξ)−1 = (1+ hA−1ξ)−1A−1 = (1− hA−1ξ + o(h))A−1.

(ii) If C(t) is a smooth curve in L(B,B), then

d

dt(1 + C(t))−1 = −(1 + C(t))−1C′(t)(1 + C(t))−1. (1.60)

1.6 Locally convex spaces and Frechet spaces

In this book, we work mostly with Banach spaces. However, some very useful spaces– in particular the spaces of generalized functions or distributions or some classes ofsmooth functions, notably the Schwartz space – do not fit into the Banach setting.Since most of our methods for dealing with ODEs can be naturally extended to thegeneral setting of locally convex spaces, we provide here the necessary backgroundon these spaces in a concise form. More extensive expositions can be found, e.g.,in [115, 212, 235]).

Remark 10. Readers who do not want to deal with this level of generality (whichis more ‘topological’ in nature than the rest of the book) can skip this sectionand always think of Banach spaces whenever we mention results or definitions forgeneral locally convex spaces, in particular the Frechet spaces. One only has to keepin mind that by the convergence of sequences in function spaces that are equippedwith a countable set of norms or seminorms (see key examples (1.70), (1.72),(1.73) below), one means their convergence in each of these seminorms. Althoughthe general point of view definitely enhances the understanding of generalizedfunctions, it is possible to grasp the basic properties of these objects withoutrecourse to abstract spaces.

A set M in a linear vector space V is called absorbing or absorbent if for anyx ∈ V there exists t ∈ R+ such that tx ∈ M , or in other words, if V = ∪t∈R+tM .Clearly, 0 ∈ M for such set M .

Page 43: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

1.6. Locally convex spaces and Frechet spaces 27

The key notion for the theory is the following link between geometric and an-alytic representations of convex sets. For any set M ⊂ V , its Minkowski functionalpM : V → R+ is defined by the formula

pM (x) = inf{t > 0 : x ∈ tM}. (1.61)

The requirement that M is absorbing is then equivalent to the requirement thatpM (x) has a finite non-negative value for any x ∈ V .

For any convex absorbing set M , pM has the following properties: (i) if x ∈M , then pM (x) ≤ 1; (ii) if pM (x) < 1, then x ∈ M ; (iii) pM (tx) = tpM (x) for anyt ≥ 0 and x ∈ V ; (iv) p(x+ y) ≤ p(x) + p(y) for any x, y ∈ V .

Exercise 1.6.1. Prove these properties of pM .

Conversely, any mapping p : V → R+ satisfying the above conditions (i)–(iv)is a Minkowski functional for some convex absorbing set. In fact, p(x) = pM (x)for any set M , so that

{x : p(x) < 1} ⊂ M ⊂ {x : p(x) ≤ 1}.Exercise 1.6.2. Prove this assertion.

Recall that a semi-norm on V is a functional p → R+ such that p(αx) = |α|xfor any α ∈ R and p(x+ y) ≤ p(x) + p(y). A subset M ⊂ V is called symmetric,balanced or centered if x ∈ M implies −x ∈ M .

Remark 11. For complex spaces, the above notion of a subset M being balancedrequires that x ∈ M implies λx ∈ M for any complex λ of unit magnitude.

As a direct consequence of the properties of Minkowski functionals, it followsthat if M is a convex absorbing and symmetric set, then pM is a semi-norm. Alsovice versa, for any semi-norm p, there exists a convex absorbing and symmetricset M such that p = pM , for instance M = {x : p(x) < 1}.

A linear topological space is a linear vector space equipped with a topologywhich renders the operations of addition and scalar multiplication continuous. No-tice that any neighbourhood U of zero in a linear topological space V is absorbing.In fact, the mapping t → tx is continuous for any x and takes 0 ∈ R to 0 ∈ V .Hence tx ∈ U for sufficiently small t.

By the base of open neighbourhoods of a point x in a topological space, onemeans the family of neighbourhoods Uα of x with the property that for any neigh-bourhood U there exists α such that Uα ⊂ U . A linear topological space is calledlocally convex if zero has a base of open neighbourhoods that consists of convexcentered (and hence absorbing) sets Mα, with α from some set of indices. Noticethat the Minkowski functional pα of Mα is continuous for any Mα. In fact, ifx− y ∈ εMα, then |p(x)−p(y)| ≤ p(x− y) ≤ ε. Hence the open neighbourhoods ofzero Mα can be described as the sets {x : pα(x) < 1} for continuous semi-normspα. Therefore, one can conclude that a linear topological space is locally convex

Page 44: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

28 Chapter 1. Analysis on Measures and Functional Spaces

if and only if its topology can be defined by a family of semi-norms, that is, thereexists a family of semi-norms pα, with α from some set of indices, so that the sets

Nα1,...,αk;ε1,...,εk = {x : pαj (x) < εj , j = 1, . . . , k} (1.62)

define a base of open neighbourhoods of zero. Hence, the base of neighbourhoods ofany point x is given by the sets x+Nα1,...,αk;ε1,...,εk . A sequence xn ∈ V convergesto x in this topology if and only if it converges in any pα, that is, for any ε > 0and α there exists N such that pα(xn − x) < ε for all n > N .

A topology on a set V is said to be separated or Hausdorff if any two pointsx = y from V have nonintersecting neighbourhoods. Clearly, a topological linearspace with the topology given by semi-norms pα is Hausdorff if and only if thefamily pα is separating, that is, if pα(x) = 0 for all α implies x = 0.

Exercise 1.6.3. Let V be a linear vector space and pα a separating family of semi-norms on V . Show that V becomes a locally convex topological linear space if it isequipped with the topology that is generated by the base of open sets (1.62) andall its shifts. A proof can be found, e.g., in [212].

Proposition 1.6.1. A linear mapping L : V1 → V2 between two locally convexspaces with the topologies generated by the semi-norms {p1α} and {p2β} respectivelyis continuous if and only if for any β there exists a constant C and a finite numberof semi-norms {p1α1

, . . . , p1αk} such that

p2β(Lx) ≤ C(p1α1(x) + · · ·+ p1αk

(x)). (1.63)

Proof. The ‘if’ part follows from the definition of continuity: the pre-image of anopen set (in our case {x : p2β(Lx) < ε}) must be open. For proving the ‘only if’part, notice that, if L is continuous, then for any β there exists a constant ε anda finite number of semi-norms {p1α1

, . . . , p1αk} such that

p1α1(x) < ε, . . . , p1αk

(x) < ε ⇒ p2β(Lx) < 1. (1.64)

If (1.63) does not hold, we can choose a sequence Cn → ∞, as n → ∞, and asequence of vectors xn ∈ V1 such that

p2β(Lxn) > Cn(p1α1(xn) + · · ·+ p1αk

(xn)). (1.65)

However, by linearity, if (1.65) holds for xn, it holds for all kxn with k > 0. Wecan therefore choose xn such that

ε/2 < p1α1(xn) + · · ·+ p1αk

(xn) < ε.

Then (1.65) will contradict (1.64) for large enough Cn. �

This proposition implies that two families of semi-norms pα and dβ on alinear space V are equivalent – that is, they define the same topology on V – if

Page 45: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

1.6. Locally convex spaces and Frechet spaces 29

and only if for any β and α there exists a constant C and a finite number ofsemi-norms {pα1 , . . . , pαk

} and {dβ1 , . . . , dβm} such that

dβ(x) ≤ C(pα1(x) + · · ·+ pαk(x)), pα(x) ≤ C(dβ1(x) + · · ·+ dβm(x)). (1.66)

A family of semi-norms pα on V is called directed if for any α1, α2 thereexists α3 and a constant C such that pα1(x) + pα2(x) ≤ Cpα3(x). In any locallyconvex space V with the topology defined by the semi-norms pα, there alwaysexists an equivalent directed family of norms. In fact, it is sufficient to choose allfinite sums of the semi-norms of the initial family as a new family. As follows fromProposition 1.6.1, two directed families of semi-norms pα and dβ on a linear spaceV are equivalent, if and only if for for any β and α there exists a constant C andsemi-norms pα1 and dβ1 such that

dβ(x) ≤ Cpα1(x), pα(x) ≤ Cdβ1(x). (1.67)

One often has to deal with families of linear mappings. A family of linearmappings Lβ : V → W between two topological linear spaces V,W is calledequicontinuous if for any neighbourhood N of zero in W there exists a neighbour-hood U of zero in V such that LβV ⊂ N for all β. Notice that for a single mappingL this turns into the definition of continuity. The following proposition is a directextension of Proposition 1.6.1 for the families of mappings.

Proposition 1.6.2. A family of linear mappings Lν : V1 → V2 between two locallyconvex spaces with the topologies generated by the semi-norms p1α and p2β respec-tively is equicontinuous if and only if for any β there exists a constant C and afinite number of semi-norms {p1α1

, . . . , p1αk} such that

p2β(Lνx) ≤ C(p1α1(x) + · · ·+ p1αk

(x)) (1.68)

for all x ∈ V1 and all ν.

The structure of a topological linear space makes it possible to define ana-logues of Cauchy sequences, which are essential for the study of metric spaces.A sequence xn in a topological linear space is called Cauchy sequence if for anyneighbourhood U of zero there exists N such that xn − xm ∈ U for all m,n > N .In locally convex spaces with the topology defined by the semi-norms pα, theproperty of being Cauchy is equivalent to the requirement that xn is a Cauchysequence in each semi-norm pα, that is, for any ε > 0 and any α there exists Nsuch that pα(xn −xm) < ε for all m,n > N . A topological linear space V is calledsequentially complete if any Cauchy sequence in V converges.

A topological space is calledmetricizable whenever its topology can be definedin terms of a certain metric.

In general (non-metricizable) topological spaces, the convergence of sequencesdoes not fully specify the topology. Instead, one must use converging directed sets(also called nets) xμ that are indexed by a partially ordered set of indices μ such

Page 46: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

30 Chapter 1. Analysis on Measures and Functional Spaces

that for any μ1, μ2 there exists μ such that μ > μ1 and μ > μ2. A directed set xμ

is called a Cauchy-directed set if for any neighbourhood U of zero there exists μsuch that xμ1 − xμ2 ∈ U for all μ1, μ2 > μ. A topological linear space V is calledcomplete if any Cauchy-directed set in V converges. This completeness impliessequential completeness, and they are equivalent for metricizable spaces.

Proposition 1.6.3. A locally convex Hausdorff space V is metricizable if and onlyif V has a countable base of neighbourhoods of zero.

Proof. Any metric space has a countable base of neighbourhoods of any point,given by the balls centered at this point with rational radii. Therefore, the topologycan be specified by a countable set of norms.

On the other hand, if there exists a countable base of neighbourhoods of zero,the topology of V can be generated by a countable set pn of semi-norms. Thenthe formula

d(x, y) =∑

n2−n pn(x− y)

1 + pn(x− y)(1.69)

specifies a metric on V that defines the same topology as the topology given bythe family of semi-norms pn. �

Exercise 1.6.4. Check the last assertion. Also check that V is complete with respectto the family of semi-norms pn if and only if it is complete as the metric spacewith the metric (1.69).

A locally convex Hausdorff (topological linear) space V is called Frechet spaceif it is complete and its topology can be specified by a countable set of semi-norms.(Thus, V is metricizable).

The key examples for Frechet spaces include various classes of smooth func-tions on Rd. For instance, the Schwartz space S(Rd) is the space of infinitelydifferentiable functions on Rd that decrease at infinity faster than any power,together with all their derivatives. The topology in this space is defined by thecountable set of norms (with p, q non-negative integers)

‖f‖p,q =∑

k1+···+kd≤p,m1+···+md≤q

supx

d∏j=1

|xj |kj

∣∣∣∣ ∂qf

∂xm11 · · ·∂xmd

d

∣∣∣∣ , (1.70)

or equivalently, by their integral versions

‖f‖p,q,2 =∑

k1+···+kd≤p,m1+···+md≤q

⎛⎜⎝∫

∣∣∣∣∣∣d∏

j=1

xkj

j

∂qf

∂xm11 · · ·∂xmd

d

∣∣∣∣∣∣2

dx

⎞⎟⎠

1/2

. (1.71)

Exercise 1.6.5. (i) Check that the family of norms (1.70) and (1.71) are equivalent.(ii) Check that S is a Frechet space, that is, it is complete.

Page 47: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

1.6. Locally convex spaces and Frechet spaces 31

For a compact set K in Rd, the space C∞0 (K) is the space of infinitely

differentiable functions on Rd with support in K, i.e., they vanish outside Ktogether with all their derivatives. The topology in this space is defined by thefollowing countable set of norms:

‖f‖q =∑

m1+···+md=q

supx∈K

∣∣∣∣ ∂qf

∂xm11 · · · ∂xmd

d

∣∣∣∣ . (1.72)

Let Ω be an open subset of Rd and Ω1 ⊂ Ω2 ⊂ · · · an increasing sequence ofopen subsets such that the closure Ωn of each Ωn is a compact subset of Ωn+1 andΩ = ∪nΩn. Let C

∞(Ω) be the space of infinitely differentiable functions on Ω. Thetopology in this space is defined by the following countable family of seminorms:

‖f‖q,n =∑

m1+···+md=q

supx∈Ωn

∣∣∣∣ ∂qf

∂xm11 · · · ∂xmd

d

∣∣∣∣ . (1.73)

Exercise 1.6.6. Check that C∞0 (K) and C∞(Ω) are Frechet spaces.

The space C∞0 (Ω) = ∪nC

∞0 (Ωn) of infinitely differentiable functions on Ω

with a compact support is dense in C∞(Ω). In order to define its natural topology,we have to go beyond the class of Frechet spaces (using inductive limits, see below).

A subset M of a linear topological space V is called bounded if it is absorbedby any neighbourhood of zero, that is, for any neighbourhood of zero U there existsλ > 0 such that λM ⊂ U . If V is locally convex, with the topology generated bythe semi-norms pα, then M is bounded if and only if sup{pα(x) : x ∈ M} < ∞for any α. The following elementary properties of bounded sets are worth beingnoted.

Proposition 1.6.4.

(i) Any compact set is bounded.

(ii) The Minkowski functional pM of a bounded balanced convex absorbing set Min a separated (or Hausdorff) locally convex space is a norm.

Proof. (i) For any seminorm p and a point x in a compact set M , there existsλx > 0 such that x ∈ {y : p(y) < λx}. Choosing a finite subset of this open coverof M implies that sup{p(y) : y ∈ M} < ∞.

(ii) Since pM is a seminorm, one only has to check that x = 0 implies pM (x) =0. Due to the separation, for any x = 0 there exists a neighbourhood U of zerosuch that x /∈ U . Since M is bounded, there exists λ > 0 such that λM ⊂ U , andconsequently pM (x) = 0. �

Remark 12. The norm pM (x) from Proposition 1.6.4 is generally not continuousand defines a stronger topology on the subspace of V generated by M , than thetopology that is induced by V .

Page 48: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

32 Chapter 1. Analysis on Measures and Functional Spaces

Bounded sets are crucial ingredients for defining various useful topologieson the space of continuous linear operators L(V1, V2) between two locally con-vex spaces V1, V2 with the topologies generated by the semi-norms p1α and p2βrespectively. The most important operator topologies are the topology of pointwiseconvergence defined by the family of seminorms p2β(Lx) for L ∈ L(V1, V2), withall possible β and all points x ∈ V1, and the topology of bounded convergence de-fined by the family of seminorms sup{p2β(Lx) : x ∈ M}, with all possible β andall bounded subsets M ⊂ V1. If V1, V2 are Banach spaces, the topology of point-wise convergence coincides with the strong operator topology (as defined in Section1.1), and the topology of bounded convergence is generated by the norm (1.1). Itis sometimes referred to as the uniform topology.

Remark 13. An intermediate topology between the topologies of pointwise con-vergence and bounded convergence is the topology of compact convergence, whichis defined by the family of seminorms sup{p2β(Lx) : x ∈ M}, with all possible βand all compact subsets M ⊂ V1.

Important cases are the topologies on the dual space to V , denoted by V ′ orV ∗. The dual space is defined as the space of continuous linear functionals V → R,that is V ′ = L(V,R). (Of course, for complex spaces one uses C instead of R inthis definition.) The topology of pointwise convergence in L(V,R) is called the ∗-weak topology on V ′, and the topology of bounded convergence is called the strongtopology on V ′.

Remark 14. Notice the mismatch in the standard nomenclature: for a Banachspace V , the ‘strong topology’ on L(V,R) is the ‘∗-weak topology’ on V ′ =L(V,R).

A mapping between two locally convex spaces is called bounded if it takesbounded subsets to bounded subsets.

Remark 15. According to this definition, the linear mapping f : x → x in R isbounded, although the function f(x) = x is surely not bounded in the usual senseof calculus.

In Banach spaces, bounded sets are sets that are bounded by norm. Therefore,a linear mapping between Banach spaces is continuous if and only if it is bounded.In general locally convex spaces, continuity of a linear map L implies that it isbounded, which follows directly from the definitions. But the inverse implicationdoes not hold in general, which gives rise to the following definition. A locallyconvex space V is called bornological if any bounded linear mapping L : V → W ,with W being any locally convex space, is continuous.

Proposition 1.6.5. Any metricizable locally convex space V , i.e., a locally convexspace with a countable base of neighbourhood of zero, is bornological.

Proof. If V is metricizable, then there exists a countable base of open neighbour-hoods {Un} such that Un+1 ⊂ Un for all n. Let us assume that a linear mapping

Page 49: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

1.6. Locally convex spaces and Frechet spaces 33

L : V → W is not continuous. Then there exists a neighbourhood N of zero inF such that L−1(N) is not a neighbourhood of zero, and hence tL−1(N) is nota neighbourhood either for any t. We can therefore choose a sequence xn ∈ Un

such that xn /∈ nL−1(N). Thus {xn} is a bounded set (because it converges tozero), but {L(xn)} is not (since it cannot be absorbed by N). Therefore, L is notbounded. �

A balanced closed convex absorbing set in a locally convex space V is calleda barrel. A locally convex space V is called barrelled if any barrel in V is a neigh-bourhood of zero.

Proposition 1.6.6. Any Frechet space is barrelled.

Proof. This is a direct consequence of the fundamental Baire theorem, whichstates that a complete metric space cannot be represented as a countable unionof nowhere-dense sets. An alternative direct proof can be found in [235]. �

The importance of a space being barrelled lies primarily in the followingprinciple of uniform boundedness, also called Banach–Steinhaus theorem whenapplied to Banach spaces.

Theorem 1.6.1. Let V be a barrelled space and W any locally convex space. Let Lβ

be a family of continuous linear mappings V → W such that the sets {Lβv} arebounded in W for any v. Then the family Lβ is equicontinuous.

Proof. For a closed balanced convex neighbourhood N of zero in W , each setL−1β (N) is closed, because Lβ is continuous. Hence the set M = ∩βL

−1β (N) is

closed, convex and balanced. By the last assumption on Lβ, M is also absorbing.Hence it is a barrel and therefore a neighbourhood of zero in V . The equicontinuityfollows from Lβ(M) ⊂ N . �

A fundamental tool for constructing new classes of spaces is based on theidea of the inductive limit. Let X0 ⊂ X1 ⊂ X2 ⊂ · · · be an increasing sequence oftopological (e.g., metric) spaces such that for any n the topology of Xn coincideswith the topology that is induced on it by Xn+1. In other words, the open setsin Xn are sets of the form Xn ∩ U with U open subsets of Xn+1. This impliesthat the inclusions Xn → Xn+1 are continuous. The topology of the inductivelimit on X = ∪nXn is the topology whose open sets U are subsets of X suchthat U ∩Xn is open in Xn for any n. This implies that a mapping X → Y withany topological space Y is continuous if and only if its restriction to any of Xn

is continuous. In fact, this property can be taken as an alternative definition ofthe inductive topology. By yet another equivalent characterization, the topologyof the inductive limit is the strongest one on X for which all inclusions Xn → Xare continuous.

Exercise 1.6.7. Let V ⊂ W be two locally convex spaces such that the topology ofV is induced by W . Show that for any balanced convex neighbourhood V0 of zero

Page 50: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

34 Chapter 1. Analysis on Measures and Functional Spaces

in V , there exists a balanced convex neighbourhood W0 of zero in W such thatV0 = W0 ∩ V .

Applying this general notion to linear spaces, we can define the inductivelimit of an increasing sequence of locally convex spaces V0 ⊂ V1 ⊂ · · · as the spaceV = ∪nVn equipped with the inductive topology. This topology is also locallyconvex and separated if all Vj are separated.

Exercise 1.6.8. Check the last claim.

Proposition 1.6.7.

(i) The inductive limit of barrelled spaces is barrelled.

(ii) The inductive limit of bornological spaces is bornological.

Proof. (i) If U is a barrel in V , then U ∩ Vn is a barrel in Vn and hence a neigh-bourhood there. Therefore, U is a neighbourhood in V by the definition of theinductive topology.

(ii) If L : V → W is bounded, then the restriction of L on Vn is boundedand hence continuous. Therefore, L : V → W is continuous by the definition ofthe inductive topology. �

Let us quote the following result that stresses the role of the inductive limitin the theory of locally convex spaces. Note, however, that we shall not use thisfact. (Its proof is not very difficult and can be found, e.g., in [235].)

Proposition 1.6.8.

(i) A Hausdorff locally convex space is bornological if and only if it is the induc-tive limit of normed spaces.

(ii) A complete Hausdorff bornological space is the inductive limit of Banachspaces.

A prime example for the inductive limit is the space D(Ω) = C∞c (Ω) of in-

finitely differentiable functions on an open subset Ω inRd with a compact support.It was represented earlier as C∞

0 (Ω) = ∪nC∞0 (Ωn), where Ω1 ⊂ Ω2 ⊂ · · · is an

increasing sequence of open subsets of Ω such that the closure Ωn of each Ωn isa compact subset of Ωn+1 and Ω = ∪nΩn. The natural topology on D(Ω) is thetopology of the inductive limit of the Frechet spaces C∞

0 (Ωn). Equipped with thistopology, it is usually referred to as the space of test functions on Ω.

The dual spaces D′(Ω) to D(Ω) and S′(Rn) to the Schwartz space S(Rd) arecalled the spaces of generalized functions (or distributions) and tempered general-ized functions (or distributions) respectively. They play an important role in theanalysis of partial differential (and pseudo-differential) equations in Rd, and willbe discussed in more detail in the next sections (essentially independent from thegeneral theory). At this stage, let us only note that the operation of differentia-tion extends to generalized functions by the usual duality, that is, for any ξ ∈ D′

and φ ∈ D one defines (ξ′, φ) = −(ξ, φ′) (motivated by the integration by parts

Page 51: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

1.6. Locally convex spaces and Frechet spaces 35

formula). According to this definition, any generalized function is infinitely differ-entiable. This allows for an extension of the calculus beyond its usual boundaries.For instance, each locally integrable function g(x) on Rn can be considered anelement of D′ acting on D as φ �→ ∫

g(x)φ(x) dx, and thus we have found a wayto differentiate such functions.

The following result describes the convergence of sequences (not necessarilydirected sets) in the inductive limit.

Proposition 1.6.9. Let X = ∪Xn be the inductive limit of the sequence of locallyconvex spaces Xn such that each Xn is a proper closed subspace of Xn+1. A se-quence {xm} converges in X if and only if all xm belong to some Xn and {xm}converges in Xn.

Remark 16. The proof is not very difficult. It is based on the Hahn–Banach theo-rem and can be found, e.g., in [231]. This general result is only quoted for the sakeof completeness. We only need its application to the space of test functions D(Ω).But to work in this space, one can just define sequential convergence there withthe help of this rule, instead of deriving it from the general construction of theinductive limit. (Note that this is the most common way to develop the theory,which is often found in the literature.)

Remark 17. From Proposition 1.6.9, it follows that the inductive limit X is notmetricizable. In particular, the space D(Ω) of test functions is not metricizable. Infact, assuming the existence of a countable base of open neighbourhoods Un of theorigin in X , we can choose a sequence xn ∈ Un such that xn /∈ Xn. Then {xn} hasto converge to 0 by the definition of the base, but it does not so by Proposition1.6.9.

To complete this lengthy section, let us describe the smoothness in locallyconvex spaces.

For a mapping F : V → W between two locally convex spaces V,W , thedirectional derivative at a point Y in the direction ξ ∈ V is defined as it is forBanach spaces:

DξF (Y ) = DF (Y )[ξ] = limh→0+

1

h(F (Y + hξ)− F (Y )). (1.74)

If this derivative exists for some ξ and all Y from some convex subset M ofV , and if it is continuous in Y (for this ξ), then the homogeneity property (1.15)holds for Y ∈ M , as well as the first-order Taylor expansion (1.16):

F (Y + ξ)− F (Y ) =

∫ 1

0

d

dsF (Y + sξ) ds =

∫ 1

0

DξF (Y + sξ) ds, (1.75)

The justification is the same as for the Banach case.

If the mapping DξF (Y ) is defined for all Y in M and all ξ, and if it is linearand continuous in ξ, then the linear operatorDξF (Y ) is usually called the Gateauxderivative. If this is the case for all Y , then F is called Gateaux-differentiable onM .

Page 52: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

36 Chapter 1. Analysis on Measures and Functional Spaces

The analogue to Proposition 1.2.1 (see also Remark 5) reads as follows:

Proposition 1.6.10.

(i) If DξF (Y ) exists for all ξ and all Y from M , and if it is continuous inY for any ξ, then DξF (Y ) is linear in ξ. In particular, if DξF (Y ) is alsocontinuous in ξ (the continuity at the point ξ = 0 is of course sufficient forthe continuity of a linear map), then DξF (Y ) is Gateaux-differentiable.

(ii) If additionally the set {DξF (Y ) : Y ∈ M} is bounded in W for any ξ and Vis a barrelled space, then the family of linear mappings DξF (Y ) (parametrizedby Y ∈ M) is equicontinuous, and therefore the mapping (ξ, Y ) �→ DξF (Y )is continuous.

Proof. (i) This repeats exactly the proof of Proposition 1.2.1. (ii) This followsfrom the principle of uniform boundedness (Theorem 1.6.1). �

One can imagine several reasonable extensions of the notion of Frechet deriva-tives that turn into the usual derivative for Banach spaces. The most naturalextensions are as follows.

One says that a function F on M is Frechet-differentiable at Y (respectivelystrongly Frechet-differentiable) if there exists an element DF (Y ) ∈ B∗, called thederivative of F at Y , such that

limh→0

F (Y + hξ)− F (Y )− hDF (Y )[ξ]

h= 0 (1.76)

uniformly for ξ from any bounded subset of V (respectively from any neighbour-hood of zero). In Banach spaces, neighbourhoods are bounded sets, and thus bothnotions yield the usual Frechet derivative in the case of Banach spaces.

Varying the class of subsets for which uniform convergence in (1.76) is re-quired leads to other notions of differentiability. For instance, the Gateaux deriva-tive corresponds to the choice of singletons, so that Frechet-differentiability im-plies the Gateaux-differentiability, as for Banach spaces. The so-called Hadamardderivative corresponds to the choice of compact sets, which makes it an interme-diate notion between the Gateaux and the Frechet derivative.

Difficulties with all these derivatives in a general locally convex contextmostly arise when it comes to the chain rule and thus a systematic developmentof the usual rules of calculus. This has to do with the fact that the mapping ofthe composition of continuous linear operators L(V,B) × L(B,W ) → L(V,W )(taken in the topology of bounded convergence) turns out to be continuous (as afunction of two variables) for Banach intermediate spaces B only (see, e.g., [264]for a proof). In concrete situations, however, problems with the chain rules canusually be overcome by introducing some sort of equicontinuity assumptions onthe derivatives.

Page 53: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

1.7. Linear operators in spaces of measures and functions 37

1.7 Linear operators in spaces of measuresand functions

We shall now discuss the basic representations of linear operators on the spacesof measures in terms of transition kernels, and on the main functional spacesas pseudo-differential operators (ΨDO). In this context, we define the Fouriertransform in order to clarify the related notation and highlight its basic properties.Their proofs are omitted, but can be found in numerous books such as [232].

A transition kernel (or just kernel) ν(x,A) or ν(x, dy) from a topologicalspace X to a topological space Y is a function of two variables such that, for eachx ∈ X , ν(x, .) ∈ M+(Y ), and for each Borel set A, ν(., A) is a Borel-measurablefunction on X . A signed transition kernel is defined in the same way, but withν(x, .) ∈ M(Y ). If X = Y , then ν is referred to as the kernel in X . Any signedtransition kernel defines a linear operator from bounded measurable functions onY to measurable functions on X , where ν plays the role of the integral kernel viathe formula

Tνf(x) =

∫f(y)ν(x, dy). (1.77)

If all ν are positive measures, then this Tν is positivity preserving: it takes non-negative functions to non-negative functions.

A kernel ν is called bounded if supx ‖ν(x, .)‖ < ∞. In particular, ν is a prob-ability kernel or stochastic kernel if all measures ν(x, .) are probability measures.

One says that a signed transition kernel ν is weakly continuous if Tνf(x) isa continuous function for any f ∈ C(Y ). For a weakly continuous bounded signedkernel ν, the operator Tν is a bounded linear operator C(Y ) → C(X). For X alocally compact space, the dual operator T ′

ν acts in the space of measures M(X)by the formula

(T ′νμ)(dy) =

∫μ(dx)ν(x, dy). (1.78)

If X = Rd and the kernel ν(x, dy) has a dual kernel ν′ such that ν(x, dy)dx =ν′(y, dx)dy, then the dual operator can be reduced to the action on C(Rd) via theformula

(T ′νg)(y) =

∫g(x)ν′(y, dx). (1.79)

The famous Riesz–Markov theorem states that if Y is a locally compact met-ric space, then any bounded linear functional on C∞(Y ) is given by integrationwith respect to a Borel measure. Consequently, in this case any bounded linearoperator T : C∞(Y ) → C(X) is given by (1.77) with some bounded weakly con-tinuous signed transition kernel ν. These operators are contractions if and onlyif ‖ν(x, .)‖ ≤ 1 for any x. Of course, this holds for probability kernels ν. There-fore, bounded linear operators C∞(Y ) → C(X) can be naturally lifted up into thebounded linear operators C(Y ) → C(X), given by the same formula (1.77). Thislifting can be described in an alternative way without a reference to the struc-

Page 54: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

38 Chapter 1. Analysis on Measures and Functional Spaces

ture given by ν. Namely, a sequence of functions fn ∈ C(X) is said to convergeto f ∈ C(X) in the bounded-pointwise topology (or shortly, in bp-topology) if thefamily {fn} is uniformly bounded and fn(x) → f(x) for any x, as n → ∞. If φ is abounded linear functional on C∞(X), then its lifting to C(X) can be performed bythe bp-closure: for f ∈ C(X), φ(f) = limn→∞ φ(fn) for any sequence fn ∈ C∞(X)that is bp-converging to f . (The existence of the limit follows from the dominatedconvergence theorem and the fact that φ can be represented via the measure ν.)Similarly, the operators C∞(Y ) → C(X) are lifted to C(X) by the bp-closure.

It is worth noting that C∞(X) is closed in C(X) in its Banach topology, butit is dense in C(X) in the bp-topology.

In principle, the operators on C(X) (not C∞(X)) can be quite different fromthose lifted from C∞(X). For instance, the formulas f �→ lim supx �→∞ f(x) orf �→ lim supn�→∞ f(xn), with any sequence xn tending to infinity, define linearcontinuous functionals on C(R). However, such functionals are rarely used.

Proposition 1.7.1. For a bounded linear functional φ : C(X) → R with a locallycompact metric space X, the following three properties are equivalent:

(i) φ(f) =∫f(x)μ(dx) with some μ ∈ M(X),

(ii) φ is bp-continuous,

(iii) φ is obtained by the bp-closure from a bounded functional on C∞(X).

Proof. The implication (iii) to (i) follows from the structure of the functionalson C∞(X), i.e., from the Riesz–Markov theorem. Implication (i) to (ii) followsfrom the dominated convergence theorem. Finally, any bounded linear functionalφ : C(X) → R can be reduced to C∞(X), where it has a structure as given in (i).If it is bp-continuous, its value on any element of C(X) will be recovered by thebp-closure from this restriction. �

Similarly, a bounded operator T : C(Y ) → C(X) has a representation (1.77)with a bounded kernel ν if and only if it is bp-bp-continuous, i.e., it takes bp-converging sequences to bp-converging sequences. We shall only work with opera-tors on C(X) that are lifted from the operators on C∞(X) by bp-continuity.

In practice, unbounded operators in C(Rd), or in C(X) with X a convexsubset of Rd, are often well defined on some classes of smooth functions. In orderto distinguish this class of operators, let us say that, for a k ∈ N, an unboundedoperator A in C(Rd) is of at most kth order or has order not exceeding k if Ais a bounded operator Ck(Rd) → C(Rd). The structure of such operators, whenreduced to Ck∞(Rd), is as follows.

Proposition 1.7.2. If A is a bounded operator Ck∞(X) → C(X), then

Af(x) =

k∑m=0

∑i1,...,im

∫∂mf(y)

∂yi1 · · · ∂yimνi1···im(x, dy), (1.80)

with certain weakly continuous bounded signed transition kernels νi1···im .

Page 55: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

1.7. Linear operators in spaces of measures and functions 39

Proof. The mapping from f to the collection of all its partial derivatives up toorder k is a bounded injection of Ck

∞(X) to the product of a sufficient numberof C(X). Hence, by the Hahn–Banach theorem, any bounded linear functional onCk

∞(X) can be extended to a bounded linear functional on this product of C(X),which yields (1.80). �

The simplest examples of such operators are of course the differential oper-ators:

Af(x) =

k∑m=0

∑i1,...,im

Ai1···im(x)∂mf(x)

∂xi1 · · · ∂xim

. (1.81)

Another particular case are integral operators on the tails of the Taylor ex-pansion:

Af(x) =

∫ [f(x+y)−

k∑m=0

1

m!

∑i1,...,im

yi1 · · · yimAm(y)∂mf(x)

∂xi1 · · ·∂xim

]νi1···im(x, dy),

(1.82)where ν is not necessarily a bounded kernel, but only such that the integral is welldefined on smooth functions, that is, such that min(|y|k+1, 1) is integrable. Thefunctions Am play the role of a mollifier. Sometimes, they are indeed needed, butoften Am(y) = 1 is enough.

The main example for such operators is given by the powers of Laplacians,see (1.145) below. Another class of examples corresponds to the cases with k = 1and k = 2:

Aν1f(x) =

∫(f(x+ y)− f(x))ν(x, dy),

Aν2f(x) =

∫(f(x+ y)− f(x)− (∇f(x), y)χ(y))ν(x, dy),

(1.83)

where∫min(|y|, 1)ν(x, dy) < ∞ in the first equation and

∫min(|y|2, 1)ν(x, dy) <

∞ in the second one. These operators arise in stochastic analysis, where they areoften referred to as Levy–Khintchin-type operators. The mollifier χ in (1.83) isconventionally chosen either as χ(y) = 1/(1 + y2) or as χ(y) = 1y≤1.

In order to represent operators (1.82) in the general form (1.80) with boundedkernels ν, one has to expand f(x+ y) in a Taylor series. For instance, if d = 1, thefirst operator in (1.83) can be written equivalently as

Aν1f(x) =

∫ ∞

−∞f ′(x+z)Φ(x, z) dz, Φ(x, z) =

⎧⎪⎪⎨⎪⎪⎩

Φ+(x, z) =

∫ ∞

z

ν(x, dy), z > 0,

Φ−(x, z) =∫ z

−∞ν(x, dy), z < 0,

(1.84)

Page 56: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

40 Chapter 1. Analysis on Measures and Functional Spaces

so that Φ±(x, .) are positive functions, decreasing (respectively increasing) in z,and ∫ ∞

0

Φ+(x, z)dz =

∫ ∞

0

yν(x, dy),

∫ 0

−∞Φ−(x, z)dz = −

∫ 0

−∞yν(x, dy).

And the second operator in (1.83) with χ = 1 (for simplicity) can be written as

Aν2f(x) =

∫ ∞

−∞f ′′(x + z)Φ(x, z) dz,

Φ(x, z) =

⎧⎪⎪⎨⎪⎪⎩

Φ+(x, z) =

∫ ∞

z

(y − z)ν(x, dy), z > 0,

Φ−(x, z) =∫ z

−∞(z − y)ν(x, dy), z < 0,

(1.85)

so that Φ±(x, .) are positive convex functions, decreasing (respectively increasing)in z:

∂Φ+(x, z)

∂z= −

∫ ∞

z

ν(x, dy) < 0,∂Φ−(x, z)

∂z=

∫ z

−∞ν(x, dy) > 0.

Exercise 1.7.1. Check the formulas (1.84) and (1.85).

The representation (1.80) is of course not unique. The situation becomessimpler in dimension d = 1. In this case, the Taylor expansion

f(x) = f(a) +k−1∑m=1

f (m)(a)xm

m!+

∫ x

a

f (k)(y)(x − y)k−1

(k − 1)!dy

yields a bijective mapping from Ck([a, b]) to Rk × C([a, b]). Consequently, anycontinuous linear operator A : Ck([a, b]) → C([a, b]) has a unique representation

Af(x) = α0(x)f(a) +

k−1∑m=1

αm(x)f (m)(a) +

∫ b

a

f (k)(y)ν(x, dy), (1.86)

with some signed weakly continuous bounded transition kernel ν in [a, b] andcontinuous functions αj(x).

The most appropriate language for describing the operators in the spacesof smooth functions is the language of pseudo-differential operators. In order tointroduce this language properly, let us recall the basics of the Fourier transform(see any text on analysis for detail, e.g., [232], if needed) and also fix the notations.

The classical Fourier theorem states that the Fourier transform

F : φ → (Fφ)(p) =

∫e−i(p,x)φ(x) dx (1.87)

Page 57: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

1.7. Linear operators in spaces of measures and functions 41

is an isomorphism of the Schwartz space S(Rn), with the inversion formula

F−1 : ψ → (F−1ψ)(λ) = (2π)−n

∫ei(p,x)ψ(x) dx = (2π)−nFψ(−p). (1.88)

Remark 18. An annoying inconsistency prevails in the definitions of the Fouriertransform, since it can also be defined as

F : φ → (Fφ)(p) =

∫ei(p,x)φ(x) dx,

and also with various multipliers like (2π)−n or (2π)−n/2. These differences affectthe form of the inverse transform and the sign in the link between the differentia-tion of a function and the multiplication of its Fourier transform by p.

The most basic example for a Fourier transform is the transform of the ex-ponential of a quadratic form:∫

e−i(p,x) exp

{−1

2(Ax, x)

}dx =

(2π)d/2√detA

exp

{−1

2(p,A−1p)

}(1.89)

for a symmetric positive matrix A.

Exercise 1.7.2. Prove (1.89). Hint: Bring A into a diagonal form, i.e., express it asA = ODO−1 with a diagonal matrix D and an orthogonal matrix O. Then, onecan reduce the calculations to the one-dimensional case.

The Riemann–Lebesgue lemma states that F extends to the bounded linearoperator L1(Rn) → C∞(Rn). It also extends to the bounded operator M(Rn) →C(Rn) and to the isomorphism of the space L2(Rn), so that the following isometryrelation holds: ∫

(Ff)(x)(Fg)(x) dx = (2π)n∫

f(x)g(x) dx. (1.90)

Another fundamental fact on the range of the Fourier transform is given bythe Paley–Wiener theorem, which states that the image under Fourier transform ofthe space of functions from S(Rd) with a compact support in the ball {y : |y| ≤ R}coincides with the space of entire analytic functions g on Cn such that, for anyN > 0,

|g(ξ)| ≤ CNeR|Im ξ|(1 + |ξ|)−N , (1.91)

with a constant CN depending on N (see proofs in [232] or [90]).

The Fourier transform is used as the universal tool for diagonalizing transla-tion invariant linear operators, because – as can be seen by the direct applicationof integration by parts – the operator (1.87) turns the operator of differentiationinto an operator of multiplication:(

F∂φ

∂xj

)(p) =

∫e−i(p,x) ∂φ

∂xj(x) dx = −ipj(Fφ)(p). (1.92)

Page 58: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

42 Chapter 1. Analysis on Measures and Functional Spaces

This implies that operator (1.87) turns differential operators with constant coeffi-cients into operators of multiplication by a function.

Moreover, the Fourier transform takes the operation of convoluting functions,

(φ � ψ)(x) =

∫φ(x − y)ψ(y)dy

to a multiplication of functions:

F (φ � ψ)(p) = (Fφ)(p)(Fψ)(p), (1.93)

and vice versa:F (φψ) = (2π)−d(Fφ) � (Fψ). (1.94)

For an operator A acting in a space of functions on Rn, its symbol is definedas the following function of two variables:

ψ(x, p) = exp{−ixp}(A exp{ip ·})(x), (1.95)

whenever this expression is well defined. Representing a function u via the Fourierinversion formula

u =1

(2π)n

∫u(p)eixp dp

yields

Au(x) =1

(2π)n

∫u(p)(Aeip ·)(x) dp =

1

(2π)n

∫u(p)eixpψ(x, p) dp, (1.96)

which expresses the action of A in terms of its symbol.

For instance, the operator (1.80) of order at most k has the symbol

ψ(x, p) = ime−ipxk∑

m=0

∑i1,...,im

pi1 · · · pim∫

eiypνi1···im(x, dy).

For a function ψ(x, p) which is a polynomial in the variables p = (p1, . . . , pn),the differential operator ψ(x,−i∇) acts as

ψ(x,−i∇)u(x) =1

(2π)n

∫u(p)eixpψ(x, p) dp,

because the operator of differentiation turns into the operator of multiplicationunder the Fourier transform. Therefore, the general operator A with the symbolψ(x, p) acting by (1.96) can be naturally denoted by ψ(x,−i∇). Operators repre-sented in this form are called pseudo-differential operators (ΨDOs) with symbolsψ. In cases where ψ(x, p) = ψ(p) does not depend on x, ψ(−i∇) is referred to asan operator with constant coefficients, or as a spatially homogeneous operator. By

Page 59: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

1.8. Fractional calculus 43

(1.96), the action of ψ(−i∇) on a function is equivalent to the multiplication ofits Fourier image with ψ(p), so that

F [ψ(−i∇)u](y) = ψ(y)(Fu)(y), (1.97)

F [V (.)u(.)](y) = V (i∇)(Fu)(y) (1.98)

for bounded continuous functions ψ and V . Moreover, as follows from (1.93), theseoperators commute with the convolution in the sense that

ψ(−i∇)(φ � ψ) = (ψ(−i∇)φ � ψ) = (φ � ψ(−i∇)ψ), (1.99)

whenever all these expressions are well defined.

1.8 Fractional calculus

In this book, we shall demonstrate in many places that an extension of the theory ofODEs that is required for including fractional derivatives can be achieved more orless directly if one derives both of them from an appropriately formulated generaltheory of integral equations. In this section, we present the necessary backgroundon fractional calculus in concise manner, namely the definitions of fractional inte-grals and derivatives in their Banach-space-valued extension, their most importantalternative representations, the link between integrals and derivatives, the action offractional derivatives on the exponents and how this leads to the symbols of ΨDOsrepresenting these derivatives, and finally their finite-dimensional extensions.

Remark 19. Readers who are not interested in the ‘fractional’ development maywell skip this section and just note the definitions of fractional integrals that weoccasionally use as a tool for streamlining the treatment of certain series expan-sions.

Let Iaf be the integration operator defined on the set of continuous curvesf ∈ C([a, b], B), with B a Banach space, as

Iaf(x) =

∫ x

a

f(t) dt.

Integration by parts yields

I2af(x) =

∫ x

a

(Iaf)(y) dy =

∫ x

a

(x− y)f(y) dy.

Similarly, by induction one gets the following formula for the iterated Riemannintegral:

Ina f(x) =1

(n− 1)!

∫ x

a

(x− t)n−1f(t)dt. (1.100)

Page 60: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

44 Chapter 1. Analysis on Measures and Functional Spaces

This formula suggests a natural analytical extension, if x > a, to complex nwith positive real part, leading to the following definition of the (left) fractionalor Riemann–Liouville (RL) integral of order β, for any β with positive real part:

Iβa f(x) = Iβa+f(x) =1

Γ(β)

∫ x

a

(x− t)β−1f(t)dt. (1.101)

As can be expected from this definition, one finds the following semigroupproperty of fractional integrals by direct integration:

Iβ1a Iβ2

a f(x) = Iβ1+β2a f(x), (1.102)

for positive β1, β2 and any continuous curve f in B. On the other hand, directdifferentiation shows that for β > k (only β ∈ R are considered here) with k ∈ N,

dk

dxkIβa f(x) = Iβ−k

a f(x). (1.103)

Since

(Iβa 1)(x) =(x − a)β

Γ(β + 1)

where 1 is the constant function with value 1, it follows from the definition of theMittag-Leffler function (see (9.13)) that

Eβ(λ(t− a)β) =

⎛⎝ ∞∑

j=0

λjIjβa

⎞⎠1(t), (1.104)

for any number λ, or more generally for any bounded linear operator λ in a Banachspace B.

Noting that the derivation is the inverse operation to usual integration, thedefinition (1.101) of the fractional integral suggests two notions of fractional deriva-tives: 1) the so-called RL (left) derivatives of order β ∈ (n, n+ 1), with n a non-negative integer:

Dβa+f(x) =

dn+1

dxn+1In+1−βa f(x)

=1

Γ(n+ 1− β)

dn+1

dxn+1

∫ x

a

(x− t)n−βf(t)dt, x > a,

(1.105)

and 2) the so-called Caputo (left) derivative of order β ∈ (n, n+ 1):

Dβa+∗f(x) = In+1−β

a

[dn+1

dxn+1f

](x)

=1

Γ(n+ 1− β)

∫ x

a

(x− t)n−β

[dn+1

dtn+1f

](t)dt, x > a.

(1.106)

Page 61: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

1.8. Fractional calculus 45

Proposition 1.8.1. For f ∈ C1(R, B) and β ∈ (0, 1), x > a,

Dβa+f(x) =

1

Γ(−β)

∫ x−a

0

f(x− z)− f(x)

z1+βdz +

f(x)

Γ(1 − β)(x− a)β, (1.107)

Dβa+∗f(x) =

1

Γ(−β)

∫ x−a

0

f(x− z)− f(x)

z1+βdz +

f(x)− f(a)

Γ(1 − β)(x− a)β, (1.108)

implying

Dβa+∗f(x) = Dβ

a+[f − f(a)](x) = Dβa+f(x)−

f(a)

Γ(1− β)|x − a|β . (1.109)

Proof. Integrating by parts yields

I1−βa+ f(x) =

1

Γ(1 − β)

∫ x

a

(x− t)−βf(t) dt = − 1

Γ(1− β)

∫ x

a

d

dt

[(x− t)1−β

1− β

]f(t) dt

=1

Γ(1 − β)

[(x− a)1−β

1− β

]f(a) +

1

Γ(1− β)

∫ x

a

[(x− t)1−β

1− β

]f ′(t) dt,

so that

Dβa+f(x) =

d

dxI1−βa+ f(x) =

f(a)

Γ(1− β)(x − a)β+

1

Γ(1− β)

∫ x

a

(x− t)−βf ′(t) dt.

(1.110)Another integration by parts using f ′(t) = (f(t)− f(x))′ yields

Dβa+f(x) =

f(a)

Γ(1− β)(x − a)β− f(a)− f(x)

Γ(1− β)(x − a)β− β

Γ(1− β)

∫ x

a

f(t)− f(x)

(x− t)1+βdt,

which equals the r.h.s. of (1.107). On the other hand,

Dβa+∗f(x) = I1−β

a+ f ′(x) =1

Γ(1− β)

∫ x

a

(x− t)−βf ′(t) dt,

which differs from (1.110) by f(a)(x − a)−β/Γ(1 − β). This leads to (1.108) and(1.109). �

In particular, it follows that for smooth bounded integrable functions, theleft RL and Caputo derivatives coincide for a = −∞, β ∈ (0, 1). Therefore, onedefines the fractional derivative in generator form as their common value:

dxβf(x) = Dβ

−∞+f(x) = Dβ−∞+∗f(x) =

1

Γ(−β)

∫ ∞

0

f(x− z)− f(x)

z1+βdz.

(1.111)

The following corollary is important for building the generalized fractionalcalculus, as performed in Chapter 8.

Page 62: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

46 Chapter 1. Analysis on Measures and Functional Spaces

Proposition 1.8.2. The operators Dβa+∗ (respectively Dβ

a+) are obtained from Dβ∞+

by the restriction of its action on the space C1([a,∞), B) that can be consideredthe subspace of functions from C1(R, B) that are constants for x ≤ a (respectivelyon the subspace of C1

kill([a,∞), B) consisting of functions that vanish for x ≤ a).

Proof. One only has to observe that

1

Γ(1 − β)(x− a)β=

1

Γ(−β)

∫ ∞

x−a

dz

zβ+1. �

As can be expected from the definition, an important consequence of (1.109)is that the fractional derivatives form the left inverse operations to the RL inte-grals:

Dβa+∗I

βa g(x) = Dβ

a+Iβa g(x) = g(x), (1.112)

for any continuous curve g and x > a. In fact, the first equation follows from(1.109) and the second is obtained from the definition and (1.102):

Dβa+I

βa g =

d

dxI1−βa Iβa g = g.

Proposition 1.8.3. For f ∈ C2(R, B) and β ∈ (1, 2), x > a,

Dβa+f(x) =

1

Γ(−β)

∫ x−a

0

f(x− z)− f(x) + f ′(x)zz1+β

dz

+f(x)(x − a)−β

Γ(1− β)+

βf ′(x)(x − a)1−β

Γ(2− β), (1.113)

Dβa+∗f(x) =

1

Γ(−β)

∫ x−a

0

f(x− z)− f(x) + f ′(x)zz1+β

dz (1.114)

+(f(x) − f(a))(x− a)−β

Γ(1− β)+

(βf ′(x) − f ′(a))(x − a)1−β

Γ(2− β),

so that

Dβa+∗f(x) = Dβ

a+[f − f(a)− f ′(a)(. − a)](x)

= Dβa+f(x)−

f(a)(x− a)−β

Γ(1− β)− f ′(a)(x − a)1−β

Γ(2− β). (1.115)

Proof. For β ∈ (1, 2) and x > a,

Dβa+∗f(x) =

1

Γ(2− β)

∫ x

a

(x− t)1−βf ′′(t) dt, (1.116)

which rewrites as

1

Γ(2− β)(x− a)1−β(f ′(x) − f ′(a)) +

1− β

Γ(2− β)

∫ x

a

(x − t)−β(f ′(t)− f ′(x)) dt.

Page 63: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

1.8. Fractional calculus 47

Using

f ′(t)− f ′(x) =d

dt(f(t)− f(x) − (t− x)f ′(x)),

another integration by parts yields

Dβa+∗f(x) =

1

Γ(2 − β)(x− a)1−β(f ′(x) − f ′(a))

− 1

Γ(1− β)(x− a)−β(f(a)− f(x)− (a− x)f ′(x))

+1

Γ(−β)

∫ x

a

f(t)− f(x)− (t− x)f ′(x)(x− t)1+β

dt,

which equals the r.h.s. of (1.114).

On the other hand, again for β ∈ (1, 2) and x > a,

I2−βa+ f(x) =

1

Γ(2− β)

∫ x

a

(x− t)1−βf(t) dt,

which rewrites by integration by parts as

I2−βa+ f(x) =

f(a)

Γ(2− β)

(x− a)2−β

2− β+

1

Γ(2− β)

∫ x

a

(x− t)2−β

2− βf ′(t) dt,

and by yet another integration by parts as

I2−βa+ f(x) =

f(a)

Γ(2− β)

(x − a)2−β

2− β+

f ′(a)(x − a)3−β

Γ(2− β)(2 − β)(3 − β)

+

∫ x

a

(x− t)3−βf ′′(t)Γ(2− β)(2 − β)(3 − β)

dt.

Consequently,

Dβa+f(x) =

d2

dx2I2−βa+ f(x) =

1− β

Γ(2− β)(x − a)−βf(a)

+f ′(a)(x − a)1−β

Γ(2− β)+

∫ x

a

(x− t)1−βf ′′(t)Γ(2− β)

dt.

Comparing this with (1.116) yields (1.113) and (1.115). �

For smooth bounded integrable functions, the left RL and Caputo derivativesagain coincide for a = −∞, β ∈ (1, 2), and one defines the fractional derivative ingenerator form as their common value:

dxβf(x) = Dβ

−∞+f(x) = Dβ−∞+∗f(x) =

1

Γ(−β)

∫ ∞

0

f(x− z)− f(x) + f ′(x)zz1+β

dz.

(1.117)

Page 64: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

48 Chapter 1. Analysis on Measures and Functional Spaces

Formula (1.115) implies that (1.112) holds for β ∈ (1, 2). Similar argumentsjustify it for all positive β. Moreover, for any β ∈ (n, n+1) one derives that the leftRL and Caputo derivatives coincide for a = −∞, and one defines the fractionalderivative in generator form as their common value:

dxβf(x) = Dβ

−∞+f(x) = Dβ−∞+∗f(x)

=1

Γ(n+ 1− β)

∫ ∞

0

dn+1

dxn+1

f(x− z)dz

zβ−n

=1

Γ(n+ 1− β)

dn+1

dxn+1

∫ ∞

0

(x− t)n−βf(t) dt (1.118)

=1

Γ(−β)

∫ ∞

0

[f(x− z)− f(x) + f ′(x)z − · · · − 1

n!f (n)(x)(−z)n

]dz

z1+β

=1

Γ(−β)

∫ 0

−∞

[f(x+ z)− f(x)− f ′(x)z − · · · − 1

n!f (n)(x)zn

]dz

|z|1+β.

Note that we presented several equivalent forms that are used in various contexts.

Let us specifically distinguish the following consequence of (1.112) that allowsdifferential equations to be rewritten with fractional derivatives in an equivalentintegral form, which is crucial for their analysis.

Proposition 1.8.4. Let β ∈ (n, n+ 1) with a non-negative integer n.

(i) If g ∈ C([a, b], B) and

f(x) =

n−1∑k=0

xk

k!f (k)(a) +

1

Γ(β)

∫ x

0

(x− s)β−1g(s) ds, (1.119)

then g = Dβa+∗f .

(ii) If f ∈ Cn−1([a, b], B) and g = Dβa+∗f is well defined as a continuous function,

then f is given by (1.119).

Proof. Let us prove it for β ∈ (0, 1), the case β ∈ (n, n+ 1) with arbitrary n ∈ Nbeing analogous. (i) We have

Dβa+∗f = Dβ

a+(f − f(a)) = Dβa+I

βa g = g.

(ii) We have

Iβa g = Iβad

dxI1−βa (f − f(a)) = D1−β

a+∗ I1−βa (f − f(a))

= D1−βa+ I1−β

a (f − f(a)) = f − f(a). �Remark 20. An insightful framework for understanding fractional operations isgiven by the theory of generalized functions, which allows one to look at fractionalintegrals and derivatives in a unified way as convolutions with regularized powerfunctions, see Section 1.10.

Page 65: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

1.8. Fractional calculus 49

Turning to the right derivative, notice that for x < a formula (1.100) rewrites as

Ina f(x) =(−1)n

(n− 1)!

∫ a

x

(t− x)n−1f(t)dt. (1.120)

This suggests several possible normalizations for the analytic continuation in n andthe corresponding inversions (fractional derivatives). The most common definitionof the right fractional or Riemann–Liouville (RL) integral of order β (with positivereal part) is

Iβa−f(x) =1

Γ(β)

∫ a

x

(t− x)β−1f(t) dt, (1.121)

and the right versions of (1.105), (1.106) are chosen as follows:

Dβa−f(x) =

(−1)n+1

Γ(n+ 1− β)

dn+1

dxn+1

∫ a

x

(t− x)n−βf(t) dt, x < a, (1.122)

Dβa−∗f(x) =

(−1)n+1

Γ(n+ 1− β)

∫ a

x

(t− x)n−β dn+1

dxn+1f(t) dt, x < a. (1.123)

When β ∈ (0, 1) and x < a, similar calculations as for the left derivative (see(1.110)) lead to the following analogues of (1.107), (1.108):

Dβa−f(x) =

1

Γ(−β)

∫ a−x

0

f(x+ z)− f(x)

z1+βdz +

f(x)

Γ(1− β)(a− x)β, (1.124)

Dβa−∗f(x) =

1

Γ(−β)

∫ a−x

0

f(x+ z)− f(x)

z1+βdz +

f(x)− f(a)

Γ(1− β)(a− x)β, (1.125)

implying

Dβa−∗f(x) = Dβ

a−[f − f(a)](x) = Dβa−f(x)−

f(a)

Γ(1 − β)(a− x)β. (1.126)

When β ∈ (1, 2), x < a, one obtains

Dβa−f(x) =

1

Γ(−β)

∫ a−x

0

f(x+ z)− f(x)− f ′(x)zz1+β

dz

+f(x)(a− x)−β

Γ(1− β)− βf ′(x)(a − x)1−β

Γ(2 − β), (1.127)

Dβa−∗f(x) =

1

Γ(−β)

∫ a−x

0

f(x+ z)− f(x)− f ′(x)zz1+β

dz

+(f(x)− f(a))(a− x)−β

Γ(1− β)− (βf ′(x)− f ′(a))(a− x)1−β

Γ(2− β), (1.128)

Page 66: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

50 Chapter 1. Analysis on Measures and Functional Spaces

so that

Dβa−∗f(x) = Dβ

a−[f − f(a)− f ′(a)(.− a)](x)

= Dβa−f(x)−

f(a)(a− x)−β

Γ(1− β)+

f ′(a)(a− x)1−β

Γ(2− β). (1.129)

For smooth bounded integrable functions, the right fractional derivatives ingenerator form is again the common value of Dβ

a−f(x) and Dβa−∗f(x) for a = ∞,

which for β ∈ (n, n+ 1) is

d(−x)βf(x) = Dβ

∞−f(x) = Dβ∞−∗f(x) (1.130)

=1

Γ(−β)

∫ ∞

0

[f(x+ z)− f(x)− f ′(x)z − · · · − 1

n!f (n)(x)zn

]dz

z1+β.

It is straightforward to see that the pairs of operators (1.118), (1.130) aredual in the sense that (

dxβf, g

)=

(f,

d(−x)βg

)(1.131)

for β ∈ (n, n + 1) and sufficiently regular functions f, g, where the pairing (f, g)denotes of course the usual L2-product: (f, g) =

∫f(x)g(x)dx. This fact also

justifies the notation dβ/d(−x)β , since for β = 1 the operators d/dx and −d/dx =d/d(−x) are dual.

Proposition 1.8.5. For any β > 0,

d(±x)βe−ipx = exp{∓iπβ sgn p/2}|p|βe−ipx. (1.132)

Proof. For natural β, this follows from the usual differentiation. Let β ∈ (n, n+1)with a non-negative integer n. Then by (1.118) and (1.130),

d(±x)βe−ipx =

1

Γ(−β)e−ipx

∫ ∞

0

[e±ipz − 1− (±ipz)− · · · − 1

n!(±ipz)n

]dz

z1+β,

and (1.132) follows by (9.24). �

When applied to fractional derivatives, the correspondence (1.97) implies thefollowing.

Proposition 1.8.6. For any β > 0,

F

(dβ

d(±x)βf

)(p) = exp

{± i

2πβ sgn p

}|p|βF (f)(p), (1.133)

F

(dβ

dxβ+

d(−x)β

)(p) = 2 cos(πβ/2)|p|βF (f)(p). (1.134)

Page 67: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

1.8. Fractional calculus 51

Proof. By (1.131) and the definition of F ,

F

(dβ

dxβf

)(p) =

∫ (dβ

d(−x)βe−ipx

)f(x) dx.

Hence (1.132) proves (1.133). Equation (1.134) is a consequence of (1.133). �Remark 21. Note that exp{iπβ sgn p/2}|p|β is the value of the main branch of theanalytic function (ip)β (if p is real). Thus Proposition 1.8.6 states that F takesthe fractional β-derivative to a multiplication by (ip)β .

Keeping in mind that the Fourier transform F takes −d2/dx2 to the op-erator of multiplication by p2, one defines the symmetric fractional derivative|d2/dx2|β/2 = |d/dx|β as the operator that the Fourier transform takes to |p|β.Proposition 1.8.7. For any β ∈ (n, n+ 1) with a non-negative integer n,∣∣∣∣ ddx

∣∣∣∣β

f(x) =1

2Γ(−β) cos(πβ/2)(1.135)

×∫ ∞

−∞

[f(x+ z)− f(x)− f ′(x)z − · · · − 1

n!f (n)(x)zn

]dz

|z|1+β.

If β = 2k with k ∈ N, then |d/dx|2k = (−1)kd2k/dx2k. If β = 2k + 1 with k ∈ N,then∣∣∣∣ ddx

∣∣∣∣β

f(x) =(−1)k+1

π(2k + 1)! (1.136)

× limε→0

∫R\(−ε,ε)

[f(x+ z)− f(x)− f ′(x)z − · · · − 1

(2k)!f (2k)(x)z2k

]dz

|z|2k+2.

Proof. It follows from (1.134) that for positive β which is not an odd integer,∣∣∣∣ ddx∣∣∣∣β

f(x) =1

2 cos(πβ/2)

(dβ

dxβ+

d(−x)β

).

Therefore, the first two statements follow from the definitions of the one-sidedfractional derivatives. If β = 2k + 1, we use the continuity to write∣∣∣∣ ddx

∣∣∣∣2k+1

f(x) = limε→0

limβ→2k+1

1

2Γ(−β) cos(πβ/2)

×∫ ∞

−∞

[f(x+ z)− f(x)− f ′(x)z − · · · − 1

(2k)!f (2k)(x)z2k

]dz

|z|1+β,

where β → 2k + 1 from below. To take the limit, we note that

cos(πβ/2) = (−1)k sin

(π2k + 1− β

2

)∼ (−1)kπ

2k + 1− β

2

Page 68: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

52 Chapter 1. Analysis on Measures and Functional Spaces

for small (2k+1−β)/2 and that (the analytic continuation of) the function Γ(−β)has a pole at β = 2k + 1 with Γ(β) ∼ −1/[(2k + 1)!(2k + 1 − β)] for β near thispole. This implies (1.136). �Exercise 1.8.1. As an example, check that for β ∈ (0, 1), δ < β, x > 0,

d(−x)βxδ =

xδ−β

Γ(−β)

∫ ∞

0

(1 + z)δ − 1

z1+βdz. (1.137)

The exact evaluation of fractional derivatives can be rarely achieved, and oneusually has to confine oneself to its asymptotic or qualitative behaviour, as thefollowing example shows.

Proposition 1.8.8. Let β ∈ (0, 1), and t, x, α > 0. Then

d(−x)βexp{−txα} ∼ x−β exp{−txα} ×

⎧⎪⎪⎪⎪⎨⎪⎪⎪⎪⎩

(txα)β , txα > 1,

txα, txα ≤ 1, β > α,

txα − ln(txα), txα ≤ 1, β = α,

(txα)β/α, txα ≤ 1, β < α,(1.138)

where ∼ means that f ∼ g on a set A, if C−1 < f/g < C on A for someconstant C.

Proof. We have

d(−x)βexp{−txα} =

1

Γ(−β)

∫ ∞

0

[exp{−t(x+ y)α} − exp{−txα}] dy

y1+β.

Changing the variable y into z = y/x yields

d(−x)βexp{−txα} =

exp{−txα}Γ(−β)xβ

∫ ∞

0

[exp{−txα((1 + z)α − 1)} − 1]dz

z1+β.

Next, changing the variable z into w = (1 + z)α − 1 yields

d(−x)βexp{−txα} =

exp{−txα}Γ(−β)xβ

∫ ∞

0

[exp{−txαw} − 1]dw

f(w),

wheref(w) = ((1 + w)1/α − 1)1+βα(1 + w)(α−1)/α

is a positive increasing function of w > 0 such that f(w) ∼ w1+β/α for w > 1 andf(w) ∼ w1+β for w ≤ 1. Hence (1.138) follows from the following similarities:

−∫ ∞

1

(e−tw − 1)dz

w1+β∼

⎧⎪⎨⎪⎩

1/γ, t > 1,

tmin(1,γ), t ≤ 1, γ = 1,

− t ln t, t ≤ 1, γ = 1,

(1.139)

−∫ 1

0

(e−tw − 1)dz

w1+β∼{tβ, t > 1,

t, t ≤ 1.�

Page 69: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

1.8. Fractional calculus 53

Exercise 1.8.2. Let β ∈ (0, 1), δ < β and t, x, α > 0. Show that

d(−x)β(xδ exp{−txα}) ∼ xδ−β exp{−txα}g(txα), (1.140)

where g(y) ∼ yβ for large y and g is uniformly bounded for y ∈ [0, a] with anya > 0.

So far we have worked in one dimension only. In finite dimensions, we shalllimit the discussion to symmetric (mixed) fractional derivatives. Since one canweight the derivatives of different directions differently, it is natural to consider asymmetric (mixed) fractional operator in Rd of the form∫

Sd−1

|(∇, s)|βμ(ds), (1.141)

where μ(ds) is an arbitrary centrally symmetric finite (non-negative) Borel mea-sure on the sphere Sd−1, and β > 0. The most natural way to define this operator isvia the Fourier transform, i.e., as an operator that multiplies the Fourier transformof a function by ∫

Sd−1

|(p, s)|βμ(ds),or, in other words, via the equation

F (

∫Sd−1

|(∇, s)|βμ(ds)f)(p) =∫Sd−1

|(p, s)|βμ(ds)Ff(p). (1.142)

By analogy with the one-dimensional case, in order to get a more concreteexpression for this operator, one can look at the operator

Lβμf =

∫ ∞

0

∫Sd−1

(f(x+ y)− f(x)− (∇f(x), y)− · · · − 1

k!∇kf(x)y⊗k

)d|y|μ(ds)|y|1+β

,

(1.143)where β ∈ (k, k + 1), s = y/|y|, and

∇kf(x)y⊗k =∑

j1,...,jk

∂kf(x)

∂xj1 · · ·∂xjk

yj1 · · · yjk .

By (9.28),

(Lβμe

±i(p,.))(x) = e±i(p,x)Γ(−β) cos(πβ/2)

∫Sd−1

|(p, s)|βμ(ds). (1.144)

Consequently,

F (Lβμf)(p) =

∫e−i(p,x)Lβ

μf(x) dx =

∫f(x)Lβ

μe−i(p,x) dx

= (Ff)(p)Γ(−β) cos(πβ/2)

∫Sd−1

|(p, s)|βμ(ds).

Page 70: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

54 Chapter 1. Analysis on Measures and Functional Spaces

Therefore, according to the above definition of the operator (1.141) via its Fouriertransform, we find ∫

Sd−1

|(∇, s)|βμ(ds) = 1

Γ(−β) cos(πβ/2)Lβμ, (1.145)

whenever β is not an odd integer. This formula extends (1.135) to arbitrary di-mensions.

For β ∈ (1, 2), the r.h.s. of (1.145) can be written in equivalent forms as

1

2Γ(−β) cos(πβ/2)

∫ ∞

0

∫Sd−1

(f(x+ y) + f(x− y)− 2f(x))d|y|

|y|1+βμ(ds)

=1

Γ(−β) cos(πβ/2)limε→0

∫ ∞

ε

∫Sd−1

(f(x+ y)− f(x))d|y|

|y|1+βμ(ds). (1.146)

For β = 2k+1 with a non-negative integer k, it follows by the same limitingprocedure as in Proposition 1.8.7 that∫

Sd−1

|(∇, s)|2k+1μ(ds) f(x) =2

π(−1)k+1(2k + 1)!

× limε→0

∫ ∞

ε

∫Sd−1

(f(x+ y)− f(x)− (∇f(x), y) − · · ·

· · · − 1

(2k)!∇2kf(x)y⊗2k

)d|y|

|y|2k+2μ(ds). (1.147)

One can see directly from both its Fourier representation (1.142) and theintegro-differential representation (1.145) that the operators

∫Sd−1 |(∇, s)|βμ(ds)

are self-dual, that is(∫Sd−1

|(∇, s)|βμ(ds) f, g)

=

(f,

∫Sd−1

|(∇, s)|βμ(ds) g). (1.148)

The measure μ that mixes the fractional derivatives in various directions isoften referred to as the spectral measure of the operator (1.145).

The case of the uniform (or Lebesgue) measure μ(ds) = ds and of the corre-

sponding operator Lβds is particularly interesting. In this case, we can calculate∫

Sd−1

|(p, s)|β ds

= |p|β |Sd−2|∫ π

0

| cos θ|β sind−2 θ dθ = 2|p|β|Sd−2|∫ 1

0

uβ(1 − u2)(d−3)/2 du

= |p|β |Sd−2|∫ 1

0

v(β−1)/2(1− v)(d−3)/2 dv = |p|β |Sd−2|B(β + 1

2,d− 1

2

).

Page 71: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

1.8. Fractional calculus 55

Using (9.15) and (9.12) yields∫Sd−1

|(p, s)|βds = 2|p|βπ(d−1)/2Γ((β + 1)/2)

Γ((β + d)/2). (1.149)

Consequently, the operator Lβds multiplies the Fourier transform of a function by

2|p|βΓ(−β) cos(πβ/2)π(d−1)/2Γ((β + 1)/2)

Γ((β + d)/2).

Therefore, for any positive non-integer β, the fractional Laplacian operator |∇|β =|Δ|β/2, defined as the operator that multiplies the Fourier transform of a functionby |p|β, can be represented as

|∇|βf(x) = 1

2π(d−1)/2 cos(πβ/2)

Γ((β + d)/2)

Γ((β + 1)/2)Γ(−β)Lβdsf(x). (1.150)

In the language of ΨDOs (see (1.95)), formula (1.132) means that the function

exp{±iπβ sgn p/2}|p|β (1.151)

is the symbol of the operator dβ/d(±x)β . Since the analytic properties of symbolsare important for the analysis of ΨDOs, let us note that this function is theboundary value at real p of the analytic function (−ip)β on the half-space {Im p ≥0}, or (for the negative sign in (1.151)) of the function (ip)β on the half-space{Im p ≤ 0}, respectively.

Similarly, formula (1.144) means that the function

ψ(p) =

∫Sd−1

|(p, s)|βμ(ds) (1.152)

=1

Γ(−β) cos(πβ/2)

∫ ∞

0

∫Sd−1

(ei(y,p) − 1− i(y, p)− · · · − ik(y, p)k

k!

)d|y|μ(dy))|y|1+β

,

where y = y/|y|, is the symbol of the operator∫Sd−1 |(∇, s)|βμ(ds). The corre-

sponding ΨDOs with variable coefficients have the symbols

ψ(x, p) =

∫Sd−1

|(p, s)|βμ(x, ds). (1.153)

Another useful extension of operators Lβμ is given by operators of the type

Lβμ,Ωf =

∫ ∞

0

∫Sd−1

(f(x+ y)− f(x)− (∇f(x), y) − · · ·

· · · − 1

k!∇kf(x)y⊗k

)Ω(x, y)d|y|μ(ds)

|y|1+β,

(1.154)

Page 72: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

56 Chapter 1. Analysis on Measures and Functional Spaces

with a function Ω(x, y). These operators are sometimes referred to as hyper-singular integrals, with the function Ω called their characteristics. If Ω(x, y) de-pends on y only through the ratio y/|y|, these hyper-singular integrals have sym-bols of the type (1.153). General hyper-singular integrals arise naturally as thedual operators to operators with symbols of the type (1.153).

Finally, let us note that the equivalence between the definitions of the frac-tional derivatives via their Fourier transform (1.142) and via an integral operatorof the type (1.145) was obtained for functions f from the Schwartz space S(Rd).Once this is done, the formulas (1.142) and (1.145) can be treated in two differentways in order to extend the fractional derivatives to less regular functions: Onthe one hand, formula (1.142) allows for the definition of

∫Sd−1 |(∇, s)|βμ(ds) as a

bounded operator from the space of functions representable as the Fourier trans-forms of functions φ ∈ L1(R

d) such that |p|βφ(p) ∈ L1(Rd) to the space C∞(Rd).

On the other hand, formula (1.145) allows the definition of∫Sd−1 |(∇, s)|βμ(ds)

as a bounded operator from Cn+1(Rd) to C(Rd), or more generally from thesubspace of Cn(Rd) with Holder-continuous nth-order derivatives to C(Rd).

1.9 Generalized functions: main operations

In Remark 16, we had introduced the space D = D(Ω) of infinitely differentiablefunctions (real- or sometimes complex-valued) on the open set Ω ⊂ Rn with abounded support, equipped with such a topology that a sequence φn ∈ D convergesto φ ∈ D, as n → ∞, if there exists a compact K ⊂ Rn such that all φn and φvanish outside K and φn → φ uniformly on K together with all its derivatives. Acontinuous linear functional ξ on D is called a generalized function on Rn, or isalternatively referred to as a distribution. Its value on φ ∈ D is usually denotedby ξ(φ), or sometimes (ξ, φ), if no ambiguity arises (for instance, related to thecomplex scalar product, see Remark 22 below).

In this section, we shall summarize the basic definitions, notations and for-mulas with respect to the theory of generalized functions. A detailed expositioncan be found in many places, e.g., [90], while we shall keep it short.

The space D′ of all generalized functions is equipped with the usual dualtopology (which would formally be better called ∗-weak topology), that is ξn → ξimplies ξn(φ) → ξ(φ) for all φ ∈ D. In this context, the space D is referred to asthe space of test functions.

An alternative convenient space of linear functionals is the space S′(Rn)of continuous linear functionals on the Schwartz space S(Rn), referred to as thespace of tempered generalized functions. This space is again equipped with thecorresponding weak topology: ξn → ξ means that ξn(φ) → ξ(φ) for all φ ∈ S.

Basic examples of generalized functions are supplied by ‘normal’ locally in-tegrable functions f(x) on Rn and locally bounded measures μ acting on D byintegration, that is f(φ) =

∫f(x)φ(x) dx or μ(φ) =

∫φ(x)μ(dx). A notable ex-

Page 73: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

1.9. Generalized functions: main operations 57

ample of the latter case is given by the famous Dirac delta-functions δx, which arepoint measures at x acting on D by the evaluation δx(φ) = φ(x). A sequence ofregular functions fn is called a δ-convergent sequence, or shortly a δ-sequence, iffn → δ = δ0, as n → ∞ in the sense of generalized functions.

In order to stress the difference between D′ and S′, let us note that thefunction ex on R defines an element of D′(R), which does not belong to S′(R).

Remark 22. The use of the Fourier transform makes it indispensable to work withcomplex-valued functions. In this case, it is often convenient to define the cor-respondence between ordinary functions and generalized functions in a differentway. In fact, some authors prefer to assign a generalized function to the ordinaryfunction f according to the rule (f, φ) =

∫f(x)φ(x) dx, which aligns the notation

with the usual scalar product in L2(Rd), but makes the correspondence of func-tions and generalized functions anti-linear. We shall stick to our original definitiongiven above. Of course, with such a modified correspondence, some other formulasalso change, most notably those related to the Fourier transform.

The direct product f × g of two generalized functions on Rd and Rn is natu-rally defined as the generalized function on Rd+n, so that it acts as (f × g, φψ) =(f, φ)(g, ψ) for the products of the test functions φ(x)ψ(y) from S(Rd+n) orD(Rd+n), and extends to arbitrary test functions by linearity and continuity.

The operation of differentiation extends to generalized functions by duality,that is, for any ξ ∈ D′ and φ ∈ D one defines (ξ′, φ) = −(ξ, φ′) (because (f ′, φ) =−(f, φ′) due to integration by parts for f, φ ∈ D). According to this definition,any generalized function is infinitely differentiable. For instance, the derivative δ′xacts on D(R) as δ′x(φ) = −φ′(x).

Moreover, if L = ψ(−i∇) is a pseudo-differential operator in Rd with symbolψ(p), the formal dual is the operator L∗ = ψ(i∇) with the symbol ψ(−p), andtherefore the action of L on generalized functions is defined as

(Lξ, φ) = (ψ(−i∇)ξ, φ) = (ξ, L∗φ) = (ξ, ψ(i∇)φ). (1.155)

Accordingly, one says that a generalized function ξ is a generalized solution to theequation Lξ = f , if this equation holds in the sense of generalized functions.

Similarly, the Fourier transform of a generalized function ξ is defined as

(Fξ, φ) = (ξ, Fφ).

From the Fourier theorem on the isomorphism of S(Rd) under F , it follows that Fis also an isomorphism of S′(Rd). In order to see what happens withD′, we observethat, according to the Paley–Wiener theorem (see (1.91)), the Fourier transformis an isomorphism of the space D(Rd) and the space of test functions Z(Cd)consisting of entire analytic functions g on Cd for which an R exists (dependingon g) such that for any N > 0,

|g(ξ)| ≤ CNeR|Im ξ|(1 + |ξ|)−N , (1.156)

Page 74: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

58 Chapter 1. Analysis on Measures and Functional Spaces

with a constant CN depending on N and g. Consequently, F and F−1 yield iso-morphisms between the dual spaces D′(Rd) and Z ′(Cd), the dual space of Z(Cd).

The definition of the convolution requires much care. Observing that forordinary functions (say, from the space S)∫

(f � g)(x)φ(x) dx =

∫ ∫f(x)g(y)φ(x + y) dxdy,

it is natural to define the convolution f � g of two generalized functions by setting((f � g), φ

)=((f × g)(x, y), φ(x + y)

). (1.157)

However, this term is not always well defined, because the function φ(x+ y) doesnot usually belong to S(R2d) or D(R2d) (see Proposition 1.9.2 below).

Although one cannot speak about the value of a generalized function ξ at aspecific point, it makes sense to say that ξ vanishes in an open set U – meaningthat (ξ, φ) = 0 for all φ with a support in U . Accordingly, by the support of ageneralized function ξ in Rd, one means the intersection of all closed sets K ⊂ Rd

such that ξ vanishes on Rd \K. A noteworthy example is as follows.

Proposition 1.9.1. If a generalized function ξ on R (from S′ or D′) is supported on

the one-point set {0}, then there exists k = 0, 1, 2, . . . such that ξ =∑k

j=0 ajδ(j)

with some constants aj.

Proof. If ξ ∈ D′ and it has a compact support, then ξ ∈ S′. Due to the continuity ofξ (see Proposition 1.6.1), there exists k = 0, 1, 2, . . . such that |(ξ, φ)| ≤ ‖φ‖Ck(R).By Proposition 1.7.2, any such functional is given by the sum of the integrals ofthe derivatives of φ up to order k over some measures. Due to the condition ofbeing supported at zero, these measures have to have support at zero. �

An important property of functions that are supported by proper cones isthe fact that the convolution restricted to such functions is always well defined.For instance, the following holds.

Proposition 1.9.2. If f and g are generalized functions on Rd that are supportedon the positive cone Rd

+, the convolution f � g is well defined and also supportedon Rd

+.

Proof. It follows from (1.157), since for any φ ∈ D(Rd), the function φ(x +y)1x≥01y≥0 belongs to D(R2d). �

Finally, let us comment on the operation of (pointwise) multiplication of gen-eralized functions. If f is an infinitely smooth function with all derivatives boundedand ξ a generalized function, then the product fξ can be defined as the general-ized function acting by the rule (fξ, φ) = (ξ, fφ). If f is not infinitely smooth, thisproduct fξ may not be well defined. Thus by passing to generalized functions, theproduct structure of usual functions is effectively lost. Several approaches exist

Page 75: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

1.10. Generalized functions: regularization 59

(which we will not touch) suggesting to extend the notion of generalized functionsin such a way that an extension of the multiplicative structure to these objectsbecomes possible.

1.10 Generalized functions: regularization

If a function f on R is not locally integrable, it cannot be used directly to definethe generalized function φ → ∫

f(x)φ(x) dx. However, if f is integrable around anypoint apart from a finite set M of singular points, then

∫f(x)φ(x) dx is defined

for all φ ∈ D that vanish in a neighbourhood of M , and a generalized function fis called a regularization of f if (f , φ) =

∫f(x)φ(x) dx for all such φ. Note that

a regularization is never uniquely defined. In fact, if f is a regularization of fhaving a singularity at 0, then f + δ(k) is also a regularization for any k ∈ N.Nevertheless, for a wide class of functions including various combinations of powerfunctions, there exists in some sense a canonical way to choose a regularization.Let us demonstrate this choice on the basic example of one-sided power functions

xλ+ =

{xλ, x > 0,

0, x ≤ 0,xλ− =

{0, x ≥ 0,

|x|λ, x < 0,

and their even and odd combinations

|x|λ = xλ+ + xλ

−, |x|λ sgnx = xλ+ − xλ

−.

These functions are locally integrable for λ > −1 and have a non-integrable sin-gularity at x = 0 for λ ≤ −1. What is the natural regularization for λ ≤ −1? Twoideas can be exploited for answering this question.

Firstly, it would be nice to keep the usual rules of differentiation, say (xλ+)

′ =λxλ−1

+ . Let us calculate (xλ+)

′ for λ ∈ (−1, 0) in the sense of generalized functions:

((xλ+)

′, φ) = −(xλ+, φ

′) = −∫ ∞

0

xλφ′(x) dx = − limε→0

∫ ∞

ε

xλφ′(x) dx.

Integrating by parts with φ′(x)dx = d(φ(x) − φ(0)) yields

((xλ+)

′, φ) = limε→0

[ελ(φ(ε) − φ(0)) +

∫ ∞

ε

(φ(x) − φ(0))λxλ−1 dx

]

=

∫ ∞

0

(φ(x) − φ(0))λxλ−1 dx.

Therefore, the generalized function acting as φ → ∫∞0 (φ(x) − φ(0))xμ dx is the

natural regularization of xμ+ for μ ∈ (−2,−1), which we shall denote by xμ

+ ina slightly abusing notation. This choice of regularization ensures that the rule(xλ

+)′ = λxλ−1

+ holds for λ ∈ (−1, 0).

Page 76: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

60 Chapter 1. Analysis on Measures and Functional Spaces

Iterating this procedure, i.e., differentiating the obtained function xλ+ for

λ ∈ (−2,−1) and so on, one can show that if the regularized version of xλ+ for

−n− 1 < Reλ < −n is defined by the equation

(xλ+, φ) =

∫ ∞

0

xλ[φ(x)−φ(0)− xφ′(0)− · · · − 1

(n− 1)!xn−1φ(n−1)(0)]dx, (1.158)

then the rule (xλ+)

′ = λxλ−1+ holds for all λ = −1,−2, . . ..

The second approach to canonical regularization is based on the idea of an-alytic continuation. Namely, for Reλ > −1 we have

(xλ+, φ) =

∫ ∞

0

xλφ(x) dx =

∫ 1

0

xλ[φ(x) − φ(0)]dx +

∫ ∞

1

xλφ(x) dx +φ(0)

λ+ 1.

This function is analytic in λ for Reλ > −2 and has a single pole at the pointλ = −1. Hence the idea of the analytic continuation suggests to take it as theextension of the function xλ

+ to the domain Reλ > −2. Notably, if λ ∈ (−2,−1),this formula rewrites as

(xλ+, φ) =

∫ ∞

0

(φ(x) − φ(0))xλ dx,

yielding the same result as obtained previously with respect to the consistency ofthe differentiation rules. Continuing in the same way, i.e., adding and subtractingfurther terms of the Taylor expansion of the test function φ, yields the analyticcontinuation of the integral

∫∞0 xλφ(x) dx to the domain Reλ > −n − 1 in the

form∫ ∞

0

xλφ(x) dx =

∫ ∞

1

xλφ(x) dx +n∑

k=1

φ(k−1)(0)

(k − 1)!(k + λ)

+

∫ 1

0

xλ[φ(x) − φ(0)− xφ′(0)− · · · − 1

(n− 1)!xn−1φ(n−1)(0)]dx. (1.159)

This analytic function has simple poles at λ = −1,−2, . . . ,−n. For −n − 1 <Reλ < −n, it coincides remarkably with the expression (1.158) that was obtainedfrom another point of view, because∫ ∞

1

xk+λ dx = − 1

k + λ+ 1

for k ∈ [0, n− 1] and −n− 1 < Reλ < −n.

Therefore, the formulas (1.158) and (1.159) specify an analytic continuationof the integral

∫∞0 xλφ(x) dx to the whole complex plane of λ (with poles at λ =

−1,−2, . . .) and define a family of generalized functions xλ+ that satisfy the usual

differentiation rule (xλ+)

′ = λxλ−1+ (for λ outside the poles).

Page 77: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

1.10. Generalized functions: regularization 61

Similarly, one can define a family of generalized functions xλ−. However, it iseasier to use the identity (xλ

−, φ(x)) = (xλ+, φ(−x)) for writing down the expres-

sions directly:

(xλ−, φ) =

∫ ∞

0

xλ[φ(−x) − φ(0) + xφ′(0)− · · · − (−x)n−1

(n− 1)!φ(n−1)(0)]dx, (1.160)

(xλ−, φ) =

∫ ∞

1

xλφ(−x) dx +n∑

k=1

(−1)k−1φ(k−1)(0)

(k − 1)!(k + λ)(1.161)

+

∫ 1

0

xλ[φ(−x) − φ(0) + xφ′(x)− · · · − (−x)n−1

(n− 1)!φ(n−1)(0)]dx,

valid for −n− 1 < Reλ < −n and for −n− 1 < Reλ, respectively.

It is insightful to observe from the formulas (1.158) and (1.160) on theone hand and the formulas (1.130) and (1.118) on the other hand that the val-ues (xλ±, φ) with λ < −1 yield nothing but the fractional derivatives Γ(−β)dβ/d(∓x)βφ(0) with β = −1− λ ∈ (n− 1, n), see (1.130). We can generally concludethat both fractional integrals and derivatives (in the generator form) of a functionare given by convolutions with the power functions xλ

+ (defined as generalizedfunctions that regularize the usual power functions).

Adding and subtracting the expressions for (xλ±, φ), we see that in both cases

half of the poles cancel, so that |x|λ has poles only at λ = −1,−3,−5, . . . and|x|λ sgnx has poles only at λ = −2,−4, . . .. Therefore, for −2k − 1 < Reλ <−2k + 1,

(|x|λ, φ) =∫ ∞

0

[φ(x) + φ(−x) − 2

(φ(0) +

1

2x2φ′′(0) + · · ·

· · ·+ 1

(2k − 2)!x2k−2φ(2k−2)(0)

)]dx.

(1.162)

In particular, the generalized function |x|−2k with k ∈ N, that can be natu-rally denoted by x−2k, acts as

(x−2k, φ) =

∫ ∞

0

x−2k

[φ(x) + φ(−x) − 2

(φ(0) +

1

2x2φ′′(0) + · · ·

· · ·+ 1

(2k − 2)!x2k−2φ(2k−2)(0)

)]dx,

(1.163)

and the function |x|−1 sgnx, that can be naturally denoted by 1/x, acts as

(x−1, φ) =

∫ ∞

0

[φ(x) − φ(−x)]dx

x, (1.164)

which coincides with the Cauchy principle value of the integral with the den-sity 1/x:

(x−1, φ) = limε→0

∫R\(−ε,ε)

φ(x)

xdx. (1.165)

Page 78: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

62 Chapter 1. Analysis on Measures and Functional Spaces

The simplest application of the obtained formulas is the analytic continuationof the Euler Gamma-function

Γ(λ) =

∫ ∞

0

xλ−1e−xdx = (xλ−1+ , e−x),

where e−x can be considered continued as a smooth function to the left of theorigin in such a way that it becomes a function from S. Namely, applying (1.158)and (1.159) yields the continuation of Γ(λ + 1) to a analytic function with polesat λ = −1,−2, . . ., so that, for Reλ > −n− 1,

Γ(λ+1) =

∫ ∞

1

xλe−x dx+

n∑k=1

(−1)(k−1)

(k − 1)!(k + λ)+

∫ 1

0

[e−x −

n−1∑k=0

(−1)kxk

k!

]dx.

(1.166)For −n− 1 < Reλ < −n, this reduces to

Γ(λ+ 1) =

∫ ∞

0

[e−x −

n−1∑k=0

(−1)kxk

k!

]dx. (1.167)

Thus Γ(λ + 1) has simple poles at λ = −k, k ∈ N, like the generalizedfunctions xλ

+. By dividing (xλ+, φ) by Γ(λ + 1), we therefore cancel the poles, so

that the generalized function xλ+/Γ(λ + 1) is an entire analytic function in λ.

Similarly, one shows the following assertion.

Proposition 1.10.1. The generalized functions

xλ+

Γ(λ+ 1),

xλ−

Γ(λ+ 1),

|x|λΓ(λ+1

2 ),

sgnx|x|λΓ(λ+2

2 )(1.168)

are entire functions of the parameter λ ∈ C.

Odd and even combinations of xλ± make it possible to cancel some of thesingularities, but not all of them. The natural question arises whether a linearcombination (with regular coefficients, not like 1/Γ(λ+ 1)) can be chosen in sucha way that we get an analytic function of λ without any singularities. A niceconstruction of such a combination goes as follows. For complex numbers z ∈C \ R−, one can fix the argument arg(z) by the requirement −π < arg(z) < π.Then for any λ ∈ C, the function

zλ = |z|λeiλ arg(z) = eλ ln |z|eiλ arg(z)

is a well-defined analytic function of z ∈ C \R−. This allows for the definition ofthe functions

(x ± i0)λ = limτ→0+

(x± iτ)λ, (1.169)

Page 79: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

1.10. Generalized functions: regularization 63

where τ → 0+ means the convergence from the right. Since

limτ→0+

arg(x± iτ) =

{0, x > 0,

± π, x < 0,

it follows that

(x± i0)λ =

{|x|λ, x > 0,

e±iπλ|x|λ, x < 0.(1.170)

In other words,(x± i0)λ = xλ

+ + e±iπλxλ−. (1.171)

Though (x ± i0)λ are not locally integrable functions for all λ, the formula(1.171) makes it possible to give them a meaning as generalized functions for allλ = −1,−2, . . .. Moreover, as can be seen from the above formulas for xλ±, theirsingularities cancel in the combinations (1.171), so that the following assertionholds.

Proposition 1.10.2. The generalized functions (x±i0)λ given by (1.171) or (1.169)are entire functions of the parameter λ ∈ C.

Extending the above arguments to the finite-dimensional case, one can ob-tain the analytic continuation of generalized functions arising from homogeneousfunctions on Rd of order λ, that is functions of the type Ψλ(x) = |x|λψ(x/|x|) withsome integrable function ψ on Sd−1. In fact, for any test function φ ∈ D(Rd), wefind ∫

|x|λψ(x/|x|)φ(x) dx =

∫ ∞

0

rλ+d−1u(r) dr, (1.172)

with

u(r) =

∫Sd−1

ψ(s)φ(rs) ds,

where ds denotes the Lebesgue measure on Sd−1. The function u(r) is infinitelydifferentiable on R+ with a bounded support, with all derivatives having contin-uous limits as r → 0 due to

u(k)(0) =

∫Sd−1

ψ(s)∇kφ(0)s⊗k ds

(concise tensor notations from (3.7) and (3.8) were used). Hence it can be continuedto a function belonging to D(R). Consequently, the integral (1.172) is well definedfor λ > −d. Moreover, it equals (xλ+d−1

+ , u). By the properties of xλ+d−1+ , it

therefore follows that the integral (1.172) has an analytic continuation to theplane of complex λ with possible poles only at the points λ = −d,−d − 1, . . ..This extension defines the generalized function Ψλ on Rd. Moreover, if ψ(s) isan even function, i.e., it satisfies ψ(s) = ψ(−s)), then the derivatives u(k)(0) ofodd orders k vanish – which implies that the poles of Ψλ may occur only at

Page 80: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

64 Chapter 1. Analysis on Measures and Functional Spaces

λ = −d,−d − 2,−d − 4, . . .. A basic example is obtained by choosing ψ(s) = 1,which leads to the generalized functions |x|λ on Rd.

The same results also hold if ψ is not a function on Sd−1, but a measure – oreven more generally a generalized function, that is a continuous linear functionalon the space of smooth functions on Sd−1.

1.11 Fourier transform, fundamental solutionsand Green functions

The true power of generalized functions reveals itself in combination with thetheory of Fourier transforms. By (9.20), for λ > 0 and τ > 0,

F (xλ±e

−τ |x|)(p) =∫ ∞

0

rλe−rτ∓irp dr = (±ip+ τ)−1−λΓ(1 + λ),

where −π/2 < arg(±ip+ τ) < π/2. This can be rewritten as

F (xλ±e

−τ |x|)(p) = e∓iπ(λ+1)/2(p∓ iτ)−1−λΓ(1 + λ), (1.173)

where −π < arg(p− iτ) < 0 and 0 < arg(p+ iτ) < π, respectively. Passing to thelimit τ → 0+ yields

F (xλ±/Γ(1 + λ))(p) = e∓iπ(λ+1)/2(p∓ i0)−1−λ. (1.174)

By analytic continuation, this equation holds in the sense of generalized func-tions for all λ, since both sides are entire functions of λ by Propositions 1.10.1 and1.10.2. Representing (p∓ i0) by (1.171), we obtain

F (xλ±)(p) = Γ(1 + λ)[e∓iπ(λ+1)/2p

−(1+λ)+ + e±iπ(λ+1)/2p

−(1+λ)− ], (1.175)

for λ = −1,−2, . . ., and thus

F−1(xλ±)(p) =

1

2πF (xλ

±)(−p)

=Γ(1 + λ)

[e∓iπ(λ+1)/2p

−(1+λ)− + e±iπ(λ+1)/2p

−(1+λ)+

].

(1.176)

Adding and subtracting yields the Fourier transforms of the even and oddcombinations. For instance,

F (|x|λ)(p) = −2Γ(1 + λ) sinλπ

2|p|−(1+λ). (1.177)

This result has a direct extension to arbitrary dimensions. Namely, theFourier transform of the function |x|λ in Rd equals

F (|x|λ)(p) =∫ ∞

0

∫Sd−1

|y|λ+d−1ei(y,p)d|y| dy,

Page 81: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

1.11. Fourier transform, fundamental solutions and Green functions 65

with y = y/|y| and ds the Lebesgue measure on Sd−1 applied to s = y. Hence for0 < λ+ d < 1, we get from (9.26) that

F (|x|λ)(p) = Γ(d+ λ)

∫Sd−1

|(p, s)|−d−λ cos(π(d+ λ)/2)ds.

Applying (1.149) finally yields

F (|x|λ)(p) = 2 cos(π(d + λ)/2)π(d−1)/2Γ(λ+ d)Γ((−λ− d+ 1)/2)

Γ(−λ/2)|p|−d−λ.

(1.178)With the convergence (as an improper Riemann integral) proved for 0 < λ+ d <1, this formula extends by the analytical continuation to all λ = 0, 2, 4, . . . andλ = −d,−d− 2, . . . as an equation for generalized functions.

More generally, formula (9.26) provides the Fourier transform of the homo-geneous generalized function |x|ωμ(dx)d|x| with a symmetric measure μ on Sd−1

for −1 < ω < 0, which again can be extended to other λ by analytic continuationof homogeneous functions as explained above (see the discussion after formula(1.172)).

In the analysis of partial differential and pseudo-differential equations, a keyrole is played by the so-called fundamental solutions. If L is a linear differential orpseudo-differential operator with constant coefficients, its fundamental solution isdefined as a generalized function EL such that LEL(x) = δ(x).

Proposition 1.11.1. If g(x) is a function such that the convolution

u(x) = (EL � g)(x) =

∫EL(x− y)g(y) dy (1.179)

is well defined, then this convolution solves the equation Lu(x) = g(x). Moreover,this solution is unique in the class of functions where the convolution EL �u is welldefined.

Proof. The first assertion holds because of

Lu(x) = (LEL � g)(x) = (δ � g)(x) = g(x).

If we assume that there are two solutions to the equation Lu(x) = g(x), thentheir difference solves the equation Lu = 0. Hence

u = u � δ = u � LEL = Lu � EL = 0,

where (1.99) was used. �

Note, however, that a fundamental solution may not be unique (see examplesbelow or Proposition 5.12.2).

Page 82: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

66 Chapter 1. Analysis on Measures and Functional Spaces

A useful extension can be obtained for Banach-space-valued functions. Name-ly, for a Banach space B, we may define the B-valued generalized functions fromD′

B(Rd) and S′

B(Rd) as the space of continuous linear mappings D(Rd) → B

respectively S(Rd) → B. Any locally integrable function ξ : Rd → B defines ageneralized function from D′

B(Rd) by the usual rule (ξ, φ) =

∫ξ(x)φ(x) dx for any

φ ∈ D(Rd). The differentiation operation extends to D′B(R

d) and S′B(R

d) like inthe real-valued case. It is important to note that Proposition 1.11.1 extends toB-valued solutions of the equation Lu(x) = g(x):

Proposition 1.11.2. If g ∈ D′B(R

d) and the convolution EL � g is well defined,then this convolution solves the equation Lu(x) = g(x) in D′

B(Rd). Moreover, this

solution is unique in the class of g ∈ D′B(R

d) where EL � u is well defined.

Let us see how the Fourier transform works when it comes to calculatingfundamental solutions. Let EL be a fundamental solution for a pseudo-differentialoperator with constant coefficients and symbol ψ. Then passing to the Fouriertransform in the equation LEL = δ yields ψ(p)FEL(p) = 1. Hence

EL = F−1(1/ψ). (1.180)

For instance, if L = |∇|β = |Δ|β/2, with β > 0, it follows that EL =F−1(|p|−β). Consequently, by (1.178), if β = d, d + 2, d + 4, . . ., the fundamen-tal solution for L is

EL(x) = c(β, d)|x|β−d, (1.181)

with a constant c(β, d). In particular, for the usual Laplacian (where β = 2) onegets

EL = − 1

(d− 2)|Sd−1| |x|2−d, (1.182)

for all d = 2.

Exercise 1.11.1.

(i) Check (1.182) by direct differentiation, as well as the formula

EΔ =1

2πln |x| for d = 2. (1.183)

(ii) Check that the functions

θ(t)e−at, θ(t)sin(at)

a

represent fundamental solutions to the one-dimensional operators a + d/dtand a2+d2/dt2 respectively. Also, show that both fundamental solutions areunique under the additional assumption that they vanish for negative t.

Page 83: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

1.11. Fourier transform, fundamental solutions and Green functions 67

If a ΨDO L has an infinitely differentiable symbol ψ(p) of at most polynomialgrowth, then the action Lξ is well defined for any generalized function. In fact, Lξcan be defined by its Fourier transform ψ(p)Fξ. If the symbol ψ is not infinitelysmooth, this product may not be defined, and therefore also not the action Lξ.Thus one has to be cautious when working with ΨDOs with non-smooth symbolsin the framework of generalized functions.

Let us find the fundamental solutions for the operators of the fractionalderivation dβ/d(±x)β . By (1.151), their symbols equal

exp{±iπβ sgn p/2}|p|β

and are not infinitely smooth. Nevertheless, the fundamental solutions are welldefined by the formula

E±β = F−1(exp{∓iπβ sgn p/2}|p|−β

)= exp

{∓iπβ/2}F−1p−β

+ + exp{±iπβ/2

}F−1p−β

− .

Using formula (1.175), we find (after some cancellation and exploiting the factΓ(β)Γ(1− β) = π/ sin(πβ)) that

E±β(x) = xβ−1± /Γ(β), (1.184)

for all positive β /∈ N. These fundamental solutions are unique under the additionalcondition that they vanish on a half-line (see Proposition 5.12.2 for more generalcases).

Another type of fundamental solution arises for evolutionary problems.Namely, if L is a linear differential or pseudo-differential operator on Rd with con-stant coefficients, its Green function, or the fundamental solution for its Cauchyproblem (also referred to as the heat kernel) is defined as the generalized functionGL(t, .) on Rd depending on the parameter t > 0 such that

∂GL

∂t− LGL = 0, t > 0; lim

t→0GL(t, x) = δ(x). (1.185)

When such a GL is known, it follows from its definition that the formula

f(t, x) = (GL(t, .) � f0)(x) (1.186)

supplies a solution to the Cauchy problem

∂f(t, x)

∂t− Lf(t, x) = 0, t > 0; f(0, x) = f0(x), (1.187)

with an arbitrary initial condition f0 ∈ D. If the Green function is given byordinary locally integrable functions of x depending smoothly on t (which is often

Page 84: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

68 Chapter 1. Analysis on Measures and Functional Spaces

the case in concrete examples), then the convolution in (1.186) turns into the usualintegral,

f(t, x) =

∫GL(t, x− y)f0(y) dy, (1.188)

and is well defined for all bounded measurable functions f0.

More generally, for a time-dependent family of operators Lt, the Green func-tion of Lt is defined as the generalized function GL(t, s, .) on Rd depending on theparameters t > s such that

∂GL(t, s, .)

∂t− LtGL(t, s, .) = 0; lim

t→sGL(t, s, x) = δ(x). (1.189)

The most fundamental example of the Green function is supplied by theCauchy problem for the diffusion or heat conductivity equation:

∂f(t, x)

∂t=

1

2Δf(t, x), f(0, x) = f0(x). (1.190)

The corresponding Green function solves the problem

∂G(t, x)

∂t− 1

2ΔG(t, x) = 0, t > 0; lim

t→0GL(t, x) = δ(x), (1.191)

and is given by the formula

G(t, x) =1

(2πt)d/2exp

{−x2

2t

}, (1.192)

Its derivation via the Fourier transform can be found, e.g., with (2.63).

We shall now give two simple but fundamental results that link fundamentalsolutions and Green functions.

Proposition 1.11.3. Let L be a pseudo-differential operator ψ(−i∇), and let itsGreen function GL be given by ordinary functions GL(t, x), t > 0, so that theintegrals

∫GL(t, x)dx are uniformly bounded for small t and GL(t, x) satisfies the

equation ∂GL/∂t− LGL = 0 classically. Then the function

E(t, x) ={GL(t, x), t > 0,

0, t ≤ 0,(1.193)

is the fundamental solution for the operator ∂/∂t− L in Rd+1.

Page 85: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

1.11. Fourier transform, fundamental solutions and Green functions 69

Proof. By (1.155) and the integrability of GL(t, .), if φ ∈ D(Rd+1), then([∂

∂t− L

]E , φ)

= −∫ ∞

0

dt

∫GL(t, x)

[∂φ

∂t(t, x) + L∗φ(t, x)

]dx

= − limε→0

∫ ∞

ε

dt

∫GL(t, x)

[∂φ

∂t(t, x) + L∗φ(t, x)

]dx

= limε→0

∫ ∞

ε

dt

∫Rd

[∂

∂t− L

]GL(t, x)φ(t, x)dx

+ limε→0

∫Rd

GL(ε, x)φ(ε, x) dx.

The first term vanishes, since GL satisfies the equation ∂GL/∂t− LGL = 0 clas-sically. Moreover,

limε→0

∫Rd

GL(ε, x)(φ(ε, x) − φ(0, x)) dx = 0,

due to the uniform integrability of GL(t, .). Hence([∂

∂t− L

]E , φ)

= limε→0

∫Rd

GL(ε, x)φ(0, x) dx = φ(0, 0). �

What can be said about the fundamental solution for L, once the Greenfunction GL of its Cauchy problem is given? Let us say that a generalized functionξ on Rd+1 = {t, x1, . . . , xn} can be reduced to the generalized function ξ on Rd, if

(ξ, φ) = (ξ, φ(x)1(t))

is well defined in the following sense: for any π ∈ D(Rd) and any sequence offunctions ηk : R → [0, 1] such that ηk → 1 monotonically as k → ∞, there exists alimit of the sequence (ξ, φ(x)ηk(t)), denoted by (ξ, φ(x)1(t)), such that this limitdoes not depend on the sequence ηk and depends continuously on φ in the topologyof D(Rd).

Remark 23. In fact, the last condition about the continuity is automatically ful-filled once the limit exists. However, this fact is not at all obvious and requires anontrivial proof (see, e.g., [261]).

Proposition 1.11.4. Let L be a pseudo-differential operator ψ(−i∇), and let E ∈D′(Rd+1) be the fundamental solution for the operator ∂/∂t−L. If E can be reducedto the generalized function E ∈ D′(Rd), then E is a fundamental solution for theoperator −L.

Proof. We have

−(LE , φ) = −(E , L∗φ) = −(E , L∗φ(x)1(t))

=

([∂

∂t− L

]E , φ(x)1(t)

)= (δ(x)δ(t), φ(x)1(t)) = φ(0),

as claimed. �

Page 86: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

70 Chapter 1. Analysis on Measures and Functional Spaces

As an exemplary application, let us derive again the fundamental solution(1.182) for the Laplacian operator Δ in d > 2, now using formula (1.192) for theGreen function. By Proposition 1.11.4, the fundamental solution E(x) for −Δ/2should be given by the integral

E(x) =∫ ∞

0

1

(2πt)d/2exp

{−x2

2t

}dt

whenever it is well defined, which is the case for d > 2. Therefore, using thisformula for d > 2 and changing the variable t into u = x2/2t yields t = x2/2u,dt = −(x2/2u2)du and thus

E(x) = 1

2πd/2|x|d−2

∫ ∞

0

u−2+d/2e−udu =1

2πd/2|x|d−2Γ(−1 + d/2).

Evaluating the Γ-function yields the same result for the fundamental solution forthe Laplacian as in (1.182).

1.12 Sobolev spaces

The Fourier transform allows for a stratification of the spaces S and S′ in a naturalway. For an integer k ≥ 0 and p ≥ 1, the Sobolev space Hk

p = Hkp (R

d) is defined

as the subspace of functions u from Lp(Rd) such that the partial derivatives up to

and including order k (always defined in the sense of generalized functions) turnout to be also functions from Lp(R

d). The norm on Hkp is defined as

‖f‖k,p =

k∑q=0

∑m1+···+md=q

∥∥∥∥ ∂qf

∂xm11 · · · ∂xmd

d

∥∥∥∥Lp(Rd)

. (1.194)

According to the Fourier inversion theorem, the norm (1.194) for k = 2 isequivalent to the norm

‖f‖Hk2=

(∫(1 + |p|2)k|Ff(p)|2 dp

)1/2

. (1.195)

This allows for an extension of the definition of Hk2 to all real k. All these spaces

Hs2 = Hs

2 (Rd), s ∈ R, are Hilbert spaces, so that the Fourier transform

F : Hk2 (R

d) → L2(Rd, (1 + |p|2)k/2dp)is an isometry. Similar representations via the Fourier transform also exist for Hk

p

with p = 2, but we will not give details here.

With the help of the celebrated Sobolev embedding theorem, one can embedthese spaces in some classes of regular functions. The most basic result is as follows.

Page 87: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

1.13. Variational derivatives 71

Theorem 1.12.1.

(i) If s > d/2, then Hs2(R

d) ⊂ C∞(Rd).

(ii) If s > d/2 + k, then Hs2(R

d) ⊂ Ck∞(Rd).

Proof. (i) By the Riemann–Lebesgue lemma, it is sufficient to check that u ∈ Hs2

implies Fu ∈ L1(Rd), because in this case u = F−1 ◦ Fu ∈ C∞(Rd). With the

Cauchy inequality for scalar products in Hilbert spaces, we find

∫|(Fu)(p)|dp ≤

(∫|(Fu)(p)|2(1 + |p|2)s dp

)1/2(∫(1 + |p|2)−s dp

)1/2

= ‖u‖Hs2

(∫(1 + |p|2)−s dp

)1/2

,

and the last term here is finite because of s > d/2. (ii) The proof is similar, ifwe take into account that the differentiation of u of order k adds a multiplier ofmagnitude |pk| to its Fourier image. �

For an open Ω ⊂ Rd, one defines the local Sobolev space Hs2(Ω) as the

subspace of generalized functions ξ of D′(Rd) such that φξ ∈ Hs2(R

d) for allφ ∈ D(Rd) with a support in Ω.

Remark 24. One can show (e.g., in [232], Section IX.6) that for any bounded openΩ and any ξ ∈ D′(Rd) there exists an s such that ξ ∈ Hs

2 (Ω). Therefore, thespaces Hs

2(Ω) cover all D′.

1.13 Variational derivatives

We shall now discuss the peculiarities that arise when analysing spaces of measuresand the dual functional spaces, where the structure of differentials is revealed interms of variational derivatives. For that purpose, recall that M(X) is the space ofbounded Borel measures on a metric space X , and M+(X) is its cone of positivemeasures. Let M<λ(X) (respectively M≤λ(X) or Mλ(X)) and M+

<λ(X) (respec-

tively M+≤λ(X) or M+

λ (X)) denote the parts of these sets containing measures ofthe norm less than λ (respectively not exceeding λ or equal λ).

For a function F on M+≤λ(X) or M≤λ(X), the variational derivative δF (Y )/

δY (x) is defined as the directional derivative of F (Y ) in the direction δx:

δF (Y )

δY (x)= DδxF (Y ) = lim

s→0+

1

s(F (Y + sδx)− F (Y )). (1.196)

The higher derivatives δlF (Y )/δY (x1) · · · δY (xl) are defined inductively.

If F is a continuous function and δF (Y )/δY (.) exists for all x ∈ X anddepends continuously on Y in the norm topology, then the function F (Y + sδx) of

Page 88: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

72 Chapter 1. Analysis on Measures and Functional Spaces

s ∈ R+ has a continuous right derivative everywhere and is therefore continuouslydifferentiable (by Lemma 1.3.1). This implies

F (Y + δx)− F (Y ) =

∫ 1

0

δF (Y + sδx)

δY (x)ds. (1.197)

From the definition of variational derivatives, it follows that

lims→0+

1

s(F (Y + saδx)− F (Y )) = a

δF

δY (x)

for a positive a, which allows for an extension of equation (1.197) to

F (Y + aδx)− F (Y ) = a

∫ 1

0

δF

δY (x)(Y + saδx) ds. (1.198)

As for the general directional derivatives (see equation (1.17)), one can deducefrom (1.197) that (1.198) still holds for negative a – as long as Y + aδx belongs tothe set where F is defined, of course.

Furthermore, again by the same argument as in the proof of Proposition1.2.1, one finds that if δF (Y )/δY (.) exists for all x ∈ X and depends continuouslyon Y in the norm topology, then the directional derivative exists in the directionof ξ =

∑j ajδxj , i.e., an arbitrary finite linear combination of Dirac masses (again

restricted by the condition that Y + aδx belongs to the set where F is defined),and

F (Y + ξ)− F (Y ) =

∫ 1

0

∫δF (Y + sξ)

δY (x)ξ(dx) ds. (1.199)

The natural question arises whether one can derive from the existence of vari-ational derivatives that directional derivatives in an arbitrary direction ξ ∈ M(X)exist, and whether such directional derivatives can be expressed in terms of thevariational derivatives. In principle, a continuous linear functional on the space ofmeasures given by F (ξ) = (f, ξ) with some f(x) for ξ being linear combinations ofDirac point masses can look quite differently for other measures, because the setsof continuous and discrete measures are closed Banach subspaces in M(Rd). In or-der to extend the functional F uniquely from its values on point masses, one has toassume that this functional is weakly continuous. (Linguistically counterintuitive,the weak continuity is a stronger requirement than just continuity.)

This motivates the following definition. Let Cweak(M+≤λ(X)) denote the space

of bounded weakly continuous functions on M+≤λ(X). Equipped with the usual

sup-norm, it represents a Banach space that is a closed subspace of C(M+≤λ(X)),

the space of bounded functions onM+≤λ(X) that are continuous in the norm topol-

ogy. We denote by Ckweak(M+

≤λ(X)), k = 1, 2, . . . , the subspace of Cweak(M+≤λ(X))

consisting of functions F such that δlF (Y )/δY (x1) . . . δY (xl) exists for all l =1, . . . , k and x1, . . . , xl ∈ X l, Y ∈ M+

<λ(X), and represents a continuous mapping

Page 89: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

1.13. Variational derivatives 73

of l + 1 variables with measures equipped with the weak topology, this mappinghaving a bounded continuous extension to the closed setX l×M+

≤λ(X). The spaces

Ckweak(M+

≤λ(X)) are Banach with respect to the recursively defined norm

‖F‖Ckweak

(M+≤λ

(X)) = ‖F‖Ck−1weak(M+

≤λ(X)) + sup

x1,...,xk

supY ∈M+

≤λ(X)

δkF (Y )

δY (x1) · · · δY (xk),

(1.200)where C0

weak(M+≤λ(X)) = Cweak(M+

≤λ(X)). The spaces Ckweak(M≤λ(X)) are de-

fined accordingly.

We say that a function F on M+(X) (and similarly, on M(X)) has locallybounded continuous variational derivatives up to order k if F ∈ Ck

weak(M+≤λ(X))

for all λ > 0. Let us denote the space of such functions by Cklbw(M+(X)). It is a

Frechet space equipped with the countable set of norms (1.200) with λ ∈ N.

Theorem 1.13.1. Let X be a locally compact metric space. Then the spacesCk

weak(M+≤λ(X)) are closed Banach subspaces of Ck

Gat(M+≤λ(X)), with the norm

(1.200) coinciding with the norm (1.22). If F ∈ C1weak(M+

≤λ(X)), then

DξF (Y ) =

∫δF (Y )

δY (x)ξ(dx), (1.201)

F (Y + ξ)− F (Y ) =

∫ 1

0

(δF (Y + sξ)

δY (.), ξ

)ds, (1.202)

for any eligible Y, ξ such that Y, Y + ξ ∈ M+≤λ(X).

Proof. Let us show (1.202) for F ∈ C1weak(M+

≤λ(X)). All other statements followdirectly from this one.

Assume that ξ ∈ M+(X) and ξk → ξ weakly in M(X) as k → ∞, where allξk are finite linear combinations of the Dirac measures with positive coefficientssuch that Y + ξk ∈ M+

≤λ(X). We are going to pass to the limit k → ∞ in the

equation (1.199) for ξk. Due to F ∈ Cweak(M+≤λ(X)), one has

F (Y + ξk)− F (Y ) → F (Y + ξ)− F (Y ), k → ∞.

Next,∫ 1

0

(δF

δY (.)(Y +sξk),ξk

)ds−

∫ 1

0

(δF

δY (.)(Y +sξ),ξ

)ds

=

∫ 1

0

(δF

δY (.)(Y +sξk),ξk−ξ

)ds+

∫ 1

0

(δF

δY (.)(Y +sξk)− δF

δY (.)(Y +sξ),ξ

)ds.

Since δF/δY (x) is a continuous function of two variables,

δF

δY (x)(Y + sξk)− δF

δY (x)(Y + sξ) → 0,

Page 90: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

74 Chapter 1. Analysis on Measures and Functional Spaces

as k → ∞ uniformly for x from any compact set. Therefore, the last integral inthe above formula converges to zero, as k → ∞. To see that the first integralin the above formula converges to zero, one has to note that if ξk → ξ weakly,then (fk, ξk) → (f, ξ), whenever all fk are continuous and uniformly bounded andconverge to f uniformly on compact sets. �Corollary 1. Let X be a locally compact metric space. If Y, Y + ξ ∈ M+

≤λ(X) and

F ∈ C2weak(M≤λ(X)) or F ∈ C3

weak(M≤λ(X)), then the second- and third-orderTaylor expansions hold:

(a) F (Y + ξ)− F (Y ) = (δF (Y )

δY (.), ξ) +

∫ 1

0

(1− s)

(δ2F (Y + sξ)

δY (.)δY (.), ξ ⊗ ξ

)ds,

(b) F (Y + ξ)− F (Y ) =

(δF

δY (.), ξ

)+

1

2

(δ2F (Y )

δY (.)δY (.), ξ ⊗ ξ

)

+1

2

∫ 1

0

(1 − s)2(δ3F (Y + sξ)

δY 3(., ., .), ξ⊗3

)ds. (1.203)

Proof. Straightforward from the usual Taylor expansion and Theorem 1.13.1. �

It is natural to ask when a function from Ckweak(M+

≤λ(X)) has a Frechetderivative.

Theorem 1.13.2. Let X be a locally compact metric space.

(i) A function F from Ckweak(M+

≤λ(X)) belongs to Ck(M+≤λ(X)) whenever F ∈

Ck−1(M+≤λ(X)) and the mapping

Y �→ δkF (Y )

δY (x1) · · · δY (xk)

is continuous as a mapping of the Banach spaces M+≤λ(X) → C(Xk).

(ii) An F from Ck(M+≤λ(X)) belongs to Ck

weak(M+≤λ(X)) whenever all deriva-

tives DlF (Y )[ξ1, . . . , ξl], l ≤ k, are given by the following integrals with con-tinuous functions:

DlF (Y )[ξ1, . . . , ξl] =

∫ωY (x1, . . . , xl)ξ1(dx1), . . . , ξl(dxl), (1.204)

in which case

ωY (x1, . . . , xl) =δF (Y )

δY (x1) · · · δY (xl). (1.205)

Proof. (i) The condition of the theorem means that

supx1,...xk

∣∣∣∣ δkF (Y1)

δY (x1) · · · δY (xk)− δkF (Y2)

δY (x1) · · · δY (xk)

∣∣∣∣→ 0

Page 91: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

1.13. Variational derivatives 75

as ‖Y1 − Y2‖ → 0, which is precisely the condition for a Gateaux derivative to bea Frechet derivative, as given in Proposition 1.2.2. (ii) Equation (1.205) followsfrom (1.204) and the definition of variational derivatives. �

With the help of Theorem 1.13.2, we can recast any general statement involv-ing Gateaux or Frechet derivatives on measures in terms of variational derivatives.

Remark 25. Counterintuitively, weak differentiability does not imply the weakLipschitz continuity. For φ ∈ C(Rn), the linear functional F (μ) = (φ, μ) =∫φ(z)μ(dz) on M(Z) is weakly continuously differentiable of all orders, but it

is weakly Lipschitz only if φ is Lipschitz, with ‖F‖weakLip = ‖φ‖Lip (see (1.9)).

Let us present some basic examples for calculating variational derivatives.

Exercise 1.13.1. Check that

(i) if F (μ) =∫g(y1, . . . , yk)μ(dy1) · · ·μ(dyk) with a symmetric function g, then

δF (μ)

δμ(x)= k

∫g(x, y2, . . . , yk)μ(dy2) · · ·μ(dyk);

(ii) if F (μ) = Ω[∫g(y1, . . . , yk)μ(dy1) · · ·μ(dyk)] with a smooth function Ω, then

δF (μ)

δμ(x)= Ω′

[∫g(y1, . . . , yk)μ(dy1) · · ·μ(dyk)

]

× k

∫g(x, y2, . . . , yk)μ(dy2) · · ·μ(dyk);

(iii) if F (μ) =∫V (μ, y)μ(dy), then

δF (μ)

δμ(x)= V (μ, x) +

∫δV (μ, y)

δμ(x)μ(dy).

For the mappings between different spaces of measures, the nomenclaturebecomes quite cumbersome, since the various topologies on the domain and therange must be distinguished.

Let us say that a mapping Φ : M≤λ(X) �→ M(Z) has a strong (respectivelyweak) variational derivative δΦ(Y, x) if for any Y ∈ M≤λ(X), x ∈ X the limit

δΦ(Y, x) =δΦ

δY (x)= lim

s→0+

1

s(Φ(Y + sδx)− Φ(Y ))

exists in the norm topology of M(X) (respectively in the weak topology ofM(X))and is a finite signed measure on X . Higher derivatives are defined inductively.We say that Φ belongs to

Clweak(M≤λ(X);M(Z)),

Page 92: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

76 Chapter 1. Analysis on Measures and Functional Spaces

(respectively Clweak,weak(M≤λ(X);M(Z))), l = 0, 2, . . . , if for all k = 1, . . . , l, the

strong (respectively weak) variational derivative δkΦ(Y ;x1, . . . , xk) exists for allx1, . . . , xk ∈ Xk, Y ∈ M(X) and represents a continuous mapping M≤λ(X) ×Xk �→ M(Z) in the sense of the weak topology.

The spaces Clweak(M≤λ(X);M(Z)) and Cl

weak,weak(M≤λ(X);M(Z)) are Ba-nach spaces when equipped with the norm

‖Φ‖k =

k∑l=0

supx1,...,xl

supY ∈M≤λ(X)

∥∥∥∥ δlΦ(Y )

δY (x1) · · · δY (xl)

∥∥∥∥M(Z)

.

With these definitions, the following inclusions hold:

Clweak(M≤λ(X);M(Z)) ⊂ Cl

weak,weak(M≤λ(X);M(Z)),

Clweak(M≤λ(X);M(Z)) ⊂ Cl

Gat(M≤λ(X);M(Z)).

Let us say that Φ : M(X) �→ M(Z) has locally bounded continuous strong(respectively weak) variational derivatives of order up to k, if their restriction toany M≤λ(X) belongs to

Ckweak(M≤λ(X);M(Z))

(respectively Ckweak,weak(M≤λ(X);M(Z))). The space of such mapping is denoted

by Cklb,weak(M(X);M(Z)) respectively Ck

lb,weak,weak(M(X);M(Z)).

Lemma 1.13.1. Let Φ ∈ C1lb,weak,weak(M(X);M(X)) and F ∈ C1

lbw(M(X)). Then

the composition F ◦ Φ(Y ) = F (Φ(Y )) belongs to C1lbw(M(X)) and

δF

δY (x)(Φ(Y )) =

∫δF (Z)

δZ(y)|Z=Φ(Y )

δΦ

δY (x)(Y, dy). (1.206)

Proof. By (1.198) and the definitions, we find

δF ◦ ΦδY (x)

(Y ) = limh→0+

1

h(F (Φ(Y + hδx))− F (Φ(Y )))

= limh→0+

∫ 1

0

ds

(δF (Φ(Y ) + s(Φ(Y + hδx)− Φ(Y ))

δZ(.),1

h(Φ(Y + hδx)− Φ(Y ))

).

This implies (1.206), where again the fact is used that if ξk → ξ weakly, then(fk, ξk) → (f, ξ), whenever all fk are continuous and uniformly bounded andconverge to f uniformly on compact sets. �

Similar facts and notations can be used for mappings Φ : M+(X) �→ M(Z).

Note that the application of variational derivatives to the analysis of inter-acting particles requires special formulae for differentiating functions on the set ofdiscrete measures, as developed in [147], see also Lemma 7.8.1.

Page 93: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

1.14. Derivatives compatible with duality, AM- and AL-spaces 77

1.14 Derivatives compatible with duality,AM- and AL-spaces

The possibility to use variational derivatives and the natural order are specificfeatures of spaces of measures and functions, which we are mostly working with.However, in order to understand the general situation in which these methods areapplicable, we shall briefly outline the relevant abstract notions.

First, let us discuss an abstract setting for variational derivatives. For thatpurpose, recall that by a dual pair we mean a pair of Banach spaces (Bobs, Bst)such that each of these spaces is a closed subspace of the other space that sep-arates the points of the latter. Also recall that for a closed convex subset M ofBst we denoted by C1

Gat(M) (respectively C1(M) = C1Frech(M)) the subspace of

C(M) of Gateaux- (respectively Frechet-) differentiable functions, i.e., such func-tions that the directional derivatives DξF (Y ) = DF (Y )[ξ] exist for all Y ∈ Mand ξ ∈ M − Y , that they are continuous functions of two variables Y, ξ andlinear in ξ (respectively when the mapping Y → DF (Y ) is continuous as a map-ping M → B∗

st). Let us say that a Gateaux or Frechet derivative respects thepair (Bobs, Bst) or is compatible with the duality (Bobs, Bst) if DF (Y ) ∈ Bobs

for all Y . Let us denote the corresponding subspaces of Gateaux- or Frechet-differentiable functions by C1

Gat(M ⊂ Bst)(Bobs, Bst) or C1(M ⊂ Bst)(Bobs, Bst),

respectively. In a symmetrical manner, the spaces C1Gat(M ⊂ Bobs)(Bobs, Bst) and

C1(M ⊂ Bobs)(Bobs, Bst) of differentiable functions on M ⊂ Bobs respecting thepair (Bobs, Bst) are defined.

Remark 26. Similarly, the spaces CkGat(M ⊂ Bst)(Bobs, Bst) or Ck(M ⊂ Bst)

(Bobs, Bst) can be defined by the requirement that the multi-linear operators ofhigher derivatives belong to the appropriate tensor-products of Bobs.

In our main example, the pair (Bobs, Bst) is the pair (C(X),M(X)) for ametric space X . Then the space C1

Gat(M)(Bobs, Bst) for M ⊂ M(X) consists ofGateaux-differentiable functions F on M such that DF (Y )[ξ] =

∫ωY (x)ξ(dx)

with some ωY ∈ C(X) uniformly bounded in Y ∈ M , x ∈ X , and such thatωY (x) depends continuously on Y in the norm topology of Bst for any x. Choos-ing ξ = δx, we see that ωY (x) = δF (Y )/δY (X), i.e., it is the variational deriva-tive as defined earlier. Thus Theorem 1.13.1, say for k = 1, states the inclu-sion C1

weak(M≤λ(X)) ⊂ C1Gat(M≤λ(X))(Bobs, Bst), the difference between the

two spaces reflecting the requirement of continuity of the variational derivativeωY (x) = δF (Y )/δY (x) with respect to Y being taken in weak or norm topology.

Consequently, the notion of variational derivatives provides a notion ofGateaux or Frechet derivatives that is compatible with certain duality.

Apart from the pair (C(X),M(X)), another prime example of (Bobs, Bst)in quantum physics is the pair (Lsa(H),Las

1 (H)) of the spaces Lsa(H) of all self-adjoint bounded operators in a Hilbert space H and Las

1 (H) of all self-adjointtrace-class operators in H (also referred to as nuclear operators). It is known that

Page 94: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

78 Chapter 1. Analysis on Measures and Functional Spaces

Lsa(H) = (Las1 (H))∗ (see proofs, e.g., in [208]), the duality relation being given

by the trace tr (AB), A ∈ Lsa(H), B ∈ Las1 (H). Therefore, the weak topology

on Las(H) with respect to this pair is its ∗-weak topology. It plays a key role inthe analysis of Banach algebras as well as in physics. It is known under differentnames: as operator ultra-weak topology, σ-weak topology and normal topology. Thusthe derivative DF (Y )[ξ] of a function F on Lsa(H) is compatible with this Banachpair if it can be represented as DF (Y )[ξ] = tr (ωξ) with some ω ∈ Las

1 (H), or, inother words, if DF (Y )[ξ] is a ∗-weakly continuous functional of ξ ∈ Lsa(H).

Next, let us briefly outline an abstract setting for dealing with the order.We shall essentially follow [240] and [38], where more details can be found. Theabstract notion of order arises by generalizing the properties of the usual order ≤on the real numbers. This order enjoys the following properties:

L1: x ≤ x for all x;

L2: x ≤ y and y ≤ z implies x ≤ z (transitivity);

L3: x ≤ y and y ≤ x implies x = y (anti-commutativity);

L4: for any pairs x, y there exists an element x∨y (called the maximum of x andy) such that x ≤ x ∨ y, y ≤ x ∨ y, and x ∨ y is the smallest element withthese properties, i.e., if x ≤ z and y ≤ z for some element z, then necessarilyx ∨ y ≤ z;

L5: for any pairs x, y there is an element x ∧ y (called the minimum of x andy) such that x ∧ y ≤ x, x ∧ y ≤ y, and x ∧ y is the greatest element withthese properties, i.e., if z ≤ x and z ≤ y for some element z, then necessarilyz ≤ x ∧ y.

A binary relation ≤ on an arbitrary set L is called a partial order if it satisfiesthe properties L1 to L3. The set L with a partial order is called a lattice, if therelation ≤ additionally satisfies the properties L4 and L5. In general lattices, theelements ∨,∧ given by L4 and L5 are usually calledthe least upper bound respec-tively the greatest lower bound of x and y. Notice that they are uniquely defineddue to axiom L3. The major difference between general lattices and the order onusual numbers is that for a general lattice not all pairs must be comparable, i.e.,there can be pairs x, y such that neither x ≥ y nor x ≤ y holds.

An element M of a lattice L is called its maximum element, if x ≤ M forall x from L. Similarly, an element m is called its minimum element, if m ≤ xfor all x from L. The set of real numbers has neither a maximum nor a minimumelement. But for the set of all non-negative numbers, zero is of course the minimumelement.

A lattice L is called distributive if

x ∧ (y ∨ z) = (x ∧ y) ∨ (x ∧ z) (1.207)

for all x, y, z from L. This is known to be equivalent to the dual distributive law

x ∨ (y ∧ z) = (x ∨ y) ∧ (x ∨ z). (1.208)

Page 95: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

1.14. Derivatives compatible with duality, AM- and AL-spaces 79

Non-distributive lattices arise naturally in quantum logics, where the simplestexample is the quantum bit or qubit, which represents the set of all points on thetwo-dimensional sphere, together with an empty set ∅ and the whole sphere Q itselfand the order structure that is induced by set theory, i.e., defined by inclusion.In this case, ∅ and Q are the minimum and maximum elements, respectively, andany two different points of Q are not comparable, i.e., neither x ≤ y nor y ≤ xholds. In particular, x∧ y = ∅ and x∨ y = Q for any such points, so that the l.h.s.of (1.207) is x and its r.h.s. is ∅.

A vector lattice or a linear lattice is a lattice V equipped with the structureof a linear space, such that

x ≤ y ⇒ x+ z ≤ y + z and λx ≤ λy (1.209)

for all x, y, z ∈ V and λ ≥ 0. This structure allows for the definition of the positivecone V+ = {x ∈ V : x ≥ 0} of V , and for any x ∈ V its positive part, itsnegative part and the modulus as x+ = x ∨ 0, x− = (−x) ∨ 0 and |x| = x ∨ (−x),respectively. The elements x, y are called order orthogonal if |x|∨|y| = 0. From justthese axioms, one can derive many properties that are analogous to the propertiesof usual modulus.

A vector lattice V is called a Banach lattice if V is a Banach space withrespect to a norm ‖.‖ such that |x| ≤ |y| implies ‖x‖ ≤ ‖y‖. Basic examples ofBanach lattices are the spaces C(X) and M(X) of bounded continuous functionsequipped with the usual sup-norm, and of bounded Borel measures on a metricspace X equipped with the total variation norm, respectively, and their subspaces.For f, g ∈ C(X), f ∧ g and f ∨ g are the pointwise minimum respectively thepointwise maximum of f and g. On the other hand, for a measure μ ∈ M(X),the modulus |μ| is the total variation measure |μ| = μ+ + μ−, obtained from theHahn decomposition μ = μ+ − μ− of μ in its positive and negative parts. Clearly,the sup-norm on V = C(X) satisfies

‖g ∨ f‖ = ‖g‖ ∨ ‖f‖ = max(‖g‖, ‖f‖) for g, f ∈ V+, (1.210)

and the total variation norm on V = M(X) satisfies

‖μ+ ν‖ = ‖μ‖+ ‖ν‖ for μ, ν ∈ V+. (1.211)

This motivates the following definitions. A Banach lattice V is called anabstract M-space or AM-space if its norm satisfies the condition (1.210). If theunit ball of such a space contains a largest element e (which is evidently unique),then e is referred to as the unit of V . A Banach lattice is called an abstract L-spaceor AL-space if its norm satisfies (1.211). It is remarkable that these properties aresufficient to characterize the spaces of continuous functions and measures accordingto the following results (see proofs in [240]).

Theorem 1.14.1. Every AM-space with a unit is isomorphic to C(K) with somecompact K. Every AL-space is isomorphic to L1(X,μ) with a locally compact spaceX and a positive measure μ on it.

Page 96: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

80 Chapter 1. Analysis on Measures and Functional Spaces

Theorem 1.14.2. The Banach dual to any AM-space is a AL-space, and the Banachdual to any AL-space is an AM-space.

In particular, the spaces of bounded continuous functions C(X) on a completemetric space, or its weighted modificationCf (X) consisting of continuous functionsψ such that ψ/f ∈ C(X) are clearly AM-spaces, and hence they are isomorphicto spaces C(K) with some compact sets K (though the identification of such Kdoes not seem to be obvious at all). Similarly, the space M(X) of bounded Borelmeasures on X or its weighted modifications are AL-spaces and can therefore berealized as spaces of integrable functions L1(Y, μ) with a locally compact Y . In away, Theorem 1.14.1 says that we can apply the constructions related to the spacesC(X) and M(X), like variational derivatives or conditional positivity criteria, toa wide class of AM-spaces and their dual AL-spaces.

Alternatively, variational derivatives can be extended to Banach lattices asfollows. Since point measures are known to be extreme points of the cone of positivemeasures, the analogues of variational derivatives in Banach lattices are directionalderivatives in the directions of the extreme points of the positive cone V+. For ex-ample, for the Banach pair (Lsa(H),Las

1 (H)) discussed above, the analogues ofvariational derivatives of functions on Las

1 (H) that arise from the order (A ≥ Bmeans that A − B is a positive self-adjoint operator) are the directional deriva-tives in the directions of one-dimensional orthogonal projectors that are known torepresent the extreme points of the cone of positive elements of Las

1 (H).

1.15 Hints and answers to chosen exercises

Exercise 1.1.1. ‖A∗‖ ≤ ‖A‖ is straightforward. For proving the equality, one usesthe fact (see Proposition 2.17.2) that for any b ∈ B there exists z ∈ B∗ such that‖z‖ = 1 and (z, b) = 1.

Exercise 1.1.2. It is claimed that for any g ∈ C(Rd)∫R2d

g(x)ndφ(n(x− y))μ(dy) dx

→∫Rd

g(x)μ(dx) =

∫R2d

g(y)μ(dy)φ(n(x − y))nd dx,

as n → ∞. To show this claim, it is sufficient to show that

gn =

∫(g(x)− g(y))ndφ(n(x − y)) dx =

∫(g(y + z/n)− g(y))ndφ(z) dz → 0

uniformly for y from any compact set, and this follows from the continuity of g.In fact, the functions gn are bounded, so for any ε > 0 there exists a compact Ksuch that

∫Rd\K gn(y)μ(dy) < ε.

Page 97: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

1.15. Hints and answers to chosen exercises 81

Exercise 1.1.3. First reduce the problem to the case when the support of μ ∈ M(X)is compact K ⊂ X . In this case, for any f and ε > 0 there exists a partition of K,i.e., its representation K = ∪Xj as a union of a finite number of pairwise disjointsubsets, such that the oscillation of f on each Xj does not exceed ε. Choosingarbitrary xj ∈ Xj and defining

με =∑

jμ(Xj)δxj ,

one gets |(f, μ)− (f, με)| ≤ ε‖μ‖.Exercise 1.1.5. The equations 1.11 are straightforward. The first equation in (1.12)follows from duality (see Exercise 1.1.1). For example, it is sufficient to use just R2.

Exercise 1.1.7.

(y(m), x) =

n∑j=1

y(m)j xj +

∞∑j=n+1

y(m)j xj ,

so that, if y(m) is bounded, the second term can be made arbitrary small for anyx by choosing n large enough. This reduces the problem to a finite-dimensionalsetting.

Exercise 1.1.8. Weak convergence implies tightness, that is, for any ε there exists

N such that |y(m)n | < ε for all n > N and all m.

Exercise 1.4.1. Firstly,

[f, g]+ = limh→0

1

h(max

x|f(x) + hg(x)| − ‖f‖)

≥ limh→0

1

hmaxx∈Mf

(|f(x)|+ hg(x) sgn f(x)− ‖f‖) = maxx∈Mf

(g(x) sgn f(x)).

In order to prove the equality, choose a converging subsequence from the sequencexh that realises a maximum for finite h.

Exercise 1.4.2. It is a consequence of Proposition 1.4.3.

Exercise 1.5.1. (i) (A+ hξ)−1 = (1+ hA−1ξ)−1A−1 = (1− hA−1ξ + o(h))A−1.

Exercise 1.6.1. Claims (i) to (iii) are straightforward. To prove (iv), note that ifx = tm1, y = sm2 with m1,m2 ∈ M , then

x+ y

t+ s=

tm1 + sm2

t+ s∈ M.

Exercise 1.6.4.

pn(x) < ε =⇒ d(x, 0) <∑k =n

2−k + ε, d(x, 0) < ε =⇒ pn(x) < ε/(1− ε).

Page 98: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

82 Chapter 1. Analysis on Measures and Functional Spaces

Exercise 1.6.5.

‖f‖2p,q,2 ≤ ‖f‖2p+d,q

∫(1 + ‖x||2d)−1dx,

‖f‖0,0 ≤∫ ∣∣∣∣ ∂f∂xj

∣∣∣∣ dx ≤∫ ∣∣∣∣ ∂f∂xj

∣∣∣∣2

(1 + ‖x‖)2ddx∫

(1 + ‖x‖)−2ddx.

Exercise 1.6.7. For a balanced convex neighbourhood V0 of zero in V , there existsa neighbourhood W1 of zero in W such that V0 = W1 ∩ V . Since W is locallyconvex, there exists a convex balanced neighbourhood W2 of zero in W such thatW2 ⊂ W1. Take

W0 = {αv + βw : v ∈ V0, w ∈ W2, |α|+ |β| = 1}.

Exercise 1.6.8. Since for any x, y ∈ V there exists n such that x, y ∈ Vn, theseparation property follows. The local convexity follows essentially from Exercise1.6.7.

Exercise 1.7.1. For (1.85), use the Taylor expansions in the form

f(x+ y)− f(x)− f ′(x)y =

∫ y

0

f ′′(x+ z)(y − z) dz, y > 0,

f(x+ y)− f(x)− f ′(x)y =

∫ 0

y

f ′′(x+ z)(z − y) dz, y < 0.

Exercise 1.7.2. One can reduce the calculations to the one-dimensional case bybringing A into diagonal form, i.e., expressing it as A = ODO−1 with a diagonalmatrix D and an orthogonal matrix O.

Exercise 1.8.2. This is as a corollary of (1.138) and (1.137).

Exercise 1.11.1(i).

(Δ ln |x|, ψ) = (ln |x|,Δψ) = limε→0

∫|x|≥ε

ln |x|Δψ(x) dx.

In polar coordinates (r, φ), we find

Δ =∂2

∂r2+

1

r

∂r+

1

r2∂2

∂φ2=

1

r

∂rr∂

∂r+

1

r2∂2

∂φ2,

implying

(Δ ln |x|, ψ) = limε→0

∫ ∞

ε

dr

∫ 2π

0

r ln r

[1

r

∂rr∂

∂r+

1

r2∂2

∂φ2

]ψ(r, φ) dφ.

Page 99: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

1.16. Summary and comments 83

The term with ∂2/∂φ2 disappears due to periodicity, so that

(Δ ln |x|, ψ) = limε→0

∫ ∞

ε

dr

∫ 2π

0

ln r

[∂

∂rr∂

∂r

]ψ(r, φ) dφ

= − limε→0

∫ ∞

ε

dr

∫ 2π

0

∂r(ln r)r

∂rψ(r, φ) dφ − lim

ε→0ε ln ε

∫ 2π

0

∂rψ(ε, φ) dφ.

The second term vanishes, and therefore

(Δ ln |x|, ψ) = − limε→0

∫ ∞

ε

dr

∫ 2π

0

∂rψ(r, φ) dφ = lim

ε→0

∫ 2π

0

ψ(ε, φ) dφ = 2πψ(0).

1.16 Summary and comments

We developed the calculus on locally convex spaces, paying most attention tothe case of Banach spaces and going into special details for the space of measures,where the variational derivatives play a crucial role. The analysis on general spaceswas presented in a more sketchy way. By carefully selecting the material, we refinedthe general ideas to such a degree that they are appropriate for applications to awide class of concrete partial differential and pseudo-differential equations, someof which will be dealt with further on. With the space of generalized functions,the key example of locally convex spaces was properly introduced together withits main properties, including fundamental solutions and the Fourier transform.Propositions 1.4.4, 1.4.5 that are major tools for proving uniqueness and stabilityof general kinetic equations seem to have be found first in [141]. A probabilis-tic proof based on Martingale theory is given in [215]. Here, we give the mostelementary proof based on the theory of semi-inner products.

Many sources exist where the classical topics that are touched upon here aredeveloped in various directions. The general references to topological linear spacesinclude [115, 212] and[235]. The discussion of various notions of differentiabilityin locally convex spaces can be found, e.g., in [20, 21, 264].

Fractional calculus is presented in many excellent books. Standard referencesinclude [123, 238] and [239]. Note that the last book specifically deals with hyper-singular integrals that we only touched upon here. Of great importance is the prob-abilistic interpretation of fractional derivatives (see [152] and references therein),which plays the key role for establishing their links with various models in physicsand economics. In fact, one of the greatest impetus to the modern developmentof fractional calculus is due to its appearance in the theory of continuous-timerandom walks (CTRW), see, e.g., [144, 206, 207, 253, 255] and references therein.

A well-written exposition of generalized function is the book [90]. The theoryof generalized functions as a powerful tool for solving various classes of partial dif-ferential equations was first developed by S.L. Sobolev, following the ideas of N.M.Gunter, who initially introduced them under the name of ‘functions of domains’.

Page 100: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

84 Chapter 1. Analysis on Measures and Functional Spaces

The final, more abstract theory was shaped by L. Schwartz and I.M. Gelfand (moreon this history, e.g., in [14]).

The modern development of the theory of pseudo-differential operators wasessentially initiated by Hormander [109] and Maslov [200]. Many excellent expo-sitions of pseudo-differential operators (ΨDOs) include [202, 245, 254]. Here, weonly use the notations of the classical theory of ΨDOs, since this theory mostlydeals with smooth symbols and the problems discussed here are often lacking suchregularity. The theory of pseudo-differential operators with non-smooth symbolsas they arise in the analysis of Markov processes is developed in [118].

Of course, many important tools of modern analysis on measures did not findtheir place in this exposition. The most notable omission are metrics that metri-cise the weak convergence (Wasserstein–Kantorovich metrics) and the notions ofderivatives that are motivated by the links with probability theory, like the deriva-tives of Malliavin or Ito’s pathwise calculus in the sense of Follmer and Cont, see,e.g., [10] and [26] and [84]. The fundamental monograph on mean field games,[48], contains a lot of information on modern analysis of the mappings betweenmeasure spaces. Also, we did not touch the very interesting topic of the analysison metric spaces without a linear structure, which can be found in [7].

Page 101: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

Chapter 2

Basic ODEs in CompleteLocally Convex Spaces

In this chapter, we present more or less standard material (though not easilyfound in a concise systematic form) on vector-valued ODEs with a Lipschitz r.h.s.We shall fix the notations and set the scene for further developments. The Lips-chitz r.h.s. ensures that the solution is globally well defined in both forward andbackward time. We shall prove the basic well-posedness and sensitivity for ODEs,explain their links with partial differential equations via the method of charac-teristics, present some extensions to equations with memory (including causalequations and fractional derivatives in time), and finally review the theory of ac-cretivity. We shall mostly work with Banach spaces, but will indicate the necessarymodifications for the analogous theory in general locally convex spaces at the endof the chapter. Although used here for ODEs with Lipschitz r.h.s., the abstractwell-posedness results are formulated in such a way that they are applicable towide classes of more general equations, as will be seen later. Therefore, this chap-ter lays the foundation for all future developments, the main general tools beingpresented in Sections 2.1, 2.3, 2.8, 2.10 and 2.15.

2.1 Fixed-point principles for curves in Banach spaces

Instead of carrying out case-by-case modifications of the fixed-point argumentsfor proving the well-posedness of various ODEs, it is handy to have some abstractversion that we derive here as a consequence of the general fixed-point principlein metric spaces, as recalled in the Appendix (Section 9.1).

Let B be a Banach space andM a closed convex subset therein. For any τ < t,let C([τ, t],M) be a convex subset of the Banach space C([τ, t], B) of functions on[τ, t] with values in M ⊂ B. Thus C([τ, t],M) is a complete metric space, equipped

© Springer Nature Switzerland AG 2019 V. Kolokoltsov, Differential Equations on Measures and Functional Spaces, Birkhäuser Advanced Texts Basler Lehrbücher, https://doi.org/10.1007/978-3-030-03377-4_2

85

Page 102: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

86 Chapter 2. Basic ODEs in Complete Locally Convex Spaces

with the distance that is induced from C([τ, t], B). There is a natural restrictionmapping that projects C([τ, t],M) → C([τ, s],M) for t ≥ s ≥ τ and a naturalinclusion M → C([τ, t],M) that maps Y ∈ M onto a constant function μs = Yon [τ, t]. For any Y ∈ M , let CY ([τ, t],M) denote the closed convex subset ofC([τ, t],M) consisting of functions μt with the initial condition Y : μτ = Y . LetB1 be an auxiliary Banach space of some parameters.

Theorem 2.1.1. Suppose that for any Y ∈ M , α ∈ B1, a mapping

ΦY,α : C([τ, T ],M) → CY ([τ, T ],M)

is given with some T > τ such that for any t the restriction of ΦY,α(μ.) on [τ, t]depends only on the restriction of the function μs on [τ, t]. Moreover

‖[ΦY,α(μ1. )](t) − [ΦY,α(μ

2. )](t)‖ ≤ L(Y )

∫ t

τ

‖μ1. − μ2

. ‖C([τ,s],B) ds,

‖[ΦY1,α1(μ.)](t) − [ΦY2,α2(μ.)](t)‖ ≤ κ‖Y1 − Y2‖+ κ1‖α1 − α2‖,(2.1)

for any μ1, μ2 ∈ C([τ, T ],M), α1, α2 ∈ B1, some constants κ,κ1 and a continuousfunction L on M .

Then for any Y ∈ M , α ∈ B1 the mapping ΦY,α has a unique fixed pointμt,τ (Y, α) in CY ([τ, T ],M). Moreover, for all t ∈ [τ, T ],

‖μt,τ (Y, α)− Y ‖ ≤ e(t−τ)L(Y )‖[ΦY,α(Y )](t) − Y ‖, (2.2)

and the fixed points μt,τ (Y1, α1) and μt,τ (Y2, α2) with different initial data Y1, Y2

and parameters α1, α2 satisfy the estimate

‖μt,τ (Y1, α1)− μt,τ (Y2, α2)‖≤ (κ‖Y1 − Y2‖+ κ1‖α1 − α2‖) exp{(t− τ)min(L(Y1), L(Y2))}.

(2.3)

Proof. It follows from (2.1) by direct induction that

‖[ΦnY,α(μ

1. )]− [Φn

Y,α(μ2. )]‖C([τ,t],B) ≤ (t− τ)nLn(Y )

n!‖μ1

. − μ2. ‖C([τ,t],B) (2.4)

holds for any t ∈ [τ, T ]. Hence, by Proposition 9.1.1 the mapping Φ has a uniquefixed point μt,τ (Y, α) in CY ([τ, T ], B), the approximations [Φn

Y,α(Y )](t) convergeto this point in C([τ, T ], B), and (2.2) holds. Finally, equation (2.3) follows from(2.1) and Proposition 9.1.3 (equation (9.2)). �

We shall see several applications of this result, often with slight extensionsthat are now pointed out in detail. For this, let M(t), t ∈ [τ, T ], be an increas-ing family of bounded convex closed subsets in B such that M = M(τ), andlet C([τ, T ],M(.)) denote the closed subset of curves μ. of C([τ, T ],M(T )) suchthat μt ∈ M(t) for all t ∈ [τ, T ]. The following extension of Theorem 2.1.1 isstraightforward.

Page 103: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

2.1. Fixed-point principles for curves in Banach spaces 87

Theorem 2.1.2. Let all assumptions of Theorem 2.1.1 hold, with the only differ-ence that for any Y ∈ M = M(τ), ΦY,α maps C([τ, T ],M(.)) to itself. Then thestatements of Theorem 2.1.1 hold for all Y ∈ M with the fixed point existing inC([τ, T ],M(.)).

For more advanced developments, (e.g., for nonlinear parabolic equationsand equations with fractional derivatives), the following extension turns out to beuseful. Note that the Mittag-Leffler function used therein is defined in (9.13).

Theorem 2.1.3.

(i) Suppose that the conditions of Theorem 2.1.1 hold, but with (2.1) replaced by

‖[ΦY,α(μ1. )](t)− [ΦY,α(μ

2. )](t)‖ ≤ L(Y )

∫ t

τ

(t− s)−ω‖μ1. − μ2

. ‖C([τ,s],B) ds,

‖[ΦY1,α1(μ.)](t)− [ΦY2,α2(μ.)](t)‖ ≤ κ‖Y1 − Y2‖+ κ1‖α1 − α2‖, (2.5)

with some ω ∈ (0, 1). Then for any Y ∈ M the mapping ΦY,α has a uniquefixed point μt,τ (Y, α) in CY ([τ, T ],M).

Moreover, for all t ∈ [τ, T ],

‖μt,τ (Y, α)−Y ‖ ≤ E1−ω(L(Y )Γ(1−ω)(t− τ)1−ω)‖[ΦY,α(Y )](t)−Y ‖, (2.6)

and the fixed points μt,τ (Y1, α1) and μt,τ (Y2, α2) with different initial dataY1, Y2 and parameters α1, α2 satisfy the estimate (for any j = 1, 2)

‖μt,τ (Y1, α1)− μt,τ (Y2, α2)‖≤ (κ‖Y1 − Y2‖+ κ1‖α1 − α2‖)E1−ω(L(Yj)Γ(1− ω)(t− τ)1−ω).

(2.7)

(ii) Similar to Theorem 2.1.2, let all assumptions of part (i) hold, with the onlydifference that for any Y ∈ M = M(τ), ΦY,α maps C([τ, T ],M(.)) to it-self (for an increasing family of bounded convex closed subsets M(.)). Thenthe statements of (i) hold for all Y ∈ M with the fixed point existing inC([τ, T ],M(.)).

Proof. Let us prove only (i), since the modification (ii) is straightforward. The firstinequality of (2.5) can be rewritten in terms of the fractional Riemann integral(see (1.100) for its definition) as

‖[ΦY,α(μ1. )](t)− [ΦY,α(μ

2. )](t)‖ ≤ L(Y )Γ(1 − ω)I1−ω

τ (‖μ1. − μ2

. ‖C([τ,.],B))(t).

By iteration and the composition rule Ikτ Imτ = Ik+m

τ for fractional integrals (see(1.102)), it follows that

‖[ΦnY,α(μ

1. )]− [Φn

Y,α(μ2. )]‖C([τ,t],B)

≤ (L(Y )Γ(1 − ω))nIn(1−ω)τ (‖μ1

. − μ2. ‖C([τ,.],B))(t)

Page 104: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

88 Chapter 2. Basic ODEs in Complete Locally Convex Spaces

≤ (L(Y )Γ(1− ω))n

Γ(n(1− ω))‖μ1

. − μ2. ‖C([τ,t],B)

∫ t

τ

(t− s)n(1−ω)−1ds

=(L(Y )Γ(1− ω)(t− τ)1−ω)n

Γ(n(1− ω) + 1)‖μ1

. − μ2. ‖C([τ,t],B), (2.8)

where in the last equation the formula Γ(x+ 1) = xΓ(x) was used. Of course, theestimate (2.8) can be alternatively obtained from (2.5) by direct induction. Thefractional integrals are only used as an elegant tool in order to avoid such routineinduction arguments.

Consequently, by the definition of the Mittag-Leffler function (see (9.13)) itfollows that Propositions 9.1.1 and 9.1.3 are applicable with

A = A(t−τ) =

∞∑n=0

(L(Y )Γ(1 − ω)(t− τ)1−ω)n

Γ(n(1− ω) + 1)= E1−ω(L(Y )Γ(1−ω)(t−τ)1−ω),

which yields the unique fixed point for ΦY and the estimates (2.6) and (2.7). �

2.2 ODEs in Banach spaces: well-posedness

As usual, let B be a Banach space. Differential equations supplemented by someinitial conditions (i.e., conditions at an initial time of the evolution) are usuallyreferred to as Cauchy problems.

Theorem 2.2.1. Let F be a Lipschitz-continuous (not necessarily bounded) mappingB → B, with a Lipschitz constant ‖F‖Lip = L as defined by (1.53). Then for anyY ∈ B there exists a unique global solution μt = μt(Y ) (defined for all t ≥ 0) tothe Cauchy problem

μt = F (μt), (2.9)

with the initial condition μ0 = Y . Moreover

‖μt(Y )− Y ‖ ≤ tetL‖F (Y )‖ ≤ tetL(L‖Y ‖+ ‖F (0)‖), (2.10)

and the solutions μt(Y1) and μt(Y2) with different initial data Y1, Y2 satisfy theestimate

‖μt(Y1)− μt(Y2)‖ ≤ etL‖Y1 − Y2‖. (2.11)

Proof. For t > 0 the Cauchy problem (2.9) is equivalent to the integral equation

μt = Y +

∫ t

0

F (μs) ds. (2.12)

Let us fix an arbitrary T > 0, and let us define the mapping

ΦY : C([0, T ], B) → CY ([0, T ], B)

Page 105: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

2.2. ODEs in Banach spaces: well-posedness 89

by the equation

[ΦY (μ.)](t) = Y +

∫ t

0

F (μs) ds, t ∈ [0, T ], (2.13)

where CY ([0, T ], B) is the subset of curves μt in C([0, T ], B) such that μ0 = Y .For any two curves μ1

t , μ2t from C([0, T ], B), it follows that

‖[ΦY1(μ1. )](t) − [ΦY2(μ

2. )](t)‖ ≤ L

∫ t

0

‖μ1. − μ2

. ‖C([0,s],B) ds+ ‖Y1 − Y2‖. (2.14)

Consequently, Theorem 2.1.1 applies with the constant L, the empty space B1,κ = 1 and [ΦY (Y )](t)− Y = t‖F (Y )‖. �

In the above Lipschitz setting, negative times can be treated in the same wayas positive times. This yields the following result.

Proposition 2.2.1. Under the assumptions of Theorem 2.2.1, the solutions to equa-tion (2.9) are also well defined for all t < 0, so that (2.10) and (2.11) extend to

‖μt − Y ‖ ≤ te|t|L‖F (Y )‖, ‖μt(Y1)− μt(Y2)‖ ≤ e|t|L‖Y1 − Y2‖. (2.15)

Finally, for any t, the mapping Y �→ μ−t(Y ) is the inverse of the mapping Y �→μt(Y ).

Proof. Global Lipschitz continuity ensures that the same arguments as in Theorem2.2.1 work for negative times. (Inverting the time reduces the case with negativet to the case with positive t.) Thus only the last statement needs a proof. For anys < 0 the functions μt(μs(Y )) and μs+t(Y ) satisfy the same equation with thesame initial condition μs(Y ) at t = 0. Hence μt(μs(Y )) = μs+t(Y ). In particular,μt(μ−t(Y )) = Y . �

Several extensions of the above results are straightforward. Namely, one oftendeals with equations whose r.h.s. depends explicitly on the time t and/or on anadditional parameter. In other words, one deals with equations of the form

μt = F (t, μt, α), (2.16)

with α from some other Banach space B1, and one is interested in the continuousdependence of the solutions with respect to α. However, lifting equation (2.16) upto the respective equation in B × B1 by adding the equation α = 0 reduces theproblem of sensitivity with respect to a parameter (or the continuous dependenceon it) to a problem of sensitivity with respect to the initial data. The dependenceon time does not require any changes in the above results, as long as it is contin-uous. Therefore, the proof of the following result is a direct extension of the proofof Theorem 2.2.1.

Page 106: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

90 Chapter 2. Basic ODEs in Complete Locally Convex Spaces

Theorem 2.2.2. Let F : R×B×B1 → B be a continuous function such that F (t, ., .)is Lipschitz-continuous as a mapping B ×B1 → B with a Lipschitz constant thatis independent of t, so that

‖F (t, Y1, α1)− F (t, Y2, α2)‖ ≤ L‖Y1 − Y2‖+ L1‖α1 − α2‖. (2.17)

Then

(i) for any Y ∈ B, s ∈ R, α ∈ B1 there exists a unique global solution μt,s(Y, α)(defined for all t ∈ R) to the Cauchy problem for equation (2.16) with theinitial condition μs,s = Y . Moreover,

‖μt,s(Y, α)− Y ‖ ≤ |t− s|e|t−s|L(L‖Y ‖+ supτ∈[s,t]

‖F (τ, 0, α)‖); (2.18)

(ii) the solutions μt,s(Y1, α1) and μt,s(Y2, α2) with different initial data Y1, Y2

and parameter values α1, α2 satisfy the estimate

‖μt,s(Y1, α1)−μt,s(Y2, α2)‖ ≤ e|t−s|L(‖Y1−Y2‖+L1(t−s)‖α1−α2‖); (2.19)

(iii) for any s, t, α the mapping Y �→ μs,t(Y, α) is the inverse of the mappingY �→ μt,s(Y, α).

In many applications, F is not everywhere continuous in t. In this case,instead of the Cauchy problem for (2.16) one can work with its integral version:

μt = μs +

∫ t

s

F (τ, μτ , α) dτ. (2.20)

Proposition 2.2.2. Let F (τ, μτ , α) satisfy all the assumptions of Theorem 2.2.2apart from being continuous in t.

(i) Let F be measurable and locally bounded as a function of t. Then the claimsof Theorem 2.2.2 remain valid when applied to the problem (2.20).

(ii) Let F be continuous with respect to t apart from some fixed set N of measurezero (independent of other arguments of F ), and let F also be locally boundedin t. Then a locally absolutely continuous function μt solves (2.20) if andonly if it assumes the initial value μs and satisfies (2.16) almost surely.

Proof. Statement (i) is straightforward, since it is actually the problem (2.20),which has been dealt with in Theorem 2.2.2. Statement (ii) follows from Theorem1.3.2. �Remark 27. If B = R, the integral equation (2.20) is equivalent to (2.16) beingsatisfied almost surely, whenever F is locally bounded and measurable in t.

As another key extension, let us mention the equations of higher order,namely equations of the type

μ(k)t = F (t, μt, μ

′t, . . . , μ

(k−1)t , α), (2.21)

Page 107: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

2.2. ODEs in Banach spaces: well-posedness 91

where k ∈ N. The Cauchy problem for this equation for times t ≥ s is posed byspecifying the k initial vectors:

Y0 = μs, Y1 = μ′s, . . . , Yk−1 = μ(k−1)

s . (2.22)

The standard trick is to rewrite this equation as a first-order equation in Bk

(with the norm being defined as the sum of the norms of the components in B)by setting

νt = (ν0t , ν1t , . . . , ν

k−1t ) = (μt, μ

′t, . . . , μ

(k−1)t ) ∈ Bk.

In terms of ν, the problem (2.21), (2.22) rewrites as

d

dtνt =

d

dt(ν0t , ν

1t , . . . , ν

k−1t ) = (ν1t , . . . , ν

k−1t , F (t, ν0t , ν

1t , . . . , ν

k−1t , α)) (2.23)

with the initial conditionνs = (Y0, . . . , Yk−1). (2.24)

Theorem 2.2.3. Let F : R × Bk × B1 → B be a continuous function such thatF (t, ., .) is Lipschitz-continuous as a mapping Bk × B1 → B with a Lipschitzconstant that is independent of t, so that

‖F (t, Y0, . . . , Yk−1, α)− F (t, Z0, . . . , Zk−1, β)‖ ≤ L

k−1∑j=0

‖Yj − Zj‖+ L1‖α− β‖.

(2.25)Then

(i) for any Y = (Y0, . . . , Yk−1) ∈ Bk, s ∈ R, α ∈ B1, there exists a uniqueglobal solution νt,s(Y, α) (defined for all t ∈ R) to the Cauchy problem (2.23),(2.24), and thus a unique global solution μt,s(Y, α) = ν0t,s(Y, α) to the Cauchyproblem (2.21), (2.22), and

‖νt,s(Y, α)− Y ‖Bk ≤ |t− s|e|t−s|(1+L)((1 + L)‖Y ‖Bk + supτ∈[s,t]

‖F (τ, 0, α)‖);(2.26)

(ii) the solutions μt,s(Y, α) and μt,s(Z, β) with different initial data Y, Z andparameter values α, β satisfy the estimate

‖νt,s(Y, α)−νt,s(Z, β)‖ ≤ e|t−s|(1+L)(‖Y −Z‖Bk +L1(t−s)‖α−β‖); (2.27)

Proof. The results follow from applying Theorem 2.2.2 in the Banach space Bk,taking into account that

‖(ν1t , . . . , νk−1t , F (t, ν0t , ν

1t , . . . , ν

k−1t , α))

− (ν1t , . . . , νk−1t , F (t, ν0t , ν

1t , . . . , ν

k−1t , α))‖Bk

≤ (1 + L)

k−1∑j=0

‖Yj − Zj‖+ L1‖α− β‖. �

Page 108: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

92 Chapter 2. Basic ODEs in Complete Locally Convex Spaces

In this book, we shall mostly be looking for the global solutions of ODEs(defined for all t > 0). However, local solutions are also often found in practice.Such solutions arise when the assumptions of Theorem 2.2.1 turn out to holdfor some finite period of time only, say when the r.h.s is only locally Lipschitz-continuous. Physically, local solutions usually describe the effect of explosion: thesolutions diverge to infinity in finite times. We illustrate this possibility in thefollowing exercise.

Exercise 2.2.1. Solve the Cauchy problem for the ODE x = x3. For each initialpoint x0, find the ‘explosion time’ t0 such that limt→t0 x(t) = ∞.

The next exercise provides the simplest example for the case of Lipschitzcontinuity not being met by the r.h.s., and the corresponding Cauchy problemhaving infinitely many solutions.

Exercise 2.2.2. Show that the Cauchy problem for the ODE x =√x with the

initial condition x0 = 0 has infinitely many solutions.

As already emphasized, the abstract view on ODEs as promoted in this textmakes it possible to look at PDEs as ODEs in Banach spaces. Besides this abstractapproach, there are other ways for reducing PDEs to ODEs, some of them moreconcrete and practically very useful. For instance, a systematic method for dealingwith first-order PDEs will be discussed later in this chapter. Another such method(with some empirical flavor) consists of searching for solutions of a particular formfor a given PDE. The following exercise demonstrates this approach on the exampleof one of the most famous nonlinear PDEs, the KdV-equation

∂u(t, x)

∂t=

3

2u∂u(t, x)

∂x+

1

4

∂3u(t, x)

∂x3, (2.28)

where t, x ∈ R.

Exercise 2.2.3. Show that if a solution to the KdV-equation has the form of a‘traveling wave’ u(t, x) = w(x + ct) with a function w and a constant c, then wsatisfies the ODE

(w′)2 = −2w3 + 4cw2 − 8c1w − 8c2 (2.29)

with some constants c1, c2.

By an appropriate linear transformation of the unknown function, i.e., bychanging w into W = aw+ b, equation (2.29) can be transformed to the canonicalform

(W ′)2 = 4W 3 − k1W − k2, (2.30)

which is a famous classical ODE with solutions being given by the Weierstrasselliptic p-function. Therefore, the traveling-wave-solution u(x + ct) to the KdV-equation is explicitly expressed in terms of the classical elliptic functions. This

Page 109: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

2.3. Linear equations and chronological exponentials 93

includes the famous ‘solitary waves’ (or solitons) on shallow waters that can bedescribed by solutions of the form

usol(t, x) = 8k2(ekx+k3t + e−kx−k3t

)−2

that solves the KdV-equation for any value of k. S. Russel is reported to haveobserved it and singled it out for the first time by following it for several miles onthe horseback along a narrow channel.

After the well-posedness of ODEs, the next basic question concerns their sen-sitivity to parameters and initial data. Before diving into this topic, we shall firstdiscuss the properties and concrete representations for the solutions of three classesof equations: general linear equations with bounded generators, linear ΨDEs withspatially homogeneous symbols and basic Hamiltonian evolutions, with the appli-cation of the latter to PDEs and optimal control.

2.3 Linear equations and chronological exponentials

The simplest class of ODEs in a Banach space B is given by the linear equations

μt = Aμt + gt, (2.31)

whereA is a linear operator in B and gt a given curve there. For instance, ifB = Rd

and A is a square matrix, we talk about linear equations in finite dimensions.

Proposition 2.3.1. Let A ∈ L(B,B) and gt a continuous curve. Then the uniquesolution to the Cauchy problem of equation (2.31) with the initial data μ0 = Y isgiven by the following Duhamel formula:

μt = eAtY +

∫ t

0

e(t−s)Ags ds. (2.32)

Exercise 2.3.1. Check (2.32) by direct differentiation, taking into account that theoperator exponent eA =

∑n

An

n! satisfies the same rules of differentiation as theusual exponent.

We will now see how one can derive (2.32) in a more general setting of time-dependent evolutions, namely for the equation

μt = Atμt + gt, (2.33)

where At is a family of bounded linear operators in B depending continuously on t,or more generally measurably (in the strong operator topology). That is, t �→ Atfis a continuous, or measurable, function for any f ∈ B.

Page 110: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

94 Chapter 2. Basic ODEs in Complete Locally Convex Spaces

If and only if all operators At commute, the solution can be written as astraightforward extension of the homogeneous case:

μt = exp

{∫ t

0

As ds

}Y +

∫ t

0

exp

{∫ t

s

Aτ dτ

}gs ds. (2.34)

In order to find the general solution to the Cauchy problem of (2.33), let usfirst rewrite it in the integral form as

μt = Y +

∫ t

0

Asμs ds+

∫ t

0

gs ds. (2.35)

Let us iterate this equation by replacing μs under the integral by the wholeexpression given by the r.h.s. of the formula:

μt = Y +

∫ t

0

(AsY + gs) ds+

∫ t

0

As

(∫ s

0

(As1μs1 + gs1) ds1

)ds.

Denoting gt =∫ t

0 gsds and repeating this procedure recursively leads, for anyn ≥ 2, to the formula

μt = Y + gt +n−1∑k=1

∫0≤s1≤···≤sk≤t

Ask · · ·As1(Y + gs1)ds1 · · · dsk

+

∫0≤s1≤···≤sn≤t

Asn · · ·As1μs1ds1 · · · dsk.

If all As are uniformly bounded, it follows that the last term here tends to zero,as n → ∞, leading to the following result.

Proposition 2.3.2. Let At and gt be families of uniformly bounded operators andelements in B that depend continuously – or more generally almost surely contin-uously (with the points of discontinuity forming a negligible set) – on t ∈ R. Thenthe unique solution to the Cauchy problem of equation (2.33) with the initial dataμ0 = Y , given by Theorem 2.2.1 or Proposition 2.2.2, has the following convergentseries representation:

μt = Y + gt +

∞∑k=1

∫0≤s1≤···≤sk≤t

Ask · · ·As1(Y + gs1)ds1 · · · dsk. (2.36)

Formula (2.36) can be rewritten in various insightful ways. For instance,when I ◦ A denotes the operator in C([0, T ], B) (for any fixed T ) acting as g. �→∫ t

0Asgs ds, equation (2.36) rewrites as the geometric series

μt =∞∑k=0

(I ◦A)k(Y + g.)(t). (2.37)

Page 111: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

2.3. Linear equations and chronological exponentials 95

Remark 28. The sum∑∞

k=0(I ◦A)k can be interpreted as the series expansion forthe operator (1 − I ◦ A)−1, referred to as the resolvent of the operator I ◦ A. Infact, it follows formally from (2.35) that (1 − I ◦ A)−1(Y + g.)(t) should be thesolution to (2.35), in accordance with (2.37).

Alternatively, setting g = 0 first, one can rewrite (2.36) as

μt = Y +

∞∑k=1

1

k!

∫ t

0

· · ·∫ t

0

T (Ask · · ·As1)Y ds1 · · · dsk, (2.38)

where T is the ordering functional, which for any sequence s1, . . . , sk reorders theproduct Ask · · ·As1 in such a way that sj follow each other in decreasing order fromthe left to the right. A comparison with the usual exponential expansion suggeststhe definition of the fundamental concept of the chronological exponential or time-ordered exponential or T -product of the integral of time-dependent operators as

T exp

{∫ t

s

Aτdτ

}=

∞∑k=1

1

k!

∫ t

0

· · ·∫ t

0

T (Ask · · ·As1) ds1 · · · dsk. (2.39)

With this formula for the chronological exponentials (2.36), one gets thefollowing fundamental representation of the solution:

μt = T exp

{∫ t

0

As ds

}Y +

∫ t

0

T exp

{∫ t

s

Aτ dτ

}gs ds. (2.40)

If At commutes, this turns into (2.34), and if At does not depend on t, it reducesto (2.32).

The chronological exponentials can be expressed in another insightful way,namely as multiplicative Riemann integrals (see Section 1.3). Approximating At inthe equation μt = Atμt by piecewise-constant families suggests the formulation ofthe approximate solution to the Cauchy problem of this equation with the initialcondition μs = Y as

μΔt = exp{(tn − tn−1)Atn−1} · · · exp{(t1 − t0)At0}Y. (2.41)

where Δ = {s = t0 < t1 < · · · < tn = t} is any partition of the interval [s, t].If At is continuous apart from a set of zero measure and uniformly bounded, itfollows from Theorem 1.3.3 that the approximations μΔ

t do converge towards thesolution to the equation μt = Atμt, as |Δ| = max(tj+1 − tj) → 0. This leads to analternative representation for the chronological exponent (2.39):

T exp

{∫ t

s

Aτdτ

}= lim

|Δ|→0exp{(tn − tn−1)Atn−1} · · · exp{(t1 − t0)At0}. (2.42)

Remark 29. For unbounded At, the story is more complicated. See Theorem 5.1.3for continuous families At and [149] (Theorem 2) for more general cases.

Page 112: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

96 Chapter 2. Basic ODEs in Complete Locally Convex Spaces

Note for future reference that if At is piecewise-constant, i.e., At = Aj fortj ≤ t < tj+1 for some partition Δ = {s = t0 < t1 < · · · < tn = t}, then

T exp

{∫ t

s

Aτdτ

}= exp{(tn − tn−1)Atn−1} · · · exp{(t1 − t0)At0}. (2.43)

As follows from the definition of the chronological product, the solution tothe linear equation (2.33) can be estimated by

‖μt‖ ≤ exp{t sups∈[0,t]

‖As‖B→B}(‖Y ‖+ t sups∈[0,t]

‖gs‖). (2.44)

Similarly, one solves the backward linear problem

μt = −Atμt + gt, t ≤ r, μr = Y. (2.45)

Its equivalent integral representation reads

μt = Y +

∫ r

t

(Asμs − gs) ds, t ≤ r. (2.46)

If the operators At commute, the unique solution to this problem equals

μt = exp

{∫ r

t

As ds

}Y +

∫ r

t

exp

{∫ s

t

Aτ dτ

}gs ds. (2.47)

In the general case, the exponents are changed into the backward chronologicalexponentials or backward T -product:

μt = T exp

{∫ r

t

As ds

}Y +

∫ r

t

T exp

{∫ s

t

Aτ dτ

}gs ds, (2.48)

where

T exp

{∫ t

s

Aτdτ

}= lim

|Δ|→0exp{(t1 − t0)At0} · · · exp{(tn − tn−1)Atn−1}. (2.49)

The following notation containing the inverse order of integration is also in use:

T exp

{∫ t

s

Aτdτ

}= T exp

{∫ s

t

Aτdτ

}, s ≤ t. (2.50)

Quite similarly, one can also conclude that the integral equation

μt = GtY +

∫ t

0

At,sμs ds+

∫ t

0

gs ds, t > 0, (2.51)

Page 113: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

2.3. Linear equations and chronological exponentials 97

with uniformly bounded linear operators Gt, At,s depending continuously on t andmeasurably on s, has a unique solution that is given by the series representation

μt = GtY +gt+∞∑k=1

∫0≤s1≤···≤sk≤t

Ask+1,sk · · ·As2,s1(Gs1Y +gs1)ds1 · · · dsk (2.52)

(where sk+1 = t), with the estimate

‖μt‖ ≤ exp{t sups1≤s2≤t

‖As2,s1‖B→B}(

sups∈[0,t]

‖Gs‖B→B‖Y ‖+ t sups∈[0,t]

‖gs‖).

(2.53)

Finally, let us look at linear equations inRd. By (2.32) the problem is reducedto calculating the exponential etA with an arbitrary matrix A. First, let us assumethat A is diagonalizable (which is, e.g., always the case when A is symmetric). Thismeans that there exists an invertible C such that A = CDC−1 with a diagonalmatrix D that has some numbers λ1, . . . , λd on the diagonal. In this case, we find

etA = CetDC−1,

where etD is a diagonal matrix with numbers etλ1 , . . . , etλd on the diagonal. More-over, the columns vj = C.j of the matrix C are the eigenvectors of A, so that

etAvj = etλjvj , j = 1, . . . , d.

If A is not diagonalizable, it can be brought to the Jordan normal form, for whichthe calculation of the exponent is lengthier, but still quite explicit (see any el-ementary introduction to ODEs for more details). Therefore, for the numericalsolutions of linear problems in large dimensions, the main problem lies in the ef-fective calculation of the Jordan normal form of A, which in turn can basically bereduced to finding eigenvectors and eigenvalues of A.

Exercise 2.3.2. Find the exponential exp{tσj} for the Pauli matrices

σ1 =

(1 0

0 −1

), σ2 =

(0 1

1 0

), σ3 =

(1 −i

0 i

). (2.54)

Exercise 2.3.3. Find the exponentials of the Jordan blocks

J2 =

(0 1

0 0

), J3 =

⎛⎜⎝0 1 0

0 0 1

0 0 0

⎞⎟⎠

in dimensions d = 2 and d = 3. Afterwards, extend this result to arbitrary dimen-sions.

Page 114: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

98 Chapter 2. Basic ODEs in Complete Locally Convex Spaces

Exercise 2.3.4. Find the solution to the second-order linear equation

f + bf(x) = g(t), t ≥ 0, (2.55)

with the initial data f(0) = f0, f(0) = y0.

Exercise 2.3.5. Find the solution to equation (2.55) with the boundary conditionsf(0) = f0, f(T ) = fT .

2.4 Linear evolutions involving spatiallyhomogeneous ΨDOs

In this section, we deal mostly with equations that have a Lipschitz-continuousr.h.s. First, we give a brief introduction to the important class of equations thatcan be brought into such a form by Fourier transform and then solved explicitly.Namely, let us consider the evolution equations

ft = −ψt(−i∇)ft, f |t=s = fs, t ≥ s, (2.56)

for the (possibly time-dependent family of) differential or pseudo-differential op-erators with constant coefficients (i.e., spatially homogeneous operators) with suf-ficiently regular symbols ψt.

The most studied examples of this Cauchy problem are the simplest diffusionequations and, more generally, equations with (possibly fractional) powers of theLaplacian,

ft = −|Δ|α/2ft, f |t=s = fs, (2.57)

as they arise from (2.56) with ψ(p) = |p|α. We shall discuss these problems andtheir extensions in more detail in Chapter 4. Passing to the Fourier transformf(p) =

∫e−ipxf(x) dx, the Cauchy problem (2.56) turns into the problem

d

dtft(p) = −ψt(p)ft(p), f |t=s = fs,

By (2.34) and Proposition 2.3.2, this problem has the unique solution

ft = exp

{−∫ t

s

ψτ (p) dτ

}fs(p), (2.58)

whenever ψt is almost surely continuous with respect to t.

Returning to f via the inverse Fourier transform yields

ft(x) =

∫Gψ

t,s(x− y)fs(y) dy =

∫Gψ

t,s(z)fs(x− z) dy, (2.59)

with

Gψt,s(x) =

1

(2π)d

∫eipx exp

{−∫ t

s

ψτ (p) dτ

}dp, (2.60)

whenever this integral is well defined.

Page 115: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

2.4. Linear evolutions involving spatially homogeneous ΨDOs 99

From (2.59), one derives the following characteristic feature of evolutions thatare generated by ΨDOs with constant coefficients: The resolving operator fs → ftcommutes with the shifts, i.e., with the mappings f(x) → f(x+a), for any a ∈ Rd.

If the family ψ does not explicitly depend on t, the solution to the Cauchyproblem (2.56) simplifies to

ft(x) =

∫Gψ

t−s(x− y)fs(y) dy, (2.61)

with

Gψt (x) =

1

(2π)d

∫eipx exp{−tψ(p)} dp. (2.62)

According to the general definitions (1.185) and (1.189), the functions Gψt,s(x)

or Gψt (x) are referred to as the Green functions or the heat kernels for the corre-

sponding Cauchy problem. By (2.59) and (2.61), these functions solve the Cauchyproblem (2.56) with the Dirac initial condition δ(x). For instance, for the basicdiffusion equation ft =

12Δft with ψ(p) = p2/2, the Green function (2.62) has the

form

Gψt (x) =

1

(2π)d

∫eipx exp{−tp2} dp =

1

(2πt)d/2exp

{−x2

2t

}, (2.63)

where (1.89) was used.

Similarly, the solution to the backward Cauchy problem

ft = ψt(−i∇)ft, f |t=r = fr, t ≤ r, (2.64)

can be written as

ft(x) =

∫Gψ

r,t(x − y)fr(y) dy =

∫Gψ

r,t(z)fr(x− z) dy, (2.65)

because, by (2.47), its Fourier transform equals

ft = exp

{−∫ r

t

ψτ (p) dτ

}fr(p). (2.66)

Whenever ψt is continuous with respect to p (with a real part that is boundedfrom below) and almost surely continuous with respect to t (or even just measur-

able, see Remark 27), the function exp{− ∫ tsψτ (p) dτ} is well defined as an element

of S′(Rd), that is, as a tempered generalized function. In this case, equation (2.59)can be written as a convolution:

ft(x) = (Gψt,s � fs)(x). (2.67)

Therefore, whenever the convolutions (2.59) or (2.61) are well defined, theyyield unique solutions to the Cauchy problem (2.56), well defined in the sense ofgeneralized functions.

Page 116: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

100 Chapter 2. Basic ODEs in Complete Locally Convex Spaces

Natural Banach spaces that provide the well-posedness of (2.56) are spaceswith various integral norms, since those norms can be recast in terms of the Fouriertransform. Basic examples are the Sobolev spaces (see (1.194) and (1.195)), theso-called Wiener space or Wiener ring F (L1(R

d)) of the Fourier transforms ofL1(R

d) (equipped with the norm inherited from L1(Rd)) and, more generally, the

space F (M(Rd)) of the Fourier transforms of bounded signed measures.

Theorem 2.4.1. Let ψt be a locally bounded measurable function of two variableswith a non-negative real part. Then the mapping fs �→ ft that resolves the problem(2.56) and is given by (2.59) is a well-defined contraction in the space L2(R

d), inthe Sobolev spaces Hs

2(Rd) for all s ∈ R, in the Wiener space F (L1(R

d)) and inthe space F (M(Rd)). Moreover, in all these spaces the initial values are attainedin the usual sense: ‖ft − fs‖ → 0 as t− s → 0.

Proof. This follows from the observation that the mapping fs �→ ft given by (2.58)is a well-defined contraction in the spaces Lq(R

d), q ≥ 1, and their weighted

versions, the spaces Lq(Rd, (1 + |p|2)s dp), and that ‖ft − fs‖ → 0 as t− s → 0 in

all these spaces. �Remark 30. The last property that is stated in Theorem 2.4.1 is referred to asstrong continuity of the mappings fs �→ ft. This property will be studied in moredetail later.

Let us give a reasonably general criterion for the heat kernels to exist as usualsmooth functions.

Theorem 2.4.2. Let ψt(p) be a continuous function of two variables, with its realpart growing as a power for large p:

Reψt(p) ≥ Clow|p|δ, |p| > p, (2.68)

with some positive constants Clow, δ, p. Then, for any t > s, the heat kernel Gψt,s(.)

belongs to C∞(Rd) together with all its spatial derivatives (derivatives with respectto x of any order), and

supx

| ∂l

∂xi1 · · · ∂xil

Gψt,s(x)| ≤

|Sd−1|(2π)d

(pd+l +

1

δΓ

(d+ l

δ

)(Clow(t− s))−(d+l)/δ

),

(2.69)where |Sd−1| is the surface area of the unit sphere in Rd.

If additionally ψt(p) grows at most polynomially for large p, i.e.,

|ψt(p)| ≤ Cup|p|N , |p| > p, (2.70)

with some positive constants Cup and N , then the derivatives

∂Gψt,s(.)

∂tand

∂Gψt,s(.)

∂s

Page 117: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

2.4. Linear evolutions involving spatially homogeneous ΨDOs 101

are well defined, belong to C∞(Rd) as well as all their spatial derivatives and

supx

max

(∣∣∣∣ ∂∂tGψt,s(x)

∣∣∣∣ ,∣∣∣∣ ∂∂sGψ

t,s(x)

∣∣∣∣)

(2.71)

≤ |Sd−1|(2π)d

(pd+l max

|p|≤p,τ∈[s,t]|ψτ (p)|+ 1

δCupΓ

(d+N

δ

)(Clow(t− s))−(d+N)/δ

).

Proof. Since

∂m

∂xj1 · · · ∂xjm

Gψt,s(x) = (2π)−d

∫imeipxpj1 · · · pjm exp

{−∫ t

s

ψτ (p) dτ

}dp,

∂tGψ

t,s(x) = −(2π)−d

∫eipxψt(p) exp

{−∫ t

s

ψτ (p)dτ

}dp,

∂sGψ

t,s(x) = (2π)−d

∫eipxψs(p) exp

{−∫ t

s

ψτ (p)dτ

}dp,

the claims about C∞(Rd) follow from the Riemann–Lebesgue lemma. Moreover,(2.68) implies

supx

|Gψt,s(x)| ≤

|Sd−1|(2π)d

pd +|Sd−1|(2π)d

∫ ∞

0

exp{−Clow(t− s)|p|δ}|p|d−1d|p|.

A change of variables to r = Clow(t − s)|p|δ in the last integral allows for thisintegral to be expressed in terms of the Gamma function, using the equation∫ ∞

0

rω exp{−Arδ}dr = 1

δΓ((1 + ω)δ)A−(1+ω)/δ.

This leads to (2.69) with l = 0. The other estimates are similarly obtained. �

Equations that satisfy (2.68) are usually called parabolic.

Theorem 2.4.2 implies that the solution ft given by (2.61) is infinitely smoothfor any fs ∈ L1(R

d). This, however, does neither imply that the solution is welldefined for fs ∈ C∞(Rd), nor that the spaces C∞(Rd) or L1(R

d) are preservedby the resolving mapping fs �→ ft. In order for such conclusions to hold, we needthe integrability of G. And for a proof, we need to know the behaviour of G forlarge x. For an appropriate decrease of G(x), as x → ∞, some smoothness of thesymbol is required. The simplest result in this direction is as follows.

Theorem 2.4.3. Under the assumptions of Theorem 2.4.2, let us assume that ψt(p)are infinitely differentiable in p with all derivatives growing at most polynomiallyas p → ∞ (uniformly in t). Then Gψ

t,s(x) decreases faster than any power as

x → ∞, that is for any m > 0 there exists a constant C(m) such that |Gψt,s(x)| ≤

C(m)|x|−m for |x| > 1. In particular, the resolving operator fs �→ ft maps thespaces C∞(Rd) or L1(R

d) to themselves.

Page 118: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

102 Chapter 2. Basic ODEs in Complete Locally Convex Spaces

Proof. Due to the Riemann–Lebesgue lemma and the assumptions on ψ, the func-tions

(−i)lxi1 · · ·xilGψt,s(x) = (2π)−d

∫eipx

∂l

∂pi1 · · ·∂pilexp

{−∫ t

s

ψτ (p) dτ

}dp

belong to C∞(Rd) for any l > 0. �

A case that has been studied extensively is the case when ψt(p) is a poly-nomial in p of an even order 2m such that the homogeneous part of the highestorder 2m, ψ2m

t (p), is positive non-degenerate, that is |ψ2mt (p)| ≥ ε|p|2m with some

ε > 0. In this case, a very detailed analysis of G is available. In particular, Gψt,d(x)

decreases exponentially as x → ∞ (see, e.g., [91]).

In many important examples, however, ψ(p) has some singularity. A promi-nent example are the powers ψ(p) = |p|β with some real β. Let us give some resultson the behaviour of G for large x in the presence of such singularities.

Theorem 2.4.4. Under the assumptions of Theorem 2.4.2, let us assume thatψt(p) is l-times continuously differentiable with respect to p outside some compact,nowhere dense set of singularities (the same for all t), such that all derivatives aregrowing at most polynomially as p → ∞ and satisfy the condition that any productof the partial derivatives

∂l1ψ

∂pi1 · · · ∂pil1· · · ∂lmψ

∂pj1 · · · ∂pjlm(2.72)

with l1 + · · ·+ lm = l is locally integrable everywhere. Then, for any s < t,

(1 + |.|l)Gψt,s(.), (1 + |.|l)∂G

ψt,s(.)

∂t, (1 + |.|l)∂G

ψt,s(.)

∂s∈ C∞(Rd), (2.73)

and the same holds for all spatial derivatives of these functions.

In particular, if this holds for l = d + 1, then the functions Gψt,s(.),

∂Gψt,s(.)

∂t

and∂Gψt,s(.)

∂s belong to L1(Rd) together with all their spatial derivatives. In other

words, these functions belong to the Sobolev spaces Hk1 (R

d) for all natural k. Thecorresponding norms are again uniformly bounded for t − s ∈ [T1, T2] with any0 < T1 < T2.

Proof. According to the assumptions, the function under the integral in the ex-pression

(−i)lxi1 · · ·xilGψt,s(x) = (2π)−d

∫eipx

∂l

∂pi1 · · ·∂pilexp

{−∫ t

s

ψτ (p) dτ

}dp

belongs to L1(Rd). Therefore, the first inclusion in (2.73) follows again by the

Riemann–Lebesgue lemma. Since multiplying by polynomials in p or by ψ(p) does

Page 119: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

2.4. Linear evolutions involving spatially homogeneous ΨDOs 103

not increase the singularity, the other inclusions in (2.73) immediately follow. Allother statements follow from to the fact that the condition (1 + |x|d+1)f(x) ∈C∞(Rd) for a continuous f implies f ∈ L1(R

d). �Remark 31. Although Theorem 2.4.2 and (2.4.4) imply that the resolving operatorfs �→ ft of the Cauchy problem (2.56) takes C∞(Rd) to itself and L1(R

d) to itself,the question remains open as to how the initial conditions are met, that is whether‖ft− fs‖ → 0 as t− s → 0 in these spaces. This question is ultimately linked withthe question of uniform boundedness for small t− s of the operators fs �→ ft. Weshall return to these questions in Chapter 4.

For example, if ψ(p) = |p|β with a β > 0, the only singularity of ψ is atthe origin, and the condition of the integrability of the products (2.72) is fulfilledfor l = d + k, where k is the biggest integer that is strictly smaller than β. Inparticular, l ≥ d + 1 for β > 1. The case of homogeneous ψ will be considered indetail in the Sections 4.4 and 4.5. Here, let the operator ψ(−i∇) be the fractionalderivatives (right or left) on R as an insightful example. The corresponding Greenfunctions or heat kernels solve the problems

∂G

∂t(t, x) = − dβ

d(±x)βG(t, x), t ≥ 0, Gt=0 = δ(x), (2.74)

with β ∈ (0, 1). From Proposition 1.8.6, it follows that the corresponding symbolsψ equal

ψ(p) = exp

{± i

2πβ sgn p

}|p|β,

so that

G(t, x) = G±β(t, x) =1

∫ ∞

−∞exp

{ipx− t|p|β exp

{± i

2πβ sgn p

}}dp. (2.75)

Taking into account that this expression is real, it follows that

G(t, x) = G±β(t, x) =1

2πRe

∫ ∞

−∞exp

{ipx− t|p|β exp

{± i

2πβ sgn p

}}dp

=1

πRe

∫ ∞

0

exp

{ipx− tpβ exp

{± i

2πβ

}}dp. (2.76)

From this formula, it follows that G±β(t, x) = G∓β(t,−x). Moreover, thesefunctions satisfy a scaling law:

G±β(t, x) = t−1/βG±β(1, t−1/βx) =

1

|x|G±β(t|x|−β , sgnx). (2.77)

This law is crucial for their analysis. Moreover, Theorem 2.4.2 (i) implies that theheat kernel G±β(t, .) belongs to C∞(R) together with all its spatial derivatives,

and that the derivative∂G±β(t,.)

∂t also belongs to C∞(R) together with all its spatialderivatives.

Page 120: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

104 Chapter 2. Basic ODEs in Complete Locally Convex Spaces

Remark 32. In probability theory, the functions p±β(t,x)=G∓β(t,−x)=G±β(t,x)are referred to as stable densities. They represent transition densities for increasingrespectively decreasing stable subordinators.

More specific properties of these stable densities are collected in the followingassertion.

Proposition 2.4.1.

(i) The functions G±β(t, x) are non-negative, and

G±β(t, x) = 0 for ∓ x > 0. (2.78)

(ii) For any t, these functions behave like |x|−1−β for large x, in the sense thatthere exits a finite positive limit

lim|x|→∞

G±β(t, x)|x|1+β for ± x > 0. (2.79)

(iii) G±β(t, x) > 0 for ±x > 0, and they are unimodal in this region, that is,they increase monotonically from zero up to the maximum and then decreasemonotonically back to zero.

Proof. Assertion (i) follows from the more general Proposition 5.11.3. Assertion(ii) follows from the more general Proposition 4.5.1 that shall be proven later on.Assertion (iii) will be neither proved nor used here. Proofs can be found, e.g., in[78, 267] or [148]. �

Another key property of these functions has to do with the following repre-sentation of the Mittag-Leffler function:

Eβ(s) =1

β

∫ ∞

0

esxx−1−1/βGβ(1, x−1/β) dx =

∫ ∞

0

esy−β

Gβ(1, y) dy, (2.80)

which holds for β ∈ (0, 1) and all s ∈ C. As we shall see later on, this formula iscrucial for analysing general fractional equations. We shall give two analytic proofsof this formula in Proposition 8.1.1 and in Section 8.4. A probabilistic proof canbe found in [106].

2.5 Hamiltonian systems, boundary-value problems

and the method of shooting

For a smooth function H(x, p), x, p ∈ Rm, the system of equations⎧⎪⎪⎨⎪⎪⎩

x =∂H

∂p

p = −∂H

∂x

(2.81)

Page 121: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

2.5. Hamiltonian systems, boundary-value problems, method of shooting 105

is called the system of Hamiltonian equations with the Hamiltonian function, orjust the Hamiltonian, H . It represents one of the key examples of ODEs that occurin physics. The coordinates x and p are referred to as position and momentum.Projections of the solutions onto the position coordinates are often referred toas characteristics or extremals. The reason for this nomenclature will be revealedbelow. This and the next section analyse Hamiltonian systems in more detail. Notethat the obtained results are not used in other parts of the book.

Unlike most of the other book content, we mostly stick to finite-dimensionalHamiltonian systems, since their infinite-dimensional extension is nontrivial andhas to be specially developed. Many specialized treatises are devoted to this topic.But even the finite-dimensional theory of Hamiltonian systems is quite specific,which is why it can only be roughly touched in the framework of this book, namelyby introducing the method of shooting that can be used to solve boundary-valueproblems for ODEs, provided that the corresponding Cauchy (or initial value)problem is well understood. The core idea of this method is to consider all trajec-tories that emerge from a certain point (shooting the corresponding mechanicalparticles out of this point) and then to try to find the trajectory that satisfies therequired condition on its end point.

In the next section, we shall explain how Hamiltonian systems can be used tosolve first-order partial differential equations, namely the Hamilton–Jacobi equa-tions.

A basic class of Hamiltonian functions that is responsible for the second-orderNewton equations of mechanics is given by the convex quadratic-in-momentumHamiltonians

H(x, p) =1

2(G(x)p, p) − (A(x), p) − V (x), (2.82)

where G(x) is a non-negative symmetric matrix. Such a Hamiltonian is called non-degenerate if G is strictly positive, so that G(x)−1 exists for all x and is uniformlybounded. In this section, we develop the theory of boundary-value problems forthe basic non-degenerate case. For the more general case, we shall refer to theoriginal papers.

Before proceeding with the boundary-value problem, one needs some esti-mates for the solutions to the Cauchy problem for the Hamiltonian system (2.81).For the Hamiltonian (2.82), it has the form⎧⎪⎨

⎪⎩x = G(x)p−A(x)

pi = −1

2

(∂G

∂xip, p

)+

(∂A

∂xi, p

)+

∂V

∂xi, i = 1, . . . ,m,

(2.83)

where we have written the second equation for each coordinate separately. In whatfollows, Br denotes the balls in Rm of radius r centered at zero. Furthermore, letus denote by X(s, x0, p0), P (s, x0, p0) the solution to (2.81) or (2.83) with theinitial data (x0, p0) (whenever it exists).

Page 122: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

106 Chapter 2. Basic ODEs in Complete Locally Convex Spaces

Lemma 2.5.1. Let G(x), A(x), V (x) be twice continuously differentiable functions.Then for an arbitrary x0 ∈ Rm and an arbitrary open bounded neighbourhoodU(x0) of x0, there exist positive constants t0, c0, C such that for t ∈ (0, t0], c ∈(0, c0] and p0 ∈ Bc/t, the solution X(s, x0, p0), P (s, x0, p0) exists on the interval[0, t], and for all s ∈ [0, t],

X(s, x0, p0) ∈ U(x0), |P (s, x0, p0)| < C(|p0|+ t). (2.84)

Proof. Let T be the time of exit of the solution from the domain U(x0), namely

T (t) = min (t, sup{s : X(s, x0, p0) ∈ U(x0), P (s, x0, p0) < ∞}) .Since the derivatives of G, A and V are bounded in U(x0), it follows that

for s ≤ T (t) the growth of |X(s, x0, p0)− x0| and |P (s, x0, p0)| is bounded by thesolution to the system

x = K(p+ 1), p = K(p2 + 1) (2.85)

in R2 with the initial conditions x(0) = 0, p(0) = |p0| and some constant K. Thesolution to the second equation is

p(s) = tan(Ks+ arctan p(0)) =p(0) + tanKs

1− p(0) tanKs. (2.86)

Therefore, if |p0| ≤ c/t with c ≤ c0 < 1/K, where K is chosen in such a waythat tanKs ≤ Ks for s ≤ t0, then

1− |p0| tanKs > 1− |p0|Ks ≥ 1− c0K

for all s ≤ T (t). Consequently, for such s,

|P (s, x0, p0)| ≤ |p0|+ Ks

1− c0K, |X(s, x0, p0)− x0| ≤ Ks+K

c+ Ks2

1− c0K.

Choosing t0, c0 in such a way that the last inequality implies X(s, x0, p0) ∈U(x0) for s ≤ t0, c ≤ c0, it follows that T (t) = t. This implies (2.84) as required.

�Lemma 2.5.2. Let G(x), A(x), V (x) be twice continuously differentiable functions.Then there exist t0 > 0 and c0 > 0 such that, for s ≤ t ≤ t0, c ≤ c0, p0 ∈ Bc/t,

1

s

∂X

∂p0(s, x0, p0) = G(x0) +O(c + t),

∂P

∂p0(s, x0, p0) = 1 +O(c+ t). (2.87)

Proof. Differentiating the first equation in (2.83) yields

xi =∂Gik

∂xl(x)xlpk +Gik(x)pk − ∂Ai

∂xlxl (2.88)

=

(∂Gik

∂xlpk − ∂Ai

∂xl

)(Gljpj −Al) +Gik

(−1

2(∂G

∂xkp, p) + (

∂A

∂xk, p) +

∂V

∂xk

).

Page 123: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

2.5. Hamiltonian systems, boundary-value problems, method of shooting 107

Consequently, differentiating the Taylor expansion

x(s) = x0 + x(0)s+

∫ s

0

(s− τ)x(τ) dτ

with respect to the initial momentum p0 and using (2.84), one gets

∂X

∂p0(s,x0,p0) (2.89)

=G(x0)s+

∫ s

0

(O(1+ |p0|2)∂X

∂p0(τ,x0,p0)+O(1+ |p0|) ∂P

∂p0(τ,x0,p0)

)(s−τ)dτ.

Similarly, differentiating p(s) = p0 +∫ s

0p(τ) dτ leads to

∂P

∂p0(s, x0, p0) (2.90)

= 1 +

∫ s

0

(O(1 + |p0|2)∂X

∂p0(τ, x0, p0) + O(1 + |p0|) ∂P

∂p0(τ, x0, p0)

)dτ.

Let us now look at the matrices

v(s) =1

s

∂X

∂p0(s, x0, p0), u(s) =

∂P

∂p0(s, x0, p0)

as elements of the Banach space C([0, t],Mm) of continuous m×m-matrix-valuedfunctions M(s) on [0, t] equipped with the norm sup{|M(s)| : s ∈ [0, t]}. Then onecan write the equations (2.89), (2.90) in the abstract form

v = G(x0) + L1v + L1u, u = 1 + L2v + L2u,

where L1, L2, L1, L2 are linear operators in C([0, t],Mm) with the norms |Li| =O(c2 + t2) and |Li| = O(c + t). This implies (2.87) for c and t small enough. Infact, the second equation yields u = 1 + O(c+ t) +O(c2 + t2)v. Substituting thisequality in the first equation yields v = G(x0)+O(c+ t)+O(c2+ t2)v, and solvingthis equation with respect to v leads to the first equation in (2.87). �

Now we are ready to prove the existence of the family Γ(x0) of solutions ofthe system (2.83) that start at x0 and cover a neighbourhood of x0 in times t ≤ t0.

Theorem 2.5.1. Let G(x), A(x), V (x) be twice continuously differentiable functionsand let the matrix G be positive non-degenerate. Then

(i) for each x0 ∈ Rm there exist c and t0 such that for all t ≤ t0 the mappingp0 �→ X(t, x0, p0) defined on the ball Bc/t is a diffeomorphism onto its image;

(ii) for an arbitrary small enough c, there are positive r = O(c) and t0 = O(c)such that the image of this diffeomorphism contains the ball Br(x0) for allt ≤ t0.

Page 124: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

108 Chapter 2. Basic ODEs in Complete Locally Convex Spaces

Proof. (i) Note first that, by Lemma 2.5.2, the mapping p0 �→ X(t, x0, p0) is alocal diffeomorphism for all t ≤ t0. Moreover, if p0, q0 ∈ Bc/t, then

X(t, x0, p0)−X(t, x0, q0) =

∫ 1

0

∂X

∂p0(t, x0, q0 + s(p0 − q0)) ds (p0 − q0)

= t(G(x0) +O(c + t))(p0 − q0).

(2.91)

Therefore, if c and t are sufficiently small, the r.h.s. of (2.91) cannot vanish forp0 − q0 = 0.

(ii) We must prove that for x ∈ Br(x0) there exists p0 ∈ Bc/t such thatx = X(t, x0, p0), or equivalently, such that

p0 = p0 +1

tG(x0)

−1(x−X(t, x0, p0)).

In other words, the mapping

Fx : p0 �→ p0 +1

tG(x0)

−1(x−X(t, x0, p0)) (2.92)

has a fixed point in the ball Bc/t. Since every continuous mapping from a ball toitself has a fixed point (Schauder fixed-point principle), it is enough to prove thatFx takes the ball Bc/t into itself, i.e., that

|Fx(p0)| ≤ c/t (2.93)

whenever x ∈ Br(x0) and |p0| ≤ c/t. By (2.84), (2.88) and (2.89), we have

X(t, x0, p0) = x0 + t(G(x0)p0 −A(x0)) +O(c2 + t2).

Therefore, it follows from (2.92) that (2.93) is equivalent to

|G(x0)−1(x− x0) +O(t + c2 + t2)| ≤ c,

which holds for t ≤ t0, |x− x0| ≤ r and sufficiently small r, t0, provided that c ischosen small enough. �

As a consequence, we get the following local well-posedness result for theboundary-value problem for quadratic Hamiltonians.

Theorem 2.5.2. If either (i) A, V,G,G−1 ∈ C2(Rd), or (ii) G is a constant pos-itive matrix, A ∈ C2(Rd) and the second derivatives V ′′ exist and are uniformlybounded, then there exist positive r, c, t0 such that for any t ∈ (0, t0] and any x1, x2

with |x1 − x2| ≤ r, there exists a solution to the system (2.83) with the boundaryconditions

x(0) = x1, x(t) = x2.

Moreover, this solution is unique under the additional assumption that |p(0)| ≤ c/t.

Page 125: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

2.5. Hamiltonian systems, boundary-value problems, method of shooting 109

Proof. The case (i) follows directly from Theorem 2.5.1. Under the assumptions(ii), the proof of Lemma 2.5.1 can be repeated with the system

x = K(p+ 1), p = K(1 + p+ x)

as a bound for the solution to the Hamiltonian system. This system is linear(this is where the assumption of G being constant comes in) and its solutions areestimated by the exponents. The rest of the proof remains the same. �

The proof of the existence of the boundary-value problem given above isnot constructive. However, once the well-posedness is given, one can constructapproximate solutions up to any order in small t for smooth enough Hamiltonians.For this purpose, one again begins with the construction of the asymptotic solutionfor the Cauchy problem.

Proposition 2.5.1. If the functions G,A, V in (2.82) have continuous derivativesof order up to k + 1, then for the solution to the Cauchy problem for equation(2.83) with initial data x(0) = x0, p(0) = p0, the following asymptotic formulaehold:

X(t, x0, p0) = x0 + tG(x0)p0 −A(x0)t+

k∑j=2

Qj(t, tp0) +O(c+ t)k+1, (2.94)

P (t, x0, p0) = p0 +1

t

⎡⎣ k∑j=2

Pj(t, tp0) +O(c+ t)k+1

⎤⎦ , (2.95)

where Qj(t, q) = Qj(t, q1, . . . , qm), Pj(t, q) = Pj(t, q

1, . . . , qm) are homogeneouspolynomials of degree j with respect to all their arguments, with coefficients thatdepend on the values of G,A, V and their derivatives up to order j at the pointx0. Moreover, one has the following expansion for the derivatives with respect tothe initial momentum:

1

t

∂X

∂p0= G(x0) +

k∑j=1

Qj(t, tp0) +O(c + t)k+1, (2.96)

∂P

∂p0= 1 +

k∑j=1

Pj(t, tp0) +O(c+ t)k+1, (2.97)

where Qj , Pj are again homogeneous polynomials of degree j, but now matrix-valued.

Proof. This follows directly from differentiating the equations (2.83), then usingthe Taylor expansion for its solution up to kth order and finally estimating theremainder with the help of Lemma 2.5.1. �

Page 126: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

110 Chapter 2. Basic ODEs in Complete Locally Convex Spaces

Proposition 2.5.2. If the functions G,A, V in (2.82) have continuous derivativesof order up to k + 1 and G(x0) is a non-degenerate positive matrix, the functionp0(t, x, x0), defined by the equation x = X(t, x0, p0(t, x, x0)) according to Theorem2.5.1, has the asymptotic expansion

p0(t, x, x0) =1

tG(x0)

−1

⎡⎣(x − x0) +A(x0)t+

k∑j=2

Pj(t, x− x0) +O(c+ t)k+1

⎤⎦ ,

(2.98)where Pj(t, x − x0) are certain homogeneous polynomials of degree j in all theirarguments.

Proof. It follows from (2.94) that x−x0 can be expressed as an asymptotic powerseries in the variable (p0t) with coefficients that have asymptotic expansions inpowers of t. This implies the existence and uniqueness of the formal power series ofthe form (2.98) which solves equation (2.94) with respect to p0. The well-posednessof this equation (which follows from Theorem 2.5.1) completes the proof. �

We have shown how the method of shooting is used for proving the well-posedness and for effectively calculating the solution to the boundary-value prob-lem for Hamiltonian systems locally. The vast development of the theory of Hamil-ton equations is beyond the scope of this book, but we shall provide some com-ments on its various directions. (More bibliographical comments are provided inthe last section of the chapter.)

For convex Hamiltonians, it will be shown in the next section that the projec-tions of the solutions to the boundary-value problems on the position coordinatex, X(τ, x0, p0) with x = X(t, x0, p0) provide local minimizers for the integral func-tional

It(y(.)) =

∫ t

0

L(y(τ), y(τ)) dτ (2.99)

among all piecewise smooth curves y(τ) with given boundary conditions y(0) =x0, y(t) = x (hence the term extremals mentioned above), where L(x, v) is theLagrange function or the Lagrangian of H defined as the Legendre transform ofH(x, p) in the variable p:

L(x, v) = maxp

(pv −H(x, p)). (2.100)

For example, for the quadratic Hamiltonian H(x, p) = 12 (G(x)p, p), the cor-

responding Lagrangian is also quadratic:

L(x, v) =1

2(G−1(x)v, v). (2.101)

Exercise 2.5.1. Check formula (2.101).

Page 127: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

2.5. Hamiltonian systems, boundary-value problems, method of shooting 111

This link to the theory of optimization (or the calculus of variations) makesit possible to derive from the local well-posedness of the boundary-value problem(proved above) the global existence result for its solution – without the restrictionthat x0 and x are close and t is small. (Note, however, that this does not implythe uniqueness!) Hereby, a possible solution can be identified as a curve supplyingthe global minimum of (2.99). This existence result for non-degenerate quadraticHamiltonians is referred to as Tonelli’s theorem.

The key application of this theory in geometry and general relativity arisesfrom pure quadratic Hamiltonians with a Lagrangian of the form (2.101). Whendefined on the tangent bundles to a Riemannian manifold, these functions that arequadratic in the velocity v specify Riemannian metrics. The corresponding localminimizers or extremals are called geodesics. These are curves of minimal length,that is, the analogue to straight lines in Euclidean geometry. The correspondingevolution generated by the Hamilton equations is referred to as the geodesic flow.

While non-degenerate quadratic Hamiltonians arise in geodesic flows andin standard problems of calculus of variation with Lagrangians L(x, v) that de-pend quadratically on v, degenerate quadratic Hamiltonians arise in the study ofstochastic geodesic flows (i.e., geodesic flows that are perturbed by some noisyinput) and in problems of calculus of variation with Lagrangians L(x, x, . . . , x(k))that depend on higher derivatives and show a quadratic dependence on the high-est derivative x(k). For degenerate quadratic Hamiltonians, the analysis of theboundary-value problem is much less straightforward. For example, let us con-sider the following quadratic convex Hamiltonian on R4d:

H(x, y, p, q) = −f(y)p+1

2q2 (2.102)

with an everywhere-positive function f , for instance f(y) = y2. (Note that (p, q)is the momentum related to the position coordinates (x, y).) The correspondingHamiltonian system reads

x = −f(y), y = q, p = 0, q = f ′(y)p.

Therefore, x is always negative and there is no solution of the Hamiltonian systemjoining (x0, y0) and (x, y) whenever x > x0, even for small positive t and closex, x0. In other words, the boundary-value problem is not solvable even locally.

A natural approach to address the local boundary-value problem for a de-generate Hamiltonian near a point x0 is to consider its linear approximation,obtained by taking the quadratic (in both p and x) approximation to the Hamil-tonian around some point x0. However, it turns out that the solution to the localboundary-value problem is approximated by the solutions to the correspondinglinear approximation for a particular class of Hamiltonians only. This class ofHamiltonians has been identified in [136], where the full theory of the boundary-value problem for the corresponding Hamiltonian equations (local well-posedness

Page 128: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

112 Chapter 2. Basic ODEs in Complete Locally Convex Spaces

and global existence of the global minimizing solutions) is provided. The Hamilto-nians of this class are called regular in [136]. They can be classified with the helpof Young diagrams (finite non-increasing sequences of natural numbers). For adiagram that consists of only one number, the Hamiltonian is non-degenerate. Fora diagram that consists of two numbers k ≥ n, the corresponding HamiltoniansH(x, y, p, q), x, p ∈ Rn, y, q ∈ Rk have the form

H =1

2(g(x)q, q) − (a(x) + α(x)y, p) − (b(x) + β(x)y +

1

2(γ(x)y, y), q)− V (x, y),

(2.103)or more explicitly

H =1

2

∑gij(x)q

iqj −∑

(ai(x) + αijyj)pi

−∑

(bi(x) + βji (x)yj +

1

2

∑γjli yjyl)q

i − V (x, y),

where g is a positive k × k matrix, V (x, y) is a polynomial in y of degree ≤ 4,bounded from below, and the rank of the matrix α(x) is n.

Exercise 2.5.2. Write down explicitly the Hamiltonian system arising from Hamil-tonians of the type (2.103). Afterwards, prove the following generalization ofLemma 2.5.1 for this system: There exist constants K, t0, and c0 such that for allc ∈ (0, c0] and t ∈ (0, t0], the solution (X,Y, P,Q)(t, x0, y0, p0, q0) to this systemwith the initial data (x0, y0, p0, q0) exists on the interval [0, t] whenever

|y0| ≤ c

t, |q0| ≤ c2

t2, |p0| ≤ c3

t3.

On this interval, the following estimates hold:

|x− x0| ≤ Kt(1 +

c

t

), |y − y0| ≤ Kt

(1 +

c2

t2

),

|q − q0| ≤ Kt

(1 +

c3

t3

), |p− p0| ≤ Kt

(1 +

c4

t4

).

Note that a solution to this exercise can be found in [136].

Exercise 2.5.3. Prove the following local well-posedness result for the boundary-value problem for a Hamiltonian system with a Hamiltonian of the form (2.103):(i) There exist positive real numbers c and t0 (depending only on x0) such thatfor all t ≤ t0 and |y| ≤ c/t, the mapping (p0, q0) �→ (X,Y )(t, x0, y0, p0, q0) definedon the polydisc Bc3/t3 ×Bc2/t2 is a diffeomorphism onto its image. (ii) There existpositive r, c, t0 such that the image of this diffeomorphism contains the polydiscBr/t(y)×Br(x), where

(x, y) = (X,Y )(t, x0, y0, 0, 0).

The arguments are rather lengthy, see again [136].

Page 129: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

2.6. Hamilton–Jacobi eq., method of characteristics, calc. of variation 113

Remark 33. For a general Young diagram M = {mM+1 ≥ mM ≥ · · · ≥ m0 > 0},let x0, . . . , xM , xM+1 = y be the position coordinates in the spaces of dimensionsm0, . . . ,mM+1 respectively, and let p0, . . . , pM , pM+1 = q be the momenta. Thecorresponding regular Hamiltonians are defined by the formula

H(x, y, p, q) =1

2

(g(x0)q, q

)−R1(x, y)p0 − · · ·· · · −RM+1(x, y)pM −RM+2(x, y)q −R2(M+2)(x, y),

where the RI(x, y) are (vector-valued) polynomials in the variables x1, . . . , xM ,y = xM+1 of M-degree I with smooth coefficients depending on x0, and wheretheM-degree of a polynomial is defined by prescribing the degree I to the variablesxI , I = 0, . . . ,M+1. Moreover, g(x0) is non-degenerate and the matrices ∂RI

∂xI havethe rank mI−1.

The problem of solving Hamiltonian equations is closely related to (in fact,is often even reduced to) the problem of finding the conservation laws (alsoreferred to as first integrals), which are functions g(x, p) such that the valuesg(X(τ, x0, p0), P (τ, x0, p0)) do not depend on τ for any x0, p0. A tool for findingsuch first integrals is supplied by the notion of the Poisson bracket, which is definedfor functions g, h on R2d as follows:

{g, h}(x, p) = ∂g

∂x

∂h

∂p− ∂g

∂p

∂h

∂x. (2.104)

The functions g, h are said to be in convolution if their bracket vanishes. It isstraightforward to see that if g and H are in convolution then g is a conservationlaw for the Hamiltonian system with the Hamiltonian function H .

Exercise 2.5.4. Check this claim.

If a sufficient number of conservation laws can be found (i.e., sufficient toidentify the solutions), the Hamiltonian system is referred to as integrable. Inte-grable infinite-dimensional systems are of particular interest. Their study broughtto life the remarkable theory of solitons and instantons in the nonlinear PDEs,based on the famous KdV-equation.

2.6 Hamilton–Jacobi equation, method of

characteristics and calculus of variation

We shall now introduce the method of characteristics, which links ODEs and first-order PDEs and is a key tool for solving the latter.

Remark 34. Remarkably, the method also turns out to be effective in the oppositedirection, making it possible to derive solutions to ODEs in some important caseswhen the solutions to PDEs are easier to find, see, e.g., [16].

Page 130: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

114 Chapter 2. Basic ODEs in Complete Locally Convex Spaces

We develop the method based on the most important case of the Hamilton–Jacobi equations, which differ from the general first-order PDE by the restrictionthat these equations do not depend explicitly on the unknown function, but only onits partial derivatives. (Note, however, that general linear equations can be foundin Section 2.10.) As a consequence, we will show the central role of Hamiltoniansystems in solving minimization problems of the calculus of variation. We willbuild the theory of the Cauchy problem for the Hamilton–Jacobi equation forsmall times and then briefly indicate the ways for constructing global generalizedsolutions.

Let H = H(x, p) be a twice continuously differential function on R2n. LetX(t, x0, p0), P (t, x0, p0) denote the solution to the Hamiltonian system (2.81) withinitial conditions (x0, p0) at time zero. (As was already noted, the projections onthe x-space of the solutions of (2.81) are called characteristics of the HamiltonianH , or extremals.) Suppose that for some x0, t0 > 0 and all t ∈ (0, t0], there existsa neighbourhood of the origin Ωt ∈ Rn such that the mapping p0 �→ X(t, x0, p0)is a diffeomorphism from Ωt onto its image and, moreover, this image containsa fixed neighbourhood D(x0) of x0 (not depending on t). Then the family Γ(x0)of solutions of (2.81) with initial data (x0, p0), p0 ∈ Ωt, is called the field ofcharacteristics starting from x0 and covering D(x0) in times t. Basic classes ofHamiltonians where such field exists (which occurs whenever the boundary-valueproblem is locally well posed) were identified in the previous section.

Assuming that this field exists, one can define the smooth function

p0(t, x, x0) : (0, t0]×D(x0) �→ Ωt(x0)

so thatX(t, x0, p0(t, x, x0)) = x.

The family Γ(x0) defines two natural vector fields in (0, t0]×D(x0). Namely, eachpoint of this set is associated with the momentum and velocity vectors

p(t, x) = P (t, x0, p0(t, x, x0)), v(t, x) =∂H

∂p(x, p(t, x)) (2.105)

of the solution to (2.81) joining x0 and x in time t.

With each solution X(t, x0, p0), P (t, x0, p0), one associates the action func-tion defined by the integral

σ(t, x0, p0) =

∫ t

0

(P (τ, x0, p0)X(τ, x0, p0)−H(X(τ, x0, p0), P (τ, x0, p0))) dτ.

(2.106)Due to the properties of the field of characteristics Γ(x0), one can locally definethe two-point function S(t, x, x0) as the action along the trajectory from Γ(x0)that joins x0 and x in the time t, i.e.,

S(t, x, x0) = σ(t, x0, p0(t, x, x0)). (2.107)

Page 131: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

2.6. Hamilton–Jacobi eq., method of characteristics, calc. of variation 115

Using the vector field p(t, x), one can rewrite this in the equivalent form

S(t, x, x0) =

∫ t

0

[p(τ, x) dx−H(x, p(τ, x)) dτ ], (2.108)

the curvilinear integral being taken along the characteristic X(τ, x0, p0(t, x, x0)).

The following statement represents the key link between ODEs and first-orderpartial differential equations – more concretely, between Hamiltonian equationsand the Hamilton–Jacobi equation. It also plays the central role in the classicalcalculus of variations.

Theorem 2.6.1. For any x0, the function S(t, x, x0) satisfies the Hamilton–Jacobiequation

∂S

∂t+H

(x,

∂S

∂x

)= 0 (2.109)

in the domain (0, t0]×D(x0). Moreover

∂S

∂x(t, x) = p(t, x). (2.110)

Finally, the integral in the r.h.s. of (2.108) does not depend on the path of inte-gration, i.e., it has the same value for all smooth curves x(τ) joining x0 and x inthe time t and lying completely in the domain D(x0).

Proof. First we prove (2.110). This equation can be rewritten as

P (t, x0, p0) =∂S

∂x(t,X(t, x0, p0)),

or equivalently as

P (t, x0, p0) =∂σ

∂p0(t, x0, p0(t, x, x0))

∂p0∂x

(t, x, x0).

Since X(t, x0, p0(t, x, x0)) = x, we find(∂p0∂x

)−1

(t, x, x0) =∂X

∂p0(t, x0, p0(t, x, x0)).

It follows that equation (2.110), written in terms of the variables (t, p0), has theform

P (t, x0, p0)∂X

∂p0(t, x0, p0) =

∂σ

∂p0(t, x0, p0). (2.111)

To prove this equation, we note that it holds at t = 0, since both parts vanishat that time. Moreover, differentiating this equation with respect to t and using(2.81), one gets the tautology

−∂H

∂x

∂X

∂p0+ P

∂2X

∂t∂p0=

∂P

∂p0

∂H

∂p+ P

∂2X

∂t∂p0− ∂H

∂p

∂P

∂p0− ∂H

∂x

∂X

∂p0,

showing that (2.111) holds for all t.

Page 132: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

116 Chapter 2. Basic ODEs in Complete Locally Convex Spaces

In order to prove (2.109), let us first rewrite it as

∂σ

∂t+

∂σ

∂p0

∂p0∂t

(t, x) +H(x, p(t, p0(t, x))) = 0.

Inserting the expressions for ∂σ∂t and ∂σ

∂x from (2.106) respectively (2.111) yields

P (t, x0, p0)X(t, x0, p0) + P (t, x0, p0)∂X

∂p0(t, x0, p0)

∂p0∂t

= 0,

which is true, because the equation

∂X

∂p0(t, x0, p0)

∂p0∂t

+ X(t, x0, p0) = 0

is obtained by differentiating the equation X(t, x0, p0(t, x, x0)) = x.

The final statement follows, because the integrand in (2.108) is a completedifferential due to (2.109) and (2.110). �

The integral on the r.h.s. of (2.108) is often referred to as the invariant Hilbertintegral.

For the sake of completeness, let us examine the key link of Hamiltonian sys-tems with the calculus of variations. For this purpose, one introduces the Lagrangefunction L(x, v) as the Legendre transform (2.99) of H(x, p) in the variable p, theintegral functional (2.99) defined on piecewise-smooth curves joining x0 and x inthe time t, and the Weierstrass function W (x, q, p) defined in the Hamiltonianpicture as

W (x, q, p) = H(x, q)−H(x, p)−(q − p,

∂H

∂p(x, p)

).

One says that the Weierstrass condition holds for a solution (x(τ), p(τ)) tothe system (2.81), if W (x(τ), q, p(τ)) ≥ 0 for all τ and all q ∈ Rn. For example,if the Hamiltonian H is convex (even non-strictly) in the variable p, then theWeierstrass function is non-negative for any choice of its arguments, thus in thiscase the Weierstrass condition holds trivially for all curves.

The following result is the basicWeierstrass sufficient condition for minimumin the calculus of variation.

Theorem 2.6.2. If the Weierstrass condition holds on a trajectory X(τ, x0, p0),P (τ, x0, p0) of the field Γ(x0) joining x0 and x in the time t (i.e., such thatX(t, x0, p0) = x), then the characteristic X(τ, x0, p0) provides a minimum forthe functional (2.99) over all curves that lie completely in D(x0). Furthermore,S(t, x, x0) is the corresponding minimal value.

Page 133: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

2.6. Hamilton–Jacobi eq., method of characteristics, calc. of variation 117

Proof. For any curve y(τ) joining x0 and x in the time t and lying in D(x0), onefinds

It(y(.)) =

∫ t

0

L(y(τ), y(τ)) dτ ≥∫ t

0

(p(t, y(τ))y(τ)−H(y(τ), p(τ, y(τ))) dτ.

Since the r.h.s. is the invariant Hilbert integral, it equals S(t, x, x0).

It remains to prove that S(t, x, x0) provides the value of It on the character-istics X(τ, x0, p0(t, x, x0)). For this purpose, it is enough to show that

P (τ, x0, p0)X(τ, x0, p0)−H(X(τ, x0, p0), P (τ, x0, p0))

equals L(X(τ, x0, p0), X(τ, x0, p0)), where p0 = p0(t, x, x0); in other words, that

P (τ, x0, p0)∂H

∂p(X(τ, x0, p0), P (τ, x0, p0))−H(X(τ, x0, p0), P (τ, x0, p0))

≥ q∂H

∂p(X(τ, x0, p0), P (τ, x0, p0))−H(X(τ, x0, p0), q)

for all q. But this inequality is just the Weierstrass condition. �Exercise 2.6.1. Let the field of characteristics exist for all x0 in some open set.Show the following further link between the field of extremals and the two-pointfunction:

∂S

∂x0(t, x, x0) = −p0(t, x, x0). (2.112)

Moreover, show that the function S(t, x, x0) as a function of t, x0 satisfies theHamilton–Jacobi equation corresponding to the Hamiltonian H(x, p) = H(x,−p).

Exercise 2.6.2. Using equations (2.110) and (2.112) and assuming that the matrices( ∂X∂p0

) are non-degenerate, derive the equations

∂2S

∂x2(t, x, x0) =

∂P

∂p0(t, x0, p0)

(∂X

∂p0(t, x0, p0)

)−1

, (2.113)

∂2S

∂x20

(t, x, x0) =

(∂X

∂p0(t, x0, p0)

)−1∂X

∂x0(t, x0, p0), (2.114)

∂2S

∂x0∂x(t, x, x0) = −

(∂X

∂p0(t, x0, p0)

)−1

,

linking the derivatives of the solutions to Hamiltonian systems with the secondderivatives of the two-point function. By (2.96), this result implies that, for non-degenerate quadratic Hamiltonians and small t,

∂2S

∂x2(t, x, x0) ∼ 1

tG(x0),

∂2S

∂x20

(t, x, x0) ∼ 1

tG(x0), (2.115)

so that S is convex in x and x0.

Page 134: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

118 Chapter 2. Basic ODEs in Complete Locally Convex Spaces

Using the asymptotic solutions to the boundary-value problems constructedin the previous section, one can derive the asymptotic representation of the two-point function.

Proposition 2.6.1. Under the assumptions of Proposition 2.5.1, the two-point func-tion S(t, x, x0) can be extended into the form

S(t, x, x0) =1

2t

(x− x0 +A(x0)t, G(x0)

−1(x− x0 +A(x0)t))

+1

t

⎛⎝V (x0)t

2 +

k∑j=3

Pj(t, x− x0) +O(c+ t)k+1

⎞⎠ ,

(2.116)

where the Pj are again polynomials in t and x−x0 of degree j, and where the termthat is quadratic in x− x0 is explicitly written down.

Proof. First, one needs to find the asymptotic expansion for the action σ(t, x0, p0)defined by (2.106). For a Hamiltonian of the form (2.82), one gets that σ(t, x0, p0)equals∫ t

0

[1

2(G(X(τ, x0, p0))P (τ, x0, p0), P (τ, x0, p0)) + V (X(τ, x0, p0))

]dτ.

Using the expansions of Proposition 2.5.1, one obtains

σ(t, x0, p0) =1

t

⎡⎣12(p0t, G(x0)p0t) + V (x0)t

2 +k∑

j=3

Pj(t, tp0) +O(c+ t)k+1

⎤⎦ ,

where Pj are polynomials of degree ≤ j in p0. Inserting the asymptotic expansion(2.98) for p0(t, x, x0) in this formula yields (2.116). �Remark 35. Of course, one can calculate the coefficients of the expansion (2.116)directly from the Hamilton–Jacobi equation, without solving the boundary-valueproblem. However, the well-posedness of the boundary-value problem explains whythe asymptotic expansion has such a form, and it justifies the formal calculationof its coefficients by means of, e.g., the method of undetermined coefficients.

Remark 36. Similarly (but with much more calculation efforts), for the Hamil-tonian (2.103) one finds that the main term of the small-time asymptotic of thetwo-point function has the form

S0(t, x, y, x0, y0) =1

2t(y − y0, g

−1(x0)(y − y0)) +6

t3(g−1(x0)z, z),

with

z = α−1(x0)(x− x0) +t

2[y + y0 + 2α−1(x0)a(x0)].

This expression is mostly known due to its appearance in the Green function ofthe so-called Kolmogorov diffusion.

Page 135: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

2.7. Hamilton–Jacobi–Bellman equation and optimal control 119

2.7 Hamilton–Jacobi–Bellman equationand optimal control

Theorem 2.6.1 showed that the two-point function S(t, x, x0) satisfies the Hamil-ton–Jacobi equation (2.109) locally. From these local solutions, one can also derivethe solutions to the corresponding Cauchy problem:

∂S

∂t+H

(x,

∂S

∂x

)= 0, S(0, x) = S0(x). (2.117)

For this purpose, a different field of characteristics has to be employed.Namely, assuming that S0 is differentiable, let us consider the family of char-acteristics that exit from all points ξ ∈ Rn with the momentum p(ξ) = ∂S0

∂x (ξ)and the corresponding mapping Y (t, ξ) = X(t, ξ, p(ξ)).

Proposition 2.7.1. Let H be a quadratic Hamiltonian of the type (2.82) with G, A,V ∈ C2(Rn), and let S0 ∈ C2(Rn). Then there exists t0 such that the mappingsξ �→ Y (t, ξ) are diffeomorphisms Rn → Rn for all t ∈ [0, t0].

Proof. Let us denote ‖∇S‖ = supx∣∣∂S0

∂x

∣∣. According to (2.86), if t0 is chosen suchthat

tan(Kt0)‖∇S‖ < 1,

then the solution (2.85) exists for all p0 = ∂S0

∂ξ , s ≤ t0, and is bounded in norm by

C =‖∇S‖+ tan(Kt0)

1− tan(Kt0)‖∇S‖ .

Hence, by (2.86), for t ≤ t0,

|Y (t, ξ)− ξ| ≤ K(1 + C)t0.

Following the same arguments as in the proof of Lemma 2.5.1, we find that

∂X

∂ξ(s) = 1+O(s),

∂X

∂p0(s) = O(s),

so that∂Y

∂ξ(s) =

∂X

∂ξ(s) +

∂X

∂p0(s)

∂2S0

∂ξ2= 1 +O(s).

Consequently, for small enough t0 and t < t0, the determinant of(

∂Y∂ξ

)(t) is

uniformly bounded from below and above. Hence the mappings ξ �→ Y (t, ξ) arelocal diffeomorphisms. Moreover, since

(Y (t, ξ1)− Y (t, ξ2), ξ1 − ξ2) =

(ξ1 − ξ2,

∫ 1

0

∂Y

∂ξ(t, ξ2 + h(ξ1 − ξ2))(ξ1 − ξ2) dh

),

Page 136: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

120 Chapter 2. Basic ODEs in Complete Locally Convex Spaces

it follows that the mappings ξ �→ Y (t, ξ) are injective. Finally, since Y (t, ξ) → ∞as ξ → ∞, it follows that the image of Y (t, .) is both open and closed in Rn, andtherefore coincides with Rn. �

Theorem 2.7.1. Under the assumptions of Proposition 2.7.1, the function

S(t, x) = [S0(ξ) + S(t, x, ξ)]|ξ=ξ(t,x) (2.118)

is the unique smooth solution to the Cauchy problem (2.117) for t ≤ t0, whereξ(t, x) is the inverse function to Y (t, ξ).

Proof. (i) Let us prove that S is a solution. Firstly,

∂S

∂x(t, x) =

∂S(t, x, ξ)

∂x

∣∣∣∣ξ=ξ(t,x)

+

(∂S0

∂ξ+

∂S(t, x, ξ)

∂ξ

)ξ=ξ(t,x)

∂ξ

∂x.

The second term vanishes due to (2.112), which implies

∂S

∂x(t, x) =

∂S(t, x, ξ)

∂x

∣∣∣∣ξ=ξ(t,x)

.

Next, we find

∂S

∂t(t, x) =

∂S(t, x, ξ)

∂t

∣∣∣∣ξ=ξ(t,x)

+

(∂S0

∂ξ+

∂S(t, x, ξ)

∂ξ

)ξ=ξ(t,x)

∂ξ

∂t.

The second term vanishes due to (2.112). Hence

∂S

∂t(t, x) = −H

(x,

∂S(t, x, ξ)

∂x

∣∣∣∣ξ=ξ(t,x)

)= −H

(x,

∂S

∂t(t, x)

),

as required.

(ii) Let g(t, x) be a smooth solution to the Cauchy problem (2.117) for t ≤t0. The main point is to show that the spatial gradient of g coincides with themomentum on the corresponding characteristics:

∂g

∂x

(t,X

(t, ξ, p0 =

∂S0

∂ξ

))= P

(t, ξ, p0 =

∂S0

∂ξ

). (2.119)

Fixing ξ, let f(t) and p(t) denote the l.h.s. respectively the r.h.s. of this equation.First of all, we find f(0) = p(0) = p0 = ∂S0

∂ξ , because g(0, x) = S0(0, x). Next, theHamiltonian equations yield

p(t) = −∂H

∂x(X(t, ξ, p0), p(t)) .

Page 137: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

2.7. Hamilton–Jacobi–Bellman equation and optimal control 121

Moreover, we find

f(t) =∂2g

∂x∂t(t, x)|x=X(t,ξ,p0) +

∂2g

∂x2(t,X(t, ξ, p0))

∂H

∂p(X(t, ξ, p0), p(t)).

Due to

∂2g

∂x∂t(t, x) = − ∂

∂xH

(x,

∂g

∂x(t, x)

)

= −∂H

∂x

(x,

∂g

∂x(t, x)

)− ∂2g

∂x2(t, x)

∂H

∂p

(x,

∂g

∂x(t, x)

),

it follows that

f(t) = −∂H

∂x(x, f(t)) +

∂2g

∂x2(t, x)

[∂H

∂p(x, p(t)) − ∂H

∂p(x, f(t))

],

with x = X(t, ξ, p0). Since the equation for p(t) can be equivalently rewritten inthe form

p(t) = −∂H

∂x(x, p(t)) +

∂2g

∂x2(t, x)

[∂H

∂p(x, p(t)) − ∂H

∂p(x, p(t))

],

again with x = X(t, ξ, p0), the functions f(t) and p(t) satisfy the same ODE, andhence coincide. Consequently,

∂g

∂x(t, x) =

∂S

∂x(t, x)

at all points t ≤ t0, x. Therefore, we also find

∂g

∂t(t, x) = −H

(x,

∂g

∂x(t, x)

)= −H

(x,

∂S

∂x(t, x)

)=

∂S

∂t(t, x),

which implies g(t, x) = S(t, x). �

A very special class of Hamiltonians is given by functions that are linearwith respect to one of their two variables. For instance, if H(x, p) = −(p, f(x))is linear with respect to p, then px = −pf(x) = H(x, p) and hence σ(t, x0, p0) =0. Therefore, the two-point function S(t, x, x0) also vanishes. Moreover, the firstequation of the Hamiltonian system x = ∂H

∂p = −f(x) does not depend on p, so

that X(t, x0, p0) does not depend on p0 and is the solution to the ODE x = −f(x)with the initial condition x0. The following fact is a specification of Theorem 2.7.1for the case of a Hamiltonian that is linear in p.

Theorem 2.7.2. In case H(x, p) = −(p, f(x)) with f ∈ C1(Rd), the mappingξ �→ Y (t, ξ) from Proposition 2.7.1 is given by the solution to the ODE x = −f(x)

Page 138: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

122 Chapter 2. Basic ODEs in Complete Locally Convex Spaces

with the initial condition ξ and is a diffeomorphism for any t ∈ R. The solution(2.118) is globally defined (for all t ∈ R) and equals

S(t, x) = S0(ξ)|ξ=ξ(t,x), (2.120)

where ξ(t, x) is the inverse mapping to ξ �→ Y (t, ξ), which is also given as thesolution ξ(t, x) to the ODE ξ = f(ξ) with the initial condition x.

In Section 2.10, we shall derive this important result for a more generalinfinite-dimensional setting in a different way, independent of the theory of Hamil-tonian systems.

Unlike the case of Hamiltonians that are linear in p, the mapping ξ →X(t, ξ, p0 =

∂S0

∂ξ ) usually does not remain a diffeomorphism for all t, which implies

that the classical (smooth) solutions to the Cauchy problem (2.117) fail to existglobally, and that one has to resort to some kind of generalized solutions.

Equation (2.115) implies that, for non-degenerate quadratic Hamiltonians,solution (2.118) can be rewritten in the following insightful form:

S(t, x) = minξ

[S0(ξ) + S(t, x, ξ)]. (2.121)

This has numerous interesting consequences. First, since this form does not dependon the derivatives, it can be used for defining generalized solutions by approxima-tions, i.e., as a limit of sequences of classical solutions. Another approach can bederived from the observation that the mapping S0 �→ St = S(t, .) given by (2.121)can be considered linear in the exotic (max,+)-structure (also referred to as trop-ical algebra), i.e., min(S1

0 , S20) �→ min(S1

t , S2t ). Therefore, the methods of linear

equations can be applied in this case and one can define generalized solutions byduality in the sense of the corresponding (max,+)-‘generalized functions’. Finally,the representation (2.121) is convenient for developing the methods of viscositysolutions. References for these developments are provided in Section 2.19.

In the theory of optimization, one of the basic equations is the so-calledBellman equation. It has the form

∂S

∂t+ sup

u∈U

[(g(x, u),

∂S

∂x

)− J(x, u)

]= 0, (2.122)

with some functions g(x, u), J(x, u) and the parameter u taken from some setU . This equation is nothing but the Hamilton–Jacobi equation (2.109) with thespecific Hamiltonian

H(x, p) = supu∈U

[(g(x, u), p)− J(x, u)]. (2.123)

Therefore, the equations (2.109) are often referred to as the Hamilton–Jacobi–Bellman equations, or shortly HJB-equations.

Page 139: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

2.7. Hamilton–Jacobi–Bellman equation and optimal control 123

Remark 37. Notice that any function H(x, p) that is convex in p can be writtenin the form (2.123). Therefore, the Bellman equations and the Hamilton–Jacobiequations (with a convex Hamiltonian) represent in fact the same class of equa-tions.

Remark 38. For the sake of completeness, let us recall the basic heuristic derivationof the Bellman equation (2.122). Assume that an agent has a position x ∈ Rd

at the time t. Also assume that the position will be moved to a new positionx(t + τ) during a small time τ according to the ODE x = g(x, u), where the‘control parameter’ u can be chosen by the agent from some given set U . Moreover,the agent has to pay a charge of J(x, u) per unit of time during the transition.Furthermore, let us assume that at the next time t+τ the agent can choose anotheru for the next transition during the time interval [t + τ, t + 2τ ] and so on, untilthe terminal time T is reached, where the agent receives the award VT (x(T )) thatdepends on the final position. Then for the total optimal payoff S(t, x) of the agentstarting at x at the time t, we can write the following approximate equation (inthe first order for small τ):

S(t, x) = supu∈U

[S(t+ τ, x+ g(x, u)τ)− J(x, u)τ ].

Expanding S by the first-order Taylor expansion yields the approximation

S(t, x) = supu∈U

[S(t, x) +

∂S

∂t(t, x)τ +

∂S

∂x(t, x)g(x, u)τ − J(x, u)τ

].

Cancelling S(t, x) from both sides yields (2.122).

Equation (2.122) arises from a control problem, when an agent can controlthe velocity of its movement. A different class of Bellman equations arises whenthe motion of an agent is not smooth, but subject to possible jumps. Namely, if thenumber of allowed jumps from a position x ∈ Rd is finite and if the jumps are givenby some functions x �→ y1(x), . . . , ym(x) with the intensities ujνj(x) controlled byan agent via the control parameters u = (u1, . . . , um) ∈ U , the equation for theoptimal payoff gets the form

∂S

∂t+ sup

u∈U

⎡⎣ m∑j=1

ujνj(x)(S(t, y(xj))− S(t, x))− J(x, u)

⎤⎦ = 0, (2.124)

which is called the Bellman equation for controlled jump-processes..

Remark 39. The heuristic derivation of Remark 38 is modified for the jump-typedynamics as follows: Saying that possible jumps x �→ y1(x), . . . , ym(x) occur withthe rates ujνj(x) per unit time means that the probability of having a jump in asmall time τ equals approximately R = τ

∑ujνj(x), and the probability to have

Page 140: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

124 Chapter 2. Basic ODEs in Complete Locally Convex Spaces

the jump x → yj(x) (when a jump occurs) is ujνj(x)/R. Under this assumption,the approximate equation for the optimal payoff becomes

S(t, x) = supu∈U

m∑j=1

ujνj(x)S(t+τ, yj(x))−J(x, u)τ+(1−τ

m∑j=1

ujνj(x))S(t+τ, x)].

Expanding the change in small τ into a Taylor series yields (2.124).

More generally, if jumps are distributed over Rd with some controlled inten-sities ν(u, x, dy) (given by a transition kernel in Rd that depends on u ∈ U asa parameter), one gets the more general Bellman equation for controlled jump-processes in the form

∂S

∂t+ sup

u∈U

[∫(S(t, y)− S(t, x))ν(u, x, dy) − J(x, u)

]= 0. (2.125)

From the very essence of optimal control problems, it is seen that the nat-ural problem for Bellman equations is the backward Cauchy problem, where thesolutions of (2.124) or (2.125) are sought for times t ≤ T under the additionalterminal conditions S(T, x) = VT (x) with a given VT (x).

Remark 40. The same backward problem is natural in stochastic analysis whenone is concerned with the value of some payoff (say, a financial obligation) atthe time t, where the payoff value depends on the position of the process in thefuture time T > t. This remark provides an additional motivation for the study ofbackward problems (and thus the backward propagators of Chapter 4).

The equations (2.125) may be regarded as being simpler than the HJB-equation (2.122), since they are not PDEs, but ODEs. However, they illustratethe importance of the abstract Banach-space setting for ODEs, because equa-tion (2.125) cannot be written in the form x = F (t, x) for a function F (t, x) inany finite-dimensional Euclidean space. On the other hand, equation (2.125) is anequation of the type S + F (t, S) = 0, with S being an element of the functionalBanach space C(Rd). Hence, we get the following well-posedness result for theBellman equation for jump processes as a direct application of Theorems 2.2.1and 2.2.2 or of Proposition 2.2.1.

Theorem 2.7.3.

(i) Let the transition kernels ν be weakly continuous (for any u) and uniformlybounded, i.e.,

supu∈U

supx∈Rd

∫ν(u, x, dy) < ∞.

Then the backward Cauchy problem for equation (2.125) is well posed inC(Rd), i.e., for any VT ∈ C(Rd) there exists a unique curve St in C(Rd),t ≤ T , such that (2.125) holds and S(T, x) = VT (x). (Note that the derivative

Page 141: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

2.8. Sensitivity of integral equations 125

in t is defined with respect to the Banach topology of C(Rd).) Moreover, thesolution depends Lipschitz-continuously on the terminal value VT .

(ii) If additionally ν = να depends Lipschitz-continuously on a parameter α froma Banach space B1, so that

‖να(u, x, dy)− νβ(u, x, dy)‖ ≤ κ‖α− β‖,then the solution St also depends Lipschitz-continuously on α.

We shall pick up the story of the jump-type Bellman equations in Chapter 7in the context of forward-backward systems.

2.8 Sensitivity of integral equations

We continue the development of the general theory of ODEs by addressing theissue of sensitivity or smooth dependence of the solutions on the initial data orparameters. In order to present sensitivity results for various equations in a unifiedway, we shall discuss here the smooth dependence of the fixed points on initial dataand parameters for a special class of integral mappings Φ, namely for the operatorsΦY in C([τ, t], B) of the form

[ΦY,τ (μ.)](t) = Gt,τY +

∫ t

τ

Ωt,s(μs) ds, (2.126)

where Gt,s and Ωt,s, t > s are the families of linear operators respectively (pos-sibly) nonlinear mappings in B, such that Gt,t is the identity. As usual, we write‖.‖ for the norm ‖.‖B. In this section, we shall deal with uniformly bounded Ω.Later on, we shall discuss an extension for Ω having a singularity at t = s.

Therefore, assuming the existence of the unique fixed points μt,τ (Y ) ∈C([τ, T ], B) of these mappings for any Y ∈ B, T > τ , we are interested in thederivatives

ξt = ξt,τ (Y )[ξ] = Dμt,τ (Y )[ξ] (2.127)

in some direction ξ, where D denotes the derivative with respect to the argumentfrom B (for fixed t, τ) throughout this section. Also, BR denotes the ball of radiusR in B centered at zero.

As an important hint for obtaining these derivatives, we note that if they arewell defined, then they satisfy the equation

ξt = Gt,τ ξ +

∫ t

τ

DΩt,s(μs,τ )[ξs] ds, (2.128)

which is obtained by differentiation from the fixed-point equation

μt,τ = Gt,τY +

∫ t

τ

Ωt,s(μs,τ ) ds. (2.129)

Page 142: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

126 Chapter 2. Basic ODEs in Complete Locally Convex Spaces

Theorem 2.8.1. Let Gt,s and Ωt,s depend continuously on t and be measurable ons. Let equation (2.129) be well posed in the following sense: for any T,R > 0 thereexists MT (R) > 0 such that for any Y ∈ BR, τ ∈ [0, T ], there exists a uniquesolution to equation (2.129) for t ∈ [τ, T ] such that

‖μ.‖C([τ,T ],B) ≤ MT (R).

Let Gt,t be the identity operator and

‖Gt,τ‖B→B ≤ G, (2.130)

with some constant G ≥ 1. Moreover, let Ωt,s ∈ C1uc(BR, B) uniformly in t, s for

any R. In other words, for any ε, R, there exists L(R) and δ = δ(R, ε) such that

‖DΩt,s(μ1)‖B→B ≤ L(R), ‖DΩt,s(μ1)−DΩt,s(μ2)‖B→B ≤ ε, (2.131)

for all t, s, and any μ1, μ2 such that μ1 ∈ BR, ‖μ1 − μ2‖ ≤ δ.

Then the mapping μτ = Y → μt,τ (Y ) belongs to C1uc(BR, B) for all t, ξt =

Dμt,τ (Y )[ξ] represents the unique solution to equation (2.128) and

‖ξt‖ ≤ Ge(t−τ)L(MT (R))‖ξ‖, 0 ≤ τ ≤ t ≤ T. (2.132)

Proof. First of all, subtracting the equations for μt,τ (Y1) and μt,τ (Y2) yields

‖μt,τ(Y1)− μt,τ (Y2)‖ ≤ G‖Y1 − Y2‖+∫ t

τ

L(MT (R))‖μs,τ (Y1)− μs,τ (Y2)‖ds,

which implies

‖μt,τ (Y1)− μt,τ (Y2)‖ ≤ Ge(t−τ)L(MT (R))‖Y1 − Y2‖, Y1, Y2 ∈ BR, (2.133)

by Gronwall’s lemma.

Next, (2.52) and (2.53) ensure that the Cauchy problem for the linear equa-tion (2.128) is well posed for any ξ, and that it defines a unique ξt that satisfies(2.132). In order to show that this solution defines the derivative Dμt,τ (Y )[ξ], wehave to prove that ‖φt‖ ≤ ε‖ξ‖ for any ε and ξ whenever ‖ξ‖ ≤ δ with sufficientlysmall δ, where

φt = μt,τ (Y + ξ)− μt,τ (Y )− ξt.

For this, the idea is to find the equation for φ and then estimate its solution.Subtracting from the integral equation for μt,τ (Y + ξ) the integral equations forμt,τ (Y ) and ξt yields

φt =

∫ t

τ

[Ωt,s(μs,τ (Y + ξ))− Ωt,s(μs,τ (Y ))−DΩt,s(μs,τ (Y ))[ξs]] ds,

Page 143: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

2.8. Sensitivity of integral equations 127

and adding and subtracting DΩt,s(μs,τ (Y ))[μs,τ (Y + ξ)− μs,τ (Y )] leads to

φt =

∫ t

τ

DΩt,s(μs,τ (Y ))[φs] ds

+

∫ t

τ

(Ωt,s(μs,τ (Y + ξ)) − Ωt,s(μs,τ (Y ))

−DΩt,s(μs,τ (Y ))[μs,τ (Y + ξ)− μs,τ (Y )])ds.

(2.134)

For any ε, let us choose δ ≤ 1 such that (1.58) holds for F = Ωt,s, that is

‖Ωt,s(Z + ξ)−Ωt,s(Z)−DΩt,s(Z)[ξ]‖ ≤ ε‖ξ‖ for ‖ξ‖ ≤ δ, ‖Z‖ ≤ MT (R+1).

By (2.133), if ‖ξ‖ ≤ G−1e−(T−τ)L(MT (R))δ, then

‖μt,τ (Y + ξ)− μt,τ (Y )‖ ≤ Ge(t−τ)L(MT (R))‖ξ‖ ≤ δ,

so that

‖Ωt,s(μs,τ (Y + ξ))− Ωt,s(μs,τ (Y ))−DΩt,s(μs,τ (Y ))[μs,τ (Y + ξ)− μs,τ (Y )]‖≤ ε‖μt,τ (Y + ξ)− μt,τ (Y )‖ ≤ εGe(t−τ)L(MT (R))‖ξ‖. (2.135)

Consequently, (2.134) implies

‖φt‖ ≤∫ t

τ

L(MT (R))‖φs‖ ds+ ε(t− τ)Ge(t−τ)L(MT (R))‖ξ‖.

By Gronwall’s lemma, this implies

sups∈[τ,t]

‖φs‖ ≤ ε(t− τ)Ge2(t−τ)L(MT (R))‖ξ‖, (2.136)

thus showing that ξt = Dμt,τ (Y )[ξ].

It remains to show that ξt = Dμt,τ (Y )[ξ] is a uniformly continuous function ofY ∈ BR. Let ξ

1t = Dμt,τ (Y1)[ξ] and ξ2t = Dμt,τ (Y2)[ξ]. Subtracting the respective

equations (2.128) for ξ1t and ξ2t yields

‖ξ1t − ξ2t ‖ ≤∫ t

0

‖DΩt,s(μs,τ (Y1)[ξ1s − ξ2s ]‖ ds

+

∫ t

0

‖(DΩt,s(μs,τ (Y1))−DΩt,s(μs,τ (Y2))[ξ2s ]‖ ds

and thus

‖ξ1t − ξ2t ‖ ≤ L(MT (R))

∫ t

τ

‖ξ1s − ξ2s‖ ds

+(t− τ)Ge(t−τ)L(MT (R))‖ξ‖ sups∈[τ,t]

‖(DΩt,s(μs,τ (Y1)−DΩt,s(μs,τ (Y2))‖.(2.137)

Page 144: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

128 Chapter 2. Basic ODEs in Complete Locally Convex Spaces

The last term can be made arbitrary small for small ‖Y1 − Y2‖, because F ∈C1

uc(BR, B). Again by Gronwall’s lemma, ‖ξ1t −ξ2t ‖ can therefore be made arbitrarysmall for small Y1 − Y2, as required. �

The next statement provides an alternative proof of Theorem 2.8.1 undera slightly stronger assumption: Lipschitz continuity of DΩ rather than simplecontinuity. The idea is to show that the derivative ξt can be obtained as the limitof the derivatives ξnt = D[Φn

Y,τ ](t)[ξ] of the approximations that are used in theproof of Theorem 2.1.1 in order to obtain a fixed point for the mapping Φ, givenby (2.126).

From the recursion formula for Φnt and Proposition 1.5.1, it follows that all

ξnt are well defined and satisfy the recursion formula

ξnt = ξ +

∫ t

0

DΩt,s([Φn−1Y,τ ](t))[ξn−1

s ] ds, (2.138)

whenever Ωt,s is differentiable.

Theorem 2.8.2. Under the assumptions of Theorem 2.8.1, let us additionally sup-pose that Ωt,s ∈ C1

bLip(BR, B) uniformly in t, s for any R, so that (2.131) isimproved to

‖DΩt,s(μ1)‖B→B ≤ L(R), ‖DΩt,s(μ1)−DΩt,s(μ2)‖B→B ≤ LD(R)‖μ1 − μ2‖,(2.139)

for all μ1, μ2 ∈ BR and some constants L(R) and LD(R).

Then the mapping μτ = Y �→ μt,τ (Y ) belongs to C1bLip(BR, B) for all t, τ .

Moreover, ξt = Dμt,τ (Y, α)[ξ] is the unique solution to equation (2.128). It satisfies(2.132) and is the limit of the approximations (2.138).

Proof. First of all, it follows from (2.130) and (2.139) that

‖[ΦY,τ (μ1. )](t) − [ΦY,τ(μ

2. )](t)‖ ≤ L(MT (R))

∫ t

τ

‖μ1. − μ2

. ‖C([τ,s],B) ds,

‖[ΦY1,τ (μ.)](t)− [ΦY2,τ (μ.)](t)‖ ≤ G‖Y1 − Y2‖.(2.140)

By Theorem 2.1.1, this implies the convergence of the approximations ΦnY to the

unique fixed point μt,τ (Y ) of ΦY . The well-posedness of the linear equation (2.128)as well as the estimate (2.132) follow as in Theorem 2.8.2.

In order to show that the solution ξt to (2.128) yields the derivative of μt,τ (Y ),let us show that the derivatives ξnt of the approximations converge, where ξ0t = ξ.This would imply that the limit ξt is precisely Dμt,τ (Y )[ξ] and satisfies equation(2.145). In fact, if a sequence of functions uniformly converges, and if the sequenceof their derivatives uniformly converges as well, then the limit of the sequence ofthe derivatives coincides with the derivatives of the limit.

Page 145: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

2.8. Sensitivity of integral equations 129

From (2.138), it follows that

‖ξnt ‖ ≤ G‖ξ‖+∫ t

τ

L(MT (R))‖ξn−1s ‖ ds,

which implies by Lemma 9.1.1 that all approximations are bounded by (2.132). Inorder to prove that the ξn converge, we derive from (2.138) that

‖ξn+1t − ξnt ‖ ≤

∫ t

τ

‖DΩt,s([ΦnY,τ (μ.)](s))‖B→B [ξ

ns − ξn−1

s ] ds

+

∫ t

0

‖DΩt,s([ΦnY,τ (μ.)](s)) −DΩt,s([Φ

n−1Y,τ (μ.)](s))‖B→B [ξn−1

s ] ds.

For estimating the last term, we can use (2.4), which yields

‖ΦnY,τ(μ.)]−[Φn−1

Y,τ (μ.)]‖C([τ,t],B) ≤ (t− τ)n−1Ln−1(MT (R))

(n− 1)!‖Φ1

Y (Y )−Y ‖C([τ,t],B).

Therefore, taking into account (2.139), we get

‖ξn+1t − ξnt ‖ ≤ L(MT (R))

∫ t

τ

‖ξns − ξn−1s ‖ ds

+ ‖ξ‖G exp{(t− τ)L(MT (R))}

× (t− τ)nLn−1(MT (R))LD(MT (R))

n!‖ΦY (Y )− Y ‖C([τ,t],B).

Consequently, we find

‖ξn+1t − ξnt ‖ ≤ L2(MT (R))

∫ t

τ

ds

∫ s

τ

‖ξn−1s1 − ξn−2

s1 ‖ ds1+ 2‖ξ‖G exp{(t− τ)L(MT (R))}

× (t− τ)nLn−1(MT (R))LD(MT (R))

n!‖ΦY (Y )− Y ‖C([τ,t],B).

By induction, it follows that

‖ξn+1t − ξnt ‖ ≤ ‖ξ1t − ξ0‖ (L(t− τ))n

n!+ n2‖ξ‖G exp{(t− τ)L(MT (R))} (2.141)

× (t− τ)nLn−1(MT (R))LD(MT (R))

n!‖ΦY (Y )− Y ‖C([τ,t],B).

Therefore, the sequence ξnt converges in C([τ, T ], B). �

Page 146: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

130 Chapter 2. Basic ODEs in Complete Locally Convex Spaces

As was mentioned in the comments after equation (2.16), the problem ofsensitivity with respect to a parameter can be reduced to the problem of sensitivitywith respect to initial data. So let us formulate the obtained result directly viathis approach. For that purpose, we shall work with real parameters, since onecan always reduce the general case to this particular one by using directionalderivatives in the space of vector-valued parameters.

Instead of looking for fixed points of the single integral operator (2.126), letus now consider the family of such equations

[ΦY,τ ;α(μ.)](t) = Gαt,τY +

∫ t

τ

Ωαt,s(μs) ds, (2.142)

where Gαt,s and Ωα

t,s, t ≥ s are families of linear operators respectively (possibly)nonlinear mappings in B that depend on a real parameter α. Differentiating this

equation for the fixed point μαt,τ (Y ), we find that if the derivative βt =

∂μαt,τ (Y )

∂α iswell defined, then it satisfies the equation

βt =∂Gα

t,τ

∂αY +

∫ t

0

(∂Ωα

t,s(μs,τ )

∂α+DΩα

t,s(μs,τ )[βs]

)ds. (2.143)

Theorem 2.8.3. Assume that Gαt,s and Ωα

t,s satisfy all the assumptions of Theorem2.8.1, with all estimates being uniform in α. Moreover, let Ωα

t,s(μ) ∈ C1uc(R ×

BR, B) as a function of (α, μ) and Gαt,s ∈ C1

uc(R,L(B,B)) as a function of αuniformly in t, s for any R. Then the mapping μτ = Y �→ μα

t,τ (Y ) belongs to

C1uc(R × BR, B) for all t, τ , and βt =

∂μαt,τ (Y )

∂α is the unique solution to equation(2.143).

Higher-order derivatives of the fixed points of operators (2.126) with respectto initial data can be derived analogously. However, it is easier to obtain them bylooking at the differentiability of equation (2.128) with respect to the parameterμτ . Looking at the directional derivative

ηt = D2μt,τ (Y )[ξ1, ξ2] = D(ξt,τ (Y )[ξ1]

)[ξ2] =

d

dh|h=0 ξt,τ (Y + hξ2)[ξ1]

differentiating (2.128) gives us the equation

ηt =

∫ t

τ

DΩt,s(μs,τ )[ηs] ds+

∫ t

τ

D2Ωt,s(μs,τ )[ξ1s , ξ

2s ] ds, (2.144)

where ξjt = Dμt,τ [ξj ]. As a consequence of Theorem 2.8.3, we get the following.

Theorem 2.8.4. Under the assumptions of Theorem 2.8.1, let Ωt,s ∈ C2uc(BR, B)

for all R. Then the mapping μτ = Y �→ μt,τ (Y ) belongs to C2uc(BR, B) for all t, τ ,

and ηt = D2μt,τ (Y )[ξ1, ξ2] is the unique solution to equation (2.144).

Page 147: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

2.9. ODEs in Banach spaces: sensitivity 131

2.9 ODEs in Banach spaces: sensitivity

As a consequence of Theorems 2.8.1 and 2.8.2, we get the following result on thesensitivity of ODEs.

Theorem 2.9.1. Let F ∈ C1luc(B,B) for a Banach space B (and consequently the

assumptions of Theorem 2.2.1 hold). Then the mapping μ0 = Y → μt(Y ) that isconstructed in Theorem 2.2.1 belongs to C1

luc(B,B) for all t, and ξt = Dμt(Y )[ξ]is the unique solution to the linear equation

ξt = ξ +

∫ t

0

DF (μs)[ξs] ds ⇐⇒ ξt = DF (μt)[ξt] and ξ0 = ξ, (2.145)

with the initial condition ξ0 = ξ. Moreover,

‖ξt‖ ≤ e|t|L‖ξ‖ ≤ exp{|t|‖F‖C1(B)}‖ξ‖. (2.146)

If additionally F ∈ C1uc(B,B) or F ∈ C1

bLip(B,B), then the mapping μ0 = Y →μt(Y ) belongs to C1

uc(B,B) or C1bLip(B,B), respectively.

Proof. We apply Theorems 2.8.1 and 2.8.2 with Ωt,s = F . Moreover, the result fort < 0 is obtained by changing the variable t to −t. �

Similarly, the following result is a consequence of Theorem 2.8.4.

Theorem 2.9.2. Under the assumption of Theorem 2.2.1, let us additionally as-sume that F ∈ C2

uc(BR, B) for any R. Then the mapping Y �→ μt(Y ) belongs toC2

uc(BR, B) for all t, and ηt = D2μt(Y )[ξ1, ξ2] is the unique solution to the linearequation

ηt = D2F (μt)[ξ1t , ξ

2t ] +DF (μt)[ηt], (2.147)

with the initial condition η0 = 0, where ξit = Dμt(Y )[ξi], i = 1, 2. If F ∈C2

bLip(BR, B), then the mapping Y �→ μt(Y ) belongs to C2bLip(B,B).

The key property of the differential Dμt(Y ) is that it transfers the vectorfield F (Y ) = μt(Y )|t=0 along its integral curves μt:

Proposition 2.9.1. Under the assumptions of Theorem 2.9.1,

F (μt(Y )) = Dμt(Y )[F (Y )] (2.148)

for any t and Y . Moreover, μt(Y ) ∈ C1(K ×B) as a function of two variables forany bounded interval K of R.

Proof. The function ξt = F (μt(Y )) satisfies the equation

ξt = DF (μ)|μ=μt(Y )[ξt],

Page 148: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

132 Chapter 2. Basic ODEs in Complete Locally Convex Spaces

with the initial condition ξ0 = F (Y ). On the other hand, zt = Dμt(Y )[F (Y )] hasthe same initial data and satisfies the same equation, because

zt = D(F ◦ μt)(Y )[F (Y )] = DF (μ)|μ=μt(Y )[Dμt(Y )[F (Y )]]

by the chain rule. Therefore, the uniqueness of the solution to (2.145) yields ξt =Dμt(Y )[F (Y )], as required. The last statement follows from Proposition 1.5.2. �

In the time-dependent case of with F (t, μt) and the corresponding solutionsμt,s(Y ), we have

F (t, Y ) =∂μt,s(Y )

∂t

∣∣∣∣t=s

= − ∂μt,s(Y )

∂s

∣∣∣∣s=t

,

and the natural question arises as to which of these derivatives is transferred alongthe solutions by the differential Dμt,s(Y ).

Proposition 2.9.2. Under the assumptions of Theorem 2.2.2 and assuming thatF (t, Y, α) ∈ C1

luc(B,B) as a function of Y uniformly in t and α, it follows that

∂μt,s(Y, α)

∂s= −Dμt,s(Y, α)[F (s, Y, α)]. (2.149)

Proof. Since

μt,s(Y, α) =

∫ t

s

F (τ, μτ,s(Y, α), α) dτ,

it follows that ηt,s =∂μt,s(Y,α)

∂s satisfies the equation

ηt,s = −F (s, Y, α) +

∫ t

s

DF (τ, μτ,s(Y, α), α)[ητ,s] dτ,

and consequently

d

dtηt,s = DF (t, μ, α)|μ=μt,s(Y,α), α)[ηt,s].

By the chain rule, the same equation is satisfied by zt = Dμt,s(Y, α)[F (s, Y, α)],which implies (2.149). �Exercise 2.9.1. Under the assumptions of Proposition 2.9.2, write down the equa-

tion for∂μt,s(Y,α)

∂α .

Exercise 2.9.2. Prove that under the assumptions of Theorem 2.8.3, equation(2.148) generalizes to

F (t, μt) = Dμt,s(Y )[F (s, Y )] +

∫ t

s

Dμt,τ (μτ,s(Y ))

[∂F

∂τ(τ, μτ,s(Y ))

]dτ. (2.150)

Page 149: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

2.10. Linear first-order partial differential equations 133

Exercise 2.9.3. Prove the analogue of Theorem 2.9.1 for Gateaux derivatives: Un-der the assumption of Theorem 2.2.1 and additionally assuming F ∈ C1

Gat(B,B),it follows that the mapping μ0 = Y → μt(Y ) belongs to C1

Gat(B,B) for all t,and ξt = Dμt(Y )[ξ] is still the unique solution to (2.145). Hint: Inequality (1.20)extends to Y and ξ from any compact set, and the set of solutions μs(μ0 + hη),s ≤ t, h ≤ 1, is compact for a given η (as the image of a continuous mapping ofthe square).

The next exercise provides more concrete estimates for the derivatives withrespect to initial data in the case of Rd. Exactly the same estimates can also beproved for the equations in l1.

Exercise 2.9.4.

(i) Let g ∈ C1(Rd), and Xx(t) = {Xxj (t)} be the solution to the equation

x = g(x) in Rd. Then Xx(t) ∈ C1(Rd) as a function of x and

supj,x

∑k

∣∣∣∣∂Xxk (t)

∂xj

∣∣∣∣ ≤ exp

{t sup

k,x

∣∣∣∣∂g(x)∂xk

∣∣∣∣}. (2.151)

Moreover, if f ∈ C1(Rd), then

supj,x

∣∣∣∣ ∂

∂xjf(Xx(t))

∣∣∣∣ ≤ supj,x

∣∣∣∣ ∂

∂xjf(x)

∣∣∣∣ exp{t sup

k,x

∣∣∣∣∂g(x)∂xk

∣∣∣∣}. (2.152)

(ii) Let g ∈ C2(Rd). Then Xx(t) ∈ C2(Rd) as a function of x and

supj,i,x

∑k

∣∣∣∣∂2Xxk (t)

∂xi∂xj

∣∣∣∣ ≤ t supj,i,x

∣∣∣∣ ∂2g(x)

∂xj∂xi

∣∣∣∣ exp{3t sup

j,x

∣∣∣∣∂g(x)∂xj

∣∣∣∣}. (2.153)

Moreover, if f ∈ C2(Rd), then

supj,i,x

∣∣∣∣ ∂2

∂xj∂xif(Xx(t))

∣∣∣∣ (2.154)

≤(supj,i,x

∣∣∣∣ ∂2

∂xj∂xif(x)

∣∣∣∣+ t supk,x

∣∣∣∣∂f(x)∂xk

∣∣∣∣ supj,i,x

∣∣∣∣ ∂2g(x)

∂xj∂xi

∣∣∣∣)exp

{3t sup

k,x

∣∣∣∣∂g(x)∂xk

∣∣∣∣}.

2.10 Linear first-order partial differential equations

A standard method of solving partial differential equations of first order is basedon obtaining the solutions in terms of the solutions of certain ODEs called thecharacteristics of the original equations. In Sections 2.6 and 2.7, we developedthis method for finite-dimensional Hamiltonian systems. Now, we show how thismethod works in the infinite-dimensional setting, although we restrict our atten-tion to linear equations.

We start with the simplest case of time-homogeneous equations.

Page 150: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

134 Chapter 2. Basic ODEs in Complete Locally Convex Spaces

Theorem 2.10.1. Under the assumptions of Theorem 2.9.1, let S ∈ C1(B). Thenthe function G(t, Y ) = S(μt(Y )) is the unique solution in C1(R×B) to the linearpartial differential equation

∂G

∂t(t, Y ) = DG(t, Y )[F (Y )] (2.155)

with the initial condition G(0, Y ) = S(Y ), where D denotes the derivative withrespect to the second variable of G. The curves μt(Y ) are referred to as the char-acteristics of the partial differential equation (2.155).

Proof. Let G(t, Y ) = S(μt(Y )). Then G ∈ C1(R × B) by Propositions 2.9.1 and1.5.1. Moreover, by (2.148) and Proposition 1.5.1, we find

∂G

∂t(t, Y ) = DS(μ)|μ=μt(Y )[F (μt)] = DS(μ)|μ=μt(Y )[Dμt(Y )[F (Y )]]

= D(S ◦ μt)(Y )[F (Y )] = DG(t, Y )[F (Y )],

therefore G is in fact a solution.

In order to prove the uniqueness, let us assume that g(t, Y ) is another solutionto (2.155) from C1(R × B). Let us introduce a function φ(t, Y ) = g(t, μ−t(Y )).Then this function does not depend on time, since

∂φ

∂t(t, Y ) =

∂g

∂t(t, μ−t(Y )) −Dg(t, μ−t(Y ))F (μ−t(Y )) = 0.

Consequently, g(t, Y ) = g(0, μt(Y )), i.e., g is a function of μt(Y ). Therefore itcoincides with S(μt(Y )). �

Let us now turn to the time-dependent case. Extrapolating the previousresult to the dynamics of functions that arise from the evolution Y → μt,s(Y )of Theorem 2.2.2 (omitting the irrelevant dependence on a parameter α), onecould expect a function of the type G(t, Y ) = S(μt,s(Y )) to satisfy the equation∂G∂t = DG(t, Y )[Ft(Y )]. The correct result, however, is different.

Theorem 2.10.2. Under the assumptions of Theorem 2.2.2 (omitting the irrelevantdependence on a parameter α), let S ∈ C1(B). Then the function G(t, s, Y ) =S(μt,s(Y )) satisfies the linear equations

∂G

∂s(t, s, Y ) = −DG(t, s, Y )[Fs(Y )], (2.156)

where D again denotes the derivative with respect to the last variable of G. More-over, this function is the unique solution to this equation from the space C1(R ×R×B) with the initial condition G(t, t, Y ) = S(Y ).

Page 151: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

2.11. Equations with memory: causality 135

Proof. By (2.149), we find

∂G

∂s(t, s, Y ) = DS(μt,s(Y ))

[∂μt,s(Y )

∂s

]= −DS(μt,s(Y ))[Dμt,s(Y )[F (s, Y )]]

= −DG(t, s, Y )[Fs(Y )],

as claimed in (2.156).

Suppose that φ(s, Y ) is another solution, and let g(s) = φ(s, μs,t(Y ). Then

g′(s) =∂φ

∂s(s, μs,t(Y )) +Dφ(s, μs,t(Y ))

[∂μs,t

∂s(Y )

]= −Dφ(s, μs,t(Y ))[F (s, μs,t(Y ))] +Dφ(s, μs,t(Y ))[F (s, μs,t(Y ))] = 0,

because of (2.149) and the assumption that φ solves (2.156). Hence φ(s, μs,t(Y )) =φ(t, Y ) = S(Y ) and thus φ(s, Y ) = S(μt,s(Y )), as claimed. �

The already discussed link between the nonlinear dynamics μt(Y ) in Bsolving the ODE μt = F (μt) and the linear evolution on the functions on B:TtS(Y ) = S(μt(Y )) solving the PDE (2.155) is a crucial tool in many branches ofanalysis. We shall return to this link in Chapter 4 (see Proposition 4.1.1) and inChapter 7 (when deriving the kinetic equations).

2.11 Equations with memory: causality

In practice, one often has to deal with extensions of ODEs that incorporate mem-ory. The implementation of memory is usually achieved by two approaches.

Firstly, one can work with an extension of (2.9), where the r.h.s. depends onthe past values of the unknown function:

μt = F (t, μ≤t), (2.157)

with F (t, μ≤t) a continuous mapping [0, T ]×C([0, T ], B)→ B such that for any t,F (t, μ.) depends only on the values μs : s ∈ [0, t]. Such equations are also referredto as causal equations. A particular case is the class of delay equations, whereF (t, μ≤t) is of the type Ft(μt, μt−δ1 , . . . , μt−δk), with some δk > · · · > δ1 > 0, i.e.,the function depends on several past values of μs. In this section, we shall discusssuch equations.

The second approach is based on replacing the usual derivative on the l.h.s.of (2.9) by a fractional derivative, which will be discussed in the next section.

Theorem 2.11.1. Let B be a Banach space and T > 0. For any t, let F (t, μ.)be a Lipschitz-continuous mapping C([0, t], B) → B, with a Lipschitz constant‖F‖Lip = L that can be chosen uniformly in t ∈ [0, T ] for some T > 0, so that

‖F (t, μ1≤t)− F (t, μ2

≤t)‖ ≤ L‖μ1. − μ2

. ‖C([0,t],B) = L sups∈[0,t]

‖μ1s − μ2

s‖ (2.158)

Page 152: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

136 Chapter 2. Basic ODEs in Complete Locally Convex Spaces

for all t ∈ [0, T ] and μ1. , μ

2. ∈ C([0, t], B). Then for any Y ∈ B there exists a

unique solution μ.(Y ) ∈ C([0, T ], B) to the Cauchy problem for equation (2.157)with the initial condition μ0 = Y . Moreover,

‖μ.(Y )− Y ‖C([0,t],B) ≤ etL(tL‖Y ‖+

∫ t

0

‖F (s, 0)‖ds)

(2.159)

for all t ∈ [0, T ]. Finally, for solutions μt(Y1) and μt(Y2) with different initial dataY1, Y2, the following estimate holds:

‖μ.(Y1)− μ.(Y2)‖C([0,t],B) ≤ etL‖Y1 − Y2‖. (2.160)

Proof. This is an example where the abstract Theorem 2.1.1 can be applied.Namely, the mapping ΦY : C([0, T ], B) → CY ([0, T ], B) defined by the equation

[Φ(μ.)](t) = Y +

∫ t

0

F (s, μ.) ds, t ∈ [0, T ], (2.161)

implies the same estimate (2.14) as for usual ODEs, as well as the estimate

‖[ΦY (Y )](t)− Y ‖ ≤∫ t

0

‖F (s, Y )‖ds (2.162)

≤ (tL‖Y ‖+∫ t

0

‖F (s, 0)‖ds). �

Remark 41. We introduced causal equations in the most transparent, somewhatsimplified way. More established definitions of a causal r.h.s. of the equation μ =F (μ.) require that the functional F on curves μ. satisfies the following property:if two curves μ1

. and μ2. coincide up to a time t, then the corresponding curves

F (μ1. ) and F (μ2

. ) also coincide up to time t.

Exercise 2.11.1. Show that the unique solution to the equation x =∫ t

0x(s)ds on

R with the initial condition x0 equals x(t) = x0 cosh t.

As a concrete example of causal equations, let us consider the case where ther.h.s. depends on a fractional integral of the unknown function:

μt = F (t, (Iβa μ.)(t)), (2.163)

with any β > 0. The following result is a direct consequence of Theorem 2.11.1.

Theorem 2.11.2. Let B be a Banach space and T > 0. Let F (t, μ) be a continuousmapping [0, T ]×B → B, which is Lipschitz-continuous in the second variable, i.e.,

‖F (t, μ1)− F (t, μ2)‖ ≤ L‖μ1 − μ2‖ (2.164)

for all t ∈ [0, T ] and μ1, μ2 ∈ B. Then for any a ∈ [0, T ) and Y ∈ B, there existsa unique solution μ.(Y ) ∈ C([a, T ], B) to the Cauchy problem for equation (2.163)with the initial condition μa = Y . Moreover, the estimates (2.159) and (2.160)hold with the constant LT β/Γ(β) instead of L.

Page 153: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

2.12. Equations with memory: fractional derivatives 137

2.12 Equations with memory: fractional derivatives

An alternative standard way of introducing memory into a system that is governedby an ODE is to change the usual derivative on the l.h.s. into a fractional derivativeof order β ∈ (0, 1). Thus, for a Banach space B, a vector Y ∈ B and constantsa ∈ R, β ∈ (0, 1), let us consider the Cauchy problem

Dβa+∗μt = F (t, μt), μa = Y, t ≥ a, (2.165)

where Dβa+∗ is the Caputo fractional derivative of order β, see (1.109) and (1.106).

More generally, for β ∈ (k − 1, k) with k ∈ N, and Y0, Y1, . . . Yk−1 ∈ B, weconsider the Cauchy problem

Dβa+∗μt = Ft(μt, μ

′t, . . . , μ

(k−1)t ),

μa = Y0,d

dtμa = Y1, . . . ,

dk−1

dtk−1μa = Yk−1, t ≥ a,

(2.166)

By Proposition 1.8.4, the problems (2.165) and (2.166) are equivalent to theintegral equations

μt = Y +1

Γ(β)

∫ t

a

(t− s)β−1F (s, μs)ds, (2.167)

respectively

μt =

k−1∑j=0

(t− a)j

j!Yj +

1

Γ(β)

∫ t

a

(t− s)β−1F (s, μs, μ′s, . . . , μ

(k−1)s )ds. (2.168)

Remark 42. Readers who do not wish to enter the world of fractional calculuscan just consider the equations (2.167) and (2.168) (containing nothing that isexplicitly ‘fractional’) as defining the evolutions (2.165) and (2.166) driven by thefractional Caputo derivatives.

Recall that Eβ denotes the Mittag-Leffler function.

Theorem 2.12.1. Let F be a continuous function R×B → B, which is Lipschitz-continuous in the variable μ ∈ B, with a Lipschitz constant ‖F‖Lip = L as definedin (1.53). Then for any Y ∈ B there exists a unique global (defined for all t ≥a) solution μt = μt(Y ) to the problem (2.167) – and therefore also to (2.165).Moreover,

‖μt(Y )− Y ‖ ≤ Eβ(L(t− a)β)(t− a)β

Γ(β + 1)maxs∈[a,t]

‖F (s, Y )‖, (2.169)

and the solutions μt(Y1) and μt(Y2) with different initial data Y1, Y2 satisfy theestimate

‖μt(Y1)− μt(Y2)‖ ≤ ‖Y1 − Y2‖Eβ(L(t− a)β). (2.170)

Page 154: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

138 Chapter 2. Basic ODEs in Complete Locally Convex Spaces

Proof. This is a direct consequence of Theorem 2.1.3, since the solutions to theproblem (2.167) are fixed points of the mapping

[Φ(μ.)](t) = Y +1

Γ(β)

∫ t

a

(t− s)β−1F (s, μs)ds,

which satisfies all assumptions of Theorem 2.1.3 with ω = 1 − β, κ = 1, L(Y ) =L/Γ(β). (Strictly speaking, this applies only after shifting t to t− a.) �

Theorem 2.2.3 suggests the application of the same trick to equation (2.166).For this, notice first that differentiating equation (2.168) (k − 1) times yields

μ(k−1)t = Yk−1 +

1

Γ(β − (k − 1))

∫ t

a

(t− s)β−kF (s, μs, μ′s, . . . , μ

(k−1)s )ds.

(Of course, this also follows from (1.103).) Hence, in terms of the vector-function

νt = (ν0t , ν1t , . . . , ν

k−1t ) = (μt, μ

′t, . . . , μ

(k−1)t ) ∈ Bk

and the vector Y = (Y0, . . . , Yk−1) ∈ Bk, the problem (2.168) can be rewritten as

νt = (ν0t , . . . , νk−1t ) = Y+

∫ t

a

(ν1s , . . . , ν

k−1s ,

(t− s)β−k

Γ(β − (k − 1))F (s, ν0s , . . . , ν

k−1s )

)ds.

(2.171)Noticing that 1 ≤ (t− a)−ω(T − a)ω for t ∈ [a, T ], we can apply Theorem 2.1.3 inthe Banach space Bk with κ = 1, ω = k − β, which gives the following result.

Theorem 2.12.2. Let F be a continuous function R×Bk → B such that

‖F (t, Y0, . . . , Yk−1)− F (t, Z0, . . . , Zk−1)‖ ≤ L

k−1∑j=0

‖Yj − Zj‖.

Then for any Y = (Y0, . . . , Yk−1) ∈ Bk there exists a unique global (defined for allt ≥ 0) solution νt = νt(Y ) to the problem (2.171), and hence the unique solutionμt = μt(Y ) = ν0t (Y ) to the problem (2.168). Moreover, for t ∈ [a, T ] with any T ,

‖νt(Y )− Y ‖Bk ≤(‖Y ‖Bk +

(t− a)β

Γ(β + 1)maxs∈[a,t]

‖F (s, Y )‖)

(2.172)

× Eβ−(k−1)[(L+ (T − a)k−βΓ(β − (k − 1)))(t− a)β−(k−1)],

and the solutions νt(Y ) and νt(Z) with different initial data Y, Z satisfy the esti-mate

‖νt(Y1)− νt(Y2)‖Bk (2.173)

≤ ‖Y − Z‖BkEβ−(k−1)[(L + (T − a)k−βΓ(β − (k − 1)))(t− a)β−(k−1)].

Page 155: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

2.13. Linear fractional ODEs and related integral equations 139

2.13 Linear fractional ODEs and relatedintegral equations

As a key example for fractional ODEs, let us consider the linear Cauchy problem

Dβa+∗μt = Aμt + bt, μa = Y, t ≥ a, (2.174)

with β ∈ (0, 1), bt a continuous curve in B, and A a bounded linear operator inB. The equivalent integral form of the problem reads

μt = Y +1

Γ(β)

∫ t

a

(t− s)β−1(Aμs + bs) ds. (2.175)

Proposition 2.13.1. The unique solution (from Theorem 2.12.1) to the problem(2.174) or (2.175) equals

μt = Eβ(A(t− a)β)Y +

∫ t

a

(t− s)β−1Eβ,β(A(t− s)β)bs ds

= Eβ(A(t− a)β)Y + β

∫ t

a

(t− s)β−1E′β(A(t− s)β)bs ds.

(2.176)

Proof. By recursively replacing μs under the integral in equation (2.175) by thewhole expression of the r.h.s. of (2.175) and by using the semigroup property ofthe fractional integral Iβa , one finds

μt = (1 +A(Iβa 1)(t) + · · ·+Ak−1(I(k−1)βa 1)(t))Y +

Ak

Γ(kβ)

∫ t

a

(t− s)kβ−1μs ds

+(Iβa +AI2βa + · · ·+Ak−1Ikβa )b(t),

which yields (2.176) by passing to the limit k → ∞ and using (9.14). �

As in the case of usual linear equations, the analysis of equation (2.174)extends straightforwardly to the time-dependent case,

Dβa+∗μt = Atμt + bt, μa = Y, t ≥ a, (2.177)

where At is a family of linear operators, provided that one carefully takes intoaccount the non-commutativity of the family At.

Proposition 2.13.2. Let β ∈ (0, 1). Suppose that At is a family of uniformly boundedoperators in B that depend continuously on t, and that bt a continuous curve in B.Then equation (2.177) has a unique solution. This solution has a representationas the geometric series

μt =

∞∑m=0

(Iβa ◦A)m[Y + (Iβa b)(.)](t), (2.178)

Page 156: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

140 Chapter 2. Basic ODEs in Complete Locally Convex Spaces

where Iβa ◦A acts in C([a, t], B) by the formula

(Iβa ◦A)g(t) = 1

Γ(β)

∫ t

a

(t− s)β−1Asgs ds.

Finally,

‖μt‖ ≤ Eβ( sups∈[a,t]

‖As‖(t− a)β)(‖Y ‖+ sups∈[a,t]

‖(Iβa b)(s)‖). (2.179)

Proof. As mentioned before, the expansion (2.178) is obtained as the direct ex-tension of the corresponding expansion for a constant A. �Remark 43. Since the integral version (2.167) of (2.177) reads

(1 − Iβa ◦A)μ. = Y + (Iβa b)(.),

representation (2.178) is formally obtained by the expansion of (1− Iβa ◦A)−1 intoa geometric series.

More generally, estimates of the type (2.179) extend to linear equations ofthe type

μt = GtY +

∫ t

a

At,sμs ds+ gt. (2.180)

Namely, the following assertion is obtained by using the same arguments as in theproof of Proposition 2.13.1, i.e., recursively inserting the r.h.s. of equation (2.180)in order to express μs under the integral and then estimating the terms of theresulting series via fractional integrals.

Proposition 2.13.3. Suppose that At,s, t > s, and Gt, t ≥ 0, are families of boundedlinear operators in B that depend continuously on t and measurably on s, and letgt be a continuous curve in B. Suppose further that the family Gt is uniformlybounded and

‖At,s‖ ≤ |A|(t− s)−ω (2.181)

for some constants |A| > 0 and ω ∈ (0, 1). Then equation (2.180) has a uniquesolution. This solution has a representation as the geometric series

μt =

∞∑m=0

(IA)m(G.Y + g.)(t), (2.182)

where IA acts in C([a, t], B) by the formula

(IAh)(t) =

∫ t

a

At,shs ds.

Finally,

‖μt‖ ≤ E1−ω(|A|Γ(1 − ω)(t− a)1−ω) sups∈[a,t]

(‖Gs‖B→B‖Y ‖+ ‖gs‖). (2.183)

Page 157: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

2.13. Linear fractional ODEs and related integral equations 141

Remark 44. The reason for studying equation (2.180) is that the milder forms ofmany standard PDEs (including diffusions) are represented by equations of thistype, see, e.g., Sections 4.6 and 4.8. Therefore, equation (2.180) is a handy way toput both fractional and usual evolutions under one single umbrella.

The discussed theory of linear equations with β ∈ (0, 1) extends directlyto the case of arbitrary positive β. In fact, by (2.168), the equivalent integralrepresentation for the Cauchy problem

Dβa+∗μt = Atμt + bt, μa = Y0,

d

dtμa = Y1, . . . ,

dk−1

dtk−1μa = Yk−1, t ≥ a,

(2.184)

is

μt =

k−1∑j=0

(t− a)j

j!Yj +

1

Γ(β)

∫ t

a

(t− s)β−1(Asμs + bs)ds. (2.185)

Theorem 2.13.1. Let β ∈ (k − 1, k) for a natural k, and suppose that At is afamily of uniformly bounded operators in B depending continuously on t and btis a continuous curve in B. Then equation (2.185) has a unique solution. Thissolution has a representation as the geometric series

μt =

∞∑k=0

(Iβa ◦A)k[Y0 + (.− a)Y1 + · · ·+ (.− a)k−1

(k − 1)!Yk−1 + (Iβa b)(.)](t) (2.186)

and is bounded:

‖μt‖ ≤ Eβ( sups∈[a,t]

‖As‖(t− a)β)

(k−1∑l=0

(t− a)l

l!‖Yl‖+ sup

s∈[a,t]

‖(Iβa b)(s)‖). (2.187)

If At = A does not depend on t, the solution takes the form

μt =k−1∑l=0

(I laEβ(A(.− a)β))(t)Yl + β

∫ t

a

(t− s)β−1E′β(A(t− s)β)bs ds. (2.188)

Proof. Let us only check how (2.188) is obtained for the case k = 2. Again, re-placing μs under the integral in equation (2.185) by the whole expression of ther.h.s. of (2.185) yields

μt = Y0 + (t− a)Y1 + Iβa(b. + A(Y0 + (.− a)Y1 + Iβa (Aμ. + b.))

)(t)

= Y0 + (Iβa 1)(t)AY0 + (I1a1)(t)Y1 + (Iβ+1a 1)(t)AY1

+ (Iβa b.)(t) + (I2βa Ab(.))(t) + (I2βa A2μ.)(t).

Page 158: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

142 Chapter 2. Basic ODEs in Complete Locally Convex Spaces

Repeating this recursively leads to

μt =m∑l=0

(I lβa 1)(t)AlY0 +m∑l=0

(I lβ+1a 1)(t)AlY1 +

m∑l=0

(I(l+1)βa Alb.)(t)

+ (I(m+1)βa Am+1μ.)(t).

Passing to the limit m → ∞ and taking into account formula (1.104) yields (2.188)for k = 2.

The formulae (2.186) and (2.187) are obtained similarly as straightforwardextensions of Propositions 2.13.1 and 2.13.2. �

Finally, let us consider a system of fractional differential equations of differentorders, for μ = (μ1, . . . , μk) ∈ Bk:⎧⎪⎨

⎪⎩Dβ1

a+∗μ1t = F1(t, μt), μ1

a = Y1,

· · ·Dβk

a+∗μkt = Fk(t, μt), μk

a = Yk,

(2.189)

or in the integral form

μjt = Yj + (Iβj

a Fj(., μ.))(t), 1, . . . , k. (2.190)

The following statement is yet another direct consequence of Theorem 2.1.3.

Theorem 2.13.2. For a natural k, let β1, . . . , βk ∈ (0, 1) and let F1, . . . Fk beLipschitz-continuous mappings Bk → B, so that

‖Fj(μ)− Fj(μ)‖ ≤ L‖μ− μ‖Bk = L

k∑j=1

‖μj − μj‖

with a constant L. Then the system (2.190) has a unique global solution μt(Y ),t ≥ a, for any Y = (Y1, . . . , Yk) ∈ Bk. Moreover, for any T > a there exists aconstant C(T, L) such that for all t ∈ [a, T ]

‖μt(Y )− Y ‖ ≤ C(T, L) maxs∈[a,t]

‖F (s, Y )‖,

‖μt(Y )− μt(Y )‖ ≤ C(T, L)‖Y − Y ‖.(2.191)

2.14 Linear fractional evolutions involvingspatially homogeneous ΨDOs

In this section, we extend the results of Section 2.4 to fractional evolutions. Namely,let us consider the equations

Dβa+∗ft = −ψ(−i∇)ft, f |t=a = fa, (2.192)

Page 159: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

2.14. Linear fractional evolutions involving spatially homogeneous ΨDOs 143

for differential or pseudo-differential operators with constant coefficients, with suf-ficiently regular symbols ψ (choosing ψ to be time-independent for simplicity),where fractional derivatives are taken with respect to the variable t.

Passing to the Fourier transform f(p) =∫e−ipxf(x) dx, the Cauchy problem

(2.192) turns into

Dβa+∗ft(p) = −ψ(p)ft(p), f |t=a = fa.

By Proposition 2.13.1, it has the solution

ft(p) = Eβ(−(t− a)βψ(p))fa(p).

Returning to f via the inverse Fourier transform yields

ft(x) =

∫Gψ,β

t−a(x− y)fa(y) dy (2.193)

with

Gψ,βt−a(x) =

1

(2π)d

∫eipxEβ(−(t− a)βψ(p)) dp, (2.194)

whenever this integral is well defined. The ‘fractional heat kernels’ Gψ,βt−a(x) solve

the Cauchy problem (2.56) with the Dirac initial condition δ(x).

Similar to the usual linear equations, the solution (2.193) is well defined andunique in S′(Rd) for continuous functions ψ that are bounded from below.

In order to see how the solution can be defined in spaces of more regularfunctions, we can exploit the integral representation for the Mittag-Leffler function(2.80), that is

βEβ(s) =

∫ ∞

0

esxx−1−1/βGβ(1, x−1/β) dx.

Using this formula, we rewrite (2.194) as

Gψ,βt−a(x) =

1

(2π)dβ

∫eipxdp

∫ ∞

0

e−(t−a)βψ(p)yy−1−1/βGβ(1, y−1/β) dy, (2.195)

or as

Gψ,βt−a(x) =

1

β

∫ ∞

0

Gψ(t−a)βy

(x)y−1−1/βGβ(1, y−1/β) dy, (2.196)

where Gψ(t−a)βy

(x) is the heat kernel (2.62) of the corresponding problem (2.56)

with the usual derivative.

The following assertion is an application of a general result that relates linearCauchy problems with usual and fractional derivatives.

Theorem 2.14.1. Let the family of linear mappings fs �→ ft resolving problem(2.56) and given by (2.59) be well defined in some functional space L with normsthat are uniformly bounded by a constant M for T1 ≤ s ≤ t ≤ T2. For instance,

Page 160: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

144 Chapter 2. Basic ODEs in Complete Locally Convex Spaces

according to Theorem 2.4.1, these mappings are contractions if L is chosen asL2(R

d) or the Sobolev space Hk2 (R

d) for any k ∈ R, whenever ψ is a locallybounded measurable function with a non-negative real part. Then the mappingsfa �→ ft given by (2.193) and resolving problem (2.192) are also well defined inthe same space L with norms that are bounded by the same constant M .

Proof. From Proposition 2.4.1 (i) and (iii), it follows that the function

y−1−1/βGβ(1, y−1/β)

under the integral in (2.196) is bounded as y → 0 and decays faster than anypower as y → ∞. In particular, it belongs to L1(R). For the mapping fs �→ ftgiven by (2.193), we therefore find

‖ft‖L ≤ 1

βM‖fa‖L

∫ ∞

0

y−1−1/βGβ(1, y−1/β) dy = M‖fa‖LE(0) = M‖fa‖L,

as required. �

Let us extend this result to equations with a nontrivial r.h.s., i.e., to equationsof the type

Dβa+∗ft = −ψ(−i∇)ft + gt, f |t=a = fa, (2.197)

and let us formulate the result in the framework of Theorem 2.4.1.

Theorem 2.14.2. Let ψ(p) be a locally bounded measurable function with a non-negative real part, and let gt be a continuous curve in the space L, which is eitherL1(R

d) or L2(Rd) or F (L1(R

d)). Then there exists a unique solution to the prob-lem (2.197), and it is given by the formula

ft(x) =1

β

∫ ∞

0

dy

∫Rd

Gψy(t−a)β

(x− z)y−1−1/βGβ(1, y−1/β)fa(z) dz (2.198)

+

∫ t

a

ds

∫ ∞

0

dy

∫Rd

(t− s)β−1Gψy(t−s)β

(x− z)y−1/βGβ(1, y−1/β)gs(z) dz.

Moreover, it satisfies the estimate

‖ft‖L ≤ ‖fa‖L +1

Γ(β)

∫ t

a

(t− s)β−1‖gs(.)‖L ds. (2.199)

Proof. By the Fourier transform, equation (2.197) turns into the equation

Dβa+∗ft(p) = −ψ(p)ft(p) + gt(p), f |t=a = fa. (2.200)

According to Proposition 2.13.1, its solution is unique and given by the formula

ft(p) = Eβ(−ψ(p)(t− a)β)f0(p) + β

∫ t

a

(t− s)β−1E′β(−ψ(p)(t− s)β)gs(p) ds.

(2.201)

Page 161: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

2.15. Sensitivity of integral and differential equations: advanced version 145

Using again (2.80), this can be written as

ft(p) =1

β

∫ ∞

0

exp{−yψ(p)(t− a)β}y−1−1/βGβ(1, y−1/β) dy fa(p) (2.202)

+

∫ t

a

ds

∫ ∞

0

exp{−yψ(p)(t− s)β}(t− s)β−1y−1/βGβ(1, y−1/β) dy gs(p).

Applying the inverse Fourier transform yields (2.198). Estimate (2.199) follows

from the contraction property in L of the integral operators with the kernelGψt (x−

z), and due to Eβ(0) = 1, E′β(0) = 1/Γ(β + 1). �

As an example, let us consider the Cauchy problem

Dαa+∗ft(x) = − dβ

dxβft(x) + gt(x), f |t=a = fa, (2.203)

where α, β ∈ (0, 1) and the operator Dαa+∗ is assumed to act on the variable t.

According to Theorem 2.14.2 and Proposition 2.4.1, its unique solution is givenby the formula

ft(x) =1

α

∫ ∞

0

dy

∫ x

−∞Gβ(y(t− a)α, x− z)y−1−1/αGα(1, y

−1/α)fa(z) dz (2.204)

+

∫ t

a

ds

∫ ∞

0

dy

∫ x

−∞(t− s)α−1Gβ(y(t− s)α, x− z)y−1/αGα(1, y

−1/α)gs(z) dz.

2.15 Sensitivity of integral and differential equations:

advanced version

Let us now prove an advanced version of the sensitivity for integral equations.. Itsapplication will include both the fractional equations discussed above and nonlin-ear diffusions that will be considered later on. In fact, the above Theorems 2.8.1and 2.8.2 were formulated in such a way that they can be more or less straightfor-wardly extended to the present setting. We are again looking for the derivativesof the fixed points of equation (2.129), the only difference being now the presenceof a singularity of Ωt,s for t = s.

Theorem 2.15.1. Suppose that the assumptions of Theorem 2.8.1 hold with a slightmodification concerning Ω. Namely, instead of (2.131), we assume that Ωt,s ∈C1

uc(BR, B) for t > s and any R, and there exists ω ∈ (0, 1) such that for any ε, R,there exist L(R) and δ = δ(R, ε) such that

‖DΩt,s(μ)‖B→B ≤ L(R)(t− s)−ω,

‖DΩt,s(μ1)−DΩt,s(μ2)‖B→B ≤ ε(t− s)−ω,(2.205)

for all t, s, and for any μ1, μ2 with μ1 ∈ BR, ‖μ1 − μ2‖ ≤ δ.

Page 162: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

146 Chapter 2. Basic ODEs in Complete Locally Convex Spaces

Then the mapping μτ = Y �→ μt,τ (Y ) belongs to C1uc(BR, B), and thus to

C1luc(B,B), for all t. Moreover, ξt = Dμt,τ (Y )[ξ] is the unique solution to equation

(2.128) and the following estimate holds:

‖ξt‖ ≤ E1−ω(L(MT (R))Γ(1 − ω)(t− τ)1−ω)G‖ξ‖. (2.206)

Proof. We are using the same arguments as in the proof of Theorem 2.8.1. First ofall, equation (2.128) is well posed and its solution satisfies (2.206) due to Propo-sition 2.13.3. Next, we find

‖μt,τ (Y1)− μt,τ (Y2)‖ ≤ G‖Y1 − Y2‖E1−ω(L(MT (R))Γ(1− ω)(t− τ)1−ω) (2.207)

by Theorem 2.1.3. As in Theorem 2.8.1, the function

φt = μt,τ (Y + ξ)− μt,τ (Y )− ξt

again satisfies equation (2.134).

For an ε, we choose δ ≤ 1 such that

‖Ωt,s(Z + ξ)− Ωt,s(Z)−DΩt,s(Z)[ξ]‖ ≤ ε‖ξ‖(t− s)−ω

for ‖ξ‖ ≤ δ and ‖Z‖ ≤ MT (R+ 1). By (2.133), if

‖ξ‖ ≤ [GE1−ω(L(MT (R))Γ(1 − ω)(t− τ)1−ω)]−1δ,

then

‖μt,τ (Y + ξ)− μt,τ (Y )‖ ≤ δ,

so that

‖Ωt,s(μs,τ (Y + ξ))− Ωt,s(μs,τ (Y ))−DΩt,s(μs,τ (Y ))[μs,τ (Y + ξ)− μs,τ (Y )]‖≤ ε‖μt,τ (Y + ξ)− μt,τ (Y )‖(t− s)−ω

≤ εG(t− s)−ωE1−ω(L(MT (R))Γ(1 − ω)(t− τ)1−ω)‖ξ‖. (2.208)

Consequently, by (2.134) and Proposition 2.13.3, we find

sups∈[τ,t]

‖φs‖ ≤ ε‖ξ‖κ,

with a κ depending on L(MT (R)), T, τ and G. This shows that ξt = Dμt,τ (Y )[ξ].

The continuity of ξt = Dμt,τ (Y )[ξ] as a function of Y ∈ BR can be shownlike in Theorem 2.8.1. �

As an application, we can get the following result for the sensitivity forfractional equations.

Page 163: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

2.15. Sensitivity of integral and differential equations: advanced version 147

Theorem 2.15.2. Let F ∈ C1luc(B,B) for a Banach space B – therefore, the as-

sumptions of Theorem 2.12.1 hold. Then the mapping Y �→ μt(Y ) as constructedin Theorem 2.12.1 belongs to C1

luc(B,B) for all t, and ξt = Dμt(Y )[ξ] is the uniquesolution to the linear equation

ξt = ξ +1

Γ(β)

∫ t

0

(t− s)β−1DF (s, μs)[ξs] ds

⇐⇒ Dβa+∗ξt = DF (t, μt)[ξt] and ξ0 = ξ.

(2.209)

Moreover, the following estimate holds:

‖ξt‖ ≤ Eβ(L(t− a)β)‖ξ‖. (2.210)

Next, let us formulate an extension of Theorem 2.8.2 to singular Ωt,s.

Theorem 2.15.3. Under the assumptions of Theorem 2.15.1, let us additionallysuppose that Ωt,s ∈ C1

bLip(BR, B) for t > s, so that

‖DΩt,s(μ1)−DΩt,s(μ2)‖B→B ≤ LD(R)(t− s)−ω‖μ1 − μ2‖ (2.211)

for all μ1, μ2 ∈ BR and some constants L(R) and LD(R).

Then the mapping Y �→ μt,τ (Y ) belongs to C1bLip(BR, B) for all t, τ . More-

over, ξt = Dμt,τ (Y, α)[ξ] is the unique solution to equation (2.128), it satisfies(2.206) and is the limit of the approximations (2.138).

Exercise 2.15.1. Give the full proof of Theorem 2.15.3 by identifying those argu-ments in the proof of Theorem 2.8.2 that have to be modified.

Exercise 2.15.2. Formulate and prove the extension of Theorem 2.8.3 for singu-lar Ωt,s.

Finally, let us formulate a direct extension of Theorem 2.8.4 that deals withsecond-order derivatives.

Theorem 2.15.4. Under the assumptions of Theorem 2.15.1, let Ωt,s ∈ C2uc(BR, B)

for all R and t > s, and for any ε, R, let L2(R) and δ2 = δ2(R, ε) exist such that

‖D2Ωt,s(μ)‖B×B→B ≤ L2(R)(t− s)−ω,

‖D2Ωt,s(μ1)−D2Ωt,s(μ2)‖B×B→B ≤ ε(t− s)−ω,(2.212)

for all t, s, and any μ1, μ2 with μ1 ∈ BR, ‖μ1 −μ2‖ ≤ δ2. Then the mapping μτ =Y �→ μt,τ (Y ) belongs to C2

uc(BR, B) for all t, τ . Moreover, ηt = D2μt,τ (Y )[ξ1, ξ2]is the unique solution to equation (2.144).

Page 164: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

148 Chapter 2. Basic ODEs in Complete Locally Convex Spaces

2.16 ODEs in locally convex spaces

In this section, we are going to show how the previous results for Banach spacescan be extended to general locally convex spaces. All necessarily definitions canbe found in Section 1.6.

The main idea is rather simple: convergence of sequences in locally convexspaces is equivalent to their convergence in each of the semi-norms that define thetopology. Therefore, in order to extend the main results for the Banach spaces, therequirement is that all the relevant estimates must hold in each of the semi-norms.Note that we will not discuss the more subtle situation when this uniformity doesnot hold.

Throughout this section, let us denote by V any complete Hausdorff locallyconvex linear topological space, with the topology being defined by a separatingfamily of semi-norms pγ . If the set pγ is countable, then V is a Frechet space,which can be metricised by the metric (1.69).

First, let us formulate the extensions of the fixed-point principles. The ana-logue to Proposition 9.1.1 reads as follows.

Proposition 2.16.1. If Φ is a mapping V → V such that pγ(Φn(x),Φn(y)) ≤

αγnpγ(x − y) for all x, y and γ with some αγ

n such that Aγ = 1 +∑∞

n=1 αγn < ∞,

then Φ has a unique fixed point x∗, Φn(x) converges to x∗ for any x and

pγ(x− x∗) ≤ Aγpγ(x− Φ(x)). (2.213)

Proof. Using the estimates from the proof of Proposition 9.1.1 for each semi-normpγ (instead of the metric ρ) implies that Φn(x) is Cauchy in each semi-norm andhence converges to a point x∗. Applying (2.213) to another fixed point x = x∗

yields pγ(x− x∗) = 0 for any γ and hence x∗ = x∗. �

Similarly, Proposition 9.1.3 on the stability of fixed points can be directlyextended:

Proposition 2.16.2. If Φ1,Φ2 are two mappings V → V such that pγ(Φnj (x) −

Φnj (y)) ≤ αγ

n(j)pγ(x − y) for j = 1, 2 and all x, y and γ, with some αγn(j) such

that Aγ(j) = 1 +∑∞

n=1 αγn(j) < ∞, and if pγ(Φ1(x)− Φ2(x)) ≤ εγ for all x, then

pγ(x∗1 − x∗

2) ≤ εγ minj=1,2

Aγ(j)

for the fixed points x∗j of the mappings Φj.

In order to extend Theorems 2.1.1 and 2.1.3, we introduce the spacesC([τ, t], V ) of continuous functions [τ, t] → V . These spaces are considered com-plete Hausdorff locally convex spaces if equipped with the family of semi-norms

p[τ,t]γ (μ.) = sups∈[τ,t]

pγ(μ(s)).

Page 165: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

2.16. ODEs in locally convex spaces 149

For a closed convex set M ⊂ V and Y ∈ M , let C([τ, t],M) be a closed convexsubset of C([τ, t], V ) of functions with values in M , and let CY ([τ, t],M) be asubset of functions μ with μτ = Y . If the family of norms pγ is countable, thenboth V and C([s, t], V ) are Frechet spaces, and hence metric spaces. However, thisproperty will not be used here.

The proof of the following result is derived in the same way from Propositions2.16.1 and 2.16.2 as the proofs of Theorems 2.1.1 and 2.1.3 are derived from thecorresponding fixed-point principles in metric spaces.

Theorem 2.16.1. Suppose that for any Y ∈ M and α ∈ B1 (with B1 a Banachspace), a mapping ΦY,α : C([τ, T ],M) → CY ([τ, T ],M) is given with some T >τ such that for any t the restriction of ΦY,α(μ.) on [τ, t] depends only on therestriction of the function μs on [τ, t]. Moreover, for any γ, let

pγ([ΦY,α(μ1. )](t)− [ΦY,α(μ

2. )](t)) ≤ Lγ(Y )

∫ t

τ

(t− s)−ωp[τ,s]γ (μ1. − μ2

. ) ds,

pγ([ΦY1,α1(μ.)](t)− [ΦY2,α2(μ.)](t)) ≤ κγpγ(Y1 − Y2) + κ1γ‖α1 − α2‖, (2.214)

for any μ1, μ2 ∈ C([τ, T ],M), some constants ω ∈ [0, 1), κγ ,κ1γ and continuous

functions Lγ on M .

Then for any Y ∈ M and α ∈ B1, the mapping ΦY,α has a unique fixed pointμt,τ (Y, α) in CY ([τ, T ],M). Moreover, for all t ∈ [τ, T ] and all γ,

pγ(μt,τ (Y, α)− Y ) ≤ e(t−τ)Lγ(Y )pγ([ΦY,α(Y )](t)− Y ), (2.215)

if ω = 0, or

pγ(μt,τ (Y, α)−Y ) ≤ E1−ω(Lγ(Y )Γ(1−ω)(t−τ)1−ω)pγ([ΦY (Y )](t)−Y ), (2.216)

if ω > 0.

Finally, the fixed points μt,τ (Y1, α1) and μt,τ (Y2, α2) with different initialdata Y1, Y2 and parameters α1, α2 satisfy the estimate

pγ(μt,τ (Y1, α1)− μt,τ (Y2, α2))

≤ (κγpγ(Y1 − Y2) + κ1γ‖α1 − α2‖) exp{(t− τ)Lγ(Yj)},

(2.217)

if ω = 0, or

pγ(μt,τ (Y1, α1)− μt,τ (Y2, α2))

≤ (κγpγ(Y1 − Y2) + κ1γ‖α1 − α2‖)E1−ω(L(Yj)Γ(1 − ω)(t− τ)1−ω).

(2.218)

if ω > 0, for j = 1, 2.

All results that have been previously derived for Banach spaces can bestraightforwardly extended to the case of general V . For instance, Theorems 2.2.1and 2.12.1 rewrite as follows.

Page 166: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

150 Chapter 2. Basic ODEs in Complete Locally Convex Spaces

Theorem 2.16.2. Let F be a continuous mapping R× V → V which is Lipschitz-continuous in the variable μ ∈ V in all semi-norms, i.e.,

pγ(F (t, μ1)− F (t, μ2)) ≤ Lγpγ(μ1 − μ2)

for all γ and some constants Lγ. Then for any Y ∈ V there exists a unique globalsolution μt = μt(Y ) to the Cauchy problems

μt = F (t, μt), μa = Y, t ∈ R, (2.219)

and

Dβa+∗μt = F (t, μt), μa = Y, t ≥ a, (2.220)

with any β ∈ (0, 1). Moreover,

pγ(μt(Y )− Y ) ≤ |t− a|e|t−a|Lγ sups∈[a,t]

pγ(F (s, Y )) (2.221)

and

pγ(μt(Y )− Y ) ≤ Eβ(Lγ(t− a)β)(t− a)β

Γ(β + 1)maxs∈[a,t]

pγ(F (s, Y )), (2.222)

for the equations (2.219) and (2.220), respectively.

Finally, the solutions μt(Y1) and μt(Y2) with different initial data Y1, Y2 sat-isfy the estimate

pγ(μt(Y1)− μt(Y2)) ≤ e|t−a|Lγpγ(Y1 − Y2) (2.223)

and

pγ(μt(Y1)− μt(Y2)) ≤ pγ(Y1 − Y2)Eβ(Lγ(t− a)β) (2.224)

for the equations (2.219) and (2.220), respectively.

Similar extensions can be achieved for the results on sensitivity.

2.17 Monotone and accretive operators

In this section, we remind the reader of the useful notion of monotone and accretiveoperators, returning back to Banach spaces for simplicity. Moreover, for the sake ofcompleteness, we sketch the main results of the general theory of ODEs, involvingoperators that can cover quite many interesting problems. For more details, how-ever, we refer to the abundant literature, where this theory is well documented,see, e.g., [27, 244, 262].

A Banach space is called strictly convex if x = y, ‖x‖ = ‖y‖ = 1 implies‖hx+ (1− h)y‖ < 1 for any h ∈ (0, 1). A Banach space is called uniformly convexif for any ε ∈ (0, 2) there exists δ such that ‖x‖ ≤ 1, ‖y‖ ≤ 1 and ‖x − y‖ > εimplies ‖x+ y‖ < 2(1− δ).

Page 167: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

2.17. Monotone and accretive operators 151

Exercise 2.17.1.

(i) A Banach space is strictly convex, if x = y, ‖x‖ = ‖y‖ = 1 implies ‖hx +(1 − h)y‖ < 1 for some h ∈ (0, 1).

(ii) If a Banach space is uniformly convex, then it is strictly convex.

Proposition 2.17.1. If a Banach space B or its dual B∗ is uniformly convex, thenB is reflexive (see, e.g., [244]).

Basic examples of uniformly convex Banach spaces are Hilbert spaces andthe spaces Lp(R

n) with p > 1. On the other hand, the spaces L1(Rn) and M(Rn)

are neither strictly convex nor reflexive.

The duality mapping J of B is defined as the following multi-valued mappingfrom B to B∗:

J(x) = {x∗ ∈ B∗ : (x∗, x) = ‖x‖2 = ‖x∗‖2}. (2.225)

Exercise 2.17.2. If B = Lp(Rn), p > 1, then J is single-valued. Namely, if f ∈ B

with ‖f‖ = 1, then (J(f))(x) = sgn (f(x))|f(x)|p−1 ∈ Lq(Rn) with 1/q+1/p = 1.

Exercise 2.17.3. Let B = L1(Rn) and f ∈ L1(R

n). If f > 0 everywhere, thenJ(f) = 1. In general, g ∈ J(f) if g(x) = sgn (f(x)) for f(x) = 0 and g(x) ∈ [−1, 1]otherwise.

Proposition 2.17.2. The image J(x) is never empty. (This is a consequence ofthe Hahn–Banach theorem.) If B is strictly convex, then the duality mapping issingle-valued.

Proof. See [244], Section II.8. �

A mapping A from a subspace D of a Banach space B to subsets of itsdual B∗ (in other words, a multi-valued mapping D → B∗) is called monotone if(x∗ − y∗, x − y) ≥ 0 for all x, y ∈ D, x∗ ∈ A(x), y∗ ∈ A(y). In particular, if A islinear and single-valued, this is equivalent to the requirement that (A(x), x) ≥ 0.The most fundamental example of monotone mappings are sub-gradients of convexfunctions. Recall that for a convex function φ : B → R the sub-gradient ∂φ(x) atx is defined as

∂φ(x) = {x∗ ∈ B∗ : φ(z)− φ(x) ≥ (x∗, z − x) for all z ∈ B}. (2.226)

Summing up these conditions for the pairs (x, y) and (y, x) yields

(x∗ − y∗, x− y) ≥ 0 for all x∗ ∈ ∂φ(x), y∗ ∈ ∂φ(y),

that is, the monotonicity of the sub-gradient mapping x → ∂φ(x).

If H is a Hilbert space and D ⊂ H , a mapping A : H → H is called accretiveif it becomes monotone after the usual identification of H and H∗ (see Remark1), i.e., if (A(x) −A(y), x− y) ≥ 0 for all x, y ∈ D.

In order to define accretivity for Banach spaces, one uses the duality mappingto transfer B to B∗. As for monotonicity, the most natural notion of accretivity isformulated in terms of multi-valued mappings. Recall that by a (binary) relation

Page 168: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

152 Chapter 2. Basic ODEs in Complete Locally Convex Spaces

A on B, one means any subset of B×B whose domain D(A) is defined as {x ∈ B :∃y : {x, y} ∈ A} and whose range or image is defined as {y ∈ B : ∃x : {x, y} ∈ A}.These relations are naturally identified with multi-valued mappings (which wedenote by the same letter): A(x) = {y : {x, y} ∈ A}. The inverse relation or theinverse mapping is defined as A−1 = {{y, x} : {x, y} ∈ A}. Linear operations onthe relations are defined as λA = {{x, λy} : {x, y} ∈ A} for λ ∈ R and

A+B = {{x, y + z} : {x, y} ∈ A, {x, z} ∈ B}.A relation A is said to be a contraction if ‖y1− y2‖ ≤ ‖x1−x2‖ for all yj ∈ A(xj).

The relation or the multi-valued mapping A is called accretive if

[x1 − x2, y1 − y2]+ ≥ 0 ⇐⇒ (x1 − x2, y1 − y2)+ ≥ 0 (2.227)

whenever {xj , yj} ∈ A, j = 1, 2. (The equivalence follows from (1.40).) By thedefinition of the semi-inner product and by the monotonicity of the slopes ofconvex functions (1.37), this condition is equivalent to the requirement that

‖x1 − x2‖ ≤ ‖(x1 + αy1)− (x2 − αy2)‖ (2.228)

whenever α > 0 and {xj , yj} ∈ A, j = 1, 2. In other words, this requirement meansthat the relations Jα = (I + αA)−1 are contractions for all α > 0. If such an Ais single-valued, one can call it accretive mapping. Note, however, that the term‘accretive mapping’ often refers to multi-valued mappings in the literature.

For a linear operator A, condition (2.228) turns into

‖x+ αAx‖ ≥ ‖x‖for any α > 0, in which case one says that the operator (−A) is dissipative.

Proposition 2.17.3.

(i) A is accretive if and only if {xj , yj} ∈ A, j = 1, 2 implies that there existsf ∈ J(x1 − x2) such that (f, y1 − y2) ≥ 0.

(ii) If A is accretive, then the image of (I + αA) coincides with the whole B forsome α > 0 if and only if it coincides with the whole B for all α > 0.

Proof. See [244], Section IV.7. �

An accretive relation is called m-accretive if the last condition in Proposition2.17.3 holds, i.e., the image of (I + αA) coincides with the whole B for all α > 0.

The following Kato’s theorem is the main result for accretive operators in auniformly convex context.

Theorem 2.17.1. Let B∗ be uniformly convex and A an m-accretive relation inB. Then for any u0 ∈ D(A), ω ∈ R and T > 0, there exists a unique curveu ∈ C([0, T ];B) such that

ωu(t) ∈ u′(t) +A(u(t)) (2.229)

for almost all t ∈ [0, T ]. Moreover, u′ exists almost everywhere and is uniformlybounded.

Page 169: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

2.18. Hints and answers to chosen exercises 153

Proof. See [244], Section IV.7. �

For single-valued relations A, the inclusion (2.229) reduces to the usual ODEu′(t) +A(u(t)) = ωu(t).

An extension of this result exists for general Banach spaces, namely theCrandall–Ligget Theorem, see, e.g., [244] or [262]. In this case, however, one cannotgenerally guarantee the existence of a classical solution. Therefore, one provesthe existence of a unique generalized solution, the so-called C0-solution, which isdefined as the limit of certain natural discrete approximations.

Kato’s and Crandall–Ligget’s Theorems extend the famous Hille–Yosida re-sult for linear operators A to the nonlinear case. As for the Hille–Yosida case,the necessity to check m-accretivity is a very delicate point when checking theassumptions of these theorems, since m-accretivity is a much stronger require-ment than just accretivity. We refer to the above-mentioned books for numerousexamples of successful applications of Kato’s and Crandall–Ligget’s Theorem. Anotable example is the so-called porous medium equation in L1(Ω), Ω ⊂ Rn, withAu = −Δρ(u) and ρ some real function, the main example being ρ(r) = r|r|m−1.

2.18 Hints and answers to chosen exercises

Exercise 2.2.1. The solution is x2(t) = (x−20 − 2t)−1, and the explosion time is

given by t0 = 1/(2x20).

Exercise 2.2.2. For any t0 ≥ 0, the formulae x(t) = 0 for t ≤ t0 and x(t) =(t− t0)

2/4 for t ≥ t0 define a solution.

Exercise 2.2.3. Substituting u(t, x) = w(x+ ct) in (2.28) yields the equation

cw′ =3

2ww′ +

1

4w′′′,

which implies

cw =3

4w2 +

1

4w′′ + c1

with a constant c1. For further integration and to get rid of the second derivative,the trick is to multiply this equation by the factor w′.

Exercise 2.3.2. Since σ2j = 1, exp{tσj} = cosh t+ σj sinh t.

Exercise 2.3.3. The key point is that (J2)2 = 0, (J3)

3 = 0. Answer:

exp{J2} = 1+ tJ2 =

(1 t

0 1

), exp{J2} = 1+ tJ3 +

t2

2J23 =

⎛⎝1 t t2/20 1 t0 0 1

⎞⎠

Page 170: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

154 Chapter 2. Basic ODEs in Complete Locally Convex Spaces

Exercise 2.3.4. Rewrite the equation in terms of f and y = f , and use the Duhamelformula. Alternatively, rewrite the equation in terms of the function f . Answer:

f(t) = f0 +1

b(1− e−bt)y0 +

1

b

∫ t

0

(1 − e−(t−s)b)g(s) ds. (2.230)

Exercise 2.3.5. Answer:

f(t) = f0 + (fT − f0)1− e−bt

1− e−bT+

1

b

∫ t

0

(1− e−(t−s)b)g(s) ds

− 1− e−bt

b(1− e−bT )

∫ T

0

(1 − e−(t−s)b)g(s) ds. (2.231)

Exercise 2.5.1. The maximum in maxp(pv − (G(x)p, p)/2) is attained at p =G−1(x)v.

Exercise 2.6.1. If the curve (x(τ), p(τ)) is a solution to (2.81) joining x0 and x inthe time t, then the curve (x(τ) = x(t − τ), p(τ) = −p(t − τ)) is the solution tothe Hamiltonian system with the Hamiltonian H joining the points x and x0 inthe time t.

Exercise 2.6.2. For the first equation in (2.113), note that

∂2S

∂x2(t, x, x0) = ∂p(t, x)∂x,

where p(t, x) = P (t, x0, p0), x = X(t, x0, p0).

Exercise 2.11.1. Differentiation turns this equation into the ODE x = x.

Exercise 2.15.1. The estimates for the increments ξn+1t − ξnt become

‖ξn+1t − ξnt ‖ ≤ Γ(1− ω)L(MT (R))I1−ω

τ ‖ξn. − ξn−1. ‖(t)

+ κ‖ξ‖(t− τ)(L(MT (R))Γ(1− ω)(t− τ)1−ω)n−1

Γ(n(1− ω) + 1)

for some constant κ = κ(T ) that depends on T .

Exercise 2.17.1. (i) Let h0 ∈ (0, 1) be such that ‖h0x + (1 − h0)y‖ < 1. Supposethat there is another h ∈ (0, 1), say h > h0, such that ‖hx+ (1− h)y‖ = 1. Then‖hx + (1 − h)y‖ = 1 for all h ∈ [h, 1], which contradicts the assumption for thepair of points hx+ (1− h)y and y.

(ii) This follows from (i), since it yields the condition there with h = 1/2.

Exercise 2.17.2. This follows from the Holder inequality, which states that∫g(x)f(x)dx ≤ ‖g‖p ‖f‖q.

The equality can only occur if |g(x)|p = |f(x)|q for almost all x.

Page 171: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

2.19. Summary and comments 155

2.19 Summary and comments

As mentioned before, the material of this chapter is more or less standard. Still,we tried hard to balance clearness, brevity and a reasonable generality, as well asto streamline and simplify the proofs.

A methodological novelty lies in the systematic use of abstract fixed-pointresults for curves from Section 2.1, which is amplified by the use of the semigroupof fractional integration for dealing with singularities in time. This leads to a veryconcise presentation of various well-posedness results, extending usual ODEs torather general causal equations and equations with fractional derivatives, and toprecise estimates (including constants) for the growth of solutions, the continuousdependence on the initial data and the derivatives with respect to initial dataand parameters. Later in this book, we shall see that the results from Section 2.1allows for quick and elegant arguments in other, more advanced situations likenonlinear diffusions or the Hamilton–Jacobi–Bellman equations. Also, much carewas given to distinguish the differentials of Gateaux and Frechet, which is a specificfeature of the infinite-dimensional setting. Moreover, the method of T -products orchronological exponentials was developed in some detail. This method is used inabundance in the physics literature, while (strangely enough) it is a rare guest inmathematics textbooks.

The representation (2.80) for the Mittag-Leffler function, which is crucial forour approach to fractional calculus, was probably first established by Zoloterevin [266], following the results of Pollard [223], who proved that the Mittag-Lefflerfunction is the Laplace transform of a positive function. In Chapter 8, we shallgive two new proofs for this formula and its extension to generalized Mittag-Lefflerfunctions.

Section 2.5 is a very short introduction to Hamiltonian systems, with theemphasis on boundary-value problems. For a full exposition of such boundary-value problems (including complex and/or stochastic characteristics), we refer thereader to [136] and [159]. The classical book on the mathematical aspects of Hamil-tonian mechanics is [16]. Hamiltonian dynamics can be integrable or exhibitingchaotic behaviour. Integrable systems have rather transparent general structures,since their trajectories fill the tori (Arnold–Liouville theorem). Therefore, mucheffort was given to the description and classification of integrable systems. Fortwo-dimensional flows, a powerful topological invariant is the Fomenko–Tsishanginvariant that was discovered in [85]. This invariant can be effectively calculatedfor many classical systems in order to prove their topological equivalence or non-equivalence, see, e.g., [203] and [40]. Among the conservation laws for Hamiltoniansystems, a most prominent role is played by laws that are polynomial in the mo-mentum. Therefore, much attention has been given to geodesic flows with suchfirst integrals. Starting from the classification of geodesic flows with quadratic in-tegrals in [132], the problem was intensively studied with impressive results onthe geodesic flows, see, e.g., [252] and [172], as well as for other systems, see [170].

Page 172: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

156 Chapter 2. Basic ODEs in Complete Locally Convex Spaces

The integrability of geodesic flows with polynomial integrals leads to various in-teresting geometric properties (like all geodesics being closed), see [133] and [126],as well as spectral properties of the corresponding geometric Laplacians, see [67].Complete integrability of infinite-dimensional Hamiltonian systems remains a veryactive area of research that we do not touch here at all.

Sections 2.6, 2.7 and 2.10 touch the vast area of the method of characteristicsfor first-order PDEs. Its development in the geometric theory of branching solu-tions leads to the dynamics of Lagrangian manifolds, which specifies the quasi-or semiclassical approximations for quantum mechanical problems via the Maslovcanonical operator, see, e.g., [199, 200]. One version of this theory can be ap-plied to diffusions and other stochastic processes, see [136]. The idea is that in theasymptotic analysis of PDEs with a small parameter, like the Schrodinger equationihψ = (V (x) − h2Δ/2)ψ or the diffusion equation hu = −(V (x) − h2Δ/2)u, thesolution is sought in a quickly oscillating form ψ(t, x) = φ(t, x) exp{iS(t, x)/h} orin a bell-shaped form u(t, x) = φ(t, x) exp{−S(t, x)/h} with S ≥ 0, respectively.

The developments in the HJB-equation theory in applications to optimizationand games lead to the theory of generalized solutions based on the idea of viscositysolutions, see [83], or on Subbotin’s minimax solutions, see [250] and [251], or onthe idempotent (or tropical, or max-plus) superposition principle and related ideasof approximations, see, e.g., [158] and [194]. The analogues of usual characteristicsfor the generalized solutions are certain piecewise smooth curves that are specifiedby the Pontryagin maximum principle.

Apart from classical Hamiltonian systems, the method of characteristicshas been successfully applied to general first-order PDEs, as well as for integro-differential and general abstract operator equations, see, e.g., [200] and [201].Semiclassical asymptotics involving general first-order PDEs were developed inthe framework of superprocesses in [135].

The method of characteristics can even be successfully developed if the solu-tions to the underlying ODEs are not well defined. In this case, certain generalizedsolutions arise, see, e.g., the case of the transport equation in [64, 193] and refer-ences therein.

An interesting development is the method of stochastic characteristics, whichcan be used for semiclassical approximations of stochastic heat or the Schrodinerequation (see [136]). This method transforms stochastic PDEs to simpler PDEswith random coefficients, see [174] for the general theory and [162] for its appli-cation to the analysis of sensitivity of stochastic McKean–Vlasov equations. Forstochastic characteristics in optimal control, we can refer to [66] and referencestherein. Various probabilistic methods for integral representations and numericsolutions (see, e.g., [61] and references therein) can also be considered an exten-sion to the method of characteristics, where the solutions to PDEs propagate viathe random trajectories of Markov processes.

More details on causal equations that have been touched upon in Section2.11 can be found, e.g., in [178]. Insightful examples occur in the modelling of

Page 173: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

2.19. Summary and comments 157

epidemics, see, e.g., [230]. Fractional equations were quickly discussed as an ex-emplary application of general fixed-point theorems for curves. This topic will bepicked up and further developed in Chapter 8.

Well-written books on ODEs in Banach spaces include [27, 197, 257, 262]. Amore application-oriented presentation is given in [244]. The so-called degenerateequations (where the l.h.s. d

dtMf contains a linear operator M with a nontrivial

kernel, rather than just ddt ) are developed in [75]. The basics of ordinary differential

equations in locally convex spaces have been developed in [209] and [189].

The porous medium equation u = Δρ(u) that was mentioned in Section 2.17remains a very active area of research, see, e.g., [217] and [4] and references therein.

Note that we are did not touch the very important direction of researchthat arises from ODEs with a discontinuous r.h.s. For such equations, we referto [79, 80] and [171]. A proper analysis of such equations naturally leads to thetheory of differential inclusions, see [18], [248].

Abundant literature exists that is devoted to the existence only of solutionsto ODEs when the r.h.s. is continuous, but not Lipschitz-continuous. For thispurpose, a different class of fixed-point principles must be used. The most standardprinciple is the Schauder fixed-point principle, stating that a compact (completelycontinuous) mapping from a convex closed subset of a Banach space to itself hasa fixed point. An almost direct application of this result yields the existence of asolution to the Caratheodory equation x = f(t, x) in Rd, where f is measurablewith respect to t, continuous with respect to x and bounded by a summablefunction of t (see, e.g., [80] for a proof). If f is continuous, this result is theclassical Peano theorem. In Banach spaces, however, it does not hold. Namely,Godunov’s theorem states that for any infinite-dimensional Banach space B thereexists a continuous function f(t, x) on R×B and initial condition x0 such that theequation x = f(t, x) has no (even local) solution with this initial condition. Hajekand M. Johanis showed in [102] that for any separable Banach space B there existsa continuous mapping f : B → B such that the autonomous equation x = f(x)has no solution at any point. In order to prove the existence of solutions of Banachspace-valued ODEs, one uses various extensions of the fixed-point principles thatoften depend on some measure of non-compactness, the simplest one being theKuratowskii measure of non-compactness χ(S) of subsets of a metric space, whichis the infimum of numbers ε(S) such that there exists a covering of S by a finitenumber of sets whose diameter does not exceed ε. E.g., the Sadovskii fixed-pointprinciple states that a continuous condensing mapping from a convex closed subsetof a Banach space to itself has a fixed point (see [236]). (A mapping f from abounded subset S of a Banach space S is called condensing if χ(f(S′)) < χ(S′)for all S′ ⊂ S.) The related weaker Darbo theorem [58] states the existence of afixed point if χ(f(S′)) ≤ kχ(S′) for all S′ ⊂ S with some k ∈ (0, 1).

Many peculiarities arise when passing from Banach spaces to Frechet spaces.For instance, one can use a version of the Lipschitz condition with the Lipschitzconstant being an infinitely-dimensional matrix that calibrates the stretching of

Page 174: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

158 Chapter 2. Basic ODEs in Complete Locally Convex Spaces

different norms (see, e.g., [108] and references therein). Another approach is basedon the fact that Frechet spaces are projective limits of Banach spaces and thatODEs in Frechet spaces can be recast in terms of the systems of equations insequences of Banach spaces (see, e.g., [89] and references therein). Yet anotherapproach is based on various subtle smoothing properties of specific Frechet spacesand the resulting inverse function theorems (see, e.g., [225] and references therein).

Let us stress again that we were mainly interested in constructing sufficientlyregular global solutions to infinite-dimensional ODEs. Abundant literature existsthat is devoted to the general classification of various particular features of thesolutions to infinite-dimensional ODEs as compared to finite-dimensional ones,see, e.g., [52, 101, 112, 189].

Page 175: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

Chapter 3

Discrete Kinetic Systems:Equations in l+p

In this chapter, we initiate the theory of positivity-preserving ODEs with un-bounded coefficients in the most simple case of spaces of sequences. Unlike theLipschitz-continuous case, the behaviour for forward- and backward-times is quitedifferent.

As a warm-up, we begin with an elementary theory of ODEs in Rn+. This set-

ting allows for a succinct demonstration of two new tools that arise as a substitutefor global Lipschitz continuity, namely the bounds from positivity preservationand from linear Lyapunov functions. Afterwards, we introduce the main examplesfor equations in Rn

+ and l+p that occur in natural and social sciences. The analysisof equations in Rn

+ is concluded by basic results on equilibria and ergodicity thatarise from analysing entropy and its extensions.

The study of chemical reactions gives inspiration for a useful general rep-resentation of positivity-preserving ODEs in order to describe the evolution ofinteracting particle systems. This representation is the starting point for more ad-vanced methods that are based on two additional tools, namely moment estimatesand accretivity. These methods will be further developed for equations in lp.

Hence, this chapter is effectively decomposed into two parts that can be readindependently: first Sections 3.1 to 3.4 on the evolutions in Rn

+, with the emphasison large time behaviour, and secondly Sections 3.5 to 3.14 on the evolutions in l+p ,with the emphasis on non-explosion, uniqueness and sensitivity.

Note that the results of this chapter are not used in other parts of the bookand give a more or less self-contained overview of the topic. In Chapter 7, thetheory will be generalized to measure-valued evolutions on arbitrary state spaces.

© Springer Nature Switzerland AG 2019 V. Kolokoltsov, Differential Equations on Measures and Functional Spaces, Birkhäuser Advanced Texts Basler Lehrbücher, https://doi.org/10.1007/978-3-030-03377-4_3

159

Page 176: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

160 Chapter 3. Discrete Kinetic Systems: Equations in l+p

3.1 Equations in Rn+

Even very simple equations with a quadratic r.h.s. may have no global solutions.For instance, the equation x = x2 in R has the general solution x(t) = (x−1

0 −t)−1,which for x0 > 0 is only defined for t < 1/x0. Nevertheless, important classes ofequations with a quadratic or polynomial r.h.s. do have global solutions due tothe combined effect of two bounds, arising from positivity and growth estimatesthat are governed by a linear Lyapunov function.

First of all, recall the notations

Rn+ = {x = (x1, . . . , xn) ∈ Rn : xj ≥ 0 for all j}, Σn =

{x ∈ Rn

+ :∑

xj = 1}

for the positive quadrant and the standard simplex in Rn. We shall denote theinterior of Rn

+ by

Rn++ = {x = (x1, . . . , xn) ∈ Rn : xj > 0 for all j}.

The expression (x, y) denotes the usual inner product in Rn.

The following simple observation is the starting point for our analysis. If A isan n×n-matrix, then the solution etAx to the linear equation x = Ax in Rn withthe initial condition x always takes Rn

+ to itself if and only if A is conditionallypositive in the sense that its off-diagonal terms are non-negative (or equivalently,if (Av)j ≥ 0 whenever vj = 0). We say that the r.h.s. f(x) of the ODE

x = f(x) ⇐⇒ {xj = fj(x) for all j = 1, . . . , n} (3.1)

is conditionally positive if fj(x) ≥ 0 whenever xj = 0. Moreover, we say that avector L ∈ Rn

++ is a Lyapunov function or a Lyapunov vector for equation (3.1),or that the mapping f has the Lyapunov function L, if the Lyapunov condition

(L, f(x)) ≤ a(L, x) + b (3.2)

holds with some constants a, b. The function L is called subcritical (respectivelycritical) Lyapunov function, or f is said to be L-subcritical (respectively L-critical),if (L, f(x)) ≤ 0 (respectively (L, f(x)) = 0) for all x ∈ Rn

+.

Theorem 3.1.1. If f : Rn+ → Rn is conditionally positive, locally Lipschitz-

continuous (that is, Lipschitz-continuous on any bounded subset of Rn+) and has

a Lyapunov vector L ∈ Rn++, then for any x ∈ Rn

+ there exists a unique globalsolution X(t, x) (defined for all t ≥ 0) to equation (3.1) with the initial conditionx, and this solution lies in Rn

+ for all t. Moreover, if a = 0, then

0 ≤ (L,X(t, x)) ≤ eat((L, x) +

b

a

)− b

a(3.3)

If a = 0, then (L,X(t, x)) ≤ (L, x) + bt. Finally, if f is L-critical, then

(L,X(t, x)) = (L, x).

Page 177: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

3.1. Equations in Rn+ 161

Remark 45. Intuitively, this result is clear. In fact, by conditional positivity, thevector field f(x) on any boundary point of Rn

+ is directed inside or tangent to theboundary, thus not allowing a solution to leave. On the other hand, the Lyapunovcondition implies that

(L,X(t, x)) ≤ (L, x) + a

∫ t

0

(L,X(s, x)) ds+ bt

which leads to (3.3) by Gronwall’s lemma. However, a rigorous proof is not fullystraightforward, since already the existence of a solution is not clear: Any attemptsto construct a solution via the usual approximation schemes (see, e.g., Theorem2.2.1) or by standard Euler or Peano approximations encounter a problem, since allthese approximations may not be positivity-preserving (and may therefore jumpout of the domain where f is defined).

Proof. We obtain positivity-preserving approximations by using a linear boundfor the negative part of f . Namely, assuming a = 0 for definiteness, let

Mxa,b(t) =

{y ∈ Rn

+ : (L, y) ≤ eat((L, x) +

b

a

)− b

a

}.

Fixing T , let us define the space Ca,b(T ) of continuous functions y : [0, T ] �→ Rn+

such that y([0, t]) ∈ Mxa,b(t) for all t ∈ [0, T ].

Now, let fL = fL(x) be the maximum of the Lipschitz constants (1.7) ofall fj on Mx

a,b(T ), and let us pick a constant K = K(x) ≥ fL. By conditionalpositivity, fj(y) ≥ −Kyj in Mx

a,b(T ). Therefore, we can rewrite equation (3.1)equivalently as

y = (f(y) +Ky)−Ky ⇐⇒ {yj = (fj(y) +Kyj)−Kyj , j = 1, . . . , n}, (3.4)

which ensures that the nonlinear part fj(y) + Kyj of the r.h.s. is always non-negative.

Next, we modify the usual approximation scheme (see Theorem 2.2.1) bydefining the map Φx from Ca,b(T ) to itself in the following way:

For a y ∈ C([0, T ],R+n ), let Φx(y) be the solution to the equation

d

dt[Φx(y)](t) = f(y(t)) +Ky(t)−K[Φx(y.)](t),

with the initial data [Φx(y)](0) = x. It is a linear equation with a unique explicitsolution, which can be taken as an alternative definition of Φx:

[Φx(y)](t) = e−Ktx+

∫ t

0

e−K(t−s)[f(y(s)) +Ky(s)] ds.

Clearly, a fixed point of Φ is a solution to (3.1) with the initial data x.

Page 178: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

162 Chapter 3. Discrete Kinetic Systems: Equations in l+p

Next, let us check that Φ takes Ca,b(T ) to itself. In fact, if y ∈ Ca,b(T ), wefind that

(L, [Φx(y)](t)) = e−Kt(L, x) +

∫ t

0

e−K(t−s)[(a+K)(L, y(s)) + b] ds

≤ e−Kt(L, x) +

∫ t

0

e−K(t−s)

[(a+K)

(eas((L, x) +

b

a

)− b

a

)+ b

]ds

= (L, x)e−Kt + e−Kt(e(K+a)t − 1)

((L, x) +

b

a

)− b

ae−Kt(eKt − 1)

= (L, x)eat +b

a(eat − 1).

Notice that, due to this bound, the iterations of Φ remain in Ca,b(T ). Therefore,it is justified to use the Lipschitz constant K for f . The proof is completed byreferring to Theorem 2.1.2, because

‖[Φx1(y1. )](t)− [Φx2(y

2. )](t)‖ ≤ (K + fL)

∫ t

0

‖y1s − y2s‖ ds+ ‖x1 − x2‖

and ‖[Φx(x)](t) − x‖ ≤ tf(x). �

3.2 Examples in Rn+: replicator dynamics

and mass-action-law kinetics

This section presents basic examples of equations in Rn+ as they arise in chemistry,

physics, biology and economics. Apart from supplying practical examples, therelevant models of natural science motivate an important representation of suchequations in cases when an appropriate Lyapunov function is available. Namely,they make it possible to represent the r.h.s. of any such equation as the descriptionof a combined effect that results from many small-scaled reactions, each one ofwhich satisfying the corresponding Lyapunov condition. This representation iscrucial for the advanced analysis of such equations, since it allows for an estimate ofhigher moments of the conservation law and for proving the property of accretivitythat leads to uniqueness and continuous dependence on initial data.

To begin with, let us consider equations with a quadratic r.h.s.:

x = (Ax, x) ⇐⇒ {xj = (Ajx, x) for all j = 1, . . . , N}, (3.5)

where each Aj is a symmetric N×N -matrix. Let Aj denote the (N−1)× (N−1)-matrix obtained from Aj by deleting the row and column of index j. Clearly, ther.h.s. of (3.5) is conditionally positive if and only if

(Ajv, v) ≥ 0 for all v ∈ RN−1+ . (3.6)

Page 179: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

3.2. Examples in Rn+: replicator dynamics and mass-action-law kinetics 163

The following equation has similar properties:

x = A[x⊗k] ⇐⇒ {xj = Aj [x⊗k]) for all j = 1, . . . , N}, (3.7)

with the r.h.s. being a polynomial of order k:

Aj [Y (1)⊗ · · · ⊗ Y (k)] =∑

i1,...,ik

Aji1,...,ik

Yi1(1) · · ·Yik(k), (3.8)

with some array of Nk+1 numbers A = (Aj) = {Aji1,...,ik

}, which is symmetric withrespect to changes of the order of i1, . . . , ik for any j. Note that the algebra of tensorproducts will not be used: Here, the notation [Y (1)⊗· · ·⊗Y (k)] can be understoodas merely denoting the collection of vectors {Y (1), . . . , Y (k)}. Therefore, X⊗k isjust a handy notation for the collection of k vectors, each of which equals X .

The r.h.s. of (3.7) is conditionally positive if and only if

Aj [v⊗k] ≥ 0 for all v ∈ RN−1+ , (3.9)

where Aj denotes the array of (N − 1)k numbers that is obtained from Aj bydeleting all elements Aj

i1,...,ikwith at least one of il being equal to j.

An equation with an analytic r.h.s. can thus be written as

x = A(0) +∞∑k=1

A(k)[x⊗k], (3.10)

where A(0) is a constant vector and each A(k), k > 0, has the form of the r.h.s.of (3.7). If only a finite number of A(k) in (3.10) is non-vanishing, then we havean equation with a general polynomial r.h.s. Clearly, if each A(k) is conditionallypositive, then the same applies to the r.h.s. of (3.10). Note, however, that theconverse is not true. (A full discussion of this point is given in [143].)

The polynomial r.h.s. of equation (3.7) can be equivalently written as a sum-mation over unordered and ordered collections i1, . . . , ik. This notation is widelyused in the theory of interacting particles and will be frequently used in our ex-position.

Proposition 3.2.1. Let Ψ(i1, . . . , ik) ∈ Z∞+ denote the profile of the collection

{i1, . . . , ik}, that is, the coordinate ψj of Ψ equals the number of indices j thatenter the collection {i1, . . . , ik}. With the notation

xΨ =∏j

xψj

j , Ψ! =∏j

ψj !, (3.11)

the polynomial A[x⊗k] given by the symmetric arrays A = (Ai1···ik) can be rewrit-

Page 180: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

164 Chapter 3. Discrete Kinetic Systems: Equations in l+p

ten as

A[x⊗k] =∑

i1,...,ik

Ai1···ikxi1 · · ·xik

= k!∑

i1≤···≤ik

(∏m

[ψm(i1, . . . , ik)]!

)−1

Ai1···ikxi1 · · ·xik

= k!∑

Ψ∈Z∞+

AΨxΨ

Ψ!.

(3.12)

Therefore, equation (3.7) can be written as

xj = k!∑

Ψ∈Z∞+

AjΨ

Ψ!. (3.13)

Proof. Equation (3.12) is proved by direct induction. In order to get a feeling forhow the combinatorics works, let us write it down specifically for k = 2 and k = 3:∑

i1,i2

Ai1i2xi1xi2 =∑

iAiixixi + 2

∑i1<i2

Ai1i2xi1xi2 , (3.14)

∑i1,i2,i3

Ai1i2i3xi1xi2xi3 =∑

iAiiixixixi + 2

∑i<k

Aiikxixixk + 2∑i<k

Aikkxixkxk

+ 3!∑

i1<i2<i3

Ai1i2i3xi1xi2xi3 . (3.15)

As another example, let us consider the class of equations

xj = xjΩj(x), j = 1, . . . , N, (3.16)

which are conditionally positive for all functions Ωj on Rn+. This class contains

the so-called replicator dynamics (RD) equations from evolutionary game theorythat are widely used in evolutionary biology and economics:

xj = xj(∑

kΠjkxk −

∑kl

Πklxkxl), j = 1, . . . , N, (3.17)

with some non-negative numbers Πkl. Equation (3.17) is obtained as the equationfor frequencies xj = nj/(n1 + · · ·+ nN ) from the equation for absolute sizes

nj = nj(c+∑

kΠjkxk), j = 1, . . . , N, (3.18)

with any constant c, which describes the evolution of sizes based on the backgroundreproduction/death rate c and the pairwise interaction with rates given by thematrix Π.

Page 181: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

3.2. Examples in Rn+: replicator dynamics and mass-action-law kinetics 165

Remark 46. General replicator dynamics with kth-order local interactions in ar-bitrary state spaces is discussed in Chapter 7. Its extension for games against amajor player is developed in [154].

From Theorem 3.1.1, we get the following consequences.

Proposition 3.2.2. If (3.6) respectively (3.9) holds and∑

j LjAj = 0 for a vector

L ∈ RN++, then for any x ∈ RN

+ there exists a unique global non-negative solutionX(t, x) to (3.5) respectively (3.7), and (L,X(t, x)) = (L, x) for all t ≥ 0.

Proposition 3.2.3. If the functions Ωj are Lipschitz-continuous in bounded domainsand

∑j LjxjΩj(X) = 0 for a vector L ∈ RN

++, then for any x ∈ RN+ there exists

a unique global non-negative solution X(t, x) to (3.16), and (L,X(t, x)) = (L, x)for all t ≥ 0.

One of the most important examples for equations with a polynomial r.h.s.is given by the so-called mass-action-law kinetics in chemistry that we shall nowdescribe.

A general chemical reaction is described by the symbolic equation

l∑j=1

αjAj →k∑

i=1

βkBk, (3.19)

where A1, . . . , Al and B1, . . . Bk denote the types of initial and resulting molecules,respectively. The integers α1, . . . , αl and β1, . . . , βk are called stoichiometric coef-ficients (from the Greek word for ‘measure’ and ‘element’), and they specify thenumber of corresponding molecules in the input and output of the reaction. If thereaction can go in both directions, one writes

l∑j=1

αjAj �k∑

i=1

βkBk (3.20)

instead of (3.19).

The rates of a chemical reaction depend in a certain way on the concentrationof the reactants. These are often denoted by [Aj ], so that

r = f([A1], . . . , [Al]). (3.21)

The identification of the rates r for concrete reactions is a very complicated task,which is usually performed experimentally. In many cases, the rates are found todepend on the concentration by a power law, i.e., as

r = k[A1]a1 · · · [Al]

al (3.22)

with some real (not necessarily integer) aj and a constant k > 0. In this case, thereaction is said to have the order a1 + · · · + al. The individual numbers aj areorders in Aj . One says that the reaction rate (3.19) satisfies the mass-action-lawif (3.22) holds. Note that k is referred to as the rate constant.

Page 182: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

166 Chapter 3. Discrete Kinetic Systems: Equations in l+p

Remark 47. Some reasons for the power dependence on the concentrations will berevealed in Section 3.5.

It turns out that the mass-action-law mostly holds for elementary reactionsinvolving a small number of particles only. In many cases, however, the complexreaction (3.19) turns out to run in several steps, each of which is an elementaryreaction that satisfies the mass-action-law. The sequence of these elementary stepsdescribes the mechanism of the reaction. As a simple example, let us consider thereaction

A → I → P,

where the product P is obtained from A via an intermediate I. Assuming thatboth steps are first-order reactions with coefficients kAandkI , respectively, theequations for the concentrations become

d[A]

dt= −kA[A],

d[I]

dt= kA[A]− kI [I],

d[P ]

dt= kI [I].

If we assume that at the beginning only A was present, the explicit solution reads

[P ](t) =

[1 +

kAe−kI t − kIe

−kAt

kI − kA

][A0]. (3.23)

One of the basic models for a binary reaction that produces P from A andB is based on the idea of an activation complex, where A,B are first supposed toform the compound intermediate state C = AB, which can then either fall apartor produce the product P . Therefore, the reaction has two steps A + B ←→ ABand AB → P . Assuming that both steps are elementary (and satisfy the mass-action-law of first-order in each type of particles), the rates are k[A][B] and k[AB]for the forward and backward reactions in A+B ←→ AB, and kp[AB] for the lastreaction. The system of equations for the concentrations therefore becomes

d[A]

dt=

d[B]

dt= −kA[A][B] + k[AB],

d[AB]

dt= kA[A][B]− k[AB]− kp[AB],

d[P ]

dt= kp[AB],

(3.24)

which is explicitly solvable only for the unimportant case of kp = 0.

In some reactions, the rates can be considerably enhanced by the presenceof a catalyst, i.e., a substance that accelerates the reaction. Autocatalysis occurs

Page 183: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

3.2. Examples in Rn+: replicator dynamics and mass-action-law kinetics 167

when a reaction product acts as a catalyst. The simplest example is the reactionA → P with the rates r = k[A][P ]. The equations for the concentrations are

d[A]

dt= −k[A][P ],

d[P ]

dt= k[A][P ],

with the explicit solution

[A](t) = [A0][A0] + [P0]

[A0] + [B0]ekt([A0]+[P0]. (3.25)

Exercise 3.2.1. Prove the formulae (3.23) and (3.25).

A general kinetic system can be defined as a collection of molecules or speciesA1, . . . , Am and a collection of k possible reactions of the type (3.19) between them,that is,

m∑j=1

αjρAj →m∑j=1

βjρAj , ρ = 1, . . . , k, (3.26)

each having a prescribed rate rρ(c) that depends on the concentrations c =(c1, . . . , cm), where cj = [Aj ] is the concentration of Aj . The evolution of theconcentrations is then given by the following system of equations:

dcldt

=

k∑ρ=1

rρ(c)(βlρ − αlρ) =

k∑ρ=1

rρ(c)γlρ, l = 1, . . . ,m, (3.27)

withγρ = (γ1ρ, . . . , γmρ) = (β1ρ − α1ρ, . . . , βmρ − αmρ)

being referred to as the elementary reaction vectors or stoichiometric vectors. Thesubspace S of Rm that is generated by all stoichiometric vectors is called thestoichiometric space of the kinetic system.

Equivalently, a kinetic system can be described in terms of complexes. Inthis context, a complex is meant to be any finite collection of species of the typeA1, . . . , Am, which can be formally expressed by a linear combination

∑mj=1 αjAj

with certain non-negative integers αj (expressing the number of species Aj inthis complex). Let C(1), . . . , C(n) be the collection of all distinct complexes thatappear either on the r.h.s or the l.h.s. of any of the reactions (3.26). Of course, thecondition n ≤ 2k always holds, where n < 2k may occur if some complexes entermore than one reaction. These complexes

C(i) =

m∑j=1

yj(i)Aj , i = 1, . . . , n, (3.28)

are specified by the complex vectors y(i) (which have of course nothing to do withcomplex numbers!), their coordinates yj(i) being referred to as the molecularity of

Page 184: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

168 Chapter 3. Discrete Kinetic Systems: Equations in l+p

Aj in the complex C(i). The matrix

Y =

⎛⎜⎝

y1(1) · · · y1(n)· · ·

ym(1) · · · ym(n)

⎞⎟⎠

that is formed by their coordinates is called the complex matrix. With these no-tations in mind, an equivalent description of a kinetic system can be given by thecollection of molecules or species A1, . . . , Am, a collection of n complexes (3.28)and a matrix-valued function R(c) = (rij(c)) on Rm

+ , where each element rij(c)specifies the rate of the reaction C(j) → C(i) and where the natural conventionapplies that rjj = 0 for all j. In these terms, the evolution (3.27) has the form

dcldt

=n∑

i,j=1

rij(c)(yl(i)− yl(j)) = fl(c), l = 1, . . . ,m. (3.29)

The vector-function f(c) = (fl(c)) on the r.h.s. is called species formation vector.By changing the summation index, one can rewrite fl(c) in the following equivalentforms

fl(c) =

n∑i,j=1

(rij(c)− rji(c))yl(i) =∑i<j

(rij(c)− rji(c))(yl(i)− yl(j)). (3.30)

Since rij(c)− rji(c) is the total rate of the reaction C(j) → C(i), the vector

(R(c)−RT (c))1 =

⎛⎝ n∑

j=1

(r1j(c)− rj1(c)), . . . ,

n∑j=1

(r1n(c)− rjn(c))

⎞⎠ ∈ Rn (3.31)

(where 1 is the vector with coordinates 1 and RT denotes the transpose matrix)can be naturally called the complex formation vector. As can be seen from (3.30)and (3.31), the species formation vector and the complex formation vector arelinked by the equation

f(c) = Y (R(c)−RT (c))1. (3.32)

In order to switch back to the description (3.26), one only has to enumerateall reactions with positive rates rij(c) > 0, taking into account that if the reactionof the index ρ in (3.27) is the reaction C(j) → C(i), then γlρ = yl(i) − yl(j).Therefore, the stoichiometric space S can be equivalently described as the spacethat is generated by the vectors y(i)− y(j) for i, j with non-vanishing rij(c).

A kinetic system is called mass-action-law kinetics if all admissible reactions(3.26) of this system satisfy the mass-action-law (3.22), that is, if

rij(c) = kij

m∏l=1

cyl(j)l (3.33)

with a certain matrix K = (kij) with non-negative elements that is called the rateconstant matrix. In this case, equations (3.27) are of the type (3.10).

Page 185: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

3.3. Entropy and equilibria for linear evolutions in Rn+ 169

3.3 Entropy and equilibria for linear evolutions in Rn+

In this section, we shall discuss equilibria and convergence to equilibria for linearsystems in Rn

+:dcjdt

=∑l =j

(qjlcl − qljcj) (3.34)

with some non-negative coefficients qjl. In chemistry, such systems describe theevolution of the concentrations c = {cj = [Aj ]} of molecules of the type{A1, . . . , An} such that Ai can change into Aj according to first-order kinetics,i.e., with the rates qjici. System (3.34) is then referred to as the master equation.

By adding all the equations of (3.34), we get∑j

dcjdt

=∑

j,l:l =j

(qjlcl − qljcj) = 0, (3.35)

which expresses the basic conservation law. (It is assumed that no source and nosinks are present in the system.) Turning to normalized quantities pj = cj/(c1 +· · ·+ cn) (that can be interpreted as frequencies or probabilities), (3.34) becomes

dpjdt

=∑l =j

(qjlpl − qljpj),∑

jpj = 1. (3.36)

In probability theory, these equations are referred to as Kolmogorov’s forwardequations for a Markov chain with the Q-matrix Q = (qij). This coincidence doesnot come as a surprise due to the usual (in practice) identification of frequenciesor concentrations with probabilities.

The most important problem for the analysis is the search for equilibria ofthe system (3.34) and for the rates of convergence to these equilibria. For thispurpose, the graph of the system (3.34) (or of the Markov chain (3.36)) plays akey role. It is defined as the directed graph with n vertices A1, . . . , An such thatthe edge Ai → Aj is present if and only if qji > 0. System (3.34) is called weaklyreversible if for any i, j a path from Ai to Aj exists if and only if a path from Aj

to Ai exists. This clearly holds if and only if the graph decomposes into a disjointunion of strongly connected components. If c∗ = (c∗1, . . . , c

∗n) is an equilibrium for

(3.34), then ∑l =j

(qjlc∗l − qljc

∗j ) = 0, j = 1, . . . , n. (3.37)

This equation is often referred to as the balance condition. An equilibrium c∗ isstrictly positive if c∗j > 0 for all j. The system (3.34) is called ergodic if there existsa unique (up to a multiplier) strictly positive equilibrium. The following standardfact is worth being quoted (see any textbook on Markov chains for a proof).

Proposition 3.3.1.

(i) System (3.34) (or (3.36)) is ergodic if and only if the corresponding directedgraph is strongly connected.

Page 186: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

170 Chapter 3. Discrete Kinetic Systems: Equations in l+p

(ii) A strictly positive equilibrium exists if and only if the system is weakly re-versible, i.e., the graph is a disjoint union of strongly connected components.

If the system (3.34) is weakly reversible and its graph has d strongly con-nected components G1, . . . , Gd (we denote by Gj both the subgraph itself and itscollection of vertices, which are in one-to-one correspondence due to the assumedconnectivity), then the sums

βs =∑j∈Gs

cj (3.38)

are the integrals of evolution (3.34) for all s. Moreover, if c∗ is any strictlypositive equilibrium of the system, then, for any collection of positive numbersβ = (β1, . . . , βd), there exists a unique strictly positive equilibrium c(β) of evolu-tion (3.34) satisfying the conservation laws (3.38), namely

cj(β) = βs

c∗j∑l∈Gs

c∗l, j ∈ Gs. (3.39)

If c∗ is a strictly positive equilibrium for (3.34), then∑l =j

qljcj =∑l =j

qljc∗j

cjc∗j

=∑l =j

qjlc∗l

cjc∗j

,

therefore (3.34) can be equivalently written as

dcjdt

=∑l =j

qjlc∗l

(clc∗l

− cjc∗j

). (3.40)

For a smooth convex function h on the positive half-line (with zero not nec-essarily included) and a strictly positive equilibrium c∗ of (3.34), the Csiszar–Morimoto (conditional) entropy is defined as the following function on the concen-trations:

Hh(c‖c∗) =∑j

c∗jh

(cjc∗j

). (3.41)

A direct application of the method of Lagrange multipliers implies that c∗ is theunique minimum of Hh on the set of c that are subject to the basic conservationlaw

∑cj =

∑c∗j , and the equilibria c(β) from (3.38) are the unique minima of

Hh on the set of c that are subject to the conservation laws (3.38).

It turns out that Hh is a Lyapunov function for (3.40) for any (convex) h,since the following Morimoto H-theorem holds:

Proposition 3.3.2. Under the evolution (3.40), the following is true:

dHh(c‖c∗)dt

=∑

(l,j):l =j

qjlc∗l h

′(cjc∗j

)(clc∗l

− cjc∗j

)≤ 0. (3.42)

Page 187: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

3.3. Entropy and equilibria for linear evolutions in Rn+ 171

Moreover, if h is strictly convex, then the equality in (3.42) holds only if (cj) forman equilibrium of the type (3.39).

Proof. Due to the conservation law (3.35) applied to (3.40), it follows that∑(l,j):l =j

qjlc∗l (xl − xj) = 0

holds for all vectors x of the form

x = (x1, . . . , xn) =

(c1c∗1

, . . . .cnc∗n

),

By linearity, since all c∗j are positive, we find that this holds for all x ∈ Rn.Consequently,

dHh(c‖c∗)dt

=∑

(l,j):l =j

h′(cjc∗j

)qjlc

∗l

(clc∗l

− cjc∗j

)(3.43)

=∑

(l,j):l =j

qjlc∗l

[h

(cjc∗j

)− h

(clc∗l

)+ h′

(cjc∗j

)(clc∗l

− cjc∗j

)]≤ 0,

where the last inequality follows from the convexity of h.

Finally, the equality in (3.43) holds if and only if qjl = 0 whenever cj/c∗j =

cl/c∗l . Equivalently, the equality in (3.43) holds if and only if cj/c

∗j = cl/c

∗l for

j, l from any strongly connected component Gs of the graph G of system (3.40),and thus if cj = λsc

∗j for any j from Gs with some λs > 0, which is equivalent to

(3.39). �Remark 48. Proposition 3.3.2 implies the uniqueness part of Proposition 3.3.1,that is, the ergodicity of the positive equilibrium.

A basic example of Hh arises from h(x) = x(ln x− 1), in which case we find

Hh(c‖c∗) =∑

jcj

(ln

cjc∗j

− 1

). (3.44)

One says that c∗ = (c∗1, . . . , c∗n) is a point of detailed balance if, for all j and l,

qjlc∗l = qljc

∗j ,

i.e., the reaction rates from l to j and from j to l coincide for any pair. Of course,any point of detailed balance is an equilibrium, but not vice versa.

Remark 49. In functional analytic terms, a point c∗ is a point of detailed balanceif the Q-matrix Q = (qij) is Hermitian (or self-adjoint) as an operator in Rn

equipped with the scalar product (u, v)c∗ =∑

ujvjc∗j .

Page 188: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

172 Chapter 3. Discrete Kinetic Systems: Equations in l+p

3.4 Entropy and equilibria for nonlinearevolutions in Rn

+

This section is devoted to the key facts on equilibria of mass-action-law kinetics.Note that this content will not be used in other parts of the book.

The kinetic system (3.29) is said to be conservative if it admits a linearpositive conservation law, that is, there exists a vector b = (b1, . . . , bm) with allbj > 0 such that the product (b, c) is preserved by (3.29):

d

dt(b, c) = (b, f(c)) = 0

for all c. In other words, b is a critical Lyapunov vector for (3.29). Conservativityis the simplest way to ensure well-posedness, which under this condition followsfrom Theorem 3.1.1.

As usual, an equilibrium point for the system (3.29) is defined as a vectorc∗ = (c∗1, . . . , c

∗n) ∈ Rn

+ such that f(c∗) = 0. This equilibrium is positive, if c∗j > 0for all j. By analogy with linear systems, one says that c∗ is a point of detailedbalance if, for all j and l, the reaction rates of any pair of forward and backwardreactions coincide at c∗, that is, if

R(c∗) = RT (c∗).

By (3.32), any point of detailed balance is an equilibrium point. But formula(3.32) suggests the introduction of an intermediate notion (between equilibriumand detailed balance). Namely, one says that c∗ is a point of complex balance ifthe complex formation vector vanishes at this point:

(R(c∗)−RT (c∗))1 = 0.

Therefore, a point of complex balance is also an equilibrium. For linear systems,complexes and species coincide, and therefore the notions of detailed and complexbalance also coincide.

One says that system (3.29) is complex balanced (respectively detailed bal-anced) if the set of positive equilibria is not empty and coincides with the set ofpoints of complex (respectively detailed) balance.

Complex balancing turns out to be crucial for finding the ‘thermodynamicproperties’ of (3.29), i.e., for finding out whether a function (3.44) or some naturalextension of it can serve as a Lyapunov function for (3.29), as in the linear case.We shall now see how it works for mass-action-law systems. For this purpose, recallthat a kinetic system is a mass-action-law kinetics if all admissible reactions (3.26)of this system are of the form (3.33).

Theorem 3.4.1. Assume that a positive point of detailed balance c∗ exists for amass-action-law kinetics. Then the system is detailed balanced, that is, all other

Page 189: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

3.4. Entropy and equilibria for nonlinear evolutions in Rn+ 173

positive points of equilibria are points of detailed balance. Moreover, the function

G(c) =∑

jcj

(ln

cjc∗j

− 1

)(3.45)

is a Lyapunov function for the system (as for the linear case), that is

dG(c)

dt≤ 0

for all c, with equality only for the points of detailed balance. Finally, these points cof detailed balance can be characterized as the critical points of G on the hyperplanec+ S, where S is the stoichiometric space.

Proof. Since ∂G∂ci

equals ln(ci/c∗i ), one can rewrite the reaction rates in the following

equivalent forms:

rij(c) = rij(c∗)

m∏l=1

(clc∗l

)yl(j)

= rij(c∗) exp

{m∑l=1

yl(j) ln

(clc∗l

)}, (3.46)

and consequently as

rij(c) = rij(c∗) exp{(y(j),∇G(c))}. (3.47)

Taking into account the detailed balance condition rij(c∗) = rji(c

∗), this implies

lnrij(c)

rji(c)= (y(j)− y(i),∇G(c)) (3.48)

whenever kij = 0. Consequently, applying the last expression in (3.30) takes us to

dG(c)

dt= (∇G(c), f(c)) =

∑i<j:kij (c∗) =0

(rij(c)− rji(c))(y(i) − y(j),∇G(c))

and thusdG(c)

dt= −

∑i<j:kij(c∗) =0

(rij(c)− rji(c)) lnrij(c)

rji(c)≤ 0. (3.49)

The latter inequality is a consequence of the elementary inequality (x− y)(lnx−ln y) > 0. Moreover, the inequality on the r.h.s. of (3.49) becomes an equality ifand only if all terms in the sum vanish, i.e., if and only if rij(c) = rji(c) for alli < j (with kij = 0), that is, if c is a point of detailed balance. Finally, it followsthat c is an equilibrium if and only if (y(i)−y(j),∇G(c)) = 0 for all i < j : kij = 0,which implies that c is a critical point of G on c+ S. �

Let us now turn to the complex balance for mass-action-law systems. LetG be defined by (3.45) with some arbitrary positive (not necessarily equilibrium)

Page 190: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

174 Chapter 3. Discrete Kinetic Systems: Equations in l+p

vector c∗. In this case, the representation (3.47) still holds. It turns out to be usefulto introduce an auxiliary function

θ(λ; c) =∑i=j

rij(c∗) exp{(λy(j) + (1− λ)y(i),∇G(c))}, λ ∈ [0, 1]. (3.50)

In particular, the following three properties hold:

θ(1; c) =∑

j

⎛⎝ ∑

i=j:kij(c∗) =0

rij(c∗)

⎞⎠ exp{(y(j),∇G(c))},

θ(0; c) =∑i=j

rij(c∗) exp{(y(i),∇G(c))}

=∑

j

⎛⎝ ∑

i=j:kji(c∗) =0

rji(c∗)

⎞⎠ exp{(y(j),∇G(c))},

d

dλθ(1; c) = −dG(c)

dt. (3.51)

The following statement derives the dissipation of entropy from the complex bal-ance condition. Its proof reveals the reason for introducing the function θ.

Theorem 3.4.2. The condition of complex balance for c∗ is equivalent to the validityof the equation

θ(1; c) = θ(0; c)

for all c. If this is the case, then the following dissipation inequality with theLyapunov function G holds:

d

dtG(c) = − d

dλθ(1; c) ≤ 0. (3.52)

Proof. From the definition of λ, we find

θ(1)−θ(0) =∑

j

⎛⎝ ∑

i=j:kij(c∗) =0

rij(c∗)−

∑i=j:kji(c∗) =0

rji(c∗)

⎞⎠ exp{(y(j),∇G(c))}.

The convexity of G implies that all exponents are linearly independent functions.Therefore, θ(1)− θ(0) = 0 for all c is equivalent to∑

i=j:kij (c∗) =0

rij(c∗) =

∑i=j:kji(c∗) =0

rji(c∗),

which is the condition of complex balance for c∗. Finally, θ(λ) is easily seen to beconvex in λ (since the second derivative is positive). Therefore, θ(1; c) = θ(0; c)implies (3.52). �

Notice that ddλθ(1, c) = 0 only if θ does not actually depend on λ. This occurs

when (yj −yi,∇G(c)) = 0 for all i, j with kij = 0, that is again when c is a criticalpoint of G on c+ S.

Page 191: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

3.5. Kinetic equations for collisions, fragmentation, reproduction . . . 175

3.5 Kinetic equations for collisions, fragmentation,reproduction and preferential attachment

From now on, we will focus on equations in l+p . Let us start with the introductionof main examples of ODEs in lp-spaces that arise from the analysis of interactingparticles. This type of equation can be characterized by a discrete parameter.These ODEs are natural extensions of the ODEs in Rn

+ as discussed above.

We introduce these evolutions in an intuitive manner by freely identifyingprobabilities and frequencies, as it is usually done in physics or biology. One canderive them rigorously as a dynamic law of large numbers for the correspondingMarkov processes of evolution. Such a derivation can be found in the literature,see, e.g., [147] and the extensive bibliography therein. Here, we want to concentrateon the properties of the equations rather than their derivations.

As we shall show, the kinetic equations that arise from physical models ofinteractions cover basically all types of ODEs in Rn

+ or l+p . In other words, anyequation in Rn or lp that preserves positivity can be considered a kinetic equationfor some process of interacting particles.

Assuming that j ∈ N labels the type of a particle (for instance, its mass), asequence (n1, n2, . . .) ∈ Z∞

+ can be thought of as describing the state (also referredto as profile) of a system of many particles containing nj particles of each typej, j = 1, 2, . . .. If the number of types is finite, then j ∈ {1, . . . , N} with somefinite N .

In order to introduce the evolution of a system in differential form, it isconvenient to describe the states by a sequence of real numbers (x1, x2, . . .) ∈R∞

+ rather than natural numbers. Each xj represents the concentration of thepopulation of type j. (Note that we renamed these concentrations cj in (3.27), inaccordance with the common literature in chemistry.)

Remark 50. For instance, one can think of xj as frequencies, xj = nj/∑

l nl, oras probabilities of choosing a particle of type j in a random sample.

The simplest diagonal system

xj = ajxj +Aj , j ∈ N or j ∈ {1, . . . , N}, (3.53)

describes the process of death and/or reproduction (death for aj < 0 and repro-duction for aj > 0) combined with an external input of particles.

Suppose next that any particle of the type k can mutate to a type m withthe rate Qm

k , independently of the presence of other particles. Since the number ofsuch mutations will be proportional to the size xj of the population of type j, theprocess of mutation can be described by the following infinite system of equations:

xj =

∞∑k=1

∑m =k

xkQmk (δmj − δkj ) =

∑k =j

(xkQjk − xjQ

kj ), (3.54)

Page 192: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

176 Chapter 3. Discrete Kinetic Systems: Equations in l+p

where Qkj are any non-negative numbers such that

∑k Q

kj < ∞ and Qj

j = 0 forall j. Equation (3.54) plays the central role in the theory of Markov chains, whereit is referred to as Kolmogorov’s forward equation.

On the other hand, if any particle of the type k can be decomposed into twoparticles of the types m and n with the rate Pmn

k independently of the presenceof other particles, the evolution is governed by the system

xj =

∞∑k=1

xk

∑(m,n)

Pmnk (δmj + δnj − δkj ), (3.55)

which describes the process of fragmentation or the process of branching. (We meanfragmentation if particles m,n are in some sense smaller than k, and branching ifthey are the same as k.) In this equation,

∑(m,n) denotes the sum over all pairs

of types (m,n) with irrelevant order.

The linearity of the evolutions (3.54) and (3.55) reflects the absence of inter-actions: any particle is subject to certain transformations independently of otherparticles. Meanwhile, a quadratic r.h.s. appears when one deals with binary inter-actions. For example, if any two particles of the type k and l can merge into a newparticle of the type m with the rate Pm

kl , the evolution is governed by the system

xj =1

2

∞∑k=1

∞∑l=1

xkxl

∞∑m=1

Pmkl (δ

mj − δlj − δkj ), (3.56)

which describes the processes of merging, or coagulation, or coalescence. On theother hand, if any two particles of the type k and l can collide or mutate to createa new pair of particles of the type m and n with the rate Pmn

kl , the evolution isgoverned by the system

xj =1

2

∞∑k=1

∞∑l=1

xkxl

∑(m,n)

Pmnkl (δmj + δnj − δlj − δkj ), (3.57)

which describes the processes of collision, or collision breakage, or pairwise muta-tions.

Another insightful example is the interest driven migration in decision pro-cesses like evolutionary games, where agents of the behavioural type i can migrateto the type j under the influence of an agent of the type j if some performancefunction Rj(x) is better for the type j, i.e., if the migration probabilities are

P ji = (Rj −Ri)

+. The corresponding evolution is governed by the equation

xj =1

2

∞∑k=1

∞∑l=1

xkxl(Rk −Rl)+(δkj − δlj), (3.58)

which can be simplified to its most common form

xj =1

2xj

∞∑k=1

xk(Rj −Rk). (3.59)

Page 193: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

3.5. Kinetic equations for collisions, fragmentation, reproduction . . . 177

Exercise 3.5.1. Check that (3.59) is equivalent to (3.58). Hint: equation (3.58) canbe rewritten as

xj =1

2xj

∑l =j

(Rj −Rl)+ − 1

2xj

∑k =j

(Rk −Rj)+.

A polynomial r.h.s. of order k describes interactions that involve simultaneoustransformations of k particles (kth-order interaction). Namely, the evolution of asystem where any k particles of the types i1, . . . , ik can be changed into a collectionof particles described by a profile Φ = (φ1, φ2, . . .) ∈ Z∞

+,fin with some rate PΦi1,...,ik

is governed by the equations

xj =1

k!

∞∑i1,...,ik=1

xi1 · · ·xik

∑ΦPΦi1,...ik(φj − δi1j − · · · − δikj ). (3.60)

Equivalently, if Ψ denotes the profiles of the collection of k particles that areeligible for a transformation to the profile Φ with the rate PΦ

Ψ , equation (3.60) canbe rewritten in the form

xj =∑

Ψ:#(Ψ)=k

Ψ!

∑ΦPΦΨ (φj − ψj), (3.61)

where notation (3.11) and the last equation in (3.12) were used, and where #(Ψ) =∑j ψj denotes the number of particles in the profile Ψ. If one relaxes the con-

straints on Ψ by requiring that it includes at most k rather than exactly k par-ticles, one can include the transformations of all orders not exceeding k, or evenmore generally transformations of any order. In the latter case, the equation getsthe form

xj =∑

Ψ

Ψ!

∑ΦPΦΨ (φj − ψj)

=

∞∑k=1

1

k!

∞∑i1,...,ik=1

xi1 · · ·xik

∑ΦPΦi1,...ik(φj − δi1j − · · · − δikj ).

(3.62)

The equations (3.62) are the general kinetic equations with a polynomial oranalytic r.h.s. Even more generally, the coefficients PΦ

Ψ can also depend on x, inwhich case one speaks of a mean-field dependence of the reaction rates.

With the mean-field dependence Rj(x) included, the equations (3.59) containthe famous replicator dynamics of evolutionary games in its most general form.

Remark 51. Let us point out some slight abuse of the used notation: by Pmnk in

(3.55), we mean PΦΨ with Ψ = {ψj = δjk}, Φ = {φj = δmj + δnj }.

The equations (3.62) can be equivalently written in another insightful form,namely the weak form, where the elements x of l1 are characterized by their scalar

Page 194: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

178 Chapter 3. Discrete Kinetic Systems: Equations in l+p

products with arbitrary elements g ∈ c∞. Multiplying (3.62) by a g ∈ c∞ yieldsthe representation of this equation in its weak form:

d

dt(g, x) =

∑Ψ

Ψ!

∑ΦPΦΨ (g, φ− ψ)

=∞∑k=1

1

k!

∞∑i1,...,ik=1

xi1 · · ·xik

∑ΦPΦi1,...ik

[g(xj1) + · · ·

· · ·+ g(xjm)− g(xi1)− · · · − g(xik)].

(3.63)

This equation must hold for all g ∈ c∞ or, more generally, for g from some densesubset of c∞, where j1, . . . , jm are the types of particles in a profile Φ: Φ =Φ(j1, . . . , jm). We shall use this form in the lp-theory implicitly, when calculatingthe derivatives of the linear functionals on the solutions. In Chapter 7), we shalldiscuss its very important extension to the analysis of evolutions of measures ongeneral state spaces.

Remark 52. Let us stress again that the summation over profiles Φ in (3.63) meansa summation over an unordered collection of particles. Therefore, switching to asummation over ordered indices j1, . . . , jm would require additional combinatorialadjustments, as was explicitly demonstrated for Ψ.

When the number of possible transformations Ψ → Φ in (3.61) is finiteand the number of types j is a finite set {1, . . . , N}, i.e., if equation (3.61) is anequation in RN with a polynomial r.h.s., then equation (3.61) describes a generalmass-action-law kinetics (3.27) from chemistry, where the transformations Ψ → Φare referred to as reactions and the output-input profiles Ψ,Φ as complexes orcompounds.

If the quantity (L, x) is conserved for an L ∈ R∞++ and all transformations

Ψ → Φ in (3.61), i.e., if (L,Φ) = (L,Ψ) whenever PΦΨ = 0, then the function

j �→ Lj is the critical Lyapunov vector for the system (3.61), and it is also referredto as the conservation law. In chemistry, this usually holds with Lj = mj being themass of a particle of the type j, due to the principle of conservation of mass or theLomonosov–Lavoisier law. A natural subclass of processes arises once the index jitself is interpreted as the mass of a particle: mj = j. In this case, particles differonly by their masses. The corresponding processes are referred to asmass-exchangeprocesses.

Another important class of processes includes some quantity L that is allowedto grow only by reactions involving no more than one particle. In this case, equation(3.62) gets the form

xj = aj +∑

kxk

∑Φ:(L,Φ)≤Lk+b

PΦk (φj − δkj ) +

∑Ψ

Ψ!

∑Φ:(L,Φ)≤(L,Ψ)

PΦΨ (φj − ψj),

(3.64)

Page 195: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

3.5. Kinetic equations for collisions, fragmentation, reproduction . . . 179

with non-negative b and aj. A notable example is the celebrated model of preferen-tial attachment. In this model, one interprets xj as the concentration of coalitionsof the size k, and the growth obeys the following rule: new particles are regularlyinjected into the system in such a way that they either become independent (i.e.,they form a new coalition of the unit size) or randomly join one of the coalitionswith a probability that is proportional to its size. This process is governed by theequation

xj = a(x)δ1j +

∞∑k=1

kxkP (x)(δjk+1 − δjk) = a(x)δ1j +P (x)[(j− 1)xj−1 − jxj ], (3.65)

with P k+1k = kxkP (x) as discussed in Remark 51. Here, we included some addi-

tional mean-field dependence of a and P . This equation is a variation of (3.64)with L = (1, 2, . . .) and b = 1.

By far the most important interactions are binary interactions. Therefore, weshall often deal with equations that combine a linear and/or quadratic r.h.s. Forexample, the most studied mass-exchange system is the Smoluchowski coagulation-fragmentation, where any pair of particles with the masses k and l can coagulateto produce a particle of the mass k + l with some rate Qkl, and any particle ofthe mass k can fragment into two pieces of the masses l and k − l for any l < kwith some rate P k−l,l. These processes are special cases of (3.55) and (3.56). Thecorresponding evolution equations are

xj =1

2

∞∑k=1

∞∑l=1

xkxl

∞∑m=1

Qkl(δk+lj − δlj − δkj )

+1

2

∞∑k=1

xk

∑m<k

Pm,k−m(δmj + δk−mj − δkj ),

(3.66)

or equivalently

xj =1

2

j−1∑k=1

xkxj−kQk,j−k − xj

∞∑m=j+1

xm−jQm−j,j

+∞∑

m=j+1

xmP j,m−j − 1

2xj

j−1∑m=1

Pm,j−m.

(3.67)

Remark 53. The notations for the rates P may differ when one sums over anordered or an unordered collection of particles in the profiles, see (3.12). For ex-ample, in order for the part with the coefficients P in (3.66) (representing thefragmentation) where one traditionally sums over ordered pairs to coincide withthe r.h.s. of equation (3.57) – where one sums over unordered pairs –, one musttake Pmn from (3.66) to be equal to Pmn from (3.57) for m = n and to be twicethe value of Pmn from (3.57) for m = n.

Page 196: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

180 Chapter 3. Discrete Kinetic Systems: Equations in l+p

Combining the evolution from (3.65) and (3.66) yields the equation

xj = a(x)δ1j +

∞∑k=1

kxkP (x)(δjk+1 − δjk)

+1

2

∞∑k=1

∞∑l=1

xkxl

∞∑m=1

Qkl(x)(δk+lj − δlj − δkj )

+1

2

∞∑k=1

xk

∑m<k

Pm,k−m(x)(δmj + δk−mj − δkj ).

(3.68)

This equation describes the process of merging and splitting, for instance banks,firms or internet communities (or in a physical interpretation the coagulation-fragmentation of particles), enhanced by the external flux of new members thatarrive according to the preferential-attachment model and are subject to mean-field dependence of coefficients. In yet another interpretation, equation (3.68) canbe considered a description for the process of evolutionary coalition building.

Let us show now that equations of the type (3.61) are the most generalequations for representing positivity-preserving evolutions in lp or in RN . Let usstart with the simplest case of linear equations

xj =∑

kAjkxk, j ∈ N or j ∈ {1, . . . , N}, (3.69)

with some matrix Ajk. The following result is straightforward.

Proposition 3.5.1. If Ajk is conditionally positive, i.e., if Ajk ≥ 0 for j = k, andif the sequence Amj is summable over m for any j, then equation (3.69) can bewritten in the form

xj =∑

kxk

∑m =k

Pmk (δmj − δkj ) +

∑mAmjxj , (3.70)

where Pmk = Amk for m = k. In other words, it is a kinetic equation that describes

mutation, death and reproduction. Moreover, the last term responsible for deathand reproduction vanishes if and only if

∑mAmj = 0 for all j.

In particular, the equations describing fragmentation or branching can besimilarly represented in a form that describes mutation, death and reproduction.For instance, the second term of (3.66) can be written as follows:

∞∑k=1

xk

∑m<k

Pm,k−m(δmj − δkj ) +1

2

∑m<j

Pm,j−mxj . (3.71)

When looking at polynomial equations, an additional detail comes into play.Let us consider a multi-linear operator in lp (or c∞ or R∞) that is given by the

Page 197: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

3.5. Kinetic equations for collisions, fragmentation, reproduction . . . 181

array A = (Aj) = {Aji1,...,ik

}, which is symmetric with respect to the change oforder of i1, . . . , ik for any j, with j, i1, . . . ik from {1, . . . , N} in the case of systemsin RN and with j, i1, . . . ik ∈ N in the case of systems in R∞, and which actsaccording to (3.8):

A[Y (1)⊗ · · · ⊗ Y (k)] = {Aj [Y (1)⊗ · · · ⊗ Y (k)]}with

Aj [Y (1)⊗ · · · ⊗ Y (k)] =∑

i1,...,ik

Aji1,...,ik

Yi1(1) · · ·Yik(k), (3.72)

Such a multi-linear operator is conditionally positive if Aj [Y (1)⊗ · · · ⊗ Y (k)] ≥ 0whenever Yj(1) = · · · = Yj(k) = 0. Moreover, we say that the correspondingpolynomial kinetic equations of the type (3.7), that is

x = A[x⊗k] ⇐⇒ {xj = Aj [x⊗k]) for all j = 1, . . . , N}, (3.73)

although now in RN or R∞, with A given by (3.72) is strongly conditionally posi-tive if the mapping A is conditionally positive as a multi-linear operator. Accord-ingly, we say that an equation with an analytic r.h.s. (3.10) (now again consideredeither in RN or R∞) is strongly conditionally positive if each homogeneous partof it is given by a conditionally positive multi-linear operator A(k).

It is straightforward to see that all equations (3.60) to (3.62) are stronglyconditionally positive. This property turns out to be characteristic for equations(3.60)–(3.62). Let us explain it in detail for quadratic equations only by present-ing the following straightforward result. (The general case only requires a moresophisticated notation.)

Proposition 3.5.2. Let the r.h.s. of the equation

xj = (Ajx, x) =∑k,l

Ajklxkxl =

∑kAj

kkx2k + 2

∑k

k−1∑l=1

Ajklxkxl,

j = 1, . . . , N or j ∈ N,

(3.74)

be strongly conditionally positive, and let∑

m |Amkj | < ∞ and

∑m Am

kj = 0 for anyj, k. Then it can be represented as

(Ajx, x) =1

2

∞∑k=1

∞∑l=1

xkxl

∑(m,n)

Pmnkl (δmj + δnj − δlj − δkj )

with some positive symmetric Pmnkl , i.e., as kinetic equations describing the pro-

cesses of binary collisions (preserving the number of particles). Such representationis unique if it is additionally required that one of the two particles in each colli-sion remains unchanged, i.e., Pmn

kl = 0 if neither m nor n equal k. This unique

Page 198: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

182 Chapter 3. Discrete Kinetic Systems: Equations in l+p

representation is as follows:

xj =∑

kx2k

∑m =k

Pmkkk (δmj − δkj ) + 2

∑k

k−1∑l=1

xkxl

∑m =l

Pmklk (δmj − δlj), (3.75)

where Pmklk = Am

lk for k ≥ l and m = l.

Notice that if one has∑

m L(m)Amkj = 0 instead of

∑m Am

kj = 0 for all k, jand with some function L ∈ R∞

++, then the representation (3.75) can be obtainedfor variables yj = L(j)xj .

If there is no conservation law of the type∑

m L(m)Amkj = 0, then an ad-

ditional term for representing death and reproduction must appear in the corre-sponding extension of (3.75).

The extension of Proposition 3.5.2 to other polynomials is also straightfor-ward:

Proposition 3.5.3. For any collection of symmetric arrays Aj = (Aji1···ik) such that∑

j Aj = 0 and such that the operators defined by Aj are strongly conditionally

positive, one has∑i1≤···≤ik

Aji1···ikxi1 · · ·xik =

∑i1≤···≤ik

∑m =i1

Pmi2···iki1i2···ik (δ

mj − δi1j )xi1 · · ·xik , (3.76)

where Pmi2···iki1i2···ik = Am

i1i2···ik for m = i1 and i1 ≤ · · · ≤ ik.

3.6 Simplest equations in l+p

In this section, we discuss equations in l+p . Their properties are obtained by moreor less direct extension of the theory for Rn

+. But first, we introduce the basicobjects related to linear Lyapunov functions in R∞.

For a y = (y1, y2, . . .) ∈ R∞, we shall denote |y| = (|y1|, |y2|, . . .) ∈ R∞+ .

Moreover, we shall use the notation (g, f) for the inner product (g, f) =∑∞

k=1 gkfkfor the elements g, f ∈ R∞ such that (|g|, |f |) =∑∞

k=1 |gk| |fk| < ∞. Pn denotesthe projection operator R∞ → Rn acting as

Pn(x1, x2, . . .) = (x1, . . . , xn, 0, 0, . . .).

For sequences from R∞ with all elements strictly positive, we shall write R∞++.

For L ∈ R∞++, let

M(L) = {y ∈ R∞ : (L, |y|) < ∞}, M≤λ(L) = {y ∈ M(L) : (L, |y|) ≤ λ}

for a λ > 0. We also introduce the notation M+(L) = M(L) ∩R∞+ , M+

≤λ(L) =M≤λ(L) ∩ R∞

+ . Of major importance are functions L with finite level sets, i.e.,

Page 199: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

3.6. Simplest equations in l+p 183

{j : L(j) ≤ λ} is finite for any λ > 0. This requirement is equivalent to therequirement that L(j) → ∞ as j → ∞. The main example of such L are functions(1j, 2j , . . .) ∈ R∞

+ for a j ≥ 0, in which case the corresponding M(L) is denotedby Mj.

The subsets M≤λ(L) are useful, since they make it possible to control thefinite-dimensional approximations of the elements of lp. Namely, it is clear thatM(L) ⊂ c∞ and M(L) ⊂ lp for any p ≥ 1 whenever L has finite level sets and isnon-decreasing. Moreover, if x ∈ Mλ(L), then

‖x− Pn−1(x)‖c∞ ≤ ‖x− Pn−1(x)‖1 ≤ 1

L(n)

∞∑k=n

L(k)|xk| ≤ λ

L(n), (3.77)

and for any p ≥ 1

‖x− Pn−1(x)‖p =

( ∞∑k=n

|xk|p)1/p

≤∞∑

k=n

|xk| ≤ λ

L(n). (3.78)

As a direct consequence of these estimates, we get the following result.

Lemma 3.6.1. M≤λ(L) is a compact subset of c∞ and of any lp, p ≥ 1, wheneverL has finite level sets and is non-decreasing.

Remark 54. This is of course just a discrete-setting specification of the generalfact that the set of measures with bounded moments of any given order is weaklycompact.

Proof. In fact, for any sequence yk, k = 1, 2, . . ., of elements of M≤λ(L), andfor any ε, we can choose n0 such that ‖yk − Pn(y

k)‖ ≤ ε for n > n0, all k andany of the norms in lp or c∞. Therefore, we can choose a convergent subsequencefrom yk. �

Extending the definition of the finite-dimensional setting, let us say that ther.h.s. f(x) of the ODE

x = f(x) ⇐⇒ {xj = fj(x) for all j = 1, . . .} (3.79)

in R∞ is conditionally positive if fj(x) ≥ 0 whenever xj = 0. Let us say thatL ∈ R∞

++ is a Lyapunov function or a Lyapunov vector for equation (3.79), or thatthe mapping f has the Lyapunov function L, if f(x) ∈ M(L) (this is an additionalrequirement compared to the finite-dimensional case!) and the Lyapunov condition

(L, f(x)) ≤ a(L, x) + b (3.80)

holds with some constants a, b whenever x ∈ Mj+. The Lyapunov function L iscalled subcritical (respectively critical), or f is said to be L-subcritical (respectivelyL-critical), if (L, f(x)) ≤ 0 (respectively (L, f(x)) = 0) for all x ∈ M+(L).

Page 200: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

184 Chapter 3. Discrete Kinetic Systems: Equations in l+p

Theorem 3.6.1. Let f : M+(L) → M(L) be conditionally positive, Lipschitz-continuous on any set M+

≤λ(L) as a mapping from c∞ → c∞ (respectively fromlp → lp for a p ≥ 1), and have the Lyapunov vector L which has finite level setsand is non-decreasing. Then for any x ∈ M+

≤λ(L) there exists a unique global

solution X(t, x) ∈ M+(L) (defined for all t ≥ 0) to equation (3.79) in the spacec∞ (respectively lp) with the initial condition x. Moreover, if a = 0, then

X(t, x) ∈ M+≤λ(t)(L), λ(t) = eat

(λ+

b

a

)− b

a= eatλ+ (eat − 1)

b

a. (3.81)

If a = 0, then the same holds with λ(t) = λ+ bt. If f is L-critical, then the sameholds with λ(t) = λ.

Proof. This is a straightforward extension of the proof of Theorem 3.1.1. Theconstant K is chosen as the Lipschitz constant of f in M≤λ(T )(L), considered asa mapping from c∞ → c∞ (respectively from lp → lp). �Theorem 3.6.2. Under the assumptions of Theorem 3.6.1, assume additionallythat f ∈ C1(M+

≤λ(L), c∞) (respectively f ∈ C1(M+≤λ(L), lp)) for any λ, where

M+≤λ(L) is considered in the topology of c∞ (respectively lp). Then the derivative

ξt = −DX(t, x)[ξ] of the solutions with respect to the initial data is well definedas an element in the corresponding space c∞ or lp, and the unique solution to theequation reads

ξt = ξ +

∫ t

0

DX(s, x)[ξs] ds.

Proof. Once the well-posedness is proved as in Theorem 3.6.1, Theorem 3.6.2 canbe proved as for the Lipschitz case, Theorem 2.9.1, since the solutions starting inM+

≤λ belong to M+≤λ(T ), where the r.h.s. is assumed to be Lipschitz-continuous.

The assumptions of Theorem 3.6.1 are rather strong, the key constraint beingthe assumption that the image of f belongs to M(L). Therefore, they do not covermany important examples. What they do cover, though, are polynomial equations

xj =∑

iAj

i (1)xi +∑i1,i2

Aji1i2

(2)xi1xi2 + · · ·+∑

i1,...,ik

Aji1···ik(k)xi1 · · ·xik (3.82)

with local bounded coefficients, as the next result shows. Let us say that thepolynomial operator given by the arrays A(m) = {Aj

i1···im(m)} is local if for any n

there exists j0 such that Aji1···im(m) = 0 whenever j > j0 and all il do not exceed n.

Roughly speaking, this means that species of lower levels cannot directly influencethe changes in species of considerably higher levels.

Theorem 3.6.3. Let L ∈ R∞++ be non-decreasing with finite set levels and such that

for any m = 1, . . . , kL(j) ≤ L(i1) + · · ·+ L(im)

Page 201: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

3.6. Simplest equations in l+p 185

whenever Aji1···im(m) = 0 (in particular, all A(m) are local) and

∑jL(j)Aj

i1···im(m) = 0

for all m, i1, . . . , im. Assume that

(i) the r.h.s. of (3.82) is conditionally positive;

(ii) there is a fixed number J such that the number of values j with Aji1···im(m) = 0

is uniformly bounded by J for all m, i1, . . . , im;

(iii) all arrays are uniformly bounded: |Aji1···im(m)| < K with some K.

Then equation (3.82) satisfies the conditions of Theorem 3.6.1 for p = 1, and there-fore has a unique global solution in l+1 for any x ∈ M+(L). Moreover,

∑j L(j)xj

is constant along any solution.

Proof. We have

∑jL(j)

∑i1,...,im

|Aji1···im(m)|xi1 · · ·xim

≤∑

i1,...,im

JK(L(i1) + · · ·+ L(im))xi1 · · ·xim

≤ JKm(L, x)‖x‖m−1l1

,

which shows that the r.h.s of (3.82) takes M+(L) to M(L). Furthermore, we find

∑j

∣∣∣∣∣∑

i1,...,im

Aji1···im(m)xi1 · · ·xim −

∑i1,...,im

Aji1···im(m)yi1 · · · yim

∣∣∣∣∣≤∑

i1,...,im

∑j|Aj

i1···im(m)|m∑p=1

xi1 · · ·xip−1 |xip − yip |yip+1 · · · yim

≤ JKm‖x− y‖l1 max(‖x‖m−1l1

, ‖y‖m−1l1

),

which shows the Lipschitz continuity of the r.h.s. of (3.82) in l1 on any setM≤λ(L).�

As an example, let us apply this result to the Smoluchowski equation (3.66)or (3.67).

Corollary 2. If supk,l Qkl < ∞ and supk∑

m<k Pm,k−m < ∞, then the assump-

tions of Theorem 3.6.3 hold for equation (3.66) with L(j) = j, therefore implyingthe well-posedness of (3.66) for bounded coefficients.

Page 202: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

186 Chapter 3. Discrete Kinetic Systems: Equations in l+p

3.7 Existence of solutions for equations in l+p

Lemma 3.7.1. Let B be a Banach space and K, KT , K ⊂ KT be compact subsetstherein. Let f and fn, n = 1, 2, . . ., be a uniformly bounded family of continuousfunctions [0, T ] × KT → B such that ‖fn − f‖C([0,T ]×KT ,B) → 0 as n → ∞.Suppose that for any n and xn

0 ∈ K there exists a solution for t ∈ [0, T ] to theequation x = fn(t, x) in B with the initial condition xn

0 such that xn(t) ∈ KT forall t. Moreover, suppose that xn

0 → x0 as n → ∞. Then there exists a solution fort ∈ [0, T ] to the equation x = f(t, x) in B with the initial condition x0 such thatx(t) ∈ KT for all t.

Proof. Since all xn(t) assume values in a compact set, and since the derivativesxn(t) are uniformly bounded, one can choose a subsequence of the sequence offunctions xn(t) – which we again denote by xn(t) – that converges to a functionx(t) uniformly for t ≤ T . Clearly, x(t) also assumes values in KT . Moreover, since

‖fn(t, xn(t))− f(t, x(t))‖ ≤ ‖fn(xn(t)) − f(t, xn(t))‖ + ‖f(t, xn(t))− f(t, x(t))‖,and since f is continuous and hence uniformly continuous in [0, T ] × KT , wecan conclude that the sequence of the derivatives xn(t) = fn(x

n(t)) convergesto f(x(t)) uniformly on t ≤ T for all T . Therefore, x(t) exists in B and equalsf(x(t)). �

For a subset D ⊂ R∞, let us say that a mapping f : D → R∞ is local if forany n there exists m0 such that

fm(Pn(x)) = 0 for anyx and anym ≥ m0. (3.83)

Clearly, this notion extends the above definition for polynomial operators on ther.h.s. of (3.82).

Theorem 3.7.1. Let L ∈ R∞++ be non-decreasing and with finite level sets.

(i) Let f : M+(L) → lp, p ≥ 1 be conditionally positive and local;

(ii) let f(Pn(x)) be L-subcritical and Lipschitz-continuous for any n, as a map-ping lp → lp on any set M+

≤λ(L);

(iii) let ‖f(Pn(x) − f(x)‖lp → 0, as n → ∞, uniformly on any set M+≤λ(L).

Then for any x ∈ M+≤λ(L) there exists a global solution X(t, x) ∈ M+

≤λ(L)(defined for all t ≥ 0) to equation (3.79) in lp with the initial condition x.

Proof. The equation x = f(Pn(x)) has a unique solution for any n, since bylocality of f this equation is finite-dimensional and satisfies the assumptions ofTheorem 3.1.1 due to (i) and (ii). Therefore, the required existence follows by (iii)and the Lemmas 3.6.1 and 3.7.1 with K = KT = M+

≤λ(L). �Remark 55. The crucial difference compared to Theorem 3.6.1 is that we do notassume f(x) ∈ M(L) here. Therefore, the sub-criticality condition makes senseonly for the approximations f(Pn(x)).

Page 203: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

3.7. Existence of solutions for equations in l+p 187

As an example, we consider again a polynomial equation of the type (3.82)with sub-multiplicative coefficients.

Theorem 3.7.2. Let L ∈ R∞++ be non-decreasing with finite set levels such that for

any m = 1, . . . , k|Aj

i1···im(m)| ≤ o(1)L(i1) · · ·L(im), (3.84)

where the function o(1) of the variables i1, . . . , im is bounded by a constant K andtends to zero whenever max(i1, . . . , im) → ∞, and∑

jL(j)Aj

i1···im(m) ≤ 0

for all m, i1, . . . , im. Assume that

(i) the r.h.s. of (3.82) is conditionally positive;

(ii) there is a fixed number J such that the number of values j with Aji1···im(m) = 0

is uniformly bounded by J for all m, i1, . . . , im.

Then equation (3.82) satisfies the conditions of Theorem 3.7.1 for p = 1, andtherefore has a global solution in l+1 for any x ∈ M+(L). Moreover, for x ∈M+

≤λ(L), solutions exist that remain in M+≤λ(L) for all times.

Proof. We have∑j

∑i1,...,im

|Aji1···im(m)|xi1 · · ·xim

≤ JK∑

i1,...,im

L(i1) · · ·L(im)xi1 · · ·xim ≤ JK(L, x)m,

which shows that the r.h.s of (3.82) takes M+(L) to l1. Furthermore,

‖A(m)(Pn(x)) −A(m)(x)‖1≤∑

j

∑i1,...,im

|Aji1···im(m)| |xi1 · · ·xim − xi11(i1 ≤ n) · · ·xim1(im ≤ n)|

≤∑

j

∑i1,...,im

|Aji1···im(m)|xi1 · · ·xim1max(i1,...,im)>n

≤∑

i1,...,im

JL(i1) · · ·L(im)ω(n)xi1 · · ·xim ≤ Jω(n)(L, x)m,

where ω(n) → 0, as n → ∞. This establishes condition (iii) of Theorem 3.7.1 forp = 1. �

As an example, let us apply this result to the Smoluchowski equation (3.66)or (3.67) again.

Corollary 3. If Qkl ≤ o(1)kl and∑

m<k Pm,k−m ≤ o(1)k, as k, l → ∞, then the

assumptions of Theorem 3.7.2 hold for equation (3.66) with L(j) = j.

Page 204: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

188 Chapter 3. Discrete Kinetic Systems: Equations in l+p

Let us emphasize that Theorems 3.7.1 and 3.7.2 are only about existence.There are examples showing that under such conditions multiple solutions canactually exist.

Finally, note that in order to get the most effective well-posedness results,one has to employ yet another idea of generalized monotonicity or accretivity.

3.8 Additive bounds for rates

We shall now work explicitly with equations as represented by (3.61). Let usconsider this equation in its most general form, including mean-field dependenceof coefficients that is subject to a conservation law and a bound on the order ofreactions. This can be either equations of the type

xj = fj(x) = aj(x) +∑

lxl

∑Φ:(L,Φ)≤Ll+b

PΦl (x)(φj − δlj)

+∑

Ψ:2≤#(Ψ)≤k

Ψ!

∑Φ:(L,Φ)≤(L,Ψ)

PΦΨ (x)(φj − ψj),

(3.85)

where #(Ψ) = (1,Ψ) =∑

j ψj denotes the total number of particles in Ψ and

where L ∈ R∞++, a(x) = (aj(x)) ∈ R∞

+,fin and b, PΦΨ (x) are non-negative, or more

generally equations of the type

xj = fj(x) = aj(x) +∑

lxl

∑Φ:(L,Φ)≤Ll+b

PΦl (x)(φj − δlj)

+∑

Ψ:#(Ψ)≥2

Ψ!

∑Φ:(L,Φ)≤(L,Ψ)

PΦΨ (x)(φj − ψj),

(3.86)

where the constraint on Ψ is relaxed. The function f on the r.h.s. of these equationsis always conditionally positive, and according to (3.83) it is also local wheneverL has finite level sets.

Remark 56. Writing an equation in the form (3.86) makes clear that all transfor-mations arising from the interactions (involving more than one particle) do notincrease L (i.e., they are subcritical), but some growth of L is allowed by explicitinjections or spontaneous transformations (that are still allowed to be enhancedby a mean-field interaction) of single particles.

We shall mostly work with equations of the type (3.85), since they are by farthe most useful ones for all practical purposes, especially for k = 2, i.e., for binaryinteractions. But we will also sketch some results for (3.86) in order to highlightwhere the finiteness of k becomes crucial.

The most established growth-condition for the rates PΦΨ , which allows us to

cover basic practical examples and to develop a sound theory, is the condition of

Page 205: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

3.8. Additive bounds for rates 189

additive bounds for the rates with respect to L:∑ΦPΦΨ (x) ≤ C(L,Ψ) (3.87)

for all Ψ and a constant C. Another crucial assumption that is often used couldbe called no-dust condition. This condition forbids the creation of an unboundednumber of small particles by a single reaction:

#(Φ) =∑

jφj ≤ J whenever PΦ

Ψ (x) = 0, (3.88)

with a constant J . Note that this condition is similar, but not equivalent to condi-tion (iii) of Theorem 3.6.3. The following result shows that with the help of theseconditions, one can effectively estimate the L-norm of the r.h.s. f(x) of equation(3.85) in terms of x itself.

Proposition 3.8.1.

(i) Let L ∈ R∞++ and let us assume that (3.87) holds. Then

(L, |f(x)|) =∑

jLj |fj(x)|

≤ (L, a(x)) + Cb(L, x) + 2C(L2, x)

∞∑m=1

m(1, x)m−1

(m− 1)!,

(3.89)

for f from (3.86), so that f maps M+(L2) to M(L). In the case of equation(3.85), the sum on the r.h.s. extends only to m ≤ k.

(ii) If additionally (3.88) holds, then

‖f(x)‖l1 ≤ (1, a(x)) + C(J + k)(L, x)

k∑m=1

(1, x)m−1

(m− 1)!, (3.90)

for f from (3.85), so that f maps M+(L) to l1. In fact, under the conditions(3.88) and (3.87), the estimate (3.90) holds for more general equations

xj = fj(x) = aj(x) +∑

Ψ:#(Ψ)≤k

Ψ!

∑ΦPΦΨ (x)(φj − ψj).

Proof. (i) If x ∈ M+(L2), then

(L, |f(x)|) =∑

jLj|fj(x)|

≤ (L, |a|) +∑

lxlLlCb + 2

∑Ψ

Ψ!

∑Φ:(L,Φ)≤(L,Ψ)

PΦΨ (x)(L,Ψ).

From now on, let us set a = 0 and b = 0, since their contributions are alreadyobtained. Using (3.87), we can then conclude that (L, |f(x)|) does not exceed

2C∑

Ψ

Ψ!(L,Ψ)2 = 2C

∞∑m=1

1

m!

∑i1,...,im

xi1 · · ·xim(Li1 + · · ·+ Lim)2,

Page 206: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

190 Chapter 3. Discrete Kinetic Systems: Equations in l+p

where we used (3.12). Using the trivial estimate

(a1 + · · ·+ am)2 ≤ m(a21 + · · ·+ a2m)

and the symmetry of the indices i1, . . . , im we further derive that

(L, |f(x)|) ≤ 2C

∞∑m=1

1

(m− 1)!

∑i1,...,im

xi1 · · ·xim(L2i1 + · · ·+ L2

im)

= 2C

∞∑m=1

1

(m− 1)!

∑i1,...,im

xi1 · · ·ximmL2i1

= 2C(L2, x)

∞∑m=1

m(1, x)m−1

(m− 1)!,

which implies (3.89). Therefore, (L, |f |) is bounded.(ii) Similarly, we have

‖f(x)‖1 ≤ (1, a(x)) + (J + k)∑

Ψ:#(Ψ)≤k

Ψ!

∑ΦPΦΨ (x)

≤ (1, a(x)) + (J + k)C∑

Ψ:#(Ψ)≤k

Ψ!(L,Ψ)

≤ (1, a(x)) + (J + k)C

k∑m=1

1

m!

∑i1,...,im

xi1 · · ·xim(Li1 + · · ·+ Lim),

which implies (3.90). �

If a and b vanish, then (L, f(Pn(x))) ≤ 0. In general, the following resultapplies:

Proposition 3.8.2. Under condition (3.87), the following estimates hold:

(L, f(Pn(x))) ≤ (L, a) + Cb(L,Pn(x)), (3.91)

for f from (3.86), and

(L,Xn(t, x)) ≤ λ(t) = eCbtλ+ (eCbt − 1)(L, a)

Cb, (3.92)

for any solution Xn(t, x) to the equation x = f(Pn(x)) with the initial conditionx ∈ M+

≤λ(L).

Proof. The first estimate (3.91) is straightforward. For any x ∈ M+(L), Pn(x) ∈M+(L2) and therefore f(Pn(x)) ∈ M(L) by Proposition 3.8.1. Hence (3.92) fol-lows by Theorem 3.6.1. �

Page 207: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

3.9. Evolution of moments under additive bounds 191

3.9 Evolution of moments under additive bounds

We will now show the key property of additive bounds for rates: under additive(with respect to L) bounds, all powers p ≥ 1 of the conservation law L are alsoLyapunov functions on the subsets M≤λ(L), so that (3.91) extends to Lβ for allβ > 1. Before going ahead, let us first collect some elementary inequalities thatare routinely used for estimating moments in this context.

Lemma 3.9.1.

(i) For any m, β ≥ 1, aj ≥ 0 and bj ≥ 1,

m∑j=1

aβj bj ≤⎛⎝ m∑

j=1

ajbj

⎞⎠

β

. (3.93)

(ii) For any m, aj ≥ 0 and β ≥ 1,

m∑j=1

aβj ≤⎛⎝ m∑

j=1

aj

⎞⎠

β

≤ mβ−1m∑j=1

aβj , (3.94)

while for β ∈ [0, 1] one has (∑m

j=1 aj)β ≤∑m

j=1 aβj .

(iii) For any a, b, β > 0, the following estimates hold:

(a+ b)β − aβ − bβ ≤ 2β(abβ−1 + baβ−1)

(a+ b)β − aβ ≤ β2β(bβ + baβ−1).(3.95)

Exercise 3.9.1. Prove these inequalities.

Proposition 3.9.1. Let L ∈ R∞++ be non-decreasing with finite set levels such that

Lj ≥ 1 for all j. Moreover, let us assume that (3.87) holds. Then

(Lβ, f(Pn(x))) ≤ (Lβ , x)κ(β, λ) + (Lβ, a(x)) (3.96)

for any β > 1, f from (3.86), any n and x ∈ M+≤λ(L) ∩M(Lβ), where

κ(β, λ) = Ccβ

(bβλ+

∞∑m=1

λmmβ

m!

), (3.97)

with a constant cβ. Therefore, if the coordinates of a(x) are uniformly boundedby the coordinates of a vector a ∈ M+(Lβ), then Lβ is a Lyapunov vector in thesense of (3.80). In case of equation (3.85), the sum in (3.97) is over m ≤ k − 1.

Proof. Notice first that using Pn(x) rather than x in (3.96) ensures by localitythat f(Pn(x)) ∈ M(Lβ). Therefore, the l.h.s. of (3.96) is well defined. Let us set

Page 208: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

192 Chapter 3. Discrete Kinetic Systems: Equations in l+p

a = 0 and b = 0, since their contributions are straightforward to obtain. By (3.85),we find

(Lβ, f(Pn(x))) ≤∑

Ψ

Ψ!

∑Φ:(L,Φ)≤(L,Ψ)

PΦΨ (x)

∑j(Lβ

j φj − Lβj ψj).

By (3.93), it follows that

(Lβ , f(Pn(x))) ≤∑

Ψ

Ψ!

∑Φ:(L,Φ)≤(L,Ψ)

PΦΨ (x)

[(∑jLjφj

)β−∑

jLβj ψj

],

and consequently by (3.87)

(Lβ , f(Pn(x))) ≤∑

Ψ

Ψ!

∑Φ:(L,Φ)≤(L,Ψ)

PΦΨ (x)

[(∑jLjψj

)β−∑

jLβj ψj

]

≤ C∑

Ψ

Ψ!(L,Ψ)

[(∑jLjψj

)β−∑

jLβj ψj

],

which can be rewritten with the summation over unordered indices by (3.12) as

C

∞∑m=1

1

m!

∑i1,...,im≤n

xi1 · · ·xim(Li1 + · · ·+Lim)[(Li1 + · · ·+Lim)β−Lβi1−· · ·−Lβ

im],

which, due to the symmetry, yields

(Lβ, f(Pn(x))) (3.98)

≤ C∑

m

1

(m− 1)!

∑i1,...,im≤n

xi1 · · ·ximLi1 [(Li1 + · · ·+ Lim)β − Lβi1− · · · − Lβ

im].

Let us first perform the next step for the easiest case β = 2, in order tounderstand the main point of the argument: the use of cancelation for reducingthe highest power of L by one. In this case, one can solve the brackets in (Li1 +· · ·+ Lim)2 to find

(L2, f(Pn(x))) ≤ 2C

∞∑m=2

1

(m− 1)!

∑i1,...,im≤n

xi1 · · ·ximLi1

⎛⎝ ∑

1≤p<q≤m

LipLiq

⎞⎠

≤ 2C∞∑

m=2

1

(m− 1)!

∑i1,...,im≤n

xi1 · · ·xim

(L2i1

∑p>1

Lip + (m− 1)Li1

∑p>1

L2ip

),

which, by symmetry, equals

2C

∞∑m=2

m(m− 1)

(m− 1)!

∑i1,...,im≤n

xi1 · · ·ximL2i1Li2 .

Page 209: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

3.9. Evolution of moments under additive bounds 193

Since Lj ≥ 1 for all j, the above term is therefore is bounded by

2C

∞∑m=1

m(m+ 1)

m!(L2, x)(L, x)m,

as required.

Let us now turn to arbitrary β. In order to do a similar cancelation, weemploy the second inequality of (3.95) with a = Li1 and b = Li1 + · · · + Lim toobtain

Li1 [(Li1 + · · ·+ Lim)β − Lβi1− · · · − Lβ

im]

≤ β2β [Li1(Li2 + · · ·+ Lim)β + Lβi1(Li2 + · · ·+ Lim)]

+ Li1 [(Li2 + · · ·+ Lim)β − Lβi2− · · · − Lβ

im].

By (3.94) and symmetry, (3.98) therefore yields the estimate

(Lβ , f(Pn(x))) ≤ Ccβ

∞∑m=2

(m− 1)β

(m− 1)!

∑i1,...,im≤n

xi1 · · ·ximLβi1Li2

≤ Ccβ

∞∑m=2

(m− 1)β

(m− 1)!(Lβ, x)(L, x)m−1,

which implies (3.96). �

As a consequence of these moment-estimates, one obtains the following exis-tence result.

Theorem 3.9.1. Let L ∈ R∞++ be non-decreasing with finite set levels such that

Lj ≥ 1 for all j, and let the coefficients PΦΨ (Pn(x)) be Lipschitz-continuous in x

for any n.

(i) Let (3.87) and (3.88) hold. Let x ∈ M+≤λ(L)∩M≤ν(L

β), and let the coordi-

nates of a(x) be uniformly bounded by the coordinates of a vector a ∈ M+(Lβ)with β > 1. Then for any p ≥ 1 there exists a global solution X(t, x) to (3.85)in lp with the initial condition x such that

X(t, x) ∈ M+≤λ(t)(L) ∩M≤ν(t)(L

β),

ν(t) = eκ(β,λ(t))tν + (eκ(β,λ(t))t − 1)(Lβ , a)

κ(β, λ(t)),

(3.99)

with κ from (3.97) and λ(t) from (3.92).

Page 210: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

194 Chapter 3. Discrete Kinetic Systems: Equations in l+p

(ii) Let (3.87) hold. Then the same existence result is valid for equation (3.86)and all β > 2.

Proof. (i) We shall use Lemma 3.7.1 with B = l1, K = M+≤λ(L) ∩M≤ν(L

β) and

KT = M+≤λ(T )(L) ∩M≤ν(T )(L

β). Due to locality and the conditional positivity

of f(x), the equation x = f(Pn(x)) is well posed for all n, and it belongs to KT

by Proposition 3.9.1 and Theorem 3.6.1 applied to the Lyapunov function Lβ . Inorder to apply Lemma 3.7.1 in the space lp, p ≥ 1, one therefore only has to provethat ‖f(x)−f(Pn(x))‖l1 → 0 as n → ∞ uniformly for x ∈ KT (because the normsin lp are bounded by the norm in l1). Similar to the estimates that are used in theproof of Proposition 3.8.1(ii), we get

∑j|fj(Pn(x)) − fj(x)|

≤ (J + k)Ck∑

m=1

1

m!

∑i1,...,im:max(iq)≥n

xi1 · · ·xim (Li1 + · · ·+ Lim)

= (J + k)C

k∑m=1

1

(m− 1)!

∑i1,...,im:max(iq)≥n

xi1 · · ·ximLi1 ,

which tends to zero, as n → ∞, uniformly in x ∈ KT , since we can estimate

xi1Li1 ≤ xi1

Lβi1

Lβ−1n

, xil ≤ xil

Lβil

Lβn

,

for i1 > n or il > n with l = 1, respectively.

(ii) This proof is analogous, based on the estimates from Proposition 3.8.1(i).Namely, we can prove that (L, |f(x) − f(Pn(x)|) → 0 as n → ∞ for x ∈ KT (forwhich β > 2 is required). This implies again that ‖f(x)− f(Pn(x))‖l1 → 0. �

Unlike Theorem 3.7.1, the conditions of Theorem 3.9.1 also allow for a proofof the uniqueness, as will be shown shortly.

3.10 Accretive operators in lp

The standard assumption for the r.h.s. of an ODE that ensures the uniquenessof solutions is Lipschitz continuity, as used, e.g., in Theorem 2.2.1. Another con-dition, which is a bit more subtle, can serve the same purpose: accretivity. Thegeneral notion of accretivity was given in Section 2.17. In this section, we makean independent sketch of the theory that is adapted to the spaces lp.

A motivation for the definition can be gained from the attempt to estimatethe norm of the solutions to an ODE. Namely, if x(t) ∈ lp, p > 1, for all t and

Page 211: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

3.10. Accretive operators in lp 195

x(t) + f(t, x(t)) = 0, then

d

dt‖x(t)‖pp =

d

dt

∑j|xj(t)|p = −

∑jp|xj(t)|p−1sgn(xj(t))fj(t, x(t)).

Remark 57. In the literature on accretive operators, one traditionally writes equa-tions in the form x(t) + f(t, x(t)) = 0 rather than our usual x(t) = f(t, x(t)).

In order to estimate this equation by ‖x(t)‖pp – and consequently to be ableto estimate the norm ‖x(t)‖pp via Gronwall’s lemma –, one needs an estimate ofthe type

−∑

j|xj |p−1sgn(xj)fj(t, x) ≤ κ

∑j|xj |p (3.100)

with a constant κ ≥ 0. Similarly, in order to estimate the derivative

d

dt‖x(t)−y(t)‖pp = −

∑jp|xj(t)−yj(t)|p−1sgn(xj(t)−yj(t))(fj(t, x(t))−fj(t, y(t)))

in terms of ‖x(t)− y(t)‖pp, one needs an estimate of the type

−∑

j|xj − yj |p−1sgn(xj − yj)(fj(x) − fj(y)) ≤ κ

∑j|xj − yj |p = κ‖x− y‖pp.

(3.101)

One says that a mapping f : M → lp defined on a subset M ⊂ lp is quasi-accretive in lp if (3.101) holds with a constant κ for all x, y ∈ M . Such a mappingis called accretive if the same holds with κ = 0.

Notice that if f : l2 → l2 is Lipschitz-continuous, that is ‖f(x) − f(y)‖2 ≤κ‖x− y‖2, then∣∣∣∑

j(xj − yj)(fj(x) − fj(y))

∣∣∣ ≤ κ‖x − y‖2‖f(x)− f(y)‖2 ≤ κ‖x− y‖22,

which implies that f is quasi-accretive in l2. Note, however, that the conversemay not be true! Therefore, quasi-accretivity is a weaker property than Lipschitzcontinuity.

For spaces lp with p > 1, the norm is a differentiable function. It is a pe-culiarity of the case l1 that its norm is not differentiable. However, since thenorm is a convex function, the right and left directional derivatives of the normare well defined, which leads to the notion of the semi-inner products (1.38),(1.39). For absolutely continuous functions, the set of points where the right andleft derivatives differ has zero measure and does not contribute to the integrals.Therefore, it is essentially irrelevant if the right or left derivative or any of itsconvex combination are chosen when estimating quantities like ‖x(t)− y(t)‖pp, seeProposition 1.4.2. Strictly speaking, the accretivity of f : M → l1 means that[x− y, f(x)− f(y)]+ ≥ 0, with the formula for [x, z]± in l1 being given in (1.47).Since by (1.40), [x, z]+ ≥ ([x, z]+ + [x, z]−)/2, and since the formula for the r.h.s.

Page 212: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

196 Chapter 3. Discrete Kinetic Systems: Equations in l+p

is simpler in l1, one naturally modifies the accretivity condition in l1 by definingmodified accretivity as

1

2([x− y, f(x)− f(y)]+ + [x− y, f(x)− f(y)]−) ≥ 0, (3.102)

which is just a bit stronger than the standard accretivity (and coincides with itfor all lp with p > 1). Due to (1.47), the modified accretivity (that we shall stickto) and quasi-accretivity of a mapping f in l1 can be formally written in the sameway as for other lp as

−∑

jsgn(xj − yj)(fj(x)− fj(y)) ≤ κ

∑j|xj − yj |, (3.103)

where the function sgn(y) has the usual meaning of being equal to 1,−1, 0 fory > 0, y < 0, y = 0 respectively.

Remark 58. Using the accretivity notion in its standard form would require a morespecific description of the contribution of the terms with xj − yj = 0 in (3.103),which turns out to be irrelevant in all concrete examples.

The following simple result shows in detail how the notion of accretivityworks.

Proposition 3.10.1. Let f : lp → lp be quasi-accretive and let x(t) and y(t) be twosolutions to the equation x+ f(x) = 0 in lp, p ≥ 1. Then

‖x(t)− y(t)‖p ≤ ‖x(0)− y(0)‖petκ. (3.104)

In particular, for a given x(0) there can exist at most one solution. Moreover, if fis accretive, then the norm of the difference of any two solutions is non-increasingin time.

Proof. In fact, by quasi-accretivity, we find

d

dt‖x(t)− y(t)‖pp ≤ κ‖x(t) − y(t)‖pp,

and (3.104) follows from Gronwall’s lemma. For p = 1, the norm is not dif-ferentiable, and the most direct proof is completed by Proposition 1.4.4 in thespace l1. �

As we shall see, accretivity with respect to the weighted norm x → (L, |x|)with L ∈ R∞

++ arises naturally in the analysis of equations of the type (3.61). Suchaccretivity of f(x) means that

−∑

jLj sgn (xj − yj)(fj(x)− fj(y)) ≤ κLj

∑j|xj − yj |. (3.105)

Page 213: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

3.11. Accretivity for evolutions with additive rates 197

3.11 Accretivity for evolutions with additive rates

The following result on the accretivity with respect to the weighted norms is amajor ingredient for the analysis of well-posedness in lp for equations with additivebounds for rates.

Lemma 3.11.1. Let Lj ≥ 1 for all j, the r.h.s. f of equation (3.86) satisfy (3.87)and the coefficients PΦ

Ψ (x), a(x) satisfy the following Lipschitz continuity condi-tion:

∑Φ|PΦ

Ψ (x)− PΦΨ (y)| ≤ C(L, |x− y|)(L,Ψ), (3.106)

(L, |a(x)− a(y)|) ≤ Ca(L, |x− y|). (3.107)

Then −f is accretive on M+(L2) with respect to the weighted norm x �→ (L, |x|).More precisely,

∞∑i=1

Liσi(fi(x)− fi(y)) ≤ [α(λ)ν + Ca + b(C + λC)](L, |x− y|), (3.108)

for x, y ∈ M+≤λ(L) ∩M+

≤ν(L2) and σi = sgn(xi − yi), where

α(λ) = 2(C + C)

∞∑m=1

(m+ 1)λm−1

(m− 1)!. (3.109)

In the case of equation (3.85), the summation in the last formula is over m ≤ k−1.

Remark 59. Here, we mean that σi equals 1,−1, 0 for xi − yi > 0, xi − yi < 0,xi − yi = 0, respectively – in line with our convention on the accretivity for thespace l1, see (3.102) and (3.103). However, the below proof will show that theassignment of the value σi for the case xi − yi = 0 turns out to be irrelevant,which confirms the claim made in the above Remark 58.

Proof. Proposition 3.8.1(i) ensures that the l.h.s. of (3.108) is well defined. Inorder to keep the formulae slim, we set a = 0, since the contribution of a isstraightforward.

Let us first consider the case when PΦΨ (x) does not depend on x and when

b = 0. By (3.85), we get

∞∑i=1

Liσi(fi(x)− fi(y)) =

∞∑i=1

Liσi

∑Ψ

xΨ − yΨ

Ψ!

∑Φ:(L,Φ)≤(L,Ψ)

PΦΨ (φi − ψi),

Page 214: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

198 Chapter 3. Discrete Kinetic Systems: Equations in l+p

which rewrites as

∞∑m=1

1

m!

∑i1,...,im

(xi1 · · ·xim − yi1 · · · yim)

×∑

(L,Φ)≤Li1+···+Lim

PΦi1,...,im

( ∞∑i=1

Liσiφi − Li1σi1 − · · · − Limσim

)

=∑

m

1

m!

∑i1,...,im

m∑q=1

xi1 · · ·xiq−1 (xiq − yiq )yiq+1 · · · yim

×∑

(L,Φ)≤Li1+···+Lim

PΦi1,...,im

( ∞∑i=1

Liσiφi − Li1σi1 − · · · − Limσim

)

=∑

m

1

m!

∑i1,...,im

m∑q=1

xi1 · · ·xiq−1yiq+1 · · · yim |xiq − yiq |σiq

×∑

(L,Φ)≤Li1+···+Lim

PΦi1,...,im

( ∞∑i=1

Liσiφi − Li1σi1 − · · · − Limσim

).

And now comes the main trick:

σiq

∑(L,Φ)≤Li1+···+Lim

PΦi1,...,im

( ∞∑i=1

Liσiφi − Li1σi1 − · · · − Limσim

)(3.110)

=∑

(L,Φ)≤Li1+···+Lim

PΦi1,...,im

(∑iσiqLiσiφi − σiqLi1σi1 − · · · − σiqLimσim

)

≤ C(Li1 + · · ·+ Lim)(Li1 + · · ·+ Lim − σiqLi1σi1 − · · · − σiqLimσim).

Since the term Liq cancels in the second bracket, the last expression vanishes form = 1. For m > 1, it does not exceed

2CLiq

∑p=q

Lip + 2C

⎛⎝∑

p=q

Lip

⎞⎠

2

≤ 2CLiq

∑p=q

Lip + 2C(m− 1)∑p=q

L2ip .

Consequently,

∞∑i=1

Liσi(fi(x) − fi(y)) ≤ 2C

∞∑m=2

1

m!

∑i1,...,im

m∑q=1

xi1 · · ·xiq−1yiq+1 · · · yim

× |xiq − yiq |⎡⎣Liq

∑p=q

Lip + (m− 1)∑p=q

L2ip

⎤⎦ , (3.111)

Page 215: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

3.11. Accretivity for evolutions with additive rates 199

so that Liq enters only linearly! This expression can be bounded by

2C(L, |x− y|)∞∑

m=2

1

(m− 2)![λm−1 + (m− 1)λm−2ν]

≤ 2C(L, |x− y|)∞∑

m=1

(m+ 1)λm−1

(m− 1)!ν,

which implies (3.108). (Note that we used the estimate λ ≤ ν in the secondinequality.)

If nontrivial b are included, the estimates (3.110) are modified for m = 1. Anadditional term that is bounded by CbLi then adds to the r.h.s. of (3.111):

Cb∑

i|xi − yi|Li = Cb(L, |x− y|).

Turning to a general PΦΨ (x) satisfying (3.106) and b = 0, we have to add to

the r.h.s. of (3.111) the term

∞∑i=1

Liσi

∑Ψ

Ψ!

∑Φ:(L,Φ)≤(L,Ψ)

[PΦΨ (x) − PΦ

Ψ (y)](φi − ψi)

≤∞∑i=1

∑Ψ

Ψ!

∑Φ:(L,Φ)≤(L,Ψ)

|PΦΨ (x)− PΦ

Ψ (y) |Liφi + Liψi|,

which by (3.106) and the estimate (L,Φ) ≤ (L,Ψ) does not exceed

2∑

Ψ

Ψ!C(L,ψ)2(L, |x− y|)

≤ 2C(L, |x− y|)∑

i1,...,im

1

m!yi1 · · · yimm(L2

i1 + · · ·+ L2im)

= 2C(L, |x− y|)∞∑

m=1

m

(m− 1)!

∑i1,...,im

yi1 · · · yimL2i1 .

This gives the remaining contribution to (3.109). Including b adds an additionalterm bλC(L, |x− y|). �

It is remarkable that similar to the case with moments, the accretivity esti-mates can also be obtained for weighted norms (Lβ, |x|) with any β > 1. In orderto keep the formulae slim, let us only discuss the case β = 2.

Lemma 3.11.2. Under the assumptions of Lemma 3.11.1, the function −f is accre-tive on M+(L3) with respect to the weighted norm x �→ (L2, |x|). More precisely,

∞∑i=1

L2iσi(fi(x) − fi(y)) ≤ α(C, C, Ca, (L

3, x))(L2, |x− y|), (3.112)

for x, y ∈ M+(L3) with some continuous function α(C, C, Ca, (L3, x)).

Page 216: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

200 Chapter 3. Discrete Kinetic Systems: Equations in l+p

Proof. The difference to Lemma 3.11.1 only affects the parts with a = 0, b = 0and P independent of x. In this case, working like in the proof of Lemma 3.11.1we obtain

∞∑i=1

L2iσi(fi(x) − fi(y))

≤∑

m

1

m!

∑i1,...,im

m∑q=1

xi1 · · ·xiq−1yiq+1 · · · yim |xiq − yiq |σiq

×∑

(L,Φ)≤Li1+···+Lim

PΦi1,...,im

( ∞∑i=1

L2iσiφi − L2

i1σi1 − · · · − L2imσim

).

Now the main trick is the estimate

σiq

(∑iL2iσiφi − L2

i1σi1 − · · · − L2imσim

)≤⎛⎝∑

iL2iφi − L2

iq +∑p=q

L2ip

⎞⎠

≤⎛⎝(∑

iLiφi

)2− L2

iq +∑p=q

L2ip

⎞⎠ ≤

⎛⎝(Li1 + · · ·+ Lim)2 − L2

iq +∑p=q

L2ip

⎞⎠

≤ 2Lq

∑p=q

Lip +m∑p=q

L2ip .

The leftover part is fully analogous to Lemma 3.11.1. �

3.12 The major well-posedness result in l+p

We can now improve Theorem 3.9.1 to the full well-posedness result.

Theorem 3.12.1. Let L ∈ R∞++ be non-decreasing with finite set levels such that

Lj ≥ 1 for all j. Let (3.87), (3.88), (3.106) and (3.107) hold, and let x ∈ M+≤λ(L)∩

M≤ν(L2), a(x) = a ∈ R∞

+,fin.

(i) Then for any p ≥ 1 there exists a unique global solution X(t, x) to (3.85) inlp with the initial condition x such that

X(t, x) ∈ M+≤λ(t)(L) ∩M≤ν(t)(L

2),

ν(t) = eκ(2,λ(t))tν + (eκ(2,λ(t))t − 1)(Lβ, a)

κ(2, λ(t)),

(3.113)

with κ from (3.97) and λ(t) from (3.92). If x ∈ M+(Lβ) with some β > 2,then the corresponding solution X(t, x) also belongs to M+(Lβ) with similarestimates (given in Theorem 3.9.1).

Page 217: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

3.12. The major well-posedness result in l+p 201

(ii) If X(t, x) and X(t, y) are two solutions with the initial conditions x and y,then

(L, |X(t, x)−X(t, y)|) ≤ exp{[a(λ(t))ν(t) + Ca + b(C + C)]t}(L, |x− y)|),(3.114)

with a(λ) from (3.109).

(iii) If Xn(t, x) denotes the solution to the finite-dimensional approximation x =f(Pn(x)), then

‖Xn(t, x)−X(t, x)‖lp → 0 and (L, |Xn(t, x)−X(t, x)|) → 0 (3.115)

as n → ∞ uniformly on x ∈ M≤ν(L2). In particular, it follows that if the

sum in (3.86) is over Φ : (L,Φ) = (L,Ψ) and not just over (L,Φ) ≤ (L,Ψ)+b,then (L,X(t, x)) = (L, x).

Proof. The existence part is already proved in Theorem 3.9.1. Estimate (3.114)and hence uniqueness follow from the accretivity (3.108) in the weighted norm inthe same way as (3.104) of Proposition 3.10.1 follows from the accretivity in lpand Proposition 1.4.4 in the case p = 1. It remains to show (iii): The first limit in(3.115) follows from the construction of X(t, x) in Theorem 3.9.1, and the secondlimit follows from the convergence in l1 and the observation that

∑j≥m Ljyj → 0

as m → ∞ uniformly for y ∈ M+≤ν(L

2). �Remark 60. Part (ii) of Theorem 3.9.1 can be similarly extended into a well-posed-ness of equation (3.86), but with the additional constraint that x, a ∈ M+(Lβ)with some β > 2.

As an example, let us consider equation (3.68) describing the mean-field de-pendent merging-splitting or coagulation-fragmentation, enhanced by the processof preferential attachment.

Theorem 3.12.2. Let P (x), a(x), Qkl(x), Pm,k(x) be continuous non-negative func-

tions on l1 such that

P (x) + a(x) ≤ c, Qkl(x) ≤ c(k + l),∑

m+n<k

Pmn(x) ≤ ck, (3.116)

|P (x) − P (y)|+ |a(x) − a(y)| ≤ c(L, |x− y|),|Qkl(x) −Qkl(y)| ≤ c(k + l)(L, |x− y|), (3.117)

∑m+n≤k

|Pmn(x)− Pmn(y)| ≤ ck(L, |x− y|), (3.118)

for L = (1, 2, . . .) and a constant c. Then equation (3.68) satisfies the assump-tions of Theorem 3.12.1 with L = (1, 2, . . .) and is therefore well posed for initialconditions from M+

≤ν(L2).

Proof. Straightforward inspection. �

Page 218: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

202 Chapter 3. Discrete Kinetic Systems: Equations in l+p

3.13 Sensitivity

Let us now prove the smooth dependence of the solutions to equation (3.85) (asconstructed in Theorem 3.12.1) with respect to the initial data. In order to graspthe main idea more easily, we start with equations with coefficients that do notdepend on x, i.e., we start with the equations of the type

xj =∑

Ψ:#(Ψ)≤k

Ψ!

∑ΦPΦΨ (φj − ψj)

=k∑

m=1

1

m!

∑i1,...,im

xi1 · · ·xim

∑ΦPΦΨ (φj − δi1j − · · · − δimj ),

(3.119)

where the summation over Φ is restricted to (L,Φ) ≤ (L,Ψ) for m > 1 and to(L,Φ) ≤ (L,Ψ) + b for m = 1. Assuming the existence of the partial derivative

ξ(t) = ξp(t) = ∂X(t,x)∂xp

of the solutions to (3.119), one gets the following equation

for ξ = ξ(t) = ξp(t) by differentiation:

ξj =k∑

m=1

1

m!

∑i1,...,im

m∑q=1

ξiq∏r =q

Xir (t, x)∑

ΦPΦi1,...,im(φj − δi1j − · · · − δimj )

=

k−1∑m=0

1

m!

∑l,i1,...,im

ξlXi1(t, x) · · ·Xim(t, x)

×∑

ΦPΦi1,...,im,l(φj − δlj − δi1j − · · · − δimj ), (3.120)

with the initial condition ξj(0) = δpj . Since we shifted the parameter m in the lastequation, the restriction on Φ turns to (L,Φ) ≤ (L,Ψ) for m > 0 and (L,Φ) ≤(L,Ψ)+ b for m = 0.

Equation (3.120) differs from (3.119) in two essential points. First of all, itis linear: it can be written as

ξj =∑

lAjl(t)ξl ⇐⇒ ξ = A(t)ξ, (3.121)

with the infinite matrix A having the elements

Ajl(t) =

k−1∑m=0

1

m!

∑i1,...,im

Xi1(t, x) · · ·Xim(t, x)∑

ΦPΦi1,...,im,l(φj−δlj−δi1j −· · ·−δimj ).

And secondly, no positivity for ξ can be expected. This implies that an estimateof (L, ξ) does not make much sense, and that only estimates for (L, |ξ|) can berelevant.

Page 219: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

3.13. Sensitivity 203

The natural finite-dimensional approximations to (3.120) are given by theequations

ξj =∑

lA

(n)jl (t)ξl =

k−1∑m=0

1

m!

∑l,i1,...,im

ξlXi1(t,Pn(x)) · · ·Xim(t,Pn(x))

×∑

ΦPΦi1,...,im,l(φj − δlj − δi1j − · · · − δimj ).

(3.122)

(Note that they are finite-dimensional due to the locality of the r.h.s. of (3.119).)

Obviously, the solutions ξ(n)j to these finite-dimensional equations with the initial

conditions δpj are well defined.

We shall analyse the smoothness of the solutions to equation (3.119) by thefollowing steps. First we shall see with the help of accretivity that the approxima-tions (3.122) fit the assumptions of Lemma 3.7.1, which leads to the existence ofsolutions to (3.120). Next, we shall use accretivity again for proving the unique-

ness. This would imply that the approximations ξ(n)j = ξ

(n)j (t) (note that we often

omit the dependence on t in order to keep the formulae slim) solving (3.122) ac-tually converge to the solutions of (3.120) (not only their subsequences), which

again implies that the derivatives ξ(n)j of X(t,Pn(x)) with respect to xp converge.

Therefore, their limit ξj is the derivative of X(t, x).

Let us start with accretivity estimates in the weighted norm y �→ (L, |y|), asin Lemma 3.11.1, in order to get the estimates for the growth of (L, |ξ(n)j |). Theestimate (3.108) for the finite difference of the r.h.s. of equation (3.119) suggeststhat a similar estimate should hold for the derivatives of this r.h.s.:

∞∑i=1

Li sgn(ξ(n)i )[A(n)(t)ξ(n)]i

≤ C(L, |ξ(n)|)[b+ 2(L2, X(t,Pn(x)))

k−1∑m=1

(m+ 1)(L, x)m−1

(m− 1)!

].

(3.123)

In fact, this can be proved similar to the proof in Lemma 3.11.1. Namely, settingb = 0 in order to avoid lengthy formulae (its contribution is straightforward), wehave∞∑i=1

Li sgn(ξ(n)i )[A(n)(t)ξ(n)]i

=

k−1∑m=0

1

m!

∑l,i1,...,im

|ξ(n)l |Xi1(t,Pn(x)) · · ·Xim(t,Pn(x))

×∑

(L,Φ)≤Li1+···+Lim+Ll

PΦl,i1,...,im sgn(ξ

(n)l )

×(∑

iLi sgn(ξ

(n)i )φi − Li1 sgn(ξ

(n)i1

)− · · · − Lim sgn(ξ(n)im

)− Ll sgn(ξ(n)l )).

Page 220: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

204 Chapter 3. Discrete Kinetic Systems: Equations in l+p

Similar to the proof of Lemma 3.11.1, we can show that this expression is bound-ed by

≤k−1∑m=1

1

m!

∑l,i1,...,im

|ξ(n)l |Xi1(t,Pn(x)) · · ·Xim(t,Pn(x))

× 2C(Li1 + · · ·+ Lim + Ll)(Li1 + · · ·+ Lim)

≤ 2C(L, |ξ(n)|)k−1∑m=1

(L,Pn(x))m

(m− 1)!

+ 2C(1, |ξ|(n))k−1∑m=1

(L2, X(t,Pn(x)))m(L,Pn(x))

m−1

(m− 1)!,

which yields (3.123).

By (3.123) and Proposition 1.4.4 in the Banach space of sequences with theweighted norm (L, |ξ|), it follows that

(L, |ξ(n)(t)|)

≤ C

∫ t

0

(L, |ξ(n)(s)|)[b+ 2(L2, X(s,Pn(x)))

k−1∑m=1

(m+ 1)(L, x)m−1

(m− 1)!

]ds.

By Gronwall’s lemma and taking into account that X(t,Pn(x)) ∈ M≤ν(t)(L2) by

(3.113) and that (L, |ξ(n)|) = Lp at t = 0, it follows that

(L, |ξ(n)(t)|) ≤ exp

{Ct

[b+ 2ν(t)

k−1∑m=1

m(L, x)m−1

(m− 1)!

]}Lp. (3.124)

Consequently, we can apply Lemma 3.7.1 to the approximations ξ(n) with

KT =

{y ∈ l1 : (L, |y|) ≤ exp

{CT

[b+ 2ν(T )

k−1∑m=1

m(L, x)m−1

(m− 1)!

]}Lp

}.

It can be seen that all conditions are met due to (3.124) and Theorem 3.12.1(iii),which implies the existence of solutions to (3.120) with the initial condition ξj(0) =δpj that can be obtained as the limit of a converging subsequence of ξ(n). Moreover,for any solution ξ to (3.120) from KT , the following estimate can be obtainedsimilar to (3.123):

∞∑i=1

Li sgn(ξi)[A(t)ξ]i ≤ C(L, |ξ|)[b+ 2ν(t)

k−1∑m=1

(m+ 1)(L, x)m−1

(m− 1)!

]. (3.125)

Next, by linearity ofA(t), this estimate can be applied to the difference ξp−ξq

of any two solutions of (3.120) from KT with the initial conditions ξpj (0) = δpj ,

Page 221: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

3.13. Sensitivity 205

ξqj (0) = δqj . This leads to

∞∑i=1

Li sgn(ξpi − ξqi )[A(t)(ξ

p − ξq)]i

≤ C(L, |ξ − η|)[b+ 2ν(t)

k−1∑m=1

(m+ 1)(L, x)m−1

(m− 1)!

].

(3.126)

Using again Proposition 1.4.4 and Gronwall’s lemma, we can conclude that

(L, |ξp(t)− ξq(t)|) ≤ exp

{Ct

[b+ 2ν(t)

k−1∑m=1

(m+ 1)(L, x)m−1

(m− 1)!

]}|Lp − Lq|.

(3.127)

The uniqueness of solutions to (3.120) with the initial condition ξj(0) = δpjfollows from (3.127). Consequently, it also follows that this solution is the limitin l1 of the approximating solutions ξ(n). As already noted, this implies that thesolution ξp(t) to (3.120) with the initial condition ξj(0) = δpj equals in fact the

derivative ∂X(t,x)∂xp

.

Summing up, we proved the following sensitivity result for discrete kineticequations:

Theorem 3.13.1. Let L ∈ R∞++ be non-decreasing with finite set levels such that

Lj ≥ 1 for all j and that (3.87), (3.88) hold. Then the solutions X(t, x) to equation(3.119) (which are uniquely defined according to Theorem 3.12.1) are differentiablewith respect to each coordinate xp for x ∈ M+(L2), and the derivatives ξp(t) =∂X(t,x)

∂xpare the unique solutions to equation (3.120) in lp belonging to M(L), with

the initial condition ξpj (0) = δpj . Their growth is estimated by the same estimate

(3.124) as for ξ(n), and the difference of two derivatives satisfies (3.127).

Let us now extend Theorem 3.13.1 to the full equations. Note that we omitthe details of the proof and show only a straightforward modification to the aboveproof.

Theorem 3.13.2. Under the condition of Theorem 3.12.1, assume additionally thatthe functions a(x), PΦ

Ψ (x) are continuously differentiable in x, so that

supj

∑Φ

∣∣∣∣∂PΦΨ (x)

∂xj

∣∣∣∣ ≤ C(L,Ψ), supj

(L,

∣∣∣∣∂a(x)∂xj

∣∣∣∣)

≤ Ca. (3.128)

Then the solutions X(t, x) to equation (3.119) are differentiable with respect to

each coordinate xp for x ∈ M+(L2). The derivatives ξp(t) = ∂X(t,x)∂xp

belong to

Page 222: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

206 Chapter 3. Discrete Kinetic Systems: Equations in l+p

M(L) and have the bounds

(L, |ξp(t)|) ≤ exp

{t

[Ca + (C + (L, x)C)b

+ 2(C + C)ν(t)

k−1∑m=1

(m+ 1)(L, x)m−1

(m− 1)!

]}Lp,

(3.129)

and the difference of two derivatives can be estimated by

(L, |ξp(t)− ξq(t)|) ≤ exp

{Ct

[Ca + (C + (L, x)C)b (3.130)

+ 2(C + C)ν(t)

k−1∑m=1

(m+ 1)(L, x)m−1

(m− 1)!

]}|Lp − Lq|.

Let us emphasize that the solutions X(t, x) to equation (3.119) belong toM(L2) and their derivatives ξ belong to M(L), which reflects the important gen-eral effect that derivatives with respect to the initial condition are often less regularthan the solution itself. This regularity decay extends to other moments as follows.

Theorem 3.13.3. Under the condition of Theorem 3.13.2, let x ∈ M+(L3) (andtherefore X(t, x) ∈ M+(L3) for all t by Theorem 3.12.1(i)). Then the derivatives

ξp(t) = ∂X(t,x)∂xp

belong to M(L2) and the following estimates hold:

(L2, |ξp(t)|) ≤ α1(t, Ca, C, C, (L3, x))L2p, (3.131)

(L2, |ξp(t)− ξq(t)|) ≤ α1(t, Ca, C, C, (L3, x))|L2p − L2

q|, (3.132)

with some continuous function α1(t, Ca, C, C, (L3, x)).

Proof. It is the same as in Theorem 3.13.2, but one must now use the accretivity ofLemma 3.11.2 rather than that of Lemma 3.11.1. Namely, one obtains the followingaccretivity estimate for the derivative approximations:

∞∑i=1

L2i sgn(ξ

(n)i )[A(n)(t)ξ(n)]i ≤ (L2, |ξ(n)|)α1(t, Ca, C, C, (L3, x)) (3.133)

with some continuous function α1(t, Ca, C, C, (L3, x)). �

In the following section, we will show that – as one may now expect – takingsecond derivatives leads to a further decay of the regularity.

3.14 Second-order sensitivity

For the analysis of fluctuations in interacting particles systems as described bythe kinetic equations, the second-order derivatives of the solutions with respect to

Page 223: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

3.14. Second-order sensitivity 207

the initial data play a key role. Their analysis is based on similar ideas as for thefirst-order derivatives. However, the expected further decay of regularity requiresthat they can only be obtained for solutions with a third-order moment in L, i.e.,the solutions from M+(L3) according to Theorem 3.13.3.

Theorem 3.14.1. Under the conditions of Theorem 3.13.2, assume additionally thatthe functions a(x), PΦ

Ψ (x) are twice continuously differentiable in x, so that

supj,k

∑Φ

∣∣∣∣∂2PΦΨ (x)

∂xj∂xk

∣∣∣∣ ≤ C(L,Ψ), supj,k

(L,

∣∣∣∣ ∂2a(x)

∂xj∂xk

∣∣∣∣)

≤ Ca. (3.134)

Let x ∈ M+(L3), such that X(t, x) ∈ M+(L3) for all t by Theorem 3.12.1(i) and

that ξp(t) = ∂X(t,x)∂xp

belong to M(L2) by Theorem 3.13.3. Then the continuous

derivatives ηp,q(t) = ∂2X(t,x)∂xp∂xq

exist and belong to M(L). Moreover, the following

estimate holds:

(L, |ηp,q(t)|) ≤ tLpLqα2(t, Ca, C, C, (L3, x)) (3.135)

with some continuous function α2(t, Ca, C, C, (L3, x)).

Proof. It is very similar to Theorem 3.13.1. Reducing the discussion to the case of aand P independent of x, we first formally differentiate equation (3.120) (assumingthat the derivative exists), which leads to the equation

ηp,qj (t) = [A(t)ηp,q(t)]j+

k−2∑m=0

1

m!

∑l1,l2,i1,...,im

ξpl1ξql2Xi1(t, x) · · ·Xim(t, x) (3.136)

×∑

ΦPΦl1,l2,i1,...,im(φj − δl1j − δl2j − δi1j − · · · − δimj ),

which must be satisfied with the initial condition ηp,qj (0) = 0. This equation has thesame linear part as equation (3.120), and its non-homogeneous term g, i.e., the sumin (3.136), has a uniformly bounded weighted norm (L, |g|), because (L2, |ξp(t)|)and (L2, |ξq(t)|) are bounded. Therefore, equation (3.136) has a unique solutionwhich belongs to M(L). By the same approximation as in Theorem 3.13.1, onecan show that this solution yields in fact the required second-order derivative. �

For example, applying this result to equation (3.68) yields the following.

Theorem 3.14.2. Under the assumptions of Theorem 3.12.2, assume additionallythat the functions a, P,Qkl, P

mk are twice continuously differentiable and∣∣∣∣∂a(x, b)∂xp|+ |∂P (x, b)

∂xp

∣∣∣∣ ≤ c,

∣∣∣∣∂Qkj(x, b)

∂xp

∣∣∣∣ ≤ c(k + j),∑

m+n<k

∣∣∣∣∂Pmn(x, b)

∂xp

∣∣∣∣ ≤ ck,

(3.137)

Page 224: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

208 Chapter 3. Discrete Kinetic Systems: Equations in l+p

∣∣∣∣∂a2(x, b)∂xp∂xq

∣∣∣∣+∣∣∣∣∂2P (x, b)

∂xp∂xq

∣∣∣∣ ≤ c,

∣∣∣∣∂2Qkj(x, b)

∂xp∂xq

∣∣∣∣ ≤ c(k+j),∑

m+n<k

∣∣∣∣∂2Pmk(x, b)

∂xp∂xq

∣∣∣∣ ≤ ck,

(3.138)uniformly for all x, p, q.

Then the equation (3.68) satisfies the assumptions of Theorems 3.13.2 and3.14.1 with L(j) = j. Therefore for x :

∑j j

3xj < ∞, the solutions to (3.68) aretwice continuously differentiable, and the derivatives have the bounds

supx

∑jj|ξpj (t, x)| ≤ C(ν, T )Lp, sup

x

∑jj2|ξpj (t, x)| ≤ C(ν, T )L2

p, (3.139)

supx

∑jj|ηp,qj (t, x)| ≤ tC(ν, T )LpLq, (3.140)

uniformly for t ≤ T , where supx is over x ∈ M≤ν(L3), i.e., over x that satisfy

the estimate∑

j j3xj ≤ ν.

3.15 Stability of solutions with respect to coefficients

In this section, we prove a simple result on the continuous dependence of thesolutions to kinetic equations on the coefficients of their r.h.s.

Theorem 3.15.1. Suppose that we are given another systems of equations (3.85)with the coefficients a(x), P (x) such that both these systems satisfy the assumptionsof Theorem 3.12.1. Moreover, suppose that

∑Φ|PΦ

i1,...,im(x)− PΦi1,...,im(x)| ≤ ε(Li1 + · · ·+ Lim + δ1mb), (3.141)∑

Lj |aj(x)− aj(x)| ≤ ε. (3.142)

Then we have the following estimate for the solutions X(t, x) and X(t, x) to theseequations with equal initial conditions:

(L, |X(t, x)− X(t, x)|) ≤ εt

[1 + b(L, x) + ν(t)

∑ m(L, x)m−1

(m− 1)!

]

× exp{[a(λ(t))ν(t) + Ca + b(C + C)]t

}. (3.143)

where ν = maxs≤t(L2, X(t, x)).

Proof. We have

∑iLiσi(fi(x)− fi(x)) =

∑iLiσi(fi(x)− fi(x)) +

∑iLiσi(fi(x)− fi(x)),

Page 225: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

3.16. Hints and answers to chosen exercises 209

where σi = sgn(xi − xi), and f, f are the r.h.s. of the two equations (3.85) consid-ered. The first sum in this expression is estimated in Lemma 3.11.1. Consequently,

∞∑i=1

Liσi(fi(x)− fi(x)) ≤ [α(λ)ν + Ca + b(C + λC)](L, |x− x|)

+∑

iLi|ai(x) − ai(x)|

+∑ xΨ

Ψ!(PΦ

Ψ (x)− PΦΨ (x))[(L,Φ) + (L,Ψ)].

The last two terms can be estimated by

ε+ ε∑

m

∑i1,...,im

xi1 · · ·xim(Li1 + · · ·+ Lim)(Li1 + · · ·+ Lim + δm1 b)

≤ ε

[1 + b(L, x) + ν

∑ m(L, x)m−1

(m− 1)!

].

The proof is then completed as in Theorem 3.12.1(ii), i.e., by Gronwall’s lemmaand Proposition 1.4.4. �

As an example, let us apply this theorem for estimating the difference be-tween the solutions to (3.68) and their approximations arising from the use ofapproximate coefficients.

Theorem 3.15.2. Under the assumptions of Theorem 3.12.2 for the difference ofthe solutions X(t, x) of equation (3.68) and the solutions Xn(t, x) of the sameequation, but with coefficients a(Pn(x)), P (Pn(x)), Qkl(Pn(x))andP

km(Pn(x)), wehave the estimate

(L, |X(t, x)−Xn(t, x)|) ≤ t

nβC(T, (Lβ, x)) (3.144)

for any β ≥ 2, uniformly for t ≤ T , with a constant C(T, (Lβ, x)).

Proof. From the conditions of Theorem 3.12.2, we find the conditions of the pre-vious theorem to be fulfilled with ε of the order 1/nβ−1. �

3.16 Hints and answers to chosen exercises

Exercise 3.2.1. [A]+[I]+[P ] is preserved by the evolution of [A], [I], [P ]. Therefore,we find [I] = [A0]− [A]− [P ], and the equation for [P ] is

d[P ]

dt= kI([A0]− [A]− [P ]) = kI([A0]− e−tka [A0]− [P ]),

Page 226: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

210 Chapter 3. Discrete Kinetic Systems: Equations in l+p

which implies (3.23). [A] + [P ] is preserved by the evolution of [A], [P ]. Therefore,we find [P ] = [A0] + [P0]− [A], and the equation for [A] is

d[A]

dt= −k([A])([P0] + [A0]− [A]),

which implies (3.25).

Exercise 3.9.1. (i) follows from the first inequality of (3.94). (ii) Dividing bymax(aj) leads to the simpler inequality

1 +m−1∑j=1

xβj ≤

⎛⎝1 +

m−1∑j=1

xj

⎞⎠

β

≤ mβ−1

⎛⎝1 +

m−1∑j=1

xβj

⎞⎠ , (3.145)

with all xj from [0, 1]. (iii) For β ≤ 1 the first inequality hold trivially, since thel.h.s. is non-positive. Due to homogeneity, the first inequality follows for β > 1from the inequality

(1 + x)β − 1 ≤ 2βx, x ∈ [0, 1].

The last inequality follows from the mean-value theorem.

3.17 Summary and comments

The theory of positivity-preserving ODEs with unbounded coefficients was devel-oped for spaces of sequences. The space l1, being the simplest example of thespace of measures, also has the specific property that the cone of positive elementsl+1 has a non-empty interior. The interior of the cone of non-negative measuresM+(X) for an uncountable metric space X is empty. In fact, for any μ ∈ M+(X)with such X , there exists an element x ∈ X such that μ{x} = 0, and thereforeμ − εδx /∈ M+(X) for any ε > 0. This difference is important when it comes topositivity-preserving solutions, since for l1 the corresponding conditional positiv-ity is a property on the boundary of l+1 , and for M(Rn) it is a property for thewhole set M+(Rn).

Section 3.1 that deals with the equations in Rn that preserve the positiveconeRn

+ can be considered an application of the general theory of invariant sets forODEs, see, e.g., [233], and [176] for Banach spaces. For the history of the detailedand complex balance conditions that culminate in Theorem 3.4.2, we refer to [95].Extensions to these results for continuous state spaces are given in [94].

Sections 3.7 to 3.12 cover the theory of discrete kinetic equations, culminatingin Theorem 3.12.1. For the basic Smoluchowski coagulation-fragmentation model,this result was first obtained in [25] and then extended to various models bymany authors. The general form given here is taken from [139] (although it isfurther extended in order to account for a possible growth of L by injections andunilateral transformations), where a rather extensive bibliography can be found.

Page 227: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

3.17. Summary and comments 211

Let us emphasize, however, that the method of [25] developed specifically to treatSmoluchowski equations can be considered an application of the general methodof accretive operators. Theorem 3.13.1 is an adaptation to the discrete case ofthe corresponding sensitivity result for continuous state models from [147]. Inthe framework of increasing Lyapunov functions, the results of Theorems 3.13.2,3.13.3, 3.14.1 and 3.14.2 are new. Sensitivity for the Smoluchowski equation withrespect to a parameter was developed in [22] and applied in [23] for obtainingeffective numeric solutions.

The kinetic equations that have been dealt with in this chapter can be ob-tained as weak limits of the Markov model of interacting particles, i.e., as a dy-namic law of large numbers, see [139] and [147] for the general kinetics of mass-exchange processes. For the coagulation-fragmentation processes described by theSmoluchowski equations, the approximating Markov process is referred to as theMarkus–Lushnikov process. The corresponding convergence result was obtained bymany authors under various assumptions, see, e.g., [214] and references therein.This relation leads to a deep study of the coagulation-fragmentation process byprobabilistic methods, see [36] and references therein. In particular, this methodleads to a detailed description of special solutions once the total mass is not con-served – a situation that is referred to as gelation (creation of infinite mass clustersin a finite time). It turns out that the limiting behaviour of fluctuation processesof particle systems around their law-of-large-number limits can be described by anappropriate infinite-dimensional diffusion process. The corresponding convergenceresult, i.e., an infinite-dimensional central limit theorem, was obtained for the dis-crete case with bounded coefficients in [59], and for a general case in [145]. Themethod of [59] is essentially probabilistic, while the method of [145] is analyticwith the main ingredient being the sensitivity of the solutions to kinetic equationswith respect to the initial data – which is a very strong motivation for studyingsensitivity.

The systematic inclusion of the positive influx of particles (growing Lyapunovfunctions) that was presented here is motivated by the vast modern literature onthe models of growth, in particular the famous preferential-attachment model thatwas introduced in the context of complex networks in [5]. Its remarkable history isnicely described in [246]. Our results on sensitivity of the kinetic equations in thesemodels allow for a vigorous proof in a very general setting that the correspondingMarkov systems of interacting particles with a linear growth rate converge to thedeterministic limit as described by these kinetic equations, see [154].

Page 228: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

Chapter 4

Linear Evolutionary Equations:Foundations

This chapter deals with linear equations with an unbounded r.h.s. in Banachspaces (or even locally convex spaces), and the methods of semigroups and prop-agators are developed. The basic tools are duality, perturbation theory and theFourier transform. The general theory is illustrated on various examples, includingevolutions with mixed fractional Laplacians, Schrodinger equations with singularand polynomially growing potentials and magnetic fields, complex diffusions andparabolic higher-order PDEs and ΨDEs with local and nonlocal perturbations.

4.1 Semigroups and their generators

We start by recalling the basic facts on semigroups of linear operators, which aredefined as collections of continuous operators Tt, t ∈ R+, in a linear topologicalspace V such that T0 is the identity and the chain rule (also called group equation)Tt+s = TtTs holds for all t, s ≥ 0. Such semigroup Tt is called strongly continuous(or, alternatively, a semigroup of the class (C0)) if Ttf → f as t → 0 for anyf ∈ V . In a Banach space B, this condition reads ‖Ttf − f‖ → 0 as t → 0, andin a locally convex space with the topology generated by semi-norms pα, it turnsinto the requirement that pα(Ttf − f) → 0 for any α and f . If bounded operatorsTt are defined for all t ∈ R and satisfy the group-equation TtTs = Tt+s there, thenthe family {Tt} is referred to as a group of operators.

We shall mostly work with Banach spaces. Note, however, that an extensionof the basic theory to general locally convex spaces is usually straightforward ifall the required conditions on the norm are extended in such a way that they holdfor each semi-norm of the locally convex space.

If V is a barrelled space (for instance, a Banach space or a Frechet space),then it follows from the principle of uniform boundedness, Theorem 1.6.1, and the

© Springer Nature Switzerland AG 2019 V. Kolokoltsov, Differential Equations on Measures and Functional Spaces, Birkhäuser Advanced Texts Basler Lehrbücher, https://doi.org/10.1007/978-3-030-03377-4_4

213

Page 229: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

214 Chapter 4. Linear Evolutionary Equations: Foundations

strong continuity of Tt (and, of course, the continuity of each Tt) that the familyTt is locally equicontinuous. For a Banach space, this means that the norms ‖Tt‖are uniformly bounded for t from any compact interval. For a barrelled space Vwith the topology induced by the family of semi-norms pα, it means, according toProposition 1.6.2, that for any t > 0 and α there exist a constant C and a finitenumber of semi-norms {pα1 , . . . , pαk

} such that

pα(Tsx) ≤ C(pα1(x) + · · ·+ pαk(x)) (4.1)

for all x ∈ V and s ∈ [0, t].

The semigroup Tt in a locally convex space V is called equicontinuous if forany α there exist a constant C and a finite number of semi-norms {pα1 , . . . , pαk

}such that (4.1) holds for all s ≥ 0. In the case of a Banach space, this is equiva-lent to the requirement that the family of operators Tt is uniformly bounded. Aparticularly important case is when Tt is a semigroup of contraction in a Banachspace B, i.e., ‖Tt‖ ≤ 1 for all t.

The simplest example of strongly continuous semigroups of operators in aBanach space B is the family of exponents

Tt = etA =

∞∑n=0

tn

n!An, (4.2)

defined for any bounded linear operator A on B. In this case, the Tt actually forma group. A special case of such A are square matrices, i.e., linear operators in Rd.

Furthermore, the shifts Ttf(x) = f(x + t) form a strongly continuous semi-group (and even a group) of contractions on the Banach spaces C∞(R) and Lp(R),p ≥ 1, an equicontinuous semigroup on the Schwartz space S(Rd) and on the spaceof test functions D(Rd), i.e., infinitely differentiable functions with a compact sup-port. However, this semigroup is not strongly continuous on C(R).

Exercise 4.1.1. Check these assertions.

Observe also that if f is an analytic function, then

f(x+ t) =

∞∑n=0

tn

n!(Dnf)(x), (4.3)

which can be formally written as etDf(x).

The resolving operators f0 → ft from Theorem 2.4.1 in the case of a time-independent ψt = ψ provide more examples of strongly continuous semigroups.The general relation between semigroups and Cauchy problems will be establishedin the framework of propagators in Theorem 4.10.1.

If D ⊂ V and A : D → V is a linear mapping, then A is usually referred toas a linear operator on V with the domain D. Such operator is called closed if itsgraph is a closed subset of V × V . If V is metricizable (Banach or Frechet space),

Page 230: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

4.1. Semigroups and their generators 215

this is equivalent to the requirement that if xn → x and Axn → y as n → ∞for a sequence xn ∈ D, then x ∈ D and y = Ax. A is called closable if a closedextension of A exists, in which case the closure of A is defined as the minimalclosed extension of A, i.e., the operator with the graph being the closure of thegraph of A. A subspace D of the domain DA of a closed operator A is called acore for A if A is the closure of A restricted to D.

Let Tt be a strongly continuous semigroup of linear operators on a space V .The infinitesimal generator (or just the generator) of Tt is defined as the operator

Af = limt→0

Ttf − f

t

on the linear subspace DA ⊂ V (the domain of A) where this limit exists. Forexample, any bounded A on a Banach space B is the generator of the semigroup(4.2). By analogy, one often denotes the semigroup Tt that is generated by anoperator A by the exponential function: Tt = etA. Another standard example isthe semigroup of shifts (4.3), whose generator is the differentiation operator D.

A more complicated example is given by the link between nonlinear dynamicsin a Banach (or locally convex) space and the corresponding linear dynamics onthe functions, performed by the method of characteristics (see Theorem 2.10.1).Let us prove a simple assertion revealing this link.

Proposition 4.1.1.

(i) Under the assumptions of Theorem 2.2.1, let F be uniformly bounded. Thenthe operators TtS(Y ) = S(μt(Y )) form a strongly continuous group of linearcontractions in the space Cuc(B) of uniformly continuous bounded functionson B (as a closed subspace of C(B)).

(ii) Under the assumptions of Theorem 2.9.1, the space C1(B) is an invariantsubspace for this semigroup, where the generator is given by the formula

LS(Y ) = DS(Y )[F (Y )], (4.4)

and G(t, Y ) = TtS(Y ) solves the PDE (2.155) for any Y ∈ C1(B).

Proof. (i) The space Cuc(B) is invariant under Tt by (2.11). The strong continu-ity follows from (2.10) and the assumption that F is bounded. The contractionproperty is seen from the following estimate:

‖TtS‖ = supY

|TtS(Y )| = supY

|S(μt(Y ))| ≤ supμ

|S(μ)| = ‖S‖.

(ii) The space C1(B) is invariant under Tt by (2.146) and the chain rule. Theremaining statements are consequences of Theorem 2.10.1. �

Remark 61. This result can be extended to more general situations, when thedynamics μt(Y ) does not satisfy the assumptions of Theorem 2.2.1, in particular

Page 231: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

216 Chapter 4. Linear Evolutionary Equations: Foundations

when F (μ) is not Lipschitz-continuous. Whenever the dynamics t �→ μt is welldefined at least for positive t, the operators TtS form a semigroup (not a group!)of linear contractions in an appropriate space of functions. We shall explain inChapter 7 how such extensions can be used for deriving general kinetic equations.

Exercise 4.1.2. Let F ∈ CbLip(Rd,Rd) and TtF (Y ) = S(μt(Y )), the semigroup in

the space Cuc(Rd) given by Proposition 4.1.1. Show that the space C1(Rd) is a

core (although possibly not invariant) for this semigroup.

The following result shows that the domain of the generator is always ratherrich.

Proposition 4.1.2. Let Tt be a strongly continuous semigroup of continuous linearoperators on a locally convex space V . Then for all ψ ∈ V and t > 0, the vectorψ(t) =

∫ t

0Tuψ du belongs to DA and Aψ(t) = Ttψ−ψ. The vectors of these form,

and hence the vectors from the domain of D, are dense in B.

Proof. We have

Tδψ − ψ

δ=

1

δ

∫ t+δ

t

Tuψ du− 1

δ

∫ δ

0

Tuψ du,

which implies Aψ(t) = Ttψ − ψ by the continuity of Tuψ. Moreover, ψ(t)/t → ψas t → 0 implies the density. �

If the semigroup Tt on V is equicontinuous (in particular, if Tt is a uniformlybounded family on a Banach space), then the resolvent of Tt (or of A) is definedfor any λ > 0 as the operator

Rλf =

∫ ∞

0

e−λtTtf dt. (4.5)

The equicontinuity implies that the resolvent is a well-defined continuous operatorfor any λ > 0. In particular, if the family of semi-norms pα on V is ordered, thenfor any α there exist β and a constant Cαβ such that

‖Rλf‖α = λ−1Cαβ‖f‖β. (4.6)

The following result collects the basic properties of generators and resolvents. Italso provides another proof (for the equicontinuous case) that the generator alwayshas a dense domain.

Theorem 4.1.1. Let Tt = etA be a equicontinuous and strongly continuous semi-group of linear operators on a locally convex space V with the generator A. Thenthe following assertions hold:

(i) TtDA ⊂ DA for each t ≥ 0 and TtAf = ATtf for each t ≥ 0, f ∈ DA.

(ii) Ttf =∫ t

0 ATsf ds+ f and Ttf = ATtf for f ∈ DA.

Page 232: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

4.1. Semigroups and their generators 217

(iii) λRλf → f as λ → ∞.

(iv) Rλf ∈ DA for any f and λ > 0 and (λ −A)Rλf = f , i.e. Rλ = (λ− A)−1.

(v) If f ∈ DA, then RλAf = ARλf ; in particular, Rλ(λ − A)f = f , so that Rλ

is a bijection V → DA for any λ > 0.

(vi) DA is dense in B.

(vii) A is closed on DA.

(viii) If Tt acts in a Banach space and ‖Tt‖ ≤ M , then ‖Rλ‖ ≤ M/λ for anyλ > 0.

(ix) All Rλ commute, and the following resolvent equation holds:

Rλ −Rμ = (μ− λ)RλRμ. (4.7)

Proof. (i) For ψ ∈ DA,

ATtψ =

[limh→0

1

h(Th − I)

]Ttψ = Tt

[limh→0

1

h(Th − I)

]ψ = TtAψ.

(ii) Follows from (i).

(iii) Follows from the equation

λ

∫ ∞

0

e−λtTtf dt = λ

∫ ∞

0

e−λtf dt+λ

∫ ε

0

e−λt(Ttf−f) dt+λ

∫ ∞

ε

e−λt(Ttf−f) dt,

observing that the first term on the r.h.s. is f , the second (respectively third) termis small for small ε (respectively for any ε and large λ).

(iv) By definition,

ARλf = limh→0

1

h(Th − 1)Rλf = lim

h→0

1

h

∫ ∞

0

e−λt(Tt+hf − Ttf) dt

= limh→0

[eλh − 1

h

∫ ∞

0

e−λtTtf dt− eλh

h

∫ h

0

e−λtTtf dt

]= λRλf − f.

(v) Follows from the definitions and (ii).

(vi) Follows from (iv) and (v).

(vii) If fn → f as n → ∞ for a sequence fn ∈ D and Afn → g, then

Ttf − f = limn→∞

∫ t

0

TsAfn ds =

∫ t

0

Tsg ds.

Applying the fundamental theorem of calculus shows that g = Af , which com-pletes the proof.

(viii) Follows from (4.5).

(ix) Equation (4.7) follows more or less directly from the definitions. It impliesthat all Rλ commute. �

Page 233: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

218 Chapter 4. Linear Evolutionary Equations: Foundations

Exercise 4.1.3. Give a detailed derivation of the resolvent equation (4.7).

Proposition 4.1.3. Let an operator A with domain DA generate a strongly contin-uous semigroup of linear continuous operators Tt on a locally convex space V . IfD is a dense subspace of DA of A, i.e., invariant under all Tt, then D is a corefor A.

Proof. Let D be the domain of the closure of A restricted to D. We have to showthat for ψ ∈ DA there exists a sequence ψn ∈ D, n ∈ N, such that ψn → ψ andAψn → Aψ. By Proposition 4.1.2, it is enough to show this for vectors ψ(t) =∫ t

0 Tuψ du. Since D is dense, there exists a sequence ψn ∈ D converging to ψ.Therefore, also Aψn(t) → Aψ(t) holds, because Aψ(t) = Ttψ−ψ. The observationthat ψn(t) ∈ D by the invariance of D completes the proof. �

Under the assumptions of Theorem 4.1.1, the resolvent R0 for λ = 0 is ingeneral not well defined. Nevertheless, it is defined in many interesting situations.The operator R0 is then referred to as the potential operator.

Proposition 4.1.4. Under the assumptions of Theorem 4.1.1, let the operator R0f =∫Ttf dt be well defined as a continuous operator in V . Then R0 is a bijection

V → DA, and R0Ag = AR0g = −g for any g ∈ DA.

Proof. The assumption implies that Ttψ → 0 and ψ(t) =∫ t

0Tsψ ds → R0ψ as

t → ∞. But Aψ(t) = Ttψ − ψ (see Proposition 4.1.2), and hence Aψ(t) → −ψ.Consequently, since A is closed, we find R0ψ ∈ DA and AR0ψ = −ψ for any ψ.On the other hand, if ψ ∈ DA, then

R0Aψ = limt→∞

∫ t

0

ATsψ ds = limt→∞(Ttψ − ψ) = −ψ,

so the image of R0 coincides with DA. �

Let us now point out the two main relations by which the theory of semi-groups enters the theory of differential equations.

Proposition 4.1.4 states that the potential operator is the inverse of A upto the minus sign. Therefore, it can be considered the abstract analogue of thefundamental solution from the theory of generalized functions. More precisely, by(1.179), if A is a ΨDO, then the fundamental solution is the integral kernel of theoperator (−R0).

Theorem 4.1.1 (ii) states that Ttf solves the Cauchy problem for the equationft = Aft with the initial condition f0 = f , whenever f ∈ D. The inclusion ft ∈ Dis an abstract version of what is meant by the notion of classical solutions inclassical ODEs and PDEs. The theory of semigroups also fits the natural notionof generalized solutions. Let us say that ft is a generalized solution to the Cauchyproblem for the equation ft = Aft, if it is a continuous function of t, satisfies theinitial condition f0 = f , and if a sequence of elements fn ∈ D exists such thatfn → f and the corresponding (classical, i.e., belonging to the domain) solutions

Page 234: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

4.2. Semigroups of operators on Banach spaces 219

T nf converge to ft, as n → ∞. Therefore, the curves Ttf represent generalized

solutions for any f ∈ V . Another way of defining generalized solutions is based onduality. Both these notions as well as the related question of well-posedness willbe discussed in more detail later (on a more general level of propagators).

4.2 Semigroups of operators on Banach spaces

We shall now provide more details for the case of semigroups on Banach spacesB. Let us start with the classification of unbounded semigroups.

Proposition 4.2.1. Let Tt be a strongly continuous semigroup in a Banach spaceB. Then there exist constants M and m such that ‖Tt‖ ≤ Memt.

Proof. As was already mentioned above, the principle of uniform boundednessimplies that for any S the norms ‖Tt‖, t ∈ [0, S], are uniformly bounded, say, byM . Next, for any t, let n ∈ N be the integer part of t/S, so that t = nS + δ withδ ≤ S. Then ‖Tt‖ ≤ Mn+1 = Men lnM ≤ Memt with m = (lnM)/S. �

The infimum of those m for which ‖Tt‖ ≤ Memt with some M is called thetype of growth of the semigroup Tt. The following result is straightforward.

Theorem 4.2.1. If Tt is a strongly continuous semigroup of type m0 in a Banachspace B, Theorem 4.1.1 still holds with the only modification that Rλ is definedfor λ > m0 and (viii) is replaced by the estimate ‖Rλ‖ ≤ M(m)/(λ −m), whichholds for λ > m > m0 with some constants M(m) that may depend on m.

Next, we provide a simple convergence result.

Proposition 4.2.2. Let Tt be a strongly continuous semigroup in a Banach spaceB generated by the operator A on D. Let Tn be a sequence of strongly continuoussemigroups generated by the operators An on the domains Dn ⊃ D such that(An − A)f → 0 as n → ∞ for f ∈ D, and either (i) this convergence is uniformon the sets {f : ‖Af‖ ≤ K}, or (ii) the norms ‖Anf‖ are uniformly bounded onthe sets {f : ‖Af‖ ≤ K}. Then T n

t f → Ttf for any f ∈ B, uniformly for t fromcompact intervals.

Proof. We have

Tt − T nt = (T n

t−sTs)|t0 =

∫ t

0

d

ds(T n

t−sTs) ds =

∫ t

0

T nt−s(A−An)Ts ds. (4.8)

If ‖Tt‖ is bounded by M on [0, t], then ‖ATsf‖ ≤ M‖Af‖. In the case (i), (A −An)Tsf → 0 uniformly for s ∈ [0, t] for any f ∈ D. Therefore, (Tt − T n

t )f → 0uniformly for t from compact segments and any f ∈ D. Approximating arbitrary fby elements from D yields the claimed statement. In the case (ii), the convergence(Tt − T n

t )f → 0 for each t follows from the dominated convergence theorem.The uniform convergence is a consequence of the boundedness of the integrandin (4.8). �

Page 235: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

220 Chapter 4. Linear Evolutionary Equations: Foundations

Remark 62. With a somewhat more elaborate argument, this proposition can beproved without the additional assumptions (i) or (ii), see, e.g., [119].

A notable example for approximating semigroups is given by the Yosidaapproximation Aλ, which for any generator A of a semigroup Tt is defined as

Aλ = λARλ = λ(λRλ − 1). (4.9)

For any strongly continuous semigroup Tt of type m0, this operator is boundedfor λ > m0. For f ∈ DA, we find

(Aλ −A)f = λRλAf −Af,

which according to Theorems 4.1.1 and 4.2.1 tends to zero, as λ → ∞. Moreover,

‖Aλf‖ ≤ M(m)λ‖Af‖λ−m

.

Proposition 4.2.3. Let Tt be a strongly continuous semigroup in a Banach spaceB generated by the operator A on D. Let Aλ be its Yosida approximation. Let T λ

t

denote the semigroups exp{tAλ}. Then the semigroups T λt converge to Tt uniformly

on compact sets of t. Moreover, all semigroups exp{tAλ} and Tt commute, D isinvariant under all T λ

t , and A and T λt commute on D. Finally

AλTλt f → ATtf,

as λ → ∞, for any f ∈ D.

Proof. The first statement is a corollary to Proposition 4.2.2. All Aλ commute,because the resolvents Rλ commute. Therefore, their semigroups commute. Pass-ing to the limit in the relation exp{tAλ} exp{tAμ} = exp{tAμ} exp{tAλ} yieldsexp{tAλ}Tt = Tt exp{tAλ}. Finally,

AλTλt f −ATtf = (Aλ −A)T λ

t f +A(T λt − Tt)f = T λ

t (Aλ −A)f + (T λt − Tt)Af,

which tends to 0, as λ → ∞, due to the convergence of the operators Aλ and theirsemigroups. �

For the analysis of semigroups Tt generated by an operator A on the domainD, it is often convenient to measure the size of the elements f ∈ D by anothernorm, a canonical choice being

‖f‖D = ‖f‖B + ‖Af‖B. (4.10)

In particular, this is useful for the analysis of the regularity classes of solutions tothe equation ft = Aft, which are classified according to the number k ∈ N suchthat Akf is well defined.

Page 236: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

4.2. Semigroups of operators on Banach spaces 221

Proposition 4.2.4. Let Tt be a strongly continuous semigroup of linear operatorsin B generated by an operator A on the domain D. Then

(i) D is a Banach space with respect to ‖.‖D;(ii) the operator A is a contraction as an operator D → B;

(iii) the operators Tt form a strongly continuous semigroup of bounded operatorsin D with the same bounds for the norms as in B;

(iv) Tt in D is generated by the same operator A defined on the domain D = {f ∈D : Af ∈ D}.

Proof. (i) Completeness with respect to ‖.‖D is a consequence of A being a closedoperator onD. (ii) ‖Af‖B ≤ ‖Af‖B+‖f‖B = ‖f‖D. (iii) Since A and Tt commute,

‖Ttf‖D = ‖Ttf‖B + ‖TtAf‖B ≤ ‖Tt‖B→B‖f‖D.

Moreover,

‖Ttf − f‖D = ‖Ttf − f‖B + ‖TtAf −Af‖B,which tends to zero due to the strong continuity of Tt in B whenever Af ∈ B,that is, for all f ∈ D. (iv) If A is the generator of Tt in D with the domainD′, then ‖(Ttf − f)/t − Af‖D → 0, as t → 0, for any f ∈ D′. But this implies‖(Ttf − f)/t− Af‖B → 0, and hence Af = Af . Moreover, since Af = Af is thelimit in D of (Ttf−f)/t, it follows that Af ∈ D. Therefore, D′ ⊂ D. On the otherhand, if f ∈ D, then

‖(TtAf −Af)/t−AAf‖B → 0 =⇒ ‖(Ttf − f)/t−Af‖D → 0,

as t → 0, so that f ∈ D′. �

By iteration, one can therefore build a sequence of decreasing dense subspacesof arbitrary regularity D(k) = {f ∈ B : Akf ∈ B}, which are Banach spaces withthe norm ‖f‖D(k) = ‖f‖B + ‖Af‖B + · · · + ‖Akf‖B, so that Tt is a stronglycontinuous semigroup in each D(k) generated by A with the domain D(k + 1).This reflects an important property, namely the preservation of regularity of thesolutions to the equation ft = Aft.

As the later discussions on perturbation theory, on T -products and on gener-ator mixing will show, Banach towers of the embedded regularity classesD(k) of Acan be very useful for the construction of semigroups. Generally, a k-levelBanachtower is meant to be a collection of k embedded Banach spacesDk−1 ⊂ · · · ⊂ D1 ⊂B with the ordered norms ‖.‖Dk−1

≥ · · · ≥ ‖.‖D1 ≥ ‖.‖B such that every spacein this row is dense in the next space with respect to the topology of the latter.As a simple example for the application of such Banach towers, let us obtain aconvergence result for semigroups that extends Proposition 4.2.2 to the case whenthe limiting semigroup is not given a priori. (For an exemplary application of thisresult, see Section 5.2.)

Page 237: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

222 Chapter 4. Linear Evolutionary Equations: Foundations

Proposition 4.2.5.

(i) Let D ⊂ D ⊂ B be a three-level Banach tower. Let T nt be a sequence of

strongly continuous semigroups in B generated by the operators An on thecommon invariant core D such that ‖T n

t ‖B ≤ eKt, ‖T nt ‖D ≤ eKt with a con-

stant K for all n. Moreover, let An ∈ L(D,B) represent a Cauchy sequencein L(D,B). Then the sequence T n

t converges to a strongly continuous semi-group Tt in B with a domain containing D, where its generator A acts asAf = limn→∞ Anf . Moreover, ‖Tt‖B ≤ eKt.

(ii) If D is additionally invariant under T nt , ‖T n

t ‖D ≤ eKt, An ∈ L(D,D) and

they represent a Cauchy sequence in L(D,D) as well, then D is an invariantcore for Tt, and the T n

t converge to Tt also in D.

Proof. (i) Using the formulae (4.8) for comparing the semigroups T nt and Tm

t , wefind that

‖(Tmt − T n

t )f‖B ≤ teKt‖Am −An‖D→B‖f‖D. (4.11)

Therefore, Tmt f is a Cauchy sequence in C([0, T ], B) for any T > 0 and f ∈ D. This

implies that the curves Tmt f converge to a curve Ttf . By the density argument, we

can derive that the convergence holds for any f ∈ B. Consequently, the operatorsTt form a strongly continuous semigroup in B satisfying the required growth rate.In order to see that D is in the domain of Tt, we write

Ttf − f

t=

T nt f − f

t+

Ttf − T nt f

t.

By (4.11), the second term tends to zero, as t → 0. Therefore, we can write

Ttf − f

t=

1

t

∫ t

0

T ns Af ds+

1

t

∫ t

0

T ns (An −A)f ds+

Ttf − T nt f

t,

where the second and the third term can be made arbitrary small by choosing nlarge enough and t small enough. The first term tends to Af as t → 0 for any n(and in fact uniformly in n).

(ii) Repeating the above arguments for the pair of embedded spaces (D,D)shows that the T n

t converge to Tt also in the topology of D. This implies theinvariance of D under Tt. �

Since the canonical norm (4.10) may be rather artificial (after all, it stronglydepends on A), it might be handy to choose it – and the corresponding Banachtower – in a more universal way. For instance, for semigroups Tt on C∞(Rd), a use-ful universal Banach tower is formed by the spaces of smooth functions Ck

∞(Rd).

The following modification of Proposition 4.2.4 addresses this issue.

Proposition 4.2.6. Let Tt be a strongly continuous semigroup of linear operators inB generated by an operator A with an invariant core D. Assume that ‖.‖D ≥ ‖.‖Bis a norm on D with respect to which D is a Banach space, such that Tt is a strongly

Page 238: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

4.3. Simple diffusions and the Schrodinger equation 223

continuous semigroup there. Then Tt in D is generated by the same operator Adefined on the domain D = {f ∈ D : Af ∈ D}.Proof. Exactly as above, we can deduce that if f belongs to the domain of thegenerator A of Tt in D, then Af = Af (this is where the assumption ‖.‖D ≥ ‖.‖Bis used) and Af ∈ D. Conversely, assume that f ∈ D and Af ∈ D. The first

inclusion implies that Ttf − f =∫ t

0ATsfds. And the second inclusion, together

with the assumed strong continuity of Tt in D, implies that the function underthe integral is continuous in the norm topology of D. Consequently,

Ttf − f

t−Af =

1

t

∫ t

0

(TsAf −Af)ds,

and the r.h.s. tends to zero, as t → 0, in the topology of D. �

One can further extend the preservation of regularity to several semigroupswith a common invariant domain.

Proposition 4.2.7. Let T 1t , . . . , T

kt be k strongly continuous semigroups of linear

operators in B generated by the operators A1, . . . , Ak, respectively, with a commoninvariant core D.

(i) Then the closure D of D with respect to the norm ‖f‖D = ‖f‖B+∑

j ‖Ajf‖Bis a Banach space, which is a core for each group T k.

(ii) If additionally

‖AjTit f‖B ≤ κ(‖f‖B +

∑j‖Ajf‖B) = κ‖f‖D

for all t, i, j, and a constant κ, then all T j are strongly continuous boundedsemigroups in D generated by the same operators Aj but with the domain

Dj = {f ∈ D : Ajf ∈ D}.Exercise 4.2.1. Prove this proposition.

4.3 Simple diffusions and the Schrodinger equation

Possibly the most fundamental example of an operator semigroup is the semigroupgenerated by the (customary half of the) Laplacian Δ/2 with

Δf(x) =

d∑j=1

∂2f

∂x2j

, x ∈ Rd.

Apart from playing a fundamental role in various domains of natural science, ithas a large mathematical value, since all basic objects related to this semigroup(e.g., resolvent, generator) can be explicitly calculated. Therefore, one can use itas a basic example for testing general assertions.

Page 239: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

224 Chapter 4. Linear Evolutionary Equations: Foundations

The heat conduction semigroup ft = Ttf solving the Cauchy problem for thesimplest heat conduction or diffusion equation

ft =1

2Δft, f0 = f, (4.12)

has the following closed form:

ft(x) = Ttf(x) =

∫Gt(x− y)f(y) dy, (4.13)

with the function

Gt(x) = (2πt)−d/2 exp

{−x2

2t

}, t > 0. (4.14)

This function is referred to as the heat kernel or the Green function of equation(4.12).

Exercise 4.3.1.

(i) Gt(x) represents a so-called δ-sequence as t → 0, meaning that∫Gt(x− y)f(y) dy → f(x), t → 0, (4.15)

for any f ∈ C(Rd) and moreover

∂tGt(x) =

1

2ΔGt(x). (4.16)

(ii) For any t > 0 and any bounded measurable function f , Ttf given by (4.13)is an infinitely differentiable function satisfying (4.12). Moreover, Ttf → fin C∞(Rd) as t → 0 whenever f ∈ C∞(Rd).

(iii) The operators Tt of (4.13) form a strongly continuous semigroup in C∞(Rd)with C2

∞(Rd) being its invariant core.

(iv) The operators Tt form a semigroup of positivity-preserving contractions inC(Rd) that also preserve constants.

The semigroup Tt is a basic example for the class of Feller semigroups (seedefinitions in Section 5.10). These semigroups play a key role in the theory ofMarkov processes.

It is a bit more tricky to find not only a handy invariant core like C2∞(Rd),

but the full domain of the generator Δ/2 of the heat semigroup Tt of (4.13). Tothis end, let us limit our attention to the one-dimensional case d = 1. The keyingredient for the analysis is the resolvent of Tt, which for d = 1 is given by theoperator

Rλf(x) =1√2λ

∫ ∞

−∞exp{−

√2λ|y − x|}f(y) dy. (4.17)

Page 240: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

4.3. Simple diffusions and the Schrodinger equation 225

Exercise 4.3.2. Check that

R′λf =

∫ ∞

−∞exp{−

√2λ|y − x|}f(y) sgn (x− y) dy,

R′′λf =

√2λ

∫ ∞

−∞exp{−

√2λ|y − x|}f(y) dy − 2f(x),

(4.18)

where the primes denote derivatives with respect to x. Confirm that the functionsRλf , R

′λf , R

′′λf belong to C∞(R) whenever f ∈ C∞(R), in which case Rλ satisfies

the resolvent equation1

2

d2

dx2Rλf = λRλf − f. (4.19)

Hence conclude that equation (4.17) yields in fact the resolvent of the heat semi-group (4.13), and that the space C2

∞(R) is the domain of its generator.

Exercise 4.3.3. Check that the heat semigroup (4.13) is also a strongly continuoussemigroup in any of the spaces Lp(R

d) with p ≥ 1, but that it is not stronglycontinuous in C(Rd).

Exercise 4.3.4. The heat semigroup (4.13) extends to the semigroup on M(Rd)such that Ttf ∈ L1(R

d) for any t > 0 and f ∈ M(Rd). However, in M(Rd)this semigroup is not strongly continuous, but only weakly continuous. Check thisclaim on the example of the Green function Gt, which solves (4.12) with the Diracδ-function δ ∈ M(Rd) as initial condition.

The heat semigroup (4.13) is an example for a so-called analytic semigroup,meaning that the operators Tt can be extended to complex values of times t. Inother words, formula (4.13) solving (4.12) extends to the formula

ft(x) = T σt f(x) =

∫Gtσ(x− y)f(y) dy, (4.20)

which solves the complex diffusion equation

ft =1

2σΔft, f0 = f, (4.21)

with any σ ∈ C having positive real part. For any such σ, formula (4.20) stilldefines a strongly continuous semigroup in the space of complex-valued continuousfunctions on Rd vanishing at infinity.

Formula (4.20) can be further extended to pure imaginary σ, in which caseequation (4.21) turns to the most basic Schrodinger equation of quantum mechan-ics. In this case, however, the natural functional space becomes L2 rather thanC∞. Notice for the sake of completeness that, more generally, the evolutionarySchrodinger equation is an equation of the type

ft = −iHft, f0 = f, (4.22)

Page 241: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

226 Chapter 4. Linear Evolutionary Equations: Foundations

where H is a self-adjoint operator in a Hilbert space. It is shown in functionalanalysis that the resolving operators exp{−iHt} are well defined and form thegroup (not only semigroup) of unitary operators in this Hilbert space.

Equation (4.21) with σ = i + ε and a small real ε is sometimes considered aSchrodinger equation that is regularized by a complexification of the time or themass. The properties of this regularized equation can be exploited for obtaininguseful information about the limiting case σ = i.

Exercise 4.3.5. Show that, if σ = iκ with a real κ, the operators (4.20) form astrongly continuous semigroup in L2(Rd) and solve the corresponding Schrodingerequation (4.21). An invariant core for this semigroup is given by the Schwartz spaceS(Rd).

A key feature of the heat semigroup (4.13) is its smoothing property.

Proposition 4.3.1. For the Green function Gt from (4.14) and f ∈ C(Rd), j =1, . . . , d, we have ∥∥∥∥∂Gt(x)

∂xj

∥∥∥∥L1(Rd)

=

√2

πt, (4.23)

supx

∣∣∣∣ ∂

∂xjTtf(x)

∣∣∣∣ ≤√

2

πt‖f‖C(Rd),∥∥∥∥ ∂

∂xjTtf(x)

∥∥∥∥L1(Rd)

≤√

2

πt‖f‖L1(Rd). (4.24)

Proof. The estimates (4.24) are consequences of (4.23). In order to get (4.23), wecalculate

∂Gt(x)

∂xj= −(2πt)−d/2 xj

texp

{−x2

2t

},

which yields∥∥∥∥∂Gt(x)

∂xj

∥∥∥∥L1(Rd)

=1√t(2π)−d/2

∫|zj | exp

{−z2

2

}dz =

2√t(2π)−d/2(2π)(d−1)/2.

This implies (4.23). �

The estimates (4.24) extend to the derivatives of all orders. For instance, forany f ∈ C1(Rd), we get

supj,k

supx

∣∣∣∣ ∂2

∂xj∂xkTtf(x)

∣∣∣∣ ≤√

2

πt‖f‖C1(Rd). (4.25)

Exercise 4.3.6. Prove (4.25).

The next result shows that the smoothing property of Tt works also forintegral norms, that is for Sobolev spaces.

Page 242: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

4.3. Simple diffusions and the Schrodinger equation 227

Exercise 4.3.7. Following the line of arguments of Proposition 4.3.1 above, showthat the heat semigroup Tt takes L1(R

d) to Hk1 (R

d) with any k > 0, and that

‖Ttf‖Hk1 (R

d) ≤ κk(d)t−k/2‖f‖L1(Rd). (4.26)

Exercise 4.3.8. Extend (4.24), (4.25) and (4.26) to the semigroup T σt of (4.20) for

any σ with a positive real part.

Another key feature of the heat semigroup (4.13) is that it preserves smooth-ness and polynomial decay at infinity. Namely, for any f ∈ Ck(Rd),

‖Ttf‖Ck(Rd) ≤ ‖f‖Ck(Rd). (4.27)

On the other hand, if f increases not faster than |x|2n at infinity, n ∈ N , then

supx

|(1 + |x|2n)Ttf(x)| ≤ (1 + cnt) supx

|(1 + |x|2n)f(x)|, (4.28)

for t ∈ [0, 1] and a constant cn. Estimate (4.27) follows from the observation thatTt commutes with the operation of differentiation. In order to prove (4.28), onemust first show a rougher estimate:

supx

|(1 + |x|α)Ttf(x)| ≤ κα supx

|(1 + |x|α)f(x)|, (4.29)

which holds for any α > 0 and t ∈ [0, 1] with some constant κα.

Exercise 4.3.9. Derive the inequality (4.29) and give an estimate for the con-stant κα.

Next, we show that (4.28) is equivalent to the inequality

(1 + |x|2n)∫

(2πt)−d/2 exp

{− (x− y)2

2t

}dy

1 + |y|2n ≤ (1 + cnt). (4.30)

At t = 0, the l.h.s. of (4.30) equals one. In order to prove (4.30), one therefore hasto show that

supt∈[0,1]

supx

∣∣∣∣ ∂∂t((1 + |x|2n)

∫(2πt)−d/2 exp

{− (x− y)2

2t

}dy

1 + |y|2n)∣∣∣∣

=1

2sup

t∈[0,1]

supx

∣∣∣∣(1 + |x|2n)Δ∫(2πt)−d/2 exp

{− (x− y)2

2t

}1

1 + |y|2n dy

∣∣∣∣=

1

2sup

t∈[0,1]

supx

∣∣∣∣(1 + |x|2n)∫(2πt)−d/2 exp

{− (x− y)2

2t

1

1 + |y|2n dy

∣∣∣∣ < ∞.

Page 243: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

228 Chapter 4. Linear Evolutionary Equations: Foundations

But this follows from (4.29), since∣∣∣∣Δ 1

1 + |y|2n∣∣∣∣ =(

1

1 + r2n

)′′+

d− 1

r

(1

1 + r2n

)′

= − 2nr2n−2

(1 + r2n)2

[d+ 2n− 2− 2nr2n

(1 + r2n)3

]

≤ κn1

1 + r2n

with r = |y|, which completes the proof of (4.30).

Summing up, we proved the following result.

Proposition 4.3.2. The action of the semigroup Tt in the weighted spaces CLn(Rd)

with the functions Ln(x) = 1 + |x|n, n ∈ N, satisfy the estimates

‖Tt|L(CLn (Rd)) ≤ exp{Cnt} (4.31)

with a constant Cn.

Remark 63. By duality, this implies that Tt preserves the sets of measures withbounded moments of any order. The fact that one can reduce evolutions to mea-sures of bounded moments is crucial for nonlinear equations (see, e.g., Theo-rems 6.4.3 and 6.10.1) because, by Proposition 1.1.1, the sets of measures witha bounded moment are compact in the weak topology.

Both for mathematical developments and physical applications, a highly in-teresting field are the processes of heat conduction and diffusion in bounded do-mains, for which one needs to find appropriate modifications of the heat semigroupTt that act in spaces C(Ω) with Ω a subset of Rd. As the simplest example, let usconsider the case when the one-dimensional domain Ω is the half-line R+.

Exercise 4.3.10. Check that the operators

TNeut f(x) =

∫ ∞

0

(Gt(x− y) +Gt(x+ y))g(y) dy (4.32)

define a strongly continuous semigroup in C∞([0,∞)) with the generator Lf =f ′′/2 defined in this way on the domain

DNeu = {f ∈ C2∞([0,∞)) : f ′(0) = 0}.

Note that the condition f ′(0) = 0 is the so-called Neumann boundary condition.Formula (4.32) can be obtained from Tt of (4.13) by reducing it to the space ofeven functions.

Exercise 4.3.11. Check that the operators

TDirt f(x) =

∫ ∞

0

(Gt(x− y)−Gt(x+ y))g(y) dy (4.33)

Page 244: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

4.3. Simple diffusions and the Schrodinger equation 229

define a strongly continuous semigroup in the subspace Ckill(0),∞([0,∞)) ofC∞([0,∞)), consisting of functions that vanish at zero, with the generator Lf =f ′′/2 defined in this way on the domain

DDir = {f ∈ C2∞([0,∞)) : f(0) = f ′′(0) = 0}.

The condition f(0) = 0 is the so-called Dirichlet boundary condition. Formula(4.33) can be obtained from Tt of (4.13) by reducing it to the space of odd func-tions.

Exercise 4.3.12. Check that the operators

T stopt f(x) = 2f(0)

∫ ∞

x

Gt(y) dy +

∫ ∞

0

(Gt(x − y)−Gt(x+ y))g(y) dy, (4.34)

define a strongly continuous semigroup in C∞([0,∞)) with the generator Lf =f ′′/2 defined in this way on the domain

Dstop = {f ∈ C2∞([0,∞)) : f ′′(0) = 0}.

This semigroup is an extension of the semigroup TDirt to the whole space

C∞([0,∞)). Moreover, check the intertwining relation

(T stopt f)′ = TNeu

t (f ′). (4.35)

The theory extends directly to the case when Δ in (4.12) is substituted byan arbitrary non-degenerate second-order operator with constant coefficients:

ft =1

2(A∇,∇)ft + (b,∇)ft, f0 = f, (4.36)

where b ∈ Rd and A is a symmetric positive matrix. In this case, the solution is

ft(x) = TA,bt f(x) =

∫GA,b

t (x− y)f(y) dy, (4.37)

with the heat kernel

GA,bt (x) = (2πt)−d/2(detA)−1/2 exp

{− 1

2t(A−1(x+ bt), x+ bt)

}. (4.38)

Formula (4.37) can either be directly verified or derived via the Fourier trans-form, see (2.60) and (1.89).

Another example are Gaussian diffusions, which are an extension of equation(4.12) in the sense that their heat kernels or Green functions (i.e., the integralkernel of the operators that form the semigroup) can be written explicitly as

Page 245: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

230 Chapter 4. Linear Evolutionary Equations: Foundations

the exponential of a quadratic form. Namely, a Gaussian diffusion operator is asecond-order differential operator of the form

L =

(B x,

∂x

)+

1

2tr

(A

∂2

∂x2

), (4.39)

where x ∈ Rd, B and A are d × d-matrices, and the matrix A is symmetric andnon-negative. The corresponding parabolic equation ∂ft

∂t = Lft can be writtenmore explicitly as

∂ft∂t

=∑i,j

bijxi∂ft∂xj

+1

2

∑i,j

Aij∂2ft

∂xi∂xj. (4.40)

If A is the unit matrix, the second term turns into the Laplacian Δ, and the result-ing equation is usually referred to as the Ornstein–Uhlenbeck diffusion equation.

Remark 64. In Section 4.12, we shall look at infinite-dimensional generalizations ofGaussian or Ornstein–Uhlenbeck diffusions, by using an approach that is differentfrom the one employed here. Namely, instead of building a Green function (whichis not a straightforward object in the infinite-dimensional case), we will considerthe evolution of Gaussian packets, which naturally leads to the so-called Riccatiequations.

If the matrix

E = E(t) =

∫ t

0

eBτAeBT τ dτ (4.41)

is non-singular, where BT is the transpose of B, the Gaussian diffusion semigroupft = Ttf solving the Cauchy problem for equation (4.40) has the following closedform:

ft(x) = Ttf(x) =

∫GA,B

t (x− y)f(y) dy, (4.42)

with

GA,Bt (x) = (2π)−d/2(detE(t))−1/2 exp

{−1

2

(E−1(x0 − eBtx), x0 − eBtx

)}(4.43)

the heat kernel (or Green function) of equation (4.40).

Exercise 4.3.13.

(i) Check that (4.42) yields a solution to equation (4.40).

(ii) Derive (4.43) via the Fourier transform.

It is possible to fully classify operators of the type (4.39) having a non-singular matrix E(t), as well as the possible small-time asymptotics of E(t).Namely, it turns out (see, e.g., Chapter 1 of [136] for the proofs) that if E(t)is non-singular, then coordinates exist where E(t) is block-diagonal with blocks of

Page 246: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

4.3. Simple diffusions and the Schrodinger equation 231

dimension p that have the form Λp(t)(1 + O(t)), where the main term Λp(t) hasthe entries

Λp(t)ij =1

(p− i)!(p− j)!λp(t)ij =

1

(p− i)!(p− j)!

t2p+1−(i+j)

2p+ 1− (i+ j),

i, j = 1, . . . , p,

(4.44)

and the determinant (entering formula (4.43))

detΛp(t) = tp2 2! · · · (p− 1)!

p!(p+ 1)! · · · (2p− 1)!.

In order to better understand this structure, let us write down the blocks λp

and the inverses of Λp for p = 1, 2, 3:

λ1(t) = t, λ2(t) =

⎛⎜⎝t3

3

t2

2t2

2t,

⎞⎟⎠ λ3(t) =

⎛⎜⎜⎜⎜⎜⎜⎜⎝

t5

5

t4

4

t3

3

t4

4

t3

3

t2

2

t3

3

t2

2t

⎞⎟⎟⎟⎟⎟⎟⎟⎠

, (4.45)

(Λ1(t))−1 =

1

t, (Λ2(t))

−1 =

⎛⎜⎜⎝

12

t3− 6

t2

− 6

t24

t

⎞⎟⎟⎠ ,

(4.46)

(Λ3(t))−1 =

⎛⎜⎜⎜⎜⎜⎜⎝

6!

t5− 6!

2t460

t3

−6!

t4192

t3−36

t2

60

t3−36

t29

t

⎞⎟⎟⎟⎟⎟⎟⎠

.

Therefore, the Gaussian diffusions with non-singular E(t) (and thus with a smoothheat kernel) are fully classified by the numbers of blocks of order p (any collectioncan be realized). This can be nicely encoded by the so-called Young schemes orYoung diagrams, which are non-decreasing finite sequences of natural numbers.Also, the small-time asymptotics of their heat kernels are given in closed form.The situation with the blocks Λ1 only corresponds to the usual non-degeneratediffusions without drift (having heat kernels (4.38) with b = 0). Diffusion equationsthat arising from this scheme with only one block Λ2 are usually referred to asKolmogorov’s diffusion. One can also classify diffusions with variable coefficients,whose small-time asymptotics is given by the heat kernels of the above Gaus-sian diffusions. The symbols H(x, p) of these diffusion operators are the regular

Page 247: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

232 Chapter 4. Linear Evolutionary Equations: Foundations

Hamiltonians from formula (2.103) in connection with boundary-value problemsfor Hamilton systems.

As a last example which is important to many applications (see e.g. Theorem6.9.1), let us present without a proof the fundamental result on the degeneratediffusions that can be obtained by the method of stochastic analysis. (Proofs ofvarious versions can be found, e.g., in [87, 147] or [174].) The result deals withsituations when the matrix of diffusion coefficients can be written in a specialproduct form:

A(x) = σ(x)σT (x), (4.47)

with some square matrix σ and σT its transpose.

Theorem 4.3.1.

(i) In the diffusion equation of the form

∂u

∂t= Lu =

1

2(A(x)∇,∇)u + (b(x),∇)u(x), x ∈ Rd, (4.48)

let A be given by (4.47), and let both σ and b be Lipschitz-continuous inx. Then the operator L in (4.48) generates a strongly continuous semigroupof contractions Tt in C∞(Rd), whose domain contains the space of twicecontinuously differentiable functions with a compact support. Moreover, Tt

extends to the strongly continuous semigroup Tt in the space of weightedfunctions CL,∞(Rd) (as defined in (1.2)) with L(x) = 1 + x2, so that

‖Tt‖CL,∞(Rd) ≤ eKt, (4.49)

with a constant K depending on d and the Lipschitz constants of σ, b inCk(Rd).

(ii) If ∇σ,∇b are well defined and belong to Ck−1(Rd) with k ∈ N, then thespace Ck

∞(Rd) is invariant under the action of the semigroup Tt. Moreover,the Tt are bounded operators with respect to the usual norm of these spaces,so that

‖Tt‖Ck(Rd) ≤ eKkt, (4.50)

with a constant Kk depending on d, k and the norms of σ, b in Ck(Rd). Inparticular, if k ≥ 2, the space C2

∞(Rd) is an invariant core for the semigroupTt in C∞(Rd), and the space of functions D consisting of twice continuouslydifferentiable functions f such that their second-order derivatives belong toC∞(Rd) is an invariant core for the semigroup Tt in CL,∞(Rd). Moreover,if we equip the space D with a Banach space structure via the norm

‖f‖D = ‖f‖CL(Rd) + ‖∇2f‖C(Rd),

then ‖Tt‖D ≤ eKDt with a constant KD.

Page 248: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

4.4. Evolutions generated by powers of the Laplacian 233

(iii) Under the assumptions of (i), Tt respects the rates of polynomial growth ordecay at infinity in the following sense: for any real α, the spaces CLα(Rd)and CLα,∞(Rd) are invariant under Tt, and

‖Tt‖CLα(Rd) ≤ eκαt (4.51)

with constants κα.

Remark 65.

(i) Notice that the conditions of the theorem permit a linear growth of σ(x) andb(x), and therefore a quadratic growth of A(x) = σ(x)σT (x).

(ii) If A is a non-degenerate (strictly elliptic) matrix, then the analytic method offrozen coefficients can be used for obtaining this result. This will be demon-strated in Chapter 5.

(iii) Concerning the representation (4.47), it can be shown (see, e.g., Chapter 3 of[87]) that if A(x) ∈ C2(Rd) and if A(x) is a non-negative symmetric matrix,then there exists a symmetric matrix σ(x) such that A(x) = σ2(x) and σ(x)is Lipschitz-continuous in x with a Lipschitz constant L ≤ C(d)‖A‖C2(Rd)

with a constant C(d).

4.4 Evolutions generated by powers of the Laplacian

Since the Fourier transform takes the differentiation to multiplication, it takes thenegation of the Laplacian −Δ to the multiplication by |p|2 = p2. This fact is amotivation for calling −Δ the magnitude of Δ, |Δ| = −Δ (which actually fits thegeneral definition of |A| in the operator calculus). Consequently, one can naturallydefine the operator |Δ|α/2 for any α > 0 as the operator that becomes the operatorof multiplication by |p|α under Fourier transformation. Of course, for an integerβ, |Δ|β = (−Δ)β defined in such a way coincides with the corresponding power of−Δ calculated in the usual way.

The semigroups generated by powers of the Laplacian represent another fun-damental example of semigroups. These semigroups solve the equation

ft = −σ|Δ|α/2ft, f0 = f, (4.52)

with α > 0 and σ > 0. In what follows, we shall stick to real σ, since in the case ofcomplex diffusion, the theory remains more or less the same for a complex σ with apositive real part. We will now obtain some basic properties of the problem (4.52),whereby we use no other tools than integration by parts. In the next section, wewill refine these results in a more general setting of homogeneous symbols, usingmore sophisticated methods.

In order to solve (4.52), one has to apply the Fourier transform to it (see(1.87), if needed). According to the above definition of |Δ|β , this yields the fol-

Page 249: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

234 Chapter 4. Linear Evolutionary Equations: Foundations

lowing equation for ft = F (ft):

∂tft = −σ|p|αft, f0 = f . (4.53)

This linear problem has the solution

ft(p) = exp{−tσ|p|α}f(p). (4.54)

Consequently, taking the inverse Fourier transform and using (1.93) yields

ft(x) = Tα,σt f(x) =

∫Gα

tσ(x− y)f(y) dy, (4.55)

with

Gασ(x) = (2π)−d

∫exp{−σ|p|α + ipx} dp. (4.56)

In order to assess the properties of the semigroup Tα,σt , we have to analyse

the properties of the functions Gασ , given by the integral (4.56). By the Riemann–

Lebesgue Lemma, we find Gασ ∈ C∞(Rd), and all derivatives of Gα

σ with respectto x also belong to C∞(Rd) for all positive α, σ.

Equation (4.52) is a special case of equation (2.56) discussed in Theorems2.4.2 and 2.4.4. The conditions of both of these theorems are satisfied for (4.52)with δ = N = α and l = d + k, where k is the biggest integer that is strictlysmaller than α. In particular, l ≥ d+ 1 for α > 1. Therefore, if σ > 0 and α > 1,then Gα

σ ∈ L1(Rd) and

∂m

∂xj1 · · · ∂xjm

Gασ(x) ∈ L1(R

d)

for any indices j1, . . . , jm. However, much more can be said for the semigroup Tα,σt

resolving the problem (4.52).

Theorem 4.4.1. Let σ > 0 and α > 1. Then the semigroup Tα,σt given by (4.56)

is a uniformly bounded and strongly continuous semigroup in C∞(Rd), with allspaces Ck

∞(Rd) being invariant. In particular, the subspaces Ck∞(Rd) for all k ≥ α

represent invariant cores. Moreover, the semigroup is smoothing in the sense thatTα,σt f ∈ Ck∞(Rd) for any k > 0, t > 0 and f ∈ C(Rd). Finally, for all non-

negative integers k, l,∥∥∥∥ ∂k

∂xj1 · · · ∂xjk

Gαtσ(.)

∥∥∥∥L1(Rd)

≤ ckt−α/k, (4.57)

‖Tα,σt f‖Cl+k(Rd) ≤ ck,lt

−k/α‖f‖Cl(Rd), (4.58)

with some constants ck and ck,l depending on α, σ and d.

Page 250: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

4.4. Evolutions generated by powers of the Laplacian 235

Proof. Since

Gαtσ(x) = (2π)−d

∫exp{−tσ|p|α + ipx} dp

= (2π)−d

∫exp{−σ|q|α + iqxt−1/α}t−d/α dq,

it follows that (with y = xt−1/α)∫|Gα

tσ(x)| dx =

∫ ∣∣∣∣(2π)−d

∫exp{−σ|q|α + iqy} dq

∣∣∣∣dy,which has a bound that is independent from t. This implies that the semigroupTα,σt is uniformly bounded in C∞(Rd).

Next, we find

∂xjGα

tσ(x) = (2π)−dt−1/α

∫iqj exp{−σ|q|α + iqxt−1/α}t−d/α dp,

so that∫ ∣∣∣∣ ∂

∂xjGα

tσ(x)

∣∣∣∣ dx = t−1/α

∫ ∣∣∣∣(2π)−d

∫iqj exp{−σ|q|α + iqy} dp

∣∣∣∣dy,which is bounded by a constant times t−1/α. (4.57) is similarly obtained. Thisimplies (4.58) for l = 0. The case of arbitrary l can be reduced to the case l = 0,because the gradient ∇f evolves according to the same equation as f itself.

In order to show strong continuity, we have to show that ft = Tα,σt f → f in

C∞(Rd). By the Riemann–Lebesgue lemma, this follows from the fact that ft → fin L1(R

d), which again follows straightforwardly from (4.54).

Finally, for showing that the Ck∞(Rd) represent invariant cores for all k ≥ α,it remains to show that these spaces belong to the domain D(A) of the generatorA of the semigroup Tα,σ

t . For this, we observe that D(A) contains the subspaceof C∞(Rd) consisting of functions that are representable as Fourier transforms offunctions φ ∈ L1(R

d) such that |p|αφ(p) ∈ L1(Rd), because

(exp{−tσ|p|α} − 1)φ(p)/t → −σ|p|αφ(p), t → 0,

both pointwise and in L1(Rd). In particular, S(Rd) ⊂ D(A). In order to conclude

that Ck∞(Rd) ⊂ D(A) for all k ≥ α, it remains to note that A is closed (by Theorem

4.1.1(vii)) and that Ck∞(Rd) belongs to the closure of A defined on S(Rd) for allk ≥ α (due to the formulae (1.145) and (1.143)). �Remark 66. Spaces of smooth functions of fractional order (i.e., subspaces ofCk

∞(Rd) that consist of functions with Holder-continuous derivatives) yield coresthat represent the structure of the semigroups Tα,σ

t in a better way. However, weshall stick to Ck(Rd) for the sake of simplicity.

Page 251: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

236 Chapter 4. Linear Evolutionary Equations: Foundations

Proposition 4.4.1. Under the assumptions of Theorem 4.4.1, the following holds:

(i) The semigroups Tα,σt also represent strongly continuous bounded semigroups

in each space Ck∞(Rd).

(ii) The semigroups Tα,σt extend to C(Rd), so that Ttf(x) → f(x), as t → 0,

for any f ∈ C(Rd) and each x (uniformly on compact subsets of x, but notnecessarily uniform in all x).

(iii)∫Gα

σt(x)dx = 1 for all t, σ.

Proof. Statement (i) is straightforward. In order to prove (ii), one must approx-imate f ∈ C(Rd) by functions f ∈ C∞(Rd) that converge uniformly on anycompact subset. Statement (iii) follows from (ii) by choosing f = 1 and takinginto account that, by homogeneity,

∫Gα

σt(x)dx does not depend on time. �

In particular, it follows that if G is not positive everywhere (which is in factthe case for α > 2), then ‖Tt‖ =

∫ |Gασt(x)|dx > 1. Therefore, ‖Tt‖ does not tend

to 1, as t → 0, although Tt strongly converges to the identity operator.

Remark 67. Gασt(x) is positive for α ≤ 2. This follows from the conditional posi-

tivity of the generator, see Section 5.10.

For the analysis of duality (which is needed, e.g., when dealing with variablecoefficients), one has to know how the evolution Tt acts on the dual spaces toCk(Rd), i.e., on the spaces of integrable functions.

Proposition 4.4.2. Under the assumptions of Theorem 4.4.1, the following holds:

(i) The semigroup Tα,σt is strongly continuous in L1(R

d), and each Sobolev spaceHk

1 (Rd), k ∈ N, is invariant.

(ii) For t > 0, Tα,σt takes L1(R

d) to Hk1 (R

d), and the equation ft = −σ|Δ|α/2ftholds in the norm of L1(R

d).

(iii) The semigroups Tα,σt extend by weak continuity from L1(R

d) to M(Rd),so that Tα,σ

t μ ∈ C(Rd) ∩ L1(Rd) (more precisely, the measure Tα,σ

t μ has adensity that belongs to this space), and Tα,σ

t μ → μ weakly (but not necessarilystrongly), as t → 0, for any μ ∈ M(Rd).

Proof. (i) and (ii) follow directly from the properties of Gαtσ.

(iii) Again by the properties of Gαtσ , T

α,σt μ is a well-defined continuous and

integrable function (even infinitely differentiable) for any μ ∈ M(Rd). Since∫Gα

tσ(x−y)φ(x) → φ(y) uniformly due to the strong continuity of Tα,σt , it follows

that ∫Gα

tσ(x− y)φ(x)μ(dy) →∫

φ(y)μ(dy),

as t → 0, for any μ ∈ M(Rd). �

In order to assess the properties of the solutions to (4.52), let us obtain somepointwise estimates for the Green function Gα

σ(x), uniform with respect to its twokey variables σ and x.

Page 252: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

4.5. Evolutions generated by ΨDOs with homogeneous symbols . . . 237

Proposition 4.4.3. There exists a constant C = C(α) such that

|Gασ(x)| ≤ Cmin

(σ−d/α,

1

|x|d ,σk/α

|x|d+k

), (4.59)

where k is the biggest integer being strictly smaller than α.

Proof. Multiplying Gασ(x) by imxi1 · · ·xim is equivalent to differentiating its

Fourier transform exp{−σ|p|α} with respect to pi1 , . . . , pim . Each consecutive dif-ferentiation yields a multiplier of order either σ|p|α−1 or |p|−1. The resulting ex-pression for |xi1 · · ·ximGα

σ(x)| can then be estimated using |eipx| ≤ 1 and changingthe integration variable p to q via the equation σ|p|α = |q|α with dp = σ−d/αdq,which turns these multipliers to σ|p|α−1 = |q|α−1σ1/α and |p|−1 = |q|−1σ1/α.Therefore, the total dependence of the estimate on σ is reduced to σm/α, whicimplies that

|Gασ(x)| ≤ C

σ−d/ασm/α

|x|m (4.60)

for any m = 0, 1, . . . , k. The estimates (4.59) are obtained by choosing m = 0, d,d+ k. �

As we shall show in the next section by a more refined method, the estimate(4.59) can be optimized to |Gα

σ(x)| ≤ Cσ/|x|d+α. This implies, in particular, thatTheorem 4.4.1 can be extended to all α > 0.

4.5 Evolutions generated by ΨDOs with homogeneous

symbols and their mixtures

The case α ∈ (0, 1] that had been excluded above will now be dealt with in amuch more general setting of problems (2.56) with homogeneous symbols ψ of apositive order β. For the sake of simplicity, we shall work with time-homogeneousequations. Therefore, we are looking at the Cauchy problem

ft = −ψ(−i∇)ft, f |t=0 = f0, (4.61)

where

ψ(p) = |p|βω(p/|p|), (4.62)

where ω = ωr + iωi is a continuous complex-valued function on the sphere Sd−1

with positive real part. The corresponding semigroup of operators resolving (4.61)acts as

Ttf(x) =

∫Gψ

t (x− y)f(y) dy, (4.63)

with the Green function (2.62) of the form

Gψt (x) =

1

(2π)d

∫eipx exp{−t|p|βω(p/|p|)} dp. (4.64)

Page 253: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

238 Chapter 4. Linear Evolutionary Equations: Foundations

Obviously, this function is real, so Tt preserves the reality of functions if andonly if the condition

ω(p/|p|) = ω(−p/|p|) (4.65)

holds for all p.

An important example for an operator ψ(−i∇) with a homogeneous symbol(4.62) are mixed fractional symmetric derivatives (1.141) with symbols

ψ(p) =

∫Sd−1

|(p, s)|βμ(ds),and

ω(p/|p|) =∫Sd−1

|(p/|p|, s)|βμ(ds). (4.66)

This ω is real and non-negative, and the condition of its strict positivity is equiva-lent to the requirement that the support of the measure μ on Sd−1 is not containedin any hyperplane of Rd.

Exercise 4.5.1. Check the last claim.

Let us start with the one-dimensional case d = 1. Then

ψ(p) = a+1p≥0|p|β + a−1p<0|p|β (4.67)

(where 1M denotes the indicator function of the set M) with complex constants

a± = a±r + ia±i , a±r > 0. (4.68)

The reality condition (4.65) becomes a± = a∓, in which case ψ gets the form

ψ(p) = (ar + iai sgn (p))|p|β (4.69)

with constants ar > 0, ai ∈ R.

By changing the integration variable p to −p in (4.64), the Green functionfor d = 1 can be rewritten as

Gψt (x) =

1

∫ ∞

0

eipx exp{−ta+pβ} dp+ 1

∫ ∞

0

e−ipx exp{−ta−pβ} dp. (4.70)

Proposition 4.5.1. For any β > 0 and constants (4.68), the Green function (4.70)is infinitely smooth in both variables, and the estimates

| ∂l

∂xlGψ

t (x)| ≤ C(β, a±, l)min

(t−(1+l)/β ,

t

|x|1+β+l

)(4.71)∣∣∣∣ ∂∂t ∂l

∂xlGψ

t (x)

∣∣∣∣ ≤ 1

tC(β, a±, l)min

(t−(1+l)/β ,

t

|x|1+β+l

), (4.72)

hold for any integer l with constants C(β, a±, l), C(β, a±, l).

Page 254: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

4.5. Evolutions generated by ΨDOs with homogeneous symbols . . . 239

Proof. Applying Proposition 9.3.5 with ω = l to (4.70) and noting that the firstterms of the asymptotics ±i/|x| for the two terms in (4.70) always cancel, we findthe estimate ∣∣∣∣ ∂l

∂xlGψ

t (x)

∣∣∣∣ ≤ C(β, a±, l)t

|x|1+β+l.

Taking (2.69) into account completes the proof of (4.71). For the proof of (4.72),one applies Proposition 9.3.5 with ω = l+ β. �

Let us extend these estimates to arbitrary dimensions d.

Theorem 4.5.1.

(i) Let d > 1, β > 0 and a function ω on Sd−1 be (d+1+[β]) times continuouslydifferentiable (where [β] is the integer part of β; also note that these deriva-tives are uniformly bounded due to the compactness of Sd−1), with a realpart that bounded from below by a positive number. Then the Green function(4.64) satisfies the estimate

|Gψt (x)| ≤ Cmin

(t−d/β,

t

|x|d+β

). (4.73)

(ii) Moreover, G is differentiable with respect to t, and the following bounds apply:∣∣∣∣ ∂∂tGψt (x)

∣∣∣∣ ≤ Cmin

(t−1−d/β ,

1

|x|d+β

). (4.74)

(iii) If additionally ω is (d + 1 + [β] + l) times continuously differentiable, then

Gψt (x) is l times continuously differentiable in x and∣∣∣∣ ∂k

∂xi1 · · ·∂xik

Gψt (x)

∣∣∣∣ ≤ Cmin

(t−(d+k)/β ,

t

|x|d+β+k

)(4.75)

for all k ≤ l and all i1, . . . , ik. In all these estimates, C is a constant thatdepends on β, d and the corresponding bounds for ω and its derivatives.

Proof. (i) The first bound t−d/β is already known from Proposition 4.4.3. Usingspherical coordinates with the axis along the direction of x = x/|x|, we can rewrite(4.64) as

Gψt (x) =

1

(2π)d

∫ ∞

0

d|p|∫ π

0

∫Sd−2

dn ei|p| |x| cos θ

× exp{−t|p|βω(x, cos θ, n)} sind−2 θ|p|d−1,

(4.76)

where ω(x, cos θ, n) = (ωr+iωi)(x, cos θ, n) denotes the function ω(p/|p|) expressedin terms of the spherical coordinates θ, n (that depend on x). Changing θ tou = cos θ yields the equivalent formulation

Gψt (x) =

1

(2π)d

∫ ∞

0

d|p|∫ 1

−1

du

∫S(d−2

dn ei|p| |x|u

× exp{−t|p|βω(x, u, n)}(1− u2)(d−3)/2|p|d−1.

(4.77)

Page 255: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

240 Chapter 4. Linear Evolutionary Equations: Foundations

When trying to directly apply Proposition 9.3.5, a problem arises from thepossibility of u = 0. Therefore, the idea is to separate the small values of u. Namely,let χ1(u) be an even infinitely differentiable function R → [0, 1] with support[−1/2, 1/2] and such that χ1(u) = 1 for u ∈ [−1/4, 1/4]. Let χ2(u) = 1 − χ1(u)

and let us represent Gψt as the sum of two integrals Gψ

t (x) = I1 + I2 with

Ij =1

(2π)d

∫ ∞

0

d|p|∫ 1

−1

du

∫Sd−2

dn ei|p| |x|u

× exp{−t|p|βω(x, u, n)}(1− u2)(d−3)/2χj(u)|p|d−1.

(4.78)

Let us start with I2. Proposition 9.3.5 with ω = d− 1 implies I2 = I02 + I2, where

I02 =|Sd−2|Γ(d)(2π)d|x|d

∫ 1

−1

exp{i sgn (u)πd/2}(1− u2)(d−3)/2χ2(u)du

|u|d

and

|I2| ≤ C(d, β) sups

|ω(s)| t

|x|d+β.

Now let us turn to I1. Changing the integration variable |p| to r = |x| |p|yields

I1 =1

(2π)d|x|d∫ ∞

0

dr

∫ 1

−1

du

∫Sd−2

dn eiru

× exp

{−tω(x, u, n)

|x|β}(1− u2)(d−3)/2χ1(u)r

d−1,

which we represent as the sum I1 = I01 + I1, where

I01 =|Sd−2|

(2π)d|x|d∫ ∞

0

dr

∫ 1

−1

du eiru(1− u2)(d−3)/2χ1(u)rd−1,

I1 =1

(2π)d|x|d∫ ∞

0

dr

∫ 1

−1

du

∫Sd−2

dn

× eiru(exp

{−tω(x, u, n)

|x|β}− 1

)(1 − u2)(d−3)/2χ1(u)r

d−1.

Equivalently, the second term can be written as

I1 = − t

(2π)d|x|d+β

∫ ∞

0

dr

∫ 1

−1

du

∫Sd−2

dn

× eiruω(x, u, n)rβφ

(tω(x, u, n)

|x|β)(1− u2)(d−3)/2χ1(u)r

d−1,

Page 256: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

4.5. Evolutions generated by ΨDOs with homogeneous symbols . . . 241

where φ(y) = (1 − e−y)/y and y = tω(x, u, n)rβ/|x|β . Using eiru = (ir)−1deiru,we can now integrate by parts (d + 1 + [β]) times with respect to the variable u.Hereby, the required differentiation of the function

ω(x, u, n)(1− u2)(d−3)/2χ1(u)

does not create any additional singularities or growing terms. On the other hand,

d

duφ(y(u)) = [ω(x, u, n)]−1 ∂

∂uω(x, u, n)

(yd

dyφ(y)

)|y=tω(x,u,n)rβ/|x|β .

Applying this differentiation (d+1+[β]) times gives a bounded result, because thefunctions [y( d

dy )]kφ(y) are uniformly bounded on R+ for any k ∈ N. As a result,

we get a function of r that decreases as r−2+β−[β], as r → ∞, and is thereforeintegrable. This yields the estimate

|I1| ≤ C1t

|x|d+β.

Exercise 4.5.2. Check the claim about [y( ddy )]

kφ(y).

Consequently, the correction terms I1 + I2 satisfy the required bounds, andin order to complete the proof, it is sufficient to show that I01 + I02 = 0. This iseasy to see for odd dimensions d, since in this case both I01 and I02 vanish. For evendimensions, the direct proof does not seem to be so straightforward (see Exercise4.5.3 below). The simplest way to see that the sum I01 + I02 vanishes is to observethat it does not depend on ω or β. But for ω = 1 and β = 2 it must vanish, sinceotherwise the Green function G2

σ(x) from (4.56) would have a decay of order 1/|x|das x → ∞, which is against our knowledge that the heat kernel of the standarddiffusion decays exponentially as x → ∞.

(ii) The proof of the second bound in (4.74) is analogous, the only differencenow being that the corresponding terms I0j already depend on β and ω, whichis why we cannot generally prove that they vanish or cancel. Instead, we mustdirectly estimate the corresponding integrals

Ij = − 1

(2π)d|x|d∫ ∞

0

dr

∫ 1

−1

du

∫Sd−2

dn

× eiruω(x, u, n)rβ

|x|β exp

{−tω(x, u, n)

|x|β}(1− u2)(d−3)/2χj(u)r

d−1,

without the above-used subtraction for distinguishing the major term of theasymptotics. In particular, for the integral I2 we use the estimate (9.31) of Propo-sition 9.3.5 with ω = β+d−1. The first bound in (4.74) is obtained by the variablechange trβ = qβ .

(iii) The first estimate is obtained by differentiating (4.64) and then usingthe estimate |ei(p,x)| ≤ 1. The second, more subtle estimate is proved as in (i).

Page 257: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

242 Chapter 4. Linear Evolutionary Equations: Foundations

The difference is that differentiating the function Gt(x) k times with respect to xyields multipliers of the order |p|k. Therefore, one has to differentiate d+1+[β]+ktimes by parts in the corresponding integral I1 in order to get the decay of orderr−2+β−[β] in the integrand. �

Exercise 4.5.3. Show that, for odd dimensions, both I01 and I02 from the proof of(i) vanish. On the other hand, for d = 2k, show that

I01+I02 = − 2

2k − 1+2

∫ 1

0

[φ(u)− φ(0)− 1

2φ′′(0)u2 − · · · − 1

2φ(2k−2)(0)u2k−2

]du

u2k,

where φ(u) = (1 − u2)(d−3)/2. (Hint: you may use (1.175) and (1.163).) Use thisformula to confirm that I01 + I02 = 0 at least for small dimensions d = 2 or 4. Thesolution to this exercise can be found in Section 5.2 of [136].

Remark 68.

(i) Expanding the exponent exp{−t|p|βω(x, u, n)} that appears in the integralsIj into a power series yields an asymptotic expansion of G in power series ofthe variable t/|x|β . One can even show that this power series is convergentfor β ∈ (0, 1).

(ii) For β = 2k with k ∈ N, the above power estimate is very rough, since in

this case Gψt (x) actually decreases exponentially as x → ∞, like in the case

of diffusion with β = 2.

As a corollary of the estimates (4.73), one can conclude that the correspond-ing semigroup Tt given by (4.63), and resolving equation (4.61), acts by boundedoperators in the weighted spaces CLα(R

d) with Lα(x) = 1+ |x|α. (Recall Remark63 on why this is of importance.)

Proposition 4.5.2. Under the assumptions of Theorem 4.5.1, Tt preserves theweighted spaces CLα(R

d) for α ∈ [0, β), and the norms ‖Tt‖CLα(Rd) are boundedon any bounded time interval.

Proof. We need to show that∣∣∣∣∫

Gψt (x − y)|y|α dy

∣∣∣∣ ≤ C(α, β, t)(1 + |x|α),

for which it is sufficient to show that∣∣∣∣∣∫|y|>1

Gψt (x− y)|y|α dy

∣∣∣∣∣ ≤ C(α, β, t)(1 + |x|α). (4.79)

Let us decompose this integral into the sum of two integrals I1 + I2, where I1 isover the set {y : |x− y| < t1/β}.

Page 258: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

4.5. Evolutions generated by ΨDOs with homogeneous symbols . . . 243

Then, we have

I1 ≤ C(β)

∫{|x−y|<t1/β}

|y|αt−d/β dy

≤ C(α, β, t)(1 + |x|α)t−d/β

∫{|x−y|<t1/β}

dy

≤ C(α, β, t)(1 + |x|α).

It remains to estimate

I2 ≤ C(β)

∫{|x−y|>t1/β,|y|>1}

t|y|α|x− y|d+β

dy.

The integral over the set |y| ≤ 2|x| is bounded by |x|α. Therefore, the followingintegral remains: ∫

D

t|y|α|x− y|d+β

dy

over the domain D = {y : |y| > 1, |y| > 2|x|, |x− y| > t1/β}. But this integral canbe estimated by

C(t, α, β)

∫D

t|y|α|y|d+β

dy,

which again yields the same estimate. �

An interesting extension of the above results concerns equations with mixedhomogeneous symbols. Namely, let us consider the problem (4.61) with

ψ(p) =

∫U

|p|β(u)ω(u, p/|p|)μ(du), (4.80)

where β(u) and ω(u, .) are continuous functions on u ∈ U (an arbitrary met-ric space), and μ is a (positive) measure on U . The corresponding semigroup ofoperators resolving (4.61) acts as (4.63) with

Gψt (x) =

1

(2π)d

∫eipx exp

{−t

∫U

|p|β(u)ω(u, p/|p|)μ(du)}

dp. (4.81)

Theorem 4.5.2. Let β(u) ∈ [bmin, bmax] with some 0 < bmin ≤ bmax < ∞, andlet ω(u, .) satisfy all conditions of Theorem 4.5.1 with all estimates uniform in u.Let β, ω depend continuously on u, and let μ be a Borel measure on U such thatμ{u : β(u) = bmax} > 0.

Page 259: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

244 Chapter 4. Linear Evolutionary Equations: Foundations

Then, for t ∈ (0, 1), the Green function (4.64) and its derivatives satisfy theestimates

|Gψt (x)| ≤ Cmin

(t−d/bmax , t

∫U

μ(du)

|x|d+β(u)

), (4.82)∣∣∣∣ ∂∂tGψ

t (x)

∣∣∣∣ ≤ Cmin

(t−1−d/bmax ,

∫U

μ(du)

|x|d+β(u)

), (4.83)∣∣∣∣ ∂k

∂xi1 · · · ∂xik

Gψt (x)

∣∣∣∣ ≤ Cmin

(t−(d+k)/bmax , t

∫U

μ(du)

|x|d+β(u)+k

), (4.84)

for all k ≤ l and i1, . . . , ik.

Proof. The second parts of the estimates in (4.82) to (4.84) are obtained in literallythe same way as in the proof of Theorem 4.5.1, except that Proposition 9.3.6 isused rather than Proposition 9.3.5. In order to get the first estimate in (4.82), onedecomposes the integral in (4.81) into two integrals over the sets {|p| ≤ 1} and{|p| > 1}, respectively. The first integral is bound by a constant. Hence up to aconstant, we find

|Gψt (x)| ≤

1

(2π)d

∫exp{−t|p|βmaxμ{u : β(u) = bmax}min

u,nω(u, n)} dp.

This yields the first estimate in (4.82). Similar decompositions yield the first esti-mates in (4.83) and (4.84). �Exercise 4.5.4.

(i) Extend Theorem 4.4.1 to all α > 0.

(ii) Formulate and prove the analogue of Theorem 4.4.1 to the mixed homogenoussetting of Theorem 4.5.2.

4.6 Perturbation theory and the interaction picture

An important tool for the construction of semigroups is perturbation theory, whichcan be applied once a generator of interest can be written as the sum of a well-understood operator and a term that is smaller (in some sense). We start with thesimplest result of this kind.

Theorem 4.6.1. Let an operator A with domain DA generate a strongly continuoussemigroup Tt on a Banach space B, and let L be a bounded operator on B. ThenA+ L with the same domain DA also generates a strongly continuous semigroupΦt on B, which is given by the series

Φt = Tt +

∞∑m=1

∫0≤s1≤···≤sm≤t

Tt−smLTsm−sm−1 · · ·LTs1 ds1 · · · dsm

= Tt +

∞∑m=1

∫0≤s1≤···≤sm≤t

Ts1LTs2−s1 · · ·LTt−sm ds1 · · · dsm(4.85)

Page 260: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

4.6. Perturbation theory and the interaction picture 245

that converges in the operator norm. Moreover, Φtf is the unique (bounded) solu-tion to the integral equation

Φtf = Ttf +

∫ t

0

Tt−sLΦsf ds, (4.86)

with a given f0 = f .

Remark 69. The two versions of the series in (4.85) are obtained from each other bya trivial change of the integration variables. For the path-integral representation,however, it turns out that one of them can be preferable as corresponding to the‘natural’ time-direction on the path. See Section 4.7 for more details on this point.

Proof. Clearly

‖Φt‖ ≤ ‖Tt‖+∞∑

m=1

(‖L‖t)mm!

( sups∈[0,t]

‖Ts‖)m+1,

which implies the convergence of the series. Next, the main semigroup conditionis shown by

ΦtΦτf =

∞∑m=0

∫0≤s1≤···≤sm≤t

Tt−smLTsm−sm−1 · · ·LTs1 ds1 · · · dsm

×∞∑

n=0

∫0≤u1≤···≤un≤τ

Tτ−unLTun−un−1 · · ·LTu1 du1 · · · dun

=∞∑

m,n=0

∫0≤u1≤···un≤τ≤v1≤···≤vm≤t+τ

dv1 · · · dvmdu1 · · · dun

× Tt+τ−vmLTvm−vm−1L · · ·Tv2−v1LTv1−unLTun−un−1 · · ·LTu1

=

∞∑k=0

∫0≤u1≤···≤uk≤t+τ

Tt+τ−ukLTuk−uk−1

L · · ·LTu1 du1 · · · duk

= Φt+τf.

Equation (4.86) is a consequence of (4.85). On the other hand, if (4.86) holds,then substituting the l.h.s. of this equation into its r.h.s. recursively yields

Φtf = Ttf +

∫ t

0

Tt−sLTsf ds+

∫ t

0

ds2Tt−s2L

∫ s2

0

ds1Ts2−s1LΦs1f

= Ttf +

N∑m=1

∫0≤s1≤···≤sm≤t

Tt−smLTsm−sm−1 · · ·LTs1f ds1 · · · dsm

+

∫0≤s1≤···≤sN+1≤t

Tt−sN+1LTsN+1−sN · · ·LTs2−s1LΦs1f ds1 · · · dsm

Page 261: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

246 Chapter 4. Linear Evolutionary Equations: Foundations

for arbitrary N . Since the last term tends to zero, the series representation (4.85)follows, which implies that the solution is unique.

Finally, since the terms with m > 1 in (4.85) are of order O(t2) for small t,we find

d

dt|t=0 Φtf =

d

dt|t=0

(Ttf +

∫ t

0

Tt−sLTsf ds

)=

d

dt|t=0 Ttf + Lf.

Therefore, ddt |t=0 Φtf exists if and only if d

dt |t=0 Ttf exists, and in this case

d

dt|t=0 Φtf = (A+ L)f.

Thus the domains of A and A+ L coincide. �

Equation (4.86) is often referred to as the mild form of the equation Φtf =(A + L)Φt, and the solutions to (4.86) are referred to as mild solutions to theequation Φtf = (A+ L)Φt.

Remark 70. If D ⊂ DA is a core for Tt from Theorem 4.6.1, such that for anyf ∈ DA there exists a sequence fn ∈ D with fn → f,Afn → Af , as n → ∞, thenalso Lfn → Lf , and therefore D is also a core for Φt. If D is an invariant core forTt, however, we cannot generally conclude that it is also an invariant core for Φt.Yet, if L and Tt are both bounded in D with respect to some norm in D, then theperturbation series also converges in D, which ensures its invariance under Φt.

Theorem 4.6.1 can be used to show the strong convergence of operator semi-groups.

Theorem 4.6.2. Under the assumptions of Theorem 4.6.1, suppose that we areadditionally given a sequence of operators An generating the uniformly boundedsemigroups T n

t on some domains Dn and a family of uniformly bounded operatorsLn in B. Assume that the T n

t converge strongly to Tt and the Ln converge stronglyto L. Then the corresponding semigroups Φn

t (built by Theorem 4.6.1 from T nt and

Ln) converge strongly to the semigroup Φt.

Proof. For each n, formula (4.85) can be rewritten as

Φnt = T n

t +

∞∑m=1

∫0≤s1≤···≤sm≤t

T nt−smLnT

nsm−sm−1

· · ·LnTns1 ds1 · · · dsm. (4.87)

By the dominated convergence theorem, in order to prove that the Φnt f converge

to Φtf it is sufficient to show that

T nt−smLnT

nsm−sm−1

· · ·LnTns1f → Tt−smLTsm−sm−1 · · ·LTs1f,

for any collection 0 ≤ s1 ≤ · · · ≤ sm ≤ t. But this follows from the assumption ofthe theorem. �

Page 262: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

4.6. Perturbation theory and the interaction picture 247

Perturbation theory is also used in a more general setting of unboundedoperators L, when their unboundedness can be somehow estimated by A. Suchestimates can be given in terms of A, or in terms of its resolvent, or in terms ofits semigroup. We shall work with the last approach.

Theorem 4.6.3.

(i) Let an operator A with domain DA generate a strongly continuous semigroupTt on a Banach space B, and let L be an unbounded operator in B with thedomain DL such that Ttf ∈ DL for any f ∈ B, t > 0 and

‖LTt‖ ≤ κt−ω (4.88)

with constants κ > 0, ω ∈ (0, 1), uniformly for t from a fixed bounded inter-val. Then the series (4.85) converges in the operator norm, and the operatorsΦt form a strongly continuous semigroup in B. If ‖Tt‖ ≤ Memt, then

‖Φt‖ ≤ MemtE1−ω(Γ(1− ω)Mκt1−ω), (4.89)

with E1−ω the Mittag-Leffler function.

(ii) If additionally L is a closed operator, then Φtf ∈ DL for any f ∈ B, t > 0,with the same order of growth as for Tt:

‖LΦt‖ ≤ MemtκΓ(1 − ω)t−ωE1−ω,1−ω(κΓ(1 − ω)t1−ω). (4.90)

Moreover, Φtf is the unique solution to the integral equation (4.86) for anygiven f0 = f such that ‖LΦtf‖ ≤ ct−ω1 with some constants ω1 ∈ (0, 1) andc > 0.

Proof. (i) For the nth term Φt(n) of the series (4.85), we have

‖Φt(n)‖ ≤ Mn+1emtκn

∫0≤s1≤···≤sn≤t

(sn−sn−1)−ω · · · (s2−s1)

−ωs−ω1 ds1 · · · dsn.

By the definition of the fractional Riemann integral (see (1.101)) and by the com-position rule IkIn = Ik+n for fractional integrals, this estimate can be rewritten as

‖Φt(n)‖ ≤ Mn+1emt(κΓ(1 − ω))nIn(1−ω)(1)(t)

≤ Mn+1emt (κΓ(1 − ω))n

Γ(n(1 − ω))

∫ t

0

(t− s)n(1−ω)−1ds

= Mn+1emt (κΓ(1 − ω)t1−ω)n

Γ(n(1 − ω) + 1),

where in the last equation the formula Γ(x+ 1) = xΓ(x) was used. Alternatively,this estimate follows directly from the Dirichlet formula (9.16). The series withthese terms converges, and its sum can be written in terms of the Mittag-Leffler

Page 263: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

248 Chapter 4. Linear Evolutionary Equations: Foundations

function (9.13). This takes us to (4.89). The semigroup property is proved as inTheorem 4.6.1.

(ii) We shall now prove that the series (4.85) still converges if we apply L toall of its terms. For the term Φt(n), we have

‖LΦt(n)‖ ≤ Mn+1emtκn+1

∫0≤s1≤···≤sn≤t

(t− sn)−ω(sn − sn−1)

−ω · · ·

· · · (s2 − s1)−ωs−ω

1 ds1 · · · dsn= Mn+1emt

κn+1tn(1−ω)−ω

∫0≤h1≤···≤hn≤1

(1− hn)−ω(hn − hn−1)

−ω · · ·

· · · (h2 − h1)−ωh−ω

1 dh1 · · · dhn

= Mn+1emtκt−ω(κt1−ω)n

(Γ(1 − ω))n+1

Γ((n+ 1)(1− ω)),

where the Dirichlet formula (9.15) was used. Consequently, by the definition ofthe Mittag-Leffler function (9.13), we find

∞∑n=0

‖LΦt(n)‖ ≤ Mn+1emtκΓ(1 − ω)t−ωE1−ω,1−ω(κΓ(1 − ω)Mt1−ω). (4.91)

Since the series (4.85) still converges if we apply L to all of its terms, itfollows from the closeness of L that Φtf ∈ DL for any f ∈ B, t > 0. The estimate(4.91) then implies (4.90). Consequently, the proof that Φtf is the unique solutionto equation (4.86) can be completed as in Theorem 4.6.1. �

It is reasonable to ask whether the strongly continuous semigroup Φt fromTheorem 4.6.3 is actually generated by A+L, as was the case with bounded L. Asimple answer can be obtained in terms of an intermediate Banach space, whichis often explicitly given in applications.

Theorem 4.6.4. Let an operator A with domain DA generate a strongly continuoussemigroup Tt on a Banach space B. Let B be a dense subspace of B, which is itselfa Banach space under the norm ‖.‖B ≥ ‖.‖B, and let L ∈ L(B, B). Let DA ⊂ B

and the semigroup Tt have the following regularization property: Ttf ∈ B for anyf ∈ B, t > 0, and

‖Tt‖B→B ≤ κt−ω , (4.92)

with constants ω ∈ (0, 1),κ > 0, uniformly for t ∈ (0, 1]. Moreover, let Tt bestrongly continuous in B.

Then the semigroup Φt constructed in Theorem 4.6.3 has DA as an invariantcore, where its generator equals A+L. Moreover, Φt maps B to B with the estimate

‖Φt‖B→B ≤ κt−ω, (4.93)

with a constant κ depending on κ, ω, ‖L‖B→B . Furthermore, Φt is also strongly

continuous in B.

Page 264: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

4.6. Perturbation theory and the interaction picture 249

Proof. First of all, we can conclude that Φt maps B to B with the estimate (4.93),because the series (4.85) for t > 0 converges in B if applied to any f ∈ B. In fact,its norm is bounded by

‖Φtf‖B ≤ κt−ω‖f‖B +∞∑

m=1

∫0≤s1≤···≤sm≤t

‖f‖B(κ‖L‖B→B)m(t− sm)−ω · · ·

· · · (s2 − s1)−ωs−ω

1 ds1 · · · dsm,

which is estimated like the above series (4.91), yielding (4.93). Similar estimatesshow that the B-norm of the sum in (4.85) converges to zero, as t → 0, if appliedto any f ∈ B. This shows that Φt is strongly continuous in B.

Finally, if f ∈ B, then

d

dt|t=0

∫ t

0

Tt−sLΦsf ds = Lf,

because of the strong continuity of Φt in B and the strong continuity of Tt in B(and of course the boundedness of L : B → B). Therefore, if f ∈ B, then it belongsto the domain of the generator of Φt if and only if it belongs to DA, in which casethe generator equals A+ L. Since the domain of any generator is invariant underits semigroup and since B is invariant under Φt, DA must be invariant under Φt,and hence constitutes an invariant core. �

Similarly to the above results, one can analyse a situation when the operatorL can be regularized by left multiplication on Tt, which leads to the followingresult. We omit the proof, since it is the same as the proof of Theorem 4.6.3above.

Theorem 4.6.5. Let an operator A with domain DA generate a strongly continuoussemigroup Tt on a Banach space B, and let L be an operator mapping a subspaceof B to some (possibly different) space, but in such a way that the composition TtLis well defined for t > 0 as a bounded operator in B such that

‖TtL‖ ≤ κt−ω (4.94)

with constants κ > 0, ω ∈ (0, 1), uniformly for t ∈ (0, 1]. Then the series (4.85)converges in the operator norm, the operators Φt form a strongly continuous semi-group in B, and Φtf is the unique bounded solution to the integral equation (4.86)for any given f0 = f . Moreover, ΦtL is well defined as a bounded operator in Bfor any t > 0, and

‖ΦtL‖ ≤ κt−ω (4.95)

for t ∈ (0, 1] with a constant κ.

Remark 71.

(i) In situations as described by Theorem 4.6.5, it can be difficult to identify thedomain of the generator of the semigroup Φt.

Page 265: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

250 Chapter 4. Linear Evolutionary Equations: Foundations

(ii) As an application, we shall consider the Schrodinger equation with a singularpotential in Section 4.8.

As an insightful example for the perturbation series, let us consider an equa-tion of the form

ft(x) =

∫ ∞

0

(ft(x + y)− ft(x))ν(dy), (4.96)

with a finite measure ν. The solution to the equation ft = −‖ν‖ft with the inialcondition f0 equals ft = exp{−t‖ν‖}f0. Considering the operator

∫f(x+ y)ν(dy)

the perturbation, one gets the formula

ft(x) = exp{−t‖ν‖}∞∑k=0

tk

k!

∫ ∞

0

· · ·∫ ∞

0

f0(x+ y1 + · · ·+ yk)ν(dy1) · · · ν(dyk)(4.97)

for the resolving operator (4.85) of the Cauchy problem of equation (4.96).

Exercise 4.6.1. How must (4.97) be modified in order to include the case of asigned bounded measure ν?

Let us distinguish an important special case of the above treatment of theequation ft = (A + L)ft, when A generates not only a semigroup, but a group.Namely, there exists a family of bounded operators Tt, t ∈ R, depending stronglycontinuous on t such that TtTs = Tt+s and Tt = ATtf = TtAf for any f ∈ D. Forinstance, this is the case when the operator A is bounded or A is a self-adjointoperator in a Hilbert space. In this case, choosing a new function μt = T−tft inthe equation ft = (A+ L)ft takes us to

μt = −AT−tft + T−t(A+ L)ft,

and therefore, the equation ft = (A+ L)ft can be written as

μt = Ltμt = T−tLTtμt. (4.98)

This variant of the original equation ft = (A+L)ft plays a crucial role in quantumphysics, where it is referred to as the interaction representation or the interactionpicture. A standard technique in quantum physics is based on finding the approxi-mate solutions to (4.98) from the first terms of the series (2.36), with L instead ofA, even if L is unbounded and the series does not converge. The point to empha-size is that the series (2.36) for μt = T−tft and L instead of A can obviously beobtained from the perturbation series (4.85) for ft = Φtf by applying T−t to bothsides, so that the perturbation series is just another representation of the basicexpansion (2.36) of the chronological exponential in the interaction picture.

4.7 Path integral representation

Integral representations for the solutions to linear PDEs in terms of an infinite-dimensional integral over some space of trajectories are a very important tool both

Page 266: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

4.7. Path integral representation 251

for modern analysis and theoretical physics. Let us show here a typical result ofthis kind that is implied by the perturbation theory developed above. For a morecomplete picture, we refer to Chapter 9 of [148].

The path-space that we shall work on is the space of piecewise constant paths.Namely, a sample path Z in Rd on the time interval [τ, t] that starts at a point yis defined by a finite number, say n, of jump-times τ < s1 < · · · < sn < t, and byjumps-sizes z1, . . . , zn (each zj ∈ Rd \ {0}) at these times:

Zx(s) = x−Zs1···snz1···zn (s), Zs1···sn

z1···zn (s) =

⎧⎪⎪⎪⎨⎪⎪⎪⎩

Z0 = 0, τ ≤ s < s1,

Z1 = z1, s1 ≤ s < s2,

· · ·Zn = z1 + · · ·+ zn, sn ≤ s ≤ t.

(4.99)

Let PCx(τ, t), abbreviated as PCx(t) for τ = 0, denote the set of all such right-continuous and piecewise-constant paths [τ, t] �→ Rd starting from the point xat τ , and let PCn

x (τ, t) denote the subset of paths with exactly n discontinuities.Topologically, PC0

x(τ, t) is a point and PCnx (τ, t) = Simn

τ,t ×(Rd)n, n = 1, 2, . . . ,where

Simnτ,t = {s1, . . . , sn : τ < s1 < s2 < · · · < sn < t} (4.100)

denotes the standard n-dimensional simplex. For τ = 0, we simply write Simnt .

To each σ-finite measure M on Rd, there corresponds the σ-finite measureMPC on PCx(t), which is defined as the sum of measures MPC

n , n = 0, 1, . . . ,where each MPC

n is the product-measure on PCnx (t) of the Lebesgue measure on

Simnt and of n copies of the measure M on Rd. Therefore, if Z is parametrized as

in (4.99), then

MPCn (dZ(.)) = ds1 · · · dsnM(dz1) · · ·M(dzn),

and for any measurable functional F (Zx(.)) = {Fn(x − Z0, x − Z1, . . . , x − Zn)}on PCx(t), given by a collection of functions Fn on Rdn, n = 0, 1, . . ., we find

∫PCx(t)

F (Zx(.))MPC(dZ(.)) = F (x) +

∞∑n=1

∫PCn

x (t)

F (Zx(.))MPCn (dZ(.))

=

∞∑n=0

∫Simn

t

ds1 · · · dsn∫Rd

· · ·∫Rd

M(dz1) · · ·M(dzn)

× Fn(x− Z0, x− Z1, . . . , x− Zn). (4.101)

If the measure M on Rd is finite, then the measure MPC = MPC(t, x) onPCx(t) is also finite with

‖MPC‖ = 1 +

∞∑n=1

∫Simn

t

ds1 · · · dsn∫Rdn

M(dz1) · · ·M(dzn) = et‖M‖.

Page 267: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

252 Chapter 4. Linear Evolutionary Equations: Foundations

Therefore, using the probabilistic notation E (the expectation) for the integralover the normalized (probability) measure MPC = e−t‖M‖MPC on the path-spacePCx(t), we can write (4.101) as∫

PCx(t)

F (Zx(.))MPC(dZ(.)) = et‖M‖

∫PCx(t)

F (Zx(.))MPC(dZ(.))

= et‖M‖EF (Zx(.)). (4.102)

Let us now look at perturbation series (4.85) assuming that A is the oper-ator of multiplication by a function A(y) in Rd and L is an integral operator inC(Rd). For the sake of simplicity, we take this integral operator to be spatiallyhomogeneous, i.e., Lf(x) =

∫f(x − y)ν(dy) with a measure ν on Rd (possibly

unbounded and complex-valued). Then series (4.85) (its second version!) can berewritten as

ΦtY (x) = etA(x)Y (x) (4.103)

+

∞∑m=1

∫0≤s1≤···≤sm≤t

Y (x − z1 − · · · − zm)ds1 · · · dsmν(dz1) · · · ν(dzm)

× exp{s1A(x) + (s2 − s1)A(x − z1) + · · ·

· · ·+ (t− sm)A(x − z1 − · · · − zm)}.

The latter exponential term can also be written as

exp

{∫ t

0

A(Zx(s)) ds

}.

Comparing with (4.101), we find the following path integral representation for thesolutions to the equation ft = (A+ L)ft.

Theorem 4.7.1. Under the assumptions of any one of the Theorems 4.6.1, 4.6.3or 4.6.5, let A be the operator of multiplication by the function A(y) in Rd andLf(x) =

∫f(x − y)ν(dy) with a measure ν on Rd. Then the convergent series

(4.85) or (4.103), expressing the resolving operator for the Cauchy problem ofequation ft = (A + L)ft, can be represented as a path integral of the type (4.101)with

F (Zx(.)) = exp

{∫ t

0

A(Zx(s)) ds

}Y (Zx(t))

and ν instead of M .

If ν is not positive, then it can be written as ν(dz) = ξ(z)M(dz) with a real(or complex) density ξ and some positive measure M . In this case, (4.103) can bewritten as (4.101), with

F (Zx(.)) = exp

{∫ t

0

A(Zx(s)) ds

}ξ(z1) · · · ξ(zn)Y (x − z1 − · · · − zn).

Page 268: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

4.7. Path integral representation 253

As a basic example, let us consider the regularized Schrodinger equation

�∂

∂tft = σ(−Δ+ V (x))ft, (4.104)

where σ is a complex constant with a non-negative real part; σ = −i for thestandard Schrodinger equation. If the potential V is the Fourier transform ofsome measure ν on Rd (possibly unbounded), then applying the Fourier transformto equation (4.104) leads (by (1.94)) to the following equation for the Fourier

transform f of f :

�∂

∂tft(p) = −ip2ft(p)− i(2π)−d

∫ft(p− q)ν(dq). (4.105)

Therefore, we are in the framework of Theorem 4.7.1, which implies the pathintegral representation for the solutions to the Schrodinger equation. Differentconditions on ν that ensure the compliance with all assumptions of this Theoremare discussed in detail in Chapter 9 of [148].

Later on, we will extend this theory to time-dependent A and L, see (4.181).In Chapter 8, we will extend it into other directions, in particular to the casewhere A is not necessarily a multiplication operator.

Remark 72. For readers with some probabilistic background, let us indicate analternative approach to the path integral representation. To this end, let us startwith the series (4.85) in its first version:

ΦtY = TtY +

∞∑m=1

∫0≤s1≤···≤sm≤t

Tt−smLTsm−sm−1 · · ·LTs1Y ds1 · · · dsm. (4.106)

It represents the solutions ft = ΦtY to the equation ft = (A + L)ft with theinitial condition Y . A similar series expansion can be used to find the solution tothe Cauchy problem for the SDE dφt = Aφt + LφtdWt, where W is the Wienerprocess. This leads to the formula

φtY = TtY +

∞∑m=1

∫0≤s1≤···≤sm≤t

Tt−smLTsm−sm−1 · · ·LTs1Y dWs1 · · · dWsm ,

(4.107)expressed in terms of the iterated (multiplicative) Wiener (or Ito) integral. Thecelebrated Wiener isometry states that for any two expansions

φj =

∞∑m=0

∫0≤s1≤···≤sm≤t

gjm(s1, . . . , sm)dWs1 · · · dWsm , j = 1, 2

(under appropriate growth conditions), one has

E(φ1φ2) =

∫0≤s1≤···≤sm≤t

g1mg2m ds1 · · · dsm.

Page 269: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

254 Chapter 4. Linear Evolutionary Equations: Foundations

Using this isometry for g2m = 1 for all m and taking into account the well-knownformula

exp

{Wt − t

2

}=

∞∑m=0

∫0≤s1≤···≤sm≤t

dWs1 · · · dWsm ,

one finds a representation of the series (4.106) in terms of the expectation E withrespect to the Wiener measure:

ΦTY = E

[φt(Y ) exp

{Wt − t

2

}], (4.108)

where φtY solves the Cauchy problem for the SDE dφt = Aφt + LφtdWt.

4.8 Diffusion with drifts and Schrodinger equationswith singular potentials and magnetic fields

As an example for the application of perturbation theory as developed above, letus analyse the diffusion or heat conduction equation in Rd with a variable drift band a source V :

ft =1

2Δft + (b(x),∇)ft(x) + V (x)ft(x), f0 = f, (4.109)

where (b(x),∇) =∑

j bj(x)∂

∂xj. Hereby, the operator

Lf = (b(x),∇)f(x) + V (x)f(x) (4.110)

is considered a perturbation. If b = 0 and V ∈ C(Rd), then the semigroup gener-ated by L + Δ/2 can be constructed via Theorem 4.6.1. In order to deal with anon-vanishing drift b, Theorem 4.6.3 or 4.6.4 are required.

Proposition 4.8.1. Let V and all bj belong to C(Rd). Then the operator L+Δ/2generates a strongly continuous semigroup in C∞(Rd), and C2

∞(Rd) is an invari-ant core.

Proof. This is a direct application of Theorem 4.6.4 with B = C1∞(Rd) and of theregularization property (4.24) of the semigroup generated by Δ. �

One is often interested in situations with discontinuous drifts or sources.

Proposition 4.8.2. Let V and all bj be bounded measurable functions on Rd. Thenthe operator L+Δ/2 still generates a strongly continuous semigroup Φt in C∞(Rd).Its members Φt provide unique solutions to the corresponding mild form of equation(4.109).

Page 270: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

4.8. Diffusion with drifts and Schrodinger equations . . . 255

Proof. Formally, Theorem 4.6.4 does not apply even for the vanishing drift b, sinceL is not a well-defined operator in C∞(Rd). However, the arguments proving theconvergence of the series (4.85) and the strong continuity of Φt are still perfectlyapplicable. Alternatively, the proof can be based on Theorem 4.6.5, as will be donein the below Theorem 4.8.1. �

These results automatically extend to complex diffusion equations, or regu-larized Schrodinger equations with magnetic fields :

ft =1

2σΔft+(b(x),∇)ft(x)+V (x)ft(x) =

(1

2σΔ+ L

)ft(x), f0 = f, (4.111)

where σ = i+ ε is a complex constant with a positive real part ε. This takes us tothe following result.

Proposition 4.8.3. Let V and all bj be bounded continuous complex-valued func-tions. Then the operator on the r.h.s. of (4.111) generates a strongly continuoussemigroup in the space of complex-valued continuous functions CC∞(Rd) vanishingat infinity, and the corresponding complex-valued space C2,C

∞ (Rd) is an invariantcore.

The limiting equation with σ = i is the Schrodinger equation with magneticfields. In its standard representation, it is written as

i�∂

∂tft =

1

2(−i∇+A(x))

2ft + V (x)ft, (4.112)

with � the Planck constant and the functions V (x) and A(x) specifying the so-called vector potential.

In physics, one is often interested in more general potentials V , e.g. potentialsthat are represented by measures. For this purpose, let us consider the regularizedSchrodinger equation (4.111) when V and ∇b are complex Radon measures. (Ofcourse, V in (4.110) should in this case better be denoted by V (dx).) Let usintroduce the subspace MC

R,α(Rd) of Radon measures μ on Rd such that

|μ|(Br(x)) ≤ Crα (4.113)

for all x ∈ Rd, r ∈ (0, 1) and some constant α ∈ (max(d − 2, 0), d] called thedimensionality of μ. Br(X) denotes the ball of radius r centered at x. Notice that(4.113) implies

|μ|(Br(x)) ≤ C1 max(1, rd) (4.114)

for all r and some other constant C1 depending on d and α. The space MCR,α(R

d)of measures of dimensionality α is a Banach space with respect to the norm

‖μ‖R,α = supx∈Rd,r∈(0,1]

|μ|(Br(x))

rα. (4.115)

Page 271: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

256 Chapter 4. Linear Evolutionary Equations: Foundations

Natural examples of measures from MCR,2(R

3) are the volumes on regular hyper-surfaces. Dirac’s point masses (or δ-functions) on R and their finite linear combi-nations belong to MC

R,0(R).

Theorem 4.8.1. Let b be a bounded measurable function. Let V ∈ MCR,α(R

d) and∂b∂xj

∈ MCR,α(R

d) for all j, where d > 1, α ∈ (d−2, d], and let σ be a complex con-

stant with a positive real part. Then the perturbation series (4.85) with A = σΔ/2and L from (4.110) converges in the norm of the space of bounded operators inC∞(Rd), and Φtf represents a unique bounded solution to the mild equation (4.86)for any f ∈ C∞(Rd). Moreover, the operators Φt extend to the space MC

R,α(Rd),

so that for any f ∈ MCR,α(R

d), Φtf solves (4.86) and satisfies the estimate

‖Φtf‖CC∞(Rd) ≤ κt−ω‖f‖MCR,α(R

d), (4.116)

with any ω > (d− α)/2 and some κ.

Remark 73. The derivative ∂b∂xj

is defined in the sense of generalized functions,

i.e., it is effectively defined inside the integral∫f(x) ∂b

∂xj(dx) only. This fact will

be used below in (4.119). For instance, if b(x) = 1[0,∞) is the indicator function ofthe positive half line, then

∫f(x)b′(dx) = f(0). Readers who do not wish to touch

generalized functions can work with a special case when b is a Lipschitz-continuousfunction, for which the partial derivatives are well defined almost surely.

Proof. In view of Theorem 4.6.5, we only need to show the corresponding versionof estimate (4.94):∣∣∣∣

∫Gtσ(x− y) (Lf)(y)(dy)

∣∣∣∣ ≤ κt−ω‖f‖C(Rd) (4.117)

with some constants ω ∈ (0, 1),κ > 0, where the heat kernel G is given by (4.14).Due to the structure of L, this boils down to proving the estimates∫

|Gtσ(x− y)| |V |(dy) ≤ κt−ω (4.118)

and ∣∣∣∣∫

Gtσ(x− y)∑

jbj(y)

∂yjf(y)(dy)

∣∣∣∣=

∣∣∣∣∫ ∑

j

∂yj(Gtσ(x− y)bj(y))f(y)(dy)

∣∣∣∣ ≤ κt−ω‖f‖C(Rd).

(4.119)

Let us start with (4.118). Since

|Gtσ(x)| = (2πt|σ|)−d/2 exp

{− x2

2tσ1

},

Page 272: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

4.8. Diffusion with drifts and Schrodinger equations . . . 257

with σ1 being the real part of σ−1, re-scaling of t and κ leads to the followingreduced estimate:

(2πt)−d/2

∫exp

{− (x− y)2

2t

}|V |(dy) ≤ κt−ω . (4.120)

Since condition (4.113) is shift-invariant, one can further reduce the proof to theestimate

(2πt)−d/2

∫exp

{−x2

2t

}|V |(dx) ≤ κt−ω . (4.121)

In order to estimate the integral on the l.h.s., we decompose it into threeparts: over the ball Btδ (0), over the band B1(0) \ Btδ (0) and over the remainingpart. Therefore, it follows from (4.113) and (4.114) that

(2πt)−d/2

∫exp

{−x2

2t

}|V |(dx) ≤ (2πt)−d/2Ctδα

+ (2πt)−d/2C1 exp

{− t2δ

2t

}(4.122)

+ (2πt)−d/2

∫|x|≥1

exp

{−x2

2t

}|V |(dx).

For δ < 1/2, the second term is exponentially small for small t, and the first termis of order t−ω with ω = d/2 − δα. In order to have ω < 1, one has to chooseδα > d/2− 1, which is consistent with the restriction δ < 1/2 precisely under theassumed condition α > d− 2.

It remains to estimate the last term in (4.122). It can be rewritten as

(2πt)−d/2

∫ ∞

1

exp

{−r2

2t

}V (dr),

where V ((r1, r2]) = |V |(Br2(0)) − |V |(Br1(0)). With the help of (4.114), this isfurther estimated by

(2πt)−d/2C1

∫ ∞

1

exp

{−r2

2t

}rd dr,

which is exponentially small for small t, as expected.

Turning to (4.119), we note that, since b is bounded, it is sufficient to showthe following two estimates:∫ ∑

j|Gtσ(x− y)|

∣∣∣∣ ∂

∂yjbj(y)

∣∣∣∣ (dy) ≤ κt−ω , (4.123)∫ ∑j

∣∣∣∣ ∂

∂yjGtσ(x − y)

∣∣∣∣ dy ≤ κt−ω . (4.124)

Page 273: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

258 Chapter 4. Linear Evolutionary Equations: Foundations

But estimate (4.123) is the same as (4.118), since only the dimensionality of V and∇b is relevant. Estimate (4.124) is straightforward (the calculations are performedin Proposition 4.3.1). �

The situation is much simpler in the dimension d = 1. If the measures V andb′ are bounded, one can even deal with the Schrodinger equation directly, withoutany regularization.

Theorem 4.8.2. Let b be a bounded measurable function, V and b′ be boundedcomplex measures in R and σ = 0 a complex constant with a non-negative real part.Then the perturbation series (4.85) with A = σΔ/2 and L from (4.110) convergesin the norm of the space of bounded operators in C∞(R), and Φtf represents aunique bounded solution to the mild equation (4.86) for any f ∈ C∞(R).

Proof. Estimating |Gtσ| roughly by (2πt|σ|)−1/2 implies that∫|Gtσ(x− y)| |V |(dy) ≤ (2πt|σ|)−1/2‖V ‖,

which yields (4.118) with ω = 1/2. (4.118) is obtained similarly. �

It is instructive to compare equation (4.109) with its dual equation

gt =1

2Δgt − (∇, b(x)gt(x)) + V (x)gt(x), g0 = g. (4.125)

In this case, b stands under the operator∇, therefore it seems as if more regularitywas required for b than in (4.12). However, this is not the case. Working with(4.125) as with (4.109) in Theorem 4.8.1, we again need to obtain the estimate(4.117). But here, unlike the case of (4.109), the integration by parts transfers thederivatives to G only. Therefore, there is no need to prove the estimate (4.123).Consequently, one obtains the following result:

Theorem 4.8.3. Let b be a bounded measurable function and V ∈ MCR,α(R

d), whered > 1, α ∈ (d − 2, d], and let σ be a complex constant with a positive real part.Then the perturbation series (4.85) with A = σΔ/2 and

Lg = −(∇, b(x)g(x)) + V (x)g(x)

converges in the norm of the space of bounded operators in C∞(Rd), and Φtfrepresents a unique bounded solution to the corresponding mild form (4.86) ofequation (4.125) for any f ∈ C∞(Rd). Moreover, the operators Φt extend to thespace MC

R,α(Rd), so that for any f ∈ MC

R,α(Rd), Φtf solves (4.86) and satisfies

the estimate

‖Φtf‖CC∞(Rd) ≤ κt−ω‖f‖MCR,α(R

d), (4.126)

with any ω > (d− α)/2 and some κ.

Page 274: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

4.9. Propagators and their generators 259

4.9 Propagators and their generators

For a set S, a family of mappings U t,r from S to itself, parametrized by the pairof numbers r ≤ t (respectively t ≤ r) from a given finite or infinite interval ofR is called a propagator (respectively a backward propagator) in S, if U t,t is theidentity operator in S for all t and the following chain rule, or propagator equation,holds for r ≤ s ≤ t (respectively for t ≤ s ≤ r): U t,sUs,r = U t,r. If the mappingsU t,r forming a backward propagator depend only on the differences r− t, then thefamily T t = U0,t is clearly a semigroup.

Remark 74. In the literature, propagators are also referred to as two-parametersemigroups and evolutionary families.

The propagators of continuous linear operators in linear topological spaces V(sometimes shortly referred to as propagators on V ) arise naturally when solvinglinear Cauchy problems in V :

f(t) = Atf(t), t ≥ s, (4.127)

with a given f = f(s), or its backward version

f(s) = −Asf(s), s ≤ t, (4.128)

with a given f = f(t), where At is a family of densely defined operators in V ,and where for t = s in (4.127) (respectively s = t in (4.128)) the derivative isunderstood as the right derivative (respectively left derivative).

One says that the propagator (respectively backward propagator) U t,r ofcontinuous linear operators solves the Cauchy problem (4.127) (respectively thebackward Cauchy problem (4.128)) on D ⊂ B for a family of densely definedoperatorsAt, or equivalently, that a family of densely defined operatorsAt generatethe propagator (respectively backward propagator) U t,r on D, if D is a subspacecontained in the domains of all At, D is invariant under all U t,s, and for anyf ∈ D, U t,sf is a solution to (4.127) (respectively (4.128)), i.e.,

d

dtU t,sf = AtU

t,sf, t ≥ s, (4.129)

respectively

d

dsUs,tf = −AsU

s,tf, s ≤ t, (4.130)

for the backward case.

Remark 75. When dealing with discrete approximations, see (5.3), or in applica-tions to stochastic equations, see [146], it can be useful to extend the above notionto the case when (4.129) or (4.130) hold only outside some fixed finite or evencountable subset of t and s. Note that practically all below results remain validunder this extension.

Page 275: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

260 Chapter 4. Linear Evolutionary Equations: Foundations

It is clear that if the Cauchy problems (4.127) or (4.128) have unique solutionsgiven by continuous operators, then these operators form a propagator respectivelya backward propagator. Shortly, we shall prove the inverse statement using themethod of duality.

It is natural to ask what happens with a backward propagator Us,t when itis differentiated with respect to t and whether it is sufficient to assume (4.130)only for s = t.

Proposition 4.9.1. Let Us,t be a backward propagator on V with some invariantsubspace D and

d

ds−Us,tf

∣∣∣∣s=t

= −Asf, f ∈ D, (4.131)

where d/ds− denotes the left derivative, for all t and some family of operators At

with domains containing D. Then

d

ds−Us,tf = −AsUs,tf,

d

dt−Us,tf = Us,tAtf, (4.132)

for all s < t. If additionally the family At is strongly continuous as a family ofoperators D → V , i.e., if Atf is a continuous mapping t �→ V for any f ∈ D, thenthe second equation of (4.132) improves to

d

dtUs,tf = Us,tAtf, s ≤ t, (4.133)

with the derivative being understood as the right derivative in the case of s = t.

Proof. The first equation in (4.132) follows from the chain rule for propagatorsand (4.131). The second equation in (4.132) is obtained by writing

1

h(Us,t − Us,t−h)f =

1

hUs,t−h(U t−h,tf − f)

= Us,t−h

(1

h(U t−h,tf − f)−Atf

)+ Us,t−hAtf,

for h > 0. The first term tends to zero and the second to Us,tAtf , as h → 0. Thisimplies the second equation in (4.132). Finally, in order to get (4.133), Lemma1.3.1 must be applied. �Remark 76. In order to deduce (4.130) from the first equation in (4.132) by Lemma1.3.1, one has to require that the mapping s �→ AsU

s,tf is continuous for all f ∈ Dand s < t.

Remark 77. Property (4.133) is crucial for the theory of propagators. It can beincluded into the definition of the generation of U by A. However, this propoerty isdifficult to check in other ways than via the continuity assumptions of Proposition4.9.1.

Page 276: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

4.9. Propagators and their generators 261

A propagator or a backward propagator U t,r of uniformly (for t, r from acompact set) continuous linear operators on a locally convex linear space V iscalled strongly continuous if the family U t,r depends strongly continuously on thepair of variables (t, r), i.e., U t,rf is a continuous function (t, r) �→ V for any f ∈ V .As for the semigroups, it follows from the principle of uniform boundedness, The-orem 1.6.1, that if V is a barrelled space, then the strongly continuous family ofcontinuous operators U t,r is locally (that is, for t, r from any compact set) equicon-tinuous. However, unlike the semigroup case, the link between strongly continuouspropagators and Cauchy problems (4.127) or (4.128) is much less straightforwardin general. For instance, no analogues to Proposition 4.2.4 seem to exist. Similarly,Proposition 4.2.1 on exponential bounds does not extend to propagators, see, e.g.,[42] for a review on this topic.

If a propagator (respectively a backward propagator)U t,r of continuous linearoperators solves the Cauchy problem (4.127) or (4.128) on D, then U t,r dependsstrongly continuously on r (respectively on t) on the closure of D in V . In thiscase, one says that if f belongs to the closure of D in V , then U t,rf defines thegeneralized solution to the Cauchy problems (4.127) or (4.128).

As for semigroups, we shall mostly work with propagators in Banach spaces,although most of the abstract results have direct extensions to locally convexspaces.

The following simple fact extends formula (4.8) and is crucial for comparingdifferent propagators.

Proposition 4.9.2. Let Us,t1 and Us,t

2 be two backward propagators of bounded linearoperators in a Banach space B with a common invariant subspace D. Let A1

t andA2

t be two families of operators in B with domains containing D such that Us,t2 , A2

t

satisfy (4.130) and Us,t1 , A1

t satisfy (4.133). Moreover, let Us,t1 depend strongly

continuously on t. Then

(U t,r1 − U t,r

2 )f =

∫ r

t

U t,s1 (A1

s −A2s)U

s,r2 f ds, f ∈ D. (4.134)

Proof. This follows from the observation that the function U t,s1 Us,r

2 f is differen-tiable in s ∈ [t, r] for any f ∈ D, and from

d

ds(U t,s

1 Us,r2 f) = U t,s

1 (A1s −A2

s)Us,r2 f. (4.135)

In order to prove this formula, we write

1

δ(U t,s+δ

1 Us+δ,r2 f − U t,s

1 Us,r2 f)

=1

δU t,s+δ1 (Us+δ,r

2 − Us,r2 )f +

1

δ(U t,s+δ

1 − U t,s1 )Us,r

2 f.

Page 277: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

262 Chapter 4. Linear Evolutionary Equations: Foundations

The second term tends to U t,s1 A1

sUs,r2 f , as δ → 0, because Us,t

1 , A1t satisfy (4.133).

The first term can be written as

U t,s+δ1

(1

δ(Us+δ,r

2 − Us,r2 )f +A2

sUs,rf

)− U t,s+δ

1 A2sU

s,rf,

which converges to −U t,s1 A2

sUs,rf , because U t,s

1 is strongly continuous and Us,t2 , A2

s

satisfy (4.130). �

As a direct consequence, we obtain the following stability (or continuity)result for propagators.

Proposition 4.9.3. Under the assumptions of Proposition 4.9.2, assume that D isitself a Banach space under the norm ‖.‖D ≥ ‖.‖B such that the operators A1

t , A2t

are bounded as operators D → B and U t,s2 are bounded operators in D. Then

‖(U t,r1 − U t,r

2 )f‖B ≤ ‖f‖D sups∈[t,r]

‖U t,s1 ‖B→B sup

s∈[t,r]

‖Us,r2 ‖D→D

×∫ r

t

‖A1s −A2

s‖D→B ds.(4.136)

This result can be used to find the derivative of a propagator or a semigroupwith respect to a parameter. In fact, the following statement is a consequence ofProposition 4.9.2.

Proposition 4.9.4. Let Us,tα be a family, depending on a parameter α ∈ R, of

strongly continuous backward propagators of bounded linear operators in a Banachspace B with a common invariant subspace D. Let Aα

t be the families of operatorsin B with domains containing D that generate the propagators Us,t

α on D. Assumethat D is itself a Banach space under the norm ‖.‖D ≥ ‖.‖B such that the operatorsAα

t are bounded as operators D → B and U t,sα are also bounded as operators in D.

Finally assume that, for any t and g ∈ D, Aαt g is differentiable with respect to α

as a mapping R → B, and that their derivatives are uniformly bounded for t andg from any bounded sets. Then the mappings U t,r

α f are differentiable in α for anyf ∈ B and

∂(U t,rα f)

∂α=

∫ r

t

U t,sα

∂Aαs

∂α(Us,r

α f) ds. (4.137)

In particular, if exp{tAα} is a family of strongly continuous semigroups oflinear operators in B with the same domain D, then the derivative with respectto the parameter α is given by the formula

∂αexp{tAα} =

∫ t

0

exp{(t− s)Aα}∂Aα

∂αexp{sAα} ds, (4.138)

which is far from being the same as if it were the derivative of a usual exponent.

Page 278: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

4.10. Well-posedness of linear Cauchy problems 263

Finally, let us mention that there exists a standard method for buildingsemigroups from propagators by enlarging the state space. Namely, each stronglycontinuous propagator U t,s of bounded linear operators in a Banach space B canbe associated with a semigroup of operators in the space C(R, B), which is oftenreferred to as the Holland semigroup and given by the formula

(Ttf)(x) = Ux,x−tf(x− t), t ≥ 0. (4.139)

The semigroup property Tt+s = TtTs is readily checked. Many results on evolutionequations that arise from this link can be found, e.g., in [53]. See also [243] for anapplication to second-order equations of the type u = Bu+ f . More generally, forany δ ∈ R and α > 0, the semigroup can be formed by the operators

(Ttf)(x) = Ux+δ,x+δ−αtf(x− αt), t ≥ 0. (4.140)

Exercise 4.9.1. Check that these operators define a strongly continuous semigroupin C∞(R, B). Moreover, if At generates the forward propagator U t,s, then thegenerator of the semigroup Tt is

(Lf)(x) = αAx+δf(x)− αf ′(x). (4.141)

An extension of this dynamics with mixed fractional derivative instead of f ′

will be presented in Chapter 8, see Theorems 8.7.2 and 8.8.1.

4.10 Well-posedness of linear Cauchy problems

The following result employs the method of duality for establishing the well-posedness of the linear Cauchy problem corresponding to a propagator (and, inparticular, to a semigroup).

Theorem 4.10.1. Let U t,r be a strongly continuous backward propagator of boundedlinear operators in a Banach space B generated by a family of linear operators At

on a common dense domain D that is invariant under all U t,r. Moreover, let Atfbe a continuous function t �→ B for any f ∈ D. Then the following holds:

(i) The family of dual operators V s,t = (U t,s)∗ forms a ∗-weakly continuous (ins, t) propagator of bounded linear operators in B∗ (contractions if all U t,r arecontractions), so that

d

dtV s,tξ = −V s,tA∗

t ξ,d

dsV s,tξ = A∗

sVs,tξ, t ≤ s ≤ r, (4.142)

holds weakly on D, i.e., say, for the second equation

d

ds(f, V s,tξ) = (Asf, V

s,tξ), t ≤ s ≤ r, f ∈ D; (4.143)

Page 279: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

264 Chapter 4. Linear Evolutionary Equations: Foundations

(ii) V s,tξ is the unique solution to the Cauchy problem of equation (4.143), i.e.,if ξt = ξ for a given ξ ∈ B∗ and ξs, s ∈ [t, r], is a ∗-weakly continuous familyin B∗ satisfying

d

ds(f, ξs) = (Asf, ξs), t ≤ s ≤ r, f ∈ D, (4.144)

then ξs = V s,tξ for t ≤ s ≤ r;

(iii) Us,rf is the unique solution to the inverse Cauchy problem (4.130), i.e., iffr = f , fs ∈ D for s ∈ [t, r] and satisfies the equation

d

dsfs = −Asfs, t ≤ s ≤ r, (4.145)

then fs = Us,rf .

Proof. Statement (i) is a direct consequence of duality and the equations (4.130)and (4.133). (ii) Let g(s) = (Us,rf, ξs) for a given f ∈ D. We will show thatg′(s) = 0 for all s. In fact, we have

[(Us+δ,rf, ξs+δ)− (Us,rf, ξs)]/δ

= (Us+δ,rf − Us,rf, ξs+δ)/δ + (Us,rf, ξs+δ − ξs)/δ,

and the second term tends to (AsUs,rf, ξs), as δ → 0. The first term can be

written as

−(AsUs,rf, ξs+δ) +

(1

δ(Us+δ,r − Us,r)f +AsU

s,rf, ξs+δ

),

where the first term tends to −(AsUs,r, ξs), as δ → 0, since ξ is ∗-weakly continu-

ous, and the second term tends to zero, because the ξs are uniformly bounded inB∗. Therefore, we find g′(s) = 0 as claimed. This implies g(r) = (f, ξr) = g(t) =(U t,rf, ξt), which shows that ξr is uniquely defined. Similarly, we can analyse anyother point r′ ∈ (s, r).

(iii) Similar to (ii), it is sufficient to prove the equation

d

ds(fs, V

s,tξ) = 0.

To this end, let us write

(fs+δ − fs, Vs+δ,tξ)/δ =

(1

δ(fs+δ − fs) +Asfs, V

s+δ,tξ

)− (Asfs, V

s+δ,tξ).

The first term tends to zero, as δ → 0, because ‖(fs+δ − fs)/δ + Asfs‖ → 0and the family V s+δ,tξ is uniformly bounded in B∗. The second term tends to−(Asfs, V

s,tξ) because of the ∗-weak continuity of V s,t. Consequently,

limδ→0

[(fs+δ, Vs+δ,tξ)/δ − (fs, V

s,tξ)/δ]

= −(Asfs, Vs,tξ) + lim

δ→0(fs, V

s+δ,tξ − V s,tξ)/δ = 0,

as required. �

Page 280: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

4.10. Well-posedness of linear Cauchy problems 265

Sometimes, not only the contuinity, but also quantitative measures of the reg-ularity of the dual propagator are required. For that purpose, a bit more structureis usually assumed.

Proposition 4.10.1. Under the assumptions of Theorem 4.10.1, suppose that D isitself a Banach space under some norm ‖.‖D such that ‖.‖D ≥ ‖.‖B and

‖As‖D→B ≤ A, ‖Us,r‖B→B ≤ UB,

for all s, r and some constants A, UB. Then the dual curve V s,tξ is Lipschitz-continuous in s in the norm topology of D∗:

‖V s+δ,tξ − V s,tξ‖D∗ ≤ δUBA‖ξ‖B∗ . (4.146)

If additionally ‖Us,r‖D→D ≤ UD with a constant UD, then also

‖V s,tξ − V s,t±δξ‖D∗ ≤ δUDA‖ξ‖B∗ . (4.147)

Proof. We have

‖V s+δ,tξ − V s,tξ‖D∗ = sup‖f‖D≤1

|(f, V s+δ,tξ − V s,tξ)|

= sup‖f‖D≤1

|(U t,s+δf − U t,sf, ξ)| ≤ δUBA‖ξ‖B∗ .

(4.147) is proved in a similar way. �

Let us now extend the uniqueness result to affine equations of the form

d

dsfs = −Asfs − gs, t ≤ s ≤ r, (4.148)

with a given fr and a continuous curve s �→ gs ∈ B.

Proposition 4.10.2. Under the assumptions of Theorem 4.10.1, suppose that s �→ gsis a continuous mapping s → B and gs ∈ D for all s with uniformly bounded‖gs‖D. Then equation (4.148) has a unique solution fs for any boundary conditionf = fr ∈ D, and we have

fs = Us,rf +

∫ r

s

Us,tgt dt. (4.149)

Proof. Let us assume that there are two solutions to (4.148) with the same bound-ary condition. Then their difference satisfies the equation f ′ = −Asf with a van-ishing boundary condition. In this case, Theorem 4.10.1 implies that f vanishes.Therefore, there can be at most one solution to (4.148) with a given boundarycondition. Differentiating (4.149) with respect to s (this is where we use gs ∈ D!),one finds that it satisfies (4.148). �

Page 281: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

266 Chapter 4. Linear Evolutionary Equations: Foundations

Remark 78. Of course, one obtains the direct analogues of Theorem 4.10.1 andProposition 4.10.2 for the forward propagators by reverting the direction of time.

The dual analogue of (4.148), written in the weak form, is the equation

d

ds(f, ξs) = (Asf, ξs) + (f, νs), s ≥ t, f ∈ D,

with some given curve νs. This equation often appears in the slightly modifiedform

d

ds(f, ξs) = (Asf, ξs) + (Lsf, νs), s ≥ t, f ∈ D, (4.150)

with Ls : B → B a family of bounded operators. The next statement is the dualversion of Proposition 4.10.2.

Proposition 4.10.3. Under the assumptions of Theorem 4.10.1, suppose that s �→ νsis a continuous curve in B∗ and s �→ Ls is a continuous curve in L(B,B). Thenthe weak equation (4.150) in B∗ with the initial condition ξt = ξ has a uniquesolution ξs given by the formula

(f, ξs) = (U t,sf, ξt) +

∫ s

t

(LrUr,sf, μr) dr. (4.151)

Proof. The uniqueness follows directly by subtracting two solutions and referringto the uniqueness of the solutions to (4.144). Therefore, one only has to show thatξs given by the explicit formula (4.151) solves the Cauchy problem for equation(4.150). This, however, follows by differentiation. �

4.11 The operator-valued Riccati equation

As an insightful example for the application of linear propagator methods, let usanalyse an important class of quadratic equations, the so-called Riccati equations.

As usual, let B and B∗ be a real Banach space and its dual. Let us say that adensely defined operator C from B to B� (possibly unbounded) is symmetric (re-spectively positive) if (Cv,w) = (Cw, v) (respectively if additionally (Cv, v) ≥ 0)for all v, w from the domain of C. Let us denote the space of bounded symmet-ric operators B �→ B∗ (respectively its convex subset of positive operators) bySL(B,B∗) (respectively SL+(B,B∗)). Analogous definitions are applied to theoperators B∗ �→ B. The notion of positivity of course induces a (partial) orderrelation on the space of symmetric operators.

For the study of equations in SL(B,B∗) or SL(B∗, B), it is convenient tointroduce a special norm therein:

‖C‖S = sup{|(Cv, v)| : |v| ≤ 1}.

Page 282: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

4.11. The operator-valued Riccati equation 267

Proposition 4.11.1. ‖C‖S is in fact a norm and it is equivalent on SL(B,B∗) tothe usual norm of the space L(B,B∗). More precisely,

‖C‖S ≤ ‖C‖L(B,B∗) ≤ 3‖C‖S. (4.152)

Proof. Homogeneity, positivity and the triangle inequality are straightforward con-sequences from the definition. Next, for any C ∈ SL(B,B∗), we find

(Cv,w) =1

2[(C(v + w), v + w) − (Cv, v) − (Cw,w)]. (4.153)

Consequently, if ‖C‖S = 0, then (Cv,w) = 0 for all v, w, and thus C = 0. There-fore, ‖C‖S satisfies all properties of a norm.

Next, the l.h.s. inequality of (4.152) follows from the definition. In order toget the r.h.s. inequality of (4.152), we use (4.153) to obtain

‖C‖L(B,B∗) = sup{|(Cv,w)| : |w|, |v| ≤ 1}≤ 1

2sup{|(C(v + w), v + w)| : |w|, |v| ≤ 1}+ sup{|(Cv, v)| : |v| ≤ 1}

≤ 3‖C‖S,as required. �

Let us start with the simplest quadratic equation

π(t) = −π(t)C(t, s)π(t), t ≥ s, (4.154)

for an unknown operator-valued function π(t) ∈ SL+(B∗, B).

Remark 79. Before touching operator-valued evolutions, it is useful to get a clearunderstanding of the simpler situation inR. Namely, look at the equation x = −x2.For x0 > 0, the solution is globally defined and tends to zero as t → ∞. For x0 < 0,the solution explodes, i.e., tends to −∞, in a finite time.

Exercise 4.11.1. Solve the equation x = −x2 explicitly to confirm the claim in theabove remark.

Proposition 4.11.2. Suppose that C(t, s), t ≥ s, is a family of elements inSL+(B,B∗) such that C(t, s) are strongly continuous in t for t > s, with an

integrable singularity at t = s at most, i.e., with∫ t

sC(τ, s) dτ < ∞. Then the

following holds:

(i) For any πs ∈ SL+(B∗, B) there exists a unique global strongly continuousfamily of operators π(t) ∈ SL+(B∗, B), t ≥ s, such that

π(t) = πs −∫ t

s

π(τ)C(τ, s)π(τ) dτ. (4.155)

(Note that the integral is defined in the norm topology.)

(ii) π(t) ≤ πs and the image of π(t) belongs to the image of πs for all t ≥ s.

(iii) Equation (4.154) holds in the norm topology of L(B,B∗) for t > s.

Page 283: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

268 Chapter 4. Linear Evolutionary Equations: Foundations

Proof. The uniqueness follows from Gronwall’s lemma in Proposition 9.1.4. Theexistence of a positive solution for small t− s follows from the explicit formula

π(t) = πs

(1+

∫ t

s

C(τ, s) dτ πs

)−1

=

(1+ πs

∫ t

s

C(τ, s) dτ

)−1

πs. (4.156)

In fact, for small t−s the inverse operators in this formula are well defined, because(1+ A)−1 exists whenever ‖A‖L(B,B) < 1. In order to check that this is indeed asolution, we can use (1.60), which implies, e.g., for the first formula

π(t) = −πs

(1+

∫ t

s

C(τ, s) dτ πs

)−1

C(t, s)πs(t)

(1+

∫ t

s

C(τ, s) dτ πs

)−1

= −π(t)C(t, s)π(t),

as required.

Remark 80. In order to derive (or to guess) formula (4.156), one can assume thatπ−1t exists. In this case, (4.154) is equivalent to the equation d

dtπ−1(t) = C(t, s).

It remains to show the global existence. To this end, it is natural to try toemploy some kind of accretivity. As it turns out, the norm ‖.‖S is convenient forthat purpose. More precisely, for any t > s there exists a decomposition s = t0 <t1 < · · · < tn = t of the interval [s, t] such that

3‖πs‖S∥∥∥∥∥∫ tj+1

tj

C(τ, s) dτ

∥∥∥∥∥L(B,B∗)

< 1

for all j. As can be seen from (4.154), the norm ‖π(t)‖S does not increase alongthe solution. Defining the solution first on the interval [t0, t1], we therefore findthat

‖π(t1)‖L(B∗,B)

∥∥∥∥∫ t2

t1

C(τ, s) dτ

∥∥∥∥L(B,B∗)

≤ 3‖πs‖S∥∥∥∥∫ t2

t1

C(τ, s) dτ

∥∥∥∥L(B,B∗)

< 1.

Therefore, the solution can be extended to the interval [t2, t3]. This procedure canbe applied to all remaining subintervals of the partition. �

The (time-nonhomogeneous, differential) Riccati equation in B is defined asthe equation

Rt = A(t)Rt + (A(t)Rt)∗ −RtC(t)Rt, (4.157)

with some given families of operators A(t) in B and C(t) from B to B∗, and wherethe solutions Rt are sought in the class SL+(B�, B). We assume that D is a densesubspace of B, which is itself a Banach space under the norm ‖.‖D ≥ ‖.‖B.

Page 284: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

4.11. The operator-valued Riccati equation 269

Proposition 4.11.3. Assume that

(i) A(t) and C(t) are bounded strongly continuous families of operators D → Bor D → B∗, respectively;

(ii) A(t) generates a propagator U t,s in B on the common invariant domain Dsuch that for any φ ∈ D the family U t,sφ is the unique solution in D to theCauchy problem

d

dtU t,sφ = A(t)U t,sφ, Us,sφ = φ,

and U t,s is strongly continuous both in B and D,

(iii) all C(t) are positive and C(t, s) = (U t,s)∗C(t)U t,s ∈ SL+(B,B∗) for t > s,with norms that have at most an integrable singularity at s = t.

Then for any R ∈ SL+(B∗, B) with the image belonging to D, the family

Rt = U t,sπ(t)(U t,s)∗, t ≥ s, (4.158)

where π(t) is the solution to (4.154) given by Proposition 4.11.2 with πs = R andC(t, s) = (U t,s)∗C(t)U t,s, is a continuous function t �→ SL+(B∗, B), t ≥ s, in thestrong operator topology, the images of all Rt belong to D, Rt depends Lipschitz-continuously on R and satisfies the Riccati equation weakly, i.e.,

d

dt(Rtv, w) = (A(t)Rtv, w) + (v,A(t)Rtw)− (RtC(t)Rtv, w) (4.159)

for all v, w ∈ B∗. If R extends to a bounded operator D∗ → D, then Rt satisfiesthe Riccati equation (4.157).

Proof. Everything follows from inspecting the given explicit formula. In fact, thisformula implies that

d

dtRtv = AtU

t,sπ(t)(U t,s)∗v − U t,sπ(t)C(t, x)π(t)(U t,s)∗v + U t,sπ(t)(U t,s)∗A∗sv.

The only problem in this formula is rooted in its last term, since A∗sv (and hence

(U t,s)∗A∗sv) may belong to D∗, where R and thus π(t) may be not defined. This

is why one generally resorts to weak solutions. �

Remark 81. The above results were formulated for the usual forward Cauchyproblem. Of course, everything remains valid for the backward Riccati equation

Rs = −A(s)Rs − (A(s)Rs)∗ +RsC(s)Rs, s ≤ t, (4.160)

if A(t) is assumed to generate the backward propagator Us,t, s ≤ t, in B. Thenthe family Rs = Us,tπs(U

s,t)∗ solves (4.160) with the terminal condition Rt = R,where πs solves the reduced backward Riccati equation πs = πsC(s, t)πs with thesame initial condition πt = R and C(s, t) = (Us,t)∗C(s)Us,t.

Page 285: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

270 Chapter 4. Linear Evolutionary Equations: Foundations

In the case B = C∞(Rd), B∗ = M(Rd), the elements C of SL(B,B∗) areusually integral operators with symmetric kernels C(x, y), so that the correspond-ing bilinear form can be written as

LC(v, w) = (Cv,w) =

∫ ∫C(x, y)v(x)w(y) dxdy. (4.161)

This extends to linear forms on the space of symmetric functions of two variables as

LC(R) =

∫ ∫C(x, y)R(x, y) dxdy. (4.162)

The corresponding bilinear form of the dual operator C∗ from SL(B∗, B) is thenanalogously given by

(Cμ, ν) =

∫ ∫C(x, y)μ(dx)ν(dy).

Note that the requirement of C ∈ SL(B,B∗) being positive turns into therequirement of (Cv, v) of (4.161) being positive for any v. Symmetric functionsC(x, y) that satisfy this property are called positive-definite kernels (real-valued inour case). In applications, such kernels arise as correlation functions of stationaryrandom fields, where their structure is highly developed and well understood.

4.12 An infinite-dimensional diffusion equationin variational derivatives

As an example for the application of the Riccati equation, let us analyse thebackward propagators in C(M(Rd)) specified by a second-order operator in thevariational derivatives of the form

OtF (Y ) =

(A(t)

δF

δY (.), Y

)+

1

2LC(t)

δ2F

δY 2(., .), (4.163)

where A(t) and C(t) are as in Proposition 4.11.3 with B = C∞(Rd), and LC(t) isthe bilinear form corresponding to C(t) given by (4.161) and (4.162). This equationis an infinite-dimensional extension of Gaussian diffusion (4.39) and can thereforebe referred to as measure-valued Ornstein–Uhlenbeck diffusion. Equations of thistype naturally appear in the analysis of fluctuations of systems with a large numberof particles (or agents) around their law-of-large-number-limits that are specifiedby the kinetic equations considered in Chapter 7.

Unlike the method of Green functions employed in Section 4.3, we shall anal-yse the evolution generated by Ot by looking at the evolution of Gaussian packets,i.e., functions of Y ∈ M(Rd) of the form

Ft(Y ) = FRt,μt,γt(Y ) = exp

{−1

2(Rt(Y − μt), Y − μt) + γt

}, (4.164)

Page 286: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

4.12. An infinite-dimensional diffusion equation in variational derivatives 271

where Rt ∈ SL+(B∗, B) are given by a symmetric bounded positive-definite kernelRt(x, y), so that

(RtY, Y ) =

∫ ∫Rt(x, y)Y (dx)Y (dy),

and μt ∈ M(Rd), γt ∈ C(Rd). It follows that

δFt

δY (z)= −Rt(Y − μt)(z)Ft(Y ) = −

∫Rt(z, y)(Y − μt)(dy)Ft(Y ),

δ2Ft

δY (z)δY (w)= [Rt(Y − μt)(z)Rt(Y − μt)(w) −Rt(z, w)]Ft(Y ),

and therefore

OtFt(Y ) =

[−(A(t)Rt(Y − μt), Y )

+1

2

(C(t)Rt(Y − μt), Rt(Y − μt)

)− 1

2LC(t)Rt

]F (Y )

= − F (Y )

∫ ∫[A(t)Rt(., y)] (z)(Y − μt)(dy)Y (dz)

+1

2F (Y )

∫ ∫C(t, x, y)

[∫ ∫Rt(x, v)(Y − μt)(dv)

×Rt(y, w)(Y − μt)(dw) −Rt(x, y)

]dxdy.

On the other hand,

Ft(Y ) =

[−1

2(Rt(Y − μt), Y − μt) + (μt, Rt(Y − μt)) + γt

]F (Y ).

Therefore, the backward equation F (Y ) = −OtFt(Y ) reads

1

2(Rt(Y − μt), Y − μt)− (μt, Rt(Y − μt))− γt

= −(A(t)Rt(Y − μt), Y − μt)− (A(t)Rt(Y − μt), μt)

+1

2(C(t)Rt(Y − μt), Rt(Y − μt))− 1

2LC(t)Rt.

This equation is satisfied if the following conditions hold:

(Rtv, v) = −2(A(t)Rtv, v) + (C(t)Rtv,Rtv),

μt = A∗(t)μt,

γt =1

2LC(t)Rt.

(4.165)

Page 287: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

272 Chapter 4. Linear Evolutionary Equations: Foundations

The first equation (written in the weak form in order to hold for any v) is aconsequence of the backward Riccati equation

(Rtv, w) = −(A(t)Rtv, w)− (A(t)Rtw, v) + (RtC(t)Rtv, w).

For the sake of definiteness, let D = C1∞(Rd) or D = C2

∞(Rd). The follow-ing result is a consequence of the above calculations and Proposition 4.11.3 (andRemark 81).

Proposition 4.12.1. Assume that

(i) A(t) and C(t) are bounded strongly continuous families of operators D → Band D → B∗, respectively;

(ii) A(t) generates a backward propagator Us,t in B on the common invariantdomain D, and U t,s is strongly continuous both in B and D;

(iii) all C(t) are positive and C(s, t) = (Us,t)∗C(s)Us,t ∈ SL+(B,B∗) for t > s,with the norms having an integrable singularity at s = t at most.

Then for any R ∈ SL+(B∗, B) with the image belonging to D, all three equa-tions in (4.165) are well posed and therefore specify a well-defined evolution ofGaussian packages that solves the backward Ornstein–Uhlenbeck equation F (Y ) =−OtFt(Y ).

From the equations (4.165) and the formulae of Proposition 4.11.3, it ad-ditionally follows that the backward evolution specified by the equation F (Y ) =−OtFt(Y ) is positivity-preserving and non-expansive (a contraction) on the set oflinear combinations of the above Gaussian packages.

Exercise 4.12.1. Prove this claim.

4.13 Perturbation theory for propagators

The next result extends Theorem 4.6.1 to propagators.

Theorem 4.13.1. Let U t,r be a strongly continuous backward propagator of boundedlinear operators in a Banach space B, generated by a family of linear operatorsAt on a common dense domain D, such that (4.130) holds. Let Lt be a family ofbounded operators in B that depend strongly continuously on t. Then the series

Φt,r = U t,r +∞∑

m=1

∫t≤s1≤···≤sm≤r

U t,s1Ls1Us1,s2 · · ·LsmUsm,r ds1 · · · dsm (4.166)

are well defined as a converging series of bounded operators in B. They forma strongly continuous backward propagator in B such that Φt,rf is the uniquebounded solution to the integral equation

Φt,rf = U t,rf +

∫ r

t

U t,sLsΦs,rf ds, (4.167)

Page 288: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

4.13. Perturbation theory for propagators 273

with a given fr = f . Moreover, if f ∈ D, then

d

dt− |t=r Φt,rf = −(Ar + Lr)f (4.168)

for any f ∈ D.

Remark 82. Unlike for semigroups, one cannot deduce directly from (4.168) thatD is invariant under Φt,r and that this propagator solves the Cauchy problem forAt + Lt there.

Proof. This is a straightforward extension of Theorem 4.6.1. The only differenceto note is that in order to conclude that

d

dt|t=r

∫ r

t

U t,sLsUs,rf ds = −Lrf,

one must use the continuous dependence of Ls on s and the observation that if

A(1)s , . . . , A

(k)s are k families of bounded strongly continuous operators, then the

family of compositions A(1)s ◦· · ·◦A(k)

s is also bounded and strongly continuous. �

As for semigroups, equation (4.167) is often referred to as the mild form of

the equation dΦt,rfdt = (A + L)Φt,rf and the solutions to (4.86) are referred to as

the mild solutions of the equation dΦt,rfdt = (A+ L)Φt,rf .

In order to conclude further (from the conditions of Theorem 4.13.1) that Φt,r

solves the Cauchy problem for At +Lt on D, one has to know that D is invariantunder Φt,r. This, however, does not follow from (4.166), even if we assume thatD is invariant under all Lt. A way to overcome this difficulty arises once thedomain D has itself a natural Banach space structure, given by a certain norm‖.‖D ≥ ‖.‖B (similar to the situation in Proposition 4.9.3). This is often the casefor concrete equations of practical interest. In this case, in order to check thatAsU

s,tf is continuous, it is sufficient to prove that Us,tf is a strongly continuousfamily of bounded operators in D and that As is a strongly continuous familyof bounded operators D → B (not B → B of course!). With the help of suchstructure, one obtains the following improvement of Theorem 4.13.1.

Proposition 4.13.1. Under the conditions of Theorem 4.13.1, assume additionallythat D is itself a Banach space under another norm ‖.‖D ≥ ‖.‖B such that As is astrongly continuous family of bounded operators D → B and Us,t is a strongly con-tinuous family of operators in D. Moreover, assume that the family Lt is stronglycontinuous as a family of operators B → B and is bounded as a family of operatorsD → D. Then D is invariant under Φt,r, the propagator Φt,r is generated by thefamily At + Lt on D and (As + Ls)Φ

s,tf is a continuous function (s, t) �→ B forany f ∈ D.

Proof. It follows that the series (4.166) converges in the norm of the Banach spaceD. Therefore,D is invariant under Φt,r. The continuity assumptions are specifically

Page 289: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

274 Chapter 4. Linear Evolutionary Equations: Foundations

designed in order to ensure the continuity of (As+Ls)Φs,tf as a function (s, t) �→ B

for any f ∈ D. The rest follows from Theorem 4.13.1, Proposition 4.9.1 andLemma 1.3.1. �

As a simple example, let us consider the straightforward time-dependentextension of formula (4.97). Namely, the unique solution given by the perturbationseries to the backward Cauchy problem

ft(x) = −∫ ∞

0

(ft(x+ y)− ft(x))νt(dy), t ≤ T, (4.169)

with a continuous family of finite measures νs and a given terminal condition fTcan be written as

ft(x) = exp

{−∫ T

t

‖νs‖ ds}

(4.170)

×∞∑k=0

∫t≤s1≤···≤sk≤T

∫ ∞

0

· · ·∫ ∞

0

fT (x+ y1 + · · ·+ yk)νs1(dy1) · · · νsk(dyk).

In the sensitivity analysis for nonlinear propagators (which will be carriedout later on), the following modification of Theorem 4.13.1 becomes important forthe case when the operators Lt are not bounded in B, but only in D:

Theorem 4.13.2. Again let D be a dense subspace of B, which is itself a Banachspace under another norm ‖.‖D ≥ ‖.‖B, and At be a family of bounded stronglycontinuous linear operators D → B. Let U t,r be a strongly continuous backwardpropagator of bounded linear operators in B generated by At on D such that U t,r isalso a strongly continuous propagator in the Banach space D. Let Lt be a stronglycontinuous family of bounded operators in D. Then the series (4.166) are welldefined as converging series of bounded operators in D. They form a stronglycontinuous backward propagator in D such that, for any f ∈ D, Φt,rf is theunique bounded solution to equation (4.167) with a given fr = f . Moreover,

d

dtΦt,rf = −(At + Lt)Φ

t,rf,d

drΦt,rf = Φt,r(Ar + Lr)f, (4.171)

for any f ∈ D, where the derivatives are understood in the sense of the topologyof B, and Φt,rf yields the unique solution to the backward Cauchy problem for theequation ft = −(At + Lt)ft with a given boundary condition fr = f ∈ D.

Proof. The convergence of the series (4.166) follows directly from the conditionsof the Theorem, which imply a) the invariance of D under all Φt,r and b) that Φt,r

defines a strongly continuous backward propagator in D (although possibly not inB). The invariance of D leads to (4.171), first with the left derivatives instead ofthe full derivatives, as in Proposition 4.9.1. The extension to the full derivatives

Page 290: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

4.13. Perturbation theory for propagators 275

follows from the assumed strong continuity of At and Lt. Finally, by Proposition4.10.2, any solution ft to the equation ft = −(At + Lt)ft satisfies the equation

ft = U t,rf +

∫ r

t

U t,sLsfs ds,

and is therefore unique. �

Let us present the counterpart of Theorem 4.6.3 for propagators. Its proof isalmost identical to the proof of Theorem 4.6.3 and is therefore omitted.

Theorem 4.13.3.

(i) Let the family of operators At with the common domain D generate a stronglycontinuous backward propagator U t,s on a Banach space B, and let Lt be afamily of operators in B with the common domain DL such that U t,sf ∈ DL

for any f ∈ B, t < s, and

‖LtUt,s‖ ≤ κ(s− t)−ω (4.172)

with constants κ > 0, ω ∈ (0, 1), uniformly for t, s from any fixed boundedinterval. Then the series (4.166) converges in the operator norm, and theoperators Φt,r form a strongly continuous backward propagator in B.

(ii) If additionally each Lt is a closed (or closable) operator on DL, then Φt,rf ∈DL for any f ∈ B, t > 0 (or Φt,rf belongs to the domain of the closure ofLt, respectively), with the same order of growth as for Tt:

‖LtΦt,r‖ ≤ κ(r − t)−ω, (4.173)

with some other constant κ. Moreover, Φt,rf is the unique solution to theintegral equation (4.167) for any given f0 = f such that ‖LtΦ

t,rf‖ ≤ c(r −t)−ω1 with some constants ω1 ∈ (0, 1) and c > 0.

Let us now present a version of Theorem 4.6.4 for propagators, which exploitsthe method of three-level Banach towers.

Theorem 4.13.4. Let U t,s be a strongly continuous backward propagator in a Ba-nach space B generated by the family At on an invariant dense subspace D. LetB be another dense subspace of B such that D ⊂ B ⊂ B. Assume that D andB are themselves Banach spaces under the norms ‖.‖D ≥ ‖.‖B ≥ ‖.‖B, and let

Lt ∈ L(B, B)∩L(D, B) be a strongly continuous family. Let U t,s have the followingregularization property: Ttf ∈ D for any f ∈ B, t > 0, and

‖U t,s‖B→B ≤ κ(s − t)−ω, ‖U t,s‖B→D ≤ κ(s− t)−ω (4.174)

with constants ω ∈ (0, 1),κ > 0, uniformly for (s− t) ∈ (0, 1]. Finally, let U t,s bestrongly continuous in B and D, and let

‖U t,s‖B→B ≤ Mem(s−t), ‖U t,s‖D→D ≤ MDemD(s−t). (4.175)

Page 291: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

276 Chapter 4. Linear Evolutionary Equations: Foundations

Then

(i) the backward propagator Φt,s constructed in Theorem 4.13.3 is strongly con-tinuous in both D and B, and it satisfies the estimates

‖Φt,s‖B→B ≤ Mem(s−t)E1−ω[Γ(1 − ω)κM(s− t)1−ω sups

‖Ls‖B→B],

‖Φt,s‖D→D ≤ MDemD(s−t)E1−ω[Γ(1− ω)κMD(s− t)1−ω sup

s‖Ls‖D→B];

(4.176)

(ii) it has the same regularization property as U t,s, namely

‖Φt,s‖B→B ≤ κ(s− t)−ω, ‖Φt,s‖B→D ≤ κ(s− t)−ω , (4.177)

with a constant κ that can also be expressed in terms of Mittag-Leffler func-tions;

(iii) the backward propagator Φt,r is generated by At+Lt on D, i.e., the equations(4.171) hold in B for any f ∈ D;

(iv) Φt,sf ∈ D for any f ∈ B and t < s, and

‖Φt,s‖B→D ≤ 4κ2(s− t)−2ω . (4.178)

Proof. The conditions (4.174) and Lt ∈ L(B, B) ∩ L(D, B) (with norms that areuniform in t) ensure that the perturbation series converge in B for any f ∈ Band in D for any f ∈ B. The estimates for the sum are obtained as in the proofof Theorem 4.6.4. The required strong continuity of Φt,r follows from the strongcontinuity of U t,r. Due to the invariance of D, it is sufficient to check the equations(4.171) only for t = r, where they are readily seen. Finally, (4.178) follows from(4.177) and the chain rule. �

An important example for the application of Theorem 4.13.4 will be givenin Theorem 6.8.3, which is devoted to quantitative estimates of the sensitivity ofMcKean–Vlasov diffusions with respect to the initial data.

Perturbation theory can be used for constructing solutions to the perturbedequation even if U t,s are neither supposed to be propagators, nor to representunique solutions to the Cauchy problem of the equation ft = −Atft.

Theorem 4.13.5. Let D ⊂ B ⊂ B be a triple of Banach spaces with ‖.‖D ≥ ‖.‖B ≥‖.‖B such that D (respectively B) is dense in B (respectively in B) in the topologyof B (respectively B). Let U t,s, t ≤ s, be a family of operators in B such that D andB are invariant and the U t,s are bounded and strongly continuous in all three spacesD, B, B. Also, let the U t,s be smoothing, such that Ttf ∈ D for any f ∈ B, t > 0,and that (4.174) holds. Let At ∈ L(D,B) and Lt ∈ L(B, B)∩L(D, B) be stronglycontinuous families, and for any f ∈ B, let the equation d

dtUt,sf = −AtU

t,sf holdfor t < s.

Page 292: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

4.13. Perturbation theory for propagators 277

Then the series (4.166) converges in the norms of all three spaces D, B, B,its sum Φt,s is strongly continuous in all these spaces, the operators Φt,s have thesame regularization property as U t,s (namely the estimates (4.177) hold), and forany f ∈ B the equation

d

dtΦt,rf = −(At + Lt)Φ

t,rf, t < r, (4.179)

is satisfied.

Proof. This is just a repetition of the arguments used in Theorem 4.13.4, whereeverything concerning uniqueness and the chain rule is omitted. �

Finally, let us give a direct extension of the path integral representationof Theorem 4.7.1. Keeping the notation (4.99) for piecewise constant paths, andPCx(τ, t) for the set of such paths [τ, t] �→ Rd starting from the point x at τ , letus assign to each continuous family of bounded measures Mt on Rd the measureMPC on PCx(τ, t), which is defined as the sum of measures MPC

n , n = 0, 1, . . . ,where each MPC

n is the product-measure on PCnx (τ, t) of the Lebesgue measure

on Simnτ,t and of n measures Msj on Rd taken at the points of discontinuity of

Zx(s). That is, if Z is parametrized as in (4.99), then

MPCn (dZ(.)) = ds1 · · · dsnMs1(dz1) · · ·Msn(dzn),

and for any measurable functional F (Zx(.)) = {Fn(x − Z0, x − Z1, . . . , x − Zn)}on PCx(τ, t), given by the collection of functions Fn on Rdn, n = 0, 1, . . .,∫

PCx(τ,t)

F (Zx(.))MPC(dZ(.)) = F (x) +

∞∑n=1

∫PCn

x (t)

F (Zx(.))MPCn (dZ(.))

=∞∑n=0

∫Simn

τ,t

ds1 · · · dsn∫Rd

· · ·∫Rd

Ms1(dz1) · · ·Msn(dzn)

× Fn(x− Z0, x− Z1, . . . , x− Zn). (4.180)

Since

‖MPC‖ = 1 +

∞∑n=1

∫Simn

τ,t

ds1 · · · dsn‖Ms1‖ · · · ‖Msn‖ = exp

{∫ t

τ

‖Ms‖ ds},

and using the probabilistic notation E (the expectation) for the integral over the

normalized (probability) measure MPC = exp{− ∫ t

τ‖Ms‖ ds}MPC on the path-

space PCx(τ, t), we can write (4.180) as∫PCx(τ,t)

F (Zx(.))MPC(dZ(.)) = exp

{∫ t

τ

‖Ms‖ ds}∫

PCx(t)

F (Zx(.))MPC(dZ(.))

= exp

{∫ t

τ

‖Ms‖ ds}EF (Zx(.)). (4.181)

Page 293: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

278 Chapter 4. Linear Evolutionary Equations: Foundations

Assuming that the propagators Us,t in (4.166) are generated by multiplicationoperators of the time-dependent family of functions A(t, x) and that Lt is a familyof bounded operators in C(Rd) of the form Ltf(x) =

∫f(x − y)νt(dy) with a

family of bounded measures νt on Rd, the series (4.166) can be rewritten as

Φt,rY (x) = exp

{∫ r

t

A(s, x) ds

}Y (x) (4.182)

+

∞∑m=1

∫t≤s1≤···≤sm≤r

Y (x− z1 − · · · − zm)ds1 · · · dsmνs1(dz1) · · · νsm(dzm)

× exp

{∫ s1

t

A(s, x) ds+

∫ s2

s1

A(s, x− z1) ds+ · · ·+∫ t

sm

A(s, x− z1 − · · · − zm)

}.

The last exponential term can be also written as exp{∫ r

tA(s, Zx(s)) ds}. Therefore,

the series (4.182), which is a performance of the perturbation series (4.166), canbe represented as a path integral of the type (4.181):∫

PCx(t,r)

exp

{∫ r

t

A(s, Zx(s)) ds

}Y (Zx(s))M

PC(dZ(.))

= exp

{∫ t

τ

‖Ms‖ ds}E

[exp

{∫ r

t

A(s, Zx(s)) ds

}Y (Zx(r))

].

(4.183)

4.14 Diffusions and Schrodinger equationswith nonlocal terms

As a first example for the application of perturbation theory of propagators, letus extend Proposition 4.8.1 to time-dependent drifts and sources, namely to theheat conduction equation

ft =1

2Δft + (bt(x),∇)ft(x) + Vt(x)ft(x), f0 = f, (4.184)

with the time-dependent perturbation Ltf = (bt(x),∇)f(x) + Vt(x)f(x).

Proposition 4.14.1. Let Vt, bt be bounded measurable functions. Then the series(4.166) with U t,s = exp{(s− t)Δ/2} converge, and the corresponding family Φt,r

forms a backward propagator that solves the mild equation (4.167).

As in the time-homogeneous case, these results automatically extend to com-plex diffusion equations, or the regularized Schrodinger equation with magneticfields:

ft =1

2σΔft +(bt(x),∇)ft(x) +Vt(x)ft(x) = (

1

2σΔ+Lt)ft(x), f0 = f, (4.185)

where σ = i+ ε is a complex constant with a positive real part ε.

Page 294: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

4.14. Diffusions and Schrodinger equations with nonlocal terms 279

Theorem 4.6.5 also extends to propagators, which almost literally leads tothe following time-dependent version of Theorems 4.8.1 and 4.8.2.

Theorem 4.14.1.

(i) Let bt(x) be a bounded measurable function and Vt a Radon measure on Rd,d > 1, depending measurably on t. Let Vt ∈ MC

R,α(Rd) and ∂bt

∂xj∈ MC

R,α(Rd)

for all j, with all constants uniform in t, α ∈ (d − 2, d] and σ a complexconstant with a positive real part. Then equation (4.185) is well posed in thesense of mild solutions.

(ii) Let bt(x) be a bounded measurable function. Let Vt and∂bt∂x be bounded (uni-

formly in t) measures on R, depending measurably on t. Let σ be a complexconstant with a non-negative real part (thus including the case σ = i). Thenequation (4.185) is well posed in the sense of mild solutions.

As an example related to Theorem 4.13.4, let us provide conditions thatensure that the equations (4.12) or (4.185) can be classically solved.

Theorem 4.14.2. Let bt, Vt ∈ C([0, T ], C1(Rd)) (or its complex-valued version),where d ≥ 1 and σ is a complex constant with a positive real part. Then the propa-gator Φt,r solving (according to Theorem 4.14.1) the mild form of equation (4.185)acts strongly continuously in C∞(Rd), C1

∞(Rd) and C2∞(Rd), it is generated by

Lt + σΔ/2 on C2∞(Rd) and therefore solves equation (4.185) classically.

Proof. This is a direct consequence of Theorem 4.13.4 with D = C2∞, B = C1

∞,B = C∞ and the regularization property of the semigroups T σ

t of (4.20), see (4.24),(4.25) and Exercise 4.3.8. �

Remark 83. By carefully looking at the proof of Theorem 4.13.4 in this case, wecan to conclude that the assumption bt, Vt ∈ C([0, T ], CbLip(R

d)) is sufficient forthe validity of the results of Theorem 4.14.2.

Let us now extend the theory for the equations (4.185) to the case of diffusionsor Schrodinger equations with non-local terms, i.e., to equations of the type

ft =

(1

2σΔ+ Lt

)ft(x)

=1

2σΔft + (bt(x),∇)ft(x) + Vt(x)ft(x)

+

∫ft(y)νt(x, dy) +

∫(∇ft(y), νt(x, dy)), f0 = f,

(4.186)

where νt(x, dy) is a (possibly signed or even complex) transition kernel for anyt, i.e., νt(x, dy) is measurable with respect to x ∈ Rd and belongs to M(Rd) orMC(Rd) as a function of the second variable. Moreover, νt = (ν1t , . . . , ν

dt ) is a

vector-valued transition kernel. An important special case are operators Lt of the

Page 295: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

280 Chapter 4. Linear Evolutionary Equations: Foundations

Levy–Kchintchin type, which arise in the theory of nonlocal Markov processes:

Ltf = (bt(x),∇)ft(x) + Vt(x)ft(x) +

∫(ft(x+ y)− ft(x))νt(x, dy)), (4.187)

where for any x the (positive) measure νt(x, .) may be unbounded, but such that∫min(1, |y|)νt(x, dy) < ∞.

By (1.84), these operators can be expressed in the form (4.186) with boundedνt, νt.

Interestingly enough, nonlocal equations of the type (4.186) also arise natu-rally in linearized evolutions of nonlinear diffusions, see equation (6.81) below.

Recall that the transition kernel νt(x, dy) is said to be weakly continuous if∫f(y)νt(x, dy) is a continuous function of t and x for any continuous bounded f .

Theorem 4.14.3.

(i) Let σ be a complex constant with a positive real part, let Vt, bt be boundedmeasurable complex-valued functions and νt, νt uniformly bounded complextransition kernels. Then the series (4.166) with U t,s = exp{(s−t)σΔ/2} con-verges both in C(Rd) and C1(Rd), and the corresponding family Φt,r forms astrongly continuous backward propagator both in C∞(Rd) and C1

∞(Rd). Thispropagator solves the mild equation (4.167).

(ii) Let additionally bt, Vt ∈ C([0, T ], C1(Rd)) (possibly complex-valued), and

let the partial derivatives ∂νt(x,.)∂xj

, ∂νt(x,.)∂xj

exist as uniformly bounded weakly

continuous families of complex transition kernels. Then the propagator Φt,r

from (i) acts strongly continuously in C∞(Rd), C1∞(Rd) and C2

∞(Rd), it isgenerated by Lt + σΔ/2 on C2

∞(Rd) and therefore solves equation (4.186)classically.

Proof. Again, this is a direct consequence of Theorem 4.13.4. �

For the sake of simplicity, we used the standard Laplacian operators as mainterm of the equations. The above results extend straightforwardly to equations ofthe type

ft =1

2(At∇,∇)ft + (bt(x),∇)ft(x) + Vt(x)ft(x), fs = f, (4.188)

with At a family of symmetric positive matrices. In fact, the solution to the cor-responding homogeneous problem

ft =1

2(At∇,∇)ft, fs = f, (4.189)

Page 296: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

4.15. ΨDOs with homogeneous symbols (time-dependent case) 281

is expressed in the following closed form (obtained directly by the Fourier trans-form, using (1.89)):

ft =(2π)−d/2√det(

∫ t

sAτdτ )

∫exp

{−1

2

((∫ t

s

Aτdτ

)−1

(x− y), x− y

)}f(y) dy.f

(4.190)The corresponding propagator has all properties that are required for an extensionof the above results to equations of the type (4.188).

4.15 ΨDOs with homogeneous symbols

(time-dependent case)

In this section, we extend the theory of Section 4.5 into several directions, includingtime-dependent homogeneous symbols.

We start with the Cauchy problem

ft = −ψt(−i∇)ft, f |t=s = fs, t ≥ s, (4.191)

with

ψt(p) = |p|βωt(p/|p|), (4.192)

where ωt = ωtr + iωt

i is a continuous function on R × Sd−1 with a positive realpart (see (4.66) for key examples). The corresponding propagator resolving (4.61)acts as

U t,sfs(x) =

∫Gψ

t,s(x− y)fs(y) dy, (4.193)

with the Green function (2.60) of the form

Gψt,s(x) =

1

(2π)d

∫eipx exp

{−|p|β

∫ t

s

ωτ (p/|p|) dτ}

dp. (4.194)

Since this function is real, U t,s preserves the reality of functions if and only if thecondition

ωt(p/|p|) = ωt(−p/|p|) (4.195)

holds for all p and t.

The following two theorems are proved by directly extending the argumentsfrom Theorems 4.5.1 and 4.4.1.

Theorem 4.15.1. Let β > 0 and a continuous function ωt(s) on Sd−1 be (d+1+[β])-times (where [β] is the integer part of β) continuously differentiable, with its realpart being bounded from below by a positive number. Then the following holds:

Page 297: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

282 Chapter 4. Linear Evolutionary Equations: Foundations

(i) The Green function (4.194) is infinitely differentiable in all variables t, s, xfor t− s > 0 and satisfies the estimate

max

(|Gψ

t,s(x)|,∣∣∣∣(t− s)

∂Gt,s(x)

∂t

∣∣∣∣ ,∣∣∣∣(t− s)

∂Gt,s(x)

∂s

∣∣∣∣)

≤ Cmin

((t− s)−d/β,

t− s

|x|d+β

) (4.196)

for a constant C.

(ii) The propagator U t,s given by (4.193) is a uniformly bounded and stronglycontinuous propagator in C∞(Rd), with all spaces Ck∞(Rd) being invariant.It is smoothing in the sense that U t,sf is infinitely differentiable for t > sand any f ∈ C∞(Rd). If β ≥ 1, then for all non-negative integers k, l,

‖U t,sf‖Cl+k(Rd) ≤ ck,l(t− s)−k/β‖f‖Cl(Rd) (4.197)

with some constants ck,l.

(iii) The family of operators −ψt(−i∇) generates the propagator U t,s on the in-variant subspace of C∞(Rd) consisting of functions that can be representedas Fourier transforms of functions φ ∈ L1(R

d) such that |p|βφ(p) ∈ L1(Rd),

so that equation (4.129) holds for functions from this space. (The −ψt(−i∇)are defined as operators that multiply the Fourier transform of a function by−ψt(p).)

(iv) If additionally ω is (d+ 1 + [β] + l)-times continuously differentiable, then∣∣∣∣ ∂k

∂xi1 · · · ∂xik

Gψt (x)

∣∣∣∣ ≤ Cmin

(t−(d+l)/β,

t

|x|d+β+l

), (4.198)

for a constant C and all k ≤ l and i1, . . . , ik.

Theorem 4.15.2. Under the assumptions of Theorem 4.15.1 (i) to (iii), assumeadditionally that the ψt are symmetric mixed fractional derivatives∫

Sd−1

|(∇, s)|βμt(ds),

that is, their symbols are given by formula (1.152) with some time-dependent familyof spectral measures μt. Then the family of operators −ψt(−i∇) generates thepropagator U t,s on the invariant spaces Ck∞(Rd) with any integer k ≥ β, suchequation (4.129) holds that for the functions from this space. (Here, the −ψt(−i∇)are defined on Ck∞(Rd) by the formulae (1.145) and (1.143).)

Next, let us consider the Cauchy problem

ft = −ψt(−i∇)ft + gt, f |t=s = fs, t ≥ s, (4.199)

where ψt is given by (4.192) and gt is a given curve in C∞(Rd).

Theorem 4.15.3. Let the assumptions of Theorem 4.15.2 hold and gt be a contin-uous curve [s,∞) → C∞(Rd). Let an integer k ≥ β. Then:

Page 298: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

4.15. ΨDOs with homogeneous symbols (time-dependent case) 283

(i) The function

ft = U t,sfs +

∫ t

s

U t,τgτ dτ (4.200)

represents the unique solution to equation (4.199), whenever fs ∈ Ck∞(Rd)

and gt is a bounded curve [s,∞] → Ck∞(Rd).

(ii) If β ≥ 1, fs ∈ C∞(Rd) and gt is a bounded curve [s,∞] → Ck−1∞ (Rd),then the function ft from (4.200) represents the unique continuous curve[s,∞] → C∞(Rd) with the initial condition fs such that ft belongs to Ck∞(Rd)and equation (4.199) holds for all t > s.

Proof. Statement (i) is a direct consequence of Proposition 4.10.2 and Remark 78.Statement (ii) is the effect of smoothing. Namely, by Theorem 4.15.1(ii), U t,sfs ∈Ck

∞(Rd) for all t > s, and by (4.197), U t,τgτ ∈ Ck∞(Rd) for all t > τ and the

family of the corresponding norms in Ck∞(Rd) is integrable in τ . �

Finally, let us briefly touch the extension of the theory to homogeneous sym-bols, but with time-dependent order. Namely, let us consider the Cauchy problem(4.191) with

ψt(p) = |p|βtωt(p/|p|). (4.201)

A generalization of all above results can again be obtained by directly extendingthe arguments from Theorems 4.5.1 and 4.4.1, where we now use Proposition 9.3.6rather than 9.3.5, see also similar arguments in Theorem 4.5.2. For instance, thefollowing Theorem holds:

Theorem 4.15.4. Let βt be a continuous curve on R with strictly positive values,and let μt be a continuous curve with values in the set of (positive) measures onSd−1 such that the functions

∫Sd−1 |(p, s)|βtμt(ds) are (d+2+[bmax])-times contin-

uously differentiable in p ∈ Sd−1 and bounded from below by a positive constant.Let us denote

bmin[s, t] = min{βτ : τ ∈ [s, t]}, bmax[s, t] = max{βτ : τ ∈ [s, t]}.

Then the Green function Gψt,s(x) of the corresponding propagator (4.193)

resolving (4.191) is infinitely differentiable with respect to all variables t, s, x, andfor t− s ∈ (0, 1) it satisfies the estimates

|Gψt,s(x)| ≤ Cmin

((t− s)−d/bmin[s,t],

∫ t

s

|x|d+β(τ)

),∣∣∣∣∂Gt,s(x)

∂t

∣∣∣∣ ≤ Cmin

((t− s)−(d+β(t))/bmin[s,t],

1

|x|d+β(t)

),∣∣∣∣∂Gt,s(x)

∂s

∣∣∣∣ ≤ Cmin

((t− s)−(d+β(s))/bmin[s,t],

1

|x|d+β(s)

),

(4.202)

for a constant C.

Page 299: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

284 Chapter 4. Linear Evolutionary Equations: Foundations

Moreover, the propagator U t,s given by (4.193) is a uniformly bounded andstrongly continuous propagator in C∞(Rd), with all spaces Ck

∞(Rd) being invari-ant. It is smoothing in the sense that U t,sf is infinitely differentiable for t > s andany f ∈ C∞(Rd), and for all non-negative integers k, l and t− s ≤ 1,

‖U t,sf‖Cl+k(Rd) ≤ ck,l(t− s)−k/bmin[s,t]‖f‖Cl(Rd) (4.203)

with some constants ck,l.

Finally, the family of operators −ψt(−i∇) generates the propagator U t,s onthe invariant spaces Ck

∞(Rd) with any integer k ≥ max{βτ}.Exercise 4.15.1. Give a full proof of this theorem.

Exercise 4.15.2. Extend Theorem 4.15.4 to time-varying mixtures, i.e., to symbolsof the type

ψt(p) =

∫U

|p|β(t,u)ωt(u, p/|p|)μt(du).

4.16 Higher-order ΨDEs with nonlocal terms

As another example for the application of perturbation theory to propagators, letus give an extension of Theorem 4.14.2 to higher (and/or fractional) order PDEswith nonlocal terms:

ft = −σ|Δ|α/2ft + Ltft (4.204)

with

Ltf =

k∑m=0

∑j1≤···≤jm≤d

btj1···jm(x)∂mf(x)

∂xj1 · · ·∂xjm

+

k∑m=0

∑j1≤···≤jm≤d

∫∂mf(y)

∂yj1 · · · ∂yjmνtj1···jm(x, dy),

(4.205)

where k ≥ 1 is any natural number that is strictly less than α and btj1···jm(x) and

νtj1···jm(x, dy) are families of measurable functions and transition kernels on Rd.

Let p be the smallest integer that is not less than α. The following result isa direct consequence of Theorem 4.13.4 and Proposition 4.4.1 with D = Cp

∞(Rd),B = Ck

∞(Rd).

Theorem 4.16.1.

(i) Let σ be a complex constant with a positive real part, let btj1···jm(x) be boundedmeasurable complex-valued functions and νtj1···jm(x, dy) uniformly boundedcomplex stochastic kernels. Then the series (4.166) with U t,s = exp{(s −t)σ|Δ|α} and Lt from (4.205) converges in C∞(Rd), and the correspondingfamily Φt,r forms a backward propagator that solves the mild equation (4.167).

(ii) Let additionally btj1···jm ∈ C([0, T ], Ck(Rd)) (possibly complex-valued), andlet the partial derivatives of order up to and including k of the kernels

Page 300: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

4.17. Hints and answers to chosen exercises 285

νtj1···jm(x, .) with respect to x exist as uniformly bounded, weakly continu-ous families of complex transition kernels. Then the propagator Φt,r from (i)acts strongly continuously in C∞(Rd), Ck∞(Rd) and Cp∞(Rd), it is generatedby Lt−σ|Δ|α/2 on Cp

∞(Rd) and therefore solves equation (4.186) classically.

Exercise 4.16.1. By applying Theorems 4.15.1 or 4.15.4, show how Theorem 4.16.1extends to the case when σ|Δ|α/2f is substituted by a general homogeneous op-erator of (possibly time-varying) orders βt ≥ α.

Remark 84. A corollary to Proposition 5.9.1 will extend this result to the casewhen the variable σ depends on x.

4.17 Hints and answers to chosen exercises

Exercise 4.1.1. All statements on strong continuity and equicontinuity are obtainedby reducing them to the problem of convergence Ttf → f uniform on compactsets. For a counterexample, take f(x) = sin

√x.

Exercise 4.1.2. The invariant core is given by all shifts TtS for all t and S ∈ C1(Rd).Therefore, one needs to show that any such TtS belongs to the closure of L definedon C1(Rd). To this end, approximate F by a sequence Fn ∈ C1(Rd,Rd) and showthat T n

t S(Y ) → TtS(Y ) and LT nt S(Y ) → LTtS(Y ).

Exercise 4.1.3.

RλRμf =

∫ ∞

0

e−λtTt

[∫ ∞

0

e−μsTsf ds

]dt =

∫ ∞

0

∫ ∞

0

e−λt−μsTt+sf dsdt.

Afterwards, change the integration variable t to t+ s.

Exercise 4.3.1. (i) For any ε there exists δ such that |f(y)−f(x)| < ε for |x−y| < δ.Next,∫

Gt(x− y)(f(y)− f(x)) dy =

∫|x−y|>tε+1/2

Gt(x− y)(f(y)− f(x)) dy

+

∫|x−y|≤tε+1/2

Gt(x− y)(f(y)− f(x)) dy,

and the first integral tends to zero as t → 0. Finally, for small enough t,∫|x−y|≤tε+1/2

Gt(x− y)|f(y)− f(x)| dy ≤ ε

∫Gt(x− y) dy = ε.

(iii) For f ∈ C2∞(Rd),

d

dtTtf(x) =

∫∂

∂tGt(x− y)f(y) dy

=1

2

∫ΔGt(x− y)f(y) dy =

1

2

∫Gt(x− y)Δf(y) dy.

Page 301: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

286 Chapter 4. Linear Evolutionary Equations: Foundations

Exercise 4.3.3. Strong continuity for continuous f with a compact support followsas in Exercise 4.3.1. It can be extended to the whole space by a density argument.

Exercise 4.3.5. This semigroup turns to the multiplication semigroup f(p) �→eiκp2

f(p) under Fourier transform.

Exercise 4.3.6. Use integration by parts.

Exercise 4.3.10. One only has to check that DNeu is invariant, which follows fromthe explicit formula for TNeu

t .

Exercise 4.3.11. Again, the invariance of the spaces C([0,∞)) and DDir followsfrom the explicit formula for TDir

t .

Exercise 4.5.2. yφ′(y) = e−y − φ(y).

Exercise 4.9.1.

TsTtf(x) = Ux+δ,x+δ−αs(Ttf)(x− αs)

= Ux+δ,x+δ−αsUx−αs+δ,x+δ−αs−αtf(x− αs− αt) = Ts+tf(x).

Exercise 4.12.1. One first shows this for positive and negative linear combinationsseparately, and then uses the observation that these two subsets evolve indepen-dently of each other.

4.18 Summary and comments

In this chapter, we developed the theory of semigroups and propagators of lin-ear operators in locally convex spaces, with the emphasis on the case of Banachspaces. The theory provides a tool for analysing a wide class of evolutionary par-tial differential and pseudo-differential equations. The presented results are mostlyknown. However, as in the previous chapters, the aim was to simplify, streamlineand unify various facts and approaches, while at the same time supplying a suf-ficiently detailed exposition and various insightful examples. The classical bookson linear differential equations are the treatises [110] and [175].

An extensive literature is devoted to the asymptotics of the Green functionof the Cauchy problem for spatially homogenous ΨDOs, the main emphasis beingusually given to the case of homogeneous symbols. One of the first papers in thisdirection was [77]. The case of the homogeneity indices α ∈ (0, 2) is crucial forprobability theory, since the corresponding Green functions represent transitionprobabilities of stable Levy processes. These transition probabilities have beenstudied in detail by probabilists, leading to very precise asymptotic estimates andexpansions, see, e.g., the classical monograph [267] for the one-dimensional case.For the multi-dimensional case, asymptotic expansions were obtained in [134].Based on the approaches of this paper, we tried to give a concise representationfor a reasonably general case of homogeneous symbols.

Page 302: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

4.18. Summary and comments 287

In many cases, the r.h.s. of an evolutionary equation can be written as thesum (A + L)u of two operators applied to the unknown function, one yielding awell-understood evolution etA (ideally given in a closed form) and the other, L,being in some sense more regular than the other. Perturbation theory is a classicaltool for dealing with such systems, and it has been developed by many authorsbased on various ways of comparing A and L, see, e.g., the well-known books[199, 200, 232]. Our approach is based on the systematic exploitation of the in-tegrability of the singularity of the compositions etAL or LetA. This provides aneffective and systematic way of obtaining various concrete results. The advantageof this approach also lies in its more or less straightforward extensibility to time-nonhomogeneous situations, i.e., to propagators. (Note that this may not be thecase for other methods, e.g., those that are based on resolvents.) For instance,Theorems 4.8.1 and 4.8.2 are obtained in [138] with much harder calculations, andtheir time-nonhomogeneous extension, Theorem 4.14.1, is derived in our approachmore or less automatically, including even more extensions to equations with non-local terms. An interesting problem that was not considered here is the analysisof diffusion equations with measure-valued drifts, see [122] and references therein.

Path integral representations for the solutions of PDEs are the main linkbetween PDEs and probability theory. We are not thoroughly developing this topichere. A specific branch of research on path integrals arises from the equations ofquantum mechanics, because the corresponding Feynmann path integral does notalways match the rigorous probabilistic treatment. In this chapter, we followed anapproach to path integrals as it arises from jump-type processes. This approach toFeynmann path integrals was first considered in [201] and fully developed in [136–138]. It exploits the link between path integrals and perturbation theory, which,in the theory of quantum fields, is usually designated graphically via Feynmanndiagrams. For more details and an extensive bibliography, we refer to [148]. Moredetails on other approaches to Feynmann path integrals can be found, e.g., in [6].

The literature on the infinite-dimensional Riccati equation is also extensive,mainly due to its application in optimal control, see, e.g., McEneaney [204] andreferences therein. It is often analysed in the setting of Hilbert spaces. Following[147], we gave a direct proof of its well-posedness for unbounded families A(t), C(t)in a Banach space via an explicit formula that arises from the ‘interaction repre-sentation’, while we by-passed any optimization interpretations and related tools.The related theory of measure-valued infinite-dimensional diffusion arises natu-rally in the analysis of fluctuations (dynamical central-limit theorem) of variousstatistical mechanical systems, see [143] and [145].

The method of duality is a well-established tool for proving the uniqueness ofevolution equations. It will be further developed in Chapter 5. Its concrete versionin Theorem 4.10.1 follows the exposition from [147], but is close to the classicalexpositions of [219] and [91].

Page 303: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

Chapter 5

Linear Evolutionary Equations:Advanced Theory

In this chapter, we continue the study of linear equations with unbounded r.h.s. inBanach spaces, thereby further developing the methods of semigroups and prop-agators. The first part (Sections 5.1 to 5.3) is devoted to methods for buildingpropagators and their generator families by combining (mixing, adding, makingthem time-varying) the generators of the semigroup. The second part of the chap-ter (Sections 5.4 to 5.7) is devoted to the method of frozen coefficients, which is avariation of the more general method of parametrix. This method aims at buildingsolutions to ΨDEs with variable coefficients from combinations of the solution tothe equations with constant coefficients. It is developed here in a rather generalform and illustrated on various examples, including Levy–Khintchin-type gener-ators and various mixed fractional Laplacians. In the final part of the chapter,general methods for proving uniqueness are introduced, with a discussion of thenotion of generalized solutions as a by-product. Particular attention is given topositivity-preserving evolutions, which leads to the so-called Feller semigroups thatplay a key role in stochastic calculus. As insightful examples of linear equationsand the related semigroups, we discuss in some detail the convolution semigroupsgenerated by Levy–Khintchin operators in Banach spaces and by generators oforder at most one. The chapter is completed with a brief discussion of smoothingand smoothness.

5.1 T -products with three-level Banach towers

Extending the theory of Section 2.3 to unbounded generators, we shall now developthe notion of T -products or chronological products, or operator-valued multiplicativeintegrals. They are an important tool for building propagators that are generatedby families of operators, each of which generates a sufficiently regular semigroup.

© Springer Nature Switzerland AG 2019 V. Kolokoltsov, Differential Equations on Measures and Functional Spaces, Birkhäuser Advanced Texts Basler Lehrbücher, https://doi.org/10.1007/978-3-030-03377-4_5

289

Page 304: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

290 Chapter 5. Linear Evolutionary Equations: Advanced Theory

Let Lt, t ∈ R, be a family of operators in the Banach space B such that eachone of them generates a strongly continuous semigroup exp{sLt} in B, having thedense subspace D ⊂ B as as invariant core. The natural question arises whetherthe family Lt generates a propagator that solves the Cauchy problem

u(t) = Ltu(t), t ≥ s, us = u ∈ D, (5.1)

or a backward propagator solving the backward Cauchy problem

u(t) = −Ltu(t), t ≤ s, us = u ∈ D. (5.2)

Since each Lt generates a semigroup, one can expect the solution to be ob-tained by the limit of discrete approximations, where the evolution is definedby the corresponding semigroup for each step. To be precise, for a partitionΔ = {0 = t0 < t1 < · · · < tN = T } of an interval [0, T ], let us define a fam-ily of operators UΔ(τ, s), 0 ≤ s ≤ τ ≤ T , by the following rules:

UΔ(τ, s) = exp{(τ − s)Ltj}, tj ≤ s ≤ τ ≤ tj+1,

UΔ(τ, r) = UΔ(τ, s)UΔ(s, r), 0 ≤ r ≤ s ≤ τ ≤ T.(5.3)

It is clear that each UΔ is a propagator in B with the invariant domain D,generated by the family Ls(Δ), where s(Δ) = tj for tj ≤ s < tj+1.

Remark 85. In this context, the notion of generation is used in the sense of Re-mark 75, since the corresponding differential equations hold outside the finite set{t1, . . . , tN}.

Let Δtj = tj+1 − tj and |Δ| = maxj Δtj . If the limit

Us,rf = lim|Δ|→0

UΔ(s, r)f, s ≥ r, (5.4)

exists for some f and all 0 ≤ r ≤ s ≤ T , it is called the T -product (or chronologicalproduct or chronological exponential) of Lt applied to f . It is denoted by

T exp

{∫ s

r

Lτ dτ

}f = Us,r

for f, r ≤ s,

where the suffix ‘for’ refers to the fact that this T-product is ‘forward’ in time. Asmentioned above, one expects the T -product to provide a solution to (5.1).

The backward Cauchy problem (5.2) can be handled similarly. Namely, fora partition Δ = {0 = t0 < t1 < · · · < tN = t} we define the family of operatorsUΔ(s, τ), 0 ≤ s ≤ τ ≤ t, by the following rules:

UΔ(s, τ) = exp{(τ − s)Ltj+1}, tj ≤ s ≤ τ ≤ tj+1,

UΔ(r, τ) = UΔ(r, s)UΔ(s, τ), 0 ≤ r ≤ s ≤ τ ≤ T.(5.5)

Page 305: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

5.1. T -products with three-level Banach towers 291

If the limit

Us,rf = lim|Δ|→0

UΔ(s, r)f, s ≤ r, (5.6)

exists for some f and all 0 ≤ s ≤ r ≤ T , it is called the backward T -product of Lt

applied to f . It is denoted by

T exp

{∫ r

s

Lτ dτ

}f = T exp

{∫ s

r

Lτ dτ

}f

= T exp

{∫ r

s

(−Lτ ) dτ

}f = Us,r

backf, s ≤ r.

(5.7)

One expects this T -product to provide a solution to (5.2).

Remark 86. Due to this fact, it is customary in physics to denote propagatorsthat are generated by a family Lt by the corresponding T -product. In fact, onecan show that if the propagator is well defined, then the convergence in (5.6) takesplace under very general conditions. In what follows, we tackle the more compli-cated problem of proving convergence without initially assuming the existence ofa propagator generated by Lt.

For a rigorous treatment, it is handy to work with the common invariant corebeing equipped with its own Banach space structure (two-level Banach tower), aswe did in Proposition 4.13.1.

Theorem 5.1.1. Let B be a Banach space and D a dense subspace therein, whichis itself Banach with respect to the norm ‖.‖D ≥ ‖.‖B. Let a family Ltf , t ∈ [0, T ],of linear operators in B be given such that

(i) each Lt generates a strongly continuous semigroup esLt , s ≥ 0, in B with theinvariant core D such that

‖ exp{sLt}‖B→B ≤ eKs, ‖ exp{sLt}‖D→D ≤ eKs, s, t ∈ [0, T ], (5.8)

with some constant K;

(ii) Lt are uniformly bounded operators D → B, so that ‖L‖D→B ≤ L < ∞.They depend continuously on t in the norm topology of L(D,B).

Then:

(i) the T -products U r,sfor f = T exp{∫ r

sLτ dτ}f and Us,r

backf = T exp{∫ r

s(−Lτ )dτ}f

exist for all f ∈ B, where the limit is understood in the norm of B. Moreover,the convergence in (5.4), (5.6) is uniform in f on any bounded subset of Dand in s, t ∈ [0, T ];

(ii) the T -products U r,sfor and Us,r

back form a bounded strongly continuous propagatorand a backward propagator, respectively, with the bounds

‖U r,sfor ‖B→B ≤ eK(r−s), ‖Us,r

back‖B→B ≤ eK(r−s); (5.9)

Page 306: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

292 Chapter 5. Linear Evolutionary Equations: Advanced Theory

(iii) the T -products U r,sfor and Us,r

back satisfy the following equations:

d

dtUs,tforf = −Us,t

forLtf, s ≥ t (5.10)

andd

dtUs,tbackf = Us,t

backLtf, s ≤ t, (5.11)

for any f ∈ D;

(iv) the dual propagators V t,sfor = (Us,t

back)∗ and V t,s

back = (Us,tfor )

∗ in B∗ solve theweak versions of the dual equations to (5.2) and (5.1):

d

ds(f, V s,t

for ξ) = (Lsf, Vs,tfor ξ), t ≤ s ≤ r, f ∈ D, (5.12)

d

ds(f, V s,t

backξ) = (Lsf, Vs,tbackξ), r ≤ s ≤ t, f ∈ D. (5.13)

Remark 87. Let us highlight the importance of exponential bounds (5.8) thatwe sometimes refer to as regular bounds. In the sequel, they shall be regularlyused. They are much stronger than just the requirement of boundedness of thesemigroups. Semigroups with such estimates for the resolvents are called ‘stable’by some authors (see, e.g., [100]). Examples for the case when this property doesnot hold are given after Proposition 4.4.1.

Proof. For the sake of definiteness, let us work with the forward propagator, thecase of the backward propagator being analogous.

(i) We use formula (4.134) to write

(UΔ(s, r) − UΔ′(s, r))f =

∫ s

r

UΔ′(s, τ)(Lτ(Δ) − Lτ(Δ′))UΔ(τ, r) f dτ,

where s(Δ) = tj for tj ≤ s < tj+1. In fact, each UΔ is a propagator in B with theinvariant domain D generated by the family Lt(Δ).

Remark 88. Unlike the setting of Proposition 4.9.2, the propagators Us,rΔ are dif-

ferentiable only outside a finite subset of s and r. However, this does not affectthe validity of the representation (4.134), see Remark 75.

By (5.8), we find

‖UΔ(τ, r)‖D→D ≤ eK(τ−r), ‖UΔ′(s, τ)‖B→B ≤ eK(s−τ).

The assumed continuity of Lt therefore yields ‖(UΔ(s, r)− UΔ′(s, r))f‖B → 0, as|Δ|, |Δ′| → 0, uniformly in f on any bounded subset of D and in s, t ∈ [0, T ].The existence of the T -product for arbitrary f ∈ B follows by approximating fby elements from D and using the uniform boundedness of all approximationsUΔ(s, r).

Page 307: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

5.1. T -products with three-level Banach towers 293

(ii) The bound (5.9) follows from the convergence of the approximations withthe same bound. Next, choosing the partitions Δ of [r, τ ] containing the points ∈ (r, τ) in such a way that they are composed of a partition Δ1 of [r, s] and apartition Δ2 of [s, τ ], we get

U τ,rfor = lim

|Δ1|,|Δ2|→0UΔ2(τ, s)UΔ1(s, r) = U τ,s

for Us,rfor

showing the chain rule for U r,sfor .

If f ∈ D, then the equations

UΔ(s, r)f − f =

∫ s

r

Lτ(Δ)UΔ(τ, r)f dτ,

imply that‖UΔ(s, r)f − f‖B ≤ L(s− r)eK(s−r)‖f‖D.

Therefore, the family UΔ(s, r)f is Lipschitz-continuous as a function (s, t) → Buniformly for f bounded in D and all Δ. By approximations, one gets the strongcontinuity of the family U r,s

for in B.

(iii) Applying (4.133) to the propagator UΔ(s, r)f (with the inverted sign,since (4.133) applies to backward propagators) and integrating yields

Us,tΔ f = −

∫ t

s

Us,rΔ Lr(Δ)f dr, f ∈ D. (5.14)

Passing to the limit |Δ| → 0 yields

Us,tforf = −

∫ t

s

Us,rforLrf dr.

For f ∈ D, the function under the integral on the r.h.s. is continuous, whichimplies (5.10) after differentiation.

(iv) Equation (5.13) follows from (5.10) by the definition of duality. �

The difficulty to get Ufor or Uback fully generated by the family Lt in the senseof the ‘strong’ equations (4.129) and (4.130) stems from the fact that under theassumptions of Theorem 5.1.1 there seems to be no way to check the invarianceof D under Ufor or Uback. Therefore, the link between the propagators and the‘generators’ is only given by the weaker equations (5.10) and (5.11). As we shallshow shortly, this issue can be resolved by working with three Banach spaces. Butlet us first specify the notion of solutions arising from Theorem 5.1.1.

Let us say that ut is a generalized solution via discrete approximations to theCauchy problem (5.1) or (5.2), if for any ε > 0 there exists δ such that for anypartition Δ with |Δ| < δ,

‖ut − UΔ(t, s)u‖ < ε, (5.15)

with UΔ given by (5.3), t ∈ [s, T ], or by (5.5), t ∈ [0, s], respectively.

Page 308: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

294 Chapter 5. Linear Evolutionary Equations: Advanced Theory

Theorem 5.1.2. Under the assumptions of Theorem 5.1.1, Us,tforu and Us,t

backu repre-sent the unique generalized solutions to (5.1) or (5.2), respectively. Moreover, anyclassical solution ut (such that ut ∈ D for all t and the equations (5.1) or (5.2)hold rigorously) is a generalized solution.

Remark 89. Defining generalized solutions via discrete approximations is a stan-dard approach in the theory of differential equations. For instance, the so-calledC0-solutions that are used in the Crandall–Ligget theory of accretive operators(see, e.g., [244] or [262]) are defined in this way. Of course, the reasonability ofsuch a definition should be supported by a result like Theorem 5.1.2.

Proof. The uniqueness follows from the existence of the T -product, see Theorem5.1.1(i). Furthermore, suppose that ut is a classical solution with the initial con-dition u ∈ D to, say, equation (5.1). Then

ut − UΔ(t, s)us = UΔ(t, r)ur|ts =∫ t

s

UΔ(t, r)(Lr − Lr(Δ))ur dr,

which tends to 0 as |Δ| → 0. �

As mentioned before, the natural setting for proving that the above T -products solve the equations (5.1) and (5.2) classically is obtained by the methodof embedded Banach spaces or Banach towers. In this context, a triple of spaces(three-level Banach tower) is sufficient. Namely, let us introduce another densesubspace D ⊂ D ⊂ B such that D is itself a Banach space under the norm‖.‖D ≥ ‖.‖D.Theorem 5.1.3. Under the assumptions of Theorem 5.1.1, suppose additionally that

(i) the Lt are also uniformly bounded as operators D → D, so that ‖Lt‖D→D ≤L < ∞;

(ii) D is also invariant under all esLt , and these operators are uniformly boundedas operators in D, with their norms not exceeding eKs.

Then, if f ∈ D, the approximations UΔ(s, r) converge also in D. Therefore, thespace D is invariant under all T -products U t,s

for and U t,sback. Moreover, these T -

products define strongly continuous propagators in D which solve the problems(5.1) or (5.2) for any f ∈ D.

Proof. Using the pair (D,D) instead of (D,B), one shows analogously to Theorem5.1.1 the convergence in D for f ∈ D first, and then extends the result to all f ∈ Dby a density argument. Similarly, one shows strong continuity in D first for f ∈ Dand then for all f ∈ D. Once this is proven, the equations (5.1) or (5.2) can beobtained either by passing to the limit in similar equations for the approximationsU t,sΔ , or from the equations (5.10), (5.11) by the arguments used in Proposition

4.9.1. �

Page 309: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

5.1. T -products with three-level Banach towers 295

As another example for the use of this three-spaces-technique, let us extend(at least partially) Proposition 4.2.6 to the preservation of regularity for propaga-tors. To this end, we assume the Banach spaces D ⊂ D ⊂ B as above.

Proposition 5.1.1. Let U t,s be a strongly continuous propagator (or backward pro-pagator) of linear operators in B generated by the family At on the invariantdomain D. Assume next that the space D is also invariant, that all operators U t,s

are uniformly bounded and strongly continuous as operators D → D and D → D,and that the At are bounded and strongly continuous as operators D → B andD → D. Then the propagator (or backward propagator) U t,s in D is generated bythe same family At on the invariant domain D.

Proof. Let us talk about backward propagators for the sake of definiteness. SinceU t,s is generated by At on D, we have

U t,sf − f = (s− t)Asf +

∫ s

t

(ArUr,s −As)f dr

for any f ∈ D. If f ∈ D, then the function under the integral is continuous in thetopology of D, because U t,s is strongly continuous in D and the At are continuousas mappings D → D. This implies

lims−t→0+

∥∥∥∥ 1

s− t(U t,sf − f)−Asf

∥∥∥∥D

= 0,

so that equation (4.131) holds in the topology of D. The rest follows from Propo-sition 4.9.1. �

As a direct consequence of the obtained results, we can extend Theorem 4.3.1to the time-dependent case (with stronger assumptions on regularity):

Theorem 5.1.4. In the diffusion operators

Ltu =1

2(A(t, x)∇,∇)u + (b(t, x),∇)u(x), x ∈ Rd, (5.16)

let A(t, x) = σ(t, x)σT (t, x), and let both σ and b be continuous and belong toC4(Rd) as functions of x. Then the family of operators Lt in (5.16) generates astrongly continuous backward propagator Us,t in C∞(Rd) on the common invariantdomain C2

∞(Rd), so that ‖Us,t‖L(C2(Rd)) ≤ eKt.

Remark 90. In this setting, the direct probabilistic approach still gives betterresults, since it allows for a straightforward time-dependent extension of Theorem4.3.1 without strengthening the regularity assumptions.

Page 310: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

296 Chapter 5. Linear Evolutionary Equations: Advanced Theory

5.2 Adding generators with 4-level Banach towers

We address now the problem of ‘adding generators’, i.e., constructing a propagatorthat is generated by the family Lt

1 + · · · + Ltn from the propagators that are

generated by each Ltj separately. We start with the case n = 2.

The Lie–Trotter–Daletski–Chernoff formula

eL1+L2 = limn→∞(eL1/neL2/n)n (5.17)

was established and widely applied under various assumptions for linear opera-tors L1, L2. In most cases, it is obtained under the condition that the semigroupexp{t(L1 + L2)} exists, see [232] and the bibliography therein. We are now goingto discuss a situation where the semigroups generated by L1, L2 are sufficientlyregular for deducing the existence of the semigroup generated by L1 + L2, foridentifying its invariant core and for getting the precise rates of convergence. Fur-thermore, we will extend this result to time-dependent generators.

For two given operators L1, L2 that generate the bounded semigroups etL1

and etL2 in a Banach space and for a given τ > 0, let us define the family ofbounded operators U τ

t , t > 0, in the following way: For k = 0 or k ∈ N, let

U τt = e(t−2kτ)L1(eτL2eτL1)k, 2kτ ≤ t ≤ (2k + 1)τ, (5.18)

U τt = e(t−(2k+1)τ)L2eτL1(eτL2eτL1)k, (2k + 1)τ ≤ t ≤ (2k + 2)τ. (5.19)

As in Theorem 5.1.3, we shall work with Banach towers. Here, however, thefour-level tower D3 ⊂ D2 ⊂ D ⊂ B turns out to be useful. (A definition of Banachtowers can be found prior to Proposition 4.2.6.) As the following results reveal,higher-level Banach towers can often be used as an intermediate tool, when theoperators under analysis belong to a class where more regular approximations arenaturally available. In particular, this is the case for the wide class of pseudo-differential operators (see the discussion prior to Theorem 5.15.1).

Theorem 5.2.1. Suppose that

(i) the linear operators L1, L2 in B generate strongly continuous semigroups etL1

and etL2 in B with D being their common invariant core, and

max(‖etL1‖B, ‖etL2‖B, ‖etL1‖D, ‖etL2‖D

) ≤ eKt

with a constant K;

(ii) D2 and D3 are also invariant under etL1 and etL2 with

max(‖etL1‖D2 , ‖etL2‖D2

) ≤ eK2t, max(‖etL1‖D3 , ‖etL2‖D3

) ≤ eK3t,

with constants K2,K3, where we assume for the sake of definiteness thatK3 ≥ K2 ≥ K;

Page 311: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

5.2. Adding generators with 4-level Banach towers 297

(iii) L1, L2 are bounded as operators D → B, D2 → D and D3 → D2 with normsthat are bounded by some constants LDB, L2, L3.

Then

(i) for any T > 0 and f ∈ B, the curves U τkt f , τk = 2−k, converge in C([0, T ], B)

to a curve Utf , as k → ∞. For f ∈ D, this convergence holds in C([0, T ], D),so that Utf ∈ C([0, T ], D);

(ii) the norms ‖Ut‖B and ‖Ut‖D are bounded by eKt;

(iii) for f ∈ D2 and τk < min(1, t/2), we have

‖(U τkt − Ut)f‖B ≤ 6LDBL2e

2K2τkeK2ttτk

1− τk‖f‖D2; (5.20)

(iv) the operators Ut form a strongly continuous semigroup in B with the gener-ator (L1 + L2)/2, with D being its invariant core;

(v) the curve Utf is a Lipschitz-continuous function t �→ Utf ∈ D for anyf ∈ D2.

Proof. Let τ ≤ t ∈ [0, T ] with an arbitrary given T > 0. Assumption (ii) implies

‖U τt ‖B→B ≤ eKt, ‖U τ

t ‖D→D ≤ eKt, ‖U τt ‖D2→D2 ≤ eK2t. (5.21)

Next, for any f ∈ D and i = 1, 2, we find

etLif − f =

∫ t

0

LiesLif ds, (5.22)

which implies‖etLif − f‖B ≤ tLDBe

Kt‖f‖D,‖etLif − f‖D ≤ tL2e

K2t‖f‖D2 .(5.23)

Moreover, we find

etLif = f + tLif +

∫ t

0

Li(esLif − f) ds

which implies‖etLif − f − tLif‖B ≤ t2LDBL2e

K2t‖f‖D2 . (5.24)

Consequently,

‖etL2etL1f − etL2f − tetL2L1f‖B ≤ t2LDBL2e2K2t‖f‖D2,

and therefore, by approximating etL2f by f + tL2f via (5.24) and tetL2L1f bytL1f via (5.22) applied to tL1f , the following estimate holds:

‖etL2etL1f − f − t(L1 + L2)f‖B ≤ 3t2LDBL2e2K2t‖f‖D2 . (5.25)

Page 312: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

298 Chapter 5. Linear Evolutionary Equations: Advanced Theory

Consequently,

‖(etL1etL2 − etL2etL1)f‖B ≤ 6t2LDBL2e2K2t‖f‖D2 , (5.26)

and therefore

‖(e2tL1e2tL2 − etL1etL2etL1etL2)f‖B = ‖etL1(etL1etL2 − etL2etL1)etL2f‖B≤ 6t2LDBL2e

4K2t‖f‖D2.

Writing now

(eτL2eτL1)k − (eτL2/2eτL1/2)2k

=

k∑l=1

(eτL2eτL1)k−l[eτL2eτL1 − (eτL2/2eτL1/2)2](eτL2/2eτL1/2)2l−2,

we can conclude that∥∥∥((eτL2eτL1)k − (eτL2/2eτL1/2)2k)f∥∥∥B≤ 6kτ2LDBL2e

2K2τeK2t‖f‖D2,

so that

‖(U τt − U

τ/2t )f‖B ≤ 6τtLDBL2e

2K2τeK2t‖f‖D2. (5.27)

Consequently, for a natural number k < l and τ < 1, we find

‖(U τkt − U τl

t )f‖B ≤∑n>k

‖(U τnt − U

τn+1

t )f‖B ≤ 6LDBL2e4K2τeK2t

tτk1− τk

‖f‖D2,

which implies that U τkt converges in C([0, T ], B) as k → ∞ to a curve (which we

denote by Utf), and that the estimate (5.20) holds. Since the Ut are uniformlybounded, we can deduce by the usual density argument that U τk

t f converges forany f ∈ B, and that the limiting set of operators Ut forms a bounded semigroupin B.

Repeating the above arguments for the triple of spaces D3 ⊂ D2 ⊂ D yieldsthe convergence of the approximations in D and therefore the invariance of Dunder Ut and the required bound for Ut in D.

Next, by (5.23), we find

∥∥∥∥∥k∏

i=1

eτLk · · · eτL1f − f

∥∥∥∥∥D

=

∥∥∥∥∥k∑

i=1

(eτLi − 1)i−1∏l=1

eτLlf

∥∥∥∥∥D

≤ τkL2ekK2τ‖f‖D2,

where each Li is any of the operators L1, L2. Therefore,

‖U τt f − f‖D ≤ tL2e

K2t‖f‖D2,

Page 313: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

5.2. Adding generators with 4-level Banach towers 299

and thus, for t2 > t1,

‖U τt2f − U τ

t1f‖D = ‖(U τt2−t1 − 1)U τ

t1f‖D ≤ (t2 − t1)L2eK2t2‖f‖D2 , (5.28)

which proves the Lipschitz continuity of the approximations and therefore also ofthe limiting propagator in D.

It remains to prove statement (iv) of the theorem. Let first f ∈ D2 and let tbe a binary rational. Then t/2τk ∈ N for k large enough, so that

U τkt f = (eτkL2eτkL1)t/2τk

and

U τkt f − f =

(t/2τk)−1∑l=0

(eτkL2eτkL1 − 1)(eτkL2eτkL1)lf.

Therefore, (5.25) leads to

∥∥∥∥U τkt f − f − τk

(t/2τk)−1∑l=0

(L1 + L2)(eτkL2eτkL1)lf

∥∥∥∥B

=

∥∥∥∥(t/2τk)−1∑

l=0

[eτkL2eτkL1 − 1− τk(L1 + L2)](eτkL2eτkL1)lf

∥∥∥∥B

≤ 3

(t/2τk)−1∑l=0

τ2k LDBL2e2K2τkl‖f‖D2 ≤ 3tτke

tK2LDBL2‖f‖D2.

Consequently,

U τkt f − f =

1

2(2τk)

(t/2τk)−1∑l=0

(L1 + L2)U2lτkf

+ τk

(t/2τk)−1∑l=0

(L1 + L2)[Uτk2lτk

− U2lτk ]f + αk,

‖αk‖B ≤ 3tτketK2LDBL2‖f‖D2 .

Passing to the limit as k → ∞ in the topology of B and applying (5.20) yields

Utf − f =1

2

∫ t

0

(L1 + L2)Usf ds. (5.29)

By a density argument, the same formula holds for any f ∈ D. Due to the conti-nuity of Utf in D, it follows from (5.29) that

d

dt|t=0Utf =

1

2(L1 + L2)f

for any f ∈ D, where the derivative is defined in the topology of B. �

Page 314: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

300 Chapter 5. Linear Evolutionary Equations: Advanced Theory

The possibility of regular approximations provides a way to get rid of lengthyBanach towers. Namely, as a direct consequence of Theorem 5.2.1 and Proposition4.2.5, we can conclude the following:

Theorem 5.2.2.

(i) Suppose that L1, L2 ∈ L(D,B) and there exists a sequence of pairs of oper-ators Ln

1 , Ln2 , n ∈ N, satisfying the assumptions of Theorem 5.2.1 for each

n, with the constant K being independent of n, and such that Ln1 → L1 and

Ln2 → L2, as n → ∞, in the operator topologies of the space L(D,B). Then

the semigroups exp{tL1}, exp{tL2} and exp{t(L1 + L2)/2} converge in Bto some strongly continuous semigroups T 1

t , T2t and T 12

t , respectively, suchthat D belongs to the domain of the generators of these semigroups. Theirgenerators coincide with L1, L2 and (L1 + L2)/2, respectively.

(ii) Suppose additionally that the constant K2 is independent of n and the con-vergence Ln

1 → L1 and Ln2 → L2 holds also in the operator topologies of

the space L(D2, D). Then the space D is an invariant core for the semi-groups T 1

t , T2t , T

12t , where these semigroups are generated by the operators

L1, L2, (L1 + L2)/2 respectively.

Another way of reducing the length of the tower is based on additional dualityassumptions, as shown in the following result.

Theorem 5.2.3. Suppose that

(i) the linear operators L1, L2 in B generate strongly continuous semigroups etL1

and etL2 in B with the spaces D and D2 being invariant, such that etLj , j =1, 2, have norms that are bounded by eKt with some K in all spaces B,D,D2.Moreover, L1 and L2 are bounded as operators D → B and D2 → D withnorms that are bounded by some constant L.

(ii) the spaces D and B are dual Banach spaces, i.e., B = E∗B, D = E∗

D for someBanach spaces EB ⊂ ED.

Then the semigroups U τkt converge to a strongly continuous semigroup Ut in B

with an invariant core D, where it is generated by (L1 + L2)/2.

Proof. Following the proof of Theorem 5.2.1, we obtain (5.27) and therefore theconvergence of U τk

t in B, as well as the Lipschitz continuity in t of the curves U τkt f

in both spaces B and D for any f ∈ D2. Using the Banach–Alaoglu theorem on thecompactness of the unit ball in D = E∗

D in the ∗-weak topology, we can concludethat for any f ∈ D the sequence U τk

t f is bounded in D and therefore relativelycompact in the ∗-weak topology. Hence there exists a converging subsequence fl.But since EB ⊂ ED, it follows that fl converges ∗-weakly in B. But then its limit isUtf . Therefore, all converging subsequences have the same limit. Consequently, thesequence U τk

t f converges ∗-weakly in D, and hence Utf ∈ D. Therefore, D turnsout to be invariant under Ut, and the Ut are bounded operators in D. Moreover,

Page 315: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

5.2. Adding generators with 4-level Banach towers 301

the Lipschitz continuity of U τkt f in D implies that the Utf are also Lipschitz-

continuous in D. The proof that Ut is generated by (L1 +L2)/2 in D is similar tothe proof of Theorem 5.2.1. �

It is more or less straightforward to extend the theory to a finite number ofgenerators Lj, j = 1, . . . , n, with the approximations being defined as

U τt = e(t−(nk+l)τ)Ll+1eτLl · · · eτL1(eτLn · · · eτL1)k,

(nk + l)τ ≤ t ≤ (nk + l + 1)τ.(5.30)

Theorem 5.2.4. Suppose that the operators Lj in B, j = 1, . . . , n, satisfy all theconditions assumed for L1, L2 in Theorem 5.2.1. Then for any T > 0 and f ∈ B,the curves U τk

t f , τk = 2−k, converge in C([0, T ], B) to a curve Utf , as k → ∞,and for f ∈ D this convergence holds in C([0, T ], D), so that Utf ∈ C([0, T ], D).Moreover, the norms ‖Ut‖B and ‖Ut‖D are bounded by eKt and the operators Ut

form a strongly continuous semigroup in B with the generator (L1 + · · · + Ln)/nand the invariant core D.

Proof. From (5.24), we derive (5.25) as in Theorem 5.2.1. Then we repeat theprocedure. Namely, from (5.25), we can derive

‖etL3etL2etL1f − etL3f − tetL3(L1 + L2)f‖B ≤ 3t2LDBL2e3K2t‖f‖D2 , (5.31)

and consequently

‖etL3etL2etL1f − f − t(L1 + L2 + L3)f‖B≤ 3t2LDBL2e

3K2t‖f‖D2 + ‖etL3f − f − tL3f‖B + t‖(etL3 − 1)(L1 + L2)f‖B≤ 6t2LDBL2e

3K2t‖f‖D2 .

By induction, it follows that

‖etLn · · · etL1f−f−t(L1+ · · ·+Ln)f‖B ≤ 1

2n(n+1)t2LDBL2e

nK2t‖f‖D2, (5.32)

and therefore

‖etLn · · · etL1f − etLπ(n) · · · etLπ(1)f‖B ≤ n(n+ 1)t2LDBL2enK2t‖f‖D2, (5.33)

where π is any permutation of the set {1, . . . , n}. Consequently,‖etLn · · · etL1f − (etLn/2 · · · etL1/2)2f‖B ≤ 2n(2n+ 1)t2LDBL2e

nK2t‖f‖D2 ,

and therefore

‖(eτkLn · · · eτkL1)kf − (eτLn/2 · · · eτkL1/2)2kf‖B≤ n(n+ 1)tτkLDBL2e

nK2τkeKt‖f‖D2 .(5.34)

The rest of the proof is as in Theorem 5.2.1. �

Page 316: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

302 Chapter 5. Linear Evolutionary Equations: Advanced Theory

Exercise 5.2.1. Derive the analogue of Theorems 5.2.2 and 5.2.3 for finite familiesof operators Lj.

As an example, let us consider the heat-conduction equation in Rd withbounded sources and unbounded sinks:

∂ft∂t

=1

2Δft − V (x)ft, (5.35)

where V function that is bounded from below and has a polynomial growth atinfinity. This example also illustrates the importance of using weighted spaces ofmeasures and functions. For the sake of simplicity, we shall work with positive V .

Since we aim at applying Theorem 5.2.1, we note that the operator Δ/2 gen-erates the heat semigroup Tt = exp{tΔ/2} in B = C∞(Rd), given by (4.13), withinvariant cores Ck

∞(Rd) for any k ≥ 2, and the operator (−V ) of multiplication by(−V (x)) generates the semigroup exp{−tV } of multiplication by exp{−tV (x)} inB = C∞(Rd) with cores being spaces of functions that decrease sufficiently fastat infinity. In order to find a common core, let us write down the derivatives ofthe semigroup exp{−tV }:

∂xj

(e−tV (x)f(x)

)= − t

∂V

∂xj(x)e−tV (x)f(x) + e−tV (x) ∂f

∂xj, (5.36)

∂2

∂xi∂xj

(e−tV (x)f(x)

)= − t

∂2V

∂xi∂xj(x)e−tV (x)f(x) + t2

∂V

∂xi(x)

∂V

∂xj(x)e−tV (x)f(x)

− t∂V

∂xie−tV (x) ∂f

∂xj− t

∂V

∂xje−tV (x) ∂f

∂xi+ e−tV (x) ∂2f

∂xi∂xj.

(5.37)

For any k ∈ R, let us now introduce the spaces Ck of continuous functions on Rd

with a finite norm‖f‖k = sup

x(|f(x)|(1 + |x|k)),

and by Ck,∞ its subspace of functions such that |f(x)|(1 + |x|k) → 0 as x → ∞.Let us denote by ∇lf the collection of all partial derivatives of f of order l, andlet us say that ∇lf ∈ Ck or ∇lf ∈ Ck,∞ if all these derivatives belong to Ck orCk,∞, respectively, with the norm ‖∇lf‖k being defined as the supremum of theCk-norms of all partial derivatives of order l. The next result identifies the simplestcommon core for the semigroups exp{tΔ/2} and exp{−tV } in B = C∞(Rd).

Lemma 5.2.1. Assume that V ≥ 0 and, for some even k > 0, that V,∇V,∇2V ∈C−k. Then the space

D = {f ∈ C2k,∞ : ∇f ∈ Ck,∞, ∇2f ∈ C0,∞ = B}is an invariant core for both exp{tΔ/2} and exp{−tV }. If D is equipped with thenorm

‖f‖D = ‖f‖2k + ‖∇f‖k + ‖∇2f‖B,

Page 317: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

5.2. Adding generators with 4-level Banach towers 303

then the semigroups exp{tΔ/2} and exp{−tV } act strongly continuously in D andV ,Δ ∈ L(D,B). Moreover, for t ∈ (0, 1), we have

‖ exp{−tV }‖D→D ≤ 1 + κt, ‖ exp{−tΔ/2}‖D→D ≤ 1 + κt, (5.38)

with some constant κ depending on d and on the norms of V,∇V,∇2V in thecorresponding spaces.

Proof. It can be seen from (5.36) that ∇(e−tV (x)f(x)) ∈ Ck,∞ whenever ∇f ∈Ck,∞ and f ∈ C2k,∞. (5.37) implies that ∇2(e−tV (x)f(x)) ∈ B whenever∇2f ∈ B,

∇f ∈ Ck,∞ and f ∈ C2k,∞. This implies the invariance of D under exp{−tV }. Theinvariance of D under exp{tΔ/2} follows from (4.27) and (4.28). The estimates(5.38) also follow from (5.36) and (5.37). �

The following lemma is proved in a completely analogous manner.

Lemma 5.2.2. Assume that V ≥ 0 and, for some even k > 0, that V , ∇V , ∇2V ,∇3V , ∇4V ∈ C−k. Then the space

D ={f ∈ C4k,∞ : ∇f ∈ C3k,∞, ∇2f ∈ C2k,∞, ∇3f ∈ Ck,∞, ∇4f ∈ B

}is also an invariant core for both exp{tΔ/2} and exp{−tV }. If D is equipped withthe norm

‖f‖D = ‖f‖4k + ‖∇f‖3k + ‖∇2f‖2k + ‖∇3f‖k + ‖∇4f‖B,then the semigroups exp{tΔ/2} and exp{−tV } act strongly continuously in D andV ,Δ ∈ L(D,D). Moreover, for t ∈ (0, 1), we have

‖ exp{−tV }‖D→D ≤ 1 + κt, ‖ exp{−tΔ/2}‖D→D ≤ 1 + κt, (5.39)

with some constant κ depending on d and on the norms of V and its derivativesin C−k.

The next result is a direct consequence of the Lemmas 5.2.1, 5.2.2 and The-orem 5.2.2.

Proposition 5.2.1. Under the assumptions of Lemma 5.2.2, the operator −V +Δ/2generates a strongly continuous semigroup in B, with D as introduced in Lemma5.2.1 being its invariant core. In particular, the Cauchy problem for equation (5.35)in B is well posed for initial conditions in D.

Exercise 5.2.2. Derive the analogue of Proposition 5.2.1 for a time-dependentfamily Vt using Theorem 5.1.1.

Exercise 5.2.3. Use Theorems 5.2.4 and 4.3.1 to show that the diffusion operator(5.16) generates a strongly continuous semigroup whenever

A(t, x) =n∑

j=1

σj(x)σTj (x)

and b, σj are sufficiently regular.

Page 318: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

304 Chapter 5. Linear Evolutionary Equations: Advanced Theory

5.3 Mixing generators

In this section, we extend the above construction to time-dependent mixturesof an infinite number of generators. The extension to a finite number of time-dependent terms is more or less straightforward. For this purpose, the Banachspaces D3 ⊂ D2 ⊂ D ⊂ B are supposed to form a four-level tower as in Theorem5.2.1.

For two families Ls1 and Ls

2, s ≥ 0, of linear operators in B such that eachLsi generates a bounded semigroup, and for a given τ > 0, let us define the family

U τt in the following way:

U τ2kτ,2lτ = exp{τL(2k−1)τ

2 } exp{τL(2k−2)τ1 } · · · exp{τL(2l+1)τ

2 } exp{τL2lτ1 };

k, l ∈ N, k > l;(5.40)

U τt,s = exp{(t− s)L2kτ

1 }, 2kτ ≤ s ≤ t ≤ (2k + 1)τ, (5.41)

U τt,s = exp{(t− s)L

(2k+1)τ2 }, (2k + 1)τ ≤ s ≤ t ≤ (2k + 2)τ. (5.42)

For other t, s, it is obtained by gluing in such a way that U τt,s becomes a propagator.

Theorem 5.3.1. Suppose that

(i) the linear operators Ls1 and Ls

2, s ∈ [0, T ], generate strongly continuous semi-groups exp{tLs

i} in B with the common invariant core D, and

max (‖ exp{tLs1}‖B, ‖ exp{tLs

2}‖B, ‖ exp{tLs1}‖D, ‖ exp{tLs

2}‖D) ≤ eKt

with a constant K;

(ii) D2 and D3 are also invariant under exp{Ls1} and exp{tLs

2} with

max (‖ exp{tLs1}‖D2, ‖ exp{tLs

2}‖D2) ≤ eK2t,

max (‖ exp{tLs1}‖D3, ‖ exp{tLs

2}‖D3) ≤ eK3t,

with constants K3 ≥ K2 ≥ K;

(iii) Ls1, L

s2 are bounded as operators D → B, D2 → D and D3 → D2 with norms

that are bounded by some constants LDB, L2, L3;

(iv) the Ltj depend Lipschitz-continuously on t in the following sense:

‖(Lt+τj − Lt

j)f‖B ≤ κτ‖f‖D2 , ‖(Lt+τj − Lt

j)f‖D ≤ κτ‖f‖D3 (5.43)

uniformly for finite t, τ .

Then the propagators U τkt,s converge in C([0, T ], B) and in C([0, T ], D) to a prop-

agator Ut,s such that

‖Ut,s‖B ≤ eK(t−s), ‖Ut,s‖D ≤ eK(t−s);

and the propagator Ut,s is generated on D by the family (Lt1 +Lt

2)/2 (in the senseof the definition given prior to Proposition 4.9.1).

Page 319: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

5.3. Mixing generators 305

Proof. This follows a similar path as in the proof for Theorem 5.2.1. Namely,instead of (5.25) one gets

‖ exp{τLt11 } exp{τLt2

2 }f − f − τ(Lt11 + Lt2

2 )f‖B ≤ 3τ2LDBL2e2K2t‖f‖D2 , (5.44)

for any t1, t2 ∈ (0, t). Together with (5.43), this implies the following modificationof (5.26):

‖(exp{tLt11 } exp{tLt2

2 } − exp{tLt22 } exp{tLt1

1 })f‖B≤ 6t2LDBL2e

2K2t‖f‖D2 + 2κt2‖f‖D2.(5.45)

Therefore, instead of (5.27) one gets the estimate

‖(U τt,s − U

τ/2t,s )f‖B ≤ τ(t− s)C(T )‖f‖D, (5.46)

with another constant C(T ). The remaining argument of Theorem 5.2.1 can besimilarly applied. �Exercise 5.3.1. Extend Theorems 5.2.2 and 5.2.3 to a time-dependent setting.

Exercise 5.3.2. Extend Theorems 5.45, 5.2.2 and 5.2.3 to the case of n familiesof operators Ls

i , i = 1, . . . , n, in B generating strongly continuous semigroupsexp{tLs

i}, and show that the time-dependent extensions of the approximations(5.30) converge to the propagator that is generated by the family (Ls

1+ · · ·+Lsn)/n

on D.

The main result of this section concerns infinite mixtures of the generators.Let D4 ⊂ D3 ⊂ D2 ⊂ D ⊂ B be a 5-level Banach tower, as defined prior toProposition 4.2.6.

Theorem 5.3.2. Suppose that we are given a family of operators L(x) in B depend-ing on a parameter x ∈ Rd such that L(x) are bounded operators D4 → D3, D3 →D2, D2 → D and D → B with norms not exceeding the constants L4, L3, L2andL1,respectively, so that each L(x) generates a strongly continuous semigroup etL(x) inB, with D their common invariant core and with all Dj being invariant. More-over, assume that the mapping x �→ L(x) is continuous in the operator topologiesof L(D4, D3), L(D3, D2), L(D2, D), L(D,B), and that the norms of the opera-tors etL(x) in the spaces D4, D3, D2, D,B are bounded by eK4t, eK3t, eK2t, eK1t

and eKt, respectively, with some constants Kj ,K. Let μt be a Lipschitz-continuous(in the norm topology of M(Rd)) family of probability measures on Rd. Then thetime-dependent family of operators Lt =

∫L(x)μt(dx) generates strongly continu-

ous forward and backward propagators in B with the common domain D such thatthe norms of U t,s in B and D do not exceed eK|t−s| and eK2|t−s|, respectively.

Proof. Let us work with forward propagators for the sake of definiteness. The ideais to approximate the integral Lt by Riemannian integral sums and use the versionof Theorem 5.2.4 with time-dependent generators (see Exercise 5.3.2) to constructthe propagators that are generated by these integral sums. More concretely, by

Page 320: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

306 Chapter 5. Linear Evolutionary Equations: Advanced Theory

applying this latter result to the 4-level tower D3 ⊂ D2 ⊂ D ⊂ B, for any finitepartition Δ = {Aj} of Rd into a disjoint union of measurable sets Aj and anypoints xj ∈ Aj , the family of operators

LΔt =

∑jμt(Aj)L(xj) =

1

n

∑jLtj, Lt

j = nμt(Aj)L(xj),

generates a propagator U t,sΔ in B on the common invariant domain D such that

‖U t,sΔ ‖B→B and ‖U t,s

Δ ‖D→D are bounded by eK|t−s| and eK2|t−s|, respectively.In fact, estimates of the type (5.43) follow from the Lipschitz continuity of μt.Moreover, although

‖ exp{tLsj}‖ ≤ exp{tnKμs(Aj)}

does not provide a uniform bound as required in Theorem 5.2.4, it can be shownthat their product is uniformly bounded:

‖ exp{tLs1} · · · exp{tLs

n}‖B ≤ exp{tnKμs(A1)} · · · exp{tnKμs(An)} = etnK .

This is sufficient for the proof of Theorem 5.2.4 (or its version with time-dependentgenerators).

Using the 4-level tower D4 ⊂ D3 ⊂ D2 ⊂ D, we show next that the spaceD2 is also invariant under U t,s

Δ , whereby these operators are bounded in norm byetK2 .

Since μt is a continuous curve, one can choose, for any ε, the cube [−R,R]d ⊂Rd such that μt(R

d \ [−R,R]d) < ε. Using the continuity of L(x) in Rd andhence its uniform continuity in [−R,R]d, we can find a partition of [−R,R]d,[−R,R]d = ∪n

j=1Aj , such that

‖L(x)− L(y)‖D→B < ε, ‖L(x)− L(y)‖D2→D < ε, ‖L(x)− L(y)‖D3→D2 < ε

for any j and any x, y ∈ Aj . Consequently, for the partition Δε = {Aj} ∪ (Rd \[−R,R]d) ofRd, the norms of the operator LΔ

t −Lt in the spaces L(D,B),L(D2, D)and L(D3, D2) are bounded by ε(1 + L1), ε(1 + L2) and ε(1 + L3), respectively.Therefore, we can apply Proposition 4.2.5 (more precisely, its direct extension totime-dependent generators) to derive the convergence of the propagators U t,s

Δεto

the required propagatorU t,s generated by the family Lt on the invariant domainD.�

In applications to nonlinear equations, one usually works with weakly contin-uous curves μt, rather than strongly continuous ones. This issue is now addressed.Also, we relax the assumptions on the regularity of L(x), which can be substi-tuted by appropriate approximations, as was the case with the addition of a finitenumber of generators.

Theorem 5.3.3. Let L(x) be a family of operators in B such that the mappingx �→ L(x) is continuous in the operator topologies of the spaces L(D2, D) and

Page 321: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

5.3. Mixing generators 307

L(D,B). Let a sequence of families Ln(x) exist that satisfies the assumptions ofTheorem 5.3.2 with constants K,K1,K2 independent of n and such that

supx

‖Ln(x) − L(x)‖D→B → 0, supx

‖Ln(x)− L(x)‖D2→D → 0, (5.47)

as n → ∞. Let μt be a weakly continuous family of probability measures on Rd,which is Lipschitz-continuous in the topology of the space (Ck(Rd))∗ with somek ∈ N, that is

∣∣∣∣∫

φ(x)(μt(dx)− μs(dx))

∣∣∣∣ ≤ κ|t− s|‖φ‖Ck(Rd)

for all φ ∈ Ck(Rd) and a constant κ. Then the time-dependent family of operatorsLt =

∫L(x)μt(dx) generates strongly continuous forward and backward propaga-

tors in B with the common domain D such that the norms of U t,s in B and D donot exceed eK|t−s| and eK1|t−s|, respectively.

Proof. Again, let us work with forward propagators. The idea is to approximate μt

by strongly continuous measure curves. For this purpose, we can use the standardapproximation (1.5). However, we must additionally use a cutoff in x, i.e., theapproximations

fnt (x) = φ(x/n)nd

∫φ(n(x− y))μt(dy), (5.48)

and choose φ ∈ Ck(Rd) such that φ(x) equals 1 in a neighbourhood of the origin.

Notice first that the curves fnt are Lipschitz-continuous in t in the norm

topology of L1(Rd), so that the measure-valued curves μn

t with the densities fnt

are Lipschitz-continuous in t in the norm topology of M(Rd). In fact,

|fnt (x)− fn

s (x)| = φ(x/n)nd

∣∣∣∣∫

φ(n(x− y))(μt(dy)− μs(dy))

∣∣∣∣≤ κ|t− s|nd+k‖φ‖Ck(Rd)φ(x/n),

and thus ∫|fn

t (x)− fns (x)| dx ≤ κ|t − s|n2d+k‖φ‖Ck(Rd).

Therefore, for any n, the operators Ln(x) and the curves μnt satisfy all assump-

tions of Theorem 5.3.2. Consequently, the families of the operators∫Ln(x)μ

nt (dx)

generate forward propagators U t,sn such that

‖U t,sn ‖B→B ≤ eKt, ‖U t,s

n ‖D→D ≤ eK1t, ‖U t,sn ‖D2→D2 ≤ eK2t.

Page 322: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

308 Chapter 5. Linear Evolutionary Equations: Advanced Theory

To complete the proof, we apply Proposition 4.2.5, more precisely its direct ex-tension to propagators. For doing so, we have to know that∥∥∥∥

∫Ln(x)μ

nt (dx) −

∫L(x)μt(dx)

∥∥∥∥D→B

→ 0,∥∥∥∥∫

Ln(x)μnt (dx) −

∫L(x)μt(dx)

∥∥∥∥D2→D

→ 0,

as n → ∞. Due to (5.47), it is sufficient to show that

∥∥∥∥∫

L(x)μnt (dx) −

∫L(x)μt(dx)

∥∥∥∥D→B

→ 0,

(5.49)∥∥∥∥∫

L(x)μnt (dx) −

∫L(x)μt(dx)

∥∥∥∥D2→D

→ 0.

Of course, we expect this to be true, since the μnt converge weakly to μt. To be

more precise, let us write down, e.g., the first limit in (5.49) in more detail:∥∥∥∥∫ ∫

L(x)φ(x/n)ndφ(n(x− y))μt(dy) dx−∫

L(x)μt(dx)

∥∥∥∥D→B

→ 0.

Since φ(x/n) → 1, as n → ∞, it is sufficient to show that∥∥∥∥∫ ∫

L(x)ndφ(n(x − y))μt(dy) dx −∫

L(y)μt(dy)

∥∥∥∥D→B

→ 0,

or equivalently∥∥∥∥∫ ∫

(L(x) − L(y))ndφ(n(x − y))μt(dy) dx

∥∥∥∥D→B

→ 0. (5.50)

But this follows from the continuity of L(x) (and hence its uniform continuity oncompact subsets of Rd) and the tightness of the family μt. �

5.4 The method of frozen coefficients: heuristics

The method of frozen coefficients is a classical approach to solving equations withvariable coefficients by approximating the solution with the solutions of equationsthat have constant, or frozen, coefficients. We start with some formal calculationsthat will suggest some natural assumptions on the operator symbols, which arethen given a rigorous form.

Let ψt(x,−i∇) be a time-dependent family of pseudo-differential operatorswith symbols ψt(x, p) whose real part is bounded from below. As basic examples,

Page 323: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

5.4. The method of frozen coefficients: heuristics 309

one can have in mind the diffusion operator − 12 (At(x)∇,∇), a fractional Laplacian

with a variable scale, ψ(x,−i∇) = σt(x)|Δ|α, or a more general ΨDO havinghomogeneous symbols with position- and time-dependent coefficients, ψ(x, p) =ωt(x, p/|p|)|p|βt(x). Lower-order terms can always be added, but we prefer to treatthem separately via perturbation theory.

Assume that we are interested in solving the Cauchy problem

∂tft(x) = −ψt(x,−i∇)ft(x), f |t=s = fs, t > s, (5.51)

expecting to obtain a solution of the form

ft(x) =

∫Gt,s(x, y)fs(y) dy, (5.52)

with a Green function G that solves equation (5.51) with the initial conditionGs,s(x, y) = δ(x− y).

Let us write ψt,[z](−iΔ) for the operator ψ with frozen coefficients z, i.e., itacts on functions f(x) as an operator with constant coefficients (which are fixed bythe choice of z). We know that the solution to the Cauchy problem for operatorswith constant coefficients,

∂tgt(x) = −ψt,[z](−i∇)gt(x), g|t=s = gs, (5.53)

is given by the formulae (2.59) and (2.60), so that

gt(x) =

∫Gψ,z

t,s (x− y)gs(y) dy, (5.54)

Gψ,zt,s (x) =

1

(2π)d

∫ei(p,x) exp

{−∫ t

s

ψτ (z, p) dτ

}dp. (5.55)

The idea is to use the function Gapt,s(x, y) = Gψ,y

t,s (x− y) as a first approxima-tion for the actual Gt,s(x, y). It is readily seen that it satisfies the equation

∂tGap

t,s(x, y) = −ψt,[y](−i∇)Gapt,s(x, y),

where the operator ∇ acts on the variable x, or equivalently

∂tGap

t,s(x, y) = −ψt(x,−i∇)Gapt,s(x, y)− Ft,s(x, y), (5.56)

with

Ft,s(x, y) = (ψt,[y](−i∇)− ψt(x,−i∇))Gapt,s(x, y). (5.57)

Page 324: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

310 Chapter 5. Linear Evolutionary Equations: Advanced Theory

For these entities, we have

ψt,[y](−i∇)Gapt,s(x, y) =

1

(2π)d

∫ei(p,x−y)ψt(y, p) exp

{−∫ t

s

ψτ (y, p) dτ

}dp,

ψt(x,−i∇)Gapt,s(x, y) =

1

(2π)d

∫ei(p,x−y)ψt(x, p) exp

{−∫ t

s

ψτ (y, p) dτ

}dp,

(5.58)which implies

Ft,s(x, y) =1

(2π)d

∫ei(p,x−y)[ψt(y, p)− ψt(x, p)] exp

{−∫ t

s

ψτ (y, p) dτ

}dp.

(5.59)If the evolution (5.52) is well defined and F is sufficiently regular, then it followsfrom (5.56) and Proposition 4.10.2 (see also Remark 78) that

Gapt,s(x, y) = Gt,s(x, y)−

∫ t

s

∫Gt,τ (x, z)Fτ,s(z, y) dz. (5.60)

Remark 91. Equation (5.60) can be considered a (slightly generalized) variant of(4.134).

In operator form, the r.h.s. of (5.60) reads G − FG, where F is the linearoperator acting on functions of four variables as

(F ψ)t,s(x, y) =

∫ t

s

∫ψt,τ (x, z)Fτ,s(z, y) dz. (5.61)

Therefore, it follows from (5.60) that

Gt,s(x, y) = [(I − F )−1Gap]t,s(x, y) =

∞∑n=0

(FnGap)t,s(x, y). (5.62)

The mth term of this series equals

(FmGap)t,s(x, y) =

∫s≤s1≤···≤sm≤t

ds1 · · · dsm∫

Gapt,sm(x, zm)Fsm,sm−1(zm, zm−1) · · ·Fs2,s1(z2, z1)Fs1,s(z1, y)dz1 · · · dzm. (5.63)

Or in other words,

Gt,s(x, y) = Gapt,s(x, y) +

∫ t

s

∫Gap

t,τ (x, z)Φτ,s(z, y) dz, (5.64)

with

Φt,s(x, y) =

∞∑n=0

(FnF )t,s(x, y). (5.65)

Page 325: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

5.5. The method of frozen coefficients: estimates for the Green function 311

Formula (5.64) suggests an alternative approach to the derivation of (5.62),namely by searching for the heat kernel Gt,s(x, y) in the form (5.64) with a newunknown function Φ. Plugging this expression into the equation

∂tGt,s(x, y) = −ψt(x,−i∇)Gt,s(x, y) (5.66)

yields

− ψt,[y](−i∇)Gapt,s(x, y) + Φt,s(x, y)−

∫ t

s

∫ψt,[z](−i∇)Gap

t,τ (x, z)Φτ,s(z, y) dz

= −ψt(x,−i∇)Gapt,s(x, y)−

∫ t

s

∫ψt(x,−i∇)Gap

t,τ (x, z)Φτ,s(z, y) dz, (5.67)

which implies the following integral equation for Φ:

Φt,s(x, y) = Ft,s(x, y) +

∫ t

s

∫Ft,τ (x, z)Φτ,s(z, y) dz. (5.68)

Solving this equation by successive approximation yields again (5.65), and there-fore also the expansion (5.62).

In order to render all these calculations rigorous, one has to show at leastthat all series involved in the definition of Gt,s(x, y) do indeed converge, and thento clarify in what sense (if any) this function satisfies equation (5.51) with theinitial condition f(y) = δ(y), obtained as the limit for t − s → 0 in the senseof generalized functions. We shall carry out these tasks in the following Sections.At this stage, it should only be noted that the expansion (5.65) is similar to theperturbation series (4.166), with F playing the role of LU . When analysing theconvergence of the series (5.62), one can therefore use similar arguments as inTheorems 4.13.3 and 4.13.4.

5.5 The method of frozen coefficients:

estimates for the Green function

As a starting point for a rigorous theory, let us begin with the convergence ofa series of the type (5.65). We are interested in convergence both in L1(R

d) andC∞(Rd). A quick look at the expression (5.59) (details will be given later) suggeststhat Ft,s(., y) should belong to the intersection L1(R

d) ∩C∞(Rd), although withpossible power-type singularities as t → s. This observation is reflected in thefollowing lemma, which is built on the weakest general assumptions that stillensure the required convergence of (5.65).

Lemma 5.5.1. Let Ft,s(x, y), T1 ≤ s < t ≤ T2, T = T2 − T1, x, y ∈ Rd, be acontinuous function in all its variables such that Ft,s(., y) ∈ L1(R

d)∩C∞(Rd) for

Page 326: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

312 Chapter 5. Linear Evolutionary Equations: Advanced Theory

all t, s, y and

|Ft,s(x, y)| ≤ min[(t− s)−ωCΩC

t−s(x − y), (t− s)−ωLΩLt−s(x − y)

](5.69)

with some constants ωL ∈ (0, 1), ωC > 0 and non-negative functions ΩCt (.) ∈

C∞(Rd), ΩLt (.) ∈ L1(R

d) that have uniformly bounded norms:

‖ΩCt (.)‖C∞(Rd) ≤ ΩC , ‖ΩL

t (.)‖L1(Rd) ≤ ΩL,

with some positive constants ΩC ,ΩL. Then the series on the r.h.s. of (5.65), whereF is given by (5.61), converges for any t > s, y both in L1(R

d) and C∞(Rd), andthe following estimates hold for its sum:

max(‖Φt,s(., y)‖L1(Rd), ‖Φt,s(x, .)‖L1(Rd)) ≤ (t− s)−ωLC(ΩL, ωL, T ), (5.70)

max(‖Φt,s(., y)‖C∞(Rd), ‖Φt,s(x, .)‖C∞(Rd)) ≤ (t− s)−ωCΩCC(ΩL, ωL, ωC , T ).

(5.71)

Explicit expressions for the constants C(ΩL, ωL, T ) and C(ΩL, ωL, ωC , T ) are givenbelow. Moreover, equation (5.68) has a unique solution Φ that satisfies the esti-mates (5.70), (5.71), and it is given by the sum on the r.h.s. of (5.65).

Proof. It is well known (and easy to see) that for any functions φ, ψ ∈ L1(Rd) the

convolution (φ � ψ)(x) =∫φ(x − y)ψ(y)dy also belongs to L1(R

d) and

‖φ � ψ‖L1(Rd) ≤ ‖φ‖L1(Rd)‖ψ‖L1(Rd).

For the nth term of the series on the r.h.s. of (5.65), we therefore get

(FnF )t,s(x, y) ≤∫s≤s1≤···≤sn≤t

ds1 · · · dsn(t− sn)−ωL(sn − sn−1)

−ωL · · · (s1 − s)−ωL

×∫Rdn

ΩLt−sn(x− yn) · · ·ΩL

s1−s(y1 − y)dy1 · · · dyn

≤ Ωn+1L

∫s≤s1≤···≤sn≤t

ds1 · · · dsn(t− sn)−ωL(sn − sn−1)

−ωL · · · (s1 − s)−ωL .

Using formula (9.15), we can conclude that

‖(FnF )t,s(., y)‖L1(Rd) ≤ Ωn+1L (t− s)n(1−ωL)−ωL

[Γ(1 − ωL)]n+1

Γ((n+ 1)(1− ωL)).

Therefore, by the definition of the Mittag-Leffler function (9.13), the series definingΦ converges in L1(R

d) and

‖Φt,s(., y)‖L1(Rd) ≤ ΩL(t− s)−ωLΓ(1−ωL)E1−ωL,1−ωL(Γ(1− ωL)ΩL(t− s)1−ωL),(5.72)

Page 327: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

5.5. The method of frozen coefficients: estimates for the Green function 313

which proves the estimate (5.70) with

C = ΩLΓ(1− ωL)E1−ωL,1−ωL(Γ(1 − ωL)ΩLT1−ωL).

Next, one observes that if φ ∈ L1(Rd) and ψ ∈ C∞(Rd), then φ � ψ ∈ C∞(Rd)

and

‖φ � ψ‖C∞(Rd) ≤ ‖φ‖C∞(Rd)‖ψ‖L1(Rd). (5.73)

Exercise 5.5.1. Prove the assertion (5.73).

Consequently,∫Ft,τ (., z)Fτ,s(z, y) dz ∈ C∞(Rd) and∥∥∥∥

∫Ft,τ (., z)Fτ,s(z, y) dz

∥∥∥∥C∞(Rd)

≤ ΩCΩL min[(t− τ)−ωC (τ − s)−ωL , (t− τ)−ωL(τ − s)−ωC

].

(5.74)

Now, a problem arises: how to push the (possibly non-integrable) singularity (t−τ)−ωC or (τ − s)−ωC through the integration over τ? The trick is to decomposethe integral into two parts:

∫ t

s

∫Ft,τ (x, z)Fτ,s(z, y) dz

=

∫ s+(t−s)/2

s

∫Ft,τ (x, z)Fτ,s(z, y) dz +

∫ t

s+(t−s)/2

∫Ft,τ (x, z)Fτ,s(z, y) dz,

and to estimate the first and the second integral by the first and the secondestimate of (5.74), respectively. This yields

∥∥∥∥∫ t

s

∫Ft,τ (., z)Fτ,s(z, y) dz

∥∥∥∥C∞(Rd)

≤ ΩCΩL

∫ s+(t−s)/2

s

(t− τ)−ωC (τ − s)−ωL dτ

+ΩCΩL

∫ t

s+(t−s)/2

(t− τ)−ωL(τ − s)−ωC dτ

≤ 2ωCΩCΩL(t− s)−ωC

(∫ s+(t−s)/2

s

(τ − s)−ωL dτ +

∫ t

s+(t−s)/2

(t− τ)−ωL dτ

)

=1

1− ωL2ωC+ωLΩCΩL(t− s)1−ωL−ωC .

In order to estimate the term FnF , we observe that for any partition s =s0 ≤ s1 ≤ · · · ≤ sn ≤ sn+1 = t, there exists k ∈ {0, . . . , n} such that sk+1 − sk >(t − s)/(n + 1). Therefore, the integral in the term FnF can be bounded by the

Page 328: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

314 Chapter 5. Linear Evolutionary Equations: Advanced Theory

sum of (n + 1) integrals Ik such that sk+1 − sk > (t − s)/(n + 1) in Ik. In eachintegral Ik, we estimate Fsk+1,sk(yk+1, yk) by

(sk+1 − sk)−ωCΩC

sk+1−sk(yk+1 − yk) ≤ (n+ 1)ωC (t− s)−ωCΩCsk+1−sk(yk+1 − yk)

and the other Fsj+1,sj (yj+1, yj) by

(sj+1 − sj)−ωLΩL

sj+1−sj (yj+1 − yj).

This leads to the estimate

‖(FnF )t,s(., y)‖C∞(Rd) ≤ (n+ 1)ωC+1(t− s)−ωCΩCΩnL

×∫s≤s1≤···≤sn≤t

(sn − sn−1)−ωL · · · (s1 − s)−ωLds1 · · · dsn

= (n+ 1)ωC+1(t− s)n(1−ωL)−ωCΩCΩnL

[Γ(1− ωl)]n

Γ(n(1− ωL) + 1),

where (9.16) was used. Consequently, we find

‖Φt,s(., y)‖C∞(Rd) ≤ ΩC(t− s)−ωC

∞∑n=0

(n+ 1)ωC+1ΩnL

[(t− s)1−ωLΓ(1− ωL)]n

Γ(n(1− ωL) + 1).

This series converges and can also be expressed in terms of the Mittag-Lefflerfunctions, which proves (5.71). The last statement is a direct consequence of theobtained estimates. �

We can now obtain a general criterion for ensuring that a candidate for theGreen function (5.64) suggested by the method of frozen coefficients is well definedand satisfies the required initial condition δ(y).

Lemma 5.5.2.

(i) Suppose that Φ satisfies the estimates (5.70), (5.71) of Lemma 5.5.1. LetGap

t,s(x, y), t > s, be a continuous function in all its variables such that

Gapt,s(., y) ∈ L1(R

d) ∩ C∞(Rd) for all t, s, y, and

‖Gapt,s(., y)‖L1(Rd) ≤ ΩGL,

‖Gapt,s(x, .)‖L1(Rd) ≤ ΩGL,

‖Gapt,s(., .)‖C∞(R2d) ≤ (t− s)−ωGΩGC ,

(5.75)

with some constants ωG,ΩGL,ΩGC > 0. Then Gt,s(x, y) given by (5.64) iswell defined, Gt,s(., y) ∈ L1(R

d) ∩ C∞(Rd) for all y and t > s, and for the

Page 329: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

5.5. The method of frozen coefficients: estimates for the Green function 315

second term in (5.64) we have the estimates

∥∥∥∥∫ t

s

∫Gap

t,τ (., z)Φτ,s(z, y) dz

∥∥∥∥L1(Rd)

≤ ΩGLC(ΩL, ωL, T )

1− ωL(t− s)1−ωL ,

(5.76)∥∥∥∥∫ t

s

∫Gap

t,τ (x, z)Φτ,s(z, .) dz

∥∥∥∥L1(Rd)

≤ ΩGLC(ΩL, ωL, T )

1− ωL(t− s)1−ωL ,

(5.77)∥∥∥∥∫ t

s

∫Gap

t,τ (., z)Φτ,s(z, .) dz

∥∥∥∥C∞(R2d)

≤ Cmax[(t− s)1−ωL−ωG , (t− s)1−ωC ]

(5.78)

with a constant C = C(ΩGL,ΩGC ,ΩL,ΩC , ωL, ωC , ωG, T ).

(ii) Additionally, let∥∥∥∥∫

Gapt,s(., y)f(y)dy − f(.)

∥∥∥∥C∞(Rd)

→ 0, as(t− s) → 0, (5.79)

for any f ∈ C∞(Rd). (This requirement expresses the strong continuity ofthe family of operators f → ∫

Gapt,s(., y)f(y)dy in C∞(Rd) at the diagonal

s = t and is a bit stronger than the requirement that∫Gap

t,s(., y) tends to δ(y)in the sense of generalized functions.) Then the same holds for Gt,s(x, y):∥∥∥∥

∫Gt,s(., y)f(y)dy − f(.)

∥∥∥∥C∞(Rd)

→ 0, as(t− s) → 0. (5.80)

Proof. (i) The estimates (5.76) and (5.77) are a direct consequence of (5.70) andthe first two estimates of (5.75).

Next, by (5.75), (5.70), (5.71) and (5.73) imply that∣∣∣∣∫

Gapt,τ (x,z)Φτ,s(z,y)dz

∣∣∣∣ (5.81)

≤min[(t−τ)−ωGΩGC(τ −s)−ωLC(ΩL,ωL,T ),ΩGL(τ−s)−ωCΩCC(ΩL,ωL,ωC ,T )].

In order to estimate the integral over τ , we use the same trick as in the proof ofLemma 5.5.2. Namely, we decompose this integral into two parts:

∫ t

s

dτ Gapt,τ (x, z)Φτ,s(z, y) dz

=

∫ s+(t−s)/2

s

dτ Gapt,τ (x, z)Φτ,s(z, y) dz +

∫ t

s+(t−s)/2

dτ Gapt,τ (x, z)Φτ,s(z, y) dz,

Page 330: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

316 Chapter 5. Linear Evolutionary Equations: Advanced Theory

Afterwards, we can estimate the first and the second integral by the first and thesecond estimate of (5.81), respectively. This yields the estimate∫ t

s

dτ Gapt,τ (x, z)Φτ,s(z, y) dz ≤ 2ωG(t− s)−ωG

ΩGCC(ΩL, ωL, T )

1− ωL(t− s)1−ωL

+ 2ωC (t− s)−ωCΩGLΩCC(ΩL, ωL, ωC , T )(t− s),

which implies (5.78).

(ii) The estimate (5.77) implies that∥∥∥∥∫ t

s

∫Gap

t,τ (., z)Φτ,s(z, y)f(y) dz dy

∥∥∥∥C∞(Rd)

→ 0, ast− s → 0,

since (t− s)1−ωL‖f‖C(Rd) → 0 for any f ∈ C(Rd), so that (5.80) is equivalent to(5.79). �

5.6 The method of frozen coefficients:main examples

In the next section, we shall search for conditions to ensure that G actually solvesthe required equation. In this section, let us confirm that the conditions of Lemmas(5.5.1) and (5.5.2) are indeed satisfied for the basic examples that we have inmind, namely equations arising from operators with homogenous symbols. We shallconsider three cases separately: diffusions, operators with homogeneous symbolsand operators with homogeneous symbols of variable order. Their respective thesymbols are:

ψt(x, p) = (At(x)p, p), ψt(x, p) = ωt(x, p/|p|)|p|β , ψt(x, p) = ωt(x, p/|p|)|p|βt(x).(5.82)

Of course, the first case is a special case of the second one. However, it is reasonableto discuss diffusion separately a) because of its importance and b) due to its sim-plicity and therefore the possibility to avoid the machinery of pseudo-differentialoperators.

For a diffusion operator with the symbol ψt(x, p) = −(At(x)p, p) with a fam-ily of symmetric positive matrices At(x), the Green function for the correspondingequation with frozen coefficients is given by (4.190):

Gapt,s(x, y) =

(2π)−d/2√det(

∫ t

sAτ (y)dτ )

∫exp

{−1

2

((∫ t

s

Aτ (y)dτ

)−1

(x− y), x− y

)}.

(5.83)It satisfies the following estimate:

|Gapt,s(x, y)| ≤ (2πλmin(t− s))−d/2 exp

{− (x − y)2

2(t− s)λmax

}, (5.84)

Page 331: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

5.6. The method of frozen coefficients: main examples 317

where λmin and λmax are the minimum respectively the maximum of the eigenval-ues of all matrices Aτ , τ ∈ [s, t]. Consequently, (5.75) holds with ωG = d/2. Next,we find

∂Gapt,s(x, y)

∂x= −

(∫ t

s

Aτ (y)dτ

)−1

(x− y)Gapt,s(x, y) (5.85)

and

∂2Gapt,s(x, y)

∂xj∂xk= Gap

t,s(x, y)

[∑l,m

(∫ t

s

Aτ (y)dτ

)−1

jm

(x− y)m

(∫ t

s

Aτ (y)dτ

)−1

kl

(x − y)l

− δkj

(∫ t

s

Aτ (y)dτ

)−1

jk

]. (5.86)

Assume that the At(x) are γ-Holder-continuous in x with an index γ ∈ (0, 1],i.e.,

‖At(y)−At(x)‖ ≤ HA‖x− y‖γ . (5.87)

The function F given by (5.57) equals

Ft,s(x, y) =1

2tr

[(At(y)−At(x))

∂2Gapt,s(x, y)

∂x2

]

=1

2

∑j,k

(At,jk(y)−At,jk(x))∂2Gap

t,s(x, y)

∂xj∂xk.

(5.88)

In order to estimate this trace, we can use the fact that tr(CD) ≤ max |λC | ‖D‖for any symmetric matrices C,D, where max |λC | is the maximal magnitude ofthe eigenvalues of C, and ‖D‖ is the Euclidean norm of D, so that

‖D‖ ≤√∑

ij

D2ij ≤ dmax

i,j|Dij |.

Therefore, we get

|Ft,s(x, y)| ≤ min(λmax, HA‖x− y‖γ)∥∥∥∥∥∂

2Gapt,s(x, y)

∂xj∂xk

∥∥∥∥∥ .The last term can be estimated as the sum of two terms in the bracket on ther.h.s. of (5.86), which yields

|Ft,s(x, y)| ≤ d

λmin(t− s)min(HA‖x− y‖γ , λmax)

[1 +

(x− y)2

λmin(t− s)

]Gap

t,s(x, y).

(5.89)

Page 332: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

318 Chapter 5. Linear Evolutionary Equations: Advanced Theory

Integrating over x and changing the variable x to w = (x− y)/√t− s leads to∫

|Ft,s(x, y)|dx ≤ dHA

λmin(t− s)1−γ/2

×∫

|w|γ[1 +

|w|2λmin

](2πλmin)

−d/2 exp

{− w2

2λmax

}dw.

Therefore, the second estimate in (5.69) holds with ωL = 1 − γ/2. Similarly, thefirst estimate in (5.69) holds with ωC = 1 + d/2.

Next, let us discuss the second symbol in (5.82). Assume that for each x theconditions of Theorem 4.15.1 are satisfied with bounds that are uniform in x, andthat ωt(x, s) is γ-Holder-continuous in x, i.e.,

|ωt(x, p/|p|)− ωt(y, p/|p|)| ≤ Hω|x− y|γ . (5.90)

It follows from (4.196) that for all t > s and x, y, z,

|Gψ,zt,s (x, y)| ≤ Cmin

((t− s)−d/β ,

t− s

|x− y|d+β

), (5.91)

where C depends on d, β, the minimal value of the real part and the maximummagnitude of ωt(x, s). Consequently, for the corresponding Gap

t,s(x, y) = Gψ,yt,s (x, y)

the estimates (5.75) hold with ωG = d/β.

In order to estimate the corresponding function F of (5.57), we note that

the action of ψt,[y](−i∇) on Gψ,yt,s (x, y) is equivalent to a differentiation in t, for

which the estimate (4.196) can be used. In the exact same way, one estimates the

action of ψt,[x](−i∇) = ψ(x,−i∇) on Gψ,yt,s (x, y). As a consequence, one sees that

the action of ψt,[y](−i∇) − ψ(x,−i∇) on Gψ,yt,s (x, y) has the same upper bound,

but with an additional factor min(1, |x − y|γ). In fact, analogously to (4.77), wehave

Ft,s(x, y) =1

(2π)d

∫ ∞

0

d|p|∫ 1

−1

du

∫S(d−2)

dn ei|p| |x−y|u(1 − u2)(d−3)/2|p|d−1|p|β

× exp

{−|p|β

∫ t

s

ωτ (y, x− y, u, n) dτ

}[ωt(y, x− y, u, n)− ωt(x, x − y, u, n)].

The last term in the square brackets brings in the factor min(1, |x − y|γ). Theremaining integral can be decomposed into two parts by writing 1 = χ1(u)+χ2(u)as for (4.77). The second integral is estimated by (9.31). Therefore, we get theestimate

|Ft,s(x, y)| ≤ Cmin(1, |x− y|γ) 1

t− s

((t− s)−d/β,

t− s

|x− y|d+β

). (5.92)

Similar to the case of diffusions we observe that, when estimating the norms of F ,the natural scaled variable is w = (x − y)(t − s)−1/β . This implies the estimates(5.69) with

ωL = 1− γ/β, ωC = 1 + d/β, (5.93)

Page 333: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

5.6. The method of frozen coefficients: main examples 319

which is of course consistent with the above estimate for diffusions correspondingto β = 2.

Let us now analyse the case of variable order of homogeneity, i.e., the thirdcase in (5.82):

ψt(x, p) = ωt(x, p/|p|)|p|βt(x).

Assuming that for each x the conditions of Theorem 4.15.4 are satisfied withbounds that are uniform in x, it follows from (4.202) that for t−s < T , with any T ,

|Gψ,zt,s (x, y)| ≤ Cmin

[(t− s)−d/bmin ,max

(t− s

|x− y|d+bmin,

t− s

|x− y|d+bmax

)],

(5.94)where C depends on T , bmax = maxt,x βt(x), bmin = mint,x βt(x), the minimalvalue of the real part and the maximum magnitude of ωt(x, s). Consequently, for

the corresponding Gapt,s(x, y) = Gψ,y

t,s (x, y) the estimates (5.75) hold with ωG =d/bmin. Next, assuming the Holder continuity of the coefficients, i.e., assuming(5.90) and

|βt(x) − βt(y)| ≤ Hβ‖x− y‖γ , (5.95)

it follows that

|Ft,s(x, y)| ≤ Cmin(1, |x− y|γ) 1

t− s(1 + | ln(t− s)|) (5.96)

×min

[(t− s)−d/bmin ,max

(t− s

|x− y|d+bmin,

t− s

|x− y|d+bmax

)],

because, for |p| > 1,∣∣∣|p|βt(x) − |p|βt(y)|∣∣∣ ≤ |βt(x) − βt(y)||p|bmin ln |p|,

and the multiplier ln |p| yields an additional multiplier | ln(t − s)| in the finalestimate. In particular, the estimates (5.69) hold with

ωL = 1− ε− γ/bmax, ωC = 1− ε+ d/bmin

with any ε > 0.

Summing up, we proved the following.

Theorem 5.6.1. Let the symbols of the Cauchy problem (5.51) have one of thethree types given by (5.82) and their coefficients (A, ω, β) be Holder-continuousin x. Suppose that A is symmetric with all eigenvalues between certain positivenumbers λmin, λmax, that the βt(x) belong to a certain interval [bmin, bmax] ⊂ R+

and that ωt(x, s) has a positive real part separated from zero and is (d + 1 +bmax)-times continuously differentiable in s. Then the Green function of the type(5.64) constructed by the method of frozen coefficients is a well-defined continuousfunction satisfying (5.76), (5.78) and (5.80). In particular, ωL = 1 − γ/β, ωC =1+ d/β in the first two cases.

Page 334: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

320 Chapter 5. Linear Evolutionary Equations: Advanced Theory

By a more careful analysis of the series (5.65), one can often obtain ratherprecise bounds and even two-sided estimates for the Green function Gt,s(x, y), aswell as its asymptotic expansions in small times and around the diagonal {x = y}.An extensive literature is devoted to such estimates and asymptotic expansions.For the sake of completeness, let us quote some key results.

In [116], the case of diffusion is studied in detail (see also [227] and referencestherein), which leads to the following theorem:

Theorem 5.6.2. For the general heat equation

∂ut(x)

∂t= Ltut(x),

Ltut(x) =1

2(a(t, x)∇,∇)ut(x) + (b(t, x),∇ut(x)) + c(t, x)ut(x),

(5.97)

suppose that a is uniformly elliptic, i.e., the expression (a(x)ξ, ξ) for unit vectors ξis uniformly bounded from below and above by positive constants m,m−1, and thatit is continuously differentiable in x. Moreover, suppose that b, c are continuouswith respect to all their variables and the following uniform bounds hold:

supt,x

max(|∇a(t, x)|, |b(t, x)|, |c(t, x)|) ≤ M.

Then there exist constants σi, Ci, i = 1, 2, depending only on m,M, T such thatthe Green function Gt,s(x, ξ) of equation (5.97) is well defined and satisfies thefollowing two-sided bounds:

C1Gσ1(t−s)(x− ξ) ≤ Gt,s(x, ξ) ≤ C2Gσ2(t−s)(x− ξ), 0 < s, t < T, (5.98)

where Gσ1(t−s)(x − ξ) is the heat kernel (4.14) of the basic heat equation.

Moreover, if a is twice continuously differentiable in x, and b, c are continu-ously differentiable (with all derivatives bounded), then Gt,s(x, ξ) is differentiablein x, ξ and

max

(∣∣∣∣ ∂∂ξGt,s(x, ξ)

∣∣∣∣ ,∣∣∣∣ ∂∂xGt,s(x, ξ)

∣∣∣∣)

≤ Ct−1/2Gt,s(x, ξ), 0 < s, t < T, (5.99)

where the constant C depends only on m,M, T and the bounds for the derivatives.

In [134, 136], two-sided bounds are obtained for the equation generated bymixed fractional Laplacians of order not exceeding 2, with variable coefficients, interms of the corresponding heat kernels of the constant coefficient case, as well asthe related estimates for the derivatives.

Let us give some precise results for the equation

∂u

∂t= −a(x)|∇|β(x)u, x ∈ Rd, t ≥ 0. (5.100)

Page 335: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

5.6. The method of frozen coefficients: main examples 321

Theorem 5.6.3. Let β(x) ∈ [βd, βu], a(x) ∈ [ad, au] be γ-Holder-continuous func-tions on Rd with values in compact subsets of (0, 2) and (0,∞), respectively, andγ ∈ (0, 1]. Then the Green function G(t, x, ξ), t > 0, x, ξ ∈ Rd, for equation (5.100)has the following upper bound:

G(t, x, ξ) ≤ K[Gβd

t (x− ξ) +Gβu

t (x− ξ)], (5.101)

for all t ≤ T with an arbitrary T , where K = K(T ) is a constant and where

Gβt (x − ξ) is the Green function of the Cauchy problem for the ΨDO with the

symbol |p|β. Moreover, if the index β(x) = β does not depend on x, then thefollowing two-sided estimate holds as well:

K−1Gβt (x− ξ) ≤ G(t, x, x0)

≤ KGβt (x− ξ).

A proof and various extensions can be found in [134, 136, 148].

Abundant literature exists on estimates and asymptotic expansions of theheat kernels for diffusions on manifolds, including the case of degenerate diffu-sions. For some basic results, see, e.g., [33, 182]. Starting from the classificationof Gaussian degenerate diffusions (with the Green function given by the exponentof a quadratic form) in terms of the Young schemes, see the discussion aroundformula (4.44), and taking into account the extreme importance of Gaussian es-timates, it is natural to ask which class of diffusions has a Gaussian form as themain term of the asymptotics for small times. This question was answered in [136],where the full classification of such diffusions is given in accordance to the classi-fication of Gaussian diffusions themselves. For instance, in the case of the Youngscheme (k, n), k ≥ n (Kolmogorov’s Gaussian diffusion), diffusion equations wherethe heat kernel has a corresponding Gaussian main term turn out to be given bysecond-order operators of the following type:

1

2

(G(x)

∂y,∂

∂y

)+

(a(x) + α(x)y,

∂x

)

+

(b(x) + β(x)y +

1

2(γ(x)y, y),

∂y

)− V (x, y),

(5.102)

where x ∈ Rn, y ∈ Rk, the rank of the matrix α(x) is n, G is a square k×k positivematrix, and V (x, y) is a polynomial in y of order at most 4 that is bounded frombelow. These operators naturally arise in the analysis of stochastic geodesic flowson manifolds.

For some recent developments in degenerate stable-like equations (with ho-mogeneous symbols of order not exceeding 2), we can refer, e.g., to [114, 167] andreferences therein.

Page 336: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

322 Chapter 5. Linear Evolutionary Equations: Advanced Theory

5.7 The method of frozen coefficients: regularity

In this section, we further investigate the regularity of the Green function Gt,s(x, y)as constructed above. To this end, we shall assume for the sake of simplicity thatthe symbols ψt = ψ do not depend on t. (The general case is not more demandingfrom an ideological point of view, but it requires lengthier formulae.) Therefore,the functions Gas

t,s, Ft,s, Φt,s and Gt,s become functions of the difference t − sonly. From now on, we will denote the functions Gas

t,0, Ft,0, Φt,0 and Gt,0(x, y) ina shorter form by Gas

t , Ft, Φt and Gt(x, y).

Aiming at the differentiability of G with respect to t, we start again withthe question concerning the series Φ, see (5.64) and (5.65). A quick look at theexamples of ψ considered above suggests that differentiating Ft(x, y) with respectto t should increase the singularity at small t by a factor 1/t, which motivates thegeneral assumption in the following result.

Lemma 5.7.1. Under the assumptions of Lemma 5.5.1, assume additionally thatF depends only on the difference of t− s and that Ft(x, y) is continuously differ-entiable in t, where the derivative has the same estimate as F with an additionalmultiplier of the order 1/t, i.e.,∣∣∣∣t ∂∂tFt(x, y)

∣∣∣∣ ≤ CF min[t−ωCΩC

t (x− y), t−ωLΩLt (x− y)

], (5.103)

with some constant CF . Then the function Φt(x, y) (which is Φt,0(x, y) in theprevious notation of the time-dependent case and denotes the sum of the series onthe r.h.s. of (5.65) with s = 0) is also continuously differentiable in t, and thisderivative has the same estimates as Φ with an additional multiplier of the order1/t, i.e., ∥∥∥∥t ∂∂tΦt(., y)

∥∥∥∥L1(Rd)

≤ t−ωLC(ΩL, ωL, T, CF ), (5.104)

∥∥∥∥t ∂∂tΦt(., y)

∥∥∥∥C∞(Rd)

≤ t−ωCΩCC(ΩL, ωL, ωC , T, CF ). (5.105)

Proof. Let us first look at the differentiability of the first nontrivial term in theseries (5.65):

I1(t, x, y) =

∫ t

0

∫Ft−τ (x, z)Fτ (z, y) dz.

The problem is that a direct differentiation and using the estimates for F yieldsa non-integrable singularity 1/(t − τ) inside the integral. This difficulty can beovercome by changing the variable of integration:

I1(t, x, y) = t

∫ 1

0

ds

∫Ft(1−s)(x, z)Fts(z, y) dz.

Page 337: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

5.7. The method of frozen coefficients: regularity 323

Differentiating now yields

∂tI1(t, x, y) =

∫ 1

0

ds

∫Ft(1−s)(x, z)Fts(z, y) dz

+ t

∫ 1

0

ds

∫(1 − s)

∂Fτ (x, z)

∂τ|τ=t(1−s)Fts(z, y) dz

+ t

∫ 1

0

ds

∫sFt(1−s)(x, z)

∂Fτ (z, y)

∂τ|τ=ts dz.

All three terms are perfectly well defined and can be estimated as I2 itself in theproof of Lemma 5.5.1 (using (5.103) for the second and third term). This leads to

∥∥∥∥t ∂∂tI1(t, ., y)∥∥∥∥L1(Rd)

≤ (1 + 2CF )Ω2Lt

1−2ωL[Γ(1− ωL)]

2

Γ(2(1− ωL)).

Similarly, the nth term of the series (5.65),

In(t, x, y) =

∫0≤s1≤···≤sn≤t

ds1 · · · dsn

×∫

Ft−sn(x, zn)Fsn−sn−1(zn, zn−1) · · ·Fs1(z1, y)dz1 · · · dzn,

can be rewritten as

In(t, x, y) = tn∫0≤s1≤···≤sn≤1

ds1 · · · dsn

×∫

Ft(1−sn)(x, zn)Ft(sn−sn−1)(zn, zn−1) · · ·Fts1(z1, y)dz1 · · · dzn.

Consequently,

∂tIn(t, x, y) =

n

tIn(t, x, y) + tn

n∑k=1

· · · (sk − sk−1)∂Fτ (zk, zk−1)

∂τ|τ=t(sk−sk−1) · · · ,

where · · · denotes all terms that were not subject to differentiation. Again, allterms can be estimated like In(t, x, y) itself, which yields∥∥∥∥t ∂∂tIn(t, ., y)

∥∥∥∥L1(Rd)

≤ n(1 + nCF )Ωn+1L tn(1−2ωL)−ωL

[Γ(1− ωL)]n

Γ(n(1− ωL)).

The corresponding sum is again convergent, thus yielding (5.104). The termsIn(t, ., y) are estimated as elements of C∞(Rd) exactly like in Lemma 5.5.1), whichyields (5.105). �

Page 338: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

324 Chapter 5. Linear Evolutionary Equations: Advanced Theory

Lemma 5.7.2. Under the assumptions of Lemma 5.5.2, assume additionally thatψt = ψ does not depend on t, that Φ satisfies the estimates (5.104), (5.105) fromLemma 5.7.1 and that t ∂

∂tGapt (x, y) satisfies the same estimates as Gap

t (x, y) it-self, i.e.,∥∥∥∥t ∂∂tGap

t (., y)

∥∥∥∥L1(Rd)

≤ ΩGL,

∥∥∥∥t ∂∂tGapt (., y)

∥∥∥∥C∞(Rd)

≤ t−ωGΩGC , (5.106)

with some constants ΩGL, ΩGC > 0.

Then the function Gt(x, y) given by (5.64) (with s = 0) is continuously dif-ferentiable in t and |t ∂

∂tGt(x, y)| has again the same estimates as Gt(x, y) itself,i.e., the following estimates hold for the second term in (5.64):∥∥∥∥t ∂∂t

∫ t

0

∫Gap

t−τ (., z)Φτ (z, y) dz

∥∥∥∥L1(Rd)

≤ Ct1−ωL , (5.107)

∥∥∥∥t ∂∂t∫ t

0

∫Gap

t−τ (., z)Φτ (z, y) dz

∥∥∥∥C∞(Rd)

≤ Cmax[t1−ωL−ωG , t1−ωC ] (5.108)

with constants C depending on all constants that have entered the assumptions ofthe Lemma.

Proof. For estimating the derivative of the second term in (5.64), one uses thesame trick as for the integral I1 in the proof of Lemma 5.7.1. Namely, one rewritesit as ∫ t

0

∫Gap

t−τ (x, z)Φτ (z, y) dz = t

∫ 1

0

ds

∫Gap

t(1−s)(x, z)Φts(z, y) dz

so that

∂t

∫ t

0

∫Gap

t−τ (x, z)Φτ (z, y) dz =

∫ 1

0

ds

∫Gap

t(1−s)(x, z)Φts(z, y) dz

+

∫ 1

0

ds

∫(1 − s)

∂Gapτ (x, z)

∂τ

∣∣∣∣τ=t(1−s)

Φts(z, y) dz

+

∫ 1

0

ds

∫sGap

t(1−s)(x, z)∂Φτ (z, y)

∂τ

∣∣∣∣τ=ts

dz.

Finally, one uses the same estimates as for the second term in (5.64) itself. �

Everything is now ready for the main result that summarizes the outcome ofthe method of frozen coefficients in an abstract form.

Theorem 5.7.1. Let ψ(x, p) be a continuous function such that

Gapt (x, y) =

1

(2π)d

∫ei(p,x−y) exp{−tψ(y, p)}dp

Page 339: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

5.7. The method of frozen coefficients: regularity 325

is a well-defined continuous function with respect to all its variables for t > 0,is differentiable in t, and such that Gap

t (., y), ∂∂tG

apt (., y) ∈ L(R

d) ∩ C∞(Rd) andthe estimates (5.75), (5.106) hold, as well as the convergence property (5.79).Moreover, let the function

Ft(x, y) = (ψt,[y](−i∇)− ψt(x,−i∇))Gapt (x, y)

=1

(2π)d

∫ei(p,x−y)[ψt(y, p)− ψt(x, p)] exp{−tψ(y, p)} dp (5.109)

be well defined and continuously differentiable in t, and let the estimates (5.69)and (5.103) hold, i.e.,

|Ft(x, y)| ≤ min[t−ωCΩC

t (x− y), t−ωLΩLt (x− y)

],

|t ∂∂t

Ft(x, y)| ≤ CF min[t−ωCΩC

t (x− y), t−ωLΩLt (x − y)

].

Then the function Φ given by (5.65) (with s = 0) and the function

Gt(x, y) = Gapt (x, y) +

∫ t

0

∫Gap

t−τ (x, z)Φτ (z, y) dz (5.110)

are well defined continuous functions, which are differentiable in t and have theproperties (5.76), (5.78), (5.80), (5.107) and (5.107). Moreover, Gt satisfies equa-tion (5.66), i.e.,

∂tGt(x, y) = −ψt(x,−i∇)Gt(x, y) (5.111)

(where the action of ψ on G is defined classically via its action on Gap given by(5.58)), with the initial condition (5.80), i.e.,∥∥∥∥

∫Gt(., y)f(y)dy − f(.)

∥∥∥∥C∞(Rd)

→ 0, as(t− s) → 0,

for all f ∈ C∞(Rd).

Proof. When putting together the results of all lemmas above, the only thing thatis left to check is equation (5.111). From the derivation of equation (5.67), i.e., theequation

− ψt,[y](−i∇)Gapt (x, y) + Φt(x, y) −

∫ t

0

∫ψt,[z](−i∇)Gap

t−τ (x, z)Φτ (z, y) dz

= −ψt(x,−i∇)Gapt (x, y)−

∫ t

0

∫ψt(x,−i∇)Gap

t−τ (x, z)Φτ (z, y) dz, (5.112)

it follows that it is equivalent to (5.111) and equation (5.68) on Φ,

Φt(x, y) = Ft(x, y) +

∫ t

0

∫Ft−τ (x, z)Φτ (z, y) dz, (5.113)

Page 340: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

326 Chapter 5. Linear Evolutionary Equations: Advanced Theory

as long as all terms in (5.112) make sense. The point to emphasize here is thatalthough we already know that the function

∫ t

0

∫Ft−τ (x, z)Φτ (z, y) dz

=

∫ t

0

∫(ψt,[z](−i∇)− ψt(x,−i∇))Gap

t−τ (x, z)Φτ (z, y) dz

is well defined (since the integration over t can be performed), this does not directlyimply that each of the integrals on the l.h.s. and the r.h.s. of (5.112) makes sense.One still has to show that at least one of them is well defined. For this purpose,we note that

∂Gt(x, y)

∂t= lim

δ→0

1

δ(Gt+δ(x, y)−Gt(x, y))

=∂Gap

t (x, y)

∂t+ lim

δ→0

1

δ

∫ t+δ

t

∫Gap

t+δ−τ (x, z)Φτ (z, y)dz

+ limδ→0

∫ t

0

dτ1

δ(Gap

t+δ−τ (x, z)−Gapt−τ (x, z))Φτ (z, y)dz.

The l.h.s. is defined by Lemma 5.7.2, and the first limit on the r.h.s. is defined andequals Φt(x, y) by (5.80). Consequently, we may conclude that the second integralon the r.h.s. is also well defined. It equals

∫ t

0

dτ∂Gap

t−τ (x, z)

∂tΦτ (z, y)dz =

∫ t

0

∫ψt,[z](−i∇)Gap

t−τ (x, z)Φτ (z, y) dz,

as required. This completes the proof. �

5.8 The method of frozen coefficients:

the Cauchy problem

In this section, we derive some basic properties of the resolving operator for theCauchy problem of equations with the symbol ψ(x, p), as they arise as conse-quences of the properties of the Green function obtained above.

Proposition 5.8.1. Under the assumptions of Theorem 5.7.1, for any f ∈ C(Rd),the function

Ttf(x) =

∫Gt(x, y)f(y) dy (5.114)

is continuously differentiable in t for t > 0 and satisfies equation (5.51) classically(also for t > 0). Moreover, the initial condition is met in the sense that Ttf(x) →f(x), as t → 0, for any x, and this convergence is uniform in x for f ∈ C∞(Rd).

Page 341: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

5.8. The method of frozen coefficients: the Cauchy problem 327

Finally, the mappings Tt extend to M(Rd), so that for any μ ∈ M(Rd), Ttμ isa measure with a continuous density and the curve t → Ttμ is weakly continuousin M(Rd).

Proof. The only statement that is left to be proved is the weak continuity of theextension of Tt to measures. This follows from the corresponding property of Gas

(see Proposition 4.4.2) and the estimate

∥∥∥∥∫ t

0

∫ ∫Gap

t−τ (., z)Φτ (z, y) dzμ(dy)

∥∥∥∥L1(Rd)

≤ Ct1−ωL‖μ‖,

which is a consequence of (5.76). �

By Theorem 5.6.1, we know that the conditions of Lemmas 5.5.1, 5.5.2 aresatisfied for our basic examples. It is readily seen that the additional requirementsof the Lemmas 5.7.1 and 5.7.2 are also met by these examples. Namely, for thecase of diffusion it is seen from (5.83), (5.86), (5.88). For the other two cases in(5.82), the estimates (5.106) follow from (4.74) and (4.196). The estimates for thederivatives of F are obtained analogously. This leads to the following result:

Theorem 5.8.1. Under the assumptions of Theorem 5.6.1, the Green function ofthe type (5.64) constructed by the method of frozen coefficients satisfies the as-sumptions of Theorem 5.7.1 and consequently its assertions and corollary.

We are interested in the additional regularity of the solutions to the Cauchyproblems (5.51). For simplicity, we formulate them only for the class of homoge-neous symbols, keeping in mind that the applied principles are rather general.

Theorem 5.8.2. Assume that the assumptions of Theorem 5.6.1 hold for the symbolψ = ψ(x, p) = ω(x, p/|p|)|p|β (the second case in (5.82)) with β > 1. Let k denotethe maximal integer that is strictly less than β, and let ω(x, s) be (d+ 1+ β + k)-times continuously differentiable in s. Then for any integer l ≤ k, the heat kernelGt(x, y) given by (5.110) is l-times continuously differentiable in x and the secondterm of (5.110) has the following bounds:

∥∥∥∥∫ t

0

∫∂l

∂xi1 · · · ∂xil

Gapt−τ (., z)Φτ (z, y) dz

∥∥∥∥L1(Rd)

≤ Ct(γ−l)/β, (5.115)

∥∥∥∥∫ t

0

∫∂l

∂xi1 · · · ∂xil

Gapt−τ (x, z)Φτ (z, .) dz

∥∥∥∥L1(Rd)

≤ Ct(γ−l)/β, (5.116)

supx,y

∣∣∣∣∫ t

s

∫∂l

∂xi1 · · ·∂xil

Gapt,τ (x, z)Φτ,s(z, y) dz

∣∣∣∣ ≤ Ct(γ−d−l)/β. (5.117)

Moreover, the family of operators (5.114) is smoothing in both C∞(Rd) andM(Rd) in the sense that Ttf ∈ Ck

∞(Rd) for any f ∈ C(Rd) and (Ttμ)(x) ∈

Page 342: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

328 Chapter 5. Linear Evolutionary Equations: Advanced Theory

Hk1 (R

d) for any μ ∈ M(Rd), and the following estimates hold for any l ≤ k andindices i1, . . . , il:∥∥∥∥ ∂l

∂xi1 · · ·xil

(Ttμ)(.)

∥∥∥∥L1(Rd)

≤ Ct−l/β‖μ‖M(Rd), (5.118)

∥∥∥∥ ∂l

∂xi1 · · ·xil

Ttf(.)

∥∥∥∥C(Rd)

≤ Ct−l/β‖f‖C(Rd). (5.119)

Proof. The function Gt(x, y) is given by (5.110). The required properties of itsfirst term follow from the corresponding results for position-independent equations,namely that the L1-norms of the spatial derivatives of order m of Gap

t are boundedby a constant times t−β/m, see (4.57). Using these properties of Gap

t , the estimates(5.115), (5.116) and (5.117) are proved by the same argument as estimates (5.76),(5.77) and (5.78) are proved in Lemma 5.5.2, i.e., due to the estimate of the L1-norm of Φt(x, .) by a constant times t−ωL with ωL = 1 − γ/β. The conditionk < β ensures the integrability of the singularity that arises for small (t−τ) in theintegrals (5.115) and (5.116). The estimates (5.118) and (5.119) are consequencesof (5.115) and (5.116). �

In Theorem 5.8.1, we obtained the differentiability of the heat kernel of orderk < β without any essential additional assumptions. This could be expected, sinceψ(x,−i∇) acts as a kind of generalized derivative of order β. However, the actionof ψ(x,−i∇) on the above Gt is defined in terms of its Fourier transform. Inorder to be able to define its action on Gt (and hence on other solutions to theCauchy problems (5.51)) directly via a formula like (1.145), (1.143) (at least for aparticular choice of ω), we need regularity of an order that is higher than β. Thenext result shows how this can be achieved.

Theorem 5.8.3. Let the assumptions of Theorem 5.6.1 again hold for the symbolψ(x, p) = −ω(x, p/|p|)|p|β with β > 0, and let k denote the maximal integer that isstrictly less than β. Let additionally ω(x, s) be q-times continuously differentiablein x, and let each of these derivatives be (d+1+(k+q)(β+1))-times continuouslydifferentiable in s with all bounds uniform in x, s. Then the family of operators(5.114) preserves the space Cq

∞(Rd) and is locally bounded in these spaces (withrespect to their standard norm). Moreover, they are smoothing in Cq(Rd) in thesense that Ttf ∈ Cq+k

∞ (Rd) for any f ∈ Cq(Rd), and the following estimates holdfor any l ≤ k:

‖Ttf‖Cl+q(Rd) ≤ Ct−l/β‖f‖Cq(Rd). (5.120)

Remark 92. The statement about smoothing becomes void for β ≤ 1. In this case,the smoothing effect can still be observed, but requires spaces of fractional orderof smoothness (see Remark 66) that we avoid here.

Proof. As in the previous theorem, we show that each term in formula (5.110) forGt(x, y) defines an integral operator with the required properties. These operators

Page 343: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

5.8. The method of frozen coefficients: the Cauchy problem 329

are

T 1t f(x) =

∫Gap

t (x, y)f(y) dy, Gapt (x, y) =

1

(2π)d

∫ei(p,x−y) exp{−tψ(y, p)}dp,

T 2t f(x) =

∫ t

0

∫∂

∂xGap

t−τ (x, z)Φτ (z, y)f(y) dz dy.

As far as T 1t is concerned, the idea is to transfer the firstm derivatives with respect

to x to the variable y, in order to further transfer it to f via integration by parts.Thereby, the main observation is that

Gt(x, y) =∂Gap

t (x, y)

∂x+

∂Gapt (x, y)

∂y

= − t

(2π)d

∫ei(p,x−y) ∂ψ(y, p)

∂yexp{−tψ(y, p)}dp,

(5.121)

which has the same bounds as Gapt (x, y) itself, given by (4.73), (4.75), i.e.,∣∣∣∣ ∂l

∂xi1 · · · ∂xil

Gt(x, y)

∣∣∣∣ ≤ Cmin

(t−(d+l)/β ,

t

|x|d+β+l

)

for l ≤ k, although each term on the l.h.s. of (5.121) is more singular due to theadditional factor t−1/β . Consequently, one has

∂x

∫Gap

t (x, y)f(y) dy =

∫Gt(x, y)f(y) dy −

∫∂

∂yGap

t (x, y)f(y) dy

=

∫Gt(x, y)f(y) dy +

∫Gap

t (x, y)∂

∂yf(y) dy.

Therefore, if f ∈ C1(Rd), then both terms are uniformly bounded. Repeating thisprocedure m times shows that

‖Ttf‖Cm(Rd) ≤ C‖f‖Cm(Rd),

i.e., the operators T 1t act as bounded operators on Cm(Rd). The extension to

(5.120) is obtained as in the proof of Theorem 5.8.3.

When dealing with the operators T 2t , the same procedure transfers the firstm

derivatives from Gap to Φτ . Therefore, we need the regularity of Φ. Two cases mustbe distinguished. The simpler case is when m ≤ k. In this case, the differentiationof F in the terms of the series (5.65) that defines Φ, like in the second term∫Ft,τ (., z)Fτ,s(z, y) dz, does not create a non-integrable singularity in τ , since

each differentiation just ‘spoils’ the estimates by a factor of the order (t− τ)−1/β

(assuming of course the differentiability of ω(x, p) with respect to x), and theestimate of the derivatives of all terms of the series (5.65) is done as in Lemma5.5.2 for Φ itself. An additional difficulty arises when dealing with m > k. In this

Page 344: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

330 Chapter 5. Linear Evolutionary Equations: Advanced Theory

case, the direct differentiation of F in the terms of the series (5.65) does create anon-integrable singularity. However, the situation is similar to the case when wedealt with the derivatives in t in Lemma 5.7.1. Therefore, it can be resolved byexactly the same procedure as there. �

Theorems 5.8.2 and 5.8.3 include the case of an even integer β = 2q, q ∈ N,and ψ being a homogeneous polynomial of order 2q in the variable p, which isthe case of classical parabolic PDEs. In this case, however, many things becomesimpler. For instance, since ω = m(x, s) is a polynomial in s, it is automaticallyinfinitely smooth in s. And the conditions on the positivity of the real part of ωare fulfilled if the polynomial ψ(x, p) is (strictly) elliptic, i.e.,

C−1|p|2q ≤ ψ(x, p) ≤ C|p|2q (5.122)

for all x and some constant C > 0.

As we already mentioned, the theory extends to evolutions that are gener-ated by homogeneous operators supplemented by operators of lower terms, like in(4.204). Namely, let us consider the equation

ft = −ψ(x,−i∇)ft + Ltft (5.123)

with ψ(x, p) = −ω(x, p/|p|)|p|β, with a β > 1, and

Ltf =

k∑m=0

∑j1≤···≤jm≤d

btj1···jm(x)∂mf(x)

∂xj1 · · ·∂xjm

+

k∑m=0

∑j1≤···≤jm≤d

∫∂mf(y)

∂yj1 · · · ∂yjmνtj1···jm(x, dy),

(5.124)

where k ≥ 1 is any natural number that is strictly less than β, and btj1···jm(x) and

νtj1···jm(x, dy) are families of measurable functions and transition kernels on Rd,respectively. Let p be the smallest integer that is not less than β. The followingresult is a direct consequence of Theorems 4.13.5 and 5.8.3 with D = Cp

∞(Rd) andB = Ck∞(Rd).

Theorem 5.8.4.

(i) Let ψ(x, p) = ω(x, p/|p|)|p|β with a β > 1 be a symbol satisfying the assump-tions of Theorem 5.8.3 with any q ≥ 0. Let btj1···jm(x) be bounded measur-able complex-valued functions and νtj1···jm(x, dy) uniformly bounded complexstochastic kernels. Then the series (4.166) with U t,s = Tt−s and Lt from(5.124) converges in C∞(Rd), and the corresponding functions Φt,rf solvethe mild form of equation (5.123).

(ii) Let additionally btj1···jm ∈ C([0, T ], Cq(Rd)) (possibly complex-valued), and letthe partial derivatives of the kernels νtj1···jm(x, .) with respect to x of the orderup to and including q exist as uniformly bounded weakly continuous families

Page 345: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

5.9. Uniqueness via duality and accretivity; generalized solutions 331

of complex transition kernels. Then the family of operators Φt,r from (i) actsstrongly continuously in C∞(Rd) and Cq

∞(Rd), and the corresponding series(4.166) converges in Cq(Rd). Moreover, the operators Φt,r are smoothing, sothat

‖Φt,rf‖Cl+q(Rd) ≤ C(t− r)−l/β‖f‖Cq(Rd). (5.125)

If q ≥ k, then Φt,rf solves equation (4.186) classically.

5.9 Uniqueness via duality and accretivity;

generalized solutions

The method of frozen coefficients did not supply the uniqueness of the solutions tothe Cauchy problems (5.51), since we can only assert that the constructed Greenfunction is unique among functions represented by (5.64) with a sufficiently regularΦ. Therefore, we cannot derive from Theorem 5.6.1 that the operators Tt form asemigroup. In the sequel, we shall present four approaches to uniqueness that allowfor a completion of the analysis of the equations (5.51). In this section, we discusstwo approaches that are based on duality, the second one being reminiscent to themethod of accretivity.

The result on uniqueness that we begin with is very close to Theorem 4.10.1.The difference is that instead of starting with a propagator and building the dualpropagator via duality, we start with the solutions (that do not necessarily forma propagator) for both the forward and the backward problems that have beenconstructed independently from the duality. We formulate the result in terms ofdual pairs of Banach spaces, i.e., pairs of Banach spaces (Bobs, Bst) such that eachof these spaces is a closed subspace of the dual of the other space that separatesthe points of the latter.

Theorem 5.9.1. Suppose that (Bobs, Bst) is a dual pair of Banach spaces andDobs ⊂ Bobs, Dst ⊂ Bst are their dense subspaces. Let At be a family of lin-ear operators Dobs → Bobs such that their dual operators A∗

t acting from B∗obs

to D∗obs represent the extensions (by weak continuity) of the family of operators,

which is also denoted by A∗t and acts from Dst to Bst.

Let U t,r, t ≤ r, be a strongly continuous family of bounded linear operatorsin Bobs such that Dobs is invariant and the equation

d

dsfs = −Asfs, s < t, (5.126)

is satisfied by fs = Us,tf for any f ∈ Dobs. Let Vr,t, r ≥ t, be a strongly contin-

uous family of bounded linear operators in Bst such that Dst is invariant and theequation

d

dsξs = A∗

sξs, s > t, (5.127)

is satisfied by ξs = V s,tξ for any ξ ∈ Dst.

Page 346: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

332 Chapter 5. Linear Evolutionary Equations: Advanced Theory

Then V s,tξ is the unique solution to the Cauchy problem of equation (5.127),i.e., if ξt = ξ for a given ξ ∈ Dst and ξs, s ≥ t, is a continuous curve in Bst

such that ξs ∈ Dst for all s and satisfies (5.127), then ξs = V s,tξ for all s ≥ t.Analogously, Us,rf is the unique solution to the inverse Cauchy problem (5.126).

Proof. It is almost identical to the proof of Theorem 4.10.1. Namely, for a solutionξs to equation (5.127) with the initial condition ξt = ξ and any f ∈ Dobs, we definethe function g(s) = (Us,tf, ξs) and show exactly as in the proof of Theorem 4.10.1that g′(s) = 0. Therefore, we find g(s) = (f, ξs) = g(t) = (U t,sf, ξt), which showsthat ξs is uniquely defined, because Bst separates the points of Bobs and Dobs isdense in Bobs. The second statement is proved similarly. �

Of course, the uniqueness stated in Theorem 5.9.1 implies that families U t,s

and V s,t form a backward and a forward propagator, respectively.

Exercise 5.9.1. Analyse the proof of Theorem 4.10.1 in order to find out wherethe assumption of U t,r being a backward propagator was really used. Why is itnot possible to just start with any family U t,r resolving the corresponding Cauchyproblem, then build a family of dual operators solving the dual problem, and thencomplete the proof as above?

As an example, let us apply this result to equations of the type (5.123) withslightly more restrictive assumptions.

Theorem 5.9.2. Let us consider the equation

ft = −ψ(x,−i∇)ft + Ltft, (5.128)

with ψ(x, p) being an elliptic polynomial (see (5.122)) of order 2q, q ∈ N, and

Ltf =

k∑m=0

∑j1≤···≤jm≤d

btj1···jm(x)∂mf(x)

∂xj1 · · · ∂xjm

+

k∑m=0

∑j1≤···≤jm≤d

∫∂mf(y)

∂yj1 · · · ∂yjmνtj1···jm(x, y) dy,

(5.129)

where k ≥ 1 is any natural number that is strictly less than 2q, and btj1···jm(x)

and νtj1···jm(x, y) are functions of the class C2k(Rd) and C2k(R2d) ∩ H2k1 (R2d),

respectively.

Then the solutions constructed in Theorem 5.8.4 are unique and the operatorsΦt,r form a propagator.

Proof. This result follows from Theorem 5.9.1, since the dual operator to

−ψ(x,−i∇)f + Lt is − ψ(x,−i∇)f + Lt,

Page 347: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

5.9. Uniqueness via duality and accretivity; generalized solutions 333

with the same main term −ψ(x,−i∇) and Lt of the same type as Lt. Therefore,the solutions to the Cauchy problem with this operator −ψ(x,−i∇)f + Lt orto the time-reversed Cauchy problem with the operator ψ(x,−i∇)f − Lt can beconstructed by Theorem 5.8.4. �

Remark 93. The additional smoothness assumptions on ν have the specific purposethat the dual operator satisfies the conditions of Theorem 5.8.4(ii). Moreover, werestricted the main term ψ to be a polynomial, because otherwise the dual operatorwould not be of the same form as ψ, but would rather contain hyper-singularintegrals of the type (1.154) with nontrivial characteristics. In such a case, dealingwith the corresponding Cauchy problem would require additional work.

The duality can be used for proving uniqueness in a slightly different way(reminiscent to using accretivity) that we are going to explain now. This approachallows us to treat some other classes of equations, for instance those with symbolsψ(x, p) = at(x)|p|α. Instead of presenting the abstract framework, we just showhow the method works for this particular example.

Proposition 5.9.1. Let β > 0 and a(x) be a continuous function on Rd such thatC−1 ≤ a(x) ≤ C for all x and some C > 0. Then for any m ≥ β and f ∈Cm

∞(Rd) ∩ Hm1 (Rd), there may exist at most one (real) curve ft ∈ Cm

∞(Rd) ∩Hm

1 (Rd) that is continuous both in C∞(Rd) and L1(Rd) and satisfies the equation

ft = −a(x)|∇|βft, t ≥ s, both in the topology of C∞(Rd) and L1(Rd), and the

initial condition fs = f .

Proof. It is sufficient to prove the uniqueness for fs = 0. In this case, we have

d

dt

∫ft(x)ft(x)[a(x)]

−1 dx = −2

∫ft(x)|∇|βft(x) dx

= −2

∫|∇|β/2ft(x)|∇|β/2ft(x) dx ≤ 0,

and hence∫ft(x)ft(x)[a(x)]

−1 dx = 0 for all t. Therefore, we find ft = 0. �

Corollary 4. Let β > 0, m ≥ β, a(x) ∈ Cm(Rd) and C−1 ≤ a(x) ≤ C for all xand some C > 0. Then the resolving family of operators Tt of the Cauchy problemfor the equation ft = −a(x)|∇|βft constructed in Theorem 5.8.3 form a semigroupand yield unique solutions both in Cm

∞(Rd) and Hm1 (Rd).

Proof. The uniqueness of the solutions in Cm∞(Rd)∩Hm

1 (Rd), which follows fromTheorem 5.9.1, implies that the Tt form a semigroup, because TsTt and Tt+s mustcoincide as a solution to the same problem. Therefore, the uniqueness in eachspace Cm

∞(Rd) or Hm1 (Rd) is derived from Theorem 4.10.1. �

Corollary 5. Theorem 4.14.2 about the equation

ft = −σ|Δ|α/2ft + Ltft

Page 348: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

334 Chapter 5. Linear Evolutionary Equations: Advanced Theory

remains true if σ is not a constant, but a function from Ck(Rd) that is boundedfrom below by a positive constant.

Proof. It is straightforward. �

Exercise 5.9.2. Extend Proposition 5.9.1 to the case of the time-dependent fam-ily at(x).

Theorem 5.9.1 yields a convenient framework to discuss generalized solutions,as mentioned at the end of Section 4.1. Let us say that a continuous curve fs,s ≤ t, in Bobs is a generalized solution by approximation to the Cauchy problemof equation (5.126) with the terminal condition ft, if it satisfies this terminalcondition and there exists a sequence of elements fn

t ∈ D such that fnt → f

and the corresponding classical (i.e., belonging to the domain) solutions Us,tftconverge to fs, as n → ∞. Moreover, let us say that a continuous curve fs, s ≤ t,in Bobs (continuity can be understand in the strong sense or weakly with respectto the pair) is a generalized solution by duality to the Cauchy problem of equation(5.126) with the terminal condition ft, if for any ξ ∈ Dst, the weak form of equation(5.126) holds, i.e.,

d

ds(fs, ξ) = −(fs, A

∗ξ), s < t, ξ ∈ Dst. (5.130)

Generalized solutions to (5.127) are defined in a symmetrical manner.

Applying Theorem 4.10.1 and taking into account that Dobs is dense in Bobs,we can derive the following corollary to Theorem 5.9.1.

Proposition 5.9.2. Under the assumptions of Theorem 5.9.1, for any f ∈ Bobs

(respectively for any ξ ∈ Bst), the curve Us,tf (respectively V s,tξ) represents theunique generalized solution by approximation and the unique generalized solutionby duality to the Cauchy problem of equation (5.126) (respectively (5.127)).

Of course, analogues of these notions of generalized solutions can be definedand exploited for more general pairs of spaces (Bobs, Bst) that do not necessarilysatisfy the conditions of Theorem 5.9.1, and do not necessarily have to be Banach.For instance, the usual notion of a generalized solution in the theory of generalizedfunctions corresponds to the duality arising from pairs of locally convex spaces(D,D′).

5.10 Uniqueness via positivity and approximations;

Feller semigroups

In this section, we show that the uniqueness of solutions to the Cauchy problemsof equations of at most second order can be established with the help of theproperty of positivity-preservation. Towards the end of the section, we will develop

Page 349: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

5.10. Uniqueness via positivity and approximations; Feller semigroups 335

an alternative approach that is based on approximation and can also be directlyapplied to equations of at most second order.

One says that a linear operator L acting from a subspace D of C(Rd) tosome other space of functions on Rd

(i) is conditionally positive, if Lf(x) ≥ 0 for any f ∈ D such that f(x) = 0 =miny f(y);

(ii) satisfies the positive maximum principle (PMP), if Lf(x) ≤ 0 for any f ∈ Dsuch that f(x) = maxy f(y) ≥ 0.

By passing from f to −f , the property (ii) is seen to be equivalent to therequirement that Lf(x) ≥ 0 for any f ∈ D such that f(x) = miny f(y) ≤ 0. Inparticular, it implies conditional positivity.

By shifting, one sees that conditional positivity implies the PMP wheneverD contains constants and L takes non-positive values on them. For example, theoperator of multiplication u(x) �→ c(x)u(x) with a function c ∈ C(Rd) is alwaysconditionally positive, but it satisfies the PMP only in the case of a non-negative c.

The main class of examples for conditionally positive operators are specialrepresentatives of the operators of at most second order, namely the so-calledLevy–Khintchin-type operators, which are generally given by the following formula:

Lf(x) =1

2(A(x)∇,∇)f(x) + (b(x),∇)f(x)

+

∫(f(x+ y)− f(x)− (∇f(x), y)χ(|y|))ν(x, dy) + c(x)f(x),

(5.131)

where A is a non-negative symmetric matrix-valued function, ν is a non-negativekernel such that

∫min(|y|2, 1)ν(x, dy) < ∞ (so-called Levy kernel) and χ a non-

negative decreasing function (a mollifier, needed to make the integral well defined),which is normally taken to be either the indicator function χ(s) = 1s≤1 or χ(s) =1/(1 + s2). If ν(x, .) has a bounded first moment, then χ(|y|) is not needed at all.

It is readily seen that these operators are defined on C2(Rd) and conditionallypositive. They satisfy the PMP if and only if the function c(x) is non-positive.

Remark 94. It can be shown (Courrege theorem) that, under mild additionalassumptions, conditionally positive operators have to be of the type (5.131).

The following result is representative for a class of results that is referred toas the maximum principle.

Theorem 5.10.1. Let a subspace D ⊂ C(Rd) contain constant functions, and let afamily of operators Lt : D �→ C(Rd) satisfying the PMP be given. Let the numberss < T be given and u(t, x) ∈ C([s, T ] ×Rd. Assume that u(s, x) is non-negativeeverywhere, u(t, .) ∈ C∞(Rd) ∩D for all t ∈ [s, T ], is differentiable in t for t > sand satisfies the evolutionary equation

∂u

∂t= Ltu, t ∈ (s, T ].

Page 350: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

336 Chapter 5. Linear Evolutionary Equations: Advanced Theory

Then u(t, x) ≥ 0 everywhere, and

max{u(t, x) : t ∈ [s, T ], x ∈ Rd} = max{u(s, x) : x ∈ Rd}. (5.132)

Proof. Suppose inf u = −α < 0. For a δ < α/(T − s), consider the function

v = u(t, x) + δ(t− s).

Clearly, this function also has a negative infimum. Since v tends to a positiveconstant δt as x → ∞, v has a global negative minimum at some point (t0, x0),which lies in (s, T ]×Rd. Therefore, we find ∂v

∂t (t0, x0) ≤ 0 (the equality being trueif t0 < T ), and by the PMP Ltv(t0, x0) ≥ 0. Consequently,(

∂v

∂t− Ltv

)(t0, x0) ≤ 0.

On the other hand, from the evolution equation (and since Ltδ ≤ 0 by the PMP),one deduces that(

∂v

∂t− Ltv

)(t0, x0) ≥

(∂u

∂t− Ltu

)(t0, x0) + δ = δ,

because, by the PMP, L takes positive (respectively negative) constants to non-positive (respectively non-negative) functions. This contradiction completes theproof of the first statement.

Next, assume that

max{u(t, x) : t ∈ [s, T ], x ∈ Rd} > max{u(s, x) : x ∈ Rd}.Then there exists a δ > 0 such that the function v = u(t, x)− δ(t− s) also attainsits maximum at a point (t0, x0) with t0 > s. Therefore, we find ∂v

∂t (t0, x0) ≥ 0 (theequality being true if t0 < T ), and by the PMP Ltv(t0, x0) ≤ 0. Consequently,(

∂v

∂t− Ltv

)(t0, x0) ≥ 0.

But from the evolution equation,(∂v

∂t− Ltv

)(t0, x0) ≤

(∂u

∂t− Ltu

)(t0, x0)− δ = −δ.

This is again a contradiction. �Remark 95. We assumed that the domain of Lt contains constants. This is notnecessary. If Lt satisfies the PMP on a subspace D of C∞(Rd), we can extend itto the space generated by D and constants by linearity and by setting Lt to bezero on constant functions. In this case, the result of Theorem 5.10.1 and its proofwould remain valid. Alternatively, the extension can be performed by continuity.

Page 351: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

5.10. Uniqueness via positivity and approximations; Feller semigroups 337

A strongly continuous semigroup Tt (or a propagator U t,s) in C∞(Rd) iscalled Feller semigroup (respectively Feller propagator) if for any f with values in[0, 1], all functions Ttf (respectively U t,sf) also have values in [0, 1].

Corollary 6. Strongly continuous semigroups in C∞(Rd) generated by operatorsthat satisfy the PMP are Feller semigroups. In particular, this is the case foroperators L of the type (5.131) with a non-positive c(x).

Corollary 7.

(i) Under the conditions on D and Lt as in the above theorem, assume thatf ∈ C([s, T ]×Rd), g ∈ C∞(Rd). Then the Cauchy problem

∂u

∂t= Ltu+ ft, u(s, x) = g(x), (5.133)

can have at most one solution u ∈ C([s, T ]×Rd) such that u(t, .) ∈ C∞(Rd)for all t ∈ [s, T ] and u(t, .) ∈ D for all t ∈ (s, T ] (and the equation is supposedto hold for t > s).

(ii) In particular, this result holds for operators Lt of the Levy–Khintchin type(5.131), if its coefficients are continuous bounded functions, where one cantake B = C∞(Rd) and D = C2

∞(Rd). (The kernels νt(x, dy) should dependcontinuously on t and x in the weak sense, i.e.,

∫f(y)ν(x, dy) is a continuous

function whenever f is continuous and such that f(x) ≤ min(1, |x|2).)As a direct consequence of this uniqueness result, we get the following im-

provement of Theorem 5.8.4.

Theorem 5.10.2. If β ∈ (1, 2] in Theorem 5.8.4 and ψ(x, p) =∫Sd−1 |(p, s)|βμ(ds)

(so that ψ(x,−i∇) is of the Levy–Khinchine type by (1.145)), then the operatorsΦt,r form a propagator yielding unique solutions to equation (5.123). If additionallythe operators Lt are of the Levy–Khintchin type, the propagators Φt,r are Feller.

From the characterization of the continuous operators in C∞(Rd) (see (1.77))and the property of positivity-preservation, it follows that the Feller semigroupsTt are given by the formulae Ttf(x) =

∫f(y)μt(x, dy) with some transition kernels

μt such that ‖μt(x, .)‖ ≤ 1. With this formula, one can extend the action of Tt

on C∞(Rd) to the space C(Rd) (although Ttf for f ∈ C(Rd) may turn out tobe not continuous). A Feller semigroup is called conservative, if its extension toC(Rd) preserves constants. In terms of μt, this condition means that all μt(x, .)are probability measures: ‖μt(x, .)‖ = 1.

The simplest examples of Feller semigroups are generated by the Levy–Khintchin operators, that is, operators (5.131) with A,b, ν constant (not depend-ing on x) and vanishing c. We shall discuss these examples in some detail in thenext section. For now, let us introduce another approach to uniqueness, based onapproximations.

Page 352: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

338 Chapter 5. Linear Evolutionary Equations: Advanced Theory

Theorem 5.10.3. Let B be a Banach space and D a dense subspace that is itselfa Banach space under some norm ‖.‖D ≥ ‖.‖B. Let At be a family of uniformlybounded linear operators D → B. Let there exist a sequence of families of boundedoperators An

t in B such that

(i) Ant f → Atf for any f ∈ D,

(ii) the norms ‖Ant f‖ are uniformly bounded on bounded subsets of D,

(iii) the propagators U t,sn =T exp{∫ t

s Anτ dτ} in B (see (2.39) for the used notation)

generated by Ant are uniformly bounded in n for t from any compact set.

Then for any f ∈ D there might exist at most one continuous curve ft, t ≥ s,such that fs = f applies, ft ∈ D for all t, supt∈[s,T ] ‖ft‖D < ∞ and ft satisfies

the equation ft = Atft.

Proof. Let us show that any curve ft satisfying the requirements of the theoremis necessarily the limit of the sequence fn

t = U t,sn f solving the Cauchy problem for

the equations fnt = An

t fnt . This would imply the uniqueness.

From the equations for ft and fnt , we get the equation

d

dt(fn

t − ft) = Ant f

nt −Atft = An

t (fnt − ft) + (An

t −At)ft.

Since fns − fs = 0, it follows from Proposition 4.10.2 (and Remark 78) that

fnt − ft =

∫ t

s

U t,sn (An

s −As)fs ds.

By the assumption (i), the integrand converges to 0, as n → ∞. By the assumptions(ii) and (iii), one can apply the dominated convergence theorem to conclude thatfnt − ft → 0, as claimed. �

This theorem is most easily applied to Levy–Khintchin-type operators, whereit provides an alternative proof to statement (ii) of the corollary to Theorem5.10.1. In fact, by approximating the Levy kernels ν(x, dy) by bounded kernels1|y|>εν(x, dy) and the derivatives by the corresponding finite differences, one getsapproximations to the Levy–Khintchin-type operators that also satisfy the PMPand therefore generate propagators that are contractions and hence uniformlybounded. We will apply this method to fractional equations in Chapter 8.

Finally, let us formulate the general result on the well-posedness of Levy–Khintchin-type operators that can be kept in mind as a standard example for well-posed linear problems. This result extends the properties of diffusions given in The-orem 4.3.1, as well as Theorem 5.10.2. Recall that the Wasserstein–Kantorovichdistance of order p between measures on Rd with a finite pth moment is defined as

Wp(ν1, ν2) =

(infν

∫|y1 − y2|pν(dy1dy2)

)1/p

, (5.134)

Page 353: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

5.11. Levy–Khintchin generators and convolution semigroups 339

where inf is taken over all ν on Rd×Rd that couple ν1 and ν2, i.e., their projectionon the first and second variable coincides with ν1 and ν2, respectively.

Remark 96. This distance is usually defined for probability measures. But (5.134)also makes perfect sense for unbounded measures, as long as they have a finitesecond moment.

Theorem 5.10.4. Let an operator L have the form (5.131) with vanishing c, where

‖√A(x1)−

√A(x2)‖+ |b(x1)− b(x2)|+W2(1B1(.)ν(x1; .),1B1(.)ν(x2; .))

≤ κ|x1 − x2| (5.135)

with a certain constant κ, and

supx

(√A(x) + |b(x)|+

∫B1

|y|2ν(x, dy))

< ∞. (5.136)

Let the family of finite measures {1Rd\B1)(.)ν(x; .)} be uniformly bounded, tight

and depend weakly continuously on x. Then L extends to the generator of a con-servative Feller semigroup.

As in the case of degenerate diffusions, the proof is based on the application ofstochastic differential equations. We shall not give it here, but refer to the originalpaper [146].

In the remaining chapter, we shall look in more detail at some concrete classesof ΨDEs and their corresponding propagators.

5.11 Levy–Khintchin generators andconvolution semigroups

In this section, we shall deal with the properties of semigroups that are generatedby Levy–Khintchin operators, that is, operators of the type (5.131) with constantcoefficients and vanishing c:

Lf(x) =1

2(A∇,∇)f(x) + (b,∇)f(x)

+

∫(f(x+ y)− f(x)− (∇f(x), y)1|y|≤1)ν(dy),

(5.137)

where ν is a Levy measure, i.e., a measure on Rd \ {0} such that∫min(|y|2, 1)ν(dy) < ∞. (5.138)

In probability theory, the symbol of the ΨDO on the r.h.s. of (5.137),

ψ(p) = e−ipxLeipx

= −1

2(Ap, p) + i(b, p) +

∫[ei(p,y) − 1− i(p, y)1|y|≤1]ν(dy),

(5.139)

Page 354: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

340 Chapter 5. Linear Evolutionary Equations: Advanced Theory

is called the Levy exponent or Levy symbol or characteristic exponent of L, andthe semigroups generated by Levy–Khintchin operators describe the so-called Levyprocesses.

Let us emphasize that the condition (5.138) ensures that the Levy exponentψ(p) is well defined as a continuous function on R such that ψ(0) = 1 and |ψ(p)| ≤C(1 + p2) with some constant C and Reψ(p) ≤ 0.

Proposition 5.11.1. For any non-negative matrix A, a vector b and a Levy measureν, the operator L of the type (5.137) generates a Feller semigroup Tt in C∞(Rd)with all spaces Ck∞(Rd) invariant and, for k ≥ 2, representing cores. Moreover,Tt acts as

Ttf(x) =

∫f(x− y)μt(dy) =

∫f(x+ y)μt(dy), (5.140)

where the probability measure μt(dy) is the Green function of the Cauchy problemof the operator L given by the formulae

μt = F−1(etψ(.)) ⇐⇒ etψ(p) = (Fμt)(p), (5.141)

and where μ(dy) = μt(−dy).

Proof. The real part of the function −ψ(p) is bounded from below. Therefore,

the Cauchy problem for the equation ddt ft = ψ(p)ft, which is obtained by Fourier

transforming the equation ft = Lft, has the solution ft = exp{tψ(p)}f0. Thissolution defines a strongly continuous semigroup in all spaces Lp, see Theorem2.4.1. Consequently, for any f0 ∈ C∞(Rd) from the Wiener ring F (L1(R

d)) of

the Fourier transforms of functions f0 ∈ L1(Rd), the solution Ttf0 = ft to the

Cauchy problem for the equation ft = Lft is a uniquely defined continuous curvein C∞(Rd). Moreover, by Theorem 5.10.1, the mapping f0 → Ttf0 is such that forany f0 with values in [0, 1] the functions Ttf0 also have values in [0, 1]. Therefore,the operators Tt are contractions in C∞(Rd) when reduced to the Wiener ringF (L1(R

d)). Hence Tt extends to the strongly continuous semigroup on the wholeC∞(Rd) by a density argument.

Since the ΨDO L has constant coefficients, its semigroup is given by theconvolution with the generalized function Gψ

t−s, see (2.67) and (2.62), referred to

as the Green function of the Cauchy problem of L. Since the Green function Gψt−s

in our case defines a positive continuous operator on C∞(Rd), it is a measure (bythe Riesz–Markov theorem). This yields (5.140) and (5.141).

The invariance of the spaces Ck∞(Rd) is seen from (5.140). Since the Schwartz

space is invariant under the Fourier transform and belongs to the domain of theFourier transformed semigroup Tt, it belongs to the domain of L. It follows thatC2∞(Rd) belongs to the domain of L, because L is closed on its domain (Theorem4.1.1(vii)) and L is bounded as an operator C2

∞(Rd) → C∞(Rd). �

Page 355: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

5.11. Levy–Khintchin generators and convolution semigroups 341

In terms of μt, the semigroup equation TtTs = Ts+t can be rewritten as theconvolution equation

μt � μs = μs+t,

showing that the μt form a convolution semigroup, i.e., a semigroup with respectto the convolution. The strong continuity of Tt translates into the vague continuityof μt.

Remark 97. One can show that the vague continuity of a family of probabilitymeasures implies its weak continuity, so that μt is a weakly continuous family.

Notice that Tt of (5.140) extends directly to the semigroups of contractionson the space of bounded Borel functions on Rd.

For the theory of differential equations, it is important to know in whichsense the initial condition and the equation ft = Lft are satisfied by Ttf wheneverf deviates from the domain of L. Besides, one is also interested in regularizationproperties of Tt. The following result gives a partial answer (see Proposition 5.11.4for a follow-up).

Proposition 5.11.2. Let the assumptions of Proposition 5.11.1 hold. (i) If f is abounded Borel function, then Ttf(x) → f(x), as t → 0, at any point x of continuityof f . (ii) The semigroup Tt extends to the strongly continuous semigroup on thespace Cuc(R

d) of uniformly continuous functions on Rd. (iii) If all measures μt

have no atoms, then Ttf is continuous whenever f is bounded with at most acountable number of discontinuities.

Proof. (i) Let x be a point of continuity of f . For any ε, we can choose δ suchthat |x − y| < δ implies |f(x) − f(y)| < ε. Due to the strong continuity of Tt onC∞(R), we can choose t small, so that

μt[Rd \ (−δ, δ)d] < ε, μt[(−δ, δ)d] > 1− ε.

It follows that∣∣∣∣∫

f(x+ y)μt(dy)− f(x)

∣∣∣∣ =∣∣∣∣∫(f(x+ y)− f(x))μt(dy)

∣∣∣∣≤ 2ε‖f‖+

∣∣∣∣∣∫(−δ,δ)d

(f(x+ y)− f(x))μt(dy)

∣∣∣∣∣≤ 2ε‖f‖+ ε

with all three terms being of order ε.

(ii) This is the same as in (i). The only modification is that, for f ∈ Cuc(Rd),

δ can be chosen uniformly in x.

(iii) If μt have no atoms, f is bounded with at most a countable number ofdiscontinuities, and xn → x as n → ∞, then f(xn + y) → f(x + y) almost surelywith respect to μt. Therefore, the dominated convergence theorem completes theproof. �

Page 356: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

342 Chapter 5. Linear Evolutionary Equations: Advanced Theory

Specific features arise in equations with one-sided ν, namely equations of thetype

ft = Lνft(x) =

∫ ∞

0

(ft(x+ y)− ft(x))ν(dy) =

∫ ∞

−∞(ft(x + y)− ft(x))ν(dy),

(5.142)where ν is a measure with a support on {y : y > 0} that satisfies the one-sidedLevy condition ∫ ∞

0

min(1, y)ν(dy) < ∞. (5.143)

The latter condition ensures that the symbol

ψν(p) =

∫(eipy − 1)ν(dy) (5.144)

of the operator Lν on the r.h.s. of (5.142) is well defined for all p, as a continuousfunction. Moreover, ψ(0) = 1, |ψ(p)| ≤ C(1 + |p|) with some constant C andReψ(p) ≤ 0.

Proposition 5.11.3.

(i) Under the condition (5.143), the equations (5.142) generate the Feller semi-groups Tt on C∞(R) such that

Ttf(x) =

∫ ∞

0

f(x+ y)G(ν)(t, dy) =

∫ 0

−∞f(x− y)G(ν)(t, dy), (5.145)

with some probability measures

G(ν)(t, dy) on R+ and G(ν)(t, dy) = G(ν)(t, d(−y)) on R−,

such that the value of Ttf(x) depends only on f(z) with z ≥ x. The spaceC1∞(R) is an invariant core for Tt.

(ii) The Tt have the following monotonicity properties: If f is non-decreasing,then Ttf(x) is non-decreasing both in t and in x, and Ttf(x) ≥ f(x) for allx and t.

(iii) Comparison principle: Let ν1, ν2 be two measures satisfying (5.143) and defin-ing the semigroups T 1

t and T 2t . Let ν1(dy) ≥ ν2(dy). Then T 1

t f(x) ≥ T 2t f(x)

for any non-decreasing f .

Proof. (i) If ν is a finite measure, then the assertion follows from (4.97). A generalν can be approximated by finite νε(dy) = 1|y|≥εν(dy), in which case the assertionfollows from the convergence of the corresponding semigroups, see Proposition4.2.2. The statement about the core is a consequence of the three facts: C2

∞(R)is a core by Proposition 5.11.1, Lν is bounded as an operator C1∞(R) → C∞(R),and the generator Lν is closed on its domain.

Page 357: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

5.11. Levy–Khintchin generators and convolution semigroups 343

(ii) It is seen from (5.145) (and the fact that the G(ν)(t, .) are probability mea-sures) that Ttf(x) is non-decreasing in x, and Ttf(x) ≥ f(x) for non-decreasingf . The monotonicity in t follows from equation (5.142).

(iii) Let ν1 − ν2 = ν3. Since ν3 is positive, it also defines a semigroup, sayT 3t , of the same kind. Moreover, all T j

t commute. This can be first checked forapproximating bounded νj as in (i), and then by passing to the limit for general νj .Moreover, T 1 = T 3T 2. This is a straightforward consequence of the Lie–Trotterformula (5.17) for commuting generators, or can be directly proved. Therefore,to show that T 1

t f(x) ≥ T 2t f(x) for any non-decreasing f means to show that

T 3t T

2t f(x) ≥ T 2

t f(x), and hence that T 3t g(x) ≥ g(x) for any non-decreasing g

(since T 2t f is non-decreasing for non-decreasing f). But this holds due to the third

statement in (ii). �

Remark 98. In probability theory, the order relation T 1t f(x) ≥ T 2

t f(x) for any non-decreasing f is referred to as stochastic order, and the property that Tt preservesthe set of increasing functions is referred to as stochastic monotonicity.

Proposition 5.11.4. Under the assumption (5.143), let Tt be the correspondingsemigroup given by (5.145) and let the measures μt have no atoms. Let f ∈ C∞(R)be piecewise differentiable, i.e., there exists a finite number of points a1 < · · · < aksuch that f is continuously differentiable outside the set of these points with auniformly bounded derivative. Then

limt→0

1

t(Ttf(x)− f(x)) = Lνf(x), (5.146)

for all x /∈ {a1, . . . , ak}.

Proof. Let fn be a sequence of uniformly bounded elements of C1∞(R) such that,for any y /∈ {a1, . . . , ak}, fn(y) = f(y) for large enough n (depending on y).Consequently, Lνfn(y) → Lνf(y) for y /∈ {a1, . . . , ak}, where Lνf(y) is definedby the r.h.s. of (5.142) and is continuous for y /∈ {a1, . . . , ak}. In fact, for anyy ∈ (ak−1, ak) we can write

Lνf(y) =

∫ ∞

0

(f(y + z)− f(y))ν(dz)

=

∫ (ak−y)/2

0

(f(y + z)− f(y))ν(dz) +

∫ ∞

(ak−y)/2

(f(y + z)− f(y))ν(dz),

The first term is bounded and approximated by the corresponding expression forfn, because f

′ is bounded on the segment [y, y+(ak−y)/2]. And the second term isbounded and approximated by the corresponding expression for fn, because ν(dz)is bounded for z > (ak − y)/2. This argument shows that Lνf(y) is uniformlybounded on R \ {a1, . . . , ak}. Hence, by the dominated convergence theorem anddue to the fact that the μt have no atoms, we find (TsLνfn)(x) → (TsLνf)(x)

Page 358: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

344 Chapter 5. Linear Evolutionary Equations: Advanced Theory

for all x. Again by the dominated convergence theorem, we can pass to the limitn → ∞ in the equation

Ttfn(x) = fn(x) +

∫ t

0

(TsLνfn)(x) ds

and obtain

Ttf(x) = f(x) +

∫ t

0

(TsLνf)(x) ds.

This implies (5.146), because (TsLνf)(x) → Lνf(x), as s → 0, by Proposition5.11.2. �

For a concrete ν, the assumptions on f ensuring (5.146) can be further weak-ened. The important methodological consequence of Proposition 5.11.4 is the possi-bility to talk about the values of Lνf(x) at particular points x, even if the functionLνf(x) is not globally defined. If (5.146) holds, then we can say that f belongs tothe domain of L locally, at a point x. This concept is crucial for applications toboundary-value problems of PDEs and ΨDEs, see e.g. Proposition 8.4.2 below.

As for the general Levy–Khintchin operators, G(ν)(t, .) is the Green functionof the Cauchy problem for equation (5.142), and

G(ν)(t, .) = F−1(etψν(.)) ⇐⇒ etψν(p) = (FG(ν)(t, .))(p). (5.147)

Similarly, the Cauchy problem for the equations

ft = L′νft(x) =

∫ ∞

0

(ft(x− y)− ft(x))ν(dy), (5.148)

with a generator that is dual to the operator on the r.h.s. of (5.142), has solutionsof the form

T ′tf(x) =

∫ 0

−∞f(x+ y)G(ν)(t, dy) =

∫ ∞

0

f(x− y)G(ν)(t, dy), (5.149)

the Green function G(ν)(t, dy) and the following symbol of the generator:

ψν−(p) = ψν(−p) =

∫(e−ipy − 1)ν(dy).

Remark 99. For the development of generalized fractional equations (see Chapter8), it is crucial that the space C([a,∞)) (as a subspace of C(R) of functions beingconstant to the left of a) and its subspace Ckill(a)([a,∞)) consisting of functionsvanishing to the left of a are both invariant under T ′

t . The restriction of thegenerator L′

ν to these subspaces of C(R) is a far-reaching extension of the mixedfractional derivative of the Caputo-type and of the RL-type.

Page 359: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

5.12. Potential measures 345

5.12 Potential measures

When working with one-sided equations of the type (5.142), it is often conve-nient to analyse the properties of the Green functions G(ν)(t, .) via their Laplacetransform. Namely, one introduces the Laplace exponent of the operator Lν as

φν(λ) = −ψν(iλ) =

∫ ∞

0

(1− e−λy)ν(dy), (5.150)

so that

e−tφν(λ) = etψν(iλ) =

∫ 0

−∞eλyG(ν)(t, dy)

=

∫ ∞

0

e−λyG(ν)(t, dy) = (LG(ν)(t, dy))(λ),

(5.151)

where we used the notation L for the Laplace transform.

An important general concept is the following: an infinitely differentiablefunction f on (0,∞) with non-negative values is called completely monotone (re-spectively a Bernstein function) if (−1)nf (n)(λ) ≥ 0 for all n = 0, 1, . . . (respec-tively if the derivative of f is completely monotone). Bernstein’s theorem states(for a proof, see, e.g., [241]) that a function f : (0,∞) → R is completely monotoneif and only if it is the Laplace transform of a positive measure:

f(λ) = (L(μ))(λ) =∫ ∞

0

e−λyμ(dy)

with some μ on {y : y ≥ 0} that may not be finite, but such that L(μ) is awell-defined function on {λ : λ > 0}. Therefore, the Laplace exponents φν(λ) of(5.150) represent the Bernstein functions, and the exponents e−tφν(λ) are com-pletely monotone for all t.

Proposition 5.12.1.

(i) For any measure ν on {y : y > 0} satisfying (5.143), there exists the vaguelimit

U (ν)(M) =

∫ ∞

0

G(ν)(t,M) dt

of the measures∫K

0G(ν)(t, .) dt, K → ∞, such that U (ν)(M) is finite for any

compact M . Moreover, the Laplace transform of U (ν) is well defined for allλ > 0 and

(LU (ν))(λ) =

∫ ∞

0

e−λyU (ν)(dy) = 1/φν(λ). (5.152)

(ii) Comparison principle: if ν1(dy) ≥ ν2(dy), then∫f(y)U (ν1)(dy) ≥

∫f(y)U (ν2)(dy)

for any non-decreasing f . The opposite inequality holds for non-increasing f .

Page 360: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

346 Chapter 5. Linear Evolutionary Equations: Advanced Theory

Proof. (i) We have∫ ∞

0

(∫ ∞

0

e−λyG(ν)(t, dy)

)dt =

∫ ∞

0

e−tφν(λ)dt =1

φν(λ).

For any function u(y) with the compact support [0, z], it holds u(y) ≤ ‖u‖e−λyeλz .Therefore, ∫ ∞

0

(∫ ∞

0

u(y)G(ν)(t, dy)

)dt ≤ ‖u‖ eλz

φν(λ).

Consequently, U (ν)(.) =∫∞0

G(ν)(t, .) dt is a well-defined finite measure on [0, z]

for any finite z, and thus U (ν) exists as a σ-finite measure on R+.

(ii) This follows from the comparison principle for the semigroups Tt, seeProposition 5.11.3(iii). �

For example, if ν is finite, then it follows from (4.97) that∫f(y)U (ν)(dy) =

1

‖ν‖f(0) +∞∑k=1

1

‖ν‖k∫

· · ·∫

f(y1 + · · ·+ yk)ν(dy1) · · · ν(yk).(5.153)

Consequently, applying the comparison principle of Proposition 5.12.1 leads to thefollowing result.

Corollary 8. The potential measure U (ν) has an atom at zero if and only if ν isfinite, in which case this atom is δ0/‖ν‖.Exercise 5.12.1. Extend this assertion to λ-potential measures as defined in (5.154)below.

As already mentioned, in the terminology of differential equations, G(ν)(t, .)(respectively G(ν)(t, .)) is the Green function of the Cauchy problem for the opera-tor Lν (respectively for the operator L′

ν). Then, by Proposition 1.11.4, the measureU (ν)(dy) on {y ≥ 0} is the fundamental solution to the operator −L′

ν , and themeasure U (ν)(−dy) on {y ≤ 0} is the fundamental solution to the operator −Lν .

In the terminology of semigroups, the operator g → ∫g(x+ y)U (ν)(dy) with

the kernel U (ν)(dy) is the potential operator for the semigroup Tt, see Proposition4.1.4, and the convolution operator g → ∫

g(x − y)U (ν)(dy) is the potential op-erator for the semigroup T ′

t . However, unlike the statement of Proposition 4.1.4,this operator is not defined as a strong limit and therefore may be unbounded inC∞(Rd). The measure U (ν)(dy) is usually referred to as the potential measure ofthe convolution semigroup {G(ν)(t, .)}. Accordingly, the measure

U(ν)λ (A) =

∫ ∞

0

e−λtG(ν)(t, A) dt (5.154)

is called the λ-potential measure of the convolution semigroup {G(ν)(t, .)}. It is

finite with ‖U (ν)λ ‖ = 1/φν(λ) and represents the integral kernel of the resolvent

Page 361: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

5.12. Potential measures 347

operator R′λ of the semigroup T ′

t generated by L′ν . On the other hand, R′

λ =

(λ−L′ν)

−1 implies that its integral kernel U(ν)λ (dy) is the fundamental solution to

the operator λ− L′ν . Of course, U

(ν)0 (dy) = U (ν)(dy).

The final result of this section is devoted to the question of uniqueness of thefundamental solution to the operators L′

ν and their shifts.

Proposition 5.12.2. Let the measure ν on {y : y > 0} satisfy (5.143).

(i) For any λ > 0, the λ-potential measure U(ν)λ represents the unique funda-

mental solution to the operator λ− L′ν .

(ii) If the support of ν is not contained in a lattice {αn, n ∈ Z}, with some α > 0,then the measure U (ν)(dy) represents the unique fundamental solution to theoperator −L′

ν up to an additive constant.

(iii) Let {αn, n ∈ Z} be the minimal lattice (that cannot be further rarified) con-taining the support of ν, such that for any k ∈ Z, k > 1, there exists n ∈ Zsuch that αn belongs to the support of ν and n/k /∈ Z. Then any two fun-damental solutions to the operator −L′

ν differ by a linear combination of thetype

G(x) =∑n∈Z

an exp{2πnix/α} (5.155)

with some numbers an. In particular, U (ν)(dy) is again the unique funda-mental solution vanishing on the negative half-line.

Proof. (i) For any two fundamental solutions U1, U2 of λ− L′ν , the Fourier trans-

form implies that (ψν(−p)− λ)(FG)(p) = 0 for G = U1 −U2. Since Re (ψν(−p)−λ) ≤ −λ < 0 for all p, FG(p) = 0 and hence G = 0.

(ii) For any two fundamental solutions U1, U2 of −L′ν, the Fourier transform

implies ψν(−p)(FG)(p) = 0 for G = U1 − U2. Since the support of ν is notcontained in a lattice, ψν(−p) < 0 everywhere except at p = 0, because cos(py)−1 < 0 everywhere except when y = 2πn/p with some n ∈ Z. Therefore, FGhas a support at zero. Consequently, by Proposition 1.9.1, FG is a finite linearcombination of the derivatives δ(j) of the δ-function. But the derivative of ψν(−p)at zero does not vanish: it either equals −i

∫yν(dy), if this integral is finite, or is

not finite at all, if otherwise. In both cases, FG cannot have other terms in thesum apart from the δ-function itself. In fact, ψν(−p)

∑j ajδ

(j)(p) = 0 would meanthat

m∑j=0

aj [ψν(−p)φ(p)](j)(0) = 0

for any φ ∈ D(Rd). This is possible only if all aj = 0 for j > 0. Hence G is aconstant, as claimed.

(iii) Under the assumption of (ii), we have ν(dy) =∑

n>0 bnδαn(y) with somenon-negative numbers bn such that for any k ∈ Z, k > 1, there exists n ∈ Z such

Page 362: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

348 Chapter 5. Linear Evolutionary Equations: Advanced Theory

that bn > 0 and n/k /∈ Z. Therefore,

ψν(p) =∞∑n=1

bn(eipαn − 1).

Notice that ψν(p) = 0 if and only if cos(pαn) = 1, or equivalently pαn = 2πlwith l ∈ Z, for all n with bn > 0. Therefore, ψν(p) = 0 for pm = 2πm/α, m ∈ Z.Moreover, if p is not of this form, then ψν(p) = 0. To see this, let us assumeotherwise, i.e., ψν(p) = 0 for some p = pm. Then p = (2π/α)(m/k) with somerational number m/k. Let us choose it in such a way that the fraction m/k isirreducible. Then k > 1, since p = pm. Let us choose n ∈ Z such that bn = 0 andn/k /∈ Z. Since pαn = 2πl with some integer l, it follows that m/k = l/n. Sincem/k is irreducible, n/k is an integer, which leads to a contradiction.

Consequently, if ψν(−p)(FG)(p) = 0, then the support of FG is on thelattice {pm} (called the dual lattice to the lattice αn). As in (i), we find that thederivatives of the δ-function cannot enter the formula for FG. Therefore, we haveFG(p) =

∑m∈Z amδpm(p), which implies (5.155). The final statement is due to

the fact that a linear combination of exponents cannot vanish on the negativehalf-line. �

In Chapter 8, we shall develop this topic further, since Proposition 5.12.2 isthe cornerstone for the development of the generalized fractional calculus.

5.13 Vector-valued convolution semigroups

In this section, we present the Banach-valued extensions of the semigroups fromProposition 5.11.3, and some of their direct applications. Further extensions willbe provided in Chapter 8.

Proposition 5.13.1.

(i) Let B be a Banach space. Under the condition (5.143), the operators Tt and T ′t

given by (5.145) and (5.149) extend to C(R, B) (by the same formula) andrepresent strongly continuous semigroups in each of the spaces Ck∞(R, B),k = 0, 1, . . . and in Cuc(R, B).

(ii) If T εt and (T ε

t )′ denote the semigroups generated by the finite approximations

νε(dy) = 1|y|≥εν(dy) of ν, then T εt → Tt and (T ε

t )′ → T ′

t strongly, as ε → 0,

in each of the spaces Ck∞(R, B).

(iii) The space C1∞(R, B) is an invariant core for both Tt and T ′

t in C∞(R, B).

(iv) The resolvent operators R′λ of the semigroup T ′

t , given by the formula

R′λf(x) =

∫ ∞

0

f(x− y)U(ν)λ (dy),

also extend to bounded operators in C∞(R, B), so that R′λ(λ−L′

ν)f = f forany f ∈ C∞(R, B).

Page 363: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

5.13. Vector-valued convolution semigroups 349

Proof. (i) For the sake of definiteness, let us deal with Tt. For f ∈ C∞(R, B),the integral in (5.145) can be defined as the limit of Riemannian sums, whichconverge (in the norm of B) due to the uniform continuity of f on R. The uniformcontinuity of f also implies that Ttf is continuous. Since f tends to zero at infinity,the same holds for Ttf . The boundedness of the Tt in C∞(R, B) follows from theboundedness of their restrictions to C∞(R). The proof of the strong continuityin Cuc(R, B) relies on the same arguments as in Proposition 5.11.2(i). Finally,the same argument shows that Tt acts strongly continuously in each of the spacesCk∞(R, B), because differentiation commutes with all the operators Tt.

(ii) For any ε, the semigroups T εt are generated by bounded operators. Con-

sequently, they can be defined by a convergent exponential series. Since T εt → Tt

as operators in C∞(R) (by Proposition 4.2.2), it follows that the correspondingmeasures G(νε)(t, .) converge weakly to G(ν)(t, .) and hence T ε

t f → Ttf weakly inB for any f ∈ B. In order to see that this convergence is strong, we can estimatethe difference T ε1

t −T ε2t , with some ε1 > ε2, by the same method as used in formula

(4.8):

T ε1t − T ε2

t =

∫ t

0

T ε2t−s(Lνε1

− Lνε2)T ε1

s ds.

Therefore, if f ∈ C1(R, B),

‖(T ε1t − T ε2

t )f‖C(R,B) ≤∫ t

0

ds

∥∥∥∥∫ ε1

ε2

(T ε1s f(.+ y)− T ε1

s f(.))ν(dy)

∥∥∥∥C(R,B)

≤∫ t

0

ds

∫ ε1

ε2

y‖T ε1s f‖C1(R,B)ν(dy)

≤∫ t

0

ds

∫ ε1

ε2

y‖f‖C1(R,B)ν(dy), (5.156)

which tends to zero as ε1 → 0. Therefore, the family T εt f is Cauchy in C∞(R, B),

and hence the weak convergence implies the strong convergence.

(iii) For each ε and any f ∈ C∞(R, B), we have

T εt f(x)− f(x) =

∫ t

0

LνεTεsf(x) ds.

For f ∈ C1∞(R, B) we can pass to the limit ε → 0, which yields

Ttf(x)− f(x) =

∫ t

0

LνTsf(x) ds.

Therefore, C1∞(R, B) is an invariant core for Tt in C∞(R, B).

(iv) This follows from (i), (iii) and the general Theorem 4.1.1. �

Page 364: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

350 Chapter 5. Linear Evolutionary Equations: Advanced Theory

Exercise 5.13.1. Check that Propositions 5.11.4 and 5.11.2 extend to the presentBanach-space-valued setting. Moreover, if the Banach space B is a Banach latticewith respect to some partial order relation, then the monotonicity properties ofProposition 5.11.3(ii) and (iii) also hold in this setting.

Theorem 5.13.1.

(i) Let A be an operator in B that generates a strongly continuous semigroupetA in B with an invariant core D. Let D itself be a Banach space undersome norm ‖.‖D ≥ ‖.‖B (for the canonical choice of such a norm see (4.10))such that etA is also strongly continuous in D and A is a bounded operatorD → B. Then etA extends to the strongly continuous semigroup in C∞(R, B)with the invariant core C∞(R, D).

(ii) The semigroup etA and the semigroups T ′t generated by L′

ν (as constructed inProposition 5.13.1) are commuting semigroups in C∞(R, B) with C1

∞(R, D)being their common invariant core. Moreover, the operator L′

ν +A generatesa strongly continuous semigroup T ′

tetA in the following spaces: (a) C∞(R, B)

with the invariant core C1∞(R, D); (b) Cuc(R, B) with the invariant core

C1uc(R, D) of functions from C1(R, D) which are uniformly continuous to-

gether with their derivatives; (c) Cuc((−∞, b], B) for any b, with the invariantcore C1

uc((−∞, b], D).

Proof. (i) The operators etA act pointwise in C∞(R, B): (etAf)(x) = etA(f(x)).These operators form a bounded semigroup in C∞(R, B), because the etA forma bounded semigroup in B. For the strong continuity, we note that the pointwiseconvergence, etA(f(x)) → f(x), as t → 0 for any x, follows from the strongcontinuity of etA in B. The uniform convergence in x is straightforward on compactsets and extends to all x due to f ∈ C∞(R, B). By applying the same result to D,we can conclude that the operators etA represent a strongly continuous semigroupin C∞(R, D) as well. Finally, for any f ∈ C∞(R, B), we have

1

t(etAf(x)− f(x)) =

1

t

∫ t

0

AesAf(x) ds → Af(x), as t → 0,

uniformly in x.

(ii) The commutativity of etA and T ′t can best be proved by starting from

their approximations with a bounded generator (say, the Yosida approximationfor A and (T ε

t )′ for Tt) and then passing to the limit in the commutation relation.

From the commutativity of T ′t and etA, it follow that the operators T ′

tetA form a

strongly continuous semigroup in C∞(R, B). Since both T ′t and etA have the core

C1∞(R, D), it follows that C1

∞(R, D) is also a core for T ′te

tA. Similarly, one dealswith other spaces. �

As a direct application, let us discuss some very simple fractional PDEs.Namely, let us consider the Cauchy problem

∂ft∂t

= L′νft +Aft, ft|t=0(x, y) = f0(x, y), (5.157)

Page 365: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

5.14. Equations of order at most one 351

where x ∈ R and y ∈ Rd, L′ν acts on the variable x and is given by (5.148),

and A is a linear operator acting on the variable y that generates a stronglycontinuous semigroup etA in C∞(Rd), with some invariant core D ⊂ C∞(Rd).

For instance, −L′ν can be taken as dβ

dxβ with β ∈ (0, 1) or as a linear combinationof these derivatives with positive coefficients, and A can be the generator of anarbitrary Feller semigroup, or an elliptic PDO, or a mixed fractional Laplacian likeA = a(y)|Δ|α. As a consequence of Theorem 5.13.1, we get the following result.

Proposition 5.13.2. For any f0 ∈ C∞(R, D), there exists a unique solution ft ∈C∞(R, D) to the problem (5.157). It is given by the formula

ft(x, .) =

∫ ∞

0

etAf(x− z, .)G(ν)(t, dz). (5.158)

5.14 Equations of order at most one

In this section, we extend equations with the r.h.s. of the type Lν or L′ν , as dis-

cussed above, to variable coefficients. Namely, we deal with semigroups generatedby integro-differential (or pseudo-differential) operators of order at most one, i.e.,by operators

Lf(x) = (b(x),∇f(x)) +

∫Rd\{0}

(f(x+ y)− f(x))ν(x, dy) (5.159)

with Levy kernels ν(x, .) that have a finite local first moment∫B1

|y|ν(x, dy). (Notethat Ba denotes the ball of radius a in Rd centered at the origin.) The argumentswill be similar to the arguments used in Proposition 5.13.1(ii), although we willrestrict our attention to real-valued evolutions only.

Theorem 5.14.1. Assume that b ∈ C1(Rd) and that ∇ν(x, dy), the gradient of theLevy kernel with respect to x, exists in the weak sense as a signed measure anddepends weakly continuously on x, in the sense that

∫f(y)∇ν(x, dy) is a contin-

uous function for any f ∈ C(Rd) with a support separated from zero. Moreover,assume that

supx

∫min(1, |y|)ν(x, dy) < ∞, sup

x

∫min(1, |y|)|∇ν(x, dy)| < ∞, (5.160)

and that for any ε > 0 there exists a K > 0 such that

supx

∫Rd\BK

ν(x, dy) < ε, supx

∫Rd\BK

|∇ν(x, dy)| < ε, (5.161)

supx

∫B1/K

|y|ν(x, dy) < ε. (5.162)

Page 366: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

352 Chapter 5. Linear Evolutionary Equations: Advanced Theory

Then L generates a conservative Feller semigroup Tt in C∞(Rd) with the invari-ant core C1

∞(Rd). Moreover, Tt reduced to C1∞(Rd) is also a strongly continuous

semigroup in the Banach space C1∞(Rd), where it is regular in the sense that

‖Tt‖C1∞(Rd) ≤ eKt (5.163)

with a constant K.

Proof. Notice first that (5.160) implies, for any ε > 0,

supx

∫Rd\Bε

ν(x, dy) < ∞, supx

∫Rd\Bε

|∇ν(x, dy)| < ∞. (5.164)

Next, since the operator∫Rd\B1

(f(x+ y)− f(x))ν(x, dy) (5.165)

is bounded in the Banach spaces C(Rd) and C1(Rd) (by (5.160)) and also inthe Banach spaces C∞(Rd) and C1∞(Rd) (by (5.161)), the standard perturbationargument (see Theorem 4.6.1 and Remark 70) makes it possible to reduce thesituation to the case when all ν(x, dy) have support in B1, which we shall assumefrom now on.

Let us introduce the approximation

Lhf(x) = (b(x),∇f(x)) +

∫Rd\Bh

(f(x+ y)− f(x))ν(x, dy). (5.166)

For any h > 0, this operator generates a conservative Feller semigroup T ht in

C∞(Rd) with the invariant core C1∞(Rd), because the first term in (5.166) does

so and the second term is a bounded operator in the Banach spaces C∞(Rd) andC1

∞(Rd) (by (5.164)). Therefore, perturbation theory (Theorem 4.6.1) applies.The conservativity also follows from the perturbation series representation, andthe contraction property follows, e.g., from Theorem 5.10.1.

Formally differentiating the equation f(x) = Lhf(x) with respect to x (thatis, assuming that all derivatives are well defined) yields the equation

d

dt∇kf(x) = Lh∇kf(x) + (∇kb(x),∇f(x)) +

∫B1\Bh

(f(x+ y)− f(x))∇kν(x, dy).

(5.167)Considering this an evolution equation for g = ∇f in the Banach space C∞(Rd ×{1, . . . , d}) = C∞(Rd) × · · · × C∞(Rd), we observe that the r.h.s. is representedas the sum of a diagonal operator that generates a Feller semigroup and of twobounded (uniformly in h by (5.160)) operators of g (by expanding f(x + y) −f(x) into a Taylor series). Therefore, these evolutions are well posed and generatesemigroups that are uniformly bounded in h.

Page 367: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

5.14. Equations of order at most one 353

Let us now show that if f0 ∈ C1∞(Rd), then ft ∈ C1∞(Rd) and its derivativeg = ∇f is actually given by the semigroup generated by (5.167). To this end,we first approximate b and ν by a sequence of twice continuously differentiableobjects bn, νn, n → ∞, and prove this claim for the corresponding operators Ln

h.For the corresponding approximating evolutions,

d

dt∇kf

n(x) = Lnh∇kf

n(x) + (∇kbn(x),∇fn(x))

+

∫B1\Bh

(fn(x+ y)− fn(x))∇kνn(x, dy),(5.168)

for gn = ∇fn, the space C1∞(Rd) is an invariant core. Therefore, if f0 ∈ C2

∞(Rd),then fn

t ∈ C2∞(Rd), so we can legitimately differentiate the evolution of fn

t andconclude that it does satisfy the equation (5.168). By the uniqueness Theorem4.10.1, we can further conclude that the evolution is given by the semigroup gen-erated by the operator on the r.h.s. of (5.168). Next, if f0 ∈ C1

∞(Rd), then wecan come to the same conclusion by approximating it by functions from C2

∞(Rd).Finally, the semigroup generated by the operator on the r.h.s. of (5.168) is givenby a perturbation series, where all terms apart from Ln

h are considered pertur-bations. Letting n → ∞, we observe that all terms of this series converge to thecorresponding series without the label ‘n’. Therefore, the derivatives of the ap-proximations ∇fn

t converge, as n → ∞, to the function given by the semigroupgenerated by (5.167). Hence the approximations ∇fn

t converge to ∇ft and thefunction gt = ∇ft is well defined and can be obtained by applying the operatorsof the semigroup generated by (5.167) to g0 = ∇f0.

Consequently, we may conclude that the ∇kTht f are uniformly bounded for

all h ∈ (0, 1] and t from any compact interval whenever f ∈ C1∞(Rd). Therefore,

by the same method as used in formula (4.8), we can write

(T h1t − T h2

t )f =

∫ t

0

T h2t−s(Lh1 − Lh2)T

h1s ds

for arbitrary h1 > h2 and then estimate

|(Lh1 − Lh2)Th1s f(x)| ≤

∫Bh1

\Bh2

|(T h1s f)(x+ y)− (T h1

s f)(x)|ν(x, dy)

≤∫Bh1

‖∇T h1s f‖|y|ν(x, dy) = o(1)‖f‖C1∞, as h1 → 0,

by (5.162), which yields

‖(T h1t − T h2

t )f‖ = o(1)t‖f‖C1∞, as h1 → 0. (5.169)

Therefore, the family T ht f converges to a family Ttf , as h → 0. Clearly, the limiting

family Tt specifies a strongly continuous semigroup in C∞(Rd). Writing

Tt − f

t=

Tt − T ht f

t+

T ht − f

t

Page 368: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

354 Chapter 5. Linear Evolutionary Equations: Advanced Theory

and noting that the first term is of the order o(1)‖f‖C1∞ due to (5.169), as h → 0,

we can conclude that C1∞(Rd) belongs to the domain of the generator of thesemigroup Tt in C∞(Rd) and that it is given there by (5.159).

In order to show that C1∞(Rd) is an invariant core, let us now apply to Tt

the procedure applied above to T ht . Differentiating, first formally, the evolution

equation with respect to x, we obtain for g = ∇f the equation

d

dtgk(x) = Lgk(x)+(∇kb(x),∇f(x))+

∫B1\Bh

(f(x+y)−f(x))∇kν(x, dy). (5.170)

In order to show that the semigroup generated by this equation actually yields thederivatives of ft for f ∈ C1

∞(Rd), we again approximate b and ν by a sequence oftwice-continuously differentiable objects bn, νn, n → ∞. For these bn, νn, the esti-mates (5.169) can be obtained in C1

∞(Rd) in the same way as they were obtainedabove in C∞(Rd). This implies the claim for the approximating evolutions withbn, νn. And again, the semigroups generated by the r.h.s. of (5.170) with bn, νnconverge to the semigroup generated by the r.h.s. of (5.170) without the label ‘n’,because all terms of the perturbation series converge.

The perturbation series representation also implies that Tt is a strongly con-tinuous semigroup in C1

∞(Rd), and that the estimate (5.163) holds. �

5.15 Smoothness and smoothing of propagators

In the abstract form, the regularization property of the semigroup St in a Banachspace B generated by an operator A with the domain D(A) means that Stμ ∈D(A) for any t > 0 and any μ ∈ B (not necessarily from D) and that an estimateof the type ‖AStμ‖ ≤ c(t)‖μ‖ holds with some c(t) having a singularity at zero.The abstract theory is well developed for Hilbert spaces B. For instance, if A ispositive and self-adjoint, then its semigroup is known to be regularizing with c(t)of the order t−1, see, e.g., [244]. In the non-Hilbert setting of spaces of measuresand continuous functions (where we mostly work), the regularization propertyis usually derived from the existence and properties of the Green function. InTheorem 5.8.3, we derived both the regularization property (5.120) of Cauchyproblems that arise from ΨDOs with homogeneous symbols and the preservation-of-smoothness property via certain manipulations with the Green function. Nowwe are going to show that the initial regularization property can be deepened, i.e.,semigroups that regularize continuous functions will also regularize more smoothfunctions. This deepening is the consequence of some mild assumption on the coreand of the possibility to perform a regular approximation.

For a pseudo-differential operator A in Rd with the symbol A(x, p), which wedenote by the same letter with some abuse of notation, let us denote by ∂A

∂xjthe

operator with the symbol ∂A∂xj

(x, p), and more generally by ∂kA∂xi1 ···∂xik

the operators

Page 369: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

5.15. Smoothness and smoothing of propagators 355

with the symbol ∂kA(x,p)∂xi1 ···∂xik

. The main examples are as follows: (i) A is a differential

operator

Af(x) =∑

Ai1···ik(x)∂kf(x)

∂xi1 · · · ∂xik

,

in which case the derivatives of A are reduced to the derivatives of its coefficients,say

∂A

∂xjf(x) =

∑ ∂Ai1···ik(x)∂xj

∂kf(x)

∂xi1 · · ·∂xik

;

(ii) A is an integral operator of the Levy–Khintchin type (5.131), in whichcase the differentiation of its integral part reduces to the differentiation of thetransition kernel ν with respect to the first variable:

∂A

∂xjf(x) =

1

2

(∂A(x)

∂xj∇,∇

)f(x) +

(∂b(x)

∂xj,∇)f(x)

+

∫(f(x+ y)− f(x)− (∇f(x), y)χ(y))

∂ν(x, dy)

∂xj,

(5.171)

with similar formulae for other derivatives.

Working in the framework of Banach spaces Ck(Rd) suggests to use operatorsof at most kth order (see Proposition 1.7.2). Whenever we choose such operatorsas the generators of semigroups in C∞(Rd), we shall tacitly assume that theirdomains contain the space Ck

∞(Rd).

By far the most important class of operators are operators of at most secondorder. In particular, such operators arise in the analysis of Markov processes.Although our methods work for operators with arbitrary order, we shall stick tooperators of second order for the sake of clearness and simplicity.

We shall use the fact that the equations of all practical examples are takenfrom a class where a natural approximation by operators of the same class is avail-able, which are either bounded or have smoother coefficients. Abstractly speaking,the Yosida approximations can serve as A(n). However, since we mostly apply thetheory to differential or pseudo-differential operators, these approximations canbe constructed explicitly by (i) using finite differences instead of differential oper-ators (or, more generally, linear combinations of exponentials for approximatingthe symbol of a pseudo-differential operator), in order to get a bounded approx-imation, or (ii) approximating the coefficients of a differential operators (moregenerally, the symbol of a pseudo-differential operator) by smoother ones, in orderto get more regular operators of the same class. Therefore, the existence and eventhe explicit form of the approximation A(n) is usually seen directly in concreteexamples.

Theorem 5.15.1. Let A be a pseudo-differential operator of at most second orderwhich generates a semigroup etA in C∞(Rd) having C2

∞(Rd) as its core and enjoys

Page 370: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

356 Chapter 5. Linear Evolutionary Equations: Advanced Theory

the following smoothing property: etAf ∈ C1∞(Rd) for any t > 0, f ∈ C∞(Rd),and

‖etAf‖C1(Rd) ≤ κt−ω‖f‖C(Rd), (5.172)

with constants κ > 0, ω ∈ (0, 1), and t ∈ (0, 1]. Let the operator ∂A∂xj

be well defined

as an operator of at most second order. Moreover, assume that the operator A canbe approximated by a sequence of operators An, so that

‖An −A‖L(C2∞(Rd),C(Rd)) → 0,

∥∥∥∥∂A∂x − ∂An

∂x

∥∥∥∥L(C2∞(Rd),C(Rd))

→ 0, (5.173)

as n → ∞, and such that the An have the same properties as A, but additionallyleave the spaces C1

∞(Rd), C2∞(Rd) and C3

∞(Rd) invariant under exp{tAn}, andsuch that the semigroups exp{tAn} are strongly continuous in C1∞(Rd).

Then etA takes C1(Rd) to C2∞(Rd),

‖etAf‖C2(Rd) ≤ κt−ω‖f‖C1(Rd) (5.174)

with some other constant κ, and the etA are strongly continuous in C1∞(Rd).

Similarly, if eAt acts strongly continuously on the spaces Cl∞(Rd) with l =1, . . . , k, then

‖etAf‖Ck(Rd) ≤ κt−ω‖f‖Ck−1(Rd). (5.175)

Proof. Let us only deal with the first statement, that is with k = 2. Assume firstthat the spaces Cj

∞(Rd), j = 1, 2, 3, are invariant under all operators etA, andthat the etA are strongly continuous in C1

∞(Rd). Then, if f0 ∈ C3∞(Rd), we can

differentiate the equation f = Af and obtain the equation

g = Adiagg +∂A

∂xf (5.176)

for the derivative g = ∂f∂x =

(∂f∂x1

, . . . , ∂f∂xd

), where Adiag is the diagonal operator

with the element A on the main diagonal. Due to Proposition 1.7.2, we can write

∂A

∂xf = A1g +B1f, (5.177)

where A1 is an operator of at most first order and B1 is a bounded operator inC∞(Rd). Therefore, the first step in the analysis of equation (5.176) consists ofanalysing the equation

g = Adiagg +A1g. (5.178)

Since C2∞(Rd) is an invariant core for etA, applying Theorem 4.6.4 withB = (C∞(Rd))d, B = (C1

∞(Rd))d and L = A1 leads to the conclusion that thesemigroup Φt yielding mild solutions to equation (5.178) is strongly continuousboth in B and B, takes B to B with the estimate

‖Φt‖B→B ≤ κt−ω (5.179)

and is generated by the operator Adiag +A1 on the invariant core (C2∞(Rd))d.

Page 371: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

5.15. Smoothness and smoothing of propagators 357

Next, by Proposition 4.10.2, equation (5.176) written as g = Adiagg +A1g+B1f has a unique solution gt for any g0 ∈ (C2

∞(Rd))d and a curve

f. ∈ C([0, T ], (C∞(Rd))d),

and it is given by the formula

gt = exp{t(Adiag +A1)}g0 +∫ t

0

exp{(t− s)(Adiag +A1)}B1fs ds. (5.180)

Moreover, for any f. ∈ C([0, T ], (C∞(Rd))d) and g0 ∈ (C∞(Rd))d (respectivelyg0 ∈ (C1

∞(Rd))d), the function t �→ gt is continuous in the topology of (C∞(Rd))d

(respectively (C1∞(Rd))d) and

‖gt‖(C1∞(Rd))d ≤ κt−ω‖g0‖(C∞(Rd))d +κt1−ω‖B1‖

1− ωsup

s∈[0,t]

‖fs‖C∞(Rd). (5.181)

Since g = ∂f∂x solves equation (5.176) with the initial condition g0 = ∂f0

∂xand the unique solution to this problem is given by (5.180), we can conclude thatthis formula yields the derivative g = ∂f

∂x . The estimate (5.181) translates to therequired estimate (5.174).

Due to the continuous dependence of gt on g0 and the assumed boundednessof the semigroup etA in C1

∞(Rd), formula (5.180) for the derivative g = ∂f∂x remains

valid even for f0 ∈ C1∞(Rd), and due to the smoothing property of eAt even forf ∈ C∞(Rd) and t > 0.

If A is approximated by the sequence An with the invariant spaces C2∞(Rd)

and C3∞(Rd), then gn given by (5.180) with all operators labeled by n accordingly

yield the derivatives gn = ∂fn

∂x . Because of the first estimate in (5.173), Proposition4.2.2 and the assumption that C2

∞(Rd) is a core for etA, we can conclude that thefnt = etA

n

f converge to ft = etAf in C∞(Rd) for any f ∈ C∞(Rd). By thesecond estimate in (5.173) and the perturbation argument, we can conclude thatexp{t(An

diag +An1 )}g0 converges to exp{t(Adiag +A1)}g0 for any g0 ∈ (C∞(Rd))d.

Consequently, the gnt =∂fn

t

∂x given by (5.180) with all objects labeled with ‘n’also converge for any f0 ∈ C∞(Rd), as n → ∞. The limiting function is givenby (5.180) and is equal to the derivative gt =

∂ft∂x , which implies all the required

estimates. �Corollary 9. Under the assumptions of Theorem 5.15.1, its statement generalizesto the assertion that eAt for t > 0 takes the space CbLip(R

d) to C2(Rd), with thesame estimate:

‖etAf‖C2(Rd) ≤ κt−ω‖f‖CbLip(Rd). (5.182)

Proof. This follows from the observation that elements of CbLip(Rd) can be uni-

formly approximated by elements of C1(Rd) and that the norms of CbLip(Rd) and

C1(Rd) coincide in C1(Rd). �

Page 372: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

358 Chapter 5. Linear Evolutionary Equations: Advanced Theory

5.16 Summary and comments

In this chapter, we continued our analysis of linear systems and extended thetheory into various directions.

The method with the T -product is another classical tool for dealing withnon-homogeneous, as well as nonlinear, equations. One of the first books to sys-tematically present this method in the context of quantum mechanics (nonlin-ear Schrodinger equation and related questions) was [201]. We developed the T -product approximation in Section 5.1. Our exposition is close to [100], althoughwe rely on the method of Banach towers for justifying the convergence insteadof assuming the reflexivity of the involved Banach spaces. In Section 5.2, we pro-vided a result that belongs to a class of formulae which is generally referred to asthe Lie–Trotter–Daletski–Chernoff formula. Following here the exposition of [148],we highlighted once again the convenient use of the method of Banach towers inconjunction with the method of regular approximation as a tool for proving con-vergence of Lie–Trotter approximations. The final results on mixing, Theorems5.3.2 and 5.3.3, are possibly new.

An alternative to finite Banach towers that turns out to be a handy tool inmany cases are the scales of Banach spaces that depend on a continuous parame-ter, the so-called Ovsyannikov’s method, for which we refer to [81] and referencestherein.

The second part of the chapter was devoted to the method of frozen coeffi-cients, which permit the construction of propagators for equations with variablecoefficients whenever the corresponding Cauchy problem for the equation withconstant coefficients is well understood. The literature on this method is quite ex-tensive, and we are not attempting to review it. Since we mainly apply the methodto the case of homogeneous symbols, let us mention the paper [128], followed by[213], where this method was initially applied to rather general homogeneous sym-bols. The monograph [72] is gives a very detailed presentation of the method offrozen coefficients for homogeneous symbols, using the theory of hyper-singularintegrals and the Fourier expansion of coefficients in the series with respect tospherical harmonics. For some recent achievements, let us mention [127] and [173],which improve the method of frozen coefficients by using a modified initial approx-imation for the Green function (by adding some correcting term). Serious attentionis given to equations with an index of homogeneity α ∈ (0, 2), since these equa-tions naturally appear in the analysis of Markov processes, namely the so-calledstable and stable-like processes. Two-sided estimates for the Green functions areprovided in [134] and [136]. In our exposition, an abstract version of the methodof frozen coefficients has been developed, which allows for a rather concise andunified presentation for various special cases, not only of the convergence of themain perturbation series, but also of the regularization properties of the obtainedresolving operators. The detailed exposition given here in such generality seemsto be new.

Page 373: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

5.16. Summary and comments 359

Unfortunately, we did not touch at all the issues related to the evolutionequations in domains with a boundary, see, e.g., [39, 51, 121] and references thereinfor various directions in this topic.

Next, we turned to the general methods for proving the uniqueness of solu-tions to Cauchy problems. We presented the methods based on duality, accretiv-ity, positivity and approximations in an abstract form. As examples, we analysedvarious classes of Levy–Khintcin-type operators that generate Feller semigroups.Considering the related convolution semigroups as lifted to Banach spaces, onecan obtain various useful extensions. As examples, we analysed simple fractionalPDEs. Next, following [147], we studied the class of Cauchy problems generatedby operators of order at most one, which can be considered the most generalrepresentation of mixed fractional derivatives. Proposition 5.12.2, which is themethodological basis for our future treatment of generalized fractional calculus, isnew. In the final section, the link between smoothness and smoothing was clari-fied in a rather general setting. Smoothing properties of the operator semigroupsare crucial for their application to nonlinear equations, as will be made clear inChapter 6.

Page 374: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

Chapter 6

The Method of Propagatorsfor Nonlinear Equations

This chapter applies the method of linear propagators to nonlinear equations withan unbounded r.h.s. We provide the well-posedness of nonlinear evolutions andtheir sensitivity with respect to initial data and parameters. It is shown how thesemethods work for equations with memory and with anticipating path dependence.The general theory is illustrated on various concrete examples. In particular, weanalyse nonlinear heat conduction equations, nonlinear Schrodinger equations andcomplex diffusions, HJB equations and Mc-Kean–Vlasov equations, Cahn–Hilliardequations, and the related forward-backward systems. The chapter ‘celebrates’ thestrength of the abstract Theorems 2.1.1 to 2.1.3, showing how various nontrivialequations can be dealt with more or less directly by means of these results, withestimates for the growth of solutions via the Mittag-Leffler functions.

We shall look at evolutionary equations where the r.h.s. is given by pseudo-differential operators with symbols At(x, p) that depend on an unknown function(equations in function spaces) or on an unknown measure (equations on measures),in particular by differential operators whose coefficients depend on the unknownfunction or measure. This dependence can be a) pointwise, i.e., At(x, p) dependson the value ut(x) of the unknown function u(x) (or of the density of the unknownmeasure), b) local, i.e., At(x, p) depends on the spatial derivatives of ut(x), or c)integral. In the last case, one can distinguish (i) the spatial integral dependence,i.e., At(x, p) depends on some integrals over ut(x) taken at time t, (ii) the adaptiveor causal dependence, i.e., At(x, p) depends on the values of us at times s ≤ t,and (iii) general path dependence. All of these cases require different regularityassumptions on the symbols or coefficients for the analysis. Namely, for pointwiseor local dependence of At(x, p) on some values of u, the smoothness is quantita-tively given in terms of the partial derivatives with respect to these values. For theintegral dependence of At(x, p) on u, the smoothness is most naturally representedby variational derivatives.

© Springer Nature Switzerland AG 2019 V. Kolokoltsov, Differential Equations on Measures and Functional Spaces, Birkhäuser Advanced Texts Basler Lehrbücher, https://doi.org/10.1007/978-3-030-03377-4_6

361

Page 375: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

362 Chapter 6. The Method of Propagators for Nonlinear Equations

A critical point for the classification of the methods that are used for analysisconcerns the possible non-degeneracy of the major (most singular) term that canusually be used for deriving certain smoothing properties of the evolution gener-ated by this major term and therefore allowing for much less regular behaviour ofthe remaining parts. The absence of such non-degeneracy assumptions makes theanalysis much more demanding. Another key point is the presence or absence ofnonlinearity in the major term. Finally, as in the linear case, serious progress canbe achieved by exploiting various representations of the equations, namely mild,strong or weak. These representations make it possible to deal with various reg-ularity classes of solutions. The strong form for measure-valued evolutions seemsto be most appropriate for the non-degenerate case, when the smoothing prop-erty turns arbitrary initial measures into measures with smooth densities, wherepseudo-differential operators are well defined. Since pseudo-differential operatorsare not well defined on arbitrary measures, however, the strong form may notbe available in the absence of smoothing, in which case the weak form becomesthe only possibility for the analysis. This is why the sensitivity for the generaldegenerate case is developed separately at the end of the chapter.

Let us explain in abstract terms the precise link to Theorems 2.1.1 to 2.1.3,which will be used in the sequel. The basic approach to the analysis of the ODE μ =F (μ) in a Banach space B with an initial state μ0 = Y for a Lipschitz-continuous

r.h.s. f was based on successive approximations μn+1(t) = x +∫ t

0F (μn(s)) ds,

which are obtained as solutions to the approximating recursive system of equa-tions μn+1 = F (μn). If F is unbounded, then such approximations may becomeinappropriate, because the growth rate of F may increase by the iterations. How-ever, if one can distinguish a linear or affine unbounded part of F such thatF (μ) = A[μ]μ + C[μ], where A[ν] is a linear operator in B for any ν that canbe handled by the linear theory, then the natural system of recursive approxima-tions to the ODE μ = F (μ) can is μn+1 = A[μn]μn+1 +C[μn]. The correspondingfixed-point equation can be interpreted as an infinite-dimensional multiplicative-integral equation. In fact, we already used such approximations, with A[μn] beinga constant, in order to handle the positivity in Theorem 3.1.1. In this chapter,we use this idea quite generally, exploiting it both for strong and weak equations.The proof of Theorem 6.1.1 below is the first application of this idea.

6.1 Hamilton–Jacobi–Bellman (HJB) and

Ginzburg–Landau equations

Let us start with the equation

∂f

∂t(x) = Af(x) +H

(x,

∂f

∂x(x), f(x)

), (6.1)

Page 376: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

6.1. Hamilton–Jacobi–Bellman (HJB) and Ginzburg–Landau equations 363

where A is the generator of a strongly continuous semigroup etA in C∞(Rd) suchthat

‖etAf‖C1(Rd) ≤ κt−ω‖f‖C(Rd), (6.2)

uniformly for t from a compact interval [0, T ], ω ∈ (0, 1), and H is a Lipschitz-continuous function in three variables, referred to as the Hamiltonian of equation(6.1). The smoothing assumption (6.2) is central for the following discussion. Inthe previous chapter, we showed that many natural evolutions (parabolic PDEsand ΨDEs) satisfy this property, see, e.g., Theorem 5.8.3.

A basic example for an equation of the type (6.1) comes from stochastic con-trol theory, where A is the generator of a sufficiently regular Feller process (say,diffusion or a stable or stable-like process) and the Hamiltonian function H hasthe form (2.123). In this case, equation (6.1) represents the standard Hamilton–Jacobi–Bellman (HJB) equation of stochastic control (of Markov processes gener-ated by A).

A substantial simplification arises when H in (6.1) does not depend on thederivative ∂f

∂x . An example for this case is the Ginzburg–Landau equation

∂f

∂t(x) = Δf(x)− ψ′(f(x)) (6.3)

with some given function ψ. The Ginzburg–Landau equation and the Cahn–Hilliard equation (considered below, see (6.28)) are central for materials science,see, e.g., [99] for their derivation and potential extensions.

Remark 100. If A in (6.1) is not smoothing, then the theory is quite different.This can already be seen at the case of vanishing A, which was touched upon inSection 2.6.

Motivated by Theorem 4.6.3, we shall consider the term with H as a per-turbation (though a nonlinear one) and will start with the corresponding mildsolutions to (6.1), i.e., with solutions of the mild form to the following equation:

ft = etAY +

∫ t

0

e(t−s)AH

(.,∂fs∂x

(.), fs(.)

)ds. (6.4)

Theorem 6.1.1. Let A be an operator in C∞(Rd) that generates a strongly con-tinuous semigroup etA in C∞(Rd) such that etA is also a strongly continuoussemigroup in C1∞(Rd) and

‖etA‖C∞(Rd)→C∞(Rd) ≤ TC , ‖etA‖C1∞(Rd)→C1∞(Rd) ≤ TD, (6.5)

with constants TC , TD and t ∈ [0, T ]. Let etA take C(Rd) to C1∞(Rd) and let

(6.2) hold with κ > 0, ω ∈ (0, 1), and let H(x, p, q) be a continuous function onRd ×Rd ×R such that h = supx |H(x, 0, 0)| < ∞ and

|H(x, p1, q1)−H(x, p2, q2)| ≤ LH |p1 − p2|+ LH |q1 − q2| (6.6)

Page 377: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

364 Chapter 6. The Method of Propagators for Nonlinear Equations

with a constant LH . Then for any Y ∈ C1∞(Rd) there exists a unique solutionf. ∈ C([0, T ], C1

∞(Rd)) to equation (6.4). Moreover, for all t ≤ T ,

‖ft(Y )− Y ‖C1(Rd) ≤ E1−ω(κLHΓ(1− ω)t1−ω) (6.7)

×(t1−ω

κ

1− ω(h+ LH‖Y ‖C1(Rd)) + ‖(etA − 1)Y ‖C1(Rd)

),

and the solutions ft(Y1) and ft(Y2) with different initial data Y1, Y2 satisfy theestimate

‖ft(Y1)− ft(Y2)‖C1(Rd) ≤ TD‖Y1 − Y2‖C1(Rd)E1−ω(κLHΓ(1− ω)t1−ω), (6.8)

where E denotes the Mittag-Leffler function (9.13).

Remark 101. The estimate (6.7) expresses the rate of convergence ft(Y ) → Y ,as t → 0, in terms of the rate of convergence etA(Y ) → Y . For instance, if Ybelongs to the domain of the generator of etA in C1(Rd), then the difference‖etA(Y )− Y ‖C1(Rd) will be of order t.

Proof. Using the notations given prior to Theorem 2.1.1, let us define the mappingΦY : C([0, t], C1∞(Rd)) → CY ([0, t], C

1∞(Rd)) by

[ΦY (f.)](t) = etAY +

∫ t

0

e(t−s)AH

(.,∂fs∂x

(.), fs(.)

)ds. (6.9)

Let us first show that this mapping is well defined. If f ∈ C([0, t], C1∞(Rd)), then[ΦY (f.)](t) ∈ C1

∞(Rd) for any t, with

‖[ΦY (f.)](t)‖C1(Rd) ≤ TD‖Y ‖C1(Rd) + κ

∫ t

0

(t− s)−ω(h+ LH‖fs‖C1(Rd)) ds.

We need to show the continuous dependence of [ΦY (f.)](t) on t. The first term in(6.9) depends continuously on t in the topology C1∞(Rd), because of the assumedstrong continuity of etA in C1

∞(Rd). For the difference of the values of the secondterm of (6.9) at different times t1 > t2, we find∫ t1

0

e(t1−s)AH(.,∂fs∂x

(.), fs(.)) ds −∫ t2

0

e(t2−s)AH(.,∂fs∂x

(.), fs(.)) ds

=

∫ t1

t2

e(t1−s)AH(.,∂fs∂x

(.), fs(.)) ds

+ (e(t1−t2)A − 1)

∫ t2

0

e(t2−s)AH(.,∂fs∂x

(.), fs(.)) ds.

The first term on the r.h.s. tends to zero in the C1(Rd)-norm, as t1 → t2, because of(6.2), and the second because of the strong continuity of etA in C1∞(Rd). Therefore,ΦY is well defined.

Page 378: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

6.1. Hamilton–Jacobi–Bellman (HJB) and Ginzburg–Landau equations 365

The rest is the application of Theorem 2.1.3, because

‖[ΦY1(f.)](t)− [ΦY2(f.)](t)‖C1∞(Rd) ≤ TD‖Y1 − Y2‖C1(Rd),

and

‖[ΦY (f1. )](t)− [ΦY (f

2. )](t)‖C1(Rd)

≤ κ

∫ t

0

(t− s)−ω‖H(.,∂f1

s

∂x(.), f1

s (.)) −H(.,∂f2

s

∂x(.), f2

s (.))‖C∞(Rd) ds,

≤ κLH

∫ t

0

(t− s)−ω‖f1s − f2

s ‖C1∞(Rd) ds,

which yields the estimate (2.5) of Theorem 2.1.3, and finally

[ΦY (Y )](t)− Y = (etA − 1)Y +

∫ t

0

e(t−s)AH(.,∂Y

∂x, Y )ds,

‖[ΦY (Y )](t)− Y ‖ ≤ ‖(etA − 1)Y ‖C1(Rd) +t1−ω

κ

1− ω(h+ LH‖Y ‖C1(Rd)). �

The next result states the continuous and Lipschitz dependence of the solu-tions to the HJB equation on a parameter entering the expression for the Hamil-tonian.

Theorem 6.1.2. Let Hα(x, p, q) be a family of Hamiltonians depending on a param-eter α taken from an auxiliary Banach space Bpar. Suppose that each Hα satisfiesall assumptions of Theorem 6.1.1 with all bounds uniform in α, and moreover

|Hα(x, p, q)−Hβ(x, p, q)| ≤ ‖α− β‖LparH (1 + |p|+ |q|), (6.10)

with a constant LparH . Then the solutions ft(Y, α) and ft(Y, β) of (6.4) (built in

Theorem 6.1.1) with different parameter values satisfy the estimate

sups∈[0,t]

‖fs(Y, α)− fs(Y β)‖C1(Rd) ≤ LparH K‖α− β‖(1 + ‖Y ‖C1(Rd)), (6.11)

where the constant K depends continuously on t, ω,κ, h, LH and TD.

Proof. This is again a consequence of Theorem 2.1.3, because

‖[ΦY,α(f.)](t)− [ΦY,β(f.)](t)‖C1(Rd)

≤∥∥∥∥∫ t

0

e(t−s)A

(Hα(.,

∂fs∂x

(.), fs(.))−Hβ(.,∂fs∂x

(.), fs(.))

)ds

∥∥∥∥C1(Rd)

≤ κ

1− ωt1−ω sup

s∈(0,t]

∥∥∥∥Hα(.,∂fs∂x

(.), fs(.))−Hβ(.,∂fs∂x

(.), fs(.))

∥∥∥∥C(Rd)

≤ κ

1− ωt1−ωLpar

H ‖α− β‖(1 + sup

s∈(0,t]

‖fs‖C1(Rd)

). �

Page 379: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

366 Chapter 6. The Method of Propagators for Nonlinear Equations

Let us now analyse the conditions under which the regularity of the solutionscan be improved and one can therefore pass from the mild equation to the initialone.

Theorem 6.1.3.

(i) Under the assumptions of Theorem 6.1.1, assume additionally that etA actsstrongly continuously in C2∞(Rd), so that

‖etA‖C2∞(Rd)→C2∞(Rd) ≤ TD, (6.12)

with some constant TD, and that H is Lipschitz-continuous in the first argu-ment:

|H(x1, p, q)−H(x2, p, q)| ≤ LH |x1 − x2| |p|. (6.13)

(The linear dependence of the Lipschitz constant on |p| is a standard featurein all natural examples.) Then for any f0 ∈ C2∞(Rd), the unique solutionf. ∈ C([0, T ], C1

∞(Rd)) to equation (6.4) is such that ft ∈ C2∞(Rd) for any t

with the norm in C2∞(Rd) being uniformly bounded for t ∈ [0, T ]. Moreover,

ft satisfies (6.1) for t > 0.

(ii) Let Hα(x, p, q) be a family of Hamiltonians and Aα a family of operatorsdepending on a parameter α taken from an auxiliary Banach space Bpar insuch a way that, for each α, Hα and Aα satisfy the conditions of (i) with allbounds uniform in α. Moreover, let (6.10) hold and

‖Aα −Aβ‖C2(Rd)→C(Rd) ≤ LparA ‖α− β‖, (6.14)

with a constant LparA . Then the estimate

‖ft(Y, α)− ft(Y β)‖C1(Rd) ≤ (LparH +Lpar

A )K‖α− β‖(1 + ‖Y ‖C1(Rd)) (6.15)

holds for the solutions ft(Y, α) and ft(Y, β) of (6.4) with different parame-ter values, where the constant K depends continuously on t, ω,κ, h, LH , TD

and TD.

Remark 102. We do not state in (i) that f. ∈ C([0, T ], C2∞(Rd)).

Remark 103. In the proof, we shall use the additional regularization estimates(5.174) and (5.182) obtained in Theorem 5.15.1. One can avoid referring to thistheorem by additionally assuming the smoothing properties (5.174) and (5.182).According to Theorem 5.8.3, these properties hold for the evolutions generated byparabolic PDEs and ΨDEs.

Proof. (i) The mapping (6.9) is now a mapping

ΦY : C([0, t], C2∞(Rd)) → CY ([0, t], C

2∞(Rd)),

and all functions ΦnY (Y )(t) are uniformly bounded in CY ([0, t], C

2∞(Rd)). In fact,we know by Theorem 6.1.1 that for any Y ∈ C2

∞(Rd) all approximations are uni-

Page 380: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

6.1. Hamilton–Jacobi–Bellman (HJB) and Ginzburg–Landau equations 367

formly bounded in C1∞(Rd), so the Lipschitz constant LH |p| in (6.13) is uniformlybounded for

p =∂

∂xΦn

Y (Y )(t)

and all n. Therefore, the boundedness of all [ΦY (f.)](t) in C2∞(Rd) follows from(6.12) and (5.182). The continuity of the curve [ΦY (f.)](t) in C2

∞(Rd) follows asin the proof of Theorem 6.1.1, but now the strong continuity of etA in C2

∞(Rd)and (5.182) are used.

Therefore, all approximations of ft have uniformly bounded norms inC2

∞(Rd), and hence the limit ft has a bounded norm in C1bLip(R

d). But since

ft satisfies (6.4), it follows that ft ∈ C2∞(Rd) for any t > 0. Since the norms inC1

bLip(Rd) and C2(Rd) coincide, it follows that the norms of ft in C2(Rd) are

uniformly bounded, as claimed.

Finally, in order to show that (6.1) holds for t > 0, we can apply Propo-sition 4.10.2. Note that it is not directly applicable, since we did not show thatgt = H(x, ∂ft

∂x , ft) belongs to C2∞(Rd) for each t. Still, its conclusion holds, since

e(t−s)Ags ∈ C2∞(Rd) for any t − s > 0, and therefore the mild solution ft isdifferentiable at least for t > 0.

(ii) Again we use Theorem 2.1.3. To this end, we have to estimate

‖(etAα − etAβ )Y ‖C1(Rd)

+

∥∥∥∥∫ t

0

(e(t−s)AαHα

(.,∂fs∂x

(.), fs(.)

)− e(t−s)AβHβ

(.,∂fs∂x

(.), fs(.)

))ds

∥∥∥∥C1(Rd)

.

Given the calculations in the proof of Theorem 6.1.2, we only need to estimate thedifferences

D1 = ‖(etAα − etAβ )Y ‖C1(Rd)

and

D2 =

∥∥∥∥∫ t

0

(e(t−s)Aα − e(t−s)Aβ )Hα

(.,∂fs∂x

(.), fs(.)

)ds

∥∥∥∥C1(Rd)

.

The difference between two semigroups can be estimated with the help of formula(4.8):

etAα − etAβ =

∫ t

0

e(t−s)Aβ (Aα −Aβ)esAα ds. (6.16)

Consequently, by (6.2) and (5.174), we find

D1 ≤∫ t

0

κκ(t− s)−ω‖α− β‖LparA s−ω ds ‖Y ‖C1(Rd)

= B(1 − ω, 1− ω)κκ‖α− β‖LparA t1−2ω‖Y ‖C1(Rd),

as required in order for Theorem 2.1.3 to be applicable. (Note that B is the Beta-function defined in (9.7).) The difference D2 is estimated in the same way. �

Page 381: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

368 Chapter 6. The Method of Propagators for Nonlinear Equations

Let us now formulate the time-dependent version of the above result. (Notethat we omit the proof, since it is almost identical to the above proof.) In thetime-dependent setting, the Cauchy problem for HJB equations usually arises inthe inverse time. Therefore, we shall analyse the equation

ft(x) = −Atft(x)−Ht

(x,

∂ft∂x

(x), ft(x)

), 0 ≤ t ≤ r ≤ T, (6.17)

where At is a family of operators generating a strongly continuous backward prop-agator U t,r in C∞(Rd) such that

‖U t,rf‖C1(Rd) ≤ κ(r − t)−ω‖f‖C(Rd) (6.18)

uniformly for t, r from a compact interval [0, T ], and Ht is a family of Lipschitz-continuous functions.

Like in the time-homogeneous case (taking into account the inverse directionof time), the mild solutions to the Cauchy problem of equation (6.17) with theterminal data Y = fr are defined as the solutions to the mild form of the equation(6.17):

ft = U t,rY +

∫ r

t

U t,sHs

(.,∂fs∂x

(.), fs(.)

)ds, t ≤ r. (6.19)

Theorem 6.1.4.

(i) Let At be a family of operators in C∞(Rd) generating a strongly continu-ous backward propagator U t,r in C∞(Rd) such that U t,r is also a stronglycontinuous propagator in C1

∞(Rd) with

‖U t,r‖C∞(Rd)→C∞(Rd) ≤ TC , ‖U t,r‖C1∞(Rd)→C1∞(Rd) ≤ TD, (6.20)

with constants TC , TD and t, r ∈ [0, T ]. Let U t,r take C(Rd) to C1∞(Rd), let

(6.18) hold with κ > 0, ω ∈ (0, 1), and let Ht(x, p, q) be a continuous functionon [0, T ]×Rd ×Rd ×R such that h = supt,x |Ht(x, 0, 0)| < ∞ and

|Ht(x, p1, q1 −Ht(x, p2, q2)| ≤ LH |p1 − p2|+ LH |q1 − q2| (6.21)

with a constant LH . Then for any Y = fr ∈ C1∞(Rd) there exists a unique

solution ft ∈ C1∞(Rd) to equation (6.19). Moreover, for all t ≤ r,

‖ft(Y )− Y ‖C1(Rd) ≤ E1−ω(κΓ(1 − ω)(r − t)1−ω) (6.22)

×[hκ

(r − t)1−ω

1− ω+ ‖Y ‖C1(Rd)

(1 + TD + κLH

(r − t)1−ω

1− ω

)],

and the solutions ft(Y1) and ft(Y2) with different initial data Y1, Y2 satisfythe estimate

‖ft(Y1)− ft(Y2)‖C1(Rd) ≤ TD‖Y1 − Y2‖E1−ω(κΓ(1 − ω)(r − t)1−ω), (6.23)

where E denotes the Mittag-Leffler function (9.13).

Page 382: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

6.1. Hamilton–Jacobi–Bellman (HJB) and Ginzburg–Landau equations 369

(ii) Let Hα,t(x, p, q) be a family of Hamiltonians depending on a parameter αtaken from an auxiliary Banach space Bpar, with each Hα satisfying the as-sumptions of (i) with all bounds uniform in α. Moreover, let

|Hα,t(x, p, q)−Hβ,t(x, p, q)| ≤ ‖α− β‖LparH (1 + |p|+ |q|), (6.24)

with a constant LparH . Then the solutions ft(Y, α) and ft(Y, β) to (6.4) with

different parameter values satisfy the estimate (6.11), where the constant Kdepends continuously on t− r, ω,κ, h, LH and TD.

Exercise 6.1.1. Formulate and prove a non-homogeneous analogue of Theorem6.1.3.

Finally, let us analyse the sensitivity (i.e., the smooth dependence on theinitial data) of the nonlinear equations that we dealt with above. This analysisdemonstrates once again the power of the abstract results of Chapter 2. Theequations (6.4) and (6.19) are of the type (2.129) and can therefore be handledby Theorem 2.15.1. Let us discuss sensitivity of the HJB equation in the simplestframework of Theorem 6.1.1, the other cases considered above can be dealt withanalogously.

Theorem 6.1.5. Under the assumptions of Theorem 6.1.1, let us additionally as-

sume that the derivatives ∂H(x,p,q)∂p and ∂H(x,p,q)

∂q exist and are continuous func-

tions, uniformly for x ∈ Rd and p, q from any bounded set. Then the mappingY �→ ft ∈ C1

∞(Rd) yielding the solution to equation (6.4) constructed in Theorem6.1.1 belongs to C1

luc(C1∞(Rd), C1

∞(Rd)), and the derivative ξt = Dft(Y )[ξ] is theunique solution to the equation

ξt = etAξ +

∫ t

0

e(t−s)A

(∂H

∂p

(.,∂fs∂x

(.), fs(.)

)∂ξ

∂x+

∂H

∂q

(.,∂fs∂x

(.), fs(.)

)ds.

(6.25)Moreover, this solution is bounded:

‖ξt‖ ≤ κ(T, ‖Y ‖)‖ξ‖(1 + t). (6.26)

Proof. It is a consequence of (6.2) and Theorems 6.1.1, 2.15.1, if one notes that the

derivative of the mapping f �→ H(x, ∂f(x)∂x , f(x)), as a mapping C1∞(Rd) → C(Rd),

is given by the formula

DH(.,∂f

∂x(.), f(.))[ξ](x)

=∂H

∂p

(x,

∂f

∂x(x), f(x)

)∂ξ(x)

∂x+

∂H

∂q

(x,

∂f

∂x(x), f(x)

)ξ(x). �

Page 383: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

370 Chapter 6. The Method of Propagators for Nonlinear Equations

6.2 Higher-order PDEs and ΨDEs, andCahn–Hilliard-type equations

Similar to the operators of at most second order, as developed in Theorem 5.15.1,one can prove a strengthened smoothing for operators of at most order α underthe corresponding regularity assumptions. This leads to extensions of the theorythat was developed above for the equations (6.1). For the sake of simplicity, weshall analyse these equations only for the simplest A of order α, namely for A =σ(x)|Δ|α/2, whose semigroup was constructed in Proposition 4.4.1 for a constantσ and in the corollary to Proposition 5.9.1 for variable ones. The correspondingextension of the equations (6.1) is the class of equations of the type

ft(x) = −σ(x)|Δ|α/2ft +H

(x,

{∂mft

∂xi1 · · · ∂xim

}(x), ft(x)

), (6.27)

where σ is a positive constant (more generally, a complex constant with a positivereal part),H(x, {pi1,...,im}, q) is a function of the variables x ∈ Rd and pi1,...,im , q ∈R, where pi1,...,im are parametrized by any sequence of m numbers from {1, . . . , d}with m ∈ [1, . . . , k], where k < α. Therefore, H in (6.27) is a function of f and allits derivatives of order up to k. Equation (6.27) is referred to as quasi-linear if His linear with respect to all partial derivatives of f .

An important example for physics is the so-called Cahn–Hilliard equation:

ft = −σΔ2ft +Δ(γ2f3t + γ1f

2t − ft), (6.28)

with constants γ1,2. It governs the thermodynamical process of the separation ofmixtures (spinodal decompositions).

Remark 104. The growth of the coefficients in (6.28) does not fit the assumptionsof our general treatment of (6.27) below. However, the particular structure makesit possible to get some a-priori estimates as an appropriate counterbalance for thisgrowth, see, e.g., [73] and [74].

Similar to (6.4), one can define the mild solutions to (6.27) as the solutionto the mild form of this equation:

ft = exp{−tσ|Δ|α}Y (6.29)

+

∫ t

0

exp{−(t− s)σ|Δ|α}H(.,

{∂mfs

∂xi1 · · · ∂xim

}(.), fs(.)

)ds.

The reason for the assumption k < α in (6.27) is the integrability of thesingularity t−k/2α in (4.58), which is ensured by this assumption. Keeping this inmind, the following result is a straightforward extension of Theorem 6.1.1.

Page 384: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

6.3. Nonlinear evolutions and multiplicative-integral equations 371

Theorem 6.2.1. Let H be a continuous function such that h = supx |H(x, 0, 0)| <∞ and

|H(x, {p1i1,...,im}, q1)−H(x, {p2i1,...,im}, q2)|≤ LH

∑i1,...,im

|p1i1,...,im − p2i1,...,im |+ LH |q1 − q2| (6.30)

with a constant LH . Then for any Y ∈ Ck∞(Rd) there exists a unique solutionf. ∈ C([0, T ], Ck

∞(Rd)) to equation (6.29). Moreover, the solutions ft(Y1) andft(Y2) with different initial data Y1, Y2 satisfy the estimate

‖ft(Y1)− ft(Y2)‖Ck(Rd) ≤ c1‖Y1 − Y2‖Ck(Rd)E1−k/2α(c2t1−k/2α), (6.31)

with some constants c1, c2 that only depend on α and σ.

All the other results that have earlier been developed for equation (6.1) ex-tend more or less automatically to the equations (6.27) and their related extensionas well.

6.3 Nonlinear evolutions and

multiplicative-integral equations

In the last two sections, we laid the foundation for studying the strong form ofequations with a linear major term and a non-linearity that depends pointwise onthe values of an unknown function and its derivatives, working with their mildrepresentations. Now we set up an alternative scheme of analysis that arises fromworking with the weak form of the equations, and therefore with a strong emphasison duality. The weak form arises naturally in many applications, e.g., in cases whenthe symbols bear some integral dependence on the unknown function. Sometimesthe weak form is the only way to write down an equation rigourously. But it canalso be used as an alternative approach to the equations discussed above.

A handy general framework for dealing with weak equations is given by thesetting of the dual Banach pair (Bobs, Bst) (i.e., each of these spaces is a closedsubspace of the dual of the other space that separates the points of the latter).Unlike the more popular pair (B,B∗), the setting (Bobs, Bst) makes the resultsexplicitly symmetric with respect to changing the order of spaces in the pair.Therefore, we shall analyse ODEs in the Banach space Bst of the form

d

dt(f, μt) = (At(μt)f, μt), μ0 = Y, f ∈ D, (6.32)

which should hold for all f from a dense subspace D of Bobs, and where At[μ] isa family of linear operators in Bobs for any μ, with a domain containing D.

It will be convenient to assume that D itself is a Banach space if equippedwith some other norm ‖.‖D ≥ ‖.‖Bobs

, which allows for working with its dual

Page 385: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

372 Chapter 6. The Method of Propagators for Nonlinear Equations

Banach space D∗ ⊃ B∗obs ⊃ Bst (and thus ‖.‖D∗ ≤ ‖.‖B∗

obs≤ ‖.‖Bst). Recall that

C([τ, T ], D∗) denotes the Banach space of continuous functions [τ, T ] → D∗. ForM ⊂ Bst, let us denote by C([τ, T ],M(D∗)) the subset of C([τ, T ], D∗) of functionsthat take values in M , and by CY ([τ, T ],M(D∗)) the subset of C([τ, T ], D∗) offunctions μt with the given initial value μτ = Y . If M is closed in D∗, thenC([τ, T ],M(D∗)) is close in C([τ, T ], D∗) and a complete metric space with thedistance induced by the norm of C([τ, T ], D∗).

Theorem 6.3.1. Let M be a convex subset of Bst that is closed in the norm topolo-gies of both Bst and D∗. Let ξ �→ At(ξ) be a mapping from M × [0, T ] to thebounded linear operators At[ξ] : D → Bobs, which is continuous in t and Lipschitz-continuous as a mapping D∗ → L(D,Bobs):

‖At(ξ) −At(η)‖D→Bobs≤ LA‖ξ − η‖D∗ , ξ, η ∈ M, (6.33)

for a constant LA.

Assume that for any Y ∈ M and ξ. ∈ CY ([τ, T ],M(D∗)), τ ∈ [0, T ), theoperator curve At(ξt) : D → Bobs generates a strongly continuous backward prop-agator of uniformly bounded linear operators U r,s[ξ.], τ ≤ r ≤ s ≤ T , in Bobs onthe common invariant domain D such that

‖U r,s[ξ.]‖D→D ≤ UD, ‖U r,s[ξ.]‖Bobs→Bobs≤ UB, (6.34)

for some constants UD, UB, and that the dual propagators V s,r[ξ.] = (U r,s)∗[ξ.]preserve the set M .

Then the weak nonlinear Cauchy problem (6.32) is well posed in M . Moreprecisely, for any Y ∈ M it has a unique solution μt(Y ) ∈ M , and the transfor-mation Y → μt(Y ) of M depends Lipschitz-continuously on the time t and theinitial data in the norm of D∗, i.e.,

‖μt(Y1)− μt(Y2)‖D∗ ≤ exp{tUDUBLA}UB‖Y1 − Y2‖D∗ , (6.35)

‖μt(Y )− Y ‖D∗ ≤ tUB‖Y ‖Bst‖A(Y )‖D→B exp{t‖Y ‖BstUDUBLA}. (6.36)

Remark 105.

(i) The constant LA may depend on ξ and η, but it should be uniformly boundedfor ξ, η from any bounded subset of M .

(ii) The continuity (6.33) is a stronger requirement than the Lipschitz continuityin Bst.

(iii) The notation C([τ, T ],M(D∗)) (rather than just C([τ, T ],M)) emphasizesthat M is considered in the topology of D∗.

(iv) According to Proposition 4.10.1, the dual propagators V s,r[ξ.] are necessarilyLipschitz-continuous functions of s, r in the norm topology of D∗. Therefore,Theorem 6.3.1 is still valid if the propagators U r,s[ξ.] can be constructednot for any ξ ∈ CY ([τ, T ],M(D∗)), but only for those that are Lipschitz-continuous in the norm topology of D∗. As an application of this remark, seeTheorem 6.9.1.

Page 386: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

6.3. Nonlinear evolutions and multiplicative-integral equations 373

Proof. As mentioned before, we are planning to approximate the solutions to (6.32)by a recursive system solving the equations

d

dt(f, ξn+1(t)) = (At[ξn(t)]f, ξn+1(t)), ξn(τ) = Y, f ∈ D, (6.37)

expecting that the ξn converge to a fixed point of the mapping ξt �→ ΦY (ξ.)(t) =V t,τ [ξ.]Y considered as a mapping of the metric space CY ([τ, T ],M(D∗)) to itself.In order to estimate the difference of two propagators U τ,t[ξ1. ] and U τ,t[ξ2. ] viaProposition 4.9.2, we need the continuity of the mapping At(ξ)f as a mappingM(D∗) �→ Bobs for any f ∈ D, which follows by (6.33). Since

(f, (V t,τ [ξ1. ]− V t,τ [ξ2. ])Y ) = (U τ,t[ξ1. ]f − U τ,t[ξ2. ]f, Y ),

it follows that

‖[ΦY (ξ1. )](t)− [ΦY (ξ

2. )](t)‖D∗ = ‖V t,τ [ξ1. ]Y − V t,τ [ξ2. ]Y ‖D∗

≤ ‖U τ,t[ξ1. ]− U τ,t[ξ2. ]‖D→B‖Y ‖Bst

≤ UDUBLA‖Y ‖Bst

∫ t

τ

‖ξ1s − ξ2s‖D∗ ds. (6.38)

Moreover,

‖[ΦY1(ξ.)](t)− [ΦY2(ξ.)](t)‖D∗ = ‖V t,0[ξ.](Y1 − Y2)‖D∗ ≤ UB‖Y1 − Y2‖Bst . (6.39)

Finally, using (4.133), we obtain

‖ΦY (Y )− Y ‖C([τ,t],D∗) = sups∈[τ,t]

‖V s,τ [Y ]Y − Y ‖D∗

= sups∈[τ,t]

sup‖f‖D≤1

(U τ,s[Y ]f − f, Y )

≤ (t− τ)UB sups∈[τ,t]

‖As(Y )‖D→B‖Y ‖Bst .

Therefore, everything follows from Theorem 2.1.1. �

Remark 106. For measure-valued equations, the basic examples of the set M arethe set of probability measures or the set of positive measures.

In the above fixed-point equation ξ. = ΦY (ξ.), the r.h.s is expressed in termsof the propagator, which in turn represents a T -product or a multiplicative integral.Therefore, this equation is the multiplicative-integral analogue of the usual integralequations.

Let us now prove the stability (or continuous dependence) of the nonlin-ear semigroups of transformations μt with respect to small perturbations of thegenerators A.

Page 387: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

374 Chapter 6. The Method of Propagators for Nonlinear Equations

Theorem 6.3.2. Assume that we have a family ξ �→ Aαt (ξ) of mappings from M ×

[0, T ] to the bounded linear operators Aαt [ξ] : D → Bobs, satisfying the conditions

of Theorem 6.3.1 with all constants being uniform in α for α from some auxiliaryBanach space B1. Suppose that

‖Aαt (ξ)−Aβ

t (ξ)‖D→B ≤ κ‖α− β‖, ξ ∈ M, (6.40)

with a constant κ. Then

‖μαt (Y )−μβ

t (Y )‖D∗ ≤ (t− τ)‖α−β‖κ exp{(t− τ)UDUBLA‖Y ‖Bst}UDUB‖Y ‖Bst ,(6.41)

where μαt (Y ) denotes the corresponding solution to the equation with Aα

t .

Proof. By duality and Proposition 4.9.3, we find

‖[ΦαY (ξ.)](t) − [Φβ

Y (ξ.)](t)‖D∗ = ‖(V t,τα [ξ.]Y − V t,τ

β [ξ.]Y ‖D∗

≤ ‖U τ,tα [ξ.]− U τ,t

β [ξ.]‖D→Bobs‖Y ‖Bst

≤ (t− τ)UDUBκ‖α− β‖‖Y ‖Bst , (6.42)

which again implies (6.41) by Theorem 2.1.1. �

6.4 Causal equations and general

path-dependent equations

In this section, we shall deal with a path-dependent version of equation (6.32),i.e., with the equation

d

dt(f, μt) = (A[t, {μs}0≤s≤T ]f, μt), μ0 = Y, f ∈ D, (6.43)

where (t, {ηs}0≤s≤T ) �→ A[t, {ηs}0≤s≤T ] maps R+ × Cμ([0, T ],M(D∗)) to thebounded linear operators D �→ Bobs. We refer to equation (6.43) as the generalpath-dependent kinetic equation. It should hold for all test functions f ∈ D. If theoperatorsA only depend on the history of the trajectory {μ.} ∈ Cμ([0, T ],M(D∗)),that is,

d

dt(f, μt) = (A[t, {μ≤t}]f, μt), μ0 = Y, f ∈ D, (6.44)

where {μ≤t} is a short form for {μs}0≤s≤t, then the equations (6.44) are causal,see (2.157). It is also natural to call them adaptive kinetic equations, in analogyto adaptive control and adaptive stochastic differential equations.

If, on the other had, the generators A only depend on the future of thetrajectory {μ.}, that is,

d

dt(f, μt) = (A[t, {μ≥t}]f, μt), μ0 = Y, f ∈ D, (6.45)

Page 388: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

6.4. Causal equations and general path-dependent equations 375

where {μ≥t} is a short form for {μs}t≤s≤T , then we call (6.45) an anticipatingkinetic equation.

Remark 107. The main reason for talking about (seemingly quite exotic) anticipat-ing equations lies in their inevitable appearance in the study of forward-backwardsystems, see Section 6.10.

Let us start with the causal equations, where no additional difficulties arisecompared to equation 6.32.

Theorem 6.4.1. As in Theorem 6.3.1, let M be a bounded convex subset of Bst thatis closed in the norm topologies of both Bst and D∗. Assume that for any t ∈ [0, T ]and a curve {ξ.} ∈ Cμ([0, T ],M(D∗)), a linear operator A[t, {ξ≤t}] : D �→ Bobs isdefined such that all A[t, {ξ≤t}] : D �→ Bobs are uniformly bounded and Lipschitz-continuous in {ξ.}, i.e., for any {ξ.}, {η.} ∈ CY ([0, T ],M(D∗)), we have

sups∈[0,t]

‖A[s, {ξ≤s}]−A[s, {η≤s}]‖D �→Bobs≤ LA‖ξ. − η.‖C([0,t],D∗), (6.46)

with a positive constant LA.

Moreover, assume that for any {ξ.} ∈ CY ([0, T ],M(D∗)), the operator curveA[t, {ξ≤t}] : D �→ Bobs generates a strongly continuous backward propagator ofbounded linear operators U r,s[{ξ≤t}] in Bobs, 0 ≤ r ≤ s ≤ t, on the commoninvariant domain D, so that

‖U r,s[{ξ.}]‖D �→D ≤ UD and ||U r,s[{ξ.}]||Bobs �→Bobs≤ UB, t ≤ s, (6.47)

for some constants UD, UB, and that their dual propagators V s,r = (U r,s)∗[{ξ≤t}]preserve the set M .

Then the Cauchy problem (6.44) is well posed, that is, for any Y ∈ M , ithas a unique solution μt(Y ) ∈ M that depends Lipschitz-continuously on the timet and the initial data in the norm of D∗. In other words, the same estimates(6.35), (6.36) hold, but with sups∈[0,t] ‖A(s, Y )‖D→Bobs

instead of ‖A(Y )‖D→Bobs

in (6.36).

Proof. The proof of Theorem 6.3.1 was presented in such a way that it can be read-ily extended to the present, more general situation by working with the mappingΦY (ξ≤t)(t) = V t,0[ξ≤t]Y . �

Let us now turn to general path-dependent equations.

Theorem 6.4.2. Let M be a convex subset of Bst with supμ∈M ‖μ‖Bst ≤ K, whichis closed in the norm topologies of both Bst and D∗. Suppose that

(i) the linear operators A[t, {ξ.}] : D �→ Bobs are uniformly bounded and Lip-schitz-continuous in {ξ.}, i.e., for any {ξ.}, {η.} ∈ CY ([0, T ],M(D∗)), wehave

supt∈[0,T ]

‖A[t, {ξ.}]−A[t, {η.}]‖D→Bobs≤ LA sup

t∈[0,T ]

||ξt − ηt||D∗ , (6.48)

with a positive constant LA;

Page 389: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

376 Chapter 6. The Method of Propagators for Nonlinear Equations

(ii) for any {ξ.} ∈ CY ([0, T ],M(D∗)), the operator curve A[t, {ξ.}] : D → Bobs

generates a strongly continuous backward propagator of the bounded linearoperators U t,s[{ξ.}] in B, 0 ≤ t ≤ s, on the common invariant domain D, sothat

||U t,s[{ξ.}]||D→D ≤ UD and ||U t,s[{ξ.}]||B→B ≤ UB, t ≤ s, (6.49)

for some positive constants UD, UB. Moreover, suppose that their dual prop-agators V s,t[{ξ.}] preserve the set M .

Then, ifLAUBUDKT < 1, (6.50)

the Cauchy problem (6.43) is well posed, i.e., for any Y ∈ M it has a uniquesolution μt(Y ) ∈ M (that is, (6.43) holds for all f ∈ D) that depends Lipschitz-continuously on the initial data in the norm of D∗:

‖μ.(Y1)− μ.(Y2)‖C([0,T ],M(D∗)) ≤ UB‖Y1 − Y2‖D∗

1− LAUBUDKT. (6.51)

Proof. The peculiarity of the general path dependence arises in the equation (6.42).Unlike for the causal case, it is not valid here. Instead, one gets

‖[ΦY1(ξ1. )](t) − [ΦY2(ξ

2. )](t)‖D∗

≤ UDUBLA‖Y1‖B∗T ‖ξ1. − ξ2. ‖C([0,T ],D∗) + UB‖Y1 − Y2‖B∗ .(6.52)

Therefore, the mapping ΦY is a contraction if (6.50) holds. The existence of thefixed point now follows from the Banach contraction principle, and (6.51) followsfrom the stability of fixed points, see Proposition 9.1.3. �Theorem 6.4.3. Under the assumptions in Theorem 6.4.2, assume additionally thatfor any t from a dense subset of [0, T ], the set

{V t,0[{ξ.}]Y : {ξ.} ∈ CY ([0, T ],M(D∗))} (6.53)

is relatively compact in M(D∗). Then a solution to the Cauchy problem (6.43)exists in M globally, i.e., without the restriction (6.50).

Proof. Since M is convex, the space CY ([0, T ],M(D∗)) is also convex. Sincethe dual operators V t,0[{ξ.}] preserve the set M , the mapping ΦY acts fromCY ([0, T ],M(D∗)) to itself. Moreover, by (6.52), this mapping is Lipschitz-con-tinuous.

Let us denote by C the image of CY ([0, T ],M(D∗)) in CY ([0, T ],M(D∗))under ΦY . In particular, ΦY takes C to itself. Together with (6.52), the assumptionthat the set (6.53) is compact in M for any t from a dense subset of [0, T ] impliesthat the set C is relatively compact in CY ([0, T ],M(D∗)) (by the Arzela–AscoliTheorem). Finally, the Schauder fixed-point theorem implies that there exists afixed point in C ⊂ CY ([0, T ],M(D∗)), which ensures the existence of a solutionto (6.43). �

Page 390: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

6.5. Simplest nonlinear diffusions: weak treatment 377

In all applications, we have to keep in mind that B = Bobs = C∞(Rd),D = Ck

∞(Rd) with some k ∈ N, Bst = B∗ = M(Rd) and M = P(Rd).

According to Proposition 1.1.1, a handy class of compact subsets of P(Rd)is given by the sets

P1≤λ(R

d) =

{μ ∈ P(Rd) :

∫|x|μ(dx) ≤ λ

}.

Therefore, we get the following more concrete version of Theorem 6.4.3:

Theorem 6.4.4. Let B = Bobs = C∞(Rd), D = Ck∞(Rd) with some k ∈ N,

and M = P(Rd). Under the assumptions in Theorem 6.4.2, assume additionallythat all backward propagators U t,s[{ξ.}] act as uniformly bounded operators in theweighted spaces CL(R

d) with L(x) = 1+|x|. Then a solution to the Cauchy problem(6.43) exists in P1(Rd) = ∪λ>0P1

≤λ(Rd) for any Y ∈ P1(Rd).

6.5 Simplest nonlinear diffusions: weak treatment

Let us consider some basic examples of nonlinear evolutions (with both pointwiseand integral nonlinearities) and compare their strong and weak formulations.

First, let B = Bobs = C∞(R), Bst = B∗ = M(R), M = M+≤λ(R) and

D = C1∞(R) equipped with the usual norm of C1

∞(R). Let

A(ξ)f = a(ξ, x)∂

∂x(6.54)

with some function a : B∗ ×R → R. Then, we have

‖A(ξ)−A(η)‖D→B ≤ ‖a(ξ, .)− a(η, .)‖C(R),

so that condition (6.33) requires

‖a(ξ, .)− a(η, .)‖C(R) ≤ LA‖ξ − η‖D∗ . (6.55)

Assuming that

a(ξ, x) =

∫Rk

g(x, y1, . . . , yk)ξ(dy1) · · · ξ(dyk) (6.56)

is the integral operator with the kernel g, we find

‖a(ξ, .)− a(η, .)‖C(R) ≤ ‖g‖C(Rk+1)kbk−1‖ξ − η‖B∗ ,

where b = max(‖ξ‖B∗ , ‖η‖B∗), and

‖a(ξ, .)−a(η, .)‖C(R) ≤ supx,y

bk−1

(k|g(x, y)|+

∑j

∣∣∣∣ ∂g∂yj(x, y)

∣∣∣∣)‖ξ−η‖D∗ , (6.57)

so that the continuity (6.33) requires smoothness of the integral kernel g. As adirect consequence of Theorem 6.3.1, we have the following result:

Page 391: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

378 Chapter 6. The Method of Propagators for Nonlinear Equations

Proposition 6.5.1. Let A(ξ) be given by (6.54) with a(ξ, .) ∈ C1(R) uniformly inξ ∈ M+

≤λ(R), and let (6.55) hold there. For instance, the function a can be given

by (6.56) with g ∈ C1(Rk+1). Then the Cauchy problem

d

dt(f, μt) = (a(μt, x)f

′(x), μt), μ0 = Y ∈ M+≤λ(R),

f ∈ D = C1(R) ∩C∞(R),(6.58)

with t ∈ [0, T ], satisfies the conditions of Theorem 6.3.1 with

M = M+≤λ(R), UB = 1, UD = exp{T sup

μa(μ, .)C1(R) : μ ∈ M+

≤λ(R)}.

Therefore, this Cauchy problem is well posed in M = M+(R).

Exercise 6.5.1. Extend Proposition 6.5.1 to B = C∞(Rd), and also for causal andpath-dependent nonlinearities.

Next, let us consider the (possibly complex) diffusion operator

A(ξ)f =1

2σΔf + (bt(ξ, x),∇)f(x) + Vt(ξ, x)f(x) (6.59)

in Rd, where σ is a complex constant with a positive real part ε, with somefunctions bt(ξ, x), Vt(ξ, x) (possibly complex-valued). In this case, the appropriatespaces are B = C∞(Rd) and D = C2

∞(Rd), with B∗ = M(Rd) and either M =M+

≤λ(Rd) or M = M≤λ(R

d) or, in the case of complex-valued functions, M =

MC≤λ(R

d). The condition (6.33) requires that

‖bt(ξ, .)− bt(η, .)‖C(R) ≤ LA‖ξ − η‖D∗ ,

‖Vt(ξ, .)− Vt(η, .)‖C(R) ≤ LA‖ξ − η‖D∗ .(6.60)

If bt, Vt are integrals of the type (6.56), then this condition can be more explicitlyexpressed in terms of the second derivatives of the corresponding integral kernels,as shown above for the function a.

The control over the growth of the solutions can be performed either viapositivity or the PMP (see Corollary 6), or by just requiring uniform boundedness.As a direct consequence of Theorems 6.3.1 and 4.14.2 (and Corollary 6), we getthe following result:

Proposition 6.5.2.

(i) Let At(ξ) be given by (6.59) with a positive real σ and with real continuousfunctions bt(ξ, x), Vt(ξ, x) such that Vt ≤ 0 everywhere and bt(ξ, .), Vt(ξ, .) ∈C1(Rd) uniformly in time and ξ ∈ M+

≤λ(Rd) for any λ, and let (6.60) hold

again uniformly for ξ ∈ M+≤λ(R

d). Then the Cauchy problem

d

dt(f, μt) =

(1

2σΔf + (bt(μt, x),∇)f(x) + Vt(μt, x)f(x), μt

),

μ0 = Y ∈ M+≤λ(R

d),

(6.61)

Page 392: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

6.6. Simplest nonlinear diffusions: strong treatment 379

(written in the weak form for f ∈ D = C2∞(Rd)) satisfies the conditionsof Theorem 6.3.1 with M = M+

≤λ(Rd) for any λ. Therefore, this Cauchy

problem is well posed in M+(R).

(ii) Let bt, Vt be complex-valued and σ be complex with a positive real part. Letbt(ξ, .), Vt(ξ, .) ∈ C1(Rd) and let (6.60) hold uniformly in time and all ξ ∈MC(Rd). Then the Cauchy problem (6.61) is well posed in MC(Rd).

6.6 Simplest nonlinear diffusions: strong treatment

The weak equations (6.32), (6.59) can be equivalently written in the strong form:

∂tμt =

1

2σΔμt − (∇, bt(μt, x)μt) + Vt(μt, x)μt,

μ0 = Y ∈ M+≤λ(R

d).(6.62)

Since the major term Δ is non-degenerate, one can expect better results from thestrong treatment, compared to the weak approach developed above.

When comparing (6.62) with (6.1), it is important to note that we are nowlooking for measure-valued solutions, or at least solutions in L1(R

d), as opposedto the well-posedness in C1∞(Rd) that we proved for (6.1). Even more importantly,we do not assume the dependence of the coefficients on μ to be of a pointwise localform, as we did in (6.1). Moreover, for the sake of definiteness, we want to avoid thistype of dependence here, since its treatment may be different from the treatmentof the integral dependence, which we are going to mostly deal with from now on.In order to understand this point, observe that if bt(μt, x) is given by a functionof the type h(φt(x)), where φt is the density of μt, then

∂b∂x = h′(φt(x))φ

′(x), butif bt(μt, x) =

∫h(x)μt(dx), then

∂b∂x = 0. To be specific, we shall usually assume

that the dependence of the coefficients on μ and x is given by some function of afinite number of integral monomials of the type (6.56), i.e.,∫

Rk

g(x, y1, . . . , yk)μ(dy1) · · ·μ(dyk).

In particular, for k = 0, we have just a dependence on x, and if g does not dependon x, then the monomial does not depend on x. This kind of dependence is conve-nient, because it allows for a more or less explicit representation of all variationalderivatives. Also, it covers virtually all equations that arise from applications. Asan important special case, let us mention the dependence of the coefficients on theconvolutions g � μt. This case occurs, e.g., in the Landau–Fokker–Planck model(7.54) of grazing collisions.

Equation (6.62) can also be written in a mild form. Since the semigroupgenerated by σΔ takes measures to L1(R

d), the solutions to the mild equationshould belong to L1(R

d) for t > 0 (even if Y does not), or in other words, they

Page 393: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

380 Chapter 6. The Method of Propagators for Nonlinear Equations

should be measures with densities. In terms of the densities φt, the mild form of(6.62) reads

φt(x) =

∫Gtσ(x− y)Y (dy) (6.63)

+

∫ t

0

ds

∫G(t−s)σ(x− y) [Vs(φs, y)φs(y)− (∇, bs(φs, y)φs(y))] dy,

where Vt(φ, y) and bt(φ, y) denote the values of the functionals Vt and bt on ameasure with the density φ (with some abuse of notations). Integrating by partsyields

φt(x) =

∫Gtσ(x− y)Y (dy) +

∫ t

0

ds

∫G(t−s)σ(x− y)Vs(φs, y)φs(y)dy

+

∫ t

0

ds

∫ (∂

∂yG(t−s)σ(x − y), bs(φs, y)φs(y)

)dy. (6.64)

This is the most convenient form, since it makes sense even if no smoothness isassumed on b. Therefore, one can expect that the well-posedness for the mildequation (6.64) can be proved under weaker assumptions than for (6.61). This isactually true, as we will show in the following theorem that represents the mainwell-posedness result for equations of the type (6.61).

In order to work with the equations (6.63) and (6.64), it is often instructiveto write down the weak form of this mild equation, which can also be consideredthe mild form of the weak equation (6.61):

(f, φt) = (Ttf, Y ) +

∫ t

0

ds ([Vs(φs, .) + (bs(φs, .),∇)]Tt−sf, φt) , (6.65)

where Tt is the heat semigroup generated by σΔ/2.

Theorem 6.6.1. Let σ > 0 and bt(ξ, .) = {bjt(ξ, .)}, Vt(ξ, .) be measurable boundedreal or complex functions, such that

V = supy,ξ,t

|Vt(ξ, y)| < ∞, b = supy,ξ,t

∑j|bjt(ξ, y)| < ∞, (6.66)

and letsupx,t

|bt(ξ, x)− bt(η, x)| ≤ LA‖ξ − η‖M(Rd),

supx,t

|Vt(ξ, x) − Vt(η, x)‖ ≤ LA‖ξ − η‖M(Rd).(6.67)

Then:

(i) For any T > 0 and any Y having a density φ, the mild equation (6.64) hasthe unique bounded solution φ. ∈ C([0, T ], L1(R

d)). This solution has thebound

‖φt‖L1(Rd) ≤ ‖Y ‖M(Rd)E1/2(Γ(1/2)(V√t+ b)σ

√t), (6.68)

Page 394: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

6.6. Simplest nonlinear diffusions: strong treatment 381

and for any two solutions φ1t and φ2

t with the initial conditions Y 1andY 2,respectively, the estimate

‖φ1t − φ2

t ‖L1(Rd) ≤ ‖Y 1 − Y 2‖L1(Rd)E1/2(κ(T )Γ(1/2)√t) (6.69)

holds, where E1/2 is the Mittag-Leffler function, σ = max(1, σ−1/2) and κ(T )is given by (6.72) below.

(ii) For any T > 0 and Y ∈ M(Rd), the mild equation (6.64) has the uniquesolution φ. ∈ C((0, T ], L1(R

d)) such that φt → Y weakly, as t → 0. In thiscase, the estimate (6.69) can be rewritten as

‖φ1t − φ2

t‖L1(Rd) ≤ ‖Y 1 − Y 2‖M(Rd)E1/2(κ(T )Γ(1/2)√t). (6.70)

(iii) Solutions φt to the mild equation (6.64) also solve the weak equation (6.61).

(iv) If additionally b and V are real and V is non-positive, then for any positiveφ ∈ L1(R

d) and any positive Y ∈ M+(Rd), the solution φt is also non-negative, we have

‖φt‖L1(Rd) ≤ ‖Y ‖M(Rd),

and (6.70) holds with

κ(T ) = Γ(1/2)σ[(V√T + b) + LA(

√T + 1)]‖Y ‖.

Proof. (i) In order to be able to apply Theorem 2.1.3, we first need to get a boundfor all iterations of the mapping ΦY (φ.)(t) defined as the r.h.s. of (6.64). Since1 ≤ √

t/√t− s and

√2/π ≤ 1 holds, using (4.23) yields

‖[ΦY (φ.)](t)‖L1(Rd) ≤ ‖Y ‖L1(Rd)+Γ(1/2)(V√t+ b)σ

∫ t

0

(t− s)−1/2‖φs‖L1(Rd) ds.

Iterating and using the definition of the Mittag-Leffler function yields

‖[ΦnY (Y )](t)‖L1(Rd) ≤ ‖Y ‖L1(Rd)E1/2(Γ(1/2)(V

√t+ b)σ

√t), (6.71)

which implies (6.68). Again using (4.23) yields

‖[ΦY (φ1. )](t)− [ΦY (φ

2. )](t)‖L1(Rd)

≤ Γ(1/2)(V√T + b)σ−1

∫ t

0

(t− s)−1/2‖φ1. − φ2

. ‖C([0,s],L1(Rd)) ds

+ Γ(1/2)LA(√T + 1)σ−1

×∫ t

0

(t− s)−1/2‖φ1. − φ2

. ‖C([0,s],L1(Rd))‖φ1. ‖C([0,T ],L1(Rd)) ds.

Therefore,

‖[ΦY (φ1. )](t)− [ΦY (φ

2. )](t)‖L1(Rd) ≤ κ(T )

∫ t

0

(t− s)−1/2‖φ1. − φ2

. ‖C([0,s],L1(Rd)) ds

Page 395: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

382 Chapter 6. The Method of Propagators for Nonlinear Equations

with

κ(T ) = Γ(1/2)σ[(V

√T + b) + LA(

√T + 1)E1/2(Γ(1/2)(V

√T + b)σ

√T )].

(6.72)

Since ‖[ΦY1(φ.)](t) − [ΦY2(φ.)](t)‖ ≤ ‖Y1 − Y2‖, the result now follows fromTheorem 2.1.3 with ω = 1/2 and the Banach space L1(R

d).

(ii) Theorem 2.1.3 can be applied for finding the unique solution in the Ba-nach space M(Rd). However, the curve ΦY (μ.)(t) is only weakly continuous in tat t = 0, because the same applies to the first term in (6.64). It follows from (6.64)that ΦY (μ.)(t) has a density for all t > 0.

(iii) If φt solves (6.64), then it solves (6.65) for any f ∈ C(Rd). If f ∈C2

∞(Rd), then we can differentiate (6.65) and get

d

dt(f, Y ) =

1

2σ(ΔTtf, φ) +

1

2

∫ t

0

ds ([Vs(φs, .) + (bs(φs, .),∇)]σΔTt−sf, φs)

+ ([Vt(φt, .) + (bt(φt, .),∇)]f, φt)

=1

2(σΔf, φt) + ([Vt(φt, .) + (bt(φt, .),∇)]f, φt) ,

as required, where (6.63) was used in the last equation.

(iv) By Proposition 6.5.2(i), solutions to (6.65) preserve the positivity anddo not increase the norm. �

6.7 Simplest nonlinear diffusions:regularity and sensitivity

The regularity of the solutions can be enhanced by increasing the regularity of b:

Theorem 6.7.1.

(i) Under the assumptions of Theorem 6.6.1, assume additionally that bt(ξ, .) ∈C1(Rd) uniformly for ξ ∈ M+

≤λ(Rd) and that

supx

∣∣∣∣∂bt(ξ, x)∂x− ∂bt(η, x)

∂x

∣∣∣∣ ≤ LA‖ξ − η‖M(Rd). (6.73)

Then for any Y with the density φ ∈ H11 (R

d), the solution φt to (6.64)belongs to C([0, T ], H1

1 (Rd)), and for any two solutions φ1

t and φ2t with the

initial conditions φ1 and φ2, respectively, the estimate

‖φ1t − φ2

t ‖H11(R

d) ≤ c(t)‖φ1 − φ2‖H11(R

d) (6.74)

holds, with continuous functions c(t) expressed again in terms of the Mittag-Leffler function, like in (6.70).

Page 396: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

6.7. Simplest nonlinear diffusions: regularity and sensitivity 383

(ii) Moreover, the solution is smoothing in the following sense: For any Y ∈M(Rd), the solution φt to (6.64) belongs to H1

1 (Rd) for all t > 0 and we

have

‖φt‖H11 (R

d) ≤ κt−1/2‖Y ‖M(Rd) (6.75)

with a constant κ.

Proof. (i) This is similar to the proof of Proposition 6.6.1, although it is moreconvenient to work with the representation (6.63). Under the present assumptions,the expression in the square brackets in (6.63) belongs to L1(R

d) whenever φs ∈H1

1 (Rd). Therefore, the r.h.s. of (6.63) becomes bounded in H1

1 (Rd) due to the

smoothing property of the heat semigroup. Now we can apply Theorem 2.1.3 inthe Banach space B = H1

1 (Rd).

(ii) Suppose first that φ ∈ H11 (R

d). Then ‖φt‖H11(R

d) is bounded. By (6.63),we have

‖φt‖H11 (R

d) ≤c√t‖φ‖L1(Rd) +

∫ t

0

c√t− s

‖φs‖H11 (R

d) ds,

with a constant c that depends on T, σ and the bounds for V and b. Consequently,

sups≤t

(√s‖φs‖H1

1 (Rd)) ≤ c‖φ‖L1(Rd) +

√t sups≤t

(√s‖φs‖H1

1 (Rd))

∫ t

0

c√t− s

√sds

= c‖φ‖L1(Rd) +√t sups≤t

(√s‖φs‖H1

1 (Rd))

∫ 1

0

c√1− u

√udu.

Therefore, for sufficiently small t, we have

sups≤t

(√s‖φs‖H1

1 (Rd)) ≤ cκ‖φ‖L1(Rd),

with κ = (1−√tcB(1/2, 1/2))−1, which implies (6.75) for an initial φ ∈ H1

1 (Rd).

Next, let φ ∈ L1(Rd). Then it can be approximated in L1 by a sequence

φn of elements of H11 (R

d) with the same norm in L1. Applying (6.75) to theseapproximations yields for any f ∈ C1

∞(Rd) the estimate

|(φt, f′)| = lim

n→∞ |(φnt , f

′)|≤ κt−1/2‖φ‖L1(Rd)‖f‖C(Rd).

Therefore, φt has a generalized derivative for t > 0, which is a signed measurewith a norm that is bounded by the r.h.s. of (6.75). But it is seen from (6.64) thatthe derivative of φt is in fact a function, not just a measure.

Finally, for Y ∈ M(Rd), the solution has a density φt ∈ L1(Rd) for any

t > 0, and the previous argument can be applied to the solution φt considered asthe solution to the Cauchy problem starting at any time t > 0. �

Page 397: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

384 Chapter 6. The Method of Propagators for Nonlinear Equations

As a consequence of Theorems 2.15.1 and 6.6.1, we now prove the sensitivityresult for the solutions to (6.64) and (6.61). For this purpose, we construct the vari-ational derivatives of the solutions to nonlinear diffusion with respect to the initialdata as Green functions (i.e., solutions with the Dirac initial data) of the Cauchyproblem for the corresponding linearized equation (equations in variations).

Theorem 6.7.2. Under the assumptions of Theorem 6.6.1(iv), let the variationalderivatives of Vt(μ,y) and bt(μ,y) with respect μ be well defined and locally bounded,i.e.,

supy,z,s

∣∣∣∣δVs(μ, y)

δμ(z)

∣∣∣∣ ≤ R(λ), supy,z,s

∑j

∣∣∣∣δbjs(μ, y)δμ(z)

∣∣∣∣ ≤ R(λ), (6.76)

for μ ∈ M+≤λ(R

d) and constants R(λ). Moreover, let these variational derivativesbe locally Lipschitz-continuous, i.e.,

supy,z,s

∣∣∣∣δVs(μ1, y)

δμ(z)− δVs(μ

2, y)

δμ(z)

∣∣∣∣ ≤ L(λ)‖μ1 − μ2‖M(Rd), (6.77)

supy,z,s

∑j

∣∣∣∣δbjs(μ1, y)

δμ(z)− δbjs(μ

2, y)

δμ(z)

∣∣∣∣ ≤ L(λ)‖μ1 − μ2‖M(Rd), (6.78)

for μ ∈ M≤λ(Rd) and constants L(λ).

Then the mapping φ = φ0 �→ φt with the initial data φ ∈ L1(Rd) (or,

more generally, Y �→ φt with the initial data Y ∈ M(Rd)), solving (6.64) ac-cording to Theorem 6.6.1, belongs to C1

luc(L1(Rd), L1(R

d)) (or, more generally, to

C1luc(M(Rd), L1(R

d))), for all t > 0, and ξt(x) =δφt(Y )δY (x) is the unique solution to

the equation

ξt(x; z) = Gtσ(z − x) (6.79)

+

∫ t

0

ds

∫G(t−s)σ(z − y)

(∫δVs(φs, y)

δφs(w)ξs(x;w)φs(y) dw + Vs(φs, y)ξs(x; y)

)dy

+

∫ t

0

ds

∫ (∂

∂yG(t−s)σ(z − y),∫δbs(φs, y)

δφs(w)ξs(x;w)φs(y) dw + bs(φs, y)ξs(x; y)

)dy.

It satisfies the Dirac initial condition ξ0(x; .) = δx and has the bound

‖ξt(x, .)‖L1(Rd) ≤ E1/2[Γ(1/2)√t((λR(λ) + V )

√t+ λR(λ) + b)]. (6.80)

Finally, ξt also solves the weak equation obtained by formal differentiation of

Page 398: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

6.8. McKean–Vlasov equations 385

(6.61):

d

dt(f, ξt(x; .)) =

(1

2σΔf + (bt(φt, .),∇)f + Vt(φt, .)f, ξt(x; .)

)

+

∫ ∫δVt(φt, y)

δφt(w)ξt(x;w)f(y)φt(y) dydw

+

∫ ∫ (δbt(φt, y)

δφt(w)ξt(x;w),∇f(y)

)φt(y) dydw.

(6.81)

Remark 108. Sensitivity still holds under Theorem 6.6.1(i) to (iii), although withmore complicated (and more rapidly growing) bounds for ξt, namely with λ notbeing ‖Y ‖ as in the below proof, but being given by the r.h.s. of (6.68).

Proof. We are in the setting of Theorem 2.15.1 with ξt satisfying an equation ofthe type (2.128) with ξ = δx, Gt,0ξ = Gtσ(z−x) and DΩt,s given by the expressionunder the integral in (6.79). Therefore, we can estimate the norms as in the proofof the above Theorem 6.6.1 and get

‖DΩt,s(φ)‖M(Rd)→L1(Rd) ≤ (λR(λ) + V ) + (t− s)−1/2σ(λR(λ) + b)

≤ (t− s)−1/2σ[λR(λ) + b+√t(λR(λ) + V )],

with λ = ‖Y ‖M(Rd), which yields the first estimate in (2.205). Furthermore,

‖DΩt,s(φ1)−DΩt,s(φ

2)‖M(Rd)→L1(Rd)

≤ (λL(λ) +R(λ))‖φ1 − φ2‖C([0,s],L1(Rd))(1 + σ(t− s)−1/2)

≤ σ(λL(λ) +R(λ))‖φ1 − φ2‖C([0,s],L1(Rd))(1 +√t),

which yields the second estimate in (2.205). Therefore, the application of Theorem2.15.1 completes the proof. �

The weak and mild representations (6.81) and (6.79) of the derivatives withrespect to the initial conditions can be used for deriving different kinds of regularityfor these derivatives. This will be shown in the next section for a more generalmodel with nontrivial diffusion coefficients.

6.8 McKean–Vlasov equations

Nonlinear diffusion equations represent a general class of diffusion-type equationswith coefficients that depend on an unknown function. In other words, they areevolutionary equations of the second order, which are linear with respect to thederivatives:

∂ut

∂t=

1

2(at(ut, x)∇,∇)ut(x) + (bt(ut, x),∇)ut(x) + Vt(ut, x)ut, (6.82)

Page 399: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

386 Chapter 6. The Method of Propagators for Nonlinear Equations

or equivalently

∂ut

∂t=

1

2tr

(at(ut, x)

∂2ut(x)

∂x2

)+

(bt(ut, x),

∂ut(x)

∂x

)+ Vt(ut, x)ut, (6.83)

or more explicitly

∂ut

∂t=

1

2

∑i,j

at,ij(ut, x)∂2ut(x)

∂xi∂xj+∑

jbjt (ut, x)

∂ut(x)

∂xj+ Vt(ut, x)ut(x), (6.84)

with given functions at(u, x), bt(u, x), Vt(u, x). When the coefficient-functionsa, b, V are allowed to be complex-valued, we speak of complex nonlinear diffusionequations or (complex) nonlinear Schrodinger equations.

This section is devoted to the sensitivity of nonlinear diffusions with respectto the initial data. It also provides explicit bounds for the norms of the derivativesin various regularity classes.

Nonlinear diffusion equations and nonlinear Schrodinger equations are ba-sic equations that represent the evolution of the large-number-of-particles limitfor systems of interacting classical or quantum particles. Their derivation frominteracting particle systems will be sketched in Section 7.8.

So far, we have analysed the special case when a is the identity matrix. If thedependence of the functions b, V on u is simply a dependence on the values of u atgiven points, say b(u, x) = b(u(x), x), then the equations (6.84) can be consideredspecial cases of equation (6.1), withH depending linearly on p and q. Therefore, itsmain properties can be obtained in the framework of the HJB theory, for instancefrom Theorems 6.1.1, 6.1.4 and 6.1.5. Of course, the linearity of H allows for aweakening of the assumptions on its regularity, e.g., by choosing b and V to bemeasures of dimensionality α ∈ (d− 2, d], in the spirit of Theorem 4.8.1.

Returning to the general case (6.82), we shall always assume that for anycontinuous function ut(x) the second-order operator (at(ut(.), x)∇,∇)/2 generatesa propagator U t,s

a(u) in C∞(Rd) on the invariant domain C2∞(Rd). In this case,

equation (6.82) with the initial condition u0 can be rewritten in a mild form:

ut = U t,0a(u)u0 +

∫ t

0

U t,sa(u)(b(us, x),∇)us(x) + V (us, x)us) ds. (6.85)

Very often, the nonlinear equations arise in their weak form, as equations ofthe type (6.32):

d

dt(f, μt) = (At(μt)f, μt), μ0 = Y, f ∈ D, (6.86)

where

At(μ)f(x) =1

2(at(μ, x)∇,∇)f(x) + (bt(μ, x),∇)f(x) + Vt(μ, x)f(x). (6.87)

Page 400: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

6.8. McKean–Vlasov equations 387

The strong form of equation (6.86) for measures μ with densities φ reads

∂φ

∂t=

1

2(∇,∇(at(φ, x)φ(x))) − (∇, bt(φ, x)φ(x)) + Vt(φ, x)φ(x), (6.88)

where we denote the functions of measures and of their densities by the same let-ters, with some abuse of notation. More explicitly, equation (6.88) can be rewrit-ten as

∂φ

∂t=

1

2

∑i,j

∂2

∂xi∂xj(at,ij(φ, x)φ) −

∑j

∂xj(bjt (φ, x)φ(x)) + Vt(φ, x)φ(x),

(6.89)which can of course be rewritten in the general form (6.84), although with someother functions a, b and V .

Nonlinear diffusion equations are often referred to as McKean–Vlasov diffu-sions, especially when the coefficients a, b, V depend on the unknown function uvia moments of the type

∫g(x1, . . . , xk)u(x1) · · ·u(xk)dx1 · · · dxk.

As we already mentioned, the analysis of diffusions crucially depends onwhether the major second-order term is degenerate or not. We shall first analysethe non-degenerate case. In this case, the strong form and its mild version seem tobe most convenient for the analysis. Even if the initial condition is a non-regularobject, say a measure, the smoothing property of non-degenerate diffusions makesit a smooth function at any positive time.

If the matrix a(u, x) does not depend on u, then equation (6.82) belongs tothe class of equations of the type

∂ut

∂t=

1

2(at(x)∇,∇)ut(x) + (bt(ut, x),∇ut(x)) + Vt(ut, x)ut, (6.90)

where a is a standard second-order operator. Working with this equation makesthe analysis easier, as compared to the general case. In fact, the analysis of thisequation with non-degenerate a can be reduced to the case of a = Δ by Proposition5.6.2. For the sake of simplicity, we shall mostly concentrate on equations of thetype (6.90) and use such a reduction. More precisely, we shall work with thecorresponding weak equation (6.86) and the corresponding strong form

∂φ

∂t=

1

2(∇,∇(at(x)φ(x))) − (∇, b(φ, x)φ(x)) + Vt(φ, x)φ(x). (6.91)

References to papers that treat more general cases are supplied in the last Section.Theorem 6.6.1 can be extended to the following result:

Theorem 6.8.1. Let bt(ξ, .) = {bjt(ξ, .)}, Vt(ξ, .) be measurable bounded real func-tions with V non-positive, satisfying (6.66) and (6.67). Let a(x) be a matrix-valuedfunction, with elements belonging to C2(Rd) (uniformly in t), which is uniformlyelliptic, so that

m(ξ, ξ) ≤ (at(x)ξ, ξ) ≤ m−1(ξ, ξ),∑

ij‖at,ij(x)‖C2(Rd) ≤ M < ∞, (6.92)

with some positive constants m,M .

Page 401: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

388 Chapter 6. The Method of Propagators for Nonlinear Equations

Then for any T > 0 and any Y having a non-negative density φ (respectivelywithout density), the mild equation

φt(x) =

∫Gt(x, y)Y (dy) +

∫ t

0

ds

∫G(t−s)(x, y)Vs(φs, y)φs(y)dy

+

∫ t

0

ds

∫ (∂

∂yG(t−s)(x, y), bs(φs, y)φs(y)

)dy (6.93)

has a unique bounded non-negative solution φ. ∈ C([0, T ], L1(Rd)) (respectively a

unique solution φ. ∈ C((0, T ], L1(Rd))) such that φt → Y weakly, as t → 0, so

that‖φt‖L1(Rd) ≤ ‖Y ‖M(Rd),

and for any two solutions φ1t and φ2

t with the initial conditions Y 1 and Y 2, re-spectively, the estimate

‖φ1t − φ2

t ‖L1(Rd) ≤ ‖Y 1 − Y 2‖M(Rd)E1/2(κ(T )√t) (6.94)

holds, whereκ(T ) = C[(V

√T + b) + LA(

√T + 1)]‖Y ‖,

with a constant C depending on m,M, T .

Finally, solutions φt to the mild equation (6.93) also solve the correspondingweak equation that extends (6.58).

Proof. The solution φt is the fixed point of the mapping ΦY that extends themapping arising from (6.64), i.e.,

[ΦY (φ.)](t)(x) =

∫Gt(x, y)Y (dy) +

∫ t

0

ds

∫G(t−s)(x, y)Vs(φs, y)φs(y)dy

+

∫ t

0

ds

∫ (∂

∂yG(t−s)(x, y), bs(φs, y)φs(y)

)dy. (6.95)

By Proposition 5.6.2, there exist constants σ and C depending only onm,M, T such that the Green function Gt,s(x, y) of the Cauchy problem for theoperator φ(x) �→ 1

2 (∇,∇(at(x)φ(x))) is differentiable in x and y and satisfies theestimate

Gt,s(x, y) ≤ CGσ(t−s)(x− ξ), 0 < s, t < T, (6.96)

max

(∣∣∣∣ ∂∂yGt,s(x, y)

∣∣∣∣ ,∣∣∣∣ ∂∂xGt,s(x, y)

∣∣∣∣)

≤ Ct−1/2Gσ(t−s)(x− ξ), 0 < s, t < T.

(6.97)

(Recall that this Green function is just the transpose kernel to the Green functionGs,t(y, x) of the backward Cauchy problem for the dual operator 1

2 (at(x)∇,∇).)Therefore, all estimates used in the proof of Theorem 6.6.1 remain valid, if theadditional constant C is included. �

Page 402: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

6.8. McKean–Vlasov equations 389

Similarly, Theorems 6.7.1 and 6.7.2 and Proposition 6.8.3 have direct exten-sions. For instance, the following sensitivity result for McKean–Vlasov diffusionholds.

Theorem 6.8.2. Under the assumptions of Theorem 6.8.1, let the variational de-rivatives of Vt(μ, y) and bt(μ, y) with respect μ be well defined, locally bounded andsatisfy (6.76), (6.77) and (6.78).

Then the mapping φ = φ0 �→ φt, with the initial data φ ∈ L1(Rd) (or,

more generally, Y �→ φt with the initial data Y ∈ M(Rd)), solving (6.95) ac-cording to Theorem 6.8.1, belongs to C1

luc(L1(Rd), L1(R

d)) (or, more generally, to

C1luc(M(Rd), L1(R

d))), for all t > 0, and ξt(x) =δφt(Y )δY (x) is the unique solution to

the equation

ξt(x; z) = Gt(z, x) (6.98)

+

∫ t

0

ds

∫Gt−s(z, y)

(∫δVs(φs, y)

δφs(w)ξs(x;w)φs(y) dw + Vs(φs, y)ξs(x; y)

)dy

+

∫ t

0

ds

∫ (∂

∂yGt−s(z, y),

∫δbs(φs, y)

δφs(w)ξs(x;w)φs(y) dw + bs(φs, y)ξs(x; y)

)dy.

It satisfies the Dirac initial condition ξ0 = δx and has the bound

‖ξt(x, .)‖L1(Rd) ≤ E1/2[C(2t+ 1)(2λR(λ) + V + b)], (6.99)

with a constant C depending on m,M and T . Finally, ξt also solves the weakequation obtained by formal differentiation of (6.58):

d

dt(f, ξt(x; .)) =

(1

2(A(.)∇,∇)f + (bt(φt, .),∇)f + Vt(φt, .)f, ξt(x; .)

)

+

∫ ∫δVt(φt, y)

δφt(w)ξt(x;w)f(y)φt(y) dydw

+

∫ ∫ (δbt(φt, y)

δφt(w)ξt(x;w),∇f(y)

)φt(y) dydw.

(6.100)

As already mentioned, the weak and mild representations (6.100) and (6.98),respectively, of the derivatives with respect to the initial conditions can be usedto derive different kinds of regularity for these derivatives. In order to illustratethis claim, let us observe that the weak equation (6.81) shows that the evolutionξ �→ ξt of the directional derivatives of the solutions φt is dual to the backwardevolution in C(Rd) generated by the equation

ft(z) = − 1

2(A(z)∇,∇)ft(z)− (bt(φt, z),∇)ft(z)− Vt(φt, z)ft(z) (6.101)

−∫

δVt(φt, y)

δφt(z)ft(y)φt(y) dy −

∫ (δbt(φt, y)

δφt(z),∇ft(y)

)φt(y) dy.

Page 403: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

390 Chapter 6. The Method of Propagators for Nonlinear Equations

Theorem 6.8.3. Under the assumptions of Theorem 6.8.2, let additionally

bt(μ, .), Vt(μ, .),δbt(μ, y)

δμ(.),δVt(μ, y)

δμ(.)∈ C1(Rd)

uniformly for bounded μ, so that for μ ∈ M+≤λ(R

d) with any λ, we have

supt

(‖bt(μ, .)‖C1(Rd) + ‖Vt(μ, .)‖C1(Rd)

)

+ supt,y

(∥∥∥∥δbt(μ, y)δμ(.)

∥∥∥∥C1(Rd)

+

∥∥∥∥δVt(μ, y)

δμ(.)

∥∥∥∥C1(Rd)

)≤ c1(λ)

(6.102)

with a continuous function c1(λ).

Then:

(i) The equation (6.101) is well posed in C2∞(Rd) and it generates a backward

propagator Φt,s acting strongly continuously in the spaces C∞(Rd), C1∞(Rd)and C2

∞(Rd), so that

‖Φt,s‖L(C1∞(Rd)) ≤ eC(m,M)(s−t)E1/2[(V + b+ 2R(λ))C(m,M)√s− t],

‖Φt,s‖L(C2∞(Rd)) ≤ eC(m,M)(s−t)E1/2[c1(λ)C(m,M)√s− t], (6.103)

with a constant C(m,M) depending only on m and M . Consequently, by du-ality, the weak equation (6.81) generates a (forward) propagator (Φt,s)∗ inM(Rd) that extends to bounded propagators in the dual spaces (C1

∞(Rd))∗

and (C2∞(Rd))∗. Moreover, the variational derivatives ξt(x; .) are twice dif-ferentiable in x, so that

∂ξt(x; .)

∂x∈ (C1

∞(Rd))∗,∂2ξt(x; .)

∂x2∈ (C2

∞(Rd))∗, (6.104)

and ∥∥∥∥∂ξt(x; .)∂x

∥∥∥∥(C1∞(Rd))∗

≤ eC(m,M)tE1/2[(V + b+ 2R(λ))C(m,M)√t],

∥∥∥∥∂2ξt(x; .)

∂x2

∥∥∥∥(C2∞(Rd))∗

≤ eC(m,M)tE1/2[c1(λ)C(m,M)√t]. (6.105)

(ii) The directional derivative ξt[ξ] = Dφt(Y )[ξ] belongs to H11 (R

d) whenever ξdoes, and

‖ξt‖H11 (R

d) ≤ C(T )‖ξ‖H11(R

d),

where C(T ) depends on the bounds of all derivatives of A, b and V mentionedin the assumptions of the proposition, and can also be explicitly expressed interms of the Mittag-Leffler function.

Page 404: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

6.8. McKean–Vlasov equations 391

Proof. (i) It follows directly from Theorem 4.13.4 applied to equation (6.101)with B = C∞(Rd), B = C1

∞(Rd), D = C2∞(Rd). Notice also that the derivatives

(6.104) satisfy the same equation as ξt itself, but have the initial conditions ∇δx ∈(C1

∞(Rd))∗, ∇2δx ∈ (C2∞(Rd))∗.

(ii) It follows by applying Theorem 2.1.3 to equation (6.98) in the Banachspace H1

1 (Rd). �

Exercise 6.8.1. Apply Theorem 4.13.4 to further confirm that Φt,s is smoothingwhen taking C1

∞(Rd) to C2∞(Rd) and C∞(Rd) to C1

∞(Rd). Find the correspondingestimates. Show that the dual propagators take (C2

∞(Rd))∗ to (C1∞(Rd))∗ and

(C1∞(Rd))∗ to (C∞(Rd))∗, and thus that ∂ξt(x;.)

∂x and ∂2ξt(x;.)∂x2 belong to H1

1 (Rd)

for any t > 0. Again obtain the corresponding estimates for the norms.

Finally, let us look at the second variational derivatives of the solutions toMcKean–Vlasov equations with respect to the initial data:

ηt(x, z; .) =δ2φt(Y )

δY (x)δY (z).

Remark 109. Second-order derivatives are crucial for the analysis of fluctuationsin a system of interacting particles around its limit given by the nonlinear kineticequations (see Section 7.8), for instance, the McKean–Vlasov equation. In partic-ular, the estimates (6.110) and (6.113) are of great relevance, see Remark 116.

Differentiating equation (6.100), we obtain for η the weak equation

d

dt(f, ηt(x, z; .)) =

(1

2(at(.)∇,∇)f + (bt(φt, .),∇)f + Vt(φt, .)f, ηt(x, z; .)

)

+

∫ ∫δVt(φt, y)

δφt(w)ηt(x, z;w)f(y)φt(y) dydw (6.106)

+

∫ ∫ (δbt(φt, y)

δφt(w)ηt(x, z;w),∇f(y)

)φt(y) dydw + (f, gt),

with (f, gt) given by∫ ∫ [δVt(φt, y)

δφt(w)f(y) +

(δbt(φt, y)

δφt(w),∇f(y)

)]× [ξt(x; y)ξt(z;w) + ξt(x;w)ξt(z; y)] dydw

+

∫ ∫ ∫ [δ2Vt(φt, y)

δφt(w)δφt(u)f(y) +

(δ2bt(φt, y)

δφt(w)δφt(u),∇f(y)

)]× ξt(x;w)ξt(z;u)φt(y) dydwdu,

(6.107)

which should be satisfied with the vanishing initial condition η0(x, z, .) = 0. Thisis the same equation as (6.100), but with an additional non-homogeneous term

Page 405: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

392 Chapter 6. The Method of Propagators for Nonlinear Equations

(g, ft). Therefore, its solution can be expressed in terms of the propagators (Φt,s)∗

from Theorem 6.8.3:

ηt(x, z; .) =

∫ t

0

(Φ0,s)∗gs ds. (6.108)

Therefore, in spite of the challengingly looking expression (6.107), the analysis ofη is more or less straightforward.

However, the structure of (6.107) conveys an important message, namely

that for this analysis one needs the exotic spaces C2,k×kweak (M+

≤λ(Rd)), which are

subspaces of C1,1weak(M+

≤λ(Rd)) consisting of functionals F (μ) such that δ2F (Y )

δY (x)δY (z)

exists for all x, z and belongs to Ck×k(R2d) (see the definition of this space inSection 1.1) uniformly for Y ∈ M+

≤λ(Rd). Their relevance for the approximation

of systems of interacting particles will be further elaborated in Section 7.8. FromTheorem 6.8.3 and formula (6.108), we can now derive the following consequenceon the second-order sensitivity for the McKean–Vlasov diffusion:

Theorem 6.8.4. (i) Under the assumptions of Theorem 6.8.3, assume the exis-tence of continuous bounded second-order variational derivatives:

supy,w,u,t

∣∣∣∣ δ2Vt(μ, y)

δμ(w)δμ(u)

∣∣∣∣ ≤ R2(λ), supy,w,u,t

∑j

∣∣∣∣∣ δ2bjt(μ, y)

δμ(w)δμ(u)

∣∣∣∣∣ ≤ R2(λ). (6.109)

Then ηt(x, z; .) is well defined for any t as an element of (C1(Rd))∗ and hasthe following bound:

‖ηt(x, z; .)‖(C1(Rd))∗ ≤ tC(m,M, T )λ[R(λ) +R2(λ)]

× (E1/2[C(m,M, T )(2λR(λ) + V + b)])3 (6.110)

with λ = ‖Y ‖M(Rd).

(ii) Assuming additionally that

supy,t

∥∥∥∥ δ2Vt(μ, y)

δμ(.)δμ(.)

∥∥∥∥C1×1(R2d)

≤ R3(λ), supy,t

∑j

∥∥∥∥∥ δ2bjt (μ, y)

δμ(.)δμ(.)

∥∥∥∥∥C1×1(R2d)

≤ R3(λ),

(6.111)

supw,t

∥∥∥∥δVt(μ, .)

δμ(w)

∥∥∥∥C1(Rd)

≤ R4(λ), supw,t

∑j

∥∥∥∥∥δbjt (μ, .)

δμ(w)

∥∥∥∥∥C1(Rd)

≤ R4(λ),

(6.112)it follows that the derivatives of ηt(x, z; .) with respect to x and z of order atmost one are well defined as elements of (C2(Rd))∗ and∥∥∥∥ ∂α

∂xα

∂β

∂zβηt(x, z; .)

∥∥∥∥(C2(Rd))∗

(6.113)

≤ tC(m,M, T )λ[R(λ) +R2(λ) +R3(λ) +R4(λ)](E1/2[C(m,M, T )c1(λ)]

)3for α, β = 0, 1.

Page 406: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

6.9. Landau–Fokker–Planck-type equations 393

Proof. (i) For gt in (6.108), we have the estimate

‖gt‖(C1(Rd))∗ ≤ 4λ[R(λ) +R2(λ)]‖Y ‖M(Rd)‖ξt(x; .)‖L1(Rd)‖ξt(z; .)‖L1(Rd)

≤ 4λ[R(λ) +R2(λ)](E1/2[C(m,M, T )c1(λ)]

)2,

where (6.99) was used. Therefore, (6.110) follows by the first estimate in (6.103).In order to prove that the solution ηt to equation (6.106) yields in fact the secondderivative of μt with respect to the initial data, one can either use the strong formof this equation and apply Theorem 2.15.1, like in the proof of Theorem 6.7.2, oruse an approximation by bounded operators, as will be explained in Section 6.12for a more general setting.

(ii) The same as (i), but using the second estimate in (6.103). �

6.9 Landau–Fokker–Planck-type equations

We shall now analyse the class of possibly degenerate nonlinear diffusions of thetype (6.32) with

A(μ)f =1

2tr

(∫σ(x, y)σT (x, y)μ(dy)

∂2f

∂x2

)+

(∫b(x, y)μ(dy),

∂f

∂x

), (6.114)

with a d×d-square-matrix-valued function σ(x, y), x, y ∈ Rd, and a vector-valuedfunction b(x, y). As a key example, this class contains the celebrated Landau–Fokker–Planck equation from statistical physics (see Section 7.4 for concrete σ, barising in this context).

Unlike the rest of the book, the results of this section depend strongly on theuse of probability theory, since one of the ingredients of the proofs is supplied byTheorem 4.3.1, which we formulated without proof while mentioning that it arisesas a consequence of the methods of stochastic analysis.

Theorem 6.9.1. Let σ, b be continuous functions such that σ(., y), b(., y) ∈ C4(Rd)with all required derivatives being continuous and uniformly bounded in both vari-ables and σ(x, .), b(x, .) ∈ C2(Rd) with all required derivatives being continuousand uniformly bounded in both variables. Then the Cauchy problem (6.32), (6.114)is well posed in the sense that for any Y ∈ P(Rd) there exists a unique globalsolution μt(Y ) in M(Rd) with the initial condition Y such that μt(Y ) ∈ P(Rd)for all t.

Proof. Similar to our dealing with nonlinear diffusions, we are going to use The-orem 6.3.1 with Bobs = C∞(Rd), D = C2

∞(Rd), M = P(Rd). By the assumptionσ(., y), b(., y) ∈ C4(Rd), by Theorem 5.3.3 with D2 = C2∞(Rd), and by Theorem4.3.1, for any Lipschitz-continuous curve ξt ∈ CY ([0, T ],M(D∗)) there exists abackward propagator U r,s[ξ.] generated by the family of operators

A(ξt)f =1

2tr

(∫σ(x, y)σT (x, y)ξt(dy)

∂2f

∂x2

)+

(∫b(x, y)ξt(dy),

∂f

∂x

),

Page 407: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

394 Chapter 6. The Method of Propagators for Nonlinear Equations

so that‖U r,s[ξ.]‖D→D ≤ UD, ‖U r,s[ξ.]‖B→B ≤ UB

uniformly for any compact interval containing r, s, with some constants UD, UB

that depend on the norms of σ(., y), b(., y) in C4(Rd). By the assumption σ(x, .),b(x, .) ∈ C2(Rd), we can conclude that

‖A(ξ)−A(η)‖D→B ≤ supx

(‖σ(x, .)‖2C2(Rd) + ‖b(x, .)‖C2(Rd)

)‖ξ − η‖D∗ .

Consequently, by Theorem 6.3.1 and Remark 105(iv), we find the required well-posedness of the Cauchy problem (6.32), (6.114). �Exercise 6.9.1. Extend Theorem 6.9.1 to the case of time-dependent σ and b.

In most concrete applications (in particular in the setting of Landau equa-tions), σ and b are not bounded. For instance, in the case of collisions of so-calledMaxwellian molecules, these coefficients are of linear growth, see (7.55), (7.56).We shall not go into details concerning the necessary extensions, but refer insteadto the original papers, see, e.g., the comments in Section 7.10 on the Landau–Fokker–Planck equation.

6.10 Forward-backward systems

As we already pointed out, two different classes of the considered nonlinear equa-tions can be distinguished: (i) with a local nonlinearity, i.e., depending on thepointwise values of the unknown function and its derivatives, and (ii) with a non-linearity that depends on the integral characteristics of the unknown function (thisis when the weak form often turns out to be useful). More recent studies (essentiallyarising from the mean-field games) brought to light the analysis of systems thatare described by coupled combinations of equations of these two classes, whichmoreover are evolving in the opposite direction of time. A sufficiently generalforward-backward system of this kind can be written as follows:⎧⎪⎪⎪⎪⎪⎨⎪⎪⎪⎪⎪⎩

d

dt(φ, μt) =

(A(μt)φ+

(bt

(x, μt, u

(x,

∂ft∂x

, μt

)),∇φ

), μt

),

μ|t=0 = μ0, φ ∈ D,

∂ft∂t

+Aft +Hμt

(x,

∂ft∂x

)= 0, f |t=T = fT .

(6.115)

Let us explain the meaning of all occurring terms. The unknown functionsare μt ∈ P(Rd), t ∈ [0, T ] and ft ∈ B = C∞(Rd). The first equation is written inthe weak form, which has to hold for all test functions φ ∈ D, with D usually takento be D = C2∞(Rd). A and A(μ) are operators that generate strongly continuoussemigroups in C∞(Rd) (for the sake of simplicity, we choose A to be independent

Page 408: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

6.10. Forward-backward systems 395

of μ). The drift bt(x, μ, u) is continuous, and the Hamilton function H (or, moreprecisely, the family of these functions depending on μ as a parameter) is of theform

Hμ(x, p) = maxu∈U

[g(x, μ, u)p− J(x, μ, u)], (6.116)

as it arises from the theory of optimization, with some continuous functions g, Jand a closed set U ⊂ Rn. It is assumed that this representation is chosen insuch a way that the value u(x, p, μ), where the maximum in (6.116) is attained,is always uniquely defined. Therefore, u in the first equation of (6.115) expressesthe coupling between the two equations of (6.115).

Assuming that the backward Cauchy problem for the second equation in(6.115) is well posed for any curve μt ∈ C([0, T ],P1(D∗)) (at least in the sense ofthe mild solutions), the function

u(x, {μ≥t}; fT ) = u

(x,

∂ft(x)

∂x(x, {μ≥t}, fT ), μt

), (6.117)

is well defined. Consequently, the forward-backward system (6.115) can be rewrit-ten as a single anticipating kinetic equation of the type (6.45):

d

dt(φ, μt) = (A(μt)φ+ (bt(x, μt, u(x, {μ≥t}; fT )),∇φ), μt) , μ|t=0 = μ0, φ ∈ D.

(6.118)

Let us say that the Hamiltonian (6.116) has a Lipschitz minimizer if thefunction u(x, p, μ) is a Lipschitz-continuous function of all its three variables, withμ ∈ P(Rd) considered in the norm topology of the space D∗. In the sequel, weshall work only with Hamiltonians of this class. The most important nontrivialpoint is the Lipschitz continuity in p. This property is often fulfilled for convex (inu) functions J and convex sets U . In the Appendix, see Theorem 9.5.1, a simplesubclass of such Hamiltonians is explicitly identified.

Recall that we denoted

P1 = P1(Rd) = {μ ∈ P(Rd) :

∫|x|μ(dx) < ∞}.

Theorem 6.10.1. Let Hμ(x, p) be a family of Hamiltonians (6.116) having a Lips-chitz minimizer and satisfying the assumptions of Theorem 6.1.2. Let A satisfy theassumption on A of of Theorem 6.1.1 an A(μ) satisfy the assumptions of Theorem6.1.3(ii) with Bpar = C([0, T ],P1(D∗)). Assume that for any continuous functionu(x, {μ.}; fT ), which is Lipschitz-continuous in the first two variables x ∈ Rd,μ. ∈ C([0, T ],P1(D∗)), and any curve ξ. ∈ C([0, T ],P1(D∗)), the operators

At = At({ξ.}) = A(ξt) + (bt(x, ξt, u(x, {ξ.}; fT )),∇) : D → B (6.119)

generate conservative Feller propagators U t,s[{ξ.}] in B (recall that ‘conservative’means ‘preserving constants’) on the common invariant domain D such that the

Page 409: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

396 Chapter 6. The Method of Propagators for Nonlinear Equations

U t,s[{ξ.}] also act as bounded operators in the space of weighted functions CL(Rd)

with L(x) = 1 + |x|, so that

||U t,s[{ξ.}]||D→D ≤ UD and ||U t,s[{ξ.}]||CL(Rd)→CL(Rd) ≤ UL, (6.120)

for some constants UD, UL. Moreover, let

supt∈[0,T ]

‖A[t, {ξ.}]− A[t, {η.}]‖D→B ≤ LA supt∈[0,T ]

||ξt − ηt||D∗ (6.121)

for a constant LA. Then the forward-backward problem (6.115) has a solutionμt ∈ P1 and ft ∈ C1(Rd) for any fT ∈ C1(Rd) and μ0 ∈ P1. If LAUDT < 1, thissolution is unique.

Proof. By Theorem 6.1.4(ii) with

Bpar = C([0, T ],P1(D∗)),

for any μt ∈ C([0, T ],P1(D∗)) the backward equation in (6.115), which is a HJBequation, has a unique mild solution in C1(Rd) for any fT ∈ C1(Rd). By theassumption of a Lipschitz minimizer and again by Theorem 6.1.2, the functionu(x, {μ≥t}; fT ) depends Lipschitz-continuously on its arguments x and {μ≥t}.Therefore the proof can be completed by applying Theorems 6.4.2 and 6.4.4. �

The main examples are supplied by operatorsA andA of the Levy–Khintchintype. For instance, the following holds:

Theorem 6.10.2. The assumption on A from Theorem 6.10.1 holds if bt is aLipschitz-continuous function of its variables and the A(μ) are either non-de-generate diffusion operators or mixed fractional Laplacians of the type (5.100)satisfying the assumptions of Theorem 5.6.3 uniformly in μ.

Proof. The basic conditions (6.120) follow from Theorems 5.6.2 and 5.6.3 as wellas from Propositions 4.5.2 and 4.3.2. �

6.11 Linearized evolution aroundnon-linear propagators

In this section, we make a preparatory step to proving the sensitivity for generalevolutions provided by Theorem 6.3.1 without the simplifying assumption of themajor term being non-degenerate.

In terms of the general notations for the spaces of mappings (see (1.53)),the assumption (6.33) means that A ∈ CbLip(M(D∗),L(D,B)) with LA beingthe Lipschitz constant, where writing M(D∗) emphasizes that M is considered in

Page 410: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

6.11. Linearized evolution around non-linear propagators 397

the topology induced from D∗. Let us now make the stronger assumption thatA ∈ C1(M(D∗),L(D,B)). We aim at showing that the derivative

ξt = Dμt(Y )[ξ] = limh→0+

1

h(μt(Y + hξ)− μt(Y )) (6.122)

is well defined as an element of D∗, with the limit in (6.122) also existing in thesense of the topology D∗. This will illustrate a general feature of evolutions witha non-Lipschitz r.h.s., where the derivatives with respect to the initial data lie ina different regularity class than the solutions.

Assuming that (6.122) exists, one can differentiate equation (6.32) to obtain

d

dt(f, ξt) = (A(μt)f, ξt) + (DA(μt)[ξt]f, μt) ξ0 = ξ, f ∈ D, (6.123)

which describes the linearized evolution around a path of a nonlinear semigroupμt = μt(Y ). As in similar situations, in order to prove the existence of the deriva-tive (6.122), we first show the well-posedness of the evolution (6.123), and thenwe show that it yields the derivative (6.122).

A careful look at equation (6.123) leads to the conclusion that one cannotgenerally expect to solve it by a curve ξt in Bst, but only in D∗. This in turnimplies the necessity of additional assumptions on its r.h.s. Namely, let us assumethat there exists the representation

(DξA(μ)g, μ) = (F (μ)g, ξ), (6.124)

with F (μ) a Lipschitz-continuous mapping M(D∗) → L(D,D), so that

‖F (μ)− F (η)‖D→D ≤ LF ‖μ− η‖D∗ , (6.125)

with a constant LF that may depend on μ and η, but must be uniformly boundedfor μ, η from bounded subsets of M . This condition does not seem to be very intu-itive from an abstract point of view. In concrete examples, however, the operatorsA(μ) are usually some differential operators with coefficients depending on μ, i.e.,they are sums of terms like

AΩ(μ) = Ω(x, μ)∂k

∂x1 · · · ∂xk

with some functional Ω(x, μ) that smoothly depends on μ. In this case, we find

(DξA(μ)g, μ) = (DξΩ(x, μ)∂k

∂x1 · · · ∂xkg, μ)

=

∫ ∫δΩ(x, μ)

δμ(y)ξ(dy)

∂kg(x)

∂x1 · · · ∂xkμ(dx),

(6.126)

Page 411: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

398 Chapter 6. The Method of Propagators for Nonlinear Equations

so that the structure (6.124) becomes transparent with the function

(F (μ)g)(y) =

∫δΩ(x, μ)

δμ(y)

∂kg(x)

∂x1 · · · ∂xkμ(dx).

More generally, if A(μ) is a family of pseudo-differentiable operators with thesymbols H(x, p, μ) (see (1.95) if needed) depending smoothly on μ, one gets

(F (μ)g)(y) =

∫ ([δH(x, p, μ)

δμ(y)

]g

)(x)μ(dx), (6.127)

where we denoted (only for this argument) by [ψ] the pseudo-differential operatorwith the symbol ψ. Therefore, the following theory is effectively meant to beapplicable to differential or pseudo-differential operators A(μ).

Under the assumption (6.124), equation (6.123) can be rewritten as

d

dt(f, ξt) = (A(μt)f, ξt) + (F (μt)f, ξt), f ∈ D, (6.128)

which is dual to the equation

g = −(A(μt) + F (μt))g. (6.129)

Equations of this type were analysed in Theorem 4.13.2, which implies thatthis equation is well posed in D with the derivative understood in the sense of thetopology of Bobs. The resolving backward propagator Φt,s[Y ] of equation (6.129)acts in D (and generally not in B), and the ξt evolving according to the dualpropagator Ψs,t[Y ] = (Φt,s[Y ])∗ are well defined in D∗.

However, since Ψs,t[Y ] = (Φt,s[Y ])∗ acts in D∗ and not in B∗, the expressionon the r.h.s. of (6.128) may not make sense for f ∈ D. Therefore, this propagatorsupplies, strictly speaking, only a generalized solution to (6.128). This is a typicaldifficulty related to the already mentioned fact that nonlinear evolutions withunbounded coefficients usually push the derivatives of the solutions with respectto parameters outside the domain where the solutions live. In order to ensurethat the Ψs,t[Y ]ξ do solve (6.128), further regularity assumptions are required,which can be conveniently formulated in the setting of the three Banach spacesD ⊂ D ⊂ Bobs as in Theorem 5.1.3, so that ‖ ‖D ≥ ‖ ‖D ≥ ‖ ‖Bobs

, D is dense in

Bobs in the topology of B, and D is dense in D in the topology of D. The mainresult of this section is as follows:

Theorem 6.11.1.

(i) Under the assumptions of Theorem 6.3.1, let the representation (6.124) to(6.125) hold. Then for each Y ∈ M , equation (6.129) is well posed, the solu-tion being given by a strongly continuous backward propagator Φt,s[Y ] in Dsuch that Φt,s[Y ]g is the unique curve in D solving equation (6.129) in Bobs

Page 412: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

6.11. Linearized evolution around non-linear propagators 399

(the derivative being defined with respect to the topology of B). The dualoperators Ψs,t[Y ] = (Φt,s[Y ])∗ form a ∗-weakly continuous propagator in D∗.

(ii) Assume additionally that the backward propagators U t,s[η.] generated by A(ηt)are also strongly continuous and bounded in D, and that the operators A(η)are also bounded operators D → D such that

‖A(ξ)−A(η)‖D→D ≤ LA‖ξ − η‖D∗ . (6.130)

Moreover, let the representation (6.124) hold, with F (μ) being a Lipschitz-continuous mapping M(D∗) → L(D,D) and M(D∗) → L(D, D) such that(6.125) holds and

‖F (μ)− F (η)‖D→D ≤ LF ‖μ− η‖D∗ . (6.131)

Then, for each Y ∈ M and ξ ∈ D∗, ξt = Ψt,0[Y ]ξ is the unique solution to(6.123) in the sense that it holds for any f ∈ D.

Remark 110. The construction of propagators from condition (ii) can naturallybe carried out via Theorem 5.1.1, that is via T -products.

Proof. (i) This is a direct consequence of Theorem 4.13.2. (ii) By Proposition 5.1.1,the U t,s[μt]g solve the equation gt = −A(μt)gt in D for any g ∈ D. Proposition4.13.1, applied to (D,D) rather than (D,Bobs), yields the existence of the strongbackward propagator Φt,s of linear operators in D generated by the family A(μt)+F (μt) on D. By Theorem 4.10.1, again applied to (D,D), the dual propagatorΨs,t = (Φt,s)∗ supplies a unique solution to the Cauchy problem for equation(6.128). �

Both for numerical simulations and for the application to interacting parti-cles, it is crucial to analyse the dependence of the solutions on other parameters,not only on the initial data. Therefore, we shall consider a more general situation,where we are given a family of operators Aα(μ) that depends on a real parameterα and satisfies the assumptions of Theorem 6.3.1 for each α. For μα

t = μαt (Y ), a

solution corresponding to (6.32) with the initial condition Y , we are interested inthe derivative

ξt(α) =∂μα

t

∂α. (6.132)

Differentiating (6.32) (at least formally for the moment) with respect to αyields the equation

d

dt(g, ξt(α)) = (Aα(μα

t )g, ξt(α)) + (Dξt(α)Aα(μα

t )g, μαt ) +

(∂Aα(μα

t )

∂αg, μα

t

),

(6.133)as an extension of (6.123), with the initial condition

ξ0 = ξ0(α) =∂μα

0

∂α. (6.134)

We can now extend Theorem 6.11.1 to the case of the linearized evolution (6.133).

Page 413: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

400 Chapter 6. The Method of Propagators for Nonlinear Equations

Theorem 6.11.2. Let the conditions of Theorem 6.11.1(ii) hold for each family

Aα, with all bounds being uniform in α. Moreover, let∂Aα(μα

t )∂α exist and define

a continuous mapping M(D∗) → L(D,Bobs). Then the Cauchy problem for theweak equation (6.133) has a unique solution ξt = Πt,0[α, Y ]ξ0 (in the sense that(6.133) holds for all g ∈ D) given by the formula

(g, ξt) = (Φt,r[α, Y ]g, ξt) +

∫ r

t

(∂Aα(μα

s )

∂αΦs,r[α, Y ]g, μα

s

)ds. (6.135)

Proof. It follows directly from Theorem 6.11.1 and Proposition 4.10.3. �

We complete this section by a simple stability result for Πs,t.

Theorem 6.11.3.

(i) Under the assumptions of Theorem 6.11.1(ii), the propagator Ψ depends con-tinuously on Y = μ0 in the following sense:

‖Ψs,t[Y1]−Ψs,t[Y2]‖D∗→D∗ ≤ C(LA + LF )‖Y1 − Y2‖D∗ , (6.136)

with a constant C that is uniform for Y1, Y2 in any bounded subset of M(D∗).(ii) Under the assumptions of Theorem 6.11.2, suppose that∥∥∥∥∂Aα(μ)

∂α− ∂Aα(η)

∂α

∥∥∥∥D→B

≤ L∂A‖μ− η‖D∗ ,

with a constant L∂A that can be chosen uniformly for μ, η from any boundedsubset of M(D∗). Then the solutions ξt = Πt,0[α, Y ]ξ0 depend continuouslyon Y = μ0 in the following sense

‖(Πt,0[α, Y1]−Πt,0[α, Y2])ξ0‖D∗ ≤ C(L∂A + (LA + LF )‖ξ‖D∗)‖Y1 − Y2‖D∗ .(6.137)

Proof. (i) By Proposition 4.9.3, we have

‖(Φt,s[Y1]− Φt,s[Y2])f‖D ≤ t‖f‖D(LA + LF ) supτ,r

‖Φτ,r[Y1]‖D→D

× supτ,r

‖Φτ,r[Y2]‖D→D supτ

‖μτ (Y1)− μτ (Y2)‖D∗ .

The last term can be estimated by

UB exp{tUDUBLA}‖Y1 − Y2‖D∗ ,

which implies (6.136) by duality.

(ii) This again follows by Proposition 4.9.3. �

Page 414: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

6.12. Sensitivity of nonlinear propagators 401

6.12 Sensitivity of nonlinear propagators

In order to deduce that Ψs,t[Y ]ξ from the previous section equals the derivative(6.122), and therefore to prove the sensitivity result for general nonlinear propa-gators, we shall approximate A(μ) by a sequence of bounded operators An(μ), forwhich the statement is a straightforward consequence of the theory for equationswith a Lipschitz-continuous r.h.s., see Theorem 2.9.1. Afterwards, we pass to thelimit. For the setting of ΨDOs that we are dealing with, a natural way is to ap-proximate operators by approximating their symbols, see the discussion prior toTheorem 5.15.1.

Theorem 6.12.1. Under the assumptions of Theorem 6.11.1(ii), assume that thereexists a sequence of families of operators An(μ) such that for each n they satisfyall the assumptions of Theorem 6.3.1 with the same bounds, they are bounded asoperators D → D and Bobs → Bobs, and

‖An(η)−A(η)‖D→D → 0, ‖Fn(η)− F (η)‖D→D → 0, (6.138)

as n → ∞, uniformly for η from any bounded subset of M . Then the deriva-tive (6.122) exists (the limit being understood in the sense of D∗) and it equalsthe unique solution ξt = Ψt,0[Y ]ξ to equation (6.123) constructed in Theorem6.11.1(ii).

Proof. We need to add a dependence on n to all objects constructed from An.By Theorem 2.9.1, the claim of Theorem 6.12.1 holds for ξnt [Y ] = Ψt,0

n [Y ]ξ, whichimplies that

μnt (Y + hξ)− μn

t (Y ) =

∫ h

0

ξnt [Y + rξ] dr. (6.139)

By Theorem 6.3.2 and the first relation in (6.138), μnt converges to μt in D∗, as

n → ∞. By (6.130), (6.131), (6.138) and Proposition 4.9.3, applied to the backwardpropagators Φt,s[Y ] and Φt,s

n [Y ] in D, it follows that ξns → ξs in D∗. Therefore,passing to the limit in (6.139) in the topology of D∗, as n → ∞, yields

μt(Y +hξ)−μt(Y ) =

∫ h

0

ξt[Y +rξ] dr = hξt[Y ]+

∫ h

0

(ξt[Y +rξ]−ξt[Y ]) dr. (6.140)

Theorem 6.11.3 implies that

‖ξt[Y + rξ]− ξt[Y ]‖D∗ → 0,

as r → 0, and consequently (μt(Y + hξ)− μt(Y ))/h → ξt[Y ] in D∗, as h → 0, asrequired. �

The extension to a more general dependence on a parameter is as follows:

Page 415: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

402 Chapter 6. The Method of Propagators for Nonlinear Equations

Theorem 6.12.2. Let ξ0 = ξ ∈ B∗ and be defined by (6.134), where the derivativeexists in the norm-topology of D∗. Then the derivative (6.132) exists in D∗ and itequals the unique solution ξt[α] = Πt,0[α, μα

0 ]ξ to equation (6.133) constructed inTheorem 6.11.2.

Proof. As above, we get the following equation for the approximating family:

μαt (n)− μα0

t (n) =

∫ α

α0

ξt[β](n) dβ, (6.141)

which holds as an equation in D∗. Again passing to the limit completes the proof.�

Theorems 6.12.1 and 6.12.1 are applicable to a wide range of nonlinear evo-lutions, including degenerate McKean–Vlasov diffusions, various nonlinear evolu-tions with fractional derivatives and the Landau–Fokker–Planck-type equations ofSection 6.9.

6.13 Summary and comments

A method for constructing nonlinear dynamics via infinite-dimensional multipli-cative-integral equations was developed for the quantum mechanical problem ofmany particle systems in [201]. Further development of this method for the analysisof nonlinear equations of control theory (HJB) and statistical mechanics of inter-acting Markov evolutions (like nonlinear diffusions and nonlinear evolutions withfractional Laplacians) in the spirit of Sections 6.3 and 6.12, including most impor-tantly their sensitivity and applications to forward-backward systems as presentedin this chapter, was carried out in several works of the author with co-workers, see,e.g., [143, 147, 163–165]. We did not touch here the important analysis of nonlineardiffusions in domains with a boundary, see, e.g., [226, 247] and references therein.

Our approach to HJB equations was essentially via mild solutions, which iseffective for equations with a smoothing linear part. For first-order HJB equationsof the type H(x, u(x),∇u(x)) = 0 with x ∈ Ω for open subsets Ω of a Banachspace B (including the evolutionary equation ∂u

∂t +H(x, u,∇u) = 0 in (0, T )×Ω),the most appropriate notion of a solution is that of a viscosity solution. There areseveral similar definitions. The most transparent one that suffices to fully treatequations in Rd and in reflexive Banach spaces (see [54] and references therein) isthe following: A continuous function φ on Ω is a viscosity solution to the equationH(x, u(x),∇u(x)) = 0 if it is both a supersolution and a subsolution, meaning thatH(x, u(x), p) ≥ 0 (respectively H(x, u(x), p) ≤ 0) for any x and any subdifferential(respectively superdifferential) p of u at x, where p ∈ B∗ is called a subdifferentialof u at x, if

lim infy→x

[u(y)− u(x)− (p, y − x)] ≥ 0,

Page 416: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

6.13. Summary and comments 403

and a superdifferential of u at x, if

lim supy→x

[u(y)− u(x)− (p, y − x)] ≤ 0.

The literature on the popular topic of McKean–Vlasov diffusions (determin-istic or random) is extensive, see, e.g., [3, 46, 56, 221] and references therein. Themain new point of [147] was the systematic development of sensitivity (smoothnessof solutions with respect to the initial data). There is quite a lot of work availableon the Landau–Fokker–Planck equation (bibliographical comments are given inSection 7.10). We only explained how our general approach works for the class ofequations of a similar type, but with bounded coefficients.

As mentioned, the rising interest in forward-backward systems has been dueto the development of mean-field games, which currently represent one of the mostpopular directions in game theory. Mean-field games were initiated in [113] and[180]. At present, there exist already several excellent surveys and monographson various directions of the theory, see [34, 43, 48, 92, 97]. For results on theexistence of solutions to forward- backward systems under various assumptions,we can refer, e.g., to [29, 93, 164] and references therein. For an analysis in termsof the master equation, see [35, 45, 47, 48, 162].

The general Theorem 6.3.1 can be used in many other situations that havenot been covered in this book. For instance, quantum dynamic semigroups actingin the space L(H,H) of bounded operators in a Hilbert space H are known tohave generators of the form

L(X) =∞∑j=1

(V ∗j XVj − 1

2(V ∗

j VjX +XV ∗j Vj)

)+ i[H,X ], (6.142)

where H is a self-adjoint operator in a Hilbert space H and Vi,∑∞

i=1 V∗i Vi ∈

L(H,H). A straightforward manipulation shows that the corresponding dual evo-lution on the space of trace-class operators has the generator

L′(Y ) =1

2

∞∑j=1

([VjY, V

∗j ] + [Vj , Y V ∗

j ])− i[H,Y ], (6.143)

where L′ denotes the dual operator with respect to the usual pairing given bythe trace (see Section 1.14). Nonlinear counterparts of dynamic semigroups thatappear as the law of large numbers or the mean field limit for interacting quantumparticles are given by nonlinear equations of the form

Yt = L′Y (Y ) =

1

2

∞∑j=1

([Vj(Y )Y, V ∗

j (Y )] + [Vj(Y ), Y V ∗j (Y )]

)− i[H(Y ), Y ], (6.144)

where the operators Vi and H additionally depend on the current state Y . Asa more or less direct consequence of Theorem 6.3.1 (see [147]), one can derive a

Page 417: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

404 Chapter 6. The Method of Propagators for Nonlinear Equations

rather general well-posedness result for equation (6.144), with its solution formingthe so-called nonlinear quantum dynamic semigroup.

Combining the results of Chapters 4 and 5 with the general construction ofSections 6.3 and 6.4, one can construct nonlinear extensions of basically all linearevolutions covered in Chapters 4 and 5. We refer to them as nonlinear Markov ifthey define evolutions of measures that preserve positivity. As we saw in Section5.10, this positivity-preservation is the hallmark of nonlinear evolutions that havegenerators of the Levy–Khintchin type with coefficients depending on an unknownmeasure. For instance, fractional Laplacians |Δ|α/2 have this form for α ≤ 2. Werefer to [143] for an analysis of a reasonably general equation

d

dt(g, μt) = − (σ(x)|Δ|α/2g, μt) +

∫(V (x, y),∇g(x))μt(dx)μt(dy)

+

∫(g(x+ z)− g(x))ψ(x, y; z)dzμt(dx)μt(dy), (6.145)

which combines fractional Laplacian, drift and integral terms.

As other examples, one can mention stochastic geodesic flows on manifolds,including non-local evolutions of the velocities (see [147]) and nonlinear diffusionswith singular coefficients (depending, e.g., on the median or the VaR), as developedin [151] for tackling the model of financial asset pricing suggested in [55].

Page 418: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

Chapter 7

Equations in Spaces ofWeighted Measures

In this chapter, we deal with equations in spaces of weighted measures and con-tinuous functions. Basic examples for this type of equation are supplied by gen-eral kinetic equations of statistical physics, including the celebrated equations ofBoltzmann and Smoluchowski, as well as by replicator dynamics and its variousmodifications from evolutionary biology and dynamic games. We shall briefly ex-plain how these nonlinear equations arise from the linear evolutions of systemsof interacting particles when the number of particles tends to infinity, i.e., as thedynamic law of large numbers. Note that we shall only lay the foundations, whilewe will supply a brief guide to the enormous literature at the end of the chapter.Most of the exposition can be regarded as a far-reaching extension of the discretesetting of Chapter 3.

7.1 Conditional positivity

Let us start with the infinitesimal structure of positivity-preserving evolutions.

Theorem 7.1.1. Let X be a complete metric space. Let μt = Ft(μt) be an equationin M(X) such that F : R × M(X) → M(X) is a continuous function. Let μt,t ≥ s, be a solution to this equation with the initial data μs = μ ∈ M+(X). Ifthe negative part F−

s (μ) of the Hahn decomposition Fs(μ) = F+s (μ) − F−

s (μ) ofFs(μ) is not absolutely continuous with respect to μ, then μt /∈ M+(X) for allsufficiently small t− s.

Proof. By the Lebesgue decomposition theorem, we find F−s (μ) = F−

s,abs(μ) +

F−s,sing(μ), where F

−s,abs(μ) and F−

s,sing(μ) are the absolutely continuous respectively

© Springer Nature Switzerland AG 2019 V. Kolokoltsov, Differential Equations on Measures and Functional Spaces, Birkhäuser Advanced Texts Basler Lehrbücher, https://doi.org/10.1007/978-3-030-03377-4_7

405

Page 419: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

406 Chapter 7. Equations in Spaces of Weighted Measures

the singular parts of the measure F−s (μ) with respect to μ. If F−

s (μ) is not abso-lutely continuous with respect to μ, there exists A such that (F−

s,sing(μ))(A) > 0

and (F−s,abs(μ))(A) = (F+

s (μ))(A) = μ(A) = 0. For any solution μt, we have

‖μs+δ − μ− δFs(μ)‖ ≤ δε,

where ε can be made arbitrary small by choosing a sufficiently small δ. Therefore,

‖μs+δ(A) + δ(F−s,sing(μ))(A)‖ ≤ εδ.

If ε < (F−s,sing(μ))(A), it follows that μs+δ(A) < 0, as required. �

This result motivates the following definition: A mapping F : M(X) →M(X) is conditionally positive if its negative part F−(μ) is everywhere absolutelycontinuous with respect to μ, i.e.,

F (μ) = Ω(μ)− a(x, μ)μ, (7.1)

where a(x, μ) is a non-negative function and Ω(μ) ∈ M+(X).

The proof given above is sufficiently robust to have several extensions. Forinstance, the same conclusion holds if we consider the same equation in the spaceof weighted measures (see M(X,L) below), or if we require the equation to holdweakly. However, it does not cover equations where Ft(μ) maps M(X) onto amore irregular class of generalized functions.

Introducing the transition kernel

ν(x, μ, dy) =a(x, μ)∫

a(z, μ)μ(dz)Ω(μ)(dy), (7.2)

depending on μ as on a parameter, one can rewrite (7.1) in the equivalent form

F (μ) =

∫X

μ(dw)ν(w, μ, .) − a(., μ)μ(.), (7.3)

which is convenient for comparisons with the linear theory, and therefore for theweak formulation. In fact, by (7.3), the equation μt = F (μt) can be rewritten inthe standard weak form (6.32), i.e.,

d

dt(f, μt) = (A(μt)f, μt) = (f, F (μt)), μ0 = Y, (7.4)

with test-functions f from C(X) and

(A(μ)f)(x) =

∫X

f(y)ν(x, μ, dy)− a(x, μ)f(x). (7.5)

In particular, if∫F (μ)(dy) = 0 for all μ (the conservativity condition),

that is,∫Ω(μ)(dy) =

∫a(x, μ)μ(dx), then the kernel (7.2) satisfies the equation

a(x, μ) =∫ν(x, μ, dy) = ‖ν(x, μ, .)‖, so that

A(μ)f =

∫X

(f(y)− f(x))ν(x, μ, dy). (7.6)

Page 420: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

7.2. Simplest equations that preserve positivity 407

Remark 111. Formula (7.2) provides the simplest ν satisfying (7.3). In all practicalexamples, F is expressed by (7.3) with ν differing from (7.2). Therefore, we shallbe constantly working with (7.3) without assuming the validity of (7.2).

7.2 Simplest equations that preserve positivity

Recall the following notations for weighted measure spaces: For a non-negativecontinuous function L on X , let us introduce the sets of weighted measures

M(X,L) = {μ ∈ M(X) : (L, |μ|) < ∞},M≤λ(X,L) = {μ ∈ M(X,L) : (L, |μ|) ≤ λ}

for a λ > 0. We denote by M+(X,L) and M+≤λ(X,L) the corresponding subsets

of non-negative elements. The space M(X,L) is a Banach space if equipped withthe norm

‖μ‖M(X,L) =

∫L(x)|μ|(dx) = sup{|(y, μ)| : |y| ≤ L}.

Similar to the usual spaces of measures, the natural weak topology on theweighted space M(X,L) is defined with respect to the duality relation with theweighted space CL(X) consisting of continuous functions on X with a boundednorm

‖f‖CL(X) = inf{K : |f(x)| ≤ KL(x) for all x}.For kernels (even signed kernels) ν(x, dy) acting in a space of weighted measuresand functions, the natural norm is the ‘double-norm’, i.e., the norm in CL(X) withrespect to the first variable and the norm in M(X,L) with respect to the secondone:

‖ν(., .)‖LL(X) = inf{K : ‖ν(x, .)‖M(X,L) ≤ KL(x)}. (7.7)

For strictly positive L, this is equivalent to

‖ν(., .)‖LL(X) = supx

‖ν(x, .)‖M(X,L)

L(x)= sup

x{L−1(x)

∫X

L(y)|ν(x, dy)|}.

These norms also define the norms of the integral operators (1.77) with the kernelν, when they act in weighted spaces:

‖Tν‖L(CL(X)) ≤ ‖ν(., .)‖LL(X), ‖T ′ν‖L(M(X,L)) ≤ ‖ν(., .)‖LL(X). (7.8)

Exercise 7.2.1. Prove these inequalities. Show that these inequalities become equal-ities if L is strictly positive.

Let us clarify the relation between the Lipschitz continuity of ν, a and Fconnected by (7.3).

Page 421: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

408 Chapter 7. Equations in Spaces of Weighted Measures

Lemma 7.2.1. Suppose, for μ, ξ, η ∈ M≤λ(X,L), that

a(x, μ) + ‖ν(., μ, .)‖LL(X) ≤ c1(λ), (7.9)

|a(x, ξ)− a(x, η)| + ‖ν(., ξ, .)− ν(., η, .)‖LL(X) ≤ c2(λ)‖ξ − η‖M(X,L). (7.10)

Then

‖F (μ)‖M(X,L) ≤ λc1(λ), (7.11)

‖F (ξ)− F (η)‖M(X,L) ≤ (c1(λ) + λc2(λ))‖ξ − η‖M(X,L). (7.12)

Proof. We have

‖F (μ)‖M(X,L) ≤∫X2

L(y)μ(dw)ν(w, μ, dy) +

∫L(y)a(y, μ)μ(dy)

≤ c1(λ)‖μ‖M(X,L),

which shows (7.11). Next, we write

‖F (ξ)− F (η)‖M(X,L)

≤∫

L(y)|a(y, ξ)− a(y, η)|η(dy) +∫

L(y)a(y, ξ)|ξ(dy)− η(dy)|

+

∫X2

L(y)|ξ(dw) − η(dw)|ν(w, ξ, dy)

+

∫X2

L(y)η(dw)|ν(w, ξ, dy) − ν(w, η, dy)|,

which leads to (7.12). �

Most of the analysis of the equations (7.4) is linked with some Lyapunovfunctions. A non-negative continuous function L on X is called the Lyapunovfunction for F (μ) or A(μ) if ν(x, μ, .) ∈ M+(X,L) for all x and μ ∈ M+(X,L)and

(L, F (μ)) = (A(μ)L, μ) ≤ α(L, μ) + β (7.13)

with some constants α, β. As in the discrete setting, the Lyapunov function Lis called subcritical (respectively critical), or F is said to be L- subcritical (re-spectively L- critical), if (L, F (μ)) ≤ 0 (respectively (L, f(μ)) = 0) for all μ ∈M+(X,L).

We shall now prove the basic well-posedness result for measure-valued ODEswith a Lyapunov condition and bounded rates, extending its discrete version ofTheorem 3.6.1.

Theorem 7.2.1.

(i) Let X be a complete metric space, L a non-negative continuous function onX, and a(x, μ) a continuous non-negative function on X × M+(X,L). Let

Page 422: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

7.2. Simplest equations that preserve positivity 409

ν be a family of transition kernels such that ν(x, μ, .) ∈ M+(X,L) for allx ∈ X,μ ∈ M+(X,L), depending continuously on x, with ν considered inthe weak topology of M+(X,L). Let F be given by (7.3). Let the Lyapunovcondition (7.13) and the conditions of growth and Lipschitz continuity (7.9),(7.10) hold.

Then for any Y ∈ M+≤λ(X,L), there exists a unique global solution μt(Y ) ∈

M+(X,L) (defined for all t ≥ 0) to the Cauchy problem for the equationμ = F (μ) in M(X,L) (i.e., with the derivative understood in the Banachtopology of M(X,L)), with the initial condition Y . Moreover, if α = 0, then

μ(t, μ) ∈ M+≤λ(t)(X,L), λ(t) = eαtλ+ (eαt − 1)β/α. (7.14)

If α = 0, then the same holds with λ(t) = λ + βt. If f is L-subcritical, thenthe same holds with λ(t) = λ.

(ii) Suppose that all the conditions of (i) hold apart from (7.9) and (7.10), whichare substituted by the analogous conditions in the topology of M(X) =M(X,L = 1), i.e.,

a(x, μ) + ‖ν(., μ, .)‖11(X) ≤ c1(λ), (7.15)

|a(x, ξ) − a(x, η)|+ ‖ν(., ξ, .)− ν(., η, .)‖11(X) ≤ c2(λ)‖ξ − η‖M(X). (7.16)

Assume moreover that L is bounded from below by a constant L0 > 0. Thenthe same conclusion as in (i) holds, but the solution to the equation μ = F (μ)is understood in the sense of the Banach topology of M(X).

Proof. (i) As was the case for Theorem 3.6.1, this is again a straightforwardextension of the proof of Theorem 3.1.1 and a consequence of Theorem 2.1.2.Namely, fixing T , let us define the convex set Ca,b(T ) of continuous functionsμ : [0, T ] �→ M+(X,L) such that μ([0, t]) ∈ M+

≤λ(t)(X,L) for all t ∈ [0, T ]. Let

K = max{a(x, μ) : x ∈ X,μ ∈ M+≤λ(T )(X,L)}.

Therefore, by defining the mapping ΦY as

[ΦY (μ.)](t) = e−KtY +

∫ t

0

e−K(t−s)[F (μs) +Kμs] ds,

it follows that ΦY preserves positivity and that the fixed points of ΦY are thesolutions to μ = F (μ). The same integration as in Theorem 3.6.1 shows that ΦY

preserves the set Ca,b(T ).

The proof is completed by referring to Theorem 2.1.2, because, by (7.11) and(7.12),

‖[ΦY1(μ1. )](t)− [ΦY2(μ

2. )](t)‖M(X,L)

≤ (K + c1(λ) + λc2(λ))

∫ t

0

‖μ1s − μ2

s‖M(X,L) ds+ ‖Y1 − Y2‖M(X,L)

Page 423: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

410 Chapter 7. Equations in Spaces of Weighted Measures

and

[ΦY (Y )](t)−Y = e−KtY −Y +

∫ t

0

e−K(t−s)[F (Y )+KY ] ds =1

K(1−e−Kt)F (Y ),

which implies ‖[ΦY (Y )](t)− Y ‖M(X,L) ≤ tF (Y )M(X,L).

(ii) The conditions (7.15) and (7.16) imply

‖F (μ)‖M(X) ≤ (a(x, μ) + ‖ν(., μ, .)‖11(X))‖μ‖M(X) ≤ λc1(λ)/L0, (7.17)

‖F (ξ)− F (η)‖M(X) ≤ (c1(λ) + λc2(λ)/L0)‖ξ − η‖M(X). (7.18)

Consequently,

‖[ΦY1(μ1. )](t)− [ΦY2(μ

2. )](t)‖M(X)

≤ (K + c1(λ) + λc2(λ)/L0)

∫ t

0

‖μ1s − μ2

s‖M(X) ds+ ‖Y1 − Y2‖M(X).

Therefore, the proof can again be completed by referring to Theorem 2.1.2 andthe observation that the sets M+

≤λ(X,L) are closed in M(X) for any λ. �Remark 112. In [147], four different proofs of Theorem 7.2.1 are provided for thespecial case L = 1.

Let us now prove a sensitivity result for the evolution μ = F (μ). Translatingthe basic setting of Section 1.13 into the present setting with weighted spaces, wesay that a mapping F : M+

≤λ(X,L) �→ M(X,L) has a strong variational derivativeδF (Y, x) if for any Y ∈ M≤λ(X), x ∈ X the limit

δF (Y, x) =δF

δY (x)= lim

s→0+

1

s(F (Y + sδx)− F (Y ))

exists in the norm topology of M(X,L).

We say that F belongs to C1weak(M≤λ(X,L),M(X,L)), if the strong varia-

tional derivative δF (Y, x) exists for all x ∈ X , Y ∈ M(X,L) and is a continuous(in the sense of the weak topology) mapping M+

≤λ(X,L) ×X → M(X,L). Like

in Theorem 1.13.2(i), it follows that a mapping F ∈ C1weak(M≤λ(X,L),M(X,L))

belongs to C1(M≤λ(X,L),M(X,L)) if the mapping Y �→ δF (Y, x) is a contin-uous bounded mapping M+

≤λ(X,L) → L(M(X,L)) with respect to the normtopologies on both sides, so that

‖δF (Y1, .)− δF (Y2, .)‖LL(X) → 0,

as ‖Y1 − Y2‖M(X,L) → 0. (For this, recall that the norm in L(M(X,L)) is givenby (7.8).) The space C1

weak(M≤λ(X,L),M(X,L))∩C1(M≤λ(X,L),M(X,L)) isa Banach space when equipped with the norm

supY ∈M≤λ(X,L)

‖F (Y )‖M(X,L) + supY ∈M≤λ(X,L)

‖δF (Y, .)‖LL(X). (7.19)

Page 424: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

7.2. Simplest equations that preserve positivity 411

As a direct consequence of Theorems 2.8.1 and 7.2.1, we obtain the followingsensitivity result:

Theorem 7.2.2. Under the assumptions of Theorem 7.2.1(i), suppose that

F ∈ C1weak(M≤λ(X,L),M(X,L)) ∩C1(M≤λ(X,L),M(X,L))

for any λ and that the derivative is uniformly continuous in the sense that for anyλ > 0 and ε > 0 there exists δ > 0 such that

‖δF (Y1, .)− δF (Y2, .)‖LL(X) < ε,

whenever Y1, Y2 ∈ M+≤λ(X,L) and ‖Y1 − Y2‖M(X,L) < δ.

Then for any ξ ∈ M+(X,L), the derivative

ξt = Dμt(Y )[ξ]

exists strongly (i.e., the limit exists in the norm topology of M(X,L)) and is theunique solution to the integral equation

ξt = e−Ktξ +

∫ t

0

e−K(t−s)

[∫δF (μs, x)ξs(dx) +Kξs

]ds,

which is equivalent to the Cauchy problem

ξt =

∫δF (μt, x)ξt(dx), ξ0 = ξ.

Exercise 7.2.2. Formulate the analogous result in the framework of Theorem7.2.1(ii).

Let us emphasize that the regularity conditions on F in Theorem 7.2.2 followfrom the following regularity conditions on ν and a via (7.3): the strong variationalderivatives

δa(x, μ)

δμ(y),δν(x, μ, .)

δμ(y)

exist, depend continuously on their arguments (measures considered in the weaktopology of M(X,L)) and are locally uniformly continuous as functions of μ inthe sense that for any λ > 0 and ε > 0 there exists δ > 0 such that

supy

{L−1(y)

∫L(x)

∣∣∣∣δa(x, Y1)

δμ(y)Y1(dx) − δa(x, Y2)

δμ(y)Y2(dx)

∣∣∣∣}

< ε, (7.20)

supy

{L−1(y)

∫ ∫L(x)

∣∣∣∣δν(w, Y1, dx)

δμ(y)Y1(dw)− δν(w, Y2, dx)

δμ(y)Y2(dw)

∣∣∣∣}

< ε,

(7.21)

whenever Y1, Y2 ∈ M+≤λ(X,L) and ‖Y1 − Y2‖M(X,L) < δ.

A particularly interesting case are equations with unbounded rates, where thenon-uniqueness of solutions can be naturally expected. A characteristic exampleof such situation is given by the following existence result.

Page 425: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

412 Chapter 7. Equations in Spaces of Weighted Measures

Theorem 7.2.3. Let X be a locally compact metric space, L a continuous functionon X that is bounded from below by a constant L0 > 0 and tends to infinity asx → ∞. Let a(x, μ) be a continuous non-negative function on X ×M+(X,L) andν a family of transition kernels in M(X,L) such that for μ, ξ, η ∈ M+

≤λ(X,L),we have

|a(x, μ)|+ ‖ν(x, μ, .)‖M(X) = ω(x)L(x)c1(λ), (7.22)

|(a(x, ξ)− a(x, η)|1{L(x)≤ρ} ≤ c2(λ)c2(ρ)‖ξ − η‖M(X), (7.23)

‖(ν(x, ξ, .)− ν(x, η, .)‖M(X)1{L(x)≤ρ} ≤ c2(λ)c2(ρ)‖ξ − η‖M(X), (7.24)

with some c1(λ), c2(λ), c2(ρ) and a bounded function ω on R+ tending to zero atinfinity. Let the Lyapunov condition (7.13) hold and F be given by (7.3).

Then for any Y ∈ M+≤λ(X,L), there exists a global solution μt(Y ) to the

Cauchy problem for the equation μ = F (μ) in M(X), with the initial conditionY , such that (7.14) holds for α = 0. For α = 0, the same condition holds withλ(t) = λ+ βt.

Proof. Let χn(x), n ∈ N, be a continuous function X → [0, 1] such that χn(x) = 1for L(x) ≤ n− 1 and χn(x) = 0 for L(x) ≥ n. For any n < m, let us consider thecut-off data

am(x, μ) = χm(x)a(x, μ), νn(x, μ, .) = χn(x)ν(x, μ, .).

Let Fnm be given by (7.3) with am, νn instead of a, ν. Due to the assumptionn < m, the Lyapunov condition (7.13) holds for Fnm. In fact, one can just fixn = −m − 1. Applying Theorem 7.2.1(ii) leads to the conclusion that for anyn < m there exists a unique solution μnm

t (Y ) to the Cauchy problem for theequation μ = Fnm(μ) which satisfies the required growth conditions (e.g., (7.14)for α = 0).

Next, we find

‖Fnm(μnmt (Y ))‖M(X) ≤ sup

z{ω(z)}λ(t)c1(λ(t)),

which is uniformly bounded in n,m. Consequently, for any sequence of pairs n < mtending to infinity, the Arzela–Ascoli theorem ensures the existence of a subse-quence such that μnm

t (Y ) converges in C([0, t],M(X)) to some function μt(Y ). Itremains to show that μt(Y ) satisfies the equation

μt = Y +

∫ t

0

F (μs) ds.

This can be achieved by passing to the limit in the corresponding equation forμnmt . Since

‖Fnm(μ)− F (μ)‖M(X) ≤ ω(n)λc1(λ) → 0,

Page 426: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

7.3. Path-dependent equations and forward-backward systems 413

as n → ∞, uniformly for μ ∈ M+≤λ(X,L) with any λ, one only has to show that

‖F (μnmt (Y ))−F (μt(Y ))‖M(X) → 0 as m → ∞. Decomposing the integrals in the

expression

‖F (μnmt (Y ))− F (μt(Y ))‖M(X)

≤∫X

|a(x, μnmt (Y ))μnm

t (Y )(dx) − a(x, μ(Y ))μ(Y )(dx)|

+

∫X2

|μnmt (Y )(dx)ν(x, μnm(Y ), dy)− μt(Y )(dx)ν(x, μt(Y ), dy)|

into two parts over the sets {L(x) > ρ} and {L(x) ≤ ρ}, respectively, we find thatthe integrals over the first set tend to zero by (7.22), as ρ → ∞, and the integralsover the second set tend to zero, as n,m → ∞, for any ρ due to the convergenceμnmt (Y ) → μt(Y ). �

7.3 Path-dependent equations andforward-backward systems

Let us briefly sketch the analogue of the theory of forward-backward systems fromSection 6.10 for the framework of kinetic equations in weighted measure spaces.The starting point are path-dependent kinetic equations, which can be written as

d

dtμt(dx) = F (t, {μ.}) =

∫μt(dw)ν(t, w, {μ.}, dx)− a(t, x, {μ.})μt(dx), μ0 = Y,

(7.25)on some interval t ∈ [0, T ]. Extending Theorem 7.2.1 and limiting our attentionto the subcritical case only gives us the following result:

Theorem 7.3.1. Let X be a locally compact metric space, L a continuous functionon X such that L(x) ≥ L0 > 0 for all x, and L(x) → ∞, as x → ∞. Leta(t, x, {μ.}) be a continuous non-negative function on

[0, T ]×X × C([0, T ],M+(X,L))

and ν a family of transition kernels such that ν(t, x, {μ.}, .) ∈ M+(X,L) for allt ∈ [0, T ], x ∈ X, μ. ∈ C([0, T ],M+(X,L)), depending continuously on t andx, with ν considered in the weak topology of M+(X). Let the Lyapunov condition(L, F (t, {μ.})) ≤ 0 hold, as well as the following conditions of growth and Lipschitzcontinuity:

a(t, x, {μ.}) + ‖ν(t, ., {μ.}, .)‖11(X) ≤ c1(λ), (7.26)

|a(t, x, {ξ.})− a(t, x, {η.})|+ ‖ν(t, ., {ξ.}, .)− ν(t, ., {η.}, .)‖11(X)

≤ c2(λ)‖ξ − η‖C([0,T ],M(X)).(7.27)

Page 427: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

414 Chapter 7. Equations in Spaces of Weighted Measures

Then for any Y ∈ M+≤λ(X,L) there exists a solution μt(Y ) ∈ M+(X,L) to the

Cauchy problem for equation (7.25) (with the derivative understood in the Banachtopology of M(X)), with the initial condition Y . Moreover, this solution is uniquefor sufficiently small T , namely for

T (1 + c1(λ) + λc2(λ)/L0) < 1. (7.28)

Proof. As in Theorem 7.2.1, our solution is a fixed point of the mapping

[ΦY (μ.)](t) = e−KtY +

∫ t

0

e−K(t−s)[F (s, {μ.}) +Kμs] ds.

Since

‖[ΦY (μ1. )](t)−[ΦY (μ

2. )](t)‖M(X) ≤ t‖μ1

. −μ2. ‖C([0,T ],M(X))(K+c1(λ)+λc2(λ)/L0),

the mapping ΦY is a contraction in C([0, T ],M(X)) under (7.28), which impliesthe existence and uniqueness of a fixed point.

For arbitrary T , ΦY maps C([0, T ],M+≤λ(X,L)) to itself. By Proposition

1.1.2, the sets M+≤λ(X,L) are metric and compact in the weak topology. Since the

image of ΦY are Lipschitz-continuous curves, we can conclude by the Arzela–AscoliTheorem that this image is compact in C([0, T ],M+

≤λ(X,L)), where M+≤λ(X,L))

is metricized by the metric of Proposition 1.1.2. Consequently, as in Theorem 6.4.3,the Schauder fixed-point theorem ensures the existence of a fixed point of ΦY . �

Let us now consider the forward-backward system with a forward kineticequation and a backward Bellman equation of the jump-type, see (2.124) and(2.125). In the latter equation, we separate jumps that depend on the control uand jumps that depend on the ‘environment’ μ that couples it with the kineticequation:

d

dtμt(dx) = F (t, μt, u(x, {μ≥t};ST ))

=

∫μt(dw)ν(w, μt , u(x, {μ≥t};ST ), dx) − a(x, μt, u(x, {μ≥t};ST ))μt(dx),

μ0 = Y,

∂S

∂t+ sup

u∈U

⎡⎣ m∑j=1

ujνj(x)(S(t, yj(x))− S(t, x))− J(x, u)

⎤⎦ (7.29)

+

∫(S(t, y)− S(t, x))n(μt, x, dy) = 0, S|t=T = ST ,

where U is a convex set and J a strictly convex function of u, so that the valueu(x, p) where the ‘Hamiltonian’

H(x, p) = supu∈U

⎡⎣ m∑j=1

ujνj(x)pj − J(x, u)

⎤⎦ , x ∈ X, p ∈ Rm,

Page 428: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

7.3. Path-dependent equations and forward-backward systems 415

reaches its maximum is always unique. u in the first equation is defined by theformula

u(x, {μ≥t};ST ) = u(x, p)|pj=S(t,y(xj))−S(t,x), (7.30)

with S(t, x) depending on {μ≥t} and ST via the solution to the second equation(7.29).

Theorem 7.3.2. Let X be a locally compact metric space, L a continuous functionon X such that L(x) ≥ L0 > 0 for all x, and L(x) → ∞, as x → ∞. Let U bea convex compact subset of a Euclidean space. Let a(x, μ, u) be a continuous non-negative function on X×M+(X,L)×U and ν a family of transition kernels suchthat ν(x, μ, u, .) ∈ M+(X,L) for all x ∈ X, μ ∈ M+(X,L), u ∈ U dependingcontinuously on x, with ν considered in the weak topology of M+(X). Let theLyapunov condition (L, F (t, μ, u)) ≤ 0 hold, as well as the following conditions ofgrowth and Lipschitz continuity:

a(x, μ, u) + ‖ν(., μ, u, .)‖11(X) ≤ c1(λ), (7.31)

|a(x, μ1, u1)− a(x, μ2, u2)|+ ‖ν(., μ1, u1, .)− ν(., μ2, u2, .)‖11(X)

≤ c2(λ)(‖μ1 − μ2‖C([0,T ],M(X)) + ‖u1 − u2‖). (7.32)

Let νj and n be uniformly bounded stochastic kernels and let the HamiltonianH(x, p) have a Lipschitz minimizer (see the discussion after equation (6.117) andTheorem 9.5.1 in the Appendix). Then for any Y ∈ M+

≤λ(X,L) and ST ∈ C(X),

there exists a solution μt(Y ) ∈ M+(X,L), S(t, .) ∈ C(X) to the forward-backwardsystem (7.29). Moreover, this solution is unique for sufficiently small T .

Proof. By Theorem 2.7.3, the solution to the second equation of (7.29) is welldefined and depends Lipschitz-continuously on μ for any continuous curve μt(Y ) ∈M+(X,L). Therefore, since H has a Lipschitz minimizer, u(x, {μ≥t};ST ) dependsLipschitz-continuously on μ. Hence we find ourselves in the setting of Theorem7.3.1. �

An important observation that leads to an advanced study of the forward-backward system (7.29) is the possibility to encode the system into a single back-ward equation with an unknown function G(t, x, μ). This equation is referred toas the master equation and has the form

∂G

∂t(t, x, μ) +

∫(G(t, y, μ))−G(t, x, μ))n(μ, x, dy)

+

⎡⎣ m∑j=1

ujνj(x)(G(t, yj(x), μ) −G(t, x, μ)) − J(x, u)

⎤⎦ ∣∣∣

u=u(x,G(t,.),μ)

+

∫δG

δμ(z)(t, x, μ)

[∫w∈X

μ(dw)ν(w, μ, u, dz) − a(z, μ, u)μ(dz)

] ∣∣∣u=u(z,G(t,.),μ)

= 0,

Page 429: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

416 Chapter 7. Equations in Spaces of Weighted Measures

where u = {uj}(x,G(t, .), μ) is the argmax of the expression

m∑j=1

ujνj(x)[G(t, yj(x), μ) −G(t, x, μ)]− J(x, u).

For instance, if X = {1, . . . , n} is finite, then this equation simplifies to

∂Gi

∂t(t, μ) +

∑k(Gk(t, μ)−Gi(t, μ))nik(μ)

+

⎡⎣ m∑j=1

ujνji(Gj(t, μ)−Gi(t, μ))− Ji(u)

⎤⎦ |u=u(i,G(t,.),μ)

+∑

l

∂G

∂μl(t, μ)

[∑pμpνpl(μ, u)− ap(μ, u)μp

]|u=u(l,G(t,.),μ)

= 0,

where u = {uj}(i, G(t, .), μ) is the argmax of the expression

m∑j=1

ujνji(Gj(t, μ)−Gi(t, μ))− Ji(u).

7.4 Kinetic equations (Boltzmann, Smoluchowski,

Vlasov, Landau) and replicator dynamics

The basic examples of measure-valued evolutions as they appear in natural sciencesare general kinetic equations. Some special cases of these equations, arising from adiscrete state space and given by (3.63) in the weak representation, were analysedin detail in Chapter 3. In this section, we are going to formulate the analoguesfor a state space X being an arbitrary complete metric space and point out theirmost well-known special cases.

Let us first introduce some notation. Denoting by X0 a one-point space andby Xj the powers X × · · ·×X (j times) considered with their product topologies,we denote by X their disjoint union X = ∪∞

j=0Xj. In applications, X specifies

the state space of one particle and X = ∪∞j=0X

j stands for the state space of anarbitrary number of similar particles. We denote by Csym(X ) the Banach spaces ofsymmetric bounded continuous functions on X and by Csym(X

k) the correspond-ing spaces of functions on the finite power Xk. The space of symmetric (positivefinite Borel) measures is denoted by Msym(X ). The elements of Msym(X ) andCsym(X ) are the (mixed) states and the observables, respectively, for an evolutionon X . We denote the elements of X by bold letters, say x, y. For a finite subsetI = {i1, . . . , ik} of a finite set J = {1, . . . , n}, we denote by |I| the number ofelements in I, by I its complement J \ I and by xI the collection of variablesxi1 , . . . , xik .

Page 430: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

7.4. Kinetic equations (Boltzmann, Smoluchowski, Vlasov, Landau) . . . 417

Reducing the set of observables to Csym(X ) effectively means that our statespace is not X (or Xk) but rather the quotient-space SX (respectively SXk)obtained by factorization with respect to all permutations. This allows for theidentifications Csym(X ) = C(SX ) and Csym(X

k) = C(SXk). The set of equiva-lence classes SX can be identified with the set of all finite subsets of X , the orderbeing irrelevant.

Each f ∈ Csym(X ) is defined by its components (restrictions) fk on Xk.E.g., for x = (x1, . . . , xk) ∈ Xk ⊂ X , say, we can write f(x) = f(x1, . . . , xk) =fk(x1, . . . , xk). Similar notations will be used for the components of measures fromM(X ). In particular, the pairing between Csym(X ) and M(X ) can be written as

(f, ρ) =

∫f(x)ρ(dx) = f0ρ0 +

∞∑n=1

∫f(x1, . . . , xn)ρ(dx1 · · · dxn)

for f ∈ Csym(X ), ρ ∈ M(X ). A useful class of measures (and mixed states) on Xis given by the decomposable measures of the form Y ⊗, which are defined for anarbitrary finite measure Y (dx) on X by their components

(Y ⊗)n(dx1 · · · dxn) = Y ⊗n(dx1 · · · dxn) = Y (dx1) · · ·Y (dxn).

Similarly, decomposable observables, multiplicative or additive, are defined for anarbitrary Q ∈ C(X) as follows:

(Q⊗)n(x1, . . . , xn) = Q⊗n(x1, . . . , xn) = Q(x1) · · ·Q(xn), (7.33)

(Q⊕)(x1, . . . , xn) = Q(x1) + · · ·+Q(xn). (7.34)

(Note that Q⊕ vanishes on X0.) In particular, if Q = 1, then Q⊕ = 1⊕ is thenumber of particles:

1⊕(x1, . . . , xn) = n.

The analogues of the rates PΦΨ from the discrete setting of the equations (3.62)

and (3.63) are (a) the transitions kernels P 1(x; dy) = {P 1m(x; dy1 · · · dym)} from

X to SX (they describe unilateral (or spontaneous) transformations of parti-cles such as splitting or mutations), (b) the transition kernels P 2(x1, x2; dy) ={P 2

m(x1, x2; dy1 · · · dym)} from SX2 to SX (they describe binary interactions suchas collisions or breakage), and generally (c) the transition kernels

P k(x1, . . . , xk; dy) = {P km(x1, . . . , xk; dy1 · · · dym)} (7.35)

from SXk to SX (they describe simultaneous interactions of k particles, i.e., kth-order interaction). The norms of the kernels,

P 1(x) =

∫XP 1(x; dy) =

∞∑m=0

∫Xm

P 1m(x; dy1 · · · dym)

Page 431: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

418 Chapter 7. Equations in Spaces of Weighted Measures

for unilateral transformations, and

P k(x1, . . . , xk) =

∫P k(x1, . . . , xk; dy) =

∞∑m=0

∫P km(x1, . . . , xk; dy1 · · · dym),

(7.36)for kth-order interaction, are usually referred to as the transition intensities(or rates).

The natural analogue of the weak equation (3.63) for a continuous state spaceis given by the following kinetic equation of pure jump-type in the weak form:

d

dt(g, μt) =

K∑k=1

1

k!

∫X

∫Xk

(g⊕(y) − g⊕(z))P k(z; dy)μ⊗kt (dz), (7.37)

where we used a finite bound K for the order of interaction, since only this caseprevails in concrete applications. Loosely speaking, these equations describe theevolution of the particle concentrations μt(dz) in regions (dz) under the trans-formations z → y with the rates P (z; dy). A systematic way for deriving theseequations from the evolution of systems of interacting particles will be outlined inSection 7.8. There, it will become clear (see (7.99)) that (7.37) is a consequence ofa certain scaling of the interactions of order k. This scaling is well established instatistical mechanics. In a biological context, however, a different scaling is moreappropriate. With this other scaling, (7.37) is replaced by the following re-scaledmodification:

d

dt(g, μt) =

K∑k=1

1

k!‖μt‖k−1

∫X

∫Xk

(g⊕(y)− g⊕(z))P k(z; dy)μ⊗kt (dz). (7.38)

Moreover, in the biological context, one is usually interested in normalized (prob-ability) measures ν = μ/‖μ‖. Since ‖μ‖ =

∫Xμ(dx) for positive μ, we find for

positive solutions μt to (7.38) that

d

dt‖μt‖ =

1

k!

∫Xk

Q(z)

(μt

‖μt‖)⊗k

(dz)‖μt‖, (7.39)

where

Q(z) =

∫X(1⊕(y) − 1⊕(z))P k(z; dy). (7.40)

Consequently, rewriting equation (7.38) in terms of the normalized measure νt =μt/‖μt‖ yields

d

dt

∫X

g(z)νt(dz) =1

k!

∫X

∫Xk

(g⊕(y) − g⊕(z))P k(z; dy)ν⊗kt (dz)

− 1

k!

∫X

g(z)νt(dz)

∫X

∫Xk

Q(z)ν⊗kt (dz).

(7.41)

Page 432: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

7.4. Kinetic equations (Boltzmann, Smoluchowski, Vlasov, Landau) . . . 419

Remark 113. The re-scaling of interactions that leads to (7.38) is equivalent toa time change in (7.37). A special case of this reduction in evolutionary biologyis the well-known trajectory-wise equivalence of the Lotka–Volterra model andreplicator dynamics, see, e.g., [111].

Notice that equation (7.37) is a special case of the equations (7.4) and (7.5)with

a(x, μ) =K∑

k=1

1

(k − 1)!

∫X

∫Xk−1

P k(x, z2, . . . , zk; dy)μ(dz2) · · ·μ(dzk), (7.42)

ν(w, μ, dy) =

K∑k=1

∞∑m=1

m

k!

∫Xm−1

∫Xk−1

P km(w, z2, . . . , zk; dydy2 · · · dym)μ(dz2) · · ·

· · ·μ(dzk).(7.43)

Example 1. Generalized Smoluchowski coagulation model. The classical Smolu-chowski model describes the process of mass-preserving binary coagulation of par-ticles. In a more general context, referred to as cluster coagulation (see [214]),a particle is characterized by a parameter x from a locally compact state spaceX , where a mapping E : X → R+, the generalized mass, and a transition ker-nel P 2

1 (z1, z2; dy) = K(z1, z2; dy), the coagulation kernel, are given such that themeasures K(z1, z2; .) are supported on the set {y : E(y) = E(z1)+E(z2)}. In thissetting, equation (7.37) takes the form

d

dt

∫X

g(z)μt(dz)

=1

2

∫X3

[g(y)− g(z1)− g(z2)]K(z1, z2; dy)μt(dz1)μt(dz2)

(7.44)

and is known as Smoluchowski’s equation. In the classical Smoluchowski model,we have X = R+, E(x) = x and K(x1, x2; dy) = K(x1, x2)δ(x1 + x2 − y) for acertain symmetric function K(x1, x2).

Example 2. Spatially homogeneous Boltzmann collisions. This model describes theprocess of binary collisions that transform the velocities of two particles (v1, v2) �→(w1, w2) in such a way that the total momentum and energy are conserved:

v1 + v2 = w1 + w2,

v21 + v22 = w21 + w2

2 .(7.45)

These equations imply that

w1 = v1 − n(v1 − v2, n),

w2 = v2 + n(v1 − v2, n)), n ∈ Sd−1, (n, v2 − v1) ≥ 0.(7.46)

Page 433: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

420 Chapter 7. Equations in Spaces of Weighted Measures

If we assume that the collision rates are shift-invariant, i.e., they depend on v1, v2only via their difference, and ignore the spatial distribution of particles, then theweak kinetic equation (7.37) for describing such collisions turns into the spatiallytrivial Boltzmann equation:

d

dt(g, μt) =

1

2

∫n∈Sd−1:(n,v2−v1)≥0

∫R2d

μt(dv1)μt(dv2)

× [g(w1) + g(w2)− g(v1)− g(v2)]B(v2 − v1, dn),

(7.47)

with a certain collision kernel B(v, dn) that specifies a concrete physical model forthe collisions. In the most common models, the kernel B has a density with respectto the Lebesgue measure on Sd−1 and depends on v only via its magnitude |v| andthe angle θ ∈ [0, π/2] between v and n. In other words, one assumes B(v, dn) tohave the form B(|v|, θ)dn for a certain function B. By extending B to the anglesθ ∈ [π/2, π] by

B(|v|, θ) = B(|v|, π − θ), (7.48)

we can write the weak form of the Boltzmann equation in the equivalent form

d

dt(g, μt) =

1

4

∫Sd−1

∫R2d

[g(w1) + g(w2)− g(v1)− g(v2)]

×B(|v1 − v2|, θ)dnμt(dv1)μt(dv2),

(7.49)

where w1, w2 are given by (7.46), θ is the angle between v2 − v1 and n, and Bsatisfies the condition (7.48).

Example 3. Multiple coagulation, fragmentation and collision breakage. Processesthat combine pure coagulation of no more than k particles, spontaneous fragmen-tation into no more than k pieces, and collisions (or collision breakages) of no morethan k particles are specified by the following transition kernels:

P l1(z1, . . . , zl, dy) = Kl(z1, . . . , zl; dy),

l = 2, . . . , k,

called coagulation kernels,

P 1m(z; dy1 · · · dym) = Fm(z; dy1 · · · dym),

m = 2, . . . , k,

called fragmentation kernels and

P ll (z1, . . . , zl; dy1 · · · dyl) = Cl(z1, . . . , zl; dy1 · · · dy2),

l = 2, . . . , k,

Page 434: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

7.4. Kinetic equations (Boltzmann, Smoluchowski, Vlasov, Landau) . . . 421

called collision kernels. The corresponding kinetic equation (7.37) takes the form

d

dt

∫g(z)μt(dz) (7.50)

=k∑

l=2

1

l!

∫z1,...,zl,y

[g(y)− g(z1)− · · · − g(zl)]Kl(z1, . . . , zl; dy)l∏

j=1

μt(dzj)

+

k∑m=2

∫z,y1,...,ym

[g(y1) + · · ·+ g(ym)− g(z)]Fm(z; dy1, . . . , dym)μt(dz)

+

k∑l=2

∫[g(y1) + · · ·+ g(yl)− g(z1)− · · · − g(zl)]

× Cl(z1, . . . , zl; dy1 · · · dyl)l∏

j=1

μt(dzj).

Apart from the instantaneous transformations of particles described by theequations (7.37), one can also consider other processes involving groups of particlesand described by some generators Ak acting in the spaces Csym(X

k) (given thatthe process involves k particles). If A = (A1, A2, . . .) denotes the collection of suchoperators, then the natural extension of the evolutions (7.37) are described by theequations

d

dt(g, μt) =

K∑k=1

1

k!

∫Xk

{Ag⊕(z) +

∫X(g⊕(y) − g⊕(z))P k(z; dy)

}μ⊗kt (dz),

(7.51)where the intuitive notation (Ag⊕)(z1, . . . , zl) = Alg⊕(z1, . . . , zl) was used. As afinal level of extension, let us mention the possibility for the rates and evolutionsA to depend on the current state μ (the so-called mean-field interaction). Thisleads to the following general kinetic equation in the weak form:

d

dt(g, μt) =

K∑k=1

1

k!

∫Xk

{A(μt)g

⊕(z) +∫X(g⊕(y) − g⊕(z))P k(μt, z; dy)

}μ⊗kt (dz).

(7.52)This equation with A = 0 is again a performance of the equations (7.4) and (7.5),with a, ν given by (7.42), (7.42) and P additionally depending on μ.

Example 4. Vlasov’s equation. In standard dynamics, particles are described bytheir positions and momenta, so that X = R2d. The Vlasov equation of plasmaphysics in the weak form is

d

dt(g, μt) =

∫R2

(∂H

∂p

∂g

∂x− ∂H

∂x

∂g

∂p

)μt(dxdp) (7.53)

+

∫R4d

(∇V (x1 − x2),

∂g

∂p1(x1, p1)

)μt(dx1dp1)μt(dx2dp2).

Page 435: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

422 Chapter 7. Equations in Spaces of Weighted Measures

The function H(x, p) is called the Hamiltonian, say H = p2/2−U(x) with a givenpotential U . V stands for the potential of the interaction.

Example 5. Landau–Fokker–Planck equation. This is the equation

d

dt(g, μt) =

∫R2d

[1

2(G(v − w)∇,∇)g(v) + (b(v − w),∇g(v))

]μt(dw)μt(dv),

(7.54)with a certain non-negative matrix-valued function G(v) and a vector field b(v).In the original equation that specifies the limiting regime of Boltzmann collisionsas described by (7.47) when they become grazing, i.e., when v1 is close to v2, wehave

Gij(v) = ψ(|v|)(|v|2δij − vivj), bi(v) =∑

j

∂vjGij(v), (7.55)

with some function ψ(r) that reflects the details of the interaction. The specialcase of ψ being a constant is referred to as the case of Maxwellian molecules. In-creasing or decreasing the function ψ specifies the cases of hard and soft potentials,respectively. A key property of G from (7.55) (essentially due to Theorem 4.3.1)is the possibility to represent it as a product G(z) = σ(z)σT (z). In fact, for d = 2and d = 3, this representation holds with

σ(z) =√ψ(|z|)

(z2 0

−z1 0

), σ(z) =

√ψ(|z|)

⎛⎜⎜⎝

z2 −z3 0

−z1 0 z3

0 z1 −z2

⎞⎟⎟⎠ , (7.56)

respectively.

Example 6. McKean–Vlasov diffusions. Taking A1 in (7.52) to be a second-order(or diffusion) operator and setting all other terms to zero, we obtain the McKean–Vlasov nonlinear diffusion that we already analysed in Chapter 6.

Example 7. Generalized replicator dynamics. The evolution

d

dt

∫X

g(x)νt(dx) =1

(k − 1)!

∫X

(H�(νt‖x)−H�(νt))g(x)νt(dx), (7.57)

represents the replicator dynamics in weak form for a symmetric k-person gamewith an arbitrary compact space of strategies. It is a special case of (7.41).H(z1; z2, . . . , zk) is a function that is symmetric with respect to permutationsof all variables apart from the first one and which is interpreted as the costsfor a player employing the strategy z1 when other players employ the strategiesz2, . . . , zk. Moreover, the following notations apply:

H�i (P ) =

∫Xk

Hi(x1, . . . , xk)P (dx1 · · · dxk),

P = (p1, . . . , pk) ∈ P(X1)× · · · × P(Xk),

H�i (P‖xi) =

∫X1×···×Xi−1×Xi+1×···×Xn

Hi(x1, . . . , xn) dp1 · · · dpi−1dpi+1 · · · dpn.

Page 436: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

7.4. Kinetic equations (Boltzmann, Smoluchowski, Vlasov, Landau) . . . 423

If the νt have densities ft with respect to some reference probability measure Mon X , then equation (7.57) can be rewritten in terms of ft as

ft(x) = ft(x)(H�(ftM‖x)−H�(ftM)), (7.58)

which is most established form of replicator dynamics. Even more generally, if thepairwise interaction of players does not constitute a game between themselves,but rather a process of adapting better strategies for playing against a commonadversary (an interaction that is analysed in [154] and called there the pressure-and-resistance game), then the replicator dynamics (7.57) turns into the equation

d

dt

∫X

g(x)νt(dx) =

∫X

(R(x, μ)−R(y, μ))g(x)νt(dx)νt(dy), (7.59)

where R is some function on X ×M(X).

Other examples include spatially nontrivial Boltzmann and Smoluchowskiequations, and various extensions of the replicator dynamics to evolutionary biol-ogy and many other models.

Let make explicit the crucial link between the dynamics Y �→ μt(Y ) andthe corresponding linear dynamics in C(M(X)) realized by the method of char-acteristics in the spirit of Proposition 4.1.1. According to this proposition (andRemark 61), once the well-posedness of equations (7.52) is proved, one can expectthe operators TtS(Y ) = S(μt(Y )) to form a semigroup of contractions on somespace of smooth functions on M+(X), with the generator

ΛA,PS(Y ) =

(δS(Y )

δY (.), μt(Y )|t=0

)

or explicitly

ΛA,PS(Y ) =

K∑k=1

1

k!

∫Xk

{(A(Y )

(δS(Y )

δY (.)

)⊕)(z) (7.60)

+

∫X

[(δS(Y )

δY (.)

)⊕(y) −

(δS(Y )

δY (.)

)⊕(z)

]P k(Y, z; dy)

}μ⊗kt (dz).

Therefore, the function Ft(Y ) = TtS(Y ) solves the following first-order infinite-dimensional PDE in variational derivatives:

d

dtFt(Y ) =

K∑k=1

1

k!

∫Xk

{(A(Y )

(δFt(Y )

δY (.)

)⊕)(z) (7.61)

+

∫X

[(δFt(Y )

δY (.)

)⊕(y)−

(δFt(Y )

δY (.)

)⊕(z)

]P k(Y, z; dy)

}Y ⊗k(dz).

In Section 7.8, we shall sketch the derivation of this equation (and hence of thecorresponding kinetic equation) from the evolution of interacting particle systems.

Page 437: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

424 Chapter 7. Equations in Spaces of Weighted Measures

Remark 114. If A and P do not depend explicitly on μ, then the r.h.s. of (7.61)preserves the analytic functions F of the type

Fg(Y ) =∞∑

m=0

∫gm(x1, . . . , xm)μ(dx1) · · ·μ(dxm)

with some g ∈ Csym(X ). Therefore, equation (7.61) can be rewritten in terms of thecoefficient functions gmt . This leads to the so-called Bogolubov chains or BBGKYhierarchies, which are another useful tool for analysing kinetic equations.

7.5 Well-posedness for basic kinetic equations

The class of equations (7.52) unifies an immense variety of important concreteevolutions. Moreover, it provides a fast unified approach for obtaining variousquantitative and qualitative features of concrete examples that had initially beendeveloped in lengthy case-by-case studies.

In all models of interest, one can distinguish a function on X that measuressome key property of particles and their clusters, and that does not increase duringall the transitions. For instance, this can be the mass in mass-exchange modelslike coagulation-fragmentation or the kinetic energy for Boltzmann collisions. Thisfunction plays the role of a Lyapunov function. In the sequel, we shall denote it byE (energy), which is the well-established notation in the Boltzmann setting. Theformal definition goes as follows: Let E be a positive function on X . The transitionkernel P = P (x; dy) in (7.35) is called E-subcritical (respectively E-critical), if∫

(E⊕(y) − E⊕(x))P (x; dy) ≤ 0 (7.62)

for all x (respectively if the equality holds). We say that P (x; dy) is E-preserving(respectively E-non-increasing) if the measure P (x; dy) is supported on the set{y : E⊕(y) = E⊕(x)} (respectively {y : E⊕(y) ≤ E⊕(x)}). Clearly, if P (x; dy)is E-preserving (respectively E-non-increasing), then it is also E-critical (respec-tively E-subcritical). E.g., if E = 1, then the preservation of E (subcriticallity)means that the number of particles remains constant (does not increase on aver-age) during the evolution of the process. As we shall see later, subcriticallity enterspractically all natural assumptions and ensures the non-explosion of the modelsof interaction.

The well-posedness for bounded kernels is mostly covered by Theorems 7.2.1and 7.2.2.

Theorem 7.5.1. Let X be a complete metric space and E a continuous function onX that is bounded from below by a constant E0 > 0. In order to keep the formulaeslim, we shall take E0 = 1. Let P (x; .) be a continuous transition kernel in SX

Page 438: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

7.5. Well-posedness for basic kinetic equations 425

that is E-subcritical with uniformly bounded intensities,

supz∈X

supμ∈M(X,E)

∫XP (z; dy) = P < ∞.

Then the Cauchy problem for equation (7.37) is well posed in M+(X,E) and thesolutions depend smoothly on the initial data.

Proof. It follows from Theorems 7.2.1 and 7.2.2 by taking into account the formu-lae (7.42) and (7.43), as well as the resulting formulae for the variational deriva-tives:

δa(x, μ)

δμ(z)=

K∑k=1

1

(k − 2)!

∫X

∫Xk−2

P k(x, z, z3, . . . , zk; dy)μ(dz3) · · ·μ(dzk), (7.63)

δν(w, μ, dy)

δμ(z)=

K∑k=1

∞∑m=1

m(k − 1)

k!(7.64)

×∫Xm−1

∫Xk−2

P km(w, z, z3, . . . , zk; dydy2 · · · dyk)μ(dz3) · · ·μ(dzk),

Moreover, the following estimates apply:

a(x, μ) ≤ P∑

k

(E, μ)k−1

(k − 1)!≤ P exp{(E, μ)},

∫E(y)ν(x, μ, dy) ≤ 1

k!

K∑k=1

∫E⊕(x, z2, . . . , zk)

× P k(x, z2, . . . , zk; dy)μ(dz2) · · ·μ(dzk)≤ E(x)P exp{(E, μ)}. �

This result can be extended to equations with a mean-field dependence:

d

dt(g, μt) =

K∑k=1

1

k!

∫X

∫Xk

(g⊕(y) − g⊕(z))P k(μt, z; dy)μ⊗kt (dz). (7.65)

Theorem 7.5.2. Let X be a complete metric space and E a continuous function onX that is bounded from below by a positive constant. Let P (μ,x; .) be a family ofcontinuous transition kernels in SX that are E-subcritical with uniformly boundedintensities,

supz∈X

supμ∈M(X,E)

∫XP (μ, z; dy) < ∞.

Let P (μ,x, .) depend locally Lipschitz-continuously on μ, so that

supx

‖P (ξ,x; .)− P (η,x; .)‖M(X,E) ≤ C(λ)‖ξ − η‖M(X,E)

Page 439: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

426 Chapter 7. Equations in Spaces of Weighted Measures

for ξ, η ∈ M+≤λ(X,E) and some constants C(λ). Then the Cauchy problem for

equation (7.65) is well posed in M+(X,E) and the solutions depend continuouslyon the initial data. If additionally P (μ,x; .) has uniformly continuous variationalderivatives with respect to μ on each set M+

≤λ(X,E), then the solutions dependsmoothly on the initial data.

Equations with unbounded rates and kernels are particularly interesting. Twomain classes of such kernels are usually discussed in the literature: Namely, thetransition kernel P is called multiplicatively E-bounded or E⊗-bounded (respec-tively additively E-bounded or E⊕-bounded) for a positive function E on X when-ever ‖P (μ,x; .)‖ ≤ cE⊗(x) (respectively ‖P (μ,x; .)‖ ≤ cE⊕(x)) for all μ and xand some constant c > 0, where we used the notations (7.33) and (7.34). In orderto keep the formulae short, we shall consider this definition to hold with c = 1.

The next result is a variation of Theorem 7.2.3.

Theorem 7.5.3. Let X be a locally compact metric space and E a continuous func-tion on X that is bounded from below by a constant E0 that we choose to equal1 without loss of generality. Let P (μ,x, .) be a family of continuous transitionkernels in SX such that the number of particles created in one transition is uni-formly bounded by some number M (i.e., m is bounded by M in (7.35)). Let P beE-subcritical and sub-multiplicatively E-bounded in the sense that

‖P (μ, z; .)‖ ≤ ω(z)E⊗(z), (7.66)

with some positive function ω(z) that is bounded by a constant Ω and tends tozero, as z → ∞. Let P (μ,x, .) depend locally Lipschitz-continuously on μ, so that

supx

‖P (ξ, z, .)− P (η, z, .)‖M(X) ≤ E⊗(z)C(λ)‖ξ − η‖M(X)

for ξ, η ∈ M+≤λ(X,E) and some constants C(λ). Then the Cauchy problem for

equation (7.65) has a global solution μt ∈ M+(X,E) for any μ0 ∈ M+(X,E).

Proof. The proof is essentially the same as in Theorem 7.2.3. The only differenceis that one has to choose the approximations in accordance with the particularstructure of the problem. Namely, one approximates the kernels P by

Pn(μ, z1, . . . zk; dy) =

k∏j=1

χn(zj)Pn(μ, z1, . . . zk; dy),

and takes an, νn as given by (7.42), (7.43) with Pn instead of P . This choice yieldsthe Lipschitz continuity conditions

|(an(x, ξ)− an(x, η)|

≤ E(n)‖ξ − η‖M(X)

K∑k=1

1

(k − 1)![C(λ)λk−1 +Ω(k − 1)λk−2],

(7.67)

Page 440: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

7.6. Equations with additive bounds for rates 427

‖(νn(x, ξ, .) − νn(x, η, .)‖M(X)

≤ ME(n)‖ξ − η‖M(X)

K∑k=1

1

k![C(λ)λk−1 +Ω(k − 1)λk−2]

(7.68)

for the approximating equations , which allows for the conclusion of the well-posedness of these approximating equations. The rest of the proof is the same asin Theorem 7.2.3. �

7.6 Equations with additive bounds for rates

The methods for the unified analysis of the equations (7.52) are largely based onideas that are similar to the ones used in the discrete case of Chapter 3: Lya-punov functions, preservation of positivity, finite-dimensional (or bounded rates)approximations, moment estimates and accretivity. In this section, we show howthe last two ideas can be exploited. For a complete picture, we refer to the morespecialized literature (see Section 7.10). Here, we shall only analyse a subclass ofthe equations (7.52) that describes binary interactions, i.e., the equations

d

dt(g, μt) =

2∑l=1

1

l!

∫Xl

[∫X(g⊕(y) − g⊕(z))P (z; dy)

]μ⊗lt (dz), μ0 = Y, (7.69)

and its integral version

(g, μt)− (g, μ) =

∫ t

0

ds

2∑l=1

1

l!

∫Xl

[∫X(g⊕(y)− g⊕(z))P (z; dy)

]μ⊗ls (dz), (7.70)

where we assume that the P (z; dy) are transition kernels in X ∪ SX2. Therefore,only two particles take part in any instantaneous act of interaction and only twoparticles can be created as the result of this interaction.

Already this system shows the advantage of a concise notation for functionsand measures on X used in (7.52). In a more detailed description, equation (7.69)can be written as

d

dt(g, μt) =

∫X

[∫X2

(g(y1) + g(y2)− g(z))P 12 (z; dy1dy2)

+

∫X

(g(y)− g(z))P 11 (z; dy)

]μt(dz)

+1

2

∫X2

∫X2

(g(y1) + g(y2)− g(z1)− g(z2))

× P 2(z1, z2; dy1, dy2)μt(dz1)μt(dz2)

+1

2

∫X2

∫X

(g(y)− g(z1)− g(z2))P2(z1, z2; dy)μt(dz1)μt(dz2),

μ0 = Y. (7.71)

Page 441: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

428 Chapter 7. Equations in Spaces of Weighted Measures

Equations of this type contain spatially homogeneous evolutions of the Boltzmannand Smoluchowski type (Examples 1 and 2 above).

We shall first prove the existence of the solutions and the moment estimates.Afterwards, we shall consider uniqueness and the continuous dependence on theinitial data.

Theorem 7.6.1. Let X be a locally compact metric space and P (x; .) be a continuoustransition kernel from X ∪ SX2 to itself such that P (x; .) is E-non-increasingand (1 + E)⊕-bounded for some continuous non-negative function E on X withE(x) → ∞ as x → ∞. Suppose that

∫(1 + Eβ)(x)μ(dx) < ∞ for the initial

condition μ with some β > 1.

Then there exists a global non-negative solution (in the topology of M(X))to (7.69), which does not increase E, i.e., with (E, μt) ≤ (E, Y ), t ≥ 0, such thatfor an arbitrary T ,

supt∈[0,T ]

∫(1 + Eβ)(x)μt(dx) ≤ C(T, β, (1 + E, Y ))(1 + Eβ, Y ) (7.72)

with some constant C(T, β, (1 + E, Y )).

Proof. Let us first approximate the transition kernel P by the cut-off kernels Pn

defined by the equation∫g(y)Pn(z; dy) =

∫χn(E

⊕(z))g(y)χn(E⊕(y))P (z; dy), (7.73)

for arbitrary g, where χ is the same as in Theorem 7.2.3. Then Pn has the sameproperties as P , but is bounded at the same time. Therefore, the solutions μn

t tothe corresponding kinetic equations with initial condition Y exist by Proposition7.5.2.

Since the evolution defined by Pn does not change measures outside thecompact region {y : E(y) ≤ n}, it follows that if

∫(1 + Eβ)(x)Y (dx) < ∞, then

the same holds for μt for all t. Our aim now is to obtain a bound for this entity thatis independent on n. For that purpose, let us denote by Fg the linear functionalon measures Fg(μ) = (g, μ), and by ΛFg(μt) the r.h.s. of equation (7.69). Due tothe structure of ΛFg(μt) and the assumptions on P , we find

ΛF1(μ) ≤ F1+E(μ), ΛFE(μ) ≤ 0,

which by Gronwall’s lemma implies

supt∈[0,T ]

∫(1 + E)(x)μn

t (dx) ≤ eT (1 + E, Y ). (7.74)

This implies that the sequence of curves μnt in C([0, T ],M(X)) is relatively com-

pact by the Ascoli theorem. Therefore, one can extract a subsequence (again

Page 442: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

7.6. Equations with additive bounds for rates 429

denoted by μnt ) that converges to some curve μ. ∈ C([0, T ],M(X)) which also

satisfies the growth condition (7.74).

Next, for any y in the support of P (x, .), we have

(Eβ)⊕(y) ≤ (E⊕(y))β ≤ (E⊕(x))β ,

since P is E-non-increasing and the function z �→ zβ is convex. Consequently, onehas

ΛFEβ (μ) =2∑

l=1

1

l!

∫Xl

[(Eβ)⊕(y) − (Eβ)⊕(x)]P (x; dy)μ⊗l(dx)

≤ 1

2

∫X2

[(E(x1) + E(x2))β − Eβ(x1)− Eβ(x2)]P (x; dy)μ(dx1)μ(dx2).

Using the symmetry with respect to permutations of x1, x2 and the assumptionthat P is (1 + E)⊕-bounded, one deduces that this expression does not exceed∫

[(E(x1) + E(x2))β − Eβ(x1)− Eβ(x2)](1 + E(x1))μ(dx1)μ(dx2).

Using the inequalities (3.95) with a = E(x1), b = E(x2) yields

(E(x1) + E(x2))β − Eβ(x1)− Eβ(x2)

≤ 2β[E(x1)Eβ−1(x2) + E(x2)E

β−1(x1)]

≤ 2β+1[Eβ(x1) + Eβ(x2)],

and

[(E(x1) + E(x2))β − Eβ(x1)]E(x1)

≤ β2β[E(x2)βE(x1) + E(x1)

βE(x2)].

Again by the symmetry, this implies

ΛFEβ (μ) ≤ 2β+1(2 + β)

∫Eβ(x1)(1 + E(x2))μ(dx1)μ(dx2). (7.75)

By (7.74), it follows that

ΛFEβ (μ) ≤ eT (1 + E, μ)2β+1(2 + β)(Eβ , μ).

The same estimates hold for the transitions Pn instead of P . Consequently, Gron-wall’s lemma implies for an arbitrary T that

FEβ (μnt ) = (Eβ , μn

t ) < C(T, β, (1 + E, Y ))(Eβ , Y )

with some constant C(T, β, (1 + E, μ0)) for all t ∈ [0, T ] and all n. This impliesthat the limiting curve μt satisfies the estimate (7.72). It remains to show thatμt satisfies (7.70) by passing to the limit in the corresponding equations for μn

t .This is done as in the proof of Theorem 7.2.3: all integrals outside the domain {y :E(y) < K} can be made arbitrary small by choosing a large K (because of (7.72));and inside this domain, the result follows from the convergence μn

t to μt. �

Page 443: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

430 Chapter 7. Equations in Spaces of Weighted Measures

Theorem 7.6.2. Suppose that the assumptions of Theorem 7.6.1 hold and that Pis (1 + Eα)⊕-bounded for some α ∈ [0, 1] such that β ≥ α + 1. Then there existsa unique non-negative solution μt to (7.69) satisfying (7.72) and a given initialcondition Y ∈ M+(X, 1 + Eβ).

Moreover, the mapping Y �→ μt is Lipschitz-continuous in the norm ofM1+E(X), i.e., for any two solutions μt and νt to (7.69) satisfying (7.72) withthe initial conditions Yμ and Yν , one has∫

(1 + E)(x)|μt − νt| (dx) ≤ C

∫(1 + E)(x)|Yμ − Yν | (dx) (7.76)

for some constant C uniformly for all t ∈ [0, T ].

Proof. Given the previous results, we only need to prove (7.76). For that purpose,let us apply Proposition 1.4.5 to the measure-valued curve (1+E)(x)(μt−νt)(dx).Let ft denote a version of the density of μt − νt with respect to |μt − νt|. Thenwe get∫

(1 + E)(x)|μt − νt|(dx) = ‖(1 + E)(μt − νt)‖

=

∫(1 + E)(x)|Yμ − Yν |(dx) +

∫ t

0

ds

∫X

fs(x)(1 + E)(x)(μs − νs)(dx).

By (7.69), the last integral in this expression equals∫ t

0

ds

∫ ∫ ([fs(1 + E)]⊕(y) − [fs(1 + E)](z)

)P 1(z; dy)(μs(dz)− νs(dz))

+

∫ t

0

ds

∫ ∫ ([fs(1 + E)]⊕(y) − [fs(1 + E)]⊕(z)

)× P 2(z; dy)(μs(dz1)μs(dz2)− νs(dz1)νs(dz2)). (7.77)

The second term on the r.h.s. equals

∫ t

0

dsk∑

l=1

∫ ∫ ([fs(1 + E)]⊕(y) − [fs(1 + E)]⊕(z)

)× P (z; dy)[(μs − νs)(dz1)μs(dz2) + νs(dz1)(μs − νs)(dz2)].

(7.78)

Let us now estimate the integral arising from the first term in the last squarebracket. (Note that the second term can be dealt with analogously.) We have∫∫ (

[fs(1 + E)]⊕(y) − [fs(1 + E)]⊕(z))P 2(z; dy)(μs − νs)(dz1)μs(dz2). (7.79)

=

∫∫ ([fs(1 + E)]⊕(y) − [fs(1 + E)]⊕(z)

)P 2(z; dy)fs(z1)|μs − νs|(dz1)μs(dz2)

Page 444: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

7.7. On the sensitivity of kinetic equations 431

Since E is non-increasing by P (z; dy), we find([fs(1 + E)]⊕(y) − [fs(1 + E)]⊕(z)

)fs(z1)

≤ (1 + E)⊕(y)− fs(z1)[fs(1 + E)]⊕(z)

≤ 4 + E⊕(z) − E(z1)− fs(z1)fs(z2)E(z2) ≤ 4 + 2E(z2).

Therefore, (7.79) does not exceed∫(4 + 2E(z2))(1 + Eα(z1) + Eα(z2))|μs − νs|(dz1)μs(dz2).

Consequently, since 1 + α ≤ β and α ≤ 1, the integral (7.78) does not exceed

C(T, β, (1 + E, Yμ + Yν))(1 + Eβ , Yμ + Yν)

∫ t

0

ds

∫(1 + E)(x)|μt − νt|(dx)

with some constant C. A similar procedure with the first term in (7.77) showsthat it is non-negative. Consequently, (7.76) follows by Gronwall’s lemma. �Remark 115. As in the discrete setting of Chapter 3, the above proof of theuniqueness is effectively the proof of the accretivity of the corresponding kineticequation with respect to the weighted norm of M(X,L).

7.7 On the sensitivity of kinetic equations

After the brief but systematic exposition of the previous sections, let us now sketchsome directions of further developments. In this section, we give some commentson how to deal with sensitivity. Thereby, the sensitivity for kinetic equations withadditively bounded kernels is derived from the sensitivity of the approximatedmodel with cut-off rates, like in the framework of Theorem 7.6.2.

To begin with, one differentiates equation (7.69) with respect to the initialdata and obtains the following linear equation for the derivative ξt = Dμt(Y )[ξ]:

d

dt(g, ξt) =

∫X

∫X(g⊕(y) − g(z))P 1(z; dy)ξt(dz)

+

∫X2

∫X(g⊕(y) − g(x)− g(z))

× P 2(x, z; dy)[ξt(dz)μt(dx) + ξt(dx)μt(dz)].

(7.80)

The dual backward equation on functions reads

d

dtgt(x) = −

∫X(g⊕t (y)− gt(x))P

1(x; dy)

−∫X

∫X(g⊕t (y)− gt(x))P

2(x, z; dy)μt(dz)

+

∫X

∫Xgt(z)P

2(x, z; dy)μt(dz).

(7.81)

Page 445: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

432 Chapter 7. Equations in Spaces of Weighted Measures

Note that we wrote the last term separately in order to highlight that this termdestroys the conditional positivity of the evolution. Therefore, the propagatorfor the evolution (7.81) is built in two steps: (1) for the corresponding equationwithout the last term, where the methods of positivity-preserving evolutions canbe used; (2) by dealing with the last term based on perturbation theory.

The systematic treatment reveals an important additional complication: so-lutions to the linearized equations (7.80) describing the evolution of the derivativeswith respect to the initial data usually belong to other weighted spaces (i.e., havestronger growth at infinity) than the solutions to the kinetic equations. We referto [147] for the full story. At this point, we only mention the remarkable obser-vation that the pair of equations (7.69) and (7.81) (the kinetic equation and thedual equation to its linearized evolution) can be written in the infinite-dimensionalHamiltonian form with the help of partial Frechet derivatives:

d

dtμt = D2H(μt, gt),

d

dtgt = −D1H(μt, gt) (7.82)

(the first equation being the strong form of weak equation (7.69)), where D1, D2

denote the derivatives with respect to the first or second variable, respectively,and where the Hamiltonian function is

H(μ, g) = (Aμg, μ), (7.83)

if written in the general form for all kinetic equations, or

H(μ, g) =

2∑l=1

1

l!

∫Xl

[∫X(g⊕(y) − g⊕(z))P (z; dy)

]μ⊗lt (dz), (7.84)

if written for the considered special case.

In its weak form and using fractional derivatives, the Hamiltonian system(7.82) can be rewritten as

d

dt(f, μt) = D2H(μt, gt)[f ],

d

dtgt(.) = −δH(μt, gt)

δμt(.). (7.85)

7.8 On the derivation of kinetic equations:second quantization and beyond

In this section, we shall draw a general scheme for the derivation of kinetic equa-tions from the evolution of systems of interacting particles, referring to [147] for adetailed, rigorous derivation based on this scheme. In the sequel, we shall use thenotation from Section 7.4.

Let us start with spontaneous transitions under mean-field interaction. LetA be an arbitrary ΨDO in Rd describing the evolution of a state (a particle) in

Page 446: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

7.8. On the derivation of kinetic equations: second quantization and beyond 433

X = Rd: ft(x) = Aft(x). This equation is interpreted as describing the evolutionof observables f (some measurable quantities) that depend on a (possibly ran-dom) state x. E.g., the first-order equation ft = a(x)∂ft∂x describes the evolutionft(x) = f(Xt(x)) of the quantity f depending on the position of a particle thatmoves deterministically according to the equation x = a(x), Xt(x) being its solu-tions starting at x at the time zero. A second-order equation stands for diffusionprocesses. Equations with the integral generator f(x) =

∫(f(y) − f(x))ν(dy) de-

scribe processes of random jumps occurring with the rate ν(y). In the quantumsetting, various self-adjoint operators A can be used.

To any operator A1 acting on the functions on X , there corresponds an oper-ator A1 acting on the space of functions on SX that describes the evolution of anarbitrary number of particles, each one developing according to A1, independentlyfrom each other. Therefore, A1 acts as

A1f(x1, . . . , xk) =∑

jA1

jf(x1, . . . , xk), (7.86)

where A1j denotes the operator A1 acting on f seen as a function of xj .

In quantum physics, the operator A1 is referred to as the second quantizationof A1. However, in quantum physics the emphasis is on operators that act inL2(Rd) (therefore, the operator (7.86) acts in L2(Rdk)). In this section, we willdeal with classical particles, where the more appropriate functional space is C(X)or better C∞(X).

One says that an operator A1 in C(X) is subject to a mean-field interaction,if instead of one operator we are given a family of operators A1(μ) depending onμ ∈ M+(X) as a parameter. The corresponding evolution ft = A1ft in C(SX )given by (7.86) is then modified by letting A1 depend on the ‘environment’ viathe empirical distribution

μ = δx/k = (δx1 + · · ·+ δxk)/k ∈ P(X),

where we introduced the notation δx = δx1+· · ·+δxkfor a point x = (x1, . . . , xk) in

SX . Therefore, the mean-field interacting particle system described by the familyof operators A1(μ) in C(X) is given by the equation

f(x1, . . . , xk) = A1(δx/k)f(x1, . . . , xk) =∑

jA1

j (δx/k)f(x1, . . . , xk), (7.87)

or, a in more concise form,

f(x) = A1(μ)f(x)|μ=δx/k, x ∈ SXk. (7.88)

As we saw earlier already, the inclusion SX to M(X) given by the scaledtransformation

x = (x1, . . . , xl) �→ h(δx1 + · · ·+ δxl) = hδx (7.89)

Page 447: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

434 Chapter 7. Equations in Spaces of Weighted Measures

plays a key role in the theory of measure-valued limits of interacting particlesystems. This transformation defines a bijection between SX and the set M+

δ (X)of finite linear combinations of Dirac-δ-measures with natural coefficients. Thisbijection can be used for equipping SX with the structure of a metric space bypulling back any distance on M(X) that is compatible with its weak topology. Inthe above-considered situation when the total number of particles is preserved bythe evolution, one naturally chooses h = 1/k, where k the number of particles. Butin other situations, h is best fixed as the inverse 1/N0 to the number of particlesin the initial state. The final objective is to pass to the limit h → 0, k → ∞ insuch a way that μ = hδx has a finite limit in M+(X).

In order to be able to find such limit, one has to transfer the action of A1 fromthe functions on SX to the functionals F (hδx) on M+(X). For this transforma-tion, the formulae for the differentiation of these functionals pay a crucial role. Inorder to properly describe these formulae, we need the notations from Section 1.13

and their extensions specifying the regularity of the variational derivatives δF (Y )δY (x)

with respect to x. Namely, let us define the space C1,kweak(M+

≤λ(X)) as the sub-

space of C1weak(M+

≤λ(X)) consisting of functionals F (μ) such that δF (Y )δY (.) ∈ Ck(X)

uniformly for Y ∈ M+≤λ(X). This space is Banach with the norm

‖F‖C1,kweak(M+

≤λ(X)) = sup

Y ∈M+≤λ

(X)

∥∥∥∥δF (Y )

δY (.)

∥∥∥∥Ck(X)

. (7.90)

Similarly, one can introduce the space Cn,kweak(M+

≤λ(X)) of functionals havingvariational derivatives of order up to k, which are continuously differentiable oforder up to n with respect to the parameters entering these variational deriva-tives. It turns out, however, that for the analysis of particle systems the key roleis played by the following space that may seem rather artificial at first sight: LetC2,k×k

weak (M+≤λ(R

d)) denote the subspace of C1,1weak(M+

≤λ(Rd)) consisting of func-

tionals F (μ) such that δ2F (Y )δY (x)δY (z) exists for all x, z and belongs to Ck×k(R2d) (see

the definition of this space in Section 1.1).

Lemma 7.8.1.

(i) If F ∈ C1,1weak(M(Rd)), then

∂xiF (hδx) = h

∂xi

δF (Y )

δY (xi)

∣∣∣∣Y=hδx

. (7.91)

(ii) If F ∈ C1,2weak(M(Rd)) ∩ C2,1×1

weak (M+≤λ(R

d)), then

∂2

∂x2i

F (hδx) = h∂2

∂x2i

δF (Y )

δY (xi)

∣∣∣∣Y=hδx

+ h2 ∂2

∂y∂z

δF (Y )

δY (y)δY (z)

∣∣∣∣Y =hδx,y=z=xi

,

(7.92)

Page 448: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

7.8. On the derivation of kinetic equations: second quantization and beyond 435

∂2

∂xi∂xjF (hδx) = h2 ∂2

∂xi∂xj

δF (Y )

δY (xi)δY (xj)

∣∣∣∣Y=hδx

, i = j. (7.93)

Proof. Let us prove only (7.91) and leave the other formulae as an exercise. Infact, this is a consequence of Theorem 1.13.1, because the use of (1.202) leads to

∂xiF (hδx) = lim

ε→0

1

ε[F (hδx + hδxi+ε − hδxi)− F (hδx)]

= limε→0

h

ε

∫ 1

0

(δF (Y )

δY (.)

∣∣∣∣Y=hδx+hs(δxi+ε−δxi

)

, δxi+ε − δxi

)ds

= limε→0

h

ε

∫ 1

0

(δF (hδx + hs(δxi+ε − δxi))

δY (xi + ε)− δF (hδx + hs(δxi+ε − δxi))

δY (xi)

)ds,

which implies (7.91). �Lemma 7.8.2. Let A1(μ) be a differential operator with coefficients that depend onμ, such that the symbol of A1, A1(μ, x, p), is a polynomial in p with a vanishingfree term: A(μ, x, 0) = 0. Then, if hδx tends to a measure μ ∈ M+(X), as h → 0(of course, x also changes with h), we have

limh→0

A1(μ)F (hδx) =

∫X

(A1(μ)

δF (μ)

δμ(.)

)(x)μ(dx). (7.94)

Proof. In fact, (7.91) implies that the highest-order term (in small h) of the deriva-tives of F (hδx) with respect to xi coincides with the derivatives of the functionδF (Y )δY (xi)

, i.e., in the highest order we have

A1(μ)F (hδx) = h∑

iA1

i (μ)δF (Y )

δY (xi)=

(A1(μ)

δF (μ)

δμ(.), hδx

).

This yields (7.94). �

Formula (7.94) extends to ΨDOs with symbols A1(μ, x, p) such that

A1(μ, x, 0) = 0.

In particular, it holds for the integral operatorsA1f(x) =∫(f(y)−f(x))ν(μ, x, dy).

Exercise 7.8.1. As an instructive exercise, give an independent proof of (7.94) forsuch integral operators.

As a consequence of Lemma 7.8.2, the generator A1(μ)F (hδx) acting onC(SX ) tends to the operator on the r.h.s. of (7.94) acting in C(M(X), as h → 0.This, however, is a particular performance of the operator (7.60) with k = 1 andvanishing P , and the evolution (7.88) on C(SX ) tends to the evolution

Ft(μ) = ΛA1Ft(μ) =

∫X

(A1(μ)

δFt

δμ(.)

)(x)μ(dx). (7.95)

Page 449: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

436 Chapter 7. Equations in Spaces of Weighted Measures

Therefore, we derived (7.60) as well as the corresponding kinetic equation

d

dt(g, μt) = (A1(μt)g, μt) =

∫X

A1(μt)g(x)μt(dx), (7.96)

with k = 1 and vanishing P .

Remark 116.

(i) Of course, by Proposition 4.2.2, in order to rigorously prove the convergenceof the semigroups generated by A1 on C(SX ) to the semigroups generated byΛA1 on C(M(X)), one has to ensure that the convergence of the generatorsholds on the core of the limiting semigroup. Therefore, one has to showthat smooth functionals F (μ) on M+(X) are invariant under the semigroupgenerated by ΛA, which boils down to the smooth dependence of the solutionsto kinetic equations with respect to the initial data, see [147] for detail.

(ii) The second term in the approximation of A1(μ) as h → 0, which is requiredfor providing error estimates, can be written in terms of expressions of thetype (7.93) involving second-order variational derivatives. This leads to anecessity to work with the spaces C2,1×1

weak , which in turn requires the second-order sensitivity for solutions to the kinetic equations, as analysed in Theorem6.8.4.

(iii) The formulae of Lemma 7.8.1(ii) are used when the next-order corrections inh to the evolution on F (hδx) are sought.

Let us now extend the story to operators that change the number of particles.For a transition kernel P 1(x; dy) of spontaneous transformations of single particles,the analogue of the evolution (7.87), which describes the process of spontaneousand independent transformations of any particle that is present in the system, isthe evolution

ft(x1, . . . , xk) =∑

j

∫X[ft(x : xj → y)− ft(x1, . . . , xk)]P

1(xj ; dy), (7.97)

where (x : xj → y) is the collection of points obtained from the collection x =(x1, . . . , xk) by substituting the point xj by the collection y. In terms of thefunctionals F (hδx) = f(x), this evolution can be rewritten as

Ft(hδx) =∑

j

∫X[Ft(hδx − hδxj + hδy)− Ft(hδx)]P

1(xj ; dy).

In case of the mean-field dependence, i.e., if P 1 = P 1(μ, x; dy), the equation ismodified accordingly:

Ft(hδx) =∑

j

∫X[Ft(hδx − hδxj + hδy)− Ft(hδx)]P

1(hδx, xj ; dy).

Page 450: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

7.8. On the derivation of kinetic equations: second quantization and beyond 437

Using (1.202) and passing to the limit h → 0 with hδx → μ yields the equation

Ft(μ) =

∫X

∫X

((δFt(μ)

δμ(.)

)⊕(y) − δFt(μ)

δμ(x)

)P 1(μ, x; dy)μ(dx). (7.98)

Therefore we derived equation (7.61) as well as the corresponding kinetic equationfor k = 1 and vanishing A.

Exercise 7.8.2. In the above scheme, identify A1(μ) or P 1(μ) that lead to theBoltzmann, Smoluchowski and Vlasov equations of Section 7.4.

Let us now turn to binary interactions. In this case, the limiting equationon C(M(X)) is influenced by the way the scaling parameter h is applied, whichshould reflect the physical setting of the problem.

Let A2 be an operator in Csym(X2), i.e., the state space of two particles.

As for the case of the operator A1 in C(X), the standard procedure (extendingthe second quantization of operators A1) for lifting the action A2 to a system ofarbitrary particle numbers is to define

A2f(x1, . . . , xk) =∑

(i,j)⊂{1,...,k}A2

ijf(x1, . . . , xk),

where A2ij denotes the operator A2 acting on f considered as a function of the

variables xi, xj .

Following the general idea that, as the number of particles grows, the sizeof each particle decreases and the action of each particle on another should alsodecrease proportionally, one can scale the binary interaction by another multiplierh (compared to spontaneous transitions). Therefore, the operator A2 lifted to thefunctionals on measures F (hδx) = f(x) should be scaled as

hA2F (hδx) = h∑

(i,j)⊂{1,...,k}A2

ijF (hδx), x ∈ SXk. (7.99)

Applying again Lemma (7.8.1) in conjunction with the identity

h2∑

(i,j)⊂{1,...,n}φ(xi, xj)

=1

2

∫ ∫φ(z1, z2)(hδx)(dz1)(hδx)(dz2)− h

2

∫φ(z, z)(hδx)(dz),

(7.100)

we find that, if hδx tends to a measure μ ∈ M+(X), as h → 0, then

limh→0

A2(μ)F (hδx) =1

2

∫X2

(A2(μ)

(δF (μ)

δμ(.)

)⊕)(x, y)μ(dx)μ(dy). (7.101)

Page 451: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

438 Chapter 7. Equations in Spaces of Weighted Measures

Therefore, the evolution Ft = hA2F (hδx) tends to the evolution

Ft(μ) =1

2

∫X2

(A2(μ)

(δFt(μ)

δμ(.)

)⊕)(x, y)μ(dx)μ(dy). (7.102)

Including the transition kernels of binary interactions yields the equation

Ft(μ) =1

2

∫X2

(A2(μ)

(δFt(μ)

δμ(.)

)⊕)(x, y)μ(dx)μ(dy) (7.103)

+1

2

∫X2

∫X

[(δFt(μ)

δμ(.)

)⊕(y)−

(δFt(μ)

δμ(.)

)⊕(z)

]P 2(z; dy)μ(dz1)μ(dz2).

Therefore, we derived a first-order infinite-dimensional PDE of the type (7.61) fork = 2. Arbitrary equations (7.61) are obtained similarly from particle systemswith kth-order interaction.

Notice that the Boltzmann, Smoluchowski and Vlasov equations can be ob-tained either via certain mean-field dependent processes of spontaneous trans-formations as specified by the kernel P 1(μ, x; dy) or the operator A that dependslinearly on μ, or via the process of pure binary interactions, without any mean-fielddependence, as specified by the operators A2 or the transition kernels P 2.

Exercise 7.8.3. In the above scheme, identify A2 or P 2 that lead to the Boltzmann,Smoluchowski and Vlasov equations of Section 7.4.

7.9 Interacting particles and measure-valued diffusions

A very natural uniform scaling of the binary interaction (7.99) always leads toa first-order infinite-dimensional PDE in variational derivatives, whose respectivesystem of characteristics is usually referred to as kinetic equations. However, suchscaling is far from being the unique reasonable way to do so. Non-uniform scalingmay lead to important higher-order PDEs. For instance, for the discrete setting,these limits are described in [140] and [139]. One of the most famous infinite-dimensional limit obtained in this way is the celebrated super-Brownian motion,as well as more general super-processes. In this section, we shall just consider anexample which is related to McKean–Vlasov equations as analysed in Chapter 6.

Assume that the operator A2 on Csym(Rd×Rd) from above is of the second-order diffusive type and mixes the coordinates:

A2f(x, y) = tr

(σ(x)σT (y)

∂2f

∂x∂y

)=∑i,j,k

σik(x)σjk(y)∂2f

∂xi∂yj. (7.104)

Let A1(μ) be a family of ΨDOs in C(Rd) with the symbol A1(μ, x, p) such thatA1(μ, x, 0) = 0. The interacting particle system specified by A1(μ) and A2 is

Page 452: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

7.9. Interacting particles and measure-valued diffusions 439

generated by the operator

∑jA1

j (hδx)f(x1, . . . , xk) +∑

(i,j)⊂{1,...,k}A2

ijf(x1, . . . , xk). (7.105)

If we do not apply additional scaling by h to the binary interactions (un-like (7.99)) when transforming it to an action on C(M(X)), then the resultingevolution on C(M(X)) is generated by the operator

A1(hδx)F (hδx) + A2F (hδx), x ∈ SXk. (7.106)

Applying the formulae (7.100) and (7.93) shows that if hδx converges to a measureμ ∈ M(X), then the expressions in (7.105) converge to

ΛAF (μ) =

∫X

(A1(μ)

δF

δμ(.)

)(x)μ(dx) (7.107)

+1

2

∫R2d

tr

(σ(x)σT (y)

∂2

∂x∂y

δ2F

δμ(x)δμ(y)

)μ(dx)μ(dz),

i.e., unlike the scaling result (7.99), we obtained a second-order infinite-dimensionalPDE in variational derivatives, a measure-valued diffusion.

Remark 117. For readers with a background in probability, let us point out whyequations that are generated by operators of the type (7.105) (and thus their limits(7.107)) are of particular interest: As can be directly checked by Ito’s formula, fora system of k stochastic SDEs, we have

dXjt = b(Xj

t , μt)dt+ σ1(Xjt )dB

jt + σ2(X

jt )dWt, j = 1, . . . , k, μt =

∑jδXj

t/k,

(7.108)where B1, . . . , Bk,W are independent Wiener processes, and the function

ES(X1t , . . . , X

kt )(x)

satisfies the equation

∂S

∂t=∑

j

[b(xj , μt)

∂S

∂xj+

1

2σ1(xj)

∂2S

∂x2j

]+

1

2

∑i<j

σ2(xi)σ2(xj)∂2S

∂xi∂xj(7.109)

(we assume x to be one-dimensional here in order to keep the formulae slim),generated by a r.h.s. of the type (7.105). Therefore, the equations generated bythe operators (7.107) describe the large-number-of-particles limit of mean-fieldinteracting stochastic processes (7.108) that are subject to the common noise Wt.

Page 453: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

440 Chapter 7. Equations in Spaces of Weighted Measures

7.10 Summary and comments

This chapter is essentially an adapted and simplified introduction to the author’smonograph [147]. Nonlinear positivity-preserving evolutions on measures that arenormally specified by kinetic equations have a natural probabilistic interpretation,which leads to the notion of ‘nonlinear Markov processes’ arising most naturally assome kind of measure-valued projection of Markov models of interacting particles.In [147], one can find a detailed treatment of linear equations with unboundedrates, the extension of Theorem 7.6.2 to very general kinetic equations, its conse-quences concerning the smoothness of solutions with respect to the initial data,as well as subtle estimates for the derivatives needed for a detailed analysis of ap-proximating particle systems as mentioned in Section 7.8, with an emphasis on theSmoluchowski and Boltzmann setting. In particular, the analysis of fluctuationsin systems of interacting particles around the measure-valued evolution dictatedby kinetic equations (central-limit-type behaviour) leads one to measure-valueddiffusions of the type (7.107).

The unifying picture presented in [147] (which we also followed in this chap-ter) was developed by putting together pieces of specific methods that had beendeveloped by different authors at different times when working with different prob-lems. In the same spirit, we tried to single out the core ideas (positivity preserva-tion, Lyapunov functions, moment estimates, accretivity and approximations bymodels with cut-off rates) underlying the huge amount of concrete research on con-crete models, which eventually allowed us to give a fast and concise presentationof many results in a unified abstract setting.

The systematic development of the theory of smoothness of solutions withrespect to the initial data as the basis for a new method for the rigorous derivationof kinetic equations which includes sharp error estimates for the approximatingsystems of interacting particles (often of order 1/N) was one of the core novelties of[147]. This development is based on previous research of [142] and [143]. We refer tothis book [147] (Section 11.6) for an extensive discussion of the immense literatureon kinetic models. At this point, we only mention a few papers that are relatedto the present exposition, in particular to the development of unified methods.Proposition 1.4.5 as a systematic tool for proving uniqueness (or accretivity) forkinetic equations was put forward in [141]. A simplified proof was given in [215]. Inthe present context, new simpler proofs under weaker assumptions are presented.The general abstract setting for treating coagulation models was developed in[214]. An extension of the treatment of equilibria to general models, in terms ofthe generalized entropy given in Section 3.4 for the discrete case, was developedin [94].

The deduction of general kth-order interaction kinetic equations via Bo-golyubov’s chain approach was carried out in [32]. The quantum analogues ofgeneral kinetic equations were developed in [30] and [31]. For an alternative treat-ment of particle dynamics via evolutions on configuration spaces (starting from a

Page 454: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

7.10. Summary and comments 441

locally finite rather than a finite number of particles), we can refer to [82, 168, 169]and references therein. Fractional kinetic hierarchies were developed in [130] and[57]. Note that we did not consider spatially nontrivial models in this chapter.The mollified version of spatially nontrivial Boltzmann equations does not bringmany new difficulties compared to the spatially trivial case. On the other hand,the proper spatially nontrivial Boltzmann equation is known to be notoriouslydifficult to analyse. The celebrated paper [65] settled the question of a weak ex-istence of its solutions. For spatially nontrivial coagulation models and relatedkinetic equations, we can refer to [103, 216] and [181]. Finally, let us mentionsome useful comprehensive sources for detailed studies of concrete models, namely[9, 13, 69, 88, 234, 260] and [36]. Let us also point out the paper [210], whichshows that all uniqueness results discussed above are in fact uniqueness under therestriction of the conservation of energy E, and that the Boltzmann equation, inparticular, can have infinitely many solutions otherwise.

A rigorous derivation of the Landau–Fokker–Planck equation (7.55) from theBoltzmann equation was performed in [17] for basic examples of the interactionpotential. For an extensive and detailed analytic theory of the Landau–Fokker–Planck equation, we can refer to [258] and [259], where several special ideas used forthe study of this equation are explained, including the general notion of weak so-lutions based on the entropy-production concept and a remarkable representationof this equation in the equivalent linear form. For the probabilistic analysis of thisequation, we can refer to [60], [98]and references therein. Its quantum extensionis given in [265]. Entropic properties of Fokker–Planck equations are developed indetail in [86].

Page 455: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

Chapter 8

Generalized FractionalDifferential Equations

In this chapter, the theory of linear and nonlinear fractional differential equationsis developed and extended to a large class of generalized fractional evolutions.The used method is mostly that of semigroups and propagators as developed inChapters 4 and 5. As previously, general facts are illustrated on concrete examples.The generalized fractional calculus is usually developed by extending fractionalintegrals to integral operators with arbitrary integral kernels (or some of theirsubclasses dictated by the classes of special functions under investigation) and thendefining fractional derivatives as the derivatives of these integral operators. In whatfollows, we will use an alternative approach to generalized fractional operations,motivated by a probabilistic interpretation (Levy processes that are interrupted ontheir attempt to cross a boundary). In this approach, one starts with the definitionof a generalized fractional derivative, and the generalized fractional integral is thendefined as the corresponding potential operator or, in other words, as the rightinverse operator to the fractional derivative. This potential operator is an integraloperator whose integral kernel is the fundamental solution to the operator of thegeneralized fractional derivative. A characteristic feature of our approach is thefact that we can analyse fractional equations not as some ‘unusual evolutions’, butrather as more ‘standard’ stationary problems. This allows us to put fractionalequations in the general context of semigroups and propagators.

© Springer Nature Switzerland AG 2019 V. Kolokoltsov, Differential Equations on Measures and Functional Spaces, Birkhäuser Advanced Texts Basler Lehrbücher, https://doi.org/10.1007/978-3-030-03377-4_8

443

Page 456: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

444 Chapter 8. Generalized Fractional Differential Equations

8.1 Green functions of fractional derivativesand the Mittag-Leffler function

We start by giving the most direct analytic derivation of Zolotarev’s formula (2.80),which expresses the Mittag-Leffler function in terms of the Green functions

G±β(t, x) =1

∫ ∞

−∞exp

{ipx− t|p|β exp

{± i

2πβ sgn p

}}dp

= F−1

(exp

{−t|p|β exp

{± i

2πβ sgn p

}})

=1

πRe

∫ ∞

0

exp

{ipx− tpβ exp

{± i

2πβ

}}dp

for the problem (2.74):

∂G

∂t(t, x) = − dβ

d(±x)βG(t, x), t ≥ 0, Gt=0 = δ(x),

with β ∈ (0, 1). Another proof, based on the semigroup theory, will be presentedlater on in a much more general setting, see (8.49).

The main ingredient for the direct analytic derivation is the Mellin transformof G±β , i.e., formula (8.1) below.

Proposition 8.1.1. Let β ∈ (0, 1).

(i) For any s > 0, we have∫ ∞

0

x−sGβ(t, x) dx =Γ(s/β)

βΓ(s)t−s/β ; (8.1)

(ii) For any s ∈ R, Zolotarev’s formula holds:

Eβ(s) =1

β

∫ ∞

0

esxx−1−1/βGβ(1, x−1/β) dx. (8.2)

Equivalently, this formula can be written as

Eβ(s) =1

β

∫ ∞

0

esxx−1Gβ(x, 1) dx =

∫ ∞

0

esy−β

Gβ(1, y) dy. (8.3)

Moreover, the derivatives of Eβ(s) (with respect to s) can be obtained bydifferentiation inside the integrals (8.2) or (8.3).

Remark 118. Formula (8.2) basically states that βEβ(−s) is the Laplace trans-form of the positive function x−1−1/βGβ(1, x

−1/β). In particular, βEβ(−s) is acompletely monotone function.

Page 457: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

8.1. Green functions of fractional derivatives and the Mittag-Leffler function 445

Proof. (i) Notice first of all that by Proposition 2.4.1(i) all derivatives of thefunction Gβ(t, x) with respect to x vanish at zero. Therefore, all integrals in (8.1)are well defined. Assume now that s /∈ N. By (1.176) and the definition of G, wefind∫ ∞

0

x−sGβ(t, x) dx =

∫R

(F−1(x−s+ ))(p) exp

{−t|p|β exp

{i

2πβ sgn p

}}dp

= 2Re

∫ ∞

0

F−1(x−s+ )(p) exp{−tpβeiπβ/2} dp

= ReΓ(1 − s)

πeiπ(1−s)/2

∫ ∞

0

ps−1 exp{−tpβeiπβ/2} dp.

Therefore, by (9.9), we find∫ ∞

0

x−sGβ(t, x) dx = ReΓ(1− s)

πeiπ(1−s)/2t−s/β Γ(s/β)

βe−iπs/2

= sin(πs)Γ(1 − s)

πt−s/β Γ(s/β)

β=

Γ(s/β)

βΓ(s)t−s/β ,

where formula (9.10) was used. By continuity, this extends to all positive s includ-ing s ∈ N.

(ii) The first formula in (8.3) is obtained from (8.2) by scaling (2.77). Thesecond formula in (8.3) is obtained by changing the integration variable to x = y−β .

Let us prove the second formula in (8.3). By expanding the exponents on ther.h.s. of (8.2) into a power series, we are led to show that

Eβ(s) =∞∑

n=0

sn

n!

∫ ∞

0

y−βnGβ(1, y) dy. (8.4)

But by (8.1), we find

1

n!

∫ ∞

0

y−βnGβ(1, y) dy =Γ(n)

n!βΓ(βn)=

1

Γ(1 + βn),

so that we get precisely the defining series for E.

Since the series representation (8.4) of the analytic function Eβ(s) can be dif-ferentiated termwise, and since the terms of the corresponding series are positive,the order of summation and integration can be changed in (8.4) as well as in itsderivatives. This justifies a differentiation under the integrals in (8.2) or (8.3). �

Remark 119. The proof implies that the integrals in (8.2) are well defined, whichis a strong requirement on the behaviour of Gβ near zero. Moreover, differentiatingunder the integrals in (8.2) or (8.3) increases the singularity at one side of R+,but it always remains integrable.

Page 458: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

446 Chapter 8. Generalized Fractional Differential Equations

Formula (8.2) is of key importance, since it allows for a definition of theMittag-Leffler function E(A) for any operator A that generates a strongly contin-uous semigroup Tt in a Banach space B via the formula

Eβ(A) =1

β

∫ ∞

0

eAxx−1−1/βGβ(1, x−1/β) dx =

1

β

∫ ∞

0

Txx−1−1/βGβ(1, x

−1/β) dx,

(8.5)and its derivative by

E′β(A) =

1

β

∫ ∞

0

eAxx−1/βGβ(1, x−1/β) dx =

1

β

∫ ∞

0

Txx−1/βGβ(1, x

−1/β) dx.

(8.6)

Since the solutions to linear fractional equations are expressed in terms ofEβ(At

β), let us derive the most concise integral representations for these expres-sions. Namely, changing variables in (8.5), (8.6) yields the equations

Eβ(Atβ) =

t

β

∫ ∞

0

eAyGβ(y, t)dy

y, (8.7)

E′β(At

β) =t1−β

β

∫ ∞

0

eAyGβ(y, t) dy. (8.8)

8.2 Linear evolution

The main result for linear fractional equations with time-independent generatorsis as follows:

Theorem 8.2.1. Let β ∈ (0, 1) and let A be a generator of a strongly continuoussemigroup Tt in a Banach space B with the domain D(A). Let Y ∈ D(A), and letbt be a continuous curve in B such that bt ∈ D(A) for any t and the norms ‖Abt‖are bounded on compact intervals of t. Then the fractional linear Cauchy problem(2.174), i.e.,

Dβa+∗μt = Aμt + bt, μa = Y, t ≥ a, (8.9)

has a unique solution, which is given by (2.176). Namely, the solution reads

μt = Eβ(A(t− a)β)Y + β

∫ t

a

(t− s)β−1E′β(A(t− s)β)bs ds, (8.10)

where Eβ, E′β are given by (8.5), (8.6). This function is also the unique solution

to the fractional integral equation

μt = Y +1

Γ(β)

∫ t

a

(t− s)β−1(Aμs + bs) ds = Y + Iβ(Aμ. + b.)(t). (8.11)

Finally, if Tt has the growth type m0 (see the definition before Theorem 4.2.1), sothat ‖Tt‖ ≤ Memt with some m > m0 and M , then the norms of μt satisfy the

Page 459: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

8.2. Linear evolution 447

estimate

‖μt‖B ≤ MEβ(m(t−a)β)‖Y ‖B+Mβ

∫ t

a

(t−s)β−1E′β(m(t−s)β)‖bs‖B ds. (8.12)

Proof. Let Aλ be the Yosida approximation of A, see (4.9). By Proposition 2.13.1,the Cauchy problem

Dβa+∗μt = Aλμt + bt, μa = Y, t ≥ a, (8.13)

is equivalent to the integral equation

μt = Y +1

Γ(β)

∫ t

a

(t− s)β−1(Aλμs + bs) ds = Y + Iβ(Aλμ. + b.)(t), (8.14)

and its unique solution is given by the formula

μλt = Eβ(Aλ(t− a)β)Y + β

∫ t

a

(t− s)β−1E′β(Aλ(t− s)β)bs ds, (8.15)

where the Mittag-Leffler function can be defined both by its series representationand by (8.5) due to the boundedness of Aλ.

By Proposition 4.2.3, the semigroups generated by Aλ converge to Tt. By theprinciple of uniform boundedness (see Theorem 1.6.1), this implies that the norms‖T λ

t ‖ are uniformly bounded for all λ and t from compact intervals. Consequently,by the dominated convergence and Proposition 4.2.3, the μλ

t converge to μt givenby (8.10), as λ → ∞, uniformly on compact intervals of t. This holds for any Yand bt continuous in B. Using now that Y ∈ D(A) and bt ∈ D(A), we can concludethat Aλμ

λt → Aμt. Therefore, passing to the limit in equation (8.14) yields (8.11).

This implies (8.9) by Proposition 1.8.4.

It remains to show the uniqueness. This will be done similarly to Theorem5.10.3. For that purpose, assume that some continuous curve ft in B is such thatft ∈ D(A) and satisfies the equation Dβ

a+∗ft = Aft + bt with f0 = Y . We shallshow that this ft is necessarily the limit of μλ

t (and hence coincides with μt). Infact, from the equations for ft and μλ

t , we get the equation

Dβa+∗(μ

λt − ft) = Aλμ

λt −Aft = Aλ(μ

λt − ft) + (Aλ −A)ft.

By (8.15), this implies

μλt − ft = β

∫ t

a

E′β(Aλ(t− s)β)(Aλ −A)fs ds.

But this tends to zero by the dominated convergence, since (Aλ − A)fs → 0 andthe norms ‖Afs‖ and ‖Aλfs‖ are uniformly bounded.

Finally, the estimate (8.12) is a direct consequence of (8.12) and the inequal-ity ‖Tt‖ ≤ Memt. �

Page 460: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

448 Chapter 8. Generalized Fractional Differential Equations

With Theorem 8.2.1, it becomes possible to get well-posedness and an in-tegral representation for the solutions to the equations (8.9) once the theory ofthe corresponding usual evolutionary equation μt = Aμt is developed. As basicexamples and consequences of Theorem 8.2.1, we can indicate the following cases:

(i) The general fractional Schrodinger equation, see (4.21):

Dβa+∗ψt = σHψt, (8.16)

where H is a self-adjoint operator in a Hilbert space H, and either σ = −i orH is bounded from above and σ is any complex number with a non-negative realpart. In both cases, Theorem 8.2.1 applies and ensures the well-posedness in H.Moreover, Theorem 8.2.1 applies to the regularized fractional Schrodinger equationor complex fractional diffusion of the type

Dβa+∗ft =

1

2σΔft + (b(x),∇)ft(x) + V (x)ft(x), f0 = f, (8.17)

considered in the Banach space C∞(Rd), where σ is a complex constant with apositive real part, with V and all bj bounded continuous complex-valued functions(see Proposition 4.8.3), or more generally for V, b from Theorem 4.8.1.

(ii) The general fractional Feller evolution, which is given by equation (8.9),where A is the generator of an arbitrary Feller semigroup, e.g., a diffusion or amore general Levy–Khintchin-type operator from Theorem 5.10.4. In this case,(8.10) reveals that evolution operators Y → μt given by (8.10) with vanishing bthave the following additional property: if 0 ≤ Y ≤ 1, then if 0 ≤ μt ≤ 1. Moreover,if the Feller semigroup generated by Aλ is conservative, i.e., it preserves constants,then the mapping Y �→ μt is also conservative.

(iii) Fractional evolutions generated by ΨDOs with symbols that do not de-pend on positions:

Dβa+∗ft = −ψ(−i∇)ft + gt, f |t=a = fa, (8.18)

with β ∈ (0, 1), under various assumptions on the symbols ψ(p), as described inTheorems 2.4.1, 4.4.1, 4.5.1 or 4.5.2. In this case, the general formula (8.10) takesthe more concrete form (2.198):

ft(x) =1

β

∫ ∞

0

dy

∫Rd

Gψy(t−a)β

(x− z)y−1−1/βGβ(1, y−1/β)fa(z) dz (8.19)

+

∫ t

a

ds

∫ ∞

0

dy

∫Rd

(t− s)β−1Gψy(t−s)β

(x− z)y−1/βGβ(1, y−1/β)gs(z) dz.

Exercise 8.2.1. Derive the well-posedness of the problems (8.18) for homogeneoussymbols ψ or their mixtures by using the Fourier transform that leads to (8.19)and by-passing the general Theorem 8.2.1.

Page 461: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

8.2. Linear evolution 449

(iv) The equations (8.9) with A being a fractional Laplacian with variablescale or a parabolic polynomial perturbed by local or nonlocal terms of lower order,see the corollary to Proposition 5.9.1 as well as Theorems 4.14.2 and 5.9.2.

For Y, bt ∈ D, the solutions to the equations (8.9) given by (8.10) belong tothe domain of A at all times, and they are classical in this sense. In analogy to thecase of usual linear evolutions (see the discussion prior to Proposition 5.9.2), onecan introduce the natural notion of generalized solutions to the equations (8.9).For instance, let us say that a continuous curve μt, t ≥ a, in B is a generalizedsolution by approximation to the Cauchy problem (8.9), if it satisfies the initialcondition μa = Y , and if there exists a sequence of elements μn ∈ D and curves bntin D such that μn → Y and bnt → bt, as n → ∞, and the corresponding (classical,i.e., belonging to the domain) solutions μn

t , given by (8.10) with μn instead of Yand bnt instead of bt, converge to μt, as n → ∞. The following assertion is a directconsequence of Theorem 8.2.1 (and its proof):

Proposition 8.2.1. Under the assumptions of Theorem 8.2.1, formula (8.10) sup-plies the unique generalized solution by approximation to the Cauchy problem (8.9)for any Y ∈ B and any bounded measurable curve bt ∈ B.

As one can expect, if the semigroup generated by an operator A has someregularization property, then the same holds for solutions to the problem (8.9). Asit turns out, the solution to (8.9) has even better regularization properties, sinceit ‘spreads’ the singularity by the integration. Let us formulate the precise result:

Theorem 8.2.2. Let B ⊃ B be two Banach spaces with ‖.‖B ≥ ‖.‖B. Let β ∈ (0, 1),and let A be a generator of a strongly continuous semigroup Tt in B. Let it beregularizing so that Tt takes B to B and

‖Ttf‖B ≤ κt−ω‖f‖B, t ≤ S, (8.20)

with constants ω ∈ (0, 1), κ > 0. Then the mapping (Y, {bt}) �→ μt(Y ) given byformula (8.10) (which by Proposition 8.2.1 yields the generalized solution to theproblem (8.9)) is also regularizing. More precisely, if S = ∞, then

‖μt(Y )‖B ≤ κ

β(t− a)β(1−ω)−1‖Y ‖B

∫ ∞

0

x−ω−1−1/βGβ(1, x−1/β) dx (8.21)

+ κ

∫ t

a

(t− s)−βω‖bs‖B ds

∫ ∞

0

x−ω−1/βGβ(1, x−1/β) dx

for all t > a (where the integrals are finite), and if S is finite, then

‖μt(Y )‖B ≤ κ(t− a)−ωβ‖Y ‖B + κ

∫ t

a

(t− s)β(1−ω)−1‖bs‖B ds, t ≤ S, (8.22)

with a constant κ depending on κ, S and the growth type m0 of Tt.

Page 462: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

450 Chapter 8. Generalized Fractional Differential Equations

Proof. Using (8.5) and (8.10) together with the estimates

‖eA(t−a)βxY ‖B ≤ κ(t− a)−βωx−ω‖Y ‖B,‖eA(t−s)βxbs‖B ≤ κ(t− s)−βωx−ω‖bs‖B,

yields (8.21) in the case of S = ∞. The first integral in (8.21) is finite, since thesingularity of the integrand at zero is of order x−ω.

If S is finite, then for any m > m0 there exists M such that

‖Ttf‖B ≤ κS−ωMem(t−S)‖f‖B, t ≥ S. (8.23)

Plugging this estimate into (8.10) yields (8.22). �

In earlier chapters, you will find many examples for the smoothing property(8.20) for B = C∞(Rd) or B = L1(R

d) and B = Ck∞(Rd) orHk1 (R

d), respectively,with some k ∈ N, see, e.g., (4.24) or (4.26), and Theorems 4.4.1 and 5.8.3.

8.3 The fractional HJB equation and relatedequations with smoothing generators

In this section, we shall analyse the equation

Dβa+�ft = Aft +H

(x,

∂ft∂x

, ft

), (8.24)

where A is the generator of a strongly continuous semigroup etA in C∞(Rd) sat-isfying (6.2), i.e.,

‖etAf‖C1(Rd) ≤ κt−ω‖f‖C(Rd), (8.25)

for t ∈ (0, S] with some finite or infinite S. Dβa+� is the Caputo derivative of order

β ∈ (0, 1), and H is a function that is Lipschitz-continuous in its three variables.

A basic example comes from stochastic control theory, where A is the gener-ator of a Feller process (say, a diffusion or a stable or stable-like process) and theHamiltonian function H has the form (2.123). The corresponding equation (6.1)was initially derived in [160], where it describes the optimal cost for controlledprocesses governed by scaling limits of continuous-time random walks.

Similarly, one can deal with the fractional versions of higher-order PDEs orΨDEs as analysed in Section 6.2, i.e., equations of the type

Dβa+�ft = −σ(x)|Δ|α/2ft +H

(x,

{∂mft

∂xi1 · · · ∂xim

}, ft

), (8.26)

where the notations are explained after equation (6.27), or even more generalequations with σ varying in space or general parabolic differential operators instead

Page 463: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

8.3. The fractional HJB equation and related equations . . . 451

of |Δ|α/2. In order to keep all formulae short, we shall stick to equations of thetype (8.24).

According to Theorem 8.2.1, if a sufficiently regular ft solves (8.24) with theinitial condition f0 = Y , then it satisfies the integral equation

ft = Eβ(A(t− a)β)Y + β

∫ t

a

(t− s)β−1E′β(A(t− s)β)H

(.,∂fs∂x

, fs

)ds, (8.27)

or equivalently

ft =1

β

∫ ∞

0

exp{A(t− a)βy}y−1−1/βGβ(1, y−1/β) dy Y (8.28)

+

∫ t

a

∫ ∞

0

(t− s)β−1 exp{A(t− s)βy}y−1/βGβ(1, y−1/β) dyH

(.,∂fs∂x

, fs

)ds.

Therefore, this equation can be naturally referred to as the mild equation or themild form of (8.24). In other words, ft is a solution to the mild equation (8.28), ifit is a fixed point of the mapping Φt:

[ΦY (f.)](t) =1

β

∫ ∞

0

exp{A(t− a)βy}y−1−1/βGβ(1, y−1/β) dy Y (8.29)

+

∫ t

a

∫ ∞

0

(t− s)β−1 exp{A(t− s)βy}y−1/βGβ(1, y−1/β) dyH

(.,∂fs∂x

, fs

)ds.

The theory will now be developed in full analogy with Theorems 6.1.1 and 6.1.2.The only difference is that it is not sufficient to have local bounds of the type (6.5),since the formula for the Mittag-Leffler function includes an integration over alltimes. Therefore, the assumptions must be modified in order to take into accountthe growth type of eAt.

Theorem 8.3.1. Let A be an operator in C∞(Rd) generating a strongly continuoussemigroup etA in C∞(Rd) such that etA is also a strongly continuous semigroupin C1

∞(Rd) satisfying (8.25) and

‖etA‖C∞(Rd)→C∞(Rd) ≤ MCemCt, ‖etA‖C1∞(Rd)→C1∞(Rd) ≤ MDemDt, (8.30)

with constants MC ,MD,mC ,mD and for all t. Let H(x, p, q) be a continuous func-tion on Rd ×Rd ×R such that h = supx |H(x, 0, 0)| < ∞, and let the Lipschitzcontinuity property (6.6) hold. Then for any Y ∈ C1

∞(Rd) there exists a unique so-lution f. ∈ C([0, T ], C1

∞(Rd)) to the mild equation (8.28). Moreover, the solutionsft(Y1) and ft(Y2) with different initial data Y1, Y2 satisfy the estimate

‖ft(Y1)− ft(Y2)‖C1(Rd) ≤ C‖Y1 − Y2‖C1(Rd), (8.31)

with the constant C depending continuously on t, κ, ω, β, LH , MC, MD, mC

and mD.

Page 464: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

452 Chapter 8. Generalized Fractional Differential Equations

Proof. By the same argument as in Theorem 6.1.1, one shows with the help of(8.21) or (8.10) that the mapping ΦY given by (8.29) is a well-defined mappingC([a, t], C1∞(Rd)) → CY ([a, t], C

1∞(Rd)) for any t > a that takes bounded subsetsto bounded subsets. (The notations given prior to Theorem 2.1.1 are used.) Moreprecisely,

‖ΦY (f.)](t)‖C1(Rd) ≤ MD‖Y ‖C1(Rd)Eβ(mD(t− a)β)

+ κ

∫ t

a

(t− s)β(1−ω)−1(h+ ‖fs‖C1(Rd)) ds.(8.32)

Moreover, by (8.5) and (8.10), we have

‖[ΦY1(f.)](t) − [ΦY2(f.)](t)‖C1∞(Rd) ≤ MDEβ(mDtβ)‖Y1 − Y2‖C1(Rd).

By (8.21) or (8.10), we find

‖[ΦY (f1. )](t)− [ΦY (f

2. )](t)‖C1(Rd) ≤ κLH

∫ t

a

(t− s)β(1−ω)−1‖f1s − f2

s ‖C1∞(Rd) ds,

uniformly for t from any compact interval. Therefore, we have the estimate (2.5)of Theorem 2.1.3, and applying this theorem completes the proof. �

By similarly modifying the proof of Theorem 6.1.2, we obtain the followingresult on the continuous dependence of solutions to fractional HJB equations ona parameter:

Theorem 8.3.2. Let Hα(x, p, q) be a family of Hamiltonians depending on a param-eter α taken from an auxiliary Banach space Bpar. Suppose that each Hα satisfiesall assumptions of Theorem 8.3.1 with all bounds uniform in α. Moreover, supposethat H is Lipschitz-continuous in α, in the sense that

|Hα1(x, p, q) −Hα2(x, p, q)| ≤ ‖α1 − α2‖LparH (1 + |p|+ |q|), (8.33)

with a constant LparH . Then the solutions ft(Y, α1) and ft(Y, α2) to (8.28) (built in

Theorem 8.3.1) with different parameter values satisfy the estimate

sups∈[a,t]

‖fs(Y, α1)− fs(Y, α2)‖C1(Rd) ≤ LparH K‖α1 − α2‖(1 + ‖Y ‖C1(Rd)), (8.34)

where the constant K depends continuously on t, ω,κ, βh, LH ,MC ,MD,mC andmD.

In full analogy to Theorem 6.1.5, one can prove the sensitivity (i.e., smoothdependence) of solutions to the fractional HJB equation 8.24 with respect to aparameter or to initial values. Similar to Theorem 6.1.3, one can prove additionalregularity of the solutions.

Page 465: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

8.4. Generalized fractional integration and differentiation 453

8.4 Generalized fractional integrationand differentiation

The fractional derivative dβfdxβ , β ∈ (0, 1), was suggested as a substitute to the usual

derivative dfdx , which can model some kind of memory by taking into account the

past values of f . An obvious extension that is widely used in the literature arevarious mixtures of such derivatives, both discrete and continuous,

N∑j=1

ajdβjf

dxβj,

∫ 1

0

dβf

dxβμ(dβ). (8.35)

In order to take this idea further, one can observe that dβfdxβ represents a weighted

sum of the increments of f , f(x − y) − f(x), from various past values of f tothe ‘present value’ at x. From this point of view, the natural class of generalizedmixed fractional derivatives is represented by the causal integral operators (alreadydiscussed in Section 5.11)

L′νf(x) =

∫ ∞

0

(f(x− y)− f(x))ν(dy), (8.36)

with some positive measure ν on {y : y > 0} satisfying the one-sided Levy condition(5.143): ∫ ∞

0

min(1, y)ν(y)dy < ∞. (8.37)

This condition ensures that Lν is well defined at least for the set of boundedinfinitely smooth functions on {y : y ≥ 0}. The dual operators to Lν are given bythe anticipating integral operators, i.e., weighted sums of the increments from the‘present’ to any point ‘in the future’:

Lνf(x) =

∫ ∞

0

(f(x+ y)− f(x))ν(dy). (8.38)

Of course, one can weight the points in the past or the future differently,depending on the present position. Also, one can add a local part for completingthe picture, which leads to the operators

Llν,bf(x) =

∫ ∞

0

(f(x− y)− f(x))ν(x, dy) + b(x)df

dx, (8.39)

with a non-positive drift b(x) and a transition kernel ν(x, .) such that∫min(1, y)ν(x, dy) < ∞.

Page 466: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

454 Chapter 8. Generalized Fractional Differential Equations

These operators fully capture the idea of ‘weighting the past’ and can be calledone-sided, namely left-sided or causal, operators of order at most one. Similarly,one can define the right-sided or anticipating operators of order at most one as

Lrν,bf(x) =

∫ ∞

0

(f(x+ y)− f(x))ν(x, dy) − b(x)df

dx. (8.40)

Remark 120. Notice that Ll and Lr are dual only if ν and b do not depend on x.

General operators of order at most one, are given by linear combinationsof one-sided operators, and their semigroups were systematically studied in [147],[148] (see also Section 5.14). The theory of the corresponding fractional differentialequations was built in [106]. For the sake of simplicity, let us stick to the mixedderivatives (8.38) and (8.36) and use the following notations:

D(ν)+ = −L′

νf(x) = −∫ ∞

0

(f(x− y)− f(x))ν(dy),

D(ν)− = −Lνf(x) = −

∫ ∞

0

(f(x+ y)− f(x))ν(dy).

(With some abuse of notations, if ν has a density, we shall denote this densityagain by ν.) The minus sign was introduced in order to comply with the standardnotation for fractional derivatives, so that, e.g.,

dxβf(x) = Dβ

−∞+ = D(ν)+

with ν(y) = −1/[Γ(−β)y1+β], because (see (1.111))

dxβf(x) = Dβ

−∞+f(x) =1

Γ(−β)

∫ ∞

0

f(x− y)− f(x)

y1+βdy

and Γ(−β) < 0.

The symbols of the ΨDOs D(ν)+ and D

(ν)− are −ψν(−p) and −ψν(p), respec-

tively, where

ψν(p) =

∫(eipy − 1)ν(dy)

is the symbol of the operator Lν .

If ν is finite, then the operators D(ν)+ are bounded, which is not the case

for the derivatives. Therefore, the proper extensions of the derivatives represent

only those operators D(ν)+ arising from infinite measures ν that satisfy (8.37). The

operators arising from a finite ν can better be considered analogues of the finitedifferences that approximate the derivatives (see Proposition 5.13.1(ii)).

The operators D(ν)± are an extension of the fractional derivatives Dβ

−∞+ and

Dβ∞−, often referred to as the fractional derivatives in generator form . When

Page 467: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

8.4. Generalized fractional integration and differentiation 455

looking for the corresponding extensions of the operators Dβa± and Dβ

a±∗ with afinite a, we note that, by Proposition 1.8.2 (and the symmetric property of the right

derivatives),Dβa+∗ (respectivelyD

βa−∗) is obtained fromDβ

−∞+ (respectivelyDβ∞−)

by restricting its action on the subspace C1([a,∞)) (respectively C1((−∞, a]).Therefore, the analogues of the Caputo derivatives should be defined as

D(ν)a+∗ = −

∫ x−a

0

(f(x− y)− f(x))ν(dy) −∫ ∞

x−a

(f(a)− f(x))ν(dy),

D(ν)a−∗ = −

∫ a−x

0

(f(x+ y)− f(x))ν(dy) −∫ ∞

a−x

(f(a)− f(x))ν(dy).

(8.41)

Let us denote by Ckkill(a)([a,∞)) and Ck

kill(a)((−∞, a]) the subspaces of

Ck([a,∞)) and Ck((−∞, a]), respectively, consisting of functions that vanish to

the right or to the left of a. Again by Proposition 1.8.2, the operators Dβa+ or

Dβa−, the analogues of the Riemann–Liouville derivatives, are obtained by fur-

ther restricting the actions of Dβ−∞+ and Dβ

∞− to the spaces C1kill(a)([a,∞)) and

C1kill(a)((−∞, a]):

D(ν)a+ = −

∫ x−a

0

(f(x − y)− f(x))ν(y)dy +

∫ ∞

x−a

f(x)ν(y)dy,

D(ν)a− = −

∫ a−x

0

(f(x + y)− f(x))ν(y)dy +

∫ ∞

a−x

f(x)ν(y)dy.

(8.42)

In order to see what the proper analogue of the fractional integral couldbe, notice that, according to (1.184), the fundamental solution (that vanishes on

the negative half-line) to the fractional derivative dβ

dxβ is Uβ(x) = xβ−1+ /Γ(β).

Therefore, the usual fractional integral

Iβa f(x) =1

Γ(β)

∫ x

a

(x− y)β−1f(y) dy =1

Γ(β)

∫ x−a

0

zβ−1f(x− z) dz (8.43)

is nothing but the potential operator of the semigroup generated by − dβ

dxβ , or, inother words, the integral operator with the kernel being the fundamental solution

to − dβ

dxβ (or, yet in other words, the convolution with this fundamental solution),restricted to the space Ckill(a)([a,∞)).

By Proposition 5.12.2, the potential measure U (ν)(dy) represents the uniquefundamental solution to the operator L′

ν , vanishing on the negative half-line.Therefore, the analogue of the fractional integral Iβa for such ν should be thepotential operator of the semigroup T ′

t generated by L′ν , i.e., the convolution with

U (ν)(dy) restricted to the space Ckill(a)([a,∞)):

I(ν)a f(x) =

∫ x−a

0

f(x− z)U (ν)(dz). (8.44)

Page 468: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

456 Chapter 8. Generalized Fractional Differential Equations

The following result corroborates this identification by showing the analogy withProposition 1.8.4 reduced to the case β ∈ (0, 1) and vanishing initial conditions,and extending it into various directions.

Proposition 8.4.1.

(i) Let the measure ν on {y : y > 0} satisfy (5.143). For any generalized functiong ∈ D′(R) supported on the half-line [a,∞) with any a ∈ R, and for any

λ ≥ 0, the convolution U(ν)λ �g with the λ-potential measure (5.154) is a well-

defined element of D′(R), which is also supported on [a,∞). This convolutionrepresents the unique solution (in the sense of generalized function) to theequation (λ − L′

ν)f = g, or equivalently

D(ν)+ f = −λf + g,

supported on [a,∞).

(ii) If λ > 0, and if g ∈ C∞(R) and is supported on the half-line [a,∞), i.e.,g ∈ Ckill(a)([a,∞)) ∩ C∞(R), then

f(x) = (U(ν)λ � g)(x) = R′

λg(x) =

∫ ∞

−∞g(x− y)U

(ν)λ (dy)

=

∫ x−a

0

g(x− y)

∫ ∞

0

e−λtG(ν)(t, dy) dt

(8.45)

belongs to the domain of the operator L′ν and therefore represents the classical

solution to the equation (λ− L′ν)f = g, or equivalently

D(ν)+ f = D

(ν)a+f = D

(ν)a+∗f = −λf + g. (8.46)

(iii) If λ = 0, then the potential U (ν) defines an unbounded operator in C∞(R)that does not fully fit into the framework of Proposition 4.1.4. However, ifreduced to the space Ckill(a)([a, b]) of continuous functions on [a, b] vanishingat a (this space is invariant under T ′

t and hence under all R′λ), the potential

operator R′0 with the kernel U (ν) becomes bounded, and therefore

(U (ν) � g)(x) = R′0g(x) = I(ν)a g(x) (8.47)

belongs to the domain of L′ν and represents the classical solution to the equa-

tion−L′

νf = D(ν)+ f = D

(ν)a+f = D

(ν)a+∗f = g (8.48)

on Ckill(a)([a, b]).

Proof. (i) By Propositions 1.11.1 and 1.9.2, the convolution U(ν)λ �g is well defined

and solves the equation (λ − L′ν)f = g. The uniqueness follows as in Proposition

5.12.2.

Page 469: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

8.4. Generalized fractional integration and differentiation 457

(ii) Since L′ν generates a semigroup Tt from (5.149), which preserves the

spaces C∞([a,∞)) and Ckill(a)([a,∞))∩C∞([a,∞)) (see Remark 99), these spacesare also invariant under the resolvent R′

λ = (λ − L′ν)

−1. By Theorem 4.1.1(v),the image of the resolvent always coincides with the domain of the generator.Therefore, R′

λg belongs to the intersection of Ckill(a)([a,∞))∩C∞([a,∞)) and the

domain of D(ν)+ .

(iii) The potential operator

R′0g(x) = (U (ν) � g)(x) =

∫ x−a

0

g(x− y)U (ν)(dy)

is bounded on Ckill(a)([a, b]) by Proposition 5.12.1. Therefore, Proposition 4.1.4applies to this space. �

In particular, applying (8.45) to L′ν = − dβ

dxβ and a comparison with (8.10)yield

βzβ−1E′β(−λzβ) =

∫ ∞

0

e−λtGβ(t, z) dt = Uβλ (z), (8.49)

which is equivalent to (8.8) and therefore gives another proof of this formula, andhence also of (8.2).

Unlike the case of usual fractional derivatives, for general ν the classicalinterpretation of the solutionR′

λg(x) is more subtle for g ∈ C([a,∞)) not vanishingat a. Moreover, one must carefully distinguish the cases when g is extended to theleft of a as g(a), which we denote by g, or as zero, which we denote by g0. By thecorollary to Proposition 5.12.1, if ν is not finite, then R′

0g is continuous at zeroeven if g ∈ C([a, b]) does not vanish at a. Still, it does not belong to the domain ofL′ν. However, it may well belong to the domain locally, outside the boundary point

a, see Proposition 5.11.4 and the follow-up discussion. In fact, the requirement forthe solution to belong to the domain outside a boundary point is common forclassical problems of PDEs. The following assertion gives a concrete illustrationof this point.

Proposition 8.4.2. Under the assumptions of Proposition 8.4.1, let the potentialmeasure U (ν)(dy) have a continuous density, U (ν)(y), with respect to the Lebesguemeasure. Let g ∈ C1[a, b].

Then the function f(x) = R′0g0(x) belongs to Ckill(a)([a, b]) and is continu-

ously differentiable in (a, b]. Therefore, by Proposition 5.11.4, it satisfies the equa-

tion D(ν)a+∗f = g0 locally, at all points from the interval (a, b].

Proof. From the formula for R′0g0(x), it follows that

d

dxR′

0g0(x) =

∫ x−a

0

d

dxg(x− y)U (ν)(y) dy + g(a)U (ν)(x− a),

Page 470: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

458 Chapter 8. Generalized Fractional Differential Equations

which is well defined and continuous for x ≥ a. The limit from the right ofddxR

′0g0(x) as x → a is g(a)U (ν)(0), which may cause a jump when this func-

tion crosses the value x = a. �

As mentioned before, the image of the resolvent coincides with the domainof the generator. This implies that the function (8.45) belongs to the domain ofL′ν, restricted to Ckill(a)([a,∞)), whenever g ∈ Ckill(a)([a,∞)). For any other g,

our generalized solution was defined in the sense of a generalized function (whichreflects the notion of generalized solutions via duality). As usual (see the discus-sion prior to Proposition 5.9.2), one can also introduce the notion of generalizedsolutions with the help of approximations. Namely, for a measurable boundedfunction g(x) on [a,∞), a continuous curve f(x), t ≥ a, is the generalized solution

via approximation to the problem D(ν)+ f = −λf + g on C([a, b]), if there exists

a sequence of curves gn(.) ∈ Ckill(a)([a, b]) such that gn → g almost surely, asn → ∞, and the corresponding classical (i.e., belonging to the domain) solutionsfn(x), given by (8.45) with gn(x) instead of g(x), converge pointwise to f(t), asn → ∞.

The following assertion is a consequence of Proposition 8.4.1.

Proposition 8.4.3. For any measurable bounded function b(x) on [a,∞), the for-mula (8.45) (respectively (8.47)) supplies the unique generalized solution by ap-proximation to the problem (8.46) (respectively (8.48)) on [a, b] for any b > a.

8.5 Generalized fractional linear equations, part I

In this section, we will analyse linear equations with a non-vanishing boundaryvalue at a by extending Proposition 8.4.1.

Proposition 8.5.1. Let a non-negative measure ν on {y : y > 0} satisfy (8.37).

(i) If g is a generalized function (from S′(R) or D′(R)) vanishing to the left ofa, then

f(x) = Y +I(ν)a g(x) = Y +

∫ x−a

0

g(x−y)U (ν)(dy) = Y +(g�U (ν))(x) (8.50)

is the unique solution (from S′(R) or D′(R), respectively) to the equation

g = D(ν)a+∗f = D

(ν)+ f (8.51)

that equals the constant Y to the left of a.

(ii) If g ∈ Ckill(a)([a, b]), then f from (8.50) belongs to the domain of the generatorof the semigroup Tt defined either on the space Cuc((−∞, b]) of uniformlycontinuous functions on (−∞, b] or on its subspace C([a, b]) of functions thatare constants to the left of a < b. In this case, f is the classical solution toequation (8.51) that equals Y to the left of a.

Page 471: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

8.5. Generalized fractional linear equations, part I 459

(iii) If g ∈ C([a, b]) (and is extended to the left of a by zero) and ν is not finite,then f ∈ C(−∞, b] for any b > a and therefore takes the initial conditionf(a) = Y in the classical sense.

Remark 121. If ν is finite and g(a) = 0, then f has a discontinuity at a, since inthis case the limit of f from the right at a equals Y + g(a)/‖ν‖, see the corollaryto Proposition 5.12.1.

Proof. (i) By Propositions 5.12.2 and 1.11.1, for any g ∈ S′(R) supported on

[a,∞), the function I(ν)a g(x) is the unique solution to the equation g = D

(ν)a+∗f , in

the sense of generalized functions, up to an additive constant. Therefore, addingY fixes the initial condition in a unique way.

(ii) As in Proposition 8.4.1, this follows from the fact that the image of thepotential operator, when it is bounded, coincides with the domain.

(iii) This follows from the corollary to Proposition 5.12.1. �

Let us now look at equations with g being continued to the left of a as g(a).

Proposition 8.5.2. Let the measure ν on {y : y > 0} satisfy (8.37) and let λ > 0.

(i) For any g ∈ C∞[a,∞), considered as an element of Cuc(R) by extending itto the left of a by the constant g(a), the function

f(x) =

∫ ∞

0

g(x− y)U(ν)λ (dy) = (g � U

(ν)λ )(x)

=

∫ x−a

0

g(x− y)U(ν)λ (dy) + g(a)

∫ ∞

x−a

U(ν)λ (dy)

(8.52)

is the unique solution to the equation

D(ν)a+∗f(x) = −λf(x) + g(x) (8.53)

in the domain of the generator of the semigroup Tt on Cuc(R). This functionequals g(a)/λ to the left of a.

(ii) For any g ∈ S′(R) that is constant to the left of a, the generalized function

g � U(ν)λ is a well-defined element of S′(R), and it represents the unique

solution to equation (8.53) in the sense of generalized functions.

Proof. (i) By (8.52), f = R′λg is obtained by applying the resolvent to g. Therefore,

it belongs to the domain of Lν and solves the equation (λ − L′ν)f = g. (ii) This

follows from Proposition 5.12.2(i). �

As above, one can also interpret the formula (8.52) in the sense of general-ized solutions by approximation. However, function (8.52) is not the solution thatwe are mostly interested in, since it prescribes the boundary value at a rather

Page 472: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

460 Chapter 8. Generalized Fractional Differential Equations

than solving the boundary-value problem. The most straightforward way to dealproperly with the problem

D(ν)a+∗f(x) = −λf(x) + g(x), f(a) = Y, x ≥ a, (8.54)

is to turn it into a problem with vanishing boundary value, which is a commontrick in the theory of PDEs. Namely, by introducing the new unknown functionu = f − Y , we see that u must solve the problem

D(ν)a+u(x) = −λu(x)− λY + g(x), u(a) = 0, x ≥ a, (8.55)

just with g − λY instead of g. We can therefore define the solution to (8.54) tobe the function f = u + Y , where u solves (8.55). For the sake of clarity, let usemphasize that in (8.55) the r.h.s. g(x)−λY is considered to be continued as zero

to the left of a. This definition also complies with one of the definitions of D(ν)a+∗,

given by D(ν)a+∗f = D

(ν)a+(f − f(a)) (arising from (1.109)).

Taking first g = 0, we find the solution to (8.54) to be

f(x) = Y + u(x)

= Y − λY

∫ x−a

0

∫ ∞

0

e−λtG(ν)(t, dy) dt

= λY

∫ ∞

0

e−λt

(∫ ∞

x−a

G(ν)(t, dy)

)dt.

(8.56)

Integrating by parts leads to an alternative expression for x > a:

f(x) = Y

∫ ∞

0

e−λt ∂

∂t

(∫ ∞

x−a

G(ν)(t, dy)

)dt.

Restoring g, we arrive at the following result:

Proposition 8.5.3. For any g supported on [a,∞), the unique solution to the prob-lem (8.54) in the sense defined above is given by the formula

f(x) = Y

∫ ∞

0

e−λt ∂

∂t

(∫ ∞

x−a

G(ν)(t, dy)

)dt

+

∫ x−a

0

g(x− y)

∫ ∞

0

e−λtG(ν)(t, dy) dt

(8.57)

This solution can be classified as classical (from the domain of the generator) orgeneralized (in the sense of generalized functions or by approximation) accordingto Proposition 8.4.1 applied to the problem (8.55).

Since for L′ν = − dβ

dxβ , the coefficient at Y for x− a = 1 is the Mittag-Lefflerfunction Eβ(−λ) of index β, one can define the analogue of the Mittag-Leffler

Page 473: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

8.5. Generalized fractional linear equations, part I 461

function for arbitrary ν as

E(ν)(−λ) =

∫ ∞

0

e−λt ∂

∂t

(∫ ∞

1

G(ν)(t, dy)

)dt

= λ

∫ ∞

0

e−λt

(∫ ∞

1

G(ν)(t, dy)

)dt

= 1− λ

∫ ∞

0

e−λt

(∫ 1

0

G(ν)(t, dy)

)dt.

(8.58)

By Proposition 5.11.3(ii), the function∫∞x−a

G(ν)(t, dy) increases with t. Therefore,its derivative is well defined as a positive measure (and as a function almost ev-erywhere), which makes the function E(ν)(−λ) a completely monotone function ofλ. This function is well defined and continuous for Reλ ≥ 0, since it is boundedthere by 1:

|E(ν)(−λ)| ≤∫ ∞

0

∂t

(∫ ∞

1

G(ν)(t, dy)

)dt =

(∫ ∞

1

G(ν)(t, dy)

)∣∣∣∣∞

0

= 1. (8.59)

Moreover, we have E(ν)(0) = 1.

In fact, one can define the family of these Mittag-Leffler functions dependingon the positive parameter z as

E(ν),z(−λ) =

∫ ∞

0

e−λt ∂

∂t

(∫ ∞

z

G(ν)(t, dy)

)dt

= 1− λ

∫ ∞

0

e−λt

(∫ z

0

G(ν)(t, dy)

)dt.

(8.60)

They all are completely monotone, and the solution (8.56) to the problem (8.55)is then expressed as

f(x) = Y E(ν),x−a(−λ) +

∫ x−a

0

g(x− y)U(ν)λ (dy), (8.61)

where the λ-potential measure is expressed in terms of E(ν),z by the equation∫ z

0

U(ν)λ (dy) = (1− E(ν),z(−λ))/λ. (8.62)

If the measures G(ν)(t, dy) have densities with respect to the Lebesgue measure,

say G(ν)(t, y), then the λ-potential measure also has a density U(ν)λ (y), and (8.62)

can be rewritten as

U(ν)λ (y) = − 1

λ

∂E(ν),y(−λ)

∂y. (8.63)

However, the additional relation E(ν),z(−λ) = E(ν)(−λzβ) only applies for the

case of the derivative dβ

dxβ , due to the particular scaling property of Gβ .

Page 474: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

462 Chapter 8. Generalized Fractional Differential Equations

In the case of Lν = − dβ

dxβ , we have∫ ∞

1

Gβ(t, y)dy =

∫ ∞

1

t−1/βGβ(1, t−1/βy)dy =

∫ ∞

t−1/β

Gβ(1, x)dx,

which shows that

∂x

∫ ∞

1

Gβ(t, y)dy =1

βt−1−1/βGβ(1, t

−1/β).

Therefore, we again arrive at formula (8.2). In particular, Eβ(λ) is an entire ana-

lytic function of λ. The same is true for the λ-potential measure Uβλ , due to (8.10).

In order for E(ν)(s) to be an entire analytic function like Eβ , some regularity as-sumptions on ν are needed. This will be discussed in the next section.

Let us now turn to the extension of linear equations to the Banach-space-valued setting, i.e., to the equations

D(ν)a+∗μ(x) = Aμ(x) + g(x), μ(a) = Y, x ≥ a. (8.64)

For μ(a) = Y = 0, this turns into the RL-type equation

D(ν)a+μ(x) = Aμ(x) + g(x), μ(a) = 0, x ≥ a. (8.65)

As above, we define the solution to (8.64) as a function μ(x) = Y + u(x), whereu(x) solves the problem

D(ν)a+u(x) = Au(x) +AY + g(x), u(a) = 0, x ≥ a. (8.66)

Compared to the case of real-valued A, the only new point is the application ofTheorem 5.13.1 for building the semigroup T ′

tetA and of Proposition 1.11.2 if one is

interested in generalized solutions in the sense of generalized functions. Notice alsothat the assumption of etA to be a contraction naturally extends the case A = −λwith λ > 0 (since e−λt ≤ 1) and allows for a definition of the operator-valuedgeneralized Mittag-Leffler functions by the operator-valued integral

E(ν),z(A) =

∫ ∞

0

etA∂

∂t

(∫ ∞

z

G(ν)(t, dy)

)dt

= 1 +A

∫ ∞

0

etA(∫ z

0

G(ν)(t, dy)

)dt.

(8.67)

Theorem 8.5.1.

(i) Let the measure ν on {y : y > 0} satisfy (5.143) and let A be the generator ofthe strongly continuous semigroup etA of contractions in the Banach space B,with the domain of the generator D ⊂ B. Then the L(B,B)-valued potentialmeasure

U(ν)−A(M) =

∫ ∞

0

etAG(ν)(t,M) dt (8.68)

Page 475: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

8.6. Generalized fractional linear equations, part II 463

of the semigroup T ′te

tA on the subspace Ckkill(a)([a, b], B) of Cuc((−∞, b], B)

(as constructed in Theorem 5.13.1) is well defined as a σ-finite measure on{y : y ≥ 0} such that for any z > 0, λ > 0,

U(ν)−A([0, z]) ≤ eλz/φν(λ).

Therefore, the potential operator (given by convolution with U(ν)−A) of the semi-

group T ′te

tA on Ckkill(a)([a, b], B) is bounded for any b > a.

(ii) For any g ∈ Ckill(a)([a, b], B), the B-valued function

f(x) =

∫ x−a

0

U(ν)−A(dy)g(x−y) =

∫ x−a

0

∫ ∞

0

etAG(ν)(t, dy) dt g(x−y) (8.69)

belongs to the domain of the generator of the semigroup T ′te

tA and is theunique solution to problem (8.65) from the domain. For any g ∈ C([a, b], B),continued as zero to the left of a, this function represents the unique general-ized solution to (8.65), both by approximation and in the sense of generalizedfunctions.

(iii) For any g ∈ C([a, b], B) (continued as zero to the left of a) and Y ∈ B, thefunction

f(x) = Y +

∫ x−a

0

U(ν)−A(dy)(AY + g(x− y))

= E(ν),x−a(A)Y +

∫ x−a

0

U(ν)−A(dy)g(x− y)

(8.70)

is the unique generalized solution to problem (8.64).

Proof. (i) For the measure U(ν)A , we obtain the same estimate as for U (ν) in Propo-

sition 5.12.1, because etA are contractions. (ii) Concerning the solutions in thedomain, this is again a consequence of Proposition 4.1.4. Generalized solutions inthe sense of generalized functions are obtained from Proposition 1.11.2. The exis-tence and uniqueness of generalized solutions by approximation is a consequenceof the explicit integral formula. (iii) This follows from (ii) by the definition of thesolution to (8.64). �

8.6 Generalized fractional linear equations, part II

We have constructed the solutions to the linear problems (8.65) and (8.64) onlyfor the case of A generating a contraction semigroup (with a direct extension tothe case of a uniformly bounded semigroup etA). This restriction is ultimatelylinked with the formula (8.58) for the generalized Mittag-Leffler function, whichit not directly seen to be extensible to negative λ. In this section, we presentsome additional assumptions on ν which ensure that this extension is possible and

Page 476: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

464 Chapter 8. Generalized Fractional Differential Equations

therefore allow for an extension of the earlier results to A generating arbitrarystrongly continuous semigroups. These assumptions are of two kinds: (a) via lowerbounds for ν(dy) and (b) via its asymptotics at small y.

By arguments that are similar to those in Proposition 2.13.1, we can derive

that the unique solution to the linear problem D(ν)a+∗f(x) = Af(x) with a bounded

operator A in a Banach space B and a given initial condition f(x) = Y must be

represented by a geometric series of the operators I(ν)a :

(1 +A(I(ν)a 1)(x) + · · ·+Ak[(I(ν)a )k1](x) + · · · )Y, (8.71)

whenever this series converges. Therefore we are looking for assumptions on νwhich can ensure the convergence and provide reasonable estimates for the sum.

Remark 122. Since the measure U (ν)(dy) is always finite on any finite interval[0, z], the series (8.71) has a non-vanishing radius of convergence (it convergesfor ‖A‖ sufficiently small). This means that Theorem 8.5.1 can be extended to Agenerating semigroups of the small-growth type. What we are looking for now arethe conditions that ensure the convergence of (8.71) for all ‖A‖.

The following assertion is a direct consequence of Propositions 9.4.3 and 9.4.2from the Appendix.

Proposition 8.6.1. Let κ > 0 and β ∈ (1/2, 1). Let ν(y) be a continuous functionon {y : y ≥ 0} satisfying (8.37) and such that the function

ν(y)− κβy−β−1

tends to zero, as y → 0, and is integrable around the origin. Then the symbol ψν(p)of the operator Lν has the asymptotic behaviour

ψν(p) = −e−iπβ sgn (p)/2Γ(1− β)κ|p|β +O(1). (8.72)

Moreover, the potential measure U (ν)(dx) on {x : x ≥ 0}, representing the funda-mental solution to the operator L′

ν , has a continuous density Eν(x), whose asymp-totic behaviour is given by (9.64) (where E(x) = E(−x)):

Eν(x) = 1

πκxβ−1 sin(πβ) +O(1),

with a uniformly bounded function O(1). In particular,

Eν(x) ≤ Cν(1 + xβ−1)/Γ(β), (8.73)

with some constant Cν .

Remark 123. The restriction to β > 1/2 is a consequence of the elementary meth-ods that were used for proving Proposition 9.4.3. The result also holds withoutthis restriction.

Page 477: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

8.6. Generalized fractional linear equations, part II 465

Rougher estimates are available from the comparison principle, as the fol-lowing consequence of Proposition 5.12.1 reveals (together with formula (8.43) forthe potential measure of the fractional derivative of order β):

Proposition 8.6.2. Let ν(dy) be a measure on {y : y > 0} satisfying (8.37) andhaving the lower bound of the β-fractional type

ν(dy) ≥ (−1/Γ(−β))Cνy−1−β dy (8.74)

with some β ∈ (0, 1) and Cν > 0. Then∫ x

0

U (ν)(dy) ≤ Cν(Iβ0 1)(x) = Cνx

β/Γ(β) (8.75)

for any x > 0.

Using (8.73) or (8.75), we can estimate the geometric series (8.71) with theintegral operators (8.44). Namely, we get from (8.75) that

|1 + λ(I(ν)a 1)(x) + · · ·+ λk[(I(ν)a )k1](x) + · · · |≤ 1 + Cν |λ|(Iβa 1)(x) + · · ·+ (Cν |λ|)k[Ikβa 1](x) + · · ·≤ Eβ(Cν |λ|(x− a)β),

(8.76)

where we refer to Proposition 2.13.1 for the last equation.

With (8.73), we find that the second term in this estimate dominates forx− a < 1. This implies

‖1+ λ(I(ν)a 1)(x) + · · ·+λk[(I(ν)a )k1](x) + · · · ‖ ≤ Eβ(2Cν |λ|(x− a)β), x− a ≤ 1.(8.77)

For x− a > 1, we can give the following rough bound:

(I1a + Iβa )n1(x) ≤ 2n max

k∈[0,n]Iβk+n−k1(x)

≤ 2n maxk

(t− a)βk+n−k

Γ(βk + n− k + 1)≤ 2n

(t− a)n

Γ(βn+ 1),

and therefore

‖1 + λ(I(ν)a 1)(x) + · · ·+ λk[(I(ν)a )k1](x) + · · · ‖ ≤ Eβ(2Cν |λ|(x− a)), x− a ≥ 1.(8.78)

Putting these estimates together yields

‖1+ λ(I(ν)a 1)(x) + · · ·+ λk[(I(ν)a )k1](x) + · · · ‖ ≤ Eβ(2Cν |λ|max(x− a, (x− a)β)).(8.79)

Theorem 8.6.1. Under the assumptions of either Proposition 8.6.2 or Proposition8.6.1, the integral (8.60) converges for all complex λ, so that the function E(ν),z(λ)

Page 478: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

466 Chapter 8. Generalized Fractional Differential Equations

(defined initially by (8.60) for negative parameter values) is an entire analyticfunction of λ. Its series expansions is

E(ν),z(λ) = 1 + λ(I(ν)0 1)(z) + · · ·+ λk[(I

(ν)0 )k1](z) + · · · , (8.80)

orE(ν),z(λ) = 1 + λ(I(ν)a 1)(x) + · · ·+ λk[(I(ν)a )k1](x) + · · · ,

with x − a = z. It can be also obtained by expanding the last expression of (8.60)into a power series in λ. The series (8.80) is bounded either by (8.76) or by (8.79),respectively.

Moreover, the integral expressing the λ-potential measure,

U(ν)λ ([0, z]) =

∫ ∞

0

e−λtG(ν)(t, [0, z]) dt,

converges for all complex λ, so that the λ-potential measure is also an entire an-alytic function of λ. Its series expansion is obtained from that of E(ν),z(−λ) viathe formula (8.62).

Finally, in case of the setting of Proposition 8.6.2, we have

‖U (ν)λ ([0, z])‖ ≤ Cνβ

∫ z

0

yβ−1E′β(|λ|yβ) dy, (8.81)

and in the case of the setting of Proposition 8.6.1, the measures U(ν)λ (dy) and

G(ν)(t, dy) have densities with respect to the Lebesgue measure.

Remark 124. The statement that the expansion (8.80) coincides with the integral(8.60) is a far-reaching extension of Zolotarev’s formula (2.80).

Proof. (i) Let us start with the setting of Proposition 8.6.2. Expanding the lastexpression of (8.60) into a power series in λ, the comparison principle of Proposi-tion 5.11.3 shows us that all terms are bounded by the corresponding terms of theseries with Gβ(t, dy) instead of G(ν)(t, dy). Therefore, this series is convergent forall λ. Since both the last expression in (8.60) and the series (8.80) solve the samelinear fractional equation, they coincide.

Finally, again by the comparison principle of Proposition 5.11.3, we find

‖U (ν)λ ([0, z])‖ ≤

∫ ∞

0

e|λ|tG(ν)(t, [0, z]) dt ≤ Cν

∫ ∞

0

e|λ|tGβ(t, [0, z]) dt,

which implies (8.81) by (8.10).

(ii) Let us turn to the setting of Proposition 8.6.1. Notice first that thepotential measure has a density, as was stated in Proposition 8.6.1. The existenceof a density of Gν(t, dy) follows from (8.72) and the representation of Gν(t, dy) asthe Fourier transform of exp{tψν(p)}, see, e.g., (5.147).

Page 479: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

8.6. Generalized fractional linear equations, part II 467

As in (i), for positive λ, E(ν),z(−λ) given by the last expression of (8.60)coincides with its series representation (8.80), because they solve the same linearfractional equation. In order to see that the expansion (8.80) can be obtainedfrom the expansion of the last expression of (8.60), we note that all terms ofthe latter are bounded. (This is seen from (8.72) and the fact that Gν(t, dy) is theFourier transform of exp{tψν(p)}.) Therefore, this expansion is at least asymptotic.But two asymptotic expansions necessarily coincide. Hence, the expansion (8.80)coincides with the expansion obtained from the last expression of (8.60). But sincethe former converges for all λ, the latter converges as well. Therefore, the integralsin (8.60) also converge for all λ.

Finally, the analytic property of the potential measure follows from the cor-responding properties of E(ν),z via the formula (8.63). �

We are now ready for the main result of this section, which extends Theorem8.5.1 to arbitrary semigroups etA. The proof is literally the same as that of The-orem 8.5.1 (once the properties of the λ-potential measures from Theorem 8.6.1are obtained) and is therefore omitted.

Theorem 8.6.2. Under the assumptions of either Proposition 8.6.2 or Proposition8.6.1, let A be the generator of the strongly continuous semigroup etA in the Banachspace B, with the domain of the generator D ⊂ B. Let the growth type of etA bem0, so that ‖etA‖ ≤ Memt with any m > m0 and some M . Then the followingholds:

(i) The L(B,B)-valued potential measure

U(ν)−A([0, z]) =

∫ ∞

0

etAG(ν)(t, [0, z]) dt (8.82)

of the semigroup T ′te

tA on the subspace Ckkill(a)([a, b], B) of Cuc((−∞, b], B),

as constructed in Theorem 5.13.1, is well defined as a σ-finite measure on{y : y ≥ 0}. In the setting of Proposition 8.6.2, we have

U(ν)−A([0, z]) ≤ CνMβ

∫ z

0

yβ−1E′β(myβ) dy (8.83)

for any z > 0.

(ii) The L(B,B)-valued generalized families of Mittag-Leffler functions

E(ν),z(A) =

∫ ∞

0

eAt ∂

∂t

(∫ ∞

z

G(ν)(t, dy)

)dt

= 1 +A

∫ ∞

0

eAt

(∫ z

0

G(ν)(t, dy)

)dt

(8.84)

are well defined and bounded by

‖E(ν),z(A)‖ ≤ MEβ(Cνmzβ) (8.85)

Page 480: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

468 Chapter 8. Generalized Fractional Differential Equations

or by‖E(ν),z(A)‖ ≤ MEβ(Cνm max(z, zβ)), (8.86)

respectively.

(iii) For any g ∈ Ckill(a)([a, b], B), the B-valued function (8.69) belongs to thedomain of the generator of the semigroup T ′

tetA and is the unique solution to

the problem (8.65) from the domain. For any g ∈ C([a, b], B), this functionrepresents the unique generalized solution to (8.65), both by approximationand in the sense of generalized functions.

(iv) For any g ∈ C([a, b], B) (continued as zero to the left of a) and Y ∈ B,the function (8.70) represents the unique generalized solution to the problem(8.64).

8.7 The time-dependent case;path integral representation

In this section, our aim is to extend the above results for (8.64) to the case of afamily of operators A depending on x, i.e., to the problem

D(ν)a+∗μ(x) = A(x)μ(x) + g(x), μ(a) = Y, x ≥ a. (8.87)

This development is based on an appropriate extension of Theorem 5.13.1, whichwe shall carry out in two steps, first for bounded and then for unbounded measuresν. In any case, the framework of 3-level Banach towers turns out to be convenient.

Theorem 8.7.1.

(i) Let D ⊂ D ⊂ B be three Banach spaces with the ordered norms ‖.‖D ≥‖.‖D ≥ ‖.‖B, and let D be dense in both D and B with respect to their topolo-gies (three-level Banach tower). Let A(x), x ∈ R, be a uniformly boundedfamily of operators in L(D,B) depending strongly continuously on x, whichis also uniformly bounded and strongly continuous in L(D,D). Let all A(x)generate uniformly bounded (for x ∈ R and t from any compact segment)strongly continuous semigroups etA(x) in B with the common core D, whichalso represent uniformly bounded strongly continuous semigroups in D withthe common core D, and uniformly bounded semigroups in D. Then the op-erators

etA(.) : f(x) �→ etA(x)f(x)

form a strongly continuous semigroup in C∞(R, B) with the invariant coreC∞(R, D), and a strongly continuous semigroup in C∞(R, D).

(ii) Assume additionally that the function x �→ A(x) is differentiable both as amapping R → L(D,B) and as a mapping R → L(D,D), and that the deriva-tives A′(x) (the prime denotes a derivative with respect to x) are uniformly(in x) bounded and strongly continuous families of operators again both in

Page 481: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

8.7. The time-dependent case; path integral representation 469

L(D,B) and L(D,D). Then the operators etA(.) represent a strongly contin-uous semigroup in the Banach space C1

∞(R, B) ∩ C∞(R, D) (with the normdefined as the sum of the norms in C1∞(R, B) and C∞(R, D)) with the in-variant core C1

∞(R, D)∩C∞(R, D). Reduced to the latter space, the operatorsetA(.) form a semigroup of bounded operators, if this space is equipped withits own Banach topology.

(iii) Under the assumptions (i) and (ii), let etA(x) have common growth types mB0 ,

mD0 and mD

0 as semigroups in B, D and D, respectively, so that

‖etA(x)‖B→B ≤ MBetmB , ‖etA(x)‖D→D ≤ MDetmD ,

‖etA(x)‖D→D ≤ MDetmD

(8.88)

for any mB > mB0 , mD > mD

0 , mD > mD0 and some MB,MD, MD. Then

the semigroup etA(.) in C1∞(R, B) ∩ C∞(R, D) has a growth type that doesnot exceed max(mB

0 ,mD0 ), and it satisfies the estimates

‖etA(.)‖L(C1∞(R,B)∩C∞(R,D)) (8.89)

≤ max(MBetmB ,MDetmD +MBMDtetmax(mB ,mD) sup

x‖A′(x)‖D→B)

≤ max(MD,MB) exp{t(max(mB,mD) +MB supx

‖A′(x)‖D→B)}.

In C1∞(R, D) ∩ C∞(R, D), this semigroup has a growth type that does not

exceed max(mD0 , mD

0 ), and it satisfies the estimates

‖etA(x)‖L(C1∞(R,D)∩C∞(R,D)) (8.90)

≤ max(MDetmD , MDetmD +MDMDtetmax(mD ,mD) sup

x‖A′(x)‖D→D)

≤ max(MD, MD) exp{t(max(mD, mD) + MD supx

‖A′(x)‖D→D)}.

Proof. (i) Since

etA(x)f(x)− etA(x0)f(x0) = etA(x)(f(x)− f(x0)) + (etA(x) − etA(x0))f(x0)

and by Proposition 4.2.2, etA(x)f(x) belongs to C∞(R, B) (respectively C∞(R, D))whenever f does. Therefore, the operators etA(.) represent semigroups both inC∞(R, B) and C∞(R, D). By uniform boundedness of etA(x) with respect to x,these semigroups are locally bounded (bounded for t from compact segments).

Next, for f ∈ C∞(R, D), we have

etA(x)f(x)− f(x) =

∫ t

0

A(x)esA(x)f(x) ds,

which tends to zero in B, as t → 0, because A(x) and etA(x) are uniformly boundedas operators from L(D,B) and L(D,D), respectively. By a density argument and

Page 482: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

470 Chapter 8. Generalized Fractional Differential Equations

the boundedness of the operators etA(x) in L(B,B), the strong continuity of thesemigroup etA(.) in C∞(R, B) is implied.

Similarly, for f ∈ C∞(R, D), etA(x)f(x)− f(x) → 0 in D, as t → 0, becauseA(x) and etA(x) are uniformly bounded as operators from L(D,D) and L(D, D),respectively. The boundedness of the operators etA(x) in L(D,D) implies the strongcontinuity of the semigroup etA(.) in C∞(R, D).

It remains to show that any f ∈ C∞(R, D) belongs to the domain of thegenerator A(.) of the semigroup etA(.) in C∞(R, B). We have

1

t(etA(x)f(x) − f(x)) = A(x)f(x) +

1

t

∫ t

0

A(x)(esA(x) − 1)f(x) ds,

and the second term tends to zero, as t → 0, due to the strong continuity of esA(.)

in C∞(R, B).

(ii) Since

d

dx[etA(x)f(x)] = lim

δ→0

1

δ

[(etA(x+δ) − etA(x)f(x) + etA(x+δ)(f(x+ δ)− f(x))

]

and using (4.138), we find that, for f ∈ C1∞(R, B) ∩ C∞(R, D), the expression

d

dx[etA(x)f(x)] =

∫ t

0

e(t−s)A(x)A′(x)esA(x)f(x) ds+ etA(x)f ′(x) (8.91)

is well defined in the topology of B and represents an element of C∞(R, B),because A′(x) is assumed to be bounded and strongly continuous as a family inL(D,B). By the strong continuity of esA(.) in C∞(R, B), it follows that

d

dx

[etA(x)f(x)

]→ f ′(x),

as t → 0. But by (i), the operators esA(.) depend strongly continuously on sin C∞(R, D). Consequently, the esA(.) form a strongly continuous semigroup inC1∞(R, B) ∩ C∞(R, D).

If f ∈ C1∞(R, D)∩C∞(R, D), then formula (8.91) holds also in the topologyof C∞(R, D). This implies that the esA(.) preserve the space C1

∞(R, D)∩C∞(R, D)and act as bounded operators in the Banach topology of this space.

(iii) By (8.91), we have∥∥∥∥ d

dx

[etA(x)f(x)

]∥∥∥∥C∞(R,B)

≤ MBetmB‖f(x)‖C1∞(R,B)

+MBMDte(t−s)mBesmD supx

‖A′(x)‖D→B‖f(x)‖C∞(R,D),

Page 483: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

8.7. The time-dependent case; path integral representation 471

which implies the first inequality in (8.89). From this, it follows that the growthtype of etA(.) in C1

∞(R, B) ∩ C∞(R, D) does not exceed max(mB0 ,m

D0 ). The last

inequality in (8.89) follows from the estimate

1 + tMB supx

‖A′(x)‖D→B ≤ exp

{tMB sup

x‖A′(x)‖D→B

}.

Similarly, (8.90) is obtained from the estimate∥∥∥∥ d

dx

[etA(x)f(x)

]∥∥∥∥C∞(R,D)

≤ MDetmD‖f(x)‖C1∞(R,D)

+MDMDte(t−s)mDesmD supx

‖A′(x)‖D→D‖f(x)‖C∞(R,D). �

Theorem 8.7.2.

(i) Under the assumptions of Theorem 8.7.1(i), let ν be a bounded measure on theray {y : y > 0}. Then the operator L′

ν +A(.) generates a strongly continuous

semigroup Φν,At in C∞(R, B) with the invariant core C∞(R, D), where this

semigroup is also strongly continuous. The semigroup Φt has the followingrepresentation:

Φν,At Y (x) (8.92)

= e−t‖ν‖[etA(x)Y (x) +

∞∑m=1

∫0≤s1≤···≤sm≤t

ds1 · · · dsm ν(dz1) · · · ν(dzm)

× exp{s1A(x)} exp{(s2 − s1)A(x− z1)} · · ·

· · · exp{(t− sm)A(x − z1 − · · · − zm)}Y (x− z1 − · · · − zm)

],

or, using the notation (4.99) for piecewise-continuous paths,

Φν,At Y (x) (8.93)

= e−t‖ν‖[etA(x)Y (x) +

∞∑m=1

∫0≤s1≤···≤sm≤t

ds1 · · · dsm ν(dz1) · · · ν(dzm)

× exp

{∫ s1

0

A(Zx(τ))dτ

}exp

{∫ s2

s1

A(Zx(τ))dτ

}· · ·

· · · exp{∫ t

sm

A(Zx(τ))dτ

}Y (Zx(t))

].

(ii) For any b > a, the operators Φν,At represent strongly continuous semigroups

in the spaces Ckill(a)([a, b], B) and Ckill(a)([a,∞), B) ∩ C∞(R, B) as well.

Page 484: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

472 Chapter 8. Generalized Fractional Differential Equations

(iii) If the assumptions of Theorem 8.7.1 (ii) hold, then L′ν +A(.) also generates a

strongly continuous semigroup in C1∞(R, B) ∩ C∞(R, D) with the invariant

core C1∞(R, D) ∩ C∞(R, D). The operators Φν,At are bounded in the space

C1∞(R, D) ∩ C∞(R, D) equipped with its own Banach topology.

Proof. (i) Since L′ν is a bounded operator both in C∞(R, B) and in C∞(R, D), it

follows from Theorem 4.6.1 that the operator

(L′ν +A(.))f(x) =

∫f(x− y)ν(dy) + (A(x) − ‖ν‖)f(x)

generates a strongly continuous semigroup in C∞(R, B) with the invariant coreC∞(R, D), where this semigroup is also strongly continuous. Moreover, formula(4.85) (where the operator

∫f(x− y)ν(dy) is considered a bounded perturbation)

provides the representation (8.92). Unlike in (4.103), the operators A(x) may notcommute. Therefore, the exponents cannot be put together. Due to the notations(4.99), the equations (8.93) and (8.92) are equivalent.

(ii) The invariance of the spaces Ckill(a)([a, b], B) and Ckill(a)([a,∞), B) under

Φν,At is seen from (8.92).

(iii) This follows again from Theorem 4.6.1 and the observation that theoperators etA(.) and L′

ν are bounded in the space C1∞(R, D)∩C∞(R, D) equipped

with its own Banach topology. �

By (2.43), the product of the exponents in (8.93) or (8.92) equals the back-

ward chronological product T exp{∫ t

0 A(Zx(τ)) dτ}. Revoking the notations of Sec-tion 4.7 and denoting by νPC the measure on PCx(t) constructed from ν, we canrewrite (8.93) as

Φν,At Y (x) = e−t‖ν‖

∫PCx(t)

F (Zx(.))νPC(dZ(.)),

with

F (Zx(.)) = T exp

{∫ t

0

A(Zx(τ)) dτ

}Y (Zx(t)).

By introducing the normalized probability measure νPC = e−t‖ν‖νPC as in Section4.7 and by denoting the integration with respect to this measure on the path-spacePCx(t) by Eν (the expectation), we arrive at the main path integral representationfor the solutions:

Corollary 10. Under the assumptions of Theorem 8.7.2, the semigroup Φν,At yield-

ing the unique solution to the Cauchy problem

μt(x) = A(x)μt(x)−D(ν)+ μt(x), μ0 = Y, (8.94)

Page 485: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

8.7. The time-dependent case; path integral representation 473

has the following integral representation in terms of the backward chronologicalexponential:

Φν,At Y (x) =

∫PCx(t)

F (Zx(.))νPC(dZ(.))

= Eν

[T exp

{∫ t

0

A(Zx(τ)) dτ

}Y (Zx(t))

].

(8.95)

The next consequence shows that the formula (8.92) makes it possible to find

the growth of the semigroup Φν,At , whenever the growth of etA(x) is known.

Corollary 11. Under the assumptions of Theorem 8.7.2(i) to (iii), the followingestimates hold:

‖Φν,At ‖L(C∞(R,B)) ≤ MB exp{t(mB + ‖ν‖(MB − 1))}, (8.96)

‖Φν,At ‖L(C∞(R,D)) ≤ MD exp{t(mD + ‖ν‖(MD − 1))}, (8.97)

‖Φν,At ‖L(C1∞(R,B)∩C∞(R,D))

≤ max(MB,MD) exp{t[max(mD,mB) +MB sup

x‖A′(x)‖D→B

+ ‖ν‖(max(MB,MD)− 1)]}

,

(8.98)

‖Φν,At ‖L(C1∞(R,D)∩C∞(R,D))

≤ max(MD, MD) exp{t[max(mD, mD) +MD sup

x‖A′(x)‖D→D

+ ‖ν‖(max(MD, MD)− 1)]}

,

(8.99)

In particular, if the semigroups etA(x) are regular in B and D in the sensethat (8.88) holds with MD = MB = 1 and some mD,mB, which is equivalent tothe requirement that

supt

supx

1

tln ‖etA(x)‖L(B) < ∞, sup

tsupx

1

tln ‖etA(x)‖L(D) < ∞, (8.100)

then the same applies to the semigroup Φν,At both in C∞(R, B) and C1

∞(R, B) ∩C∞(R, D), and its growth rates are given by the estimates

‖Φν,At ‖L(C∞(R,B)) ≤ exp{tmB}, ‖Φν,A

t ‖L(C∞(R,D)) ≤ exp{tmD}, (8.101)

‖Φν,At ‖L(C1∞(R,B)∩C∞(R,D)) ≤ exp

{t

[max(mD,mB) + sup

x‖A′(x)‖D→B

]},

(8.102)independently of ν.

Page 486: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

474 Chapter 8. Generalized Fractional Differential Equations

Proof. By (8.92), we have

‖Φν,At ‖C∞(R,B) ≤ MBe

mBt

(1 +

∞∑n=1

‖ν‖nMnBt

n

n!

),

which implies (8.96). Similarly, the other estimates are obtained due to (8.89) and(8.90). �

We can now address the problem (8.87) for the simplest case of bounded ν.

Theorem 8.7.3. Under the assumptions of Theorem 8.7.2, the resolvent operatorsRA,ν

λ of the semigroup Φν,At in the space Ckill(a)([a,∞), B) ∩ C∞(R, B) yielding

the classical solutions to the problems(λ−A(x) +D

(ν)a+

)μ(x) = g(x), μ(a) = 0, x ≥ a, (8.103)

are well defined forλ > mB + ‖ν‖(MB − 1),

and are given by the formula

RA,νλ g(x) =

∫ ∞

0

e−λtEν

[T exp

{∫ t

0

A(Zx(τ)) dτ

}g(Zx(t))

]dt. (8.104)

When reduced to Ckill(a)([a, b], B), they are also well defined for λ ≥ mB +‖ν‖(MB − 1). In particular, if all semigroups generated by A(x) in B are contrac-tions, then the problem (8.87) with Y = 0 has a unique classical solution (belonging

to the domain of the generator of the semigroup Φν,At in Ckill(a)([a, b], B)) given

by (8.104) with λ = 0 for any g ∈ Ckill(a)([a, b], B).

Since g ∈ Ckill(a)([a,∞), B), formula (8.104) can be rewritten as

RA,νλ g(x) = Eν

∫ σa

0

e−λt

[T exp

{∫ t

0

A(Zx(τ)) dτ

}g(Zx(t))

]dt, (8.105)

where σa = inf{t : Zx(t) ≤ a}. This formula can be used for defining variousgeneralized solutions to (8.103).

Page 487: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

8.8. Chronological operator-valued Feynmann–Kac formula 475

8.8 Chronological operator-valuedFeynmann–Kac formula

Let us now turn to problems of the type (8.87) with an unbounded ν.

Theorem 8.8.1.

(i) Under the assumptions of Theorem 8.7.1 (i) to (iii), let the semigroups etA(x)

be regular in B and D in the sense that (8.88) holds with MD = MB = 1and some mD,mB (equivalently, if (8.100) holds). Let ν be a measure on theray {y : y > 0} satisfying (8.37). Then the operator L′

ν + A(.) generates a

strongly continuous semigroup Φν,At both in C∞(R, B) and C∞(R, D) solving

the Cauchy problem (8.94), with the domains of the generator containing thespaces C1∞(R, B)∩C∞(R, D) and C1∞(R, D) ∩C∞(R, D), respectively. The

semigroup Φν,At can be obtained as the limit, as ε → 0, of the semigroups Φνε,A

t

built by Theorem 8.7.2 for the finite approximations νε(dy) = 1|y|≥εν(dy) of

ν, so that the semigroup Φν,At has the representation

Φν,At Y (x) = lim

ε→0Φνε,A

t Y (x) = limε→0

Eνε

[T exp

{∫ t

0

A(Zx(τ)) dτ

}Y (Zx(t))

],

(8.106)where the limit is well defined both in the topologies of B and D, and

‖Φν,At ‖L(C∞(R,B)) ≤ exp{tmB}, ‖Φν,A

t ‖L(C∞(R,D)) ≤ exp{tmD}. (8.107)

(ii) For any b > a, the operators Φν,At represent strongly continuous semigroups

also in the spaces

Ckill(a)([a, b], B), Ckill(a)([a,∞), B) ∩ C∞(R, B),

Ckill(a)([a, b], D), Ckill(a)([a,∞), D) ∩ C∞(R, D).

Proof. (i) This is similar to the proof of Proposition 5.13.1. By (8.101) and (8.102),

the semigroups Φνε,At are uniformly (in ε) bounded in both

C∞(R, B) and C1∞(R, B) ∩ C∞(R, D).

Estimating the difference between the actions of Φνε,At for two values ε2 < ε1 < 1

in the usual way, i.e., by a formula of the type (4.8), leads to

Φνε1 ,At − Φ

νε2 ,At =

∫ t

0

Φνε2 ,At−s (Lνε1

− Lνε1)Φ

νε1 ,As ds.

Page 488: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

476 Chapter 8. Generalized Fractional Differential Equations

Hence, for Y ∈ C1∞(R, B)∩C∞(R, D), we can derive by (8.101) and (8.102) that

‖(Φνε1 ,At − Φ

νε2 ,At )Y ‖C∞(R,B)

≤∫ t

0

ds e(t−s)mB supx

∥∥∥∥∫ ε1

ε2

(Φνε1 ,As Y (x− y)− Φ

νε1 ,As Y (x))ν(dy)

∥∥∥∥B

≤∫ t

0

ds e(t−s)mB

∫ ε1

ε2

yν(dy)‖Φνε1 ,As Y ‖C1∞(R,B),

and therefore

‖(Φνε1 ,At − Φ

νε2 ,At )Y ‖C∞(R,B) ≤

∫ ε1

ε2

yν(dy)

∫ t

0

ds e(t−s)mB (8.108)

× exp{s[max(mD,mB) + sup

x‖A′(x)‖D→B

]}‖Y ‖C1∞(R,B)∩C∞(R,D),

which tends to zero, as ε1, ε2 → 0, uniformly for t from any compact set. Therefore,the families Φνε,A

t Y converge, as ε → 0, for any Y ∈ C1∞(R, B) ∩ C∞(R, D). Bya density argument, this convergence extends to all Y ∈ C∞(R, B). Passing tothe limit in the semigroup equation, we find that the limiting operators form abounded semigroup in C∞(R, B), with the same bounds (8.101). We denote this

semigroup by Φν,At . Its strong continuity follows from the strong continuity of

Φνε,At .

Next, writing

Φν,At Y − Y

t=

Φνε,At Y − Y

t+

Φν,At Y − Φνε,A

t Y

t

and noting that, by (8.108), the second term tends to zero, as t, ε → 0, we canconclude that the space C1

∞(R, B) ∩ C∞(R, D) belongs to the domain of the

semigroup Φν,At in C∞(R, B).

Finally, the same estimates as above can be established in the space topologyof C∞(R, D), which shows the required properties of Φν,A

t in C∞(R, D). The

estimates (8.107) follow from the same estimate for Φνε,At .

(ii) The invariance of the spaces

Ckill(a)([a, b], B), Ckill(a)([a,∞), B), Ckill(a)([a, b], D), and Ckill(a)([a,∞), D)

under Φν,At follows from their invariance under all Φνε,A

t . �

From the point of view of numeric calculations, the limiting integral repre-sentation formula (8.106) seems to be most appropriate. From a theoretical per-spective, it is of course desirable to get rid of the limε→0.

Remark 125. For the next result, some methods of stochastic analysis have to beinvoked. Readers who are not familiar with this stuff are advised to skip the nexttheorem and to simply think of Eν used below as a notation for limε→0 Eνε .

Page 489: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

8.8. Chronological operator-valued Feynmann–Kac formula 477

Theorem 8.8.2. Under the assumptions of Theorem 8.7.1, the formula (8.95) isgeneralized for the present setting as

Φν,At Y (x) = Eν

[T exp

{∫ t

0

A(Zx(τ)) dτ

}Y (Zx(t))

], (8.109)

where Eν means the expectation with respect to the measure on the cadlag pathsof the Levy process generated by the operator L′

ν, starting at x.

Proof. This follows from (8.103) and three additional points: (i) the convergenceof Feller semigroups implies the weak convergence of the corresponding Markovprocesses; (ii) the limiting process generated by L′

ν is a Levy process, whose trajec-tories are non-increasing cadlad paths; (iii) the convergence of propagators thatare parametrized by cadlag paths, see Theorem 2 of [149] or Theorem 1.9.5 of[148]. �

The formula (8.109) is a time-ordered operator-valued version of the classicalFeynman–Kac formula of stochastic calculus. As a consequence, like in the case ofbounded ν, we obtain the solutions to the problem (8.87).

Theorem 8.8.3. Under the assumptions of Theorem 8.7.1, the resolvent operatorsRA,ν

λ of the semigroup Φν,At in the space Ckill(a)([a,∞), B) ∩ C∞(R, B) yielding

the classical solutions to the problem (8.103) are well defined for λ > mB and aregiven by the formula

RA,νλ g(x) = Eν

∫ σa

0

e−λs

[T exp

{∫ t

0

A(Zx(τ)) dτ

}g(Zx(t))

]ds

= limε→0

Eνε

∫ σa

0

e−λs

[T exp

{∫ s

0

A(Zx(τ)) dτ

}g(Zx(t))

]ds.

(8.110)

If all semigroups generated by A(x) in B are contractions, then the problem (8.87)with Y = 0 has a unique classical solution (belonging to the domain of the generator

of the semigroup Φν,At in Ckill(a)([a, b], B)) for any g ∈ Ckill(a)([a, b], B).

Note that the formula (8.110) is a time-ordered operator-valued version ofthe stationary Feynman–Kac formula of stochastic calculus.

As usual, if g is any bounded measurable function [a,∞) → B, then theformula (8.110) also yields generalized solutions, by approximation or by duality,to the problem (8.103).

Again as usual, one defines solutions to the problem (8.87) with arbitrary Yby shifting, i.e., as a function μ(x) = Y + u(x), where u solves the problem

D(ν)a+∗u(x) = A(x)u(x) +A(x)Y + g(x), μ(a) = 0, x ≥ a. (8.111)

This leads to the following result:

Page 490: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

478 Chapter 8. Generalized Fractional Differential Equations

Corollary 12. Under the assumptions of Theorem 8.7.1, if all semigroups generatedby A(x) in B are contractions, then the problem (8.87) has the unique generalizedsolution

μ(x) = Y +Eν

∫ σa

0

[T exp

{∫ s

0

A(Zx(τ)) dτ

}(A(Zx(t))Y + g(Zx(t)))

]ds,

(8.112)for any Y ∈ D and a bounded measurable curve g : [a,∞) → B.

Remark 126.

(i) If one assumes some regularity on ν, like in Propositions 8.6.1 or 8.6.2, thenone can relax the assumptions on A(x). Various additional regularity prop-erties of solutions can be obtained by assuming some smoothing propertiesof the semigroups etA(x).

(ii) Assuming the existence of a bounded second derivative A′′(x) allows for

showing that the space C1∞(R, B) ∩C∞(R, D) is an invariant core for Φν,At .

(iii) The backward time-ordered exponential T exp{∫ t

sA(Zx(τ)) dτ} represents

the backward propagator Us,t solving the backward Cauchy problem

fs(x) = −A(Zx(s))fs(x), s ≤ t, (8.113)

with the given terminal condition ft, where the family A(Zx(t)) is bounded(as operators D → B), but discontinuous in t. However, by the property ofLevy processes, the set of discontinuities is at most countable.

Let us present some examples, extending those of Section 8.2, when the basicformula (8.112) is applicable. For better comparison with Section 8.2, we shall usethe letter t for the argument, rather than x used in the present section (where twas used as the time variable in the auxiliary semigroups).

(i) The generalized fractional Schrodinger equation with time-dependentHamiltonian and a generalized fractional derivative:

D(ν)a+∗ψt = −iH(t)ψt, (8.114)

where H(t) is a family of self-adjoint operators in a Hilbert space H such that theunitary groups generated by H(t) have a common domain D ⊂ H , where they areregular in the sense of the second condition of (8.100). The most basic concreteexample are Hamiltonians of the form H(t) = −Δ+V (t, x) with V (t, .) ∈ C2(Rd),where D can be chosen as the Sobolev space H2

2 (Rd).

Similarly, one can deal with the fractional Schrodinger equation with a com-plex parameter:

D(ν)a+∗ψt = σH(t)ψt, (8.115)

if H is a negative operator and σ is a complex number with a non-negative realpart, and where again a common domain D ⊂ H exists such that the semigroupsgenerated by σH(t) are regular. In both cases, formula (8.112) is applicable.

Page 491: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

8.9. Summary and comments 479

(ii) Generalized fractional Feller evolutions, where each A(t) in (8.111) gen-erates a Feller semigroup in C∞(Rd), again with the additional property that thesemigroups generated by A(t) act regularly in their invariant cores D that canoften be taken as C1

∞(Rd) (for operators of at most first order) or C2∞(Rd) (e.g.,

for diffusions).

(iii) Generalized fractional evolutions generated by ΨDOs with spatially ho-mogeneous symbols (or with constant coefficients):

D(ν)a+∗ft = −ψt(−i∇)ft + gt, f |t=a = fa, (8.116)

under various assumptions on the symbols ψt(p), as given, e.g., by Theorems 2.4.2,4.15.1 or 4.15.4. In this case, propagators solving (8.113) are explicitly constructedvia (2.60) and (2.65). For instance, formula (8.110) for the solution to (8.116) withfa = 0 becomes

RA,ν0 g(t, w) = Eν

∫ σa

0

∫Rd

Gψ,Zs,0 (w − v)gZt(s)(v) dv ds, (8.117)

where

Gψ,Zs,0 (w) =

1

(2π)d

∫eipw exp

{−∫ s

0

ψZt(τ)(p) dτ

}dp. (8.118)

8.9 Summary and comments

Fractional and fractal thinking is quickly developing as a new paradigm of tomor-row’s science of complexity, see, e.g., [263] and numerous references therein. For anextensive review of the applications of fractional differential equations in naturalsciences, we can refer to [192, 206, 237, 253] and [255]. The book [72] analysesfractional PDEs by first solving the case of constant coefficients via the Fouriertransform and then exploiting the method of frozen coefficients.

Fractional equations are discussed in many books, see, e.g., [24, 62, 120, 222,229, 256]. See also [177] specifically for the Banach-space setting. Computationalaspects are discussed in [186], for boundary-value problems see [105]. For equationswith constant coefficients, one of the basic methods for solving fractional PDEs isbased on the Fourier transform, see, e.g., [11]. We used this method for equationswith homogeneous symbols in Chapter 4.

An important development with respect to fractional equations is the frac-tional calculus of variations, see [8, 195, 196]. Optimization problems of this theoryare formulated in terms of a certain class of fractional equations on bounded do-mains, the so-called fractional Euler–Lagrange equations. Their analysis seems tohave been initiated in [41]. For fractional control problems and related equations,we can refer to [1, 160, 218] and references therein.

The theory of the fractional Hamilton–Jacobi–Bellman (HJB) equation aspresented here was essentially developed in [160] and [161], where further details

Page 492: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

480 Chapter 8. Generalized Fractional Differential Equations

can be found. The HJB evolution which is fractional in time arises from the scalinglimits of controlled continuous-time random walks. The Euler–Lagrange equationsof the fractional calculus of variations and related models of fractional mechanicslead to another type of the HJB equation, for which we refer to [196] and referencestherein. Related classes of nonlinear fractional equations are analysed in [4].

As mentioned before, generalized fractional calculus is usually developed byextending fractional integrals to integral operators with arbitrary integral kernelsand then defining fractional derivatives as the derivatives of these integral opera-tors, see [2, 123–125, 129, 130, 196]. Alternatively, one can use some compositionsand mixtures of the standard derivatives, see, e.g., [70]. In this chapter, we usedyet another approach, suggested in [152] and motivated by a probabilistic inter-pretation. It starts with the definition of a generalized fractional derivative, andthe generalized fractional operator is then defined as the corresponding poten-tial operator, or in other words, as the fundamental solution. When followingthis approach, we emphasized the analytic part of the construction. This ap-proach leads to a certain new class of generalized Mittag-Leffler functions thatcan be represented, as their classical counterpart, as the Laplace transform ofpositive functions, expressed in terms of the Green function of the correspondinggeneralized-mixed-fractional-derivative operators.

For the general background on the Feynman–Kac formulae, we can refer to[190]. The operator-valued versions that we exploited in this chapter were putforward in [100].

The fractional Schrodinger equation is gaining popularity in the physics com-munity, see, e.g., [28, 179, 211] and references therein. For fractional versions ofthe wave equations, we refer to [228]. A detailed analysis of fractional Pearsondiffusions is given in [185]. Important recent developments in fractional differen-tial equations concern fractional kinetic equations with applications to statisticalmechanics and fractional stochastic PDEs. For these developments, we refer to[130, 157, 188, 191, 268, 269] and references therein.

The main physical source for fractional equations is their appearance in thescaled limit of continuous-time random walks, see, e.g., [144, 183, 184, 206] andthe extensive bibliography therein.

Page 493: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

Chapter 9

Appendix

In this final chapter, we provide some useful facts and formulas from analysis as aconvenient source of references. For the sake of completeness, we provide most ofthe proofs, which are either very simple (like in Section 9.1) or not quite standard,except for the basic properties of Euler’s Gamma and Beta functions from Section9.2 that can be found in many sources. Theorem 9.5.1 may be new – at least, theauthor did not find an appropriate reference.

9.1 Fixed-point principles

The most handy form of a fixed-point theorem for us will be the following general-ized contraction principle, also referred to as the Weissinger fixed-point theorem:

Proposition 9.1.1. If Φ is a mapping X → X in a complete metric space X witha metric ρ such that ρ(Φn(x),Φn(y)) ≤ αnρ(x, y) for all x, y with some αn suchthat A = 1+

∑∞n=1 αn < ∞, then Φ has a unique fixed point x∗, Φn(x) converges

to x∗ for any x and

ρ(x, x∗) ≤ Aρ(x,Φ(x)). (9.1)

Proof. For any x, we find ρ(Φn(x),Φn+1(x)) ≤ αnρ(x,Φ(x)), and thus

ρ(Φn(x),Φm(x)) ≤m−1∑k=n

αkρ(x,Φ(x)).

Consequently, the sequence Φn(x) is Cauchy and hence converges to a point x∗,so that

ρ(x, x∗) ≤ ρ(x,Φ(x)) + ρ(Φ(x),Φ2(x)) + · · · ≤ Aρ(x,Φ(x)).

© Springer Nature Switzerland AG 2019 V. Kolokoltsov, Differential Equations on Measures and Functional Spaces, Birkhäuser Advanced Texts Basler Lehrbücher, https://doi.org/10.1007/978-3-030-03377-4_9

481

Page 494: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

482 Chapter 9. Appendix

This implies (9.1). Passing to the limit in the inequality ρ(Φn+1(x),Φn(x)) ≤αnρ(x,Φ(x)) yields ρ(Φ(x∗), x∗) = 0. Uniqueness follows from (9.1) by applyingit to another fixed point x∗, which yields

ρ(x∗, x∗) ≤ Aρ(x∗,Φ(x∗)) = Aρ(x∗, x∗) = 0. �

The following contraction principle (also referred to as the Banach fixed-pointtheorem) is a consequence of Proposition 9.1.1.

Proposition 9.1.2. If the mapping Φ is a contraction, that is ρ(Φ(x),Φ(y)) ≤θρ(x, y) with θ ∈ (0, 1), then ρ(Φn(x),Φn(y)) ≤ θnρ(x, y) and the previous as-sertion applies with A =

∑∞n=0 θ

n = 1/(1− θ).

Unlike the contraction principle that is commonly used in the classical textson ODEs for proving well-posedness for small times (which then extends to finitetimes by iterations), Proposition 9.1.1 makes it possible to prove well-posednessdirectly for arbitrary times, which is especially handy for generalizations thatinclude memory.

Proposition 9.1.3. If Φ1,Φ2 are two mappings X → X in a complete metric spaceX such that ρ(Φn

j (x),Φnj (y)) ≤ αn(j)ρ(x, y) for j = 1, 2 and all x, y with some

αn(j) such that A(j) = 1 +∑∞

n=1 αn(j) < ∞, and if ρ(Φ1(x),Φ2(x)) ≤ ε for allx, then

ρ(x∗1, x

∗2) ≤ ε min

j=1,2A(j) (9.2)

for the fixed points x∗j of the mappings Φj.

Proof. Note that

ρ(x∗1,Φ

n2 (x

∗1)) ≤ ρ(x∗

1,Φ2(x∗1)) + · · ·+ ρ(Φn−1

2 (x∗1),Φ

n2 (x

∗1))

≤ A(2)ρ(x∗1,Φ2(x

∗1)) ≤ εA(2).

Passing to the limit yields ρ(x∗1, x

∗2) ≤ εA(2). �

Fixed points are also often obtained from the following (generalized) Gron-wall’s lemma:

Proposition 9.1.4. If a continuous function u : R+ → R+ satisfies the inequality

u(t) ≤ a+

∫ t

0

c(s)u(s) ds, (9.3)

where c(s) is an integrable (possibly unbounded) function, then

u(t) ≤ a exp

{∫ t

0

c(s) ds

}.

Page 495: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

9.2. Special functions 483

Proof. Using (9.3) recursively yields

u(t) ≤ a+ a

∫ t

0

c(s) ds+

∫ t

0

c(s2)

(∫ s2

0

c(s1)u(s1)ds1)

)ds,

and by induction we find

u(t) ≤ a

(1 +

∫ t

0

c(s) ds+ · · ·+∫0≤s1≤···≤sn≤t

c(sn) · · · c(s1) ds1 · · · dsn)

+

∫0≤s1≤···≤sn+1≤t

c(sn+1) · · · c(s1)u(s1)ds1 · · · dsn+1.

This implies the required estimate by passing to the limit n → ∞. �

We shall also use the following discrete version of the above result.

Lemma 9.1.1. Let a positive sequence {xn} of functions on R+ satisfy the estimates

xn(t) ≤ a+

∫ t

0

(b+ cxn−1(s)) ds (9.4)

with some non-negative constants a, b, c. Then the sequence {xn} is bounded with

xn(t) ≤ [a+ bt]ect +x0c

n

n!. (9.5)

Proof. By direct induction, one gets

xn ≤ a

(1 + ct+ · · ·+ cntn

n!

)+bt

(1 +

ct

2+ · · ·+ cntn

(n+ 1)!

)+cn+1tn+1

(n+ 1)!x0, (9.6)

which implies (9.5). �Exercise 9.1.1. Extend the previous result to the case of an integrable functionc(s) instead of a constant c.

9.2 Special functions

The Euler Gamma and Beta functions are defined for positive arguments as

Γ(x) =

∫ ∞

0

e−ttx−1dt, B(x, y) =

∫ 1

0

tx−1(1 − t)y−1 dt. (9.7)

From the definition, one can trivially derive the formula Γ(x + 1) = xΓ(x)and the useful integral∫ ∞

0

e−tpβ

pωdp =1

βΓ

(1 + ω

β

)t−(1+ω)/β. (9.8)

Page 496: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

484 Chapter 9. Appendix

Continued to complex t with positive real part, this yields∫ ∞

0

exp{−teiφpβ

}pωdp =

1

βΓ

(1 + ω

β

)e−iφ(1+ω)/βt−(1+ω)/β, (9.9)

where t > 0 and φ ∈ (−π/2, π/2). By a lengthy derivation, one obtains the funda-mental relation

Γ(β)Γ(1 − β) = π/ sin(πβ). (9.10)

The functions Γ and B are linked by the equation

Γ(x)Γ(y) = Γ(x+ y)B(x, y). (9.11)

The following well-known formulae express the volume of the unit ball Vd inRd and the area of the unit sphere |Sd−1| in terms of the Γ-function:

Vd =2

d

πd/2

Γ(d/2)=

πd/2

Γ(1 + d/2)=

1

d|Sd−1|, |Sd−1| = 2

πd/2

Γ(d/2). (9.12)

The latter formula includes the case |S0| = 2 (two-point set), since Γ(1/2) =√π.

The Mittag-Leffler functions with a parameter α > 0 or with two parametersα, β > 0 are defined as

Eα(x) =

∞∑k=0

xk

Γ(αk + 1), Eα,β(x) =

∞∑k=0

xk

Γ(αk + β). (9.13)

This series converges for all x, therefore E(x) is analytic in C. As with otheranalytic functions, the same expansion can be used for defining the function E(A)for bounded linear operators A in a Banach space B.

Differentiating the series and using Γ(αk + 1) = Γ(αk)αk, we get

d

dxEα(x) =

∞∑k=1

kxk−1

Γ(αk + 1)=

∞∑k=1

xk−1

Γ(αk)α=

∞∑m=0

xm

Γ(α(m+ 1))α,

in other words,d

dxEα(x) =

1

αEα,α(x). (9.14)

Let us derive the so-called Dirichlet formulae that generalize (9.11). Firstly,for α1, . . . , αn ∈ (0, 1), one has

Γ(α1) · · ·Γ(αn) = Γ(α1 + · · ·+ αn)

∫0≤s1≤···≤sn−1≤1

ds1 · · · dsn−1 (9.15)

× (1− sn−1)αn−1(sn−1 − sn−2)

αn−1−1 · · · (s2 − s1)α2−1sα1−1

1 .

Page 497: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

9.3. Asymptotics of the Fourier transform: power functions and their exponents 485

The integral in this formula is often referred to as the multinomial Beta function.In order to see that (9.15) holds, let us write

Γ(α1) · · ·Γ(αn) =

∫ ∞

0

· · ·∫ ∞

0

e−t1tα1−11 · · · e−tntαn−1

n dt.

Then a change to the variables xj = t1 + · · ·+ tj yields

Γ(α1) · · ·Γ(αn) =

∫ ∞

0

e−xndxn

×∫0≤x1≤···≤xn

(xn − xn−1)αn−1(xn−1 − xn−2)

αn−1−1 · · ·

· · · (x2 − x1)α2−1xα1−1

1 dx1 · · · dxn−1,

which implies (9.15) after yet another change of variables xj = xnsj , j = 1, . . .,n− 1.

Another version of the Dirichlet formula is∫s≤s1≤···≤sn≤t

ds1 · · · dsn(sn − sn−1)αn−1 · · · (s1 − s)α1−1

= (t− s)α1+···+αnΓ(α1) · · ·Γ(αn)

Γ(α1 + · · ·+ αn + 1).

(9.16)

For the proof, one writes the l.h.s. as the repeated integral

∫ t−s

0

∫0≤x1≤···≤xn−1≤τ

dx1 · · · dxn−1(τ − xn−1)αn−1 · · · (x2 − x1)

α2−1xα1−11

=

∫ t−s

0

τα1+···+αn−1dτ

×∫0≤s1≤···≤sn−1≤1

ds1 · · · dsn−1(1 − sn−1)αn−1 · · · (s2 − s1)

α2−1sα1−11 .

Then (9.16) follows by applying (9.15).

9.3 Asymptotics of the Fourier transform:

power functions and their exponents

Let us begin with the Fourier transform of power functions. Since xω+ for any real ω

is not integrable, its Fourier transform cannot be defined in the classical sense. Forω > −1, however, the function xω

+ is locally integrable, and can therefore be con-sidered an element of the space of tempered distributions S′(R). Hence its Fourier

Page 498: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

486 Chapter 9. Appendix

transform can be defined in the sense of generalized functions. In the most ele-mentary approach, one can define the Fourier integral via certain regularizations.For instance, for ω > −1 and p > 0, one can define∫ ∞

0

rωe±irp dr = limε→0+

∫ ∞

0

rωe±ir(p±iε) dr. (9.17)

Note that it is only for ω ∈ (−1, 0) that the l.h.s. is also defined as an improperRiemann integral.

Proposition 9.3.1. Let ω > −1. Then, for the integral defined by (9.17),∫ ∞

0

rωe±irp dr = p−1−ωΓ(1 + ω) exp{±i

π

2(1 + ω)

}(9.18)

for p > 0, or equivalently∫ ∞

0

rωeirp dr = |p|−1−ωΓ(1 + ω) exp{iπ

2(1 + ω) sgn p

}(9.19)

for real p = 0, where sgn p = sgn (p) denotes the sign of the real number p. Ifτ > 0, then ∫ ∞

0

rωe−r(ip+τ) dr = (ip+ τ)−1−ωΓ(1 + ω), (9.20)

where arg(ip+ τ) (which is needed for a proper definition of the r.h.s. of (9.20))is chosen in its main branch, that is, −π/2 < arg(ip+ τ) < π/2.

Proof. For ω > −1 and q > 0, we have∫ ∞

0

rωe−rqdr = q−1−ωΓ(1 + ω). (9.21)

Since the functions on both sides are analytic in the complex half-plane {Re q > 0},they must coincide there. This implies (9.20). The r.h.s. of (9.20) has a limit asq = ∓ip + τ tends to any non-vanishing point on the imaginary axis, that is asτ → 0 and a fixed p > 0. Consequently, choosing q = ∓ip = p exp{∓iπ/2} yields(9.18) and hence (9.19). �Proposition 9.3.2. If α ∈ (0, 1) and a real p = 0, then∫ ∞

0

(eirp−1)dr

r1+α= −Γ(1− α)

αe−iπα sgn p/2|p|α = Γ(−α)e−iπα sgn p/2|p|α. (9.22)

Proof. Integration by parts yields∫ ∞

0

(eirp − 1)dr

r1+α=

ip

α

∫ ∞

0

r−αeirp dr

for any real p. By (9.18), this implies (9.22), where we also used the identityΓ(1− α) = −αΓ(−α). �

Page 499: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

9.3. Asymptotics of the Fourier transform: power functions and their exponents 487

Proposition 9.3.3. Let p ∈ R and p = 0. If α ∈ (1, 2), then

∫ ∞

0

eirp − 1− irp

r1+αdr = −Γ(1− α)

α|p|αe−iπα sgn p/2 = Γ(−α)|p|αe−iπα sgn p/2.

(9.23)More generally, if α ∈ (k, k + 1) with a natural k, then

∫ ∞

0

(eirp − 1− irp− · · · − (irp)k

k!

)dr

r1+α= −Γ(1− α)

α|p|αe−iπα sgn p/2. (9.24)

Proof. Integration by parts yields∫ ∞

0

(eirp − 1− irp− · · · − (irp)k

k!

)dr

r1+α

=ip

α

∫ ∞

0

(eirp − 1− irp− · · · − (irp)k−1

(k − 1)!

)dr

for any real p. Therefore, (9.24) is obtained from (9.22) by induction. �

All the above formulae have natural extensions to finite dimensions. Thefollowing result is a direct corollary of the formulae (9.19) and (9.24).

Proposition 9.3.4. Let μ(ds) be a measure on the unit sphere Sd−1 in Rd withd > 1, p ∈ Rd and p = 0.

(i) Let ω ∈ (−1, 0). Then∫ ∞

0

∫Sd−1

|y|ωei(y,p) d|y|μ(dy)

= Γ(1 + ω)

∫Sd−1

|(p, y)|−1−ωe−iπ(1+ω) sgn (p,y)/2μ(dy),

(9.25)

where y = y/|y| ∈ Sd−1. If μ is symmetric, i.e., invariant with respect to theinversion s → −s, then∫ ∞

0

∫Sd−1

|y|ωei(y,p) d|y|μ(dy)

= Γ(1 + ω)

∫Sd−1

|(p, y)|−1−ω cos(π(1 + ω)/2)μ(dy).

(9.26)

(ii) Let α ∈ (k, k + 1) with a natural k. Then

∫ ∞

0

∫Sd−1

(ei(y,p) − 1− i(y, p)− · · · − ik(y, p)k

k!

)d|y|μ(dy)|y|1+α

= Γ(−α)

∫Sd−1

e−iπα sgn (p,y)/2|(p, y)|αμ(dy).(9.27)

Page 500: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

488 Chapter 9. Appendix

If μ is symmetric, then the last expression simplifies to

Γ(−α) cos(πα/2)

∫Sd−1

|(p, y)|αμ(dy). (9.28)

Next, we need the Fourier transform of exponentials of power functions,

Iβσω(x) =

∫ ∞

0

exp{ixp− σpβ

}pω dp =

∫ ∞

0

exp{ixp− |σ|eiψpβ} pω dp, (9.29)

for x ∈ R, where σ = σr + iσi = |σ|eiψ, and ω, β real constants. These integralscan usually not be calculated in a closed form, but we are interested in the mainterm of the asymptotics for large x. Note that changing the integration variableyields

Iβσω(x) =1

|x|1+ω

∫ ∞

0

exp{ip sgn (x)− σpβ |x|−β

}pω dp. (9.30)

Proposition 9.3.5. Let ω > −1, β > 0. Let σr > 0, or equivalently ψ ∈ (−π/2, π/2).Then

|Iβσω(x)| ≤ C1+ω |σ||x|1+ω

Γ(1 + ω). (9.31)

More precisely,

Iβσω(x) =Γ(1 + ω)

|x|1+ωi sgn (x) exp {i sgn (x)πω/2}+ Iβσω(x) (9.32)

with

|Iβσω(x)| ≤ C1+ω+β |σ||x|1+ω+β

Γ(1 + ω + β), (9.33)

where C can be taken either as C = 1 or as 2β|σ|/(πσr) depending on certainrelations between σ and β that are made explicit in the below proof. (Namely, whenthe rotation angle φ = min(φ0, π/2) equals π/2 or φ0, respectively.) In particular,if β ≥ 2, the second case is realized.

Proof. The idea is to rotate the beam of integration (which is R+) onto the angle±φ, where the sign ± corresponds to the sgn (x) and denotes the anticlockwiseor clockwise rotation, respectively (so that the real part of the number ip sgn (x)is negative throughout the rotation). Also, φ ∈ (0, π/2] is chosen as close as pos-sible to π/2, with the restriction that the real part of σpβ must remain positivethroughout the rotation. These conditions justify that a rotation according to theCauchy theorem of complex analysis can be performed.

On the beam p = re±iφ = reiφ sgn (x), r > 0, the real and imaginary parts ofthe complex number σpβ equal

rβξr = rβ [σr cos(βφ) − sgn (x)σi sin(βφ)],

rβξi = rβ [σi cos(βφ) + sgn (x)σr sin(βφ)].

Page 501: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

9.3. Asymptotics of the Fourier transform: power functions and their exponents 489

The value ξr is positive for small φ, because σr > 0. By monotonicity, if sgn (x)σi ≥0, there exists a unique φ0 ∈ (0, π/(2β)] so that ξr = 0, and if sgn (x)σi < 0, thereexists a unique φ0 ∈ (π/(2β), π/β) so that ξr = 0. This φ0 is specified by theequations sin(φ0β) = σr/|σ|, cos(φ0β) = sgn (x)σi/|σ|. Therefore, the right choiceof φ is φ = min(φ0, π/2).

If β ≥ 2, then π/β ≤ π/2, which implies φ = φ0. If β < 2, then φ = π/2whenever

σr cos(βπ/2)− sgn (x)σi sin(βπ/2) ≥ 0.

Performing the rotation and turning to the real variable r on the beam p =re±iφ that is obtained by the rotation, we find

Iβσω(x) =eiφ(1+ω) sgn (x)

|x|1+ω

∫ ∞

0

exp

{i sgn (x)reiφ sgn (x) − rβ

|x|β (ξr + iξi)

}rω dr,

(9.34)where ξr = 0 and ξi = sgn(x)|σ|, if φ = φ0, and

ξr = σr cos(βπ/2)− sgn (x)σi sin(βπ/2) > 0,

ξi = σi cos(βπ/2) + sgn (x)σr sin(βπ/2)

otherwise. In both cases, we find ξ2r + ξ2i = σ2r + σ2

i = |σ|2. Therefore,

|Iβσω(x)| ≤1

|x|1+ω

∫ ∞

0

e−r sinφ rω dr =1

|x|1+ωΓ(1 + ω)(sinφ)−(1+ω),

which yields (9.31) with C = (sinφ)−1.

On the other hand, using the elementary inequality

| exp{−(x+ iy)} − 1| ≤ |x+ iy|

that is valid for any x > 0 and y, we obtain

Iβσω(x) =eiφ(1+ω) sgn (x)

|x|1+ω

∫ ∞

0

exp{i sgn (x)reiφ sgn (x)

}rω dr + Iβσω(x),

where

|Iβσω(x)| ≤1

|x|1+ω+β

∫ ∞

0

exp{−r sinφ}rβ+ω√ξ2r + ξ2i dr

=|σ|

|x|1+ω+β

∫ ∞

0

exp{−r sinφ}rβ+ω dr

=|σ|

|x|1+ω+βΓ(1 + ω + β)(sin φ)−(1+ω+β),

which yields (9.33) with C = (sinφ)−1.

Page 502: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

490 Chapter 9. Appendix

Using (9.21) with

q = −i sgn (x)eiφ sgn (x) = exp{i sgn (x)(φ − π/2)} = sinφ− i sgn (x) cosφ

(which has a positive real part) yields

eiφ(1+ω) sgn (x)

|x|1+ω

∫ ∞

0

exp{i sgn (x)reiφ sgn (x)

}rω dr

=eiφ(1+ω) sgn (x)

|x|1+ωΓ(1 + ω) exp {−i sgn (x)(φ − π/2)(1 + ω)}

=Γ(1 + ω)

|x|1+ωexp{i sgn (x)π(1 + ω)/2},

(9.35)

which implies (9.32). Moreover, if φ = π/2, then

|Iβσω(x)| ≤|σ|

|x|1+ω+βΓ(1 + ω + β).

If φ = φ0, then sin(φβ) = σr/|σ|, and therefore

sinφ > πφ/2 > π arcsin(σr/|σ|)/(2β) > πσr

2β|σ| ,

which yields the required estimate for C. �Remark 127. Expanding the exponent exp{−rβ(ξr + iξi)/|x|β} in (9.34) into apower series yields the full asymptotic expansion of Iβσω(x) in power series of |x|−β .

An extension of Proposition 9.3.5 concerns a variation or a mixing of β and σ.Namely, extending (9.29), let us consider the integral

Iβσω(x) =

∫ ∞

0

exp

{ixp−

∫ t

s

στpβτ μ(dτ)

}pω dp, (9.36)

with continuous curves βt ∈ R+, σt ∈ C.

Proposition 9.3.6. Let ω > −1. Let β(τ) be a continuous curve [s, t] → [bmin, bmax]with bmin > 0, let σ(τ) = σr(τ) + iσi(τ) be a continuous curve [s, t] → C whosereal part σr(τ) is bounded from below by a constant σ > 0 and whose magnitude|σ(τ)| is bounded from above by a constant Σ, and let μ(dτ) be an arbitrary finite(non-negative) Borel measure on [s, t] (for instance, a discrete one). Then

|Iβσω(x)| ≤C

|x|1+ω(9.37)

with a constant C depending on bmax, ω and the ratio Σ/σ. More precisely,

Iβσω(x) =Γ(1 + ω)

|x|1+ωi sgn (x) exp{i sgn (x)πω/2}+ Iβσω(x) (9.38)

Page 503: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

9.4. Asymptotics of the Fourier transform: functions of power growth 491

with

|Iβσω(x)| ≤ C

∫ t

s

|σ(τ)|μ(dτ)|x|1+ω+βτ

, (9.39)

where the constant C depends on bmin, bmax, ω and Σ/σ.

Proof. For this proof, the arguments used in Proposition 9.3.5 can be directlyextended. The rotation angle φ is defined as follows: φ = min(φ0, π/2) with

φ0 = inf{ψ : ∃τ ∈ supp μ : σr(τ) cos(β(τ)ψ) − sgn (x)σi(τ) sin(β(τ)ψ) = 0}.

After the rotation, the integral turns into

Iβσω(x) =eiφ(1+ω) sgn (x)

|x|1+ω

∫ ∞

0

exp{i sgn (x)reiφ sgn (x) − (ξr + iξi)} rω dr, (9.40)

where

ξr =

∫ t

s

rβ(τ)

|x|β(τ) [σr(τ) cos(β(τ)φ) − sgn (x)σi(τ) sin(β(τ)φ)]dτ,

ξi =

∫ t

s

rβ(τ)

|x|β(τ) [σi(τ) cos(β(τ)φ) + sgn (x)σr(τ) sin(β(τ)φ)]dτ.

Therefore, we obtain (9.38) with

|Iβσω(x)| ≤1

|x|1+ω

∫ ∞

0

e−r sinφrω√ξ2r + ξ2i dr.

The equation

[σr(τ) cos(β(τ)φ) − sgn (x)σi(τ) sin(β(τ)φ)]2

+ [σi(τ) cos(β(τ)φ) + sgn (x)σr(τ) sin(β(τ)φ)]2 = σ2(τ)

yields √ξ2r + ξ2i ≤ |ξr|+ |ξi| ≤ 2

∫ t

s

rβ(τ)

|x|β(τ) |σ(τ)| dτ.

Estimating rβ(τ) by rbmax and rbmin for r > 1 and r < 1, respectively, yields(9.39). �

9.4 Asymptotics of the Fourier transform:

functions of power growth

In this section, we briefly discuss a branch of analysis that deals with the relationbetween the asymptotic behaviour of functions and their Fourier transform. Such

Page 504: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

492 Chapter 9. Appendix

relations are often referred to as Tauberian theorems (see [37] for the full story).We present here only some more or less elementary results.

We start with two basic lemmas of asymptotic analysis, which are presentedunder more general assumptions than usual. Moreover, they give emphasis tothe main terms of the asymptotics with precise error estimates, rather than tothe corresponding asymptotic expansions. The first of these lemmas is Watson’slemma.

Lemma 9.4.1. Let a, β, α, λ > 0, and let f ∈ C([0, a]) be such that |f(x)− f(0)| ≤Kx. Then∫ a

0

xβ−1f(x) exp{−λxα} dx =1

αΓ(β/α)f(0)λ−β/α +O(λ−(β+1)/α), (9.41)

where|O(λ−(β+1)/α)| ≤ C(a, β, α)Kλ−(β+1)/α (9.42)

with a constant C(a, β, α).

Proof. This can be seen by writing f(x) = f(0) + (f(x) − f(0)) and using (9.8)for each term. Further details are available in many textbooks on calculus. �

The following Erdelyi’s lemma (in a modified version) is more involved:

Lemma 9.4.2. Let a > 0, β ∈ (0, 1) and p a complex number with a non-negativeimaginary part. Let f ∈ C([0, a]) ∩ C1((0, a]) be such that f(a) = 0 and g(x) =xβ−1(f(x)− f(0)) has an integrable derivative. Then

I =

∫ a

0

xβ−1f(x) exp{ipx} dx = Γ(β)(−ip)−βf(0) +O(p−1), (9.43)

where (−ip)−β is a branch of an analytic function on the half-plane of p with non-negative imaginary part such that (−ip)−β = λβ for p = iλ with a positive λ, andwhere

|O(p−1)| ≤ C(a, β)‖g′‖L1[0,a] |p|−1 (9.44)

with a constant C(a, β). For instance, for real p,

I =

∫ a

0

xβ−1f(x) exp{ipx} dx = eiπβ sgn (p)/2Γ(β)|p|−βf(0) +O(p−1). (9.45)

Remark 128. The assumption that xβ−1(f(x)− f(0)) has an integrable derivativeis exactly the assumption that is needed for our main application below. It is muchweaker than the requirement that f itself has an integrable derivative.

Proof. Let p = qeiψ with q > 0 and ψ ∈ [0, π]. First, we prove the assertion underthe additional assumption that f(x) = 1 for x ∈ [0, δ] with some δ ∈ (0, a). Forthat purpose, the trick is to decompose the integral I into a sum of two integralsover the segments [0, δ] and [δ, a], respectively. To the first integral, which does not

Page 505: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

9.4. Asymptotics of the Fourier transform: functions of power growth 493

depend on f , one can apply the Cauchy theorem for complex variables and replaceit by a sum of integrals over the segment x = ie−iψy, y ∈ [0, δ], and the part of acircle that joins the points ie−iψδ and δ, respectively. Let us assume for the sakeof definiteness that ψ ∈ [0, π/2]. Then this quarter-circle can be parametrized asx = δeiφ, φ ∈ (0, (π/2) − ψ), to be integrated over φ in the opposite direction.This yields I = I1 + I2 + I3 with

I1 = eiπβ/2e−iψβ

∫ δ

0

yβ−1e−qydy,

I2 = −i

∫ (π/2)−ψ

0

δβeiφβ exp{iqδei(φ+ψ)}dφ,

I3 =

∫ a

δ

xβ−1f(x) exp{iqxeiψ} dx.

By Watson’s lemma, we find that

I1 = eiπβ/2e−iψβΓ(β)q−β−1 +O(q−1) = Γ(β)(−ip)−β +O(q−1−β).

Integration by parts shows that I2 and I3 are of order 1/q, which completes theproof in this case.

Let us now turn to a general f . For that purpose, let χ(x) be a mollifier,i.e., an infinitely differentiable function R+ → [0, 1] that has the value 1 in aneighbourhood of zero and vanishes for x ≥ a. If we change f to (1 − χ)f in(9.43), the integrand vanishes near zero and integration by parts shows that it isof order 1/p. Therefore, it is sufficient to prove the result for fχ instead of f . Forthis, we get∫ a

0

xβ−1f(0)χ(x) exp{ipx} dx = eiπβ/2Γ(β)p−βf(0) +O(p−1)

by the above result. It remains to show that the integral∫ a

0

g(x)χ(x) exp{ipx} dx

is of order O(1/p). Here, g(x)χ(x) is a continuous function that vanishes at theboundaries x = 0, a and has an integrable derivative. Integration by parts showsthat the boundary terms vanish, and we get the integral of order O(1/p), asclaimed. �

The next statement reflects the main application of Erdelyi’s lemma that weare interested in.

Proposition 9.4.1. Let α ∈ (0, 1) and p ∈ R, p = 0. Moreover, let ν(y) be acontinuous complex-valued function on y > 0 such that G(y) =

∫∞y ν(z)dz is well

defined for y > 0, that there exists a finite limit

κ = limy→0

yαG(y), (9.46)

Page 506: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

494 Chapter 9. Appendix

and that the function

d

dy(G(y)− κy−α) = καy−α−1 − ν(y) (9.47)

is integrable around the origin.

Then the inverse Fourier transform of G is an analytic function in the com-plex open upper half-plane {Im p > 0}, has a continuous extension to p ∈ R \ {0}and it holds ∫ ∞

0

eipyG(y) dy = κΓ(1 − α)(−ip)α−1 +O(1/p), (9.48)

where, for any a > 0,

|O(1/p)| ≤ C(a, α)‖καy−α−1 − ν(y)‖L1([0,a])|p|−1 (9.49)

with a constant C(a, α). In particular, for real p,∫ ∞

0

eipyG(y) dy = eiπ(1−α) sgn (p)/2Γ(1− α)κ|p|−(1−α) +O(1/p). (9.50)

Remark 129. If the function (9.47) tends to zero as y → 0, then

κα = limy→0

yα+1ν(y), (9.51)

which implies (9.46) by L’Hopitale’s rule.

Proof. Notice first that the integral on the l.h.s. of (9.48) is well defined as animproper Riemann integral (i.e., as the limit of integrals over [0,K], K → ∞),because of∫ K

0

eipyG(y) dy =1

ip

∫ K

0

(eipy − 1)ν(y)dy +1

ip(eipK − 1)G(K)

and limy→∞ G(y) = 0. Choosing a mollifier χ(y), i.e., an infinitely differentiablefunction R+ → [0, 1] such that χ(y) = 1 for y ≤ a/2 and χ(y) = 0 for y ≥ a withsome (arbitrarily chosen) a > 0, we see that∫ ∞

0

eipyG(y)(1 − χ(y)) dy = − 1

ip

∫ ∞

a/2

eipy[ν(y)(1 − χ(y))−G(y)χ′(y)]dy,

which is of order 1/|p|. Therefore, it is sufficient to prove the proposition forχ(y)G(y) instead of G(y). We can write χ(y)G(y) = y−αf(y) with a function fthat has a compact support such that f(0) = κ, and

y−α(f(y)− f(0)) = G(y)χ(y) − y−ακ

has an integrable derivative by (9.47). Hence estimate (9.48) follows from Erdelyi’slemma. �

Page 507: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

9.4. Asymptotics of the Fourier transform: functions of power growth 495

The following asymptotic result should be compared with the exact formulae(9.3.2).

Proposition 9.4.2. Let α ∈ (0, 1), and let ν(y) be a continuous complex-valuedfunction on y > 0 such that∫ ∞

0

min(1, y)|ν(y)|dy < ∞.

Set G(y) =∫∞y

ν(z)dz and

ψν(p) =

∫ ∞

0

(eipy − 1)ν(y) dy = ip

∫ ∞

0

eipyG(y)dy. (9.52)

(The equation holds due to integration by parts.)

(i) If G(y) satisfies (9.46) and (9.47), then ψν(p) is an analytic function in thecomplex open upper half-plane {Im p > 0} and has a continuous extension tothe closed half-plane {Im p ≥ 0}. In this half-plane,

ψν(p) = −(−ip)αΓ(1− α)κ +O(1), (9.53)

where, for any a > 0,

|O(1)| ≤ C(a, α)‖καy−α−1 − ν(y)‖L1([0,a]) (9.54)

with a constant C(a, α). In particular, for real p,

ψν(p) = −e−iπα sgn (p)/2Γ(1− α)κ|p|α +O(1). (9.55)

(ii) If∫∞0 y|ν(y)|dy < ∞, then

ψν(p) =

∫ ∞

0

(eipy − 1)ν(y) dy = ip

[∫ ∞

0

yν(y)dy + o(1)

], (9.56)

where o(1) → 0 as p → 0, and the derivative

d

dp

∫ ∞

0

(eipy − 1)ν(y) dy =

∫ ∞

0

iyeipyν(y)dy

is well defined and uniformly bounded in the half-plane {Im p ≥ 0}.(iii) If

G(y) =(g +O(1/yλ)

)y−γ , (9.57)

with |O(1/yλ)| ≤ C/yλ and some g = 0, γ ∈ (0, 1), λ > 1− γ, C > 0, then

ψν(p) =

∫ ∞

0

(eipy − 1)ν(y) dy = −gΓ(1− γ)(−ip)γ +O(p), (9.58)

Page 508: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

496 Chapter 9. Appendix

where, for any a > 0,

|O(p)| ≤ |p|[∫ a

0

G(y) dy +C

γ + λ− 1a1−γ−λ +

g

1− γa1−γ

]. (9.59)

In particular, for real p, we find

ψν(p) =

∫ ∞

0

(eipy − 1)ν(y) dy = −gΓ(1− γ)e−iπγ sgn (p)/2|p|γ +O(p). (9.60)

Moreover, if additionally

ν(y) = γ(g +O(1/yλ))y−γ−1 (9.61)

(note that (9.61) implies (9.57) by L’Hopitale’s rule), then

d

dpψν(p) = −gγ sgn (p)Γ(1− γ)e−iπγ sgn (p)/2|p|γ−1 +O(1) (9.62)

for real p, with a bounded function O(1).

Proof. (i) Due to (9.52), (9.53) follows from (9.48) by multiplying this formula by(ip).

(ii) If∫y|ν(y)|dy < ∞, then∣∣∣∣

∫ ∞

0

(eipy − 1)ν(y) dy

∣∣∣∣ ≤∫

py|ν(y)|dy,

which tends to zero at least as p, as p → 0. The asymptotic equation (9.56) is thenobtained from the formula for the derivative.

(iii) Using (9.52) and splitting the integral into two parts over [0, a] and[a,∞), respectively, we see that the first integral yields the first estimate on ther.h.s. of (9.59). Setting p = qeiψ with q = |p| > 0 and ψ ∈ [0, π] and changing thevariable of integration y to z = yq in the second integral, we get

ip

∫ ∞

a

eipyG(y) dy = ip|p|γ−1g

∫ ∞

|p|aexp{izeiψ}z−γ dy + r

with

|r| ≤ C|p|λ+γ

∫ ∞

|p|az−γ−λ dy =

C

γ + λ− 1a1−γ−λ|p|,

where the condition λ+ γ > 1 was used. It remains to observe that

∫ ∞

|p|aexp{izeiψ}z−γ dy =

∫ ∞

0

exp{izeiψ}z−γ dy −∫ |p|a

0

exp{izeiψ}z−γ dy,

Page 509: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

9.4. Asymptotics of the Fourier transform: functions of power growth 497

and ∣∣∣∣∣∫ |p|a

0

exp{izeiψ}z−γ dy

∣∣∣∣∣ ≤ 1

1− γ(|p|a)1−γ ,

∫ ∞

0

exp{izeiψ}z−γ dy = Γ(1− γ) exp{i(ψ − π/2)(γ − 1)},

where (9.20) was used. Therefore, the main term of ψν(p) equals

ip|p|γ−1gΓ(1− γ) exp{i(ψ − π/2)(γ − 1)} = −gΓ(1− γ)(−ip)γ ,

as required. This proves (9.58) and (9.59).

Finally, the previous analysis implies (9.62) for

d

dp

∫ ∞

0

(eipy − 1)ν(y) dy =

∫ ∞

0

iyeipyν(y) dy

by using yν(y) instead of G(y). �

Remark 130. In order to get rough asymptotics (without error estimates), someconditions can be relaxed. For instance, the general Tauberian theorems (seeChapter 4 in [37]) imply that for monotone positive G, the main terms in (9.58)and (9.53) follow just from the existence of the limits κ = limy→0 y

αG(y) andg = limy→∞ yγG(y).

The main reason for the asymptotic analysis that has been performed aboveis to get estimates for the fundamental solution to ΨDOs with symbols ψν(p)from (9.53). By (1.180), this fundamental solution is given by the inverse Fouriertransform: Eν = F−1(1/ψν).

Proposition 9.4.3. Under the assumption of Proposition 9.4.2(i), let α ∈ (12 , 1).Then the function

Eν(x) = −[F−1(1/ψν)](x) = − 1

∫ ∞

−∞eipx

dp

ψν(p)(9.63)

is well defined for real x = 0. (Note that the integral can be defined either as animproper Riemann integral or as the limit of an appropriately chosen complex x.)Moreover, this function is uniformly bounded for positive x and has the followingasymptotics behaviour for negative x:

Eν(x) = 1

πκ|x|α−1 sin(πα) +O(1), (9.64)

with a uniformly bounded (for x < 0) function O(1).

Page 510: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

498 Chapter 9. Appendix

Remark 131.

(i) The minus sign was introduced in (9.63) for convenience. In fact, the mainobject is the function E(x) = E(−x), which represents the density of thepotential measure U(dx) of the operator L′

ν with the symbol ψν(p). It turnsout that the measure U(dx) is supported on R+ (see Section 8.4), so thatE(x) vanishes identically for x > 0.

(ii) Assumption α > 1/2 is a consequence of our technique. It can be relaxed bymore advanced methods (see Remark 130).

Proof. We have

Eν(x) = 1

∫ ∞

−∞eipx

dp

ψν(p)=

1

∫ ∞

0

eipxdp

ψν(p)+

1

∫ ∞

0

e−ipx dp

ψν(−p).

In order to apply Proposition 9.4.2 (iii) (with x instead of p and p instead ofy), we observe from (9.55) that, for positive p,

1

ψν(p)= − eiπα/2

pαΓ(1− α)κ(1 +O(|p|−α),

1

ψν(−p)= − e−iπα/2

pαΓ(1− α)κ(1 +O(|p|−α).

Therefore, by Proposition 9.4.2 (with γ = λ = α, hence the condition α > 1/2),we find

Eν(x) = − 1

2πix

eiπα/2

Γ(1− α)κΓ(1− α)e−iπα sgn x/2|x|α

+1

2πix

e−iπα/2

Γ(1− α)κΓ(1− α)eiπα sgn x/2|x|α +O(1)

= − 1

2πiκ sgnx|x|α−1

(eiπα/2e−iπα sgn x/2 − e−iπα/2eiπα sgn x/2

)+O(1).

Remarkably, the main terms cancel for positive x, which proves that E isbounded in this region. For negative x, we get

Eν(x) = 1

2πiκ|x|α−1(eiπα − e−iπα) +O(1) =

1

πκ|x|α−1 sin(πα) +O(1),

as claimed. �

9.5 Argmax in convex Hamiltonians

In control theory, the Hamiltonian function often appears in the form

H(p) = maxx∈X

(xp− U(x)), p ∈ Rd, (9.65)

Page 511: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

9.5. Argmax in convex Hamiltonians 499

where X is a closed set in Rd and U(x) a continuous function. In many situations(see specifically Theorem 6.10.1), it is important to know the property of the(possibly multi-valued) function

x(p) = argmax (xp− U(x)),

that is, x(p) is the set of points where the maximum in (9.65) is achieved. It isstraightforward to see that if X is convex and U(x) is a strictly convex function,then x is single-valued. In this case, one can expect x(p) to be Lipschitz-continuous.For simplicity, we shall establish this fact only for X being a ball.

Theorem 9.5.1. Let X = {x : |x| ≤ 1} and U(x) a strictly convex twice continu-ously differentiable function, such that(

∂2U

∂x2(x)ξ, ξ

)≥ a(ξ, ξ)

for a constant a and all x and ξ, and such that x = 0 is the point of the global min-imum of U . Then x(p) : Rd → X is a well-defined (globally) Lipschitz-continuousfunction, with the Lipschitz constant depending on a and

b = max

{∥∥∥∥ ∂2U

∂x2(x)

∥∥∥∥ : x ∈ X

}.

Proof. The global maximum (without the constraint x ∈ X) of px − U(x) isachieved at the point x, so that p = ∇U(x). Due to the convexity, the mappingx �→ ∇U(x) is a diffeomorphism of Rd. Therefore, its inverse G(p) is also a dif-feomorphism. Therefore, for p ∈ (∇U)(X), we find that x = G(p) and that x isa Lipschitz-continuous and smooth function of p. For p ∈ P = Rd \ (∇U)(X),x(p) belongs to the unit sphere Sd−1. Thus we only need to show the Lipschitzcontinuity of the mapping x as a mapping P → Sd−1.

From the method of Lagrange multipliers, it follows that, for p ∈ P , x(p)solves the equation

p = ∇U(x) + λx

with some λ > 0.

Remark 132. One can see this also directly. In fact, since the function xp− U(x)has its maximum x(p) on Sd−1, the gradient of this function must be orthogonalto the tangent plane of Sd−1 at x(p).

Therefore, for all x ∈ Sd−1, the set {p : x(p) = x} is the ray

{pλ = ∇U(x) + λx, λ ≥ 0}.By convexity, we find

(∇U(x)−∇U(y), x− y) =

(x− y,

∫ 1

0

∂2U

∂x2(y + t(x− y)) dt (x − y)

)≥ a(x− y)2,

Page 512: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

500 Chapter 9. Appendix

and thus

|∇U(x) −∇U(y)| ≥ a|x− y|.Consequently,

|x− y| ≤ 1

a|p0(x) − p0(y)|. (9.66)

In order to prove the theorem, we have to show that

|x− y| ≤ C|pλ(x) − pμ(y)| (9.67)

for all λ, μ ≥ 0 and a constant C.

Remark 133. In order to get a better overview, notice that the norm |pλ| increaseswith λ due to (x,∇U(x)) > 0. Therefore, p0(x) is the minimal (in magnitude)solution to the equation x(p) = x.

For proving (9.67), let us look for the value of

min{|pλ(x)− pμ(y)| : λ, μ ≥ 0} (9.68)

for a given x = y from Sd−1. Notice first of all that this value is positive, sincepλ(x) = pμ(y) for any λ, μ and any x = y, because otherwise ∇U(x) + λx =∇U(y) + μy, and thus x(p) = x = y (because x(q) is uniquely defined for any q).

Next, let us show that the minimum in (9.68) cannot be realized on a pair(λ0, μ0) such that both of these numbers are positive. In fact, assuming λ0 >0, μ0 > 0, it follows that

(x,∇U(x) −∇U(y) + λ0x− μ0y) = 0

(y,∇U(x)−∇U(y) + λ0x− μ0y) = 0,

or, equivalently, that

λ0 − μ0(x, y) = −(x,∇U(x)−∇U(y))

−λ0(x, y) + μ0 = (y,∇U(x)−∇U(y)).

Summing up these equations yields

(λ0 + μ0)(1 − (x, y)) = −(x− y,∇U(x)−∇U(y)) < 0,

and thus λ0 + μ0 < 0, which is a contradiction.

Therefore, either the minimum in (9.68) is realized on μ0 = λ0 = 0, in whichcase we have the required Lipschitz continuity by (9.66), or one of the numbersμ0, λ0 vanishes. For considering this second case, let λ0 = 0, μ0 > 0. Then

μ0 = (y,∇U(x)−∇U(y))

Page 513: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

9.5. Argmax in convex Hamiltonians 501

and

minλ,μ

{|pλ(x)−pμ(y)|} = |p0(x)−pμ0 (y)| = |∇U(x)−∇U(y)−(∇U(x)−∇U(y), y)y|.

Therefore, it remains to show that

|x− y| ≤ C|∇U(x) −∇U(y)− (∇U(x)−∇U(y), y)y|

with a constant C. This, however, would follow directly from the inequality

(x− y)2 ≤ C(x− y,∇U(x)−∇U(y)− (∇U(x)−∇U(y), y)y). (9.69)

Now we have

(x− y,∇U(x)−∇U(y)) ≥ a(x− y)2,

|(∇U(x) −∇U(y), y)| ≤ b|x− y|.

Noting that (y, x − y) = |x − y|2/2 (whenever (x, y) > 0 and |x| = |y| = 1), itfollows that

|(x− y, y)(∇U(x)−∇U(y), y)| ≤ b|x− y|3/2.Therefore,

(x− y,∇U(x)−∇U(y)− (∇U(x) −∇U(y), y)y) > a(x − y)2(1− ω),

where ω ≤ b|x− y|/(2a). Thus for |x− y| ≤ a/b,

(x− y,∇U(x)−∇U(y)− (∇U(x)−∇U(y), y)y) > a(x− y)2/2,

which implies (9.69) with C = 2/a.

On the other hand, on the set {x, y ∈ X : |x− y| ≥ a/b} the expression

|∇U(x) −∇U(y)− (∇U(x) −∇U(y), y)y|

is bounded from below by a positive constant, say c, since it is continuous andcannot vanish. Therefore, we have

|x− y| ≤ 2

c|∇U(x) −∇U(y)− (∇U(x) −∇U(y), y)y|,

which again implies (9.69). �Exercise 9.5.1. Show that if J in Theorem 9.5.1 is spherically symmetric, i.e., itis of the form J(|x|), then

x(p) = min(1, (J ′)−1(|p|)p/|p|) . (9.70)

Page 514: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

Bibliography

[1] O.P. Agrawal. A General Formulation and Solution Scheme for FractionalOptimal Control Problems, J. Nonlinear Dynamics 38 (2004), 323–337.

[2] O.P. Agrawal. Generalized Variational Problems and Euler–Lagrange equa-tions. Computers and Mathematics with Applications 59 (2010) 1852–1864.

[3] N.U. Ahmed. A general class of McKean–Vlasov stochastic evolution equa-tions driven by Brownian motion and Levy process and controlled by Levymeasure. Discuss. Math. Differ. Incl. Control Optim. 36:2 (2016), 181–206.

[4] G. Akagi, G. Schimperna, Giulio and A. Segatti. Fractional Cahn–Hilliard,Allen–Cahn and porous medium equations. J. Differential Equations 261:6(2016), 2935–2985.

[5] R. Albert and A.-L. Barabasi. Statistical mechanics of complex networks.Reviews of Modern Physics. 74:1 (2002), 47–97. arXiv:cond-mat/0106096.

[6] S.A. Albeverio, R. Høegh-Krohn and S. Mazzucchi. Mathematical theory ofFeynman path integrals. An introduction. Second edition. Lecture Notes inMathematics, 523. Springer-Verlag, Berlin, 2008.

[7] L. Ambrosio, N. Gigli and G. Savare. Gradient Flows in Metric Spaces andin the Space of Probability Measures. Birkhauser, 2005.

[8] R. Almeida, Sh. Pooseh and D.F.M. Torres. Computational methods in thefractional calculus of variations. Imperial College Press, London, 2015.

[9] R. Alonso. Boltzmann-type equations and their applications. IMPA Math-ematical Publications. 30th Brazilian Mathematics Colloquium. InstitutoNacional de Matematica Pura e Aplicada (IMPA), Rio de Janeiro, 2015.

[10] A. Ananova and R. Cont. Pathwise integration with respect to paths of finitequadratic variation. J. Math. Pures Appl. 107:6 (2017), 737–757.

[11] V.V. Anh and N.N. Leonenko. Spectral Analysis of Fractional Kinetic Equa-tions with RandomData. Journal of Statistical Physics 104, Nos. 5/6, (2001),1349–1387.

[12] D. Applebaum. Levy Processes and Stochastic Calculus. Cambridge Studiesin Advanced Mathematics, vol. 93, CUP, 2004.

© Springer Nature Switzerland AG 2019 V. Kolokoltsov, Differential Equations on Measures and Functional Spaces, Birkhäuser Advanced Texts Basler Lehrbücher, https://doi.org/10.1007/978-3-030-03377-4

503

Page 515: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

504 Bibliography

[13] V.V. Aristov. Direct methods for solving the Boltzmann equation and studyof nonequilibrium flows. Fluid Mechanics and its Applications, 60. KluwerAcademic Publishers Group, Dordrecht, 2001.

[14] V.I. Arnold. Mathematical Understanding of Nature. AMS, Providence, RI,2014.

[15] V.I. Arnold. Ordinary differential equations. Translated from the Russianby Roger Cooke. Second printing of the 1992 edition. Universitext. Springer-Verlag, Berlin, 2006.

[16] V.I. Arnold. Mathematical methods of classical mechanics. Nauka, Moscow,1979 (in Russian). French translation Mir, Moscow, 1976. Polish translationPWN, Warsaw, 1981.

[17] A.A. Arsen‘ev and O.E. Buryak. On the connection between a solution of theBoltzmann equation and a solution of the Landau–Fokker–Planck equation.Math. USSR Sbornik, 69:2, 465–478, 1991.

[18] J.P. Aubin and A. Cellina. Differential Inclusions. New York, Springer, 1994.

[19] Yu.V. Averbukh. A minimax approach to mean field games Mat. Sbornik206:7 (2015), 3–32 (in Russian).

[20] V.I. Averbukh and O.G. Smolyanov. The theory of differentiation in lineartopological spaces. Russian Math Survey 22:6 (1967), 201–258.

[21] V.I. Averbukh and O.G. Smolyanov. The various definitions of the deriva-tives in linear topological spaces. Russian Math Survey 23:4 (1968), 67–258.

[22] I.F. Bailleul. Sensitivity for the Smoluchowski equation. J. Phys. A 44 (2011),no. 24, 245004.

[23] I.F. Bailleul, Peter L.W. Man and M. Kraft. A stochastic algorithm for para-metric sensitivity in Smoluchowski’s coagulation equation. SIAM J. Numer.Anal. 48:3 (2010), 1064–1086.

[24] D. Baleanu, K. Diethelm, E. Scalas and J.J. Trujillo. Fractional calculus.Models and numerical methods. Second edition. Series on Complexity, Non-linearity and Chaos, 5. World Scientific, Hackensack, NJ, 2017.

[25] J.M. Ball, J. Carr. The discrete coagulation-fragmentation equations: exis-tence, uniqueness and density conservation. J. Stat. Phys. 61 (1990), 203–234.

[26] V. Bally, L. Caramellino and R. Cont. Stochastic Integration by Parts andFunctional Ito Calculus. Advanced Courses in Mathematics CRM Barcelona.Birkhauser, 2016.

[27] V. Barbu. Nonlinear Differential Equations of Monotone Type in BanachSpaces, Springer, New York, 2010.

[28] S.S. Bayin. Time fractional Schrodinger equation. Fox’s H-functions and theeffective potential. J. Math. Phys. 54 (2013), 012103.

Page 516: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

Bibliography 505

[29] R. Basna, A. Hilbert and V.N. Kolokoltsov. An epsilon-Nash equilibriumfor non-linear Markov games of mean-field-type on finite spaces. Commun.Stoch. Anal. 8:4 (2014), 449–468.

[30] V.P. Belavkin. Quantum branching processes and nonlinear dynamics ofmulti-quantum systems. Dokl. Acad. Nauk SSSR (in Russian) 301:6 (1988),1348–1352.

[31] V.P. Belavkin. Multiquantum systems and point processes I. Reports onMath. Phys. 28 (1989), 57–90.

[32] V. Belavkin, V. Kolokoltsov. On general kinetic equation for many parti-cle systems with interaction, fragmentation and coagulation. Proc. R. Soc.Lond. A 459 (2002), 1–22.

[33] G. Ben Arous. Developpement asymptotique du noyau de la chaleur hypoel-liptique sur la diagonale. Ann. Inst. Fourier (Grenoble) 39:1 (1989), 73–99.

[34] A. Bensoussan, J. Frehse and Ph. Yam. Mean field games and mean fieldtype control theory. Springer Briefs in Mathematics. Springer, New York,2013.

[35] A. Bensoussan, J. Frehse and Ph. Yam. On the interpretation of the MasterEquation. Stochastic Process. Appl. 127:3 (2017), 2093–2137.

[36] J. Bertoin. Random fragmentation and coagulation processes. CambridgeStudies in Advanced Mathematics, 102. Cambridge University Press, Cam-bridge, 2006.

[37] N.H. Bingham, C.M. Goldie and J.L. Teugels. Regular variation. CambridgeUniversity Press, 1987.

[38] G. Birkhof. Lattice Theory. American Mathematical Society, 1995 [3rd. ed.with corrections].

[39] K. Bogdan. Sharp estimates for the Green function in Lipschitz domains. J.Math. Anal. Appl. 243 (2000) 326–337.

[40] A.V. Bolsinov and A.T. Fomenko. Integrable geodesic flows on two-dim-ensional surfaces. Monographs in Contemporary Mathematics. ConsultantsBureau, New York, 2000.

[41] L. Bourdin. Existence of a weak solution for fractional Euler–Lagrange equa-tions. J. Math. Anal. Appl. 399:1 (2013), 239–251.

[42] C. Buse, M. Megan, M.-S. Prajea and P. Preda. The strong variant of a Bar-bashin Theorem on Stability of Solutions for Non-Autonomous DifferentialEquations in Banach Spaces. Integr. Equ. Oper. Theory 59 (2007), 491–500.

[43] P.E. Caines, “Mean Field Games”, Encyclopedia of Systems and Control,Eds. T. Samad and J. Ballieul. Springer Reference 364780; DOI 10.1007/978-1-4471-5102-9 30-1, Springer-Verlag, London, 2014.

[44] P. Cardaliaguet, J.-M. Lasry, P.-L. Lions and A. Porretta. Long time averageof mean field games with a nonlocal coupling. SIAM J. Control Optim. 51:5(2013), 3558–3591.

Page 517: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

506 Bibliography

[45] P. Cardaliaguet, F. Delarue, J.-M. Lasry and P.-L. Lions. The master equa-tion and the convergence problem in mean field games. arXiv:1509.02505v1[math.AP]

[46] R. Carmona and F. Delarue. Forward-backward stochastic differential equa-tions and controlled McKean–Vlasov dynamics. Ann. Probab. 43:5 (2015),2647–2700.

[47] R. Carmona and F. Delarue. The master equation for large population equi-libriums. Stochastic analysis and applications 2014, 77–128, Springer Proc.Math. Stat., 100, Springer, Cham, 2014.

[48] R. Carmona and F. Delarue. Probabilistic Theory of Mean Field Games withApplications, vol. I, II. Probability Theory and Stochastic Modelling vol. 83,84. Springer, 2018.

[49] C. Cercignani, G.M. Kremer. The relativistic Boltzmann equation: theoryand applications. Progress in Mathematical Physics, 22. Birkhauser Verlag,Basel, 2002.

[50] M. Chaleyat-Maurel and L. Elie. Diffusions Gaussiannes. Asterisque. 84-85(1981), 1762–1768.

[51] S. Cho, P. Kim, Panki and H. Park. Two-sided estimates on Dirichlet heatkernels for time-dependent parabolic operators with singular drifts in C1,α-domains. J. Differential Equations 252:2 (2012), 1101–1145.

[52] M. Cichon. On solutions of differential equations in Banach spaces. NonlinearAnalysis 60 (2005), 651–667.

[53] C. Chicone and Y. Latushkin. Evolution Semigroups in Dynamical Systemsand Differential Equations. Mathematical Surveys and Monographs, vol. 70.AMS, 1999.

[54] M.G. Crandall and P.-L. Lions. Hamilton–Jacobi Equations in Infinite Di-mensions I. J. Funct. Anal. 62 (1985), 379–396.

[55] D. Crisan, Th. Kurtz and Y. Lee. Conditional distributions, exchangeableparticle systems, and stochastic partial differential equations. Ann. Inst.Henri Poincare Probab. Stat. 50:4 (2014), 946–974.

[56] D. Crisan and E. McMurray. Smoothing properties of McKean–Vlasov SDEs.Probab. Theory Related Fields 171:1-2 (2018), 97–148.

[57] J.L. Da Silva, A.N. Kochubei and Y. Kondratiev. Fractional statistical dy-namics and fractional kinetics. Methods Funct. Anal. Topology 22:3 (2016),197–209.

[58] G. Darbo. Punti uniti in transformazioni a condominio non compacto. Rend.Sem. Mat. Univ. Padova, 24 (1955), 84–92.

[59] M. Deaconu, N. Fournier, E. Tanre. Rate of convergence of a stochastic par-ticle system for the Smoluchowski coagulation equation. Methodol. Comput.Appl. Probab. 5:2 (2003), 131–158.

Page 518: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

Bibliography 507

[60] F. Delarue, S. Menozzi and E. Nualart. The Landau equation for Maxwellianmolecules and the Brownian motion on SON (R). Electron. J. Probab. 20(2015), no. 92.

[61] P. Del Moral and A. Doucet. Interacting Markov chain Monte Carlo methodsfor solving nonlinear measure-valued equations. Ann. Appl. Probab. 20:2(2010), 593–639.

[62] K. Diethelm. The Analysis of Fractional Differential Equations. LectureNotes in Mathematics, vol. 2004. Springer (2010).

[63] N. Dinculeanu. Vector integration and stochastic integration in Banachspaces. Wiley, New York, 2000.

[64] R.J. DiPerna. Measure-valued solutions to conservation laws. Arch. RationalMech. Anal. 88:3 (1985), 223–270.

[65] R.J. DiPerna and P.-L. Lions. On the Cauchy problem for Boltzmann equa-tions: global existence and weak stability. Ann. of Math. 130:2 (1989), 321–366.

[66] B. Djehiche, H. Tembine and R. Tempone. A stochastic maximum principlefor risk-sensitive mean-field type control. IEEE Trans. Automat. Control60:10 (2015), 2640–2649.

[67] S. Yu. Dobrokhotov and A.I. Shafarevich. Tunnel splitting of the spectrumof Laplace–Beltrami operators on two-dimensional surfaces with a square-integrable geodesic flow (in Russian). Funktsional. Anal. i Prilozhen. 34:2(2000), 67–69 (in Russian). Engl. transl. Funct. Anal. Appl. 34:2 (2000),133–134.

[68] J. Duan. An introduction to Stochastic Dynamics. Cambridge Texts in Ap-plied Mathematics. Cambridge, 2015.

[69] P. Dubovskii. Mathematical theory of coagulation. Lecture Notes Series,23. Seoul National University, Research Institute of Mathematics, GlobalAnalysis Research Center, Seoul, 1994.

[70] M.M. Dzherbashian and A.B. Nersesian, Fractional derivatives and theCauchy problem for differential equations of fractional order, Izv. Acad.Nauk Armjanskvy SSR 3:1 (1968), 3–29.

[71] Dzherbashian. Integral Transforms and Representations of Functions in theComplex Plane. Nauka, Moscow, 1966 (in Russian).

[72] S.D. Eidelman, S.D. Ivasyshen and A.N. Kochubei. Analytic Methods in theTheory of Differential and Pseudo-Differential Equations of Parabolic Type.Operator Theory: Advances and Applications, vol. 152. Springer, Basel,2004.

[73] Ch.M. Elliott and Z. Songmu. On the Cahn–Hilliard Equation. Arch. Ratio-nal Mech. Anal. 96:4 (1986), 339–357.

[74] Ch.M. Elliott and H. Garcke. On the Cahn–Hilliard equation with degener-ate mobility. SIAM J. Math. Anal. 27:2 (1996), 404–423.

Page 519: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

508 Bibliography

[75] A. Favini, A. Yagi. Degenerate differential equations in Banach spaces.Monographs and Textbooks in Pure and Applied Mathematics, 215. MarcelDekker, Inc., New York, 1999.

[76] H.O. Fattorini. The Cauchy problem. Encycl. Math. Appl. 18, Addison Wes-ley, Reading Mass., 1983.

[77] M.V. Fedoryuk. Asymptotics of the Green function of pseudo-differentialparabolic equations (in Russian). Differential equations 14:7 (1978), 1296–1301.

[78] W. Feller. An introduction to Probability. 2nd Edition, vol. 2, John Wiley,1971.

[79] A.F. Filippov. Differential equations with discontinuous right-hand side.Mat. Sb. 51(93):1 (1960), 99–128.

[80] A.F. Filippov. Differential Equations with Discontinuous Righthand Sides.Kluwer Academic, Dordrecht, 1988.

[81] D. Finkelshtein. Around Ovsyannikov’s method. Methods Funct. Anal.Topology 21:2 (2015), 134–150.

[82] D.L. Finkelshtein, Yu.G. Kondratiev, M.J. Oliveira. Markov evolutions andhierarchical equations in the continuum. II: Multicomponent systems. Rep.Math. Phys. 71:1 (2013), 123–148.

[83] W.H. Fleming and H.M. Soner. Controlled Markov Processes and ViscositySolutions. Sec Ed. Sptinger, 2006.

[84] H. Follmer. Calcul d’Ito sans probabilites. In Seminar on Probability, XV(Univ. Strasbourg, Strasbourg, 1979/1980) (French), Lecture Notes in Math-ematics 850 (1981), 143–150. Springer, Berlin.

[85] A.T. Fomenko and Kh. Tsishang. A topological invariant and a criterionfor the equivalence of integrable Hamiltonian systems with two degrees offreedom. Izv. Akad. Nauk SSSR Ser. Mat. 54:3 (1990), 546–575 (in Russian).English transl. Math. USSR-Izv. 36:3 (1991), 567–596.

[86] T.D. Frank. Nonlinear Fokker–Planck Equations, Fundamentals and Appli-cations. Springer Series in Synergetics, 2005.

[87] M. Freidlin. Functional Integration and Partial Differential Equations.Princeton Univ. Press, Princeton, NY 1985.

[88] I. Gallagher, L. Saint-Raymond, B. Texier. From Newton to Boltzmann: hardspheres and short-range potentials. Zurich Lectures in Advanced Mathemat-ics. European Mathematical Society, Zurich, 2013.

[89] G.N. Galanis and P.K. Palamides. Nonlinear differential equations in Frechetspaces and continuum cross-section. Analele Stiintifice ale universitatii “Al.I. Cuza” IASI. Tomul LI, s.I, Matematica (2005), f.1.

[90] I.M. Gelfand and G.E. Shilov. Generalized Functions, vol. 1. Academic Press1964. Transl. from Russian, Moscow, 1958.

Page 520: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

Bibliography 509

[91] I.M. Gelfand and G.E. Shilov. Generalized Functions, vol. 3. Academic Press1964. Transl. from Russian, Moscow, 1958.

[92] D.A. Gomes and J. Saude. Mean field games models – a brief survey. Dyn.Games Appl. 4:2 (2014), 110–154.

[93] D.A. Gomes, J. Mohr and R.R. Souza. Continuous time finite state spacemean field games. Appl. Math. Optim. 68:1 (2013), 99–143.

[94] A. Gorban and V. Kolokoltsov. Generalized Mass Action Law and Thermo-dynamics of Nonlinear Markov Processes. Mathematical Modeling of NaturalPhenomena (MMNP) 10:5 (2015), 16–46, http://www.mmnp-journal.org/

[95] A.N. Gorban, M. Shahzad. The Michaelis–Menten–Stueckelberg Theorem.Entropy 13 (2011), 966–1019.

[96] P. Gorka, H. Prado and J. Trujillo. The time fractional Schrodinger equationon Hilbert space. Integral Equations Operator Theory 87:1 (2017), 1–14.

[97] O. Gueant, J.-M. Lasry and P.-L. Lions. Mean Field Games and Appli-cations. Paris-Princeton Lectures on Mathematical Finance 2010. LectureNotes in Math. (2003), Springer, Berlin, p. 205–266.

[98] H. Guerin, S. Meleard and E. Nualart. Estimates for the density of a non-linear Landau process. Journal of Functional Analysis 238 (2006), 649–677.

[99] M.E. Gurtin. Generalized Ginzburg–Landau and Cahn–Hilliard equationsbased on a microforce balance. Physica D 92 (1996), 178–192.

[100] J.W. Hagood. The operator-valued Feynman–Kac formula with noncommu-tative operators. J. Funct. Anal. 38:1 (1980), 99–117.

[101] P. Hajek and P. Vivi. Some problems on ordinary differential equations inBanach spaces. RACSAM 104:2 (2010), 245–255.

[102] P. Hajek and M. Johanis. On Peano’s theorem in Banach spaces. J. Differ-ential Equations 249:12 (2010), 3342–3351.

[103] A. Hammond and F. Rezakhanlou. The kinetic limit for a system of coagu-lating Browninan particles. Arch. Ration. Mech. Anal. 185:1 (2007), 1–67.

[104] Ph. Hartman. Ordinary differential equations. Corrected reprint of the Sec.Edition. Classics in Applied Mathematics, 38. Society for Industrial andApplied Mathematics (SIAM), Philadelphia, PA, 2002.

[105] J. Henderson and R. Luca (2016). Boundary-value problems for systems ofdifferential, difference and fractional equations. Positive solutions. Elsevier,Amsterdam. 2016.

[106] M.E. Hernandez-Hernandez and V.N. Kolokoltsov. On the probabilistic ap-proach to the solution of generalized fractional differential equations of Ca-puto and Riemann–Liouville type. Journal of Fractional Calculus and Ap-plications 7:1 (2016), 147–175.

[107] M.E. Hernandez-Hernandez and V.N. Kolokoltsov. On the solution of two-sided fractional ordinary differential equations of Caputo type. Fract. Calc.Appl. Anal. 19:6 (2016), 1393–1413.

Page 521: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

510 Bibliography

[108] G. Herzog. On Lipschitz conditions for ordinary differential equations inFrechet spaces. Czechoslovak Mathematical Journal 48 (123) (1998), 95–103.

[109] L. Hormander. Pseudo-differential operators. Comm. Pure Appl. Math. 18(1965), 501–517.

[110] L. Hormander. The analysis of linear partial differential operators. Vol. 1:Distribution theory and Fourier analysis. Vol. 2: Differential operators withconstant coefficients. Vol. 3: Pseudo-differential operators. Vol. 4: Fourierintegral operators. Berlin, New York : Springer, 2nd Ed., 2003–2009.

[111] J. Hofbauer, K. Sigmund. Evolutionary Games and Population Dynamics.Cambridge University Press, 1998.

[112] E. Horst. Differential equations in Banach spaces: five examples. Arch. Math.46 (1986), 440–444.

[113] M. Huang, R.P. Malhame, P.E. Caines. Large population stochastic dynamicgames: closed-loop McKean–Vlasov systems and the Nash certainty equiva-lence principle. Commun. Inf. Syst. 6:3 (2006), 221–251.

[114] L. Huang and St. Menozzi. A parametrix approach for some degeneratestable driven SDEs. Ann. Inst. Henri Poincare Probab. Stat. 52:4 (2016),1925–1975.

[115] K. Iosida. Functional analysis. Springer, 1965.

[116] S. Ito. Diffusion equation. Translations of Mathematical Monographs, 114.AMS, Providence, RI, 1992.

[117] L.M. Graves. Riemann Integration and Taylor’s Theorem in General Anal-ysis. Transactions of the American Mathematical Society 29:1 (1927), 163–177.

[118] N. Jacob. Pseudo-differential operators and Markov processes. Vol. II: Gen-erators and their potential theory. London: Imperial College Press, 2002.

[119] O. Kalenberg. Foundations of Modern Probability. Sec. Ed., Springer, Berlin,2002.

[120] A. Kilbas, H.M. Srivastava and J.J. Trujillo. Theory and applications offractional differential equations. Elsevier, Amsterdam, 2006.

[121] P. Kim and R. Song. Estimates on Green functions and Schrodinger-typeequations for non-symmetric diffusions with measure-valued drifts. J. Math.Anal. Appl. 332:1 (2007), 57–80.

[122] P. Kim and R. Song. Intrinsic ultracontractivity of nonsymmetric diffusionswith measure-valued drifts and potentials. Ann. Probab. 36:5 (2008), 1904–1945.

[123] V. Kiryakova. Generalized fractional calculus and applications. Pitman Re-search Notes in Mathematics Series, 301. Longman Scientific, Harlow. Cop-ublished in the United States with John Wiley and Sons, New York, 1994.

[124] V. Kiryakova. A brief story about the operators of the generalized fractionalcalculus. Fract. Calc. Appl. Anal. 11:2 (2008), 203–220.

Page 522: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

Bibliography 511

[125] V. Kiryakova. From the hyper-Bessel operators of Dimovski to the general-ized fractional calculus. Fract. Calc. Appl. Anal. 17:4 (2014), 977–1000.

[126] K. Kiyohara. Two-dimensional geodesic flows having first integrals of higherdegree. Math. Ann. 320:3 (2001), 487–505.

[127] V. Knopova and A. Kulik. Parametrix construction of the transition proba-bility density of the solution to an SDE driven by α-stable noise. Annales del’Institut Henri Poincare. Probabilites et Statistiques 54:1 (2018), 100–140.

[128] A. Kochubei. Parabolic pseudodifferential equations, hypersingular integrals,and Markov processes. Math USSR Izvestiya 33:2 (1989), 233–259.

[129] A. Kochubei. General fractional calculus, evolution equations, and renewalprocesses. Integral Equations Operator Theory 71:4 (2011), 583–600.

[130] A.N. Kochubei and Y. Kondratiev. Fractional kinetic hierarchies and inter-mittency. Kinet. Relat. Models 10:3 (2017), 725–740.

[131] A.N. Kolmogorov and S.V. Fomin. Elements of the theory of functions andfunctional analysis (in Russian). Nauka, Moscow, 1976.

[132] V.N. Kolokoltsov. Geodesic flows on two-dimensional manifolds with an ad-ditional first integral that is polynomial with respect to velocities. Izv. Akad.Nauk SSSR Ser. Mat. 46:5 (1982), 994–1010 (in Russian).

[133] V.N. Kolokoltsov. New examples of manifolds with closed geodesics. VestnikMoskov. Univ. Ser. I Mat. Mekh. 4 (1984), 80–82 (in Russian).

[134] V.N. Kolokoltsov. Symmetric Stable Laws and Stable-like Jump-Diffusions.Proc. London Math. Soc. 3:80 (2000), 725–768.

[135] V.N. Kolokoltsov. Small diffusion and fast dying out asymptotics for super-processes as non-Hamiltonian quasi-classics for evolution equations. Elec-tronic Journal of Probability 6 (2001), paper 21.

[136] V.N. Kolokoltsov. Semiclassical Analysis for Diffusions and Stochastic Pro-cesses. Springer LNM, vol. 1724, Springer 2000.

[137] V.N. Kolokoltsov. A new path integral representation for the solutions ofthe Schrodinger and stochastic Schrodinger equation. Math. Proc. Cam.Phil.Soc. 132 (2002), 353–375.

[138] V.N. Kolokoltsov. On the singular Schrodinger equations with magneticfields. Matem. Zbornik 194:6 (2003), 105–126 (in Russian). Engl. transl.Sbornik Mathematics, p. 897–918.

[139] V.N. Kolokoltsov. Hydrodynamic limit of coagulation-fragmentation typemodels of k-nary interacting particles. Journal of Statistical Physics 115,5/6 (2004), 1621–1653.

[140] V.N. Kolokoltsov. Measure-valued limits of interacting particle systems withk-nary interaction II. Finite-dimensional limits. Stochastics and StochasticsReports 76:1 (2004), 45–58.

Page 523: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

512 Bibliography

[141] V.N. Kolokoltsov. Kinetic equations for the pure jump models of k-naryinteracting particle systems. Markov Processes and Related Fields 12 (2006),95–138.

[142] V.N. Kolokoltsov. On the regularity of solutions to the spatially homoge-neous Boltzmann equation with polynomially growing collision kernel. Ad-vanced Studies in Contemp. Math. 12 (2006), 9–38.

[143] V.N. Kolokoltsov. Nonlinear Markov Semigroups and Interacting Levy TypeProcesses. Journ. Stat. Physics 126:3 (2007), 585–642.

[144] V.N. Kolokoltsov. Generalized Continuous-Time Random Walks (CTRW),Subordination by Hitting Times and Fractional Dynamics. arXiv:0706.1928v1[math.PR] 2007. Probab. Theory and Applications 53:4 (2009).

[145] V.N. Kolokoltsov. The central limit theorem for the Smoluchovski coagula-tion model. arXiv:0708.0329v1[math.PR] 2007. Prob. Theory Relat. Fields146: 1 (2010), 87–153.

[146] V.N. Kolokoltsov. The Levy–Khintchine type operators with variable Lip-schitz continuous coefficients generate linear or nonlinear Markov pro-cesses and semigroups. arXiv:0911.5688 (2009). Prob. Theory Related Fields151(2011), 95–123.

[147] V.N. Kolokoltsov. Nonlinear Markov processes and kinetic equations. Cam-bridge Tracks in Mathematics 182, Cambridge Univ. Press, 2010.

[148] V.N. Kolokoltsov. Markov processes, semigroups and generators. De GruyterStudies in Mathematics vol. 38, De Gruyter, 2011.

[149] V.N. Kolokoltsov. Stochastic Integrals and SDE Driven by Nonlinear LevyNoise. In: D. Crisan (Ed). Stochastic Analysis 2010. Springer, Berlin Heidel-berg, 2011, p. 227–242.

[150] V.N. Kolokoltsov. Nonlinear Markov games on a finite state space (mean-field and binary interactions). International Journal of Statistics and Prob-ability 1:1 (2012), 77–91.http://www.ccsenet.org/journal/index.php/ijsp/article/view/16682

[151] V.N. Kolokoltsov. Nonlinear diffusions and stable-like processes with coef-ficients depending on the median or VaR. Applied Mathematics and Opti-mization 68:1 (2013), 85–98.

[152] V.N. Kolokoltsov. On fully mixed and multidimensional extensions of theCaputo and Riemann–Liouville derivatives, related Markov processes andfractional differential equations. Fract. Calc. Appl. Anal. 18:4 (2015), 1039–1073. http://arxiv.org/abs/1501.03925.

[153] V.N. Kolokoltsov. Stochastic monotonicity and duality of kth order withapplication to put-call symmetry of powered options. http://arxiv.org/abs/1405.3894 Journal of Applied Probability 52:1 (2015), 82–101.

Page 524: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

Bibliography 513

[154] V.N. Kolokoltsov. The evolutionary game of pressure (or interference), re-sistance and collaboration (2014). MOR (Mathematics of Operations Re-search), 42 (2017), no. 4,915–944.

[155] V.N. Kolokoltsov and O.A. Malafeyev. Understanding Game Theory. WorldScientific, Singapore, 2010.

[156] V.N. Kolokoltsov and O.A. Malafeyev. Mean field game model of corruption.Dynamics Games and Applications. 7:1 (2017), 34–47. Open Access mode.

[157] V.N. Kolokoltsov and O.A. Malafeyev. Many agent games in socio-economicsystems: corruption, inspection, coalition building, network growth, security.Springer, 2019.

[158] V.N. Kolokoltsov and V.P. Maslov. Idempotent analysis and its applications.Kluwer Publishing House, 1997.

[159] V.N. Kolokoltsov and A.E. Tyukov. Boundary-value problems for Hamilto-nian systems and absolute minimizers in calculus of variations. Electron. J.Differential Equations, No. 90 (2006).

[160] V. Kolokoltsov and M. Veretennikova. Fractional Hamilton Jacobi Bellmanequations for scaled limits of controlled Continuous Time Random Walks.Communications in Applied and Industrial Mathematics 6:1 (2014), e-484.DOI: 10.1685/journal.caim.484http://caim.simai.eu/index.php/caim/article/view/484/PDF

[161] V. Kolokoltsov and M. Veretennikova. Well-posedness and regularity ofthe Cauchy problem for nonlinear fractional in time and space equations.http://arxiv.org/abs/1402.6735. ‘Fractional Differential Calculus’ 4:1(2014),1–30, http://files.ele-math.com/articles/fdc-04-01.pdf

[162] V. Kolokoltsov and M. Troeva. On the mean field games with common noiseand the McKean–Vlasov SPDEs. arXiv:1506.04594. To appear In StachasticAnalysis and Applications.

[163] V. Kolokoltsov, M. Troeva and W. Yang. On the rate of convergence forthe mean-field approximation of controlled diffusions with large number ofplayers. Dyn. Games Appl. 4:2 (2014), 208–230.

[164] V.N. Kolokoltsov and W. Yang. Existence of solutions to path-dependentkinetic equations and related forward-backward systems. Open Journal ofOptimization 2:2 (2013), 39–44.

[165] V.N. Kolokoltsov and W. Yang (2013). Sensitivity analysis for HJB equa-tions with an application to coupled backward-forward systems (2013).arXiv:1303.6234

[166] A.N. Kolmogorov and S.V. Fomin. Elements of the theory of functions andfunctional analysis. Moscow, Nauka, 6th Edition, 1989 (in Russian). Ara-bic translation by Dar Mir, Moscow, 1988. Portuguese translation by Mir,Moscow, 1982. German translation as Hochschulbucher fur Mathematik,Band 78. VEB Deutscher Verlag der Wissenschaften, Berlin, 1975. French

Page 525: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

514 Bibliography

translation by Mir, Moscow, 1974. English translation by Dover Publica-tions, Inc., New York, 1975.

[167] V. Konakov, S. Menozzi and S. Molchanov. Explicit parametrix and locallimit theorems for some degenerate diffusion processes. Ann. Inst. HenriPoincare Probab. Stat. 46:4 (2010) 908–923.

[168] Y. Kondratiev, T. Kuna and N. Ohlerich. Spectral gap for Glauber typedynamics for a special class of potentials. Electron. J. Probab. 18 (2013),no. 42.

[169] Y. Kondratiev, T. Pasurek and M. Rockner. Gibbs measures of continuoussystems: an analytic approach. Rev. Math. Phys. 24:10 (2012), 1250026.

[170] V.V. Kozlov. Polynomial conservation laws for the Lorentz and the Boltz-mann–Gibbs gases. Uspekhi Mat. Nauk 71:2 (2016), 81–120 (in Russian).Engl. transl. Russian Math. Surveys 71:2 (2016), 253–290.

[171] N.N. Krasovskii and A.I. Subbotin. Game-Theoretical Control Problems.New York, Springer, 1988.

[172] B. Kruglikov. Invariant characterization of Liouville metrics and polynomialintegrals. J. Geom. Phys. 58:8 (2008), 979–995.

[173] A.M. Kulik. On weak uniqueness and distributional properties of a solutionto an SDE with α-stable noise. To appear in SPA.

[174] H. Kunita. Stochastic Flows and Stochastic Differential Equations. Cam-bridge studies in advanced mathematics, vol. 24. Cambridge Univ. Press,1990.

[175] O.A. Ladyzhenskaia, V.A. Solonnikov, and N.N. Uraltceva. Linear and quasi-linear equations of parabolic type. Translations of mathematical mono-graphs, vol. 23. Providence, American Mathematical Society, 1968. Russianedition published in Moscow in 1967.

[176] V. Lakshmikantham, A.R. Mitchell and R.W. Mitchell. Differential equa-tions on closed subsets of a Banach space. Transactions of the AMS 220(1976), 103–113.

[177] V. Lakshmikantham and J. Vasundhara Devi. Theory of Fractional Differ-ential Equations in a Banach Space. European Journal of Pure and AppliedMathematics 1:1 (2008), 38–45.

[178] V. Lakshmikantham et al. Theory of causal differential equations. Atlantisstudies in mathematics for engineering and science, vol. 5. Amsterdam, Paris.Atlantis Press, World Scientific, 2009.

[179] N. Laskin. Fractional Schrodinger equation. Phys. Rev. E 66 (2002), 056108.

[180] J.-M. Lasry and P.-L. Lions. Jeux a champ moyen. I. Le cas stationnaire(French). C.R. Math. Acad. Sci. Paris 343:9 (2006) 619–625.

[181] Ph. Laurencot and S. Mishler. Global existence for the discrete diffusivecoagulation-fragmentation equations in L1. Rev. Mat. Iberoamericana. 18:3(2002), 731–745.

Page 526: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

Bibliography 515

[182] R. Leandre. Developpement asymptotique de la densite d’une diffusiondegenere. Forum Math. 4 (1992), 45–75.

[183] N.N. Leonenko, M.M. Meerschaert and A. Sikorskii. Correlation structureof fractional Pearson diffusions. Comput. Math. Appl. 66:5 (2013), 737–745.

[184] N.N. Leonenko, M.M. Meerschaert and A. Sikorskii. Fractional Pearson dif-fusions. J. Math. Anal. Appl. 403 (2013) 532–546.

[185] N.N. Leonenko, I. Papic, A. Sikorskii and N. Suvak. Heavy-tailed fractionalPearson diffusions. Stochastic Processes and their Applications 127 (2017)3512–3535.

[186] Ch. Li and F. Zeng (2015). Numerical methods for fractional calculus. CRCPress, Boca Raton, 2015.

[187] J. Lindenstrauss and L. Tzafriri. Classical Banach Spaces, vols. 1 and 2.Springer-Verlag, 1977.

[188] W. Liu, M. Rockner and J.L. da Silva. Quasi-linear (stochastic) partial dif-ferential equations with time-fractional derivatives. SIAM J. Math. Anal.50:3 (2018), 2588–2607.

[189] S.G. Lobanov and O.G. Smolyanov. Ordinary differential equations in locallyconvex spaces (Russian). Uspekhi Mat. Nauk 49:3 (1994), 93–168. EnglishTransl. in Russian Math. Surveys 49:3 (1994), 97–175.

[190] J. Lorinczi, F. Hiroshima and V. Betz. Feynman–Kac-type theorems andGibbs measures on path space. With applications to rigorous quantum fieldtheory. De Gruyter Studies in Mathematics, 34. Walter de Gruyter, Berlin,2011.

[191] G. Lv, J. Duan, H. Gao and J.-L. Wu. On a stochastic nonlocal conservationlaw in a bounded domain. Bull. Sci. Math. 140:6 (2016), 718–746.

[192] R.L. Magin. Fractional Calculus in Bioengineering. Begell House Publisher,Inc, Connecticut, 2006.

[193] S. Maniglia. Probabilistic representation and uniqueness results for measure-valued solutions of transport equations. J. Math. Pures Appl. (9) 87:6 (2007),601–626.

[194] O.A. Malafeyev. Controlled conflict systems. Petersburg University, 2000 (inRussian).

[195] A.B. Malinowska and D.F.M. Torres. Introduction to the Fractional Calculusof Variations. Imperial College Press, 2012.

[196] A.B. Malinowska, T. Odzijewicz and D.F.M. Torres. Advanced Methods inthe Fractional Calculus of Variations. Springer, Heidelberg, 2015.

[197] R.H. Martin. Nonlinear operators and differential equations in Banachspaces. New York, 1976.

[198] P.R. Masani. Multiplicative Riemann integration in normed rings. Trans.Amer. Math. Soc. 61 (1947), 147–192.

Page 527: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

516 Bibliography

[199] V.P. Maslov. Perturbation Theory and Asymptotical Methods. MoscowState University Press, 1965 (in Russian). French Transl. Dunod, Paris, 1972.

[200] V.P. Maslov. Methodes Operatorielles. Moscow, Nauka 1974 (in Russian).French transl. Moscow, Mir, 1987.

[201] V.P. Maslov. Complex Markov Chains and Functional Feynman Integral.Moscow, Nauka, 1976 (in Russian).

[202] V.P. Maslov and M.V. Fedoryuk. Semiclassical Approximation in QuantumMechanics. Nauka, Moscow, 1976 (in Russian). Engl. transl. Reidel, Dor-drecht, 1981.

[203] V.S. Matveev and P.I. Topalov. Geodesic equivalence of metrics on surfaces,and their integrability. (Russian) Dokl. Akad. Nauk 367:6 (1999), 736–738.

[204] W.M. McEneaney. A new fundamental solution for differential Riccati equa-tions arising in control. Automatica (Journal of IFAC) 44:4 (2008), 920–936.

[205] M. Meerschaert, E. Nane and P. Vellaisamy. Distributed-order fractionaldiffusions on bounded domains. J. Math. Anal. Appl. 379:1 (2011), 216–228.

[206] M.M. Meerschaert and A. Sikorskii. Stochastic Models for Fractional Calcu-lus. De Gruyter Studies in Mathematics Vol. 43, NY (2012).

[207] R. Metzler and J. Klafter. The random walk’s guide to anomalous diffusion:a fractional dynamics approach. Physics Reports 339:1 (2000), 1–77.

[208] P.-A. Meyer. Quantum Probability for Probabilists. Springer LNM, vol.1538. Springer, 1991.

[209] V.M. Millionshchikov. On the theory of differential equations in locally con-vex spaces. (Russian) Mat. Sb. (N.S.) 57:4 (1962), 385–406.

[210] S. Mishler, B. Wennberg. On the spatially homogeneous Boltzmann equa-tion.Ann. Inst. H. Poincare Anal. Non Lineaire 16:4 (1999), 467–501.

[211] M. Naber. Time fractional Schrodinger equation. J. Math. Phys. 45 (2004),3339–3352.

[212] M.A. Naimark. Normed rings. Translated from the first Russian edition byLeo F. Boron. Reprinting of the revised English edition. Wolters–NoordhoffPublishing, Groningen, 1970.

[213] A. Negoro. Stable-like processes: construction of the transition density andthe behavior of sample paths near t = 0. Osaka J. Math. 31 (1994), 189–214.

[214] J. Norris. Cluster Coagulation. Comm. Math. Phys. 209 (2000), 407–435.

[215] J. Norris. A consistency estimate for Kac’s model of elastic collisions in adilute gas. Ann. Appl. Probab. 26:2 (2016), 1029–1081.

[216] J. Norris. Measure solutions for the Smoluchowski coagulation-diffusionequation. ArXiv:1408.5228, 2014.

[217] L. Orsina, M.M. Porzio, F. Smarrazzo. Measure-valued solutions of nonlinearparabolic equations with logarithmic diffusion. J. Evol. Equ. 15:3 (2015),609–645.

Page 528: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

Bibliography 517

[218] F. Padula and A. Visioli. Advances in robust fractional control. Springer,2015.

[219] A. Pazy. Semigroups of Linear Operators and Applications to Partial Dif-ferential Equations. Springer-Verlag, New York, 1983.

[220] I.G. Petrovskii. Lectures on the theory of ordinary differential equations (inRussian). Sixth corrected edition. Nauka, Moscow 1970. English translationby Prentice-Hall, Englewood Cliffs, N.J., 1966.

[221] H. Pham. Linear quadratic optimal control of conditional McKean–Vlasovequation with random coefficients and applications. Probab. Uncertain.Quant. Risk 1 (2016), Paper No. 7.

[222] I. Podlubny. Fractional differential equations, An introduction to fractionalderivatives, fractional differential equations, to methods of their solution andsome of their applications. Mathematics in Science and Engineering, vol. 198.Academic Press, Inc., San Diego (1999).

[223] H. Pollard. The completely monotonic character of the Mittag-Leffler func-tion Ea(−x). Bull. Amer. Math. Soc. 54 (1948), 1115–1116.

[224] L.S. Pontryagin. Ordinary differential equations (in Russian). Fifth edition.Nauka, Moscow, 1982. Spanish translation by Aguilar, Madrid, 1973.

[225] M. Poppenberg. An application of the Nash–Moser theorem to ordinarydifferential equations in Frechet spaces. Studia Mathematica 137 (2) (1999),101–121.

[226] M.M. Porzio and F. Smarrazzo. Radon measure-valued solutions for somequasilinear degenerate elliptic equations. Ann. Mat. Pura Appl. (4) 194:2(2015), 495–532.

[227] F.O. Porper and S.D. Eidelman. Two-sided estimates of fundamental solu-tions of second-order parabolic equations, and some applications. UspehkiMat. Nauk 39:3 (1984), 107–156. Engl. transl. in Russian Math. Surv. 39:3(1984), 119–178.

[228] A.V. Pskhu. Fundamental solution of the diffusive wave equation of frac-tional order (in Russian). Izvestia RAN, Ser. Math. 73:2 (2009), 141–182.

[229] A.V. Pskhu. Partial differential equations of fractional order (in Russian).Nauka, Moscow (2005).

[230] L. Rass and J. Radcliffe. Spatial Deterministic Epidemics. MathematicalSurveys and Monographs, vol. 102. AMS 2003.

[231] M. Reed and B. Simon. Methods of Modern Mathematical Physics, vol. 1,Functional Analysis. Academic Press, N.Y. 1972.

[232] M. Reed and B. Simon. Methods of Modern Mathematical Physics, vol. 2,Harmonic Analysis. Academic Press, N.Y. 1975.

[233] R.M. Redheffer. The Theorems of Bony and Prezis on Flow-Invariant Sets.The American Mathematical Monthly 79 :7 (1972), 740–747.

Page 529: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

518 Bibliography

[234] S. Rjasanow and W. Wagner. Stochastic numerics for the Boltzmann equa-tion. Springer Series in Computational Mathematics, 37. Springer-Verlag,Berlin, 2005.

[235] A.P. Robertson and W. Robertson. Topological vector spaces. CambridgeUniversity Press, 1973.

[236] B.N. Sadovskii. A fixed-point principle, Funktsional. Anal. i Prilozhen. 1:2(1967), 74–76.

[237] R.S. Saha. Fractional calculus with applications for nuclear reactor dynam-ics. CRC Press, Boca Raton, 2016.

[238] S.G. Samko, A.A. Kilbas, O.I. Marichev. Fractional integrals and derivatives.Theory and applications. Translated from the 1987 Russian original. Gordonand Breach Science Publishers, Yverdon, 1993.

[239] S.G. Samko. Hypersingular integrals and their applications (in Russian).Rostov State University, 1984. Engl. transl. Analytical Methods and SpecialFunctions, 5. Taylor and Francis, London, 2002.

[240] H.H. Schaefer. Banach Lattices and Positive Operators. Springer, Berlin-Heidelberg, 1974.

[241] R.L. Schilling, R. Song and Z. Vondracek. Bernstein Functions. Theory andApplications. Studies in Math 37, De Gruyter, 2010.

[242] W. Schneider. Completely monotone generalized Mittag-Leffler functions.Expo Math 14 (1996), 3–16.

[243] A. Schumacher. Second order Banach space valued differential equations: asemigroup approach. Arch. Math. 81 (2003), 446–456.

[244] R.E. Showalter. Monotone Operators in Banach Space and Nonlinear PartialDifferential Equations. Mathematics Surveys and Monographs 49, AmericanMathematical Society, 1997.

[245] M.A. Shubin. Pseudo-differential operators. Moscow, Nauka, 1978 (in Rus-sian).

[246] M.V. Simkin and V.P. Roychowdhury. Re-inventing Willis. Physics Reports502 (2011), 1–35.

[247] M. Slemrod. Dynamics of Measured Valued Solutions to a Backward-Forward Heat Equation. Journal of Dynamics and Differential Equations3:1 (1991), 1–28.

[248] G.V. Smirnov. Introduction to the Theory of Differential Inclusions. Provi-dence, RI, AMS, 2001.

[249] O.G. Smolyanov. Analysis in topological linear spaces. Moscow State Uni-versity, 1979 (in Russian).

[250] A.I. Subbotin. Generalized Solutions of First Order of PDEs: The DynamicalOptimization Perspectives. Boston, Birkhauser, 1995.

Page 530: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

Bibliography 519

[251] N.N. Subbotina et al. The Method of Characteristics for the Hamilton–Jacobi–Bellmam equation. Ekaterinburg, RIO RAN, 2013 (in Russian).

[252] I.A. Taimanov. Topological obstructions to the integrability of geodesic flowson nonsimply connected manifolds. (Russian) Izv. Akad. Nauk SSSR Ser.Mat. 51:2 (1987), 429–435 (in Russian). Engl. transl. Math. USSR-Izv. 30:2(1988), 403–409.

[253] V.E. Tarasov. Fractional Dynamics, Applications of Fractional Calculus toDynamics of Particles, Fields and Media. Springer, Higher Education Press(2011).

[254] M.E. Taylor. Pseudo-differential operators. Princeton University Press, 1981.

[255] V.V. Uchaikin. Fractional Derivatives for Physicists and Engineers. Springer(2012).

[256] S. Umarov. Introduction to fractional pseudo-differential equations with sin-gular symbols. Developments in Mathematics, 41. Springer, 2015.

[257] V.V. Vasil’ev and S.I. Piskarev. Differential equations in a Banach space (inRussian). Moscow University Press, 1996.

[258] C. Villani. On the spatially homogeneous Landau equation for Maxwellianmolecules. Math. Models Methods Appl. Sci. 8:6 (1998), 957–983.

[259] C. Villani. On a new class of weak solutions to the spatially homogeneousBoltzmann and Landau equations. Arch. Rational Mech. Anal. 143:3 (1998),273–307.

[260] V. Vedenyapin, A. Sinitsyn and E. Dulov. Kinetic Boltzmann, Vlasov andrelated equations. Elsevier, Inc., Amsterdam, 2011.

[261] V.S. Vladimirov. Equations of mathematical physics (in Russian). Moscow,Nauka, 1988.

[262] I.I. Vrabie. Compactness Methods for Nonlinear Evolutions. Pitman Mono-graphs and Surveys in Pure and Applied Mathematics 32, Longman Scien-tific, 1987.

[263] B.J. West. Fractional calculus View of Complexity. Tomorrow’s Science.CRC Press, Boca Raton, 2016.

[264] S. Yamamuro. Differential Calculus in Topological Linear Spaces. SpringerLNM 374, Springer, Berlin, 1974.

[265] R. Yano. On quantum Fokker–Planck equation. J. Stat. Phys. 158:1 (2015),231–247.

[266] V.M. Zolotarev. On analytic properties of stable distribution laws (in Rus-sian). Vestnik Leningrad. Univ. 11:1 (1956), 49–52.

[267] V.M. Zolotarev. One-dimensional Stable Distributions. Moscow, Nauka,1983 (in Russian). Engl. transl. in vol. 65 of Translations of MathematicalMonographs AMS, Providence, Rhode Island, 1986.

Page 531: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

520 Bibliography

[268] G. Zou, G. Lv and J.-L. Wu. On the regularity of weak solutions to space-time fractional stochastic heat equations. Statist. Probab. Lett. 139 (2018),84–89.

[269] G. Zou, G. Lv and J.-L. Wu. Stochastic Navier–Stokes equations with Ca-puto derivative driven by fractional noises. J. Math. Anal. Appl. 461:1(2018), 595–609.

Page 532: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

Index

∗-weak topology, 2

abstract M-space, 79

accretive mapping, 152, 195

accretive relation, 152

accretivity

in weighted norm, 196, 197

autocatalysis, 166

backward Cauchy problem, 124

backward propagator, 259

Baire theorem, 33

Banach lattice, 79

Banach space

dual pair, 2

Banach tower, 221

Bellman equation, 122

jump processes, 123, 124

jump processes, well-posedness, 124

binary relation, 151

contraction, 152

Bochner integral, 15

Bogolyubov chains, 424

Boltzmann collisions, 419

Boltzmann’s equation

spatially trivial, 420

bounded set, 31

bp-topology, 38

branching, 176

Cahn–Hilliard equation, 370

Caputo fractional derivative, 44

generalized, 455

catalyst, 166

Cauchy principle value, 61

Cauchy sequence, 29

causal equations, 135, 136

causal integral operator, 453

chain rule, 259

chemical reaction, 165

order, 165

mechanism, 166

chronological exponential, 18, 95, 290

backward, 96

coagulation, 176

coagulation kernel, 419, 420

collision, 176

collision breakage, 176, 420

collision kernel, 420, 421

completely monotone function, 345

complex balance point, 172

complex diffusion equation, 255, 278

complex matrix, 168

conditional positivity, 335

for matrices, 160

mappings in R∞, 160

multuilinear mappings, 181

strong form, 181

contraction, 2

contraction principle, 482

generalized, 481

convolution semigroup, 341

Csiszar–Morimoto entropy, 170

δ-sequence, 224

decomposable measures, 417

delay equations, 135

detailed balance, 171

detailed balance point, 172

diffusion equation, 68, 224, 254

complex, 225

Dirac measure, 4

directional derivative, 9, 35

Dirichlet boundary condition, 229

Dirichlet formulae, 484

© Springer Nature Switzerland AG 2019 V. Kolokoltsov, Differential Equations on Measures and Functional Spaces, Birkhäuser Advanced Texts Basler Lehrbücher, https://doi.org/10.1007/978-3-030-03377-4

521

Page 533: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

522 Index

discrete measure, 4

dissipative operator, 152

dual Banach space, 2

dual operator, 2

dual pair of Banach spaces, 77

dual space, 32

strong topology, 32

weak topology, 32

Duhamel formula, 93

elliptic polynomial, 330

equicontinuity, 29

Erdelyi’s lemma, 492

evolutionary coalition building, 180

explosion, 92

Feller propagator, 337

Feller semigroup, 337

conservative, 337

Feynman–Kac formula, 477

stationary, 477

time-ordered operator-valued, 477

first order kinetics

ergodic, 169

weakly reversible, 169

first-order kinetics, 169

forward-backward systems, 394, 414

Fourier theorem, 40

Fourier transform, 40

inversion formula, 41

Frechet derivative, 36

strong, 36

Frechet space, 30

fractional complex diffusion, 448

fractional derivative

generalized, 453

spectral measure, 54

symmetric, 51

symmetric mixed, 53

fractional derivatives in generator form,454

fractional Feller evolution, 448, 479

fractional Laplacian, 55

fractional RL integral, 44

generalized, 455

left, 49

fractional Schrodinger equation, 448, 478

regularized, 448

fragmentation, 420

fragmentation kernel, 420

frozen coefficients, 309

fundamental solution, 65

for Cauchy problem, 67

Gateaux derivative, 10, 35

compatible with duality, 77

Gaussian diffusion operator, 230

Gaussian diffusion semigroup, 230

generalized function, 34, 56

convolution, 58

differentiation, 34, 57

direct product, 57

Fourier transform, 57

pointwise product, 58

support, 58

tempered, 34, 56

generalized solution, 57

by duality, 334

to the Cauchy problem, 218

via approximation, 334, 449, 458

via discrete approximations, 293

generator of a strongly continuoussemigroup, 215

generators of order at most one, 351

geodesic flow, 111

Ginzburg–Landau equation, 363

Godunov’s theorem, 157

Green function, 67, 68, 99, 224, 230

for fractional derivative, 103

for fractional derivative, Mellintransform, 444

Hadamard derivative, 36

Hamilton–Jacobi–Bellman (HJB)equation, 363

Hamilton–Jacobi–Bellman equation, 122

Hamiltonian

optimization theory, 122

with Lipschitz minimizer, 395

Hamiltonian equations, 105

heat conduction equation, 68, 224, 254,278

Page 534: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

Index 523

heat kernel, 67, 99, 224, 230

for fractional derivative, 103

Holand semigroup, 263

hyper-singular integrals, 56

inductive limit, 33

of linear spaces, 34

intensity of transitions, 418

interest driven migration, 176

iterated Riemann integral, 43

KdV-equation, 92

kinetic equation, 177, 421

anticipating, 375

causal, 374

path dependent, 374, 413

weak form, 178, 418

kinetic system, 167

complex balanced, 172

conservative, 172

detailed balanced, 172

Kolmogorov’s diffusion, 231

Kolmogorov’s forward equation, 176

Levy exponent, 340

Levy kernel, 335

Levy measure, 339

Levy–Khintchin operators, 339

Levy–Khintchin-type operators, 39, 335

Landau–Fokker–Planck equation, 422

Laplace exponent, 345

Laplacian, 223

lattice, 78

distributive, 78

Lebesgue decomposition theorem, 6

Lie–Trotter formula, 296

linear Cauchy problem, 259

linear operator, 2

bounded, 2, 32

closable, 215

closed, 214

closure, 215

core, 215

densely defined, 2

domain, 2

norm of, 3

strong convergence, 3

strong topology, 3

linear topological space, 27

metricizable, 29

locally convex space, 27

barrelled, 33

bornological, 32

Lomonosov–Lavoisier law, 178

lower and upper semi-inner product, 20

Lyapunov function, 408

subcritical, 408

m-accretive relation, 152

Markov chain

graph, 169

Kolmogorov’s forward equation, 169

mass-action-law, 165

mass-action-law kinetics, 178

mass-exchange process, 178

master equation, 169, 415

maximum principle, 335

McKean–Vlasov diffusions, 387, 422

measurable mapping, 14

Banach-space-valued, 14

method of duality, 263

mild equation, 246

for fractional evolutions, 451

mild solution, 246, 273, 370

to HJB, 363, 368

Minkowski functional, 27

Mittag-Leffler function, 484

mixed states, 416

molecularity, 167

monotone mapping, 151

Morimoto H-theorem, 170

multiple coagulation, 420

multiplicative Riemann integral, 18

Neumann boundary condition, 228

nonlinear diffusion, 385

complex, 386

nonlinear quantum dynamic semigroup,404

observables, 416

decomposable, 417

operator of order at most one, 454

left-sided, 454

Page 535: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

524 Index

right-sided, 454

operators of at most kth order, 38

Ornstein–Uhlenbeck diffusion, 230

measure-valued, 270

Ovsyannikov’s method, 358

pairwise mutations, 176

Paley–Wiener theorem, 41

path integral, 252

for linear fractional evolution, 472

Schrodinger equation, 253

Peano’s theorem, 157

perturbation theory, 244

Poisson bracket, 113

porous medium equation, 153

positive maximum principle (PMP), 335

positive-definite kernels, 270

potential measure, 346

potential operator, 218

preferential attachment, 179

principle of uniform boundedness, 33

probability kernel, 37

probability measure, 4

Prokhorov’s compactness criterion, 5

propagator, 259

generated by, 259

of generalized solutions, 261

solving Cauchy problem, 259

strongly continuous, 261

propagator equation, 259

pseudo-differential operator, 42

symbol of, 42

quasi-accretive mapping, 195, 196

qubit, 79

Radon measure, 4

complex, 4

dimensionality, 255

reflexive Banach space, 2

replicator dynamics, 164, 177, 422

generalized, 423

resolvent, 216, 224

resolvent equation, 217

Riccati equation, 268

backward, 269

Riemann–Lebesgue lemma, 41

Riesz–Markov theorem, 5, 37

RL fractional derivative, 44

generalized, 455

Schrodinger equation, 225, 253

regularized, 226

with magnetic fields, 255

with magnetic fields, regularized, 255,278

Schwartz space, 30

second quantization, 433

semi-norm, 27

semigroup of linear operators, 213

equicontinuous, 214

strongly continuous, 213

type of growth, 219

sensitivity, 125

discrete kinetic equations, 205

for nonlinear propagators, 401

fractional HJB, 452

fractional ODEs, 146

general kinetic equation, 431

HJB equation, 369

integral equations, 125, 145

McKean–Vlasov diffusion, 389

McKean–Vlasov diffusion, secondorder, 392

of ODEs, 131

simplest nonlinear diffusion, 384

Smoluchowski coagulation-fragmentation,179

Smoluchowski’s equation, 419

Sobolev space, 70

local, 71

Sovolev embedding, 70

spectral measure, 54

stable densities, 104

stoichiometric coefficients, 165

stoichiometric space, 167

stoichiometric vectors, 167

strictly convex Banach space, 150

sub-gradient, 151

T -product, 18, 95, 290

backward, 96, 291

Tauberian theorems, 492

Page 536: xn--webducation-dbb.comwebéducation.com/wp-content/uploads/2019/07... · vi Contents 2.7 Hamilton–Jacobi–Bellmanequationandoptimalcontrol . . . . . . 119 2.8 Sensitivityofintegralequations

Index 525

Taylor expansion

first order, 9

second and third order, 74

tightness, 5

time-ordered exponential, 18, 95

Tonelli’s theorem, 111

topology of bounded convergence, 32

topology of pointwise convergence, 32

total variation measure, 4

for complex measures, 4

transition kernel, 37, 279

additively bounded, 426

bounded, 37

critical, 424

E preserving, 424

multiplicatively bounded, 426

signed, 37

subcritical, 424

weakly continuous, 37, 280

uniformly convex Banach space, 150

variational derivative, 71

strong or weak, 75, 410

vector lattice, 79

viscosity solution, 402

Vlasov’s equation, 421

Watson’s lemma, 492

weak topology, 2, 6

for a pair, 2

for measures, 5

Weierstrass function, 116

Yosida approximation, 220