ordinary diﬀerential equations - lmu münchenphilip/publications/lecturenotes/ode.pdf · ordinary...

Ordinary Differential Equations

Peter Philip∗

Lecture Notes

Originally Created for the Class of Spring Semester 2012 at LMU Munich,

Revised and Extended for Several Subsequent Classes

November 24, 2017

Contents

1 Basic Notions 4

1.1 Types and First Examples . . . . . . . . . . . . . . . . . . . . . . . . . . 4

1.2 Equivalent Integral Equation . . . . . . . . . . . . . . . . . . . . . . . . . 9

1.3 Patching and Time Reversion . . . . . . . . . . . . . . . . . . . . . . . . 10

2 Elementary Solution Methods 12

2.1 Geometric Interpretation, Graphing . . . . . . . . . . . . . . . . . . . . . 12

2.2 Linear ODE, Variation of Constants . . . . . . . . . . . . . . . . . . . . . 12

2.3 Separation of Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

2.4 Change of Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

3 General Theory 24

3.1 Equivalence Between Higher-Order ODE and Systems of First-Order ODE 24

3.2 Existence of Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

3.3 Uniqueness of Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

3.4 Extension of Solutions, Maximal Solutions . . . . . . . . . . . . . . . . . 40

3.5 Continuity in Initial Conditions . . . . . . . . . . . . . . . . . . . . . . . 50

∗E-Mail: [email protected]

1

CONTENTS 2

4 Linear ODE 57

4.1 Definition, Setting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57

4.2 Gronwall’s Inequality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57

4.3 Existence, Uniqueness, Vector Space of Solutions . . . . . . . . . . . . . . 61

4.4 Fundamental Matrix Solutions and Variation of Constants . . . . . . . . 63

4.5 Higher-Order, Wronskian . . . . . . . . . . . . . . . . . . . . . . . . . . . 65

4.6 Constant Coefficients . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67

4.6.1 Linear ODE of Higher Order . . . . . . . . . . . . . . . . . . . . . 68

4.6.2 Systems of First-Order Linear ODE . . . . . . . . . . . . . . . . . 75

5 Stability 84

5.1 Qualitative Theory, Phase Portraits . . . . . . . . . . . . . . . . . . . . . 84

5.2 Stability at Fixed Points . . . . . . . . . . . . . . . . . . . . . . . . . . . 94

5.3 Constant Coefficients . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101

5.4 Linearization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106

5.5 Limit Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111

A Differentiability 118

B Kn-Valued Integration 119

B.1 Kn-Valued Riemann Integral . . . . . . . . . . . . . . . . . . . . . . . . . 119

B.2 Kn-Valued Lebesgue Integral . . . . . . . . . . . . . . . . . . . . . . . . . 122

C Metric Spaces 124

C.1 Distance in Metric Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . 124

C.2 Compactness in Metric Spaces . . . . . . . . . . . . . . . . . . . . . . . . 127

D Local Lipschitz Continuity 132

E Maximal Solutions on Nonopen Intervals 135

F Paths in Rn 135

G Operator Norms and Matrix Norms 139

H The Vandermonde Determinant 142

CONTENTS 3

I Matrix-Valued Functions 144

I.1 Product Rule . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144

I.2 Integration and Matrix Multiplication Commute . . . . . . . . . . . . . . 144

J Autonomous ODE 145

J.1 Equivalence of Autonomous and Nonautonomous ODE . . . . . . . . . . 145

J.2 Integral for ODE with Discontinuous Right-Hand Side . . . . . . . . . . 146

K Polar Coordinates 147

References 154

1 BASIC NOTIONS 4

1 Basic Notions

1.1 Types of Ordinary Differential Equations (ODE) and FirstExamples

A differential equation is an equation for some unknown function, involving one or morederivatives of the unknown function. Here are some first examples:

y′ = y, (1.1a)

y(5) = (y′)2 + π x, (1.1b)

(y′)2 = c, (1.1c)

∂tx = e2πit x2, (1.1d)

x′′ = −3x+

(−1

1

)

. (1.1e)

One distinguishes between ordinary differential equations (ODE) and partial differentialequations (PDE). While ODE contain only derivatives with respect to one variable, PDEcan contain (partial) derivatives with respect to several different variables. In general,PDE are much harder to solve than ODE. The equations in (1.1) all are ODE, and onlyODE are the subject of this class. We will see precise definitions shortly, but we canalready use the examples in (1.1) to get some first exposure to important ODE-relatedterms and to discuss related issues.

As in (1.1), the notation for the unknown function varies in the literature, where thetwo variants presented in (1.1) are probably the most common ones: In the first threeequations of (1.1), the unknown function is denoted y, usually assumed to depend ona variable denoted x, i.e. x 7→ y(x). In the last two equations of (1.1), the unknownfunction is denoted x, usually assumed to depend on a variable denoted t, i.e. t 7→ x(t).So one has to use some care due to the different roles of the symbol x. The notationt 7→ x(t) is typically favored in situations arising from physics applications, where trepresents time. In this class, we will mostly use the notation x 7→ y(x).

There is another, in a way a slightly more serious, notational issue that one commonlyencounters when dealing with ODE: Strictly speaking, the notation in (1.1b) and (1.1d)is not entirely correct, as functions and function arguments are not properly distin-guished. Correctly written, (1.1b) and (1.1d) read

∀x∈D(y)

y(5)(x) =(

y′(x))2

+ π x, (1.2a)

∀t∈D(x)

(∂tx)(t) = e2πit(

x(t))2, (1.2b)

where D(y) and D(x) denote the respective domains of the functions y and x. However,one might also notice that the notation in (1.2) is more cumbersome and, perhaps,harder to read. In any case, the type of slight abuse of notation present in (1.1b) and(1.1d) is so common in the literature that one will have to live with it.

1 BASIC NOTIONS 5

One speaks of first-order ODE if the equations involve only first derivatives such as in(1.1a), (1.1c), and (1.1d). Otherwise, one speaks of higher-order ODE, where the preciseorder is given by the highest derivative occurring in the equation, such that (1.1b) isan ODE of fifth order and (1.1e) is an ODE of second order. We will see later in Th.3.1 that ODE of higher order can be equivalently formulated and solved as systems ofODE of first order, where systems of ODE obviously consist of several ODE to be solvedsimultaneously. Such a system of ODE can, equivalently, be interpreted as a single ODEin higher dimensions: For instance, (1.1e) can be seen as a single two-dimensional ODEof second order or as the system

x′′1 = −3x1 − 1, (1.3a)

x′′2 = −3x2 + 1 (1.3b)

of two one-dimensional ODE of second order.

One calls an ODE explicit if it has been solved explicitly for the highest-order deriva-tive, otherwise implicit. Thus, in (1.1), all ODE except (1.1c) are explicit. In general,explicit ODE are much easier to solve than implicit ODE (which include, e.g., so-calleddifferential-algebraic equations, cf. Ex. 1.4(g) below), and we will mostly consider ex-plicit ODE in this class.

As the reader might already have noticed, without further information, none of theequations in (1.1) makes much sense. Every function, in particular, every functionsolving an ODE, needs a set as the domain where it is defined, and a set as the range itmaps into. Thus, for each ODE, one needs to specify the admissible domains as well asthe range of the unknown function. For an ODE, one usually requires a solution to bedefined on a nontrivial (bounded or unbounded) interval I ⊆ R. Prescribing the possiblerange of the solution is an integral part of setting up an ODE, and it often completelydetermines the ODE’s meaning and/or its solvability. For example for (1.1d), (a subsetof) C is a reasonable range. Similarly, for (1.1a)–(1.1c), one can require the range tobe either R or C, where requiring range R for (1.1c) immediately implies there is nosolution for c < 0. However, one can also specify (a subset of) Rn or Cn, n > 1, asrange for (1.1a), turning the ODE into an n-dimensional ODE (or a system of ODE),where y now has n compoments (y1, . . . , yn) (note that, except in cases where we aredealing with matrix multiplications, we sometimes denote elements of Rn as columnsand sometimes as rows, switching back and forth without too much care). A reasonablerange for (1.1e) is (a subset of) R2 or C2.

One of the important goals regarding ODE is to find conditions, where one can guan-rantee the existence of solutions. Moreover, if possible, one would like to find conditionsthat guarantee the existence of a unique solution. Clearly, for each a ∈ R, the functiony : R −→ R, y(x) = a ex, is a solution to (1.1a), showing one cannot expect uniquenesswithout specifying further requirements. The most common additional conditions thatoften (but not always) guarantee a unique solution are initial conditions, (e.g. requiringy(x0) = y0 (x0, y0 given); or boundary conditions (e.g. requiring y(a) = ya, y(b) = yb fory : [a, b] −→ Cn (ya, yb ∈ Cn given)).

Let us now proceed to mathematically precise definitions of the abovementioned notions.

1 BASIC NOTIONS 6

Notation 1.1. We will write K in situations, where we allow K to be R or C.

Definition 1.2. Let k, n ∈ N.

(a) Given U ⊆ R×K(k+1)n and F : U −→ Kn, call

F (x, y, y′, . . . , y(k)) = 0 (1.4)

an implicit ODE of kth order. A solution to this ODE is a k times differentiablefunction

φ : I −→ Kn, (1.5)

defined on a nontrivial (bounded or unbounded, open or closed or half-open) intervalI ⊆ R satisfying the two conditions

(i)

(x, φ(x), φ′(x), . . . , φ(k)(x)) ∈ I ×K(k+1)n : x ∈ I

⊆ U .

(ii) F (x, φ(x), φ′(x), . . . , φ(k)(x)) = 0 for each x ∈ I.

Note that condition (i) is necessary so that one can even formulate condition (ii).

(b) Given G ⊆ R×Kkn, and f : G −→ Kn, call

y(k) = f(x, y, y′, . . . , y(k−1)) (1.6)

an explicit ODE of kth order. A solution to this ODE is a k times differentiablefunction φ as in (1.5), defined on a nontrivial (bounded or unbounded, open orclosed or half-open) interval I ⊆ R satisfying the two conditions

(i)(

x, φ(x), φ′(x), . . . , φ(k−1)(x))

∈ I ×Kkn : x ∈ I

⊆ G.

(ii) φ(k)(x) = f(x, φ(x), φ′(x), . . . , φ(k−1)(x)) for each x ∈ I.

Again, note that condition (i) is necessary so that one can even formulate condition(ii). Also note that φ is a solution to (1.6) if, and only if, φ is a solution to theequivalent implicit ODE y(k) − f(x, y, y′, . . . , y(k−1)) = 0.

Definition 1.3. Let k, n ∈ N.

(a) An initial value problem for (1.4) (resp. for (1.6)) consists of the ODE (1.4) (resp.of the ODE (1.6)) plus the initial condition

∀j=0,...,k−1

y(j)(x0) = y0,j (1.7)

with given x0 ∈ R and y0,0, . . . , y0,k−1 ∈ Kn. A solution φ to the initial valueproblem is a k times differentiable function φ as in (1.5) that is a solution to theODE and that also satisfies (1.7) (with y replaced by φ) – in particular, this requiresx0 ∈ I.

1 BASIC NOTIONS 7

(b) A boundary value problem for (1.4) (resp. for (1.6)) consists of the ODE (1.4) (resp.of the ODE (1.6)) plus the boundary condition

∀j∈Ja

y(j)(a) = ya,j and ∀j∈Jb

y(j)(b) = yb,j (1.8)

with given a, b ∈ R, a < b; Ja, Jb ⊆ 0, . . . , k − 1, ya,j ∈ Kn for each j ∈ Ja, andyb,j ∈ Kn for each j ∈ Jb. A solution φ to the boundary value problem is a k timesdifferentiable function φ as in (1.5) that is a solution to the ODE and that alsosatisfies (1.8) (with y replaced by φ) – in particular, this requires [a, b] ⊆ I.

—

Under suitable hypotheses, initial and boundary value problems for ODE have uniquesolutions (for initial value problems, we will see some rather general results in Cor. 3.10and Cor. 3.16 below). However, in general, they can have infinitely many solutions orno solutions, as shown by Examples 1.4(b),(c),(e) below.

Example 1.4. (a) Let k ∈ N. The function φ : R −→ K, φ(x) = a ex, a ∈ K, is asolution to the kth order explicit initial value problem

y(k) = y, (1.9a)

∀j=0,...,k−1

y(j)(0) = a. (1.9b)

We will see later (e.g., as a consequence of Th. 4.8 combined with Th. 3.1) that φis the unique solution to (1.9) on R.

(b) Consider the one-dimensional explicit first-order initial value problem

y′ =√

|y|, (1.10a)

y(0) = 0. (1.10b)

Then, for every c ≥ 0, the function

φc : R −→ R, φc(x) :=

0 for x ≤ c,(x−c)2

4for x ≥ c,

(1.11)

is a solution to (1.10): Clearly, φc(0) = 0, φc is differentiable, and

φ′c : R −→ R, φ′

c(x) :=

0 for x ≤ c,x−c2

for x ≥ c,(1.12)

solving the ODE. Thus, (1.10) is an example of an initial value problem with un-countably many different solutions, all defined on the same domain.

1 BASIC NOTIONS 8

(c) As mentioned before, the one-dimensional implicit first-order ODE (1.1c) has noreal-valued solution for c < 0. For c ≥ 0, every function φ : R −→ R, φ(x) :=a ± x

√c, a ∈ R, is a solution to (1.1c). Moreover, for c < 0, every function

φ : R −→ C, φ(x) := a ± xi√−c, a ∈ C, is a C-valued solution to (1.1c). The

one-dimensional implicit first-order ODE

ey′

= 0 (1.13)

is an example of an ODE that does not even have a C-valued solution. It is anexercise to find f : R −→ R such that the explicit ODE y′ = f(x) has no solution.

(d) Let n ∈ N and let a, c ∈ Kn. Then, on R, the function

φ : R −→ Kn, φ(x) := c+ xa, (1.14)

is the unique solution to the n-dimensional explicit first-order initial value problem

y′ = a, (1.15a)

y(0) = c. (1.15b)

This situation is a special case of Ex. 1.6 below.

(e) Let a, b ∈ R, a < b. We will see later in Example 4.12 that on [a, b] the 1-dimensionalexplicit second-order ODE

y′′ = −y (1.16)

has precisely the set of solutions

L =(

(c1 sin+c2 cos) : [a, b] −→ K)

: c1, c2 ∈ K

. (1.17)

In consequence, the boundary value problem

y(0) = 0, y(π/2) = 1, (1.18a)

for (1.16) has the unique solution φ : [0, π/2] −→ K, φ(x) := sin x (using (1.18a)and (1.17) implies c2 = 0 and c1 = 1); the boundary value problem

y(0) = 0, y(π) = 0, (1.18b)

for (1.16) has the infinitely many different solutions φc : [0, π] −→ K, φc(x) :=c sin x, c ∈ K; and the boundary value problem

y(0) = 0, y(π) = 1, (1.18c)

for (1.16) has no solution (using (1.18c) and (1.17) implies the contradictory re-quirements c2 = 0 and c2 = −1).

1 BASIC NOTIONS 9

(f) Consider

F : R×K2 ×K2 −→ K2, F

(

x,

(

y1y2

)

,

(

z1z2

))

:=

(

z2z2 − 1

)

. (1.19a)

Clearly, the implicit K2-valued ODE

F (x, y, y′) =

(

y′2y′2 − 1

)

=

(

00

)

(1.19b)

has no solution on any nontrivial interval.

(g) Consider

F : R× C3 × C3 −→ C3, F

x,

y1y2y3

,

z1z2z3

:=

z2 + iy3 − 2iz1 + y2 − x2

y1 − ieix

. (1.20a)

It is an exercise to show the C3-valued implicit ODE

F (x, y, y′) =

y′2 + iy3 − 2iy′1 + y2 − x2

y1 − ieix

=

000

(1.20b)

has a unique solution on R (note that, here, we do not need to provide initialor boundary conditions to obtain uniqueness). The implicit ODE (1.20b) is anexample of a differential algebraic equation, since, read in components, only its firsttwo equations contain derivatives, whereas its third equation is purely algebraic.

1.2 Equivalent Integral Equation

It is often useful to rewrite a first-order explicit intitial value problem as an equivalentintegral equation. We provide the details of this equivalence in the following theorem:

Theorem 1.5. If G ⊆ R×Kn, n ∈ N, and f : G −→ Kn is continuous, then, for each(x0, y0) ∈ G, the explicit n-dimensional first-order initial value problem

y′ = f(x, y), (1.21a)

y(x0) = y0, (1.21b)

is equivalent to the integral equation

y(x) = y0 +

∫ x

x0

f(

t, y(t))

dt , (1.22)

in the sense that a continuous function φ : I −→ Kn, with x0 ∈ I ⊆ R being a nontrivialinterval, and φ satisfying

(

x, φ(x))

∈ I ×Kn : x ∈ I

⊆ G, (1.23)

1 BASIC NOTIONS 10

is a solution to (1.21) in the sense of Def. 1.3(a) if, and only if,

∀x∈I

φ(x) = y0 +

∫ x

x0

f(

t, φ(t))

dt , (1.24)

i.e. if, and only if, φ is a solution to the integral equation (1.22).

Proof. Assume I ⊆ R with x0 ∈ I to be a nontrivial interval and φ : I −→ Kn

to be a continuous function, satisfying (1.23). If φ is a solution to (1.21), then φ isdifferentiable and the assumed continuity of f implies the continuity of φ′. In otherwords, each component φj of φ, j = 1, . . . , n, is in C1(I,K). Thus, the fundamentaltheorem of calculus [Phi16, Th. 10.20(b)] applies, and [Phi16, (10.51b)] yields

∀x∈I

∀j∈1,...,n

φj(x) = φj(x0) +

∫ x

x0

fj(

t, φ(t))

dt(1.21b)= y0,j +

∫ x

x0

fj(

t, φ(t))

dt , (1.25)

proving φ satisfies (1.24). Conversely, if φ satisfies (1.24), then the validity of the initialcondition (1.21b) is immediate. Moreover, as f and φ are continuous, so is the integrandfunction t 7→ f

(

t, φ(t))

of (1.24). Thus, [Phi16, Th. 10.20(a)] applies to (the componentsof) φ, proving φ′(x) = f

(

x, φ(x))

for each x ∈ I, proving φ is a solution to (1.21).

Example 1.6. Consider the situation of Th. 1.5. In the particularly simple specialcase, where f does not actually depend on y, but merely on x, the equivalence between(1.21) and (1.22) can be directly exploited to actually solve the initial value problem:If f : I −→ Kn, where I ⊆ R is some nontrivial interval with x0 ∈ I, then we obtainφ : I −→ Kn to be a solution of (1.21) if, and only if,

∀x∈I

φ(x) = y0 +

∫ x

x0

f(t) dt , (1.26)

i.e. if, and only if, φ is the antiderivative of f that satisfies the initial condition. Inparticular, in the present situation, φ as given by (1.26) is the unique solution to theinitial value problem. Of course, depending on f , it can still be difficult to carry outthe integral in (1.26).

1.3 Patching and Time Reversion

If solutions defined on different intervals fit together, then they can be patched to obtaina solution on the union of the two intervals:

Lemma 1.7 (Patching of Solutions). Let k, n ∈ N. Given G ⊆ R×Kkn and f : G −→Kn, if φ : I −→ Kn and ψ : J −→ Kn are both solutions to (1.6), i.e. to

y(k) = f(x, y, y′, . . . , y(k−1)),

such that I =]a, b], J = [b, c[, a < b < c, and such that

∀j=0,...,k−1

φ(j)(b) = ψ(j)(b), (1.27)

1 BASIC NOTIONS 11

then

σ : I ∪ J −→ Kn, σ(x) :=

φ(x) for x ∈ I,

ψ(x) for x ∈ J,(1.28)

is also a solution to (1.6).

Proof. Since φ and ψ both are solutions to (1.6),

(

x, σ(x), σ′(x), . . . , σ(k−1)(x))

∈ (I ∪ J)×Kkn : x ∈ I ∪ J

⊆ G (1.29)

must hold, where (1.27) guarantees that σ(j)(b) exists for each j = 0, . . . , k−1. Moreover,σ is k times differentiable at each x ∈ I ∪ J , x 6= b, and

∀x∈I∪J,x 6=b

σ(k)(x) = f(

x, σ(x), σ′(x), . . . , σ(k−1)(x))

. (1.30)

However, at b, we also have (using the left-hand derivatives for φ and the right-handderivatives for ψ)

φ(k)(b) = f(

b, φ(b), φ′(b), . . . , φ(k−1)(b))

= f(

b, ψ(b), ψ′(b), . . . , ψ(k−1)(b))

= ψ(k)(b), (1.31)

which shows σ is k times differentiable and the equality of (1.30) also holds at x = b,completing the proof that σ is a solution.

It is sometimes useful to apply what is known as time reversion:

Definition 1.8. Let k, n ∈ N, Gf ⊆ R×Kkn, f : Gf −→ Kn, and consider the ODE

y(k) = f(x, y, y′, . . . , y(k−1)). (1.32)

We call the ODEy(k) = g(x, y, y′, . . . , y(k−1)), (1.33)

where

g : Gg −→ Kn, g(x, y) := (−1)k f(

− x, y1,−y2, . . . , (−1)k−1 yk)

, (1.34a)

Gg :=

(x, y) ∈ R×Kkn :(

− x, y1,−y2, . . . , (−1)k−1 yk)

∈ Gf

, (1.34b)

the time-reversed version of (1.32).

Lemma 1.9 (Time Reversion). Let k, n ∈ N, Gf ⊆ R×Kkn, and f : Gf −→ Kn.

(a) The time-reversed version of (1.33) is the original ODE, i.e. (1.32).

(b) If −∞ ≤ a < b ≤ ∞, then φ : ]a, b[−→ Kn is a solution to (1.32) if, and only if,

ψ : ]− b,−a[−→ Kn, ψ(x) := φ(−x), (1.35)

is a solution to the time-reversed version (1.33).

2 ELEMENTARY SOLUTION METHODS 12

Proof. (a) is immediate from the definition of g in (1.34).

(b): Due to (a), it suffices to show if φ is a solution to (1.32), then ψ is a solution to(1.33). Clearly, if x ∈]− b,−a[, then −x ∈]a, b[. Moreover, noting

∀j=0,...,k

∀x∈]−b,−a[

ψ(j)(x) = (−1)j φ(j)(−x), (1.36a)

one has

∀x∈]−b,−a[

(

(

− x, φ(−x), φ′(−x), . . . , φ(k−1)(−x))

∈ Gf

⇒(

x, ψ(x), ψ′(x), . . . , ψ(k−1)(x))

=(

x, φ(−x),−φ′(−x), . . . , (−1)k−1 φ(k−1)(−x))

∈ Gg

)

(1.36b)

and

∀x∈]−b,−a[

ψ(k)(x) = (−1)k f(

− x, φ(−x), φ′(−x), . . . , φ(k−1)(−x))

= (−1)k f(

− x, ψ(x),−ψ′(x), . . . , (−1)k−1 ψ(k−1)(x))

= g(

x, ψ(x), ψ′(x), . . . , ψ(k−1)(x))

,

(1.36c)

thereby establishing the case.

2 Elementary Solution Methods for 1-Dimensional

First-Order ODE

2.1 Geometric Interpretation, Graphing

Geometrically, in the 1-dimensional real-valued case, the ODE (1.21a) provides a slopey′ = f(x, y) for every point (x, y). In other words, it provides a field of directions. Thetask is to find a differentiable function φ such that its graph has the prescribed slope ineach point it contains. In certain simple cases, drawing the field of directions can helpto guess the solutions of the ODE.

Example 2.1. Let G := R+ × R and f : G −→ R, f(x, y) := y/x, i.e. we consider theODE y′ = y/x. Drawing the field of directions leads to the idea that the solutions arefunctions whose graphs constitute rays, i.e. φc : R

+ −→ R, y = φc(x) = c x with c ∈ R.Indeed, one immediately verifies that each φc constitutes a solution to the ODE.

2.2 Linear ODE, Variation of Constants

Definition 2.2. Let I ⊆ R be an open interval and let a, b : I −→ K be continuousfunctions. An ODE of the form

y′ = a(x)y + b(x) (2.1)

is called a linear ODE of first order. It is called homogeneous if, and only if, b ≡ 0; it iscalled inhomogeneous if, and only if, it is not homogeneous.


Theorem 2.3 (Variation of Constants). Let I ⊆ R be an open interval and let a, b :I −→ K be continuous. Moreover, let x0 ∈ I and c ∈ K. Then the linear ODE (2.1)has a unique solution φ : I −→ K that satisfies the initial condition y(x0) = c. Thisunique solution is given by

φ : I −→ K, φ(x) = φ0(x)

(

c+

∫ x

x0

φ0(t)−1 b(t) dt

)

, (2.2a)

where

φ0 : I −→ K, φ0(x) = exp

(∫ x

x0

a(t) dt

)

= e∫ xx0a(t) dt

. (2.2b)

Here, and in the following, φ−10 denotes 1/φ0 and not the inverse function of φ0 (which

does not even necessarily exist).

Proof. We begin by noting that φ0 according to (2.2b) is well-defined since a is assumedto be continuous, i.e., in particular, Riemann integrable on [x0, x]. Moreover, the fun-damental theorem of calculus [Phi16, Th. 10.20(a)] applies, showing φ0 is differentiablewith

φ′0 : I −→ K, φ′

0(x) = a(x) exp

(∫ x

x0

a(t) dt

)

= a(x)φ0(x), (2.3)

where Lem. A.1 of the Appendix was used as well. In particular, φ0 is continuous. Sinceφ0 6= 0 as well, φ−1

0 is also continuous. Moreover, as b is continuous by hypothesis,φ−10 b is continuous and, thus, Riemann integrable on [x0, x]. Once again, [Phi16, Th.

10.20(a)] applies, yielding φ to be differentiable with

φ′ : I −→ K,

φ′(x) = φ′0(x)

(

c+

∫ x

x0

φ0(t)−1 b(t) dt

)

+ φ0(x)φ0(x)−1 b(x)

= a(x)φ0(x)

(

c+

∫ x

x0

φ0(t)−1 b(t) dt

)

+ b(x) = a(x)φ(x) + b(x), (2.4)

where the product rule of [Phi16, Th. 9.7(c)] was used as well. Comparing (2.4) with(2.1) shows φ is a solution to (2.1). The computation

φ(x0) = φ0(x0) (c+ 0) = 1 · c = c (2.5)

verifies that φ satisfies the desired initial condition. It remains to prove uniqueness. Tothis end, let ψ : I −→ K be an arbitrary differentiable function that satisfies (2.1) aswell as the initial condition ψ(x0) = c. We have to show ψ = φ. Since φ0 6= 0, we candefine u := ψ/φ0 and still have to verify

∀x∈I

u(x) = c+

∫ x

x0

φ0(t)−1 b(t) dt . (2.6)

We obtain

a φ0 u+ b = aψ + b = ψ′ = (φ0 u)′ = φ′

0 u+ φ0 u′ = a φ0 u+ φ0 u

′, (2.7)


implying b = φ0 u′ and u′ = φ−1

0 b. Thus, the fundamental theorem of calculus in theform [Phi16, Th. 10.20(b)] implies

∀x∈I

u(x) = u(x0) +

∫ x

x0

u′(t) dt = c+

∫ x

x0

φ0(t)−1 b(t) dt , (2.8)

thereby completing the proof.

Corollary 2.4. Let I ⊆ R be an open interval and let a : I −→ K be continuous.Moreover, let x0 ∈ I and c ∈ K. Then the homogeneous linear ODE (2.1) (i.e. withb ≡ 0) has a unique solution φ : I −→ K that satisfies the initial condition y(x0) = c.This unique solution is given by

φ(x) = c exp

(∫ x

x0

a(t) dt

)

= c e∫ xx0a(t) dt

. (2.9)

Proof. One immediately obtains (2.9) by setting b ≡ 0 in in (2.2).

Remark 2.5. The name variation of constants for Th. 2.3 can be understood fromcomparing the solution (2.9) of the homogeneous linear ODE with the solution (2.2)of the general inhomogeneous linear ODE: One obtains (2.2) from (2.9) by varying theconstant c, i.e. by replacing it with the function x 7→ c+

∫ x

x0φ0(t)

−1 b(t) dt .

Example 2.6. Consider the ODE

y′ = 2xy + x3 (2.10)

with initial condition y(0) = c, c ∈ C. Comparing (2.10) with Def. 2.2, we observe weare facing an inhomogeneous linear ODE with

a : R −→ R, a(x) := 2x, (2.11a)

b : R −→ R, b(x) := x3. (2.11b)

From Cor. 2.4, we obtain the solution φ0,c to the homogeneous version of (2.10):

φ0,c : R −→ C, φ0,c(x) = c exp

(∫ x

0

a(t) dt

)

= cex2

. (2.12)

The solution to (2.10) is given by (2.2a):

φ : R −→ C,

φ(x) = ex2

(

c+

∫ x

0

e−t2

t3 dt

)

= ex2

(

c+

[

−1

2(t2 + 1)e−t

2

]x

0

)

= ex2

(

c+1

2− 1

2(x2 + 1)e−x

2

)

=

(

c+1

2

)

ex2 − 1

2(x2 + 1). (2.13)


2.3 Separation of Variables

If the ODE (1.21a) has the particular form

y′ = f(x)g(y), (2.14)

with one-dimensional real-valued continuous functions f and g, and g(y) 6= 0, then itcan be solved by a method known as separation of variables:

Theorem 2.7. Let I, J ⊆ R be (bounded or unbounded) open intervals and suppose thatf : I −→ R and g : J −→ R are continuous with g(y) 6= 0 for each y ∈ J . For each(x0, y0) ∈ I×J , consider the initial value problem consisting of the ODE (2.14) togetherwith the initial condition

y(x0) = y0. (2.15)

Define the functions

F : I −→ R, F (x) :=

∫ x

x0

f(t) dt , G : J −→ R, G(y) :=

∫ y

y0

dt

g(t). (2.16)

(a) Uniqueness: On each open interval I ′ ⊆ I satisfying x0 ∈ I ′ and F (I ′) ⊆ G(J), theinitial value problem consisting of (2.14) and (2.15) has a unique solution. Thisunique solution is given by

φ : I ′ −→ R, φ(x) := G−1(

F (x))

, (2.17)

where G−1 : G(J) −→ J is the inverse function of G on G(J).

(b) Existence: There exists an open interval I ′ ⊆ I satisfying x0 ∈ I ′ and F (I ′) ⊆ G(J),i.e. an I ′ such that (a) applies.

Proof. (a): We begin by proving G has a differentiable inverse function G−1 : G(J) −→J . According to the fundamental theorem of calculus [Phi16, Th. 10.20(a)], G is dif-ferentiable with G′ = 1/g. Since g is continuous and nonzero, G is even C1. IfG′(y0) = 1/g(y0) > 0, then G is strictly increasing on J (due to the intermediatevalue theorem [Phi16, Th. 7.57]; g(y0) > 0, the continuity of g, and g 6= 0 imply thatg > 0 on J). Analogously, if G′(y0) = 1/g(y0) < 0, then G is strictly decreasing on J .In each case, G has a differentiable inverse function on G(J) by [Phi16, Th. 9.9].

In the next step, we verify that (2.17) does, indeed, define a solution to (2.14) and(2.15). The assumption F (I ′) ⊆ G(J) and the existence of G−1 as shown above providethat φ is well-defined by (2.17). Verifying (2.15) is quite simple: φ(x0) = G−1(F (x0)) =G−1(0) = y0. To see φ to be a solution of (2.14), notice that (2.17) implies F = G φon I ′. Thus, we can apply the chain rule to obtain the derivative of F = G φ on I ′:

∀x∈I′

f(x) = F ′(x) = G′(

φ(x))

φ′(x) =φ′(x)

g(

φ(x)) , (2.18)

showing φ satisfies (2.14).


We now proceed to show that each solution φ : I ′ −→ R to (2.14) that satisfies (2.15)must also satisfy (2.17). Since φ is a solution to (2.14),

φ′(x)

g(

φ(x)) = f(x) for each x ∈ I ′. (2.19)

Integrating (2.19) yields

∫ x

x0

φ′(t)

g(

φ(t)) dt =

∫ x

x0

f(t) dt = F (x) for each x ∈ I ′. (2.20)

Using the change of variables formula of [Phi16, Th. 10.25] in the left-hand side of (2.20),allows one to replace φ(t) by the new integration variable u (note that each solutionφ : I ′ −→ R to (2.14) is in C1(I ′) since f and g are presumed continuous). Thus, weobtain from (2.20):

F (x) =

∫ φ(x)

φ(x0)

du

g(u)=

∫ φ(x)

y0

du

g(u)= G

(

φ(x))

for each x ∈ I ′. (2.21)

Applying G−1 to (2.21) establishes φ satisfies (2.17).

(b): During the proof of (a), we have already seen G to be either strictly increasingor strictly decreasing. As G(y0) = 0, this implies the existence of ǫ > 0 such that] − ǫ, ǫ[⊆ G(J). The function F is differentiable and, in particular, continuous. SinceF (x0) = 0, there is δ > 0 such that, for I ′ :=]x0−δ, x0+δ[, one has F (I ′) ⊆]−ǫ, ǫ[⊆ G(J)as desired.


y′ = −yx

on I × J := R+ × R+ (2.22)

with the initial condition y(1) = c for some given c ∈ R+. Introducing functions

f : R+ −→ R, f(x) := −1

x, g : R+ −→ R, g(y) := y, (2.23)

one sees that Th. 2.7 applies. To compute the solution φ = G−1 F , we first have todetermine F and G:

F : R+ −→ R, F (x) =

∫ x

1

f(t) dt = −∫ x

1

dt

t= − ln x, (2.24a)

G : R+ −→ R, G(y) =

∫ y

c

dt

g(t)=

∫ y

c

dt

t= ln

y

c. (2.24b)

Here, we can choose I ′ = I = R+, because F (R+) = R = G(R+). That means φ isdefined on the entire interval I. The inverse function of G is given by

G−1 : R −→ R+, G−1(t) = c et. (2.25)


Finally, we get

φ : R+ −→ R, φ(x) = G−1(

F (x))

= c e− lnx =c

x. (2.26)

The uniqueness part of Th. 2.7 further tells us the above initial value problem can haveno solution different from φ.

—

The advantage of using Th. 2.7 as in the previous example, by computing the relevantfunctions F , G, and G−1, is that it is mathematically rigorous. In particular, one can besure one has found the unique solution to the ODE with initial condition. However, inpractice, it is often easier to use the following heuristic (not entirely rigorous) procedure.In the end, in most cases, one can easily check by differentiation that the function foundis, indeed, a solution to the ODE with initial condition. However, one does not knowuniqueness without further investigations (general results such as Th. 3.15 below canoften help). One also has to determine on which interval the found solution is defined.On the other hand, as one is usually interested in choosing the interval as large aspossible, the optimal choice is not always obvious when using Th. 2.7, either.

The heuristic procedure is as follows: Start with the ODE (2.14) written in the form

dy

dx= f(x)g(y). (2.27a)

Multiply by dx and divide by g(y) (i.e. separate the variables):

dy

g(y)= f(x) dx . (2.27b)

Integrate:∫

dy

g(y)=

∫

f(x) dx . (2.27c)

Change the integration variables and supply the appropriate upper and lower limits forthe integrals (according to the initial condition):

∫ y

y0

dt

g(t)=

∫ x

x0

f(t) dt . (2.27d)

Solve this equation for y, set φ(x) := y, check by differentiation that φ is, indeed, asolution to the ODE, and determine the largest interval I ′ such that x0 ∈ I ′ and suchthat φ is defined on I ′. The use of this heuristic procedure is demonstrated by thefollowing example:


y′ = −y2 on I × J := R× R (2.28)


with the initial condition y(x0) = y0 for given values x0, y0 ∈ R. We manipulate (2.28)according to the heuristic procedure described in (2.27) above:

dy

dx= −y2 −y−2 dy = dx −

∫

y−2 dy =

∫

dx

−∫ y

y0

t−2 dt =

∫ x

x0

dt

[

1

t

]y

y0

= [t]xx0 1

y− 1

y0= x− x0

φ(x) = y =y0

1 + (x− x0) y0. (2.29)

Clearly, φ(x0) = y0. Moreover,

φ′(x) = − y20(

1 + (x− x0) y0)2 = −

(

φ(x))2, (2.30)

i.e. φ does, indeed, provide a solution to (2.28). If y0 = 0, then φ ≡ 0 is definedon the entire interval I = R. If y0 6= 0, then the denominator of φ(x) has a zero atx = (x0y0 − 1)/y0, and φ is not defined on all of R. In that case, if y0 > 0, thenx0 > (x0y0 − 1)/y0 = x0 − 1/y0 and the maximal open interval for φ to be defined onis I ′ =]x0 − 1/y0,∞[; if y0 < 0, then x0 < (x0y0 − 1)/y0 = x0 − 1/y0 and the maximalopen interval for φ to be defined on is I ′ =]−∞, x0−1/y0[. Note that the formula for φobtained by (2.29) works for y0 = 0 as well, even though not every previous expressionin (2.29) is meaningful for y0 = 0 and, also, Th. 2.7 does not apply to (2.28) for y0 = 0.In the present example, the subsequent Th. 3.15 does, indeed, imply φ to be the uniquesolution to the initial value problem on I ′.

2.4 Change of Variables

To solve an ODE, it can be useful to transform it into an equivalent ODE, using aso-called change of variables. If one already knows how to solve the transformed ODE,then the equivalence allows one to also solve the original ODE. We first present thefollowing Th. 2.10, which constitutes the base for the change of variables technique,followed by examples, where the technique is applied.

Theorem 2.10. Let G ⊆ R × Kn be open, n ∈ N, f : G −→ Kn, and (x0, y0) ∈ G.Define

∀x∈R

Gx := y ∈ Kn : (x, y) ∈ G (2.31)

and assume the change of variables function T : G −→ Kn is differentiable and suchthat

∀Gx 6=∅

(

Tx := T (x, ·) : Gx −→ Tx(Gx), Tx(y) := T (x, y), is a diffeomorphism)

,

(2.32)


i.e. Tx is invertible and both Tx and T−1x are differentiable. Then the first-order initial

value problems

y′ = f(x, y), (2.33a)

y(x0) = y0, (2.33b)

and

y′ =(

DT−1x (y)

)−1

f(

x, T−1x (y)

)

+ ∂xT(

x, T−1x (y)

)

, (2.34a)

y(x0) = T (x0, y0), (2.34b)

are equivalent in the following sense:

(a) A differentiable function φ : I −→ Kn, where I ⊆ R is a nontrivial interval, is asolution to (2.33a) if, and only if, the function

µ : I −→ Kn, µ(x) := (Tx φ)(x) = T(

x, φ(x))

, (2.35)

is a solution to (2.34a).

(b) A differentiable function φ : I −→ Kn, where I ⊆ R is a nontrivial interval, is asolution to (2.33) if, and only if, the function of (2.35) is a solution to (2.34).

Proof. We start by noting that the assumption of G being open clearly implies each Gx,x ∈ R, to be open as well, which, in turn, implies Tx(Gx) to be open, even though thisis not as obvious1. Next, for each x ∈ R such that Gx 6= ∅, we can apply the chain rule[Phi15, Th. 2.28] to Tx T−1

x = Id to obtain

∀y∈Tx(Gx)

DTx(

T−1x (y)

)

DT−1x (y) = Id (2.36)

and, thus, each DT−1x (y) is invertible with

∀y∈Tx(Gx)

(

DT−1x (y)

)−1

= DTx(

T−1x (y)

)

. (2.37)

Consider φ and µ as in (a) and notice that (2.35) implies

∀x∈I

φ(x) = T−1x (µ(x)). (2.38)

Moreover, the differentiability of φ and T imply differentiability of µ by the chain rule,which also yields

∀x∈I

µ′(x) = DT(

x, φ(x))

(

1φ′(x)

)

= DTx(φ(x))φ′(x) + ∂xT

(

x, φ(x))

.

(2.39)

1If Tx is a continuously differentiable map, then this is related to the inverse function theorem (see,e.g. [Phi15, Cor. C.9]); it is still true if Tx is merely continuous and injective, but then it is the invarianceof domain theorem of algebraic topology [Oss09, 5.6.15], which is equivalent to the Brouwer fixed-pointtheorem [Oss09, 5.6.10], and is much harder to prove.


To prove (a), first assume φ : I −→ Kn to be a solution of (2.33a). Then, for eachx ∈ I,

µ′(x)(2.39),(2.33a)

= DTx(φ(x)) f(

x, φ(x))

+ ∂xT(

x, φ(x))

(2.38)= DTx

(

T−1x (µ(x))

)

f(

x, T−1x (µ(x))

)

+ ∂xT(

x, T−1x (µ(x))

)

(2.37)=

(

DT−1x (µ(x))

)−1

f(

x, T−1x (µ(x))

)

+ ∂xT(

x, T−1x (µ(x))

)

, (2.40)

showing µ satisfies (2.34a). Conversely, assume µ to be a solution to (2.34a). Then, foreach x ∈ I,

(

DT−1x (µ(x))

)−1

f(

x, T−1x (µ(x))

)

+ ∂xT(

x, T−1x (µ(x))

)

(2.34a)= µ′(x)

(2.39)= DTx(φ(x))φ

′(x) + ∂xT(

x, φ(x))

. (2.41)

Using (2.38), one can subtract the second summand from (2.41). Multiplying the resultby DT−1

x (µ(x)) from the left and taking into account (2.37) then provides

∀x∈I

φ′(x) = f(

x, T−1x (µ(x))

) (2.38)= f

(

x, φ(x))

, (2.42)

showing φ satisfies (2.33a).

It remains to prove (b). If φ satisfies (2.33), then µ satisfies (2.34a) by (a). More-over, µ(x0) = T

(

x0, φ(x0))

= T (x0, y0), i.e. µ satisfies (2.34b) as well. Conversely,assume µ satisfies (2.34). Then φ satisfies (2.33a) by (a). Moreover, by (2.38), φ(x0) =T−1x0

(µ(x0)) = T−1x0

(T (x0, y0)) = y0, showing φ satisfies (2.33b) as well.

As a first application of Th. 2.10, we prove the following theorem about so-calledBernoulli differential equations:

Theorem 2.11. Consider the Bernoulli differential equation

y′ = f(x, y) := a(x) y + b(x) yα, (2.43a)

where α ∈ R\0, 1, the functions a, b : I −→ R are continuous and defined on an openinterval I ⊆ R, and f : I × R+ −→ R. For (2.43a), we add the initial condition

y(x0) = y0, (x0, y0) ∈ I × R+, (2.43b)

and, furthermore, we also consider the corresponding linear initial value problem

y′ = (1− α)(

a(x) y + b(x))

, (2.44a)

y(x0) = y1−α0 , (2.44b)

with its unique solution ψ : I −→ R given by Th. 2.3.


(a) Uniqueness: On each open interval I ′ ⊆ I satisfying x0 ∈ I ′ and ψ > 0 on I ′, theBernoulli initial value problem (2.43) has a unique solution. This unique solutionis given by

φ : I ′ −→ R+, φ(x) :=(

ψ(x)) 1

1−α . (2.45)

(b) Existence: There exists an open interval I ′ ⊆ I satisfying x0 ∈ I ′ and ψ > 0 on I ′,i.e. an I ′ such that (a) applies.

Proof. (b) is immediate from Th. 2.3, since ψ(x0) = y0 > 0 and ψ is continuous.

To prove (a), we apply Th. 2.10 with the change of variables

T : I × R+ −→ R+, T (x, y) := y1−α. (2.46)

Then T ∈ C1(I × R+,R) with ∂xT ≡ 0 and ∂yT (x, y) = (1− α) y−α. Moreover,

∀x∈I

Tx = S, S : R+ −→ R+, S(y) := y1−α, (2.47)

which is differentiable with the differentiable inverse function S−1 : R+ −→ R+,

S−1(y) = y1

1−α , DS−1(y) = (S−1)′(y) = 11−α

yα

1−α . Thus, (2.34a) takes the form

y′ =(

DT−1x (y)

)−1

f(

x, T−1x (y)

)

+ ∂xT(

x, T−1x (y)

)

= (1− α) y−α

1−α

(

a(x) y1

1−α + b(x)(

y1

1−α

)α)

+ 0

= (1− α)(

a(x) y + b(x))

. (2.48)

Thus, if I ′ ⊆ I is such that x0 ∈ I ′ and ψ > 0 on I ′, then Th. 2.10 says φ definedby (2.45) must be a solution to (2.43) (note that the differentiability of ψ implies thedifferentiability of φ). On the other hand, if λ : I ′ −→ R+ is an arbitrary solution to(2.43), then Th. 2.10 states µ := S λ = λ1−α to be a solution to (2.44). The uniquenesspart of Th. 2.3 then yields λ1−α = ψI′= φ1−α, i.e. λ = φ.

Example 2.12. Consider the initial value problem

y′ = f(x, y) := i− 1

ix− y + 2, (2.49a)

y(1) = i, (2.49b)

where f : G −→ C, G := (x, y) ∈ R×C : ix− y+2 6= 0 (G is open as the continuouspreimage of the open set C \ 0). We apply the change of variables

T : G −→ C, T (x, y) := ix− y. (2.50)

Then, T ∈ C1(G,C). Moreover, for each x ∈ R,

Gx = y ∈ C : (x, y) ∈ G = C \ ix+ 2 (2.51)


and we have the diffeomorphisms

Tx : C \ ix+ 2 −→ C \ −2, Tx(y) = ix− y, (2.52a)

T−1x : C \ −2 −→ C \ ix+ 2, T−1

x (y) = ix− y. (2.52b)

To obtain the transformed equation, we compute the right-hand side of (2.34a)

(

DT−1x (y)

)−1

f(

x, T−1x (y)

)

+ ∂xT(

x, T−1x (y)

)

= (−1) ·(

i− 1

y + 2

)

+ i =1

y + 2. (2.53)

Thus, the transformed initial value problem is

y′ =1

y + 2, (2.54a)

y(1) = T (1, i) = i− i = 0. (2.54b)

Using seperation of variables, one finds the solution

µ : ]− 1,∞[−→]− 2,∞[, µ(x) :=√2x+ 2− 2, (2.55)

to (2.54). Then Th. 2.10 implies that

φ : ]− 1,∞[−→ C, φ(x) := T−1x

(

µ(x))

= ix−√2x+ 2 + 2, (2.56)

is a solution to (2.49) (that φ is a solution to (2.49) can now also easily be checkeddirectly). It will become clear from Th. 3.15 below that φ and ψ are also the uniquesolutions to their respective initial value problems.

—

Finding a suitable change of variables to transform a given ODE such that one is in aposition to solve the transformed ODE is an art, i.e. it can be very difficult to spot auseful transformation, and it takes a lot of practise and experience.

Remark 2.13. Somewhat analogous to the situation described in the paragraph before(2.27) regarding the separation of variables technique, in practise, one frequently uses aheuristic procedure to apply a change of variables, rather than appealing to the rigorousTh. 2.10. For the initial value problem y′ = f(x, y), y(x0) = y0, this heuristic procedureproceeds as follows:

(1) One introduces the new variable z := T (x, y) and then computes z′, i.e. the deriva-tive of the function x 7→ z(x) = T (x, y(x)).

(2) In the result of (1), one eliminates all occurrences of the variable y by first replacingy′ by f(x, y) and then replacing y by T−1

x (z), where Tx(y) := T (x, y) = z (i.e. one hasto solve the equation z = T (x, y) for y). One thereby obtains the transformed initialvalue problem problem z′ = g(x, z), z(x0) = T (x0, y0), with a suitable function g.


(3) One solves the transformed initial value problem to obtain a solution µ, and thenx 7→ φ(x) := T−1

x

(

µ(x))

yields a candidate for a solution to the original initial valueproblem.

(4) One checks that φ is, indeed, a solution to y′ = f(x, y), y(x0) = y0.

Example 2.14. Consider

f : R+ × R −→ R, f(x, y) := 1 +y

x+y2

x2, (2.57)

and the initial value problem

y′ = f(x, y), y(1) = 0. (2.58)

We introduce the change of variables z := T (x, y) := y/x and proceed according to thesteps of Rem. 2.13. According to (1), we compute, using the quotient rule,

z′(x) =y′(x) x− y(x)

x2. (2.59)

According to (2), we replace y′(x) by f(x, y) and then replace y by T−1x (z) = xz to

obtain the transformed initial value problem

z′ =1

x

(

1 +y

x+y2

x2

)

− y

x2=

1

x(1 + z + z2)− z

x=

1 + z2

x, z(1) = 0/1 = 0. (2.60)

According to (3), we next solve (2.60), e.g. by seperation of variables, to obtain thesolution

µ :]

e−π2 , e

π2

[

−→ R, µ(x) := tan ln x, (2.61)

of (2.60), and

φ :]

e−π2 , e

π2

[

−→ R, φ(x) := xµ(x) = x tan ln x, (2.62)

as a candidate for a solution to (2.58). Finally, according to (4), we check that φ is,indeed, a solution to (2.58): Due to φ(1) = 1 · tan 0 = 0, φ satisfies the initial condition,and due to

φ′(x) = tan ln x+ x1

x(1 + tan2 ln x) = 1 + tan ln x+ tan2 ln x

= 1 +φ(x)

x+φ2(x)

x2, (2.63)

φ satisfies the ODE.

3 GENERAL THEORY 24

3 General Theory

3.1 Equivalence Between Higher-Order ODE and Systems ofFirst-Order ODE

It turns out that each one-dimensional kth-order ODE is equivalent to a system of kfirst-order ODE; more generally, that each n-dimensional kth-order ODE is equivalentto a kn-dimensional first-order ODE (i.e. to a system of kn one-dimensional first-orderODE). Even though, in this class, we will mainly consider explicit ODE, we provide theequivalence also for the implicit case, as the proof is essentially the same (the explicitcase is included as a special case).

Theorem 3.1. In the situation of Def. 1.2(a), i.e. U ⊆ R×K(k+1)n and F : U −→ Kn,plus (x0, y0,0, . . . , y0,k−1) ∈ R×Kkn, consider the kth-order initial value problem

F (x, y, y′, . . . , y(k)) = 0, (3.1a)

∀j∈0,...,k−1

y(j)(x0) = y0,j , (3.1b)

and the first-order initial value problem

y′1 − y2 = 0,

y′2 − y3 = 0,

...

y′k−1 − yk = 0,

F (x, y1, . . . , yk, y′k) = 0,

(3.2a)

y(x0) =

y0,0...

y0,k−1

(3.2b)

(note that the unknown function y in (3.1) is Kn-valued, whereas the unknown functiony in (3.2) is Kkn-valued). Then both initial value problems are equivalent in the followingsense:

(a) If φ : I −→ Kn is a solution to (3.1), then

Φ : I −→ Kkn, Φ :=

φφ′

...φ(k−1)

, (3.3)

is a solution to (3.2).

(b) If Φ : I −→ Kkn is a solution to (3.2), then φ := Φ1 (which is Kn-valued) is asolution to (3.1).

3 GENERAL THEORY 25

Proof. We rewrite (3.2a) asG(x, y, y′) = 0, (3.4)

where

G : V −→ Kkn,

V :=

(x, y, z) ∈ R×Kkn ×Kkn : (x, y, zk) ∈ U

⊆ R×Kkn ×Kkn,

G1(x, y, z) := z1 − y2,

G2(x, y, z) := z2 − y3,

...

Gk−1(x, y, z) := zk−1 − yk,

Gk(x, y, z) := F (x, y, zk). (3.5)

(a): As a solution to (3.1), φ is k times differentiable and Φ is well-defined. Then (3.1b)implies (3.2b), since

Φ(x0) =

φ(x0)φ′(x0)

...φ(k−1)(x0)

(3.1b)=

y0,0...

y0,k−1

.

Next, Def. 1.2(a)(i) for φ implies Def. 1.2(a)(i) for Φ, since

(

x,Φ(x),Φ′(x))

∈ I ×Kkn ×Kkn : x ∈ I(3.3)=(

x, φ(x), φ′(x), . . . , φ(k−1)(x), φ′(x), . . . , φ(k)(x))

∈ I ×Kkn ×Kkn : x ∈ I

(x, φ(x), . . . , φ(k)(x)) ∈ U

⊆ V. (3.6)

The definition of Φ in (3.3) implies

∀j∈1,...,k−1

Φ′j = (φ(j−1))′ = φ(j) = Φj+1,

showing Φ satisfies the first k − 1 equations of (3.2a). As

Φ′k(x) = (φ(k−1))′(x) = φ(k)(x)

and, thus,

∀x∈I

Gk

(

x,Φ(x),Φ′(x))

= F(

x, φ(x), φ′(x), . . . , φ(k)(x)) (3.1a)

= 0, (3.7)

Φ also satisfies the last equation of (3.2a).

(b): As Φ is a solution to (3.2), the first k − 1 equations of (3.2a) imply

∀j∈1,...,k−1

Φj+1 = Φ′j

φ=Φ1= φ(j),

3 GENERAL THEORY 26

i.e. φ is k times differentiable and Φ has, once again, the form (3.3) (note Φ1 = φ bythe definition of φ). Then, clearly, (3.2b) implies (3.1b), and Def. 1.2(a)(i) for Φ impliesDef. 1.2(a)(i) for φ:

(

x,Φ(x),Φ′(x))

∈ I ×Kkn ×Kkn : x ∈ I

⊆ V

and the definition of V in (3.5) imply

(

x, φ(x), . . . , φ(k))

∈ I ×K(k+1)n : x ∈ I

⊆ U.

Finally, from the last equation of (3.2a), one obtains

∀x∈I

F(

x, φ(x), . . . , φ(k)) (3.3),(3.5)

= Gk

(

x,Φ(x),Φ′(x))

= 0,

proving φ satisfies (3.1a).

Example 3.2. The second-order initial value problem

y′′ = −y, y(0) = 0,

y′(0) = r, r ∈ R given,(3.8)

is equivalent to the following system of two first-order ODE:

y′1 = y2,

y′2 = −y1,y(0) =

(

0r

)

. (3.9)

The solution to (3.9) is

Φ : R −→ R2, Φ(x) =

(

Φ1(x)Φ2(x)

)

=

(

r sin xr cos x

)

, (3.10)

and, thus, the solution to (3.8) is

φ : R −→ R, φ(x) = r sin x. (3.11)

—

As a consequence of Th. 3.1, one can carry out much of the general theory of ODE(such as results regarding existence and uniqueness of solutions) for systems of first-order ODE, obtaining the corresponding results for higher-order ODE as a corollary.This is the strategy usually pursued in the literature and we will follow suit in thisclass.

3.2 Existence of Solutions

It is a rather remarkable fact that, under the very mild assumption that f : G −→ Kn isa continuous function defined on an open subset G of R×Kkn with (x0, y0,0, . . . , y0,k−1) ∈

3 GENERAL THEORY 27

G, every initial value problem (1.7) for the n-dimensional explicit kth-order ODE (1.6)has at least one solution φ : I −→ Kn, defined on a, possibly very small, open interval.This is the contents of the Peano Th. 3.8 below and its Cor. 3.10. From Example 1.4(b),we already know that uniqueness of the solution cannot be expected without strongerhypotheses.

The proof of the Peano theorem requires some work. One of the key ingredients isthe Arzela-Ascoli Th. 3.7 that, under suitable hypotheses, guarantees a given sequenceof continuous functions to have a uniformly convergent subsequence (the formulationin Th. 3.7 is suitable for our purposes – many different variants of the Arzela-Ascolitheorem exist in the literature).

We begin with some prelimanaries from the theory of metric spaces. At this point, thereader might want to review the definition of a metric, a metric space, and basic notionson metric spaces, such as the notion of compactness and the notion of continuity offunctions between metric spaces. Also recall that every normed space is a metric spacevia the metric induced by the norm (in particular, if we use metric notions on normedspaces, they are always meant with respect to the respective induced metric). If youare not sufficiently familiar with metrics and norms, you might want to consult therelevant subsections of [Phi15, Sec. 1]; for compactness and some related results see,e.g., Appendix C.2.

Notation 3.3. Let (X, d) be a metric space. Given x ∈ X and r ∈ R+, let

Br(x) := y ∈ X : d(x, y) < r

denote the open ball with center x and radius r, also known as the r-ball with center x.

Definition 3.4. Let (X, dX) and (Y, dY ) be metric spaces. We say a sequence of func-tions (fm)m∈N, fm : X −→ Y , converges uniformly to a function f : X −→ Y if, andonly if,

∀ǫ>0

∃N∈N

∀m≥N,x∈X

dY(

fm(x), f(x))

< ǫ.

Theorem 3.5. Let (X, dX) and (Y, dY ) be metric spaces. If the sequence (fm)m∈N ofcontinuous functions fm : X −→ Y converges uniformly to the function f : X −→ Y ,then f is continuous as well.

Proof. We have to show that f is continuous at every ξ ∈ X. Thus, let ξ ∈ X and ǫ > 0.Due to the uniform convergence, we can choose m ∈ N such that dY

(

fm(x), f(x))

< ǫ/3for every x ∈ X. Moreover, as fm is continuous at ξ, there exists δ > 0 such thatx ∈ Bδ(ξ) implies dY

(

fm(ξ), fm(x))

< ǫ/3. Thus, if x ∈ Bδ(ξ), then

dY(

f(ξ), f(x))

≤ dY(

f(ξ), fm(ξ))

+ dY(

fm(ξ), fm(x))

+ dY(

fm(x), f(x))

<ǫ

3+ǫ

3+ǫ

3= ǫ,

proving f is continuous at ξ.

3 GENERAL THEORY 28

Definition 3.6. Let (X, dX) and (Y, dY ) be metric spaces and let F be a set of functionsfrom X into Y . Then the set F (or the functions in F) are said to be uniformlyequicontinuous if, and only if, for each ǫ > 0, there is δ > 0 such that

∀x,ξ∈X

(

dX(x, ξ) < δ ⇒ ∀f∈F

dY(

f(x), f(ξ))

< ǫ

)

. (3.12)

Theorem 3.7 (Arzela-Ascoli). Let n ∈ N, let ‖ · ‖ denote some norm on Kn, and letI ⊆ R be some bounded interval. If (fm)m∈N is a sequence of functions fm : I −→ Kn

such that fm : m ∈ N is uniformly equicontinuous and such that, for each x ∈ I, thesequence

(

fm(x))

m∈Nis bounded, then (fm)m∈N has a uniformly convergent subsequence

(fmj)j∈N, i.e. there exists f : I −→ Kn such that

∀ǫ>0

∃N∈N

∀j≥N,x∈I

‖fmj(x)− f(x)‖ < ǫ.

In particular, the limit function f is continuous.

Proof. Let (r1, r2, . . . ) be an enumeration of the set of rational numbers in I, i.e. ofQ ∩ I. Inductively, we construct a sequence (Fm)m∈N of subsequences of (fm)m∈N,Fm = (fm,k)k∈N, such that

(i) for each m ∈ N, Fm is a subsequence of each Fj with j ∈ 1, . . . ,m,

(ii) for each m ∈ N, Fm converges pointwise at each of the first m rational numbersrj; more precisely, there exists a sequence (z1, z2, . . . ) in Kn such that, for eachm ∈ N and each j ∈ 1, . . . ,m:

limk→∞

fm,k(rj) = zj.

Actually, we construct the (zm)m∈N inductively together with the (Fm)m∈N: Since thesequence (fm(r1))m∈N is, by hypothesis, a bounded sequence in Kn, one can apply theBolzano-Weierstrass theorem (cf. [Phi15, Th. 1.16(b)]) to obtain z1 ∈ Kn and a sub-sequence F1 = (f1,k)k∈N of (fm)m∈N such that limk→∞ f1,k(r1) = z1. To proceed byinduction, we now assume to have already constructed F1, . . . ,FM and z1, . . . , zM forM ∈ N such that (i) and (ii) hold for each m ∈ 1, . . . ,M. Since the sequence(fM,k(rM+1))k∈N is a bounded sequence in Kn, one can, once more, apply the Bolzano-Weierstrass theorem to obtain zM+1 ∈ Kn and a subsequence FM+1 = (fM+1,k)k∈N ofFM such that limk→∞ fM+1,k(rM+1) = zM+1. Since FM+1 is a subsequence of FM , it isalso a subsequence of all previous subsequences, i.e. (i) now also holds for m = M + 1.In consequence, limk→∞ fM+1,k(rj) = zj for each j = 1, . . . ,M + 1, such that (ii) nowalso holds for m =M + 1 as required.

Next, one considers the diagonal sequence (gm)m∈N, gm := fm,m, and observes that thissequence converges pointwise at each rational number rj (limm→∞ gm(rj) = zj), since,at least for m ≥ j, (gm)m∈N is a subsequence of every Fj (exercise) – in particular,(gm)m∈N is also a subsequence of the original sequence (fm)m∈N.

3 GENERAL THEORY 29

In the last step of the proof, we show that (gm)m∈N converges uniformly on the entireinterval I to some f : I −→ Kn. To this end, fix ǫ > 0. Since gm : m ∈ N ⊆ fm :m ∈ N, the assumed uniform equicontinuity of fm : m ∈ N yields δ > 0 such that

∀x,ξ∈I

(

|x− ξ| < δ ⇒ ∀m∈N

∥

∥gm(x)− gm(ξ)∥

∥ <ǫ

3

)

.

Since I is bounded, it has finite length and, thus, it can be covered with finitely manyintervals I1, . . . , IN , I =

⋃Nj=1 Ij, N ∈ N, such that each Ij has length less than δ.

Moreover, since Q is dense in R, for each j ∈ 1, . . . , N, there exists k(j) ∈ N suchthat rk(j) ∈ Ij. Define M := maxk(j) : j = 1, . . . , N. We note that each of thefinitely many sequences

(

(gm(r1))

m∈N, . . . ,

(

(gm(rM))

m∈Nis a Cauchy sequence. Thus,

∃K∈N

∀k,l≥K

∀α=1,...,M

∥

∥gk(rα)− gl(rα)∥

∥ <ǫ

3. (3.13)

We now consider an arbitrary x ∈ I and k, l ≥ K. Let j ∈ 1, . . . , N such that x ∈ Ij.Then rk(j) ∈ Ij, |rk(j) − x| < δ, and the estimate in (3.13) holds for α = k(j). Inconsequence, we obtain the crucial estimate

∀k,l≥K

∥

∥gk(x)− gl(x)∥

∥

≤ ‖gk(x)− gk(rk(j))∥

∥+∥

∥gk(rk(j))− gl(rk(j))∥

∥+∥

∥gl(rk(j))− gl(x)∥

∥

<ǫ

3+ǫ

3+ǫ

3= ǫ.

(3.14)

The estimate (3.14) shows(

(gm(x))

m∈Nis a Cauchy sequence for each x ∈ I, and we

can definef : I −→ Kn, f(x) := lim

m→∞gm(x). (3.15)

Since K in (3.14) does not depend on x ∈ I, passing to the limit k → ∞ in the estimateof (3.14) implies

∀l≥K,x∈I

‖gl(x)− f(x)‖ ≤ ǫ,

proving uniform convergence of the subsequence (gm)m∈N of (fm)m∈N as desired. Thecontinuity of f is now a consequence of Th. 3.5.

At this point, we have all preparations in place to state and prove the existence theorem.

Theorem 3.8 (Peano). If G ⊆ R×Kn is open, n ∈ N, and f : G −→ Kn is continuous,then, for each (x0, y0) ∈ G, the explicit n-dimensional first-order initial value problem

y′ = f(x, y), (3.16a)

y(x0) = y0, (3.16b)

has at least one solution. More precisely, given an arbitrary norm ‖ · ‖ on Kn, (3.16)has a solution φ : I −→ Kn, defined on the open interval

I :=]x0 − α, x0 + α[, (3.17)

3 GENERAL THEORY 30

α = α(b) > 0, where b > 0 is such that

B :=

(x, y) ∈ R×Kn : |x− x0| ≤ b and ‖y − y0‖ ≤ b

⊆ G, (3.18)

M :=M(b) := max‖f(x, y)‖ : (x, y) ∈ B <∞, (3.19)

and

α := α(b) :=

minb, b/M for M > 0,

b for M = 0.(3.20)

In general, the choice of the norm ‖ · ‖ on Kn will influence the possible sizes of α and,thus, of I.

Proof. The proof will be conducted in several steps. In the first step, we check α =α(b) > 0 is well-defined: Since G is open, there always exists b > 0 such that (3.18)holds. Since B is a closed and bounded subset of the finite-dimensional space R×Kn,B is compact (cf. [Phi15, Cor. 3.5]). Since f and, thus, ‖f‖ is continuous (every normis even Lipschitz continuous due to the inverse triangle inequality), it must assume itsmaximum on the compact set B (cf. [Phi15, Th. 3.8]), showing M ∈ R+

0 is well-definedby (3.19) and α is well-defined by (3.20).

In the second step of the proof, we note that it suffices to prove (3.16) has a solution φ+,defined on [x0, x0 + α[: One can then apply the time reversion Lem. 1.9(b): The proofproviding the solution φ+ also provides a solution ψ+ : [−x0,−x0 + α[−→ Kn to thetime-reversed initial value problem, consisting of y′ = −f(−x, y) and y(−x0) = y0 (notethat the same M and α work for the time-reversed problem). Then, according to Lem.1.9(b), φ− : ]x0 − α, x0] −→ Kn, φ−(x) := ψ+(−x), is a solution to (3.16). According toLem. 1.7, we can patch φ− and φ+ together to obtain the desired solution

φ : I −→ Kn, φ(x) :=

φ−(x) for x ≤ x0,

φ+(x) for x ≥ x0,(3.21)

defined on all of I. It is noted that one can also conduct the proof with the second stepomitted, but then one has to perform the following steps on all of I, which means onehas to consider additional cases in some places.

In the third step of the proof, we will define a sequence (φm)m∈N of functions

φm : I+ −→ Kn, I+ := [x0, x0 + α], (3.22)

that constitute approximate solutions to (3.16). To begin the construction of φm, fixm ∈ N. Since B is compact and f is continuous, we know f is even uniformly continuouson B (cf. [Phi15, Th. 3.9]). In particular,

∃δm>0

∀(x,y),(x,y)∈B

(

|x− x| < δm, ‖y − y‖ < δm ⇒∥

∥f(x, y)− f(x, y)∥

∥ <1

m

)

.

(3.23)

3 GENERAL THEORY 31

We now form what is called a discretization of the interval I+, i.e. a partition of I+ intosufficiently many small intervals: Let N ∈ N and

x0 < x1 < · · · < xN−1 < xN := x0 + α (3.24)

such that

∀j∈1,...,N

xj − xj−1 < β :=

minδm, δm/M, 1/m for M > 0,

minδm, 1/m for M = 0(3.25)

(for example one could make the equidistant choice xj := x0 + jh with h = α/N andN > α/β, but it does not matter how the xj are defined as long as (3.24) and (3.25)both hold). Note that we get a different discretization of I+ for each m ∈ N; however,the dependence on m is suppressed in the notation for the sake of readability. We nowdefine recursively

φm : I+ −→ Kn,

φm(x0) := y0,

φm(x) := φm(xj) + (x− xj) f(

xj, φm(xj))

for each x ∈ [xj , xj+1].

(3.26)

Note that there is no conflict between the two definitions given for x = xj with j ∈1, . . . , N − 1. Each function φm defines a polygon in Kn. This construction is knownas Euler’s method and it can be used to obtain numerical approximations to the solutionof the initial value problem (while simple, this method is not very efficient, though). Westill need to verify that the definition (3.26) does actually make sense: We need to checkthat f can, indeed, be applied to (xj, φm(xj)), i.e. we have to check (xj, φm(xj)) ∈ G.We can actually show the stronger statement

∀x∈I+

(x, φm(x)) ∈ B, (3.27)

where B is as defined in (3.18). First, it is pointed out that (3.20) implies α ≤ b, suchthat x ∈ I+ implies |x − x0| ≤ α ≤ b as required in (3.18). One can now prove (3.27)by showing by induction on j ∈ 0, . . . , N − 1:

∀x∈[xj ,xj+1]

(x, φm(x)) ∈ B. (3.28)

To start the induction, note φm(x0) = y0 and (x0, y0) ∈ B by (3.18). Now let j ∈0, . . . , N − 1 and x ∈ [xj, xj+1]. We estimate

‖φm(x)− y0‖ ≤ ‖φm(x)− φm(xj)‖+j∑

k=1

‖φm(xk)− φm(xk−1)‖

(3.26)= (x− xj)

∥

∥f(

xj, φm(xj))∥

∥+

j∑

k=1

(xk − xk−1)∥

∥f(

xk−1, φm(xk−1))∥

∥

(∗)

≤ (x− xj)M +

j∑

k=1

(xk − xk−1)M = (x− x0)M

≤ αM(3.20)

≤ b, (3.29)

3 GENERAL THEORY 32

where, at (∗), it was used that (xk, φm(xk)) ∈ B by induction hypothesis for eachk = 0, . . . , j, and, thus,

∥

∥f(

xk, φm(xk))∥

∥ ≤M by (3.19). Estimate (3.29) completes theinduction and the third step of the proof.

In the fourth step of the proof, we establish several properties of the functions φm. Thefirst two properties are immediate from (3.26), namely that φm is continuous on I+ anddifferentiable at each x ∈]xj, xj+1[, j ∈ 0, . . . , N − 1, where

∀j∈0,...,N−1

∀x∈]xj ,xj+1[

φ′m(x) = f

(

xj, φm(xj))

. (3.30)

The next property to establish is

∀s,t∈I+

‖φm(t)− φm(s)‖ ≤ |t− s|M. (3.31)

To prove (3.31), we may assume s < t without loss of generality. If s, t ∈ [xj, xj+1],j ∈ 0, . . . , N − 1, then

‖φm(t)− φm(s)‖(3.26)=

∥

∥φm(xj) + (t− xj) f(

xj, φm(xj))

− φm(xj)− (s− xj) f(

xj, φm(xj))∥

∥

= |t− s|∥

∥f(

xj, φm(xj))∥

∥

(3.19)

≤ |t− s|M (3.32a)

as desired. If s, t are not contained in the same interval [xj , xj+1], then fix j < k suchthat s ∈ [xj, xj+1] and t ∈ [xk, xk+1]. Then (3.31) follows from an estimate analogous tothe one in (3.29):

‖φm(t)− φm(s)‖

≤ ‖φm(s)− φm(xj+1)‖+k−1∑

l=j+1

‖φm(xl)− φm(xl+1)‖+ ‖φm(xk)− φm(t)‖

(3.32a)

≤ |s− xj+1|M +k−1∑

l=j+1

|xl − xl+1|M + |t− xk|M

= |t− s|M, (3.32b)

completing the proof of (3.31). The following property of the φm is the justification forcalling them approximate solutions to our initial value problem (3.16):

∀j∈0,...,N−1

∀x∈]xj ,xj+1[

∥

∥φ′m(x)− f

(

x, φm(x))∥

∥ <1

m. (3.33)

Indeed, (3.33) is a consequence of (3.23), i.e. of the uniform continuity of f on B: First,ifM = 0, then f ≡ φ′

m ≡ 0 and there is nothing to prove. So letM > 0. If x ∈]xj, xj+1[,then, according to (3.30), we have φ′

m(x) = f(

xj, φm(xj))

. Thus, by (3.25),

|x−xj| < β ≤ minδm, δm/M ⇒ ‖φm(x)−φm(xj)‖(3.31)

≤ |x−xj|M < δm, (3.34a)

3 GENERAL THEORY 33

and

∥

∥φ′m(x)− f

(

x, φm(x))∥

∥ =∥

∥f(

xj, φm(xj))

− f(

x, φm(x))∥

∥

(3.34a),(3.23)<

1

m, (3.34b)

proving (3.33).

The last property of the φm we need is

∀x∈I+

‖φm(x)‖ ≤ ‖φm(x)− φm(x0)‖+ ‖φm(x0)‖(3.31)

≤ |x− x0|M + ‖φm(x0)‖≤ αM + ‖y0‖,

(3.35)

which says that the φm are pointwise and even uniformly bounded.

In the fifth and last step of the proof, we use the Arzela-Ascoli Th. 3.7 to obtain afunction φ+ : I+ −→ Kn, and we show that φ constitutes a solution to (3.16). Accordingto (3.31), the φm are uniformly equicontinuous (given ǫ > 0, condition (3.12) is satisfiedwith δ := ǫ/M for M > 0 and arbitrary δ > 0 for M = 0), and according to (3.35)the φm are bounded such that the Arzela-Ascoli Th. 3.7 applies to yield a subsequence(φmj

)j∈N of (φm)m∈N converging uniformly to some continuous function φ+ : I+ −→ Kn.So it merely remains to verify that φ+ is a solution to (3.16).

As the uniform convergence of the (φmj)j∈N implies pointwise convergence, we have

φ+(x0) = limj→∞ φmj(x0) = y0, showing φ+ satisfies the initial condition (3.16b).

Next,∀x∈I

(x, φ+(x)) = limj→∞

(x, φmj(x)) ∈ B,

since each (x, φmj(x)) is in B and B is closed. In particular, f(x, φ+(x)) is well-defined

for each x ∈ I+. To prove that φ+ also satisfies the ODE (3.16a), by Th. 1.5, it sufficesto show

∀x∈I+

φ+(x)− φ+(x0)−∫ x

x0

f(

t, φ+(t))

dt = 0 (3.36)

(in particular, (3.36) implies φ+ to be differentiable). Fixing x ∈ I+ and using thetriangle inequality for the umpteenth time, one obtains

∥

∥

∥

∥

φ+(x)− φ+(x0)−∫ x

x0

f(

t, φ+(t))

dt

∥

∥

∥

∥

≤ ‖φ+(x)− φmj(x)‖+

∥

∥

∥

∥

φmj(x)− φ+(x0)−

∫ x

x0

f(

t, φmj(t))

dt

∥

∥

∥

∥

+

∥

∥

∥

∥

∫ x

x0

(

f(

t, φmj(t))

− f(

t, φ+(t))

)

dt

∥

∥

∥

∥

, (3.37)

holding for every j ∈ N. We will conclude the proof by showing that all three summandson the right-hand side of (3.37) tend to 0 for j → ∞. As already mentioned above,the uniform convergence of the (φmj

)j∈N implies pointwise convergence, implying theconvergence of the first summand. We tackle the third summand next, using∥

∥

∥

∥

∫ x

x0

(

f(

t, φmj(t))

− f(

t, φ+(t))

)

dt

∥

∥

∥

∥

≤∫ x

x0

∥

∥

∥f(

t, φmj(t))

− f(

t, φ+(t))

∥

∥

∥dt , (3.38)

3 GENERAL THEORY 34

which holds for every norm (cf. Appendix B), but can easily be checked directly forthe 1-norm, where ‖(z1, . . . , zn)‖1 :=

∑nj=1 |zj| (exercise). Given ǫ > 0, the uniform

continuity of f on B provides δ > 0 such that∥

∥f(

t, φmj(t))

− f(

t, φ+(t))∥

∥ < ǫ/α for‖φmj

(t) − φ+(t)‖ < δ. The uniform convergence of (φmj)j∈N then yields K ∈ N such

that ‖φmj(t)− φ+(t)‖ < δ for every j ≥ K and each t ∈ I. Thus,

∀j≥K

∫ x

x0

∥

∥

∥f(

t, φmj(t))

− f(

t, φ+(t))

∥

∥

∥ dt ≤ |x− x0| ǫα

≤ ǫ,

thereby establishing the convergence of the third summand from the right-hand sideof (3.37). For the remaining second summand, we note that the fact that each φm iscontinuous and piecewise differentiable (with piecewise constant derivative) allows toapply the fundamental theorem of calculus in the form [Phi16, Th. 10.20(b)] to obtain

∀x∈I+

φm(x) = φm(x0) +

∫ x

x0

φ′m(t) dt . (3.39)

Using (3.39) in the second summand of the right-hand side of (3.37) provides

∥

∥

∥

∥

φmj(x)− φ+(x0)−

∫ x

x0

f(

t, φmj(t))

dt

∥

∥

∥

∥

≤∫ x

x0

∥

∥

∥φ′mj(t)− f

(

t, φmj(t))

∥

∥

∥ dt

(3.33)

≤∫ x

x0

1

mj

≤ α

mj

,

showing the convergence of the second summand, which finally concludes the proof.

Corollary 3.9. If G ⊆ R×Kn is open, n ∈ N, f : G −→ Kn is continuous, and C ⊆ Gis compact, then there exists α > 0, such that, for each (x0, y0) ∈ C, the explicit n-dimensional first-order initial value problem (3.16) has a solution φ : I −→ Kn, definedon the open interval I :=]x0 − α, x0 + α[, i.e. always on an interval of the same length2α.

Proof. Exercise.

Corollary 3.10. If G ⊆ R × Kkn is open, k, n ∈ N, and f : G −→ Kn is continuous,then, for each (x0, y0,0, . . . , y0,k−1) ∈ G, the explicit n-dimensional kth-order initial valueproblem consisting of (1.6) and (1.7), which, for convenience, we rewrite

y(k) = f(

x, y, y′, . . . , y(k−1))

, (3.40a)

∀j∈0,...,k−1

y(j)(x0) = y0,j , (3.40b)

has at least one solution. More precisely, there exists an open interval I ⊆ R withx0 ∈ I and φ : I −→ Kn such that φ is a solution to (3.40). If C ⊆ G is compact,then there exists α > 0 such that, for each (x0, y0,0, . . . , y0,k−1) ∈ C, (3.40) has a solutionφ : I −→ Kn, defined on the open interval I :=]x0−α, x0+α[, i.e. always on an intervalof the same length 2α.

3 GENERAL THEORY 35

Proof. If f is continuous, then the right-hand side of the equivalent first-order system(3.2a) (written in explicit form) is given by the continuous function

f : G −→ Kkn, f(x, y1, . . . , yk) :=

y2y3...

yk−1

f(x, y1, . . . , yk)

. (3.41)

Thus, Th. 3.8 provides a solution Φ : I −→ Kkn to (3.2) and, then, Th. 3.1(b) yieldsφ := Φ1 to be a solution to (3.40). Moreover, if C ⊆ G is compact, then Cor. 3.9 providesα > 0 such that, for each (x0, y0,0, . . . , y0,k−1) ∈ C, (3.2) has a solution Φ : I −→ Kkn,defined on the same open interval I :=]x0 − α, x0 + α[. In particular, φ := Φ1, thecorresponding solution to (3.40) is also defined on the same I.

While the Peano theorem is striking in its generality, it does have several drawbacks:(a) the interval, where the existence of a solution is proved can be unnecessarily short;(b) the selection of the subsequence using the Arzela-Ascoli theorem makes the proofnonconstructive; (c) uniqueness of solutions is not provided, even in cases, where uniquesolutions exist; (d) it does not provide information regarding how the solution changeswith a change of the initial condition. We will subsequently address all these points,namely (b) and (c) in Sec. 3.3 (we will see that the proof of the Peano theorem becomesconstructive in situations, where the solution is unique – in general, a constructive proofis not available), (a) in Sec. 3.4, and (d) in Sec. 3.5.

3.3 Uniqueness of Solutions

Example 1.4(b) shows that the hypotheses of the Peano Th. 3.8 are not strong enoughto guarantee the initial value problem (3.16) has a unique solution, not even in someneighborhood of x0. The additional condition that will yield uniqueness is local Lipschitzcontinuity of f with respect to y.

Definition 3.11. Let m,n ∈ N, G ⊆ R×Km, and f : G −→ Kn.

(a) The function f is called (globally) Lipschitz continuous or just (globally) Lipschitzwith respect to y if, and only if,

∃L≥0

∀(x,y),(x,y)∈G

∥

∥f(x, y)− f(x, y)∥

∥ ≤ L‖y − y‖. (3.42)

(b) The function f is called locally Lipschitz continuous or just locally Lipschitz withrespect to y if, and only if, for each (x0, y0) ∈ G, there exists a (relative) open setU ⊆ G such that (x0, y0) ∈ U (i.e. U is a (relative) open neighborhood of (x0, y0))and f is Lipschitz continuous with respect to y on U , i.e. if, and only if,

∀(x0,y0)∈G

∃(x0, y0) ∈ U ⊆ G open

∃L≥0

∀(x,y),(x,y)∈U

∥

∥f(x, y)− f(x, y)∥

∥ ≤ L‖y − y‖.(3.43)

3 GENERAL THEORY 36

The number L occurring in (a),(b) is called Lipschitz constant. The norms on Km andKn in (a),(b) are arbitrary. If one changes the norms, then one will, in general, changeL, but not the property of f being (locally) Lipschitz.

Caveat 3.12. It is emphasized that f : G −→ Kn, (x, y) 7→ f(x, y), being Lipschitzwith respect to y does not imply f to be continuous: Indeed, if I ⊆ R, ∅ 6= A ⊆ Km,and g : I −→ Kn is an arbitrary discontinuous function, then f : I × A −→ Kn,f(x, y) := g(x) is not continuous, but satisfies (3.42) with L = 0.

—

While the local neighborhoods U , where a function locally Lipschitz (with respect to y)is actually Lipschitz continuous (with respect to y) can be very small, we will now showthat a continuous function is locally Lipschitz (with respect to y) on G if, and only if,it is Lipschitz continuous (with respect to y) on every compact set K ⊆ G.

Proposition 3.13. Let m,n ∈ N, G ⊆ R × Km, and f : G −→ Kn be continuous.Then f is locally Lipschitz with respect to y if, and only if, f is (globally) Lipschitz withrespect to y on every compact subset K of G.

Proof. First, assume f is not locally Lipschitz with respect to y. Then there exists(x0, y0) ∈ G such that

∀N∈N

∃(xN ,yN,1),(xN ,yN,2)∈G∩B1/N (x0,y0)

∥

∥f(xN , yN,1)− f(xN , yN,2)∥

∥ > N‖yN,1 − yN,2‖. (3.44)

The setK := (x0, y0) ∪

(xN , yN,j) : N ∈ N, j ∈ 1, 2

is clearly a compact subset of G (e.g. by the Heine-Borel property of compact sets (seeTh. C.19), since every open set containing (x0, y0) must contain all, but finitely many,of the elements of K). Due to (3.44), f is not (globally) Lipschitz with respect to y onthe compact set K (so, actually, continuity of f was not used for this direction).

Conversely, assume f to be locally Lipschitz with respect to y, and consider a compactsubset K of G. Then, for each (x, y) ∈ K, there is some (relatively) open U(x,y) ⊆ Gwith (x, y) ∈ U(x,y) and such that f is Lipschitz with respect to y in U(x,y). By theHeine-Borel property of compact sets (see Th. C.19), there are finitely many U1 :=U(x1,y1), . . . , UN := U(xN ,yN ), N ∈ N, such that

K ⊆N⋃

j=1

Uj. (3.45)

For each j = 1, . . . , N , let Lj denote the Lipschitz constant for f on Uj and set L′ :=maxL1, . . . , LN. As f is assumed continuous and K is compact, we have

M := max‖f(x, y)‖ : (x, y) ∈ K <∞. (3.46)

3 GENERAL THEORY 37

Using the compactness of K once again, there exists a Lebesgue number δ > 0 for theopen cover (Uj)j∈1,...,N of K (cf. Th. C.21), i.e. δ > 0 such that

∀(x,y),(x,y)∈K

(

‖y − y‖ < δ ⇒ ∃j∈1,...,N

(x, y), (x, y) ⊆ Uj

)

. (3.47)

Define L := maxL′, 2M/δ. Then, for every (x, y), (x, y) ∈ K:

‖y − y‖ < δ ⇒ ‖f(x, y)− f(x, y)‖ ≤ Lj‖y − y‖ ≤ L‖y − y‖, (3.48a)

‖y − y‖ ≥ δ ⇒ ‖f(x, y)− f(x, y)‖ ≤ 2M =2Mδ

δ≤ L‖y − y‖, (3.48b)

completing the proof that f is Lipschitz with respect to y on K.

While, in general, the assertion of Prop. 3.13 becomes false if the continuity of f is omit-ted, for convex G, it does hold without the continuity assumption on f (see AppendixD). The following Prop. 3.14 provides a useful sufficient condition for f : G −→ Kn,G ⊆ R×Km open, to be locally Lipschitz with respect to y:

Proposition 3.14. Let m,n ∈ N, let G ⊆ R × Km be open, and f : G −→ Kn. Asufficient condition for f to be locally Lipschitz with respect to y is f being continuously(real) differentiable with respect to y, i.e., f is locally Lipschitz with respect to y providedthat all partials ∂ykfl; k, l = 1, . . . , n (∂yk,1fl, ∂yk,2fl for K = C) exist and are continuous.

Proof. We consider the case K = R; the case K = C is included by using the identifi-cations Cm ∼= R2m and Cn ∼= R2n. Given (x0, y0) ∈ G, we have to show f is Lipschitzwith respect to y on some open set U ⊆ G with (x0, y0) ∈ U . As in the Peano Th. 3.8,since G is open,

∃b>0

B :=

(x, y) ∈ R× Rm : |x− x0| ≤ b and ‖y − y0‖1 ≤ b

⊆ G,

where ‖ · ‖1 denotes the 1-norm on Rm. Since the ∂ykfl, (k, l) ∈ 1, . . . ,m×1, . . . , n,are all continuous on the compact set B,

M := max

|∂ykfl(x, y)| : (x, y) ∈ B, (k, l) ∈ 1, . . . ,m × 1, . . . , n

<∞. (3.49)

Applying the mean value theorem (cf. [Phi15, Th. 2.32]) to the n components of thefunction

fx :

y ∈ Rm : (x, y) ∈ B

−→ Rn, fx(y) := f(x, y),

we obtain η1, . . . , ηn ∈ Rm such that

fl(x, y)− fl(x, y) =m∑

k=1

∂ykfl(x, ηl)(yk − yk), (3.50)

and, thus,

∀(x,y),(x,y)∈B

∥

∥f(x, y)− f(x, y)∥

∥

1=

n∑

l=1

|fl(x, y)− fl(x, y)|

(3.49),(3.50)

≤n∑

l=1

m∑

k=1

M |yk − yk| =n∑

l=1

M‖y − y‖1 = nM‖y − y‖1,(3.51)

3 GENERAL THEORY 38

i.e. f is Lipschitz with respect to y on B (where

(x, y) ∈ R× Rm : |x− x0| < b and ‖y − y0‖1 < b

⊆ B

is an open neighborhood of (x0, y0)), showing f is locally Lipschitz with respect to y.

Theorem 3.15. If G ⊆ R × Kn is open, n ∈ N, and f : G −→ Kn is continuous andlocally Lipschitz with respect to y, then, for each (x0, y0) ∈ G, the explicit n-dimensionalfirst-order initial value problem

y′ = f(x, y), (3.52a)

y(x0) = y0, (3.52b)

has a unique solution. More precisely, if I ⊆ R is an open interval and φ, ψ : I −→ Kn

are both solutions to (3.52a), then φ(x0) = ψ(x0) for one x0 ∈ I implies φ(x) = ψ(x)for all x ∈ I:

(

∃x0∈I

φ(x0) = ψ(x0)

)

⇒(

∀x∈I

φ(x) = ψ(x)

)

. (3.53)

Proof. We first show that φ and ψ must agree in a small neighborhood of x0:

∃ǫ>0

∀x∈]x0−ǫ,x0+ǫ[

φ(x) = ψ(x). (3.54)

Since f is continuous and both φ and ψ are solutions to the initial value problem (3.52),we can use Th. 1.5 to obtain

∀x∈I

φ(x)− ψ(x) =

∫ x

x0

(

f(

t, φ(t))

− f(

t, ψ(t))

)

dt . (3.55)

As f is locally Lipschitz with respect to y, there exists δ > 0 such that f is Lipschitzwith Lipschitz constant L ≥ 0 with respect to y on

U := (x, y) ∈ G : |x− x0| < δ, ‖y − y0‖ < δ,

where we have chosen some arbitrary norm ‖·‖ on Kn. The continuity of φ, ψ implies theexistence of ǫ > 0 such that B ǫ(x0) ⊆ I, φ(Bǫ(x0)) ⊆ Bδ(y0) and ψ(Bǫ(x0)) ⊆ Bδ(y0),implying

∀x∈Bǫ(x0)

∥

∥f(

x, φ(x))

− f(

x, ψ(x))∥

∥ ≤ L∥

∥φ(x)− ψ(x)∥

∥. (3.56)

Next, defineǫ := minǫ, 1/(2L)

and, using the compactness of Bǫ(x0) = [x0 − ǫ, x0 + ǫ] plus the continuity of φ, ψ,

M := max

‖φ(x)− ψ(x)‖ : x ∈ Bǫ(x0)

<∞.

From (3.55) and (3.56), we obtain

∀x∈Bǫ(x0)

‖φ(x)− ψ(x)‖ ≤ L

∣

∣

∣

∣

∫ x

x0

‖φ(t)− ψ(t)‖ dt∣

∣

∣

∣

≤ L |x− x0|M ≤ M

2(3.57)

3 GENERAL THEORY 39

(note that the integral in (3.57) can be negative for x < x0). The definition of Mtogether with (3.57) yields M ≤M/2, i.e. M = 0, finishing the proof of (3.54).

To prove φ(x) = ψ(x) for each x ≥ x0, let

s := supξ ∈ I : φ(x) = ψ(x) for each x ∈ [x0, ξ].One needs to show s = sup I. If s = sup I does not hold, then there exists α > 0 suchthat [s, s+ α] ⊆ I. Then the continuity of φ, ψ implies φ(s) = ψ(s), i.e. φ and ψ satisfythe same initial value problem at s such that (3.54) must hold with s instead of x0, incontradiction to the definition of s. Finally, φ(x) = ψ(x) for each x ≤ x0 follows in ancompletely analogous fashion, which concludes the proof of the theorem.

Corollary 3.16. If G ⊆ R×Kkn is open, k, n ∈ N, and f : G −→ Kn is continuous andlocally Lipschitz with respect to y, then, for each (x0, y0,0, . . . , y0,k−1) ∈ G, the explicitn-dimensional kth-order initial value problem consisting of (1.6) and (1.7), i.e.

y(k) = f(

x, y, y′, . . . , y(k−1))

,

∀j∈0,...,k−1

y(j)(x0) = y0,j ,

has a unique solution. More precisely, if I ⊆ R is an open interval and φ, ψ : I −→ Kn

are both solutions to (1.6), then

∀j∈0,...,k−1

φ(j)(x0) = ψ(j)(x0) (3.58)

holding for one x0 ∈ I implies φ(x) = ψ(x) for all x ∈ I.

Proof. Exercise.

Remark 3.17. According to Th. 3.15, the condition of f being continuous and locallyLipschitz with respect to y is sufficient for each initial value problem (3.52) to have aunique solution. However, this condition is not necessary: It is an exercise to show thatthe continuous function

f : R2 −→ R, f(x, y) :=

1 for y ≤ 0,

1 +√y for y ≥ 0,

(3.59)

is not locally Lipschitz with respect to y, but that, for each (x0, y0) ∈ R2, the initialvalue problem (3.52) still has a unique solution in the sense that (3.53) holds for eachsolution φ to (3.52a). And one can (can you?) even find simple examples of f beingdefined on an open domain such that f is discontinuous at every point in its domainand every initial value problem (3.52) still has a unique solution.

—

At the end of Sec. 3.2, it was pointed out that the proof of the Peano Th. 3.8 is non-constructive due to the selection of a subsequence. The following Th. 3.18 shows that,whenever the initial value problem has a unique solution, it becomes unnecessary toselect a subsequence, and the construction procedure (namely Euler’s method) used inthe proof of Th. 3.8 becomes an effective (if not necessarily efficient) numerical approx-imation procedure for the unique solution.

3 GENERAL THEORY 40

Theorem 3.18. Consider the situation of the Peano Th. 3.8. Under the additionalassumption that the solution to the explicit n-dimensional first-order initial value prob-lem (3.16) is unique on some interval J ⊆ [x0, x0 + α[, x0 ∈ J , and where α > 0 isconstructed as in Th. 3.8 (i.e. given by (3.18) – (3.20)), every sequence (φm)m∈N of func-tions defined on J according to Euler’s method as in the proof of Th. 3.8 (i.e. definedas in (3.26)) converges uniformly to the unique solution φ : J −→ Kn. An analogousstatement also holds for J ⊆]x0 − α, x0], x0 ∈ J .

Proof. Seeking a contradiction, assume (φm)m∈N does not converge uniformly to theunique solution φ. Then there exists ǫ > 0 and a subsequence (φmj

)j∈N such that

∀j∈N

‖φmj− φ‖sup = sup

‖φmj(x)− φ(x)‖ : x ∈ J

≥ ǫ. (3.60)

However, as a subsequence, (φmj)j∈N still has all the properties of the (φm)m∈N (namely

pointwise boundedness, uniform equicontinuity, piecewise differentiability, being approx-imate solutions according to (3.33)) that guanranteed the existence of a subsequence,converging to a solution. Thus, since the solution is unique on J , (φmj

)j∈N must, in turn,have a subsequence, converging uniformly to φ, which is in contradiction to (3.60). Thisshows the assumption that (φm)m∈N does not converge uniformly to φ must have beenfalse. The proof of the analogous statement for J ⊆]x0 − α, x0], x0 ∈ J , one obtains,e.g., via time reversion (cf. the second step of the proof of Th. 3.8).

Remark 3.19. The argument used to prove Th. 3.18 is of a rather general nature: Itcan be applied whenever a sequence is known to have a subsequence converging to somesolution of some equation (or some other problem), provided the same still holds forevery subsequence of the original sequence – in that case, the additional knowledge thatthe solution is unique implies the convergence of the original sequence without the needto select a subsequence.

3.4 Extension of Solutions, Maximal Solutions

The Peano Th. 3.8 and Cor. 3.10 show the existence of local solutions to explicit initialvalue problems, i.e. the solution’s existence is proved on some, possibly small, intervalcontaining the initial point x0. In the current section, we will address the question inwhich circumstances such local solutions can be extended, we will prove the existenceof maximal solutions (solutions that can not be extended), and we will learn how suchmaximal solutions can be identified.

Definition 3.20. Let φ : I −→ Kn, n ∈ N, be a solution to some ODE (such as (1.6)or (1.4) in the most general case), defined on some open interval I ⊆ R.

(a) We say φ has an extension or continuation to the right (resp. to the left) if, andonly if, there exists a solution ψ : J −→ Kn to the same ODE, defined on someopen interval J ⊇ I such that ψI= φ and

sup J > sup I (resp. inf J < inf I). (3.61)

3 GENERAL THEORY 41

An extension or continuation of φ is a function that is an extension to the right oran extension to the left (or both).

(b) The solution φ is called a maximal solution if, and only if, it does not admit anyextensions in the sense of (a) (note that we require maximal solutions to be definedon open intervals, cf. Appendix E).

Remark 3.21. As an immediate consequence of the time reversion Lem. 1.9(b), if asolution φ : I −→ Kn, n ∈ N, to (1.6), defined on some open interval I ⊆ R, has anextension to the right (resp. to the left) if, and only if, ψ : (−I) −→ Kn, ψ(x) := φ(−x),(solution to y(k) = (−1)k f(−x, y,−y′, . . . , (−1)k−1 y(k−1))) has an extension to the left(resp. to the right).

—

The existence of maximal solutions is not trivial – a priori it could be that every solutionhad an extension (analogous to the fact that to every x ∈ [0, 1[ (or every x ∈ R) thereis some bigger element in [0, 1[ (respectively in R)).

Theorem 3.22. Every solution φ0 : I0 −→ Kn to (1.4) (resp. to (1.6)), defined on anopen interval I0 ⊆ R, can be extended to a maximal solution of (1.4) (resp. of (1.6)).

Proof. The proof is carried out for solutions to (1.4) (the implicit ODE) – the proof forsolutions to the explicit ODE (1.6) is analogous and can also be seen as a special case.The idea is to apply Zorn’s lemma. To this end, define a partial order on the set

S := (I0, φ0) ∪ (I, φ) : φ : I −→ Kn is solution to (1.4), extending φ0 (3.62)

by letting(I, φ) ≤ (J, ψ) :⇔ I ⊆ J, ψI= φ. (3.63)

Every chain C, i.e. every totally ordered subset of S, has an upper bound, namely(IC, φC) with IC :=

⋃

(I,φ)∈C I and φC(x) := φ(x), where (I, φ) ∈ C is chosen such that

x ∈ I (since C is a chain, the value of φC(x) does not actually depend on the choice of(I, φ) ∈ C and is, thus, well-defined).

Clearly, IC is an open interval, I0 ⊆ IC, and φC extends φ0 as a function; we still need tosee that φC is a solution to (1.4). For this, we, once again, use that x ∈ IC means thereexists (I, φ) ∈ C such that x ∈ I and φ is a solution to (1.4). Thus, using the notationfrom Def. 1.2(a),

(x, φC(x), φ′C(x), . . . , φ

(k)C (x)) = (x, φ(x), φ′(x), . . . , φ(k)(x)) ∈ U

andF (x, φC(x), φ

′C(x), . . . , φ

(k)C (x)) = F (x, φ(x), φ′(x), . . . , φ(k)(x)) = 0,

showing φC is a solution to (1.4) as defined in Def. 1.2(a). In particular, (IC, φC) ∈ S. Toverify (IC, φC) is an upper bound for C, note that the definition of (IC, φC) immediatelyimplies I ⊆ IC for each (I, φ) ∈ C and φCI= φ for each (I, φ) ∈ C.To conclude the proof, we note that all hypotheses of Zorn’s lemma have been verifiedsuch that it yields the existence of a maximal element of (Imax, φmax) ∈ S, i.e. φmax :Imax −→ Kn must be a maximal solution extending φ0.

3 GENERAL THEORY 42

Proposition 3.23. Let k, n ∈ N. Given G ⊆ R × Kkn open and f : G −→ Kn

continuous, if φ : I −→ Kn is a solution to (1.6) such that I =]a, b[, a < b, b < ∞(resp. −∞ < a), then φ has an extension to the right (resp. to the left) if, and only if,

∃(b,η0,...,ηk−1)∈G

limx↑b

(

φ(x), φ′(x), . . . , φ(k−1)(x))

= (η0, . . . , ηk−1), (3.64a)

(

resp. ∃(a,η0,...,ηk−1)∈G

limx↓a

(

φ(x), φ′(x), . . . , φ(k−1)(x))

= (η0, . . . , ηk−1)

)

. (3.64b)

Proof. That the respective part of (3.64) is necessary for the existence of the respectiveextension is immediate from the fact that, for each solution to (1.6), the solution andall its derivatives up to order k − 1 must exist and must be continuous.

We now prove that (3.64a) is also sufficient for the existence of an extension to the right(the sufficiency of (3.64b) for the existence of an extension to the left is then immediatefrom Rem. 3.21). So assume (3.64a) to hold and consider the initial value problemconsisting of (1.6) and the initial conditions

∀j=0,...,k−1

y(j)(b) = ηj.

By Cor. 3.10, there must exist ǫ > 0 such that this initial value problem has a solutionψ : ]b−ǫ, b+ǫ[−→ Kn. We now show that φ extended to b via (3.64a) is still a solution to(1.6). First note the mean value theorem (cf. [Phi16, Th. 9.18]) yields that φ(j)(b) = ηjexists for j = 1, . . . , k − 1 as a left-hand derivative. Moreover,

limx↑b

φ(k)(x) = limx↑b

f(

x, φ(x), φ′(x), . . . , φ(k−1)(x))

= f(b, η0, . . . , ηk−1),

showing φ(k)(b) = f(

b, φ(b), φ′(b), . . . , φ(k−1)(b))

(again employing the mean value theo-rem), which proves φ extended to b is a solution to (1.6). Finally, Lem. 1.7 ensures

σ : ]a, b+ ǫ[−→ Kn, σ(x) :=

φ(x) for x ≤ b,

ψ(x) for x ≥ b,

is a solution to (1.6) that extends φ to the right.

Proposition 3.24. Let k, n ∈ N, let G ⊆ R × Kkn be open, let f : G −→ Kn becontinuous, and let φ : I −→ Kn be a solution to (1.6) defined on the open interval I.Consider x0 ∈ I and let gr+(φ) (resp. gr−(φ)) denote the graph of (φ, . . . , φ(k−1)) forx ≥ x0 (resp. for x ≤ x0):

gr+(φ) := gr+(φ, x0) :=(

x, φ(x), . . . , φ(k−1)(x))

∈ G : x ∈ I, x ≥ x0

, (3.65a)

gr−(φ) := gr−(φ, x0) :=(

x, φ(x), . . . , φ(k−1)(x))

∈ G : x ∈ I, x ≤ x0

. (3.65b)

If there exists a compact set K ⊆ G such that gr+(φ) ⊆ K (resp. gr−(φ) ⊆ K), then φhas an extension ψ : J −→ Kn to the right (resp. to the left) such that

∃x∈J

(

x, ψ(x), . . . , ψ(k−1)(x))

/∈ K. (3.66)

The statement can be rephrased by saying that gr+(φ) (resp. gr−(φ)) of each maximalsolution φ to (1.6) escapes from every compact subset of G when x appoaches the right(resp. the left) boundary of I (where the boundary of I can contain −∞ and/or +∞).

3 GENERAL THEORY 43

Proof. We conduct the proof for extensions to the right; extensions to the left can behandled completely analogously (alternatively, one can apply the time reversion Lem.1.9(b) as demonstrated in the last paragraph of the proof below). The proof for exten-sions to the right is divided into three steps. Let K ⊆ G be compact.

Step 1: We show that gr+(φ) ⊆ K implies φ has an extension to the right: Since K isbounded, so is gr+(φ), implying

b := sup I <∞ (3.67)

as well as

M1 := sup∥

∥φ(j)(x)∥

∥ : j ∈ 0, . . . , k − 1, x ∈ [x0, b[

<∞.

In the usual way, K compact and f continuous imply

M2 := max

‖f(x, y)‖ : (x, y) ∈ K

<∞.

SetM := maxM1,M2.

According to Prop. 3.23, we need to show (3.64a) holds. To this end, notice

∀j=0,...,k−1

∀x,x∈[x0,b[

‖φ(j)(x)− φ(j)(x)‖ ≤M |x− x| : (3.68)

Indeed,

∀x,x∈[x0,b[

‖φ(k−1)(x)− φ(k−1)(x)‖ =

∥

∥

∥

∥

∫ x

x

f(

t, φ(t), . . . , φ(k−1)(t))

dt

∥

∥

∥

∥

≤M |x− x|,

and, for 0 ≤ j < k − 1,

∀x,x∈[x0,b[

‖φ(j)(x)− φ(j)(x)‖ =

∥

∥

∥

∥

∫ x

x

φ(j+1)(t) dt

∥

∥

∥

∥

≤M |x− x|,

proving (3.68). Since K is compact, there exists a sequence (xm)m∈N in [x0, b[ such that

∃(b,η0,...,ηk−1)∈K

limm→∞

(

xm, φ(xm), φ′(xm), . . . , φ

(k−1)(xm))

= (b, η0, . . . , ηk−1). (3.69)

Using x := xm in (3.68) yields, for m→ ∞,

∀j=0,...,k−1

∀x∈[x0,b[

‖φ(j)(x)− ηj‖ ≤M |x− b|,

implying∀

j=0,...,k−1limx↑b

φ(j)(x) = ηj,

i.e. (3.64a) holds, completing the proof of Step 1.

3 GENERAL THEORY 44

Step 2: We show that gr+(φ) ⊆ K implies φ can be extended to the right to I∪]x0, b+α[,where α > 0 does not depend on b := sup I: Since K is compact, Cor. 3.9 guaranteesevery initial value problem

y(k) = f(x, y, y′, . . . , y(k−1)), (3.70a)

∀j=0,...,k−1

y(j)(ξ0) = y0,j , (ξ0, y0) ∈ K, (3.70b)

has a solution defined on ]ξ0 − α, ξ0 + α[ with the same α > 0. As shown in Step1, the solution φ can be extended into b = sup I such that it satisfies (3.70b) with(ξ0, y0) = (b, η) ∈ K. Thus, using Lem. 1.7, it can be pieced together with the solutionto (3.70) given on [b, b+ α[ by Cor. 3.9, completing the proof of Step 2.

Step 3: We finally show that gr+(φ) ⊆ K implies φ has an extension ψ : J −→ Kn

to the right such that (3.66) holds: We set a := inf I and φ0 := φ. Then, by Step 2,φ0 has an extension φ1 defined on ]a, b + α[. Inductively, for each m ≥ 1, either thereexists m0 ≤ m such that φm0 : ]a, b + m0α[−→ Kn is an extension of φ that can beused as ψ to conclude the proof (i.e. ψ := φm0 satisfies (3.66)) or φm can, once more,be extended to ]a, b + (m + 1)α[. As K is bounded, x ≥ x0 : (x, y) ∈ K ⊆ R mustalso be bounded, say by µ ∈ R. Thus, (3.66) must be satisfied for some ψ := φm with1 ≤ m ≤ (µ− x0)/α.

As mentioned above, one can argue completely analogous to the above proof to obtainthat gr−(φ) ⊆ K implies φ to have an extension to the left, satisfying (3.66). Here weshow how one, alternatively, can use the time reversion Lem. 1.9 to this end: Considerthe map

h : R×Kkn −→ R×Kkn, h(x, y1, . . . , yk) := (−x, y1, . . . , (−1)k−1yk),

which clearly constitutes an R-linear isomophism. Noting (1.6) and (1.32) are the same,we consider the time-reversed version (1.33) and observe Gg = h(G) to be open, h(K) ⊆Gg to be compact, g : Gg −→ Kn, g = (−1)k(f h), to be continuous. If gr−(φ, x0) ⊆ K,then gr+(ψ,−x0) ⊆ h(K), where ψ is the solution to the time-reversed version (1.33),

given by Lem. 1.9(b). Then ψ has an extension ψ to the right, satisfying (3.66) with ψreplaced by ψ and K replaced by h(K). Then, by Rem. 3.21, φ must have an extensionφ to the left, satisfying (3.66) with ψ replaced by φ.

In Th. 3.28 below, we will show that, for continuous f : G −→ Kn, each maximalsolution to (1.6) must go to the boundary of G in the sense of the following definition.

Definition 3.25. Let k, n ∈ N, let G ⊆ R × Kkn be open, let f : G −→ Kn, and letφ : ]a, b[−→ Kn, −∞ ≤ a < b ≤ ∞, be a solution to (1.6). We say that the solution φgoes to the boundary of G for x→ b (resp. for x→ a) if, and only if,

∀K ⊆ G compact

∃x0∈]a,b[

gr+(φ, x0) ∩K = ∅ (resp. gr−(φ, x0) ∩K = ∅), (3.71)

where gr+(φ, x0) and gr−(φ, x0) are defined as in (3.65) (with I =]a, b[). In other words,φ goes to the boundary of G for x → b (resp. for x → a) if, and only if, the graph of(φ, . . . , φ(k−1)) escapes every compact subset K of G forever for x→ b (resp. for x→ a).

3 GENERAL THEORY 45

Proposition 3.26. In the situation of Def. 3.25, if the solution φ goes to the boundaryof G for x→ b, then one of the following conditions must hold:

(i) b = ∞,

(ii) b <∞ and L := lim supx↑b∥

∥

(

φ(x), . . . , φ(k−1)(x))∥

∥ = ∞,

(iii) b <∞, L <∞ (L as defined in (ii)), G 6= R×Kkn (i.e. ∂G 6= ∅), and

limx↑b

dist(

(

x, φ(x), . . . , φ(k−1)(x))

, ∂G)

= 0. (3.72)

An analogous statement is valid for the solution φ going to the boundary of G for x→ a.

Proof. The proof is carried out for x→ b; the proof for x→ a is analogous.

Assume (i) – (iii) are all false. Choose c ∈]a, b[. Since (i) and (ii) are false,

∃0≤M<∞

∀x∈[c,b[

∥

∥

(

φ(x), . . . , φ(k−1)(x))∥

∥ ≤M.

If (iii) is false because G = R×Kkn, then K := (x, y) ∈ R×Kkn : x ∈ [c, b], ‖y‖ ≤Mis a compact subset of G that shows (3.71) does not hold. In the only remaining case,(iii) must be false, since (3.72) does not hold. Thus,

∃δ>0

∀x0∈]a,b[

∃x1∈]x0,b[

dist(

(

x1, φ(x1), . . . , φ(k−1)(x1)

)

, ∂G)

≥ δ.

Clearly, the setA :=

(x, y) ∈ G : dist(

(x, y), ∂G)

≥ δ

is closed (e.g. as the distance function d : R × Kkn −→ R+0 , d(·) := dist(·, ∂G) is

continuous (see Th. C.4) and A = (d−1[δ,∞[)∩ (G∪ ∂G)). In consequence, K ∩A withK as defined above is a compact subset of G that shows (3.71) does not hold.

Remark 3.27. (a) Examples such as the second ODE of Ex. 3.30(b) below show thatthe lim sup in Prop. 3.26(ii) can not be replaced with a lim.

(b) If f : G −→ Kn is continuous, then the three conditions of Prop. 3.26 are alsosufficient for φ to go to the boundary of G (cf. Cor. 3.29 below).

(c) For discontinuous f : G −→ Kn, in general, (ii) of Prop. 3.26 is no longer sufficientfor φ to go to the boundary ofG as is shown by simple examples, whereas (i) and (iii)remain sufficient, even for discontinuous f (exercise). Similarly, simple examplesshow Prop. 3.24 becomes false without the assumption of f being continuous; andit can also happen that a maximal solution escapes every compact set, but still doesnot go to the boundary of G (exercise).

Theorem 3.28. In the situation of Def. 3.25, if f : G −→ Kn is continuous andφ : ]a, b[−→ Kn is a maximal solution to (1.6), then φ must go to the boundary of G forboth x → a and x → b, i.e., for both x → a and x → b, it must escape every compactsubset K of G forever and it must satisfy one of the conditions specified in Prop. 3.26(and one of the analogous conditions for x→ a).

3 GENERAL THEORY 46

Proof. We carry out the proof for x→ b – the proof for x→ a can be done analogouslyor by applying the time reversion Lem. 1.9, as indicated at the end of the proof below.Let φ : ]a, b[−→ Kn be a maximal solution to (1.6). Seeking a contradiction, we assumeφ does not go to the boundary of G for x→ b, i.e. (3.71) does not hold and there existsa compact subset K of G and a strictly increasing sequence (xm)m∈N in ]a, b[ such thatlimm→∞ xm = b <∞ and

∀m∈N

(

xm, φ(xm), . . . , φ(k−1)(xm)

)

∈ K. (3.73)

We now define C to be another compact subset of G that is strictly between K and G,i.e. K ( C ( G: More precisely, we choose r > 0 such that

C :=

(x, y) ∈ R×Kkn : dist(

(x, y), K)

≤ r

⊆ G,

wheredist

(

(x, y), K)

= inf‖(x, y)− (x, y)‖2 : (x, y) ∈ K,‖ · ‖2 denoting the Euclidean norm on Rkn+1 for K = R and the Euclidean norm onR2kn+1 for K = C (this choice of norm is different from previous choices and will beconvenient later during the current proof). As φ is a maximal solution, Prop. 3.24guarantees the existence of another strictly increasing sequence (ξm)m∈N in ]a, b[ suchthat limm→∞ ξm = b < ∞, x1 < ξ1 < x2 < ξ2 < . . . (i.e. xm < ξm < xm+1 for eachm ∈ N) and such that

∀m∈N

(

ξm, φ(ξm), . . . , φ(k−1)(ξm)

)

/∈ C.

Noting(

xm, φ(xm), . . . , φ(k−1)(xm)

)

∈ K by (3.73) and K ⊆ C, define

∀m∈N

sm := sup

s ≥ xm :(

x, φ(x), . . . , φ(k−1)(x))

∈ C for each x ∈ [xm, s]

.

By the definition of sm as a sup, sm < xm+1 < b < ∞, and by the continuity of thedistance function d : R×Kkn −→ R+

0 , d(·) := dist(·, K) (see Th. C.4), one obtains

∀m∈N

dist(

(

sm, φ(sm), . . . , φ(k−1)(sm)

)

, K)

= r,

in particular,∀

x∈[xm,sm]

(

x, φ(x), . . . , φ(k−1)(x))

∈ C (3.74)

and

∀m∈N

∥

∥

∥

(

xm, φ(xm), . . . , φ(k−1)(xm)

)

−(

sm, φ(sm), . . . , φ(k−1)(sm)

)

∥

∥

∥

2≥ r. (3.75)

We use the boundedness of the compact set C and (3.74) to provide

M1 := sup∥

∥

(

φ(x), . . . , φ(k−1)(x))∥

∥

2: x ∈ [xm, sm], m ∈ N

<∞,

M2 := max

‖f(x, y)‖2 : (x, y) ∈ C

<∞

3 GENERAL THEORY 47

(as C is compact and f continuous),

M := maxM1,M2.

We now notice that each function

Jm : [xm, sm] −→ R×Kkn, Jm(x) :=(

x, φ(x), . . . , φ(k−1)(x))

,

is a continuously differentiable curve or path (using the continuity of f), cf. Def. F.1(for K = C, we consider Jm as a path in R2kn+1). To finish the proof, we will have tomake use of the notion of arc length (cf. Def. F.5) of such a continuously differentiablecurve: Recall that each such continuously differentiable path is rectifyable, i.e. it has awell-defined finite arc length l(Jm) (cf. Th. F.7). Moreover, l(Jm) satisfies

‖Jm(xm)− Jm(sm)‖2(F.4)

≤ l(Jm)(F.17)=

∫ sm

xm

‖J ′m(x)‖2 dx

=

∫ sm

xm

√

√

√

√1 +k∑

j=1

‖φ(j)(x)‖22 dx

≤∫ sm

xm

√

1 +∥

∥

(

φ(x), . . . , φ(k−1)(x))∥

∥

2

2+∥

∥f(Jm(x))∥

∥

2

2dx

≤∫ sm

xm

√1 + 2M2 dx , (3.76)

where it was used that ‖ · ‖2 was chosen to be the Euclidean norm. For each m ∈ N, weestimate

0 < r(3.75)

≤∥

∥

∥

(

xm, φ(xm), . . . , φ(k−1)(xm)

)

−(

sm, φ(sm), . . . , φ(k−1)(sm)

)

∥

∥

∥

2

= ‖Jm(xm)− Jm(sm)‖2(3.76)

≤∫ sm

xm

√1 + 2M2 dx

= (sm − xm)√1 + 2M2. (3.77)

However, limm→∞(sm − xm)√1 + 2M2 = 0 due to limm→∞ sm = limm→∞ xm = b, in

contradiction to r > 0. This contradiction shows our initial assumption that φ does notgo to the boundary of G for x→ b must have been wrong.

To obtain the remaining assertion that φ must go to the boundary of G for x → a,one can proceed as in the last paragraph of the proof of Prop. 3.23, making use of thefunction h defined there and of the time reversion Lem. 1.9: If K ⊆ G is a compactset and ψ is the solution to the time-reversed version given by Lem. 1.9(b), then ψmust be maximal as φ is maximal. Thus, for x → −a, ψ must escape the compact seth(K) forever by the first part of the proof above, implying φ must escape K forever forx→ a.

Corollary 3.29. Let k, n ∈ N, let G ⊆ R × Kkn be open, and let f : G −→ Kn

be continuous. If φ : ]a, b[−→ Kn, a < b, is a solution to (1.6), then the followingstatements are equivalent:

3 GENERAL THEORY 48

(i) φ is a maximal solution.

(ii) φ must go to the boundary of G for both x→ a and x→ b in the sense defined inDef. 3.25.

(iii) φ satisfies one of the conditions specified in Prop. 3.26 and one of the analogousconditions for x→ a.

Proof. (i) implies (ii) by Th. 3.28, (ii) implies (iii) by Prop. 3.26, and it is an exerciseto show (iii) implies (i) (here, Prop. 3.23 is the clue).

Example 3.30. The following examples illustrate the different kinds of possible bahav-ior of maximal solutions listed in Prop. 3.26 (the different kinds of bahavior can alreadybe seen for 1-dimensional ODE of first order):

(a) The initial value problemy′ = 0, y(0) = −1,

has the maximal solution φ : R −→ R, φ(x) = −1 – here we have

G = R2, f : G −→ R, f(x, y) = 0,

solution interval I = R, b := sup I = ∞, i.e. we are in Case (i) of Prop. 3.26.

(b) The initial value problemy′ = x−2, y(−1) = 1,

has the maximal solution φ : ]−∞, 0[−→ R, φ(x) = −x−1 – here we have

G =]−∞, 0[×R, f : G −→ R, f(x, y) = x−2,

solution interval I =]−∞, 0[, b := sup I = 0, limx↑0 |φ(x)| = ∞ i.e. we are in Case(ii) of Prop. 3.26.

To obtain an example, where we are also in Case (ii) of Prop. 3.26, but wherelimx↑b |φ(x)|, b := sup I, does not exist, consider the initial value problem

y′ = − 1

x2sin

1

x+

1

x3cos

1

x, y

(

− 1

π

)

= 0,

which has the maximal solution φ : ] − ∞, 0[−→ R, φ(x) = x−1 sin(x−1) (herelim supx↑0 |φ(x)| = ∞, but, as φ(−1/(kπ)) = 0 for each k ∈ N, limx↑0 |φ(x)| doesnot exist) – here we have

G = (R \ 0)× R, f : G −→ R, f(x, y) = − 1

x2sin

1

x+

1

x3cos

1

x.

To obtain an example, where we are again in Case (ii) of Prop. 3.26, but whereG = R2, consider the initial value problem

y′ = y2, y(−1) = 1,

3 GENERAL THEORY 49

which, as in the first example of (b), has the maximal solution φ : ] −∞, 0[−→ R,φ(x) = −x−1 – here we have

G = R2, f : G −→ R, f(x, y) = y2.

(c) The initial value problem

y′ = −y−1, y(−1) = 1,

has the maximal solution φ : ]−∞,−12[−→ R, φ(x) =

√−2x− 1 – here we have

G = R× (R \ 0), f : G −→ R, f(x, y) = −y−1,

solution interval I =]−∞,−12[, b := sup I = −1

2, ∂G = R× 0,

limx↑b

(x, φ(x)) =

(

−1

2, 0

)

∈ ∂G,

i.e. we are in Case (iii) of Prop. 3.26.

An example, where we are in Case (iii) of Prop. 3.26, but where limx↑b (x, φ(x))does not exist, is given by the initial value problem

y′ = − 1

x2cos

1

x, y

(

− 1

π

)

= 0,

has the maximal solution φ : ]−∞, 0[−→ R, φ(x) = sin(1/x) – here we have

G = (R \ 0)× R, f : G −→ R, f(x, y) = − 1

x2cos

1

x,

solution interval I =]−∞, 0[, b := sup I = 0, ∂G = 0 × R,

limx↑0

dist(

(x, φ(x)), ∂G)

= limx↑0

|x| = 0.

As a final example, where we are again in Case (iii) of Prop. 3.26, reconsider theinitial value problem from (a), but this time with

G =]− 1, 1[×]− 3, 5[, f : G −→ R, f(x, y) = 0.

Now the maximal solution is φ : ] − 1, 1[−→ R, φ(x) = −1, solution interval I =]−1, 1[, b := sup I = 1, and limx↑1 (x, φ(x)) = (1,−1) ∈ ∂G. This last example alsoillustrates that, even though it is quite common to omit an explicit specification ofthe domain G when writing an ODE (as we did in (a)) – where it is usually assumedthat the intended domain can be guessed from the context – the maximal solutionwill typically depend on the specification of G.

3 GENERAL THEORY 50

Example 3.31. We have already seen examples of initial value problems that admitmore than one maximal solution – for instance, the initial value problem of Ex. 1.4(b)had infinitely many different maximal solutions, all of them defined on all of R. The fol-lowing example shows that an initial value problem can have maximal solutions definedon different intervals: Let

G := R×]− 1, 1[, f : G −→ R, f(x, y) :=

√

|y|1−

√

|y|,

and consider the initial value problem

y′ = f(x, y) =

√

|y|1−

√

|y|, y(0) = 0. (3.78)

An obvious maximal solution is

φ : R −→ R, φ(0) = 0.

However, another maximal solution (that can be found using separation of variables) is

ψ : ]− 1, 1[−→ R, ψ(x) :=

−(

1−√1 + x

)2for −1 < x ≤ 0,

(

1−√1− x

)2for 0 ≤ x < 1.

To confirm the maximality of the solution ψ, note limx↓−1 (x, ψ(x)) = (−1,−1) ∈ ∂Gand limx↑1 (x, ψ(x)) = (1, 1) ∈ ∂G.

3.5 Continuity in Initial Conditions

The goal of the present section is to show that, under suitable conditions, small changesin the initial condition for an ODE result in small changes in the solution. As, insituations of nonuniqueness, we can change the solution without having changed theinitial condition at all, ensuring unique solutions to initial value problems is a minimalprerequisite for our considerations in this section.

Definition 3.32. Let G ⊆ R × Kkn, k, n ∈ N, and f : G −→ Kn. We say that theexplicit n-dimensional kth-order ODE (1.6), i.e.

y(k) = f(

x, y, y′, . . . , y(k−1))

, (3.79a)

admits unique maximal solutions if, and only if, f is such that every initial value problemconsisting of (3.79a) and

∀j∈0,...,k−1

y(j)(ξ) = ηj ∈ Kn, (3.79b)

with (ξ, η) ∈ G, has a unique maximal solution φ(ξ,η) : I(ξ,η) −→ Kn (combining Cor.3.16 with Th. 3.22 yields that G being open and f being continuous and locally Lipschitz

3 GENERAL THEORY 51

with respect to y is sufficient for (3.79a) to admit unique maximal solutions, but weknow from Rem. 3.17 that this condition is not necessary). If f is such that (3.79a)admits unique maximal solutions, then

Y : Df −→ Kn, Y (x, ξ, η) := φ(ξ,η)(x), (3.80)

defined onDf := (x, ξ, η) ∈ R×G : x ∈ I(ξ,η), (3.81)

is called the global or general solution to (3.79a). Note that the domain Df of Y isdetermined entirely by f , which is notationally emphasized by its lower index f .

Lemma 3.33. In the situation of Def. 3.32, the following holds:

(a) Y (ξ, ξ, η) = η0 for each (ξ, η) ∈ G.

(b) If k = 1, then η = η0 and Y(

x, x, Y (x, ξ, η))

= Y (x, ξ, η) for each (x, ξ, η), (x, ξ, η) ∈Df .

(c) If k = 1, then Y(

ξ, x, Y (x, ξ, η))

= η for each (x, ξ, η) ∈ Df .

Proof. (a) holds as Y (·, ξ, η) is a solution to (3.79b). For (b) note (x, ξ, η) ∈ Df im-plying

(

x, Y (x, ξ, η))

∈ G, i.e.(

x, Y (x, ξ, η))

are admissible initial data. Moreover,Y(

·, x, Y (x, ξ, η))

and Y (·, ξ, η) are both maximal solutions for some intial value prob-lem for (3.79a). Since both solutions agree at x = x, both functions must be identicalby the assumed uniqueness of solutions. In particular, they are defined for the same xand yield the same value at each x. Setting x := ξ in (b) yields (c).

The core of the proof of continuity in initial conditions as stated in Cor. 3.36 below isthe following Th. 3.34(a), which provides continuity in initial conditions locally. As abyproduct, we will also obtain a version of the Picard-Lindelof theorem in Th. 3.34(b),which states the local uniform convergence of the so-called Picard iteration, a method forobtaining approximate solutions that is quite different from the Euler method consideredabove.

Theorem 3.34. Consider the situation of Def. 3.32 for first-order problems, i.e. withk = 1, and with f being continuous and locally Lipschitz with respect to y on G open.Fix an arbitrary norm ‖ · ‖ on Kn.

(a) For each (σ, ζ) ∈ G ⊆ R × Kn and each −∞ < a < b < ∞ such that [a, b] ⊆ I(σ,ζ)(i.e., using the notation introduced in Def. 3.32, the maximal solution φ(σ,ζ) =Y (·, σ, ζ) is defined on [a, b]), there exists δ > 0 satisfying:

(i) For every point (ξ, η) in the open set

Uδ(σ, ζ) :=

(ξ, η) ∈ G : ξ ∈]a, b[,∥

∥η − Y (ξ, σ, ζ)∥

∥ < δ

, (3.82)

the maximal solution φ(ξ,η) = Y (·, ξ, η) is defined on ]a, b[ (i.e. ]a, b[⊆ I(ξ,η)).

3 GENERAL THEORY 52

(ii) The restriction of the global solution (x, ξ, η) 7→ Y (x, ξ, η) to the open set

W :=]a, b[×Uδ(σ, ζ) (3.83)

is continuous.

(b) (Picard-Lindelof) For each (σ, ζ) ∈ G, there exists α > 0 such that the Picarditeration, i.e. the sequence of functions (φm)m∈N0, φm : ]σ−α, σ+α[−→ Kn, definedrecursively by

φ0(x) := ζ, (3.84a)

∀m∈N0

φm+1(x) := ζ +

∫ x

σ

f(

t, φm(t))

dt , (3.84b)

converges uniformly to the solution of the initial value problem (3.79) (with k = 1and (ξ, η) := (σ, ζ)) on ]σ − α, σ + α[.

Proof. We will obtain (b) as an aside while proving (a). To simplify notation, weintroduce the function

ψ : [a, b] −→ Kn, ψ(x) := Y (x, σ, ζ).

Since [a, b] is compact and ψ is continuous,

γ := (Id, ψ)[a, b] = (x, ψ(x)) ∈ R×Kn : x ∈ [a, b]

is a compact subset of G (cf. C.14). Thus, γ has a positive distance from the closed set(R×Kn) \G, implying

∃δ1>0

C :=

(x, y) ∈ R×Kn : x ∈ [a, b],∥

∥y − ψ(x)∥

∥ ≤ δ1

⊆ G. (3.85)

Clearly, C is bounded and C is also closed (using the continuity of the distance functiond : R × Kn −→ R+

0 , d(·) := dist(·, γ), the continuity of the projection to the firstcomponent π1 : R × Kn −→ R, and noting C = d−1[0, δ1] ∩ π−1

1 [a, b]). Thus, C iscompact, and the hypothesis of f being locally Lipschitz with respect to y implies f tobe globally Lipschitz with some Lipschitz constant L ≥ 0 on the compact set C by Prop.3.13. We can now choose the number δ > 0 claimed to exist in (a) to be any number

0 < δ < e−L(b−a) δ1. (3.86)

Since −L(b− a) ≤ 0, we haveδ < δ1. (3.87)

Moreover, with d and π1 as above, Uδ(σ, ζ) as defined in (3.82) can be written in theform

Uδ(σ, ζ) = d−1[0, δ[ ∩ π−11 ]a, b[,

showing it is an open set ([0, δ[ is, indeed, open in R+0 ).

3 GENERAL THEORY 53

Even though we are mostly interested in what happens on the open set W , it will beconvenient to define functions on the slightly larger compact set

W := [a, b]× U,

U :=

(x, y) ∈ R×Kn : x ∈ [a, b],∥

∥y − ψ(x)∥

∥ ≤ δ

= d−1[0, δ] ∩ π−11 [a, b].

To proceed with the proof, we now carry out a form of the Picard iteration, recursivelydefining a sequence of functions (ψm)m∈N0 , ψm : W −→ Kn, defined recursively by

ψ0(x, ξ, η) := ψ(x) + η − ψ(ξ), (3.88a)

∀m∈N0

ψm+1(x, ξ, η) := η +

∫ x

ξ

f(

t, ψm(t, ξ, η))

dt . (3.88b)

The proof will be concluded if we can show the (ψm)m∈N0 constitute a sequence ofcontinuous functions converging uniformly on W to Y W . As an intermediate step, weestablish the following properties of the ψm (simultaneously) by induction on m ∈ N0:

(1) ψm is continuous for each m ∈ N0.

(2) One has

∀m∈N0,

(x,ξ,η)∈W

∥

∥ψm(x, ξ, η)− ψ(x)∥

∥ < δ1(

⇒ (x, ψm(x, ξ, η)) ∈ C)

.

In particular, since C ⊆ G, this shows the ψm are well-defined by (3.88b).

(3) One has

∀m∈N0,

(x,ξ,η)∈W

∥

∥ψm+1(x, ξ, η)− ψm(x, ξ, η)∥

∥ ≤ Lm+1 |x− ξ|m+1 δ

(m+ 1)!.

To start the induction proof, notice that the continuity of ψ implies the continuity ofψ0. Moreover, if (x, ξ, η) ∈ W , then

∥

∥ψ0(x, ξ, η)− ψ(x)∥

∥

(3.88a)=

∥

∥η − ψ(ξ)∥

∥ =∥

∥η − Y (ξ, σ, ζ)∥

∥ ≤ δ < δ1. (3.89)

Also, from ψ = Y (·, σ, ζ) = φ(σ,ζ), we know, for each x, ξ ∈ [a, b],

ψ(x)− ψ(ξ) = ζ +

∫ x

σ

f(t, ψ(t)) dt − ζ −∫ ξ

σ

f(t, ψ(t)) dt =

∫ x

ξ

f(t, ψ(t)) dt

and, for each (x, ξ, η) ∈ W ,

∥

∥ψ1(x, ξ, η)− ψ0(x, ξ, η)∥

∥ =

∥

∥

∥

∥

∫ x

ξ

(

f(

t, ψ0(t, ξ, η))

− f(t, ψ(t)))

dt

∥

∥

∥

∥

f L-Lip.

≤ L

∣

∣

∣

∣

∫ x

ξ

∥

∥ψ0(t, ξ, η)− ψ(t)∥

∥ dt

∣

∣

∣

∣

= L

∣

∣

∣

∣

∫ x

ξ

∥

∥η − ψ(ξ)∥

∥ dt

∣

∣

∣

∣

≤ L |x− ξ| δ,

3 GENERAL THEORY 54

completing the proof of (1) – (3) for m = 0. For the induction step, let m ∈ N0.

It is left as an exercise to prove the continuity of ψm+1.

Using the triangle inequality, we estimate, for each (x, ξ, η) ∈ W ,∥

∥ψm+1(x, ξ, η)− ψ(x)∥

∥

≤m∑

j=0

‖ψj+1(x, ξ, η)− ψj(x, ξ, η)∥

∥+ ‖ψ0(x, ξ, η)− ψ(x)∥

∥

(3.89), ind.hyp. for (3)

≤m∑

j=0

Lj+1 |x− ξ|j+1 δ

(j + 1)!+ δ ≤ eL|x−ξ| δ

(3.86)< eL(b−a) e−L(b−a) δ1 = δ1,

establishing the estimate of (2) for m + 1. To prove the estimate in (3) for m replacedby m+ 1, one estimates, for each (x, ξ, η) ∈ W ,

∥

∥ψm+2(x, ξ, η)− ψm+1(x, ξ, η)∥

∥ ≤∣

∣

∣

∣

∫ x

ξ

∥

∥

∥f(

t, ψm+1(t, ξ, η))

− f(

t, ψm(t, ξ, η))

∥

∥

∥ dt

∣

∣

∣

∣

≤ L

∣

∣

∣

∣

∫ x

ξ

∥

∥ψm+1(t, ξ, η)− ψm(t, ξ, η)∥

∥ dt

∣

∣

∣

∣

ind.hyp.

≤ L

∣

∣

∣

∣

∫ x

ξ

Lm+1 |t− ξ|m+1 δ

(m+ 1)!dt

∣

∣

∣

∣

=Lm+2 |x− ξ|m+2 δ

(m+ 2)!,

completing the induction proof of (1) – (3).

As a consequence of (3), for each l,m ∈ N0 such that m > l:

∀(x,ξ,η)∈W

∥

∥ψm(x, ξ, η)− ψl(x, ξ, η)∥

∥ ≤ δ

m∑

j=l+1

Lj (b− a)j

j!. (3.90)

The convergence of the exponential series, thus, implies that (ψm(x, ξ, η))m∈N0 is aCauchy sequence for each (x, ξ, η) ∈ W , yielding pointwise convergence of the ψm tosome function ψ : W −→ Kn. Letting m tend to infinity in (3.90) then shows

∀(x,ξ,η)∈W

∥

∥ψ(x, ξ, η)− ψl(x, ξ, η)∥

∥ ≤ δ

∞∑

j=l+1

Lj (b− a)j

j!,

where the independence of the right-hand side with respect to (x, ξ, η) ∈ W provesψm → ψ uniformly on W . The uniform convergence together with (1) then implies ψto be continuous.

In the final step of the proof, we show ψ = Y on W , i.e. ψ(·, ξ, η) solves (3.79) (withk = 1). By Th. 1.5, we need to show

∀(x,ξ,η)∈W

ψ(x, ξ, η) = η +

∫ x

ξ

f(

t, ψ(t, ξ, η))

dt (3.91)

3 GENERAL THEORY 55

(then uniqueness of solutions implies ψ(·, ξ, η) = Y (·, ξ, η)). To verify (3.91), givenǫ > 0, by the uniform convergence ψm → ψ, choose m ∈ N sufficiently large such that

∀k∈m−1,m

∀(x,ξ,η)∈W

∥

∥ψ(x, ξ, η)− ψk(x, ξ, η)∥

∥ < ǫ

and estimate, for each (x, ξ, η) ∈ W ,

∥

∥

∥

∥

ψ(x, ξ, η)− η −∫ x

ξ

f(

t, ψ(t, ξ, η))

dt

∥

∥

∥

∥

≤∥

∥

∥ψ(x, ξ, η)− ψm(x, ξ, η)

∥

∥

∥+

∥

∥

∥

∥

∫ x

ξ

(

f(

t, ψm−1(t, ξ, η))

− f(

t, ψ(t, ξ, η))

)

dt

∥

∥

∥

∥

< ǫ+ L

∣

∣

∣

∣

∫ x

ξ

∥

∥ψm−1(t, ξ, η)− ψ(t, ξ, η)∥

∥ dt

∣

∣

∣

∣

≤ ǫ+ L ǫ (b− a). (3.92)

As (3.92) holds for every ǫ > 0, (3.91) must be true as well.

It is noted that we have, indeed, proved (b) as a byproduct, since we know (for examplefrom the Peano Th. 3.8) that ψ must be defined on [σ − α, σ + α] for some α > 0 andthen φm = ψm(·, σ, ζ) on [σ − α, σ + α] for each m ∈ N0.

Theorem 3.35. As in Th. 3.34, consider the situation of Def. 3.32 for first-order prob-lems, i.e. with k = 1, and with f being continuous and locally Lipschitz with respect toy on G open. Then the global solution (x, ξ, η) 7→ Y (x, ξ, η) as defined in Def. 3.32 iscontinuous. Moreover, its domain Df is open.

Proof. Let (x, σ, ζ) ∈ Df . Then, using the notation from Def. 3.32, x is in the domain ofthe maximal solution φ(σ,ζ), i.e. x ∈ I(σ,ζ). Since I(σ,ζ) is open, there must be −∞ < a <x < b <∞ such that [a, b] ⊆ I(σ,ζ) and then Th. 3.34(a) implies the global solution Y tobe continuous on W , where W as defined in (3.83) is an open neighborhood of (x, σ, ζ).In particular, (x, σ, ζ) is an interior point of Df and Y is continuous at (x, σ, ζ). As(x, σ, ζ) was arbitrary, Df must be open and Y must be continuous.

Corollary 3.36. Consider the situation of Def. 3.32 with f being continuous and locallyLipschitz with respect to y on G open. Then the global solution (x, ξ, η) 7→ Y (x, ξ, η) asdefined in Def. 3.32 is continuous. Moreover, its domain Df is open.

Proof. It was part of the exercise that proved Cor. 3.16 to show that the right-hand sideF of the first-order problem equivalent to (3.79) in the sense of Th. 3.1 is continuousand locally Lipschitz with respect to y, provided f is continuous and locally Lipschitzwith respect to y. Thus, according to Th. 3.35, the equivalent first-order problem hasa continuous global solution Υ : DF −→ Kkn, defined on some open set DF . As aconsequence of Th. 3.1(b), Y = Υ1 : DF −→ Kn is the global solution to (3.79a). Sowe have Df = DF and, as Υ is continuous, so is Y .

It is sometimes interesting to consider situations where the right-hand side f dependson some (vector of) parameters µ in addition to depending on x and y:

3 GENERAL THEORY 56

Definition 3.37. If G ⊆ R×Kkn×Kl with k, n, l ∈ N, and f : G −→ Kn is such that,for each (ξ, η, µ) ∈ G, the explicit n-dimensional kth-order initial value problem

y(k) = f(

x, y, y′, . . . , y(k−1), µ)

, (3.93a)

∀j∈0,...,k−1

y(j)(ξ) = ηj ∈ Kn, (3.93b)

has a unique maximal solution φ(ξ,η,µ) : I(ξ,η,µ) −→ Kn, then

Y : Df −→ Kn, Y (x, ξ, η, µ) := φ(ξ,η,µ)(x), (3.94)

defined onDf := (x, ξ, η, µ) ∈ R×G : x ∈ I(ξ,η,µ), (3.95)

is called the global or general solution to (3.93a).

Corollary 3.38. Consider the situation of Def. 3.37 with f being continuous and locallyLipschitz with respect to (y, µ) on G open. Then the global solution Y as defined in Def.3.94 is continuous. Moreover, its domain Df is open.

Proof. We consider k = 1 (i.e. (3.93a) is of first order) – the case k > 1 can then, in theusual way, be obtained by applying Th. 3.1. To apply Th. 3.35 to the present situation,define the auxiliary function

F : G −→ Kn+l, Fj(x, y) :=

fj(x, y) for j = 1, . . . , n,

0 for j = n+ 1, . . . , n+ l.(3.96)

Then, since f is continuous and locally Lipschitz with respect to (y, µ), F is continuousand locally Lipschitz with respect to y, and we can apply Th. 3.35 to

y′ = F (x, y), (3.97a)

y(ξ) = (η, µ), (3.97b)

where (ξ, η, µ) ∈ G. According to Th. 3.35, the global solution Y : DF −→ Kn+l of(3.97a) is continuous on the open set DF . Moreover, by the definition of F in (3.96),we have

∀(x,ξ,η,µ)∈DF

Y (x, ξ, η, µ) =

(

Y (x, ξ, η, µ)µ

)

,

where Y is as defined in (3.94). In particular, Df = DF and the continuity of Y impliesthe continuity of Y .

Example 3.39. As a simple example of a parametrized ODE, consider f : R×K2 −→K, f(x, y, µ) := µy,

y′ = f(x, y, µ) = µ y,

y(ξ) = η,

with the global solution

Y : R× R×K2 −→ K, Y (x, ξ, η, µ) = η eµ (x−ξ).

4 LINEAR ODE 57

4 Linear ODE

4.1 Definition, Setting

In Sec. 2.2, we saw that the solution of one-dimensional first-order linear ODE wasparticularly simple. One can now combine the general theory of ODE with some linearalgebra to obtain results for n-dimensional linear ODE and, equivalently, for linear ODEof higher order.

Notation 4.1. For n ∈ N, let M(n,K) denote the set of all n× n matrices over K.

Definition 4.2. Let I ⊆ R be a nontrivial interval, n ∈ N, and let A : I −→ M(n,K)and b : I −→ Kn be continuous. An ODE of the form

y′ = A(x)y + b(x) (4.1)

is called an n-dimensional linear ODE of first order. It is called homogeneous if, andonly if, b ≡ 0; it is called inhomogeneous if, and only if, it is not homogeneous.

—

Using the notion of matrix norm (cf. Sec. G), it is not hard to show the right-hand sideof (4.1) is continuous and locally Lipschitz with respect to y and, thus, every initialvalue problem for (4.1) has a unique maximal solution (exercise). However, to showthe maximal solution is always defined on all of I, we need some additional machinery,which is developed in the next section.

4.2 Gronwall’s Inequality

In the current section, we will provide Gronwall’s inequality, which is also of interestoutside the field of ODE. Here, Gronwall’s inequality will allow us to prove the global ex-istence of maximal solutions for ODE with linearly bounded right-hand side – a corollarybeing that maximal solutions of (4.1) are always defined on all of I.

As an auxiliary tool on our way to Gronwall’s inequality, we will now briefly study(one-dimensional) differential inequalities:

Definition 4.3. Given G ⊆ R× R = R2, and f : G −→ R, call

y′ ≤ f(x, y) (4.2)

a (one-dimensional) differential inequality (of first order). A solution to (4.2) is a differ-entiable function w : I −→ R defined on a nontrivial interval I ⊆ R satisfying the twoconditions

(i)(

x, w(x))

∈ I × R : x ∈ I

⊆ G,

(ii) w′(x) ≤ f(

x, w(x))

for each x ∈ I.

4 LINEAR ODE 58

Proposition 4.4. Let G ⊆ R2 be open, let f : G −→ R be continuous and locallyLipschitz with respect to y, and let −∞ < a < b ≤ ∞. If w : [a, b[−→ R is a solutionto the differential inequality (4.2) and φ : [a, b[−→ R is a solution to the correspondingODE, then

w(a) ≤ φ(a) ⇒ ∀x∈[a,b[

w(x) ≤ φ(x). (4.3)

Proof. Consider the auxiliary function

g : G× R −→ R, g(x, y, µ) := f(x, y) + µ (4.4)

and the (parametrized) ODE

y′ = g(x, y, µ) = f(x, y) + µ. (4.5)

Since f is continuous and locally Lipschitz with respect to y, g is continuous and locallyLipschitz with respect to (y, µ). Thus, continuity in initial conditions as given by Cor.3.38 applies, yielding the global solution Y : Dg −→ R, (x, ξ, η, µ) 7→ Y (x, ξ, η, µ), tobe continuous on the open set Dg.

We now consider an arbitrary compact subinterval [a, c] ⊆ [a, b[ with a < c < b, notingthat it suffices to prove w ≤ φ on every such interval [a, c]. The set

γ := (Id, a, φ(a), 0)[a, c] =

(x, a, φ(a), 0) : x ∈ [a, c]

(4.6)

is a compact subset of Dg and, thus,

∃ǫ>0

γǫ :=

(x, ξ, η, µ) ∈ R4 : dist(

(x, ξ, η, µ), γ)

< ǫ

⊆ Dg. (4.7)

If we choose the distance in (4.7) to be meant with respect to the max-norm on R4 andif 0 < µ < ǫ, then (x, a, φ(a), µ) ∈ γǫ for each x ∈ [a, c], such that φµ := Y (·, a, φ(a), µ)is defined on (a superset of) [a, c]. We proceed to prove w ≤ φµ on [a, c]: Seeking acontradiction, assume there exists x0 ∈ [a, c] such that w(x0) > φµ(x0). Due to thecontinuity of w and φµ, w > φµ must then hold in an entire neighborhood of x0. On theother hand, w(a) ≤ φ(a) = φµ(a), such that, for

x1 := inf

x < x0 : w(t) > φµ(t) for each t ∈]x, x0]

,

a ≤ x1 < x0 and w(x1) = φµ(x1). But then, for each sufficiently small h > 0,

w(x1 + h)− w(x1) > φµ(x1 + h)− φµ(x1),

implying

w′(x1) = limh→0

w(x1 + h)− w(x1)

h≥ lim

h→0

φµ(x1 + h)− φµ(x1)

h= φ′

µ(x1)

= g(

x1, φµ(x1), µ)

= f(

x1, φµ(x1))

+ µ > f(

x1, φµ(x1))

= f(

x1, w(x1))

, (4.8)

in contradiction to w being a solution to (4.2).

4 LINEAR ODE 59

Thus, w ≤ φµ on [a, c] holds for every 0 < µ < ǫ, and continuity of Y on Dg yields,

∀x∈[a,c]

w(x) ≤ limµ→0

φµ(x) = limµ→0

Y(

x, a, φ(a), µ)

= Y(

x, a, φ(a), 0)

= φ(x), (4.9)

concluding the proof.

Theorem 4.5 (Gronwall’s Inequality). Let I := [a, b[, where −∞ < a < b ≤ ∞. Ifα, β, γ : I −→ R are continuous and β(x) ≥ 0 for each x ∈ I, then

∀x∈I

γ(x) ≤ α(x) +

∫ x

a

β(t) γ(t) dt (4.10)

implies

∀x∈I

γ(x) ≤ α(x) +

∫ x

a

α(t) β(t) exp

(∫ x

t

β(s) ds

)

dt . (4.11)

Proof. Defining the auxiliary functions ψ,w : I −→ R,

ψ(x) := γ(x)− α(x), (4.12a)

w(x) :=

∫ x

a

β(t)γ(t) dt , (4.12b)

(4.10) can be written as∀x∈I

ψ(x) ≤ w(x).

Moreover, this implies

∀x∈I

w′(x) = β(x)γ(x) = β(x)(

α(x) + ψ(x))

≤ β(x)w(x) + α(x)β(x),

showing w satisfies the (linear) differential inequality

y′ ≤ β(x) y + α(x) β(x). (4.13)

Continuously extending α and β to x < a (e.g. using the constant extensions α(x) = α(a)and β(x) := β(a) for x < a), we can consider the linear ODE corresponding to (4.13) onall of ] −∞, b[. Using the initial condition y(a) = w(a) = 0, yields the unique solution(employing the variation of constants Th. 2.3)

φ : ]−∞, b[−→ R,

φ(x) := exp

(∫ x

a

β(s) ds

) ∫ x

a

exp

(

−∫ t

a

β(s) ds

)

α(t) β(t) dt

=

∫ x

a

α(t) β(t) exp

(∫ x

t

β(s) ds

)

dt .

(4.14)

Finally, we apply Prop. 4.4 to conclude

∀x∈I

ψ(x) ≤ w(x)(4.3)

≤ φ(x)(4.14)=

∫ x

a

α(t) β(t) exp

(∫ x

t

β(s) ds

)

dt , (4.15)

which, taking into account ψ = γ − α, establishes (4.11).

4 LINEAR ODE 60

Example 4.6. Let I := [a, b[, where −∞ < a < b ≤ ∞. If β, γ : I −→ R arecontinuous, β(x) ≥ 0 for each x ∈ I, and C ∈ R, then

∀x∈I

γ(x) ≤ C +

∫ x

a

β(t) γ(t) dt (4.16)

implies

∀x∈I

γ(x) ≤ C exp

(∫ x

a

β(t) dt

)

: (4.17)

We apply Gronwall’s inequality of Th. 4.5 with α ≡ C together with the fundamentaltheorem of calculus to obtain the estimate

γ(x) ≤ C +

∫ x

a

C β(t) exp

(∫ x

t

β(s) ds

)

dt

= C − C

∫ x

a

(

−β(t) exp(∫ t

x

−β(s) ds))

dt = C − C

[

exp

(∫ t

x

−β(s) ds)]x

a

= C exp

(∫ x

a

β(t) dt

)

(4.18)

for each x ∈ I, proving (4.17).

—

The following Th. 4.7 will be applied to show maximal solutions to linear ODE arealways defined on all of I (with I as in Def. 4.2). However, Th. 4.7 is often also usefulto obtain the domains of maximal solutions for nonlinear ODE.

Theorem 4.7. Let n ∈ N, let I ⊆ R be an open interval, and let f : I ×Kn −→ Kn becontinuous. If there exist nonnegative continuous functions γ, β : I −→ R+

0 such that

∀(x,y)∈I×Kn

‖f(x, y)‖ ≤ γ(x) + β(x) ‖y‖, (4.19)

where ‖ · ‖ denotes some arbitrary norm on Kn, then every maximal solution to

y′ = f(x, y)

is defined on all of I.

Proof. Let c < d and φ : ]c, d[−→ Kn be a solution to y′ = f(x, y). We prove thatd < b := sup I implies φ can be extended to the right and a := inf I < c implies φ canbe extended to the left. First, assume d < b and let x0 ∈]c, d[. The idea is to applyExample 4.6 on the interval [x0, d[. To this end, we estimate, for each x ∈ [x0, d[:

‖φ(x)‖ =

∥

∥

∥

∥

φ(x0) +

∫ x

x0

f(

t, φ(t))

dt

∥

∥

∥

∥

≤ ‖φ(x0)‖+∫ x

x0

∥

∥

∥f(

t, φ(t))

∥

∥

∥ dt

(4.19)

≤ ‖φ(x0)‖+∫ x

x0

γ(t) dt +

∫ x

x0

β(t) ‖φ(t)‖ dt . (4.20)

4 LINEAR ODE 61

Since the continuous function γ is uniformly bounded on the compact interval [x0, d],

∃C≥0

∀x∈[x0,d[

‖φ(x)‖ ≤ C +

∫ x

x0

β(t) ‖φ(t)‖ dt .

Thus, Example 4.6 applies, providing

∀x∈[x0,d[

‖φ(x)‖ ≤ C exp

(∫ x

x0

β(t) dt

)

≤ C eM(d−x0), (4.21)

where M ≥ 0 is a uniform bound for the continuous function β on the compact interval[x0, d]. As (4.21) states that the graph

gr+(φ) =(

x, φ(x))

∈ G : x ∈ [x0, d[

is contained in the compact set

K := [x0, d]×

y ∈ Kn : ‖y‖ ≤ C eM(d−x0)

,

Prop. 3.24 implies φ has an extension to the right.

Now assume a < c. The idea is to apply the time reversion Lem. 1.9(b): According toLem. 1.9(b), ψ : ] − d,−c[−→ Kn, ψ(x) = φ(−x), is a solution to y′ = −f(−x, y) andthe first part of the prove above shows ψ to have an extension to the right. However,then Rem. 3.21 tells us φ has an extension to the left.

4.3 Existence, Uniqueness, Vector Space of Solutions

Theorem 4.8. Consider the setting of Def. 4.2 with an open interval I. Then everyinitial value problem consisting of the linear ODE (4.1) and y(x0) = y0, x0 ∈ I, y0 ∈ Kn,has a unique maximal solution φ : I −→ Kn (note that φ is defined on all of I).

Proof. It is an exercise to show the right-hand side of (4.1) is continuous and locallyLipschitz with respect to y. Thus, every initial value problem has a unique maximalsolution by using Cor. 3.16 and Th. 3.22. That each maximal solution is defined on Ifollows from Th. 4.7, as

∀x∈I

∥

∥A(x)y + b(x)∥

∥ ≤∥

∥b(x)∥

∥+∥

∥A(x)∥

∥ ‖y‖,

where∥

∥A(x)∥

∥ denotes the matrix norm of A(x) induced by the norm ‖ · ‖ on Kn (cf.Appendix G).

We will now proceed to study the solution spaces of linear ODE – as it turns out, thesesolution spaces inherit the linear structure of the ODE.

Notation 4.9. Again, we consider the setting of Def. 4.2. Define Li and Lh to be therespective sets of solutions to (4.1) and its homogeneous version, i.e.

Li :=

(φ : I −→ Kn) : φ′ = Aφ+ b

, (4.22a)

Lh :=

(φ : I −→ Kn) : φ′ = Aφ

. (4.22b)

4 LINEAR ODE 62

Lemma 4.10. Using Not. 4.9, we have

∀φ∈Li

Li = φ+ Lh = φ+ ψ : ψ ∈ Lh, (4.23)

i.e. one obtains all solutions to the inhomogeneous equation (4.1) by adding solutionsof the homogeneous equation to a particular solution to the inhomogeneous equation(note that this is completely analogous to what occurs for solutions to linear systems ofequations in linear algebra).

Proof. Exercise.

Theorem 4.11. Let I ⊆ R be a nontrivial interval, n ∈ N, and let A : I −→ M(n,K)be continuous. Then the following holds:

(a) The set Lh defined in (4.22b) constitutes a vector space over K.

(b) For each k ∈ N and φ1, . . . , φk ∈ Lh, the following statements are equivalent:

(i) The k functions φ1, . . . , φk are linearly independent over K.

(ii) There exists x0 ∈ I such that the k vectors φ1(x0), . . . , φk(x0) ∈ Kn are linearlyindependent over K.

(iii) The k vectors φ1(x), . . . , φk(x) ∈ Kn are linearly independent over K for everyx ∈ I.

(c) The dimension of Lh is n.

Proof. (a): Exercise.

(b): (iii) trivially implies (ii). That (ii) implies (i) can easily be shown by contraposition:If (i) does not hold, then there is (λ1, . . . , λk) ∈ Kk \ 0 such that

∑kj=1 λjφj = 0, i.e.

∑kj=1 λjφj(x) = 0 holds for each x ∈ I, i.e. (ii) does not hold. It remains to show (i)

implies (iii). Once again, we accomplish this via contraposition: If (iii) does not hold,then there are (λ1, . . . , λk) ∈ Kk \ 0 and x ∈ I such that

∑kj=1 λjφj(x) = 0. But

then, since∑k

j=1 λjφj ∈ Lh by (a),∑k

j=1 λjφj = 0 (using uniqueness of solutions). Inconsequence, φ1, . . . , φk are linearly dependent and (i) does not hold.

(c): Let (b1, . . . , bn) be a basis of Kn and x0 ∈ I. Let φ1, . . . , φn ∈ Lh be the solutionsto the initial conditions y(x0) = b1, . . . , y(x0) = bn, respectively. Then the φ1, . . . , φnmust be linearly independent by (b) (as they are linearly independent at x0), provingdimLh ≥ n. On the other hand, if φ1, . . . , φk ∈ Lh are linearly independent, k ∈ N,then, once more by (b), φ1(x), . . . , φk(x) ∈ Kn are linearly independent for each x ∈ Kn,showing k ≤ n and dimLh ≤ n.

Example 4.12. In Example 1.4(e), we had claimed that the second-order ODE (1.16)on [a, b], a < b, namely

y′′ = −y

4 LINEAR ODE 63

had the set of solutions L as in (1.17), namely

L =(

(c1 sin+c2 cos) : [a, b] −→ K)

: c1, c2 ∈ K

.

We are now in a position to verify this claim: The second-order ODE (1.16) is equivalentto the homogeneous linear first-order ODE

(

y′1y′2

)

=

(

y2−y1

)

=

(

0 1−1 0

)(

y1y2

)

(4.24)

with the vector space of solutions Lh of dimension 2 over K. Clearly, φ1, φ2 ∈ Lh, whereφ1, φ2 : [a, b] −→ K2 with

φ1(x) :=

(

sin xcos x

)

, φ2(x) :=

(

cos x− sin x

)

. (4.25)

Moreover, φ1 and φ2 are linearly independent (e.g. since φ1(0) =(

01

)

and φ2(0) =(

10

)

arelinearly independent, so are φ1, φ2 : R −→ K2 by Th. 4.11(b), implying, again by Th.4.11(b), the linear independence of φ1(a), φ2(a), finally implying the linear independenceof φ1, φ2 : [a, b] −→ K2). Thus,

Lh =(

(c1φ1 + c2φ2) : [a, b] −→ K2)

: c1, c2 ∈ K

(4.26)

and, since, according to Th. 3.1 the solutions to (1.16) are precisely the first componentsof solutions to (4.24), the representation (1.17) is verified.

4.4 Fundamental Matrix Solutions and Variation of Constants

Definition and Remark 4.13. A basis (φ1, . . . , φn), n ∈ N, of the n-dimensionalvector space Lh over K is called a fundamental system for the linear ODE (4.1). Onethen also calls the matrix

Φ :=

φ11 . . . φ1n...

...φn1 . . . φnn

, (4.27)

where the kth column of the matrix consists of the component functions φ1k, . . . , φnk ofφk, k ∈ 1, . . . , n, a fundamental system or a fundamental matrix solution for (4.1). Thelatter term is justified by the observation that Φ : I −→ M(n,K) can be interpreted asa solution to the matrix-valued ODE

Y ′ = A(x)Y : (4.28)

Indeed,Φ′ =

(

φ′1, . . . , φ

′n

)

=(

A(x)φ1, . . . , A(x)φn)

= A(x) Φ.

Corollary 4.14. Let φ1, . . . , φn ∈ Lh, n ∈ N, and let Φ be defined as in (4.27). Thenthe following statements are equivalent:

4 LINEAR ODE 64

(i) Φ is a fundamental system for (4.1).

(ii) There exists x0 ∈ I such that detΦ(x0) 6= 0.

(iii) det Φ(x) 6= 0 for every x ∈ I.

Proof. The equivalences are a direct consequence of the equivalences in Th. 4.11(b).

Theorem 4.15 (Variation of Constants). Consider the setting of Def. 4.2. If Φ : I −→M(n,K) is a fundamental system for (4.1), then the unique solution ψ : I −→ Kn ofthe initial value problem consisting of (4.1) and y(x0) = y0, (x0, y0) ∈ I ×Kn, is givenby

ψ : I −→ Kn, ψ(x) = Φ(x)Φ−1(x0) y0 + Φ(x)

∫ x

x0

Φ−1(t) b(t) dt . (4.29)

Proof. The initial condition is easily verified:

ψ(x0) = Φ(x0)Φ−1(x0) y0 + 0 = Id y0 = y0.

To check that ψ satisfies (4.1), one computes, for each x ∈ I,

ψ′(x)(I.3)= Φ′(x)Φ−1(x0) y0 + Φ′(x)

∫ x

x0

Φ−1(t) b(t) dt + Φ(x)Φ−1(x) b(x)

(4.28)= A(x) Φ(x)Φ−1(x0) y0 + A(x) Φ(x)

∫ x

x0

Φ−1(t) b(t) dt + b(x)

= A(x)ψ(x) + b(x), (4.30)


Remark 4.16. The 1-dimensional variation of constants formula (2.2) is actually aspecial case of (4.29): We note that, for n = 1 and A(x) = a(x), the solution Φ := φ0

to the 1-dimensional homogeneous equation as defined in (2.2b), i.e.

φ0 : I −→ K, φ0(x) = exp

(∫ x

x0

a(t) dt

)

= e∫ xx0a(t) dt

constitutes a fundamental matrix solution in the sense of Def. and Rem. 4.13 (since 1/φ0

exists). Taking into account Φ(x0) = φ0(x0) = 1, we obtain, for each x ∈ I,

φ(x)(4.29)= Φ(x)Φ−1(x0) y0 + Φ(x)

∫ x

x0

Φ−1(t) b(t) dt

= φ0(x)

(

y0 +

∫ x

x0

φ0(t)−1 b(t) dt

)

, (4.31)

which is (2.2a).

—

4 LINEAR ODE 65

In Sec. 4.6, we will study methods for actually finding fundamental matrix solutions incases where A is constant. However, in general, fundamental matrix solutions are oftennot explicitly available. In such situations, the following Th. 4.17 can sometimes helpto extract information about solutions.

Theorem 4.17 (Liouville’s Formula). Consider the setting of Def. 4.2 and recall thetrace of an n× n matrix A = (akl) is defined by

trA :=n∑

k=1

akk.

If Φ : I −→ M(n,K) is a fundamental system for (4.1), then

∀x0,x∈I

det(

Φ(x))

= det(

Φ(x0))

exp

(∫ x

x0

tr(

A(t))

dt

)

. (4.32)

Proof. Exercise.

4.5 Higher-Order, Wronskian

In Th. 3.1, we saw that higher-order ODE are equivalent to systems of first-order ODE.We can now combine Th. 3.1 with our findings regarding first-order linear ODE to helpwith the solution of higher-order linear ODE.

Definition 4.18. Let I ⊆ R be a nontrivial interval, n ∈ N. Let b : I −→ K anda0, . . . , an−1 : I −→ K be continuous functions. Then a (1-dimensional) linear ODE ofnth order is an equation of the form

y(n) = an−1(x)y(n−1) + · · ·+ a1(x)y

′ + a0(x)y + b(x). (4.33)

It is called homogeneous if, and only if, b ≡ 0; it is called inhomogeneous if, and only if,it is not homogeneous. Analogous to (4.22), define the respective sets of solutions

Hi :=

(φ : I −→ K) : φ(n) = b+n−1∑

k=0

akφ(k)

, (4.34a)

Hh :=

(φ : I −→ K) : φ(n) =n−1∑

k=0

akφ(k)

. (4.34b)

Definition 4.19. Let I ⊆ R be a nontrivial interval, n ∈ N. For each n-tuple of (n− 1)times differentiable functions φ1, . . . , φn : I −→ K, define the Wronskian

W (φ1, . . . , φn) : I −→ K, W (φ1, . . . , φn)(x) := det

φ1(x) . . . φn(x)φ′1(x) . . . φ′

n(x)...

...

φ(n−1)1 (x) . . . φ

(n−1)n (x)

.

(4.35)

4 LINEAR ODE 66

Theorem 4.20. Consider the setting of Def. 4.18.

(a) If Hi and Hh are the sets defined in (4.34), then Hh is an n-dimensional vectorspace over K and, if φ ∈ Hi is arbitrary, then

Hi = φ+Hh. (4.36)

(b) Let φ1, . . . , φn ∈ Hh. Then the following statements are equivalent:

(i) φ1, . . . , φn are linearly independent over K (i.e. (φ1, . . . , φn) forms a basis ofHh).

(ii) There exists x0 ∈ I such that the Wronskian does not vanish:

W (φ1, . . . , φn)(x0) 6= 0.

(iii) The Wronskian never vanishes, i.e. W (φ1, . . . , φn)(x) 6= 0 for every x ∈ I.

Proof. According to Th. 3.1, (4.33) is equivalent to the first-order linear ODE

y′ =

0 1 0 . . . 0 00 0 1 . . . 0 0...

. . . . . ....

0 0 0 . . . 1 00 0 0 . . . 0 1

a0(x) a1(x) a2(x) . . . an−2(x) an−1(x)

y1y2...

yn−2

yn−1

yn

+

00...00b(x)

=: A(x) y + b(x). (4.37)

Define

Li :=

(Φ : I −→ Kn) : Φ′ = AΦ + b

,

Lh :=

(Φ : I −→ Kn) : Φ′ = Aφ

.

(a): Let φ ∈ Hi and define

Φ :=

φφ′

...φ(n−1)

.

Then

HhTh. 3.1(a),(b)

= Ψ1 : Ψ ∈ Lhand

HiTh. 3.1(a),(b)

= Φ1 : Φ ∈ Li(4.23)= Φ1 : Φ ∈ Φ + Lh

= (Φ + Ψ)1 : Ψ ∈ Lh = φ+Hh.

4 LINEAR ODE 67

As a consequence of Th. 3.1, the map J : Lh −→ Hh, J(Φ) := Φ1, is a linear isomor-phism, implying that Hh, like Lh, is an n-dimensional vector space over K.

(b): For φ1, . . . , φn ∈ Hh, define Φkl := φ(l−1)k for each k, l ∈ 1, . . . , n and

∀x∈I

Φ(x) := (Φ1(x), . . . ,Φn(x)) = (Φkl(x)) =

φ1(x) . . . φn(x)φ′1(x) . . . φ′

n(x)...

...

φ(n−1)1 (x) . . . φ

(n−1)n (x)

such that detΦ(x) = W (φ1, . . . , φn)(x) for each x ∈ I. Since Th. 3.1 yields Φ1, . . . ,Φn ∈Lh if, and only if, φ1, . . . , φn ∈ Hh, the equivalences of (b) follow from the equivalencesof Cor. 4.14.

Example 4.21. Consider a0, a1 : R+ −→ K, a1(x) := 1/(2x), a0(x) := −1/(2x2), and

y′′ = a1(x) y′ + a0(x) y =

y′

2x− y

2x2.

One might be able to guess the solutions

φ1, φ2 : R+ −→ K, φ1(x) := x, φ2(x) :=

√x.

The Wronskian is

W (φ1, φ2) : R+ −→ K,

W (φ1, φ2)(x) = det

(

x√x

1 1/(2√x)

)

=

√x

2−√

x = −√x

2< 0,

i.e. φ1 and φ2 span Hh according to Th. 4.20(b):

Hh = c1φ1 + c2φ2 : c1, c2 ∈ K.

4.6 Constant Coefficients

For 1-dimensional first-order linear ODE, we obtained a solution formula in Th. 2.3 interms of integrals (of course, in general, evaluating integrals can still be very difficult,and one might need effective and efficient numerical methods). In the previous sections,we have studied systems of first-order linear ODE as well as linear ODE of higher order.Unfortunately, there are no general solution formulas for these situations (one can use(4.29) if one knows a fundamental system, but the problem is the absence of a generalprocedure to obtain such a fundamental system). However, there is a more satisfyingsolution theory for the situation of so-called constant coefficients, i.e. if A in (4.1) andthe a0, . . . , an−1 in (4.33) do not depend on x.

4 LINEAR ODE 68

4.6.1 Linear ODE of Higher Order

Definition 4.22. Let I ⊆ R be a nontrivial interval, n ∈ N. Let b : I −→ K becontinuous and a0, . . . , an−1 ∈ K. Then a (1-dimensional) linear ODE of nth order withconstant coefficients is an equation of the form

y(n) = an−1 y(n−1) + · · ·+ a1 y

′ + a0 y + b(x). (4.38)

—

In the present context, it is useful to introduce the following notation:

Notation 4.23. Let n ∈ N0.

(a) Let P denote the set of all polynomials over K, Pn := P ∈ P : degP ≤ n. Wewill also write P [R], P [C], Pn[R], Pn[C] if we need to be specific about the field ofcoefficients.

(b) Let I ⊆ R be a nontrivial interval. Let Dn(I) := Dn(I,K) denote the set of all ntimes differentiable functions f : I −→ K, and let

∂x : D1(I) −→ F(I,K) := f : I −→ K, ∂xf := f ′,

and, for each P ∈ Pn with P (x) =∑n

j=0 ajxj (a0, . . . , an ∈ K) define the differential

operator

P (∂x) : Dn(I) −→ F(I,K), P (∂x)f :=

n∑

j=0

aj∂jxf =

n∑

j=0

ajf(j). (4.39)

Remark 4.24. Using Not. 4.23(b), the ODE (4.38) can be written concisely as

P (∂x)y = b(x), where P (x) := xn −n−1∑

j=0

ajxj. (4.40)

—

The following Prop. 4.25 implies that the differential operator P (∂x) does not, actually,depend on the representation of the polynomial P .

Proposition 4.25. Let P, P1, P2 ∈ P.

(a) If P = P1 + P2 and n := maxdegP1, degP2, then

∀f∈Dn(I)

P (∂x)f = P1(∂x)f + P2(∂x)f.

(b) If P = P1P2 and n := maxdegP, degP1, degP2, then

∀f∈Dn(I)

P (∂x)f = P1(∂x)(

P2(∂x)f)

.

4 LINEAR ODE 69

Proof. Exercise.

Lemma 4.26. Let λ ∈ K and

f : R −→ K, f(x) := eλx. (4.41)

Then, for each P ∈ P,

P (∂x)f : R −→ K, P (∂x)f(x) = P (λ) eλx. (4.42)

Proof. There exists n ∈ N0 and a0, . . . , an ∈ K such that P (x) =∑n

j=0 ajxj. One

computes

P (∂x)f(x) =n∑

j=0

aj ∂jx e

λx =n∑

j=0

ajλjeλx = eλx P (λ),

proving (4.42).

Theorem 4.27. If a0, . . . , an−1 ∈ K, n ∈ N, and P (x) = xn−∑n−1j=0 ajx

j has the distinctzeros λ1, . . . , λn ∈ K (i.e. P (λ1) = · · · = P (λn) = 0), then (φ1, . . . , φn), where

∀j∈1,...,n

φj : I −→ K, φj(x) := eλjx, (4.43)

is a basis of the homogeneous solution space

Hh =

(φ : I −→ K) : P (∂x)φ = 0

(4.44)

to (4.38) (i.e. to (4.40)).

Proof. It is immediate from (4.42) and P (λj) = 0 that each φj satisfies P (∂x)φj = 0.From Th. 4.20(a), we already know Hh is an n-dimensional vector space over K. Thus,it merely remains to compute the Wronskian. One obtains (cf. (4.35)):

W (φ1, . . . , φn)(0) =

∣

∣

∣

∣

∣

∣

∣

∣

∣

1 . . . 1λ1 . . . λn...

...λn−11 . . . λn−1

n

∣

∣

∣

∣

∣

∣

∣

∣

∣

(H.2)=

n−1∏

k,l=0k>l

(λk − λl) 6= 0,

since the λj are all distinct. We have used that the Wronskian, in the present case, turnsout to be a Vandermonde determinant. The formula (H.2) for this type of determinant isprovided and proved in Appendix H. We also used that the determinant of a matrix is thesame as the determinant of its transpose: detA = detAt. From W (φ1, . . . , φn)(0) 6= 0and Th. 4.20(b), we conclude that (φ1, . . . , φn) is a basis of Hh.

Example 4.28. We consider the third-order linear ODE

y′′′ = 2y′′ − y′ + 2y, (4.45)

4 LINEAR ODE 70

which can be written as P (∂x)y = 0 with

P (x) := x3 − 2x2 + x− 2 = (x2 + 1)(x− 2) = (x− i)(x+ i)(x− 2), (4.46)

i.e. P has the distinct zeros λ1 = i, λ2 = −i, λ3 = 2. Thus, according to Th. 4.27, thethree functions

φ1, φ2, φ3 : R −→ C, φ1(x) = eix, φ2(x) = e−ix, φ3(x) = e2x, (4.47)

form a basis of the C-vector space Hh. If we consider (4.45) as an ODE over R, thenwe are interested in a basis of the R-vector space Hh. We can use linear combinationsof φ1 and φ2 to obtain such a basis (cf. Rem. 4.33(b) below):

ψ1, ψ2 : R −→ R, ψ1(x) =eix + e−ix

2= cos x, ψ2(x) =

eix − e−ix

2i= sin x. (4.48)

As explained in Rem. 4.33(b) below, as (φ1, φ2, φ3) are a basis of Hh over C, (ψ1, ψ2, φ3)are a basis of Hh over R.

—

By working a bit harder, one can generalize Th. 4.27 to the case where P has zeros ofhigher multiplicity. We provide this generalization in Th. 4.32 below after recalling thenotion of zeros of higher multiplicity in Rem. and Def. 4.29, and after providing twopreparatory lemmas.

Remark and Definition 4.29. According to the fundamental theorem of algebra (cf.[Phi16, Th. 8.32, Cor. 8.33(b)]), for every polynomial P ∈ Pn with degP = n, n ∈ N,there exists r ∈ N with r ≤ n, k1, . . . , kr ∈ N with k1 + · · · + kr = n, and distinctnumbers λ1, . . . , λr ∈ C such that

P (x) = (x− λ1)k1 · · · (x− λr)

kr . (4.49)

Clearly, λ1, . . . , λr are precisely the distinct zeros of P and kj is referred to as themultiplicity of the zero λj, j = 1, . . . , r.

Lemma 4.30. Let I ⊆ R be a nontrivial interval, λ ∈ K, k ∈ N0, and f ∈ Dk(I). Thenwe have

∀x∈I

(∂x − λ)k(

f(x) eλx)

= f (k)(x) eλx. (4.50)

Proof. The proof is carried out by induction. The case k = 0 is merely the identity(∂x − λ)0

(

f(x) eλx)

= f(x) eλx. For the induction step, let k ≥ 1 and compute, usingthe product rule,

(∂x − λ)k(

f(x) eλx) ind. hyp.

= (∂x − λ)(

f (k−1)(x) eλx)

= f (k)(x) eλx + f (k−1)(x)λ eλx − λ f (k−1)(x) eλx

= f (k)(x) eλx, (4.51)


4 LINEAR ODE 71

Lemma 4.31. Let P ∈ P and λ ∈ K such that P (λ) 6= 0. Then, for each Q ∈ P withdegQ = k, k ∈ N0, it holds that

∀x∈R

P (∂x)(

Q(x) eλx)

= R(x) eλx, (4.52)

where R ∈ P is still a polynomial of degree k.

Proof. We can rewrite P (cf. [Phi16, Th. 6.5(a)]) in the form

P (x) =n∑

j=0

bj (x− λ)j, n ∈ N0, (4.53)

where b0 = P (λ) 6= 0 and the remaining bj ∈ K can also be calculated from thecoefficients of P according to [Phi16, (6.6)]. We compute

P (∂x)(

Q(x) eλx) (4.53)

=n∑

j=0

bj (∂x − λ)j(

Q(x) eλx) (4.50)

=n∑

j=0

bj Q(j)(x) eλx,

i.e. (4.52) holds with R :=∑k

j=0 bj Q(j) and b0 6= 0 implies degR = degQ = k.

Theorem 4.32. If a0, . . . , an−1 ∈ K, n ∈ N, and P (x) = xn−∑n−1j=0 ajx

j has the distinctzeros λ1, . . . , λr ∈ K with respective multiplicities k1, . . . , kr ∈ N, then the set

B :=

(φjm : I −→ K) : j ∈ 1, . . . , r, m ∈ 0, . . . , kj − 1

, (4.54a)

where∀

j∈1,...,r∀

m∈0,...,kj−1φjm : I −→ K, φjm(x) := xm eλjx, (4.54b)

yields a basis of the homogeneous solution space

Hh =

(φ : I −→ K) : P (∂x)φ = 0

.

Proof. Since k1 + · · · + kr = n implies #B = n and we know dimHh = n, it suffices toshow that B ⊆ Hh and the elements of B are linearly independent. Let φjm be as in(4.54b). As λj is a zero of multiplicity kj of P , we can write P (x) = Qj(x)(x − λj)

kj

with some Qj ∈ P . From the computation

P (∂x)φjm(x) = Qj(∂x)(∂x − λj)kj(

xm eλjx) (4.50)

= Qj(∂x)(

∂kjx xm)

eλjxkj>m= 0,

we gather B ⊆ Hh. Linear independence of the φjm is verified by showing

(

r∑

j=1

Qj(x) eλjx = 0 ∧ ∀

j=1,...,rQj ∈ Pkj−1

)

⇒ ∀j=1,...,r

Qj ≡ 0. (4.55)

We prove (4.55) by induction on r. Since eλjx 6= 0 for each x ∈ R, the case r = 1is immediate. For the induction step, let r ≥ 2. If at least one Qj ≡ 0, then the

4 LINEAR ODE 72

remaining Qj ≡ 0 as well by the induction hypothesis. It only remains to consider thecase that none of the Qj vanishes identically. In that case, we apply (∂x − λr)

kr to∑r

j=1Qj(x) eλjx = 0, obtaining

r−1∑

j=1

Rj(x) eλjx = 0 (4.56)

with suitable Rj ∈ P , since Lem. 4.30 yields (∂x− λr)kr(

Qr(x) eλrx)

= Q(kr)r (x) eλrx = 0

and, for j < r, Lem. 4.31 applies due to (λj−λr)kr 6= 0, also providing degRj = degQj .Thus, none of the Rj in (4.56) can vanish identically, violating the induction hypothesis.This finishes the proof of Qj ≡ 0 for each j = 1, . . . , r and the proof of the theorem.

As it can occur in Th. 4.32 that P ∈ P [R], but λj ∈ C\R for some or all of the zeros λj,the question arises of how to obtain a basis of the R-vector space Hh from the basis ofthe C-vector space Hh provided by Th. 4.32. The following Rem. 4.33(b) answers thisquestion.

Remark 4.33. (a) If λ1, λ2 ∈ C, then complex conjugation has the properties (cf.[Phi16, Def. and Rem. 5.5])

λ1 ± λ2 = λ1 ± λ2, λ1λ2 = λ1 λ2.

In consequence, if P ∈ P [R], then P (λ) = P (λ) for each λ ∈ C. In particular, ifP ∈ P [R] and λ ∈ C \ R is a nonreal zero of P , then λ 6= λ is also a zero of P .

(b) Consider the situation of Th. 4.32 with P ∈ P [R]. Using (a), if φjm : I −→ C,φjm(x) = xm eλjx, λj ∈ C \ R, occurs in a basis for the C-vector space Hh (withm = 0 in the special case of Th. 4.27), then φjm : I −→ C, φjm(x) = xm eλjx, with

λj = λj will occur as well. Noting that, for each x ∈ R and each λ ∈ C,

eλx = ex(Reλ+i Imλ) = exReλ(

cos(x Imλ) + i sin(x Imλ))

, (4.57a)

eλx = ex(Reλ−i Imλ) = exReλ(

cos(x Imλ)− i sin(x Imλ))

, (4.57b)

1

2(eλx + eλx) = exReλ cos(x Imλ), (4.57c)

1

2i(eλx − eλx) = exReλ sin(x Imλ), (4.57d)

one can define

ψjm : I −→ R, ψjm(x) :=1

2(φjm(x) + φjm(x)) = xm exReλj cos(x Imλj), (4.58a)

ψjm : I −→ R, ψjm(x) :=1

2i(φjm(x)− φjm(x)) = xm exReλj sin(x Imλj).

(4.58b)

If one replaces each pair φjm, φjm in the basis for the C-vector space Hh with thecorresponding pair ψjm, ψjm, then one obtains a basis for the R-vector space Hh:This follows from

(

ψjmψjm

)

= A

(

φjmφjm

)

with A :=

( 12

12

12i

− 12i

)

, detA = − 1

2i6= 0. (4.59)

4 LINEAR ODE 73

Example 4.34. We consider the fourth-order linear ODE

y(4) = −8y′′ − 16y, (4.60)

which can be written as P (∂x)y = 0 with

P (x) := x4 + 8x2 + 16 = (x2 + 4)2 = (x− 2i)2(x+ 2i)2, (4.61)

i.e. P has the zeros λ1 = 2i, λ2 = −2i, both with multiplicity 2. Thus, according to Th.4.32, the four functions

φ10, φ11, φ20, φ21 : R −→ C,

φ10(x) = e2ix, φ11(x) = x e2ix, φ20(x) = e−2ix, φ21(x) = x e−2ix,

form a basis of the C-vector space Hh. If we consider (4.60) as an ODE over R, we canuse (4.58) to obtain the basis (ψ10, ψ11, ψ20, ψ21) of the R-vector space Hh, where

ψ10, ψ11, ψ20, ψ21 : R −→ R,

ψ10(x) = cos(2x), ψ11(x) = x cos(2x), ψ20(x) = sin(2x), ψ21(x) = x sin(2x).

—

If (4.38) is inhomogeneous, then one can use Th. 4.32 and, if necessary, Rem. 4.33(b),to obtain a basis of the homogeneous solution space Hh, then using the equivalence withsystems of first-order linear ODE and variation of constants according to Th. 4.15 tosolve (4.38). However, if the function b in (4.38) is such that the following Th. 4.35applies, then one can avoid using the above strategy to obtain a particular solution φto (4.38) (and, thus, the entire solution space via Hi = φ+Hh).

Theorem 4.35. Let a0, . . . , an−1 ∈ K, n ∈ N, and P (x) = xn −∑n−1j=0 ajx

j. Consider

P (∂x)y = Q(x)eµx, Q ∈ P , µ ∈ K. (4.62)

(a) (no resonance): If P (µ) 6= 0 and m := deg(Q) ∈ N0, then there exists a polynomialR ∈ P such that deg(R) = m and

φ : R −→ K, φ(x) := R(x) eµx, (4.63)

is a solution to (4.62). Moreover, if Q ≡ 1, then one can choose R ≡ 1/P (µ).

(b) (resonance): If µ is a zero of P with multiplicity k ∈ N and m := deg(Q) ∈ N0,then there exists a solution to (4.62) of the following form:

φ : R −→ K, φ(x) := R(x) eµx, R ∈ P , R(x) =m+k∑

j=k

cjxj, ck, . . . , cm+k ∈ K.

(4.64)

The reason behind the terms no resonance and resonance will be explained in the follow-ing Example 4.36.

4 LINEAR ODE 74

Proof. Exercise.

Example 4.36. Consider the second-order linear ODE

d2x

dt2+ ω2

0 x = a cos(ωt), ω0, ω ∈ R+, a ∈ R \ 0, (4.65)

which can be written as P (∂t)x = a cos(ωt) with

P (t) := t2 + ω20 = (t− iω0)(t+ iω0). (4.66)

Note that the unknown function is written as x depending on the variable t (instead of ydepending on x). This is due to the physical interpretation of (4.65), where x representsthe position of a so-called harmonic oscillator at time t, having angular frequency ω0

and being subjected to a periodic external force of angular frequency ω and amplitudea. We can find a particular solution φ to (4.65) by applying Th. 4.35 to

P (∂t)x = a eiωt. (4.67)

We have to distinguish two cases:

(a) Case ω 6= ω0: In this case, one says that the oscillator and the external force are notin resonance, which explains the term no resonance in Th. 4.35(a). In this case, wecan apply Th. 4.35(a) with µ := iω and Q ≡ a, yielding R ≡ a/P (iω) = a/(ω2

0−ω2),i.e.

φ0 : R −→ C, φ0(t) := R(t) eµt =a

ω20 − ω2

eiωt, (4.68a)

is a solution to (4.67) and

φ : R −→ R, φ(t) := Reφ0(t) =a

ω20 − ω2

cos(ωt), (4.68b)


(b) Case ω = ω0: In this case, one says that the oscillator and the external force are inresonance, which explains the term resonance in Th. 4.35(b). In this case, we canapply Th. 4.35(b) with µ := iω and Q ≡ a, i.e. m = 0, k = 1, yielding R(t) = ctfor some c ∈ C. To determine c, we plug x(t) = R(t) eµt into (4.67):

P (∂t)(

ct eiωt)

= ∂t(c eiωt + ciωt eiωt) + ω2

0ct eiωt

= ciω eiωt + ciω eiωt − cω2t eiωt + ω20ct e

iωt

= 2ciω eiωt = a eiωt ⇒ c = a/(2iω). (4.69)

Thus,

φ0 : R −→ C, φ0(t) :=a

2iωt eiωt, (4.70a)

is a solution to (4.67) and

φ : R −→ R, φ(t) := Reφ0(t) =a

2ωt sin(ωt), (4.70b)


4 LINEAR ODE 75

4.6.2 Systems of First-Order Linear ODE

Matrix Exponential Function

Definition 4.37. Let I ⊆ R be a nontrivial interval, n ∈ N, A ∈ M(n,K) and b :I −→ Kn be continuous. Then a linear ODE with constant coefficients is an equationof the form

y′ = Ay + b(x), (4.71)

i.e. a linear ODE, where the matrix A does not depend on x.

—

Recalling that the ordinary exponential function expa : R −→ C, x 7→ eax, a ∈ C,is precisely the solution to the initial value problem y′ = a y, y(0) = 1, the followingdefinition constitutes a natural generalization:

Definition 4.38. Given n ∈ N, A ∈ M(n,C), define the matrix exponential function

expA : R −→ M(n,C), x 7→ eAx, (4.72a)

to be the solution to the matrix-valued initial value problem

Y ′ = AY, Y (0) = Id, (4.72b)

i.e. as the fundamental matrix solution of y′ = Ay that satisfies Y (0) = Id (sometimescalled the principal maxtrix solution of y′ = Ay).

—

The previous definition of the matrix exponential function is further justified by thefollowing result:

Theorem 4.39. For each A ∈ M(n,C), n ∈ N, it holds that

∀x∈R

eAx =∞∑

k=0

(Ax)k

k!(4.73)

in the sense that the partial sums on the right-hand side converge pointwise to eAx onR, where the convergence is even uniform on every compact interval.

Proof. By the equivalence of all norms on Cn2 ∼= M(n,C), we may choose a convenientnorm on M(n,C). So we let ‖ · ‖ denote an arbitrary operator norm on M(n,C),induced by some norm ‖ · ‖ on Cn. We first show that the partial sums (Am(x))m∈N,

Am(x) :=∑m

k=0(Ax)k

k!, in (4.73) form a Cauchy sequence in M(n,C): For M,N ∈ N,

N > M , one estimates, for each x ∈ R,

‖AN(x)− AM(x)‖ =

∥

∥

∥

∥

∥

N∑

k=M+1

(Ax)k

k!

∥

∥

∥

∥

∥

(G.10)

≤N∑

k=M+1

‖A‖k |x|kk!

. (4.74)

4 LINEAR ODE 76

Since the convergence limm→∞

∑mk=0

‖A‖k |x|k

k!= e‖A‖|x| is pointwise for x ∈ R and uniform

on every compact interval, (4.74) shows each (Am(x))m∈N is a Cauchy sequence thatconverges to some Φ(x) ∈ M(n,C) (by the completeness of M(n,C)) pointwise forx ∈ R and uniform on every compact interval. It remains to show Φ is the solution to(4.72b), i.e.

∀x∈R

Φ(x) = Id+

∫ x

0

AΦ(t) dt . (4.75)

Using the identity

Am(x) = Id+m∑

k=1

(Ax)k

k!= Id+A

m−1∑

k=0

Akxk+1

(k + 1)!= Id+

∫ x

0

A

m−1∑

k=0

Aktk

k!dt ,

we estimate, for each x ∈ R and each m ∈ N,∥

∥

∥

∥

Φ(x)− Id−∫ x

0

AΦ(t) dt

∥

∥

∥

∥

≤∥

∥

∥

∥

Φ(x)− Am(x)

∥

∥

∥

∥

+

∥

∥

∥

∥

Am(x)− Id−∫ x

0

AΦ(t) dt

∥

∥

∥

∥

=

∥

∥

∥

∥

Φ(x)− Am(x)

∥

∥

∥

∥

+

∥

∥

∥

∥

∥

∫ x

0

(

Am−1∑

k=0

Aktk

k!− AΦ(t)

)

dt

∥

∥

∥

∥

∥

≤∥

∥

∥

∥

Φ(x)− Am(x)

∥

∥

∥

∥

+

∣

∣

∣

∣

∫ x

0

‖A‖∥

∥Am−1(t)− Φ(t)∥

∥ dt

∣

∣

∣

∣

→ 0 for m→ ∞,

which proves (4.75) and establishes the case.

The matrix exponential function has some properties that are familiar from the casen = 1 (see Prop. 4.40(a),(b)), but also some properties that are, perhaps, unexpected(see Prop. 4.42(a),(b)).

Proposition 4.40. Let A ∈ M(n,C), n ∈ N.

(a) eA(t+s) = eAteAs holds for each s, t ∈ R.

(b) (eAx)−1 = eA(−x) = e−Ax holds for each x ∈ R.

(c) For the transpose At, one has eAtx = (eAx)t for each x ∈ R.

Proof. (a): Fix s ∈ R. The function Φs : R −→ M(n,C), Φs(t) := eA(t+s) is a solutionto Y ′ = AY (namely the one for the initial condition Y (−s) = Id). Moreover, thefunction Ψs : R −→ M(n,C), Ψs(t) := eAteAs, is also a solution to Y ′ = AY , since

∂tΨs(t) =(

∂teAt)

eAs = AeAteAs = AΨs(t).

Finally, since Ψs(0) = eA0eAs = Id eAs = eAs = Φs(0), the claimed Φs = Ψs follows byuniqueness of solutions.

(b) is an easy consequence of (a), since

Id = eA0 = eA(x−x)(a)= eAxe−Ax.

4 LINEAR ODE 77

(c): Clearly, the map A 7→ At is continuous on M(n,C) (since limk→∞Ak = A implieslimk→∞ ak,αβ = aαβ for all components, which implies limk→∞At

k = At), providing, foreach x ∈ R,

eAtx = lim

m→∞

m∑

k=0

(Atx)k

k!= lim

m→∞

(

m∑

k=0

(Ax)k

k!

)t

=

(

limm→∞

m∑

k=0

(Ax)k

k!

)t

= (eAx)t,

completing the proof.

Proposition 4.41. Let A ∈ M(n,C), n ∈ N. Then

det eA = etrA.

Proof. Applying Liouville’s formula (4.32) to Φ(x) := eAx, x ∈ R, yields

det eAx = det eA0 exp

(∫ x

0

trA dt

)

= 1 · ex trA, (4.76)

and setting x = 1 in (4.76) proves the proposition.

Proposition 4.42. Let A,B ∈ M(n,C), n ∈ N.

(a) BeAx = eAxB holds for each x ∈ R if, and only if, AB = BA.

(b) e(A+B)x = eAxeBx holds for each x ∈ R if, and only if, AB = BA.

Proof. (a): If BeAx = eAxB holds for each x ∈ R, then differentiation yields BAeAx =AeAxB for each x ∈ R, and the case x = 0 provides BA Id = A IdB, i.e. BA = AB.For the converse, assume BA = AB and define the auxiliary maps

fB : M(n,C) −→ M(n,C), fB(C) := BC,

gB : M(n,C) −→ M(n,C), gB(C) := CB.

If ‖·‖ denotes an operator norm, then ‖BC1−BC2‖ ≤ ‖B‖‖C1−C2‖ and ‖C1B−C2B‖ ≤‖B‖‖C1 − C2‖, showing fB and gB to be (even Lipschitz) continuous. Thus,

BeAx = fB(eAx) = fB

(

limm→∞

m∑

k=0

(Ax)k

k!

)

= limm→∞

fB

(

m∑

k=0

(Ax)k

k!

)

= limm→∞

Bm∑

k=0

(Ax)k

k!AB=BA= lim

m→∞

(

m∑

k=0

(Ax)k

k!

)

B = limm→∞

gB

(

m∑

k=0

(Ax)k

k!

)

= gB

(

limm→∞

m∑

k=0

(Ax)k

k!

)

= gB(eAx) = eAxB,


(b): Exercise (hint: use (a)).

4 LINEAR ODE 78

Eigenvalues and Jordan Normal Form

We will see that the solution theory of linear ODE with constant coefficients is relatedto the eigenvalues of A. We recall the definition of this notion:

Definition 4.43. Let n ∈ N and A ∈ M(n,C). Then λ ∈ C is called an eigenvalue ofA if, and only if, there exists 0 6= v ∈ Cn such that

Av = λv. (4.77)

If (4.77) holds, then v 6= 0 is called an eigenvector for the eigenvalue λ.

Theorem 4.44. Let n ∈ N and A ∈ M(n,C).

(a) For each eigenvalue λ ∈ C of A with eigenvector v ∈ Cn \ 0, the function

φ : I −→ Cn, φ(x) := eλxv, (4.78)

is a solution to the homogeneous version of (4.71).

(b) If v1, . . . , vn is a basis of eigenvectors for Cn, where vj is an eigenvector withrespect to the eigenvalue λj ∈ C of A for each j ∈ 1, . . . , n, then φ1, . . . , φn with

∀j∈1,...,n

φj : I −→ Cn, φj(x) := eλjxvj, (4.79)

form a fundamental system for (4.71).

Proof. (a): One computes, for each x ∈ I,

φ′(x) = λ eλxv = eλxAv = Aφ(x),

proving that φ solves the homogeneous version of (4.71).

(b): Without loss of generality, we may consider I = R. We already know from (a) thateach φj is a solution to the homogeneous version of (4.71). Thus, it merely remainsto check that φ1, . . . , φn are linearly independent. As φ1(0) = v1, . . . , φn(0) = vn arelinearly independent by hypothesis, the linear independence of φ1, . . . , φn is provided byTh. 4.11(b).

To proceed, we need a few more notions and results from linear algebra:

Theorem 4.45. Let n ∈ N and A ∈ M(n,C). Then the following statements (i) and(ii) are equivalent:

(i) There exists a basis B of eigenvectors for Cn, i.e. there exist v1, . . . , vn ∈ Cn andλ1, . . . , λn ∈ C such that B = v1, . . . , vn is a basis of Cn and Avj = λjvj foreach j = 1, . . . , n (note that the vj must all be distinct, whereas some (or all) ofthe λj may be identical).

4 LINEAR ODE 79

(ii) There exists an invertible matrix W ∈ M(n,C) such that

W−1AW =

λ1 0. . .

0 λn

, (4.80)

i.e. A is diagonalizable (if (4.80) holds, then the columns v1, . . . , vn of W mustactually be the respective eigenvectors to the eigenvalues λ1, . . . , λn).

Proof. See, e.g., [Koe03, Th. 8.3.1].

Unfortunately, not every matrix A ∈ M(n,C) is diagonalizable. However, every A ∈M(n,C) can at least be transformed into Jordan normal form:

Theorem 4.46 (Jordan Normal Form). Let n ∈ N and A ∈ M(n,C). There exists aninvertible matrix W ∈ M(n,C) such that

B := W−1AW (4.81)

is in Jordan normal form, i.e. B has block diagonal form

B =

B1 0. . .

0 Br

, (4.82)

1 ≤ r ≤ n, where each block Bj is a so-called Jordan matrix or Jordan block, i.e.

Bj = (λj) or Bj =

λj 1 0 . . . 0λj 1

. . . . . . 00 λj 1

λj

, (4.83)

where λj is an eigenvalue of A.

Proof. See, e.g., [Koe03, Th. 9.5.6] or [Str08, Th. 27.13].

The reason Th. 4.46 regarding the Jordan normal form is useful for solving linear ODEwith constant coefficients is the following theorem:

Theorem 4.47. Let n ∈ N and A,W ∈ M(n,C), where W is assumed invertible.

(a) The following statements (i) and (ii) are equivalent:

(i) φ : I −→ Cn is a solution to y′ = Ay.

(ii) ψ := W−1φ : I −→ Cn is a solution to y′ = W−1AWy.

4 LINEAR ODE 80

(b) eW−1AWx = W−1eAxW for each x ∈ R.

Proof. (a): The equivalences

φ′ = Aφ ⇔ W−1φ′ = W−1Aφ ⇔ ψ′ = W−1AWψ

establish the case.

(b): By definition, x 7→ eW−1AWx is the solution to the initial value problem Y ′ =

W−1AWY , Y (0) = Id. Thus, noting W−1eA0W = Id and

(W−1eAxW )′ = W−1AeAxW = W−1AWW−1eAxW

shows x 7→ W−1eAxW is a solution to the same initial value problem, establishing(b).

Remark 4.48. To obtain a fundamental system for (4.71) with A ∈ M(n,C), it sufficesto obtain a fundamental system for y′ = By, where B := W−1AW is in Jordan normalform and W ∈ M(n,C) is invertible: If φ1, . . . , φn are linearly independent solutions toy′ = By, then A = WBW−1, Th. 4.47(a), and W being a linear isomorphism yield thatψ1 := Wφ1, . . . , ψn := Wφn are linearly independent solutions to y′ = Ay.

Moreover, since B is in block diagonal form with each block being a Jordan matrixaccording to (4.82) and (4.83), it actually suffices to solve y′ = By assuming that

B = λ Id+N, (4.84)

where λ ∈ C and N is a so-called canonical nilpotent matrix, i.e.

N = 0 (zero matrix) or N =

0 1 0 . . . 00 1

. . . . . . 0

0 0 1

0

, (4.85)

where the case N = 0 is already covered by Th. 4.44. The remaining case is covered bythe following Th. 4.49.

Theorem 4.49. Let λ ∈ C, k ∈ N, k ≥ 2, and assume 0 6= N ∈ M(k,C) is a canonicalnilpotent matrix according to (4.85). Then

Φ : R −→ M(k,C), Φ(x) := eλx

1 x x2

2. . . xk−2

(k−2)!xk−1

(k−1)!

0 1 x . . . xk−3

(k−3)!xk−2

(k−2)!

0 0 1. . .

......

......

.... . . . . .

...

0 0 0 . . . 1 x

0 0 0 . . . 0 1

, (4.86)

4 LINEAR ODE 81

is a fundamental matrix solution to

Y ′ = (λ Id+N)Y, Y (0) = Id, (4.87)

i.e.∀x∈R

Φ(x) = e(λ Id+N)x; (4.88)

in particular, the columns of Φ provide k solutions to y′ = (λ Id+N)y that are linearlyindependent.

Proof. Φ(0) = Id is immediate from (4.86). Since Φ(x) has upper triangular form withall 1’s on the diagonal, we obtain detΦ(x) = ekλx 6= 0 for each x ∈ R, showing thecolumns of Φ are linearly independent. Let φαβ : R −→ C denote the αth componentfunction of the βth column of Φ, i.e.

∀α,β∈1,...,k

φαβ : R −→ C, φαβ(x) :=

eλx xβ−α

(β−α)!for α ≤ β,

0 for α > β.

It remains to show that

∀α,β∈1,...,k

φ′αβ =

λφαβ + φα+1,β for α < β,

λφαβ for α = β,

0 for α > β.

(4.89)

One computes,

∀α,β∈1,...,k

φ′αβ(x) =

λ eλx xβ−α

(β−α)!+ eλx xβ−(α+1)

(β−(α+1))!for α < β,

λ eλx xβ−α

(β−α)!+ 0 for α = β,

0 for α > β,

i.e. (4.89) holds, completing the proof.

Example 4.50. For a 2-dimensional real system of linear ODE

y′ = Ay, A ∈ M(2,R), (4.90)

there exist precisely the following three possibilities (i) – (iii):

(i) The matrix A is diagonalizable with real eigenvalues λ1, λ2 ∈ R (λ1 = λ2 ispossible), i.e. there is a basis v1, v2 of R2 such that vj is an eigenvector for λj,j ∈ 1, 2. In this case, according to Th. 4.44(b), the two functions

φ1, φ2 : R −→ K2, φ1(x) := eλ1xv1, φ2(x) := eλ2xv2, (4.91)

form a fundamental system for (4.90) (over K).

4 LINEAR ODE 82

(ii) The matrix A is diagonalizable with two complex conjugate eigenvalues λ1, λ2 ∈C \ R, λ2 = λ1. Analogous to (i), one has a basis v1, v2 of C2 such that vjis an eigenvector for λj, j ∈ 1, 2, and the two functions in (4.91) still form afundamental system for (4.90), but with K replaced by C. However, one can stillobtain a real-valued fundamental system as follows: We have

λ1 = µ+ iω, λ2 = µ− iω, where µ ∈ R, ω ∈ R \ 0. (4.92)

Thus, if Av1 = λ1v1 with 0 6= v1 = α + iβ, where α, β ∈ R2, then, lettingv2 := v1 = α− iβ, and taking complex conjugates

Av2 = Av1 = Av1 = λ1v1 = λ1v1 = λ2v2

shows v2 is an eigenvector with respect to λ2. Thus, φ2 = φ1 and, similar to theapproach described in Rem. 4.33(b) above, we can let

(

ψ1

ψ2

)

=

( 12

12

12i

− 12i

)(

φ1

φ2

)

,

to obtain a fundamental system ψ1, ψ2 for (4.90) over R, where ψ1, ψ2 : R −→R2,

ψ1(x) = Re(φ1(x)) = Re(

e(µ+iω)x(α + iβ))

= Re(

eµx(

cos(ωx) + i sin(ωx))

(α + iβ))

= eµx(

α cos(ωx)− β sin(ωx))

, (4.93a)

ψ2(x) = Im(φ1(x)) = eµx(

α sin(ωx) + β cos(ωx))

. (4.93b)

(iii) The matrix A has precisely one eigenvalue λ ∈ R and the corresponding eigenspaceis 1-dimensional. Then there is an invertible matrix W ∈ M(2,R) such thatB := W−1AW is in (nondiagonal) Jordan normal form, i.e.

B = W−1AW =

(

λ 10 λ

)

.

According to Th. 4.49, the two functions

φ1, φ2 : R −→ K2, φ1(x) := eλx(

10

)

, φ2(x) := eλx(

x1

)

, (4.94)

form a fundamental system for y′ = By (over K). Thus, according to Th. 4.47,the two functions

ψ1, ψ2 : R −→ K2, ψ1(x) := Wφ1(x), ψ2(x) := Wφ2(x), (4.95)

form a fundamental system for (4.90) (over K).

Remark 4.51. One way of finding a fundamental matrix solution for y′ = Ay, A ∈M(n,C), is to obtain eAx, using the following strategy based on Jordan normal forms:

4 LINEAR ODE 83

(i) Determine the distinct eigenvalues λ1, . . . , λs, 1 ≤ s ≤ n, of A, which are preciselythe zeros of the characteristic polynomial χA(x) := det(A− x Id) (the multiplicityof the zero λj is called its algebraic multiplicity, the dimension of the eigenspaceker(A− λj Id) its geometric multiplicity).

(ii) Determine the Jordan normal form B of A and W such that B = W−1AW . Ingeneral, this means computing the (finitely many distinct) powers (A−λj Id)k and(suitable bases of) ker(A−λj Id)k (in general, this is a somewhat involved procedureand it is referred to [Mar04, Sections 4.2,4.3] and [Str08, Sec. 27] for details – thelecture notes do not provide further details here, as they rather recommend usingthe Putzer algorithm as described below instead).

(iii) For each Jordan block Bj (as in (4.83)) of B compute eBjx as in (4.86).

As step (ii) above tends to be complicated in practise, it is usually easier to obtain eAx

using the Putzer algorithm described next.

Putzer Algorithm

The Putzer algorithm due to [Put66] is a procedure for computing eAx that avoids thedifficulty of determining the Jordan normal form of A, and, thus, is often more efficientto employ in practise than the procedure described in Rem. 4.51 above. The Putzeralgorithm is provided by the following theorem:

Theorem 4.52. Let A ∈ M(n,C), n ∈ N. If λ1, . . . , λn ∈ C are precisely the eigenval-ues of A (not necessarily distinct, each eigenvalue occurring possibly repeatedly accordingto its multiplicity). Then

∀x∈R

eAx =n−1∑

k=0

pk+1(x)Mk, (4.96)

where the functions p1, . . . , pn : R −→ C and matrices M0, . . . ,Mn−1 ∈ M(n,C) aredefined recursively by

p′1 = λ1 p1, p1(0) = 1, (4.97a)

p′k = λk pk + pk−1, pk(0) = 0 for k = 2, . . . , n (4.97b)

(i.e. each pk is a solution to a (typically nonhomogeneous) 1-dimensional first-orderlinear ODE that can be solved using (2.2)) and

M0 := Id, (4.98a)

Mk =Mk−1 (A− λk Id) for k = 1, . . . , n− 1. (4.98b)

Proof. Note that (4.98) can be extended to k = n, yielding

Mn =n∏

k=1

(A− λk Id) = χA(A) = 0,

5 STABILITY 84

since each matrix annihilates its characteristic polynomial according to the Cayley-Hamilton theorem (cf. [Koe03, Th. 8.4.6] or [Str08, Th. 26.6]). Also note

∀k=0,...,n−1

AMk =Mk(A− λk+1 Id) + λk+1Mk =Mk+1 + λk+1Mk. (4.99)

We have to show that x 7→ Φ(x) :=∑n−1

k=0 pk+1(x)Mk solves the initial value problemY ′ = AY , Y (0) = Id. The initial condition is satisfied, as Φ(0) = p1(0)M0 = Id, andthe ODE is satisfied, as, for each x ∈ R,

Φ′(x)− AΦ(x) =n−1∑

k=0

p′k+1(x)Mk − A

n−1∑

k=0

pk+1(x)Mk

(4.97), (4.99)= λ1p1(x)M0 +

n−1∑

k=1

(

λk+1pk+1(x) + pk(x))

Mk

−n−1∑

k=0

pk+1(x)(

Mk+1 + λk+1Mk

)

= −pn(x)Mn = 0,

completing the proof.

5 Stability

5.1 Qualitative Theory, Phase Portraits

In the qualitative theory of ODE, which can be seen as part of the field of dynamicalsystems, the idea is to understand the set of solutions to an ODE (or to a class of ODE),if possible, without making use of explicit solution formulas, which, in most situations,are not available anyway. Examples of qualitative questions are if, and under whichconditions, solutions to an ODE are constant, periodic, are unbounded, approach somelimit (more generally, the solutions’ asymptotic behavior), etc. One often thinks of thesolutions as depending on a time-like variable, and then qualitative theory typicallymeans disregarding the speed of change, but rather focusing on the shape/geometry ofthe solution’s image.

The topic of stability takes continuity in intial conditions further and investigates thebehavior of solutions that are, at least initially, close to some given solution. Underwhich conditions do nearby solutions approach each other or diverge away from eachother, show the same or different asymptotic behavior etc.

Even though the abovedescribed considerations are not limited to this situation, a nat-ural starting point is to consider first-order ODE where the right-hand side does notdepend on x. In the following, we will mostly be concerned with this type of ODE,which has a special name:

5 STABILITY 85

Definition 5.1. If Ω ⊆ Kn, n ∈ N, and f : Ω −→ Kn, then the n-dimensional first-orderODE

y′ = f(y) (5.1)

is called autonomous and Ω is called the phase space.

Remark 5.2. In fact, nonautonomous ODE are not really more general than au-tonomous ODE, due to the, perhaps, surprising Th. J.1 of the Appendix, which statesthat every nonautonomous ODE is equivalent to an autonomous ODE. However, this factis of little practical relevance, since the autonomous ODE arising via Th. J.1 from nonau-tonomous ODE can never have bounded solutions on unbounded intervals, whereas thetheory of autonomous ODE is most powerful and useful for ODE that admit boundedsolutions on unbounded intervals (such as constant or periodic solutions, or solutionsapproaching constant or periodic functions).

Lemma 5.3. If, in the context of Def. 5.1, φ : I −→ Kn is a solution to (5.1), definedon the interval I ⊆ R, then

∀ξ∈R

φξ : I − ξ −→ Kn, φξ(x) := φ(x+ ξ), where I − ξ := x− ξ ∈ R : x ∈ I,(5.2)

is another solution to (5.1). In consequence, if φ is a maximal solution, then so is φξ.

Proof. Clearly, I − ξ is an interval. Note x ∈ I − ξ ⇒ x + ξ ∈ I and, since φ is asolution to (5.1), it is φ(I) ⊆ Ω, implying φξ(I − ξ) ⊆ Ω. Finally,

∀x∈I−ξ

φ′ξ(x) = φ′(x+ ξ) = f

(

φ(x+ ξ))

= f(

φξ(x))

,

completing the proof that φξ is a solution. Since each extension of φ yields an extensionof φξ and vice versa, φ is a maximal solution if, and only if, φξ is a maximal solution.

Lemma 5.4. If Ω ⊆ Kn, n ∈ N, and f : Ω −→ Kn is such that (5.1) admits uniquemaximal solutions (f being locally Lipschitz on Ω open is sufficient, but not necessary,cf. Def. 3.32), then the global solution Y : Df −→ Kn of (5.1) satisfies

(a) Y (x, ξ, η) = Y (x− ξ, 0, η) for each (x, ξ, η) ∈ Df .

(b) Y(

x, 0, Y (x, 0, η))

= Y (x+x, 0, η) for each (x, x, η) ∈ R×R×Kn such that (x, 0, η) ∈Df and

(

x, 0, Y (x, 0, η))

∈ Df .

Proof. (a): If ψ : Iξ,η −→ Kn and φ : I0,η −→ Kn denote the maximal solutions tothe initial data y(ξ) = η and y(0) = η, respectively, then (a) claims, using the notationfrom Lem. 5.3, ψ = φ−ξ. As a consequence of Lem. 5.3, φ−ξ : I0,η + ξ −→ Kn, is somemaximal solution to (5.1) and, since φ−ξ(ξ) = φ(0) = η = ψ(ξ), the assumed uniquenessyields the claimed ψ = φ−ξ, in particular, Iξ,η = I0,η + ξ.

(b): Let η := Y (x, 0, η). If ψ : I0,η −→ Kn and φ : I0,η −→ Kn denote the maximalsolutions to the initial data y(0) = η and y(0) = η, respectively, then (b) claims ψ = φx.As a consequence of Lem. 5.3, φx : I0,η − x −→ Kn, is some maximal solution to (5.1)and, since φx(0) = φ(x) = η = ψ(0), the assumed uniqueness yields the claimed ψ = φx,in particular, I0,η = I0,η − x.

5 STABILITY 86

Definition 5.5. Let I ⊆ R be an interval and φ : I −→ S (in principle, S can bearbitrary).

(a) The image of I under φ, i.e.

O(φ) := φ(I) = φ(x) : x ∈ I ⊆ S (5.3)

is often referred to as the orbit of φ in the present context of qualitative ODE theory.

(b) φ : R −→ S (note I = R) is called periodic if, and only if, there exists a smallestω > 0 (called the period of φ) such that

∀x∈R

φ(x+ ω) = φ(x). (5.4)

The requirement ω > 0 means constant functions are not periodic in the sense ofthis definition.

Lemma 5.6. Let φ : R −→ Kn, n ∈ N.

(a) If φ is continuous and (5.4) holds for some ω > 0, then φ is either constant orperiodic in the sense of Def. 5.5(b).

(b) (a) is false without the assumption of φ being continuous.

Proof. Exercise.

Definition 5.7. Let Ω ⊆ Kn, n ∈ N, and f : Ω −→ Kn. In the context of theautonomous ODE (5.1), the zeros of f are called the fixed points of the ODE (5.1) (cf.Lem. 5.8 below). One then sometimes uses the notation

F := Ff := η ∈ Ω : F (η) = 0 (5.5)

for the set of fixed points.

Lemma 5.8. Let Ω ⊆ Kn, n ∈ N, f : Ω −→ Kn, η ∈ Ω. Then the following statementsare equivalent:

(i) f(η) = 0, i.e. η is a fixed point of (5.1).

(ii) φ : R −→ Kn, φ ≡ η, is a solution to (5.1).

Proof. If f(η) = 0 and φ ≡ η, then φ′(x) = 0 = f(φ(x)) for each x ∈ R, i.e. (i) implies(ii). Conversely, if φ ≡ η is a solution to (5.1), then f(η) = f(φ(x)) = φ′(x) = 0, i.e. (ii)implies (i).

Proposition 5.9. If Ω ⊆ Kn, n ∈ N, and f : Ω −→ Kn is such that (5.1) admitsunique maximal solutions (f being locally Lipschitz on Ω open is sufficient), then, formaximal solutions φ1 : I1 −→ Kn, φ2 : I2 −→ Kn to (5.1), defined on open intervalsI1, I2, respectively, precisely one of the following two statements (i) and (ii) is true:

5 STABILITY 87

(i) O(φ1) ∩ O(φ2) = ∅, i.e. the solutions have disjoint orbits.

(ii) There exists ξ ∈ R such that

I2 = I1 − ξ and ∀x∈I2

φ2(x) = φ1(x+ ξ). (5.6)

In particular, it follows in this case that O(φ1) = O(φ2), i.e. the solutions havethe same orbit.

Proof. Suppose (i) does not hold. Then there are x1 ∈ I1 and x2 ∈ I2 such thatφ1(x1) = φ2(x2). Define ξ := x1 − x2 and consider

φ : I1 − ξ −→ Kn, φ(x) := φ1(x+ ξ). (5.7)

Then φ is a maximal solution of (5.1) by Lem. 5.3 and φ(x2) = φ1(x1) = φ2(x2). Byuniqueness of maximal solutions, we obtain φ = φ2, in particular, I2 = I1 − ξ, proving(5.6). Clearly, (5.6) implies O(φ1) = O(φ2).

Proposition 5.10. If Ω ⊆ Kn, n ∈ N, and f : Ω −→ Kn is such that (5.1) admitsunique maximal solutions (f being locally Lipschitz on Ω open is sufficient), then, foreach maximal solution φ : I −→ Kn to (5.1), defined on the open interval I, preciselyone of the following three statements is true:

(i) φ is injective.

(ii) I = R and φ is periodic.

(iii) I = R and φ is constant (in this case η := φ(0) is a fixed point of (5.1)).

Proof. Clearly, (i) – (iii) are mutually exclusive. Suppose (i) does not hold. Then thereexist x1, x2 ∈ I, x1 < x2, such that φ(x1) = φ(x2). Set ω := x2 − x1. According to Lem.5.3, ψ : I − ω −→ Kn, ψ(x) := φ(x + ω), must also be a maximal solution to (5.1).Since ψ(x1) = φ(x1 + ω) = φ(x2) = φ(x1), uniqueness implies ψ = φ and I = I − ω.As ω > 0, this means I = R and the validity of (5.4). As φ is also continuous, by Lem.5.6(a), either (ii) or (iii) must hold.

Corollary 5.11. If Ω ⊆ Kn, n ∈ N, and f : Ω −→ Kn is such that (5.1) admits uniquemaximal solutions (f being locally Lipschitz on Ω open is sufficient), then the orbitsof maximal solutions to (5.1) partition the phase space Ω into disjoint sets. Moreover,every point η ∈ Ω is either a fixed point, or it belongs to some periodic orbit, or it belongsto the orbit of some injective solution.

Proof. The corollary merely summarizes Prop. 5.9 and Prop. 5.10.

Definition 5.12. In the situation of Cor. 5.11, a phase portrait for (5.1) is a sketchshowing representative orbits. Thus, the sketch shows subsets of the phase space Ω,including fixed points (if any) and representative periodic solutions (if any). Usually,one also uses arrows to indicate the direction in which each drawn orbit is traced as thevariable x increases.

5 STABILITY 88

Example 5.13. Even though it is a main goal of qualitative theory to obtain phaseportraits without the need of explicit solution formulas, and we will study techniquesfor accomplishing this below, we will make use of explicit solution formulas for our firsttwo examples of phase portraits.

(a) Consider the autonomous linear ODE

(

y′1y′2

)

=

(

−y2y1

)

. (5.8)

Here we have Ω = R2 and f : Ω −→ Ω, f(y1, y2) = (−y2, y1). The only fixed pointis (0, 0). Clearly, for each r > 0, φ : R −→ R2, φ(x) := (r cos x, r sin x) is a solutionto (5.8) and its orbit is the circle with radius r around the origin. Since every pointof Ω belongs to such a circle, every orbit is either the origin or a circle around theorigin. Thus, the phase portrait consists of such circles plus the origin and arrowsthat indicate the circles are traversed counterclockwise.

(b) As compared to the previous one, the phase portrait of the autonomous linear ODE

(

y′1y′2

)

=

(

y2y1

)

(5.9)

is more complicated: While (0, 0) is still the only fixed point, for each r > 0, all thefollowing functions φ1, φ2, φ3, φ4 : R −→ R2 are solutions:

φ1(x) := (r cosh x, r sinh x), (5.10a)

φ2(x) := (−r cosh x,−r sinh x), (5.10b)

φ3(x) := (r sinh x, r cosh x), (5.10c)

φ4(x) := (−r sinh x,−r cosh x), (5.10d)

each type describing a hyperbolic orbit in some section of the plane R2. Thesesections are separated by rays, forming the orbits of the solutions φ5, φ6, φ7, φ8 :R −→ R2:

φ5(x) := (ex, ex), (5.10e)

φ6(x) := (−ex,−ex), (5.10f)

φ7(x) := (e−x,−e−x), (5.10g)

φ8(x) := (−e−x, e−x). (5.10h)

The two rays on (y1, y1) : y1 6= 0 move away from the origin, whereas the two rayson (y1,−y1) : y1 6= 0 move toward the origin. The hyperbolic orbits asymptoti-cally approach the ray orbits and are traversed such that the flow direction agreesbetween approaching orbits.

—

5 STABILITY 89

The next results will be useful to obtain new phase portraits from previously knownphase portraits in certain situations.

Proposition 5.14. Let Ω ⊆ Kn, n ∈ N, let I ⊆ R be some nontrivial interval, letf : Ω −→ Kn, and let φ : I −→ Kn be a solution to

y′ = γ(x) f(y), (5.11)

where γ : I −→ R is continuous. If φ′(x) 6= 0 for each x ∈ I (if one thinks of x astime, then one can think of φ′ as the velocity of φ), then there exists a continuouslydifferentiable bijective map λ : J −→ I, defined on some nontrivial interval J , such that(φ λ) : J −→ Kn is a solution to y′ = f(y).

Proof. Since φ′(x) 6= 0 for each x ∈ I, one has γ(x) 6= 0 for each x ∈ I. As γ is alsocontinuous, it must be either always negative or always positive. In consequence, fixingx0 ∈ I, Γ : I −→ R, Γ(x) :=

∫ x

x0γ(t) dt , is continuous and either strictly increasing or

strictly decreasing. In particular, Γ is injective, J := Γ(I) is an interval, and Γ : I −→ Jis bijective. The desired function λ is λ := Γ−1 : J −→ I. Indeed, according to [Phi16,Th. 9.9], λ is differentiable and its derivative is the continuous function

λ′ : J −→ R, λ′(x) =1

γ(

λ(x)) ,

implying, for each x ∈ J ,

(φ λ)′(x) = φ′(

λ(x))

λ′(x) = γ(

λ(x))

f(

φ(λ(x))) 1

γ(

λ(x)) = f

(

φ(λ(x)))

,

showing φ λ is a solution to y′ = f(y) as required.

Proposition 5.15. Let Ω ⊆ Kn, n ∈ N, and f : Ω −→ Kn. Moreover, consider acontinuous function h : Ω −→ R with the property that either h > 0 everywhere on Ωor h < 0 everywhere on Ω.

(a) If f has no zeros (i.e. F = ∅), then the ODE

y′ = f(y), (5.12a)

y′ = h(y) f(y) (5.12b)

have precisely the same orbits, i.e. every orbit of a solution to (5.12a) is an orbitof a solution to (5.12b) and vice versa.

(b) If f and h are such that the ODE (5.12) admit unique maximal solutions, then theODE (5.12) have precisely the same orbits (even if F 6= ∅).

Proof. (a): If φ : I −→ Kn is a solution to (5.12b), then γ := h φ is well-definedand continuous. Since F = ∅ implies φ′ 6= 0, we can apply Prop. 5.14 to obtainthe existence of a bijective λ1 : J1 −→ I such that φ λ1 is a solution to (5.12a).

5 STABILITY 90

Thus, O(φ) = O(φ λ1). Conversely, if ψ : I −→ Kn is a solution to (5.12a), i.e. to

y′ = h(y)h(y)

f(y), then γ := 1/(h ψ) is well-defined and continuous. Since F = ∅ implies

ψ′ 6= 0, we can apply Prop. 5.14 to obtain the existence of a bijective λ2 : J2 −→ I suchthat ψ λ2 is a solution to (5.12b). Thus, O(ψ) = O(ψ λ2).(b): We are now in the situation of Prop. 5.10 and Cor. 5.11, and from (a) we know everynonconstant orbit of (5.12a) is a nonconstant orbit of (5.12b) and vice versa. However,since h > 0 or h < 0, both ODE in (5.12) have precisely the same constant solutions,concluding the proof.

Remark 5.16. We apply Prop. 5.15 to phase portraits (in particular, assume uniquemaximal solutions). Prop. 5.15 says that overall multiplication with a continuous pos-itive function h does not change the phase portrait at all. Moreover, Prop. 5.15 alsostates that overall multiplication with a continuous negative function h does not changethe partition of Ω into solution orbits. However, after multiplication with a negative h,the orbits are clearly traversed in the opposite direction, i.e., for negative h, the arrowsin the phase portrait have to be reversed. For a general continuous h, this implies thephase portrait remains the same in each region of Ω, where h > 0; it remains the same,except for the arrows reversed, in each region of Ω, where h < 0; and the zeros of hadd additional fixed points, cutting some of the previous orbits. We summarize how toobtain the phase portrait of (5.12b) from that of (5.12a):

(1) Start with the phase portrait of (5.12a).

(2) Add the zeros of h as additional fixed points (if any). Previous orbits are cut, wherefixed points are added.

(3) Reverse the arrows where h < 0.

Example 5.17. (a) Consider the ODE

y′1 = −y2(

(y1 − 1)2 + y22)

,

y′2 = y1(

(y1 − 1)2 + y22)

,(5.13)

which comes from multiplying the right-hand side of (5.8) by h(y) = (y1− 1)2+ y22.The phase portrait is the same as the one for (5.8), except for the added fixed pointat (1, 0).

(b) Consider the ODEy′1 = −y1 y2 + y22,

y′2 = −y1 y2 + y21,(5.14)

which comes from multiplying the right-hand side of (5.8) by h(y) = y1 − y2. Thephase portrait is obtained from that of (5.8), where additional fixed points are onthe line with y1 = y2. This line cuts each previously circular orbit into two segments.The arrows have to be reversed for y2 > y1, that means above the y1 = y2 line.

5 STABILITY 91

Definition 5.18. Let Ω ⊆ Rn, n ∈ N, and f : Ω −→ Rn. A function E : Ω −→ R iscalled an integral for the autonomous ODE (5.1), i.e. for y′ = f(y), if, and only if, E φis constant for every solution φ of (5.1).

Lemma 5.19. Let Ω ⊆ Rn be open, n ∈ N, and f : Ω −→ Rn such that each initialvalue problem for (5.1) has at least one solution (f continuous is sufficient by Th. 3.8).Then a differentiable function E : Ω −→ R is an integral for (5.1) if, and only if,

∀y∈Ω

(∇E)(y) • f(y) =n∑

j=1

∂jE(y) fj(y) = 0. (5.15)

Proof. Let φ : I −→ Rn be a solution to y′ = f(y). Then, by the chain rule,

∀x∈I

(E φ)′(x) = (∇E)(φ(x)) • φ′(x) = (∇E)(φ(x)) • f(φ(x)). (5.16)

The differentiable function E φ : I −→ R is constant on the interval I if, and only if,(E φ)′ ≡ 0. Thus, by (5.16), E φ being constant for every solution φ is equivalent to(

(∇E) • f)

(y) = 0 for each y ∈ Ω such that at least one solution passes through y.

Example J.2 of the Appendix, pointed out by Anton Sporrer, shows the hypothesis ofLem. 5.19, that each initial value problem for (5.1) has at least one solution, can not beomitted. The following Prop. 5.20 makes use of integrals and applies to phase portraitsof 2-dimensional real ODE:

Proposition 5.20. Let Ω ⊆ R2 be open, and let f : Ω −→ R2 be continuous andsuch that (5.1) admits unique maximal solutions (f being locally Lipschitz is sufficient).Assume E : Ω −→ R to be a continuously differentiable integral for (5.1), i.e. fory′ = f(y), satisfying ∇E(y) 6= 0 for each y ∈ Ω. Then the following statements holdfor each maximal solution φ : I −→ R2 of (5.1) (I ⊆ R some open interval):

(a) If (xm)m∈N is a sequence in I such that limm→∞ φ(xm) = η ∈ Ω, then η ∈ F (i.e. ηis a fixed point) or η ∈ O(φ) (i.e. there exists ξ ∈ I with φ(ξ) = η).

(b) Let C ∈ R be such that E φ ≡ C (such a C exists, as E is an integral). IfE−1C = y ∈ Ω : E(y) = C is compact and E−1C ∩ F = ∅, then φ isperiodic.

Proof. Throughout the proof let C be as in (b), i.e. E φ ≡ C.

(a): The continuity of E yields E(η) = limm→∞E(φ(xm)) = C. Moreover, by hypoth-esis, (ǫ1, ǫ2) := ∇E(η) 6= (0, 0). We proceed with the proof for ǫ2 6= 0 – if ǫ2 = 0and ǫ1 6= 0, then the roles of the indices 1, 2 have to be switched in the following. Weapply the implicit function theorem [Phi15, Th. C.9] to the function f : Ω −→ R,f(y) := E(y)−C at its zero η = (η1, η2). By [Phi15, Th. C.9], there exist ǫ, δ > 0 and acontinuously differentiable map g : Ig −→ R, Ig :=]η1 − δ, η1 + δ[, such that g(η1) = η2,

∀s∈Ig

E(

s, g(s))

= C, (5.17a)

5 STABILITY 92

and, having fixed some arbitrary norm ‖ · ‖ on R2,

∀y∈Ω

(

(

‖y − η‖ < ǫ ∧ E(y) = C)

⇒ ∃s∈Ig

y =(

s, g(s))

)

. (5.17b)

We now assume η /∈ F and show η ∈ O(φ). If η /∈ F , then f(η) 6= 0 and the continuityof f and g imply there is δ > 0, δ ≤ δ, such that, for each s ∈ I :=]η1 − δ, η1 + δ[,f(s, g(s)) 6= 0. Define the auxiliary function ϕ : I −→ Ω, ϕ(s) = (s, g(s)). SinceE ϕ ≡ C, we can employ the chain rule to conclude

∀s∈I

0 = (E ϕ)′(s) = (∇E)(ϕ(s)) • ϕ′(s), (5.18)

i.e. the two-dimensional vectors (∇E)(ϕ(s)) and ϕ′(s) are orthogonal with respect tothe Euclidean scalar product. As E is an integral, using (5.15), f(ϕ(s)) is another vectororthogonal to (∇E)(ϕ(s)) and, since all vectors in R2 orthogonal to (∇E)(ϕ(s)) forma 1-dimensional subspace of R2 (recalling (∇E)(ϕ(s)) 6= 0), there exists γ(s) ∈ R suchthat

ϕ′(s) = γ(s)f(ϕ(s)) (5.19)

(note f(ϕ(s)) 6= 0 as s ∈ I). We can now apply Prop. 5.14, since (5.19) says ϕ is asolution to (5.11), the function γ : I −→ R, s 7→ γ(s) = ϕ′(s)/f(ϕ(s)) is continuous,and ϕ′(s) = (1, g′(s)) 6= (0, 0) for each s ∈ I. Thus, Prop. 5.14 provides a bijectiveλ : J −→ I, such that ϕ λ is a solution to y′ = f(y).

As we assume limm→∞ φ(xm) = η, there existsM ∈ N such that ‖φ(xm)−η‖ < ǫ for eachm ≥ M . Since E(φ(xm)) = C also holds, (5.17b) implies the existence of a sequence(sm)m∈N in I such that φ(xm) = (sm, g(sm)) for each m ≥ M . Then, for each m ≥ Mand τm := λ−1(sm), (ϕλ)(τm) = ϕ(sm) = φ(xm). On the other hand, for τ0 := λ−1(η1),(ϕ λ)(τ0) = ϕ(η1) = η, showing φ(xm), η ∈ O(ϕ λ). Since φ(xm) ∈ O(φ) as well,Prop. 5.9 implies O(ϕ λ) ⊆ O(φ), i.e. η ∈ O(φ), which proves (a). In preparation for(b), we also observe that ‖φ(xm) − η‖ < ǫ for each m ≥ M implies the sm for m ≥ Mall are in some compact interval I1 with η1 ∈ I1, implying the τm to be in the compactinterval J1 := λ−1[I1] with τ0 ∈ J1. We will use for (b) that J1 is bounded.

(b): As we have O(φ) ⊆ E−1C according to the choice of C, the assumed compactnessof E−1C and Prop. 3.24 show φ can only be maximal if it is defined on all of R (since(x, φ(x)) must escape every compact [−m,m]×E−1C, m ∈ N, on the left and on theright). Using the compactness of E−1C a second time, we obtain the existence of asequence (xm)m∈N in R such that limm→∞ xm = ∞ and limm→∞ φ(xm) = η ∈ E−1C.So we see that we are in the situation of (a). Let ψ be the maximal extension of thesolution ϕ λ constructed in the proof of (a). Then we know O(ψ)∩O(φ) 6= ∅ from theproof of (a) and, since ψ and φ both are maximal, Prop. 5.9 implies O(ψ) = O(φ) and,more importantly for us here, there exists ξ ∈ R such that ψ(x) = φ(x+ξ) for each x ∈ R.Let m ≥ M with M from the proof of (a). If ξ 6= 0, then φ(xm) = ψ(τm) = φ(xm + ξ)shows φ is not injective. If ξ = 0, then φ = ψ and φ(xm) = φ(τm). Since the τm arebounded, whereas the xm are unbounded, xm = τm cannot be true for all m, againshowing φ is not injective. Since E−1C ∩ F = ∅, φ cannot be constant, therefore itmust be periodic by Prop. 5.10.

5 STABILITY 93

Example 5.21. Using condition (5.15), i.e. ∇E • f ≡ 0, one readily verifies that thefunctions

E : R2 −→ R, E(y1, y2) := y21 + y22, (5.20a)

E : R2 −→ R, E(y1, y2) := y21 − y22, (5.20b)

are integrals for (5.8) and (5.9), i.e. for(

y′1y′2

)

=

(

−y2y1

)

and(

y′1y′2

)

=

(

y2y1

)

,

respectively, and we recover the respective phase portraits via the respective level curvesE(y1, y2) = C, C ∈ R.

Example 5.22. Consider the autonomous ODE(

y′1y′2

)

=

(

2y1y21− 2y21

)

. (5.21)

We claim thatE : R2 −→ R, E(y1, y2) := y1 e

−(y21+y22), (5.22)

is an integral for (5.21) and intend to use Prop. 5.20 to establish (5.21) has orbits thatare fixed points, orbits that are periodic, and orbits that are neither. To verify E is anintegral, one computes, for each (y1, y2) ∈ R2,

∇E(y1, y2) • (2y1y2, 1− 2y21)

=(

e−(y21+y22) − 2y21e

−(y21+y22), −2y1y2e

−(y21+y22))

• (2y1y2, 1− 2y21)

= e−(y21+y22) (1− 2y21,−2y1y2) • (2y1y2, 1− 2y21) = 0.

Clearly, the set of fixed points is

F =

(

− 1√2, 0

)

,

(

1√2, 0

)

.

The level set of 0 is E−10 = (0, y2) : y2 ∈ R, i.e. it is the y2-axis. This is anonperiodic orbit (actually, the orbit of solutions of the form φ : R −→ R2, φ(x) :=(0, x+ c), c ∈ R). Now consider the level set

E−1e−1 =

(y1, y2) : y1 > 0, y22 = ln y1 − y21 + 1

.

Using g : R+ −→ R, g(y1) = ln y1 − y21 + 1, and its derivative, it is not hard to showg has precisely two zeros, namely λ1 = 1 and 0 < λ2 < 1, and g ≥ 0 precisely on thecompact interval J := [λ2, 1], implying E−1e−1 =

(

y1,±√

g(y1))

: y1 ∈ J

, showingE−1e−1 is compact. According to Prop. 5.20(b), E−1e−1 must consist of one ormore periodic orbits.

5 STABILITY 94

5.2 Stability at Fixed Points

Given an autonomous ODE with a fixed point p, we will investigate the question underwhat conditions a solution φ(x) starting out near p will remain near p as x increases ordecreases.

To simplify notation, we will restrict ourselves to initial data y(0) = y0, which, in lightof Lem. 5.4(b), is not an essential restriction.

Notation 5.23. Let Ω ⊆ Kn, n ∈ N, and f : Ω −→ Kn such that

y′ = f(y) (5.23)

admits unique maximal solutions (f being locally Lipschitz on Ω open is sufficient). LetY : Df −→ Kn denote the general solution to (5.23) and define

Y : Df,0 −→ Kn, Y (x, η) := Y (x, 0, η),

Df,0 := (x, η) ∈ R×Kn : (x, 0, η) ∈ Df.(5.24)

Definition 5.24. Let Ω ⊆ Kn, n ∈ N, and f : Ω −→ Kn such that (5.23) admits uniquemaximal solutions (f being locally Lipschitz on Ω open is sufficient). Moreover, assumethe set of fixed points to be nonempty, F 6= ∅, and let p ∈ F . The fixed point p is saidto be positively (resp. negatively) stable if, and only if, the following conditions (i) and(ii) hold:

(i) There exists r > 0 such that, for each η ∈ Ω with ‖η − p‖ < r, the maximalsolution x 7→ Y (x, η) (cf. (5.24)) is defined on (a superset of) R+

0 (resp. R−0 ).

(ii) For each ǫ > 0, there exists δ > 0 such that, for each η ∈ Ω,

‖η − p‖ < δ ⇒ ∀x≥0

(resp. x ≤ 0)

‖Y (x, η)− p‖ < ǫ. (5.25)

The fixed point p is said to be positively (resp. negatively) asymptotically stable if, andonly if, (i) and (ii) hold plus the additional condition

(iii) There exists γ > 0 such that, for each η ∈ Ω,

‖η − p‖ < γ ⇒ limx→∞

Y (x, η) = p (resp. limx→−∞

Y (x, η) = p). (5.26)

The norm ‖ · ‖ on Kn used in (i) – (iii) above is arbitrary. Due to the equivalence ofnorms on Kn, changing the norm does not change the defined stability properties, eventhough, in general, it does change the sizes of r, δ, γ.

Remark 5.25. In the situation of Def. 5.24, consider the time-reversed version of (5.23),i.e.

y′ = −f(y). (5.27)

According to Lem. 1.9(b), (5.27) has the general solution

Y : D−f,0 −→ Kn, Y (x, η) := Y (−x, η),D−f,0 = (x, η) ∈ R×Kn : (−x, η) ∈ Df,0.

(5.28)

5 STABILITY 95

(a) Clearly, for a fixed point p ∈ F , we have the following equivalences:

p is positively stable for (5.23) ⇔ p is negatively stable for (5.27),

p is negatively stable for (5.23) ⇔ p is positively stable for (5.27).

(b) Clearly, for a fixed point p ∈ F , we have the following equivalences:

p is pos. asympt. stable for (5.23) ⇔ p is neg. asympt. stable for (5.27),

p is neg. asympt. stable for (5.23) ⇔ p is pos. asympt. stable for (5.27).

Lemma 5.26. Consider the situation of Def. 5.24 with f : Ω −→ Kn continuous onΩ ⊆ Kn open. Then the fixed point p is positively (resp. negatively) stable if, and onlyif, for each ǫ > 0, there exists δ > 0 such that, for each η ∈ Ω,

‖η − p‖ < δ ⇒ ∀x∈I(0,η)∩R

+0

(resp. x ∈ I(0,η) ∩ R−

0 )

‖Y (x, η)− p‖ < ǫ, (5.29)

where I(0,η) denotes the domain of the maximal solution Y (·, η).

Proof. Clearly, stability in the sense of Def. 5.24 implies (5.29), and it merely remainsto show that (5.29) implies Def. 5.24(i). As (5.29) holds, we can consider ǫ := 1 andobtain a corresponding δ =: r. Then (5.29) states that, for each η ∈ Ω with ‖η−p‖ < r,for x ≥ 0 (resp. for x ≤ 0), the maximal solution Y (x, η) remains in the compact setB1(p). Since f : Ω −→ Kn is continuous on Ω ⊆ Kn open, Th. 3.28 implies R+

0 ⊆ I(0,η)(resp. R−

0 ⊆ I(0,η)), proving Def. 5.24(i).

It is an exercise to show Lem. 5.26 becomes false if the hypothesis that f be continuousis omitted.

Example 5.27. (a) Consider the 1-dimensional R-valued ODE

y′ = y(y − 1). (5.30)

The set of fixed points is F = 0, 1. Moreover, Y ′(·, η) < 0 for 0 < η < 1 andY ′(·, η) > 0 for η ∈]−∞, 0[∪]1,∞[. It follows that, for p = 0, the positive stabilitypart of (5.29) holds (where, given ǫ > 0, one can choose δ := min1, ǫ). Moreover,for η < 0 and 0 < η < 1, one has limx→∞ Y (x, η) = 0. Thus, all three conditions ofDef. 5.24 are satisfied and 0 is positively asymptotically stable. Analogously, onesees that 1 is negatively asymptotically stable.

(b) For the R2-valued ODE of (5.8), (0, 0) is a fixed point that is positively and neg-atively stable, but neither positively nor negatively asymptotically stable. For theR2-valued ODE of (5.9), (0, 0) is a fixed point that is neither positively nor nega-tively stable.

5 STABILITY 96

(c) Consider the 1-dimensional R-valued ODE

y′ = y2. (5.31)

The only fixed point is 0, which is neither positively nor negatively stable. Indeed,not even Def. 5.24(i) is satisfied: One obtains

Y : Df,0 −→ R, Y (x, η) :=η

1− ηx,

where

Df,0 =(R× 0)∪

(x, η) ∈ R2 : η > 0, x ∈]−∞, 1/η[

∪

(x, η) ∈ R2 : η < 0, x ∈]1/η,∞[

,

showing every neighborhood of 0 contains η such that Y (·, η) is not defined on allof R+

0 and η such that Y (·, η) is not defined on all of R−0 .

Remark 5.28. There exist examples of autonomous ODE that show fixed points cansatisfy Def. 5.24(iii) without satisfying Def. 5.24(ii). For example, [Aul04, Ex. 7.4.16]provides the following ODE in polar coordinates (r, ϕ):

r′ = r (1− r), (5.32a)

ϕ′ =1− cosϕ

2= sin2 ϕ

2. (5.32b)

Even though it is somewhat tedious, one can show that its fixed point (1, 0) satisfies Def.5.24(iii) without satisfying Def. 5.24(ii) (see Claim 4 of Example K.2 in the Appendix).

—

We will now study a method that allows, in certain cases, to determine the stabilityproperties of a fixed point without having to know the solutions to an ODE. The methodis known as Lyapunov’s method. The key ingredient to this method is a test function V ,known as a Lyapunov function. Once a Lyapunov function is known, stability is ofteneasily tested. The catch, however, is that Lyapunov functions can be hard to find. Fromthe literature, it appears there is no definition for an all-purpose Lyapunov function, asa suitable choice depends on the circumstances.

Definition 5.29. Let Ω0 ⊆ Rn be open, n ∈ N. A function V : Ω0 −→ R is said tobe positive (resp. negative) definite at p ∈ Ω0 if, and only if, the following conditions (i)and (ii) hold:

(i) V (y) ≥ 0 (resp. V (y) ≤ 0) for each y ∈ Ω0.

(ii) V (y) = 0 if, and only if, y = p.

5 STABILITY 97

Theorem 5.30 (Lyapunov). Consider the situation of Def. 5.24 with K = R, Ω ⊆ Rn

open, and f : Ω −→ Rn continuous. Let Ω0 be open with p ∈ Ω0 ⊆ Ω ⊆ Rn. AssumeV : Ω0 −→ R to be continuously differentiable and define

V : Ω0 −→ R, V (y) := (∇V )(y) • f(y) =n∑

j=1

∂jV (y) fj(y). (5.33)

If V is positive definite at p and V ≤ 0 (resp. V ≥ 0) on Ω0, then p is positively (resp.negatively) stable. If, in addition, V is negative (resp. positive) definite at p, then p ispositively (resp. negatively) asymptotically stable.

Proof. The proof is carried out for the case of postive (asymptotic) stability; the prooffor the case of negative (asymptotic) stability is then easily obtained by reversing time,i.e. by using Rem. 5.25 together with noting V changing its sign when replacing f with−f . Fix your favorite norm ‖·‖ on Rn. Let r > 0 such that Br(p) = y ∈ Rn : ‖y−p‖ ≤r ⊆ Ω0 (such an r > 0 exists, as Ω0 is open). Define

k : ]0, r] −→ R+, k(ǫ) := min

V (y) : ‖y − p‖ = ǫ

, (5.34)

where k is well-defined, since the continuous function V assumes its min on compactsets, and k(ǫ) > 0 by the positive definiteness of V . Given ǫ ∈]0, r], since V (p) = 0,k(ǫ) > 0, and V continuous,

∃0<δ(ǫ)<ǫ

∀y∈Bδ(ǫ)(p)

V (y) < k(ǫ), (5.35)

where we used Not. 3.3 to denote an open ball with center p with respect to ‖ · ‖.We now claim that, for each η ∈ Bδ(ǫ)(p), the maximal solution x 7→ φ(x) := Y (x, η)must remain inside Bǫ(p) for each x ≥ 0 in its domain I(0,η) (implying p to be positivelystable by Lem. 5.26). Seeking a contradiction, assume there exists ξ ≥ 0 such that‖φ(ξ)− p‖ ≥ ǫ and let

s := sup

x ≥ 0 : φ(t) ∈ Bǫ(p) for each t ∈ [0, x]

≤ ξ <∞. (5.36)

The continuity of φ then implies ‖φ(s)− p‖ = ǫ, i.e.

V (φ(s)) ≥ k(ǫ) (5.37)

by the definition of k(ǫ). On the other hand, by the chain rule (V φ)′(x) = V (φ(x))(cf. (5.16)), such that V ≤ 0 implies

V (φ(s)) = V (η) +

∫ s

0

V (φ(x)) dx ≤ V (η)(5.35)< k(ǫ), (5.38)

in contradiction to (5.37), proving φ(x) ∈ Bǫ(p) for each x ∈ I(0,η)∩R+0 and the positive

stability of p.

5 STABILITY 98

For the remaining part of the proof, we additionally assume V to be negative definiteat p, while continuing to use the notation from above. Set γ := δ(r). We have to showlimx→∞ Y (x, η) = p for each η ∈ Bγ(p), i.e.

∀ǫ∈]0,r]

∃ξǫ≥0

∀x≥ξǫ

‖Y (x, η)− p‖ < ǫ. (5.39)

So fix η ∈ Bγ(p) and, as above, let φ(x) := Y (x, η). Given ǫ ∈]0, r], we first claim thatthere exists ξǫ ≥ 0 such that φ(ξǫ) ∈ Bδ(ǫ)(p), where δ(ǫ) is as in the first part of theproof above. Indeed, seeking a contradiction, assume ‖φ(x) − p‖ ≥ δ(ǫ) for all x ≥ 0,and set

α := max

V (y) : δ(ǫ) ≤ ‖y − p‖ ≤ r

. (5.40)

Then α < 0 due to the negative definiteness of V at p. Moreover, due to the choice ofγ, we have δ(ǫ) ≤ ‖φ(x)− p‖ ≤ r for each x ≥ 0, implying

∀x≥0

0 ≤ V (φ(x)) = V (η) +

∫ x

0

V (φ(t)) dt ≤ V (η) + αx, (5.41)

which is the desired contradiction, as α < 0 implies the right-hand side to go to −∞ forx→ ∞. Thus, we know the existence of ξǫ such that ηǫ := φ(ξǫ) ∈ Bδ(ǫ)(p).

To finish the proof, we recall from the first part of the proof that ‖Y (x, ηǫ)− p‖ < ǫ foreach x ≥ 0. Using Lem. 5.4(a), we obtain

∀x≥0

φ(ξǫ + x) = Y (ξǫ + x, ξǫ, ηǫ) = Y (ξǫ + x− ξǫ, ηǫ) = Y (x, ηǫ) ∈ Bǫ(p), (5.42)

showing ‖φ(x)− p‖ < ǫ for each x ≥ ξǫ as needed.

Example 5.31. Let k,m ∈ N and α, β > 0. We claim that (0, 0) is a positivelyasymptotically stable fixed point for each R2-valued ODE of the form

(

y′1y′2

)

=

(

−y2k−11 + αy1y

22

−y2m−12 − βy21y2

)

. (5.43)

Indeed, (0, 0) is clearly a fixed point, and we consider the Lyapunov function

V : R2 −→ R, V (y1, y2) :=y21α

+y22β, (5.44a)

which is clearly positive definite at (0, 0). Since V : R2 −→ R,

V (y1, y2) = ∇V (y1, y2) •(

− y2k−11 + αy1y

22, −y2m−1

2 − βy21y2)

= (2y1/α, 2y2/β) •(

− y2k−11 + αy1y

22, −y2m−1

2 − βy21y2)

= −2(y2k1 /α + y2m2 /β), (5.44b)

is clearly negative definite at (0, 0), Th. 5.30 proves (0, 0) to be a positively asymptoti-cally stable fixed point.

5 STABILITY 99

Theorem 5.32. Consider the situation of Def. 5.24 with K = R. Let Ω0 be open withp ∈ Ω0 ⊆ Ω ⊆ Rn. Assume V : Ω0 −→ R to be continuously differentiable and assumethere is an open set U ⊆ Ω0 such that the following conditions (i) – (iii) are satisfied:

(i) p ∈ ∂U , i.e. p is in the boundary of U .

(ii) V > 0 and V > 0 (resp. V < 0) on U (where V is defined as in (5.33)).

(iii) V (y) = 0 for each y ∈ Ω0 ∩ ∂U .

Then the fixed point p is not positively (resp. negatively) stable.

Proof. We assume V and V are positive, proving p not to be positively stable; the cor-responding statement regarding p not to be negatively stable is then, once again, easilyobtained by reversing time, i.e. by using Lem. 5.25 together with noting V changing itssign when replacing f with −f .Seeking a contradiction, assume p to be positively stable. Then there exists r > 0 suchthat Br(p) = y ∈ Rn : ‖y − p‖ ≤ r ⊆ Ω0 and η ∈ Br(p) implies Y (x, η) is definedfor each x ≥ 0. Moreover, positive stability and p ∈ ∂U also imply the existence ofη ∈ U ∩ Br(p) such that φ(x) := Y (x, η) ∈ Br(p) for all x ≥ 0 (note p 6= η as p ∈ ∂U).Set

s := sup

x ≥ 0 : φ(t) ∈ U for each t ∈ [0, x]. (5.45)

If s <∞, then the maximality of φ implies φ(s) to be defined. Moreover, φ(s) ∈ ∂U bythe definition of s, and φ(s) ∈ Br(p) ⊆ Ω0 by the choice of η. Thus, φ(s) ∈ Ω0 ∩ ∂Uand V (φ(s)) = 0. On the other hand, as V and V are positive on U , we have

V (φ(s)) = V (η) +

∫ s

0

V (φ(t)) dt > V (η) > 0, (5.46)

which is a contradiction to V (φ(s)) = 0, implying s = ∞ and φ(x) ∈ U as well asV (φ(x)) > V (η) > 0 hold for each x > 0.

To conclude the proof, consider the compact set

C := Br(p) ∩ U ∩ V −1[V (η),∞[⊆ Ω0. (5.47)

Then the choice of η guarantees φ(x) ∈ C for all x ≥ 0. If y ∈ C, then V (y) ≥ V (η) > 0.If y ∈ Ω0 ∩ ∂U , then V (y) = 0, showing C ∩ ∂U = ∅, i.e. C ⊆ U and

α := minV (y) : y ∈ C > 0. (5.48)

Thus,

∀x≥0

V (φ(x)) = V (η) +

∫ x

0

V (φ(t)) dt ≥ V (η) + αx. (5.49)

But this means that the continuous function V is unbounded on the compact set C andthis contradiction proves p is not positively stable.

5 STABILITY 100

Example 5.33. Let h1, h2 : Ω −→ R be continuously differentiable functions definedon some open set Ω ⊆ R2 with (0, 0) ∈ Ω and h1(0, 0) > 0, h2(0, 0) > 0. We claim that(0, 0) is not a positively stable fixed point for each R2-valued ODE of the form

(

y′1y′2

)

=

(

y2 h1(y1, y2)

y1 h2(y1, y2)

)

. (5.50)

Indeed, (0, 0) is clearly a fixed point, and we let Ω0 be some open neighborhood of (0, 0),where both h1 and h2 are positive (such an Ω0 exists by continuity of h1, h2 and h1, h2being positive at (0, 0)), and consider the Lyapunov function

V : Ω0 −→ R, V (y1, y2) := y1y2, (5.51a)

with V : Ω0 −→ R,

V (y1, y2) = ∇V (y1, y2) •(

y2 h1(y1, y2), y1 h2(y1, y2))

= (y2, y1) •(

y2 h1(y1, y2), y1 h2(y1, y2))

= y22 h1(y1, y2) + y21 h2(y1, y2) > 0 on Ω0 \ (0, 0). (5.51b)

Letting U := Ω0 ∩ (R+ × R+), one has (0, 0) ∈ ∂U , both V and V are positive on U ,and V = 0 on Ω0 ∩ ∂U ⊆ (0 × R) ∪ (R× 0). Thus, Th. 5.32 applies, yielding that(0, 0) is not positively stable.

Theorem 5.34. Let Ω ⊆ Rn be open, n ∈ N. Let F : Ω −→ R be C2 and consider

y′ = −∇F (y). (5.52)

If p ∈ Ω is an isolated critical point of F (i.e. ∇F (p) = 0 and there exists an open setO with p ∈ O ⊆ Ω and ∇F 6= 0 on O \ p), then p is a fixed point of (5.52) that ispositively asymptotically stable, negatively asymptotically stable, neither positively nornegatively stable as p is a local minimum for F , local maximum for F , neither.

Proof. Note that F being C2 implies ∇F to be C1 and, in particular, locally Lipschitz,such that (5.52) admits unique maximal solutions. Suppose F has a local min at p.As p is an isolated critical point, the local min at p must be strict, i.e. there exists anopen neighborhood Ω0 of p such that F (p) < F (y) for each y ∈ Ω0 \ p. Then theLyapunov function V : Ω0 −→ R, V (y) := F (y)− F (p), is clearly positive definite at pand V : Ω0 −→ R,

V (y) = ∇V (y) • (−∇F (y)) = −∇F (y) • ∇F (y) = −‖∇F (y)‖22, (5.53)

is clearly negative definite at p. Thus, p is a positively asymptotically stable fixed pointby Th. 5.30. If F has a local max at p, then the proof is conducted analogously, usingV : Ω0 −→ R, V (y) := F (p) − F (y), or, alternatively, by using time reversion (if Fhas a local max at p, then −F has a local min at p, i.e. p is a positively asymptoticallystable for y′ = ∇F (y), i.e. p is a negatively asymptotically stable for y′ = −∇F (y) byRem. 5.25(b)).

5 STABILITY 101

If p is neither a local min nor max for F , then let Ω0 := O, and V : Ω0 −→ R, V (y) :=F (p) − F (y), where O was chosen such that ∇F 6= 0 on O \ p, i.e. V : Ω0 −→ R,V (y) = ‖∇F (y)‖22, is positive definite at p. Let U := y ∈ Ω0 : F (y) < F (p). ThenU is open by the continuity of F , and p ∈ ∂U , as p is neither a local min nor maxfor F . By the continuity of F , F (y) = F (p) for each y ∈ Ω0 ∩ ∂U , i.e. V = 0 onΩ0 ∩ ∂U . Thus, Th. 5.32 applies, showing p is not positively stable. Analogously, usingU := y ∈ Ω0 : F (y) > F (p) and V (y) := F (y) − F (p) shows p is not negativelystable.

Example 5.35. (a) The function F : Rn −→ R, F (y) = ‖y‖22, has an isolated criticalpoint at 0, which is also a min for F . Thus, by Th. 5.34,

y′ = −∇F (y) = (−2y1, . . . ,−2yn) (5.54)

has 0 as a fixed point that is positively asymptotically stable.

(b) The function F : R2 −→ R, F (y) = ey1y2 , has an isolated critical point at 0, whichis neither a local min nor local max for F . Thus, by Th. 5.34,

y′ = −∇F (y) = (−y2ey1y2 , −y1ey1y2) (5.55)

has 0 as a fixed point that is neither positively nor negatively stable.

5.3 Constant Coefficients

The stability properties of systems of first-order linear ODE (cf. Sec. 4.6.2) are closelyrelated to the eigenvalues of the matrix A. As it turns out, the stability of the origin isessentially determined by the sign of the real part of the eigenvalues of A (cf. Th. 5.38below). We start with a preparatory lemma:

Lemma 5.36. Let n ∈ N and W ∈ M(n,K) be invertible. Moreover, let ‖ · ‖ be somenorm on M(n,K). Then

‖ · ‖W : M(n,K) −→ R+0 , ‖A‖W := ‖W−1AW‖, (5.56)

also constitutes a norm on M(n,K).

Proof. If A = 0, then ‖A‖W = ‖W−10W‖ = ‖0‖ = 0. If ‖A‖W = 0, then W−1AW = 0,i.e. A = W0W−1 = 0, showing ‖ · ‖W is positive definite. Next,

∀λ∈K

‖λA‖W = |λ| ‖W−1AW‖ = |λ| ‖A‖W ,

showing ‖ · ‖W is homogeneous of degree 1. Finally,

∀A,B∈M(n,K)

‖A+B‖W = ‖W−1(A+B)W‖ = ‖W−1AW +W−1BW‖ ≤ ‖A‖W +‖B‖W ,

showing ‖ · ‖W satisfies the triangle inequality.

5 STABILITY 102

Remark and Definition 5.37. Let n ∈ N, A ∈ M(n,C), and let λ ∈ C be aneigenvalue of A.

(a) Clearly, one has

0 ⊆ ker(A− λ Id) ⊆ ker(A− λ Id)2 ⊆ . . .

and the inclusion can be strict for at most n times. Let

r(λ) := min

k ∈ N0 : ker(A− λ Id)k = ker(A− λ Id)k+1

.

Then∀k∈N

ker(A− λ Id)r(λ) = ker(A− λ Id)r(λ)+k :

Indeed, otherwise, let k0 := mink ∈ N : ker(A − λ Id)r(λ) ( ker(A − λ Id)r(λ)+k.Then there exists v ∈ Cn such that (A−λ Id)r(λ)+k0v = 0, but (A−λ Id)r(λ)+k0−1v 6=0. However, that means w := (A − λ Id)k0−1v ∈ ker(A − λ Id)r(λ)+1, but w /∈ker(A− λ Id)r(λ), in contradiction to the definition of r(λ). The space

M(λ) := ker(A− λ Id)r(λ)

is called the generalized eigenspace corresponding to the eigenvalue λ.

(b) Due to A(A− λ Id) = (A− λ Id)A, one has

∀k∈N0

A(

ker(A− λ Id)k)

⊆ ker(A− λ Id)k,

i.e. all the kernels (in particular, the generalized eigenspace M(λ)) are invariantsubspaces for A.

(c) As already mentioned in Rem. 4.51 the algebraic multiplicity of λ, denoted ma(λ),is its multiplicity as a zero of the characteristic polynomial χA(x) = det(A− x Id),and the geometric multiplicity of λ is mg(λ) := dimker(A − λ Id). We call theeigenvalue λ semisimple if, and only if, its algebraic and geometric multiplicitiesare equal. We then have the equivalence of the following statements (i) – (iv):

(i) λ is semisimple.

(ii) M(λ) = ker(A− λ Id).

(iii) AM(λ) is diagonalizable.

(iv) All the Jordan blocks corresponding to λ are trivial, i.e. they all have size 1(i.e. there are dim ker(A− λ Id) such blocks).

Indeed, note that ma(λ) = dimker(A − λ Id)ma(λ) (e.g., since, if A is in Jordannormal form, then ma(λ) provides the size of the λ-block and, for A − λ Id, thisblock is canonically nilpotent). This shows the equivalence between (i) and (ii).Moreover, mg(λ) = ma(λ) means ker(A − λ Id) has a basis of ma(λ) eigenvectorsv1, . . . , vma(λ) for the eigenvalue λ. The equivalence of (i),(ii) with (iii) and with(iv) is then given by Th. 4.45 and Th. 4.46, respectively.

5 STABILITY 103

Theorem 5.38. Let n ∈ N and A ∈ M(n,C). Moreover, let ‖ · ‖ be some norm onM(n,C) and let λ1, . . . , λs ∈ C, 1 ≤ s ≤ n, be the distinct eigenvalues of A.

(a) The following statements (i) – (iii) are equivalent:

(i) There exists K > 0 such that ‖eAx‖ ≤ K holds for each x ≥ 0 (resp. x ≤ 0).

(ii) Reλj ≤ 0 (resp. Reλj ≥ 0) for every j = 1, . . . , s and if Reλj = 0 occurs, thenλj is a semisimple eigenvalue (i.e. its algebraic and geometric multiplicitiesare equal).

(iii) The fixed point 0 of y′ = Ay is positively (resp. negatively) stable.

(b) The following statements (i) – (iii) are equivalent:

(i) There exist K,α > 0 such that ‖eAx‖ ≤ Ke−α|x| holds for each x ≥ 0 (resp.x ≤ 0).

(ii) Reλj < 0 (resp. Reλj > 0) for every j = 1, . . . , s.

(iii) The fixed point 0 of y′ = Ay is positively (resp. negatively) asymptoticallystable.

Proof. Let ‖ · ‖max denote the max-norm on Cn2 ∼= M(n,C), i.e.

‖(mkl)‖max := max

|mkl| : k, l ∈ 1, . . . , n

(caveat: for n > 1, this is not the operator norm induced by the max-norm on Cn).Moreover, using Th. 4.46, let W ∈ M(n,C) be invertible and such that B := W−1AWis in Jordan normal form. Then, according to Lem. 5.36, ‖M‖Wmax := ‖W−1MW‖max

also defines a norm on M(n,C). According to Th. 4.47(b),

∀x∈R

‖eAx‖Wmax = ‖W−1eAxW‖max = ‖eW−1AWx‖max = ‖eBx‖max. (5.57)

According to Th. 4.44 and Th. 4.49, the entries βkl(x) of (βkl(x)) := eBx enjoy thefollowing property:

∀k,l∈1,...,n

∃j∈1,...,s

∃C>0

∃m∈N0

∀x∈R

|βkl(x)| = C eReλjx |x|m. (5.58)

Moreover,

(

|βkl(x)| = C eReλjx |x|m ∧ Reλj < 0)

⇒ limx→∞

|βkl(x)| = 0, (5.59a)(

|βkl(x)| = C eReλjx |x|m ∧ Reλj > 0)

⇒ limx→∞

|βkl(x)| = ∞, (5.59b)(

|βkl(x)| = C eReλjx |x|m ∧ Reλj = 0 ∧ m = 0)

⇒ |βkl| ≡ C, (5.59c)(

|βkl(x)| = C eReλjx |x|m ∧ Reλj = 0 ∧ m > 0)

⇒ limx→∞

|βkl(x)| = ∞. (5.59d)

(a): We start with the equivalence between (i) and (ii): Suppose, Reλj ≤ 0 for everyj = 1, . . . , s and if Reλj = 0 occurs, then λj is a semisimple eigenvalue. Then, using

5 STABILITY 104

Rem. and Def. 5.37(c) and (5.58), we are either in situation (5.59a) or in situation(5.59c). Thus, there exists K0 > 0 such that |βkl(x)| ≤ K0 for each x ≥ 0 and eachk, l = 1, . . . , n. Then there exists K1 > 0 such that

∀x≥0

‖eAx‖ ≤ K1 ‖eAx‖Wmax = K1 ‖eBx‖max ≤ K1K0, (5.60)

showing (i) holds with K := K1K0. Conversely, if there is j ∈ 1, . . . , s such thatReλj > 0, then there is βkl such that (5.59b) occurs; if there is j ∈ 1, . . . , s such thatReλj = 0 and λj is not semisimple, then, using Rem. and Def. 5.37(c), there is βkl suchthat (5.59d) occurs. In both cases,

limx→∞

‖eAx‖ = limx→∞

‖eAx‖Wmax = limx→∞

‖eBx‖max = ∞, (5.61)

i.e., the corresponding statement of (i) can not be true. The remaining case is handledvia time reversion: ‖eAx‖ ≤ K holds for each x ≤ 0 if, and only if, ‖e−Ax‖ ≤ K holdsfor each x ≥ 0, which holds if, and only if, Re(−λj) ≤ 0 for every j = 1, . . . , s withλj semisimple for Re(−λj) = 0, which is equivalent to Reλj ≥ 0 for every j = 1, . . . , swith λj semisimple for Reλj = 0.

We proceed to the equivalence between (i) and (iii): Fix some arbitary norm ‖ · ‖ onCn, and let ‖ · ‖op denote the induced operator norm on M(n,C). Let C1, C2 > 0 besuch that ‖M‖op ≤ C1‖M‖ and ‖M‖ ≤ C2‖M‖op for each M ∈ M(n,C). Supposethere exists K > 0 such that ‖eAx‖ ≤ K holds for each x ≥ 0. Given ǫ > 0, chooseδ := ǫ/(C1K). Then

∀η∈Bδ(0)

∀x≥0

‖Y (x, η)− 0‖ = ‖eAx η‖ ≤ ‖eAx‖op ‖η‖ < C1Kǫ

C1K= ǫ, (5.62)

proving 0 is positively stable. Conversely, assume 0 to be positively stable. Then thereexists δ > 0 such that ‖Y (x, η)‖ = ‖eAx η‖ < 1 for each η ∈ Bδ(0) and each x ≥ 0.Thus,

∀x≥0

‖eAx‖op(G.1)= sup

‖eAx η‖ : η ∈ Cn, ‖η‖ = 1

=2

δsup

‖eAx η‖ : η ∈ Cn, ‖η‖ = δ/2

≤ 2

δ,

(5.63)

showing (i) holds with K := 2C2/δ. The remaining case is handled via time reversion:‖eAx‖ ≤ K holds for each x ≤ 0 if, and only if, ‖e−Ax‖ ≤ K holds for each x ≥ 0, whichholds if, and only if, 0 is positively stable for y′ = −Ay, which, by Rem. 5.25(a), holdsif, and only if, 0 is negatively stable for y′ = Ay.

(b): As in (a), we start with the equivalence between (i) and (ii): Suppose, Reλj < 0for every j = 1, . . . , s. We first show, using (5.58),

∀k,l=1,...,n

∃Kkl,αkl>0

∀x≥0

|βkl(x)| ≤ Kkl e−αklx : (5.64)

According to (5.58),

∃Ckl>0

∀x≥0

|βkl(x)| = Ckl eReλjx/2 eReλjx/2 xm.

5 STABILITY 105

Since Reλj < 0, one has limx→∞ eReλjx/2 xm = 0, i.e. eReλjx/2 xm is uniformly boundedon [0,∞[ by someMkl > 0. Thus, (5.64) holds withKkl := CklMkl and αkl := −Reλj/2.In consequence, if K1 is chosen as in (5.60), then ‖eAx‖ ≤ Ke−α|x| ≤ K for each x ≥ 0holds with K := K1 maxKkl : k, l = 1, . . . , n and α := minαkl : k, l = 1, . . . , n.Conversely, if there is j ∈ 1, . . . , s such that Reλj ≥ 0, then there is βkl such that(5.59b) or (5.59c) or (5.59d) occurs. In each case, ‖eAx‖ 6→ 0 for x→ ∞, since

limx→∞

‖eAx‖Wmax = limx→∞

‖eBx‖max ∈ ]0,∞], (5.65)

i.e., the corresponding statement of (i) can not be true. The remaining case is handledvia time reversion: ‖eAx‖ ≤ Ke−α|x| holds for each x ≤ 0 if, and only if, ‖e−Ax‖ ≤Ke−α|x| holds for each x ≥ 0, which holds if, and only if, Re(−λj) < 0 for everyj = 1, . . . , s, which is equivalent to Reλj > 0 for every j = 1, . . . , s.

It remains to consider the equivalence between (i) and (iii): Let ‖ · ‖op and C1, C2 > 0be as in the proof of the equivalence between (i) and (iii) in (a). Suppose, there existK,α > 0 such that ‖eAx‖ ≤ Ke−α|x| holds for each x ≥ 0. Since ‖eAx‖ ≤ Ke−α|x| ≤ Kfor each x ≥ 0, 0 is positively stable by (a). Moreover,

∀η∈Cn

∀x≥0

‖Y (x, η)‖ = ‖eAx η‖ ≤ ‖eAx‖op ‖η‖ ≤ C1K e−α|x| ‖η‖ → 0 for x→ ∞,

(5.66)showing 0 to be positively asymptotically stable. For the converse, we will actuallyshow (iii) implies (ii). If 0 is positively asymptotically stable, then, in particular, itis positively stable, such that (ii) of (a) must hold. It merely remains to exclude thepossibility of a semisimple eigenvalue λ with Reλ = 0. If there were a semisimpleeigenvalue λ with Reλ = 0, then eBx had a Jordan block of size 1 with entry eλx, i.e.βkk(x) = eλx for some k ∈ 1, . . . , n. Let ek be the corresponding standard unit vectorof Cn (all entries 0, except the kth entry, which is 1). Then, for η := Wek,

∀ǫ∈R+

∀x∈R

‖W−1 eAx (ǫ η)‖ = ǫ ‖W−1 eAxWek‖ = ǫ ‖eBx ek‖ = ǫ ‖eλx ek‖= ǫ |eλx| ‖ek‖ = ǫ · 1 · ‖ek‖ > 0,

(5.67)

showing 0 were not positively asymptotically stable (e.g., since y 7→ ‖W−1y‖ defines anorm on Cn). The remaining case is, once again, handled via time reversion: ‖eAx‖ ≤Ke−α|x| holds for each x ≤ 0 if, and only if, ‖e−Ax‖ ≤ Ke−α|x| holds for each x ≥ 0,which holds if, and only if, 0 is positively asymptotically stable for y′ = −Ay, which, byRem. 5.25(b), holds if, and only if, 0 is negatively asymptotically stable for y′ = Ay.

Example 5.39. (a) The matrix

A =

(

2 11 2

)

has eigenvalues 1 and 3 and, thus, the fixed point 0 of y′ = Ay is negativelyasymptotically stable, but not positively stable.

(b) The matrix

A =

(

0 10 0

)

5 STABILITY 106

has eigenvalue 0, which is not semisimple, i.e. the fixed point 0 of y′ = Ay is neithernegatively nor positively stable.

(c) The matrix

A =

i 1 2 2− 3i0 −i 5 −170 0 −1 + 3i 00 0 0 −5

has simple eigenvalues i, −i, −1 + 3i, −5, i.e. the fixed point 0 of y′ = Ay ispositively stable (since all real parts are ≤ 0), but neither negatively stable norpositively asymptotically stable (since there are eigenvalues with 0 real part).

5.4 Linearization

If the right-hand side f of an autonomous ODE is differentiable and p is a fixed point (i.e.f(p) = 0), then one can sometimes use its linearization, i.e. its derivative A := Df(p)(which is an n× n matrix), to infer stability properties of y′ = f(y) at p from those ofy′ = Ay at 0 (see Th. 5.44 below). We start with some preparatory results:

Lemma 5.40. Let n ∈ N and consider the bilinear function

β : Rn × Rn −→ R, β(y, z) := y • (Bz) = ytBz =n∑

k,l=1

yk bkl zl, (5.68)

where B = (bkl) ∈ M(n,R), “•” denotes the Euclidean scalar product, and elements ofRn are interpreted as column vectors when involved in matrix multiplications.

(a) The function β is differentiable (it is even a polynomial, deg(β) ≤ 2, and, thus,C∞) and

∀k∈1,...,n

∂ykβ : Rn × Rn −→ R,

∀(y,z)∈Rn×Rn

∂ykβ(y, z) =n∑

l=1

bkl zl = (Bz)k,(5.69a)

∀l∈1,...,n

∂zlβ : Rn × Rn −→ R,

∀(y,z)∈Rn×Rn

∂zlβ(y, z) =n∑

k=1

yk bkl = (ytB)l,(5.69b)

∀(y,z)∈Rn×Rn

Dβ(y, z) = ∇ β(y, z) : Rn × Rn −→ R,

∇ β(y, z)(u, v) = β(y, v) + β(u, z) = ytBv + utBz.(5.69c)

(b) The function

V : Rn −→ R, V (y) := β(y, y) = y • (By) = ytBy =n∑

k,l=1

yk bkl yl, (5.70)

5 STABILITY 107

is differentiable (it is also even a polynomial, deg(β) ≤ 2, and, thus, C∞) and

∀k∈1,...,n

∂ykV : Rn −→ R,

∀y∈Rn

∂ykV (y) =n∑

l=1

yl(bkl + blk) = yt(B + Bt)k,(5.71a)

∀y∈Rn

DV (y) = ∇V (y) : Rn −→ R,

∇V (y)(u) = β(y, u) + β(u, y) = ytBu+ utBy = yt(B + Bt)u.(5.71b)

Proof. (a): (5.69a) and (5.69b) are immediate from (5.68) and, then, imply (5.69c).

(b): (5.71a) is immediate from (5.70) and, then, implies (5.71b).

Lemma 5.41. Let A,B ∈ M(n,R), n ∈ N, and V : Rn −→ R as in (5.70). Then

∀y∈Rn

∇V (y) • (Ay) = yt(BA+ AtB)y. (5.72)

Proof. We note

∀y∈Rn

(ytBt) • (Ay) = ytBtAy =(

ytBtAy)t

= ytAtBy (5.73)

and, thus, obtain

∀y∈Rn

∇V (y) • (Ay) (5.71b)= yt(B + Bt) • (Ay) (5.73)

= yt(BA+ AtB)y, (5.74)

proving (5.72).

Definition 5.42. A matrix B ∈ M(n,R), n ∈ N, is called positive definite if, and onlyif, the function V of (5.70) is positive definite at p = 0 in the sense of Def. 5.29.

Proposition 5.43. Let A ∈ M(n,R), n ∈ N. Then the following statements (i) – (iii)are equivalent:

(i) There exist positive definite matrices B,C ∈ M(n,R), satisfying

BA+ AtB = −C. (5.75)

(ii) Reλ < 0 holds for each eigenvalue λ ∈ C of A.

(iii) For each given positive definite (symmetric) C ∈ M(n,R), there exists a positivedefinite (symmetric) B ∈ M(n,R), satisfying (5.75).

Proof. (iii) immediately implies (i) (e.g. by applying (iii) with C := Id).

For the proof that (i) implies (ii), let B,C ∈ M(n,R) be positive definite matrices,satisfying (5.75). By Th. 5.38(b), it suffices to show 0 is a positively asymptoticallystable fixed point for y′ = Ay. To this end, we apply Th. 5.30, using V : Rn −→ R of

5 STABILITY 108

(5.70) as the Lyapunov function. Then, by Def. 5.42, B being positive definite meansV being positive definite at 0. Since

V : Rn −→ R, V (y) = ∇V (y) • (Ay) (5.72)= yt(BA+ AtB)y

(5.75)= −ytCy (5.76)

and C is positive definite, V is negative definite at 0, i.e. Th. 5.30 yields 0 to be apositively asymptotically stable fixed point for y′ = Ay as desired.

It remains to show that (ii) implies (iii). If all eigenvalues of A have negative real part,then, as A and At have the same eigenvalues, all eigenvalues of At have negative realpart as well. Thus, according to Th. 5.38(b),

∃K,α>0

∀x≥0

(

‖eAx‖max ≤ Ke−αx ∧ ‖eAtx‖max ≤ Ke−αx)

, (5.77)

where we have chosen the norm in (5.77) to mean the max-norm on Rn2(note that eAx

is real if A is real, e.g. due to the series representation (4.73)). Given C ∈ M(n,R),define

B :=

∫ ∞

0

eAtxC eAx dx . (5.78)

To verify that B ∈ M(n,R) is well-defined, note that each entry of the integrand matrixof (5.78) constitutes an integrable function on [0,∞[: Indeed,

∃M>0

∀x≥0

∥

∥eAtxC eAx

∥

∥

max≤M ‖eAtx‖max ‖C‖max ‖eAx‖max

≤M ‖C‖maxK2 e−2αx,

(5.79)

which is integrable on [0,∞[. Next, we compute

−C (5.79)= lim

x→∞eA

txC eAx − C = limx→∞

∫ x

0

∂s(

eAtsC eAs

)

ds

=

∫ ∞

0

∂s(

eAtsC eAs

)

ds

(I.3)=

∫ ∞

0

(

At eAtsC eAs + eA

tsC eAsA)

ds

(I.5),(I.6)= At

∫ ∞

0

eAtsC eAs ds +

(∫ ∞

0

eAtsC eAs ds

)

A

(5.78)= = AtB +BA, (5.80)

showing (5.75) is satisfied. If C is positive definite and 0 6= y ∈ Rn, then ytCy > 0,implying

ytBy = yt(∫ ∞

0

eAtxC eAx dx

)

y(I.5),(I.6)

=

∫ ∞

0

yt eAtxC eAx y dx

Prop. 4.40(c)=

∫ ∞

0

(eAx y)tC eAx y dx > 0, (5.81)

5 STABILITY 109

showing B is positive definite as well. Finally, if C is symmetric, then

Bt =

∫ ∞

0

(

eAtxC eAx

)tdx

Prop. 4.40(c)=

∫ ∞

0

eAtxC eAx dx = B, (5.82)

showing B is symmetric as well.

Theorem 5.44. Let Ω ⊆ Rn be open, n ∈ N, and f : Ω −→ Rn continuously differ-entiable. Let p ∈ Ω be a fixed point (i.e. f(p) = 0) and A := Df(p) ∈ M(n,R) thederivative of f at p. If all eigenvalues of A have negative (resp. positive) real parts, thenp is a positively (resp. negatively) asymptotically stable fixed point for y′ = f(y).

Proof. Let all eigenvalues of A have negative real parts. We first consider the specialcase p = 0, i.e. A = Df(0). By the equivalence between (ii) and (iii) of Prop. 5.43,we can choose C := Id in (iii) to obtain the existence of a positive definite symmetricmatrix B ∈ M(n,R), satisfying

BA+ AtB = − Id . (5.83)

The idea is now to apply the Lyapunov Th. 5.30 with V of (5.70), i.e.

V : Ω −→ R, V (y) := y • (By) = ytBy =n∑

k,l=1

yk bkl yl. (5.84)

We already know V to be continuously differentiable and positive definite. We willconclude the proof of 0 being positively asymptotically stable by showing there existsδ > 0, such that

V : Bδ(0) −→ R, V (y) = (∇V )(y) • f(y), (5.85)

is negative definite at 0, where we take Bδ(0) with respect to the 2-norm ‖ · ‖2 on Rn.The differentiability of f at 0 implies that (cf. [Phi15, Lem. 2.21])

r : Ω −→ Rn, r(y) := f(y)− Ay, (5.86)

satisfies

limy→0

‖r(y)‖2‖y‖2

= 0. (5.87)

Thus, we compute, for each y ∈ Ω,

V (y) = (∇V )(y) • f(y) (5.71b),(5.86)= (∇V )(y) • (Ay) +

(

yt(B + Bt))

• r(y)(5.72), B=Bt

= yt(BA+ AtB)y + 2ytB r(y)(5.83)= −‖y‖22 + 2 y •B r(y). (5.88)

We can estimate the second summand via the Cauchy-Schwarz inequality to obtain∣

∣y •B r(y)∣

∣ ≤ ‖y‖2 ‖B r(y)‖2 ≤ ‖y‖2 ‖B‖ ‖r(y)‖2, (5.89)

and, thus, using (5.87),

limy→0

∣

∣y •B r(y)∣

∣

‖y‖22= 0. (5.90)

5 STABILITY 110

Now choose δ > 0 such that Bδ(0) ⊆ Ω and such that

∀y∈Bδ(0)

2∣

∣y •B r(y)∣

∣

‖y‖22<

1

2. (5.91)

Then, for each 0 6= y ∈ Bδ(0)

V (y) = −‖y‖22 + 2 y •B r(y)(5.91)< −‖y‖22 +

‖y‖222

= −‖y‖222

< 0, (5.92)

showing V to be negative definite at 0, and 0 to be positively asymptotically stable. Ifp 6= 0, then consider the ODE y′ = g(y) := f(y + p), g : (Ω − p) −→ Rn. Then 0 is afixed point for y′ = g(y), Dg(0) = Df(p) = A, i.e. 0 is positively asymptotically stablefor y′ = g(y). But, since ψ is a solution to y′ = g(y) if, and only if, φ = ψ+p is a solutionto y′ = f(y), p must be positively asymptotically stable for y′ = f(y). The remainingcase that all eigenvalues of A have positive real parts is now treated via time reversion:If all eigenvalues of A have positive real parts, then all eigenvalues of −A = D(−f)(p)have negative real parts, i.e. p is positively asymptotically stable for y′ = −f(y), i.e., byRem. 5.25(b), p is negatively asymptotically stable for y′ = f(y).

Caveat 5.45. The following example shows that the converse of Th. 5.44 does nothold: A fixed point p can be positively (resp. negatively) asymptotically stable withoutA := Df(p) having only eigenvalues with negative (resp. positive) real parts. The sameexample shows that, in general, one can not infer anything regarding the stability of thefixed point p if A := Df(p) is merely stable, but not asymptotically stable: Consider

f : R2 −→ R2, f(y1, y2) := (y2 + µy31, −y1 + µy32), µ ∈ R. (5.93)

Then, independently of µ, (0, 0) is a fixed point and

Df(0, 0) =

(

0 1−1 0

)

(5.94)

with complex eigenvalues i and −i. Thus, the linearized system is positively and nega-tively stable, but not asymptotically stable, still independently of µ. However, we claimthat (0, 0) is a positively asymptotically stable fixed point for y′ = f(y) if µ < 0 anda positively asymptotically stable fixed point for y′ = f(y) if µ > 0. Indeed, this canbe seen by using the Lyapunov function V : R2 −→ R, V (y1, y2) = y21 + y22, which has∇V (y1, y2) = (2y1, 2y2) and

V (y1, y2) = ∇V (y1, y2) • f(y1, y2) = 2µ (y41 + y42). (5.95)

Thus, V is positive definite at (0, 0) and V is negative definite at (0, 0) for µ < 0 andpositive definite at (0, 0) for µ > 0.

Example 5.46. Consider (x, y, z)′ = f(x, y, z) with

f : R3 −→ R3, f(x, y, z) = (−x cos y, −yez, x2 − 2z). (5.96)

5 STABILITY 111

The derivative is

Df : R3 −→ M(3,R), Df(x, y, z) =

− cos y x sin y 00 −ez −yez2x 0 −2

. (5.97)

Clearly, (0, 0, 0) is a fixed point and Df(0, 0, 0) has eigenvalues −1 and −2. Thus,(0, 0, 0) is a positively asymptotically stable fixed point for (x, y, z)′ = f(x, y, z) by Th.5.44.

5.5 Limit Sets

Limit sets are important when studying the asymptotic behavior of solutions, i.e. φ(x)for x → ∞ and for x → −∞. If a solution has a limit, then its corresponding limit setconsists of precisely one point. In general, the limit set of a solution is defined to consistof all points that occur as limits of sequences taken along the solution’s orbit (of course,the limit sets can also be empty):

Definition 5.47. Let Ω ⊆ Kn, n ∈ N, and f : Ω −→ Kn be such that y′ = f(y) admitsunique maximal solutions. For each η ∈ Ω, we define the omega limit set and the alphalimit set of η as follows:

ω(η) := ωf (η) :=

y ∈ Ω : ∃(xk)k∈N⊆R

limk→∞

xk = ∞ ∧ limk→∞

Y (xk, η) = y

, (5.98a)

α(η) := αf (η) :=

y ∈ Ω : ∃(xk)k∈N⊆R

limk→∞

xk = −∞ ∧ limk→∞

Y (xk, η) = y

. (5.98b)

Remark 5.48. In the situation of Def. 5.47, consider the time-reversed version of y′ =f(y), i.e. y′ = −f(y), with its general solution Y (x, η) = Y (−x, η), cf. (5.28). Clearly,for each η ∈ Ω,

ωf (η) = α−f (η), αf (η) = ω−f (η). (5.99)

Proposition 5.49. In the situation of Def. 5.47, the following hold:

(a) If Y (·, η) is defined on all of R+0 , then

ω(η) =∞⋂

m=0

Y (x, η) : x ≥ m; (5.100a)

and if Y (·, η) is defined on all of R−0 , then

α(η) =∞⋂

m=0

Y (x, η) : x ≤ −m. (5.100b)

(b) All points in the same orbit have the same omega and alpha limit sets, i.e.

∀x∈I0,η

(

ω(η) = ω(

Y (x, η))

∧ α(η) = α(

Y (x, η))

)

.

5 STABILITY 112

Proof. Due to Rem. 5.48, it suffices to prove the statements involving the omega limitsets.

(a): Let y ∈ ω(η) and m ∈ N0. Then there is a sequence (xk)k∈N in R such thatlimk→∞ xk = ∞ and limk→∞ Y (xk, η) = y. Since, for sufficiently large k0 ∈ N, thesequence (Y (xk, η))k≥k0 is in Y (x, η) : x ≥ m, the inclusion “⊆” of (5.100a) is proved.

Conversely, assume y ∈ Y (x, η) : x ≥ m for each m ∈ N0. Then,

∀k∈N

∃xk∈[k,∞[

∥

∥Y (xk, η)− y∥

∥ <1

k,

providing a sequence (xk)k∈N in R such that limk→∞ xk = ∞ and limk→∞ Y (xk, η) = y,proving y ∈ ω(η) and the inclusion “⊇” of (5.100a).

(b): Let y ∈ ω(η) and x ∈ I0,η. Choose a sequence (xk)k∈N in R such that limk→∞ xk = ∞and limk→∞ Y (xk, η) = y. Then limk→∞(xk − x) = ∞ and

limk→∞

Y(

xk − x, Y (x, η)) Lem. 5.4(b)

= limk→∞

Y (xk, η) = y, (5.101)

proving ω(η) ⊆ ω(

Y (x, η))

. The reversed inclusion then also follows, since

Y(

− x, Y (x, η)) Lem. 5.4(b)

= Y (0, η) = η, (5.102)

concluding the proof.

Example 5.50. Let Ω ⊆ Kn, n ∈ N, and f : Ω −→ Kn be such that y′ = f(y) admitsunique maximal solutions.

(a) If η ∈ Ω is a fixed point, then ω(η) = α(η) = η. More generally, if η ∈ Ωis such that limx→∞ Y (x, η) = y ∈ Kn, then ω(η) = y; if η ∈ Ω is such thatlimx→−∞ Y (x, η) = y ∈ Kn, then α(η) = y.

(b) If A ∈ M(n,C) is such that the conditions of Th. 5.38(b) hold (all eigenvalues havenegative real parts, 0 is positively asymptotically stable), then ω(η) = 0 for eachη ∈ Cn and α(η) = ∅ for each η ∈ Cn \ 0.

(c) If η ∈ Ω is such that the orbit O(φ) of φ := Y (·, η) is periodic, then ω(η) = α(η) =O(φ). For example, for (5.8),

∀η∈R2

ω(η) = α(η) = S‖η‖2(0) =

y ∈ R2 : ‖y‖2 = ‖η‖2

. (5.103)

Example 5.51. As an example with nonperiodic orbits that have limit sets consistingof more than one point, consider

y′1 = y2 + y1(1− y21 − y22), (5.104a)

y′2 = −y1 + y2(1− y21 − y22). (5.104b)

5 STABILITY 113

We will show that, for each point except the origin (which is clearly a fixed point), theomega limit set is the unit circle, i.e.

∀η∈R2\0

ω(η) = S1(0) = y ∈ R2 : y21 + y22 = 1. (5.105)

We first verify that the general solution is

Y : Df,0 −→ R2, Y (x, η1, η2) =(η1 cos x+ η2 sin x, η2 cos x− η1 sin x)

√

η21 + η22 + (1− η21 − η22)e−2x

, (5.106)

where, letting

∀η∈y∈R2: ‖y‖2>1

xη :=1

2ln

(‖η‖22 − 1

‖η‖22

)

, (5.107)

Df,0 =(

R× η ∈ R2 : ‖η‖2 ≤ 1)

∪(

]xη,∞[×η ∈ R2 : ‖η‖2 > 1)

: (5.108)

For each (η1, η2) ∈ R2, Y (·, η1, η2) satisfies the initial condition:

Y (0, η1, η2) =(η1, η2)

√

η21 + η22 + (1− η21 − η22)= (η1, η2). (5.109)

The following computations prepare the check that each Y (·, η1, η2) satisfies (5.104):The 2-norm squared of the numerator in (5.106) is

∥

∥(η1 cos x+ η2 sin x, η2 cos x− η1 sin x)∥

∥

2

2

= η21 cos2 x+ 2η1η2 cos x sin x+ η22 sin

2 x+ η22 cos2 x− 2η1η2 cos x sin x+ η21 sin

2 x

= η21 + η22 = ‖η‖22. (5.110)

Thus,∥

∥Y (x, η1, η2)∥

∥

2=

‖η‖2√

‖η‖22 + (1− ‖η‖22)e−2x(5.111)

and

1− Y 21 (x, η1, η2)− Y 2

2 (x, η1, η2) = 1−∥

∥Y (x, η1, η2)∥

∥

2

2=

‖η‖22 + (1− ‖η‖22)e−2x − ‖η‖22‖η‖22 + (1− ‖η‖22)e−2x

=(1− ‖η‖22)e−2x

‖η‖22 + (1− ‖η‖22)e−2x. (5.112)

In consequence,

Y ′1(x, η1, η2)

=(−η1 sin x+ η2 cos x)

(

‖η‖22 + (1− ‖η‖22)e−2x)

+ (η1 cos x+ η2 sin x)(1− ‖η‖22)e−2x

(

‖η‖22 + (1− ‖η‖22)e−2x) 3

2

= Y2(x, η1, η2) + Y1(x, η1, η2)(

1− Y 21 (x, η1, η2)− Y 2

2 (x, η1, η2))

, (5.113)

5 STABILITY 114

verifying (5.104a). Similarly,

Y ′2(x, η1, η2)

=(−η2 sin x− η1 cos x)

(

‖η‖22 + (1− ‖η‖22)e−2x)

+ (η2 cos x− η1 sin x)(1− ‖η‖22)e−2x

(

‖η‖22 + (1− ‖η‖22)e−2x) 3

2

= −Y1(x, η1, η2) + Y2(x, η1, η2)(

1− Y 21 (x, η1, η2)− Y 2

2 (x, η1, η2))

, (5.114)

verifying (5.104b).

For ‖η‖2 ≤ 1, Y (·, η1, η2) is maximal, as it is defined on R (the denominator in (5.106)has no zero in this case). For ‖η‖2 > 1, the denominator clearly has a zero at xη < 0,where xη is defined as in (5.107). For x > xη, the expression under the square rootin (5.106) is positive. Since limx↓xη ‖Y (x, η1, η2)‖2 = ∞ for ‖η‖2 > 1, Y (·, η1, η2) ismaximal in this case as well, completing the verification of Y , defined as in (5.106) –(5.108), being the general solution of (5.104).

It remains to prove (5.105). From (5.111), we obtain

∀η∈R2\0

limx→∞

∥

∥Y (x, η1, η2)∥

∥

2= 1, (5.115)

which implies∀

η∈R2\0ω(η) ⊆ S1(0). (5.116)

Conversely, consider η = (η1, η2) ∈ R2 \ 0 and y = (y1, y2) ∈ S1(0). We will showy ∈ ω(η): Since ‖y‖2 = 1,

∃ϕy∈[0,2π[

y = (sinϕy, cosϕy). (5.117)

Analogously,∃

ϕη∈[0,2π[η = ‖η‖2 (sinϕη, cosϕη) (5.118)

(the reader might note that, in (5.117) and (5.118), we have written y and η using theirpolar coordinates, cf. [Phi15, Ex. 4.19]). Then, according to (5.106), we obtain, for eachx ≥ 0,

Y (x, η1, η2) =‖η‖2 (sinϕη cos x+ cosϕη sin x, cosϕη cos x− sinϕη sin x)

√

‖η‖22 + (1− ‖η‖22)e−2x

=‖η‖2

(

sin(x+ ϕη), cos(x+ ϕη))

√

‖η‖22 + (1− ‖η‖22)e−2x. (5.119)

Define∀k∈N

xk := ϕy − ϕη + 2πk ∈ R+. (5.120)

5 STABILITY 115

Then limk→∞ xk = ∞ and

limk→∞

Y (xk, η1, η2)

(5.119)= lim

k→∞

‖η‖2(

sin(ϕy − ϕη + 2πk + ϕη), cos(ϕy − ϕη + 2πk + ϕη))

√

‖η‖22 + (1− ‖η‖22)e−2x

= limk→∞

‖η‖2 (sinϕy, cosϕy)√

‖η‖22 + (1− ‖η‖22)e−2x= y, (5.121)

showing y ∈ ω(η) and S1(0) ⊆ ω(η).

Proposition 5.52. In the situation of Def. 5.47, if f is locally Lipschitz, then orbitsthat intersect an omega or alpha limit set, must entirely remain inside that same omegaor alpha limit set, i.e.

(

∀y∈Ω∩ω(η)

∀x∈I0,y

Y (x, y) ∈ ω(η)

)

∧(

∀y∈Ω∩α(η)

∀x∈I0,y

Y (x, y) ∈ α(η)

)

. (5.122)

Proof. Due to Rem. 5.48, it suffices to prove the statement involving the omega limit set.Let y ∈ Ω∩ω(η) and x ∈ I0,y. Choose a sequence (xk)k∈N in R such that limk→∞ xk = ∞and limk→∞ Y (xk, η) = y. Then limk→∞(xk + x) = ∞ and,

limk→∞

Y (xk + x, η)Lem. 5.4(b)

= limk→∞

Y(

x, Y (xk, η)) (∗)= Y (x, y), (5.123)

proving Y (x, y) ∈ ω(η). At “(∗)”, we have used that, due to f being locally Lipschitzby hypothesis, Y is continuous by Th. 3.35.

Proposition 5.53. In the situation of Def. 5.47, let η ∈ Ω be such that there exists acompact set K ⊆ Ω, satisfying

Y (x, η) : x ≥ 0 ⊆ K(

resp. Y (x, η) : x ≤ 0 ⊆ K)

. (5.124)

Then the following hold:

(a) ω(η) 6= ∅ (resp. α(η) 6= ∅).

(b) ω(η) (resp. α(η)) is compact.

(c) ω(η) (resp. α(η)) is a connected set, i.e. if O1, O2 are disjoint open subsets of Kn

such that ω(η) ⊆ O1∪O2 (resp. α(η) ⊆ O1∪O2), then ω(η)∩O1 = ∅ or ω(η)∩O2 = ∅(resp. α(η) ∩O1 = ∅ or α(η) ∩O2 = ∅).

Proof. Due to Rem. 5.48, it suffices to prove the statements involving the omega limitsets.

(a): Since, by hypothesis, (Y (k, η))k∈N is a sequence in the compact set K, it must havea subsequence, converging to some limit y ∈ K. But then y ∈ ω(η), i.e. ω(η) 6= ∅.

5 STABILITY 116

(b): According to (5.100a) and (5.124), ω(η) is a closed subset of the compact set K,implying ω(η) to be compact as well.

(c): Seeking a contradiction, we suppose the assertion is false, i.e. there are disjointopen subsets O1, O2 of Kn such that ω(η) ⊆ O1 ∪ O2, ω1 := ω(η) ∩ O1 6= ∅ and ω2 :=ω(η) ∩ O2 6= ∅. Then ω1 and ω2 are disjoint since O1, O2 are disjoint. Moreover, ω1

and ω2 are both subsets of the compact set ω(η). Due to ω1 = ω(η) ∩ (Kn \ O2) andω2 = ω(η) ∩ (Kn \ O1), ω1 and ω2 are also closed, hence, compact. Then, accordingto Prop. C.10, δ := dist(ω1, ω2) > 0. If y1 ∈ ω1 and y2 ∈ ω2, then there are numbers0 < s1 < t1 < s2 < t2 < . . . such that limk→∞ sk = limk→∞ tk = ∞ and

∀k∈N

(

Y (sk, η) ∈ O1 ∧ Y (tk, η) ∈ O2

)

. (5.125)

Define∀k∈N

σk := sup

x ≥ sk : Y (t, η) ∈ O1 for each t ∈ [sk, x]

. (5.126)

Then sk < σk < tk and the continuity of the (even differentiable) map Y (·, η) yieldsηk := Y (σk, η) ∈ ∂O1. Thus, (ηk)k∈N is a sequence in the compact set K ∩ ∂O1 and,therefore, must have a convergent subsequence, converging to some z ∈ K ∩ ∂O1. Butthen z ∈ ω(η), but not in O1 ∪O2, in contradiction to ω(η) ⊆ O1 ∪O2.

Theorem 5.54 (LaSalle). Let Ω ⊆ Rn, n ∈ N, and f : Ω −→ Rn be such that y′ = f(y)admits unique maximal solutions. Moreover, let Ω0 be an open subset of Ω, assumeV : Ω0 −→ R is continuously differentiable, K := y ∈ Ω0 : V (y) ≤ r is compact forsome r ∈ R, and V (y) ≤ 0 (resp. V (y) ≥ 0) for each y ∈ K, where V is defined as in(5.33). If η ∈ Ω0 is such that V (η) < r, then the following hold:

(a) Y (·, η) is defined on all of R+0 (resp. on all of R−

0 ).

(b) One has ω(η) ⊆ K (resp. α(η) ⊆ K) and V is constant on ω(η) (resp. on α(η)).

(c) If f is locally Lipschitz, then, letting

M :=

y ∈ K : V(

Y (x, y))

= 0 for each x ≥ 0 (resp. for each x ≤ 0)

,

one has ω(η) ⊆ M (resp. α(η) ⊆ M). In particular, V (y) = 0 for each y ∈ ω(η)(resp. for each y ∈ α(η)).

Proof. As usual, it suffices to prove the assertions for V (y) ≤ 0, as the assertions forV (y) ≥ 0 then follow via time reversion.

(a): We claim∀

x∈I0,η∩R+0

V(

Y (x, η))

< r : (5.127)

Indeed, if (5.127) does not hold, then let

0 < s := sup

x ≥ 0 : V(

Y (t, η))

< r for each t ∈ [0, x]

∈ I0,η, (5.128)

5 STABILITY 117

and

r = V(

Y (s, η))

= V (η) +

∫ s

0

V(

Y (t, η))

dt ≤ V (η) < r, (5.129)

which is impossible. Thus, (5.127) must hold. However, (5.127) implies R+0 ⊆ I0,η, since

Y (·, η) is a maximal solution and K = y ∈ Ω0 : V (y) ≤ r is compact.

(b): Let φ := Y (·, η). During the proof of (a) above, we have shown φ(x) ∈ K for eachx ≥ 0. Since, then, (V φ)′(x) = V (φ(x)) ≤ 0 for each x ≥ 0, V φ is nonincreasing forx ≥ 0. Since V φ is also bounded on K,

∃c∈R

c = limx→∞

V(

φ(x))

. (5.130)

If y ∈ ω(η), then there exists a sequence (xk)k∈N in R such that limk→∞ xk = ∞ andlimk→∞ φ(xk) = y. Thus, y ∈ K (since K is closed), and

V (y) = limk→∞

V(

φ(xk)) (5.130)

= c, (5.131)

proving (b).

(c): Let y ∈ ω(η) and φ := Y (·, y). Since f is assumed to be locally Lipschitz, Prop.5.52 applies and we obtain φ(x) = Y (x, y) ∈ ω(η) for each x ∈ R+

0 . Using (b), we knowV to be constant on ω(η), i.e. V φ must be constant on R+

0 as well, implying

∀x∈R+

0

V (φ(x)) = (V φ)′(x) = 0 (5.132)

as claimed.

Example 5.55. Let a < 0 < b and let h : ]a, b[−→ R be continuously differentiable andsuch that

h(x)

< 0 for x < 0,

= 0 for x = 0,

> 0 for x > 0.

(5.133)

Consider the autonomous ODE

y′1 = y2, (5.134a)

y′2 = −y21 y2 − h(y1). (5.134b)

The right-hand side is defined on Ω :=]a, b[×R and is clearly C1, i.e. the ODE admitsunique maximal solutions. Due to (5.133), F = (0, 0), i.e. the origin is the only fixedpoint of (5.134). We will use Th. 5.54(c) to show (0, 0) is positively asymptoticallystable: We introduce

H : ]a, b[−→ R, H(x) :=

∫ x

0

h(t) dt , (5.135)

and the Lyapunov function

V : Ω −→ R, V (y1, y2) := H(y1) +y222. (5.136)

A DIFFERENTIABILITY 118

Since H is positive definite at 0 (H is actually strictly decreasing on ]a, 0] and strictlyincreasing on [0, b[), V is positive definite at (0, 0). We also obtain

V : Ω −→ R, V (y1, y2) =(

h(y1), y2)

•(

y2,−y21 y2 − h(y1))

= −y21y22 ≤ 0. (5.137)

Thus, from the Lyapunov Th. 5.30, we already know (0, 0) to be positively stable.However, V is not negative definite at (0, 0), i.e. we can not immediately conclude that(0, 0) is positively asymptotically stable. Instead, as promised, we apply Th. 5.54(c):To this end, using that H is continuous and positive definite at 0, we choose r > 0 andc, d ∈ R, satisfying

a < c < 0 < d < b and H(c) = H(d) = r, (5.138)

and define

O := (y1, y2) ∈ Ω : V (y1, y2) < r, (5.139)

K := (y1, y2) ∈ Ω : V (y1, y2) ≤ r. (5.140)

Then O is open since V is continuous, and it suffices to show

∀(η1,η2)∈O

limx→∞

Y (x, η1, η2) = (0, 0). (5.141)

Moreover, the continuity of V implies K to be closed. Since K ⊆ [c, d]× [−√2r,

√2r], it

is also bounded, i.e. compact. Thus, Th. 5.54 applies to each η ∈ O. So let η ∈ O. Wewill show that M = (0, 0), where M is the set of Th. 5.54(c) (then ω(η) = (0, 0) byTh. 5.54(c), which implies (5.141) as desired). To verifyM = (0, 0), note V (y1, y2) < 0for y1, y2 6= 0, showing (y1, y2) /∈ M . For y1 = 0, y2 6= 0, let φ := Y (·, y1, y2). Thenφ2(0) = y2 6= 0 and φ′

1(0) = y2 6= 0, i.e. both φ1 and φ2 are nonzero on some interval]0, ǫ[ with ǫ > 0, showing (y1, y2) /∈ M . Likewise, if y1 6= 0, y2 = 0, then let φ be asbefore. This time φ1(0) = y1 6= 0 and φ′

2(0) = −h(y1) 6= 0, again showing both φ1 andφ2 are nonzero on some interval ]0, ǫ[ with ǫ > 0, implying (y1, y2) /∈M .

A Differentiability

We provide a lemma used in the variation of constants Th. 2.3.

Lemma A.1. Let O ⊆ R be open. If the function a : O −→ K is differentiable, then

f : O −→ K, f(x) := ea(x) (A.1a)

is differentiable with

f ′ : O −→ K, f ′(x) := a′(x) ea(x). (A.1b)

B KN -VALUED INTEGRATION 119

Proof. For K = R, the lemma is immediate from the chain rule of [Phi16, Th. 9.11].It remains to consider the case K = C. Note that we can not apply the chain rule forholomorphic (i.e. C-differentiable functions), since a is only R-differentiable and it doesnot need to have a holomorphic extension. However, we can argue as follows, merelyusing the chain rule and the product rule for real-valued functions: Write a = b + icwith differentiable functions b, c : O −→ R. Then

f(x) = ea(x) = eb(x)+ic(x) = eb(x) eic(x) = eb(x)(

sin c(x) + i cos c(x))

. (A.2)

Thus, one computes

f ′(x) = b′(x) eb(x) eic(x) + eb(x)(

− c′(x) cos c(x) + ic′(x) sin c(x))

= b′(x) ea(x) + ic′(x) eb(x)(

i cos c(x) + sin c(x))

= b′(x) ea(x) + ic′(x) eb(x) eic(x)

=(

b′(x) + ic′(x))

ea(x) = a′(x) ea(x), (A.3)

proving (A.1b).

B Kn-Valued Integration

During the course of this class, we frequently need Kn-valued integrals. In particular,for f : I −→ Kn, I an interval in R, we make use of the estimate ‖

∫

If‖ ≤

∫

I‖f‖,

for example in the proof of the Peano Th. 3.8. As mentioned in the proof of Th. 3.8,the estimate can easily be checked directly for the 1-norm on Kn, but it does hold forevery norm on Kn. To verify this result is the main purpose of the present section.Throughout the class, it suffices to use Riemann integrals. However, some readersmight be more familiar with Lebesgue integrals, which is a more general notion (everyRiemann integrable function is also Lebesgue integrable). For convenience, the materialis presented twice, first using Riemann integrals and arguments that make specific useof techniques available for Riemann integrals, then, second, using Lebesgue integralsand corresponding techniques. For Riemann integrals, the norm estimate is proved inTh. B.4, for Lebesgue integrals in Th. B.9.

B.1 Kn-Valued Riemann Integral

Definition B.1. Let a, b ∈ R, I := [a, b]. We call a function f : I −→ Kn, n ∈ N,Riemann integrable if, and only if, each coordinate function fj = πj f : I −→ K,j = 1, . . . , n, is Riemann integrable. Denote the set of all Riemann integrable functionsfrom I into Kn by R(I,Kn). If f : I −→ Kn is Riemann integrable, then

∫

I

f :=

(∫

I

f1, . . . ,

∫

I

fn

)

∈ Kn (B.1)

is the (Kn-valued) Riemann integral of f over I.


Remark B.2. The linearity of the K-valued integral implies the linearity of the Kn-valued integral.

Theorem B.3. Let a, b ∈ R, a ≤ b, I := [a, b]. If f ∈ R(I,Kn), n ∈ N, and φ :f(I) −→ R is Lipschitz continuous, then φ f ∈ R(I,R).

Proof. If K = R, then φ f = ψ ι f , where ι : Rn −→ Cn is the canonical imbedding,and ψ : Cn −→ R, ψ(z1, . . . , zn) := φ(Re z1, . . . ,Re zn). Clearly, ι f ∈ R(I,Cn), and,if φ is L-Lipschitz, L ≥ 0, then, due to

∀z,w∈Cn

|ψ(z)− ψ(w)| = |φ(Re z)− φ(Rew)| ≤ L ‖Re z − Rew‖(∗)

≤ CL ‖Re z − Rew‖1 = CLn∑

j=1

|Re zj − Rewj|

[Phi16, Th. 5.9(d)]

≤ CLn∑

j=1

|zj − wj| = CL ‖z − w‖1(∗∗)

≤ CCL ‖z − w‖,

(B.2)

where the estimate at (∗) holds with C ∈ R+, due to the equivalence of ‖ ·‖ and ‖ ·‖1 onRn, and the estimate at (∗∗) holds with C ∈ R+, due to the equivalence of ‖ ·‖1 and ‖ ·‖on Cn. Thus, by (B.2), ψ is Lipschitz as well, namely CCL-Lipschitz, and it suffices toconsider the case K = C, which we proceed to do next. Once again using the equivalenceof ‖ · ‖1 and ‖ · ‖ on Cn, there exists c ∈ R+ such that ‖z‖ ≤ c‖z‖1 for each z ∈ Cn.Assume φ to be L-Lipschitz, L ≥ 0. If f ∈ R(I,Cn), then Re f1, . . . ,Re fn ∈ R(I,R)and Im f1, . . . , Im fn ∈ R(I,R), i.e., given ǫ > 0, Riemann’s integrability criterion of[Phi16, Th. 10.13] provides partitions ∆1, . . . ,∆n of I and Π1, . . . ,Πn of I such that

∀j=1,...,n

R(∆j,Re fj)− r(∆j,Re fj) <ǫ

2ncL,

R(Πj, Im fj)− r(Πj, Im fj) <ǫ

2ncL,

(B.3)

where R and r denote upper and lower Riemann sums, respectively (cf. [Phi16, (10.7)]).Letting ∆ be a joint refinement of the 2n partitions ∆1, . . . ,∆n,Π1, . . . ,Πn, we have (cf.[Phi16, Def. 10.8(a),(b)] and [Phi16, Th. 10.10(a)])

∀j=1,...,n

R(∆,Re fj)− r(∆,Re fj) <ǫ

2ncL,

R(∆, Im fj)− r(∆, Im fj) <ǫ

2ncL.

(B.4)

Recalling that, for each g : I −→ R and ∆ = (x0, . . . , xN ) ∈ RN+1, N ∈ N, a = x0 <x1 < · · · < xN = b, Ik := [xk−1, xk], it is

r(∆, g) =N∑

k=1

mk |Ik| =N∑

k=1

mk(g)(xk − xk−1), (B.5a)

R(∆, g) =N∑

k=1

Mk |Ik| =N∑

k=1

Mk(g)(xk − xk−1), (B.5b)


wheremk(g) := infg(x) : x ∈ Ik, Mk(g) := supg(x) : x ∈ Ik, (B.5c)

we obtain, for each ξk, ηk ∈ Ik,

∣

∣(φ f)(ξk)− (φ f)(ηk)∣

∣ ≤ L∥

∥f(ξk)− f(ηk)∥

∥ ≤ cL∥

∥f(ξk)− f(ηk)∥

∥

1

= cL

n∑

j=1

∣

∣fj(ξk)− fj(ηk)∣

∣

[Phi16, Th. 5.9(d)]

≤ cLn∑

j=1

∣

∣Re fj(ξk)− Re fj(ηk)∣

∣+ cLn∑

j=1

∣

∣ Im fj(ξk)− Im fj(ηk)∣

∣

≤ cLn∑

j=1

(

Mk(Re fj)−mk(Re fj))

+ cLn∑

j=1

(

Mk(Im fj)−mk(Im fj))

. (B.6)

Thus,

R(∆, φ f)− r(∆, φ f) =N∑

k=1

(

Mk(φ f)−mk(φ f))

|Ik|

(B.6)

≤ cL

N∑

k=1

n∑

j=1

(

Mk(Re fj)−mk(Re fj))

|Ik|

+ cLN∑

k=1

n∑

j=1

(

Mk(Im fj)−mk(Im fj))

|Ik|

= cLn∑

j=1

(

R(∆,Re fj)− r(∆,Re fj))

+ cLn∑

j=1

(

R(∆, Im fj)− r(∆, Im fj))

(B.4)< 2ncL

ǫ

2ncL= ǫ. (B.7)

Thus, φ f ∈ R(I,R) by [Phi16, Th. 10.13].

Theorem B.4. Let a, b ∈ R, a ≤ b, I := [a, b]. For each norm ‖ · ‖ on Kn, n ∈ N, andeach Riemann integrable f : I −→ Kn, it is ‖f‖ ∈ R(I,R), and the following holds:

∥

∥

∥

∥

∫

I

f

∥

∥

∥

∥

≤∫

I

‖f‖. (B.8)

Proof. From Th. B.3, we obtain ‖f‖ ∈ R(I,R), as the norm ‖ · ‖ is 1-Lipschitz by theinverse triangle inequality. Let ∆ be an arbitrary partition of I. Recalling that, foreach g : I −→ R and ∆ = (x0, . . . , xN) ∈ RN+1, N ∈ N, a = x0 < x1 < · · · < xN = b,Ik := [xk−1, xk], ξk ∈ Ik, the intermediate Riemann sums

ρ(∆, f) =N∑

k=1

f(tk) |Ik| =N∑

k=1

f(tk)(xk − xk−1), (B.9)


we obtain, for ξk ∈ Ik,∥

∥

∥

(

(

ρ(∆,Re f1), ρ(∆, Im f1))

, . . . ,(

ρ(∆,Re fn), ρ(∆, Im fn))

)∥

∥

∥

=

∥

∥

∥

∥

∥

((

N∑

k=1

Re f1(ξk) |Ik|,N∑

k=1

Im f1(ξk) |Ik|)

, . . . ,

(

N∑

k=1

Re fn(ξk) |Ik|,N∑

k=1

Im fn(ξk) |Ik|))∥

∥

∥

∥

∥

=

∥

∥

∥

∥

∥

N∑

k=1

((

Re f1(ξk) |Ik|, Im f1(ξk) |Ik|)

, . . . ,(

Re fn(ξk) |Ik|, Im fn(ξk) |Ik|))

∥

∥

∥

∥

∥

≤N∑

k=1

∥

∥

∥

∥

((

Re f1(ξk), Im f1(ξk))

, . . . ,(

Re fn(ξk), Im fn(ξk)))

∥

∥

∥

∥

|Ik|

=N∑

k=1

‖f(ξk)‖ |Ik| = ρ(∆, ‖f‖). (B.10)

Since the intermediate Riemann sums in (B.10) converge to the respective integrals by[Phi16, (10.25b)], one obtains∥

∥

∥

∥

∫

I

f

∥

∥

∥

∥

= lim|∆|→0

∥

∥

∥

(

(

ρ(∆,Re f1), ρ(∆, Im f1))

, . . . ,(

ρ(∆,Re fn), ρ(∆, Im fn))

)∥

∥

∥

(B.10)

≤ lim|∆|→0

ρ(∆, ‖f‖) =∫

I

‖f‖, (B.11)

proving (B.8).

B.2 Kn-Valued Lebesgue Integral

Definition B.5. Let I ⊆ R be (Lebesgue) measurable, n ∈ N.

(a) A function f : I −→ Kn is called (Lebesgue) measurable (respectively, (Lebesgue)integrable) if, and only if, each coordinate function fj = πj f : I −→ K, j =1, . . . , n, is (Lebesgue) measurable (respectively, (Lebesgue) integrable), which, forK = C, means if, and only if, each Re fj and each Im fj, j = 1, . . . , n, is (Lebesgue)measurable (respectively, (Lebesgue) integrable).

(b) If f : I −→ Kn is integrable, then∫

I

f :=

(∫

I

f1, . . . ,

∫

I

fn

)

∈ Kn (B.12)

is the (Kn-valued) (Lebesgue) integral of f over I.

Remark B.6. The linearity of the K-valued integral implies the linearity of the Kn-valued integral.

Theorem B.7. Let I ⊆ R be measurable, n ∈ N. Then f : I −→ Kn is measurable inthe sense of Def. B.5(a) if, and only if, f−1(O) is measurable for each open subset O ofKn.


Proof. Assume f−1(O) is measurable for each open subset O of Kn. Let j ∈ 1, . . . , n.If Oj ⊆ K is open in K, then O := π−1

j (Oj) = z ∈ Kn : zj ∈ Oj is open in Kn.

Thus, f−1j (Oj) = f−1(O) is measurable, showing that each fj is measurable, i.e. f is

measurable. Now assume f is measurable, i.e. each fj is measurable. Since every openO ⊆ Kn is a countable union of open sets of the form O = O1 × · · · × On with each Oj

being an open subset of K, it suffices to show that the preimages of such open sets aremeasurable. So let O be as above. Then f−1(O) =

⋂nj=1 f

−1j (Oj), showing that f−1(O)

is measurable.

Corollary B.8. Let I ⊆ R be measurable, n ∈ N. If f : I −→ Kn is measurable, then‖f‖ : I −→ R is measurable.

Proof. If O ⊆ R is open, then ‖ · ‖−1(O) is an open subset of Kn by the continuity ofthe norm. In consequence, ‖f‖−1(O) = f−1

(

‖ · ‖−1(O))

is measurable.

Theorem B.9. Let I ⊆ R be measurable, n ∈ N. For each norm ‖ · ‖ on Kn and eachintegrable f : I −→ Kn, the following holds:

∥

∥

∥

∥

∫

I

f

∥

∥

∥

∥

≤∫

I

‖f‖. (B.13)

Proof. First assume that B ⊆ I is measurable, y ∈ Kn, and f = y χB, where χB is thecharacteristic function of B (i.e. the fj are yj on B and 0 on I \B). Then

∥

∥

∥

∥

∫

I

f

∥

∥

∥

∥

=∥

∥

∥

(

y1λ(B), . . . , ynλ(B))

∥

∥

∥= λ(B)‖y‖ =

∫

I

‖f‖, (B.14)

where λ denotes Lebesgue measure on R. Next, consider the case that f is a so-calledsimple function, that means f takes only finitely many values y1, . . . , yN ∈ Kn, N ∈ N,and each preimage Bj := f−1yj ⊆ I is measurable. Then

f =N∑

j=1

yjχBj, (B.15)

where, without loss of generality, we may assume that the Bj are pairwise disjoint. Weobtain

∥

∥

∥

∥

∫

I

f

∥

∥

∥

∥

≤N∑

j=1

∥

∥

∥

∥

∫

I

yjχBj

∥

∥

∥

∥

=N∑

j=1

∫

I

∥

∥yjχBj

∥

∥ =

∫

I

N∑

j=1

∥

∥yjχBj

∥

∥

(∗)=

∫

I

∥

∥

∥

∥

∥

N∑

j=1

yjχBj

∥

∥

∥

∥

∥

=

∫

I

‖f‖, (B.16)

where, at (∗), it was used that, as the Bj are disjoint, the integrands of the two integralsare equal at each x ∈ I.

C METRIC SPACES 124

Now, if f is integrable, then each Re fj and each Im fj is integrable (i.e. Re fj, Im fj ∈L1(I)) and there exist sequences of simple functions φj,k : I −→ R and ψj,k : I −→ R

such that limk→∞ ‖φj,k − Re fj‖L1(I) = limk→∞ ‖ψj,k − Im fj‖L1(I) = 0. In particular,

0 ≤ limk→∞

∣

∣

∣

∣

∫

I

φj,k −∫

I

Re fj

∣

∣

∣

∣

≤ limk→∞

‖φj,k − Re fj‖L1(I) = 0, (B.17a)

0 ≤ limk→∞

∣

∣

∣

∣

∫

I

ψj,k −∫

I

Im fj

∣

∣

∣

∣

≤ limk→∞

‖ψj,k − Im fj‖L1(I) = 0, (B.17b)

and also

0 ≤ limk→∞

‖φj,k + iψj,k − fj‖L1(I)

≤ limk→∞

‖φj,k − Re fj‖L1(I) + limk→∞

‖ψj,k − Im fj‖L1(I) = 0. (B.18)

Thus, we obtain∥

∥

∥

∥

∫

I

f

∥

∥

∥

∥

=

∥

∥

∥

∥

(∫

I

f1, . . . ,

∫

I

fn

)∥

∥

∥

∥

=

∥

∥

∥

∥

(

limk→∞

∫

I

φ1,k + i limk→∞

∫

I

ψ1,k, . . . , limk→∞

∫

I

φn,k + i limk→∞

∫

I

ψn,k

)∥

∥

∥

∥

= limk→∞

∥

∥

∥

∥

∫

I

(φk + i ψk)

∥

∥

∥

∥

≤ limk→∞

∫

I

‖φk + iψk‖(∗)=

∫

I

‖f‖, (B.19)

where the equality at (∗) holds due to limk→∞

∥

∥‖(φ1,k, . . . , φn,k)‖−‖f‖∥

∥

L1(I)= 0, which,

in turn, is verified by

0 ≤∫

I

∣

∣‖φk + iψk‖ − ‖f‖∣

∣ ≤∫

I

‖φk + iψk − f‖ ≤ C

∫

I

‖φk + iψk − f‖1

= C

∫

I

n∑

j=1

∣

∣φj,k + iψj,k − fj∣

∣→ 0 for k → ∞, (B.20)

with C ∈ R+ since the norms ‖ · ‖ and ‖ · ‖1 are equivalent on Kn.

C Metric Spaces

C.1 Distance in Metric Spaces

Lemma C.1. The following law holds in every metric space (X, d):

|d(x, y)− d(x′, y′)| ≤ d(x, x′) + d(y, y′) for each x, x′, y, y′ ∈ X. (C.1)

In particular, (C.1) states the Lipschitz continuity of d : X2 −→ R+0 (with Lipschitz

constant 1) with respect to the metric d1 on X2 defined by

d1 : X2 ×X2 −→ R+

0 , d1(

(x, y), (x′, y′))

= d(x, x′) + d(y, y′). (C.2)

Further consequences are the continuity and even uniform continuity of d, and also thecontinuity of d in both components.

C METRIC SPACES 125

Proof. First, note d(x, y) ≤ d(x, x′) + d(x′, y′) + d(y′, y), i.e.

d(x, y)− d(x′, y′) ≤ d(x, x′) + d(y′, y). (C.3a)

Second, d(x′, y′) ≤ d(x′, x) + d(x, y) + d(y, y′), i.e.

d(x′, y′)− d(x, y) ≤ d(x′, x) + d(y, y′). (C.3b)

Taken together, (C.3a) and (C.3b) complete the proof of (C.1).

Definition C.2. Let (X, d) be a nonempty metric space. For each A,B ⊆ X define thedistance between A and B by

dist(A,B) := infd(a, b) : a ∈ A, b ∈ B ∈ [0,∞] (C.4)

and∀

x∈Xdist(x,B) := dist(x, B) and dist(A, x) := dist(A, x). (C.5)

Remark C.3. Clearly, for dist(A,B) as defined in (C.4), we have

dist(A,B) <∞ ⇔ A 6= ∅ and B 6= ∅. (C.6)

Theorem C.4. Let (X, d) be a nonempty metric space. If A ⊆ X and A 6= ∅, then thefunctions

δ, δ : X −→ R+0 , δ(x) := dist(x,A), δ(x) := dist(A, x), (C.7)

are both Lipschitz continuous with Lipschitz constant 1 (in particular, they are bothcontinuous and even uniformly continuous).

Proof. Since dist(x,A) = dist(A, x), it suffices to verify the Lipschitz continuity of δ.We need to show

∀x,y∈X

| dist(x,A)− dist(y, A)| ≤ d(x, y). (C.8)

To this end, let x, y ∈ X and a ∈ A be arbitrary. Then

dist(x,A) ≤ d(x, a) ≤ d(x, y) + d(y, a) (C.9)

anddist(x,A)− d(x, y) ≤ d(y, a), (C.10)

implyingdist(x,A)− d(x, y) ≤ dist(y, A) (C.11)

anddist(x,A)− dist(y, A) ≤ d(x, y). (C.12)

Since x, y ∈ X were arbitrary, (C.12) also yields

dist(y, A)− dist(x,A) ≤ d(x, y), (C.13)

where (C.12) and (C.13) together are precisely (C.8).

C METRIC SPACES 126

Definition C.5. Let (X, d) be a metric space, A ⊆ X, and ǫ ∈ R+. Define

Aǫ := x ∈ X : d(x,A) < ǫ, (C.14a)

Aǫ := x ∈ X : d(x,A) ≤ ǫ. (C.14b)

We call Aǫ the open ǫ-fattening of A, and Aǫ the closed ǫ-fattening of A.

Lemma C.6. Let (X, d) be a metric space, A ⊆ X, and ǫ ∈ R+. Then Aǫ, the openǫ-fattening of A, is, indeed, open, and Aǫ, the closed ǫ-fattening of A, is, indeed, closed.

Proof. Since the distance function δ : X −→ R+0 , δ(x) := dist(x,A), is continuous by

Th. C.4, Aǫ = δ−1[0, ǫ[ is open as the continuous preimage of an open set (note that [0, ǫ[is, indeed, (relatively) open in R+

0 ); Aǫ = δ−1[0, ǫ] is closed as the continuous preimageof a closed set.

Lemma C.7. Let (X, d) be a metric space, A ⊆ X, and ǫ ∈ R+. If A is bounded, thenso are the fattenings Aǫ and Aǫ.

Proof. If A is bounded, then there exist x ∈ X and r > 0 such that A ⊆ Br(x). Lets := r + ǫ+ 1. If y ∈ Aǫ, then there exists a ∈ A such that d(a, y) < ǫ+ 1. Thus,

d(x, y) ≤ d(x, a) + d(a, y) < r + ǫ+ 1 = s, (C.15)

showing Aǫ ⊆ Aǫ ⊆ Bs(x), i.e. Aǫ and Aǫ are bounded.

Proposition C.8. Let (X, d) be a metric space, A ⊆ X, and 0 < ǫ1 < ǫ2.

(a) Then A ⊆ Aǫ1 ⊆ Aǫ1 ⊆ Aǫ2 ⊆ Aǫ2 always holds.

(b) If (X, ‖·‖) is a normed space with d being the induced metric, ∅ 6= A ⊆ X, and thereexists x /∈ A, satisfying δ := d(x,A) ≥ ǫ2, then all the inclusions in (a) are strict:A ( Aǫ1 ( Aǫ1 ( Aǫ2 ( Aǫ2. Caveat: For general metric spaces X and A satisfyingall the hypotheses, the inclusions do not need to be strict (consider discrete metricspaces for simple examples).

Proof. (a) is immediate from (C.14).

To prove (b), let a ∈ A and consider the maps

φ : [0, 1] −→ X, φ(t) := tx+ (1− t)a, (C.16a)

f : [0, 1] −→ R, f(t) := d(

φ(t), A)

. (C.16b)

If (sn)n∈N is a sequence in [0, 1] such that limn→∞ sn = s ∈ [0, 1], then limn→∞ φ(sn) =sx+(1−s)a = φ(s), i.e. φ is continuous. Then, using Th. C.4, f is also continuous. Thus,since f(0) = d(a,A) = 0 and f(1) = d(x,A) = δ ≥ ǫ2, one can use the intermediatevalue theorem [Phi16, Th. 7.57] to obtain, for each ǫ ∈ [0, ǫ2], some τ ∈ [0, 1], satisfyingf(τ) = ǫ. If ǫ > 0, then d(φ(τ), A) = f(τ) = ǫ > 0, i.e φ(τ) ∈ Aǫ \A and φ(τ) ∈ Aǫ \Aǫ,showing A ( Aǫ1 , Aǫ1 ( Aǫ1 , and Aǫ2 ( Aǫ2 . If ǫ := (ǫ1 + ǫ2)/2, then ǫ1 < ǫ = f(τ) =d(φ(τ), A) < ǫ2, i.e. φ(τ) ∈ Aǫ2 \ Aǫ1 , showing Aǫ1 ( Aǫ2 .

C METRIC SPACES 127

C.2 Compactness in Metric Spaces

Definition C.9. A subset C of a metric space X is called compact if, and only if, everysequence in C has a subsequence that converges to some limit c ∈ C.

Proposition C.10. Let (X, d) be a metric space, C,A ⊆ X. If C is compact, A isclosed, and A ∩ C = ∅, then dist(C,A) > 0.

Proof. Proceeding by contraposition, we show that dist(C,A) = 0 implies A ∩ C 6= ∅.If dist(C,A) = 0, then there exists a sequence ((ck, ak))k∈N in C × A such that

limk→∞

d(ck, ak) = 0. (C.17)

As C is compact, we may assume

limk→∞

ck = c ∈ C, (C.18)

also implying

limk→∞

ak = c, since ∀k∈N

d(ak, c) ≤ d(ak, ck) + d(ck, c). (C.19)

Since A is closed, (C.19) yields c ∈ A, i.e. c ∈ A ∩ C.

Proposition C.11. Let (X, d) be a metric space and C ⊆ X.

(a) If C is compact, then C is closed and bounded.

(b) If C is compact and A ⊆ C is closed, then A is compact.

Proof. (a): Suppose C is compact. Let (xk)k∈N be a sequence in C that converges inX, i.e. limk→∞ xk = x ∈ X. Since C is compact, (xk)k∈N must have a subsequencethat converges to some c ∈ C, implying x = c ∈ C and showing C is closed. If Cis not bounded, then, for each x ∈ X, there is a sequence (xk)k∈N in C such thatlimk→∞ d(x, xk) = ∞. If y ∈ X, then d(x, xk) ≤ d(x, y) + d(y, xk), i.e. d(y, xk) ≥d(x, xk)−d(x, y), showing that limk→∞ d(y, xk) = ∞ as well. Thus, y can not be a limitof any subsequence of (xk)k∈N. As y was arbitrary, C can not be compact.

(b): If (xk)k∈N is a sequence in A, then (xk)k∈N is a sequence in C. Since C is compact,it must have a subsequence that converges to some c ∈ C. However, as A is closed, cmust be in A, showing that (xk)k∈N has a subsequence that converges to some c ∈ A,i.e. A is compact.

Corollary C.12. A subset C of Kn, n ∈ N, is compact if, and only if, C is closed andbounded.

Proof. Every compact set is closed and bounded by Prop. C.11(a). If C is closed andbounded, and (xk)k∈N is a sequence in C, then the boundedness and the Bolzano-Weierstrass theorem yield a subsequence that converges to some x ∈ Kn. However,since C is closed, x ∈ C, showing that C is compact.

C METRIC SPACES 128

The following examples show that, in general, sets can be closed and bounded withoutbeing compact.

Example C.13. (a) If (X, d) is a noncomplete metric space, than it contains a Cauchysequence that does not converge. It is not hard to see that such a sequence cannot have a convergent subsequence, either. This shows that no noncomplete metricspace can be compact. Moreover, the closure of every bounded subset of X thatcontains such a nonconvergent Cauchy sequence is an example of a closed andbounded set that is noncompact. Concrete examples are given by Q∩ [a, b] for eacha, b ∈ R with a < b (these sets are Q-closed, but not R-closed!) and ]a, b[ for eacha, b ∈ R with a < b, in each case endowed with the usual metric d(x, y) := |x− y|.

(b) There can also be closed and bounded sets in complete spaces that are not compact.Consider the space X of all bounded sequences (xn)n∈N in K, endowed with the sup-norm ‖(xn)n∈N‖sup := sup|xn| : n ∈ N. It is not too difficult to see that X withthe sup-norm is a Banach space: Let (xk)k∈N with xk = (xkn)n∈N be a Cauchysequence in X. Then, for each n ∈ N, (xkn)k∈N is a Cauchy sequence in K, and,thus, it has a limit yn ∈ K. Let y := (yn)n∈N. Then

‖xk − y‖sup = sup|xkn − yn| : n ∈ N.

Let ǫ > 0. As (xk)k∈N is a Cauchy sequence with respect to the sup-norm, there isN ∈ N such that ‖xk−xl‖sup < ǫ for all k, l > N . Fix some l > N and some n ∈ N.Then ǫ ≥ limk→∞ |xkn − xln| = limk→∞ |yn − xln|. Since this is valid for each n ∈ N,we get ‖xl − y‖sup ≤ ǫ for each l > N , showing liml→∞ xl = y, i.e. X is completeand a Banach space.

Now consider the sequence (ek)k∈N with

ekn :=

1 for k = n,

0 otherwise.

Then (ek)k∈N constitutes a sequence in X with ‖ek‖sup = 1 for each k ∈ N. In par-ticular, (ek)k∈N is a sequence inside the closed unit ball B1(0), and, hence, bounded.However, if k, l ∈ N with k 6= l, then ‖ek−el‖sup = 1. Thus, neither (ek)k∈N nor anysubsequence can be a Cauchy sequence. In particular, no subsequence can converge,showing that the closed and bounded unit ball B1(0) is not compact.

Note: There is an important result that shows that a normed vector space is finite-dimensional if, and only if, the closed unit ball B1(0) is compact (see, e.g., [Str08,Th. 28.14]).

Theorem C.14. If (X, dX) and (Y, dY ) are metric spaces, C ⊆ X is compact, andf : C −→ Y is continuous, then f(C) is compact.

Proof. If (yk)k∈N is a sequence in f(C), then, for each k ∈ N, there is some xk ∈ Csuch that f(xk) = yk. As C is compact, there is a subsequence (ak)k∈N of (xk)k∈Nwith limk→∞ ak = a for some a ∈ C. Then (f(ak))k∈N is a subsequence of (yk)k∈N and

C METRIC SPACES 129

the continuity of f yields limk→∞ f(ak) = f(a) ∈ f(C), showing that (yk)k∈N has aconvergent subsequence with limit in f(C). We have therefore established that f(C) iscompact.

Theorem C.15. If (X, d) is a metric space, C ⊆ X is compact, and f : C −→ R iscontinuous, then f assumes its max and its min, i.e. there are xm ∈ C and xM ∈ Csuch that f has a global min at xm and a global max at xM .

Proof. Since C is compact and f is continuous, f(C) ⊆ R is compact according to Th.C.14. Then, by [Phi16, Lem. 7.53], f(C) contains a smallest element m and a largestelement M . This, in turn, implies that there are xm, xM ∈ C such that f(xm) = m andf(xM) =M .

Theorem C.16. If (X, dX) and (Y, dY ) are metric spaces, C ⊆ X is compact, andf : C −→ Y is continuous, then f is uniformly continuous.

Proof. If f is not uniformly continuous, then there must be some ǫ > 0 such that, foreach k ∈ N, there exist xk, yk ∈ C satisfying dX(x

k, yk) < 1/k and dY (f(xk), f(yk)) ≥ ǫ.

Since C is compact, there is a ∈ C and a subsequence (ak)k∈N of (xk)k∈N such thata = limk→∞ ak. Then there is a corresponding subsequence (bk)k∈N of (yk)k∈N such thatdX(a

k, bk) < 1/k and dY (f(ak), f(bk)) ≥ ǫ for all k ∈ N. Using the compactness of C

again, there is b ∈ C and a subsequence (vk)k∈N of (bk)k∈N such that b = limk→∞ vk.Now there is a corresponding subsequence (uk)k∈N of (ak)k∈N such that dX(u

k, vk) <1/k and dY (f(u

k), f(vk)) ≥ ǫ for all k ∈ N. Note that we still have a = limk→∞ vk.Given α > 0, there is N ∈ N such that, for each k > N , one has dX(a, u

k) < α/3,dX(b, v

k) < α/3, and dX(uk, vk) < 1/k < α/3. Thus, dX(a, b) < dX(a, u

k)+dX(uk, vk)+

dX(b, vk) < α, implying d(a, b) = 0 and a = b. Finally, the continuity of f implies

f(a) = limk→∞ f(uk) = limk→∞ f(vk) in contradiction to dY (f(uk), f(vk)) ≥ ǫ.

Theorem C.17. If (X, dX) and (Y, dY ) are metric spaces, C ⊆ X is compact, andf : C −→ Y is continuous and one-to-one, then f−1 : f(C) −→ C is continuous.

Proof. Let (yk)k∈N be a sequence f(C) such that limk→∞ yk = y ∈ f(C). Then thereis a sequence (xk)k∈N in C such that f(xk) = yk for each k ∈ N. Let x := f−1(y).It remains to prove that limk→∞ xk = x. As C is compact, there is a ∈ C and asubsequence (ak)k∈N of (xk)k∈N such that a = limk→∞ ak. The continuity of f yieldsf(a) = limk→∞ f(ak) = limk→∞ yk = y = f(x) since (f(ak))k∈N is a subsequence of(yk)k∈N. It now follows that a = x since f is one-to-one. The same argument showsthat every convergent subsequence of (xk)k∈N has to converge to x. If (xk)k∈N did notconverge to x, then there had to be some ǫ > 0 such that infinitely man xk are not inBǫ(x). However, the compactness of C would provide a convergent subsequence whoselimit could not be x, in contradiction to x having to be the limit of all convergentsubsequences of (xk)k∈N.

Definition C.18. A subset A of a metric space (X, d) is called precompact or totallybounded if, and only if, for each ǫ > 0, A can be covered by finitely many ǫ-balls, i.e. if,

C METRIC SPACES 130

and only if, there exist finitely many points a1, . . . , aN ∈ A, N ∈ N, such that

A ⊆N⋃

j=1

Bǫ(aj). (C.20)

Theorem C.19. For a subset C of a metric space (X, d), the following statements areequivalent:

(i) C is compact as defined in Def. C.9.

(ii) C has the Heine-Borel property, i.e. every open cover of C has a finite subcover,i.e. if (Oj)j∈I is a family of open sets Oj ⊆ X, satisfying

C ⊆⋃

j∈I

Oj, (C.21)

then there exist j1, . . . , jN ∈ I, N ∈ N, such that C ⊆ ⋃Nk=1Ojk .

(iii) C is precompact (i.e. totally bounded) as defined in Def. C.18 and complete, i.e.every Cauchy sequence in C converges to a limit in C.

Proof. We show (i) ⇒ (iii) ⇒ (ii) ⇒ (i).

“(i) ⇒ (iii)”: Let (cn)n∈N be a Cauchy sequence in C. As C is compact, (cn)n∈N has asubsequence (cnj

)j∈N such that limj→∞ cnj= c ∈ C. Given ǫ > 0 choose K ∈ N such

that, for each m,n ≥ K, d(cm, cn) <ǫ2, and such that, for each nj ≥ K, d(cnj

, c) < ǫ2.

Then, fixing some nj ≥ K,

∀n≥K

d(cn, c) ≤ d(cn, cnj) + d(cnj

, c) <ǫ

2+ǫ

2= ǫ, (C.22)

showing limn→∞ cn = c and the completeness of C. We now show C to be also totallybounded. We proceed by contraposition and assume C not to be totally bounded, i.e.there exists ǫ > 0 such that C is not contained in any finite union of ǫ-balls. Inductively,we construct a sequence (cn)n∈N in C such that

∀m,n∈N,m 6=n

d(cm, cn) ≥ ǫ : (C.23)

To start with, we note C 6= ∅ and choose some arbitrary c1 ∈ C. Assuming c1, . . . , ck ∈C, k ∈ N, have already been constructed such that d(cm, cn) ≥ ǫ holds for each m,n ∈1, . . . , k, there must be

c ∈ C \k⋃

j=1

Bǫ(cj). (C.24)

Choosing ck+1 := c, (C.24) guarantees (C.23) now holds for each m,n ∈ 1, . . . , k + 1.Due to (C.23), no subsequence of (cn)n∈N can be a Cauchy sequence, i.e. (cn)n∈N doesnot have a convergent subsequence, proving C is not compact.

C METRIC SPACES 131

“(iii) ⇒ (ii)”: Assume C to be precompact and complete. For each k ∈ N, the precom-pactness yields points ck1, . . . , c

kNk

∈ C, Nk ∈ N, such that

C ⊆Nk⋃

j=1

B 1k(ckj ). (C.25)

Seeking a contradiction, assume C does not have the Heine-Borel property, i.e. thereexists an open cover (Oj)j∈I of C which does not have a finite subcover. Inductively, weconstruct a decreasing sequence of subsets Ck of C, C ⊇ C1 ⊇ C2 ⊇ . . . , such that noCk can be covered by a finite subcover of (Oj)j∈I and such that

∀k∈N

∃j∈1,...,Nk

Ck ⊆ B 1k(ckj ) : (C.26)

To start out, we note that (C.25) implies at least one of the finitely many sets C ∩B1(c

11), . . . , C∩B1(c

1N1) can not be covered by a finite subcover of (Oj)j∈I , say, C∩B1(c

1j1).

Define C1 := C∩B1(c1j1). Then, given C1, . . . , Ck have already been constructed for some

k ∈ N, since Ck can not be covered by a finite subcover of (Oj)j∈I and

Ck ⊆ C ⊆Nk+1⋃

j=1

B 1k+1

(ck+1j ), (C.27)

there exists jk+1 ∈ 1, . . . , Nk+1 such that Ck ∩ B 1k+1

(ck+1jk+1

) can not be covered by a

finite subcover of (Oj)j∈I , either. Define Ck+1 := Ck ∩ B 1k+1

(ck+1jk+1

). For each k ∈ N,

choose some sk ∈ Ck (note Ck 6= ∅, as it can not be covered by finitely many Oj). Givenǫ > 0, there is K ∈ N such that 2

K< ǫ. If k, l ≥ K, then sk, sl ∈ CK ⊆ B 1

K(cKj ) for some

suitable j ∈ 1, . . . , NK. In particular, d(sk, sl) <2K< ǫ, showing (sk)k∈N is a Cauchy

sequence. As (sk)k∈N is a Cauchy sequence in C and C is complete, there exists c ∈ Csuch that limk→∞ sk = c. However, then there must exist some j ∈ I such that c ∈ Oj

and, since Oj is open, there is ǫ > 0 with Bǫ(c) ⊆ Oj, and Bǫ(c) must contain almostall of the sk. Choose k sufficiently large such that 1

k< ǫ

4and d(sk, c) <

ǫ2. Then, since

sk ∈ Ck ⊆ B 1k(ckj ), (C.28)

one has

∀x∈B 1

k(ckj )

d(x, c) ≤ d(x, sk) + d(sk, c) <2

k+ǫ

2<

2ǫ

4+ǫ

2= ǫ, (C.29)

showing Ck ⊆ B 1k(ckj ) ⊆ Bǫ(c) ⊆ Oj, in contradiction to Ck not being coverable by

finitely many Oj.

“(ii)⇒ (i)”: Assume C has the Heine-Borel property. Seeking a contradiction, assume Cis not compact, that means there exists a sequence (cn)n∈N in C such that no subsequenceof (cn)n∈N converges to a limit in C. According to [Phi15, Prop. 1.38(d)], no c ∈ C canbe a cluster point of (cn)n∈N, i.e., for each c ∈ C, there exists ǫc > 0 such that Bǫc(c)contains only finitely many of the cn. Since C ⊆ ⋃

c∈C Bǫc(c), the family(

Bǫc(c))

c∈Cconstitutes an open cover of C. As C has the Heine-Borel property, there exist finitelymany points a1, . . . , aN ∈ C, N ∈ N, such that C ⊆ ⋃N

j=1Bǫaj(aj), i.e. C contains only

finitely many of the cn, in contradiction to (cn)n∈N being a sequence in C.

D LOCAL LIPSCHITZ CONTINUITY 132

Caveat C.20. In general topological spaces, one defines compactness via the Heine-Borel property (a topological space C is defined to be compact if, and only if, C hasthe Heine-Borel property). Moreover, a topological space C is defined to be sequentiallycompact if, and only if, every sequence in C has a convergent subsequence. Using thisterminology, one can rephrase the equivalence between (i) and (ii) in Th. C.19 by statingthat a metric space is sequentially compact if, and only if, it is compact. However, ingeneral topological spaces, neither implication remains true ((iii) of Th. C.19 does noteven make sense in general topological spaces, as the concepts of boundedness, totalboundedness, and Cauchy sequences are, in general, not available): For an exampleof a topological space that is compact, but not sequentially compact, see, e.g. [Pre75,7.2.10(a)]; for an example of a topological space that is sequentially compact, but notcompact, see, e.g. [Pre75, 7.2.10(c)].

Theorem C.21 (Lebesgue Number). Let (X, d) be a metric space and C ⊆ X. If C iscompact and (Oj)j∈I is an open cover of C, then there exists a Lebesgue number δ forthe open cover, i.e. some δ > 0 such that, for each A ⊆ C with diamA < δ, there existsj0 ∈ I, where A ⊆ Oj0. Recall that

diamA =

0 for A = ∅,sup

d(x, y) : x, y ∈ A

for ∅ 6= A.(C.30)

Proof. Seeking a contradiction, assume there is no Lebesgue number for the open cover(Oj)j∈I . Then there are sequences (xk)k∈N in C and (Ak)k∈N in P(C) such that

∀k∈N

(

xk ∈ Ak, diamAk <1

k, and ∀

j∈IAk 6⊆ Oj

)

. (C.31)

As C is compact, we may assume that limk→∞ xk = c ∈ C. Then there must be Oj

such that c ∈ Oj and ǫ > 0 such that Bǫ(c) ⊆ Oj. If k ∈ N is such that 1k< ǫ

2and

d(xk, c) <ǫ2, then, for each a ∈ Ak, we have d(a, c) ≤ d(a, xk) + d(xk, c) <

ǫ2+ ǫ

2= ǫ,

implying the contradiction Ak ⊆ Bǫ(c) ⊆ Oj.

D Local Lipschitz Continuity

In Prop. 3.13, it was shown that a continuous function is locally Lipschitz with respectto y if, and only if, it is globally Lipschitz with respect to y on every compact set.The following Prop. D.1 shows that this equivalence holds even if f is not continuous,provided that each projection Gx as in (D.1) below is convex. On the other hand, Ex.D.2 shows that, in general, there exist discontinuous functions that are locally Lipschitzwith respect to y without being globally Lipschitz with respect to y on every compactset.

Proposition D.1. Let m,n ∈ N, G ⊆ R × Km, and f : G −→ Kn. If G is such thateach projection

Gx := y ∈ Km : (x, y) ∈ G, x ∈ R, (D.1)


is convex (in particular, if G itself is convex), then f is locally Lipschitz with respect toy if, and only if, f is (globally) Lipschitz with respect to y on every compact subset Kof G.

Proof. The proof of Prop. 3.13 shows, whithout making use of the continuity of f , that(global) Lipschitz continuity with respect to y on every compact subset K of G implieslocal Lipschitz continuity on G. Thus, assume f to be locally Lipschitz with respect toy and assume each Gx to be convex. The proof of Prop. 3.13 shows, whithout makinguse of the continuity of f , that, for each K ⊆ G compact

∃δ>0,L≥0

∀(x,y),(x,y)∈K

(

‖y − y‖ < δ ⇒ ‖f(x, y)− f(x, y)‖ ≤ L‖y − y‖)

. (D.2)

If (x, y), (x, y) ∈ K are arbitrary with y 6= y, then the convexity of Gx implies

(x, (1− t)y) + (x, ty) : t ∈ [0, 1] ⊆ G. (D.3)

Choose N ∈ N such that N > 2‖y − y‖/δ and set h := ‖y − y‖/N . Then

h < δ/2. (D.4)

Define

∀k=0,...,N

yk :=kh

‖y − y‖ y +(

1− kh

‖y − y‖

)

y. (D.5)

Then

∀k=0,...,N−1

‖yk+1 − yk‖ =

∥

∥

∥

∥

h

‖y − y‖ y +h

‖y − y‖ y∥

∥

∥

∥

= h < δ (D.6)

and

‖f(x, y)− f(x, y)‖ ≤N−1∑

k=0

‖f(x, yk)− f(x, yk+1)‖(D.2)

≤ L

N−1∑

k=0

‖yk − yk+1‖

= LN h = L ‖y − y‖, (D.7)

showing f to be (globally) L-Lipschitz with respect to y on K.

Example D.2. We provide two examples that show that, in general, a discontinuousfunction can be locally Lipschitz with respect to y without being globally Lipschitz withrespect to y on every compact set.

(a) ConsiderG :=]− 2, 2[×

(

]− 4,−1[∪]1, 4[)

(D.8)

and f : G −→ R,

f(x, y) :=

1/x for x 6= 0, y ∈]− 4,−1[,

0 for x = 0, y ∈]− 4,−1[,

0 for y ∈]1, 4[.(D.9)


For the following open balls with respect to the max norm ‖(x, y)‖ := max|x|, |y|,one has B1(x, y) ∩ G ⊆] − 2, 2[×] − 4,−1[ for y ∈] − 4,−1[, andB1(x, y) ∩ G ⊆]−2, 2[×]1, 4[ for y ∈]1, 4[. Thus, f(x, ·) is constant on each set B1(x, y)∩G (eitherconstantly equal to 1/x or constantly equal to 0), i.e. 0-Lipschitz with respect to y.In particular, f is locally Lipschitz with respect to y. However, f is not Lipschitzcontinuous with respect to y on the compact set

K := [−1, 1]×(

[−3,−2] ∪ [2, 3])

: (D.10)

For the sequence ((xk, yk, yk))k∈N, where

∀k∈N

xk := 1/k, yk := −2, yk := 2, (D.11)

one has

limk→∞

|f(xk, yk)− f(xk, yk)||yk − yk|

= limk→∞

k − 0

2− (−2)= ∞, (D.12)

showing f is not Lipschitz continuous with respect to y on K.

(b) If one increases the dimension by 1, then one can modify the example in (a) suchthe set G is even connected (this variant was pointed out by Anton Sporrer): Let

A :=(

]− 4,−1[×]− 2, 2[)

∪(

]− 4, 4[×]− 2, 0[)

∪(

]1, 4[×]− 2, 2[)

⊆ R2. (D.13)

Then A is open and connected (but not convex) and the same holds for

G :=]− 2, 2[×A ⊆ R3. (D.14)

Define

f : G −→ R, f(x, y1, y2) :=

1/x for x 6= 0, y1 ∈]− 4,−1[, y2 > 0,

0 otherwise.(D.15)

Then everything works essentially as in (a) (it might be helpful to graphicallyvisualize the set A and the behavior of the function f): For the following open ballswith respect to the max norm ‖(x, y)‖ := max|x|, |y1|, |y2|, one has

∀(x,y1,y2)∈G,y1∈]−4,−1[

(

(ξ, η1, η2) ∈ B1(x, y1, y2) ∩G ⇒ η1 < −1 + 1 = 0 < 1)

. (D.16)

Analogous to (a), f(x, ·) is constant on each set B1(x, y1, y2)∩G (either constantlyequal to 1/x or constantly equal to 0), i.e. 0-Lipschitz with respect to y. In particu-lar, f is locally Lipschitz with respect to y. However, f is not Lipschitz continuouswith respect to y on the compact set

K := [−1, 1]×(

[−3,−2] ∪ [2, 3])

× [−1, 1] : (D.17)

For the sequence ((xk, y1,k, y1,k), y2,k)k∈N with

∀k∈N

xk := 1/k, y1,k := −2, y1,k := 2, y2,k := 0, (D.18)

one has

limk→∞

|f(xk, y1,k, y2,k)− f(xk, y1,k, y2,k)|‖(y1,k, y2,k)− (y1,k, y2,k)‖max

= limk→∞

k − 0

max4, 0 = ∞, (D.19)

showing f is not Lipschitz continuous with respect to y on K.

E MAXIMAL SOLUTIONS ON NONOPEN INTERVALS 135

E Maximal Solutions on Nonopen Intervals

In Def. 3.20, we required a maximal solution to an ODE to be defined on an openinterval. The following Ex. E.1 shows it can occur that such a maximal solution has anextension to a larger nonopen interval. In such cases, one might want to call the solutionon the nonopen interval maximal rather than the solution on the smaller open interval.However, this would make the treatment of maximal solutions more cumbersome in someplaces, without adding any real substance, which is why we stick to our requirement formaximal solutions to always be defined on an open interval.

Example E.1. (a) Let

G := [0, 1]× R, f : G −→ R, f(x, y) := 0. (E.1)

Then, for each (x0, y0) ∈ G, the function

φ : [0, 1] −→ R, φ ≡ y0, (E.2)

is a solution to the initial value problem

y′ = f(x, y), y(x0) = y0. (E.3)

However, the maximal solution of (E.3) according to Def. 3.20 is φ]0,1[.

(b) The following modification of (a) allows f to be defined on all of R2: Let

G := R2, f : G −→ R, f(x, y) :=

0 for x ∈ [0, 1],

1 for x /∈ [0, 1].(E.4)

Then, for each (x0, y0) ∈ [0, 1]×R, the function φ of (E.2) is a solution to the initialvalue problem (E.3), but, again, the maximal solution of (E.3) according to Def.3.20 is φ]0,1[.

F Paths in Rn

Definition F.1. A path or curve in Rn, n ∈ N, is a continuous map ψ : I −→ Rn, whereI ⊆ R is an interval. One calls the path differentiable, continuously differentiable, etc.if, and only if, the function ψ has the respective property.

Definition F.2. If a, b ∈ R, a ≤ b, and I := [a, b], then we call

|I| := b− a = |a− b|, (F.1)

the length of I.

F PATHS IN RN 136

Definition F.3. Given a real interval I := [a, b] ⊆ R, a, b ∈ R, a < b, the (N + 1)-tuple ∆ := (x0, . . . , xN) ∈ RN+1, N ∈ N, is called a partition of I if, and only if,a = x0 < x1 < · · · < xN = b. The set of all partitions of I is denoted by Π(I) or byΠ[a, b]. Given a partition ∆ of I as above and letting Ij := [xj−1, xj], the number

|∆| := max

|Ij| : j ∈ 1, . . . , N

, (F.2)

is called the mesh size of ∆.

Notation F.4. Given a, b ∈ R, a < b, a path ψ : [a, b] −→ Rn, n ∈ N, and a partition∆ = (x0, . . . , xN), N ∈ N, of [a, b], we consider the approximation of ψ by the polygon,connecting the points ψ(x0), . . . , ψ(xN), where we denote the polygon’s length by

pψ(∆) := pψ(x0, . . . , xN) :=N−1∑

k=0

‖ψ(xk+1)− ψ(xk)‖2, (F.3)

using ‖ · ‖2 to denote the 2-Norm on Rn, i.e. the Euclidean norm.

Definition F.5. Given a, b ∈ R, a ≤ b, for each path ψ : [a, b] −→ Rn, n ∈ N, define

l(ψ) :=

0 for a = b,

sup

pψ(∆) : ∆ ∈ Π[a, b]

∈ [0,∞] for a < b.(F.4)

The path ψ is called rectifyable with arc length l(ψ) if, and only if, l(ψ) <∞.

Proposition F.6. Let a, b ∈ R, a < b, and let ψ : [a, b] −→ Rn be a path, n ∈ N.

(a) If ψ is affine, i.e. there exist y0, y1 ∈ Rn such that

∀x∈[a,b]

ψ(x) = y0 + x y1, (F.5)

then ψ is rectifyable with arc length

l(ψ) = ‖y1‖2 (b− a) = ‖ψ(b)− ψ(a)‖2. (F.6)

(b) If the path ψ is L-Lipschitz with L ≥ 0, then ψ is rectifyable and

l(ψ) ≤ L (b− a). (F.7)

(c) If the paths φ, ψ : [a, b] −→ Rn are both rectifyable, then

∣

∣l(φ)− l(ψ)∣

∣ ≤ l(φ− ψ). (F.8)

(d) For each ξ ∈ [a, b], it holds that

l(ψ) = l(ψ[a,ξ]) + l(ψ[ξ,b]). (F.9)

F PATHS IN RN 137

Proof. (a): For each partition (x0, . . . , xN), N ∈ N, of [a, b], we have

pψ(x0, . . . , xN) =N−1∑

k=0

‖ψ(xk+1)− ψ(xk)‖2 =N−1∑

k=0

‖xk+1 y1 − xk y1‖2

= ‖y1‖2N−1∑

k=0

(xk+1 − xk) = ‖y1‖2 (b− a), (F.10)

proving (F.6).

(b): For each partition (x0, . . . , xN), N ∈ N, of [a, b], we have

pψ(x0, . . . , xN ) =N−1∑

k=0

‖ψ(xk+1)− ψ(xk)‖2 ≤N−1∑

k=0

L ‖xk+1 − xk‖2

= LN−1∑

k=0

(xk+1 − xk) = L (b− a), (F.11)

proving (F.7).

(c): For each partition ∆ = (x0, . . . , xN), N ∈ N, of [a, b], we have

∣

∣pφ(∆)− pψ(∆)∣

∣ =

∣

∣

∣

∣

∣

N−1∑

k=0

‖φ(xk+1)− φ(xk)‖2 −N−1∑

k=0

‖ψ(xk+1)− ψ(xk)‖2∣

∣

∣

∣

∣

≤N−1∑

k=0

∣

∣

∣‖φ(xk+1)− φ(xk)‖2 − ‖ψ(xk+1)− ψ(xk)‖2

∣

∣

∣

≤N−1∑

k=0

∥

∥

∥φ(xk+1)− ψ(xk+1)−

(

φ(xk)− ψ(xk)∥

∥

∥

2

= pφ−ψ(∆), (F.12)

proving (F.8) (the last estimate in (F.12) holds true due to the inverse triangle inequal-ity).

(d): If ξ = a or ξ = b, then there is nothing to prove. Thus, assume a < ξ < b. If∆1 := (x0, . . . , xN) is a partition of [a, ξ] and ∆2 := (xN , . . . , xM) is a partition of [ξ, b],N,M ∈ N, M > N , then ∆ := (x0, . . . , xM) is a partition of [a, b]. Moreover,

pψ(∆) = pψ(∆1) + pψ(∆2) (F.13)

is immediate from (F.3), implying

l(ψ) ≥ l(ψ[a,ξ]) + l(ψ[ξ,b]). (F.14)

On the other hand, if ∆ = (x0, . . . , xM) M ∈ N, is a partition of [a, b], then, either thereis 0 < N < M such that ξ = N , in which case (F.13) holds once again, where ∆1 and ∆2

are defined as before. Otherwise, there is N ∈ 0, . . . ,M − 1 such that xN < ξ < xN+1

F PATHS IN RN 138

and, in this case, ∆1 := (x0, . . . , xN , ξ) is a partition of [a, ξ] and ∆2 := (ξ, xN+1, . . . , xM)is a partition of [ξ, b]. Moreover,

pψ(∆) =M−1∑

k=0

‖ψ(xk+1)− ψ(xk)‖2

=N−1∑

k=0

‖ψ(xk+1)− ψ(xk)‖2 + ‖xN+1 − xN‖+M−1∑

k=N+1

‖ψ(xk+1)− ψ(xk)‖2

≤ pψ(∆1) + pψ(∆2), (F.15)

showingl(ψ) ≤ l(ψ[a,ξ]) + l(ψ[ξ,b]) (F.16)

and concluding the proof.

Theorem F.7. Given a, b ∈ R, a < b, each continuously differentiable path ψ : [a, b] −→Rn, n ∈ N, is rectifyable with arc length

l(ψ) =

∫ b

a

∥

∥ψ′(x)∥

∥

2dx . (F.17)

Proof. Since ψ is continuously differentiable, it follows from [Phi15, Th. C.3] that ψ isLipschitz continuous on [a, b], i.e. ψ is rectifyable by Prop. F.6(b) above. To prove (F.17),according to the fundamental theorem of calculus [Phi16, Th. 10.20(b)], it suffices toshow the function

λ : [a, b] −→ R+0 , λ(x) := l(ψ[a,x]), (F.18)

is differentiable with derivative λ′(x) = ‖ψ′(x)‖2. To this end, first note the continuousfunction ψ′ is even uniformly continuous by Th. C.16. Thus,

∀ǫ>0

∃δ>0

∀x0,x∈[a,b]

(

|x0 − x| < δ ⇒ ‖ψ(x0)− ψ(x)‖2 < ǫ.)

. (F.19)

Fix x0 ∈ [a, b[ and consider x1 ∈]a, b[ such that x0 < x1 < x0 + δ. Define the affine path

α : [x0, x1] −→ Rn, α(x) := ψ(x0) + (x− x0)ψ′(x0). (F.20)

According to Prop. F.6(a), we have

l(α) = ‖ψ′(x0)‖2 (x1 − x0). (F.21)

Moreover, for the path ψ − α, we have

∀x∈[x0,x1]

‖ψ′(x)− α′(x)‖2 = ‖ψ′(x)− ψ′(x0)‖2(F.19)< ǫ. (F.22)

Thus, it follows from [Phi15, Th. C.3] that ψ−α is ǫ-Lipschitz on [a, b] and, then, Prop.F.6(b) yields

l(ψ[x0,x1] −α) ≤ ǫ(x1 − x0), (F.23)

G OPERATOR NORMS AND MATRIX NORMS 139

Prop. F.6(c), in turn, yields∣

∣l(ψ[x0,x1])− l(α)∣

∣ ≤ l(ψ[x0,x1] −α) ≤ ǫ(x1 − x0). (F.24)

Putting everything together, we obtain∣

∣

∣

∣

l(ψ[a,x1])− l(ψ[a,x0])

x1 − x0− ‖ψ′(x0)‖2

∣

∣

∣

∣

Prop. F.6(d), (F.21)=

∣

∣

∣

∣

l(ψ[x0,x1])

x1 − x0− l(α)

x1 − x0

∣

∣

∣

∣

(F.24)

≤ ǫ(x1 − x0)

x1 − x0= ǫ, (F.25)

showing the function λ from (F.18) has a right-hand derivative at x0 and the value ofthat right-hand derivative at x0 is the desired ‖ψ′(x0)‖2. Repeating the above argumentwith x0, x1 ∈]a, b] such that x0 − δ < x1 < x0 shows λ to have a left-hand derivative ateach x0 ∈]a, b] with value ‖ψ′(x0)‖2, which completes the proof.

Remark F.8. An example of a differentiable nonrectifyable path is given by (cf. [Wal02,Ex. 5.14.6])

ψ : [0, 1] −→ R2, ψ(x) :=

x2 cos πx2

for x 6= 0,

0 for x = 0.(F.26)

G Operator Norms and Matrix Norms

For the present ODE class, we are mostly interested in linear maps from Kn into itself.However, introducing the relevant notions for linear maps between general normed vectorspaces does not provide much additional difficulty, and, hopefully, even some extraclarity.

Definition G.1. Let A : X −→ Y be a linear map between two normed vector spaces(X, ‖ · ‖X) and (Y, ‖ · ‖Y ) over K. Then A is called bounded if, and only if, A mapsbounded sets to bounded sets, i.e. if, and only if, A(B) is a bounded subset of Y foreach bounded B ⊆ X. The vector space of all bounded linear maps between X and Yis denoted by L(X, Y ).

Definition G.2. Let A : X −→ Y be a linear map between two normed vector spaces(X, ‖ · ‖X) and (Y, ‖ · ‖Y ) over K. The number

‖A‖ := sup

‖Ax‖Y‖x‖X

: x ∈ X, x 6= 0

=sup

‖Ax‖Y : x ∈ X, ‖x‖X = 1

∈ [0,∞] (G.1)

is called the operator norm of A induced by ‖ · ‖X and ‖ · ‖Y (strictly speaking, the termoperator norm is only justified if the value is finite, but it is often convenient to use theterm in the generalized way defined here).

In the special case, where X = Kn, Y = Km, and A is given via a real m × n matrix,the operator norm is also called matrix norm.

—


From now on, the space index of a norm will usually be suppressed, i.e. we write just‖ · ‖ instead of both ‖ · ‖X and ‖ · ‖Y .Theorem G.3. For a linear map A : X −→ Y between two normed vector spaces(X, ‖ · ‖) and (Y, ‖ · ‖) over K, the following statements are equivalent:

(a) A is bounded.

(b) ‖A‖ <∞.

(c) A is Lipschitz continuous.

(d) A is continuous.

(e) There is x0 ∈ X such that A is continuous at x0.

Proof. Since every Lipschitz continuous map is continuous and since every continuousmap is continuous at every point, “(c) ⇒ (d) ⇒ (e)” is clear.

“(e) ⇒ (c)”: Let x0 ∈ X be such that A is continuous at x0. Thus, for each ǫ > 0, thereis δ > 0 such that ‖x− x0‖ < δ implies ‖Ax−Ax0‖ < ǫ. As A is linear, for each x ∈ Xwith ‖x‖ < δ, one has ‖Ax‖ = ‖A(x+ x0)−Ax0‖ < ǫ, due to ‖x+ x0 − x0‖ = ‖x‖ < δ.Moreover, one has ‖(δx)/2‖ ≤ δ/2 < δ for each x ∈ X with ‖x‖ ≤ 1. Letting L := 2ǫ/δ,this means that ‖Ax‖ = ‖A((δx)/2)‖/(δ/2) < 2ǫ/δ = L for each x ∈ X with ‖x‖ ≤ 1.Thus, for each x, y ∈ X with x 6= y, one has

‖Ax− Ay‖ = ‖A(x− y)‖ = ‖x− y‖∥

∥

∥

∥

A

(

x− y

‖x− y‖

)∥

∥

∥

∥

< L ‖x− y‖. (G.2)

Together with the fact that ‖Ax−Ay‖ ≤ ‖x− y‖ is trivially true for x = y, this showsthat A is Lipschitz continuous.

“(c) ⇒ (b)”: As A is Lipschitz continuous, there exists L ∈ R+0 such that ‖Ax−Ay‖ ≤

L ‖x − y‖ for each x, y ∈ X. Considering the special case y = 0 and ‖x‖ = 1 yields‖Ax‖ ≤ L ‖x‖ = L, implying ‖A‖ ≤ L <∞.

“(b) ⇒ (c)”: Let ‖A‖ <∞. We will show

‖Ax− Ay‖ ≤ ‖A‖ ‖x− y‖ for each x, y ∈ X. (G.3)

For x = y, there is nothing to prove. Thus, let x 6= y. One computes

‖Ax− Ay‖‖x− y‖ =

∥

∥

∥

∥

A

(

x− y

‖x− y‖

)∥

∥

∥

∥

≤ ‖A‖ (G.4)

as∥

∥

∥

x−y‖x−y‖

∥

∥

∥= 1, thereby establishing (G.3).

“(b) ⇒ (a)”: Let ‖A‖ <∞ and let M ⊆ X be bounded. Then there is r > 0 such thatM ⊆ Br(0). Moreover, for each 0 6= x ∈M :

‖Ax‖‖x‖ =

∥

∥

∥

∥

A

(

x

‖x‖

)∥

∥

∥

∥

≤ ‖A‖ (G.5)


as∥

∥

∥

x‖x‖

∥

∥

∥= 1. Thus ‖Ax‖ ≤ ‖A‖‖x‖ ≤ r‖A‖, showing that A(M) ⊆ Br‖A‖(0). Thus,

A(M) is bounded, thereby establishing the case.

“(a) ⇒ (b)”: Since A is bounded, it maps the bounded set B1(0) ⊆ X into somebounded subset of Y . Thus, there is r > 0 such that A(B1(0)) ⊆ Br(0) ⊆ Y . Inparticular, ‖Ax‖ < r for each x ∈ X satisfying ‖x‖ = 1, showing ‖A‖ ≤ r <∞.

Remark G.4. For linear maps between finite-dimensional spaces, the equivalent prop-erties of Th. G.3 always hold: Each linear map A : Kn −→ Km, (n,m) ∈ N2, iscontinuous (this follows, for example, from the fact that each such map is (trivially)differentiable, and every differentiable map is continuous). In particular, each linearmap A : Kn −→ Km, has all the equivalent properties of Th. G.3.

Theorem G.5. Let X and Y be normed vector spaces over K.

(a) The operator norm does, indeed, constitute a norm on the set of bounded linearmaps L(X, Y ).

(b) If A ∈ L(X, Y ), then ‖A‖ is the smallest Lipschitz constant for A, i.e. ‖A‖ is aLipschitz constant for A and ‖Ax − Ay‖ ≤ L ‖x − y‖ for each x, y ∈ X implies‖A‖ ≤ L.

Proof. (a): If A = 0, then, in particular, Ax = 0 for each x ∈ X with ‖x‖ = 1, implying‖A‖ = 0. Conversely, ‖A‖ = 0 implies Ax = 0 for each x ∈ X with ‖x‖ = 1. But thenAx = ‖x‖A(x/‖x‖) = 0 for every 0 6= x ∈ X, i.e. A = 0. Thus, the operator norm ispositive definite. If A ∈ L(X, Y ), λ ∈ K, and x ∈ X, then

∥

∥(λA)x∥

∥ =∥

∥A(λx)∥

∥ =∥

∥λ(Ax)∥

∥ = |λ|∥

∥Ax∥

∥, (G.6)

yielding

‖λA‖ = sup

‖(λA)x‖ : x ∈ X, ‖x‖ = 1

= sup

|λ| ‖Ax‖ : x ∈ X, ‖x‖ = 1

= |λ| sup

‖Ax‖ : x ∈ X, ‖x‖ = 1

= |λ| ‖A‖, (G.7)

showing that the operator norm is homogeneous of degree 1. Finally, if A,B ∈ L(X, Y )and x ∈ X, then

‖(A+ B)x‖ = ‖Ax+ Bx‖ ≤ ‖Ax‖+ ‖Bx‖, (G.8)

yielding

‖A+ B‖ = sup

‖(A+ B)x‖ : x ∈ X, ‖x‖ = 1

≤ sup

‖Ax‖+ ‖Bx‖ : x ∈ X, ‖x‖ = 1

≤ sup

‖Ax‖ : x ∈ X, ‖x‖ = 1

+ sup

‖Bx‖ : x ∈ X, ‖x‖ = 1

= ‖A‖+ ‖B‖, (G.9)

showing that the operator norm also satisfies the triangle inequality, thereby completingthe verification that it is, indeed, a norm.

(b): That ‖A‖ is a Lipschitz constant for A was already shown in the proof of “(b) ⇒(c)” of Th. G.3. Now let L ∈ R+

0 be such that ‖Ax−Ay‖ ≤ L ‖x−y‖ for each x, y ∈ X.Specializing to y = 0 and ‖x‖ = 1 implies ‖Ax‖ ≤ L ‖x‖ = L, showing ‖A‖ ≤ L.

H THE VANDERMONDE DETERMINANT 142

Remark G.6. Even though it is beyond the scope of the present class, let us mentionas an outlook that one can show that L(X, Y ) with the operator norm is a Banach space(i.e. a complete normed vector space) provided that Y is a Banach space (even if X isnot a Banach space).

Lemma G.7. If Id : X −→ X, Id(x) := x, is the identity map on a normed vector spaceX over K, then ‖ Id ‖ = 1 (in particular, the operator norm of a unit matrix is always1). Caveat: In principle, one can consider two different norms on X simultaneously,and then the operator norm of the identity can differ from 1.

Proof. If ‖x‖ = 1, then ‖ Id(x)‖ = ‖x‖ = 1.

Lemma G.8. Let X, Y, Z be normed vector spaces and consider linear maps A ∈L(X, Y ), B ∈ L(Y, Z). Then

‖BA‖ ≤ ‖B‖ ‖A‖. (G.10)

Proof. Let x ∈ X with ‖x‖ = 1. If Ax = 0, then ‖B(A(x))‖ = 0 ≤ ‖B‖ ‖A‖. If Ax 6= 0,then one estimates

∥

∥B(Ax)∥

∥ = ‖Ax‖∥

∥

∥

∥

B

(

Ax

‖Ax‖

)∥

∥

∥

∥

≤ ‖A‖ ‖B‖, (G.11)


Example G.9. Let m,n ∈ N and let A : Rn −→ Rm be the linear map given by them× n matrix (akl)(k,l)∈1,...,m×1,...,n. Then

‖A‖∞ := max

n∑

l=1

|akl| : k ∈ 1, . . . ,m

(G.12a)

is called the row sum norm of A, and

‖A‖1 := max

m∑

k=1

|akl| : l ∈ 1, . . . , n

(G.12b)

is called the column sum norm of A. It is an exercise to show that ‖A‖∞ is the operatornorm induced if Rn and Rm are endowed with the ∞-norm, and ‖A‖1 is the operatornorm induced if Rn and Rm are endowed with the 1-norm.

H The Vandermonde Determinant

Theorem H.1. Let n ∈ N and λ0, λ1, . . . , λn ∈ C. Moreover, let

V :=

1 λ0 . . . λn01 λ1 . . . λn1...

...1 λn . . . λnn

(H.1)

H THE VANDERMONDE DETERMINANT 143

be the corresponding Vandermonde matrix. Then its determinant, the so-called Vander-monde determinant is given by

det(V ) =n∏

k,l=0k>l

(λk − λl). (H.2)

Proof. The proof can be conducted by induction with respect to n: For n = 1, we have

det(V ) =

∣

∣

∣

∣

1 λ01 λ1

∣

∣

∣

∣

= λ1 − λ0 =1∏

k,l=0k>l

(λk − λl), (H.3)

showing (H.2) holds for n = 1. Now let n > 1. We know from Linear Algebra that thevalue of a determinant does not change if we add a multiple of a column to a differentcolumn. Adding the (−λ0)-fold of the nth column to the (n + 1)st column, we obtainin the (n+ 1)st column

0λn1 − λn−1

1 λ0...

λnn − λn−1n λ0

. (H.4)

Next, one adds the (−λ0)-fold of the (n− 1)st column to the nth column, and, succes-sively, the (−λ0)-fold of the mth column to the (m + 1)st column. One finishes, in thenth step, by adding the (−λ0)-fold of the first column to the second column, obtaining

det(V ) =

∣

∣

∣

∣

∣

∣

∣

∣

∣

1 λ0 . . . λn01 λ1 . . . λn1...

...1 λn . . . λnn

∣

∣

∣

∣

∣

∣

∣

∣

∣

=

∣

∣

∣

∣

∣

∣

∣

∣

∣

1 0 0 . . . 01 λ1 − λ0 λ21 − λ1λ0 . . . λn1 − λn−1

1 λ0...

......

......

1 λn − λ0 λ2n − λnλ0 . . . λnn − λn−1n λ0

∣

∣

∣

∣

∣

∣

∣

∣

∣

. (H.5)

Applying the rule for determinants of block matrices to (H.5) yields

det(V ) = 1 ·

∣

∣

∣

∣

∣

∣

∣

λ1 − λ0 λ21 − λ1λ0 . . . λn1 − λn−11 λ0

......

......

λn − λ0 λ2n − λnλ0 . . . λnn − λn−1n λ0

∣

∣

∣

∣

∣

∣

∣

. (H.6)

As we also know from Linear Algebra that determinants are linear in each row, for eachk, we can factor out (λk − λ0) from the kth row of (H.6), arriving at

det(V ) =n∏

k=1

(λk − λ0)

∣

∣

∣

∣

∣

∣

∣

1 λ1 . . . λn−11

......

......

1 λn . . . λn−1n

∣

∣

∣

∣

∣

∣

∣

. (H.7)

However, the determinant in (H.7) is precisely the Vandermonde determinant of the n−1numbers λ1, . . . , λn, which is given according to the induction hypothesis, implying

det(V ) =n∏

k=1

(λk − λ0)n∏

k,l=1k>l

(λk − λl) =n∏

k,l=0k>l

(λk − λl), (H.8)

I MATRIX-VALUED FUNCTIONS 144

completing the induction proof of (H.2).

I Matrix-Valued Functions

Notation I.1. Given m,n ∈ N, let M(m,n,K) denote the set of m× n matrices overK.

I.1 Product Rule

Proposition I.2. Let I ⊆ R be a nontrivial interval, let m,n, l ∈ N, and suppose

A : I −→ M(m,n,K), A(x) =(

aαβ(x))

, (I.1a)

B : I −→ M(n, l,K), B(x) =(

bαβ(x))

, (I.1b)

are differentiable. Then

C : I −→ M(m, l,K), C(x) := A(x)B(x), (I.2)

is differentiable, and one has the product rule

∀x∈I

C ′(x) = A′(x)B(x) + A(x)B′(x). (I.3)

Proof. Writing C(x) =(

cαβ(x))

and using the one-dimensional product rule togetherwith the definition of matrix multiplication, one computes, for each (α, β) ∈ 1, . . . ,m× 1, . . . , l,

c′αβ(x) =

(

n∑

γ=1

aαγ(x) bγβ(x)

)′

=n∑

γ=1

a′αγ(x) bγβ(x) +n∑

γ=1

aαγ(x) b′γβ(x)

=(

A′(x)B(x))

αβ+(

A(x)B′(x))

αβ, (I.4)

proving the proposition.

I.2 Integration and Matrix Multiplication Commute

Proposition I.3. Let m,n, p ∈ N, let I ⊆ R be measurable (e.g. an interval), let A :I −→ M(m,n,K), x 7→ A(x) = (akl(x)), be integrable (i.e. all Re akl, Im akl : I −→ R

are integrable).

J AUTONOMOUS ODE 145

(a) If B = (bjk) ∈ M(p,m,K), then

B

∫

I

A(x) dx =

∫

I

BA(x) dx . (I.5)

(b) If B = (Blj) ∈ M(n, p,K), then

(∫

I

A(x) dx

)

B =

∫

I

A(x)B dx . (I.6)

Proof. (a): One computes, for each (j, l) ∈ 1, . . . , p × 1, . . . , n,(

B

∫

I

A(x) dx

)

jl

=m∑

k=1

bjk

∫

I

akl(x) dx =

∫

I

(

m∑

k=1

bjkakl(x)

)

dx

=

∫

I

(BA(x))jl dx =

(∫

I

BA(x) dx

)

jl

, (I.7)

proving (I.5).

(b): One computes, for each (k, j) ∈ 1, . . . ,m × 1, . . . , p,((∫

I

A(x) dx

)

B

)

kj

=n∑

l=1

(∫

I

akl(x) dx

)

blj =

∫

I

(

n∑

l=1

akl(x)blj

)

dx

=

∫

I

(A(x)B)kj dx =

(∫

I

A(x)B dx

)

kj

, (I.8)

proving (I.6).

J Autonomous ODE

J.1 Equivalence of Autonomous and Nonautonomous ODE

Theorem J.1. Let G ⊆ R×Kn, n ∈ N, and f : G −→ Kn. Then the nonautonomousODE

y′ = f(x, y) (J.1)

is equivalent to the autonomous ODE

y′ = g(y), (J.2)

whereg : R×G −→ Kn+1, g(y1, . . . , yn+1) :=

(

1, f(y1, y2, . . . , yn+1))

, (J.3)

in the following sense:

J AUTONOMOUS ODE 146

(a) If φ : I −→ Kn is a solution to (J.1), then ψ : I −→ Kn+1, ψ(x) := (x, φ(x)), is asolution to (J.2).

(b) If ψ : I −→ Kn+1 is a solution to (J.2) with the property

∃x0∈I

ψ1(x0) = x0, (J.4)

then φ : I −→ Kn, φ(x) := (ψ2(x), . . . , ψn+1(x)), is a solution to (J.1).

Proof. (a): If φ : I −→ Kn is a solution to (J.1) and ψ : I −→ Kn+1, ψ(x) := (x, φ(x)),then

∀x∈I

ψ′(x) = (1, φ′(x)) =(

1, f(x, φ(x)))

= g(x, φ(x)) = g(ψ(x)), (J.5)

showing ψ is a solution to (J.2).

(b): If ψ : I −→ Kn+1 is a solution to (J.2) with the property (J.4) and φ : I −→ Kn,φ(x) := (ψ2(x), . . . , ψn+1(x)), then (J.4) implies ψ1(x) = x for each x ∈ I and, thus,

∀x∈I

φ′(x) = (ψ′2(x), . . . , ψ

′n+1(x)) = f(x, ψ2(x), . . . , ψn+1(x)) = f(x, φ(x)), (J.6)

showing φ is a solution to (J.1).

While Th. J.1 is somewhat striking and of theoretical interest, it has few useful applica-tions in practise, due to the unbounded first component of solutions to (J.2) (cf. Rem.5.2).

J.2 Integral for ODE with Discontinuous Right-Hand Side

The following Example J.2, provided by Anton Sporrer, shows Lem. 5.19 becomes falseif the hypothesis that every initial value problem for the considered ODE y′ = f(y) hasat least one solution is omitted:

Example J.2. Consider

f : R −→ R, f(y) :=

0 for y ∈ Q,

1 for y ∈ R \Q, (J.7)

and the autonomous ODE y′ = f(y). If (x0, y0) ∈ R×Q, then the initial value problemy(x0) = y0 has the unique solution φ : R −→ R, φ ≡ y0 ∈ Q. However, if (x0, y0) ∈R× (R \Q), then the initial value problem y(x0) = y0 has no solution. Since y′ = f(y)has only constant solutions, every function E : R −→ R is an integral for this ODEaccording to Def. 5.18. However, not every differentiable function E : R −→ R satisfies(5.15): For example, if E(y) := y, then E ′ ≡ 1, i.e.

∀y∈R\Q

E ′(y)f(y) = 1 6= 0, (J.8)

showing that Lem. 5.19 does not hold for y′ = f(y) with f according to (J.7).

K POLAR COORDINATES 147

K Polar Coordinates

Recall the following functions, used in polar coordinates of the plane:

r : R2 \ (0, 0) −→ R+, r(y1, y2) :=√

y21 + y22, (K.1a)

ϕ : R2 \ (0, 0) −→ [0, 2π[, ϕ(y1, y2) :=

0 for y2 = 0, y1 > 0,

arccot(y1/y2) for y2 > 0,

π for y2 = 0, y1 < 0,

π + arccot(y1/y2) for y2 < 0.

(K.1b)

Theorem K.1. Consider f : R2 \ (0, 0) −→ R2 and the corresponding R2-valuedautonomous ODE

y′1 = f1(y1, y2), (K.2a)

y′2 = f2(y1, y2), (K.2b)

together with its polar coordinate version

r′ = g1(r, ϕ), (K.3a)

ϕ′ = g2(r, ϕ), (K.3b)

where g : R+ × R −→ R2,

g1 : R+ × R −→ R,

g1(r, ϕ) := f1(r cosϕ, r sinϕ) cosϕ+ f2(r cosϕ, r sinϕ) sinϕ, (K.4a)

g2 : R+ × R −→ R,

g2(r, ϕ) :=1

r

(

f2(r cosϕ, r sinϕ) cosϕ− f1(r cosϕ, r sinϕ) sinϕ)

. (K.4b)

Let µ : I −→ R2 be a solution to (K.3).

(a) Thenφ : I −→ R2, φ(x) :=

(

µ1(x) cosµ2(x), µ1(x) sinµ2(x))

, (K.5)

is a solution to (K.2).

(b) If µ satisfies the initial condition

µ1(0) = ρ, ρ ∈ R+, (K.6a)

µ2(0) = τ, τ ∈ R, (K.6b)

and if

η1 = ρ cos τ, (K.7a)

η2 = ρ sin τ, (K.7b)


then φ satisfies the initial condition

φ1(0) = η1, (K.8a)

φ2(0) = η2. (K.8b)

Note that ρ > 0 implies (η1, η2) 6= (0, 0), and that, for (ρ, τ) ∈ R+ × [0, 2π[, (K.7)is equivalent to

r(η1, η2) = ρ, (K.9a)

ϕ(η1, η2) = τ (K.9b)

(cf. the computations of φ−1 φ and of φ φ−1 in [Phi15, Ex. 4.19]).

Proof. Exercise.

Example K.2. Consider the autonomous ODE (K.2) with

f1 : R2 \ (0, 0) −→ R,

f1(y1, y2) := y1(

1− r(y1, y2))

− y2 (r(y1, y2)− y1)

2 r(y1, y2), (K.10a)

f2 : R2 \ (0, 0) −→ R,

f2(y1, y2) := y2(

1− r(y1, y2))

+y1 (r(y1, y2)− y1)

2 r(y1, y2), (K.10b)

where r is the radial polar coordinate function as defined in (K.1a). Using g : R+×R −→R2 as defined in (K.4), one obtains, for each (ρ, ϕ) ∈ R+ × R,

g1(ρ, ϕ) = ρ cosϕ (1− ρ) cosϕ− ρ sinϕ (ρ− ρ cosϕ) cosϕ

2ρ

+ ρ sinϕ (1− ρ) sinϕ+ρ cosϕ (ρ− ρ cosϕ) sinϕ

2ρ

= ρ (1− ρ), (K.11a)

g2(ρ, ϕ) = sinϕ (1− ρ) cosϕ+cosϕ (ρ− ρ cosϕ) cosϕ

2ρ

− cosϕ (1− ρ) sinϕ+sinϕ (ρ− ρ cosϕ) sinϕ

2ρ

=1− cosϕ

2, (K.11b)

such that the autonomous ODE

r′ = r (1− r), (K.12a)

ϕ′ =1− cosϕ

2

[Phi16, (I.1c)]= sin2 ϕ

2, (K.12b)

is the polar coordinate version of (K.2) as defined in Th. K.1.


Claim 1. The general solution to

r′ = p(r), p : R+ −→ R, p(r) := r (1− r), (K.13)

isYp : Dp,0 −→ R+, Yp(x, ρ) :=

ρ

ρ+ (1− ρ) e−x, (K.14)

where

Dp,0 =(

R×]0, 1])

∪

(x, ρ) : ρ > 1, x ∈]

− lnρ

ρ− 1, ∞

[

. (K.15)

Proof. The initial condition is satisfied, since

∀ρ∈R+

Yp(0, ρ) =ρ

ρ+ (1− ρ)= ρ. (K.16)

The ODE is satisfied, since

∀(x,ρ)∈Dp,0

Y ′p(x, ρ) =

ρ(1− ρ)e−x(

ρ+ (1− ρ) e−x)2 = Yp(x, ρ)

(

1− Yp(x, ρ))

. (K.17)

To verify the form of Dp,0, we note that the denominator in (K.14) is positive for eachx ∈ R if 0 < ρ ≤ 1. If ρ > 1, then the function a : R −→ R, a(x) := ρ+ (1− ρ) e−x, isstrictly increasing (note a′(x) = (ρ− 1) e−x > 0) and has a unique zero at x = − ln ρ

ρ−1.

Thus limx↓− ln ρρ−1

Y (x, ρ) = ∞, proving the maximality of Y (·, ρ). N

Claim 2. Letting, for each k ∈ Z,

x0(τ) :=2 cos τ + 2

sin τfor τ ∈ R \ lπ : l ∈ Z, (K.18a)

Rk :=]0, π[+2kπ, (K.18b)

Lk :=]− π, 0[+2kπ, (K.18c)

A0 := R× 2kπ : k ∈ Z, (K.18d)

A0,k := R− × π + 2kπ, (K.18e)

B0,k := R+ × π + 2kπ, (K.18f)

A1,k :=

(x, τ) ∈ R2 : x ∈]−∞, x0(τ)[, τ ∈ Rk

, (K.18g)

A2,k :=

(x, τ) ∈ R2 : x ∈]x0(τ),∞[, τ ∈ Lk

, (K.18h)

B1,k :=

(x, τ) ∈ R2 : x ∈]x0(τ),∞[, τ ∈ Rk

, (K.18i)

B2,k :=

(x, τ) ∈ R2 : x ∈]−∞, x0(τ)[, τ ∈ Lk

, (K.18j)

C1,k :=

(x, τ) ∈ R2 : x = x0(τ), τ ∈ Rk

, (K.18k)

C2,k :=

(x, τ) ∈ R2 : x = x0(τ), τ ∈ Lk

, (K.18l)

the general solution to

ϕ′ = q(ϕ), q : R −→ R, q(ϕ) :=1− cosϕ

2= sin2 ϕ

2, (K.19)


is

Yq : R2 −→ R+,

Yq(x, τ) :=

τ for (x, τ) ∈ A0,

2(

kπ + arctan(

− 2x

))

for (x, τ) ∈ A0,k, k ∈ Z,

2(

(k + 1)π + arctan(

− 2x

))

for (x, τ) ∈ B0,k, k ∈ Z,

π + 2kπ for (x, τ) = (0, π + 2kπ), k ∈ Z,

2(

kπ + arctan(

2 sin τ2 cos τ−x sin τ+2

))

for (x, τ) ∈ A1,k ∪ A2,k, k ∈ Z,

π + 2kπ for (x, τ) ∈ C1,k, k ∈ Z,

2(

(k + 1)π + arctan(


))

for (x, τ) ∈ B1,k, k ∈ Z,

−π + 2kπ for (x, τ) ∈ C2,k, k ∈ Z,

2(

(k − 1)π + arctan(


))

for (x, τ) ∈ B2,k, k ∈ Z.

(K.20)

Proof. One observes that Yq is well-defined, since

R2 = A0 ∪⋃

k∈Z

(

(0, π + 2kπ) ∪A0,k ∪B0,k ∪A1,k ∪A2,k ∪B1,k ∪B2,k ∪C1,k ∪C2,k

)

(K.21)and, introducing the auxiliary function

∆ : R2 −→ R, ∆(x, τ) := 2 cos τ − x sin τ + 2, (K.22)

one has

∆(x, τ) 6= 0 for each (x, τ) ∈ A1,k ∪ A2,k ∪ B1,k ∪ B2,k, k ∈ Z. (K.23)

It remains to show that, for each τ ∈ R, the function x 7→ Yq(x, τ) is differentiable onR, satisfying

∀x∈R

Y ′q (x, τ) =

1− cosYq(x, τ)

2, (K.24)

and the initial conditionYq(0, τ) = τ. (K.25)

The initial condition (K.25) is satisfied, since

∀τ∈kπ: k∈Z

Yq(0, τ) = τ, (K.26a)

∀τ∈Rk∪Lk, k∈Z

Yq(0, τ) = 2kπ + 2arctan2 sin τ

2 cos τ + 2

[Phi16, (I.1d)]= 2kπ + 2arctan

(

tanτ

2

)

= 2kπ + 2(τ

2− kπ

)

= τ. (K.26b)

Next, we show that, for each τ ∈ R, the function x 7→ Yq(x, τ) is differentiable on R andsatisfies (K.24): For τ ∈ 2kπ : k ∈ Z, x 7→ Yq(x, τ) is constant, i.e. differentiability isclear, and

∀x∈R

1− cosYq(x, τ)

2=

1− cos(2kπ)

2= 0 = Y ′

q (x, τ) (K.27)


proves (K.24).

For each τ ∈ 2(k+1)π : k ∈ Z, differentiability is clear in each x ∈ R\0. Moreover,

∀x∈R\0

(

2 arctan

(

−2

x

))′

=4

x21

1 + 4x2

=4

4 + x2, (K.28)

and, thus, for each x ∈ R \ 0,1− cosYq(x, τ)

2=

1

2− 1

2cos

(

2 arctan

(

−2

x

))

[Phi16, (I.1e)]=

1

2− 1

2

1− 4x2

1 + 4x2

=1

2− 1

2

x2 − 4

x2 + 4=

8

2(4 + x2)=

4

4 + x2(K.28)= Y ′

q (x, τ), (K.29)

proving (K.24) for each x ∈ R \ 0. It remains to consider x = 0. One has, byL’Hopital’s rule [Phi16, Th. 9.26(a)],

limx↑0

Yq(0, τ)− Yq(x, τ)

0− x= lim

x↑0

π + 2kπ − 2(

kπ + arctan(

− 2x

))

−x[Phi16, (9.29)],(K.28)

= limx↑0

− 44+x2

−1= 1 (K.30)

and

limx↓0

Yq(0, τ)− Yq(x, τ)

0− x= lim

x↓0

π + 2kπ − 2(

(k + 1)π + arctan(

− 2x

))

−x[Phi16, (9.29)],(K.28)

= limx↓0

− 44+x2

−1= 1, (K.31)

showing x 7→ Yq(x, τ) to be differentiable in x = 0 with Y ′q (0, τ) = 1. Due to

1− cos(π + 2kπ)

2= 1 = Y ′

q (0, π + 2kπ), (K.32)

(K.24) also holds.

For each τ ∈ Rk ∪ Lk, the differentiability is clear in each x ∈ R \ x0(τ). Moreover,recalling ∆(x, τ) from (K.22), one has, for each x ∈ R \ x0(τ),(

2 arctan

(

2 sin τ

∆(x, τ)

))′

=4(sin τ)2

(∆(x, τ))21

1 + 4(sin τ)2

(∆(x,τ))2

=4(sin τ)2

4(sin τ)2 + (∆(x, τ))2, (K.33)

and, thus, for each x ∈ R \ x0(τ),

1− cosYq(x, τ)

2=

1

2− 1

2cos

(

2 arctan

(

2 sin τ

∆(x, τ)

))

[Phi16, (I.1e)]=

1

2− 1

2

1− 4(sin τ)2

(∆(x,τ))2

1 + 4(sin τ)2

(∆(x,τ))2

=1

2− 1

2

(∆(x, τ))2 − 4(sin τ)2

(∆(x, τ))2 + 4(sin τ)2=

8(sin τ)2

2(4(sin τ)2 + (∆(x, τ))2)

=4(sin τ)2

4(sin τ)2 + (∆(x, τ))2(K.33)= Y ′

q (x, τ), (K.34)


proving (K.24) for each x ∈ R \ x0(τ). It remains to consider x = x0(τ). For τ ∈ Rk,we have sin τ > 0 and x0(τ) > 0, and, thus, by L’Hopital’s rule [Phi16, Th. 9.26(a)],

limx↑x0(τ)

Yq(

x0(τ), τ)

− Yq(x, τ)

x0(τ)− x= lim

x↑x0(τ)

π + 2kπ − 2(

kπ + arctan(


))

x0(τ)− x

[Phi16, (9.29)],(K.33)= lim

x↑x0(τ)

− 4(sin τ)2

4(sin τ)2+(∆(x,τ))2

−1= 1 (K.35)

and

limx↓x0(τ)

Yq(

x0(τ), τ)

− Yq(x, τ)

x0(τ)− x= lim

x↓x0(τ)

π + 2kπ − 2(

(k + 1)π + arctan(


))

x0(τ)− x

[Phi16, (9.29)],(K.33)= lim

x↓x0(τ)

− 4(sin τ)2

4(sin τ)2+(∆(x,τ))2

−1= 1 (K.36)

showing x 7→ Yq(x, τ) to be differentiable in x = x0(τ) with Y′q (x0(τ), τ) = 1. Due to

1− cos(π + 2kπ)

2= 1 = Y ′

q (x0(τ), τ), (K.37)

(K.24) also holds. For τ ∈ Lk, we have sin τ < 0 and x0(τ) < 0, and, thus, by L’Hopital’srule [Phi16, Th. 9.26(a)],

limx↑x0(τ)

Yq(

x0(τ), τ)

− Yq(x, τ)

x0(τ)− x

= limx↑x0(τ)

−π + 2kπ − 2(

(k − 1)π + arctan(


))

x0(τ)− x

[Phi16, (9.29)],(K.33)= lim

x↑x0(τ)

− 4(sin τ)2

4(sin τ)2+(∆(x,τ))2

−1= 1 (K.38)

and

limx↓x0(τ)

Yq(

x0(τ), τ)

− Yq(x, τ)

x0(τ)− x= lim

x↓x0(τ)

−π + 2kπ − 2(

kπ + arctan(


))

x0(τ)− x

[Phi16, (9.29)],(K.33)= lim

x↓x0(τ)

− 4(sin τ)2

4(sin τ)2+(∆(x,τ))2

−1= 1 (K.39)

showing x 7→ Yq(x, τ) to be differentiable in x = x0(τ) with Y′q (x0(τ), τ) = 1. Due to

1− cos(−π + 2kπ)

2= 1 = Y ′

q (x0(τ), τ), (K.40)

(K.24) also holds. N


Claim 3. The general solution to (K.2) with f1, f2 according to (K.10) is

Y : Df,0 −→ R2,

Y (x, η1, η2) :=(

Yp(

x, r(η1, η2))

cosYq(

x, ϕ(η1, η2))

,

Yp(

x, r(η1, η2))

sinYq(

x, ϕ(η1, η2))

)

, (K.41)

where r and ϕ are given by (K.1), and

Df,0 =(

R× η ∈ R2 : 0 < ‖η‖2 ≤ 1)

∪

(x, η) ∈ R× R2 : ‖η‖2 > 1, x ∈]

− ln‖η‖2

‖η‖2 − 1, ∞

[

. (K.42)

Proof. Since (K.12) is the polar coordinate version of (K.2) with f1, f2 according to(K.10), everything follows from combining Th. K.1 with Claims 1 and 2. N

Claim 4. The autonomous ODE (K.2) with f1, f2 according to (K.10) has (1, 0) as itsonly fixed point, and (1, 0) satisfies Def. 5.24(iii) for x→ ∞ (even for each η ∈ R2 \0)without satisfying Def. 5.24(ii) (i.e. without being positively stable).

Proof. For each η ∈ R2 \ 0, it is r(η) > 0, and, thus

∀η∈R2\0

limx→∞

Yp(

x, r(η))

= limx→∞

r(η)

r(η) +(

1− r(η))

e−x= 1. (K.43)

Fix η ∈ R2 \ 0. If ϕ(η) = 0, then

limx→∞

Yq(

x, 0)

= limx→∞

0 = 0. (K.44a)

If ϕ(η) = π, then

limx→∞

Yq(

x, π)

= limx→∞

2

(

π + arctan

(

−2

x

))

= 2(π + 0) = 2π. (K.44b)

If 0 < ϕ(η) < π or π < ϕ(η) < 2π, then sinϕ(η) 6= 0 and, thus,

limx→∞

Yq(

x, ϕ(η))

= limx→∞

2

(

π + arctan

(

2 sinϕ(η)

2 cosϕ(η)− x sinϕ(η) + 2

))

= 2(π + 0) = 2π. (K.44c)

Using (K.43) and (K.44) in (K.41) yields

∀η∈R2\0

limx→∞

Y (x, η) = (1, 0). (K.45)

While (1, 0) is clearly a fixed point for (K.2) with f1, f2 according to (K.10), (K.45)shows that no other η ∈ R2 \ 0 can be a fixed point.

REFERENCES 154

For each τ ∈]0, π[ and η := (cos τ, sin τ), it is ϕ(η) = τ and Yq(0, ϕ(η)) = τ . Thus, dueto (K.44c) and the intermediate value theorem, the continuous function Yq(·, ϕ(η)) mustattain every value between τ and 2π, in particular, there is xπ > 0 that Yq(xτ , ϕ(η)) = πand Y (xτ , η) = (cos π, sin π) = (−1, 0). Since every neighborhood of (1, 0) containspoints η = (cos τ, sin τ) with τ ∈]0, π[, this shows that (1, 0) does not satisfy Def.5.24(ii) for x ≥ 0. N

References

[Aul04] Bernd Aulbach. Gewohnliche Differenzialgleichungen, 2nd ed. SpektrumAkademischer Verlag, Heidelberg, Germany, 2004 (German).

[Koe03] Max Koecher. Lineare Algebra und analytische Geometrie, 4th ed. Springer-Verlag, Berlin, 2003 (German), 1st corrected reprint.

[Mar04] Nelson G. Markley. Principles of Differential Equations. Pure and AppliedMathematics, Wiley-Interscience, Hoboken, NJ, USA, 2004.

[Oss09] E. Ossa. Topologie. Vieweg+Teubner, Wiesbaden, Germany, 2009 (German).

[Phi15] P. Philip. Calculus II for Statistics Students. Lecture Notes, Ludwig-Maximilians-Universitat, Germany, 2015, available in PDF format athttp://www.math.lmu.de/~philip/publications/lectureNot

es/calc2_forStatStudents.pdf.

[Phi16] P. Philip. Analysis I: Calculus of One Real Variable. Lecture Notes, Lud-wig-Maximilians-Universitat, Germany, 2015/2016, available in PDF format athttp://www.math.lmu.de/~philip/publications/lectureNotes/analysis1.pdf.

[Pre75] G. Preuß. Allgemeine Topologie, 2nd ed. Springer-Verlag, Berlin, 1975 (Ger-man).

[Put66] E.J. Putzer. Avoiding the Jordan Canonical Form in the Discussion of LinearSystems with Constant Coefficients. The American Mathematical Monthly 73(1966), No. 1, 2–7.

[Str08] Gernot Stroth. Lineare Algebra, 2nd ed. Berliner Studienreihe zur Mathe-matik, Vol. 7, Heldermann Verlag, Lemgo, Germany, 2008 (German).

[Wal02] Wolfgang Walter. Analysis 2, 5th ed. Springer-Verlag, Berlin, 2002 (Ger-man).

ordinary diﬀerential equations - lmu münchenphilip/publications/lecturenotes/ode.pdf · ordinary...

Documents