analysis and topology - zb260.user.srcf.net

38
Analysis and Topology * Zhiyuan Bai Compiled on July 11, 2021 This document serves as a set of revision materials for the Cambridge Math- ematical Tripos Part IB course Analysis and Topology in Michaelmas 2019. However, despite its primary focus, readers should note that it is NOT a verba- tim recall of the lectures, since the author might have made further amendments in the content. Therefore, there should always be provisions for errors and typos while this material is being used. Contents 1 Uniform Convergence 1 2 Uniform Continuity 6 3 Metric Space 7 4 Completeness 14 5 Topological Spaces 17 6 Connectedness 24 7 Compactness 27 8 Differentiation 31 1 Uniform Convergence Definition 1.1. A complex sequence x n is said to converge to a complex number x if > 0, N N such that n>N , |x - x n | <. Definition 1.2. Let S be a set and let f n : S C be a sequence of functions. Let f : S C be a function. We say f n f pointwise if for any x, f n (x) f (x). In other words, x S, > 0, N N, n>N , |f (x) - f n (x)| <. Example 1.1. Let S be the closed interval [0, 1] and f n (x)= x n , then f n f pointwise where f (x)= ( 1, if x =1 0, otherwise * Based on the lectures under the same name taught by Prof. A. Zs´ak in Michaelmas 2019. 1

Upload: others

Post on 18-Dec-2021

2 views

Category:

Documents


0 download

TRANSCRIPT

Analysis and Topology ∗

Zhiyuan Bai

Compiled on July 11, 2021

This document serves as a set of revision materials for the Cambridge Math-ematical Tripos Part IB course Analysis and Topology in Michaelmas 2019.However, despite its primary focus, readers should note that it is NOT a verba-tim recall of the lectures, since the author might have made further amendmentsin the content. Therefore, there should always be provisions for errors and typoswhile this material is being used.

Contents

1 Uniform Convergence 1

2 Uniform Continuity 6

3 Metric Space 7

4 Completeness 14

5 Topological Spaces 17

6 Connectedness 24

7 Compactness 27

8 Differentiation 31

1 Uniform Convergence

Definition 1.1. A complex sequence xn is said to converge to a complex numberx if ∀ε > 0,∃N ∈ N such that ∀n > N , |x− xn| < ε.

Definition 1.2. Let S be a set and let fn : S → C be a sequence of functions.Let f : S → C be a function. We say fn → f pointwise if for any x, fn(x) →f(x). In other words, ∀x ∈ S, ∀ε > 0,∃N ∈ N,∀n > N , |f(x)− fn(x)| < ε.

Example 1.1. Let S be the closed interval [0, 1] and fn(x) = xn, then fn → fpointwise where

f(x) =

{1, if x = 1

0, otherwise

∗Based on the lectures under the same name taught by Prof. A. Zsak in Michaelmas 2019.

1

Note that in this example, despite the fact that all of fn are continuous,even smooth, the resulting limit f needs not be continuous.Here is another example:

Example 1.2. Let S = R≥0 and let fn(x) = x2e−nx, then fn → 0 pointwise,since

0 ≤ |fn(x)| = x2

enx=

x2

1 + nx+ n2x2

2 + n3x3

6 · · ·≤ x2

nx=x

n→ 0

as n→∞.

There is another form of convergence, called uniform convergence, which isdefined as follows:

Definition 1.3. Let S be a set and let fn : S → C be a sequence of functions.Let f : S → C be a function. We say fn → f uniformly if ∀ε > 0,∃N ∈ N,∀n >N, ∀x ∈ S, |f(x)− fn(x)| < ε.

Note that the only difference between pointwise and uniform convergence isthat the large integer N does not depend on x if the convergence is uniform.Note also that uniform convergence implies pointwise convergence, but not theother way around. Although it does not seem to be such a great difference indefinition, in practice, it makes all the difference in the world.

Proposition 1.1. The sequence in the first example, i.e. fn : [0, 1] → R withfn(x) = xn, does not converge uniformly.

Proof. We can take for example ε = 1/2. Then for any n ∈ N, we can takex = n

√2/3, so that we have

|fn(x)− f(x)| = |fn(x)| = 2

3>

1

2

So the claimed N does not exist. Therefore the sequence fn does not convergeuniformly.

Proposition 1.2. The sequence in the second example, i.e. fn = R≥0 → Rwhere fn(x) = x2e−nx, converges absolutely.

Proof. Note that

0 ≤ fn(x) =x2

enx=

x2

1 + nx+ n2x2

2 + · · ·≤ x2

n2x2/2=

2

n2

Therefore for any ε > 0, we can take N = d√

2/εe, so for any x ≥ 0, n > N , weahve

|f(x)− fn(x)| = |fn(x)| = fn(x) ≤ 2

n2<

2

N2≤ 2

(√

2/ε)2= ε

So fn → 0 uniformly.

In fact, although continuous functions may not converge pointwise to a con-tinuous function, they do converge uniformly to one.

2

Theorem 1.3. Let S ⊂ C be open. Suppose that fn : S → C is a sequence ofcontinuous functions. If fn → f uniformly, then f is continuous as well.

Informal sketch. Idea: Transfer the nice property of fn to f . Choose largeenough N such that fn − f is arbitratily small for all n > N . We can alwayschoose x′ close to x where fn(x) close to f(x). Then just use triangle inequality.”3-ε proof”

Proof. ∀ε > 0, we can choose large enough N such that sup |fn − f | < ε/3 Wecan choose δ > 0 such that |x− x′| < δ =⇒ |fn(x)− fn(x′)| < ε/3.

|f(x)− f(x′)| ≤ |f(x)− fn(x)|+ |f(x′)− fn(x′)|+ |fn(x)− fn(x′)| < 3ε

3= ε

As desired.

Remark. 1. We can use this theorem to show that xn as in the previous exampledoes not converge uniformly.2. It is not true that differentiability is preserved under unform convergence.

Theorem 1.4. Let fn : [a, b] → R be all Riemann integrable. Then if it con-verges uniformly, its limit is also Riemann integrable. Furthermore,∫ b

a

limn→∞

fn(x) dx = limn→∞

∫ b

a

fn(x) dx

Recall that a function is Riemann integerable if and only if the upper andlower sums of f on the interval can be arbitratily close.

Proof. Firstly f is bounded. Since fn are bounded, we can just choose largeenough n such that |fn − f | < 1 and |fn| < M , then |f | ≤ |f − fn| + |fn| <ε+M < M + 1 so f is bounded.For ε > 0 choose N such that sup |fn − f | < ε/(3(b − a)) for any n > N .Since fn is integrable, there is some disection D of the interval [a, b] such thatUD(fn)− LD(fn) < ε/3. We have

|LD(f)− LD(fn)| =∑

(xi)∈D

∣∣∣∣ infx∈[xi,xi+1]

f(x)− infx∈[xi,xi+1]

fn(x)

∣∣∣∣ (xi+1 − xi) < ε/3

Similarly |UD(f)− UD(fn)| < ε/3. So

|UD(f)− LD(f)| ≤ |UD(f)− UD(fn)|+ |LD(fn)− LD(f)|+ |UD(fn)− LD(fn)|< 3ε/3 = ε

This shows that f is integrable. Finally, we have∣∣∣∣∣∫ b

a

f(x)− fn(x) dx

∣∣∣∣∣ ≤∫ b

a

sup |f(x)− fn(x)|dx < ε/3 < ε

which completes the proof.

Remark. 1. For uniform convergence, we can swap the integral and the limit.2. If fn → f uniformly and that all fn is bounded, then f is bounded.

3

Corollary 1.5. For uniform convergence, we can swap infinite sums and in-tegral. That is, if fn : [a, b] → R is a sequence of integrable functions whosepartial sum converges uniformly to some function f , then f is integrable and∫ b

a

f(x) dx =

∞∑n=1

∫ b

a

fn(x) dx

Proof. Let

Fn(x) =

n∑k=1

fk(x)

so Fn are integrable and Fn → f uniformly. Then we can just apply the pre-ceding theorem.

Theorem 1.6. Let fn : [a, b] → R be continuously differentable on [a, b]. As-sume that the sequence of partial sums of f ′n at every point converges uniformly,and that there is an c ∈ [a, b] such that

∞∑n=1

fn(c)

converges, then the sequence of partial sums of fn converges uniformly. Fur-thermore, the limit f is continuously differentiable and

f ′(x) =

∞∑n=1

f ′n(x)

Sketch of proof. Let

Fn(x) =

n∑k=1

fk(x), g(x) =

∞∑n=1

f ′n(x)

So we want to find a particular solution to the differential equation f ′ = g, andshow that Fn converges unformly to it. So basically we want to do

f(x) =

∫ x

c

g(t) dt+

∞∑n=1

fn(c) = limn→∞

Fn(x) =

∞∑n=1

fn(x)

rigorously and it would be done.

Proof. Let

g(x) =

∞∑n=1

f ′n(x)

g is continuous and hence Riemann integrable on [a, b]. Define f : [a, b]→ R by

f(x) =

∫ x

c

g(t) dt+ λ

where

λ =

∞∑n=1

fn(c)

4

By FTC, f is differentiable and f ′(x) = g(x). Since g is continuous, f ∈C1([a, b]). It remains to show that the series sum of fn(x) converges uniformlyto f(x). Let Fn(x) be the partial sum of the series, then by estimating itsdifference with f and the fact that the partial sum of derivatives of fn convergesuniformly (use FTC again), we can show that Fn → fn uniformly. [Write detailslater]

Definition 1.4. Let fn be a sequence of scalar function on a set S. We say fnis uniformly Cauchy on S if ∀ε > 0,∃N ∈ N,∀x ∈ X,∀n,m > N ,

|fn(x)− fm(x)| < ε

Theorem 1.7 (General Principle of Uniform Convergence). A sequence of uni-formly Cauchy scalar functions fn on S converges uniformly.

Proof. Firstly, we shall find a pointwise limit f of fn. The existence of f isimmediate since fn(x) is always Cauchy (hence converges) with x fixed.Then we shall show that this convergence is uniform. Choose any ε > 0, ∃N ∈N,∀x ∈ X,∀n,m > N, |fn(x) − fm(x)| < ε/2. Now we fix x ∈ S, n > N , sincefn → f pointwise, we can choose m > N with |fm(x)− f(x)| < ε/2, then

|f(x)− fn(x)| ≤ |f(x)− fm(x)|+ |fm(x)− fn(x)| < 2ε/2 < ε

So fn → f uniformly.

So what we did is to fix the x and the n, then let that m tend to infinity,then we can use the pointwise convergence to give the result. This is how weget pass the dependence of N on x in the pointwise convergence result.

Corollary 1.8. Let fn be a sequence of scalar functions on S, let

∞∑n=1

Mn

be convergent with Mn ≥ 0.If sup |fn| ≤Mn for any n, then

∞∑n=1

fn

converges uniformly.

Proof. Let Fn be the partial sum of fn, Fn is uniformly Cauchy due to theconvergence of the series of Mn. Essentially, ∀ε > 0,∃N ∈ N,∀n,m > N ,

m∑k=n+1

Mk < ε

som∑

k=n+1

|fn(x)| < ε

Therefore it is unifomly Cauchy, so it converges uniformly.

5

Now we consider the power series

∞∑n=0

an(z − a)n

(an)∞0 be a sequence of complex number. Let R be the radius of convergence.Now on the disk |z − a| < R, we consider

f(z) =

∞∑n=0

an(z − a)n

The question is: is the convergence uniform?

Example 1.3. Consider

f(z) =

∞∑n=0

zn =1

1− z

where R = 1. It does not converge uniformly. Indeed, the N th partial sum isbounded by N + 1 but 1/(1− z) is unbounded.

Theorem 1.9. For any r with 0 < r < R, the series converges uniformly onD(a, r).

Proof. For w ∈ C such that r < |w − a| < R, there is an M such that |an(w −a)n| < M for some M > 0 and any n. We have |z − a|/|w − a| < 1 for anyz ∈ D(a, r), hence by taking Mn = M(r/|w − a|)n shows the result.

The derivative of a power series (we can prove that it is complex differen-tiable, and we can do it term-by-term) has the same radius of convergence.

Remark. If we fix w ∈ D(a,R), we can choose r such that |w− a| < r < R. Fixany δ > 0 such that |w − a|+ δ < r, then D(w, δ) ⊂ D(a, r), so

∞∑n=0

an(z − a)n

converges uniformly on D(w, δ). We say it is locally uniformly on D(a,R).

2 Uniform Continuity

Let U ∈ C and f a scalar function on U . We know what continuity means inthe sense of metric space.

Definition 2.1. We say f is uniformly continuous on U if for any ε > 0,∃δ >0,∀x, y ∈ U , |x− y| < δ =⇒ |f(x)− f(y)| < δ.

Note that the difference between uniform convergence and our initial formof convergence is that the value of δ does not depend on the point x.

6

Example 2.1. The standard example that a function is continuous but notuniformly continuous is that f(x) = x2 on R. To see why, observe that we takeε = 1, then choose any δ > 0,

(x+ δ/2)2 − x2 = δx+ δ2/4

We can just choose x > 1/δ and the value would exceed 1.

Theorem 2.1. Let f be a scalar function defined on a closed interval [a, b],then if f is continuous then it is uniformly continuous.

Proof. (Heine-Borel Theorem gives compactness of the interval which providesa direct proof.)Assume that it is not the case, so

∃ε > 0,∀δ > 0,∃x, y ∈ [a, b], |x− y| < δ ∧ |f(x)− f(y)| > ε

Choose such a ‘bad’ ε, consider δn = 1/n and choose xn, yn accordingly.By Bolzano-Wiestrass, there is a subsequence xkm of xn that converges to somex as x→∞. Note that since the interval is closed x ∈ [a, b]. Then, |ykm − x| ≤|xkm − x|+ |xkm − ykm | < 1

km+ ε for any ε > 0. so ykm → x.

Since f is continuous at x, there is some δ such that for every y ∈ [a, b], |x−y| <δ =⇒ |f(x)−f(y)| < ε/2. There is some N such that m > N =⇒ |xkm−x| <δ, |ykm −x| < δ. So ε < |f(xkm)−f(ykn)| ≤ |f(xkm)−f(x)|+ |f(x)−f(ykm)| <2ε/2 = ε This is a contradiction.

It follows that a continuous function on a closed interval is integrable.

3 Metric Space

Definition 3.1. Let M be an arbitrary set, a metric on M is a function d :M ×M → R≥0 such that the following properties hold:1. ∀x, y ∈M,d(x, y) = 0 ⇐⇒ x = y.2. ∀x, y ∈M,d(x, y) = d(y, x).3. ∀x, y, z ∈M,d(x, y) + d(y, z) ≥ d(z, x).The couple (M,d) is called a metric space.

Example 3.1. 1. Let M be R or C, and d(x, y) = |x − y| is called the usualmetric on those sets.2. Take M = Rn or Cn, then

d((x1, . . . , xn), (y1, . . . , yn)) =√|x1 − y1|2 + . . .+ |xn − yn|2

. This is called the `2 metric.3. Take the same M , then we can also have the `1 metric where

d((x1, . . . , xn), (y1, . . . , yn)) = |x1 − y1|+ . . .+ |xn − yn|

4. Also the same M , we have the `∞ metric where

d((x1, x2, . . . , xn), (y1, y2, . . . , yn)) = maxi|xi − yi|

7

5. We can have the `p metric for p ≥ 1 where

d((x1, . . . , xn), (y1, . . . , yn)) = p√|x1 − y1|p + . . .+ |xn − yn|p

6. Let S be a set and M = `∞S be the set of bounded scalar function on S.The metric we can take is the uniform metric d(f, g) = supS |f − g|, which iswell-defined since f, g are bounded.7. Let M be any set, then define

d(x, y) =

{1, if x = y

0, otherwise

This is called the discrete metric and the space (M,d) the discrete metric space.8. Let G be a group that is generated by a symmetric set S. Define d(x, y) bethe least integer n ≥ 0 such that n is the least number of generators to get fromX to y.This develops to the discipline called geometric group theory.9. Suppose G is a connected (finite) graph, then we can define the distancebetween two vertices x, y to be the length of the shortest path from x to y.10. Riemannian metric in geometry.11. Take M to be the integers and we fix a prime p. We can define the p-adicmetric dp(x, y) to be 0 if x = y and ‖x − y‖p = p−n where n is the greatestpower of p in the prime factorisation of |x− y|. It is obvious that it is a metric.The metric space (Z, dp) is called the p-adic integers.12. Let M be the set of all functions from N to R, that is, the set of all sequences.So for x = (xn), y = (yn), we define d(x, y) =

∑∞n=0 2−n min{1, |xn − yn|}.

As in before, we can construct new objects from old.

Definition 3.2. Let (M,d) be a metric space and N ⊆ M , then (N, d|N×N )is a metric space and is called the metric subspace of (M,d). Sometimes wedenote (N, d|N×N ) by (N, d).

Example 3.2. C[0, 1] be the set of real continuous functions on the unit intervalhaving the uniform metric is a subspace of l∞[0, 1].

Note that there are other metrics on C[0, 1], for example

d(f, g) =

∫ 1

0

|f(x)− g(x)|dx

We also have the L2 metric

d(f, g) =

√∫ 1

0

|f(x)− g(x)|dx

Note that the L∞ metric is the uniform metric.

Definition 3.3. Let (M,d), (N, d′) be two metric spaces, we can define themetric product space by taking the underlying set M ×N and the metric

dp((m1, n1), (m2, n2)) = (d(m1,m2)p + d′(n1, n2)p)1/p

for some p ≥ 1 or

d∞((m1, n1), (m2, n2)) = max({d(m1,m2), d′(n1, n2)})

We can generalize it to any finite product of metric space.

8

We need to generalize to topological spaces to have the notion of a quotientspace in a metric/topological space.We can introduce (uniform) convergence again in any metric spaces. We workin a metric space (M,d). 1

Definition 3.4. Given a sequence xn in M and a point x ∈M , we say xn → xas n→∞ if for any ε > 0,∃N ∈ N,∀n > N, d(xn, x) < ε.Conversely, if such an x exists for a sequence xn, we say xn is convergent.

Lemma 3.1. Assume xn → x and xn → y, then x = y.

Proof. Assume for the sake of contradiction that it is not the case. We letε = d(x, y), we have a large enough N ∈ N such that n ∈ N =⇒ d(xn, x) <ε/2, d(xn, y) < ε/2, so

ε = d(x, y) ≤ d(x, xn) + d(y, yn) < 2ε/2 = ε

A contradiction.

Now we can introduce the concept of limit.

Definition 3.5. Let (xn) be a convergent sequence which converges to x, wewrite

limn→∞

xn = x

Example 3.3. 1. In R,C, this is the usual notion of convergence.2. Take the integers under the 2-adic metric, 2n → 0 as n→∞.3. A sequence which is eventually constant converges to that constant. Obvi-ously the converse is false in general, but true in discrete metric spaces.4. Choose a nonempty set S, then functions that converge under the (induced)uniform metric on `∞S converges uniformly as functions. This sometimes canwork even on functions /∈ `∞S, for example take S = R, then fn(x) = x+ 1/nconverges uniformly to the identity function.5. Consider the space RN, the set of all sequences on R with the metric

d((xn), (yn)) =

∞∑n=1

2−n min {1, |xn − yn|}

then we can show that a sequence of sequences (x(n)k ) (where each n gives a

sequence) converges to a sequence (xk) if and only if for each i, x(n)i → xi.In fact, if we fix a set S, is there always a metric d on RS such that fn → funder the metric d if fn → f pointwise on S? The answer is no, as we will needtopological tools for it.6. Consider C[0, 1] under the uniform metric. Surely the function defined byfn = xn do not converge, but if we equip C[0, 1] with a different metric, forexample the L1 metric, that is,

d(f, g) =

∫ 1

0

|f(x)− g(x)|dx

In this case, the sequence fn does converge to 0.7. Let (M,d), (M ′, d′) be two metric spaces, we consider the metric product

1When d is understood, we do not usually state explicitly our metric d

9

space M ⊕p M ′ = (M ×M ′, dp). Then (xn, yn) → (x, y) if and only if xn →x, yn → y.8. Consider the metric subspace N ⊂M . If a sequence xn converge to x in N ,then xn → x in M . The converse is not true since we can take N = M \ {x} ifxn → x.

Definition 3.6. Let (M,d), (M ′, d′) be two metric spaces and f : M →M ′ bea function. We say f is continuous at a ∈M if

∀ε > 0,∃δ > 0,∀b ∈M,d(a, b) < δ =⇒ d′(f(a), f(b)) < ε

If f is continuous at every a ∈ N ⊆M , then we say f is continuous on N .

Note that if f is continuous on M , it is continuous on any N ⊆ M . Theconverse, however, is not true.

Example 3.4. We can take both metric spaces to be R, then consider thefunction

f(x) =

{1, if x 6= 0

0, otherwise

Then f is continuous on N = R \ {0} but not on M = R.

Proposition 3.2. f is continuous at a if any only if for any sequence xn → a,we have f(xn)→ f(a).

Proof. If f is continuous at a, then ∀ε > 0, we can find some δ > 0 suchthat d(a, b) < δ =⇒ d′(f(a), f(b)) < ε. Now, find any xn → a, we canfind N ∈ N,∀n > N, d(a, xn) < δ, but with the same N , ∀n > N , we haved′(f(a), f(xn)) < ε by the above. So f(xn)→ f(a).Conversely, if xn → a =⇒ f(xn)→ f(a) but f is not continuous at a, then wecan find ε > 0 such that ∀δ > 0, there is some x ∈M such that d(x, a) < ε butd′(f(x), f(a)) > ε. We may set δn = 1/n and we can obtain the correspondingxn. Now xn → a but f(xn) 6→ f(a). This is a contradiction.

The following two corollaries are then obvious.

Corollary 3.3. Let f and g be continuous scalar functions, then f + g, f × gand f/g (providing that ∀x, g(x) 6= 0) are all continuous.

Corollary 3.4. If f : M → M ′, g : M ′ → M ′′ are both continuous, then g ◦ fis continuous.

One can also prove them using ε− δ, which is not hard either.

Example 3.5. 1. Constant, identity (equipping the same metric) and inclusion(in the sense of metric subspace) functions are continuous.2. Real and complex polynomials are continuous.3. The metric function itself is continuous (in fact Lipschitz) with respect tothe dp metric on M ×M .

Definition 3.7. A function (M,d) → (M ′, d′) is Lipschitz continuous if thereis some C ≥ 0 such that

∀x, y ∈M,d′(f(x), f(y)) ≤ Cd(x, y)

we sometimes call f to be C-Lipschitz.

10

Proposition 3.5. A Lipschitz function is uniformly continuous.

Proof. Trivial.

Definition 3.8. A map g : (N, d)→ (N ′, d′) is isometric if

∀x, y ∈ N, d′(g(x), g(y)) = d(x, y)

Note that an isometric function is 1-Lipschitz. It also implies injective.

We continue with examples.

Example 3.6. 4. Let M,M ′ be metric spaces, fixing y ∈M ′, f : M →M⊕pM ′by f(x) = (x, y′) is isometric, hence also (1-)Lipschitz.5. Let (M,d), (M ′, d′) be metric spaces. Consider q : M ⊕p M ′ → M, q′ :M ⊕p M ′ → M ′ be the projection functions. Both of these functions are 1-Lipschitz. We can easily extend it to a finite product of metric spaces.

Now we go on to talk about the topology of metric spaces. We start withtwo observations. Firstly, in a product metric space M⊕pM ′, convergence doesnot depend on the value of p. Secondly, continuity depends on the convergentsequences.

Definition 3.9. We fix a metric space (M,d), for x ∈ M and r ≥ 0, the openball Dr(x) is the set {y ∈M : d(x, y) < r}.

So xn → x if and only if ∀ε > 0,∃N ∈ N, n > N =⇒ xn ∈ Dε(x). Andf : M → M ′ is continuous at a ∈ M if and only if ∀ε > 0,∃δ > 0,∀x ∈ M,x ∈Dδ(a) =⇒ f(x) ∈ Dε(f(a)).

Definition 3.10. On (M,d), for x ∈M and r ≥ 0, the closed ball Br(x) is theset {y ∈M : d(x, y) ≤ r}.

Example 3.7. 1. When M is the real numbers, then an open ball is an openinterval and closed ball is an closed interval.2. In R2, B1(0, 0) is the unit disk with boundary in d2, and an slanted squarein d1, and a big square in d∞.3. If M is discrete, D1(x) = {x}, B1(x) = M .

Note that Bs(x) ⊂ Dr(x) ⊂ Br(x) for any s < r.

Definition 3.11. A subset U ⊂ M with x ∈ U is called a neighbourhood of x(in M) if there exists some r > 0 with Dr(x) ⊂ U .

It does not matter if we take the closed ball instead.

Definition 3.12. Given U ⊂M , we say U is open if ∀x ∈ U,∃r > 0, Dr(x) ⊂ U .

So U is open if and only if U is a neightbourhood of x for any x ∈ U .

Lemma 3.6. Open balls are open.

Proof. Immediate from definition but let us write the proof anyways.Consider Dr(x), then for any y ∈ Dr(x), since d(x, y) < r, if z ∈ Dr−d(x,y)(y),then d(x, z) ≤ d(y, z) + d(x, y) < r =⇒ Dr−d(x,y)(y) ⊂ Dr(x).

11

Proposition 3.7. In a metric space M , the followings are equivalent:1. xn → x.2. For any neighbourhood U of x, there is some N ∈ N,∀n > N, xn ∈ U .3. For any open set U containing x, there is some N ∈ N,∀n > N, xn ∈ U .

Proof. 1 =⇒ 2: ∃r > 0, Dr(x) ⊂ U , so we can choose an N ∈ N,∀n >N, d(xn, x) < r =⇒ xn ∈ U .2 =⇒ 3: Immediate by the preceding lemma.3 =⇒ 1: Given ε > 0, take U = Dε(x), then two statement becomes identical.

Proposition 3.8. Given function f : M →M ′, then:(A) For a ∈M , the followings are equivalent:1. f continuous at a.2. For any neighbourhood V of f(a), there is a neighbourhood U of a such thatf(U) ⊂ V .3. For any neighbourhood V of f(a), f−1(V ) is a neighbourhood of a.(B) The followings are equivalent:1. f is continuous.2. The pre-image of any open set is open.

Proof. Part (A):1 =⇒ 2: Given any neighbourhood V of f(a), there is some r such thatDr(f(a)) ⊂ V . Since f is continuous at a, there is δ > 0 with f(Dδ(a)) ⊂Dr(f(a)) ⊂ V . So U = Dδ(a) works.2 =⇒ 3: Trivial since there is some neighbourhood U containing a withf(U) ⊂ V =⇒ U ⊂ f−1(V ) so it is a neighbourhood of a.3 =⇒ 1: Given ε > 0, f−1(Dε(f(a))) contains some open ball Dδ(a) for someδ > 0, so it’s done.Part (B):1 =⇒ 2: Given V open in M ′, for x ∈ f−1(V ), we have f(x) ∈ V , so V is aneighbourhood of f(x). Since f is continuous, by (A), there is an neighbourhoodof x containing in it.2 =⇒ 1: We shall show that it is continuous at every point. Given ε > 0, a ∈M , the ball Dε(f(a)) is open in M ′, so f−1(Dε(f(a))) is open, so there is someδ with Dδ(a) ⊂ f−1(Dε(f(a))), so we are done.

Definition 3.13. The topology of a metric space the collection of open subsetsof it.

Proposition 3.9. In a metric space M , we have the following:1. ∅,M are open.2. If {Ui}i∈I are open, then

⋃i∈I Ui is open.

3. If U, V ⊂M are open, then U ∩ V is open.

Proof. 1 is trivial.For 2, given x in the union, then there is a j ∈ I such that x ∈ Uj , but thenthere is some open ball U 3 x such that U ⊂ Uj , then U is a subset of thatunion. So this union is open.Regarding 3, given x ∈ U ∩ V , then there are εU , εV > 0 such that DεU (x) ⊂U,DεV (x) ⊂ V , so the ball Dmin{εU ,εV }(x) ⊂ U ∩ V .

12

Definition 3.14. A subset A ⊂M is closed if whenever xn → x in M for somesequece (xn) ∈ A, then x ∈ A.

Example 3.8. 1. Closed balls are closed.2. So in R, any closed interval is closed. Also R itself is both open and closed.[0, 1) is neither open nor closed.

Lemma 3.10. A subset A ⊂M is closed if and only if M \A is open.

Proof. If A is closed but M \ A is not open, so there is some x ∈ M \ Asuch that Dr(x) is not contained in M \ A for all r > 0. Hence for eachε > 0,∃xε ∈M,xε ∈ Dε(x). Taking an = x1/n gives a contradiction.If A is not closed but M \ A is open. So we can find (an) ∈ A such thatan → a /∈ A, so there is some ε > 0 such that Dε(a) ⊂ M \ A, but this is acontradiction since an /∈M \A for any n but it can go ε-close to a.

Example 3.9. If (N, d) is discrete, then every subset of N is both open andclosed.

Definition 3.15. Two metrics on a set are equivalent if they give the sametopology.

Note that it is equivalent to say that the teo metrics have the same conver-gence sequences since they help identify the closed sets. It also means that theyhave the same continuous functions, both to and from, any other spaces. Notethat the two metrics induce the same topology if and only if the identity mapsfrom both spaces are continuous.

Definition 3.16. A map g : M → M ′ is called a homeomorphism if it is abijection and both g and g−1 are continuous.We say g is an isometry if it is bijective and it is isometric.We say M and M ′ are homeomorphic if there is a homeomorphism betweenthem. And M , M ′ be isometric if there is an isometry between them.

Remark. 1. Continuous bijections may not be a homeomorphism. Take g : R→R where the domain is equipped with discrete metric and the codomain withthe usual metric.2. A surjective isometric function is an isometry.

Example 3.10. 1. (0, 1), (0,∞) are homeomorphic. Take x 7→ 1/x.2. R2 and C are isometric.

Definition 3.17. Two metrics d, d′ on M are uniformly equivalent if and onlyif both the identity functions id : (M,d) → (M,d′), id : (M,d′) → (M,d) areuniformly continuous.We say d, d′ are Lipschitz equivalent if and only if both the identity functionsare Lipschitz.

Example 3.11. 1. On M ×M ′, d1, d2, d∞ are Lipschitz equivalent.2. (non-example) On C[0, 1] the uniform metric is not equivalent to the L1

metric since they do not have the same convergent sequences.

13

4 Completeness

Definition 4.1. A metric space is called complete if every Cauchy sequenceconverges.

Definition 4.2. A subset A ⊂M where M is a metric space is called boundedif there is some r > 0 and z ∈M such that A ⊂ Br(z).

Lemma 4.1. Convergent =⇒ Cauchy =⇒ Bounded.

Proof. Suppose xn → x, then ∀ε > 0, we can find some N ∈ N such that∀n > N, d(x, xn) < ε/2, then ∀n,m > N ,

d(xm, xn) ≤ d(xm, x) + d(xn, x) < 2ε/2 = ε

Assume that (xn) is Cauchy, we need that {xn : n ∈ N} is contained in someball. We know that there is some N ∈ N,∀n,m > N, d(xn, xm) < ε. We can takeε = 1, so {xn : n ∈ N, n > N} ⊂ B1(xN ). Since (xn)n<N is finite, it is containedin some ball B. In particular, we can take the ball Bmax{1,d(xN ,xi):i≤N}(xN )contains the sequence.

Remark. Bounded does not imply Cauchy and Cauchy does not imply Conver-gent.

Definition 4.3. A metric space M is complete if every Cauchy sequence con-verges.

Proposition 4.2. If M,M ′ are complete, so is M ⊕pM ′.

Proof. Let (an) be Cauchy in the product, then say that ai = (xi, x′i), then for

all m,n ∈ N, max{d(xm, xn), d(x′m, x′n)} ≤ dp(am, an), so (xn), (x′n) are both

Cauchy. Since both M,M ′ are complete, ∃x ∈ M,x′ ∈ M ′, xn → x, x′n → x′.Hence an → (x, x′).

Example 4.1. Rn,Cn are complete (in Euclidean metric) for any n.

There is another very important example:

Theorem 4.3. Let S be a non-empty set, then `∞S is complete under theuniform metric.

Proof. Theorem 1.7 shows that a uniformly Cauchy sequence of functions doesconverge to some scalar function on S. To see it is bounded, choose n ∈ N suchthat d(fn, f) < 1, then since there is some C ≥ 0 such that sup |fn| ≤ C, wehave |f | ≤ |f − fn|+ |fn| < C + 1 so it is bounded as well.

Proposition 4.4. Let N ⊂M be a metric subspace.1. If N is complete, then N is closed in M .2. If N is closed and M is complete, so is N .

So a metric subspace of a complete space is complete if and only if it isclosed.

14

Proof. 1. If N is complete, let (xn) be a sequence in N such that xn → x in M ,but this means that (xn) is Cauchy by Lemma 4.1, therefore it is convergent inN due to completeness, hence x ∈ N due to uniqueness of limit in metric space.2. Choose any Cauchy sequence (xn) in N , we know that (xn) → x for somex ∈M due to completeness of M , but since N is closed, x ∈ N as well, so N iscomplete.

Theorem 4.5. Let M be a metric space, then the space of bounded continuousscalar functions in M , Cb(M) is complete in the uniform metric.

Proof. Cb(M) is a metric subspace of `∞(M) which is complete. But uniformlimit of continuous functions to a continuous function, so Cb(M) is closed.To spell out the proof, fix x ∈M, ε > 0, we can choose N such that D(fn, f) <ε/3 where D is the uniform metric. Fix any n ≥ N , then since fn is continuous,∃δ > 0, d(x, y) < ε =⇒ |fn(x) − fn(y)| < ε/3. Hence d(x, y) < δ =⇒|f(x)− f(y)| ≤ |f(x)− fn(x)|+ |f(y)− fn(y)|+ |fn(x)− fn(y)| < 3ε/3 = ε.

Fix some S 6= ∅, a metric space (N, d′). Let `∞(S,N) be the space ofbounded functions S → N . Then we can define the uniform metric on `∞(S,N)defined by D(f, g) = supx∈S d

′(f(x), g(x)).Now given a metric space (M,d), let Cb(M,N) be the set fo bounded continuousfunctions M → N , then we have

Theorem 4.6. Let S,M,N be as above, assuming that N is complete, then`∞(S,N) is complete under uniform metric, and since Cb(M,N) is closed in`∞(M,N) hence complete.

Proof. Analogous to the case where M = R or C.

Example 4.2. 1. For any closed and bounded interval [a, b] ∈ R, then con-tinuous functions on [a, b] are the continuous and bounded functions on [a, b] iscomplete under the uniform metric.

Definition 4.4. A map f : M →M ′ is a contraction mapping if f is L-Lipschitzwith L < 1.

Theorem 4.7 (Contraction Mapping Theorem, aka Banach Fixed Point Theo-rem). If f is a contraction mapping in a nonempty complete metric space, thenf has an unique fixed point.

Note that it is important for the condition listed to be satisfied.

Example 4.3. 1. If we remove the completeness criterion, f : R\{0} → R\{0}defined by f(x) = x/2, then f is a contraction but do not have fixed point.2. If we remove L < 1, f : R → R by f(x) = x + 1 is 1-Lipschitz but do nothave any fixed point.3. f(x) = x+ 1/x, [1,∞)

Proof. Fix x0 ∈ M , then define a sequence xn by xn+1 = f(xn), so xn =fn(x0). We shall show that this sequence is Cauchy. For n ≥ 2, d(xn, xn−1) ≤Ld(xn−1, xn−2) ≤ Ln−1d(x1, x0) inductively. For m > n,

d(xm, xn) ≤ d(xn, xn+1) + · · ·+ d(xm−1, xm)

≤ (Lm−1 + Lm−2 + · · ·+ Ln)d(x1, x0)

≤ Ln

1− Ld(x1, x0)

15

The last term, which only depends on the smaller term n, can be as small as wewant when n is large enough, so the sequence is Cauchy.Hence there is a limit x of the sequence xn since M , but since f is continuous,f(xn)→ f(x), but f(xn) = xn+1, so by uniqueness of limits, f(x) = x.Suppose f(x) = x and f(y) = y, then if x 6= y, |x − y| = |f(x) − f(y)| ≤L|x− y| < |x− y| which is a contradiction.

Note that xn → x exponentially fast, so it can also be applied to numericalanalysis to find an approximated solution of the fixed point.An application of the contraction mapping theorem is to analyze the existenceand uniqueness of the solution of an initial value problem.

Example 4.4. The IVP f ′(t) = f(t2), f(0) = y0 on C[0, 1/2] is what we areinterested in. Assume that f has a solution, then immediately f is continuouslydifferentiable. By FTC,

f(t) = f(0) +

∫ t

0

f(x2) dx

Let M = C[0, 1/2], which is nonempty and complete, then consider the mapping

T : M → M defined by (Tg)(t) = y0 +∫ t0g(x2) dx T is trivially well-defined

since x ∈ [0, 1/2] =⇒ x2 ∈ [0, 1/4] ⊂ [0, 1/2] and that g(x2) is continuous in x.Also by FTC, (Tg)′ = g, so Tg is continuously differentiable hence continuous.Now f solves the IVP iff f is a fixed point of T . Also we can check that T is acontraction. Indeed, take g, h ∈M, then

|Tg(t)− Th(t)| =∣∣∣∣∫ t

0

g(x2)− h(x2) dx

∣∣∣∣≤∫ t

0

|g(x2)− h(x2)|dx ≤ tD(g, h) ≤ D(g, h)/2

So it is a contraction mapping, hence by Theorem 4.7 a unique fixed point exists.

Theorem 4.8 (Lindelof-Picard Theorem). Let a < b,R > 0 be real numbersand y0 ∈ Rn. Suppose φ : [a, b]×BR(y0)→ Rn is continuous and we have someK > 0 such that ∀x, y ∈ BR(y0),∀t ∈ [a, b], ‖φ(t, x)−φ(t, y)‖ ≤ K‖x−y‖. Then∃ε > 0,∀t0 ∈ [a, b], the IVP

f ′(t) = φ(t, f(t)), f(t0) = y0

has a unique solution in [t0 − ε, t0 + ε] ∩ [a, b].

Proof. Since [a, b] × BR(y0) is closed and bounded, |φ| is bounded above bysome C > 0. Let ε = min{R/C, 1/(2K)}. We want to solve the IVP on[c, d] = [t0 − ε, t0 + ε] ∩ [a, b].The metric space M = C([c, d], BR(y0)) is nonempty and complete, so a naturalthought is to reduce the IVP to a fixed-point problem on which we can useTheorem 4.7.Consider the mapping T : M →M given by

(Tg)(t) = y0 +

∫ t

0

φ(x, g(x)) dx

16

By FTC, Tg is C1 ((Tg)′(t) = φ(t, g(t))), in particular continuous, whenever gis continuous. In addition, Tg takes values in BR(y0) since

‖(Tg)(t)− y0‖ =

∥∥∥∥∫ t

t0

φ(x, g(x)) dx

∥∥∥∥ ≤ ∫ t

t0

‖φ(x, g(x))‖ dx ≤ εC ≤ R

So indeed T is a well-defined function M →M . It is also a contraction mapping:For g, h ∈M ,

‖Tg(t)− Th(t)‖ =

∥∥∥∥∫ t

t0

φ(s, g(s))− φ(s, h(s)) ds

∥∥∥∥≤∫ t

t0

‖φ(s, g(s))− φ(s, h(s))‖ ds

≤ εKD(g, h) ≤ D(g, h)/2

for any t ∈ R by the Lipschitz condition we assumed.Note that g ∈ M solves the IVP iff it is a fixed point of T , so we are done byTheorem 4.7.

Remark. 1. In general, however, you cannot extend the solution guaranteedabove to a global solution. But in our previous example we can extend thesolution to [0, 1).2. Also, we can apply the theorem to solve higher order equations by consideringthe vector of derivatives.3. If f : [a, b] → Rn can be written as (f1, f2, . . . , fn), so f ′ = (f ′1, f

′2, . . . f

′n)

assuming each component is differentiable. Similarly, the integral of a vectorvalued function is the vector of the integrals of the components, given that theyexist. So we can do everything by components.4. The reason that the integral of the norm is at least the norm of the integralis Cauchy-Schwarz.5. We can show that a continuous function on a closed bounded set in Rn isbounded by Bolzano-Weierstrass.

5 Topological Spaces

Definition 5.1. Consider a set X. A topology τ is a collection of subsets of Xsuch that the following axioms hold:1. ∅, X ∈ τ .2. ∀i ∈ I, Ui ∈ τ =⇒

⋃i∈I Ui ∈ τ .

3. U, V ∈ τ =⇒ U ∩ V ∈ τ .A topological space is a pair (X, τ) where X is a set and τ is a topology on X.

Note that the third axiom can be extended to any finite set of elements ofτ .Members of τ are called open sets of X.

Example 5.1. For any metric space, we can induce the metric topology byProposition 3.9. For example, the Euclidean distance on Rn induce the usualtopology on Rn.

17

Definition 5.2. A topological space X (or the topology of X) is called metriz-able if it can be induced by some metric on X.

In the case where the topology is metrizable, any other metric that is equiv-alent to the previous metric gives the same topology.

Example 5.2. The indiscrete topology on a set X is {∅, X}.

Definition 5.3. Give topologies τ1, τ2 on X, we say τ1 is coarser than τ2 or τ2is finer than τ1 if τ1 ⊂ τ2.

We know that the indiscrete topology is coarser than any topology on X. Itis then immediate that if |X| ≥ 2, then the indiscrete topology is not metrizable.Indeed, suppose x, y ∈ X, the open ball Dd(x,y)(x), then it contains x but not yand is open under the metric topology under d, so d cannot induce the indiscretetopology on X.

Example 5.3. The discrete topology on a set X is τ = 2X . This is metrizable.Indeed, it can be induced by the discrete metric. It is also the finest topologyon X.

Example 5.4. The cofinite topology on a set X consists of all subsets of Xwhose complement is finite and the empty set. When X is finite, this topologyis just the discrete topology. If it is infinite, it is not metrizable. Fix x 6= y ∈ X,whenever there is open sets U, V such that x ∈ U, y ∈ V , we know thatX\(U∩V )is finite, thus U ∩ V is not empty, but it would mean that the topological spacethat is not Hausdorff (which we will define below), but any metric space is (alsobelow), so it is not metrizable.

Definition 5.4. A topological space X is called Hausdorff if any two distinctelements x, y in X, there are open sets U, V such that x ∈ U, y ∈ V andU ∩ V = ∅.

Proposition 5.1. Any metric space is Hausdorff.

Proof. Consider U = Dd(x,y)/2(x), V = Dd(x,y)/2(y), they obviously contain x, yrespectively and have empty intersection due to triangle inequality.

Definition 5.5. A subset of X is called closed if its complement is open.

This coincides with the definition of closed sets in a metric space by Lemma3.10.

Proposition 5.2. 1. ∅, X are closed2. If Ai is closed for all i ∈ I, then

⋂i∈I Ai is closed.

3. If A,B are closed, then A ∪B is closed.

Proof. Trivial.

Again the last one can be generalized to any finite indices.

Example 5.5. In cofinite topology, a subset is closed if and only if it is finite.

Definition 5.6. For a topological space X, x ∈ X and U ⊂ X. We say U is aneighbourhood of x if ∃V ⊂ X open such that x ∈ V ⊂ U .

18

Note again that in a metric space, this reduced to our previous definition.The proof of this is trivial.

Proposition 5.3. Let U ⊂ X, then U is open if and only if every x ∈ U has aneighbourhood contained in U .

Proof. Completely trivial.

Definition 5.7. A sequence (xn) ∈ X converges to some x ∈ X, or xn → x, iffor any neighbourhood V of x, ∃N ∈ N,∀n > N, xn ∈ V .

This again and again coincides with previous definition in metric spaces byProposition 3.7.

Example 5.6. In a indiscrete space, any sequence converge to any element.

Theorem 5.4. In a Hausdorff space, limits are unique.

Proof. If xn → x and xn → y but x 6= y, then there are disjoint open setsU, V containing x, y respectively. But then there is some N1 ∈ N,∀n > N, xn ∈U , and there is also some N2 ∈ N,∀n > N, xn ∈ V , but then for any n >max{N1, N2}, xn ∈ U ∩ V = ∅, contradicion.

Remark. In a metric space, A is closed if and only if whenever xn converges inthe metric space, then its limit is in A. The =⇒ part is true in all topologicalspace, but not necessarily ⇐= .

Definition 5.8. Let X be a topological space and A ⊂ X,x ∈ X. x is called anaccumulation point (aka limit point/cluster point) of A if for any neighbourhoodU of x, (A \ {x}) ∩ U 6= ∅.The derived set A′ of A is the set of all accumulation points of A.

Example 5.7. In R, suppose A = [0, 1) ∪ {2}, then A′ = [0, 1]. Also Q′ = R,and Z′ = ∅.

Proposition 5.5. Let X be a topological space and A ⊂ X, then A is closed ifand only if A′ ⊂ A.

Proof. If A is closed, then U = X \ A is open, so for any x ∈ X \ A, U is aneighborhood of x but U ∩A = ∅, so every accumulation points of A are insideof A.Conversely, given x ∈ X \ A, then x /∈ A′, so there is a neighbourhood U of xwith U ∩A = ∅, so x ∈ U ⊂ X \A, so X \A is open, hence A is closed.

Definition 5.9. Let A be a subset of a topological space X, then the interiorof A, intA or A◦ is defined by

intA =⋃

U⊂A, U open

U

The closure, clA or A, is defined by

clA =⋂

A⊂F , F closed

F

19

Note that A◦ ⊂ A ⊂ A, and A◦ = A if and only if A is open, A = A if andonly if A is closed.

Proposition 5.6.

A◦ = {x ∈ X : A is a neighbourhood of x}A = {x ∈ X : ∀U ⊂ X such that U is a neighbourhood of x, U ∩A = ∅}

= A ∪A′

Proof. Exercise.

Example 5.8. In R, [0, 1) ∪ {2} = [0, 1] ∪ {2}, ([0, 1) ∪ {2})◦ = (0, 1), Q =R,Q◦ = ∅ = Z◦, Z = Z.

Remark. Convergent sequences determine the metric topology, since x ∈ A ⇐⇒∃(xn) ∈ A, xn → x. Again we have the ⇐= direction for all topological spacesbut not necessarily for the =⇒ direction.

Definition 5.10. Let X be a topological space and A ⊂ X. We say A is denseif A = X.We say X is separable if there is a countable dense set in X.

Definition 5.11. 1. Rn is separable since Qn = Rn.2. (non-example) An uncountable set in the discrete topology is not seperable.

As usual we can try to construct new spaces from old.

Definition 5.12. Let (X, τ) be a topological space and Y ⊂ X. The subspace(or relative) topology on Y is the collection {U ∩Y : U ∈ τ}. This is also calledthe topology on Y induced by τ .

One can check that this is indeed a topology.

Example 5.9. Let X = R, Y = [0, 2], then U = (1, 2] is open in Y sinceU = Y ∩ (1, 3). Note that U is not open in X.

Remark. 1. If Z ⊂ Y ⊂ X, then the topology on Z induced by the topology onX is the topology on Z induced by the topology on Y which is induced by thetopology on X. So the subspace of a subspace is a subspace.2. If N ⊂ M where N,M are metric spaces, then the metric on N inducedby the metric on M induces the metric topology on N induced by the metrictopology on M .

Proposition 5.7. Let Y be the subspace of a topological space X.1. A ⊂ Y is closed in Y if and only if there is closed set B ⊂ X such thatB ∩ Y = A.2. ∀A ⊂ Y, AY = Y ∩ AX .

Remark. The analogy of 2 on interiors does not always work. Take X = R, Y ={0}.

Proof. Trivial.

20

Definition 5.13. A base for a topological space (X, τ) is a family B ⊂ τ suchthat ∀U ∈ τ,∃C ⊂ B such that

U =⋃B∈C

B

In other words, the topology τ consists of the arbitrary unions of some familyof open sets which is a subset of B. So a base determines topology.

Example 5.10. 1. The set of all open intervals is a base of the usual topologyon R. In general, the collection of all open balls in a metric space is a base forthe metric topology on it.

However, what we want to do is not to construct B from τ , but the otherway around.

Lemma 5.8. let X be a set and B ⊂ 2X . Assume that1. X =

⋃B∈B B.

2. ∀B1, B2 ∈ B,∀x ∈ B1 ∩B2,∃B ∈ B, x ∈ B ⊂ B1 ∩B2.Then there is an unique topology on X that is generated by the base B.

Proof. We must have the topology

τ =

{ ⋃B∈C

B : C ⊂ B

}It is immediate that τ is a topology on X. Indeed, ∅, X ∈ τ and it is closedunder arbitrary union. For intersection, consider

U1 =⋃B∈C1

B,U2 =⋃B∈C2

B

Given x ∈ U1 ∩ U2, so ∃B1 ∈ C1, B2 ∈ C2, so there is some Bx ∈ B such thatx ∈ Bx ⊂ B1 ∩B2 ⊂ U1 ∩ U2, thus

U1 ∩ U2 =⋃

x∈U1∩U2

Bx

By definition B is a base for τ .

Definition 5.14. A topological space is called second-countable if it has acountable base.

Example 5.11. The set of all open balls of rational radii and centres is acountable base for Rn. So Rn is seond-countable.

Definition 5.15. A map f : (X, τ) → (Y, ρ) is continuous if V ∈ ρ =⇒f−1(V ) ∈ τ .

This extends our previous defintion of continuity in metric space by Propo-sition 3.8.

Proposition 5.9. Let f : X → Y be a map between topological spaces, then1. f is continuous if and only if the preimage of any closed set is closed.2. If B is a base for Y , then f is continuous if and only if for all B ∈ B,f−1(B) is open in X.3. Composition of continous functions is continuous.

21

Proof. Trivial.

Example 5.12. Constant, identity and inclusion are always continuous. Hencethe restriction of a continuous map is continuous.

Definition 5.16. Let f : X → Y be a map between topological spaces, then wesay f is a homeomorphism if f is a bijection and both f, f−1 are continuous. Wesay X,Y are homeomorphic, or X ∼= Y , if there is a homeomorphism betweenthem.

Definition 5.17. f is a open map if for every U open in X, f(U) is open inY . So f is a homeomorphism if and only if f is a continuous open bijection.

Definition 5.18. A property P of topological spaces is called a topologicalproperty if it is preserved under homeomorphisms.

Definition 5.19. Let (X, τ), (Y, ρ) be topological spaces and let

B = {U × V : U ∈ τ, V ∈ ρ}

Then X × Y ∈ B and U1 × V1 ∩ U2 × V2 = (U1 ∩ U2) × (V1 × V2) ∈ B. Thusthere is an unique topology on X × Y with base B. This is called the producttopology.

So a set W in the product topological space is open if and only if ∀(x, y) ∈W, ∃U ∈ τ, V ∈ ρ, x ∈ U × V ⊂W .

Example 5.13. R2 in the usual topology is homeomorphic to R × R in theproduct topology. In general, the topology induced by the (p-)product metric isthe product topology of metric topologies. So products of metrizable topologiesare metrizable.

Proposition 5.10. Consider πX : X × Y → X,πY : X × Y → Y be theprojections. Then πX , πY are continuous and if Z is a topological space, andf : Z → X × Y is continuous if and only if πX ◦ f, πY ◦ f are both continuous.

Note that f(z) = (πX ◦ f(z), πY ◦ f(z)).

Proof. Given an open set U ⊂ X, π−1X (U) = U × Y , which is open in X × Y , soπX is continuous. Similarly, πY is continuous.Given such an f , if f is continuous, then both of πX ◦ f, πY ◦ f are continuoussince composition of continuous functions is continuous. Conversely, if both ofπX ◦ f, πY ◦ f are continuous, then it is enough to check that any member ofthe base U × V ⊂ X × Y has an open preimage. Indeed,

f−1(U × V ) = f−1(U × Y ) ∩ f−1(X × V )

= f−1(π−1X (U)) ∩ f−1(π−1Y (V ))

= (πX ◦ f)−1(U) ∩ (πY ◦ f)−1(V )

which is open by assumption.

It is trivial to extend all the above to finite products. It is interesting toknow that (X × Y ) × Z ∼= X × (Y × Z), and X × Y ∼= Y ×X. Now we turnsto quotient topology.

22

Definition 5.20. Start with a topological space (X, τ) and let R be an equiv-alence relation on X. We let X/R be the set of equivalence classes (the “quo-tient set”). Let q : X → X/R be the quotient map sending x 7→ [x] where[x] = {y ∈ X : yRx} is the equivalence class containing x. The quotient topol-ogy on X/R is the family

τR = {V ⊂ X/R : q−1(V ) ∈ τ}

Proposition 5.11. the quotient topology is indeed a topology.

Proof. q−1(X/R) = X, q−1(∅) = ∅, so ∅, X ∈ τR.

∀(Vi)i∈I ∈ τR, q−1(⋃i∈I

Vi

)=⋃i∈I

q−1(Vi)

which is open.

∀U, V ∈ τR, q−1(U ∩ V ) = q−1(U) ∩ q−1(V )

which is also open.

Remark. 1. Note that q is surjective and continuous under τR.2. For x ∈ X, t ∈ X/R, x ∈ t ⇐⇒ q(x) = t, hence

∀V ⊂ X/R, q−1(V ) = {x ∈ X : q(x) ∈ V } = {x ∈ X : ∃t ∈ V, q(x) = t} =⋃t∈V

t

Example 5.14. Q ≤ R as (additive) groups, so R/Q gives a equivalence rela-tion. So we can induce a quotient topology on R/Q, which immediately we canfind to be the indiscrete topology which is not metrizable, which is why we donot do quotients in metric spaces.

Consider q : X → X/R the quotient map and any map f : X → Y such thatxRy =⇒ f(x) = f(y), then there is a map f : X/R → Y such that f ◦ q = f .That is, the following diagram commutes.

X Y

X/R

f

qf

If f is surjective, so is f . Also, if f(x) = f(y) ⇐⇒ xRy, then f is injective.

Proposition 5.12. Let X,Y be topological spaces, R an equivalence relationon X, q : X → X/R the quotient map, f : X → Y some map with xRy =⇒f(x) = f(y), then let f be as above, then1. If f is continuous so is f .2. If f is an open map so is f .

Proof. 1. Let V be open in Y , then f−1(V ) is open in X, so q−1(f−1(V )) =f−1(V ) is open, so f−1(V ) is open, hence f is continuous.2. Given open V ∈ X, U = q−1(V ) is open in X, and V = q(U), so f(V ) = f(U)which is open.

23

Corollary 5.13. If f(x) = f(y) ⇐⇒ xRy, f is surjective, continuous andopen, then f is a homeomorphism.

Remark. Work “upstairs”!

Example 5.15. Take R/Z where the equivalence relation is as if they areadditive groups. R/Z ∼= S1 = {z ∈ C : |z| = 1}. Indeed, consider the mapf(t) = e2πit, then the induced f is a homeomorphism by the preceding corollary.If f is not open, then there is some open U ∈ R such that f(U) is not open.∃(zn) ∈ S1 \ f(U) such that zn → z, then due to surjectivity we know thatthere is some x ∈ U such that f(x) = z and (xn) ∈ [x− 1/2, x+ 1/2] such thatf(xn) = zn and we know that xn /∈ U , but xn has a convergent subsequence(xkn) → y ∈ R \ U which is closed, so due to continuity we must have f(x) =f(y) =⇒ x− y ∈ Z =⇒ x = y /∈ U , which is a contradiction.

6 Connectedness

An interval I in R has the defining property that ∀x, y, z, x < y < z, thenx, z ∈ I =⇒ y ∈ I. We know that a real continuous function maps intervalsto intervals due to the intermediate value theorem. But it may not work if the(restricted) domain is not an interval.

Definition 6.1. A topological space X is disconnected if there are open U, V ⊂X such that U 6= ∅ and V 6= ∅ partitions X, that is U ∩V = ∅ and U ∪V = X.In this case, we say U, V disconnect x.A topological space X is connected if it is not disconnected.

Lemma 6.1. The image of continuous function on connected space is connected.

Proof. Suppose f : X → Y is continuous. Note that if we consider f asf : X → Im f then it is still continuous. Then if U, V disconnect Im f , thenf−1(U), f−1(V ) disconnect X.

Theorem 6.2. For a topological space X, the followings are equivalent:1. X is connected.2. If f : X → R is continuous, then f(X) is an interval.3. Every continous function f : X → D, where D is discrete and |D| ≥ 2, isconstant. 2

Proof. 1 =⇒ 2: Obvious due to the preceding lemma and the trivial fact thatan open set in R is connected if and only if it is an interval.2 =⇒ 3: Immediate, also from the preceding lemma.3 =⇒ 1: We shall prove the contrapositive. Suppose that U, V disconnects X,then choose d, e ∈ D with d 6= e, then the function f defined by

f(x) =

{d, if x ∈ Ue, otherwise, that is if x ∈ V

is continuous but is not constant, contradiction.

2Most of the time we take D = Z

24

Example 6.1. 1. ∅ and singletons are connected.2. Any indiscrete topological space is connected.3. The cofinite topology on an infinite set is connected.4. The discrete topology is disconnected if it is not a singleton.

Lemma 6.3. A subspace Y ⊂ X is disconnected if and only if there are opensets U, V ∈ X such that U ∩ Y 6= ∅, V ∩ Y 6= ∅, U ∩ V ∩ Y = ∅, Y ⊂ U ∪ V .

Proof. Trivial.

Proposition 6.4. Let Y be a connected subspace of X, then Y is connected.

Proof. Assume not, then by the preceding lemma, there exists open sets U, Vin X such that U ∩ Y 6= ∅, V ∩ Y 6= ∅, U ∩ V ∩ Y = ∅, Y ⊂ U ∪ V . It followsthat U ∩ V ∩ Y = ∅, Y ⊂ U ∪ V , so we must have, WLOG, U ∩ Y = ∅, thenY ⊂ X \ U =⇒ Y ⊂ X \ U =⇒ Y ∩ U = ∅, contradiction.

Remark. 1. Alternatively, we can use the third part of Theorem 6.2. 2. In fact,for any Z with Y ⊂ Z ⊂ Y is connected since the closure of Z is Y .

Alternative proof of Lemma 6.1. Let f : X → Y be continuous, for conveniencewe can just assume f is surjective using the same argument as the original proof,then consider any continuous g : Y → Z, then g ◦f is continuous hence constantsince f is connected, but f is surjective, so g is constant, then it is done byTheorem 6.2.

Remark. 1. Connectedness is a topological property.2. If f : X → Y is continuous and A ⊂ X and A is connected, then f(A) isconnected.

Corollary 6.5. If X is connected and R an equivalence relation on X, thenX/R is connected.

Proof. The quotient map is continuous and surjective.

Example 6.2. let Y = {(x, sin(1/x)) : x > 0} ⊂ R2 is connected since it isthe image of f(x) = (x, sin(1/x)), which is continuous since its components areconnected, over R>0.By Proposition 6.4, Y = Y ∪ ({0}× [−1, 1]) is also connected. This is called theTopologist’s Sine Wave.

Lemma 6.6. Let A be a family of connected subset of a topological space Xsuch that ∀A,B ∈ A , A ∩B = ∅, then

⋃A∈A A is connected.

Proof. Suppose f :⋃A∈A A → Z is connected, then f |A is continuous for any

A ∈ A , thus it is constant, say it is nA, then ∀A,B ∈ A , then nA = nB sinceA ∩B 6= ∅. Thus f is constant, hence

⋃A∈A A is connected.

Proposition 6.7. If X,Y are connected, so is X × Y .

Proof. Observe that ∀x ∈ X, {x}×Y ∼= Y is connected and ∀y ∈ Y,X×{y} ∼= Xis connected as well, so since (x, y) ∈ ({x} × Y ) ∩ (X × {y}) 6= ∅, by thepreceding lemma Ax,y = ({x} × Y ) ∪ (X × {y}) is connected. Now obviously(x, y′) ∈ Ax,y ∩ Ax′,y′ 6= ∅, so X × Y =

⋃x∈X,y∈Y Ax,y is connected by the

preceding lemma.

25

Definition 6.2. Let X be a topological space, we define an equivalence relationR by xRy if and only if there is a connected U ⊂ X such that x, y ∈ U . Onecan check that this is an equivalence relation by Lemma 6.6, and the partitionof X by R is called the connected components of X.

Let Cx be the equivalence class containing x.

Proposition 6.8. Connected components are nonempty and are maximal (wrtinclusion) connected subset of X, also they are closed.

Proof. Let C be a connected component, so it is the equivalence class of somex, so C = Cx, so C is nonempty since it contains X. So given y ∈ C, ∃Ay 3 x, ysuch that U is connected. Ay ∈ C by definition of the relation. Now ∀y, z ∈C, x ∈ Ay ∩Az 6= ∅, therefore by Lemma6.6, hence C =

⋃y∈C Ay is connected.

If C ⊂ D and D is connected, then ∀y ∈ D, x, y ∈ D, thus since D is connectedy ∈ C, so D ⊂ C =⇒ C = D.Hence since C is connected and contains C, by maximality C = C, therefore Cis closed.

Definition 6.3. A topological space X is called path-connected if ∀x, y ∈X,∃γ : [0, 1]→ X continuous, γ(0) = x, γ(1) = y.

Theorem 6.9. Any path-connected space is connected.

Proof. Suppose not, then X is path-connected but not connected, so there areopen U, V disconnects X. Then fixing x ∈ U, y ∈ V , there exists a continu-ous γ : [0, 1] → X such that γ(0) = x, γ(1) = y. Thus γ−1(U), γ−1(V ) arenonempty, open, and partitions [0, 1], thus [0, 1] is disconnected by them, whichis a contradiction.

The converse, however, is not true.

Example 6.3. Take the Topologist’s Sine Wave, X = {(x, sin(1/x)) : x >0} ∪ ({0}× [−1, 1]). We have already shown it is connected. But it is not path-connected. Indeed, pick points (0, 0), (1, sin(1)) ∈ X. Assume that γ : [0, 1] →X is continuous and γ(0) = (0, 0) = x, γ(1) = (1, sin(1)) = y. Let γ1, γ2 be thecomponents of γ, which are continuous. For γ1(t) > 0, then [0, γ1(t)] ⊂ γ1([0, t])by IVT, so ∃n ∈ N, (2πn)−1, (2πn+ π/2)−1 ∈ (0, γ1(t)) ⊂ γ1([0, t]). So there issome a, b with γ1(a) = (2πn)−1, γ1(b) = (2πn+π/2)−1, hence γ2(a) = 0, γ2(b) =1, so we can thus find a sequence 1 > t1 > t2 > · · · > 0 with

γ2(tn) =

{1, if n is even

0, otherwise

So tn converges but γ2(tn) does not. This is a contradiction.

Lemma 6.10 (Gluing Lemma). Let f : X → Y be a function between topologicalspaces. If X = A ∪ B where A,B are closed and f |A, f |B are continuous, thenf is continuous.

Proof. Given closed V in Y ,

f−1(V ) = (f−1(V ) ∩A) ∪ (f−1(V ) ∩B) = (f |A)−1(V ) ∪ (f |B)−1(V )

which is closed since A,B are closed. Hence f is continuous.

26

Corollary 6.11. Let X be a topological space. Define the relation R by xRy ifand only if there is a continuous γ : [0, 1] → X such that γ(0) = x, γ(1) = y.Then this is an equivalence relation.

Proof. Trivial.

Theorem 6.12. Let U ⊂ Rn be open, then U is connected if and only if U ispath-connected.

Proof. It suffice to show every open connected subset of Rn is path-connected.WLOG U 6= ∅, fix x0 ∈ U , let V be the path-connected component containingx0. We shall show that V,U \ V are both open, so by assumption V = U , thusthe proof will be done.V open: Since U is open, for any x ∈ U , there is r > 0 such that Dr(x) ∈ U .But any ball is path connected, so ∀x ∈ V,∃rx > 0, Drx(x) ∈ V , so V is open.U \ V open: Fix by the same proof as above, any path-connected componentsin V is open, so since U \ V is the union of some of them (the ones except V ),it is open.

Example 6.4. For n ≥ 2, Rn is not homeomorphic to R. Assume f : Rn → Ris a homeomorphism. Fix x ∈ Rn, and let y = f(x), then f |Rn\{x} is still ahomeomorphism to R \ {y}. But then Rn \ {x} is connected by the precedingtheorem, but R \ {y} is not, contradiction.

7 Compactness

Recall that a continuous, real-valued function on a closed bounded interval isbounded and attains its bound. The question is, for which topological space Xis it true that every continuous real functions is bounded.

Example 7.1. 1. For finite X, every function X → R is bounded.2. If for all continuous f : X → R,∃n ∈ N,∃A1, A2, . . . , An ⊂ X such thatX =

⋃iAi and f is bounded on each Ai, then f is bounded on X.

Note that given continuous f : X → R, for x ∈ X, Ux = f−1((f(x) −1, f(x) + 1)) is open and f is bounded there. So if there is some finite subset of{Ux : x ∈ X} that still covers X, then f must be bounded.

Definition 7.1. An open cover of a topological space X is a family of opensets U = {Ui}i∈I in X such that X =

⋃i∈I Ui.

A subcover of U is a subset V ⊂ U that is also an open cover of X. V is calleda finite subcover if it is finite.X is compact if every open cover of X has a finite subcover.

Theorem 7.1. If X 6= ∅ is compact and f : X → R is continuous, then f isbounded and attains its bound.

Proof. By continuity of f and compactness of X, there is a finite subset of{f−1((f(x) − 1, f(x) + 1)) : x ∈ X} that covers X, which means that f isbounded on any set in a finite family, so f is bounded in the union of thatfamily, which is X. To show that f attains its bound, let m = {f(x) : x ∈ X}which exists since X 6= ∅ and f is bounded. Suppose that there is not an x

27

with f(x) = m, so for any x ∈ X, f(x) > m so ∃mx such that f(x) > mx > m.Let Ux = f−1((mx,∞)) which is open and contains x, and infUx

f ≥ mx > x.Note that the family of all Ux is an open cover of X, so there is a finite subcover{Ux}x∈F , so ∀y ∈ X, f(y) ≥ minx∈F mx > m, contradiction.

Note that for a subspace Y ⊂ X, Y is compact iff whenever U is a familyof open set in X whose union contains Y , there is a finite subset V ⊂ U suchthat the union of elements in V contains Y .

Theorem 7.2. [0, 1] is compact.

Proof. Let U be a set of open sets in R thar contains [0, 1], assume that theredoes not exist finite subcover that contains [0, 1], then if 0 ≤ a < b ≤ 1 and[a, b] cannot be covered by any finite V ⊂ U , then let C = (a+ b)/2, then oneof [a, c], [c, b] cannot be covered by finite V ⊂ U .Therefore, inductively we can find intervals In = [an, bn] such that

I0 = [0, 1], In+1 ⊂ In, |bn − an| = 1/2n

thus an → x, bn = an + (bn − an)→ x for some x ∈ [0, 1]. Now there is U ⊂ Usuch that x ∈ U , but then there is some ε with (x− ε, x+ ε) ⊂ U , therefore forsufficiently large n, In ⊂ U , which is a contradiction.

Proposition 7.3. Let X be a topological space and Y ⊂ X a subspace, then1. If X is compact and Y is closed in X, then Y is compact.2. If X is Hausdorff and Y is compact, then Y is closed.

Proof. 1. Let U covers Y , then U ∪ {X \ Y } is an open cover of X. Since Xis compact, there is a finite subcover V ⊂ U ∪ {X \ Y } that covers X, henceV \ {X \ Y } ⊂ U is finite and covers Y .2. We want to show that its complement is open. Indeed, for any x ∈ X \ Y ,and for any y ∈ Y , there are disjoint Uy, Vy such that x ∈ Uy, y ∈ Vy, then{Vy}y∈Y is an open cover of Y , thus there is some finite set F ⊂ Y such thatY ⊂

⋃y∈F Vy, so U = ∩y∈FUy is open, and by definition it is disjoint from Y ,

hence x ∈ U ⊂ X \ Y , which shows that X \ Y is open.

Proposition 7.4. If X is compact and f : X → Y is continuous, then f(X) iscompact.

Proof. For any open {Ui}i∈I that covers f(X), {f−1(Ui)}i∈I is an open coverof X, therefore there is some finite F ⊂ I such that {f−1(Ui)}i∈F covers X,hence {Ui}i∈F covers f(X).

Remark. 1. Compactness is a topological property.2. Let f : X → Y and A ⊂ X. Suppose A is compact, then f(A) is compact.

Example 7.2. For a < b, [a, b] = f([0, 1]) where f(x) = (b − a)x + a which iscontinuous, thus every closed bounded interval is compact.

Corollary 7.5. If X is compact and R is an equivalence relation on X, thenX/R is compact.

Proof. The quotient map is continuous and surjective.

28

Theorem 7.6 (Topological Inverse Function Theorem). If f : X → Y is acontinuous bijection and X is compact and Y is Hausdorff, then f is a homeo-morphism.

Proof. It suffices to check that f is an open map, which, since f is a bijection,is equivalent to say that f is a closed map.Fix any closed V ⊂ X, then V is compact since X is compact, thus f(V ) iscompact since f is continuous, therefore f(V ) is closed since Y is hausfdorff.The result follows.

Example 7.3. Consider f : R → S1 by f(t) = e2πit induces a continuousbijection f : R/Z → S1. Now R/Z = q([0, 1]) (where q is the quotient map)is compact and S1 is Hausdorff since it is a metric space, therefore f is ahomeomorphism.

Theorem 7.7 (Tychonorff’s Theorem on Finite Products 3 ). Finite productsof compact spaces are compact.

Proof. It suffices to show in the case where the product consists of 2 components.Assume X,Y are compact and let U cover X × Y . Fix x ∈ X, then for anyy ∈ Y there is some Wy ∈ U with x, y ∈ Wy, so there is some Uy open in Xand Vy open in Y such that (x, y) ∈ Uy × Vy ⊂Wy. By compactness, there is afinite FY ⊂ Y with

⋃y∈FY

Vy = Y . Let Tx =⋃y∈FY

Uy is open and contains xand note that Tx × Y ⊂

⋃y∈FY

Wy. But then {Tx}x∈X covers X, so there is afinite FX ⊂ X such that {Tx}x∈FX

covers X, hence

X × Y ⊂⋃x∈FX

Tx × Y =⋃x∈FX

⋃y∈FY

Wy

The last term is the required finite subcover.

Theorem 7.8 (Heine-Borel Theorem). A subset K ⊂ Rn is compact if andonly if it is closed and bounded.

Proof. If K is compact, note that f : Rn → R by x 7→ ‖x‖, thus it is bounded.K is also closed by Proposition 7.3.Conversely, if K is closed and bounded, there is some M > 0 such that K ⊂[−M,M ]n which is compact by Theorem 7.2 and 7.7. Since K is closed in[−M,M ]n, it is compact by Proposition 7.3.

Definition 7.2. Given an open set U ⊂ Rn, a sequence of functions fk : U → Rconverges locally uniformly on U to some function f : U → R if ∀x ∈ U,∃r >0, Dr(x) ⊂ U and fk → f uniformly on Dr(x).

Thus this happens if and only if fn → f uniformly on any compact subsetof U .

Definition 7.3. A topological space X is sequentially compact if and only ifevery sequence in X has a convergent subsequence.

Example 7.4. Any closed bounded subset of Rn is sequentially compact byBolzano-Weierstrass.

3It works for arbitrary products, but that case is much much harder

29

Definition 7.4. Fix a metric space (M,d). For ε > 0 and F ⊂ M . We say Fis an ε-net for M if ∀x ∈M,∃y ∈ F, d(x, y) ≤ ε. That is,

M =⋃y∈F

Bε(y)

We say M is totally bounded if for any ε > 0, there is a finite ε-net for M .

Note that any compact space is totally bounded, but the converse is not trueby taking [0, 1), but the only thing missing here is completeness.

Theorem 7.9. The followings are equivalent for a metric space M :(1) M is compact.(2) M is sequentially compact.(3) M is totally bounded and complete.

Proof. 1 =⇒ 2: Let (xn) be a sequence in M , so for n ∈ N, let An = {xk : k >n}. It suffices to show that

⋂n∈N An is nonempty. Assume not, then⋃n∈N

M \ An = M

Each M \ An is open, so by compactness of M there is some N ∈ N such that⋃n≤N M \ An = M . But {An} is decreasing, so necessarily M \ AN = M =⇒

AN = ∅, contradiction.2 =⇒ 3: M is complete since a Cauchy sequence with convergent subsequenceis convergent. To see it is totally bounded, assume it is not, then there is someε > 0 such that every ε-net is infinite. Pick any x1 ∈ M , and once we havealready picked x1, . . . , xn, we pick xn+1 /∈

⋃nk=1Bε(xk). This is valid since

otherwise M would have a finite ε-net. But (xn) cannot possibly have anyCauchy subsequence, so it has no converging subsequence.3 =⇒ 1: Assume M is not compact, so there is an open cover U without anyfinite subcover. We say A ⊂ M is “bad” if there is no finite subcover of A inU . So M is bad but ∅ is not. Note if A =

⋃ni=1Bi is bad, then there is some i

such that Bi is bad.Next, we want to show that if A is bad and ε > 0, then ∃B ⊂ A such that B isbad and diamB = supx,y∈B d(x, y) < ε. Indeed, since M is bounded, we have afinite ε/2-net F , that is,⋃

x∈FBε/2(x) = M =⇒

⋃x∈F

(Bε/2(x) ∩A) = A

But this would mean that there is some x ∈ F such that Bε/2(x)∩A is bad, andby triangle inequality its diameter is less than ε. Using this we can construct asequence M ⊃ A1 ⊃ A2 ⊃ · · · such that An is bad for any n and diamA < 1/n.Picking xn ∈ An gives a Cauchy sequence (xn) which converges to some x ∈Mby completeness. There is some U ∈ U that contains x, and it necessarilycovers An when n is large enough. Contradiction.

Remark. 1. We have a new proof of Bolzano-Weierstrass now! We also have anew proof of Theorem 7.7 for metric spaces.2. Sadly, the equivalence of sequentially compactness and compactness fails inboth directions in general topological spaces.

30

8 Differentiation

Definition 8.1. Fix m,n ∈ N, let L(Rm,Rn) be the set of linear maps toRm to Rn. Note that this space is isomorphic to Rmn, both algebraically andtopologically, as we have the metric

∀T ∈ L(Rm,Rn), ‖T‖ =

√√√√ m∑i=1

n∑j=1

|Tij |2 =

√√√√ m∑i=1

‖Tei‖2

Lemma 8.1. (a) Given a linear map T , for every x ∈ Rm, we have ∀x ∈Rm, ‖Tx‖ ≤ ‖T‖‖x‖. So T is Lipschitz hence continuous.(b) For S ∈ L(Rm,Rn), T ∈ L(Rn,Rp), ‖TS‖ ≤ ‖T‖‖S‖.

Proof. (a) If x =∑i xiei, then

‖Tx‖ =

∥∥∥∥∥m∑i=1

xiTei

∥∥∥∥∥ ≤m∑i=1

|xi|‖Tei‖ ≤

√√√√ m∑i=1

|xi|2

√√√√ m∑i=1

‖Tei‖2 = ‖x‖‖T‖

(b) We have

‖TS‖ =

√√√√ m∑i=1

‖TSei‖2 ≤

√√√√ m∑i=1

‖T‖2‖Sei‖2 = ‖T‖‖S‖

As desired.

Recall that a function f : R→ R is differentiable at a if limh→0(f(a+ h)−f(a))/h exists. So let ε(h) = (f(a + h) − f(a))/h − f ′(a), then f(a + h) =f(a) + f ′(a)h + ε(h)h and ε → 0 as h → 0. We can think of this as ε(0) = 0and ε is continuous at 0. So we want to use it to define differentiation in higherdimensions.

Definition 8.2. Given m,n ∈ N and an open set U ⊂ Rm, a function f :U → Rn and a ∈ U . We say f is differentiable at a if there is a linear mapT : Rm → Rn and a function ε : {h ∈ Rm : a+ h ∈ U} → Rn such that

f(a+ h) = f(a) + T (h) + ε(h)‖h‖

where ε→ 0 as h→ 0. (Or ε(0) = 0 and ε is continuous ar 0).

Remark.

ε(h) =

{0, if h = 0

(f(a+ h)− f(a)− T (h))/‖h‖, if h 6= 0 and a+ h ∈ U

Since U is open, ∃r > 0, Dr(a) ⊂ U , so Dr(a) ⊂ Dom ε. Note also that ourcondition on ε is also equivalent to say ε(h)‖h‖ = o(‖h‖) as h→ 0.Next, we observe that T (if it exists) is unique. Indeed, if both T, S satisfiesour condition, then (S(h) − T (h))/‖h‖ → 0 as h → 0, so by choosing h = x/nfor n ∈ N we have S = T .

31

Definition 8.3. This unique T is called the derivative of f at a, denoted byf ′(a) or Df(a) or Df |a, so

f(a+ h) = f(a) + f ′(a)(h) + o(‖h‖)

Definition 8.4. We say f is differentiable at U if it is differentiable at a forevery a ∈ U . So the derivative of f on U is the map f ′ : U → L(Rm,Rn).

Remark. When m = 1, T is a linear map R→ Rn, so ∀x ∈ R, T (x) = xv wherev = T (1) (this in fact gives us a natural correspondence L(R,Rn) ∼= Rn, T 7→T (1)). Hence for open U ⊂ R, f : U → Rn, a ∈ U , to say f is differentiable at ais to say there is some v ∈ Rn with f(a+h) = f(a) +hv+o(h). In other words,the linear map f ′(a) takes the form h 7→ hv.

Example 8.1. 1. Every constant function is differentiable as we can takef ′(a) ≡ 0 ∈ L(Rm,Rn).2. Every linear map f is differentiable by taking f ′(a) = f ∈ L(Rm,Rn). forevery a.3. Any bilinear map f : Rm × Rn → Rp is differentiable. Indeed, we have

f((a, b) + (h, k)) = f(a+ b, h+ k) = f(a, b) + f(a, k) + f(h, b) + f(h, k)

Note that f(a, k) + f(h, b) is linear in (h, k), therefore it remains to checkf(h, k) = o(‖h‖). Indeed,

‖f(h, k)‖ =

∥∥∥∥∥∥f m∑i=1

hiei,

n∑j=1

kjej

∥∥∥∥∥∥≤∑i,j

|hi||kj |‖f(ei, ej)‖

≤ ‖(h, k)‖2∑i,j

‖f(ei, ej)‖

= O(‖(h, k)‖2) = o(‖(h, k)‖)

4. Take f : Rn → R by f(x) = ‖x‖2, so

f(a+ h) = ‖a+ h‖2 = ‖a‖2 + 2〈a, h〉+ ‖h‖2 = f(a) + 2〈a, h〉+ o(‖h‖)

That is, f ′(a)(h) = 2〈a, h〉.5. Let Mn

∼= L(Rn,Rn) be the collection of all n × n real matrices. Considerf : Mn →Mn, A 7→ A2 which has

f(A+H) = A2 +AH +HA+H2 = f(A) +AH +HA+ o(‖H‖)

due to Lemma 8.1. So f ′(A)(H) = AH +HA.

Proposition 8.2. Differentiability implies continuity.

Proof. Write f(a + h) = f(a) + f ′(a)(h) + ε(h)‖h‖ where ε(0) = 0 and ε iscontinuous at 0. f ′(a) is continuous by Lemma 8.1 (which implies every linearmap is continuous), so the RHS is continuous in h, therefore h 7→ f(a + h) iscontinuous at h = 0, hence f is continuous at a.

32

Proposition 8.3 (Chain Rule). Consider open U ∈ Rm, V ∈ Rn and functionsf : U → Rn, g : V → Rm and f(U) ⊂ V . If f is differentiable at a andg is differentiable at f(a), then g ◦ f is differentiable at a and (g ◦ f)′(a) =g′(f(a)) ◦ f ′(a).

Proof. Let b = f(a) and S = f ′(a), T = g′(b), then{f(a+ h) = f(a) + S(h) + ε(h)‖h‖g(b+ k) = g(b) + T (k) + δ(k)‖k‖

Where ε(0) = 0, δ(0) = 0 and both of them are continuous at 0. So

(g ◦ f)(a) = g(b+ S(h) + ε(h)‖h‖)

Let k(h) = S(h) + ε(h)‖h‖, so it equals

g(b) + T (k) + δ(k)‖k‖ = (g ◦ f)(a) + T ◦ S(h)

+ ‖h‖T (δ(h)) + δ(k(h))‖S(h) + δ(h)‖h‖‖

Due to continuity of ε, δ are continuous at 0, ‖h‖T (δ(h)) = o(‖h‖) and T (ε(0)) =0, so this term is fine.Also, δ(k(0)) = 0 and δ ◦ k is continuous at 0. In addition,

0 ≤ ‖S(h) + δ(h)‖h‖‖‖h‖

≤ ‖S(h)‖+ ‖ε(h)‖‖h‖‖h‖

≤ ‖S‖+ ‖h‖

by Lemma 8.1. So

limh→0

δ(k(h))‖S(h) + δ(h)‖h‖‖‖h‖

= 0 =⇒ δ(k(h))‖S(h) + δ(h)‖h‖‖ = o(‖h‖)

Hence g ◦f is differentiable and its derivative is T ◦S = g′(b)◦f ′(a) = g′(f(a))◦f ′(a).

Proposition 8.4. f : U → Rn (U ∈ Rm is open) is differentiable if and only ifeach components fj = πj ◦ f is differentiable at a. Also,

f ′(a)(h) =

n∑j=1

f ′j(a)(h)e′j

Proof. Note that πj(x) = 〈x, e′j〉 is linear hence differentiable, thus by chain rulethe =⇒ direction is done. For ⇐= , we have for every j,

fj(a+ h) = fj(a) + f ′(a)(h) + εj(h)‖h‖

So

f(a+ h) =

n∑j=1

fj(a+ h)e′j

=

n∑j=1

(fj(a) + f ′(a)(h) + εj(h)‖h‖)e′j

= f(a) +

n∑j=1

f ′j(a)(h)e′j

+

n∑j=1

εj(h)e′j

‖h‖33

Since ε(h) =∑nj=1 εj(h)e′j has ε(0) = 0 and is continuous at 0, hence the

result.

Proposition 8.5. Let f, g : U → Rn where U ⊂ Rm is open and φ : U → R isdifferentiable at a ∈ U , then so are f + g and φf : x 7→ φ(x)f(x), and

(f + g)′(a) = f ′(a) + g′(a)

(φf)′(a)(h) = φ′(a)(h)f(a) + φ(a)f ′(a)(h)

Proof. We havef(a+ h) = f(a) + f ′(a)(h) + ε(h)‖h‖

g(a+ h) = g(a) + g′(a)(h) + δ(h)‖h‖

φ(a+ h) = φ(a) + φ′(a)(h) + η(h)‖h‖

Hence

(f + g)(a+ h) = (f + g)(a) + (f ′(a) + g′(a))(h) + (ε(h) + δ(h))‖h‖

We can do the same thing for products as well which will provide a proof, but weshall give a different proof. Let F : U → R×Rn = Rn+1 by f(x) = (φ(x), f(x))and G : R × Rn → Rn by (a, x) 7→ ax. F is differentiable by Proposition 8.4and G is differentiable since it is bilinear, therefore φf = G ◦ F is differentiableand we can obtain the form of the derivative from the chain rule which is theformula as claimed.

Definition 8.5. Let U ⊂ Rm be open and f : U → Rn. Fix a ∈ U and adirection (nonzero vector) u ∈ Rm \ {0}. The limit

limt→0

f(a+ tu)− f(a)

t

if exists, is called the directional derivative of f at a to direction u and is denotedby Duf(a).

Remark. 1. f(a+ tu) = f(a) + tDuf(a) + o(t).2. Let γ(t) = a+ tu, then (f ◦ γ)′(0) = Duf(0).

In the special case where u = ei, we write Dif(a) to denote Deif(a) and itis called the ith partial derivative of f at a.

Proposition 8.6. If f is differentiable at a, then all Duf(a) exists and we haveDuf(a) = f ′(a)(u), so for h =

∑i hiei, we have

f ′(a)(h) =∑i

hiDif(a)

Proof. We havef(a+ h) = f(a) + f ′(a)(h) + ε(h)‖h‖

Thenf(a+ tu)− f(a)

t= f ′(a)(u) + ε(tu)

‖t‖t→ f ′(a)(u)

As t→ 0. The rest follows.

34

Remark. 1. Assume f is differentiable at a, then the matrix of f ′(a) is exactlyrepresented by (f ′(a))ji = Difj(a) = (∂fj/∂xi)(a). This is called the Jacobianof f at a, denoted by Jf(a).2. If all partial derivatives exists, so does Dufj(a),∀j, and we have Dufj(a) =πj(Duf(a)). So Duπj = πjDu.3. The converse of the proposition fails in general.

Theorem 8.7. Suppose f : U → Rn where U ⊂ Rm is open. Assume ∃r >0, Dr(a) ∈ U and Dif exists in Dr(a) and is continuous at a for all i, then fis differentiable at a.

Proof. WLOG n = 1 by Proposition 8.4 and the second remark above. We shallprove the case m = 2. The general case is similar.Let a = (a1, a2) and consider h = (h1, h2) ∈ Dr(0). Certainly we want thederivative to equal h1D1f(a1, a2) + h2D2f(a1, a2), so we will try to prove

f(a1 + h1, a2 + h2)− f(a1, a2)− h1D1f(a1, a2)− h2D2f(a1, a2) = o(‖h‖)

Note that we can write it out in two parts

f(a1 + h1, a2 + h2)− f(a1 + h1, a2)− h2D2f(a1, a2)

+ f(a1 + h1, a2)− f(a1, a2)− h1D1f(a1, a2)

We have f(a1 + h1, a2)− f(a1, a2)− h1D1f(a1, a2) = o(h1) = o(‖h‖) as h→ 0.As for the first part, let φ(t) = f(a1 +h1, a2 + t) for t ∈ [−|h2|, |h2|], so we have

f(a1+h1, a2+h2)−f(a1+h1, a2)−h2D2f(a1, a2) = φ(h2)−φ(0)−h2D2f(a1, a2)

Note that φ is continuous and is differentiable in (−|h2|, |h2|). Indeed we haveφ′(t) = D2f(a1 +h2, a2 + t). By MVT, there is some θ(h1, h2) ∈ (0, 1) such thatφ(h2)− φ(0) = φ′(θh2)h2. Hence

φ(h2)− φ(0)− h2D2f(a1, a2) = h2(D2f(a1 + h1, a2 + θh2)−D2f(a1, a2))

= o(h2) = o(‖h‖)

as h→ 0. So the theorem is proved.

Theorem 8.8 (Mean Value Inequality). Consider an open U ⊂ Rm and afunction f : U → Rn. Assume f is differentiable on U and we are given a, b ∈ Usuch that [a, b] = {(1− t)a+ tb : t ∈ [0, 1]} ⊂ U . Suppose there is some M > 0with ∀z ∈ [a, b], ‖f ′(z)‖ ≤M , then

‖f(b)− f(a)‖ ≤M‖b− a‖

Proof. Let v = f(b) − f(a). Consider φ : [0, 1] → R defined by φ(t) = 〈f((1 −t)a + tb), v〉. Then φ(1) − φ(0) = ‖f(b) − f(a)‖2 and φ is differentiable withφ′(t) = 〈f ′((1−t)a+tb)(b−a), v〉. By MVT, ∃θ ∈ (0, 1) with φ(1)−φ(0) = φ′(θ),so we have

‖f(b)− f(a)‖2 = 〈f(b)− f(a), f(b)− f(a)〉 = φ(1)− φ(0) = φ′(θ)

= 〈f ′((1− θ)a+ θb)(b− a), v〉 ≤ ‖f ′((1− θ)a+ θb)(b− a)‖‖v‖≤ ‖f ′((1− θ)a+ θb)‖‖b− a‖‖v‖ ≤M‖b− a‖‖f(b)− f(a)‖

The theorem follows.

35

Corollary 8.9. Let U ⊂ Rm be open and connected, and f : U → Rn bedifferentiable such that f ′ ≡ 0, then f is constant.

Proof. It is locally constant by the preceding theorem. It is then globally con-stant by connectedness.

Remark. Suppose we have open U ⊂ Rm, V ⊂ Rn and f : U → V is a bijectionsuch that f is differentiable at a ∈ U and f−1 at f(a) ∈ V . Let S = f ′(a), T =(f−1)′(f(a)), then ST = In and TS = Im. rank(In) = rank(ST ) = rank(TS) =rank(Im), hence n = m.

Theorem 8.10 (Inverse Function Theorem). We have an open set U ⊂ Rn, aC1 (continuously differentiable) function f : U → Rn and a point a ∈ U suchthat f ′(a) is invertible, then there exists open set V ⊂ U,W ⊂ Rn open suchthat a ∈ V, f(a) ∈ W, f |V : V → W is a bijection with a C1 inverse g : W → Vand ∀y ∈W, g′(y) = (f ′(g(y)))−1.

Proof. Step 1: We can actually assume WLOG that a = f(a) = 0 and f ′(a) = I.This is because we can consider h : U − a = {x − a : x ∈ U} → Rn byh(x) = (f ′(a))−1(f(x+ a)− f(a)).Now we can fix r > 0 such that Dr(0) ⊂ U and ∀x ∈ Dr(0), ‖f ′(x)− I‖ ≤ 1/2by continuity.Step 2: ∀x, y ∈ Dr(0), ‖f(x) − f(y)‖ ≥ ‖x − y‖/2, so f is injective. To provethis, consider h(x) = x − f(x) which has ‖h′(x)‖ = ‖I − f ′(x)‖ ≤ 1/2 for anyx ∈ Dr(0). So ‖h(x)− h(y)‖ ≤ ‖x− y‖/2 by Theorem 8.8, in other words,

‖x− y‖2

≥ ‖h(x)− h(y)‖ ≥ ‖x− y‖ − ‖f(x)− f(y)‖

which leads to the desired inequality.Step 3: For 0 < s < r/2, Ds(0) ⊂ f(B2s(0)) ⊂ f(Dr(0)). Fix y ∈ Ds(0) andconsider h : B2s(0)→ Rn by x 7→ y− f(x) + x. We have h′(x) = −f ′(x) + I, so∀x ∈ B2s(0), ‖h′(x)‖ ≤ 1/2. h is then 1/2-Lipschitz by Theorem 8.8. For anyx ∈ B2s(0), ‖h(x)‖ = ‖h(x) − h(0) + y‖ ≤ ‖x‖/2 + ‖y‖ ≤ 2s, so h(B2s(0)) ⊂B2s(0). By Theorem 4.7 there is some x ∈ B2r(0) such that h(x) = x, whichmeans that y = f(x).Step 4: Fix 0 < s < r/2, then let W = Ds(0) and V = f−1(Ds(0)) ∩Dr(0), soV is open and f(V ) = W by step 3 and f is injective by step 2, so f |V : V →Wis a bijection. The inequality in step 2 also implies that the inverse f−1 = g :W → V is 1/2-Lipschitz, hence continuous.If g is differentiable, then we have I = (f ◦ g)′(y) for any y, hence g′(y) =(f ′(g(y)))−1 by Chain Rule. We want to show that g has this as derivative.Indeed, fix b ∈ W and a = g(b), T = f ′(a), we have f(a + h) = f(a) + T (h) +ε(h)‖h‖. Fix δ > 0 such that Dδ(b) ∈ W and k ∈ Dδ(0), by setting h =h(k) = g(b + k) − g(b) we have k = f(a + h) − f(a) = T (h) + ε(h)‖h‖, soh = T−1(k)− T−1(ε(h))‖h‖, thus

g(b+ k) = g(b) + h = g(b) + T−1(k)− T−1(ε(h))‖h‖ = g(b) + T−1(k) + o(‖k‖)

Hence g is differentiable at b with derivative T−1 = (f ′(g(b)))−1 which is con-tinuous.

36

Definition 8.6. Let U ⊂ Rm be open and f : U → Rn. Suppose a ∈ U , thenf is twice differentiable at a if there is some open V with a ∈ V ⊂ U such thatf is differentiable in V and the derivative f ′ : V → L(Rm,Rn) is differentiable.f ′′(a) = (f ′)′(a) is called the second derivative of f .

So we have f ′′ ∈ L(Rm, L(Rm,Rn)) where we have

f ′(a+ h) = f ′(a) + f ′′(a)(h) + ε(h)‖h‖

where ε → 0 as h → 0. Note that ε(h) ∈ L(Rm,Rn). So f ′(a + h)(k) =f ′(a)(k) + f ′′(a)(h)(k) + ε(h)(k)‖h‖ for each fixed k ∈ Rm. Note also thatL(Rm, L(Rm,Rn)) ∼= Bil(Rm × Rm,Rn) by the natural correspondence T 7→ Bwith B(h, k) = T (h)(k). So we can think of the second derivative at a as abilinear map Rm × Rm → Rn.In summary, f is twice differentiable at a if and only if there is a bilinear mapB : Rm × Rm → Rn such that for any k ∈ Rm we have

f ′(a+ h)(k) = f ′(a)(k) +B(h, k) + o(‖h‖)

Example 8.2. For f : Mn → Mn by A 7→ A3. It is differentiable andf ′(A)(H) = HA2 +AHA+A2H. Then to find second derivative by

f ′(A+H)(K) = K(A+H)2 + (A+H)K(A+H) + (A+H)2K

= f ′(A)(K)

+KAH +KHA+AKH +HKA+AHK +HAK

+ o(‖H‖2)

So the seond derivative is the bilinear map f ′′(A) = B(H,K) = KAH+KHA+AKH +HKA+AHK +HAK.

Assume that f has second derivative at a under the usual setup, then

f ′(a+ h)(k) = f ′(a)(k) + f ′′(a)(h, k) + o(‖h‖)

So fix u, v ∈ Rm \ {0}, by putting k = v we have

Dvf(x+ h) = Dvf(a) + f ′′(a)(h, v) + o(‖h‖)

So Dvf is differentiable, therefore we can write

DuDvf(a) = f ′′(a)(u, v)

Theorem 8.11. Let U ⊂ Rm be open and f : U → Rn be second differentiableon U with f ′′ continuous at a for some a ∈ U , then f ′′(a) is a symmetric form,that is, for any 0 6= u, v ∈ Rm, DuDvf(a) = DvDuf(a).

Proof. WLOG n = 1 since (fj)′′ = (f ′′)j . Define

φ(s, t) = f(a+ su+ tv)− f(a+ su)− (f(a+ tv)− f(a))

= f(a+ su+ tv)− f(a+ tv)− (f(a+ su)− f(a))

Consider Ψ(x) = f(a+xu+ tv)−f(a+xu), so φ(s, t) = Ψ(s)−Ψ(0) = sΨ′(αs)where α = α(s, t) ∈ (0, 1) by mean value theorem. Expanding gives φ(s, t) =

37

s(Duf(a+αsu+ tv)−Duf(a+αsu)). Consider ψ(y) = Duf(a+αsu+ yv), soφ(s, t) = s(ψ(t)− ψ(0)) = stψ′(βt), β = β(s, t) ∈ (0, 1). In other words,

φ(s, t)

st= DvDuf(a+ αsu+ βtv)

= f ′′(a+ αsu+ βtv)(v, u)→ f ′′(a)(v, u)

by the continuity of f ′′ at a. Repeat the process in the other order to get

φ(s, t)

st→ f ′′(a)(u, v)

So they are equal.

Suppose we have U ⊂ Rm open and f : U → R, we say f has a localmaximum at a ∈ U if ∃r > 0,∀b ∈ Dr(a), f(b) ≤ f(a). We can similarly definelocal minima.

Definition 8.7. We say f has a stationary point at a if f is differentiable at aand f ′(a) = 0.

It is immediate that f has stationary points at each local maximum andminimum.

Theorem 8.12. Let U ⊂ Rm be open and f : U → R be twice differentiablein U and suppose f ′(a) = 0 and f ′′ is continuous at a. If the symmetric formf ′′ is positive definite at a, then f has a local minimum at a; if it is negativedefinite at a, then f has a local maximum at a.

Proof. It is a non-examinable fact that

f(a+ h) = f(a) + f ′(a)(h) + (1/2)f ′′(a)(h, h) + ε(h)‖h‖2

for some ε such that ε→ 0 as ‖h‖ → 0.Recall that f ′′(a) is orthogonally diagonalizable as a real symmetric form, i.e.there is an orthonormal basis {uk} for Rm such that

f ′′(ui, uj) =

{0, if i 6= j

λi, if i = j

Assume that f ′′ is positive definite, then λi = f ′′(a)(ui, ui) > 0 for all i, henceµ = min{λi : 1 ≤ i ≤ m} > 0. Therefore we have, for any h =

∑i hiui ∈ Rm,

f ′′(a)(h, h) =∑i,j

hihjf′′(a)(ui, uj) =

∑i

h2iλi ≥ µ∑i

h2i = µ‖h‖2

Hence f(a + h) − f(a) ≥ µ‖h‖2/4 ≥ 0 whenever ‖h‖ is small enough so that‖ε(h)‖ ≤ µ/4. This means that f has a local minimum at a. The negativedefinite case is analogous.

38