real analysis - kansas state universitynagy/real-an/real-an-old/notes.pdf · real analysis fall...

382
REAL ANALYSIS FALL 2001 Gabriel Nagy Kansas State University c Gabriel Nagy

Upload: others

Post on 14-Mar-2020

20 views

Category:

Documents


1 download

TRANSCRIPT

REAL ANALYSIS

FALL 2001

Gabriel Nagy

Kansas State University

c©Gabriel Nagy

Chapter I

Topology Preliminaries

Lecture 1

1. Review of basic topology concepts

In this lecture we review some basic notions from topology, the main goal beingto set up the language. Except for one result (Uryson Lemma) there will be noproofs.

Definitions. A topology on a (non-empty) set X is a family T of subsets ofX, which are called open sets, with the following properties:

(top1): both the empty set ∅ and the total set X are open;(top2): an arbitrary union of open sets is open;(top3): a finite intersection of open sets is open.

In this case the system (X, T ) is called a topological space.If (X, T ) is a topological space and x ∈ X is an element in X, a subset N ⊂ X

is called a neighborhood of x if there exists some open set D such that x ∈ D ⊂ N .A collection N of neighborhoods of x is called a basic system of neighborhoods

of x, if for any neighborhood M of x, there exists some neighborhood N in N suchthat x ∈ N ⊂M .

A collection V of neighborhoods of x is called a fundamental system of neighbor-hoods of x if for any neighborhoodM of x there exists a finite sequence V1, V2, . . . , Vnof neighborhoods in V such that x ∈ V1 ∩ V2 ∩ · · · ∩ Vn ⊂M .

A toplogy is said to have the Hausdorff property if:

(h) for any x, y ∈ X with x 6= y, there exist open sets U 3 x and V 3 y suchthat U ∩ V = ∅.

If (X, T ) is a topological space, a subset F ⊂ X will be called closed, if itscomplement X r F is open. The following properties are easily derived from thedefinition:

(c1) both the empty set ∅ and the total set X are closed;(c2) an arbitrary intersection of closed sets is closed;(c3) a finite union of closed sets is closed.

Using the above properties of open/closed sets, one can perform the followingconstructions. Let (X, T ) be a topological space and A ⊂ X be an arbitrary subset.Consider the set Int(A) to be the union of all open sets D with D ⊂ A and considerthe set A to be the intersection of all closed sets F with F ⊃ A. The set Int(A)

(sometimes denoted simply byA) is called the interior of A, while the set A is

called the closure of A. The properties of these constructions are summarized inthe following:

Proposition 1.1. Let (X, T ) be a toplogical space, and let A be an arbitrarysubset of X.

3

4 LECTURE 1

A. (Properties of the interior)(i) The set Int(A) is open and Int(A) ⊂ A.(ii) If D is an open set such that D ⊂ A, then D ⊂ Int(A).(iii) x belongs to Int(A) if and only if A is a neighborhood of x.(iv) A is open if and only if A = Int(A).B. (Properties of the closure)

(i) The set A is closed and A ⊃ A.(ii) If F is a closed set with F ⊃ A, then F ⊃ A.(iii) A point x belongs to A, if and only if, A ∩ N 6= ∅ for any neighborhood

N of x.(iv) A is closed if and only if A = A.C. (Relationship between interior and closure) Int(X r A) = X r A and

X rA = X r Int(A).Definition. Suppose (X, T ) is a topological space. Assume A ⊂ X is a subset

of X. On A we can introduce a natural topology, sometimes denoted by T |A whichconsists of all subsets of A of the form A∩U with U open set in X. This topologyis called the relative (or induced) topology .

Remark 1.1. If A is already open in the topology T , then a subset V ⊂ Ais open in the induced topology if and only if V is open in the topology T (thisfollows from the fact that the intersection of any two open sets in T is again anopen set in T .

Definition. Suppose (X, T ) and (Y,S) are topological spaces and x is anelement in X. A map f : X −→ Y is said to be continuous at x, if for anyneighborhood N of f(x) in the topology S (on Y ), the set

f−1(N) = x ∈ X | f(x) ∈ N

is a neighborhood of x in the topology T (on X).If f is continuous at every point in X, then f is said to be continuous.Continuity is “well behaved” with respect to compositions:Proposition 1.2. Suppose (X, T ), (Y,S), and (Z,Z are topological spaces,

and Xf−→ Y

g−→ Z are two functions.(i) If f is continuous at a point x ∈ X, and if g is continuous at f(x), then

g f is continuous at x.(ii) If f and g are (globally) continuous, then so is g f .

The identity map on a topological space is always continuous.In terms of open/closed sets, the characterization of continuity is given by the

following.Proposition 1.3. If (X, T ), (Y,S) are topological spaces and f : (X, T ) →

(Y,S) is a map, then the following are equivalent:(i) f is continuous.(ii) Whenever U ⊂ Y is an open set, it follows that f−1(U) is also an open

set (in X).(iii) Whenever F ⊂ Y is a closed set, it follows that f−1(F ) is also a closed

set (in X).We conclude this section with a useful technical result.

CHAPTER I: TOPOLOGY PRELIMINARIES 5

Theorem 1.1 (Urysohn’s Lemma). Let (X, T ) be a topological Hausorff spacewith the following property:

(n) For any two disjoint closed sets A,B ⊂ X, there exist two disjoint opensets U, V ⊂ X, such that U ⊃ A and V ⊃ B.

Then for any two disjoint closed sets A,B ⊂ X, there exists a continuous functionf : X → [0, 1] such that f

∣∣A

= 0 and f∣∣B

= 1.

Proof. We begin with a refinement of property (n):(n′) For any disjoint closed sets A,B ⊂ X, there exist two open sets U,W ⊂ X,

such that A ⊂ U , U ⊂W , and W ∩B = ∅.To prove (n′), we first apply (n) to find two disjoint open sets W,Z ⊂ X such that

(1) W ⊃ A and Z ⊃ B.

Next we apply again (n) to the pair of closed sets A and X r W , and find twodisjoint open sets U, V ⊂ X such that

(2) U ⊃ A and V ⊃ X rW.

On the one hand, using the fact that U ∩ V = ∅ and the fact that V is open, weget the inclusion U ⊂ X r V . Using (2) this gives

U ⊂ X r V ⊂W.

On the other hand, using the fact that W ∩Z = ∅ and the fact that Z is open, weget W ⊂ X r Z. But using (1) this will give

W ⊂ X r Z ⊂ X rB,

and we are done.To prove the Theorem, start with two disjoint closed sets A,B ⊂ X. For every

integer n ≥ 0 we define the set Dn = k2n : k ∈ Z, 0 ≤ k ≤ 2n, and we consider

D =∞⋃n=0

Dn.

(Notice that Dn ⊂ Dn+1, for all n ≥ 0.)We are going to construct a family (Vt)t∈D of open sets in X with the following

properties(i) V0 ⊃ A and V 1 ∩B = ∅;(ii) V t ⊂ Vs, for all t, s ∈ D with t < s.

Let us start by constructing V0 and V1. We use property (n′) to find open setsU,W ⊂ X, with

A ⊂ U ⊂ U ⊂W and W ∩B = ∅,and we simply take V0 = U and V1 = W .

The construction of the family (Vt)t∈D is carried on recursively. Assume, forsome integer n ≥ 0, we have constructed the sets (Vt)t∈Dn

with property (i) and (ii)(satisfied for t, s ∈ Dn), and let us construct the next block of sets (Vt)t∈Dn+1rDn .We start off by observing that for every t ∈ Dn+1 rDn, then the numbers

t± = t± 12n+1

6 LECTURE 1

belong to Dn. Apply (n′) to the pair of disjoint closed sets V t− and X r Vt+ tofind two open sets U,W ⊂ X such that

V t− ⊂ U ⊂ U ⊂W and W ∩X r Vt+ = ∅.

Notice that the equality W ∩ (X r Vt+) = ∅, coupled with the inclusion U ⊂ W ,gives U ∩ (X r Vt+), so we get U ⊂ Vt+ . We can then define Vt = U , and we willobviously have the inclusions

(3) V t− ⊂ Vt ⊂ V t ⊂ Vt+ .

Now the extended family (Vt)t∈Dn+1 will also satisfy property (ii), since for t, s ∈Dn+1 with t < s, one of the following will hold:

• either t, s ∈ Dn, or• t ∈ Dn, s ∈ Dn+1 rDn, and t ≤ s−, or• t ∈ Dn+1 rDn, s ∈ Dn, and t+ ≤ s, or• t, s ∈ Dn+1 rDn, and t+ ≤ s−.

(In either case, one uses (3) combined with the inductive hypothesis.)Having constructed the family (Vt)t∈D, with properties (i) and (ii), we define

the functions f : X → [0, 1] by

f(x) =

inft ∈ D : x ∈ Vt, if x ∈ V1

1, if x 6∈ V1

Claim 1: The function f is equivalently defined by

(4) f(x) =

0, if x ∈ V 0

supt ∈ D : x 6∈ V t, if x 6∈ V 0

Let us denote by g : X → [0, 1] be the function defined by formula (4). Fixsome point x ∈ X. We break the proof in several cases

Case I: x ∈ V 0.In particular, using (ii) we get x ∈ Vt, for all t ∈ D, with t > 0, and since

x ∈ V1, we have

f(x) = inft ∈ D : x ∈ Vt = inft ∈ D : t > 0 = 0 = g(x).

Case II: x 6∈ V1.Using (ii) we have x 6∈ V t, for all t ∈ D, with t < 1, and since x 6∈ V 0, we have

g(x) = supt ∈ D : x 6∈ V t = supt ∈ D : t < 1 = 1 = f(x).

Case III: x ∈ V1 r V 0.By the definition of f(x) we know:

x 6∈ Vt, ∀ t ∈ D, with t < f(x).(5)

∀ ε > 0, ∃ sε ∈ D, with f(x) ≤ sε < f(x) + ε, such that x ∈ Vsε.(6)

By the definition of g(x) we know:

x ∈ V t, ∀ t ∈ D, with t > g(x);(7)

∀ ε > 0, ∃ rε ∈ D, with g(x) ≥ rε > g(x)− ε, such that x 6∈ V rε.(8)

Using (6) and (8) we see that we must have

(9) sε ≥ rε, ∀ ε > 0.

CHAPTER I: TOPOLOGY PRELIMINARIES 7

Indeed, if there exists some ε > 0 for which we have sε < rε, then using (6) wewould have

x ∈ Vsε ⊂ V sε ⊂ Vrε ⊂ V rε ,

which contradicts (8).Now the inequality (9) gives

f(x) + ε > g(x)− ε, ∀ ε > 0,

so we have in fact the inequality

f(x) ≥ g(x).

Suppose now this inequality is strict. Using (5) and (7) we will get

(10) x ∈ V t and x 6∈ Vt, for all t ∈ D, with f(x) > t > g(x).

Using the fact that D is dense in [0, 1], we could then find at least two elementst1, t2 ∈ D such that

f(x) > t1 > t2 > g(x).

In this case (10) immediately creates a contradiction, since

x ∈ V t2 ⊂ Vt1 .

Claim 2: The function f is continuous.Since any open set in R is a union of open intervals, it suffice to prove the

following two properties1

(usc): f−1((∞, t)

)is open for all t ∈ R;

(lsc): f−1((t,∞)

)is open for all t ∈ R.

In order to prove property (usc) it suffices to prove the equality

(11) f−1((∞, t)

)=

⋃s∈Ds<t

Vs.

Start with a point x ∈ f−1((t,∞)

), which means that f(x) < t. Using (6), there

exists some s ∈ D with f(x) < s < t, such that x ∈ Vs, so x indeed belongs tothe right hand side of (11). Conversley, if x belongs to the right hand side of (11),there exists some s < t such that x ∈ Vs. By the definition of f(x), it follows thatf(x) ≤ s < t, so x ∈ f−1

((∞, t)

).

In order to prove property (lsc) it suffices to prove the equality

(12) f−1((t,∞)

)=

⋃r∈Dr>t

(X r V r).

Start with a point x ∈ f−1((t,∞)

), which means that f(x) > t. Using (8), there

exists some r ∈ D with f(x) > r > t, such that x 6∈ V r, that is, x ∈ X r V r, so xindeed belongs to the right hand side of (12). Conversley, if x belongs to the righthand side of (12), there exists some r > t such that x ∈ X r V s, i.e. x 6∈ V r Bythe equivalent definition of f(x) given by Claim 1, it follows that f(x) ≥ r > t, sox ∈ f−1

((t,∞)

).

1 The condition (usc) means that f is upper semi-continuous, while the condition (lsc)

means that f is lower semi-continuous.

8 LECTURE 1

Having proven that f is continuous, let us finish the proof. Since A ⊂ V0, bythe definition of f , we get f

∣∣A

= 0. Since B ⊂ X r V1, again by the definition off , we get f

∣∣B

= 1.

Definition. A Hausdorff space (X, T ) with property (n) is called normal.

Lecture 2

2. Ultrafilters

In this lecture we discuss a set theoretical concept, which turns out to betechnically useful in topology.

Definition. Suppose X is a fixed (non-empty) set. A filter in X is a (non-empty) family F of non-empty subsets of X which has the property2:

(f) Whenever F and G belong to F , it follows that F ∩G also belongs to F .What is important here is that all the sets in the filter are assumed to be non-empty. The set of all filters in X can be ordered by inclusion. A simple applicationof Zorn’s Lemma yields:

• For each filter F there exists at least one maximal filter U with U ⊃ F .Maximal filters will be called ultrafilters.

An interesting feature of ultrafilters is given by the following:Lemma 2.1. Let X be a non-empty set, and let U be a filter on X. The

following are equivalent:(i) U is an ultrafilter.(ii) For any subsets A ⊂ X, it follows that either A or X r A belongs to U ,

but not both!

Proof. (i) ⇒ (ii). Assume U is an ultrafilter. First remark that X alwaysbelongs to U . (Otherwise, if X does not belong to U , the family U ∪ X will beobviously a new filter which will contradict the maximality of U).

Let us assume that A is non-empty and it does not belong to U . This meansthat the family

M = U ∪ A ∩ U |U ∈ Uis no longer a filter (otherwise, the maximality of U will be contradicted). Note thatif F and G belong to M, then automatically F ∩G belongs to M. This means thatthe only thing that can prevent M from being a filter, must be the fact that oneof the sets in M is empty. That is, there is some set V ∈ U such that A ∩ V = ∅.In other words, V ⊂ X r A. But then, it follows that for any U ∈ U we haveU ∩ (X rA) ⊃ U ∩ V 6= ∅ and then the set

N = U ∪ U ∩ (X rA) |U ∈ U

will be a filter. By maximality, it follows that N = U , in particular, X rA belongsto U . It is obvious that A and X rA cannot simultaneously belong to U , becausethis will force ∅ = A ∩ (X rA) to belong to U .

2 Some textbooks may use a slightly different definition.

9

10 LECTURE 2

(ii) ⇒ (i). Assume property (ii) holds, but U is not maximal, which meansthat there exists some ultrafilter V with V ) U . Pick then some set A ∈ V r U .Since A 6∈ U , by (ii) we must have X rA ∈ U . This would force both A and X rAto belong to V, which is impossible.

Exercise 1. Let U be an ultrafilter on X, and let A ∈ U. Prove that thecollection

U∣∣A

= U ∩A : U ∈ Uis an ultrafilter on A.

Remark 2.1. If U is an ultrafilter on X, and A ∈ U , then U contains all setsB with A ⊂ B ⊂ X. Indeed, if we start with such a B, then by the above result,either B ∈ U or X rB ∈ U . Notice however that in the case X rB ∈ U we wouldget

U 3 (X rB) ∩A = ∅,which is impossible. Therefore B must belong to U .

We are in position now to define the notion of convergence for ultrafilters, bymeans of the following.

Proposition 2.1. Let (X, T ) be a topological space, let U be an ultrafilter inX, and let x be a point in X. The follwoing are equivalent:

(i) Every neighborhood of x belongs to U .(ii) There exists N a basic system of neighborhoods of x, with N ⊂ U .(iii) There exists V a fundamental system of neighborhoods of x, with V ⊂ U .If the ultrafilter U satisfies one of the equivalent conditions above, we say that

U is convergent to x, and we write U → x.

Proof. The implications (i) ⇒ (ii) ⇒ (iii) are obvious.(iii) ⇒ (i). Let V be a fundamental system of neighborhoods of x, with V ⊂ U .

Start with an arbitrary neighborhood M of x. By the proeprties of V, there existsa finite sequence V1, . . . , Vn ∈ V, with

x ∈ V1 ∩ · · · ∩ Vn ⊂M.

Since V ⊂ U , and U is a filter, it follows that the intersection W = V1 ∩ · · · ∩ Vnbelongs to U . By Remark 2.1 it follows that M itself belong to U . Since M wasarbitrary, it follows that U indeed satisfies condition (i).

The Hausdorff property has a nice ultrafilter characterization:Proposition 2.2. For a topological space (X, T ), the following are equivalent:

(i) The topology T is Hausdorff.(ii) Every convergent ultrafilter in X has a unique limit.

Proof. (i) ⇒ (ii). Assume the topolgy is Hausdorff. Let U be an ultrafilter inX which is convergent to both x and y. If x 6= y, then by the Hausdorff property,there exist two open sets U, V ⊂ X, with x ∈ U , y ∈ V , and U ∩ V = ∅. Since Uis a neighborhood of x, we must have U ∈ U . Likewise, we must have V ∈ U . Butthis is impossible, since it will force U 3 U ∩ V = ∅.

(ii) ⇒ (i). Assume X satisfies condition (ii), but the topology is not Hausdorff.This means that there exist two points x, y ∈ X, with x 6= y, such that

(∗) for any open sets U, V ⊂ X, with U 3 x and V 3 y, we have U ∩ V 6= ∅.

CHAPTER I: TOPOLOGY PRELIMINARIES 11

Let Nx denote the collection of all neighborhoods of x, and Ny denote the collectionof all neighborhoods of y. By condition (∗) we have

M ∩N 6= ∅, ∀M ∈ Nx, N ∈ Ny.

This proves that the collection

F = M ∩N : M ∈ Nx, N ∈ Ny

is a filter in X. Notice that, since X is a neighborhood for both x and y, we havethe inclusion F ⊃ Nx ∪ Ny. So if we take U to be an ultrafilter, with U ⊃ F , itfollows that U ⊃ Nx, hence U converges to x, but also U ⊃ Ny, hence U is alsoconvergent to y. By condition (ii) this is impossible.

Examples 2.1. A. Let x be a point in X. We can consider the collectionUx = U ⊂ X |U 3 x. Clearly Ux is an ultrafilter in X. This is called a constantultrafilter at x. If (X, T ) is a topological space, then it is obvious that Ux isconvergent to x.

B. (Example of a convergent non-constant ultrafilter.) Suppose (X, T ) is atopological space and x is a point in X such that for any neighborhood N of x, wehave N r x 6= ∅. Consider the collection

F = N r x |N neighborhood of x.

Then F is a filter. If we take U any ultrafilter which contains F , we get a non-constant (sometimes called free) ultrafilter. It is again clear that U is again con-vergent to x.

C. (Example of a non-convergent ultrafilter.) Let N be the set of non-negativeintegers. Equip N with the discrete topology (in which every subset is open). Con-sider the collection F consisting of all subsets F ⊂ N which have finite complementN r F . It is easy to check that F is a filter. Pick then U to be any ultrafilter withU ⊃ F . Since on N we use the discrete topology, it follows that the only convergentultrafilters are the constant ones. Note however, that if n ∈ N, then the set Nrnbelongs to F , hence to U . This means that the singleton set n cannot belong toU . Therefore U cannot be constant.

Remark 2.2. Maps between sets can be put to act on ultrafilters. Moreexplicitly one has the following construction. Suppose f : X → Y is a map and Uis a ultrafilter in X. Consider the collection

f∗(U) = V ⊂ Y | f−1(V ) ∈ U.

Then f∗(U) is a ultrafilter on Y . Indeed, it is easy to show that f∗(U) is a filter.To prove that it is maximal, let us take F a filter on Y with F ⊃ f∗(U) and let usconsider an arbitrary set F which belongs to F . Since U is an ultrafilter on X itfollows that either f−1(F ) or X r f−1(F ) belongs to U . If X r f−1(F ) belongs toU , using the equality X r f−1(F ) = f−1(Y r F ) if follows that Y r F belongs tof(U), hence to F . But this is impossible, since F also belongs to F and this willforce the empty set F ∩ (Y rF ) to belong to the filter F . This contradiction showsthat the set f−1(F ) belongs to U , which means precisely that F belongs to f∗(U).This argument proves the inclusion F ⊂ f∗(U), so f∗(U) is indeed a maximal filter.

Remark 2.3. With the above notations, one has

f(U) ∈ f∗(U), ∀U ∈ U .

12 LECTURE 2

One can prove this property by contradiction. Assume f(U) does not belong tof∗(U), for some U ∈ U . Then Y rf(U) belongs to f∗(U), which means that the set

M = f−1(Y r f(U)

)= X r f−1

(f(U)

)belongs to U . But using the obvious inclusion U ⊂ f−1

(f(U)

), this gives M ∩U =

∅, which is impossible.Continuity can be nicely characterized using ultrafilters:Proposition 2.3. Let (X, T ) and (Y,S) be topological spaces, and let x be

element in X. For a function f : X → Y , the following are equivalent:(i) f is continuous at x.(ii) Whenever U is an ultrafilter on X convergent to x, it follows that the

ultrafilter f∗(U) in Y , convergent to f(x).

Proof. (i) ⇒ (ii). Assume that f is continuous at x. Start with an ultrafilterU on X, with U → x. Let N be an arbotrary neighborhood of f(x). Since f iscontinuous at x, it follows that f−1(N) is a neighborhood of x. In particular weget f−1(N) ∈ U , which proves that N ∈ f∗(U). Since the ultrafilter f∗(U) containsall neighborhoods of f(x), it means that indeed f∗(U) is convergent to f(x).

(ii) ⇒ (i). Assume f satisfies condition (ii), but f is not continuous at x. Thismeans that there exists some neighborhood V of f(x) such that f−1(V ) is not aneighborhood of x. Consider the collection

F = N r f−1(V ) : N neighborhood of x.Our assumption on V shows that all the sets in F are non-empty. (Otherwisef−1(V ) would contain some neighborhood of x, which would force f−1(V ) itself tobe a neighborhood of x.) It is also clear that F is a filter. Let U be an ultrafilterwith U ⊃ F .

Claim: The ultrafilter U is convergent to x.To prove this, start with some arbitrary neighborhood N of x. If N does not

belong to U , then X r N belongs to U . But then (X r N) ∩ (N r f−1(V )) = ∅belongs to U , which is impossible. So U contains all neighborhoods of x, whichmeans that indeed U is convergent to x.

Using our assumption on V , plus condition (ii), it follows that V ∈ f∗(U),which means that f−1(V ) ∈ U . But this leads to a contradiction, since Xrf−1(V )clearly belongs to F ⊂ U .

Lecture 3

3. Constructing topologies

In this section we discuss several methods for constructing topologies on a givenset.

Definition. If T and T ′ are two topologies on the same space X, such thatT ′ ⊂ T (as sets), then T is said to be stronger than T ′. Equivalently, we will saythat T ′ is weaker than T .

Remark that this condition is equivalent to the continuity of the map

Id : (X, T ) → (X, T ′).Comment. Given a (non-empty) set X, and a collection S of subsets of X,

one can ask the following:Question 1: Is there a topology on X with respect to which all the sets in S

are open?Of course, this question has an affirmative answer, since we can take as the topologythe collection of all subsets of X. Therefore the above question is more meaningfulif stated as:

Question 2: Is there the weakest topology on X with respect to which all thesets in S are open?

The answer to this question is again affirmative, and it is based on the following:Remark 3.1. If X is a non-empty set, and (Ti)i∈I is a family of topologies on

X, then the intersection ⋂i∈I

Ti

is again a topology on X.In particular, if one starts with an arbitrary family S of subsets of X, and if

we takeΘ(S) =

T : T topology on X with T ⊃ S

,

then the intersectiontop(S) =

⋂T ∈Θ(S)

T

is the weakest (i.e. smallest) among all topologies with respect to which all sets inS are open.

The topology top(S) defined above cane also be described constructively asfollows.

Proposition 3.1. Let S be a collection of subsets of X. Then the sets intop(S), which are a proper subsets of X, are those which can be written a (arbi-trary) unions of finite intersections of sets in S.

13

14 LECTURE 3

Proof. It is useful to introduce the following notations. First we define V(S)to be the collection of all sets which are finite intersections of sets in S. In otherwords,

B ∈ V(S) ⇐⇒ ∃D1, . . . , Dn ∈ S such that D1 ∩ · · · ∩Dn = B.

With the above notation, what we need to prove is that for a set A ( X, we have

A ∈ top(S) ⇐⇒ ∃VA ⊂ V(S) such that A =⋃

B∈VA

B.

The implication “⇐” is pretty obvious. Since top(S) is a topology, and every setin S is open with respect to top(S), it follows that every finite intersection of setsin S is again in top(S), which means that every set in V(S) is again open withrespect to top(S). But then arbitrary unions of sets in V(S) are again open withrespect to top(S).

To prove the implication “⇒” we define

T0 =A ⊂ X : ∃VA ⊂ V(S) such that A =

⋃B∈VA

B,

and we will show that

(1) top(S) ⊂ X ∪ T0.

By the definition of top(S) it suffices to prove the followingClaim: The collection T1 = X ∪ T0 is a topology on X, which contains all

the sets in S.The fact that T1 ⊃ S is trivial.

The fact that ∅, X ∈ T1 is also clear.The fact that arbitrary unions of sets in T1 again belong to T1 is again clear,

by construction.Finally, we need to show that if A1, A2 ∈ T1, then A1 ∩ A2 ∈ T1. If either

A1 = X or A2 = X, there is nothing to prove. Assume that both A1 and A2 areproper subsets of X, so there are subsets V1,V2 ⊂ V(S), such that

A1 =⋃B∈V1

B and A2 =⋃E∈V1

E.

Then it is clear thatA1 ∩A2 =

⋃B∈V1E∈V2

(B ∩ E),

with all the sets B ∩ E in V(S), so A1 ∩A2 indeed belongs to T1.

Definition. Let X be a (non-empty) set, let T be a topology on X. Acollection S of subsets of X, with the property that

T = top(S),

is called a sub-base for T . According to the above remark, the above condition isequivalent to the fact that every open set D ( X can be written as a union of finiteintersections of sets in S.

Convergence of ultrafilters is characterized using sub-bases as follows;

CHAPTER I: TOPOLOGY PRELIMINARIES 15

Proposition 3.2. Let (X, T ) be a topological space, let S be a sub-base forT , and let x be some point in X. For an ultrafilter U on X, the following areequivalent:

(i) U is convergent to x;(ii) U contains all the sets S ∈ S with S 3 x.

Proof. The implication (i) ⇒ (ii) is trivial.To prove the implication (ii) ⇒ (i), we assume U has property (ii), we consider

some neighborhood N of x, and let us prove that N belongs to U. Since N is aneighborhood of x, there exists some open set D, with x ∈ D ⊂ N . Furthermore,by Proposition 3.1, either

(a) D = X, or(b) there exist sets S1, S2, . . . , Sn ∈ S with

x ∈ S1 ∩ S2 ∩ · · · ∩ Sn ⊂ D ⊂ N.

In case (a) we immediately have N = X, and we obviously get N ∈ U. In case (b)it follows that S1, . . . , Sn ∈ U, so the intersection S1 ∩ S2 ∩ · · · ∩ Sn also belongs toU. By Remark 2.1 it then follows that N itself belongs to U.

There are instances when sub-bases have a particular feature, which enablesone to describe all open sets in an easier fashion.

Proposition 3.3. Let (X, T ) be a topological space. Suppose V is a colletionof subsets of X. The following are equivalent:

(i) V is a sub-base for T , and

(2) ∀U, V ∈ V and x ∈ U ∩ V, , ∃W ∈ V with x ∈W ⊂ U ∩ V.

(ii) Every open set A ( X is a union of sets in V.

Proof. (i) ⇒ (ii). From property (i), it follows that every finite intersectionof sets in V is a union of sets in V. Then the desired implication is immeadiatefrom the previous result.

(ii) ⇒ (i). Assume (ii) and start with two sets U, V ∈ V, and an elementx ∈ U ∩ V . Since U ∩ V is open, by (ii) either we have U ∩ V = X, in which casewe get U = V = X, and we take W = X, or U ∩ V ( X, in which case U ∩ V is aunion of sets in V, so in particular there exists W ∈ V with x ∈W ⊂ U ∩ V .

Definition. If (X, T ) is a topological space, a collection V which satisfies theabove equivalent conditions, is called a base for T .

The following is a useful technical result.

Lemma 3.1. Let (Y, T ) be a topological space, let X be some (non-empty) set,and let f : X → Y be a function. Then the collection

f∗(T ) =f−1(D) : D ∈ T

is a topology on X. Moreover, f∗(T ) is the weakest topology on X, with respect towhich the map f is continuous.

Proof. Clearly ∅ = f−1(∅) and X = f−1(Y ) both belong to f∗(T ). If(Ai)i∈I is a family of sets in f∗(T ), say Ai = f−1(Di), for some Di ∈ T , for all

16 LECTURE 3

i ∈ I, then the equality ⋃i∈I

Ai =⋃i∈I

f−1(Di) = f( ⋃i∈I

Di

)clearly shows that

⋃i∈I Ai again belongs to f∗(T ). Likewise, if A1, A2 ∈ f∗(T ),

say A1 = f−1(D1) and A2 = f−1(D2) for some D1, D2 ∈ T , then the equality

A1 ∩A2 = f−1(D1) ∩ f−1(D2) = f−1(D1 ∩D2)

proves that A1 ∩A2 again belongs to f∗(T ).Having proven that f∗(T ) is a topology on X, let us prove now the second

statement. The fact that f is continuous with respect to f∗(T ) is clear by con-struction. If T ′ is another topology which still makes f continuous, then this willforce all the sets of the form f−1(D), D ∈ T to belong to T ′, which means thatf∗(T ) ⊂ T ′.

Remark 3.2. Using the above notations, if V is a (sub)base for T , then

f∗(V) =f−1(V ) : V ∈ V

is a (sub)base for f∗(T ). This is pretty obvious since the correspondence

subsets of Y 3 D 7−→ f−1(D)

is compatible with the operation of intersection and union (of arbitrary families).Remark 3.3. As a consequence of the above remark, we see that (sub)bases

can be useful for verifying continuity. More specifically, if (X, T ) and (X ′, T ′) aretopological spaces, and V is a sub-base for T ′, then a function f : X → X ′ iscontinuous, if and only if f−1(V ) is open, for all V ∈ V.

The construction outlined in Lemma 3.1 can be generalized as follows.Proposition 3.4. Let X be a set, and let Φ = (fi, Yi)i∈I be a family consisting

of maps fi : X → Yi, where Yi is a topological space, for all i ∈ I. Then there is aunique toplogy T Φ on X, with the following properties

(i) Each of the maps fi : X → Yi, i ∈ I is continuous with respect to T Φ.(ii) Given a topological space (Z,S), and a map g : Z → X, such that the

composition fi g : Z → Yi is continuous, for every i ∈ I, it follows thatg is continuous as a map (Z,S) → (X, T Φ).

Proof. For every i ∈ I we define

Di =f−1i (D) : D open subset of Yi

,

and we form the collectionD =

⋃i∈I

Di.

Take T Φ = top(D) Property (i) follows from the simple observation that, byconstruction, every set in D is open.

To prove property (ii) start with a topological space (Z,S), and a map g : Z →X, such that the composition fi g : Z → Xi is continuous, for every i ∈ I, andlet us prove that g is continuous. By Remark 3.3 it suffices to prove that g−1(A) isopen (in Z) for every A ∈ D. By the definition of D this is equivalent to provingthe fact that, for each i ∈ I, and each open set D ⊂ Yi, the set g−1

(f−1i (D)

)is

open. But this is obvious, since we have

g−1(f−1i (D)

)= (fi g)−1(D),

CHAPTER I: TOPOLOGY PRELIMINARIES 17

and fi g : Z → Yi is continuous.To prove the uniqueness, let T be another topology on X with properties (i)

and (ii). Consider the map h = Id : (X, T ) → (X, T Φ). Using property (i) for T ,combined with property (ii) for T Φ, it follows that h is continuous, which meansthat T Φ ⊂ T . Reversing the roles, and arguing exactly the same way, we also getthe other inclusion T ⊂ T Φ.

Remark 3.4. Using the above setting, assume that for each i ∈ I a sub-baseSi for the topology of Yi is given. Consider the sets f∗i Si =

f−1i (S) : S ∈ Si

.

Then the collectionS =

⋃i∈I

f∗i Si

is a sub-base for the topology TΦ.To prove this, we take T = top(S), so that we obviously have the inclusion

T ⊂ TΦ. In order to prove the equality T = TΦ, all we have to prove are (use thenotations from the proof of the above Proposition) the inclusions

Di ⊂ T, ∀ i ∈ I.By construction however, we have Di = f∗i Ti, and since Si is a sub-base for Ti, itfollows that f∗i Si is a sub-base for Di, which means that we have

Di = top(f∗i Si) ⊂ top(S) = T.

Comment. Using the notations above, it is immediate that the topology T Φ

can also be described as the weakest topology on X, with respect to which all themaps fi : X → Yi, i ∈ I, are continuous. In the light of this remark, we will callthe topology T Φ the weak topology defined by Φ.

Convergence for ultrafilters can be nicely characterized:Proposition 3.5. Let X be a set, and let Φ = (fi, Yi)i∈I be a family consisting

of maps fi : X → Yi, where Yi is a topological space, for all i ∈ I. For an ultrafilterU on X and a point x ∈ X, the following are equivalent:

(i) U is convergent to x, with respect to the topology T Φ;(ii) for every i ∈ I, the ultrafilter fi∗(U) is convergen to fi(x).

Proof. (i) ⇒ (ii). This implication is clear, since all maps fi : (X, T Φ) →(Yi, Ti), i ∈ I, are continuous.

(ii) ⇒ (i). Suppose U satisfies (ii). Then for every i ∈ I, the ultrafilter fi∗(U)contains all the open sets D ⊂ Yi with D 3 fi(x). This means that f−1

i (D) ∈ U.But by construction, the topology T Φ has

S =⋃i∈I

f−1i (D) : D ⊂ Yi open

,

as a sub-base, so if we define

Sx = S ∈ S : S 3 x,we clearly have U ⊃ Sx. Then the fact that U converges to x follows from Propo-sition 3.2.

Example 3.1. (The product topology) Supoose we have a family (Xi, Ti), i ∈ Iof topological spaces. Consider the Cartesian product X =

∏i∈I

Xi. For each j ∈ I

18 LECTURE 3

we consider the projection πj : X → Xj . The weakest topology on X, defined bythe family Φ = πjj∈I , is called the product topology.

A sub-base for the product topolgy can be defined as follows. For each i ∈ I,we choose a sub-base Si for Ti (for instance we can take Si = Ti), and we take

S =⋃i∈I

π∗i Si =⋃i∈I

π−1i (D) : D ∈ Si

.

Then S is a sub-base for the product topology.For a point x = (xi)i∈I ∈ X, and an ultrafilter U on X, the condition U → x

is equivalent to the fact that πi∗(U) → xi, ∀ i ∈ I.Another method of constructing topologies is based on the following “dual”

version of Lemma 3.1.Lemma 3.2. Let (Y, T ) be a topological space, let X be some (non-empty) set,

and let f : Y → X be a function. Then the collection

f∗(T ) =D ⊂ X : f−1(D) ∈ T

is a topology on X. Moreover, f∗(T ) is the strongest topology on X, with respectto which the map f is continuous.

Proof. Since f−1(∅) = ∅ and f−1(X) = Y , it follows that ∅ and X bothbelong to f∗(T ). If (Ai)i∈I is a family of sets in f∗(T ), then the sets f−1(Ai), i ∈ Ibelong to T . In particular the set

f−1( ⋃i∈I

Ai)

=⋃i∈I

f−1(Ai)

will again belong to T , which means that⋃i∈I Ai belongs to f∗(T ). Likewise,

if A1, A2 ∈ f∗(T ), then the sets f−1(A1) and f−1(A2) both belong to T . Theintersection

f−1(A1 ∩A2) = f−1(A1) ∩ f−1(A2)will then belong to T , which means that A1 ∩A2 again belongs to f∗(T ).

Having proven that f∗(T ) is a topology on X, let us prove now the secondstatement. The fact that f is continuous with respect to f∗(T ) is clear by con-struction. If T ′ is another topology which still makes f continuous, then this willforce all the sets of the form f−1(A), A ∈ T ′ to belong to T , which means that Awill in fact belong to f∗(T ). In other words, we have the inclusion T ′ ⊂ f∗(T ).

A generalization of the above construction is given in the following.Proposition 3.6. Let X be a set, and let Φ = (fi, Yi)i∈I be a family consisting

of maps fi : Yi → X, where Yi is a topological space, for all i ∈ I. Then there is aunique toplogy TΦ on X, with the following properties

(i) Each of the maps fi : Yi → X, i ∈ I is continuous with respect to TΦ.(ii) Given a topological space (Z,S), and a map g : X → Z, such that the

composition g fi : Yi → Z is continuous, for every i ∈ I, it follows thatg is continuous as a map (X, TΦ) → (Z,S).

Proof. For each i ∈ I, let Ti denote the topology on Yi. We define

TΦ =⋂i∈I

fi∗(Ti).

Property (i) is obvious by construction.

CHAPTER I: TOPOLOGY PRELIMINARIES 19

To prove property (ii), start with some topological space (Z,S) and a mapg : X → Z such that g fi : Yi → Z is continuous, for all i ∈ I. Start withsome open set D ⊂ Z, and let us prove that the set A = g−1(D) is open in X, i.e.A ∈ TΦ. Notice that, for each i ∈ I, one has

f−1i (A) = f−1

i

(g−1(D)

)= (g fi)−1(D),

so using the continuity of g fi we get the fact that f−1i (A) is open in Yi, which

means that A ∈ fi∗(Ti). Since this is true for all i ∈ I, we then get A ∈ TΦ.To prove uniqueness, let T be another topology on X with properties (i) and

(ii). Consider the map h = Id : (X, T ) → (X, T Φ). Using property (i) for TΦ,combined with property (ii) for T , it follows that h is continuous, which meansthat T Φ ⊂ T . Reversing the roles, and arguing exactly the same way, we also getthe other inclusion T ⊂ T Φ.

Comment. Using the notations above, it is immediate that the topology TΦ

can also be described as the strongest topology on X, with respect to which all themaps fi : Yi → X, i ∈ I, are continuous. In the light of this remark, we will callthe topology TΦ the strong topology defined by Φ.

Example 3.2. (The disjoint union topology) Supoose we have a family (Xi, Ti),i ∈ I of topological spaces. Consider the disjoint union3 X =

⊔i∈I

Xi. For each i ∈ I

we consider the inclusion εi : Xi → X. The strongest topology on X, defined bythe family Φ = εii∈I , is called the disjoint union topology.

If we think each Xi as a subset of X, then Xi is open in X, for all i ∈ I.Moreover, a set D ⊂ X is open, if and only if D ∩Xi is open (in Xi), for all i ∈ I.For a point x ∈ X, there exists a unique i(x) ∈ I, with x ∈ Xi(x). With thisnotation, an ultrafilter U on X is convergent to x, if and only if Xi(x) ∈ U, and thecollection

U∣∣Xi(x)

= U ∩Xi(x) : U ∈ Uis an ultrafilter on Xi(x), which converges to x.

3 Formally one uses the sets Z =⋃

i∈I Xi, and Y = I × Z, and one realizes the diskoint

union as X =⋃

i∈Ii ×Xi.

Lecture 4

4. Compactness

Definition. Let X be a topological space X. A subset K ⊂ X is said to becompact set in X, if it has the finite open cover property:(f.o.c) Whenever Dii∈I is a collection of open sets such that K ⊂

⋃i∈I Di,

there exists a finite sub-collection Di1 , . . . , Din such that

K ⊂ Di1 ∪ · · · ∪Din .

An equivalent description is the finite intersection property:(f.i.p.) If Fii∈I is is a collection of closed sets such that for any finite sub-

collection Fi1 , . . . , Fin we have K ∩ Fi1 ∩ . . . Fin 6= ∅, it follows that

K ∩( ⋂i∈I

Fi)6= ∅.

A topological space (X, T ) is called compact if X itself is a compact set.Remark 4.1. Suppose (X, T ) is a topological space, and K is a subset of

X. Equip K with the induced topology T∣∣K

. Then it is straightforward from thedefinition that the following are equivalent:

• K is compact, as a subset in (X, T );• (K, T

∣∣K

) is a compact space, that is, K is compact as a subset in (K, T∣∣K

).The following three results give methods of constructing compact sets.Proposition 4.1. A finite union of compact sets is compact.

Proof. Immediate from the definition.

Proposition 4.2. Suppose (X, T ) is a topological space and K ⊂ X is acompact set. Then for every closed set F ⊂ X, the intersection F ∩ K is againcompact.

Proof. Immediate, using the finite intersection property.

Proposition 4.3. Suppose (X, T ) and (Y,S) are topological spaces, f : X → Yis a continuous map, and K ⊂ X is a compact set. Then f(K) is compact.

Proof. Immediate from the definition.

Besides the two equivalent conditions (f.o.c) and (f.i.p.), there are some otheruseful characterizations of compactness, listed in the following.

Theorem 4.1. Let (X, T ) be a topological space. The following are equivalent:(i) X is compact.

21

22 LECTURE 4

(ii) (Alexander sub-base Theorem) There exists a sub-base S with the finiteopen cover property:(s) For any collection Si | i ∈ I ⊂ S with X =

⋃i∈I

Si, there exists a

finite sub-collection Si1 , Si2 , . . . , Sin (for some finite sequence ofindices i1, i2, . . . , in ∈ I) such that X = Si1 ∪ Si2 ∪ · · · ∪ Sin .

(iii) Every ultrafilter in X is convergent.

Proof. (i) ⇒ (ii). This is obvious. (In fact any sub-base has the open coverproperty.)

(ii) ⇒ (iii). Let U be an ultrafilter on X. Assume U is not convergent to anypoint x ∈ X. By Proposition 3.2 it follows that, for each x ∈ X, one can find aset Sx ∈ S with Sx 3 x, but such that Sx 6∈ U . Using property (s), one can find afinite collection of points x1, . . . , xn ∈ X, such that

(1) Sx1 ∪ · · · ∪ Sxn = X.

Since Sxp6∈ U , it means that X r Sxp

belongs to U , for every p = 1, . . . , n. Then,using (1), we get

U 3 (X r Sx1) ∩ · · · ∩ (X r Sxn) = ∅,

which is impossible.(iii) ⇒ (i). Assuming property (iii), we will show that X has the finite in-

tersection property. Start with a family of closed sets Fii∈I , with the propertythat

(2)⋂i∈J

Fi 6= ∅, for every finite subset J ⊂ I.

We want to prove that⋂i∈I

Fi 6= ∅. For every finite subset J ⊂ I we define the

non-empty closed set FJ =⋂i∈J Fi. It is clear that

F =FJ : J finite subset of I

is a filter. Let then U be an ultrafilter with U ⊃ F. By (iii) there exists some x ∈ Xsuch that U → x, whicm means that U contains all neighborhoods of x. Start nowwith some arbitrary index i ∈ I. Since we clearly have Fi ∈ F ⊂ U, it follows thatX r Fi cannot belong to U, which means that X r Fi is not a neighborhood of x.Since X r Fi is already open, this forces x 6∈ X r Fi, which means that x ∈ Fi.Since this is true for all i ∈ I, it proves that the intersection

⋂i∈I Fi caontains x,

so it is non-empty.

An interesting application of the above result is the following:Theorem 4.2 (Tihonov). Suppose one has a familiy (Xi, Ti)i∈I of compact

topological spaces. Then the product space∏i∈I

Xi is compact in the product topology.

Proof. We are going to use the ultrafilter characterization (iii) from the pre-ceding Theorem. Let U be an ultrafilter on X =

∏i∈I

Xi. Denote by πi : X → Xi,

i ∈ I the coordinate maps. Since each Xi is compact, it follows that, for everyi ∈ I, the ultrafilter πi∗(U) (in Xi) is convergent to some point xi ∈ Xi. If we formthe element x = (xi)i∈I ∈ X, this means that πi∗(U) is convergent to πi(x), forevery i ∈ I. Then, by the ultrafilter characterization of the product topology (seesection 3) it follows that U is convergent to x.

CHAPTER I: TOPOLOGY PRELIMINARIES 23

Comment. Another interesting application of Theorem 4.1 is the followingconstruction. Suppose (X, T ) is a compact Hausdorff space, and (xi)i∈I ⊂ X is anarbitray family of elements. (Here I is an arbitrary set.) Suppose U is an ultrafilteron I. If we regard the family (xi)i∈I simply as a function f : I → X, then we canconstruct the ultrafilter f∗(U) on X. More explicitly

f∗(U) =U ⊂ X : the set i ∈ I : xi ∈W belongs to U

.

Since X is compact Hausdorff, the ultrafilter f∗(U) is convergent to some uniquepoint x ∈ X. This point is denoted by lim

Uxi.

We conclude this section with some results on compactness in Hausdorff spaces.Proposition 4.4. Suppose (X, T ) is toplological Hausdorff space.

(i) Any compact set K ⊂ X is closed.(ii) If K is a compact set, then a subset F ⊂ K is compact, if and only if F

is closed (in X).

Proof. (i) The key step is contained in the followingClaim: For every x ∈ X rK, there exists some open set Dx with x ∈ Dx ⊂X rK.

Fix x ∈ X r K. For every y ∈ K, using the Hausdorff property, we can findtwo open sets Uy and Vy with Uy 3 x, Vy 3 y, and Uy ∩ Vy = ∅. Since weobviously have K ⊂

⋃y∈K Vy, by compactness, there exist points y1, . . . , yn ∈ K,

such that K ⊂ Vy1 ∪ · · · ∪ Vyn. The claim immediately follows if we then define

Dx = Uy1 ∩ · · · ∩ Uyn.

Using the Claim we now see that we can write the complement of K as a unionof open sets:

X rK =⋃

x∈XrKDx,

so X rK is open, which means that K is indeed closed. (ii). If F is closed, thenF is compact by Proposition 4.2. Conversely, if F is compact, then by (i) F isclosed.

Proposition 4.5. Every compact Hausdorff space is normal.

Proof. Let X be a compact Hausdorff space. Let A,B ⊂ X be two closedsets with A∩B = ∅. We need to find two open sets U, V ⊂ X, with A ⊂ U , B ⊂ V ,and U ∩ V = ∅. We start with the following

Particular case: Assume B is a singleton, B = b.The proof follows line by line the first part of the proof of part (i) from Proposition4.4. For every a ∈ A we find open sets Ua and Va, such that Ua 3 a, Va 3 b,and Ua ∩ Va = ∅. Using Proposition 4.4 we know that A is compact, and since weclearly have A ⊂

⋃a∈A Ua, there exist a1, . . . , an ∈ A, such that Ua1∪· · ·∪Uan

⊃ A.Then we are done by taking U = Ua1 ∪ · · · ∪ Uan and V = Va1 ∩ · · · ∩ Van .

Having proven the above particular case, we proceed now with the general case.For every b ∈ B, we use the particular case to find two open sets Ub and Vb, withUb ⊃ A, Vb 3 b, and Ub ∩ Vb = ∅. Arguing as above, the set B is compact, andwe have B ⊂

⋃b∈B Vb, so there exist b1, . . . , bn ∈ B, such that Vb1 ∪ · · · ∪ Vbn

⊃ B.Then we are done by taking U = Ub1 ∩ · · · ∩ Ubn

and V = Vb1 ∪ · · · ∪ Vbn.

Lecture 5

5. Topology preliminaries V: Locally compact spaces

Definition. A locally compact space is a Hausdorff toplogical space with theproperty

(lc) Every point has a compact neighborhood.One key feature of locally compact spaces is contained in the following;Lemma 5.1. Let X be a locally compact space, let K be a compact set in X,

and let D be an open subset, with K ⊂ D. Then there exists an open set E with:(i) E compact;(ii) K ⊂ E ⊂ E ⊂ D.

Proof. Let us start with the followingParticular case: Assume K is a singleton K = x.

Start off by choosing a compact neighborhoodN of x. Using the results from section4, when equipped with the induced topology, the set N is normal. In particular, ifwe consider the closed sets A = x and B = N rD (which are also closed in theinduced topology), it follows that there exist sets U, V ⊂ N , such that

• U ⊃ x, V ⊃ B, U ∩ V = ∅;• U and V are open in the induced topology on N .

The second property means that there exist open sets U0, V0 ⊂ X, such that U =N ∩ U0 and V = N ∩ V0. Let E = Int(U). By construction E is open, and E 3 x.Also, since E ⊂ U ⊂ N , it follows that

(1) E ⊂ N = N.

In particular this gives the compactness of E. Finally, since we obviously have

E ∩ V0 ⊂ U ∩ V0 = N ∩ U0 ∩ V0 = U ∩ V = ∅,we get E ⊂ X r V0, so using the fact that X r V0 is closed, we also get theinclusion E ⊂ X r V0. Finally, combining this with (1) and with the inclusionN rD ⊂ V ⊂ V0, we will get

E ⊂ N ∩ (X r V0) ⊂ N ∩ (N rD) ⊂ D,

and we are done.Having proven the particular case, we proceed now with the general case. For

every x ∈ K we use the particular case to find an open set E(x), with E(x) compact,and such that x ∈ E(x) ⊂ E(x) ⊂ D. Since we clearly have K ⊂

⋃x∈K E(x), by

compactness, there exist x1, . . . , xn ∈ K, such that K ⊂ E(x1)∪· · ·∪E(xn). Noticethat if we take E = E(x1) ∪ · · · ∪ E(xn), then we clearly have

K ⊂ E ⊂ E ⊂ E(x1) ∪ · · · ∪ E(xn) ⊂ D,

25

26 LECTURE 5

and we are done.

One of the most useful result in the analysis on locally compact spaces is thefollowing.

Theorem 5.1 (Urysohn’s Lemma for locally compact spaces). Let X be alocally compact space, and let K,F ⊂ X be two disjoint sets, with K compact, Fclosed, and K ∩ F = ∅. Then there exists a continuous function f : X → [0, 1]such that f

∣∣K

= 1 and f∣∣F

= 0.

Proof. Apply Lemma 5.1 for the pair K ⊂ XrF and find an open set E, withE compact, such that K ⊂ E ⊂ E ⊂ X r F . Apply again Lemma 5.1 for the pairK ⊂ E and find anothe open set G with G compact, such that K ⊂ G ⊂ G ⊂ E.

Let us work for the moment in the space E (equipped with the induced topol-ogy). This is a compact Hausdorff space, hence it is normal. In particular, usingUrysohn Lemma (see section 1) there exists a continuous function g;E → [0, 1] suchthat g

∣∣K

= 0 and g∣∣ErG = 0. Let us now define the function f : X → [0, 1] by

f(x) =g(x) if x ∈ E

0 if x ∈ X r E

Notice that f∣∣E

= g∣∣E

, so f∣∣E

is continuous. If we take the open set A = X rG,then it is also clear that f

∣∣A

= 0. So now we have two open sets E and A, withA∪E = X, and f

∣∣A

and f∣∣E

both continuous. Then it is clear that f is continuous.The other two properties f

∣∣K = 1 and f∣∣F

= 0 are obvious.

We now discuss an important notion which meakes the linkage between locallycompact spaces and compact spaces

Definition. Let X be a locally compact space. By a compactification of X onemeans a pair (θ, T ) consisting of a compact Hausdorff space T , and of a continuousmap θ : X → T , with the following properties

(i) θ(X) is a dense open subset of T ;(ii) when we equip θ(X) with the induced topology, the map θ : X → θ(X)

is a homeomorphism.Notice that, when X is already compact, any compactification (θ, T ) of X is nec-essarily made up of a compact space T , and a homeomorphism θ : X → T .

Example 5.1 (Alexandrov compactification). Suppose X is a locally compactspace, which is not compact. We form a disjoint union with a singleton Xα =X t∞, and we equip the space Xα with the topology in which a subset D ⊂ Xα

is declared to be open, if either D is an open subset of X, or there exists somecompact subset K ⊂ X, such that D = (X r K) t ∞. Define the inclsuionmap ι : X → Xα. Then (ι,Xα) is a compactification of X, which is called theAlexandrov compactification. The fact that ι(X) is open in Xα, and ι : X → ι(X)is a homeomorphism, is clear. The density of ι(X) in Xα is also clear, since everyopen set D ⊂ Xα, with D 3 ∞, is of the form (X rK) t ∞, for some compactset K ⊂ X, and then we have D ∩ ι(X) = ι(X rK), which is non-empty, becauseX is not compact.

Remark that, if X is already compact, we can still define the topological spaceXα = X t ∞, but this time the singleto set ∞ will be also be open. Althoughι(X) will still be open in Xα, it will not be dense in Xα.

CHAPTER I: TOPOLOGY PRELIMINARIES 27

One should regard the Alexandrov compactification as a minimal one. It turnsout that there exists another compactification which is described below, which canbe regarded as the largest.

Theorem 5.2 (Stone-Cech). Let X be a locally compact space. Consider theset

F = f : X → [0, 1] : f continuous ,and consider the product space

T =∏f∈F

[0, 1],

equipped with the product topology, and define the map θ : X → T by

θ(x) =(f(x)

)f∈F , ∀x ∈ X.

Equip the closure θ(X) with the topology induced from T . Then the pair (θ, θ(X))is a compactification of X.

Proof. For every f ∈ F , let us denote by πf : T → [0, 1] the coordinate map.Remark that θ : X → T is continuous. This is immediate from the definition ofthe product topology, since the continuity of θ is equivalent to the continuity of allcompositions πf β, f ∈ F . The fact that these compositions are continuous ishowever trivial, since we have πf θ = f , ∀ f ∈ F .

Denote for simplicity θ(X) by B. By Tihonov’s Theorem, the space T is com-pact (and obviously Hausdorff), so the set B is compact as well, being a closedsubset of T . By construction, θ(X) is dense in B, and θ is continuous.

At this point, it is interesting to point out the following propertyClaim 1: For every f ∈ F , there exists a unique continuous map f : B →

[0, 1], such that f θ = f .The uniqueness is trivial, since θ(X) is dense in B. The existence is also trivial,because we can take f = πf

∣∣B

.We can show now that θ is injective. If x, y ∈ X are such that x 6= y, then

using Urysohn Lemma we can find f ∈ F , such that f(x) 6= f(y). The function fgiven by Claim 1, clearly satisfies

f(θ(x)

)= f(x) 6= f(y) = f

(θ(y)

),

which forces θ(x) 6= θ(y).In order to show that θ(X) is open in B, we need some preparations. For every

compact subset K ⊂ X, we define

FK =f : X → [0, 1] : f continuous, f

∣∣XrK = 0

.

On key observation is the following.Claim 2: If K ⊂ X is compact, and if f ∈ FK , then the continuous functionf : B → [0, 1], given by Claim 1, has the property f

∣∣Brθ(K)

= 0.

We start with some α ∈ B r θ(K), and we use Urysohn Lemma to find somecontinuous function φ : B → [0, 1] such that φ(α) = 1 and φ

∣∣θ(K)

= 0. Consider

the function ψ = φ · f . Notice that (φ θ)∣∣K

= 0, which combined with the factthat f

∣∣XrK = 0, gives

ψ θ = (φ θ) · (f θ) = (φ θ) · f = 0,

28 LECTURE 5

so using Claim 1 (the uniqueness part), we have ψ = 0. In particular, since φ(α) =1, this forces f(α) = 0, thus proving the Claim.

We define now the collection

Fc =⋃K⊂X

K compact

FK .

Define the setS =

⋂f∈Fc

π−1f

(0

).

By the definition of the product topology, it follows that S is closed in T . The factthat θ(X) is open in B, is then a consequence of the following fact.

Claim 3: One has the equality θ(X) = B r S.Start first with some point x ∈ X, and let us show that θ(x) 6∈ S. Choose someopen set D ⊂ X, with D compact, such that D 3 x, and apply Urysohn Lemmato find some continuous map f : X → [0, 1] such that f(x) = 1 and f

∣∣XrD = 0.

It is clear that f ∈ FD ⊂ Fc, but πf(θ(x)

)= f(x) = 1 6= 0, which means that

θ(x) 6∈ π−1f

(0

), hence θ(x) 6∈ S. Conversely, let us start with some point α =

(αf )f∈F ∈ B r S, and let us prove that α ∈ θ(X). Since α 6∈ S, there exists somef ∈ Fc, such that πf (α) > 0. Since f ∈ Fc, there exists some compact subsetK ⊂ X, such that f

∣∣XrK = 0. Using Claim 2, we know that f

∣∣Brθ(K)

= 0. Since

f(α) = πf (α) 6= 0, this forces α ∈ θ(K) ⊂ θ(X).To finish the proof of the Theorem, all we need to prove now is the fact that

θ : X → θ(X) is a homeomorphism, which amounts to proving that, wheneverD ⊂ X is open, it follows that θ(D) is open in B. Fix an open subset D ⊂ X. Inorder to show that θ(D) is open in B, we need to show that θ(D) is a neighborhoodfor each of its points. Fix some point α ∈ θ(D), i.e. α = θ(x), for some x ∈ D.Choose some compact subset K ⊂ D, such that x ∈ Int(K), and apply UrysohnLemma to find a function f ∈ FK , with f(x) = 1. Consider the continuous functionf : B → [0, 1] given by Claim 1, and apply Claim 2 to conclude that f

∣∣Brθ(K)

= 0.In particular the open set

N = f−1((1/2,∞)

)⊂ B

is contained in θ(K) ⊂ θ(D). Since f(α) = f(x) = 1, we clearly have x ∈ N .

Definition. The compactification (θ, θ(X)), constructed in the above Theo-rem, is called the Stone-Cech compactification of X. The space θ(X) will be denotedby Xβ . Using the map θ, we shall identify from now on X with a dense open subsetof Xβ . Remark that if X is compact, then Xβ = X.

Comment. The Stone-Cech compactification is an inherent “Zorn Lemmatype” construction. For example, if X is a non-compact locally compact space,and if U is an ultrafilter on X, then weither U is convergent to a point in X (thishappens when U contains at least one compact subset of X), or U produces a pointin Xβ rX. If θ : X → Xβ denotes the inclusion map, then one considers the ultra-filter θ∗U on Xβ , and by compactness this ultrafilter converges to some (unique)point in Xβ . This way one gets a correspondence

limX :U ⊂ P(X) : U ultrafilter on X

→ Xβ .

CHAPTER I: TOPOLOGY PRELIMINARIES 29

This correspondence is surjective. The injectivity obstruction is characterized asfollows. For two ultrafilters U1, U2, the condition limX(U1) 6= limX(U2) is equiv-alent to the existence of two disjoint open sets D1 ∈ U1 and D2 ∈ U2.

Exercise 1. Suppose a set X is equipped with the discrete topology. Prove thatthe correspondence limX is bijective.

The Stone-Cech compactification is functorial, in the following sense.Proposition 5.1. If X and Y are locally compact spaces, and if Φ : X → Y

is a continuous map, then there exists a unique continuous map Φβ : Xβ → Y β,such that Φβ

∣∣X

= Φ.

Proof. We use the notations from Theorem 5.2. Define

F = f : X → [0, 1] : f continuous and G = g : Y → [0, 1] : f continuous ,the product spaces

TX =∏f∈F

[0, 1] and TY =∏g∈G

[0, 1],

as well as the maps θX : X → TX and θY : Y → TY , defined by

θX(x) =(f(x)

)f∈F , ∀x ∈ X;

θY (y) =(g(y)

)g∈G, ∀ y ∈ Y.

With these notations, we have Xβ = θX(X) ⊂ TX and Y β = θY (Y ) ⊂ TY . Usingthe fact that we have a correspondence G 3 g 7−→ g Φ ∈ F , we define the map

Ψ : TX 3 (αf )f∈F 7−→ (αgΦ)g∈G ∈ TY .Remark that Ψ is continuous. This fact is pretty obvious, because when we composewith corrdinate projections πg : TY → [0, 1], g ∈ G, we have πg Ψ = πgΦ whereπgΦ : TX → [0, 1] is the coordinate projection, which is automatically continuous.Remark that if we start with some point x ∈ X, then

(2) Ψ(θX(x)

)=

((g Φ)(x)

)g∈G = θY

(Φ(x)

),

which means that we have the equality ΨθX = θY Φ. Remark first that, since Y β

is closed, it follows that Ψ−1(Y β) is closed in TX . Second, using (2), we clearly havethe inclusion θX(X) ⊂ Ψ−1

(θY (Y )

)⊂ Ψ−1(Y β), so using the fact that Ψ−1(Y β) is

closed, we get the inclusion

Xβ = θX(X) ⊂ Ψ−1(Y β).

In other words, we get now a continuous map Φβ = Ψ∣∣Xβ : Xβ → Y β , which clearly

satisfies Φβ θX = θY Φ, which using our conventions means that Φβ∣∣X

= Φ. Theuniqueness is obvious, by the density of X in Xβ .

Exercise 2. The Alexandrov compactification is not functorial. In other words,given locally compact spaces X and Y , and a continuous map f : X → Y , in generalthere does not exist a continuous map fα : Xα → Y α, with fα

∣∣X

= f . Give anexample of such a situation.Hint: Consider X = Y = N, equipped with the discrete topology, and define f : N → N by

f(n) =

1 if n is odd

2 if n is even

It turns out that one can define a certain type of continuous maps, with respectto which the Alexandrov compactification is functorial.

30 LECTURE 5

Definition. Let X, Y be locally compact spaces, and let Φ : X → Y be acontinuous map. We say that Φ is proper, if it satisfies the condition

K ⊂ Y , compact ⇒ Φ−1(K) compact in X.

The following is an interesting property of proper maps, which will be exploitedlater, is the following.

Proposition 5.2. Let X, Y be locally compact spaces, let Φ : X → Y be aproper continuous map, and let T ⊂ X be a closed subset. Then the set Φ(T ) isclosed in X.

Proof. Start with some point y ∈ ¯Φ(T ). This means that

(3) D ∩ Φ(T ) 6= ∅, for every open set D ⊂ Y , with D 3 y.Denote by V the collection of all compact neighborhoods of y. In other words,V ∈ V, if and only if V ⊂ Y is compact, and y ∈ Int(V ). For each V ∈ V we definethe set V = Φ−1(V )∩T . Since Φ is proper, all sets V , V ∈ V, are compact. Noticealso that, for every finite number of sets V1, . . . , Vn ∈ V, if we form the intersectionV = V1 ∩ · · · ∩ Vn, then V ∈ V, and V ⊂ Vj , ∀ j = 1, . . . , n. Remark now that, by(3), we have V 6= ∅, ∀V ∈ V. Indeed, if we start with some V ∈ V and we choosesome point x ∈ T , such that Φ(x) ∈ V , then x ∈ V . Use now the finite intersectionproperty, to get the fact that

⋂V ∈V V 6= ∅. Pick now a point x ∈

⋂V ∈V V . This

means that x ∈ T , and

(4) Φ(x) ∈ V, ∀V ∈ V.

But now we are done, because this forces Φ(x) = y. Indeed, if Φ(x) 6= y, using theHausdorff property, one could find some V ∈ V with Φ(x) 6∈ V , thus contradicting(4).

Exercise 3. Let X be a locally compact space, which is non-compact, let Y beanother a locally compact space, and Φ : X → Y is a proper continuous map.

(i) If Y is non-compact, prove that there exists a unique continuous mapΦα : Xα → Y α, with Φα

∣∣X

= Φ.(ii) If Y is compact, prove that there exists a unique continuos map Ψ : Xα →

Y , with Ψ∣∣X

= Φ.Hint: In case (i) define Φα(∞) = ∞. In case (ii) consider the collection

W =Φ(T ) : T ⊂ X closed, with X r T compact

.

Use the above result, combined with the finite intersection property, to pick a point y ∈⋂

W∈W W .

Define Ψ(∞) = y.

Lecture 6

6. Metric spaces

In this section we review the basic facts about metric spaces.Definitions. A metric on a non-empty set X is a map

d : X ×X → [0,∞)

with the following properties:(i) If x, y ∈ X are points with d(x, y) = 0, then x = y;(ii) d(x, y) = d(y, x), for all x, y ∈ X;(iii) d(x, y) ≤ d(x, z) + d(y, z), for all x, y, z ∈ X.A metric space is a pair (X, d), where X is a set, and d is a metric on X.Notations. If (X, d) is a metric space, then for any point x ∈ X and any

r > 0, we define the open and closed balls:

Br(x) = y ∈ X : d(x, y) < r,

Br(x) = y ∈ X : d(x, y) ≤ r.

Definition. Suppose (X, d) is a metric space. Then X carries a naturaltoplogy constructed as follows. We say that a set D ⊂ X is open, if it has theproperty:

• for every x ∈ D, there exists some rx > 0, such that Brx(x) ⊂ D.One can prove that the collection

Td = D ⊂ X : D open is indeed a topology, i.e. we have

• ∅ and X are open;• if (Di)i∈I is a family of open sets, then

⋃i∈I Di is again open;

• if D1 and D2 are open, then D1 ∩D2 is again open.The topology thus constructed is called the metric topology.

Remark 6.1. Let (X, d) be a metric space. Then for every p ∈ X, and forevery r > 0, the set Br(p) is open, and the set Br(p) is closed.

If we start with some x ∈ Br(p), an if we define rx = r− d(x, p), then for everyy ∈ Brx

(x) we will have

d(y, p) ≤ d(y, x) + d(x, p) < rx + d(x, p) = r,

so y belongs to Br(p). This means that Brx(x) ⊂ Br(p). Since this is true for allx ∈ Br(p), it follows that Br(p) is indeed open.

To prove that Br(p) is closed, we need to show that its complement

X rBr(p) = x ∈ X : d(x, p) > r

31

32 LECTURE 6

is open. If we start with some x ∈ X r Br(p), an if we define ρx = d(p, x)− r, thenfor every y ∈ Bρx(x) we will have

d(y, p) ≥ d(p, x)− d(y, x) > d(p, x)− ρx = r,

so y belongs to XrBr(p). This means that Bρx(x) ⊂ XrBr(p). Since this is true

for all x ∈ X r Br(p), it follows that X r Br(p) is indeed open.Remark 6.2. The metric toplogy on a metric space (X, d) is Hausdorff. Indeed,

if we start with two points x, y ∈ X, with x 6= y, then if we choose r to be a realnumber, with

0 < r <d(x, y)

2,

then we have Br(x)∩Br(y) = ∅. (Otherwise, if we have a point z ∈ Br(x)∩Br(y),we would have 2r < d(x, y) ≤ d(x, z) + d(y, z) < 2r, which is impossible.)

Remark 6.3. Let (X, d) be a metric space, and let M be a subset of X. Thend∣∣M×M is a metric on M , and the metric topology on M defined by this metric is

precisely the induced toplogy from X. This means that a set A ⊂M is open in Mif and only if there exists some open set D ⊂ X with A = M ∩D.

The metric space framework is particularly convenient because one can useconvergence.

Definition. Let (X, d) be a metric space. For a point x ∈ X, we say that asequence (xn)n≥1 ⊂ X is is convergent to x, if limn→∞ d(xn, x) = 0.

Remark 6.4. Let (X, d) is a metric space, and if the sequence (xn)n≥1 ⊂ Xis convergent to some point x ∈ X, then

(1) limn→∞

d(xn, y) = d(x, y), ∀ y ∈ X.

This is an immediate consequence of the inequalities

d(x, y)− d(xn, x) ≤ d(xn, y) ≤ d(x, y) + d(xn, x).

Among other things, the equality (1) gives the fact that (xn)n≥1 cannot beconvergent to any other point y 6= x. Therefore, if (xn)n≥1 is convergent to somex, then x is uniquely determined, and will be denoted by limn→∞ xn.

Convergence is useful for characterizing closure.Proposition 6.1. Let (X, d) be a metric space, and let A ⊂ X be a non-empty

subset. For a point x ∈ X, the following are equivalent:(i) x belongs to the closure A of A;(ii) there exists some sequence (xn)n≥1 ⊂ A, with limn→∞ xn = x.

Proof. (i) ⇒ (ii). Assume x ∈ A. This means that(∗) For every open set D ⊂ X with D 3 x, the intersection D ∩ A is non-

empty.We use this property for the open sets B1/n(x), n = 1, 2, . . . . So, for every integern ≥ 1, we can find a point xn ∈ B1/n(x) ∩ A. This way we have built a sequence(xn)n≥1 ⊂ A, such that

d(xn, x) <1n, ∀n ≥ 1.

It is clear that this gives x = limn→∞ xn.(ii) ⇒ (i). Assume x satisfies property (ii). Fix (xn)n≥1 ⊂ A to be a sequence

with limn→∞ xn = x. We need to prove property (∗). Start with some arbitrary

CHAPTER I: TOPOLOGY PRELIMINARIES 33

open set D ⊂ X, with x ∈ D. Let ε > 0 be chosen such that Bε(x) ⊂ D. Sincelimn→∞ d(xn, x) = 0, there exists some nε such that d(xnε , x) < ε. It is now clearthat

xnε ∈ Bε(x) ∩A ⊂ D ∩A,so the intersection D ∩A is indeed non-empty.

Continuity can be characterized using convergence, as follows.Proposition 6.2. Let X and Y be metric spaces, and let f : X → Y be a

function. For a point p ∈ X, the following are equivalent:(i) f is continuous at p;(ii) for every ε > 0, there exists some δε > 0 such that

d(f(x), f(p)

)< ε, for all x ∈ X with d(x, p) < δε.

(iii) if (xn)n≥1 ⊂ X is a sequence with limn→∞ xn = p, then limn→∞ f(xn) =f(p).

Proof. (i) ⇒ (ii). The condition that f is continuoous at p means(∗) for every open set D ⊂ Y , with D 3 f(p), there exists some open set

E ⊂ X, with p ∈ E ⊂ f−1(D).Assume f is continuous at p. For every ε > 0, we consider the open ball BY

ε

(f(p)

).

Using (∗), there exists some open set E ⊂ X, with E 3 p, and f(E) ⊂ BYε

(f(p)

).

In particular, there exists δ > 0, such that BXδ (p) ⊂ E, so now we have

f(BXδ (p)

)⊂ BY

ε

(f(p)

),

which clearly gives (ii).(ii) ⇒ (iii). Assume f satisfies (ii), and start with some sequence (xn)n≥1 ⊂ X,

which converges to p. For every ε > 0, we choose δε > 0 as in (ii), and using thefact that limn→∞ xn = p, we can also choose some Nε such that

d(xn, p) < δε, ∀n ≥ Nε.

Using (ii) this will give

d(f(xn), f(p)

)< ε, ∀n ≥ Nε.

In other words, we get the fact that

limn→∞

(f(xn), f(p)

)= 0,

which means that we indeed have limn→∞ f(xn) = f(p).(iii) ⇒ (i). Assume f satisfies (iii), but f is not continuous at p. By (∗) this

means that there exists some open set D0 ⊂ Y with D0 3 f(p), such that(∗′) for every open set E ⊂ X with E 3 p, we have f(E) 6⊂ D0.

It is clear that any other open set D, with f(p) ∈ D ⊂ D0, will again satisfyproperty (∗′). Fix then some r > 0, such that BY

r

(f(p)) ⊂ D0. Using condition

(∗′) it follows that for every integer n ≥ 1, we have

f(BX

1/n(p))6⊂ BY

r

(f(p)

).

This means that, for every integer n ≥ 1, we can find a point xn ∈ X such that

d(xn, p) <1n

and d(f(xn), f(p)

)≥ r.

34 LECTURE 6

It is then clear that the sequence (xn)n≥1 ⊂ X is convergent to p, but the sequence(f(xn)

)n≥1

⊂ Y is not convergent to f(p). This will contradict (iii).

Convergence can also be used for characterizing compactness.Theorem 6.1. Let (X, d) be a metric space. The following are equivalent:

(i) X is compact in the metric topology;(ii) every sequence has a convergent subsequence.

Proof. (i) ⇒ (ii). Assume X is compact. Start with an arbitrary sequence(xn)n≥1 ⊂ X. For every n ≥ 1, we define the closed set

Tn = xk : k > n.

It is obvious that the family of closed sets (Tn)n≥1 has the finite intersection prop-erty, i.e. for every finite set F of indices, we have⋂

n∈FTn 6= ∅.

(This follows from the fact that the Tn’s form a decreasing sequence of sets.) Bycompactness, it follows that ⋂

n≥1

Tn 6= ∅.

Take a point x ∈⋂n≥1 Tn. The key feature of x is the given by the following:

Claim 1: For every ε > 0 and every integer ` ≥ 1, there exists some integerN(ε, `) > ` such that d(xN(ε,`), x) < ε.

This is a consequence of the fact that, for every ` ≥ 1, the point x belongs to theclosure xN : N > `, so for every ε > 0 we have

Bε(x) ∩ xN : N > ` 6= ∅.

Using Claim 1, we define a sequence (kn)n≥0 of integers, recursively by

kn = N( 1n , kn−1), ∀n ≥ 1.

(The initial term k0 is chosen arbitrarily.) We have, by construction, k0 < k1 <k2 < . . . , and

d(xkn, x) <

1n, ∀n ≥ 1,

so (xkn)n≥1 is indeed a subsequence of (xk)k≥1, which is convergent (to x).

(ii) ⇒ (i). Assume (ii). Before we start proving that X is compact, We shallneed some preparations.

Claim 2: For every r > 0 there exists a finite set F ⊂ X, such that

X =⋃x∈F

Br(x).

We prove this by contradiction. Assume there exists some r > 0, such that⋃x∈F

Br(x) ( X,

for every finite set F ⊂ X. In particular, there exists a sequence (xn)n≥1 such that

xn+1 ∈ X r[Br(x1) ∪ · · · ∪Br(xn)

], ∀n ≥ 1.

CHAPTER I: TOPOLOGY PRELIMINARIES 35

This will forced(xm, xn) ≥ r, ∀m > n ≥ 1.

Notice that every subsequence (xkn)n≥1 will satisfy the same property

d(xkm , xkn) ≥ r, ∀m > n ≥ 1.

This proves that no subsequence of (xn)n≥1 is Cauchy, so no subsequence of (xn)n≥1

can be convergent, thus contradicting (ii).Having proven Claim 2, we choose, for every integer n ≥ 1, finite set Fn such

thatX =

⋃x∈Fn

B 1n(x).

Claim 3: The collection W =B 1

n(x) : n ∈ N, x ∈ Fn

is a base for the

metric topology.What we need to show is that every open set is a union of sets in W. Fix an openset D and a point p ∈ D. Choose r > 0, such that Br(p) ⊂ D. Choose thensome integer n ≥ 1, such that 1

n < r2 , and choose some point x ∈ Fn, such that

p ∈ B 1n(x). Notice that, for every y ∈ B 1

n(x), we have

d(y, p) ≤ d(y, x) + d(x, p) <1n

+1n≤ r,

which proves that y ∈ Br(p). Therefore we have

p ∈ B 1n(x) ⊂ Br(p) ⊂ D.

Since p ∈ D is arbitrary, this proves that D is a union of sets in W.We now begin proving that X is compact. Start with a collection (Di)i∈I of

open sets, with⋃i∈I Di = X. We need to find a finite set of indices I0 ⊂ I, such

that⋃i∈I0 Di = X. First we show that:Claim 4: There exists a countable set of indices I1 ⊂ I, such that⋃

i∈I1

Di = X.

The key fact is that the base W is countable. Let us enumerate the base W as asequence

W = Wm : m ∈ N.For each i ∈ I, we define the set

Mi = m ≥ 1 : Wm ⊂ Di.By Claim 3, we know that for every x ∈ Di there exists some m ∈ Mi such thatx ∈Wm ⊂ Di. In particular this proves the equality

Di =⋃

m∈Mi

Wm, ∀ i ∈ I.

Consider then the union M =⋃i∈IMi, which is countable, being a subset of the

integers. We clearly have⋃m∈M

Wm =⋃i∈I

( ⋃m∈Mi

Wm

)=

⋃i∈I

Di = X.

For every m ∈M we choose an im ∈ I, such that m ∈Mim . If we take

I1 = im : m ∈M,

36 LECTURE 6

then I1 is obviously countable, and since we clearly have Wm ⊂ Dim , we get

X =⋃m∈M

Wm ⊂⋃m∈M

Dim =⋃i∈I1

Di,

so the Claim is proven.Let us list the countable set I1 as

I1 = ik : k ≥ 1.

(Of course, if I1 is already finite, there is nothing to prove. So we will assumethat I1 is infinite.) In order to finish the proof, we must find some k, such thatDi1 ∪Di2 ∪ · · · ∪Dik = X. Assume no such k can be found, which means that

Di1 ∪Di2 ∪ · · · ∪Dik ( X, ∀ k ≥ 1.

In other words, if we define for each k ≥ 1, the close set

Ak = X r (Di1 ∪Di2 ∪ · · · ∪Dik),

we haveAk 6= ∅, ∀ k ≥ 1.

For each k ≥ 1 we choose a point xk ∈ Ak. This way we have constructed asequence (xk)k≥1 ⊂ X, so using property (i) we can find a convergent subsequence.This means that we have a sequence of integers

1 ≤ k1 < k2 < . . .

and a point x ∈ X, such that limn→∞ xkn= x. Notice that, since

kn ≥ n, ∀n ≥ 1,

and since the sequence (Ak)k≥1 is decreasing, we get the fact that, for each m ≥ 1,we have

xkn ∈ Am, ∀n ≥ m.

Since Am is closed, this forces x ∈ Am, for all m ≥ 1. But this is clearly impossible,since ⋂

m≥1

Am = X r( ⋃m≥1

(Di1 ∪ · · · ∪Dim))

= X r( ⋃i∈I1

Di

)= ∅.

Corollary 6.1 (of the proof). Evry compact metric space is second countable,which means that there exists a sequence (Wm)m≥1 of open sets, with the property

(b) for every open set D, there exists a subset M ⊂ N such that

D =⋃m∈M

Wm.

Proof. Use (i) and the steps in the proof of (i) ⇒ (ii), up to the proof ofClaim 3.

Corollary 6.2. Let (X, d) be a metric space. For a subset K ⊂ X the fol-lowing are equivalent:

(i) every sequence in K has a subsequence which is convergent to some pointin K;

(ii) K is compact in X.

CHAPTER I: TOPOLOGY PRELIMINARIES 37

Proof. (i) ⇒ (ii). By the above Theorem, we know that when we equip Kwith the metric d

∣∣K×K , then K is compact. This means that K is compact in the

induced topology, which means exactly that K is compact in X.(ii) ⇒ (i). Argue as above. If K is compact in X, then K is compact when

equipped with the induced toplogy, which means that (K, d∣∣K×K) is compact.

Corollary 6.3. Let X and Y be metric spaces, and let f : X → Y be acontinuous map. If X is compact, then f is uniformly continuous, that is,

• for every ε > 0, there exists some δε > 0, such that

d(f(x), f(x′)

)< ε, for all x, x′ ∈ X with d(x, x′) < δε.

Proof. Suppose f is not uniformly continuos, so there exists some ε0 > 0,with the property that for any δ > 0 there exists x, x′ ∈ X, with d(x.x′) < δ, butd(f(x), f(x′)

)≥ ε0. In particular, one can construct two sequences (xn)n≥1 and

(x′n)n≥1 with

(2) d(xn, x′n) <1n

and d(f(xn), f(x′n)

)≥ ε0, ∀n ≥ 1.

Using compactness, we can find a subsequence (xnk)k≥1 of (xn)n≥1 which converges

to some point p. On the one hand, we have

d(p, x′nk) ≤ d(p, xnk

) + d(xnk, x′nk

) < d(p, xnk) +

1nk, ∀ k ≥ 1,

which proves that

(3) limk→∞

x′nk= p.

On the other hand, using (2) we also have

ε0 ≤ d(f(xnk

), f(x′nk))≤ d

(f(p), f(xnk

))

+ d(f(p), f(x′nk

)),

which leads to a contradiction, because the equalities

limk→∞

xnk= limk→∞

x′nk= p,

together with the continuity of f , will force

limk→∞

d(f(p), f(xnk

) = limk→∞

d(f(p), f(x′nk

))

= 0.

Remark 6.5. Let X be a metric space. Then any compact subset K ⊂ X isclosed (this is a consequence of the fact that X is Hausdorff) and bounded, in thesense that for every p ∈ X we have

supx∈K

d(x, p) <∞.

This is a consequence of the continuity (see ??) of the map

K 3 x 7−→ d(x, p) ∈ [0,∞).

In general however the converse is not true, i.e. there are metric spaces in whichclosed bounded sets may fail to be compact.

38 LECTURE 6

Exercise 1. Equip R with the metric

d(x, y) =|x− y|

1 + |x− y|, ∀x, y ∈ R.

Prove that d is indeed a metric on R, and the metric topology on R defined by d isthe usual topology. Prove that R is bounded with respect to this metric.

Exercise 2. Start with a metric space X, and let (xn)n≥1 ⊂ X be a sequencewhich is convergent to some point x. Prove that the set

K = x ∪ xn : n ≥ 1is compact in X.

Definition. Let (X, d) be e metric space. For a point x ∈ X and a non-emptysubset A ⊂ X, one defines the distance from x to A as the number

d(x,A) = infd(x, a) : a ∈ A

.

Exercise 3. Let (X, d) be a metric space, and let A be a non-empty subset ofX.

(i) For a point x ∈ X, prove that the equality d(x,A) = 0 is equivalent tothe fact that x ∈ A.

(ii) Prove the inequality∣∣d(x,A)− d(y,A)∣∣ ≤ d(x, y), ∀x, y ∈ X.

Using (ii) conclude that the map

X 3 x 7−→ d(x,A) ∈ [0,∞)

is continuous.Proposition 6.3. Let (X, d) be a metric space. When equipped with the metric

topology, X is normal.

Proof. Let A and B be closed subsets of X with A ∩ B = ∅. We need tofind open sets U, V ⊂ X, with U ⊃ A, V ⊃ B, and U ∩ V = ∅. We are goingto use a converse of Urysohn Lemma. More explicitly, let us define the functionf : X → [0, 1] by

f(x) =d(x,A)

d(x,A) + d(x,B), x ∈ X.

Notice that by Exercise 3, both the numerator and denominator are continuous,and the denominator never vanishes. So f is indeed continuous. It is obviousthat f

∣∣A

= 0 and f∣∣B

= 1, so if we take the open sets U = f−1((−∞, 1

2 ))

andV = f−1

(( 12 ,∞)

), we clearly get the desired result.

We continue now with a discussion on completeness.Definitions. Let (X, d) be a metric space. A sequence (xn)n≥1 ⊂ X is said

to be a Cauchy sequence, if it has the following property.(C) For every ε > 0, there exists some integer Nε ≥ 1 such that

d(xm, xn) < ε, ∀m,n ≥ Nε.

The metric space (X, d) is said to be complete, if every Cauchy sequence isconvergent.

The following result summarizes some equivalent characterizations of complete-ness.

CHAPTER I: TOPOLOGY PRELIMINARIES 39

Proposition 6.4. Let (X, d) be a metric space. The following are equivalent.(i) (X, d) is complete.(ii) Every sequence (xn)n≥1 ⊂ X, with

(4)∞∑n=1

d(xn+1, xn) <∞,

is convergent.(iii) Every Cauchy sequence has a convergent subsequence.

Proof. (i) ⇒ (ii). Assume X is complete. Let (xn)n≥1 ⊂ X be a sequencewith property (4). To prove (ii) it suffices to show that (xn)n≥1 is Cauchy. Forevery N ≥ 1 we define

RN =∞∑n=N

d(xn+1, xn).

Using (4) we get limN→∞RN = 0, so for every ε > 0 there exists some N(ε) withRN(ε) < ε. Notice also that the sequence (RN )N≥1 is decreasing. If m > n ≥ N(ε),then

d(xm, xn) ≤m−1∑k=n

d(xk+1, xk) ≤∞∑k=n

d(xk+1, xk) = Rn ≤ RN(ε) < ε,

so (xn)n≥1 is indeed Cauchy.(ii) ⇒ (iii). Start with some Cauchy sequence (yk)k≥1. For every n ≥ 1 choose

an integer N(n) ≥ 1 such that

(5) d(xk, x`) <12n, ∀ k, ` ≥ N(n).

Start with some arbitrary k1 ≥ N(1) and define recursively an entire sequence(kn)n≥1 of integers, by

kn+1 = maxkn + 1, N(n+ 1), n ≥ 1.

Clearly we have k1 < k2 < . . . , and since we have

kn+1 > kn ≥ N(n), ∀n ≥ 1,

using (5), we get

d(ykn+1 , ykn) <

12n, ∀n ≥ 1.

So if we define the subsequence xn = ykn , n ≥ 1, we will have∞∑n=1

d(xn+1, xn) ≤∞∑n=1

12n

= 1,

so the subsequence (xn)n≥1 satisfies condition (4). By (ii) the subsequence (xn)n≥1

is convergent.(iii) ⇒ (i). Assume condition (iii) holds. Start with some Cauchy sequence

(xn)n≥1. For every integer n ≥ 1 we put

Sn = sup`,m≥n

d(x`, xm).

Since (xn)n≥1 is Cauchy, we have

(6) limn→∞

Sn = 0.

40 LECTURE 6

Using the assumption, we can find a subsequence (xkn)n≥1 (defined by an increasingsequence of integers 1 ≤ k1 < k2 < . . . ) which is convergent to some point x. Weare going to prove that the entire sequence (xn)n≥1 is convergent to x. Fix for themoment n ≥ 1. For every m ≥ n, we have km ≥ m ≥ n, so we have

(7) Sn ≥ d(xn, xkm), ∀m ≥ n.

By Remark 3.4, we also know that

limm→∞

d(xn, xkm) = d(xn, x),

so if we take limm→∞ in (7) we will get

d(xn, x) ≤ Sn.

Since this estimate holds for arbitrary n ≥ 1, using (6) we immediately get the factthat (xn)n≥1 is indeed convergent to x.

Proposition 6.5. Suppose (X, d) is a complete metric space, and Y is a subsetof X. The following are equivalent:

(i) Y is complete, when equipped with the metric from X;(ii) Y is closed in X, in the metric topology.

Proof. (i) ⇒ (ii). Assume Y is complete, and let us prove that Y is closed.Start with a point x ∈ Y . Then there exists a sequence (yn)n≥1 ⊂ Y withlimn→∞ yn = x. Notice that (yn)n≥1 is Cauchy in Y , so by assumption, (yn)n≥1 isconvergent to som point in Y . This will then clearly force x ∈ Y .

(ii) ⇒ (i). Assume Y is closed, and let us prove that Y is complete. Startwith a Cauchy sequence (yn)n≥1 ⊂ Y . Since X is complete, the sequence (yn)n≥1

is convergent to some point x ∈ X. Since Y is closed, this forces x ∈ Y .

Remark 6.6. Using Theorem 6.1, we immediately see that a metric space,which is compact in the metric topology, is automatically complete.

The next result identifies those complete metric spaces that are compact. Inorder to formulate it, we need the following:

Definition. Let (X, d) be a metric space, and let ε > 0. A subset A ⊂ X issaid to be ε-rare, if

d(a, b) ≥ ε, for all a, b ∈ A with a 6= b.

Proposition 6.6. Let (X, d) be a complete metric space. The following areequivalent:

(i) X is compact in the metric topology;(ii) for each ε > 0, all ε-rare subsets of X are finite;(iii) for any ε > 0, there exist finitely many points p1, p2, . . . , pn ∈ X, such

thatX = Bε(p1) ∪Bε(p2) ∪ · · · ∪Bε(pn).

Proof. (i) ⇒ (ii). Assume X is compact. We prove (ii) by contradiction.Assume there exists some ε > 0 and an infinite ε-rare set A ⊂ X. It then followsthat there exists a sequence (an)n≥1 ⊂ A, such that

d(am, an) ≥ ε, ∀m > n ≥ 1.

CHAPTER I: TOPOLOGY PRELIMINARIES 41

It is clear that no subsequence of (an)n≥1 is Cauchy, which means that (an)n≥1

does not have any convergent subsequence, thus contradicting the fact that X iscompact.

(ii) ⇒ (iii). Assume property (ii) and let us prove (iii) by contradiction.Assume there exists some ε > 0, such that, for every finite set F ⊂ X, one has astrict inclusion ⋃

x∈FBε(x) ( X.

Start with some arbitrary point a1 ∈ X, and construct recursively a seqeuence(an)n≥1 ⊂ X, by choosing

an+1 ∈ X r[Bε(a1) ∪ · · · ∪Bε(an)

], ∀n ≥ 1.

This will then forced(am, an) ≥ ε, ∀m > n ≥ 1,

so A = an : n ∈ N will be an infinite ε-rare set, thus contradicting (ii).(iii) ⇒ (i). Assume property (iii), and let us prove that X is compact. We are

going to use Theorem 6.1. Start with an arbitrary sequence (xn)n≥1 ⊂ X, and letus construct a convergent subsequence.

Claim: There exists a sequence (pn)n≥1 ⊂ X, such that for every integerk ≥ 1, the set

Mk =n ∈ N : xn ∈

k⋂`=1

B 1`(p`)

is infinite.

The sequence (pn)n≥1 is constructed recursively. To start, we use (ii) to find a finiteset F1 ⊂ X, such that

X =⋃p∈F1

B1(p).

If we define, for each p ∈ F1, the set

S1(p) = n ∈ N : xn ∈ B1(p),then we clearly have ⋃

p∈F1

S1(p) = N,

so in particular one of the sets S1(p), p ∈ F1, is infinite.Suppose now we have constructed points p1, p2, . . . , pm−1, such that, for every

k ∈ 1, . . . ,m− 1, the set

Mk =n ∈ N : xn ∈

k⋂`=1

B 1`(p`)

is infinite, and let us indicate how the next term pm is to be constructed. Startwith a finite set Fm ⊂ X, such that

X =⋃p∈Fm

B 1m

(p),

and define, for each p ∈ Fm, the set

Sm(p) = n ∈Mm−1 : xn ∈ B 1m

(p).

42 LECTURE 6

It is clear thatMm−1 =

⋃p∈Fm

Sm(p),

and since Mm−1 is infinite, it follows that one of the sets Sm(p), p ∈ Fm is infinite.We then choose pm ∈ Fm to be one point for which Sm(pm) is infinite.

Having proven the Claim, let us us construct a sequence of integers 1 ≤ n1 <n2 < . . . as follows. Start with some arbitrary n1 ∈M1. Once n1 < n2 < · · · < nkhave been constructed, we choose the integer nk+1 ∈ Mk+1, such that nk+1 > nk.(It is here that we use the fact that Mk+1 is infinite.) By construction, we havenk ∈Mk, ∀ k ≥ 1.

Suppose k ≥ ` ≥ 1. Then by construction we have nk ∈Mk ⊂M` and n` ∈M`.In particular we get

d(xnk, xn`

) ≤ d(xnk, p`) + d(xn`

, p`) <2`.

The above estimate clearely proves that the subsequence (xnk)k≥1 is Cauchy. Since

X is complete, it follows that (xnk)k≥1 is convergent.

Corollary 6.4. Let (X, d) be a complete metric space, and let A be a subsetof X. The following are equivalent:

(i) the closure A is compact in X;(ii) for each ε > 0, all ε-rare subsets of A are finite.

Proof. (i) ⇒ (ii). This is trivial from the above result.(ii) ⇒ (i). Assume (ii), and let us prove that A is compact. Since A is complete,

it suffices to prove that, for each ε > 0, all ε-rare subsets of A are finite. Fix ε > 0,and let B be an ε-rare subset of A. For each x ∈ B, let us choose a point ax ∈ A,such that x ∈ Bε/3(ax). Suppose x, y ∈ B are such that x 6= y. Then

d(ax, ay) ≥ d(x, y)− d(ax, x)− d(ay, y) > ε− ε

3− ε

3=ε

3.

In particular, this shows that the map

f : B 3 x 7−→ ax ∈ Ais injective, and the set f(B) is an (ε/3)-rare subset of A. By condition (ii) thisforces B to be finite.

We continue with an important construction.Definitions. Let (X, d) be a metric space. We define

cs(X, d) =x = (xn)n≥1 : x Cauchy sequence in X

.

We say that two Cauchy sequences x = (xn)n≥1 and y = (yn)n≥1 in X are equiva-lent, if

limn→∞

d(xn, yn) = 0.

In this case we write x ∼ y. (It is fairly obvious that ∼ is indeed an equivalencerelation.) We define the quotient space

X = cs(X, d)/ ∼ .

For an element x ∈ cs(X, d), we denote its equivalence class by x.Finally, for a point x ∈ X, we define 〈x〉 ∈ X, to be the equivalence class of

the constant sequence x (which is obviously Cauchy).

CHAPTER I: TOPOLOGY PRELIMINARIES 43

Remark 6.7. Let (X, d) be a metric space. If x = (xn)n≥1 and y = (yn)n≥1

are Cauchy sequences in X, then the sequence of real numbers(d(xn, yn)

)n≥1

isconvergent. Indeed, for any m,n we have∣∣d(xm, ym)− d(xn, yn)

∣∣ ≤ ∣∣d(xm, ym)− d(xn, ym)∣∣ +

∣∣d(xn, ym)− d(xn, yn)∣∣ ≤

≤ d(xm, xn) + d(ym, yn).

We can then defineδ(x,y) = lim

n→∞d(xn, yn).

Proposition 6.7. Let (X, d) be a metric space.A. The map δ : cs(X, d)× cs(X, d) → [0,∞) has the following properties:

(i) δ(x,y) = δ(y,x), ∀x,y ∈ cs(X, d);(ii) δ(x,y) ≤ δ(x,z) = δ(z,y), ∀x,y,z ∈ cs(X, d);(iii) δ(x,y) = 0 ⇒ x ∼ y;(iv) If x,x′,y,y′ ∈ cs(X, d) are such that x ∼ x′ and y ∼ y′, then

δ(y,x) = δ(x′,y′).B. The map d : X × X → [0,∞), correctly defined by

d(x, y) = δ(x,y), ∀x,y ∈ cs(X, d),

is a metric on X.C. The map X 3 x 7−→ 〈x〉 ∈ X is isometric, in the sense that

d(〈x〉, 〈y〉) = d(x, y), ∀x, y ∈ X.

Proof. A. Properties (i), (ii) and (iii) are obvious. To prove property (iv) letx = (xn)n≥1, x′ = (x′n)n≥1, y = (yn)n≥1, and y′ = (y′n)n≥1. The inequality

d(x′n, y′n) ≤ d(x′n, xn) + d(xn, yn) + d(yn, y′n),

combined with limn→∞ d(x′n, xn) = limn→∞ d(yn, y′n) = 0 immediately gives

δ(x′,y′) = limn→∞

d(x′n, y′n) ≤ lim

n→∞d(xn, yn) = δ(x,y).

By symmetry we also have δ(x,y) ≤ δ(x′,y′), and we are done.B. This is immediate from A.C. Obvious, from the definition.

Proposition 6.8. Let (X, d) be a metric space.(i) For any Cauchy sequence x = (xn)n≥1 in X, one has

limn→∞

〈xn〉 = x, in X.

(ii) The metric space (X, d) is complete.

Proof. (i). For every n ≥ 1, we have

(8) d(〈xn〉, x

)= limm→∞

d(xn, xm).

Now if we start with some ε > 0, and we choose Nε such that

d(xn, xm) < ε, ∀m,n ≥ Nε,

then (8) shows thatd(〈xn〉, x

)≤ ε, ∀n ≥ Nε,

44 LECTURE 6

so we indeed havelimn→∞

d(〈xn〉, x

)= 0.

(ii). Let(pk)k≥1 be a Cauchy sequence in X. Using (i), we can choose, for each

k ≥ 1, an element xk ∈ X, such that

d(〈xk〉, pk) ≤

12k.

Claim 1: The sequence x = (xk)k≥1 is Cauchy in X.

Indeed, for k ≥ ` ≥ 1 we have

d(xk, x`) = d(〈xk〉, 〈x`〉

)≤ d

(〈xk〉, pk)

)+ d(pk, p`) + d

(p`, 〈x`〉

)≤ d(pk, p`) +

12`.

This clearly gives

limn→∞

[supk,`≥N

d(xk, x`)]≤ limn→∞

[supk,`≥N

d(pk, p`)]

= 0,

so x = (xk)k≥1 is indeed Cauchy.The proof of (ii) will the be finished, once we prove:

Claim 2: We have limn→∞ pk = x in X.

To see this, we observe that, for ` ≥ k ≥ 1 we have the inequality

(9) d(pk, 〈x`〉

)≤ d

(pk, 〈xk〉

)+ d

(〈xk〉, 〈x`〉

)≤ 1

2k+ d(xk, x`).

If we now start with some ε > 0, and we choose Nε such that

d(xk, x`) < ε, ∀ k, ` ≥ Nε,

then (9) gives

d(pk, 〈x`〉

)≤ 1

2k+ ε, ∀ ` ≥ k ≥ Nε.

If we keep k ≥ Nε fixed and take lim`→∞, using (i) we get

d(pk, x) = lim`→∞

d(pk, 〈x`〉) ≤12k

+ ε, ∀ k ≥ Nε.

The above estimate clearly proves that

limk→∞

d(pk, x) = 0,

so the sequence (pk)k≥1 is convergent (to x).

Definition. The metric space (X, d) is called the completion of (x, d).

The completion has a certain universality property. In order to formulate thisproperty we need the following

Definition. Let (X, d) and (Y, ρ) be metric spaces. A map f : X → Y is saidto be a Lipschitz function, if there exists some constant C ≥ 0, such that

ρ(f(x), f(x′)

)≤ C · d(x, x′), ∀x, x′ ∈ X.

Such a constant C is then called a Lipschitz constant for f .

CHAPTER I: TOPOLOGY PRELIMINARIES 45

Proposition 6.9. Let (X, d) be a metric space, and let (X, d) be its completion.If (Y, ρ) is a complete metric space, and f : X → Y is a Lipschitz function withLipschitz constant C ≥ 0, then there exists a unique continuous function f : X →Y , such that

f(〈x〉) = f(x), ∀x ∈ X.Moreover, f is Lipschitz, with Lipschitz constant C.

Proof. Start with some Cauchy sequence x = (xn)n≥1 in X. Using the in-equality

ρ(f(xm), f(xn)

)≤ C · d(xm, xn), ∀m,n ≥ 1,

it is obvious that(f(xn)

)n≥1

is a Cauchy sequence in Y . Since Y is complete, thissequence is convergent. Define,

φ(x) = limn→∞

f(xn).

This way we have constructed a map φ : cs(X, d) → Y .Claim: If x ∼ x′, then φ(x) = φ(x′).

Indeed, if x = (xn)n≥1 and x′ = (x′n)n≥1, then the Lipschitz property will give

ρ(f(xn), f(x′n)

)≤ C · d(xn, x′n), ∀n ≥ 1,

and using the fact that limn→∞ d(xn, x′n) = 0, we get limn→∞ ρ(f(xn), f(x′n)

)= 0.

This clearly forceslimn→∞

f(xn) = limn→∞

f(x′n).

Having proven the claim, we now see that we have a correctly define mapf : X → Y , with the property that

f(x) = φ(x), ∀x ∈ cs(X, d).

The equalityf(〈x〉) = f(x), ∀x ∈ X

is trivially satisfied.Let us check now that f is Lipschitz, with Lipschitz constant C. Start with

two points p, p′ ∈ X, represented as p = x and p′ = x′, for two Cauchy sequencesx = (xn)n≥1 and x′ = (x′n)n≥1 in X. Using the definition, we have

f(p) = limn→∞

f(xn) and f(p′) = limn→∞

f(x′n).

This will giveρ(f(p), f(p′)

)= limn→∞

ρ(f(xn), f(x′n)

).

Notice however that

ρ(f(xn), f(x′n)

)≤ C · d(xn, x′n), ∀n ≥ 1,

so taking the limit yields

ρ(f(p), f(p′)

)= limn→∞

ρ(f(xn), f(x′n)

)≤ C · lim

n→∞d(xn, x′n) = C · d(p, p′).

Finally, let us show that f is unique. Let F : X → Y be another continuousfunction with F (〈x〉) = f(x), for all x ∈ X. Start with an arbitrary point p ∈

46 LECTURE 6

X, represented as p = x, for some Cauchy sequence x = (xn)n≥1 in X. Sincelimn→∞〈xn〉 = p in X, by continuity we have

F (p) = limn→∞

F (〈xn〉) = limn→∞

f(xn) = φ(x) = f(p).

Corollary 6.5. Let (X, d) be a metric space, let (Y, ρ) be a complete metricspace, and let f : X → Y be an isometric map, that is

ρ(f(x), f(x′)

)= d(x, x′), ∀x, x′ ∈ X.

Then the map f : X → Y , given by the above result, is isometric and f(X) = f(X)- the closure of f(X) in Y ..

Proof. To show that f(X) = f(X), start with some arbitrary point y ∈f(X). Then there exists a sequence (xn)n≥1 ⊂ X, with limn→∞ f(xn) = y. Since(f(xn)

)n≥1

is Cauchy in Y , and

d(xm, xn) = ρ(f(xm), f(xn)

), ∀m,n ≥ 1,

it follows that the sequence x = (xn)n≥1 is cauchy in X. We then have

y = limn→∞

f(xn) = f(x).

Finally, we show that f is isometric. Start with two points p, q ∈ X, representedas p = x and q = z, for some Cauchy sequences x = (xn)n≥1 and z = (zn)n≥1 inX. Then by construction we have

ρ(f(p), f(q)

)= limn→∞

ρ(f(〈xn〉), f(〈zn〉)

)= limn→∞

ρ(f(xn), f(zn)

)=

= limn→∞

d(xn, zn) = d(x, z) = d(p, q).

Corollary 6.6. If (X, d) is a complete metric space, and X is its completion,then the map ι : X 3 x 7−→ 〈x〉 ∈ X is bijective.

Proof. Apply the previous result to the map Id : X → X, to get a bijective(isometric) map Id : X → X. Since the map Id is obviously a left inverse for ι, itfollows that ι itself is bijective.

In the remainder of this section we will address the following question: Givena topological Hausdorff space X, when does there exists a metric d on X, such thatthe given topology coincides with the metric topology defined by d? A topolgicalHausdorff space with the above property is said to be metrizable. It is difficult togive non-trivial necessary and sufficient conditions for mtrizability. One instance inwhich this is possible is the compact case (see the Urysohn Metrizability Theoremlater in these notes). Here is a useful result, which is an example of a sufficientcondition for mterizabilty.

Proposition 6.10 (Metrizability of Countable Products). Let (Xi, di)i∈I be acountable family of metric spaces. Then the product space X =

∏i∈I Xi, equipped

with the product topology, is metrizable.

CHAPTER I: TOPOLOGY PRELIMINARIES 47

Proof. Denote by T the product topology on X. What we need is a metric don X, such that the maps

Id : (X, d) → (X,T) and Id : (X,T) → (X, d)

are continuous. (Here the notation (X, d) signifies that X is equipped with themetric topology defined by d.) For each i ∈ I, let πi : X → Xi denote theprojection onto the ith coordinate.

Case I: Assume I is finite. In this case we define the metric d on X as follows.If x = (xi)i∈I and y = (yi)i∈I are elements in X, we put

d(x,y) = maxi∈I

di(xi, yi).

The continuity of the map Id : (X, d) → (X,T) is equivalent to the fact that allmaps

πi : (X, d) → (Xi, di), i ∈ Iare continuous. This is obvious, because by construction we have

di(πi(x), πi(y) ≤ d(x,y), ∀x,y ∈ X.

Conversely, to prove the continuity of Id : (X,T) → (X, d), we are going to provethat every d-open set is open in the product topology. It suffices to prove this onlyfor open balls. Fix then x = (xi)i∈I ∈

∏i∈I Xi and r > 0, and consider the open

ball Br(x). If we define, for each i ∈ I, the open ball BXir (xi), then it is obvious

thatBr(x) =

⋂i∈I

π−1i

(BXir (xi)

),

and since πi are all continuous, this proves that Br(x) is indeed open in the producttoplogy.

Case II: Assume I is infinite. In this case we identify I = N. For every n ∈ Nwe define a new metric δn on Xn, as follows. If

supp,q∈Xn

dn(p, q) ≤ 1,

we put δn = dn. Otherwise, we define

δn(p, p) =dn(p, q)

1 + dn(p, q), ∀ p, q ∈ Xn.

It is not hard to see that the metric topology defined by δn coincides with the onedefined by dn. The advantage is that δn takes values in [0, 1]. We define the metricd : X ×X → [0,∞), as follows. If x = (xn)n∈N and y = (yn)n∈N are elements in∏n∈N Xn, we define

d(x,y) =∞∑n=1

12n

· dn(xn, yn)1 + dn(xn, yn)

=∞∑n=1

δn(xn, yn)2n

.

Due to the fact that δn takes values in [0, 1], the above series is convergent, and itobviously defines a metric on X.

As above, the continuity of the map Id : (X, d) → (X,T) is equivalent to thecontinuity of all the maps πn : (X, d) → (Xn, dn), or equivalently for πn : (X, d) →(Xn, δn), n ∈ N. But this is an immediate consequence of the (obvioous) inequalities

δn(πn(x), πn(y)

)≤ 2n · d(x,y), ∀x,y ∈ X.

48 LECTURE 6

As before, in order to prove the continuity of the other map Id : (X,T) → (X, d), westart with some d-open set D, and we show that D is open in the product topology.Since D is a union of of open balls, we need to prove that for any x ∈ X and anyr > 0, the open ball Br(x), in (X, d), is a neighborhood of x in the product topology.Fix x = (xn)n∈N ∈

∏n∈N Xn, as well as r > 0. Choose some integer N ≥ 1, such

that∞∑

n=N+1

12n

<r

2,

and define, for each k ∈ 1, 2, . . . , N the set

Dk = y = (yn)n∈N ∈∏n∈N

Xn : δn(xk, yk) <r

2.

It is clear that Dk is open in the product topology, for each k = 1, 2, . . . , N . (This isa consequence of the fact that Dk = π−1

k

(Br/2(xk)

), where Br/2(xk) is the δk-open

ball in Xk of radius r/2, centered at xk.) Then the set D = D1 ∩D2 ∩ · · · ∩DN isalso open in the product topology. Obviously we have x ∈ D. We now prove thatD ⊂ Br(x). Start with some arbitrary y ∈ D, say y = (yn)n∈N. On the one hand,we have

δk(xk, yk) <r

2, ∀ k ∈ 1, 2, . . . , N,

so we getN∑n=1

12nδn(xn, yn) <

r

2

N∑n=1

12n

<r

2.

On the other hand, since δn takes values in [0, 1), we also have∞∑

n=N+1

12nδn(xn, yn) <

∞∑n=1

12n

<r

2,

so we get

d(x,y) =∞∑n=1

12nδn(xn, yn) < r,

thus proving that y indeed belongs to Br(x).

Lecture 7

7. Baire theorem(s)

In this section we discuss some topological phenomenon that occurs in certaintopological spaces. This deals with interiors of closed sets.

Exercise 1. Let X be a topological space, and let A and B be closed sets withthe property that int(A ∪B) 6= ∅. Prove that either Int(A) 6= ∅, or Int(B) 6= ∅.

Exercise 2. Give an example of a topological space X and of two (non-closed)sets A and B such that Int(A ∪B) 6= ∅, but Int(A) = Int(B) = ∅.

Theorem 7.1 (Baire’s Theorem). Let (X, T ) be a topological Hausdorff space,which satisfies one (or both) of the following properties:

(a) There exists a meatric d on X, which meakes (X, d) a complete metricspace, and T is the metric topology.

(b) X is locally compact.

Suppose one has a sequence (Fn)n≥1 of closed subsets of X, such that X =⋃∞n=1 Fn.

Then there exists some integer n ≥ 1, such that Int(Fn) 6= ∅.

Proof. For every n ≥ 1 we define the closed set Gn =⋃nk=1 Fk, so that we

still have X =⋃∞n=1Gn, but we also have G1 ⊂ G2 ⊂ . . . . According to Exercise 1

(use an inductive argument) it suffices to show that there exists some n ≥ 1, withInt(Gn) 6= ∅. We are going to prove this property by contradiction.

(∗) Assume Int(Gn) = ∅, for all n ≥ 1.

Claim: Under the assumption (∗) there exists a sequence (Dn)n≥1 of non-empty open sets, such that for all n ≥ 1 we have:(i) Dn ∩Gn = ∅;(ii) Dn+1 ⊂ Dn;(iii) In case (a) we have diam(Dn) ≤ 2−n; in case (b) Dn is compact.

The sequence is constructed recursivley. To construct D1 we use the fact thatInt(G1) = ∅ forces X rG1 6= ∅. We then choose a point x ∈ X rG1. In case (a)we know that there exists r > 0 such that Br(x) ⊂ X rG1. We put ρ = minr, 1

4and we set D1 = Bρ(x). In the case (b) we apply Lemma 5.1 to find D1 open withD1 compact, such that x ∈ D1 ⊂ D1 ⊂ X rG1.

Let us assume now that we have constructed D1, D2, . . . , Dk, such that (i) and(iii) hold for all n ∈ 1, . . . , k, and such that (ii) hold for all n ∈ 1, . . . , k − 1,and let us indicate how the next set Dk+1 is constructed. Using the assumptionthat Int(Gk+1) = ∅, it follows that the open set Dk rGk+1 is non-empty. Choosethen a point x ∈ Dk r Gk+1. In case (a) there exists some r > 0 such thatBr(x) ⊂ Dk rGk+1. We then put ρ = min r2 ,

12k+2 , and we define Dk+1 = Bρ(x).

49

50 LECTURE 7

In case (b) we apply Lemma 5.1 an find an open set Dk+1 with Dk+1 compact,and x ∈ Dk+1 ⊂ Dk+1 ⊂ Dk rGk+1. All properties (i)-(iii) are easily verified.

Having proven the Claim, let us see now that the assumption (∗) produces acontradiction.

Case (a): In this case we choose, for each n ≥ 1 a point xn ∈ Dn. Noticethat, for every m ≥ n ≥ 1 we have

xm, xn ∈ Dn and d(xm, xn) ≤ diam(Dn) ≤12n.

In particular, this proves that the sequence (xn)n≥1 is Cauchy, hence convergentto some point x. Since xm ∈ Dn, ∀m ≥ n ≥ 1, we see that x ∈ Dn, for all n ≥ 1.In other words we get

(1)∞⋂n=1

Dn 6= ∅.

Case (b): In this case we also get (1), this time as a consequence of thecompactness of the sets Dn (and the finite intersection property).

Let us notice now that (1) combined with (ii) will also give⋂∞n=1Dn 6= ∅. But

this is impossible, since by (i) we have∞⋂n=1

Dn ⊂∞⋂n=1

(X rGn) = X r( ∞⋃n=1

Gn)

= ∅.

Chapter II

Elements of Functional Analysis

Lecture 8

1. Hahn-Banach Theorems

The result we are going to discuss is one of the most fundamental theorems inthe whole field of Functional Analysis. Its statement is simple but quite technical.

Definitions. Let K be either of the fields R or C. Suppose X is a K-vectorspace.

A. A map q : X → R is said to be a quasi-seminorm, if(i) q(x+ y) ≤ q(x) + q(y), for all x, y ∈ X ;(ii) q(tx) = tq(x), for all x ∈ X and all t ∈ R with t ≥ 0.

B. A map q : X → R is said to be a seminorm if, in addition to the abovetwo properties, it satisfies:(ii’) q(λx) = |λ|q(x), for all x ∈ X and all λ ∈ K.

Remark that if q : X → R is a seminorm, then q(x) ≥ 0, for all x ∈ X . (Use2q(x) = q(x) + q(−x) ≥ q(0) = 0.)

There are several versions of the Hahn-Banach Theorem.Theorem 1.1 (Hahn-Banach, R-version). Let X be an R-vector space. Suppose

q : X → R is a quasi-seminorm. Suppose also we are given a linear subspace Y ⊂ Xand a linear map φ : Y → R, such that

φ(y) ≤ q(y), for all y ∈ Y.Then there exists a linear map ψ : X → R such that

(i) ψ∣∣Y = φ;

(ii) ψ(x) ≤ q(x) for all x ∈ X .

Proof. We first prove the Theorem in the following:Particular Case: Assume dimX/Y = 1.

This means there exists some vector x0 ∈ X such that

X = y + sx0 : y ∈ Y, s ∈ R.What we need is to prescribe the value ψ(x0). In other words, we need a numberα ∈ R such that, if we define ψ : X → R by ψ(y+sx0) = φ(y)+sα, ∀ y ∈ Y, s ∈ R,then this map satisfies condition (ii). For s > 0, condition (ii) reads:

φ(y) + sα ≤ q(y + sx0), ∀ y ∈ Y, s > 0,

and, upon dividing by s (set z = s−1y), is equivalent to:

(1) α ≤ q(z + x0)− φ(z), ∀ z ∈ Y.For s < 0, condition (ii) reads (use t = −s):

φ(y)− tα ≤ q(y − tx0), ∀ y ∈ Y, t > 0,

53

54 LECTURE 8

and, upon dividing by t (set w = t−1y), is equivalent to:

(2) α ≥ φ(w)− q(w − x0), ∀w ∈ Y.

Consider the sets

Z = q(z + x0)− φ(z) ; z ∈ Y ⊂ RW = φ(w)− q(w − x0) : w ∈ Y ⊂ R.

The conditions (1) and (2) are equivalent to the inequalities

(3) sup W ≤ α ≤ inf Z.

This means that, in order to find a real number α with the desired property, itsuffices to prove that sup W ≤ inf Z, which in turn is equivalent to

(4) φ(w)− q(w − x0) ≤ q(z + x0)− φ(z), ∀ z.w ∈ Y.

But the condition (4) is equivalent to

φ(z + w) ≤ q(z + x0) + q(w − x0),

which is obviously satisfied because

φ(z + w) ≤ q(z + w) = q((z + x0) + (w − x0)

)≤ q(z + x0) + q(w − x0).

Having proved the Theorem in this particular case, let us proceed now withthe general case. Let us consider the set Ξ of all pairs (Z, ν) with

• Z is a subspace of X such that Z ⊃ Y;• ν : Z → R is a linear functional such that

(i) ν∣∣Y = φ;

(ii) ν(z) ≤ q(z), for all z ∈ Z.Put an order relation on Ξ as follows:

(Z1, ν1) (Z2, ν2) ⇔Z1 ⊃ Z2

ν1∣∣Z2

= ν2

Using Zorn’s Lemma, Ξ posesses a maximal element (Z, ψ). The proof of theTheorem is finished once we prove that Z = X . Assume Z ( X and choose avector x0 ∈ X r Z. Form the subspace V = z + tx0 : z ∈ Z, t ∈ R and applythe particular case of the Theorem for the inclusion Z ⊂ V, for ψ : Z → R and forthe quasi-seminorm q

∣∣V : V → R. It follows that there exists some linear functional

η : M→ R such that(i) η

∣∣Z = ψ (in particular we will also have η

∣∣Y = φ);

(ii) η(v) ≤ q(v), for all v ∈ V.But then the element (V, η) ∈ Ξ will contradict the maximality of (Z, ψ).

Theorem 1.2 (Hahn-Banach, C-version). Let X be an C-vector space. Supposeq : X → R is a quasi-seminorm. Suppose also we are given a linear subspace Y ⊂ Xand a linear map φ : Y → C, such that

Reφ(y) ≤ q(y), for all y ∈ Y.

Then there exists a linear map ψ : X → R such that(i) ψ

∣∣Y = φ;

(ii) Reψ(x) ≤ q(x) for all x ∈ X .

CHAPTER II: ELEMENTS OF FUNCTIONAL ANALYSIS 55

Proof. Regard for the moment both X and Y as R-vector spaces. Define theR-linear map φ1 : Y → R by φ1(y) = Reφ(y), for all y ∈ Y, so that we have

φ1(y) ≤ q(y), ∀ y ∈ Y.

Use Theorem 1 to find an R-linear map ψ1 : X → R such that

(i) ψ1

∣∣Y = φ1;

(ii) ψ1(x) ≤ q(x), for all x ∈ X .

Define the map ψ : X → C by

ψ(x) = ψ1(x)− iψ1(ix), for all x ∈ X .

Claim 1: ψ is C-linear.

It is obvious that ψ is R-linear, so the only thing to prove is that ψ(ix) = iψ(x),for all x ∈ X . But this is quite obvious:

ψ(ix) = ψ1(ix)− iψ1(i2x) = ψ1(ix)− iψ1(−x) =

= −i2ψ1(ix) + iψ1(x) = i(ψ1(x)− iψ1(ix)

)= iψ(x), ∀x ∈ X .

Because of the way ψ is defined, and because ψ1 is real-valued, condition (ii)in the Theorem follows immediately

Reψ(x) = ψ1(x) ≤ q(x), ∀x ∈ X ,

so in order to finish the proof, we need to prove condition (i) in the Theorem, (i.e.ψ

∣∣Y = φ). This follows from the fact that φ1 = ψ1

∣∣Y , and from:

Claim 2: For every y ∈ Y, we have φ(y) = φ1(y)− iφ1(iy).

But this is quite obvious, because

Imφ(y) = −Re (iφ(y)) = −Reφ(iy) = −φ1(iy), ∀ y ∈ Y.

Theorem 1.3 (Hahn-Banach, for seminorms). Let X be a K-vector space (Kis either R or C). Suppose q is a seminorm on X . Suppose also we are given alinear subspace Y ⊂ X and a linear map φ : Y → K, such that

|φ(y)| ≤ q(y), for all y ∈ Y.

Then there exists a linear map ψ : X → K such that

(i) ψ∣∣Y = φ;

(ii) |ψ(x)| ≤ q(x) for all x ∈ X .

Proof. We are going to apply Theorems 1 and 2, using the fact that q is alsoa quasi-seminorm.

The case K = R. Remark that

φ(y) ≤ |φ(y)| ≤ q(y), ∀ y ∈ Y.

So we can apply Theorem 1 and find ψ : X → R with

(i) ψ∣∣Y = φ;

(ii) ψ(x) ≤ q(x), for all x ∈ X .

56 LECTURE 8

Using condition (ii) we also get

−ψ(x) = ψ(−x) ≤ q(−x) = q(x), for all x ∈ X .

In other words we get±ψ(x) ≤ q(x), for all x ∈ X ,

which of course gives the desired property (ii) in the Theorem.The case K = C. Remark that

Reφ(y) ≤ |φ(y)| ≤ q(y), ∀ y ∈ Y.

So we can apply Theorem 2 and find ψ : X → R with

(i) ψ∣∣Y = φ;

(ii) Reψ(x) ≤ q(x), for all x ∈ X .

Using condition (ii) we also get

(5) Re(λψ(x)

)= Reψ(λx) ≤ q(λx) = q(x), for all x ∈ X and all λ ∈ T.

(Here T = λ ∈ C : |λ| = 1.) Fix for the moment x ∈ X . There exists some λ ∈ Tsuch that |ψ(x)| = λψ(x). For this particular λ we will have Re

(λψ(x)

)= |ψ(x)|,

so the inequality (5) will give|ψ(x)| ≤ q(x).

In the remainder of this section we will discuss the geometric form of theHahn-Banach theorems. We begin by describing a method of constructing quasi-seminorms.

Proposition 1.1. Let X be a real vector space. Suppose C ⊂ X is a convexsubset, which contains 0, and has the property

(6)⋃t>0

tC = X.

For every x ∈ X we define

QC(x) = inft > 0 : x ∈ tC.

(By (6) the set in the right hand side is non-empty.) Then the map QC : X → R isa quasi-seminorm.

Proof. For every x ∈ X, let us define the set

TC(x) = t > 0 : x ∈ tC.

It is pretty clear that, since 0 ∈ C, we have

TC(0) = (0,∞),

so we getQC(0) = inf TC(0) = 0.

Claim 1: For every x ∈ X and every λ > 0, one has the equality

TC(λx) = λTC(x).

CHAPTER II: ELEMENTS OF FUNCTIONAL ANALYSIS 57

Indeed, if t ∈ TC(λx), we have λx ∈ tC, which menas that λ−1tx ∈ C, i.e. λ−1t ∈TC(x). Conequently we have

t = λ(λ−1t) ∈ λTX(x),

which proves the inclusionTC(λx) ⊂ λTC(x).

To prove the other inclusion, we start with some s ∈ λTC(x), which means thatthere exists some t ∈ TC(x) with λt = s. The fact that t = λ−1s belongs to TC(x)means that x ∈ λ−1sC, so get λx ∈ sC, so s indeed belongs to TC(λx).

Claim 2:: For every x, y ∈ X, one has the inclusion4

TC(x+ y) ⊃ TC(x) + TC(y).

Start with some t ∈ TC(x) and some s ∈ TC(y). Define the elements u = t−1x andv = s−1y. Since u, v ∈ C, and C is convex, it follows that C contains the element

t

t+ su+

s

t+ sv =

1t+ s

(x+ y),

which means that x+ y ∈ (t+ s)C, so t+ s indeed belongs to TC(x+ y).We can now conclude the proof. If x ∈ X and λ > 0, then the equality

QC(λx) = λQC(x)

is an immediate consequence of Claim 1. If x, y ∈ X, then the inequality

QC(x+ y) ≤ λQC(x) +QC(y)

is an immediate consequence of Claim 2.

Definition. Under the hypothesis of the above proposition, the quasi-semi-norm QC is called the Minkowski functional associated with the set C.

Remark 1.1. Let X be a real vector space. Suppose C ⊂ X is a convex subset,which contains 0, and has the property (6). Then one has the inclusions

x ∈ X : QC(x) < 1 ⊂ C ⊂ x ∈ X : QC(x) ≤ 1.

The second inclusion is pretty obvious, since if we start with some x ∈ C, using thenotations from the proof of Proposition 2.1, we have 1 ∈ TC(x), so

QC(x) = inf TC(x) ≤ 1.

To prove the first inclusion, start with some x ∈ X with QC(x) < 1. In particularthis means that there exists some t ∈ (0, 1) such that x ∈ tC. Define the vectory = t−1x ∈ C and notice now that, since C is convex, it will contain the convexcombination ty + (1− t)0 = x.

Exercise 1. Let X be a real vector space, and let q : X → R be a quasi-seminorm.Define the sets

C0 = x ∈ X : q(x) < 1,C1 = x ∈ X : q(x) ≤ 1.

(i) Prove that C0 and C1 are both convex, they contain 0, and they bothhav property (6).

4For subsets T, S ⊂ R we define T + S = t+ s : t ∈ T, s ∈ S.

58 LECTURE 8

(ii) Let C is any convex set with

C0 ⊂ C ⊂ C1.

Analyze the relationship between QC and q.Definition. A topological vector space is a vector space X over K (which is

either R or C), which is also a topological space, such that the maps

X× X 3 (x, y) 7−→ x+ y ∈ X

K× X 3 (λ, x) 7−→ λx ∈ X

are continuous.Remark 1.2. Let X be a real topological vector space. Suppose C ⊂ X is a

convex open subset, which contains 0. Then C has the property (6). Moreover(compare with Remark 2.1), one has the equality

(7) x ∈ X : QC(x) < 1 = C.

To prove this remark, we define for each x ∈ X, the function

Fx : R 3 t 7−→ tx ∈ X.

Since X is a topological vector space, the map Fx, x ∈ X are continuous. To provethe property (6) we start with an arbitrary x ∈ X, and we use the continuity of themap Fx at 0. Since C is a neighborhood of 0, there exists some ρ > 0 such that

Fx(t) ∈ C, ∀ t ∈ [−ρ, ρ].In particular we get ρx ∈ C, which means that x ∈ ρ−1C.

To prove the equality (7) we only need to prove the inclusion “⊃” (since theinclusion “⊂” holds in general, by Remark 2.1). Start with some element x ∈ C.Using the continuity of the map Fx at 1, plus the fact that Fx(1) = x ∈ C, thereexists some ε > 0, such that

Fx(t) ∈ C, ∀ t ∈ [1− ε, 1 + ε].

In particular, we have F (1 + ε) ∈ C, which means precisely that

x ∈ (1 + ε)−1C.

This gives the inequalityQC(x) ≤ (1 + ε)−1,

so we indeed get QC(x) < 1.The first geometric version of the Hahn-Banach Theorem is:Lemma 1.1. Let X be a real topological vector space, and let C ⊂ X be a convex

open set which contains 0. If x0 ∈ X is some point which does not belong to C, thenthere exists a linear continuous map φ : X → R, such that

• φ(x0) = 1;• φ(v) < 1, ∀ v ∈ C.

Proof. Consider the linear subspace

Y = Rx0 = tx0 : t ∈ R,and define ψ : Y → R by

ψ(tx0) = t, ∀ t ∈ R.It is obvious that ψ is linear, and ψ(x0) = 1.

CHAPTER II: ELEMENTS OF FUNCTIONAL ANALYSIS 59

Claim: One has the inequality

ψ(y) ≤ QC(y), ∀ y ∈ Y.

Let y be represented as y = tx0 for some t ∈ R. It t ≤ 0, the inequality is clear,because ψ(y) = t ≤ 0 and the right hand sideQC(y) is always non-negative. Assumet > 0. Since QC is a quasi-seminorm, we have

(8) QC(y) = QC(tx0) = tQC(x0),

and the fact that x0 6∈ C will give (by Remark 2.2) the inequality QC(x0) ≥ 1. Sincet > 0, the computation (8) can be continued with

QC(y) = tQC(x0) ≥ t = ψ(y),

so the Claim follows also in this case.Use now the Hahn-Banach Theorem, to find a linear map φ : X → R such that

(i) φ∣∣Y

= ψ;(ii) φ(x) ≤ QC(x), ∀x ∈ X.

It is obvious that (i) gives φ(x0) = ψ(x0) = 1. If v ∈ C, then by Remark 2.2 wehave QC(v) < 1, so by (ii) we also get φ(v) < 1. This means that the only thingthat remains to be proven is the continuity of φ. Since φ is linear, we only need toprove that φ is continuous at 0. Start with some ε > 0. We must find some openset Uε ⊂ X, with Uε 3 0, such that

|φ(u)| < ε, ∀u ∈ Uε.

We take Uε = (εC)∩ (−εC). Notice that, for every u ∈ Uε, we have ±u ∈ εC, whichgives ε−1(±u) ∈ C. By Remark 2.2 this gives QC

(ε−1(±u)

)< 1, which gives

QC(±u) < ε.

Then using property (ii) we immediately get

φ(±u) < ε,

and we are done.

It turns out that the above result is a particular case of a more general result:Theorem 1.4 (Hahn-Banach Separation Theorem - real case). Let X be a real

topological vector space, let A,B ⊂ X be non-empty convex sets with A open, andA ∩ B = ∅. Then there exists a linear continuous map φ : X → R, and a realnumber α, such that

φ(a) < α ≤ φ(b), ∀ a ∈ A, b ∈ B.

Proof. Fix some points a0 ∈ A, b0 ∈ B, and define the set

C = A−B + b0 − a0 = a− b+ b0 − a0 : a ∈ A, b ∈ B.It is starightforward that C is convex and contains 0. The equality

C =⋃b∈B

(A + b0 − a0)

shows that C is also open. Define the vector x0 = b0 − a0. Since A ∩ B = ∅, it isclear that x0 6∈ C.

Use Lemma 2.1 to produce a linear continuous mapphi : X → R such that

(i) φ(x0) = 1;

60 LECTURE 8

(ii) φ(v) < 1, ∀ v ∈ C.By the definition of x0 and C, we have φ(b0) = φ(a0) + 1, and

φ(a) < φ(b) + φ(a0)− φ(b0) + 1, ∀ a ∈ A, b ∈ B,

which gives

(9) φ(a) < φ(b), ∀ a ∈ A, b ∈ B.

Putα = inf

b∈Bφ(b).

The inequalities (9) give

(10) φ(a) ≤ α ≤ φ(b), ∀ a ∈ A, b ∈ B.

The proof will be complete once we prove the followingClaim: One has the inequality

φ(a) < α, ∀ a ∈ A.

Suppose the contrary, i.e. there exists some a1 ∈ A with φ(a1) = α. Using thecontinuity of the map

R 3 t 7−→ a1 + tx0 ∈ X

there exists some ε > 0 such that

a1 + tx0 ∈ A, ∀ t ∈ [−ε, ε].

In particular, by (10) one has

φ(a1 + εx0) ≤ α,

which means thatα+ ε ≤ α,

which is clearly impossible.

Theorem 1.5 (Hahn-Banach Separation Theorem - complex case). Let X bea complex topological vector space, let A,B ⊂ X be non-empty convex sets with A

open, and A ∩ B = ∅. Then there exists a linear continuous map φ : X → C, anda real number α, such that

Reφ(a) < α ≤ Imφ(b), ∀ a ∈ A, b ∈ B.

Proof. Regard X as a real topological vector space, and apply the real versionto produce an R-linear continuous map φ1 : X → R, and a real number α, such that

φ1(a) < α ≤ φ1(b), ∀ a ∈ A, b ∈ B.

Then the function φ : X → C defined by

φ(x) = φ1(x)− iφ1(ix), x ∈ X

will clearly satisfy the desired properties.

There is another version of the Hahn-Banach Separation Theorem, which holdsfor a special type of topological vector spaces. Before we discuss these, we shallneed a technical result.

CHAPTER II: ELEMENTS OF FUNCTIONAL ANALYSIS 61

Lemma 1.2. Let X be a topological vector space, let C ⊂ X be a compact set,and let D ⊂ D be a closed set. Then the set

C + D = x+ y : x ∈ C y ∈ D

is closed.

Proof. Start with some point p ∈ C + D, and let us prove that p ∈ C+D. Forevery neighborhood U of 0, the set p+ U is a neighborhood of p, so by assumption,we have

(11) (p+ U) ∩ (C + D) 6= ∅.

Define, for each neighborhood U of 0, the set

AU = (p+ U−D) ∩ C.

Using (11), it is clear that AU is non-empty. It is also clear that, if U1 ⊂ U2, thenAU1 ⊂ AU2 . Using the compactness of C, it follows that⋂

U neighborhoodof 0

AU 6= ∅.

Choose then a point q in the above intersection. It follows that

(q + V) ∩AU 6= ∅,

for any two neighborhoods U and V of 0. In other words, for any two such neigh-borphoods of 0, we have

(12) (q + V− U) ∩ (p−D) 6= ∅.

Fix now an arbitrary neighborhood W of 0. Using the continuity of the map

X× X 3 (x1, x2) 7−→ x1 − x2 ∈ X,

there exist neighborhoods U and V of 0, such that U− V ⊂ W. Then q + V− U ⊂q −W, so (12) gives

(q −W) ∩ (p−D) 6= ∅,which yields

(p− q + W) ∩D 6= ∅.Since this is true for all neighborhoods W of 0, we get p − q ∈ D, and since D isclosed, we finally get p − q ∈ D. Since, by construction we have q ∈ C, it followsthat the point p = q + (p− q) indeed belongs to C + D.

Definition. A topological vector space X is said to be locally convex, if everypoint has a fundamental system of convex open neighborhoods. This means thatfor every x ∈ X and every neighborhood N of x, there exists a convex open set D,with x ∈ D ⊂ N .

Theorem 1.6 (Hahn-Banach Separation Theorem for Locally Convex Spaces).Let K be one of the fields R or C, and let X be a locally convex K-vector space.Suppose C,D ⊂ X are convex sets, with C compact, D closed, and C∩D = ∅. Thenthere exists a linear continuous map φ : X → K, and two numbers α, β ∈ R, suchthat

Reφ(x) ≤ α < β ≤ Reφ(y), ∀x ∈ C, y ∈ D.

62 LECTURE 8

Proof. Consider the convex set B = D−C. By Lemma ??, B is closed. SinceC∩D = ∅, we have 0 6∈ B. Since B is closed, its complement X r B will then be aneighborhood of 0. Since X is locally convex, there exists a convex open set A, with0 ∈ A ⊂ X r B. In particular we have A ∩ B = ∅. Applying the suitable versionof the Hahn-Banach Theorem (real or complex case), we find a linear continuousmap φ : X → K, and a real number ρ, such that

Reφ(a) < ρ ≤ Reφ(b), ∀ a ∈ A, b ∈ B.

Notice that, since A 3 0, we get ρ > 0. Then the inequality

ρ ≤ Reφ(b), b ∈ B

givesReφ(y)− Reφ(x) ≥ ρ > 0, ∀x ∈ C, y ∈ D.

Then if we defineβ = inf

y∈DReφ(y) and α = sup

x∈CReφ(x),

we get β ≥ α+ ρ, and we are done.

Lectures 9-11

2. Normed vector spaces

Definition. Let K be one of the fields R or C, and let X be a K-vector space.A norm on X is a map

X 3 x 7−→ ‖x‖ ∈ [0,∞)

with the following properties

(i) ‖x+ y‖ ≤ ‖x‖+ ‖y‖, ∀x, y ∈ X;(ii) ‖λx‖ = |λ| · ‖x‖, ∀x ∈ X, λ ∈ K;(iii) ‖x‖ = 0 =⇒ x = 0.

(Note that conditions (i) and (ii) state that ‖ . ‖ is a seminorm.)

Example 2.1. Let K be either R or C. Fix some non-empty set I, and define

cK0 (I) =

α : I → K : inf

F⊂Ifinite

[supi∈IrF

|α(i)|]

= 0.

Remark that for a function α : I → K, the fact that α belongs to cK0 (I) is equivalent

to the following condition:

• For every ε > 0, there exists some finite set F ⊂ I, such that

|α(i)| < ε, ∀ i ∈ I r F.

We equip the space cK0 (I) with the K-vector space structure defined by point-wise

addition and point-wise scalar multiplication. We also define the norm ‖ . ‖∞ by

‖α‖ = supi∈I

|α(i)|, α ∈ cK0 (I).

When K = C, the space cC0 (I) is simply denoted by c0(I). When I = N - the set of

natural numbers - the space cK0 (N) can be equivalently described as

cK0 (N) =

α = (αn)n≥1 ⊂ K : lim

n→∞αn = 0

.

In this case instead of cR0 (N) we simply write cR

0 , and instead of c0(N) we simplywrite c0.

Exercise 1. Prove that ‖ . ‖∞ is indeed a norm on cK0 (I).

Example 2.2. Let K be either R or C, and let I be a non-empty set. Wedefine the space

finK(I) =α : I → K : the set i ∈ I : α(i) 6= 0 is finite

.

Then finK(I) is a linear subspace in cK0 (I).

63

64 LECTURES 9-11

Definition. Suppose X is a normed vector space, with norm ‖ . ‖. Then thereis a natural metric d on X, defined by

d(x, y) = ‖x− y‖, x, y ∈ X.

The toplogy on X, defined by this metric, is called the norm topology.Exercise 2. Let X be a normed vector space, over K(= R,C). Prove that, when

equipped with the norm toplogy, X becomes a topological vetor space. That is, themaps

X× X 3 (x, y) 7−→ x+ y ∈ X

K× X 3 (λ, x) 7−→ λx ∈ X

are continuous.Exercise 3. Let K be one of the fields R or C, and let I be a non-empty set.

Prove that finK(I) is dense in cK0 (I) in the norm topology.

Example 2.3. Let K be one of the fields R or C, and let I be a non-emptyset. Define

`∞K (I) =α : I → K : sup

i∈I|α(i)| <∞

.

We equip the space `∞K (I) with the K-vector space structure defined by point-wiseaddition and point-wise scalar multiplication. We also define the norm ‖ . ‖∞ by

‖α‖∞ = supi∈I

|α(i)|, α ∈ `∞K (I).

When K = C, the space `∞C (I) is simply denoted by `∞(I). When I = N - the setof natural numbers - instead of `∞R (N) we simply write `∞R , and instead of `∞(N)we simply write `∞.

Exercise 4. Prove that ‖ . ‖∞ is indeed a norm on `∞K (I).Exercise 5. Let K be one of the fields R or C, and let I be a non-empty set.

Prove that cK0 (I) is a linear subspace in `∞K (I), which is closed in the norm topology.

In preparation for the next class of examples, we introduce the following:Definition. A map α : I → K is said to be summable, if there exists some

number s ∈ K such that(s) for every ε > 0 there exists some finite set Fε ⊂ I such that∣∣∣∣s−∑

i∈Fα(i)

∣∣∣∣ < ε, for all finite sets F with Fε ⊂ F ⊂ I.

If such an s exists, then it is unique, and it is denoted by∑i∈I α(i). In the case

when I is finite, every map α : I → K is summable, and the above notation agreeswith the usual notation for the sum.

Exercise 6. Assume α : I → K is summable. Prove that, for every λ ∈ K, themap λα : I → K is summable, and∑

i∈Iλα(i) = λ

∑i∈I

α(i).

If β : I → K is another summable map, prove that α+β : I → K is summable, and∑i∈I

[α(i) + β(i)] =[∑i∈I

α(i)]+

[∑i∈I

β(i)].

The following result characterizes summability for non-negative terms

CHAPTER II: ELEMENTS OF FUNCTIONAL ANALYSIS 65

Lemma 2.1. Let K be one of the fields R or C, let I be a non-empty set, andlet α : I → [0,∞). The following are equivalent:

(i) α is summable;

(ii) sup ∑i∈F

α(i) : F ⊂ I, finite<∞.

Moreover, in this case we have

sup ∑i∈F

α(i) : F ⊂ I, finite

=∑i∈I

α(i).

Proof. We denote the quantity sup ∑i∈F

α(i) : F ⊂ I, finite

simply by t.

(i) ⇒ (ii). Assume α is summable, and denote∑i∈I α(i) simply by s. Choose,

for each ε > 0 a finite set Fε ⊂ I such that∣∣∣∣s−∑i∈F

α(i)∣∣∣∣ < ε, for all finite subsets F ⊂ I with F ⊃ Fε.

Claim: For any finite set G ⊂ I, and any ε > 0, one has the inequality∑i∈G

α(i) < s+ ε.

Indeed, if we take the finite set G ∪ Fε, then using the fact that all α’s are non-negative, we get ∑

i∈Gα(i) ≤

∑i∈G∪Fε

α(i) < s+ ε.

Using the Claim, which holds for any ε > 0, we immediately get∑i∈G

α(i) ≤ s, for all finite subsets G ⊂ I,

so taking supremum yields t ≤ s, in particular t <∞.(ii) ⇒ (i). Assume condition (ii) is true. We are going to show that α is

summable, by proving that the number t satisfies the definition of summabilty.Consider the set

S = ∑i∈F

α(i) : F finite subset of I,

so that sup S = t < ∞. Start with some ε > 0. Since t − ε is no longer an upperbound for S, there exists some finite set Fε ⊂ I, such that

∑i∈Fε

α(i) > t − ε.Notice that, for any finite set F ⊂ I with F ⊃ Fε, we have

t− ε <∑i∈Fε

α(i) ≤∑i∈F

α(i) ≤ t,

so we immediately get ∣∣∣∣t−∑i∈F

α(i)∣∣∣∣ < ε.

66 LECTURES 9-11

Exercise 7. Let α : I → [0,∞) be summable. Prove that every map β : I →[0,∞) with

β(j) ≤ α(j), ∀ j ∈ I,is summable, and

∑j∈I β(j) ≤

∑j∈I α(j).

Remark 2.1. It is obvious tat the above result has a version for non-positivemaps as well. More explicitly, for a map α : I → (−∞, 0] the following are equiva-lent:

(i) α is summable;

(ii) inf ∑i∈F

α(i) : F ⊂ I, finite> −∞.

Moreover, in this case we have

inf ∑i∈F

α(i) : F ⊂ I, finite

=∑i∈I

α(i).

Lemma 2.2. Let I be a non-empty set. For a function α : I → C, the followingare equivalent:

(i) α is summable;(ii) both functions Re α, Im α : I → R are summable.

Moreover, in this case we have the equality∑j∈I

α(j) =∑j∈I

Re α(j) + i∑j∈I

Im α(j).

Proof. (i) ⇒ (ii). Assume α is summable. Denote the sum∑j∈I α(j) simply

by s. For every ε > 0 choose a finite set Fε ⊂ I such that∣∣∣∣s− ∑j∈F

α(j)∣∣∣∣ < ε, for all finite sets F ⊂ I with F ⊃ Fε.

Using the inequality

max|Re z|, |Im z|

≤ |z|, ∀ z ∈ C,

we immediately get the inequalities∣∣∣∣Re s−∑j∈F

Reα(j)∣∣∣∣ =

∣∣∣∣Re[s−

∑j∈F

α(j)]∣∣∣∣ ≤ ∣∣∣∣s− ∑

j∈Fα(j)

∣∣∣∣ < ε,∣∣∣∣Im s−∑j∈F

Imα(j)∣∣∣∣ =

∣∣∣∣Im [s−

∑j∈F

α(j)]∣∣∣∣ ≤ ∣∣∣∣s− ∑

j∈Fα(j)

∣∣∣∣ < ε,

for all finite sets F ⊂ I with F ⊃ Fε,

so Reα and Imα are indeed summable and moreover, we have∑j∈I

Reα(j) = Re s and∑j∈I

Imα(j) = Im s.

(ii) ⇒ (i). Assume Reα and Imα are both summable. Denote∑j∈I Reα(j)

by u and denote∑j∈I Imα(j) by v. Fix some ε > 0. Choose finite sets Eε, Gε ⊂ I

CHAPTER II: ELEMENTS OF FUNCTIONAL ANALYSIS 67

such that ∣∣∣∣u− ∑j∈E

Reα(j)∣∣∣∣ < ε

2, for all finite sets E ⊂ I with E ⊃ Eε,∣∣∣∣v − ∑

j∈GImα(j)

∣∣∣∣ < ε

2, for all finite sets G ⊂ I with G ⊃ Gε.

Put Fε = Eε ∪Gε. Suppose F ⊂ I is a finite set with F ⊃ Fε. Using the inclusionsF ⊃ Eε and F ⊃ Gε, we then get∣∣∣∣u− ∑

j∈FReα(j)

∣∣∣∣ < ε

2and

∣∣∣∣v − ∑j∈F

Imα(j)∣∣∣∣ < ε

2,

so we get∣∣∣∣[u+ iv]−∑j∈F

α(j)∣∣∣∣ =

∣∣∣∣[u− ∑j∈F

Reα(j)]+ i

[v −

∑j∈F

Imα(j)]∣∣∣∣ ≤

≤∣∣∣∣u− ∑

j∈FReα(j)

∣∣∣∣ +∣∣∣∣v − ∑

j∈FImα(j)

∣∣∣∣ < ε

2+ε

2= ε.

This proves that α is indeed summable, and∑j∈I α(j) = u+ iv.

Exercise 8. Let K be one of the fields R or C, and let I be a non-empty set.Suppose one has two non-empty sets I1, I2 with I = I1 ∪ I2 and I1 ∩ I2 = ∅.Suppose α : I → K has the property that both α

∣∣I1

: I1 → K and α∣∣I2

: I2 → K aresummable. Prove that α is summable, and∑

j∈Iα(j) =

∑j∈I1

α(j) +∑j∈I2

α(j).

Proposition 2.1. Let I be a non-empty set, let K be one of the fields R or C.For a map α : I → K, the following are equivalent:

(i) α is summable;(ii) |α| is summable.

Moreover, in this case one has the inequality

(1)∣∣∣∣ ∑j∈I

α(j)∣∣∣∣ ≤ ∑

j∈I

∣∣α(j)∣∣.

Proof. (i) ⇒ (ii). Assume α is summable. We divide the proof in two cases:Case K = R. Define the sets

I+ = j ∈ I : α(j) > 0,I− = j ∈ I : α(j) < 0,I0 = j ∈ I : α(j) = 0.

More generally, for any subset F ⊂ I we define F± = F ∩ I± and F 0 = F ∩ I0.Claim: Both maps α

∣∣I+

: I+ → R and α∣∣I−

: I− → R are summable.Moreover, one has the equality

(2)∑j∈I

α(j) =∑j∈I+

α(j) +∑j∈I−

α(j).

68 LECTURES 9-11

Denote the sum∑j∈I α(j) simply by s. Start by choosing some finite set F ⊂ I

such that ∣∣∣∣s− ∑j∈G

α(j)∣∣∣∣ < 1, for all finite sets G ⊂ I with G ⊃ F.

Let E ⊂ I+ be a finite subset. Then the set E = E ∪ F will be a finite subset of Iwith E ⊃ F,, so we will have ∣∣∣∣s− ∑

j∈E

α(j)∣∣∣∣ < 1,

so we get∑j∈E

α(j) ≤∑

j∈E∪F+

α(j) =[ ∑j∈E∪F+

α(j) +∑

j∈F 0∪F−α(j)

]−

[ ∑j∈F 0∪F−

α(j)]

=

=[ ∑j∈E

α(j)]−

[ ∑j∈F−

α(j)]< s+ 1−

[ ∑j∈F−

α(j)].

In particular this gives

sup ∑j∈E

α(j) : E ⊂ I+, finite≤ s+ 1−

[ ∑j∈F−

α(j)],

so by Lemma ??, the map α∣∣I+

: I+ → [0,∞) is indeed summable. The fact thatthe map α

∣∣I−

: I− → (−∞, 0] is summable is proven the exact same way. Theequality (2) follows from Exercise ??

Having proven the Claim, we notice now that the map −α∣∣I−

: I− → [0,∞) isalso summable. Using Exercise ??, it is clear then that the map |α| : I → [0,∞)is summable, simply because all the three maps |α|I+ = α

∣∣I+

, |α|I− = −α∣∣I−

, and|α|I0 = 0 are all summable.

Case K = C. By Lemma ?? we know that the maps Reα, Imα : I → Rare summable. In particular, using the real case, we get the fact that the maps|Reα|, |Imα| : I → [0,∞) are summable. Using the obvious inequality

|z| ≤ |Re z|+ |Im z|, ∀ z ∈ C,

we get∑j∈F

|α(j)| ≤∑j∈F

|Reα(j)|+∑j∈F

|Imα(j)| ≤∑j∈I

|Reα(j)|+∑j∈I

|Imα(j)|,

for every finite subset F ⊂ I. Then we get

sup ∑j∈F

|α(j)| : F ⊂ I, finite≤

∑j∈I

|Reα(j)|+∑j∈I

|Imα(j)| <∞,

so |α| : I → [0,∞) is indeed summable.Having proven the implication (i) ⇒ (ii), let us prove the inequality (1). If s

denotes the sum∑j∈I α(j), then for every ε > 0 there exists Fε ⊂ I finite such

that ∣∣∣∣s− ∑j∈F

α(j)∣∣∣∣ < ε, for all finite sets F ⊂ I with F ⊃ Fε.

CHAPTER II: ELEMENTS OF FUNCTIONAL ANALYSIS 69

In particular, we get

|s| ≤ ε+∣∣∣∣ ∑j∈Fε

α(j)∣∣∣∣ ≤ ε+

∑j∈Fε

∣∣α(j)∣∣ ≤ ε+

∑j∈I

∣∣α(j)∣∣.

Since this inequality holds for all ε > 0, we then get

|s| ≤∑j∈Fε

∣∣α(j)∣∣.

(ii) ⇒ (i). Assume now |α| : I → [0,∞) is summable.Case K = R. It is obvious that |α|

∣∣J

: J → [0,∞) is summable, for any subsetJ ⊂ I. In particular, using the notations from the proof of (i) ⇒ (ii), it followsthat α

∣∣I+

= |α|∣∣I+

, α∣∣I−

= −|α|∣∣I−

, and α∣∣I0

= 0 are all summable. Then thesummability of α follows from Exercise ??.

Case K = C. Using the inequality

max|Re z|, |Im z|

≤ |z|, ∀ z ∈ C,

combined with Exercise ??, it follows that both maps |Re z|, |Im z| : I → [0,∞) aresummable. Using the real case it then follows that both maps Reα, Imα : I → Rare summable. Then the summability of α follows from Lemma ??.

The following result shows that summability is essentially the same as thesummability of series.

Proposition 2.2. Suppose α : I → K is summable. Then the support set

[[α]] = j ∈ I : α(j) 6= 0

is at most countable.

Proof. For every integer n ≥ 1, we define the set Jn = j ∈ I : |α(j)| ≥ 1n

.

Since |α| is summable, the sets Jn, n ≥ 1 are all finite. The desired result thenfollows from the obvious equality [[α]] =

⋃∞n=1 Jn.

We are now ready to discuss our next class of examples.

Example 2.4. Let K be either R or C, let I be a non-empty set, and letp ∈ [1,∞) be a real number. We define

`pK(I) =α : I → K : |α|p : I → [0,∞) summable

.

For α ∈ `pK(I) we define

‖α‖p =[∑j∈I

|α(j)|p] 1

p

.

When K = C, the space `∞C (I) is simply denoted by `∞(I). When I = N - the setof natural numbers - instead of `∞R (N) we simply write `∞R , and instead of `∞(N)we simply write `∞.

In order to show that the `p spaces (1 ≤ p <∞) are normed vector spaces, wewill need several preliminary results. The first result we are going to need is the(classical) Holder inequality.

70 LECTURES 9-11

Exercise 9. Let q > 1 and let u, v ≥ 0. Define the function f : [0, 1] → R by

f(t) = ut+ v(1− tq)1q , t ∈ [0, 1].

Prove thatmaxt∈[0,1]

f(t) = (up + vp)1p ,

where p =q

q − 1. Prove that, unless u = v = 0, there exists a unique s ∈ [0, 1] such

thatf(s) = max

t∈[0,1]f(t).

Hint: Analyze the derivative: f ′(t) = u− v

(tq

1− tq

) 1p

, t ∈ (0, 1).

Lemma 2.3 (Holder’s inequality). Let a1, a2, . . . , an, b1, b2, . . . , bn be non-nega-tive numbers. Let p, q > 1 be real number with the property 1

p + 1q = 1. Then:

(3)n∑j=1

ajbj ≤( n∑j=1

apj

) 1p

·( n∑j=1

bqj

) 1q

.

Moreover, one has equality only when the sequences (ap1, . . . , apn) and (bq1, . . . , b

qn)

are proportional.

Proof. The proof will be carried on by induction on n. The case n = 1 istrivial.

Case n = 2.Assume (b1, b2) 6= (0, 0). (Otherwise everything is trivial). Define the number

r =b1

(bq1 + bq2)1/q.

Notice that r ∈ [0, 1], and we have

b2(bq1 + bq2)1/q

= (1− rq)1/q.

Notice also that, upon dividing by (bq1 + bq2)1/q, the desired inequality

(4) a1b1 + a2b2 ≤ (ap1 + ap2)1p (bq1 + bq2)

1q

readsa1r + a2(1− rq)1/q ≤ (ap1 + ap2)

1/p,

and it follows immediately from the exercise, applied to the function

f(t) = a1t+ a2(1− tq)1/q, t ∈ [0, 1].

Let us examine when equality holds. If a1 = a2 = 0, the equality obviosuly holds,and in this case (a1, a2) is clearly proportional to (b1, b2). Assume (a1, a2) 6= (0, 0).Put

s =ap/q1

(ap1 + ap2)1/q,

and notice that

(1− sq)1/q =(

1− ap1ap1 + ap2

)1/q

=ap/q2

(ap1 + ap2)1/q,

CHAPTER II: ELEMENTS OF FUNCTIONAL ANALYSIS 71

so we have

f(s) =a1+ p

q

1 + a1+ p

q

2

(ap1 + ap2)1q

=ap1 + ap2

(ap1 + ap2)1q

= (ap1 + ap2)1− 1

q = (ap1 + ap2)1p = max

t∈[0,1]f(t).

By the exercise, it follows that we have equality in (4) precisely when r = s, i.e.

b1

(bq1 + bq2)1q

=a

pq

1

(ap1 + ap2)1q

,

or equivalentlybq1

bq1 + bq2=

ap1ap1 + ap2

.

Obviously this forcesbq2

bq1 + bq2=

ap2ap1 + ap2

,

so indeed (ap1, ap2) and (bq1, b

q2) are proportional.

Having proven the case n = 2, we now proceed with the proof of:The implication: Case n = k ⇒ Case n = k + 1.Start with two sequences (a1, a2, . . . , ak, ak+1) and (b1, b2, . . . , ak, bk+1). Define

the numbers

a =( k∑j=1

apj

) 1p

and b =( k∑j=1

bqj

) 1q

.

Using the assumption that the case n = k holds, we have

(5)k+1∑j=1

ajbj ≤( k∑j=1

apj

) 1p

·( k∑j=1

bqj

) 1q

+ ak+1bk+1 = ab+ ak+1bk+1.

Using the case n = 2 we also have

(6) ab+ ak+1bk+1 ≤ (ap + apk+1)1p · (bq + bqk+1)

1q =

( k+1∑j=1

apj

) 1p

·( k+1∑j=1

bqj

) 1q

,

so combining with (5) we see that the desired inequality (3) holds for n = k + 1.Assume now we have equality. Then we must have equality in both (5) and in

(6). On the one hand, the equality in (5) forces (ap1, ap2, . . . , a

pk) and (bq1, b

q2, . . . , b

qk) to

be proportional (since we assume the case n = k). On the other hand, the equalityin (6) forces (ap, apk+1) and (bq, bqk+1) to be proportional (by the case n = 2). Since

ap =k∑j=1

apj and bq =k∑j=1

bqj ,

it is clear that (ap1, ap2, . . . , a

pk, a

pk+1) and (bq1, b

q2, . . . , b

qk, b

qk+1) are proportional.

Definition. Two numbers p, q ∈ [1,∞) are said to be Holder conjugate, if1p + 1

q = 1. Here we use the convention 1∞ = 0.

Proposition 2.3. Let K be one of the fields R or C, let I be a non-empty set,and let p, q ∈ [1,∞] be two Holder conjugate numbers. If α ∈ `pK(I) and β ∈ `qK(I),then αβ ∈ `1K(I), and

‖αβ‖1 ≤ ‖α‖p · ‖β‖q.

72 LECTURES 9-11

Proof. Using Lemma ??, it suffices to prove the inequality

(7)∑j∈F

∣∣α(j)β(j)∣∣ ≤ ‖α‖p · ‖β‖q,

for every finite set F ⊂ I.Fix for the moment a finite subset F ⊂ I. Assume p, q ∈ (1,∞), using Holder’s

inequality we have

(8)∑j∈F

∣∣α(j)β(j)∣∣ =

∑j∈F

∣∣α(j)∣∣ · ∣∣β(j)

∣∣ ≤ [ ∑j∈F

∣∣α(j)∣∣p] 1

p

·[ ∑j∈F

∣∣β(j)∣∣q] 1

q

.

Notice however that ∑j∈F

∣∣α(j)∣∣p ≤ ∑

j∈I

∣∣α(j)∣∣p =

(‖α‖p

)p,

∑j∈F

∣∣β(j)∣∣q ≤ ∑

j∈I

∣∣β(j)∣∣q =

(‖β‖q

)q,

so we get [ ∑j∈F

∣∣α(j)∣∣p] 1

p

≤ ‖α‖p and[ ∑j∈F

∣∣β(j)∣∣q] 1

q

≤ ‖β‖q,

so when we go back to (8) we immediately get the desired inequality (7)In the case when p = 1, we immediately have∑

j∈F

∣∣α(j)β(j)∣∣ ≤ [ ∑

j∈F

∣∣α(j)∣∣] · [max

j∈F

∣∣β(j)∣∣] ≤

≤[∑j∈I

∣∣α(j)∣∣] · [ sup

j∈I

∣∣β(j)∣∣] = ‖α‖1 · ‖β‖∞.

The case p = ∞ is proven in the exact same way.

Remark 2.2. Suppose p, q ∈ [1,∞] are Holder conjugate numbers. For any α ∈`pK(I) and β ∈ `qK(I), the map αβ is summable (by Proposition ??). In particular,one can define the number

〈α, β〉 =∑j∈I

α(j)β(j) ∈ K.

As a consequence we get the inequality∣∣〈α, β〉∣∣ ≤ ‖α‖p · ‖β‖q, ∀α ∈ `pK(I), β ∈ `qK(I).

Notations. Let K be either R or C, let I be a non-empty set, and let q ∈ [1,∞]be a real number. We define

BqK(I) =

α ∈ finK(I) : ‖α‖q ≤ 1

.

(remark that finK(I) ⊂ `qK(I), for all q ∈ [1,∞].)Theorem 2.1 (Dual definition of `p spaces). Let p, q ∈ (1,∞) be Holder con-

jugate numbers, let K be one of the fields R or C, and let I be a non-empty set.For a function α : I → K, the following are equivalent:

(i) α ∈ `pK(I);(ii) sup

β∈BqK(I)

|〈α, β〉| <∞.

CHAPTER II: ELEMENTS OF FUNCTIONAL ANALYSIS 73

Moreover, one has the equality

(9) supβ∈Bq

K(I)

|〈α, β〉| = ‖α‖p, ∀α ∈ `pK(I).

Proof. It will be convenient to introduce several notations. Given a functionα : I → K, and a finite set F ⊂ I, we define the function βFα : I → K, as follows:

βFα (i) =

|α(i)|1+

pq

α(i) ·( ∑

j∈F |α(j)|p)1/q

if i ∈ F and α(i) 6= 0

0 if i 6∈ F or α(i) = 0

Notice that [[βFα ]] ⊂ F , and unless βFα is identically zero, we have∑i∈[[βF

α ]]

|βFα (i)|q = 1.

So in any case we have βFα ∈ BqK(I). Notice also that, unless βFα is identically zero,

we have

〈α, βFα 〉 =∑i∈F

α(i)βFα (i) =∑i∈F |α(i)|1+

pq( ∑

j∈F |α(j)|p)1/q

=∑i∈F |α(i)|p( ∑

j∈F |α(j)|p)1/q

=

=( ∑i∈F

|α(i)|p)1− 1

q =( ∑i∈F

|α(i)|p)1/p

.

(10)

It is clear that the equality (10) actually holds even when βFα is identically zero.To make the exposition a bit clearer, we denote the quantity sup

β∈BqK(I)

∣∣〈α, β〉∣∣simply by |||α|||.

We now proceed with the proof of the Theorem.(i) ⇒ (ii). Assume α ∈ `pK(I). In order to prove (ii) it suffices to prove the

inequality

(11) |||α||| ≤ ‖α‖p.

Start with some arbitrary β ∈ BqK(I). Using Holder inequality we have

|〈α, β〉| =∣∣∣∣ ∑j∈[[β]]

α(j)β(j)∣∣∣∣ ≤ ∑

j∈[[β]]

|α(j)| · |β(j)| ≤

≤( ∑j∈[[β]]

|α(j)|p)1/p

·( ∑j∈[[β]]

|β(j)|q)1/q

≤(

supF⊂Ifinite

[∑i∈F

|α(i)|p])1/p

= ‖α‖p.

Since this inequality holds for all β ∈ BqK(I), the inequality (11) follows.

(ii) ⇒ (i). Assume now |||α||| < ∞. In order to prove condition (i) it sufficesto prove that

(12)∑i∈F

|α(i)|p ≤ |||α|||p, for every finite subset F ⊂ I.

By (10) we know that for every finite subset F ⊂ I we have

(13)∑i∈F

|α(i)|p ≤ |||α|||p = 〈α, βFα 〉p.

74 LECTURES 9-11

In particular we get the fact that 〈α, βFα 〉 = |〈α, βFα 〉|, and the fact that βFα belongsto B

qK(I), combined with (13) will give∑

i∈F|α(i)|p = |〈α, βFα 〉|p ≤

(sup

β∈BqK(I)

|〈α, β〉|)p

= |||α|||p.

Having proven the equivalence (i) ⇔ (ii), let us now observe that (9) is animmediate consequence of (11) and (12).

Exercise 10. Prove that Theorem 9.1 holds also in the cases (p, q) = (1,∞) and(p, q) = (∞, 1).

Corollary 2.1. Let K be either R or C, let I be a non-empty set, and letp ≥ 1.

(i) When equipped with point-wise addition and scalar multiplication, the set`pK(I) is a K-vector space.

(ii) The map`pK(I) 3 α 7−→ ‖α‖p ∈ [0,∞)

is a norm.

Proof. Let q be the Holder conjugate of p. If α ∈ `pK(I), and λ ∈ K, then

〈λα, β〉 = λ〈α, β〉, ∀β ∈ finK(I),

so we getsup

β∈BqK(I)

|〈λα, β〉| = |λ| · supβ∈Bq

K(I)

|〈α, β〉|,

which gives the fact that λα ∈ `pK(I), as well as the equality ‖λα‖p = |λ| · ‖α‖p.If α1, α2 ∈ `pK(I), then

〈α1 + α2, β〉 = 〈α1, β〉+ 〈α2, β〉, ∀β ∈ finK(I),

so we get

supβ∈Bq

K(I)

|〈α1 + α2, β〉| = supβ∈Bq

K(I)

∣∣〈α1, β〉+ 〈α2, β〉∣∣ ≤

≤ supβ∈Bq

K(I)

(|〈α1, β〉|+ |〈α2, β〉|

)≤ supβ∈Bq

K(I)

|〈α1, β〉|+ supβ∈Bq

K(I)

|〈α2, β〉|,

which gives the fact that α1 + α2 ∈ `pK(I), as well as the inequality

‖α1 + α2‖p ≤ ‖α1‖p + ‖α2‖p.The implication ‖α‖p = 0 ⇒ α = 0 is obvious.

Exercise 11. Let p ≥ 1 be a real number, let K be one of the fields R or C, andlet I be a non-empty set. Prove that finK(I) is a dense linear subspace in `pK(I).

Remark 2.3. Let p, q ∈ [1,∞] be Holder conjugate. Then the map

`pK(I)× `qK(I) 3 (α, β) 7−→ 〈α, β〉 ∈ Kis bilinear, in the sense that for any γ ∈ `pK(I) and any η ∈ `qK(I), the maps

`pK(I) 3 α 7−→ 〈α, η〉 ∈ K,`qK(I) 3 β 7−→ 〈γ, β〉 ∈ K

are linear. These facts follow immediately from Exercise ??We now examine linear continuous maps between normed spaces.

CHAPTER II: ELEMENTS OF FUNCTIONAL ANALYSIS 75

Proposition 2.4. Let K be either R or C, let X and Y be normed K-vectorspaces, and let T : X → Y be a K-linear map. The following are equivalent:

(i) T is continuous;(ii) sup

‖Tx‖ : x ∈ X, ‖x‖ ≤ 1

<∞;

(iii) sup‖Tx‖ : x ∈ X, ‖x‖ = 1

<∞;

(iv) T is continuous at 0.

Proof. (i) ⇒ (ii). Assume T is continuous, but

sup‖Tx‖ : x ∈ X, ‖x‖ ≤ 1

<∞,

which means there exists some sequence (xn)n≥1 ⊂ X such that(a) ‖xn‖ ≤ 1, ∀n ≥ 1;(b) limn→∞ ‖Txn‖ = ∞.

Putzn = ‖Txn‖−1xn, ∀n ≥ 1.

On the one hand, we have

‖zn‖ =‖xn‖‖Txn‖

≤ 1‖Txn‖

, ∀n ≥ 1,

which gives limn→∞ ‖zn‖ = 0, i.e. limn→∞ zn = 0. Since T is assumed to becontinuous, we will get

(14) limn→∞

Tzn = T0 = 0.

On the other hand, since T is linear, we have Tzn = ‖Txn‖−1Txn, so in particularwe get

‖Tzn‖ = 1, ∀n ≥ 1,which clearly contradicts (14).

(ii) ⇒ (iii). This is obvious, since the supremum in (iii) is taken over a subsetof the set used in (ii).

(iii) ⇒ (iv). Let (xn)n≥1 ⊂ X be a sequence with limn→∞ xn = 0. For eachn ≥ 1, define

un =

‖xn‖−1xn, if xn 6= 0

any vector of norm 1, if xn = 0so that we have

‖un‖ = 1 and xn = ‖xn‖un, ∀n ≥ 1.Since T is linear, we have

(15) Txn = ‖xn‖Tun, ∀n ≥ 1.

If we define M = sup‖Tx‖ : x ∈ X, ‖x‖ = 1

, then ‖Tun‖ ≤M , ∀n ≥ 1, so (15)

will give‖Txn‖ ≤M · ‖xn‖, ∀n ≥ 1,

and the condition limn→∞ xn = 0 will force limn→∞ Txn = 0.(iv) ⇒ (i). Assume T is continuous at 0, and let us prove that T is continuous at

any point. Start with some arbitrary x ∈ X and an arbitrary sequence (xn)n≥1 ⊂ X

with limn→∞ xn = x. Put zn = xn − x, so that limn→∞ zn = 0. Then we will havelimn→∞ Tzn = 0, which (use the linearity of T ) means that

0 = limn→∞

‖Tzn‖ = limn→∞

‖Txn − Tx‖,

76 LECTURES 9-11

thus proving that limn→∞ Txn = Tx.

Remark 2.4. Using the notations above, the quantities in (ii) and (iii) are infact equal. Indeed, if we define

M1 = sup‖Tx‖ : x ∈ X, ‖x‖ ≤ 1

,

M2 = sup‖Tx‖ : x ∈ X, ‖x‖ = 1

,

then as observed during the proof, we have M2 ≤M1. Conversely, if we start withsome arbitrary x ∈ X with ‖x‖ ≤ 1, then we can always write x = ‖x‖u, for someu ∈ X with ‖u‖ = 1. In particular we will get

‖Tx‖ = ‖x‖ · ‖Tu‖ ≤ ‖x‖ ·M2 ≤M2.

Taking supremum in the above inequality, over all x ∈ X with ‖x‖ ≤ 1, will thengive the inequality M1 ≤M2.

Notations. Let K be either R or C, and let X and Y be normed K-vectorspaces. We define

L(X,Y) =T : X → Y : T K-linear and continuous

.

For T ∈ L(X,Y) we define (see the above remark)

‖T‖ = sup‖Tx‖ : x ∈ X, ‖x‖ ≤ 1

= sup

‖Tx‖ : x ∈ X, ‖x‖ = 1

When Y = K (equipped with the absolute value as the norm), the space L(X,K)

will be denoted simply by X∗, and will be called the topological dual of X.Proposition 2.5. Let K be either R or C, and let X and Y be normed K-vector

spaces.(i) The space L(X,Y) is a K-vector space.(ii) For T ∈ L(X,Y) we have

(16) ‖T‖ = minC ≥ 0 : ‖Tx‖ ≤ C‖x‖, ∀x ∈ X

.

In particular one has

(17) ‖Tx‖ ≤ ‖T‖ · ‖x‖, ∀x ∈ X.

(iii) The map L(X,Y) 3 T 7−→ ‖T‖ ∈ [0,∞) is a norm.

Proof. The fact that L(X,Y) is a vector space is clear.(ii). Assume TL(X,Y). We begin by proving (17). Start with some arbitrary

x ∈ X, and write it as x = ‖x‖u, for some u ∈ X with ‖u‖ = 1. Then by definitionwe have ‖Tu‖ ≤ ‖T‖, and by linearity we have

‖Tx‖ = ‖x‖ · ‖Tu‖ ≤ ‖x‖ · ‖T‖.To prove the equality (16) let us define the set

CT =C ≥ 0 : ‖Tx‖ ≤ C‖x‖, ∀x ∈ X

.

On the one hand, by (17) we know that ‖T‖ ∈ CT . On the other hand, if we takean arbitrary C ∈ CT , then for every u ∈ X with ‖u‖ = 1, we will have

‖Tu‖ ≤ C‖u‖ = C,

so taking supremum, over all u with ‖u‖ = 1, will immediately give ‖T‖ ≤ C. Sincewe now have

‖T‖ ≤ C, ∀C ∈ CT ,

we clearly get ‖T‖ = min CT .

CHAPTER II: ELEMENTS OF FUNCTIONAL ANALYSIS 77

(iii). Let T, S ∈ L(X,Y). Using (17), we have

‖(T + S)x‖ = ‖Tx+ Sx‖ ≤ ‖Tx‖+ ‖Sx‖ ≤ (‖T‖+ ‖S‖) · ‖x‖, ∀x ∈ X.

Then using (16) we get‖T + S‖ ≤ ‖T‖+ ‖S‖.

If T ∈ L(X,Y) and λ ∈ K, then the equality

‖(λT )x‖ = |λ| · ‖Tx‖, x ∈ X

will immediately give ‖λT‖ = |λ| · ‖T‖.Finally if T ∈ L(X,Y) has ‖T‖ = 0, then using (17) one immediately gets

T = 0.

Notation. Let I be a non-empty set, let K be one of the fields R or C, andlet p ∈ [1,∞]. Let q be the Holder conjugate of p. For every element α ∈ `pK(I) wedefine the map θα : `qK(I) → K by

θα(β) = 〈α, β〉 =∑i∈I

α(i)β(i), β ∈ `qK(I).

We know that θα is linear, and by Remark 9.2, we have∣∣θα(β)∣∣ ≤ ‖α‖p · ‖β‖q, ∀β ∈ `qK(I),

so θα is continuous, and we have the inequality

(18) ‖θα‖ ≤ ‖α‖p.

Proposition 2.6. Using the above notations, but assuming p ∈ (1,∞], themap

Θ : `pK(I) 3 α 7−→ θα ∈(`qK(I)

)∗is a linear isomorphism of K-vector spaces. Moreover, Θ is isometric, in the sensethat

(19) ‖Θα‖ = ‖α‖p, ∀α ∈ `pK(I).

Proof. We begin by proving (19). Since we have the inclusion

β ∈ `qK(I) : ‖β‖q ≤ 1 ⊃ BqK(I),

it follows that

(20) ‖θα‖ = sup∣∣θα(β)

∣∣ : β ∈ `qK(I), ‖β‖q ≤ 1≥ sup

∣∣θα(β)∣∣ : β ∈ B

qK(I)

.

We know however (see Theorem 9 and Exercise 7) that

sup∣∣θα(β)

∣∣ : β ∈ BqK(I)

= ‖α‖p,

so using (20) we get‖θα‖ ≥ ‖α‖p.

Combining this with (18) yields the desired equality.The fact that Θ is linear is pretty obvious. Notice now that since Θ is isometric,

it is clear that Θ is injective, so the only thing we need to prove is the fact thatΘ is surjective. Start with an arbitrary linear continuous map φ : `qK(I) → K. Forevery i ∈ I we define the function δi : I → K by

δi(j) =

1 if j = i0 if j 6= i

78 LECTURES 9-11

It is clear that δi ∈ `qK(I), for all i ∈ I. (In fact δi ∈ finK(I).) We define α : I → Kby

α(i) = φ(δi), ∀ i ∈ I.Notice that, for every β ∈ finK, we have

(21)∑i∈I

α(i)β(i) =∑i∈I

β(i)φ(δi) = φ( ∑i∈bβc

β(i)δi)

= φ(β),

where bβc = i ∈ I : β(i) 6= 0. (Since β ∈ finK(I), the set bβc is finite.) UsingHolder’s inequality, the above computation shows that∣∣〈α, β〉∣∣ ≤ ‖φ‖ · ‖β‖q, ∀β ∈ finK(I).

By Theorem 9.1 and Exercise 7, this proves that α ∈ `pK(I). Going back to (21) wenow have

θα(β) = φ(β), ∀β ∈ finK(I).Since both θα and φ are continuous, and finK(I) is dense in `qK(I) (by Exercise 10),it follows that φ = θα.

Remark 2.5. In the case p = 1, the map

Θ : `1K(I) 3 α 7−→ θα ∈(`∞K (I)

)∗is still isometric, but it is no longer surjective, unless I is finite. The explanationis the fact that when I is infinite, the subspace finK(I) is not dense in `∞K (I). Forexample, if we take 1 ∈ `∞K (I) to be the constant function 1, then it is prettyobvious that

‖1− β‖ ≥ 1, ∀β ∈ finK(I).The above equality can be immediately extended to

(22) ‖λ1 + β‖ ≥ |λ|, ∀λ ∈ K, β ∈ finK(I).

If we then consider the subspace

finK(I) = λ1 + β : β ∈ finK(I), λ ∈ K,we see that the map

φ0 : finK(I) 3 λ1 + β 7−→ λ ∈ Kis linear, continuous, and has the property that

φ0

∣∣finK(I)

= 0, φ0(1) = 1,(23)

|φ0(γ)| ≤ ‖γ‖, ∀ γ ∈ finK(I).(24)

Using the Hahn-Banach Theorem, we can then extend φ0 to a linear map φ :`∞K (I) → K which will still satisfy (23) and (24), in particular we have φ ∈

(`∞K (I)

)∗.Notice however that if we had φ = θα, for some α ∈ `1K(I), then we must haveα(i) = φ(δi) = 0, for all i ∈ I, so this would force φ = 0, which is impossible, sinceφ(1) = 1.

Exercise 12. Use the notations above. For every α ∈ `1K(I), define

σα = θα∣∣cK0 (I)

: cK0 (I) → K.

Prove that σα is linear and continuous. Prove that the map

Σ : `1K(I) 3 α 7−→ σα ∈(cK0 (I)

)∗is an isometric linear isomorphism of K-vector spaces.

Lecture 12

3. Banach spaces

Definition. Let K be one of the fields R or C. A Banach space over K is anormed K-vector space (X, ‖ . ‖), which is complete with respect to the metric

d(x, y) = ‖x− y‖, x, y ∈ X.

Example 3.1. The field K, equipped with the absolute value norm, is a Banachspace. More generally, the vector space Kn, equipped with any of the norms

‖(λ1, . . . , λn)‖∞ = max|λ1|, . . . , |λn|,

‖(λ1, . . . , λn)‖p =[|λ1|p + · · ·+ |λn|p

]1/p, p ≥ 1,

is a Banach space.Remark 3.1. Using the facts from the general theory of metric spaces, we

know that for a normed vector space (X, ‖ . ‖), the following are equivalent:(i) X is a Banach space;(ii) given any sequence (xn)n≥1 ⊂ X with

∑∞n=1 ‖xn‖ < ∞, the sequence

(yn)n≥1 of partial sums, defined by yn =∑nk=1 xk, is convergent;

(iii) every Cauchy sequence in X has a convergent subsequence.This is pretty obvious, since the sequence of partial sums has the property that

d(yn+1, yn) = ‖yn+1 − yn‖ = ‖xn+1‖, ∀n ≥ 1.

Exercise 1*. Let X be a finite dimensional normed vector space. Prove that X

is a Banach space.Hints: Use inductionn on dim X. The case dim X = 1 is trivial. Assume the statement is true for

all normed vector spaces of dimension d, and let us prove it for a normed vector space of dimensiond+1. Fix such an X, and a linear basis e1, e2, . . . , en, ed+1 for X. Start with a Cauchy sequence

(xn)n≥1 ⊂ X. Write each term as

xn =

d+1∑k=1

αn(k)ek.

Prove first that(αn(d+ 1)

)n≥1

⊂ K is bounded. Then extract a subsequence (xnp )p≥1 such that(αnp (d + 1)

)p≥1

is convergent. If we take α(d + 1) = limp→∞ αnp (d + 1), then prove that the

sequence(xnp−αnp (d+1)ed+1

)p≥1

is Cauchy in the space Spane1, . . . , ed. Using the inductive

hypothesis, conclude that (xnp )p≥1 is convergent in X. Thus, every Cauchy sequence in X has a

convergent subsequence, hence X is Banach.

Exercise 2*. Let n ≥ 1 be an integer, and let ‖ · ‖ be a norm on Kn. Provethat there exist constants C,D > 0, such that

C‖x‖∞ ≤ ‖x‖ ≤ D‖x‖∞, ∀x ∈ Kn.

79

80 LECTURE 12

Hint: Let e1, . . . , en be the standard basis vectors for Kn, so that

α1e1 + · · ·+ αnen = (α1, . . . , αn), ∀ (α1, . . . , αn) ∈ Kn.

Define D = ‖e1‖ + · · · + ‖en‖. The existence of C is equivalent to the existence of some C′ > 0

such that

‖x‖∞ ≤ C′‖x‖, ∀x ∈ Kn.

(If such a C′ exists, then we take C = 1/C′.) To prove the existence of C′ as above, we considerthe set T = x ∈ Kn : ‖x‖ ≤ 1, and we need to prove that

supx∈T

‖x‖∞ <∞.

Argue by contradiction (see also the hint from the preceding exercise).

Exercise 3. Let X and Y be normed vector spaces. Consider the product X×Y,equipped with the natural vector space structure.

(i) Prove that ‖(x, y)‖ = ‖x‖+ ‖y‖, (x, y) ∈ X× Y defines a norm on X× Y.(ii) Prove that, when equipped with the above norm, X×Y is a Banach space,

if and only if both X and Y are Banach spaces.There are two key constructions which enable one to construct new Banach

space out of old ones.Proposition 3.1. Let X be a normed vector space, and let Y be a Banach

space. Then L(X,Y) is a Banach space, when equipped with the operator norm.

Proof. Start with a Cauchy sequence (Tn)n≥1 ⊂ L(X,Y). This means thatfor every ε > 0, there exists some Nε such that

(1) ‖Tm − Tn‖ < ε, ∀m,n ≥ Nε.

Notice that, if one takes for example ε = 1, and we define

C = 1 + max‖T1‖, ‖T2‖, . . . , ‖TN1‖,then we clearly have

(2) ‖Tn‖ ≤ C, ∀n ≥ 1.

Notice that, using (1), we have

(3) ‖Tmx− Tnx‖ ≤ ε‖x‖, ∀m,n ≥ Nε, x ∈ X,

which proves that• for every x ∈ X, the sequence (Tnx)n≥1 ⊂ Y is Cauchy.

Since Y is a Banach space, for each x ∈ X, the sequence (Tn)n≥1 will be convergent.We define the map T : X → Y by

Tx = limn→∞

Tnx, x ∈ X.

Using (2) we immediately get

‖Tx‖ ≤ C‖x‖, ∀x ∈ X.

Since T is obviously linear, this prove that T is continuous. Finally, if we fix n ≥ Nεand we take limm→∞ in (3), we get

‖Tnx− Tx‖ ≤ ε‖x‖, ∀n ≥ Nε, x ∈ X,

which proves precisely that we have the inequality

‖Tn − T‖ ≤ ε, ∀n ≥ Nε,

hence (Tn)n≥1 is convergent to T in the norm topology.

CHAPTER II: ELEMENTS OF FUNCTIONAL ANALYSIS 81

Corollary 3.1. If X is a normed vector space, then its topological dual X∗ =L(X,K) is a Banach space.

Proof. Immediate from the fact that K is a Banach space.

As a direct application of the above result we getCorollary 3.2. If I is a non-empty set, if p ∈ [1,∞], then `pK(I) is a Banach

space.

Proof. For p = 1 we know that `1 ' (c0)∗. For p ∈ (1,∞], we know that`p ' (`q)∗, where q is Holder conjugate to p.

Proposition 3.2. Let X be a Banach space, and let Z ⊂ X be a linear subspace.The following are equivalent:

(i) Z is a Banach space, ehen equipped with the norm from X;(ii) Z is closed in X, in the norm topology.

Proof. This is a particular case of a general result from the theory of completemetric spaces.

Corollary 3.3. Let I be a non-empty set, and let K be one of the fields R orC. Then cK

0 (I) is a Banach space.

Proof. Use the fact that cK0 (I) is closed in `∞K (I).

Exercise 4*. Let X be an infinite dimensional Banach space, and let B be alinear basis for X. Prove that B is uncountable.Hint: If B is countable, say B = bn : n ∈ N, then

X =

∞⋃n=1

Fn,

where Fn = Span(b1, b2, . . . , bn. Since the Fn’s are finite dimensional linear subspaces, they will

be closed. Use Baire’s Theorem to get a contradiction.

Comments. A third method of constructing Banach spaces is the completion.If we start with a normed K-vector space X, when we regard X as a metric space,its completion X is constructed as follows. One defines

cs(X) =x = (xn)n≥1 : (xn)n≥1 Cauchy sequence in X

.

Two Cauchy sequences x = (xn)n≥1 and x′ = (x′n)n≥1 are said to be equivalent, iflimn→∞ ‖xn − x′n‖ = 0. In this case one writes x ∼ x′. The completion X is thendefined as the space

X = cs(X)/ ∼of equivalence classes. For x ∈ cs(X), one denotes by x its equivalence class in X.Finally for an element x ∈ X one denotes by 〈x〉 ∈ X the equivalence class of theconstant sequence x.

We know from general theory that X is a complete metric space, with thedistance d (correctly) defined by

d(x, x′) = limn→∞

‖xn − x′n‖,

for any two Cauchy sequences x = (xn)n≥1 and x′ = (x′n)n≥1.

82 LECTURE 12

It turns out that, in our situation, the space cs(X) carries a natural vectorspace structure, defined by pointwise addition and scalar multiplication. Moreover,the space X is identified as a quotient vector space

X = cs(X)/ns(X),

where

ns(X) =x = (xn)n≥1 : (xn)n≥1 sequence in X with lim

n→∞xn = 0

is the linear subspace of null sequences. It then follows that X carries a naturalvector space structure. More explicitly, if we start with a scalar λ ∈ K, and withtwo elements p, q ∈ X, which are represented as p = x and q = y, for two Cauchysequences x = (xn)n≥1 and y = (yn)n≥1 in X, then the sequence

w = (λxn + yn)n≥1

is Cauchy in X, and the element λp+ q ∈ X is then defined as λp+ q = w.Finally, there is a natural norm on X, (correctly) defined by

‖x‖ = d(x, 〈0〉) = limn→∞

‖xn‖,

for all Cauchy sequences x = (xn)n≥1. These considerations then prove that X isa Banach space, and the map

X 3 x 7−→ 〈x〉 ∈ X

is linear and isometric, in the sense that

‖〈x〉‖ = ‖x‖, ∀x ∈ X.

In the context of normed vector spaces, the universality property of the com-pletion is stated as follows:

Proposition 3.3. Let X be a normed vector space, let X denote its completion,and let Y be a Banach space. For every linear continuous map T : X → Y, thereexists a unique linear continuous map T : X → Y, such that

T 〈x〉 = Tx, ∀x ∈ X.

Moreover the mapL(X,Y) 3 T 7−→ T ∈ L(X,Y)

is an isometric linear isomorphism.

Proof. If T : X → Y is linear an continuous, then T is a Lipschitz map withLipschitz constant ‖T‖, because

‖Tx− Tx′‖ ≤ ‖T‖ · ‖x− x′‖, ∀x, x′ ∈ X.

We know, from the theory of metric spaces, that there exists a unique continuousmap T : X → Y, such that

T 〈x〉 = Tx, ∀x ∈ X.

We also know that T is Lipschitz, with Lipschitz constant ‖T‖. The only thing weneed to prove is the fact that T is linear. Start with two points p, q ∈ X, representedas p = x and q = z, for some Cauchy sequences x = (xn)n≥1 and z = (zn)n≥1 inX. If λ ∈ K, then λp+ q = w, where w = (λxn + zn)n≥1. We then have

T (λp+ q) = limn→∞

T (λxn + z + n) =[λ · lim

n→∞Txn

]+

[limn→∞

Tzn]

= λTp+ T q.

CHAPTER II: ELEMENTS OF FUNCTIONAL ANALYSIS 83

Let us prove now that ‖T‖ = ‖T‖. Since T is Lipschitz, with Lipschitz constant‖T‖, we will have ‖T‖ ≤ ‖T‖. To prove the other inequality, let us consider thesets

B0 = p ∈ X : ‖p‖ ≤ 1,B1 = 〈x〉 : x ∈ X, ‖x‖ ≤ 1.

By definition, we have‖T‖ = sup

p∈B0

‖T p‖.

Since we clearly have B0 ⊃ B1, we get

‖T‖ = supp∈B1

‖T p‖ ≥ sup‖T 〈x〉‖ : x ∈ X ‖x‖ ≤ 1

=

= sup‖Tx‖ : x ∈ X ‖x‖ ≤ 1

= ‖T‖.

The fact that the map L(X,Y) 3 T 7−→ T ∈ L(X,Y) is linear is obvious.To prove the surjectivity, start with some S ∈ L(X,Y). Consider the map

ι : X 3 x 7−→ 〈x〉 ∈ X.

Since ι is linear and isometric, in particular it is continuous, so the compositionT = S ι is linear and continuous. Notice that

S〈x〉 = S(ι(x)

)= (S ι)x = Tx, ∀x ∈ X,

so by uniqueness we have S = T .

Corollary 3.4. Let X be a normed space, let Y be a Banach space, and letT : X → Y be an isometric linear map.

(i) Let T : X → Y be the linear continuous map defined in the previous result.Then T is linear, isometric, and T (X) = T (X).

(ii) X is complete, if and only of T (X) is closed in Y.

Proof. (i). The fact that T is isometric, and has the range equal to T (X) istrue in general (i.e. for X metric space, and Y complete metric space). The linearityfollows from the previous result.

(ii). This is obvious.

Example 3.2. Let X be a normed vector space. For every x ∈ X define themap εx : X∗ → K by

εx(φ) = φ(x), ∀φ ∈ X∗.

Then εx is a linear and continuous. This is an immediate consequence of theinequality

|εx(φ)| = |φ(x)| ≤ ‖x‖ · ‖φ‖, ∀φ ∈ X∗.

Notice that this also proves

‖εx‖ ≤ ‖x‖, ∀x ∈ X.

Interestingly enough, we actually have

(4) ‖εx‖ = ‖x‖, ∀x ∈ X.

To prove this fact, we start with an arbitrary x ∈ X, and we consider the linearsubspace

Y = Kx = λx : λ ∈ K.

84 LECTURE 12

If we define φ0 : Y → K, by

φ0(λx) = λ‖x‖, ∀λ ∈ K,then it is clear that φ0(x) = ‖x‖, and

|φ0(y)| ≤ ‖y‖, ∀ y ∈ Y.

Use then the Hahn-Banach Theorem to find φ : X → K such that φ∣∣Y

= φ0, and

|φ(z)| ≤ ‖z‖, ∀ z ∈ X.

This will clearly imply ‖φ‖ ≤ 1, while the first condition will give φ(x) = φ0(x) =‖x‖. In particular, we will have

‖x‖ = |φ(x)| = |εx(φ)| ≤ ‖εx‖ · ‖φ‖ ≤ ‖εx‖.Having proven (4), we now have a linear isometric map

E : X 3 x 7−→ εx ∈ X∗∗.

Since X∗∗ is a Banach space, we now see that E : X → E(X) is an isometric linearisomorphism. In particular, X is Banach, if and only if E(X) is closed in X∗∗.

We conclude with a series of results, which are often regarded as the “principlesof Banach space theory.” These results are consequences of Baire Theorem.

Theorem 3.1 (Uniform Boundedness Principle). Let X be a Banach space, letY be normed vector space, and let M ⊂ L(X,Y). The following are equivalent

(i) sup‖T‖ : T ∈ M

<∞;

(ii) sup‖Tx‖ : T ∈ M

<∞, ∀x ∈ X.

Proof. The implication (i) ⇒ (ii) is trivial, because if we define

M = sup‖T‖ : T ∈ M

,

then by the definition of the norm, we clearly have

sup‖Tx‖ : T ∈ M

≤M‖x‖, ∀x ∈ X.

(ii) ⇒ (i). Assume M satisfies condition (ii). For each integer n ≥ 1, let usdefine the set

Fn =x ∈ X : ‖Tx‖ ≤ n, ∀T ∈ M

.

It is obvious that Fn is a closed subset of X, for each n ≥ 1. Moreover, by (ii) weclearly have

⋃∞n=1 Fn = X. Using Baire’s Theorem, there exists some n ≥ 1, such

that Int(Fn) 6= ∅. This means that there exists some x0 ∈ X and some r > 0, suchthat

Fn ⊃ Br(x0) = y ∈ X : ‖x− x0‖ ≤ r.Put M0 = sup

‖Tx0‖ : T ∈ M

. Fix for the moment some arbitrary x ∈ X,

with ‖x‖ ≤ 1, and some arbitrary element T ∈ M. The vector y = x0 + rx clearlybelongs to Br(x0), so we have ‖Ty‖ ≤ n. We then get

‖Tx‖ =∥∥T (

1r (y − x0)

)∥∥ = 1r‖Ty − Tx0‖ ≤ 1

r

(‖Ty‖+ ‖Tx0‖

)≤ 1

r (n+M0).

Keep T fixed, and use the above estimate, which gives

sup‖Tx‖ : x ∈ X, ‖x‖ ≤ 1

≤ n+M0

r,

to conclude that ‖T‖ ≤ n+M0r . Since T ∈ M is arbitrary, we finally get

sup‖T‖ : T ∈ M

≤ n+M0

r<∞.

CHAPTER II: ELEMENTS OF FUNCTIONAL ANALYSIS 85

Theorem 3.2 (Inverse Mapping Theorem). Let X and Y be Banach spaces,and let let T : X → Y be a bijective linear continuous map. Then the linear mapT−1 : Y → X is also continuous.

Proof. Let us denote by A the open unit ball in X centered at the origin, i.e.

A =x ∈ X : ‖x‖ < 1

.

The first step in the proof is contained in the following.Claim 1: The closure T (A) is a neighborhood of 0 in Y.

Consider the sequence of closed sets(kT (A)

)∞k=1

. (Here we use the notation kM =kv : v ∈ M.) Since the map v 7−→ kv is a homeomorphism, one has the equalities

kT (A) = kT (A) = T (kA), ∀ k ≥ 1.

In particular, we have∞⋃k=1

kT (A) =⋃k=1

T (kA) ⊃∞⋃k=1

T (kA) = T( ∞⋃k=1

[kA]).

Since we obviously have⋃∞k=1[kA] = X, and T is surjective, the above equality

shows that⋃∞k=1 kT (A) = Y. Using Baire’s Theorem, there exists some k ≥ 1, such

that Int(kT (A)

)6= ∅. Again using the fact that v 7−→ kv is a homeomorphism,

this gives Int(T (A)

)6= ∅. Fix now some point y ∈ Int

(T (A)

), and some r > 0,

such that T (A) contains the open ball

(5) Br(y) =z ∈ Y : ‖z − y‖ < r

.

The proof of the Claim is then finished, once we prove the inclusion

T (A) ⊃ B r2(0).

To prove this inclusion, start with some arbitrary v ∈ B r2(0), i.e. v ∈ Y and ‖v‖ < r

2 .Since ‖(2v+ y)− y‖ = 2‖v‖ < r, using (5) it follows that 2v+ y ∈ T (A). i.e. thereexists a sequence (xn)∞n=1 ⊂ X with ‖xn‖ < 1, ∀n ≥ 1, and 2v + y = limn→∞ Txn.Since y itself belongs to T (A), there also exists some sequence (zn)∞n=1 ⊂ X, with‖zn‖ < 1, ∀n ≥ 1, and y = limn→∞ Tzn. On the one hand, if we consider thesequence (un)∞n=1 ⊂ X given by un = 1

2 (xn − zn), then it is clear that

‖un‖ ≤ 12

(‖xn‖+ ‖zn‖

)< 1, ∀n ≥ 1,

i.e. (un)∞n=1 ⊂ A. On the othe hand, we have

limn→∞

Tun = limn→∞

12

(Txn − Tzn) = 1

2 (2v + y − y) = v,

so v indeed belongs to T (A).The next step is a slight (but crucial) improvement of Claim 1.

Claim 2: T (A) is a neighborhood of 0.Start off by choosing ε > 0, such that

(6) T (A) ⊃ Bε(0).

The Claim will follow, once we prove the inclusion

(7) T (A) ⊃ B ε2(0).

86 LECTURE 12

To prove this inclusion, we start with some arbitrary y ∈ Bε(0). We want toconstruct a sequence of vectors (xn)∞n=1 ⊂ A, such that, for every n ≥ 1, we havethe inequality

(8)∥∥∥∥y − n∑

k=1

T ( 12k xk)

∥∥∥∥ ≤ ε

2n+1.

This sequence is constructed inductively as follows. We start by using (6), and wepick x1 ∈ A such that ‖2y − Tx1‖ < ε

2 . Once x1, . . . , xp are constructed, such that(8) holds with n = p, we consider the vector

z = 2p+1[y −

p∑k=1

T ( 12k Txk)

]∈ Bε(0),

and we use again (6) to find xp+1 ∈ A, such that ‖z−Txp+1‖ ≤ ε2 . We then claerly

have ∥∥∥∥y − p+1∑k=1

T(

12k xk

)∥∥∥∥ =

∥∥z − Txp+1

∥∥2p+1

≤ ε

2p+2,

Consider now the series∑∞k=1

12k xk. Since ‖xk‖ < 1, ∀ k ≥ 1, and X is a Banacch

space, by Remark 3.1, the sequence of (wn)∞n=1 ⊂ X of partial sums

wn =n∑k=1

12k xk, n ≥ 1,

is convergent to some point x ∈ X. Moreover, since we have

‖wn‖ ≤n∑k=1

‖xk‖2k

≤∞∑k=1

‖xk‖2k

, ∀n ≥ 1,

we get the inequality

‖x‖ ≤∞∑k=1

‖xk‖2k

< 1,

which means that x ∈ A. Note also that using these partial sums, the inequality(8) reads

‖y − Twn‖ ≤ε

2n+2, ∀n ≥ 1,

so by the continuity of T , we have y = Tx ∈ T (A).Let us show now that T−1 is continuous. Use Claim 2, to find some r > 0 such

that

(9) T (A) ⊃ Br(0),

and let y ∈ Y be an arbitrary vector with ‖y‖ ≤ 1. Consider the vector v = r2y,

which has ‖v‖ ≤ r2 < r. By (9), there exists x ∈ A, such that Tx = v, which means

that T−1y = 2rx. This forces ‖T−1y‖ ≤ 2

r . This argument shows that

sup‖T−1y‖ : y ∈ Y, ‖y‖ ≤ 1

≤ 2r<∞,

and the continuity of T−1 follows from Proposition 2.4.

The following two exercises deal with two more “principles of Banach spacetheory.”

CHAPTER II: ELEMENTS OF FUNCTIONAL ANALYSIS 87

Exercise 5♦. (Closed Graph Theorem). Let X and Y be Banach spaces, andlet T : X → Y be a linear map. Prove that the following are equivalent:

(i) T is continuous.(ii) The graph of T

GT =(x, Tx) : x ∈ X

is a closed subset of X× Y, in the product topology.

Hint: For the implication (ii) ⇒ (i), use Exercise 3, to get the fact that GT is a Banach space.

Then T is exactly the inverse of πX

∣∣GT

, where πX : X × Y → X is the projection onto the first

coordinate. Use Theorem 3.2.

Exercise 6♦. (Open Mapping Theorem). Let X and Y be Banach spaces, andlet T : X → Y be a surjective linear continuous map. Prove that T is an open map,in the sense that

• whenver D ⊂ X is open, it follows that T (D) is open in Y.Hint: Consider the linear map

S : X× Y 3 (x, y) 7−→ (x, Tx+ y) ∈ X× Y.

Prove that S is linear, continuous, bijective, hence by Theorem 3.2, it is a homeomorphism. Use

this fact to prove that for every open set D ⊂ X, there exists some open set E ⊂ X× Y, such that

T (D) = πY(E), where πY : X× Y → Y is the projection onto the second coordinate. This reduces

the problem to proving the fact that πY is an open map.

Lecture 13

4. The weak dual topology

In this section we examine the topological duals of normed vector spaces. Be-sides the norm topology, there is another natural topology which is constructed asfollows.

Definition. Let X be a normed vector space over K(= R, C). For every x ∈ X,let εx : X∗ → K be the linear map defined by

εx(φ) = φ(x), ∀φ ∈ X∗.

We equipp the vector space X∗ with the weak topology defined by the family Ξ =(εx)x∈X. This topology is called the weak dual topology, which is denoted by w∗.Recall (see Section 3) that this topology is characterized by the following property

(w∗) Given a topological space T , a map f : T → X∗ is continuous with respectto the w∗ topology, if and only if εx f : T → K is continuous, for eachx ∈ X.

Remark that all the maps εx : X∗ → K, x ∈ X are already continuous with respectto the norm topology. This gives the fact that

• the w∗ topology on X∗ is weaker than the norm topology.Remark 4.1. The w∗ topology is Hausdorff. Indeed, if φ, ψ ∈ X∗ are such

that φ 6= ψ, then there exists some x ∈ X such that

εx(φ) = φ(x) 6= ψ(x) = εx(ψ).

Proposition 4.1. Let X be a normed vector space over K. For every ε > 0,φ ∈ X∗, and x ∈ X, define the set

W (φ;x, ε) =ψ ∈ X∗ : |ψ(x)− φ(x)| < ε.

Then the collection

W =W (φ;x, ε) : ε > 0, φ ∈ X∗, x ∈ X

is a subbase for the w∗ topology. More precisely, given φ ∈ X∗, a set N ⊂ X∗ is aneighborhood of φ with respect to the w∗ topology, if and only if, there exist ε > 0and x1, . . . , xn ∈ X, such that

N ⊃W (φ; ε, x1) ∩ · · · ∩W (φ; ε, xn).

Proof. It is clearly sufficient to prove the second assertion, because it wouldimply the fact that any w∗ open set is a union of finite intersections of sets in W.

If we define the collection

S =ε−1x (D) : x ∈ X, D ⊂ K open

,

then we know that S is a subbase for the w∗ topology.

89

90 LECTURE 13

Fix φ ∈ X∗. Start with some w∗ neighborhood N of φ, so there exists some w∗

open set E with φ ∈ E ⊂ N . Using the fact that S is a subbase for the w∗ topology,there exist open sets D1, . . . , Dn ⊂ K, and points x1, . . . , xn, such that

φ ∈n⋂k=1

ε−1xk

(Dk) ⊂ E.

Fix for the moment k ∈ 1, . . . , n. The fact that φ ∈ ε−1xk

(Dk) means that φ(xk) ∈Dk. Since Dk is open in K, there exists some εk > 0, such that

Dk ⊃ Bεk

(φ(xk)

).

Then if we have an arbitrary ψ ∈W (φ; εk, xk), we will have

|ψ(xk)− φ(xk)| < εk,

which gives ψ ∈ ε−1xk

(Dk). This proves that

W (φ; εk, xk) ⊂ ε−1xk

(Dk).

Notice that, if one takes ε = minε1, . . . , εn, then we clearly have the inclusions

W (φ; ε, xk) ⊂W (φ; εk, xk) ⊂ ε−1xk

(Dk).

We then immediately get

W (φ; ε, xk) ⊂n⋂k=1

ε−1xk

(Dk) ⊂ E ⊂ N,

and we are done.

Corollary 4.1. Let X be a normed vector space. Then the w∗ topology on X∗

is locally convex, i.e.• for every φ ∈ X∗ and every w∗-neighborhood N of φ, there exists a convexw∗-open set D such that φ ∈ D ⊂ N .

Proof. Apply the second part of the proposition, together with the obviousfact that each of the sets W (φ; ε, x) is convex and w∗-open.

Proposition 4.2. Let X be a normed vector space. When equipped with thew∗ topology, the space X∗ is a topological vector space. This means that the maps

X∗ × X∗ 3 (φ, ψ) 7−→ φ+ ψ ∈ X∗

K× X∗ 3 (λ, φ) 7−→ λφ ∈ X∗

are continuous with respect to the w∗ topology on the target space, and the w∗

product topology on the domanin.

Proof. According to the definition of the w∗ topology, it suffices to provethat, for every x ∈ X, the maps

σx : X∗ × X∗ 3 (φ, ψ) 7−→ γx : εx(φ+ ψ) ∈ KK× X∗ 3 (λ, φ) 7−→ εx(λφ) ∈ K

are continuous. But the continuity of σx and γx is obvious, since we have

σx(φ, ψ) = φ(x) + φ(x) = εx(φ) + εx(ψ), ∀ (φ, ψ) ∈ X∗ × X∗;

γx(λ, φ) = λφ(x) = λεx(φ), ∀ (λ, φ, ψ) ∈ K× X∗.

CHAPTER II: ELEMENTS OF FUNCTIONAL ANALYSIS 91

Our next goal will be to describe the linear maps X∗ → K, which are continuousin the w∗ topology.

Proposition 4.3. Let X be a normed vector space over K. For a linear mapω : X∗ → K, the following are equivalent:

(i) ω is continuous with respect to the w∗ topology;(ii) there exists some x ∈ X, such that

ω(φ) = φ(x), ∀φ ∈ X∗.

Proof. The implication (ii) ⇒ (i) is trivial, since condition (ii) gives ω = εx(i) ⇒ (ii). Suppose ω is continuous. In particular, ω is continuous at 0, so if

we take the setD = λ ∈ K : |λ| < 1,

the setω−1(D) = φ ∈ X∗ : |ω(φ)| < 1

is an open neighborhood of 0 in the w∗ topology. By Proposition ?? there existx1, . . . , xn ∈ X, and ε > 0, such that

(1) W (0; ε, x1) ∩ · · · ∩W (0; ε, xn) ⊂ D.

Claim 1: One has the inequality

|ω(φ)| ≤ ε−1 ·max|φ(x1)|, . . . , |φ(xn)|

, ∀φ ∈ X∗.

Fix an arbitrary φ ∈ X∗, and putM = max|φ(x1)|, . . . , |φ(xn)|

. For every integer

k ≥ 1, defineφk = ε

(M + 1

k

)−1φ,

so that

|φk(xj)| = ε(M + 1

k

)−1|φ(xj)| ≤ εM(M + 1

k

)−1< ε, ∀ k ≥ 1, j ∈ 1, . . . , n.

This proves that φk ∈W (0; ε, xj), for all k ≥ 1, and all j ∈ 1, . . . , n. By (1) thiswill give

|ω(φk)| < 1, ∀ k ≥ 1,which reads

ε(M + 1

k

)−1|ω(φ)| < 1, ∀ k ≥ 1.This gives

|ω(φ)| ≤ ε−1(M + 1

k

), ∀ k ≥ 1,

and it will obviously force|ω(φ)| ≤ ε−1M.

Having proven the Claim, we now define the linear map T : X∗ → Kn, by

Tφ =(φ(x1), . . . , φ(xn)

), ∀φ ∈ X∗.

Claim 2: There exists a linear map σ : Kn → K, such that ω = σ T .First we show that we have the inclusion

Kerω ⊃ KerT.

If we start with φ ∈ KerT , then φ(x1) = · · · = φ(xn) = 0, and then by Claim1 we immediately get ω(φ) = 0, so φ indeed belongs to Kerω. We us now a bitof linear algebra. On the one hand, since ω

∣∣KerT

= 0, there exists a linear mapω : X/KerT → K, such that ω = ω π, where π : X → X/KerT denotes thequoatient map. On the other hand, by the Isomorphism Theorem for linear maps,

92 LECTURE 13

there exists a linear isomorphism T : X/KerT ∼−−→ RanT , such that T π = T .We then define

σ0 = ω T−1 : RanT → K,and we will have

σ0 T = (ω T−1) (T π) = ω π = ω.

We finally extend5 σ0 : RanT → K to a linear map σ : Kn → K.Having proven Claim 2, we choose scalars α1, . . . , αn ∈ K, such that

σ(λ1, . . . , λn) = α1λ1 + · · ·+ αnλn, ∀ (λ1, . . . , λn) ∈ Kn.

We now have

ω(φ) = σ(Tφ) = σ(φ(x1), . . . , φ(xn)

)= α1φ(x1) + · · ·+ αnφ(xn), ∀φ ∈ X∗,

so if we define x = α1x1 + · · ·+ αnxn, we claerly have

ω(φ) = φ(x), ∀φ ∈ X∗.

(ii) ⇒ (i). This implication is trivial.

Corollary 4.2. Let X be a normed vector space, let C ⊂ X∗ be a convex set,and let φ ∈ X∗ r C

w∗

. (Here Cw∗

denotes the w∗-closure of C.) Then there existsan element x ∈ X, and a real number α, such that

Reφ(x) < α ≤ Reψ(x), ∀ψ ∈ C.

Proof. Since the w∗ topology on X∗ is locally convex, there exists a convexw∗-open set A ⊂ X∗, such that φ ∈ A ⊂ X∗rC

w∗

. In particular, we have A∩C = ∅.Apply the Hahn-Banach separation theorem to find a linear map ω : X∗ → K, whichis w∗-continuous, and a real number α, such that

Reω(ρ) < α ≤ Reω(ψ), ∀ ρ ∈ A, ψ ∈ C.

We then apply the above Proposition.

Comments. The definition of the w∗ topology can be used in a more generalsetting, when X is just a topological vector space. The above results are still vaildin this general setting.

In general the unit ball

(X∗)1 = φ ∈ X∗ : ‖φ‖ ≤ 1,although bounded and closed, is not compact in the norm topology. However, whenthe w∗ topology is used, we have

Theorem 4.1 (Alaoglu). If X is a normed vector space, then the unit ball(X∗)1, in the topological dual space, is compact in the w∗ topology.

Proof. Let us consider the unit ball in K:

B = λ ∈ K : |λ| ≤ 1.Let us also consider the unital ball in X:

(X)1 = x ∈ X : ‖x‖ ≤ 1.

5 One can invoke the Hahn-Banach Theorem here. In fact this is not necessary, since RanT ⊂Kn are finite dimensional vector spaces.

CHAPTER II: ELEMENTS OF FUNCTIONAL ANALYSIS 93

Define the product spaceP =

∏x∈(X)1

B,

identified equivalently as the space of maps (X)1 → B. By Tihonov’s Theorem,when we equip P with the product topology, it will become a compact topologicalspace. We denote by πx : P → B, x ∈ (X)1, the projection onto the factor withlabel x. By definition of the product topology πx is continuous.

For any x, y ∈ (X)1 define the map ∆x,y : P → K by

∆x,y(f) =f(x) + f(y)

2− f

(x+ y

2), ∀ f ∈ P.

Note that∆x,y =

12(πx + πy)− π(x+y)/2,

so ∆x,y : P → K is obviously continuous. In particular, the set

Ax,y = ∆−1x,y(0) =

f ∈ P :

f(x) + f(y)2

= f(x+ y

2)

is closed in P , for every x, y ∈ (X)1.Similarly, for every x ∈ (X)1 and every λ ∈ B, we define the map Σλ,x : P → K

byΣλ,x(f) = f(λx)− λf(x), ∀ f ∈ P,

then Σλ,x is continuous, so the set

Bx,y = Σ−1x,y(0) =

f ∈ P : f(λx) = λf(x)

is closed in P , for every λ ∈ B, x ∈ (X)1.

Define the setL =

( ⋂x,y∈(X)1

Ax,y)∩

( ⋂λ∈B

x∈(X)1

Bλ,y).

Since L is an intersection of closed sets, it follows that L itself is closed. In partic-ular, L is compact. By construction, we have

L =

f : (X)1 → B

∣∣∣∣∣∣12

[f(x) + f(y)] = f

(12 [x+ y]

)and

f(λx) = λf(x), ∀x, y ∈ (X)1, λ ∈ B

.

For any f ∈ L, we define the map ψf : X → K by

ψf (x) =

0 if x = 0

‖x‖ · f( x

‖x‖)

if x 6= 0

Claim 1: For any f ∈ L, the map ψf : X → K is linear, and satisfiesψf

∣∣(X)1

= f .

Fix f ∈ L. Start with some x ∈ X and some λ ∈ K. We have ‖λx‖ = |λ| · ‖x‖, sowe get

ψf (λx) =

0 if either x = 0, or λ = 0

|λ| · ‖x‖ · f( λ|λ|

· x

‖x‖)

if λ 6= 0 and x 6= 0

94 LECTURE 13

If λ 6= 0 and x 6= 0, we put

µ =λ

|λ|and y =

x

‖x‖,

and the fact that µ ∈ B, y ∈ (X)1, and f ∈ Bµ,y, will give

f( λ|λ|

· x

‖x‖)

= f(µy) = µf(y) =λ

|λ|· f

( x

‖x‖)

|λ| · ‖x‖ψf (x),

so in this case we get

ψf (λx) = |λ| · ‖x‖ · f( λ|λ|

· x

‖x‖)

= |λ| · ‖x‖ · λ

|λ| · ‖x‖ψf (x) = λψf (x).

In the case when either λ = 0 or x = 0, we also get the equality

ψf (λx) = 0 = λψf (x).

This way we have proven the homeogeneity of ψf(2) ψf (λx) = λψf (x), ∀λ ∈ K, x ∈ X.

Let us prove now that ψf∣∣(X)1

= f . If x = 0, then using the property

(3) f(µy) = µf(y), ∀µ ∈ B, y ∈ (X)1with µ = 0 and y = 0, we immediately get f(x) = 0 = ψf (x). If x 6= 0, we use (3)with µ = ‖x‖ and y =

x

‖x‖and we again get

f(x) = f(‖x‖ · y) = ‖x‖ · f(y) = ‖x‖ · f( x

‖x‖)

= ψf (x).

We now prove that ψf is additive. Start with two elements x, y ∈ X. Define

v =x

‖x‖+ ‖y‖+ 1and w =

y

‖x‖+ ‖y‖+ 1,

so that we obviously have v, w ∈ (X)1 and

x = ‖x‖+ ‖y‖+ 1 · v and y = ‖x‖+ ‖y‖+ 1 · w.By homogeneity, we have

ψf (x+ y) = ψf(2‖x‖+ ‖y‖+ 1 · 1

2[v + w]

)= 2‖x‖+ ‖y‖+ 1 · f

(12[v + w]).

Using the fact that f ∈ Av,w the above computation can be continued to give:

ψf (x+ y) = 2‖x‖+ ‖y‖+ 1 · f(12[v + w]) =

= 2‖x‖+ ‖y‖+ 1 · 12[f(v) + f(w)] =

= ‖x‖+ ‖y‖+ 1 · f(v) + ‖x‖+ ‖y‖+ 1 · f(w).

Using the fact that ψf∣∣(X)1

= f , the above equality gives

ψf (x+ y) = ‖x‖+ ‖y‖+ 1 · ψf (v) + ‖x‖+ ‖y‖+ 1 · ψf (w).

Finally, using the homogeneity property (2) we get

ψf (x+ y) = ψf(‖x‖+ ‖y‖+ 1 · v

)+ ψf

(‖x‖+ ‖y‖+ 1 · w

)= ψf (x) + ψf (y).

Having proven the Claim, let us now observe that, for f ∈ L, the fact that

ψf (x) = f(x) ∈ B, ∀x ∈ (X)1,

CHAPTER II: ELEMENTS OF FUNCTIONAL ANALYSIS 95

shows that ψf is continuous, and ‖ψf‖ ≤ 1. Therefore we have a correctly definedmap

Ψ : L 3 f 7−→ ψf ∈ (X∗)1.

Claim 2: When (X∗)1 is equipped with the w∗ topology, the map Ψ is con-tinuous.

By the definition of the w∗ topology, we need to prove that εx Ψ : L → K iscontinuous, for avery x ∈ X. If x = 0, the composition εx Ψ is the constant map0, so there is nothing to prove. If x 6= 0, we define

y =x

‖x‖∈ (§)1,

and using Claim 1, we have

(εx Ψ)(f) = εx(ψf ) = ψf (x) = ‖x‖ ·ψf( x

‖x‖)

= ‖x‖ ·ψf (y) = ‖x‖ · f(y), ∀ f ∈ L.

This proves thatεx Ψ = ‖x‖ · πy,

and since πy : P → B is continuous, the continuity of εx Ψ follows.In order to finish the proof of the Theorem, it then suffices to prove

Claim 3: The map Ψ : L→ (X∗)1 is surjective.

Start with an arbitrary φ ∈ (X∗)1, which means that φ : X → K is linear, continu-ous, and

|φ(x)| ≤ 1, ∀x ∈ (X)1.

In particular, if we define f = φ∣∣(X)1

, then

f(x) ∈ B, ∀x ∈ (X)1,

which means that f ∈ P . Using the fact that φ is linear, it is obvious that f ∈ L.Using Claim 1, we have

ψf (x) = f(x) = φ(x), ∀x ∈ (X)1.

Now, since ψf∣∣(X)1

= φ∣∣(X)1

, and both ψf and φ are linear, we immediately getψf = φ.

Remarks 4.2. Using the notations from the above proof, the continuous mapΨ : L→ (X∗)1 is in fact bijective. The only thing we need to prove is the injectivity.Suppose ψf = ψg, for some f, g ∈ L. Then

f = ψf∣∣(X)1

= ψg∣∣(X)1

= g.

Since Ψ : (X∗)1 → L is bijective, continuous, and the spaces (X∗)1 and L arecompact Hausdorff, it follows that Ψ is in fact a homeomorphism. The inverse mapΨ−1 : (X∗)1 → L is simply defined by

Ψ−1(φ) = φ∣∣(X)1

, ∀φ ∈ (X∗)1.

Proposition 4.4. Suppose X is a normed vector space, which is separable inthe norm topology. When equipped with the w∗ topology, the compact space (X∗)1is metrizable.

96 LECTURE 13

Proof. Fix a countable dense subset M ⊂ X, and define (M)1 = (X)1 ∩ M.Notice that (M)1 is dense in (X)1. Indeed, if we start with some x ∈ (X)1, andsome ε > 0, then we set xε = (1− ε

2 )x, and we choose y ∈ M such that ‖xε−y‖ < ε2 .

On the one hand, we have

‖y‖ ≤ ‖xε − y‖+ ‖xε‖ <ε

2+

(1− ε

2)· ‖x‖ ≤ ε

2+ 1− ε

2= 1,

so y ∈ (M)1. On the other hand, we have

‖y − x‖ ≤ ‖y − xε‖+ ‖x− xε‖ <ε

2+

∥∥ε2x‖ ≤ ε

2·(1 + ‖x‖

)≤ ε.

Let us use the notations from the proof of Theorem 4.1. Let us then define theproduct space ∏

x∈(M)1

B,

equipped with the product topology. Define also the map

Υ :∏

x∈(X)1

B 3 f 7−→ f∣∣(M)1

∈∏

x∈(M)1

B.

It is obvious that Υ is continuous. Let

κ : (X∗)1 3 φ 7−→ φ∣∣(X)1

∈∏

x∈(X)1

B.

We know that κ is continuous and injective (being the inverse of Ψ : L→ (X∗)1).Claim: The composition Υ κ : (X∗)1 →

∏x∈(M)1

B is injective.

Indeed, if φ, ψ ∈ (X∗)1 satisfy (Υκ)(φ) = (Υκ)(ψ), then we get φ∣∣(M)1

= ψ∣∣(M)1

.

Since (M)1 is dense in (X)1, this will force φ∣∣(X)1

= ψ∣∣(X)1

, which finally forcesφ = ψ.

Using the above Claim, we see that if we define Q = (Υ κ)((X∗)1

), then Q ⊂∏

x∈(M)1B is compact, and Υ κ : (X∗)1 → Q is a homeomorphism. Notice that∏

x∈(M)1B is a countable product of metric spaces, so it is metrizable. Therefore

Q is also metrizable, and so will be (X∗)1.

Remark 4.3. Assuming X is separable, and M ⊂ X is a countable dense subset.If we enumerate the countable set (M)1 as

(M)1 = yn : n ≥ 1,

then a metric d that defines the w∗ topology on (X∗)1 can be constructed as

d(φ, ψ) =∞∑n=1

|φ(yn)− ψ(yn)|2n

, ∀φ, ψ ∈ (X∗)1.

Comments. Let X be a normed vector space. One can extend the map κ to amap

κ : X∗ 3 φ 7−→ φ∣∣(X)1

∈∏

x∈(X)1

K.

This map will still be injective and continuous, and one can show that

κ : X∗ → κ(X∗)

CHAPTER II: ELEMENTS OF FUNCTIONAL ANALYSIS 97

is a homeomorphism, when κ(X∗) is equipped with the induced topology from theproduct space

∏x∈(X)1

K. In general however, the set κ(X∗) is not closed in theproduct space

∏x∈(X)1

K.If X is separable, and if one takes a countable dense set M ⊂ X, then as before,

one also still has a continuous map

Υ :∏

x∈(X)1

K 3 f 7−→ f∣∣(M)1

∈∏

x∈(M)1

K,

and the compositionΥ κ : X∗ →

∏x∈(M)1

B

will still be continuous and injective. In general however, it turns out that the map

Υ κ : X∗ → Υ κ(X∗)

is not a homeomrphism. The exercise below explains exactly when this is the case.Exercise 1*. Let X be a normed vector space, which is of uncountable dimension

(for example, a Banach space). Prove that the topological space (X∗, w∗) is notmetrizable.Hint: Assume (X∗, w∗) is metrizable. Let d be a metric which gives the w∗-topology. Then0 ∈ X∗ will have a countable basic system of neighborhoods. In particular, there exist sequences

(xn)n≥1 ⊂ X, and (εn)n≥1 ∈ (0,∞), such that the sets

Bn =

n⋂k=1

W (0; εn, xk)

satisfy Bn ⊂ B1/n(0), ∀n ≥ 1, where B1/n(0) denotes the d-open ball of center 0 and radius

1/n. Consider the set M = xn : n ∈ N. We know that Span M ( X. Choose some vector

y ∈ X r Span M. For every n ≥ 1, choose a linear map ψn : Spany, x1, . . . , xn → K, such

that ψn(y) = 1, and ψn(xk) = 0, ∀ k ∈ 1, . . . , n. Extend (use Hahn-Banach) ψn to a linear

continuous map φn : X → K. Notice now that φn ∈ Bn, for all n ≥ 1, which would then force

d- limn→∞ φn = 0. In particular, this would force limn→∞ φn(x) = 0, ∀x ∈ X. But this is

impossible, since φn(y) = 1, ∀n ≥ 1.

Comment. If X is a normed vector space of countable dimension, then (X∗, w∗)is metrizable. Indeed, if we take a linear basis bn : n ∈ N for X, then the w∗

topology on X∗ is clearly defined by the metric

d(φ, ψ) =n∑n=1

12n

· |φ(bn)− ψ(bn)|1 + |φ(bn)− ψ(bn)|

, φ, ψ ∈ X∗.

Lectures 14-15

5. Banach spaces of continuous functions

In this section we discuss a examples of Banach spaces coming from topology.Notation. Let K be one of the fields R or C, and let Ω be a topological space.

We defineCKb (Ω) = f : Ω → K : f bounded and continuous.

In the case when K = C we use the notation Cb(Ω).Proposition 5.1. With the notations above, if we define

‖f‖ = supp∈Ω

|f(p)|, ∀ f ∈ CKb (Ω),

then CKb (Ω) is a Banach space.

Proof. It is obvious that CKb (Ω) is a linear subspace of `∞K (Ω), and the norm

is precisely the one coming from `∞K (Ω). Therefore, it suffices to prove that CKb (Ω)

is closed in `∞K (Ω).Start with some sequence (fn)n≥1 ⊂ CK

b (Ω), which convergens in norm tosome f ∈ `∞K (Ω), and let us prove that f : Ω → K is continuous (the fact that f isbounded is automatic).

Fix some point p0 ∈ Ω, and some ε > 0. We need to find some neighborhoodV of p0, such that

|f(p)− f(p0)| < ε, ∀ p ∈ V.Start by choosing n such that ‖fn − f‖ < ε

3 . Use the fact that fn is continuous, tofind a neighborhood V of p0, such that

|fn(p)− fn(p0)| <ε

3, ∀Ω ∈ V.

Suppose now Ω ∈ V . We have

|f(p)− f(p0)| ≤ |fn(p)− f(p)|+ |fn(p)− fn(p0)|+ |fn(p0)− f(p0)| ≤

|fn(p)− fn(p0)|+ 2[supq∈Ω

|fn(q)− f(q)|]< 2

ε

3+ε

3= ε.

A first application of Banach space techniques is the following:Lemma 5.1 (Urysohn type density). Let Ω be a topological space, let C ⊂ CR

b (Ω)be a linear subspace, which contains the constant function 1. Assume

(u) for any two closed sets A,B ⊂ Ω, with A∩B = ∅, there exists a functionh ∈ C, such that h

∣∣A

= 0, h∣∣B = 1, and h(Ω) ∈ [0, 1], for all Ω ∈ Ω.

Then C is dense in CRb (Ω), in the norm topology.

99

100 LECTURES 14-15

Proof. The key step in the proof will be the following:Claim: For any f ∈ CR

b (Ω), there exists g ∈ C, such that

‖g − f‖ ≤ 23‖f‖.

To prove this claim we define

α = infp∈Ω

f(p) and β = supp∈Ω

f(x),

so that f(p) ⊂ [α, β], and ‖f‖ = max|α|, |β|. Define the sets

A = f−1

([α,

2α+ β

3])

and B = f−1

([α+ 2β3

, β])

.

so that both A and B are closed, and A ∩ B = ∅. Use the hypothesis, to find afunction h ∈ C, such that h

∣∣A

= 0, h∣∣B

= 1, and h(p) ∈ [0, 1], for all p ∈ Ω. Definethe function g ∈ C by

g =13[α1 + (β − α)k

].

Let us examine the difference g− f . Start with some arbitrary point p ∈ Ω. Thereare three cases to examine:

Case I: p ∈ A. In this case we have h(p) = 0, so we get g(p) =α

3. By the

construction of A we also have α ≤ f(p) ≤ 2α+ β

3, so we get

2α3≤ f(p)− g(p) ≤ α+ β

3.

Case II: p ∈ B. In this case we have h(p) = 1, so we get g(p) =β

3. We also

have2β + α

3≤ f(p) ≤ β, so we get

α+ β

3≤ f(p)− g(p) ≤ 2β

3.

Case III: p ∈ Ω r (A ∪ B). In this case we have 0 ≤ h(p) ≤ 1, so we getα

3≤ g(p) ≤ β

3, and

2α+ β

3< f(p) <

α+ 2β3

. In particular we get

f(p)− g(p) >2α+ β

3− β

3=

2α3

;

f(p)− g(p) <α+ 2β

3− α

3=

2β3.

Since2α3≤ α+ β

3≤ 2β

3, we see that in all three cases we have

2α3≤ f(p)− g(p) ≤ 2β

3,

so we get2α3≤ infp∈Ω

[f(p)− g(p)

]≤ supp∈Ω

[f(p)− g(p)

]≤ 2β

3,

so we indeed get the desired inequality

‖g − f‖ ≤ 23‖f‖.

CHAPTER II: ELEMENTS OF FUNCTIONAL ANALYSIS 101

Having proven the Claim, we now prove the density of C in CRb (Ω). Start with

some f ∈ CRb (Ω), and we construct recursively two sequences (gn)n≥1 ⊂ C and

(fn)n≥1 ⊂ CRb (Ω), as follows. Set f1 = f . Apply the Claim to find g1 ∈ C such that

‖g1 − f‖ ≤ 23‖f1‖.

Once f1, f2, . . . , fn and g1, g2, . . . , gn have been constructed, we set

fn+1 = gn − fn,

and we choose gn+1 ∈ C such that

‖gn+1 − fn+1‖ ≤23‖fn+1‖.

It is clear, by construction, that

‖fn‖ ≤(

23

)n−1

‖f‖, ∀n ≥ 1.

Consider the sequence (sn)n≥1 ⊂ C of partial sums, defined by

sn = g1 + g2 + · · ·+ gn, ∀n ≥ 1.

Using the equalitiesgn = fn − fn+1, ∀n ≥ 1,

we getsn − f = g1 + g2 + · · ·+ gn − f1 = fn+1,

so we have

‖sn − f‖ ≤(

23

)n‖f‖, ∀n ≥ 1,

which clearly give f = limn→∞ sn, so f indeed belongs to the closure C.

We are now in position to prove the following

Theorem 5.1 (Tietze Extension Theorem). Let Ω be a normal topologicalspace, let T ⊂ Ω be a closed subset. Let f : T → [0, 1] be a continuous function.(Here Y is equipped with the induced topology.) There there exists a continuousfunction g : Ω → [0, 1] such that g

∣∣T

= f .

Proof. Let us introduce the Banach space setting that will make the proofclearer. We consider the Banach spaces CR(Ω) and CR

b (T ). To avoid any confusion,the norms on these Banach spaces will be denoted by ‖ · ‖Ω and ‖ · ‖T . If we definethe restriction map

R : CRb (Ω) 3 g 7−→ g

∣∣T∈ CR

b (T ),

then R is obviously linear and continuous.We define the subspace C = R

(CRb (Ω)

)⊂ CR

b (T ).

Claim: For every f ∈ C, there exists some g ∈ CRb (Ω) such that f = Rg, and

infq∈T

f(q) ≤ g(p) ≤ supq∈T f(q), ∀ p ∈ Ω.

102 LECTURES 14-15

To prove this fact, we start first with some arbitrary g0 ∈ CRb (Ω), such that f =

Rg0 = g0∣∣Y

. Putα = inf

q∈Tf(q) and β = sup

q∈Tf(q),

so that ‖f‖T = max|α|, |β|

. Define the function θ : R → [α, β] by

θ(t) =

α if t < αt if α ≤ t ≤ ββ if t > β

Then obviously θ is continuous, and the composition g = θ g0 : Ω → [α, β] willstill satisfy g

∣∣T

= f , and we will clearly have

α ≤ g(p) ≤ β, ∀ p ∈ Ω.

Having proven the Claim, we are going to prove that C is closed. We do this byshowing that C is a Banach space, in the norm ‖ · ‖Y . To get this, we use Remark??. Start with some sequence (fn)n≥1 ⊂ C, with

∑∞n=1 ‖fn‖T < ∞. Apply the

Claim, to construct a sequence (gn)n≥1 ⊂ CRb (Ω), such that Rgn = fn, and

infq∈T

fn(q) ≤ gn(p) ≤ supq∈T

fn(q), ∀ p ∈ Ω,

for each n ≥ 1. Notice that this forces

‖gn‖Ω ≤ ‖fn‖T , ∀n ≥ 1.

Define the sequences of partial sums (hn)n≥1 ⊂ C and (sn)n≥1 ⊂ CRb (Ω), by

hn = f1 + · · ·+ fn and sn = g1 + · · ·+ gn, ∀n ≥ 1.

Since∞∑n=1

‖gn‖Ω ≤∞∑n=1

‖fn‖T <∞,

and CRb (Ω) is a Banach space, it follows that the sequence (sn)n≥1 is convergent to

some point g ∈ CRb (Ω). Since R : CR

b (Ω) → CRb (T ) is linear an continuous, we will

haveRs = lim

n→∞[Rg1 + · · ·+Rgn] = lim

n→∞[f1 + · · ·+ fn] = lim

n→∞hn,

which proves that the sequence of partial sums (hn)n≥1 ⊂ C is indeed convergentto Rs ∈ C.

Let us remark now that obviously C contains the constant function 1 = R1.Using Urysohn Lemma (applied to T ) it is clear that C satifies the condition (u)in the above lemma. Using the Lemma ??, it follows that C = CR

b (T ), i.e. R issurjective.

To finish the proof, start with some arbitrary continuous function f : Y → [0, 1].Use surjectivity of R, combined with the Claim, to find g ∈ CR

b (Ω), such thatRg = f , and

infq∈T

f(q) ≤ g(p) ≤ supq∈T

f(q), ∀ p ∈ Ω.

This clearly forces g to take values in [0, 1].

Next we concentrate on the case when Ω is a compact Hausdorff space. Inthis case, every continuous function F : Ω → K is automatically bounded, and theBanach space CK

b (Ω) will be denoted simply by CK(Ω). (When K = C this spacewill be denoted simply by C(Ω).)

CHAPTER II: ELEMENTS OF FUNCTIONAL ANALYSIS 103

Theorem 5.2 (Dini). Let K be a compact Hausdorff space, let (fn)n≥1 ⊂CR(K) be a monotone sequence. Assume there is some f ∈ CR(K), such that

limn→∞

fn(p) = f(p), ∀ p ∈ K.

Then limn→∞ fn = f , in the norm topology.

Proof. Replacing fn with fn − f , we can assume that limn→∞ fn(p) = 0,∀ p ∈ K. Replacing (if necessary) fn with −fn, we can also assume that thesequence (fn)n≥1 is decreasing. In particular, each fn is non-negative.

We need to prove that limn→∞ ‖fn‖ = 0. Assume this is not true, so thereexists some ε > 0, such that the set

M = m ∈ N : ‖fm‖ ≥ εis infinite. For each integer n ≥ 1, let us define the set

Fn = p ∈ K : fn(p) ≥ ε.Then by the definition of M , we have

Fm 6= ∅, ∀m ∈M.

Claim: One has the inclusion Fn ⊃ Fn+1, ∀n ≥ 1.Indeed, if p ∈ Fn+1, then

ε ≤ fn+1(p) ≤ fn(p),which proves that p ∈ Fn.

Using the claim, plus the fact that the set M is infinite, it follows that, Fn 6= ∅,∀n ≥ 1. (Indeed, if we start with some arbitrary n, then since M is infinite, wecan find m ∈M , with m ≥ n, and then using the Claim we have ∅ 6= Fm ⊂ Fn.)

Since K is compact, and the sets F1 ⊃ F2 ⊃ . . . are closed and non-empty, bythe finite intersection property, it follows that

∞⋂n=1

Fn 6= ∅.

But this leads to a contradiction, because if we pick an element p ∈⋂∞n=1 Fn,

then we will have fn(p) ≥ ε, ∀n ≥ 1, and then the equality limn→∞ fn(p) = 0 isimpossible.

Exercise 1. Define the sequence (Pn)n≥1 of polynomials, by P1(t) = 0, and

Pn+1(t) =12[t− Pn(t)2

]+ Pn(t), ∀n ≥ 1.

Prove thatlimn→∞

(maxt∈[0,1]

∣∣Pn(t)−√t∣∣ )= 0.

Hint: Define the functions fn, f : [0, 1] → R by fn(t) = Pn(t) and f(t) =√t. Prove that, for

every t ∈ [0, 1], the sequence(fn(t)

)n≥1

is incresing, bounded, and limn→∞ fn(t) = f(t). Then

apply Dini’s Theorem.

Theorem 5.3 (Stone-Weierstrass). Let K be a compact Hausdorff space. LetA ⊂ CR(K) be a unital subalgebra, i.e.

• A 3 1 - the constant function 1;• A is a linear subspace;• if f, g ∈ A, then fg ∈ A.

104 LECTURES 14-15

Assume A separates the points of K, i.e. for any p, q ∈ K, with p 6= q, there existsf ∈ A such that f(p) 6= f(q).

Then A is dense in CR(K), in the norm topology.

Proof. Let C denote the closure of A. Remark that C is again a unital sub-algebra and it still separates the points.

The proof will eventually use the Urysohn density Lemma. Before we get tothat point, we need several preparations.

Step 1. If f ∈ C, then |f | ∈ C.To prove this fact, we define g = f2 ∈ C, and we set h = ‖g‖−1g, so that h ∈ C,

and h(p) ∈ [0, 1], for all p ∈ K. Let Pn(t), n ≥ 1 be the polynominals defined inthe above exercise. The functions hn = Pn h, n ≥ 1 are clearly all in C. By theabove Exercise, we clearly get

limn→∞

(maxp∈K

|hn(p)−√h(p)|

)= 0,

which means that limn→∞ hn =√h, in the norm topology. In particular,

√h

belongs to C. Obviously we have√h = ‖f‖−1 · |f |,

so |f | indeed belongs to C.Step 2: Given two functions f, g ∈ C, the continuous functions maxf, g and

minf, g both belong to C.This follows immediately from Step 1, and the equalities

maxf, g =12(f + g + |f − g|

)and minf, g =

12(f + g − |f − g|

).

Step 3: For any two points p, q ∈ K, p 6= q, there exists h ∈ C, such thath(p) = 0, h(q) = 1, and h(s) ∈ [0, 1], ∀ s ∈ K.

Use the assumption on A, to find first a function f ∈ A, such that f(p) 6= f(q).Put α = f(p) and β = f(q), and define

g =1

β − α

(f − α1

).

The function g still belongs to A, but now we have g(p) = 0 and g(q) = 1. Definethe function h = ming2, 1. By Step 3, h ∈ C, and it clearly satisfies the requiredproperties.

Step 4: Given a closed subset A ⊂ K, and a point p ∈ K r A, there exists afunction h ∈ C, such that h(p) = 0, h

∣∣A

= 1, and h(q) ∈ [0, 1], ∀ q ∈ K.For every q ∈ A, we use Step 3 to find a function hq ∈ C, such that hq(p) = 0,

hq(q) = 1, and hq(s) ∈ [0, 1], ∀ s ∈ K, and we define the open set

Dq = s ∈ K : hq(s) > 0.

Using the compactness of A, we find points q1, . . . , qn ∈ A, such that

A ⊂ Dq1 ∪ · · · ∪Dqn.

Define the function f = hq1 + · · · + hqn ∈ C, so that f(p) = 0, f(q) > 0, for allq ∈ A, and f(s) ≥ 0, ∀ s ∈ K. If we define

m = minq∈A

f(q),

CHAPTER II: ELEMENTS OF FUNCTIONAL ANALYSIS 105

then the function g = m−1f again belongs to C, and it satisfies g(p) = 0, g(q) ≥ 1,∀ q ∈ A, and g(s) ≥ 0, ∀ s ∈ K. Finally, the function

h = ming, 1will satisfy the required properties.

Step 5: Given closed sets A,B ⊂ K with A ∩B = ∅, there exists h ∈ C, suchthat h

∣∣A

= 1, h∣∣B

= 0, and h(q) ∈ [0, 1], ∀ q ∈ K.Use Step 4, to find for every p ∈ B, a function hp ∈ C, such that hp

∣∣B

= 1,hp(p) = 0, and hp(s) ∈ [0, 1], ∀ s ∈ K. Put gp = 1−hp, so that gp(p) = 1, gp

∣∣B

= 0,and gp(s) ∈ [0, 1], ∀ s ∈ K. We the proceed as above. For each p ∈ A we define theopen set

Dp = s ∈ K : gp(s) > 0.Using the compactness of A, we find points p1, . . . , pn ∈ A, such that

A ⊂ Dp1 ∪ · · · ∪Dpn.

Define the function f = gp1 + · · ·+gpn∈ C, so that f

∣∣B

= 0, f(q) > 0, for all q ∈ A,and f(s) ≥ 0, ∀ s ∈ K. If we define

m = minq∈A

f(q),

then the function g = m−1f again belongs to C, and it satisfies g∣∣B

= 0, g(q) ≥ 1,∀ q ∈ A, and g(s) ≥ 0, ∀ s ∈ K. Finally, the function

h = ming, 1will satisfy the required properties.

We now apply the Urysohn density Lemma, to conclude that C is dense inCR(K). Since C is already closed, this forces C = CR(K), i.e. A is dense inCR(K).

Corollary 5.1 (Complex version of Stone-Weierstrass Theorem). Let K bea compact Hausdorff space. Let A ⊂ C(K) be a unital subalgebra, which satisfies;

• if f ∈ A, then f ∈ A.Assume A separates the points of K. Then A is dense in C(K), in the normtopology.

Proof. Consider the sub-algebra

AR = f ∈ A : f = f.It is clear that

A = AR + iAR,

and AR is a unital sub-algebra of CR(K), which separates the points of K. Usingthe real version, we know that AR is dense in CR(K). Then A is clearly dense inC(K).

Example 5.1. Consider the unit disk

D = λ ∈ C : |λ| < 1,and let D denote its closure. Consider the algebra A ⊂ C(D) consisting of allpolynomial functions. Notice that, although A is unital and separates the pointsof D, it does not have the property

f ∈ A ⇒ f ∈ A.

106 LECTURES 14-15

In fact, one way to see that this property fails is by inspecting the closure of A inC(D). This closure is denoted by A(D) and is called the disk algebra. The mainfeature of A(D) is the following:

Exercise 2*. Prove that

A(D) =f : D → C : f continuous, and f

∣∣D holomorphic

.

We now examine the topological dual of C(K).Notations. Let K be a compact Hausdorff space, and let K be one of the

fields R or C. We define the space

MK(K) = CK(K)∗ = φ : CK(K) → K : φ K-linear continuous.

The unit ball will be denoted by MK(K)1. When K = C, the superscript C will beomitted from the notation.

Remarks 5.1. Let K be a compact Hausdorff space. The space M(K) =C(K)∗ carries a natural involution, defined as follows. For φ ∈ M(K), we definethe map φ? : C(K) → C by

φ?(f) = φ(f), ∀ f ∈ C(K).

For every φ ∈ M(K), the map φ? : C(K) → C is again linear, continuous, and has

‖φ?‖ = ‖φ‖.

The map φ? will be called the adjoint of φ. We used the term involution, becausethe map

M(K) 3 φ 7−→ φ? ∈ M(K)

has the following properties:• (φ?)? = φ, ∀φ ∈ M(K);• (φ+ ψ)? = φ? + ψ?, ∀φ, ψ ∈ M(K);• (λφ)? = λφ?, ∀φ,∈ M(K), λ ∈ C.

If we define the space of self-adjoint maps

Msa(K) = φ ∈ M(K) : φ? = φ,

then is clear that, for any φ ∈ Msa(K), the restriction φ∣∣CR(K) is real-valued. In

fact, for φ ∈ M(K), one has

φ? = φ⇐⇒ φ∣∣CR(K)

is real-valued.

Moreover, one has a map

(1) Msa(K) 3 φ 7−→ φ∣∣CR(K)

∈ MR(K),

which is an isomorphism of R-vector spaces. The inverse of this map is definedas follows. Start with some φ ∈ MR(K), i.e. φ : CR(K) → R is R-linear andcontinuous, and we define φ : C(K) → C by

φ(f) = φ(Re f) + iφ(Im f), ∀ f ∈ C(K).

It turns out that φ is again linear, continuous, and self-adjoint. Moreover, thecorrespondence

MR(K) 3 φ 7−→ φ ∈ Msa(K)

is the inverse of (1).

CHAPTER II: ELEMENTS OF FUNCTIONAL ANALYSIS 107

Proposition 5.2. Let K be a compact Hausdorff space. Then the map

Msa(K) 3 φ 7−→ φ∣∣CR(K)

∈ MR(K)

is isometric. Moreover, when the two spaces are equipped with the w∗ topology, thismap is a homeomorphism.

Proof. To prove the first statement, fix φ ∈ Msa(K). It is obvious that‖φ

∣∣CR(K)

‖ ≤ ‖φ‖. To prove the other inequality, fix for the moment ε > 0, andchoose f ∈ C(K) such that ‖f‖ ≤ 1, and

|φ(f)| ≥ ‖φ‖ − ε.

Choose a complex number λ with |λ| = 1, such that

|φ(f)| = λφ(f) = φ(λf).

If we write λf = g+ ih, with g, h ∈ CR(K), then using the fact that φ is self-adoint,we will have

|φ(f)| = φ(g).Since ‖g‖ ≤ ‖λf‖ = ‖f‖ ≤ 1, we will get

|φ(f)| ≤ ‖φ∣∣CR(K)

‖,

so our choice of f will give

‖φ‖ − ε ≤ ‖φ∣∣CR(K)

‖.

Since this holds for all ε > 0, we get

‖φ‖ ≤ ‖φ∣∣CR(K)

‖.

The w∗ continuity (both ways) is obvious.

Convention. From now on, we will identify the space MR(K) with Msa(K).Proposition 5.3. Let K be a compact Hausdorff space. For every p ∈ K, let

γp : C(K) → C be the map

γp : C(K) 3 f 7−→ f(p) ∈ C.

(i) For every p ∈ K, the maps γp and γRp = γp

∣∣CR(K)

: CR(K) → R are linearand continuous.

(ii) For every p ∈ K, one has ‖γp‖ = ‖γRp ‖ = 1.

(ii) The maps

ΓK : K 3 p 7−→ γp ∈ M(K)1

ΓRK : K 3 p 7−→ γR

p ∈ MR(K)1

are injcetive and continuous, when the target spaces M(K)1 and MR(K)1are equipped with the w∗ topology.

Proof. (i)-(ii). The fact that γp is C-linear is obvious. This will also give theR-linearity of γR

p . The continuity follows from the obvious inequality

|γp(f)| = |f(p)| ≤ maxq∈K

|f(q)| = ‖f‖, ∀ f ∈ C(K).

AMong other things, the above inequality also proves

‖γp‖ ≤ 1 and ‖γRp ‖ ≤ 1.

108 LECTURES 14-15

The fact that we have in fact equalities follows from γp(1) = 1.(iii) Let us first prove the injectivity. Assume we have two point p, q ∈ K, with

p 6= q. Use Urysohn Lemma to find f : K → [0, 1] continuous, such that f(p) = 0and f(q) = 1. Then f ∈ CR(K) and γR

p (f) = f(p) = 0, and γRq (f) = f(q) = 1, so

we indeed have γRp 6= γR

q . (This will also imply γp 6= γq.To prove the continuity of the maps ΓK : K → M(K)1 and ΓR

K : K → MR(K)1,we need to prove the continuity of the maps εf ΓK : K → C, f ∈ C(K), and ofthe maps εf ΓR

K : K → R, f ∈ CR(K). (Recall that εf (φ) = φ(f), ∀φ ∈ CK(K)∗.)Notice hoewver that we have in fact equalities

εf ΓK = f, ∀ f ∈ C(K),

εf ΓRK = f, ∀ f ∈ CR(K),

so the desired continuity is automatic.

Corollary 5.2. With the above notations, the spaces

Γ(K) = γp :∈ K ⊂ M(K)1 and ΓR(K) = γRp :∈ K ⊂ MR(K)1

are w∗ compact, and the maps

ΓK : K → Γ(K) and ΓRK : K → ΓR(K)

are homeomorphisms.Here is an interesting application of the above result to topology.Theorem 5.4 (Urysohn Metrizatbility Theorem). Let K be a compact Haus-

dorff space. The following are equivalent:(i) K is metrizable;(ii) K is second countable, i.e. the topology has a countable base;

(iiiR) the Banach space CR(K) is separable;(iiiC) the Banach space C(K) is separable.

Proof. (i) ⇒ (ii). We already know this fact. (See the section on metricspaces).

(ii) ⇒ (iiiR). Assume K is second countable. Fix a countable base Dn : n ∈N for the topology. Consider the countable set

∆ = (m,n) ∈ N2 : Dm ∩Dn = ∅.Claim: For any two points p, q ∈ K, with p 6= q, there exists a pair (m,n) ∈

∆ with p ∈ Dm and q ∈ Dn.Indeed, since K is Hausdorff, there exist open sets U0, V0 ⊂ K with p ∈ U0, q ∈ V0,and U0 ∩ V0 = ∅. Since K is (locally) compact, there exist open sets U, V ⊂ K,such that p ∈ U ⊂ U ⊂ U0 and q ∈ V ⊂ V ⊂ V0. Finally, since Dn : n ∈ N is abasis for the topology, there exist m,n ∈ N such that p ∈ Dm ⊂ U and q ∈ Dn ⊂ V .Then clearly we have Dm ⊂ U ⊂ U0, and Dn ⊂ V ⊂ V0, which forces Dm∩Dn = ∅.

Having proven the Claim, for every pair (m,n) ∈ ∆ we choose (use UrysohnLemma) a continuous function hmn : K → [0, 1] such that hmn

∣∣Dm

= 0 andhmn

∣∣Dn

= 1, and we define the countable family

F = hmn : (m,n) ∈ ∆.Using the Claim, we know that F separates the points of K. We set

P = h ∈ CR(K) : h is a finite product of functions in F.

CHAPTER II: ELEMENTS OF FUNCTIONAL ANALYSIS 109

Notice that P is still countable, it also separates the points of K, but also has theproperty:

f, g ∈ P ⇒ fg ∈ P.

If we defineA = Span(1 ∪ P),

then A ⊂ CR(K) satisfies the hypothesis of the Stone-Weierstrass Theorem, henceA is dense in CR(K). Notice that if we define

AQ = SpanQ(1 ∪ P),

i.e. the set of linear combinations of elements in 1 ∪ P with rational coefficients,then clearly AQ is dense in A, and so AQ is dense in CR(K). But now we are done,since AQ is obviously countable.

(iiiR) ⇒ (iiiC). Assume CR(K) is separable. Let S ⊂ CR(K) be a countabledense set. Then the set

S + iS = f + ig : f, g ∈ Sis clearly countable, and dense in C(K).

(iiiC) ⇒ (i). Assume C(K) is separable. By the results from the previoussection, it follows that, when equipped with the w∗ topology, the compact spaceM(K)1 is metrizable. Then the compact subset Γ(K) ⊂ M(K)1 is also metrizable.Since K is homeomorphic to Γ(K), it follows that K itself is metrizable.

Definition. Let K be a compact Hausdorff space, and let K be one of thefields R or C. A K-linear map φ : CK(K) → K is said to be positive, if it has theproperty

f ∈ CR(K), f ≥ 0 =⇒ φ(f) ≥ 0.Proposition 5.4 (Automatic continuity for positive linear maps). Let K be

a compact Hausdorff space, and let K be one of the fields R or C. Any positiveK-linear map φ : CK(K) → K is continuous. Moreover, one has the equality‖φ‖ = φ(1).

Proof. In the case when K = C, it suffices to prove that φ∣∣CR(K)

is continuous.Therefore, it suffices to prove the statement for K = R. Start with some arbitraryf ∈ CR(K), and define the function f± ∈ CR(K) by

f+ = maxf, 0 and f− = max−f, 0,so that f± ≥ 0, f = f +−f−, and ‖f‖ = max‖f+‖, ‖f−‖. On the one hand, bypositivity, we have the inequalities φ(f±) ≥ 0, so we get

−φ(f−) ≤ φ(f+)− φ(f−) ≤ φ(f+),

which give

(2) |φ(f)| = |φ(f+)− φ(f−)| ≤ maxφ(f+), φ(f−).On the other hand, we have

‖f±‖ · 1− f± ≥ 0,

so by positivity we get‖f±‖ · φ(1) ≥ φ(f±).

Using this in (2) gives

|φ(f)| ≤ φ(1) ·max‖f+‖, ‖f−‖ = φ(1) · ‖f‖.

110 LECTURES 14-15

Since this holds for all f ∈ CR(K), the continuity of φ follows, together with theestimate

‖φ‖ ≤ φ(1).

Since φ(1) ≤ ‖φ‖ · ‖1‖ = ‖φ‖, the desired norm equality follows.

Notations. Let K be a compact Hausdorff space. We define

MK+(K) = φ : CK(K) → K : φ K-linear, positive;

MK+(K)1 = φ ∈ MK

+(K) : ‖φ‖ ≤ 1 = MK+(K) ∩MK(K)1.

When K = C, the superscript C will be ommitted.Remarks 5.2. Let K be a compact Hausdorff space. We have the inclusion

M+(K) ⊂ Msa(K). Indeed, if we start with φ ∈ M+(K), then using the factthat every real-valued continuous function f ∈ C(K) is a difference of non-negativecontinuous functions f = f+−f−, it follows that φ(f) = φ(f+)−φ(f−) is a differenceof two non-negative (hence real) numbers, so φ(f) ∈ R. This implies φ? = φ.

The set MR+(K) is w∗-closed in MR(K), and the set M+(K) is w∗-closed in

M(K). This follows from the fact that, for each f ∈ CR(K), the set

AKf = f ∈ MK(K) : φ(f) ≥ 0 = ε−1

f

([0,∞)

)is w∗-closed, being the preimage of a closed set, under a w∗-continuous map. Theneverything is a consequence of the equality

MK+(K) =

⋂f∈CR(K)f≥0

AKf .

In particular, the sets MR+(K)1 and M+(K)1 are w∗-compact.

The sets MR+(K)1 and M+(K)1 are convex.

Using the identification MR(K) ' Msa(K), we have the following hierarchies:

MR+(K) ' M+(K)∩ ∩

MR(K) ' Msa(K)∩

M(K)

MR+(K)1 ' M+(K)1∩ ∩

MR(K)1 ' Msa(K)1∩

M(K)1

with ' isometric and w∗-homeomorphism.

Proposition 5.5. Let K be a compact Hausdorff space. Then one has theequality

Msa(K)1 = conv(M+(K)1 ∪ −M+(K)1

).

(Here conv denotes the convex cover.)

Proof. Denote the set conv(M+(K)1 ∪ −M+(K)1

)simply by C.

Claim: One has the equality:

(3) C = tφ− (1− t)ψ : φ, ψ ∈ M+(K)1, t ∈ [0, 1].

In particular, the set C is w∗-compact.

CHAPTER II: ELEMENTS OF FUNCTIONAL ANALYSIS 111

Denote the set on the right hand side of (3) simply by D. The inclusion C ⊃ D isclear. To prove the inclusion C ⊂ D, we only need to prove that D is convex and itcontains M+(K)1 ∪ −M+(K)1. The second property is clear. The convexity of D

is also clear, being a consequence of the convexity of ±M+(K)1.The w∗-compactness of C is then a consequence of the compatness of the prod-

uct spaceM+(K)1 ×M+(K)1 × [0, 1],

and of the fact that C is the range of the continuous map

M+(K)1 ×M+(K)1 × [0, 1] 3 (φ, ψ, t) 7−→ tφ− (1− t)ψ ∈ Msa(K).

Having proven the Claim, we now proceed with the equality

Msa(K)1 = C.

The inclusion ⊃ is clear, since Msa(K)1 is convex, and it contains both M+(K)1and −M+(K)1.

We prove the other inclusion by contradiction. Assume there is some φ ∈Msa(K)1 r C. Apply Corollary II.4.2 to find some f ∈ C(K) and a real number α,such that

Reφ(f) < α ≤ Reσ(f), ∀σ ∈ C.

If we take g = Re f , then this gives

φ(g) < α ≤ σ(g), ∀σ ∈ C.

Notice that 0 ∈ C, so we get α ≤ 0. If we define β = −α(≥ 0), and h = −g, theabove inequality gives

φ(h) > β ≥ σ(h), ∀σ ∈ C.

Using the obvious inclusions ±Γ(K) ⊂ C, we get

β ≥ ±γp(h) = ±h(p), ∀ p ∈ K.Since h is real-valued, this will force ‖h‖ ≤ β. But then we get a contradiction,because we also have

β < φ(h) ≤ ‖φ‖ · ‖h‖ ≤ ‖h‖.

Corollary 5.3. Let K be a compact Hausdorff space, and let φ ∈ Msa(K).Then there exist φ1, φ2 ∈ M+(K), such that φ = φ1 − φ2, and ‖φ‖ = ‖φ1‖+ ‖φ2‖.

Proof. If φ ∈ M+(K) ∪ −M+(K), there is nothing to prove. Assume φ 6∈

M+(K)∪−M+(K), in particular φ 6= 0. We define ψ =φ

‖φ‖, so that ψ ∈ Msa(K)1.

Find ψ1, ψ2 ∈ M+(K)1 and t ∈ [0, 1], such that

ψ = tψ1 − (1− t)ψ2.

Since ψ 6∈ M+(K) ∪ −M+(K), it follows that 0 < t < 1. Notice that

1 = ‖ψ‖ = ‖tψ1 − (1− t)ψ2‖ ≤ t‖ψ1‖+ (1− t)‖ψ2‖.If ‖ψ1‖ < 1, or ‖ψ2‖ < 1, then this would imply t‖ψ1‖ + (1 − t)‖ψ2‖ < 1, whichis impossible by the above estimate. This argument proves that we must have‖ψ1‖ = ‖ψ2‖ = 1. If we define

φ1 = t‖φ‖ψ1 and φ2 = (1− t)‖φ‖ψ2,

112 LECTURES 14-15

then ‖φ1‖ = t‖φ‖ and ‖φ2‖ = (1 − t)‖φ‖, so we indeed have ‖φ1‖ + ‖φ2‖ = ‖φ‖.Obviously φ1 and φ2 are positive, and

φ1 − φ2 = ‖φ‖ ·[tψ1 − (1− t)ψ2

]= ‖φ‖ · ψ = φ.

Proposition 5.6. Let K be a compact Hausdorff space. The set

conv(Γ(K) ∪ 0

)is w∗-dense in M+(K)1.

Proof. Let C be the w∗-closure of conv(Γ(K) ∪ 0

). It is obvious that C ⊂

M+(K)1, so we only need to prove the inclusion M+(K)1 ⊂ C. We do this bycontardiction. Assume there exists some φ ∈ M+(K)1 r C. Since C is w∗-closedand convex, there exists some f ∈ C(K) and a real number α, such that

Reφ(f) < α ≤ Reσ(f), ∀σ ∈ C.

In particular, if we take h = −Re f , and β = −α, we get

(4) φ(h) > β ≥ σ(h), ∀σ ∈ C.

Sinc 0 ∈ C, we have β ≥ 0. Since Γ(K) ⊂ C, we also get

β ≥ γp(h) = h(p), ∀ p ∈ K,which menas that β1 − h ≥ 0. Since φ is positive, this will force φ(β1 − h) ≥ 0,which gives

φ(h) ≤ φ(β1) = βφ(1) = β‖φ‖.Finally, since ‖φ‖ ≤ 1, this gives

φ(h) ≤ β,

thus contradicting (4).

The results for the Banach spaces of the form C(K), with K compact Hausdorffspace, can be generalized, with suitable modifications, to the situation when K isreplaced with a locally compact space. The following result in fact reduces theanalysis to the compact case.

Theorem 5.5. Let Ω be a locally compact space, and let Ωβ be the Stone-Cechcompactification of Ω. Then the restriction map

R : CK(Ωβ) 3 f 7−→ f∣∣Ω∈ CK

b (Ω)

is an isometric linear isomorphism.

Proof. The linearity is obvious.Let us show that R is surjective. We show that R is bijective, by exhibiting an

inverse for it. For every h ∈ CKb (Ω), we consider the compact set

Kh = z ∈ K : |z| ≤ ‖h‖,so that we can regard h as a continuous map Ω → Kh. We know from the func-toriality of the Stone-Cech compactification that there exists a unique continuousmap hβ : Ωβ → Kβ

h , with hβ∣∣Ω

= h. Since Kh is compact, we have Kβh = Kh. In

particular, this gives the inequality

(5) |hβ(x)| ≤ ‖h‖, ∀x ∈ Ωβ .

CHAPTER II: ELEMENTS OF FUNCTIONAL ANALYSIS 113

Define the map T : CKb (Ω) 3 h 7−→ hβ ∈ CK(Ωβ), and let us show that T is an

inverse for R. The equality R T = Id is trivial, by construction. To prove theequality T R = Id, we start with some f ∈ CK

b (Ω), and we consider h = Rf .Then Th = hβ , and since hβ

∣∣Ω

= h = f∣∣Ω, the denisty of Ω in Ωβ clearly forces

f = hβ = Th = T (Rf).The fact that R is isometric is now clear, because on the one hand we clearly

have ‖Rf‖ ≤ ‖f‖, ∀ f ∈ CK(Ωβ), and on the other hand, by (5), we also have‖Th‖ ≤ ‖h‖, ∀h ∈ CK

b (Ω).

If Ω is a locally compact space, the above result suggests that the space CKb (Ω)

is quite “large.” It is then natural to look at smaller spaces.Definitions. Let Ω be a locally compact space. If K is one of the fields R or

C, and f : Ω → K is a continuous function, we define the support of f by

supp f = ω ∈ Ω : f(ω) 6= 0.We define the space

CKc (Ω) =

f : Ω → K : f continuous, with compact support

.

When K = C, this space will be denoted simply by Cc(Ω). Remark that, whenequipped with pointwise addition and multiplication, the space CK

c (Ω) becomes aK-algebra. One has obviously the inclusion CK

c (Ω) ⊂ CKb (Ω).

We define CK0 (Ω) = CK

c (Ω), the closure of CKc (Ω) in CK

b (Ω). (When K = C, wewill denote this space simply by C0(Ω).) The Banach space CK

0 (Ω) can be regardedas the completion of CK

c (Ω). Of course, when Ω is compact, we have the equalityCK

0 (Ω) = CK(Ω).The following result characterizes the Banach space CK

0 (Ω).Proposition 5.7. Let Ω be a locally compact space. For a function f ∈ CK

b (Ω),the following are equivalent:

(i) f ∈ CK0 (Ω);

(ii) for every ε > 0, there exists some compact subset Kε ⊂ Ω, such that

supω∈ΩrKε

|f(ω)| ≤ ε.

Proof. (i) ⇒ (ii). Suppose f ∈ CK0 (Ω), which means that there exists some

sequence (fn)∞n=1 ⊂ CKc (Ω), such that limn→∞ fn = f , in the norm topology in

CKb (Ω). Fix some ε > 0, and choose k ≥ 1, such that ‖f − fk‖ ≤ ε. If we define

Kε = supp fk, then, for every ω ∈ Ω r Kε, we have fk(ω) = 0, so the inequality‖f − fk‖ ≤ ε forces |f(ω)| ≤ ε.

(ii) ⇒ (i). Suppose f satisfies property (ii). Fix for the moment an integern ≥ 1. Use condition (ii) to find a compact subset Kn ⊂ Ω, such that

|f(ω)| ≤ 1n, ∀ω ∈ Ω rKn.

Use Urysohn Lemma to choose some continuous function hn : Ω → [0, 1], withcompact support, such that hn

∣∣Kn

= 1. Define the function fn = hnf , so thatfn ∈ CK

c (Ω). If ω ∈ Ω rKn, then, using the inequality 0 ≤ hn ≤ 1, and the choiceof Kn, we have

|f(ω)− fn(ω)| = |f(ω)| · [1− hn(ω)] ≤ |f(ω)| ≤ 1n.

114 LECTURES 14-15

Using the fact that fn∣∣Kn

= f∣∣Kn

, the above equality proves that ‖f−fn‖ ≤ 1n . This

way we have constructed a sequence (fn)∞n=1 ⊂ CKc (Ω), such that limn→∞ fn = f ,

in CKb (Ω), so by the definition it follows that f ∈ CK

0 (Ω).

The following establishes an interesting connection with the Alexandrov com-pactification.

Proposition 5.8. Let Ω be a locally compact space, which is non-compact,and let Ωα = Ω t ∞ denote the Alexandrov compactification.

(i) For every function f ∈ CK0 (Ω), the function fα : Ωα → K, defined by

fα∣∣Ω

= f , and fα(∞) = 0, is continuous.(ii) The correspondence U : CK

0 (Ω) 3 f 7−→ fα ∈ CK(Ωα) is an isometriclinear map.

(iii) One has the equality

(6) RanU =g ∈ CK(Ωα) : g(∞) = 0

.

Proof. (i). We know that Ω is open in Ωα, which immediately gives the factthat fα is continuous at every point ω ∈ Ω. So all we need to show is the continuityof fα at ∞. This amounts to showing that for every neighborhood N of fα(∞) = 0in K, there exists a neighborhood V of ∞ in Ωα, such that fα(V ) ⊂ N . Start witha neighborhood N of 0, and choose ε > 0, such that the set Bε = z ∈ K : |z| ≤ εis contained in N . Choose some compact set Kε ⊂ Ω, such that

supω∈ΩrKε

|f(ω)| ≤ ε.

Define the set D = (Ω rKε) ∪ ∞. By the definition of the topology on Ωα, theset D is an open neigborhood of ∞. We are now done, because we clearly have

|fα(x)| ≤ ε, ∀x ∈ D,

which gives the inclusion fα(D) ⊂ Bε ⊂ N .(ii). This part is trivial.(iii). Denote the right hand side of (6) by A. The inclusion RanU ⊂ A is

trivial, by definition. Conversely, let us start with some g ∈ A, and let us considerthe function f = g

∣∣Ω. Let us show that f ∈ CK

0 (Ω), using Proposition 5.7. Startwith some ε > 0, and choose some open neighborhood Dε of ∞, in Ωα, such that

|g(x)| ≤ ε, ∀x ∈ Dε.

By definition, there exists a compact subset Kε ⊂ Ω, such that Dε = Ωα r Kε,so it is immediate that f satisfies condition (ii) from Proposition 5.7. Notice nowthat, by construction we have fα

∣∣Ω

= g∣∣Ω, and fα(∞) = 0 = g(∞, so we indeed

get g = Uf .

Remark 5.3. Let Ω be a locally compact space, which is non-compact. Usethe map U defined above, to identify CK

0 (Ω) with the subspace RanU ⊂ CK(Ωα).With this identification, we have the equality

CK(Ωα) = K1 + CK0 (Ω) =

λ1 + f : λ ∈ K, f ∈ CK

0 (Ω).

Indeed, if we start with some function g ∈ CK(Ωα) and we take λ = g(∞) andf = g − λ1, then f(∞) = 0. Note that this argument proves that in fact everyg ∈ CK(Ωα), can be uniquely represented as g = λ1+f , with λ ∈ K, and f ∈ CK

0 (Ω).

CHAPTER II: ELEMENTS OF FUNCTIONAL ANALYSIS 115

We conclude with a couple of generalizations of the various results in thissection. The first two ones are proven, the rest are stated as exercises. The followingresult is a generalization of Proposition 5.4.

Proposition 5.9. Let Ω be a locally compact space, and let φ : CR0 (Ω) → R be

a positive linear map. Then φ is continuous, and one has the equality

(7) ‖φ‖ = supφ(f) : f ∈ CR0 (Ω), 0 ≤ f ≤ 1.

Proof. Let us denote the right hand side of (7) by M . First we show thatM <∞. If M = ∞, there exists a sequence (fn)∞n=1 ⊂ CR

0 (Ω), such that

0 ≤ fn ≤ 1 and φ(fn) ≥ 4n, ∀n ≥ 1.

Consider then the function f =∑∞n=1

12n fn. Since

∑∞n=1

∥∥ 12n fn

∥∥ ≤ ∑∞n=1

12n = 1,

it follows that f ∈ CR0 (Ω). Notice however that, since we obviously have 1

2n fn ≤ f ,by the positivity of φ, we get

φ(f) ≥ φ( 12nfn

)=

12nφ(fn) ≥ 2n, ∀n ≥ 1,

which is clearly impossible. Let us show now that φ is continuous, by proving theinequality

(8) |φ(f)| ≤M, ∀ f ∈ CR0 (Ω), with ‖f‖ ≤ 1.

Start with some arbitrary function f ∈ CR0 (Ω). The functions g± = |f |±f ∈ CR

0 (Ω),clearly satisfy g ≥ 0, so we get φ(|f |± f) ≥ 0, so we get φ(|f |) ≥ ±φ(f). This gives|φ(f)| ≤ φ(|f |), and since 0 ≤ |f | ≤ 1, we immediately get (8).

The inequality (8) proves the inequality ‖φ‖ ≤ M . Since we obviously haveM ≤ ‖φ‖, we get in fact the equality (7).

Corollary 5.4. Let Ω be a locally compact space, which is non-compact, andlet Ωα be the Alexandrov compactification of Ω. Using the inclusion CR

0 (Ω) ⊂CR(Ωα), given by Proposition 5.8, every positive linear map φ : CR

0 (Ω) → R can beuniquely extended to a positive linear map ψ : CR

0 (Ω) → R, such that ‖ψ‖ = ‖φ‖.

Proof. For every g ∈ CR(Ωα), we know that there exists a unique λ ∈ Rand f ∈ CR

0 (Ω), such that g = λ1 + f (namely λ = g(∞) and f = g − λ1). Wethen define ψ(g) = λ‖φ‖ + φ(f). Notice that ψ(1) = ‖φ‖. It is obvious thatψ : CR(Ωα) → R is linear, and ψ

∣∣CR

0 (Ω)= φ. Let us show that ψ is positive.

Start with some g ∈ CR(Ωα) with g ≥ 0, and let us prove that ψ(g) ≥ 0. Writeg = λ1 + f with λ ∈ R and f ∈ CR

0 (Ω). We know that λ = g(∞) ≥ 0. If λ = 0,there is nothing to prove. If λ > 0, we define the function h = λ−1f ∈ CR

0 (Ω), sothat g = λ(1 + h). The positivity of g forces 1 + h ≥ 0, which means if we considerthe function h− = max−h, 0 ∈ CR

0 (Ω), then we have 0 ≤ h− ≤ 1, as well ash− + h ≥ 0. Using the above result, this will then give

‖φ‖+ φ(h) ≥ φ(h−) + φ(h) = φ(h− + h) ≥ 0,

which means that ψ(1 + h) ≥ 0. Consequently we also get

ψ(g) = ψ(λ(1 + h)) = λψ(h) ≥ 0.

Having shown the positivity of ψ, we know that

‖ψ‖ = ψ(1) = ‖φ‖.

116 LECTURES 14-15

To prove uniqueness, start with another positive linear map ξ : CR0 (Ω) → R,

such that ‖ξ‖ = ‖φ‖, with ξ∣∣CR

0 (Ω)= φ. Since ξ is positive, this forces ξ(1) = ‖ξ‖ =

‖φ‖ = ψ(1). But then we have

ξ(λ1 + f) = λ‖φ‖+ φ(f) = ψ(λ1 + f), ∀λ ∈ R, f ∈ CR0 (Ω),

which proves that ξ = ψ.

Remark 5.4. Let Ω be a locally compact space, which is not compact, and letφ : CR

c (Ω) → R be a positive linear map. Then the following are equivalent:(i) φ is continuous;(ii) sup

φ(f) : f ∈ CR

c (Ω), 0 ≤ f ≤ 1<∞.

The implication (i) ⇒ (ii) is trivial. To prove the implication (ii) ⇒ (i) we followthe exact same steps as in the proof of the equality (7) in Proposition 5.9. Denotethe quantity in (ii) by M , and using the inequality |φ(f)| ≤ φ(|f |), we immediatelyget |φ(f)| ≤M, ∀ f ∈ CR

c (Ω), with ‖f‖ ≤ 1.Remark also that if φ is as above, then we have in fact the equality

‖φ‖ = supφ(f) : f ∈ CR

c (Ω), 0 ≤ f ≤ 1.

The following is a generalization of Corollary 5.3.Proposition 5.10. Let Ω be a locally compact space, and let φ : CR

0 (Ω) → R bea linear continuous map. Then there exist positive linear maps φ1, φ2 : CR

0 (Ω) → R,such that φ = φ1 − φ2, and ‖φ‖ = ‖φ1‖+ ‖φ2‖.

Proof. If Ω is compact there is nothing to prove (this is Corollary 5.3). As-sume Ω is non-compact. Use Hahn-Banach Theorem to find a linear continuousmap ψ : CR(Ωα) → R, with ‖ψ‖ = 1 and ψ

∣∣CR

0 (Ω)= φ. Apply Corollary 5.3 to

find two positive linear maps ψ1, ψ2 : CR(Ωα) → R such that ψ = ψ1 − ψ2 and‖ψ‖ = ‖ψ1‖ + ‖ψ2‖. Define the positive linear maps φk = ψk

∣∣CR

0 (Ω), k = 1, 2. We

clearly have φ = φ1 − φ2, and

‖φ1‖+ ‖φ2‖ ≤ ‖ψ1‖+ ‖ψ2‖ = ‖ψ‖ = ‖φ‖ = ‖φ1 − φ2‖ ≤ ‖φ1‖+ ‖φ2‖,which forces ‖φ‖ = ‖φ1‖+ ‖φ2‖.

Exercise 3. (Dini’s Theorem for locally compact spaces) Let Ω be a locallycompact space, let (fn)n≥1 ⊂ CR

0 (Ω) be a monotone sequence. Assume there issome f ∈ CR

0 (Ω), such that

limn→∞

fn(ω) = f(ω), ∀ω ∈ Ω.

Then limn→∞ fn = f , in the norm topology.Exercise 4. (Stone-Weierstrass Theorems) Let Ω be a locally compact space,

which is non-compact, and let A ⊂ CK0 (Ω) be a subalgebra, with the following

separation properties• For any two points ω1, ω2 ∈ Ω, with ω1 6= ω2, there exists f ∈ A such thatf(ω1) 6= f(ω2).

• For any ω ∈ Ω, there exists f ∈ A with f(ω) 6= 0.A. Prove that, if K = R, then A is dense in CR

0 (A), in the norm topology.B. Prove that, if K = C, and if A has the property f ∈ A ⇒ f ∈ A, then A is

dense in C0(Ω).Hint: Work in Ωα (use Remark 5.3), and prove that K1 + A is dense in CK(Ωα).

Lectures 16-17

6. Hilbert spaces

In this section we examine a special type of Banach spaces.Definition. Let K be one of the fields R or C, and let X be a vector space

over K. An inner product on X is a map

X× X 3 (ξ, η) 7−→(ξ∣∣ η )

∈ K,

with the following properties:•

(ξ∣∣ ξ )

≥ 0, ∀ ξ ∈ X;• if ξ ∈ X satisfies

(ξ∣∣ ξ )

= 0, then ξ = 0;• for any ξ ∈ X, the map X 3 η 7−→

(ξ∣∣ η )

∈ K is K-linear;•

∣∣ ξ )=

(ξ∣∣ η )

, ∀xi, η ∈ X.Comments. Combining the last two properties, one gets(

ξ∣∣λη1 + η2

)= λ

(ξ∣∣ η1 )

+(ξ∣∣ η2 )

, ∀ ξ, η1, η2 ∈ X, λ ∈ K;(λξ1 + ξ2

∣∣ η) = λ(ξ1

∣∣ η )+

(ξ2

∣∣ η ), ∀ ξ1, ξ2, η ∈ X, λ ∈ K.

In particular, one has

(1)(λξ

∣∣λξ )= λλ

(ξ∣∣ ξ )

= |λ|2 ·(ξ∣∣ ξ )

, ∀ ξ ∈ X, λ ∈ K.

Proposition 6.1 (Cauchy-Bunyakowski-Schwartz Inequality). Let(·∣∣ · )

bean inner product on the K-vector space X. Then

(2)∣∣( ξ ∣∣ η )∣∣2 ≤ (

ξ∣∣ ξ )

·(η

∣∣ η ), ∀ ξ, η ∈ X.

Moreover, if equality holds then ξ and η are proportional, in the sense that eitherξ = 0, or η = 0, or ξ = λη.

Proof. Fix ξ, η ∈ X. Assume η 6= 0.(In the case when η = 0, both statements

are trivial). Choose a number λ ∈ K, with |λ| = 1, such that∣∣( ξ ∣∣ η )∣∣ = λ

(ξ∣∣ η )

=(ξ∣∣λη )

.

Define the map F : K → K by

F (z) =(zλη + ξ

∣∣ zλη + ξ), ∀ z ∈ K.

A simple computation gives

F (z) = zλzλ(η

∣∣ η )+ zλ

(ξ∣∣ η )

+ zλ(η

∣∣ ξ )+

(ξ∣∣ ξ )

=

= |z|2|λ|2(η

∣∣ η )+ zλ

(ξ∣∣ η )

+ zλ(ξ∣∣ η )

+(ξ∣∣ ξ )

=

= |z|2(η

∣∣ η )+ z

∣∣( ξ ∣∣ η )∣∣ + z∣∣( ξ ∣∣ η )∣∣ +

(ξ∣∣ ξ )

, ∀ z ∈ R.

117

118 LECTURES 16-17

In particular, when we restrict F to R, it becomes a quadratic function:

F (t) = at2 + bt+ c, ∀ t ∈ R,where a =

∣∣ η )> 0, b = 2

∣∣( ξ ∣∣ η )∣∣, c =(ξ∣∣ ξ )

. Notice that we have

F (t) ≥ 0, ∀ t ∈ R.This forces b2 − 4ac ≤ 0. This last inequality gives

4∣∣( ξ ∣∣ η )∣∣2 − 4

(ξ∣∣ ξ )

·(η

∣∣ η )≤ 0,

so we get ∣∣( ξ ∣∣ η )∣∣2 ≤ (ξ∣∣ ξ )

·(η

∣∣ η ),

and the inequality (2) is proven. Let us examine now when we have equality. Theequality in (2) gives b2 − 4ac = 0, which in terms of quadratic equations says thatthe equation

F (t) = at2 + bt+ c = 0has a

(unique

)solution t0. This will give(

t0λη + ξ∣∣ t0λη + ξ

)= F (t0) = 0,

which forces t0λη + ξ = 0, i.e. ξ = (−t0λ)η.

Corollary 6.1. Let(·

∣∣ · )be an inner product on the K-vector space X.

Then the map

X 3 ξ 7−→√(

ξ∣∣ξ) ∈ [0,∞

)is a norm on X.

Proof. Denote√(

ξ∣∣ξ) simply by ‖ξ‖. The fact that ‖ξ‖ is non-negative is

clear. The implication ‖ξ‖ = 0 ⇒ ξ = 0 is also clear. Using (1) we have

‖λξ‖ =√(

λξ∣∣λξ) =

√|λ|2

(ξ∣∣ξ) = |λ| ·

√(ξ∣∣ξ) = |λ| · ‖ξ‖, ∀ ξ ∈ X, λ ∈ K.

Finally, for ξ, η ∈ X, we have

‖ξ + η‖2 =(ξ + η

∣∣ ξ + η)

=(ξ∣∣ ξ )

+(η

∣∣ η )+

(ξ∣∣ η )

+(η

∣∣ ξ )=

= ‖ξ‖2 + ‖η‖2 +(ξ∣∣ η )

+(ξ∣∣ η )

= ‖ξ‖2 + ‖η‖2 + 2Re(ξ∣∣ η )

.

We now use the C-B-S inequality, which reads

(3)∣∣( ξ ∣∣ η )∣∣ ≤ ‖ξ‖ · ‖η‖,

so the above computation gives

‖ξ + η‖2 = ‖ξ‖2 + ‖η‖2 + 2Re(ξ∣∣η) ≤ ‖ξ‖2 + ‖η‖2 + 2

∣∣(ξ∣∣η)∣∣ ≤≤ ‖ξ‖2 + ‖η‖2 + 2‖ξ‖ · ‖η‖ =

(‖ξ‖+ ‖η‖

)2,

so we immediately get ‖ξ + η‖ ≤ ‖ξ‖+ ‖η‖.

Definition. The norm constructed in the above result is called the normdefined by the inner product

∣∣ · ).

Exercise 1. Use the above notations, and assume we have two vectors ξ, η 6= 0,such that ‖ξ+η‖ = ‖ξ‖+‖η‖. Prove that there exists some λ > 0 such that ξ = λη.

Lemma 6.1. Let X be a K-vector space, equipped with an inner product.(ii) [Parallelogram Law] ‖ξ + η‖2 + ‖ξ − η‖2 = 2

(‖ξ‖2 + ‖η‖2

).

CHAPTER II: ELEMENTS OF FUNCTIONAL ANALYSIS 119

(i) [Polarization Identities](a) If K = R, then(

ξ∣∣ η )

=14[‖ξ + η‖2 − ‖ξ − η‖

], ∀ ξ, η ∈ X.

(b) If K = R, then

(ξ∣∣ η )

=14

3∑k=0

i−k‖ξ + ikη‖2, ∀ ξ, η ∈ X.

Proof. (i). This is obvious, since (since the computations from the proof ofCorollary ??)

‖ξ ± η‖2 = ‖ξ‖2 + ‖η‖2 ± 2Re(ξ∣∣ η )

.

(ii).(a). In the real case, the above identity gives

‖ξ ± η‖2 = ‖ξ‖2 + ‖η‖2 ± 2(ξ∣∣ η )

,

so we immediately get

‖ξ + η‖2 − ‖ξ − η‖2 = 4(ξ∣∣ η )

.

(b). For every k ∈ 0, 1, 2, 3, we have

‖ξ + ikη‖2 = ‖ξ‖2 + ‖η‖2 + 2Re(ξ∣∣ ikη )

= ‖ξ‖2 + ‖η‖2 + ik(ξ∣∣ η )

+ i−k(η

∣∣ ξ ).

Then, when we sum up, we have3∑k=0

i−k‖ξ + ikη‖2 =(‖ξ‖2 + ‖η‖2

) ∑k=0

i−k + 4(ξ∣∣ η )

+(η

∣∣ ξ ) 3∑k=0

i−2k.

Since3∑k=0

i−k =3∑k=0

i−2k = 0,

the above computation proves that we indeed have3∑k=0

i−k‖ξ + ikη‖2 = 4(ξ∣∣ η )

.

Corollary 6.2. Let X be a K-vector space equipped with an inner product(·

∣∣ · ). Then the map

X× X 3 (ξ, η) 7−→(ξ∣∣ η )

∈ K

is continuous, with respect to the product topologies.

Proof. Immediate from the polarization identities.

Corollary 6.3. Let X and Y be two K-vector spaces equipped with innerproducts

∣∣ · )X

and(·

∣∣ · )Y. If T : X → K is an isometric linear map,

then (Tξ

∣∣Tη )Y

=(ξ∣∣ η )

X, ∀ ξ, η ∈ X.

Proof. Immediate from the polarization identities.

120 LECTURES 16-17

Exercise 2. Let X be a normed K-vector space. Assume the norm satisfies theParallelogram Law. Prove that there exists an inner product

∣∣ · )on X, such

that‖ξ‖ =

√(ξ∣∣ ξ )

, ∀ ξ ∈ X.

Hint: Define the inner product by the Polarization Identity, and then prove that it is indeed an

inner product.

Proposition 6.2. Let X be a K-vector space, equipped with an inner product(·

∣∣ · )X. Let Z be the completion of X with respect to the norm defined by the

inner product. Then Z carries a unique inner product(·

∣∣ · )Z, so that the norm

on Z is defined by(·∣∣ · )

Z. Moreover, this inner product extends

(·∣∣ · )

X, in the

sense that (〈ξ〉

∣∣ 〈η〉 )Z

=(ξ∣∣ η )

X, ∀ ξ, η ∈ X.

Proof. It is obvious that the norm on Z satisfies the Parallelogram Law. Wethen apply Exercise 2.

Definitions. Let K be one of the fields R or C. A Hilbert space over K is aK-vector space, equipped with an inner product, which is complete with respect tothe norm defined by the inner product. Some textbooks use the term Euclidean forreal Hilbert spaces, and reserve the term Hilbert only for the complex case.

Examples 6.1. For I a non-empty set, the space `2K(I) is a Hilbert space. Weknow that this is a Banach space. The inner product defining the norm is(

α∣∣β )

=∑j∈I

α(j)β(j), ∀α, β ∈ `2K(I).

The fact that the function αβ : I → K is summable is a consequence of Holder’sinequality.

More generally, a Banach space whose norm satisfies the Parallelogram Law isa Hilbert space.

Definitions. Let X be a K-vector space, equipped with an inner product(·∣∣ · )

. Two vectors ξ, η ∈ X are said to be orthogonal, if(ξ∣∣ η )

= 0. In this casewe write ξ ⊥ η. Given a set M ⊂ X, and a vector ξ ∈ X, we write ξ ⊥ M, if

ξ ⊥ η, ∀ η ∈ M.

Finally, two subsets M,N ⊂ X are said to be orthogonal, and we write M ⊥ N, if

ξ ⊥ η, ∀ ξ ∈ M, η ∈ N.

Notation. Let X be a vector space equipped with an inner product. For asubset M ⊂ X, we define the set

M⊥ = ξ ∈ X : ξ ⊥ M.

Remarks 6.1. Let X be a K-vector space equipped with an inner product.A. The relation ⊥ is symmetric.B. If ξ, η ∈ X satisfy ξ ⊥ η, then one has the Pythagorean Theorem:

‖ξ + η‖2 = ‖ξ‖2 + ‖η‖2.

This is a consequence of the equality ‖ξ + η‖2 = ‖ξ‖2 + ‖η‖2 + 2Re(ξ∣∣ η )

.

CHAPTER II: ELEMENTS OF FUNCTIONAL ANALYSIS 121

C. If M ⊂ X is an arbitrary subset, then M⊥ is a closed linear subspace ofX. This follows from the linearity of the inner product in the second variable, andfrom the continuity.

D. For sets M ⊂ N ⊂ X, one has

M⊥ ⊃ N⊥.

E. For any set M ⊂ X, one has

M⊥ =(SpanM

)⊥,

where SpanM denotes the norm closure of the linear span of M. The inclusion

M⊥ ⊃(SpanM

)⊥is trivial, since we have M ⊂ SpanM. Conversely, if ξ ∈ M⊥, then M ⊂ ξ⊥. Butsince ξ⊥ is a closed linear subspace, this gives

SpanM ⊂ ξ⊥,

i.e. ξ ∈(SpanM

)⊥.

The following result gives a very interesting property of Hilbert spaces.Proposition 6.3. Let H be a Hilbert space, let C ⊂ H be a non-empty closed

convex set. For every ξ ∈ H, there exists a unique vector ξ′ ∈ C, such that

‖ξ − ξ′‖ = dist(ξ,C).

Proof. Denote dist(ξ,C) simply by d. By definition, we have

δ = infη∈C

‖ξ − η‖.

Choose a sequence (ηn)n≥1 ⊂ C, such that limn→∞ ‖ξ − ηn‖ = δ.Claim: One has the inequality

‖ηm − ηn‖2 ≤ 2‖ξ − ηm‖2 + 2‖ξ − ηn‖2 − 4δ2, ∀m,n ≥ 1.

Use the Parallelogram Law

(4) 2‖ξ − ηm‖2 + 2‖ξ − ηn‖2 = ‖2ξ − ηm − ηn‖2 + ‖ηm − ηn‖2.We notice that, since 1

2 (ηm + ηn) ∈ C, we have

‖ξ − 12 (ηm + ηn)‖ ≥ δ,

so we get‖2ξ − ηm − ηn‖2 = 4‖ξ − 1

2 (ηm + ηn)‖2 ≥ 4δ2,so if we go back to (4) we get

2‖ξ − ηm‖2 + 2‖ξ − ηn‖2 = ‖2ξ − ηm − ηn‖2 + ‖ηm − ηn‖2 ≥ 4δ2 + ‖ηm − ηn‖2,and the Claim follows.

Having proven the Claim, we now notice that, since limn→∞ ‖ξ − ηn‖ = δ, weimmediately get the fact that the sequence (ηn)n≥1 is Cauchy. Since H is complete,it follows that the sequence is convergent to some point ξ′. Since C is closed, itfollows that ξ′ ∈ C. So far we have

‖ξ − ξ′‖ = limn→∞

‖ξ − ηn‖ = δ = dist(ξ,C),

thus proving the existence.

122 LECTURES 16-17

Let us prove now the uniqueness. Assume ξ′′ ∈ C is another point such that‖ξ − ξ′′‖ = δ. Using the Parallelogram Law, we have

4δ2 = 2‖ξ − ξ′‖2 + ‖ξ − ξ′′‖2 = ‖2ξ − ξ′ − ξ′′‖2 + ‖ξ′ − ξ′′‖2.If ξ′ 6= ξ′′, then we will have

4δ2 > ‖2ξ − ξ′ − ξ′′‖2 = 4‖ξ − 12 (ξ′ + ξ′′)‖2,

so we have a new vector η = 12 (ξ′ + ξ′′) ∈ C, such that

‖ξ − η‖ < δ,

thus contracting the definition of δ.

Definition. Let H be a Hilbert space, and let X ⊂ H be a closed linearsubspace. For every ξ ∈ H, using the above result, we let PXξ ∈ X denote theunique vector in X with the property

‖ξ − PXξ‖ = dist(ξ,X).

This way we have constructed a map PX : H → H, which is called the orthogonalprojection ont X.

The properties of the orthogonal projection are summarized in the followingresult.

Proposition 6.4. Let H be a Hilbert space, and let X ⊂ H be a closed linearsubspace.

(i) For vectors ξ ∈ H and ζ ∈ X one has the equivalence

ζ = PXξ ⇐⇒ (ξ − ζ) ⊥ X.

(ii) PX

∣∣X

= IdX.(iii) The map PX : H → X is linear, continuous. If X 6= 0, then ‖PX‖ = 1.(iv) RanPX = X and KerPX = X⊥.

Proof. (i). “⇒.” Assume ζ = PXξ. Fix an arbitrary vector η ∈ X r 0, andchoose a number λ ∈ K, with |λ| = 1, such that

λ(ξ − ζ

∣∣ η )=

∣∣( ξ − ζ∣∣ η )∣∣.

In particular, we have ∣∣( ξ − ζ∣∣ η )∣∣ = Re

(ξ − ζ

∣∣λη ).

Define the map F : R → R by

F (t) = ‖ξ − ζ − tλη‖2 − ‖ξ − ζ‖2.By the definition of ζ = PXξ, we have

(5) F (t) > 0, ∀ t ∈ R r 0.Notice that F (t) = at2 + bt, ∀ t ∈ R, where a =

(λη

∣∣λη )= ‖η‖2, and b =

2Re(ξ − ζ

∣∣λη )= 2

∣∣( ξ − ζ∣∣ η )∣∣. Of course, the property

at2 + bt > 0, ∀ t ∈ R r 0forces b = 0, so we indeed get

(ξ − ζ

∣∣ η )= 0.

“⇐.” Assume (ξ − ζ) ⊥ X. For any η ∈ X, we have (ξ − ζ) ⊥ (ζ − η), so usingthe Pythagorean Theorem, we get

‖ξ − η‖2 = ‖ξ − ζ‖2 + ‖ζ − η‖2,

CHAPTER II: ELEMENTS OF FUNCTIONAL ANALYSIS 123

which forces‖ξ − η‖ ≥ ‖ξ − ζ‖, ∀ η ∈ X.

This proves that‖ξ − ζ‖ = dist(ξ,X),

i.e. ζ = PXξ.(ii). This property is pretty clear. If ξ ∈ X, then 0 = ξ − ξ is orthogonal to X,

so by (i) we get ξ = PXξ.(iii). We prove the linearity of PX. Start with vectors ξ1, ξ−2 ∈ H and a scalar

λ ∈ K. Take ζ1 = PXξ1 and ζ2 = PXξ2. Consider the vector ζ = λζ1 + ζ2. For anyη ∈ X, we have (

λξ1 + ξ2 − ζ∣∣ η )

=((λξ1 − λζ1) + (ξ2 − ζ2)

∣∣ η )=

=(λξ1 − λζ1

∣∣ η )+

(ξ2 − ζ2

∣∣ η )= λ

(ξ1 − ζ1

∣∣ η )+

(ξ2 − ζ2

∣∣ η )= 0.

By (i) we have (ξ1 − ζ1) ⊥ X and (ξ1 − ζ1) ⊥ X, so the above computation provesthat

(λξ1 + ξ2 − ζ) ⊥ X,

so using (i) we get

PX(λξ1 + ξ2) = ζ = λζ1 + ζ2 = λPXξ1 + PXξ2,

so PX is indeed linear.To prove the continuity, we start with an arbitrary vector ξ ∈ H and we use

the fact that (ξ − PXξ) ⊥ PXξ. By the Pythagorean Theorem we then have

‖ξ‖2 = ‖(ξ − PXξ) + PXξ‖2 = ‖ξ − PXξ‖2 + ‖PXξ‖2 ≥ ‖PXξ‖2.In other words, we have

‖PXξ‖ ≤ ‖ξ‖, ∀ ξ ∈ H,

so PX is indeed continuous, and we have ‖PX‖ ≤ 1. Using (ii) we immediately getthat, when X 6= 0, we have ‖PX‖ = 1.

(iv). The equality RanPX = X is trivial by the construction of PX and by (ii).If ξ ∈ KerPX, then by (i), we have ξ ∈ X⊥. Conversely, if ξ ⊥ X, then ζ = 0satisfies the condition in (i), i.e. PXξ = 0.

Corollary 6.4. If H is a Hilbert space, and X ⊂ H is a closed linear subspace,then

X + X⊥ = H and X ∩ X⊥.

In other words the map

(6) X× X⊥ 3 (η, ζ) 7−→ η + ζ ∈ H

is a linear isomorphism.

Proof. If ξ ∈ H then PXξ ∈ X, and ξ − PXξ ∈ X⊥, and then the equality

ξ = PXξ + (ξ − PXξ)

proves that ξ ∈ X+X⊥. The equality X∩X⊥ = 0 is trivial, since for ζ ∈ X∩X⊥,we must have ζ ⊥ ζ, which forces ζ = 0.

Exercise 3. Let H be a Hilbert space.(i) Prove that, for any closed subspace X ⊂ H, one has the equality

PX⊥ = I − PX.

124 LECTURES 16-17

(ii) Prove that two closed subspaces X,Y ⊂ H, the following are equivalent:– X ⊥ Y;– PXPY = 0;– PYPX = 0.

(iii) Prove that two closed subspaces X,Y ⊂ H, the following are equivalent:– X ⊂ Y;– PXPY = PX;– PYPX = PX.

(iv) Let X,Y ⊂ H are closed subspaces, such that X ⊥ Y, then– X + Y is c closed linear subspace of H;– PX+Y = PX + PY.

Corollary 6.5. Let H be a Hilbert space, and let X ⊂ H be a linear (notnecessarily closed) subspace. Then on has the equality

X =(X⊥

)⊥.

Proof. Denote the closed subspace(X⊥

)⊥ by Z. Since X⊥ = X⊥

, by theprevious exercise we have

PZ = I − PX⊥ = I − PX⊥ = I − (I − PX) = PX,

which forcesZ = RanPZ = RanPX = X.

Theorem 6.1 (Riesz’ Representation Theorem). Let H be a Hilbert space overK, and let φ : H → K be a linear continuous map. Then there exists a uniquevector ξ ∈ H, such that

φ(η) =(ξ∣∣ η )

, ∀ η ∈ H.

Moreover one has ‖ξ‖ = ‖φ‖.

Proof. First we show the existence. If φ = 0, we simply take ξ = 0. Assumeφ 6= 0. Define the subspace X = Kerφ. Notice that X is closed. Using the linearisomorphism (6) we see that the composition

X⊥ → Hquotient map−−−−−−−−→ H/X

is a linear isomorphism. Since

H/X = H/Kerφ ' Ranφ = K,it follows that dim(X⊥) = 1. In other words, there exists ξ0 ∈ X⊥, ξ0 6= 0, suchthat

X⊥ = Kξ.Start now with some arbitrary vector η ∈ H. On the one hand, using the equalityKξ0 + X = H, there exists λ ∈ K and ζ ∈ X, such that

η = λξ0 + ζ,

and since ζ ∈ X = Kerφ, we get

φ(η) = φ(λξ0) = λφ(ξ0).

On the other hand, we have(ξ0

∣∣ η )=

(ξ0

∣∣λξ0 )+

(ξ0

∣∣ ζ )= λ‖ξ0‖2,

CHAPTER II: ELEMENTS OF FUNCTIONAL ANALYSIS 125

so if we define ξ = φ(ξ0)‖ξ0‖−2 we will have(ξ∣∣ η )

=(φ(ξ0)‖ξ0‖−2ξ0 | η

)= φ(ξ0)‖ξ0‖−2

(ξ0

∣∣ η )= λφ(ξ0) = φ(η).

To prove uniqueness, assume ξ′ ∈ H is another vector with

φ(η) =(ξ′

∣∣ η ), ∀ η ∈ H.

In particular, we have

‖ξ− ξ′‖2 =(ξ− ξ′ | ξ− ξ′

)=

(ξ | ξ− ξ′

)−

(ξ′ | ξ− ξ′

)= φ(ξ− ξ′)−φ(ξ− ξ′) = 0,

which forces ξ = ξ′.Finally, to prove the norm equality, we first observe that when ξ = 0, the

equality is trivial. If ξ 6= 0, then on the one hand, using C-B-S inequality we have

|φ(η)| =∣∣( ξ ∣∣ η )∣∣ ≤ ‖ξ‖ · ‖η‖, ∀ η ∈ H,

so we immediately get ‖φ‖ ≤ ‖ξ‖. If we take the vector ζ = ‖ξ‖−1ξ, then ‖ζ‖ = 1,and

φ(ζ) =(ξ∣∣ ‖ξ‖−1ξ

)= ‖ξ‖,

so we also have ‖φ‖ ≥ ‖ξ‖.

In the remainder of this section we discuss a Hilbert space notion of linearindependence. This should be thought as a “rigid” linear independence.

Definition. Let X be a K-vector space, equipped with an inner product. Aset F ⊂ X is said to be orthogonal, if 0 6∈ F, and

ξ ⊥ η, ∀ ξ, η ∈ F, with ξ 6= η.

A set F ⊂ X is said to be orthonormal, if it is orthogonal, but it also satisfies:

‖ξ‖ = 1, ∀ ξ ∈ F.

Remark that, if one starts with an orthogonal set F ⊂ X, then the set

F(1) =‖ξ‖−1ξ : ξ ∈ F

is orthonormal.

Proposition 6.5. Let X be a K-vector space equipped with an inner product.Any orthogonal set F ⊂ X is linearly independent.

Proof. Indeed, if one starts with a vanishing linear combination

λ1ξ1 + · · ·+ λnξn = 0,

with λ1, . . . , λn ∈ K, ξ1, . . . , ξn ∈ X, such that ξk 6= ξ`, for all k, ` ∈ 1, . . . , n withk 6= `, then for each k ∈ 1, . . . , n we clearly have

λk‖ξk‖2 =(ξk

∣∣λ− 1ξ1 + · · ·+ λnξn)

= 0,

and since ξk 6= 0, we get λk = 0.

Lemma 6.2. Let X be a K-vector space equipped with an inner product, and letF ⊂ X be an orthogonal set. Then there exists a maximal (with respect to inclusion)orthogonal set G ⊂ X with F ⊂ G.

126 LECTURES 16-17

Proof. Consider the sets

A =G : G orthogonal subset of X

,

AF =G ∈ A : G ⊃ F

,

ordered with the inclusion. We are going to apply Zorn’s Lemma to AF. LetT ⊂ AF be a subcollection, which is totally ordered, i.e. for any G1,G2 ∈ T one hasG1 ⊂ G2 or G1 ⊃ G2. Define the set

M =⋃

G∈T

G.

Since G ⊂ X r 0, for all G ∈ T, it is clear that M ⊂ X r 0. If ξ1, ξ2 ∈ M

are vectors with ξ1 6= ξ2, then we can find G1,G2 ∈ T with ξ1 ∈ G1 and ξ2 ∈ G2.Using the fact that T is totally ordered, it follows that there is k ∈ 1, 2 such thatξ1, ξ2 ∈ Gk, so we indeed get ξ1 ⊥ ξ2. It is now clear that M ∈ AF, and M ⊃ G, forall G ∈ T. In other words, we have shown that every totally ordered subset of AF

has an upper bound, in AF. By Zorn’s Lemma, AF has a maximal element. Finally,it is clear that any maximal element for AF is also a maximal element in A.

Remark 6.2. Using the notations from the proof above, given an orthonormalset M ⊂ X, the following are equivalent:

(i) M is maximal in A;(ii) M is maximal in

A(1) =G : G orthonormal subset of X

.

The implication (i) ⇒ (ii) is trivial. Conversely, if M is maximal in A(1), we usethe Lemma to find a maximal N ∈ A with N ⊃ M. But then N(1) is orthonromal,and N(1) ⊃ M, which by the maximality of M in A(1) will force N(1) = M. Since N

is linearly independent, the relations

N(1) = M ⊂ N,

will force N = N(1) = M.

Comment. In linear algebra we know that a linearly independent set is max-imal, if and only if it spans the whole space. In the case of orthogonal sets, thisstatement has a version described by the following result.

Theorem 6.2. Let H be a Hilbert space, and let F be an orthogonal set in H.The following are equivalent:

(i) F is maximal among all orthogonal subsets of H;(ii) Span F is dense in H in the norm topology.

Proof. (i) ⇒ (ii). Assume F is maximal. We are going to show that SpanF isdense in H, by contradiction. Denote the closure SpanF simply by X, and assumeX ( H. Since

X =(X⊥

)⊥,

we see that, the strict inclusion X ( H forces X⊥ 6= 0. But now if we take anon-zero vector ξ ∈ X⊥, we immediately see that the set F∪ξ is still orthogonal,thus contradicting the maximality of F.

CHAPTER II: ELEMENTS OF FUNCTIONAL ANALYSIS 127

(ii) ⇒ (i). Assume SpanF is dense in H, and let us prove that F is maximal.We do this by contardiction. If F is not maximal, then there exists ξ ∈ HrF, suchthat F ∪ ξ is still orthogonal. This would force ξ ⊥ F, so we will also have

ξ ⊥ SpanF.

But since SpanF is dense in H, this will give ξ ⊥ H. In particular we have ξ ⊥ ξ,which would force ξ = 0, thus contradicting the fact that F ∪ ξ is orthigonal.(Recall that all elements of an orthigonal set are non-zero.)

Definition. Let H be a Hilbert space An orthonormal set B ⊂ H, which ismaximal among all orthogonal (or orthonormal) subsets of H, is called an orthonor-mal basis for H.

By Lemma ??, we know that given any orthonormal set F ⊂ H, there exists anorthonormal basis B ⊃ F.

By the above result, an orthonormal set B ⊂ H is an orthonormal basis for H,if and only if SpanB is dense in H.

Example 6.2. Let I be a non-empty set. Consider the Hilbert space `2K(I).Consider (see section II.2) the set

B = δi : i ∈ I.

ThenSpanB = finK(I),

which is dense in `2K(I). The above result then says that B is an orthonormal basisfor `2K(I).

The following exercise will be useful in the discussion of another interestingexample.

Exercise 4. Equipp the space C([0, 1]) with the inner product(f

∣∣ g )=

∫ 1

0

f(t)g(t) dt, f, g ∈ C([0, 1]).

The norm defined by this inner product is

‖f‖2 =( ∫ 1

0

|f(t)|2dt) 1

2

, f ∈ C([0, 1]).

Define the maps en : [0, 1] 3 t 7−→ exp(2nπit) ∈ T, n ∈ Z. (Here T denotes the unitcircle in C.) Prove that the set

B = en : n ∈ Z

is orthonormal in C([0, 1]), and SpanB is dense in C([0, 1]) in the topology definedby the norm ‖ · ‖2.Hints: Define the space

P =f ∈ C([0, 1]) : f(0) = f(1)

.

Prove that P is dense in C([0, 1]) in the topology defined by the norm ‖ · ‖2.

Prove that the map

Φ : C(T) 3 F 7−→ F e ∈ P

is a linear isomorphism, which is isometric with respect to the uniform norms.In order to prove that Span B is dense in C([0, 1]) with respect to ‖ · ‖2, it suffices to show

that Span B is dense in P in the uniform norm. Equivalently, it suffices to show that

Φ−1(Span B

)

128 LECTURES 16-17

is dense in C(T), with respect to the uniform norm. To get this density use Stone-WeierstrassTheorem, plus the fact that the functions ζn = Φ−1(en) ∈ C(T) are defined by

ζn(z) = zn, ∀ z ∈ T, n ∈ Z.

Example 6.3. We define L2([0, 1]) to be the completion of C([0, 1]) with re-spect to the norm ‖ · ‖2. Regard C([0, 1]) as a dense linear subspace in L2([0, 1]),so we also regard

B = en : n ∈ Zas a subset in L2([0, 1]). Then SpanB is dense in L2([0, 1]), so B is an orthonormalbasis for L2([0, 1]).

Lemma 6.3. Let B be an orthonormal basis for the Hilbert space H, and letF ( B be an arbitrary non-empty subset.

(i) F is an orthonormal basis for the Hilbert space SpanF.(ii)

(SpanF

)⊥ = Span(B r F).

Proof. (i). This is clear, since F is orthonormal and has dense span.(ii). Denote for simplicity SpanF = X and Span(B r F) = Y. Since

ξ ⊥ η, ∀ ξ ∈ F, η ∈ B r F,

it is pretty obvious that X ⊥ Y. Since X+Y clearly contains Span B, it follows thatX + Y is dense in H. We know howver that X + Y is closed, so we have in fact theequality

X + Y = H.

This will then giveI = PH = PX + PY,

so we getPY = I − PX = PX⊥ ,

soX⊥ = RanPX⊥ = RanPY = Y.

Theorem 6.3. Let H be a Hilbert space, and let B be an orthonormal basisfor H, labelled6 as B = ξj : j ∈ I. For every vector η ∈ H, let αη : I → K be themap defined by

αη(j) =(ξj

∣∣ η ), ∀ j ∈ I.

(i) For every η ∈ H, the map αη belongs to `2K(I).(ii) The map

T : H 3 η 7−→ αη ∈ `2K(I)is an isometric linear isomorphism.

Proof. (i). Fix for the moment η ∈ H. We must show that

sup ∑j∈F

|αη(i)|2 : F ⊂ I, finite<∞.

For any non-empty finite subset F ⊂ I, we define the subspace

HF = Spanξj : j ∈ F,

6 This notation implicitly assumes that ξj 6= ξk, for all j, k ∈ I with j 6= k.

CHAPTER II: ELEMENTS OF FUNCTIONAL ANALYSIS 129

and define the vectorηF =

∑j∈F

big( ξj∣∣ η ) · ξj .

Claim: For every finite set F ⊂ I, one has the equality

ηF = PHFη.

It suffices to prove that(η − ηF ) ⊥ HF .

But this is obvious, since if we start with some k ∈ F , then using the fact that(ξk

∣∣ ξj )= 0, for all j ∈ F r k, together with the equality ‖ξk‖ = 1, we get(

ξk∣∣ η−ηF )

=(ξk

∣∣ η )−

∑j∈F

(ξj

∣∣ η )·(ξk

∣∣ ξj )=

(ξk

∣∣ η )−

(ξk

∣∣ η )·(ξk

∣∣ ξk )= 0.

Having proven the Claim, let us observe that, since the terms in the sum thatdefines ηF are all orthogonal, we get

‖ηF ‖2 =∑j∈F

∥∥(ξj

∣∣ η )· ξj

∥∥2 =∑j∈F

∣∣( ξj ∣∣ η )∣∣2 · ‖ξj‖2 =∑j∈F

|αη(j)|2.

Combining this computation with the Claim, we now have∑j∈F

|αη(j)|2 = ‖ηF ‖2 = ‖PHFη‖2 ≤ ‖η‖2,

which proves that

sup ∑j∈F

|αη(i)|2 : F ⊂ I, finite< ‖η‖.

(ii). The linearlity of T is obvious. The above inequality actually proves that

‖Tη‖ ≤ ‖η‖, ∀ η ∈ H.We now prove that in fact T is isometric. Since T is linear and continuous, itsuffices to prove that T

∣∣Span B

is isometric. Start with some vector η ∈ SpanB,which means that there exists some finite set F ⊂ I, and scalars (λk)k∈F ⊂ K, suchthat η =

∑k∈F λkξk. Remark that(

ξj | η)

=∑k∈F

λk(ξj | ξj

)=

λk if k ∈ F0 if k ∈ F

so the element αη = Tη ∈ `2K(I) is defined by

αη(k) =λk if k ∈ F0 if k ∈ F

This gives

‖η‖2 =∑j,k∈F

λjλk(ξj

∣∣ ξk )=

∑k∈F

|λk|2 =∑k∈F

|αη(k)|2 = ‖αη‖2,

so we indeed get‖η‖ = ‖Tη‖, ∀ η ∈ SpanB.

Let us prove that T is surjective. Notice that, the above computation, applied tosingleton sets F = k, k ∈ I, proves that

Tξk = δk, ∀ k ∈ I.

130 LECTURES 16-17

In particular, we have

RanT ⊃ T(SpanB

)= SpanT (B) =

= SpanTξk : k ∈ I = Spanδk : k ∈ I = finK(I),

which proves that RanT is dense in `2K(I). We know however that T is isometric,so RanT ⊂ `2K(I) is closed. This forces RanT = `2K(I).

Corollary 6.6 (Parseval Identity). Let H be a Hilbert space, and let B =ξj : j ∈ I be an orthonormal basis for H. One has:(

ζ∣∣ η )

=∑j∈I

∣∣ ξj )·(ξj

∣∣ η ), ∀ ζ, η ∈ H.

Proof. If we define α(j) =(ξj

∣∣ ζ )and

(ξj

∣∣ η ), ∀ j ∈ I, then by construction

we have α = Tζ and β = Tη. Using the fact that T is isometric, the right handside of the above equality is the equal to∑

j∈Iα(j)β(j) =

∣∣β )=

(Tζ

∣∣Tη )=

∣∣ η ).

Notation. Let H be a Hilbert space, let B = ξj : j ∈ I be an orthonormalbasis for H, and let T : H → `2K(I) be the isometric linear isomorphism defined inthe previous theorem. Given an element α ∈ `2K(I), we denote the vector T−1α ∈ H

by ∑j∈I

α(j)ξj .

The summation notation is justified by the following fact.Proposition 6.6. With the above notations, for every ε > 0, there exists some

finite subset Fε ⊂ I, such that∥∥∥∥∑j∈I

α(j)ξj −∑k∈F

α(k)ξk∥∥2< ε, for all finite sets F ⊂ I with F ⊃ Fε.

Proof. Define the vector η =∑j∈I α(j)ξj . By construction we have Tη = α.

Likewise, if we define, for each finite set F ⊂ I, the element αF ∈ `2K(I) by

αF (k) =

α(k) if k ∈ F0 if k ∈ I r F

then T−1αF =∑k∈F α(k)ξk. Using the fact that T is an isometry, we have

‖η − T−1αF ‖ = ‖Tη − αF ‖ = ‖α− αF ‖,

and the desired property follows from the well-known properties of `2K(I).

Exercise 5. Let H be a Hilbert space, let F = ξj : j ∈ J be an orthonormalset. Define the closed linear subspace HF = SpanF. Prove that the orthogonalprojection PHF

is defined by

PHFη =

∑j∈J

(ξj

∣∣ η )ξj , ∀ η ∈ H.

CHAPTER II: ELEMENTS OF FUNCTIONAL ANALYSIS 131

Hints: Extend F to an orthonormal basis B. Let B be labelled as ξi : i ∈ I for some setI ⊃ J . First prove that for any η ∈ H, the map βη = Tη

∣∣J

belongs to `2K(J). In particular, the

sum

ηF =∑j∈J

(ξj

∣∣ η )ξj

is “legitimate” and defines an element in HF (use the fact that F is an orthonormal basis for HF).

Finally, prove that (η − ηF) ⊥ F, using Parseval Identity.

Example 6.4. Let us analyze the space L2([0, 1]). Use the orthonormal basisen : n ∈ Z defined by

en(t) = exp(2nπit), ∀ t ∈ [0, 1], n ∈ Z.

For any f ∈ C([0, 1]) we define

f(n) =∫ 1

0

exp(−2nπit)f(t) dt =(en

∣∣ f ).

We then know thatf =

∑n∈Z

f(n)en.

One can think the right hand side as a series, but the reader should be aware ofthe fact that this series is convergent only in the norm ‖ · ‖2. One can define forexample for any N ≥ 1, a partial sum fN : [0, 1] → C by

fN (t) =N∑

n=−Nf(n)exp(2nπit), t ∈ [0, 1].

We will havelimN→∞

‖f − fN‖2 = 0,

but in general there are (many) values of t ∈ [0, 1] for which the limit limN→∞ fN (t)does not exist. One can consider a formal infinite series

(7)∞∑

n=−∞f(n)exp(2nπit).

Although this series is not convergent (pointwise) for all t ∈ [0, 1], it plays animportant role in analysis. The series (7) is called the complex Fourier series of f .

Note that Parseval’s Identity gives∫ 1

0

f(t)g(t) dt =∞∑

n=−∞f(n)g(n).

One can construct another orthonormal basis for L2([0, 1]), by taking real andimaginary parts of en. More explicitly, we define the sequences of functions (gn)∞n=0

and (hn)∞n=1 by

g0(t) = 1, ∀ t ∈ [0, 1];

gn(t) =√

2 cos(2nπt), ∀ t ∈ [0, 1], n ≥ 1;

hn(t) =√

2 sin(2nπt), ∀ t ∈ [0, 1], n ≥ 1.

132 LECTURES 16-17

Then B′ = gn : n ≥ 0∪hn : n ≥ 1 is again an orthonormal basis for L2([0, 1]).(It is clear that B′ is orthonormal, and SpanB′ 3 en, ∀n ∈ Z, so SpanB′ is densein L2([0, 1]).) For f ∈ C([0, 1]) one can then define its real Fourier series

f(0) +∞∑n=1

[an cos(2nπt) + bn sin(2nπt)

],

where

an =√

2∫ 1

0

f(t) cos(2nπt) dt and bn =√

2∫ 1

0

f(t) sin(2nπt) dt, ∀n ≥ 1.

Note that

an =√

22

[f(−n) + f(n)] and bn =

√2

2i[f(−n)− f(n)], ∀n ≥ 1.

The next result discusses the appropriate notion of dimension for Hilbert spaces.Theorem 6.4. Let H be a Hilbert space. Then any two orthonormal bases of

H have the same cardinality.

Proof. Fix two orthonormal bases B and B′. There are two possible cases.Case I: One of the sets B or B′ is finite.In this case H is finite dimensional, since the linear span of a finite set is

automatically closed. Since both B and B′ are linearly independent, it follows thatboth B and B′ are finite, hence their linear spans are both closed. It follows that

SpanB = SpanB′ = H,

so B and B′ are in fact linear bases for H, and then we get

CardB = Card B′ = dim H.

Case II: Both B and B′ are infinite.The key step we need in this case is the following.

Claim 1: There exists a dense subset Z ⊂ H, with

CardZ = Card B′.

To prove this fact, we define the set

X = SpanQB′.

It is clear thatCardX = CardB′.

Notice that X is dense in SpanRB′. If we work over K = R, then we are done. Ifwe work over K = C, we define

Z = X + iX,

and we will still haveCardZ = Card X = Card B′.

Now we are done, since clearly Z is dense in SpanCB′.Choose Z as in Claim 1. For every ξ ∈ B we choose a vector ζξ ∈ Z, such that

‖ξ − ζξ‖ ≤√

2− 12

.

Claim 2: The map B 3 ξ 7−→ ζξ ∈ Z is injective.

CHAPTER II: ELEMENTS OF FUNCTIONAL ANALYSIS 133

Start with two vectors ξ1, ξ2 ∈ B, such that ξ1 6= ξ2. In particular, ξ1 ⊥ ξ2, so wealso have ξ1 ⊥ (−ξ2), and using the Pythogorean Theorem we get

‖ξ1 − ξ2‖2 = ‖ξ2‖2 + ‖ − ξ2‖2 = 2,

which gives‖ξ1 − ξ2‖ =

√2.

Using the triangle inequality, we now have√

2 = ‖ξ1 − ξ2‖ ≤ ‖ξ1 − ζξ1‖+ ‖ξ2 − ζξ2‖+ ‖ζξ1 − ζξ2‖ ≤√

2− 1 + ‖ζξ1 − ζξ2‖.This gives

‖ζξ1 − ζξ2‖ ≥ 1,which forces ζξ1 6= ζξ2 .

Using Claim 2, we have constructed an injective map B → Z. In particular,using Claim 1 and the cardinal arithmetic rules, we get

CardB ≤ CardZ = Card B′.

By symmetry we also haveCardB′ ≤ CardB,

and then using the Cantor-Bernstein Theorem, we finally get

CardBCardB′.

Corollary 6.7 (of the proof). A Hilbert space is separable, in the norm topol-ogy, if and only of it has an orthonormal basis which is at most countable.

Proof. Use Claims 1 and 2 from the proof of the Theorem.

Definition. Let H be a Hilbert space, and let B be an orthonormal basis forH. By the above theorem, the cardinal number CardB does not depend on thechoice of B. This number is called the hilbertian (or orthogonal) dimension of H,and is denoted by h-dim H.

Corollary 6.8. For two Hilbert spaces H and H′, the following are equivalent:(i) h-dim H = h-dim H′;(ii) There exists an isometric linear isomorphism U : H → H′.

Proof. (i) ⇒ (ii). Choose a set I with h-dim H = h-dim H′ = Card I.Apply Theorem ?? to produce isometric linear isomorphisms T : H → `2K(I) andT ′ : H′ → `2K(I). Then define U = T ′−1 T .

(ii) ⇒ (i). Assume one has an isometric linear isomorphism U : H → H′.Choose an orthonormal basis B for H. Then U(B) is clearly and orthonormal basisfor H′, and since U : B → U(B) is bijective, we get

h-dim H = Card B = CardU(B) = h-dim H′.

Chapter III

Measure Theory

Lecture 18

1. Set arithmetic: (σ-)rings, (σ-)algebras, and monontone classes

In this section we discuss various types of set collections used in Measure The-ory.

Notation. Given a (non-empty) set X, we denote by P(X) the collection ofall subsets of X.

Definition. Let X be a non-empty set. For A ∈ P(X), we define the functionκA : X → 0, 1 by

κA(x) =

1 if x ∈ A0 if x ∈ X rA

The function κA is called the characteristic function of A.The basic properties of characteristic functions are summarized in the following.Exercise 1. Let X be a non-empty set. Prove:

(i) κ∅ = 0 and κX = 1.(ii) For A,B ∈ P(X) one has

A ⊂ B ⇔ κA ≤ κB ;A = B ⇔ κA = κB .

(iii) κA∩B = κA · κB , ∀A,B ∈ P(X).(iv) κArB = κA · (1− κB), ∀A,B ∈ P(X).(v) κA∩B = κA + κB − κA · κB , ∀A,B ∈ P(X).

(vi) κA1∪···∪An=

n∑k=1

(−1)k−1∑

1≤i1<···<ik≤n

κAi1· · ·κAik

, ∀, A1, . . . , An ∈ P(X).

(vii) κA4B = |κA−κB | = κA+κB−2κA ·κB , ∀A,B ∈ P(X). Here4 standsfor the symmetric set difference, defined by A4B = (ArB) ∪ (B rA).

Property (vi) is called the Inclusion-Exclusion Formula.Hint: (vi). Show that the right hand side is equal to 1− (1− κA1

) · · · (1− κAn).

Remark 1.1. The Inclusion-Exclusion formula has an interesting applicationin Combinatorics. If the ambient set X is finite, then the number of elements ofany subset A ⊂ X is given by

|A| =∑x∈X

κA(x).

Using the Inclusion-Exclusion formula, we then get

|A1 ∪ · · · ∪An| =n∑k=1

(−1)k−1∑

1≤i1<···<ik≤n

|Ai1 ∩ · · ·Aik |.

137

138 LECTURE 18

This is known as the Inclusion-Exclusion Principle.Definition. Let X be a non-empty set, and let K be one of the fields7 Q, R

or C. An function φ : X → K is said to be elementary, if its range φ(X) is finite.Remark that this gives

φ =∑

λ∈φ(X)

λ · κφ−1(λ) =∑

λ∈φ(X)r0

λ · κφ−1(λ).

We defineElemK(X) = φ : X → K : φ elementary.

Given a collection M ⊂ P(X), a function φ : X → K is said to be M-elementary,if φ is elementary, and moreover,

φ−1(λ) ∈ M, ∀λ ∈ K r 0.

We defineM-ElemK(X) = φ : X → K : φ M-elementary.

Exercise 2. With the above notations, prove that ElemK(X) is a unital K-algebra.

Proposition 1.1. Given a non-empty set X, the collection P(X) is a unitalring, with the operations

A+B = A4B and A ·B = A ∩B, A,B ∈ P(X).

Proof. First of all, it is clear that 4 is commutative.To prove the associativity of 4, we simply observe that

κ (A4B)4C = κA4B + κC − 2κA4BκC =

= κA + κB − 2κAκB + κC − (κA + κB − 2κAκB) · κC =

= κA + κB + κC − 2(κAκB + κAκC + κBκC) + 2κAκBκC .

Since the final result is symmetric in A,B,C, we see that we get

κA4(B4C) = κ (A4B)4C ,

so we indeed get(A4B)4C = A4(B4C).

The neutral element for 4 is the empty set ∅. Since we obviously have A4A = ∅,it follows that

(P(X),4

)is indeed an abelian group.

The operation ∩ is clearly commutative, associative, and has the total set Xas the unit.

To check distributivity, we again use characteristic functions:

κ (A∩C)4(B∩C) = κA∩C + κB∩C − 2κA∩CκB∩C =

= κAκC + κBκC − 2κAκBκC = (κA + κB − 2κAκB)κC == κA4BκC = κ (A4B)∩C ,

so we indeed have the equality

(A ∩ C)4(B ∩ C) = (A4B) ∩ C.

7 K can be any field.

CHAPTER III: MEASURE THEORY 139

Definitions. Let X be a non-empty set. A ring on X is a non-empty sub-ringR ⊂ P(X). We do not require the unit X to belong to R, but we do require ∅ ∈ R.

An algebra on X is a ring A which contains the unit X.Rings and algebras of sets are characterized as follows.Proposition 1.2. Let X be a non-empty set.A. For a non-empty collection R ⊂ P(X), the following are equivalent:

(i) R is a ring on X;(ii) For any A,B ∈ R, we have ArB ∈ R and A ∪B ∈ R.

B. For a non-empty collection A ⊂ P(X), the following are equivalent:(i) A is an algebra on X;(ii) For any A ∈ A, we have X r A ∈ A, and for any A,B ∈ A, we have

A ∪B ∈ A.

Proof. A. (i) ⇒ (ii). Assume R is a ring on X, and let A,B ∈ R. Then A∩Bbelongs to R, so

ArB = A4(A ∩B)also belongs to R. It the follows that

A ∪B = (A4B)4(A ∩B)

again belongs to R.(ii) ⇒ (i). Assume R satisfies property (ii). Start with A,B ∈ R. Then ArB

belongs to R, andA ∩B = Ar (ArB)

again belongs to R. Since A ∪B also belongs to R, it follows that the set

A4B = (A ∪B) r (A ∩B)

again belongs to R.B. (i) ⇒ (ii). This is clear from the implication A.(i) ⇒ (ii).(ii) ⇒ (i). Assume A satisfies property (ii). Start with two sets A,B ∈ A.

Then the complements X rA and X rB both belong to A, hence their union

(X rA) ∪ (X rB) = X r (A ∩B)

belongs to A, and the complement of this union

X r[X r (A ∩B)

]= A ∩B

will also belong to A.If A,B ∈ A, then since X r B belongs to A, by the above considerations, it

follows that the intersection

A ∩ (X rB) = ArB

also belongs to A. Likewise, the difference BrA also belongs to A, hence the union

(ArB) ∪ (B rA) = A4Balso belongs to A. By part A, it follows that A is a ring.

Finally, since A is non-empty, if we choose some A ∈ A, then A4A = ∅ belongsto A, so its complement X r ∅ = X also belongs to A.

It will be useful to introduce the following terminology.Definition. A system of sets (Ai)i∈I is said to be pair-wise disjoint, if Ai ∩

Aj = ∅, for all i, j ∈ I with i 6= j.

140 LECTURE 18

Lemma 1.1. Let X be a non-empty set, let K be one of fields Q, R or C, andlet R be a ring on X. For a function φ : X → K, the following are equivalent:

(i) φ is R-elementary;(ii) there exist an integer n ≥ 1 and sets A1, . . . , An ∈ R, and numbers

λ1, . . . , λn ∈ K, such that

φ = λ1κA1+ · · ·+ λnκAn

.

(ii) there exist an integer m ≥ 1, and a finite pair-wise disjoint system (Bj)mj=1 ⊂R, and numbers µ1, . . . , µm ∈ K, such that

φ = µ1κB1+ · · ·+ µmκBm

.

Proof. (i) ⇒ (ii). Assume φ is R-elementary. If φ = 0, there is nothing toprove, because we have φ = κ∅. If φ is not identically zero, then we can obviouslywrite

φ =∑

λ∈φ(X)r0

λκφ−1(λ),

with all sets φ−1(λ) in R.(ii) ⇒ (iii). Define

E = ψ : X → K : ψ satisfies property (iii).Assume φ satisfies (ii), i.e.

φ = λ1κA1+ · · ·+ λnκAn

,

with A1, . . . , An ∈ R and λ1, . . . , λn ∈ K. We are going to prove that φ ∈ E, byinduction on n. The case n = 1 is trivial (either φ = 0, so φ = κ∅ ∈ E, or φ = λκA

for some A ∈ R and λ 6= 0, in which case we also have φ ∈ E).Assume

α1κD1+ · · ·+ αkκDk

∈ E,

for all D1, . . . , Dk ∈ R, α1, . . . , αk ∈ K. Start with a function

φ = λ1κA1+ · · ·+ λkκAk

+ λk+1κAk+1,

with A1, . . . , Ak+1 ∈ R and λ1, . . . , λk+1 ∈ K, and based on the above inductivehypothesis, let us show that φ ∈ E. Using the inductive hypothesis, the function

ψ = λ2κA2+ · · ·+ λkκAk

+ λk+1κAk+1

belongs to E, so there exist scalars η1, . . . , ηp ∈ K, an integer p ≥ 1, and a pair-wisedisjoint system (Cj)

pj=1 ⊂ R, such that

ψ = η1κC1+ · · ·+ ηpκCp

.

With this notation, we have

φ = λ1κA1+ η1κC1

+ · · ·+ ηpκCp.

Put then

B2j = A1 ∩ Cj and B2j−1 = Cj rA1, for all j ∈ 1, . . . , p;B2p+1 = A1 r (C1 ∪ · · · ∪ CP ).

It is clear that (Bk)2p+1k=1 ⊂ R is pair-wise disjoint. Notice now that the equalities

Cj = B2j−1 ∪B2j , ∀ j ∈ 1, . . . , p,A1 = B1 ∪B3 ∪ · · · ∪B2p+1,

CHAPTER III: MEASURE THEORY 141

combined with the fact that the B’s are pairwise disjoint, give

κCj= κB2j−1

+ κB2j, ∀ j ∈ 1, . . . , p,

κA1= κB1

+ κB3+ . . .κB2p+1

,

which give

φ =p∑j=1

ηjκB2j+

p∑j=1

(ηj + λ1)κB2j−1+ λ1κB2p+1

,

which proves that φ indeed belongs to E.(iii) ⇒ (i). Assume there exists a finite pair-wise disjoint system (Bj)mj=1 ⊂ R,

and numbers µ1, . . . , µm ∈ K, such that

φ = µ1κB1+ · · ·+ µmκBm

,

and let us prove that φ is R-elemntary.If all the µ’s are zero, there is noting to prove, since φ = 0.Assume the µ’s are not all equal to zero. Since the µ’s that are equal to zero

do not have any contribution, we can in fact assume that all the µ’s are non-zero.Notice that

φ(X) r 0 = µj : 1 ≤ j ≤ m.In particular φ is elementary.

If we start with an arbitrary λ ∈ K r 0, then either λ 6∈ φ(X), or λ ∈φ(X) r 0. In the first case we clearly have φ−1(λ) = ∅ ∈ R. In the secondcase, we have the equality

φ−1(λ) =⋃j∈Mλ

Bj ,

whereMλ = j : 1 ≤ j ≤ m and µj = λ.

Since all B’s belong to R, it follows that φ−1(λ) again belongs to R. Havingshown that φ is elementary, and φ−1(λ) ∈ R, for all λ ∈ K r 0, it follows thatφ is indeed R-elementary.

Proposition 1.3. Let X be a non-empty set, and let K be one of the fields Q,R, or C.

A. For a non-empty collection R ⊂ P(X), the following are equivalent:(i) R is a ring on X;(ii) R-ElemK(X) is a K-subalgebra of ElemK(X).

B. For a non-empty collection A ⊂ P(X), the following are equivalent:(i) A is an algebra on X;(ii) A-ElemK(X) is a K-subalgebra of ElemK(X), which contains the constant

function 1.

Proof. A. (i) ⇒ (ii). Assume R is a ring on X. Using Lemma 1.1 we see thatwe have the equality:

R-ElemK(X) = SpanκA : A ∈ R.In particular, this shows that R-ElemK(X) is a K-linear subspace of ElemK(X).Moreover, in order to prove that R-ElemK(X) is a K-subalgebra, it suffices to provethe implication

A,B ∈ R =⇒ κA · κB ∈ R-ElemK(X).

142 LECTURE 18

But this implication is trivial, since κA · κB = κA∩B , and A ∩B belongs to R.(ii) ⇒ (i). Assume R-ElemK(X) is a K-subalgebra of ElemK(X). First of all,

since κ∅ = 0 ∈ R-ElemK(X), it follows that ∅ ∈ R.Start now with two sets A,B ∈ R. Then κA and κB belong to R-ElemK(X).Since R-ElemK(X) is an algebra, the function

κA∩B = κA · κB

belongs to R-ElemK(X), so we immediately see that A ∩B ∈ R.Likewise, the function

κA4B = κA + κB − 2κAκB

belongs to R-ElemK(X), so we also get A4B ∈ R.B. This equivalence is clear from part A, plus the identity κX = 1.

Algebras of elementary functions give in fact a complete description for ringsor algebras of sets, as indicated in the result below.

Proposition 1.4. Let X be a non-empty set, and let K be one of the fields Q,R, or C.

A. The mapR 7−→ R-ElemK(X)

is a bijective correspondence from the collection of all rings on X, and the collectionof all K-subalgebras of ElemK(X).

B. The mapA 7−→ A-ElemK(X)

is a bijective correspondence from the collection of all algebras on X, and the col-lection of all K-subalgebras of ElemK(X) that contain 1.

Proof. A. We start by proving surjectivity. Let E ⊂ ElemK(X) be an arbitraryK-subalgebra. Define the collection

R = A ⊂ X : κA ∈ E.If A,B ∈ R, then the equalities

κA∩B = κAκB and κA4B = κA + κB − 2κAκB ,

combined with the fact that E is a subalgebra, prove that κA∩B and κA4B bothbelong to E, hence A∩B and A4B both belong to R. This shows that R is a ring.

It is pretty clear (see Lemma 1.1) that R-ElemK(X) ⊂ E. To prove the otherinclusion, start with some arbitrary function φ ∈ E, and let us prove that φ ∈R-ElemK(X). If φ = 0, there is nothing to prove. Assume φ is not identically zero.We write φ(X) r 0 as λ1, . . . , λn, with λi 6= λj for all i, j ∈ 1, . . . , n withi 6= j. For each i ∈ 1, . . . , n, we set Ai = φ−1(λi), so that

φ =n∑i=1

λi · κAi.

Since all λ’s are different, the matrix

T =

λ1 λ2 . . . λnλ2

1 λ22 . . . λ2

n...

.... . .

...λn1 λn2 . . . λnn

CHAPTER III: MEASURE THEORY 143

is invertible. Take[αij

]ni,j=1

to be the inverse of T . The obvious equalities

φk =n∑j=1

λkjκAj, ∀ k = 1, . . . , n

can be written in matrix form asφφ2

...φn

= T ·

κA1

κA2...κAn

,so multiplying by T−1 yields

κAj=

n∑k=1

αjkφk, ∀ j = 1, . . . , n,

which proves that κA1, . . . ,κAn

∈ E, so A1, . . . , An ∈ R. This then shows thatφ ∈ R-ElemK(X).

We now prove injectivity. Suppose first that R and S are rings such thatR-ElemK(X) = S-ElemK(X), and let us prove that R = S. For every A ∈ R, thefunction κA ∈ R-ElemK(X) is also S-elementary, which means that A ∈ S. Thisproves the inclusion R ⊂ S. By symmetry we also have the inclusion S ⊂ R, soindeed R = S.

B. This part is obvious from A.

Definitions. Let X be a (non-empty) set. A collection U ⊂ P(X) is called aσ-ring, if it is a ring, and it has the property:

(σ) Whenever (An)∞n=1 is a sequence in U, it follows that⋃∞n=1An also belongs

to U.A collection S ⊂ P(X) is called a σ-algebra, if it is an algebra, and it has property(σ).

Clearly, every σ-algebra is a σ-ring.Remarks 1.2. A. For σ-rings and σ-algebras, one of the properties in the

definition of rings and algebras is redundant. More explicitly:(i) A collection U ⊂ P(X) is a σ-ring, if and only if it has the property (σ)

and the property: A,B ∈ U =⇒ ArB ∈ U.(ii) A collection S ⊂ P(X) is a σ-algebra, if and only if it has the property

(σ) and the property: A ∈ S =⇒ X rA ∈ S.B. If U is a σ-ring, then it also has the property(δ) (An)∞n=1 ⊂ U =⇒

⋂∞n=1 ∈ U.

Since σ-algebras are σ-rings, they will also have property (δ).Definitions. Let X be a non-empty set. A sequence (An)n≥1 of subsets of X

is said to be monotone, if it satisfies one of the following conditions:(↑) An ⊂ An+1, ∀n ≥ 1,(↓) An ⊃ An+1, ∀n ≥ 1.

In the case (↑) the sequence is said to be increasing, and we define

limn→∞

An =∞⋃n=1

An.

144 LECTURE 18

In the case (↓) the sequence is said to be decreasing, and we define

limn→∞

An =∞⋂n=1

An.

A collection M ⊂ P(X) is said to be a monotone class on X, if it satisfies thecondition:

(m) whenever (An)n≥1 is a monotone sequence in M, it follows that its limitlimn→∞An also belongs to M.

Proposition 1.5. Let R be a ring on X. Then the following are equivalent:(i) R is a σ-ring;(ii) R is a monotone class.

Proof. (i) ⇒ (ii). This is immediate from the definition and Remark 1.2.B.(ii) ⇒ (i). Assume R is a monotone class, an let us prove that it is a σ-ring.

By Remark 1.2.A, we only need to prove that R has property (σ). Start with anarbitrary sequence (An)n≥1 in R, and let us prove that

⋃∞n=1An again belongs to R.

For every integer n ≥ 1, we define Bn =⋃nk=1An. Since R is a ring, it follows that

Bn ∈ R, ∀n ≥ 1. Moreover, the sequence (Bn)n≥1 is increasing, so by assumption,the set

⋃∞n=1An = limn→∞Bn indeed belongs to R.

Lecture 19

2. Constructing (σ-)rings and (σ-)algebras

In this section we outline three methods of constructing (σ-)rings and(σ-)algebras. It turns out that one can devise some general procedures, whichwork for all the types of set collections considered, so it will be natural to beginwith some very general considerations.

Definition. Suppose one has a type Θ of set collections. In other words, forany set X, one defines what it means for a collection C ⊂ P(X) to be of type Θ. Thetype Θ is said to be consistent, if for every set X, one has the following conditions:

• the collection P(X), of all subsets of X, is of type Θ;• if Ci, i ∈ I are collections of type Θ, then the intersection

⋂i∈I Ci is again

of type Θ.Examples 2.1. The following types are consistent:• The type R of rings;• The type A of algebras;• The type S of σ-rings;• The type Σ of σ-algebras;• The type M of monotone classes.

The reason for the consistency is simply the fact that each of these types is definedby means of set operations.

Definition. Let Θ be a consistent type, let X be a set, and let E ⊂ P(X) bean arbitrary collection of sets. Define

FΘ(E, X) =C ⊂ P(X) : C ⊃ E, and C is of type Θ on X

.

Notice that the family FΘ(E, X) is non-empty, since it contains at leas the collectionP(X). The collection

ΘX(E) =⋂

C∈FΘ(E,X)

C

is of type Θ on X, and is called the type Θ class generated by E. When there is nodanger of confusion, the ambient set X will be ommitted.

Comment. In the above setting, the class Θ(E) is the smallest collection oftype Θ on X, which contains E. In other words, if C is a collection of type Θ on X,with C ⊃ E, then C ⊃ Θ(E). This follows immediately from the fact that C belongsto FΘ(E, X).

Examples 2.2. LetX be a (non-empty) set, and let E be an arbitrary collectionof subsets of X. According to the previous list of consistent types R, A, S, Σ, andM, one can construct the following collections.

(i) R(E), the ring generated by E; this is the smallest ring that contains E.

145

146 LECTURE 19

(ii) A(E), the algebra generated by E; this is the smallest algebra that containsE.

(iii) S(E), the σ-ring generated by E; this is the smallest σ-ring that containsE.

(iv) Σ(E), the σ-algebra generated by E; this is the smallest σ-algebra thatcontains E.

(v) M(E), the monotone class generated by E; this is the smallest monotoneclass that contains E.

Comment. Assume Θ is a consistent type. Suppose E is an arbitrary collectionof subsets of some fixed non-empty set X. There are instances when we would liketo decide whether a class C ⊃ E coincides with Θ(E). The following is a useful test:

(i) check that C is of type Θ;(ii) check the inclusion C ⊂ Θ(E).

By (i) we must have C ⊃ Θ(E), so by (ii) we will indeed hav equality.A simple illustration of the above technique allows one to describe the ring and

the algebra generated by a collection of sets.Proposition 2.1. Let X be a non-empty set, and let E be an arbitrary collec-

tion of subsets of X.A. For a set A ⊂ X, the following are equivalent:

(i) A ∈ R(E);(ii) There exist sets A1, A2, . . . , An such that A = A14A24 . . .4An, and each

Ak, k = 1, . . . , n is a finite intersection of sets in E.B. The algebra generated by E is

A(E) = R(E) ∪X rA : A ∈ R(E)

= R

(E ∪ X

).

Proof. A. Define R to be the class of all subsets A ⊂ X, which satisfy property(ii), so that what we have to prove is the equality

R = R(E).

It is clear that E ⊂ R. Since every finite intersection of sets in E belongs to R(E),and the latter is a ring, it follows that R ⊂ R(E). So in order to prove the desiredequality, all we have to do is to prove that R is a ring. But this is pretty clear, ifwe think 4 as the sum operation, and ∩ as the product operation. More explicitly,let us take Π(E) to be the collection of all finite intersections of sets in E, so that

(1) A ∩B ∈ Π(E), ∀A,B ∈ Π(E).

Now if we start with two sets A,B ∈ R, written as A = A14 . . .4Am and B =B14 . . .4Bn, with A1, . . . , Am, B1, . . . , Bn ∈ Π(E), then the equality

A ∩B =[(A1 ∩B1)4 . . .4(Am ∩B1)

]4

[(A1 ∩B2)4 . . .4(Am ∩B2)

]4 . . .

. . .4[(A1 ∩Bn)4 . . .4(Am ∩Bn)

],

combined with (1) proves that A ∩B ∈ R, while the equality

A4B = A14 . . .4Am4B14 . . .4Bnproves that A4B also belongs to R.

B. DefineA = R(E) ∪

X rA : A ∈ R(E)

.

CHAPTER III: MEASURE THEORY 147

Since we clearly have E ⊂ A ⊂ A(E), all we need to prove is the fact that A is analgebra. It is clear that, whenever A ∈ A, it follows that X r A ∈ A. Therefore(see Section III.1), we only need to show that

A,B ∈ A ⇒ A ∪B ∈ A.

There are four cases to examine: (i) A,B ∈ R(E); (ii) A ∈ R(E) and XrB ∈ R(E);(iii) X rA ∈ R(E) and B ∈ R(E); (iv) X rA ∈ R(E) and X rB ∈ R(E).

Case (i) is clear, since it will force A ∪B ∈ R(E).In case (ii), we use

X r (A ∪B) = (X rA) ∩ (X rB) = (X rB) rA,

which proves that X r (A ∪B) ∈ R(E).Case (iii) is proven exactly as case (ii).In case (iv) we use

X r (A ∪B) = (X rA) ∩ (X rB),

which proves that X r (A ∪B) ∈ R(E).The equality A(E) = R

(E ∪ X

)is trivial.

Comment. Unfortunately, for σ-rings and σ-algebras, no easy constructivedescription is avaialable. There is an analogue of Proposition 2.1 uses transfiniteinduction. In order to formulate such a statement, we introduce the followingnotations. For every collection C of subsets of X, we define

C∗ = ∞⋃n=1

(An rBn) : An, Bn ∈ C ∪ ∅, ∀n ≥ 1.

Notice that

(2) C ∪ ∅ ⊂ C∗ ⊂ S(C).

Theorem 2.1. Let X be a non-empty set, and let E be an arbitrary collectionof subsets of X. For every ordinal number η define the set

Pη = α : α ordinal number with α < η.

Let Ω denote the smallest uncountable ordinal number, and define the classes Eα,α ∈ PΩ recursively by E0 = E, and

Eα =( ⋃β∈Pα

Eβ)∗, ∀α ∈ PΩ r 0.

Then the σ-ring generated by E is

S(E) =⋃α∈PΩ

Eα.

Proof. Denote the union⋃α∈PΩ

Eα simply by U. It is obvious that E ⊂ U.Let us prove that U ⊂ S(E). We do this by showing that Eα ⊂ S(E), ∀α ∈ PΩ.

We use transfinite induction. The case α = 0 is clear. Assume α ∈ PΩ has theproperty that Eβ ⊂ S(E), for all β ∈ Pα, and let us show that we also have theinclusion Eα ⊂ S(E). On the one hand, if we take the class

C =⋃β∈Pα

Eβ ,

148 LECTURE 19

then Eα = C∗. On the other hand, by the inductive hypothesis, we have C ⊂ S(E),which clearly forces S(C) ⊂ S(E). Then the desired inclusion follows from (2)

In order to finish the proof, we only need to prove that U is a σ-ring. It sufficesto prove the equality U∗ = U, which in turn is equivalent to the inclusion U∗ ⊂ U.Start with some U ∈ U∗, written as

U =∞⋃n=1

(An rBn),

for two sequences (An)∞n=1 and (Bn)∞n=1 in U. For each n ≥ 1 choose αn, βn ∈ PΩ,such that An ∈ Eαn

and B ∈ Eβn. Form then the countable set

Z = αn : n ∈ N ∪ βn : n ∈ N ⊂ PΩ.

Then we clearly haveU ∈

( ⋃ν∈Z

Eν)∗.

Since Z is countable, there is a strict upper bound for Z in PΩ, i.e. there existsγ ∈ PΩ, such that αn < γ and βn < γ, ∀n ≥ 1. In other words we have Z ⊂ Pγ , so

U ∈( ⋃ν∈Pγ

Eν)∗ = Eγ ,

so U indeed belongs to U.

Corollary 2.1. Given a non-empty set X, and an arbitrary collection E ofsubsets of X, with cardE ≥ 2, one has the inequality

cardS(E) ≤(cardE

)ℵ0.

Proof. Using the notations from the proof of the above theorem, we will firstprove, by transfinite induction, that

(3) cardEα ≤(cardE

)ℵ0, ∀α ∈ PΩ.

The case α = 0 is clear. Assume now we have α ∈ PΩ r 0, such that

cardEβ ≤(cardE

)ℵ0, ∀β ∈ Pα,

and let us prove that we also have the inequality card Eα ≤(cardE

)ℵ0 . If we takeC =

⋃β∈Pα

Eβ , we know that C is a countable union of sets, each having cardinality

≤(cardE

)ℵ0 , so we immediately get

cardC ≤ ℵ0 ·(cardE

)ℵ0 =(cardE

)ℵ0.

Then the collectionD(C) = ArB : A,B ∈ C

has cardinality at most(cardC

)2, so we also have

cardD(C) ≤(cardE

)ℵ0.

Finally, the collection Eα = C∗ has cardinality at most(cardD(C)

)ℵ0 , so we get

cardEα ≤[(

cardE)ℵ0

]ℵ0 =(cardE

)ℵ0.

CHAPTER III: MEASURE THEORY 149

Having proven (3), we now have

cardS(E) = card( ⋃α∈PΩ

Eα)≤

(cardPΩ

)·(cardE

)ℵ0 = ℵ1 ·(cardE

)ℵ0.

Since ℵ1 ≤ c = 2ℵ0 ≤(cardE

)ℵ0 , the above estimate gives

cardS(E) ≤[(

cardE)ℵ0

]2 =(cardE

)ℵ0.

Comment. Suppose Θ is a consistent type. There is a very useful techniquefor proving results on classes of the form Θ(E). More explicitly, suppose E is anarbitrary collection of subsets of X, and (p) is a certain property which refers tosubsets of X. Suppose now we want to prove a statement like:

(∗) Every set A ∈ Θ(E) has property (p).In order to prove such a statement, one defines

U =A ∈ Θ(E) : A has property (p)

,

and it suffices to prove that:(i) U is of type Θ;(ii) U ⊃ E, i.e. every set A ∈ E has property (p).

Indeed, if we prove the above two facts, that would force U ⊃ Θ(E), and since byconstruction we have U ⊃ Θ(E), we will in fact get U = Θ(E), thus proving (∗).

As a first illustration of the above technique, we prove the following.Proposition 2.2. Let X be a non-empty set, and let R be a ring on X. Then

the σ-ring generated by R is the same as the monotone class generated by R, thatis, one has the equality

S(R) = M(R).

Proof. Since S(R) is a monotone class, and contains R, we have the inclusionS(R) ⊃ M(R).

To prove the other inclusion, using the fact that M(R) contains R, it sufficesto show that M(R) and is a σ-ring. Since M(R) is already a monotone class, weonly need to show that it is a ring. In other words, we need to show that wheneverA,B ∈ M(R), it follows that both ArB and A∪B belong to M(R). Define then,for every A ∈ M(R) the set

MA =B ∈ M(R) : A ∩B, ArB, B rA ∈ M(R)

,

so that what we need to prove is:(∗) MA = M(A), ∀A ∈ M(R).

Before we proceed with the proof of (∗), let us first remark that, for A,B ∈ M(R),one has

(4) B ∈ MA ⇐⇒ A ∈ MB .

Secondly, we have the followingClaim 1: For every A ∈ M(R), the collection MA is a monotone class.

To prove this, we start with a monotone sequence (Bn)∞n=1 in MA, and we provethat the limit B = limn→∞Bn again belongs to MA. First of all, clearly B belongsto M(R). Second, since the sequences (A∩Bn)∞n=1, (ArBn)∞n=1, and (BnrA)∞n=1

are all monotone sequences in M(R), and since M(R) is a monotone class, it follows

150 LECTURE 19

that the limits A ∩ B = limn→∞(A ∩ Bn), A r B limn→∞(A r Bn), and B r A =limn→∞(Bn rA) all belong to M(R), so B indeed belongs to MA.

Having proven Claim 1, we now prove (∗) in a particular case:

Claim 2: MA = M(R), ∀A ∈ R.

Fix A ∈ R. We know that MA ⊂ M(R) is a monotone class, so it suffices to provethat MA ⊃ R. But this is obvious, since R is a ring.

We now proceed with the proof of (∗) in the general case. If we define

U =A ∈ M(R) : MA = M(R)

,

all we need to prove is the equality U = M(R). By Claim 2, we know that U ⊃ R, soit suffices to prove that U is a monotone class. Start then with a monotone sequence(An)∞n=1, and let us show that the limit A = limn→∞An again belongs to U. Firstof all, A belongs to M(R). What we then have to prove is that MA = M(R). Startwith some arbitrary B ∈ M(R). We know that B ∈ MAn

, ∀n ≥ 1. Using (4)we have An ∈ MB , ∀n ≥ 1, and using the fact that MB is a monotone class (seeClaim 1), it follows that A = limn→∞An belongs to MA. Using (4) again, thisgives B ∈ MA. This way we have proven that any B ∈ M(R) also belongs to MA,so we indeed have the equality MA = M(R).

Corollary 2.2. Let X be a non-empty set, and let E be an arbitrary familyof subsets of X. Then the σ-ring, and the σ-algebra generated by E respectively, aregiven as the monotone classes generated by the ring, and by the algebra generatedby E respectively. That is, one has the equalities:

(i) S(E) = M(R(E)

);

(ii) Σ(E) = M(A(E)

).

Proof. (i). By the above result, since R(E) is a ring, we have

(5) M(R(E)

)= S

(R(E)

).

Since S(R(E)

)is a σ-ring, and contains E, it follows that

S(R(E)

)⊃ S(E).

Conversely, since S(E) is a ring, and contains E, we get the inclusion

S(E) ⊃ R(E),

and since S(E) is a σ-ring, we will now get

S(E) ⊃ S(R(E)

),

so we get

S(R(E)

)= S(E).

Using (5), the desired equality follows.(ii). This follows from Proposition 2.1 and part (i) applied to E∪X, combined

with the obvious equality Σ(E) = S(E ∪ X

).

The σ-ring and the σ-algebra, generated by an arbitrary collection of sets, arerelated by means of the following result.

CHAPTER III: MEASURE THEORY 151

Proposition 2.3. Let X be a non-empty set, and let E be an arbitrary collec-tion of subsets of X. Define the collection

PEσ(X) =

A ⊂ X : there exists (En)∞n=1 ⊂ E, with A ⊂

∞⋃n=1

En.

(i) PEσ(X) is a σ-ring on X;

(ii) the σ-ring S(E) and the σ-algebra Σ(E), generated by E, satsify the equality

S(E) = Σ(E) ∩ PEσ(X).

Proof. Part (i) is trivial.To prove part (ii), we first observe that the intersection Σ(E) ∩ PE

σ(X) is aσ-ring, which obviously contains E, so we immediately get the inclusion

S(E) ⊂ Σ(E) ∩ PEσ(X).

The key ingredient in proving the inclusion “⊃” is contained in the following.Claim: Given a set E ∈ E, the collection

AE(X) =A ⊂ X : A ∩ E ∈ S(E)

is a σ-algebra on X.

To prove this we need to check:(a) if A belongs to AE(X), then X rA also belongs to AE(X);(b) whenever (An)∞n=1 is a sequence of sets in AE(X), it follows that the union⋃∞

n=1An also belongs to AE(X).To check (a) we simply remark that, since both E and A ∩ E belong to S(E), itfollows immediately that (X rA) ∩E = E r (A ∩E), also belongs to S(E), whichmeans that X rA indeed belongs to AE(X).

Property (b) is clear. Since the fact that An ∩ E belongs to S(E), for all n,immediately gives the fact that

( ⋃∞n=1An) ∩ E =

⋃∞n=1(An ∩ E) belongs to S(E),

which means precisely that⋃∞n=1An belongs to AE .

Having proven the Claim, we now proceed with the proof of the inclusionS(E) ⊃ Σ(E) ∩ PE

σ(X). Start with some set A ∈ Σ(E) ∩ PEσ(X), and we will show

that A belongs to S(E). First of all, there exists a sequence (En)∞n=1 ⊂ E, such that

(6) A ⊂∞⋃n=1

En.

Using the Claim, we know that for each n ∈ N, the collection AEnis a σ-algebra.

This σ-algebra clearly contains E, so we have

Σ(E) ⊂ AEn, ∀n ∈ N.

In particular, we get the fact that A ∈ AEn , which means that A ∩ En belongs toS(E, for all n ∈ N. But then the inclusion (6) forces the equality

A =⋃n=1

(A ∩ En),

which then gives the fact that A indeed belongs to S(E).

The above result motivates the following.

152 LECTURE 19

Definition. A collection E of subsets of X is said to be σ-total in X, ifX ∈ PE

σ(X), i.e. there exists some sequence (En)∞n=1 ⊂ E with⋃∞n=1En = X. By

the above result, this is equivalent to the fact that X belongs to the σ-ring S(E)generated by E, which in turn is equivalent to the equality Σ(E) = S(E).

We discuss now two more methods of constructing (σ-)rings, (σ-)algebras, ormonotone classes.

Notations. Let f : X → Y be a function, and let E ⊂ P(X) and G ⊂ P(Y )be two arbitrary collections of sets. We define

f∗E =A ∈ P(Y ) : f−1(A) ∈ E

⊂ P(Y );

f∗G =f−1(G) : G ∈ G

⊂ P(X).

Definitions. Let Θ be a type of set collections. We say that Θ is natural, iffor any map f : X → Y , one has the implications

(i) C of type Θ on X =⇒ f∗C of type Θ on Y ;(ii) D of type Θ on Y =⇒ f∗D of type Θ on X.

Examples 2.3. The types R, A, S, Σ, and M are natural.The term “natural” is justified by the following.

Exercise 1. Let Xf−→ Y

g−→ Z be maps.(i) Prove that, for any collection C ⊂ P(X), one has the equality g∗(f∗C) =

(g f)∗C.(ii) Prove that, for any collection D ⊂ P(Y ), one has the equality f∗(g∗D) =

(g f)∗D.Theorem 2.2 (Generating Theorem). Suppose Θ is a consistent class type,

which is natural. Let X and Y be non-empty sets, and let f : X → Y be a map.For any collection G ⊂ P(Y ), one has the equality

f∗Θ(G) = Θ(f∗G).

Proof. On the one hand, by naturality, we know that f∗Θ(G) is of type Θ.On the other hand, it is pretty clear that, since Θ(G) ⊃ G, we also have the inclusionf∗Θ(G) ⊃ f∗G). Since Θ is consistent, it then follows that we have the inclusion

f∗Θ(G) ⊃ Θ(f∗G).

To prove the other inclusion, we consider the class

C = f∗[Θ(f∗G)

]⊂ P(Y ).

By naturality, it follows that C is of type Θ on Y . For any G ∈ G, the obviousrelation

f−1(G) ∈ f∗G ⊂ Θ(f∗G)

proves that G ∈ C. This means that we have the inclusion C ⊃ G, and since C is ofclass Θ, it follows that we have the inclusion

Θ(G) ⊂ C.

This means that, for every A ∈ Θ(G), we have f−1(A) ∈ Θ(f∗G), which meansprecisely that we have the desired inclusion

f∗Θ(G) ⊂ Θ(f∗G).

CHAPTER III: MEASURE THEORY 153

Example 2.4. Let Θ be a consistent class type, which is both covariant andcontravariant. Let Y be some set, and let C be a collection of type Θ on Y . Givena subset X ⊂ Y , we consider the inclusion map ι : X → Y . The collection ι∗C isthen of type Θ on X. It will be denoted by C

∣∣X

. Since ι−1A = A∩X, ∀A ∈ P(Y ),we have

C∣∣X

= A ∩X : A ∈ C.If E ⊂ P(Y ) is a collection with C = Θ(E), then by the Generating Theorem wehave the equality

(7) Θ(E)∣∣X

= Θ(E ∩X : E ∈ E

).

Comment. The exercise below shows that a “forward” version of the Gener-ating Theorem does not hold in general. In other words, an equality of the typef∗Θ(G) = Θ(f∗G) may fail. The reason is the fact that the collection f∗G may berelatively “small.”

Exercise 2. Consider the sets X = 1, 2, 3, Y = 1, 2, the function f : X →Y , defined by f(1) = f(2) = 1, f(3) = 2, and the collection C =

1, 2,∅

.

Describe the collection f∗C, the algebra A(C) generated by C (on X), and thealgebra A(f∗C) generated by f∗C (on Y ). Prove that one has a strict inclusionA(f∗C) ( f∗A(C).

Exercise 3. Let Θ be a consistent natural type, let f : X → Y be a surjectivemap, and let G be a collection of subsets of X. Assume one has the inclusion

(8) G ⊂ f∗Θ(f∗G).

Prove that one has the equality

f∗Θ(G) = Θ(f∗G).

(One instance when (8) holds is for example when f−1(f(G)

)= G, ∀G ∈ G.)

Exercise 4*. Let Θ be one of the types A, R, S, Σ, or M. Let f : X → Ybe an injective map, and let G ⊂ P(X) be some arbitrary collection. Prove theequality

f∗Θ(G) = Θ(f∗G).

Natural consistent types are useful, because it is possible to construct productstructures.

Definition. Let Θ be a consistent type which is natural. Let (Xi)i∈I be acollection of non-empty sets. Assume that, for each i ∈ I, a collection Ei ⊂ P(Xi)of type Θ is given. Consider the product cartesian product X =

∏i∈I Xi, together

with the projection maps πi : X → Xi, i ∈ I. The collection

Θ -Xi∈I

Ei = Θ( ⋃i∈I

π∗i Ei

)is a collection of type Θ on X, which is called the Θ-product. When there is nodanger of confusion, we use the notation X.

Remark 2.1. Use the notations from the above definition. Assume that, foreach i ∈ I, a collection Gi ⊂ P(Xi) is given. Then one has the equality

Xi∈I

Θ(Gi) = Θ( ⋃i∈I

π∗i Gi

).

154 LECTURE 19

Indeed, if we define Ei = Θ(Gi), the inclusion ⊃ follows from the obvious inclusions

Xi∈I

Ei ⊃ π∗i Ei ⊃ π∗i Gi.

The inclusion ⊂, follows from the inclusions

π∗i Gi ⊂ Θ( ⋃i∈I

π∗i Gi

),

which combined with the fact that the right hand side is of type Θ, and the Gen-erating Theorem, give the inclusions

π∗i Ei = π∗iΘ(Gi) = Θ(π∗i Gi) ⊂ Θ( ⋃i∈I

π∗i Gi

).

Natural consistent types also allow one to define disjoint union structures.Definitions. Let (Xi)i∈I be a collection of non-empty sets. Assume that, for

each i ∈ I, a collection Ci ⊂ P(Xi) is given. On the disjoint union X =⊔i∈I Xi

one defines the collection∨i∈I

Ci =C ⊂ X : C ∩Xi ∈ Ci, ∀ i ∈ I

.

Assume now Θ is a natural consistent, and Ci is of type Θ on Xi, for each i ∈ I. Ifwe consider the inclusion maps εi : Xi → X, i ∈ I, then one clearly has the equality∨

i∈ICi =

⋂i∈I

εi∗Ci,

which means that∨i∈I Ci is a collection of type Θ on X.

Exercise 5. Let I be countable, and let (Xi)i∈I be a collection of non-emptysets. Assume that, for each i ∈ I, a collection Ci ⊂ P(Xi) is given, such that∅ ∈ Ci. Prove the equalities∨

i∈IS(Ci) = S

( ∨i∈I

Ci)

and∨i∈I

Σ(Ci) = Σ( ∨i∈I

Ci).

We conclude with a discussion on certain constructions related to topology.Definitions. Let X be a topological Hausdorff space. We consider the collec-

tion T of all open sets in X. The σ-algebra Σ(T) on X, generated by T is denotedby Bor(X). The sets in Bor(X) are called Borel sets.

Remark that singleton sets are Borel, since they are closed. Moreover• every countable set B ⊂ X is Borel.

One also defines the σ-algebra Borc(X) = Σ(CX) generated by the class CX ofall compact subsets of X.

Another class of sets will also be of interest. Its construction uses the followingterminology.

A subset A ⊂ X is said to be σ-compact, if there exists a sequence (Kn)∞n=1 ofcompact subsets of X, such that A =

⋃∞n=1Kn. A set B ⊂ X is said to be relatively

σ-compact, if there exists a σ-compact set A with B ⊂ A. We set

Pσc(X) =B ∈ P(X) : B relatively σ-compact

,

and we defineBorσc(X) = Bor(X) ∩ Pσc(X).

CHAPTER III: MEASURE THEORY 155

Proposition 2.4. Let X be a topological Hausdorff space.(i) Pσc(X) is a σ-ring on X;(ii) the σ-ring Borσc(X) coincides with the σ-ring S(CX) generated by the

collection CX of all compact subsets of X.

Proof. Using the notations from Proposition 2.3, we have Pσc(X) = PCXσ (X),

so part (i) is a consequence of Proposition 2.3.(i). By Proposition 2.3.(ii) we alsoknow that

S(CX) = Σ(CX) ∩ Pσc(X) = Borc(X) ∩ Pσc(X),

and since Borc(X) ⊂ Bor(X), we have the inclusion

S(CX) ⊂ Borσc(X).

To prove the other inclusion, all we need to show is the inclusion

Borσc(X) ⊂ Borc(X).

Start with some arbitrary set B ∈ Borσc(X), and let us prove that prove thatB ∈ Borc(X). Since B is relatively σ-compact, there exists a sequence (Kn)∞n=1

of compact sets, such that B ⊂⋃∞n=1Kn. Define, for each integer n ≥ 1, the set

Bn = B ∩Kn. Since B =⋃∞n=1Bn, It suffices to show that

(9) Bn ∈ Borc(X), ∀n ∈ N.

Fix n, and let us analyze the inclusion ιn : Kn → X. Denote by T the collection ofall open sets in X, and denote by TKn

the collection of all sets D ⊂ Kn, which areopen in the induced topology, that is,

TKn=

D ∩Kn : D ∈ T

.

By the Generating Theorem (Example 2.4), we know that

Bor(X)∣∣Kn

= Σ(T)∣∣Kn

= ΣKn

(D ∩Kn : D ∈ T

)= ΣKn

(TKn

)= Bor(Kn).

(Here the notation ΣKnindicates that the σ-algebra is taken on Kn.) In particular,

we get

(10) Bn = B ∩Kn ∈ Bor(X)∣∣Kn

= Bor(Kn), ∀n ∈ N.

Since Kn is compact, the σ-ring S(CKn), generated by all compact subsets of Kn, is

a σ-algebra on Kn (simply because it contains Kn.) Notice that every set D ∈ TKn

is of the form Kn r F , with F ⊂ Kn compact (in X), therefore D belongs toS(CKn). Since S(CX) is a σ-algebra, which contains TKn , we have

Bor(Kn) = ΣKn

(TKn

)⊂ S(CKn

) ⊂ S(CX) ⊂ Borc(X), ∀n ∈ N.

Now (9) immediately follows from the above inclusions, combined with (10).

Remark 2.2. For a topological Hausdorff space, we always have the inclusions

Borσc(X) ⊂ Borc(X) ⊂ Bor(X).

The following are equivalent(i) Borσc(X) = Bor(X);(ii) X is σ-compact.

The following result exaplains when a minimal set of generators can be chosenfor the Borel sets.

156 LECTURE 19

Proposition 2.5. Let X be a topological space which is second countable, i.e.there is a countable base for the topology. If S is any sub-base for the topology(countable or not), then

Bor(X) = Σ(S).

Proof. Denote by T the collection of all open sets in X. Denote by V thecollection of all subsets of X, which can be written as finite intersections of sets inS. It is obvious that S ⊂ V ⊂ Σ(S), so we have the equality Σ(S) = Σ(V). Thismeans that it suffices to prove the equality

(11) Σ(T) = Σ(V).

Notice that V is a base for the topology, which means that every open subset D ( Xcan be written as a union of sets in V. What we want to prove is

Claim: Every open set D ( X is a countable union of sets in V.To prove this fact, we fix an open set D ( X, as well as a countable base B =Bn∞n=1 for the topology. For every x ∈ D we define the set

Mx = n ∈ N : there exists V ∈ V, such that x ∈ Bn ⊂ V ⊂ D.It is pretty clear that Mx 6= ∅, ∀x ∈ D. (First use the fact that V is a base, tofind V ∈ V such that x ∈ V ⊂ D, and then use the fact that B is a base to find nsuch that x ∈ Bn ⊂ V .) If we put M =

⋃x∈DMx, then it is pretty obvious that⋃

n∈M Bn = D. For every n ∈ M we choose some Vn ∈ V with Bn ⊂ Vn ⊂ D (usethe fact that n must belong to some Mx). It is then clear that D =

⋃n∈M Vn, and

the claim follows.As a consequence of the Claim, we see that any open set D ( X automatically

belongs to Σ(V), and then we have the inclusion T ⊂ Σ(V) ⊂ Σ(T). This clearlyforces the equality (11).

Corollary 2.3. Let I be a set which is at most countable, and let (Xi)i∈I bea collection of second countable topological spaces. Then one has the equality

(12) Bor( ∏i∈I

Xi

)= Σ-X

i∈IBor(Xi),

where the product space∏i∈I Xi is equipped with the product topology.

Proof. By the definition of the product σ-algebra, we know that

Σ-Xi∈I

Bor(Xi) = Σ( ⋃j∈I

π∗jBor(Xj)),

where πj :∏i∈I → Xj , j ∈ I, denote the projection maps. Choose, for each j ∈ I,

a countable sub-base Sj for Xj , so that we have the equalities

Bor(Xj) = Σ(Sj), ∀ j ∈ I.By Remark 2.1 we have the equality

Σ-Xi∈I

Bor(Xi) = Σ( ⋃j∈I

π∗j Sj

),

where πj :∏i∈I → Xj , j ∈ I, denote the projection maps. Since the collection⋃

i∈I π∗i Si is a countable sub-base for the product topology, the above equality,

combined with Proposition 2.5 immediately gives (12).

CHAPTER III: MEASURE THEORY 157

Exercise 6. A. Prove that, if X is second countable, and S is a sub-base for itstopology (countable or not), with

⋃S∈S S = X, then we have in fact the equality

Bor(X) = S(S).

B. Prove that, if X is Hausdorff, second countable, with cardX ≥ 2, then for anysub-base S (countable or not), we have the equality

Bor(X) = S(S).

Hints: Follow the proof above. Remark that every open set D ⊂ X, which is a countable union

of sets in V, belongs in fact to the σ-ring S(V) = S(S). So in either case, we only have to show

that X is a countable union of sets in V.In case A, we trace the proof of the Claim, and we notice that the only property that we

used was the fact that, for every x ∈ D, there exists V ∈ V with x ∈ V ⊂ D, i.e. D is a (possiblyuncountable) union of sets in V. Since X itself satisfies this property, it follows that X is also acountable union of sets in V.

In case B, we use the Hausdorff property to write X = D1 ∪D2, with D1, D2 ( X open.

Corollary 2.4. If X is a topological Hausdorff space, which is second count-able, and X is infinite (as a set), then cardBor(X) = c.

Proof. First of all, since X is infinite, one can chose an infinite countablesubset A ⊂ X. Then A, and all its subsets are Borel, i.e. we have the inclusionP(A) ⊂ Bor(X), thus proving the inequality

cardBor(X) ≥ cardP(A) = 2ℵ0 = c.

Secondly, one can choose a base V for the topology, which is countable. We nowhave Bor(X) = S

(V ∪ X

), so by Corollary 2.1. we get

cardBor(X) ≤ card(V ∪ X

)ℵ0 ≤ ℵ0ℵ0 = c,

and the desired equality follows.

Examples 2.5. A. Consider the extended real line [−∞,∞] = R ∪ −∞,∞,thought as a compact space, homeomorphic to the interval [−π/2, π/2], via the mapf : [−π/2, π/2] → [−∞,∞], defined by

f(t) =

−∞ if t = −π/2tan t if − π/2 < t < π/2∞ if t = π/2

Notice that, when restricted to R = (−∞,∞), this topology agrees with the usualtopology. In particular, this gives the equality Bor([−∞,∞])

∣∣R = Bor(R).

Let A ⊂ R be a dense subset. Consider the collections

E1 =(a,∞] : a ∈ A

; E2 =

[a,∞] : a ∈ A

;

E3 =[−∞, a) : a ∈ A

; E4 =

[−∞, a] : a ∈ A

.

With these notations we have the equalities

Bor([−∞,∞]) = Σ(E1) = Σ(E2) = Σ(E3) = Σ(E4).

First of all, we notice that each set in E1 ∪ E2 ∪ E3 ∪ E4 is either open or closed,which means that

E1 ∪ E2 ∪ E3 ∪ E4 ⊂ Bor([−∞,∞]),thus giving the inclusions

Σ(Ek) ⊂ Bor([−∞,∞]), ∀ k ∈ 1, 2, 3, 4.

158 LECTURE 19

Second, we observe that E1 ∪ E3 is a sub-base for the topology, and since [−∞,∞]is obviously second countable, we will have the equality

Bor([−∞,∞]) = Σ(E1 ∪ E3).

So, in order to finish the proof we only need to show the inclusions

(13) E1 ∪ E3 ⊂ Σ(Ek), ∀ k ∈ 1, 2, 3, 4.

Since every set in E2 has its complement in E3, and viceversa, we have the inclusions

E2 ⊂ Σ(E3) and E3 ⊂ Σ(E2),

which prove the equality

(14) Σ(E2) = Σ(E3).

Likewise, we have the equality

(15) Σ(E1) = Σ(E4).

This means that we only have to prove (13) for k = 2 and k = 4. The case k = 2amounts to proving that E1 ⊂ Σ(E2). Fix some a ∈ A. For every integer n ≥ 1 wechoose an ∈ (a, a+ 1

n ) ∩A. Then the equality

(a,∞] =∞⋃n=1

[an,∞]

clearly shows that (a,∞] ∈ Σ(E2).The case k = 4 amounts to proving that E3 ⊂ Σ(E4). Fix some a ∈ A. For

every integer n ≥ 1 we choose an ∈ (a− 1n , a) ∩A. Then the equality

[−∞, a) =∞⋃n=1

[−∞, an]

clearly shows that [−∞, a) ∈ Σ(E4).B. If we work on R, and we consider the collections

E0k =

E ∩ R : E ∈ Ek

, k = 1, 2, 3, 4,

then by the Generating Theorem (Example 2.4) we have the equalities

Bor(R) = Σ(E0k) = S(E0

k), k = 1, 2, 3, 4.

(The fact that the σ-algebra Σ(E0k) and the σ-ring S(E0

k) coincide is a consequenceof the fact that E0

k is σ-total in R.)C. Let X be a separable metric space. Let A ⊂ X be a dense set, and let

R ⊂ (0,∞) be a subset with inf R = 0. Then the collection

SA,R =Br(a) : r ∈ R, a ∈ A

is clearly a base for the metric topology. Since X is separable, one can choose bothA and R to be countable, which proves that X is automatically second countable.Then for any choice of A and R, we will have the equality

(16) Bor(X) = Σ(

Br(a) : r ∈ R, a ∈ A)

= S(

Br(a) : r ∈ R, a ∈ A).

(The equality between the generated σ-algebra and σ-ring follows from Exercise1.A.) As particular cases when the equality (16) holds, one has the metric spaceswhich are σ-compact.

CHAPTER III: MEASURE THEORY 159

Exercise 7*. Let I be an uncountable set, and let (Xi)i∈I be a collection oftopological spaces. Assume that for each i ∈ I, there esists at leas one non-emptyclosed subset Fi ( Xi. (This is the case for example when Xi is Hausdorff, andcardXi ≥ 2.) Prove that one has a strict inclusion

Bor( ∏i∈I

Xi

)) Σ-X

i∈IBor(Xi).

Hint: For every subset J ⊂ I, define the projection map πJ :∏

i∈I Xi →∏

i∈J Xi. Considerthe collection

A =A ⊂

∏i∈I

Xi : there exists J ⊂ I countable, such that A = π−1J

(πJ (A)

).

Prove that A ∪ ∅ is a σ-algebra, which contains⋃

i∈I π∗i Bor(Xi). Prove that one has a strict

inclusion Bor( ∏

i∈I Xi

)) A∪∅, by contsructing a non-empty closed set F ⊂

∏i∈I Xi, which

does not belong to A.

Lecture 20

3. Measurable spaces and measurable maps

In this section we discuss a certain type of maps related to σ-algebras.Definitions. A measurable space is a pair (X,A) consisting of a (non-empty)

set X and a σ-algebra A on X.Given two measurable spaces (X,A) and (Y,B), a measurable map T : (X,A) →

(Y,B) is simply a map T : X → Y , with the property

(1) T−1(B) ∈ A, ∀B ∈ B.

Remark 3.1. In terms of the constructions outlined in Section 2, measurabilityfor maps can be characterized as follows. Given measurable spaces (X,A) and(Y,B), and a map T : X → Y , the following are equivalent:

(i) T : (X,A) → (Y,B) is measurable;(ii) T ∗B ⊂ A;(iii) T∗A ⊃ B.Recall

T ∗B =T−1(B) : B ∈ B

;

T∗A =B ⊂ Y : T−1(B) ∈ A

.

With these equalities, everything is immediate.The following summarizes some useful properties of measurable maps.Proposition 3.1. Let (X,A) be a measurable space.

(i) If A′ is any σ-algebra, with A′ ⊂ A, then the identity map IdX : (X,A) →(X,A′) is measurable.

(ii) For any subset M ⊂ X, the inclusion map ι : (M,A∣∣M

) → (X,A) ismeasurable.

(iii) If (Y,B) and (Z,C) are measurable spaces, and if (X,A) T−−→ (Y,B) S−−→(Z,C) are measurable maps, then the composition S T : (X,A) → (Z,C)is again a measurable map.

Proof. (i). This is trivial, since (IdX)∗A′ = A′ ⊂ A.(ii). This is again trivial, since ι∗A = A

∣∣M

.(iii). Start with some set C ∈ C, and let us prove that (S T )−1(C) ∈ A. We

know that (S T )−1 = T−1(S−1(C)

). Since S is measurable, we have S−1(C) ∈ B,

and since T is measurable, we have T−1(S−1(C)

)∈ A.

Often, one would like to check the measurability condition (1) on a small col-lection of B’s. Such a criterion is the following.

161

162 LECTURE 20

Lemma 3.1. Let (X,A) and (Y,B) be masurable spaces. Assume B = Σ(E),for some collection of sets E ⊂ P(Y ). For a map T : X → Y , the following areequivalent:

(i) T : (X,A) → (Y,B) is measurable;(ii) T−1(E) ∈ A, ∀E ∈ E.

Proof. The implication (i) ⇒ (ii) is trivial.To prove the implication (ii) ⇒ (i), assume (ii) holds. We first observe that

condition (ii) reads f∗E ⊂ A. Since A is a σ-algebra, we get the inclusion

Σ(f∗E) ⊂ A.

Using the Generating Theorem 2.2, we have

f∗B = f∗Σ(E) = Σ(f∗E) ⊂ A,

and, by the preceding remark, we are done.

Corollary 3.1. Let (X,A) be a measurable space, let Y be a topological Haus-dorff space which is second countable, and let S be a sub-base for the topology of Y .For a map T : X → Y , the following are equivalent:

(i) T : (X,A) →(Y,Bor(Y )

)is a measurable map;

(ii) T−1(S) ∈ A, ∀S ∈ S.

Proof. Immediate from the above Lemma, and Proposition 2.2, which statesthat Bor(Y ) = Σ(S).

We know (see Section 19) that the type Σ is consistent and natural. In par-ticular, measurability behaves nicely with respect to products and disjoint unions.More explicitly one has the following.

Proposition 3.2. Let (Xi,Ai)i∈I be a collection of measurable spaces. Con-sider the sets X =

∏i∈I Xi and Y =

⊔i∈I Xi, and the σ-algebras

A = Σ -Xi∈I

Ai and B =∨i∈I

Ai.

Let (Z,G) be a measurable space.(i) If we denote by πi : X → Xi, i ∈ I, the projection maps, then a map

f : (Z,G) → (X,A) is measurable, if and only if, all the maps πi f :(Z,G) → (Xi,Ai), i ∈ I, are measurable.

(ii) If we denote by εi : Xi → Y , i ∈ I, the inclusion maps, then a mapg : (Y,B) → (Z,G) is measurable, if and only if, all the maps g εi f :(Xi,Ai) → (Z,G), i ∈ I, are measurable.

Proof. (i). By the definition of the product σ-algebra, we know that

(2) A = Σ( ⋃i∈I

π∗iAi

).

If we fix some index i ∈ I, then the obvious inclusion π∗iAi ⊂ A immediatelyshows that πi : (X,A) → (Xi,Ai) is measurable. Therefore, if f : (Z,G) → (X,A)is measurable, then by Proposition 3.1 it follows that all compositions πi f :(Z,G) → (Xi,Ai), i ∈ I, are measurable.

CHAPTER III: MEASURE THEORY 163

Conversely, assume all the compositions πi f are measurable, and let us showthat f : (Z,G) → (X,A) is measurable. By Lemma 3.1 and (2), all we need toprove is the fact that

f∗( ⋃i∈I

π∗iAi

)⊂ G,

which is equivalent tof∗

(π∗iAi

)⊂ G, ∀ i ∈ I.

But this is obvious, because f∗(π∗iAi

)= (πi f)∗Ai, and πi f is measurable, for

all i ∈ I.(ii). By the definition of the σ-algebra sum, we know that

(3) B =⋂i∈I

εi∗Ai.

If we fix some index i ∈ I, then the obvious inclusion εi∗Ai ⊃ B immediatelyshows that εi : (Xi,Ai) → (Y,B) is measurable. Therefore, if g : (Y,B) → (Z,G)is measurable, then by Proposition 3.1 it follows that all compositions g εi :(Xi,Ai) → (Z,G), i ∈ I, are measurable.

Conversely, assume all the compositions g εi are measurable, and let us showthat g : (Y,B) → (Z,G) is measurable. This is equivalent to the inclusion g∗B ⊃ G.By (3) we immediately have

(4) g∗B = g∗( ⋂i∈I

εi∗Ai

)=

⋂i∈I

g∗(εi∗Ai

).

We know however that, since g εi are all measurable, we have

g∗(εi∗Ai

)= (g εi)∗Ai ⊃ G, ∀ i ∈ I,

so the desired inclusion is an immediate consequence of (4).

Conventions. Let (X,A) be a measurable space. An extended real-valuedfunction f : (X,A) → [−∞,∞] is said to be a measurable function, if it is measur-able in the above sense as a map f : (X,A) →

([−∞,∞], Bor([−∞,∞])

). If f has

values in R, this is equivalent to the fact that f is a measurable map f : (X,A) →(R, Bor(R)

)is measurable. Likewise, a complex valued function f : (X,A) → C is

measurable, if it is measurable as a map f : (X,A) →(C, Bor(C)

). If K is one of

the fields R or C, we define the set

BK(X,A) =f : (X,A) → K : f measurable function

.

Remark 3.2. Let (X,A) be a measurable space. If A ⊂ R is a dense subset,then the results from Section 2, combined with Lemma 2.1, show that the measur-ability of a function f : (X,A) → [−∞,∞] is equivalent to any of the followingconditions:

• f−1((a,∞]

)∈ A, ∀ a ∈ A;

• f−1([a,∞]

)∈ A, ∀ a ∈ A;

• f−1([−∞, a)

)∈ A, ∀ a ∈ A;

• f−1([−∞, a]

)∈ A, ∀ a ∈ A.

Definition. If X and Y are topological Hausdorff spaces, a map T : X → Yis said to be Borel measurable, if T is measurable as a map

T :(X,Bor(X)

)→

(Y,Bor(Y )

).

164 LECTURE 20

In the cases when Y = R, C, [−∞,∞], a Borel measurable map will be simplycalled a Borel measurable function.

For K = R, C, we define

BK(X) =f : X → K : f Borel measurable function

.

Remark 3.3. IfX and Y are topological Hausdorff spaces, then any continuousmap T : X → Y is Borel measurable. This follows from Lemma 3.1, from the factthat

Bor(Y ) = Σ(D ⊂ Y : D open

),

and the fact that T−1(D) is open, hence in Bor(X), for every open set D ⊂ Y .Measurable maps behave nicely with respect to “measurable countable opera-

tions,” as suggested by the following result.

Proposition 3.3. Let (X,A) and (Z,B) be a measurable spaces, let I be aset which is at most countable, and let (Yi)i∈I be a family of topological Hausdorffspaces, each of which is second countable. Suppose a measurable map Ti : (X,A) →(Yi, Bor(Yi)

)is given, for each i ∈ I. Define the map T : X →

∏i∈I Yi by

T (x) =(Ti(x)

)i∈I , ∀x ∈ X.

Equip the product space Y =∏i∈I Yi with the product topology.

For any measurable map g :(Y,Bor(Y )

)→ (Z,B), the composition g T :

(X,A) → (Z,B) is measurable.

Proof. We know (see Corollary 2.3) that we have the equality

Bor(Y ) = Σ-Xi∈I

Bor(Yi).

By Proposition 3.2, the map T : (X,A) →(Y,Bor(Y )

)is measurable, so by Propo-

sition 3.1, the composition g T : (X,A) → (Z,B) is also measurable.

The above result has many useful applications.

Corollary 3.2. Suppose (X,A) is a measurable space, and K is either R or C.Then, when equipped with point-wise addition and multiplication, the set BK(X,A)is a unital K-algebra.

Proof. Clearly the constant function 1 is measurable.Also, if f ∈ BK(X,A) and λ ∈ K, then the function λf is again measurable,

since it can be written as the composition Mλ f , where Mλ : K 3 α 7−→ λα ∈ Kis obviously continuous.

Finally, let us show that if f1, f2 ∈ BK(X,A), then f1 + f2 and f1 · f2 againbelong to BK(X,A). This is however immediate from Proposition 3.3, applied tothe index set I = 1, 2, the spaces Y1 = Y2 = K, and the continuous maps

g1 : K2 3 (λ1, λ2) 7−→ λ1 + λ2 ∈ K,g2 : K2 3 (λ, λ2) 7−→ λ1 · λ2 ∈ K.

Corollary 3.3. If (X,A) is a measurable space, then a complex valued func-tion f : X → C is measurable, if and only if the real valued functions Re f, Im f :X → R are measurable.

CHAPTER III: MEASURE THEORY 165

Proof. If f is measurable, the composing f with the continuous maps

ρ : C 3 z 7−→ Re z ∈ R and γ : C 3 z 7−→ Im z ∈ R,

immediately gives the measurability of Re f = ρ f and Im f = γ f .Conversely, if both Re f, Im f : X → R then the measurability of f follows from

Proposition 3.3, applied to Y1 = Y2 = R, the functions f1 = Re f and f2 = Im f ,and to the continuous function

g : R2 3 (a, b) 7−→ a+ bi ∈ C.

Corollary 3.4. Let (X,A) be a measurable space, let I be a set which is atmost countable, and let fi : (X,A) → [−∞,∞], i ∈ I be collection of measurablefunctions. Then the functions g, h : X → [−∞,∞], defined by

g(x) = inffi(x) : i ∈ I

and h(x) = sup

fi(x) : i ∈ I

, ∀x ∈ X,

are both measurable.

Proof. Define the maps m,M :∏i∈I [−∞,∞] → [−∞,∞] by

m(x) = infxi : i ∈ I and M(x) = supxi : i ∈ I, ∀x = (xi)i∈I ∈∏i∈I

[−∞,∞].

By Proposition 3.3, it suffices to prove the (Borel) measurability of the maps mand M .

To prove the measurability of m, we are going to show that

m−1([−∞, a)

)∈ Bor

( ∏i∈I

[−∞,∞]), ∀ a ∈ R.

But this is quite obvious, since a point x = (xi)i∈I belongs to m−1([−∞, a)

), if

and only if there exists some j ∈ I with xi < a. In other words, if we define theprojections πj :

∏i∈I [−∞,∞] → [−∞,∞], then we have

m−1([−∞, a)

)=

⋃j∈I

πj([−∞, a)

).

This shows that in fact m−1([−∞, a)

)is open, hence clearly Borel.

To prove the measurability of M , we are going to show that

M−1((a,∞]

)∈ Bor

( ∏i∈I

[−∞,∞]), ∀ a ∈ R.

But this is again clear, since, as before, we have the equality

M−1((a,∞]

)=

⋃j∈I

πj((a,∞]

),

which shows that in fact M−1((a,∞]

)is open, hence Borel.

Corollary 3.5. Let (X,A) be a measurable space, and let fn : (X,A) →[−∞,∞], n ∈ N be sequence of measurable functions. Then the functions g, h :X → [−∞,∞], defined by

g(x) = lim infn→∞

fn(x) and h(x) = lim supn→∞

fn(x), ∀x ∈ X,

are both measurable.

166 LECTURE 20

Proof. For every n ∈ N, define the functions gn, hn : X → [−∞,∞] by

gn(x) = inffk(x) : k ≥ n

and hn(x) = sup

fk(x) : k ≥ n

, ∀x ∈ X.

By Corollary 3.5, we know that gn and hn are measurable for all n ∈ N. Since

g(x) = supgn(x) : n ∈ N

and h(x) = inf

hn(x) : n ∈ N

, ∀x ∈ X,

the fact that both g and h are measurable follows again from Corollary 3.5.

Corollary 3.6. Let (X,A) be a measurable space, and let

fn : (X,A) → [−∞,∞], n ∈ Nbe sequence of measurable functions, with the property that, for each x ∈ X, thesequence

(fn(x)

)∞n=1

⊂ [−∞,∞] has a limit. Then the function f : X → [−∞,∞],defined by

f(x) = limn→∞

fn(x), ∀x ∈ X,

is again measurable.

Proof. Immediate from the above result.

Exercise 1. If fn : R → R, n ∈ N, are continuous functions, and if f(x) =limn→∞ fn(x) exists, for every x ∈ R, then by the above Corollary we know thatf : R → [−∞,∞] is Borel measurable. Prove that the converse is not true. Moreexplicitly, prove that there is no sequence (fn)∞n=1 of continuous functions, with

limn→∞

fn(x) = κQ(x), ∀x ∈ R.

Hint: Use Baire’s Theorem.

Exercise 2. Prove that a function f : R → R, which is continuous everywhere,except for a countable set of points, is Borel measurable. As an application, provethat any monotone function is Borel measurable.

Corollary 3.6 can be generalized, as follows.Theorem 3.1. Let (X,A) be a measurable space, let Y be a separable metric

space, and letTn : (X,A) →

(Y,Bor(Y )

), n ∈ N

be a sequence of measurable maps. Assume that, for every x ∈ X, the sequence(Tn(x)

)∞n=1

⊂ Y is convergent. Define the map T : X → Y by

T (x) = limn→∞

Tn(x), ∀x ∈ X.

Then T : (X,A) →(Y,Bor(Y )

)is a measurable map.

Proof. Denote by d the metric on Y . The collection

V =Br(y) : y ∈ Y, r > 0

is a base for the topology of Y . Since Y is second countable, it suffices then toshow that

(5) T−1(Br(y)

)∈ A, ∀ y ∈ Y, r > 0.

Claim: For every y ∈ Y and r > 0 one has the equality

(6) T−1(Br(y)

)=

∞⋃m,n=1

[ ∞⋂k=m

T−1k

(Br− 1

n(y)

)].

CHAPTER III: MEASURE THEORY 167

Denote the set in the right hand side simply by A. Start first with some x ∈ A.There exist some m,n ∈ N such that

x ∈∞⋂k=m

T−1k

(Br− 1

n(y)

),

which means thatTk(x) ∈ Br− 1

n(y), ∀ k ≥ m,

that is,

d(Tk(x), y

)< r − 1

n, ∀ k ≥ m.

Pasing to the limit (k →∞) then yields

d(T (x), y

)≤ r − 1

n< r,

which means that T (x) ∈ Br(y), i.e. x = T−1(Br(y)

), thus proving the inclusion

A ⊂ T−1((

Br(y)).

Conversely, if x ∈ T−1(Br(y)

), we get T (x) ∈ (Br(y), i.e. d

(T (x), y

)< r.

Choose an integer n such that

(7) d(T (x), y

)< r − 2

n.

Since limk→∞ Tk(x) = T (x), there exists some m ∈ N such that

d(Tk(x), T (x)

)<

2n, ∀ k ≥ m.

Combining this with (7) then gives

d(Tk(x), y

)≤ d

(T (x), y

)+ d

(Tk(x), T (x)

)< r − 2

n+

1n

= r − 1n, ∀ k ≥ m,

which means that

x ∈∞⋂k=m

T−1k

(Br− 1

n(y)

),

hence x indeed belongs to A.Having proven (6) we now observe that, since the Tk’s are measurable, it follows

thatT−1k

(Br− 1

n(y)

)∈ A, ∀ k, n ∈ N, r > 0.

Using the fact that A is closed under countable intersections, it follows that∞⋂k=m

T−1k

(Br− 1

n(y)

)∈ A, ∀m,n ∈ N, r > 0.

Finally, using the fact that A is closed under countable unions, the desired property(5) follows.

Exercise 3. Let (X,A) be a measurable space, and let (Xn)∞n=1 be a sequenceof sets in A, with X =

⋃∞n=1Xn. Suppose (Y,B) is a measurable space, and

F : X → Y is a map, such that

F∣∣Xn

:(Xn,A

∣∣Xn

)→ (Y,B)

is measurable, for all n ∈ N. Prove that f : (X,A) → (Y,B) is measurable.

168 LECTURE 20

Exercise 4*. Let Ω1 ⊂ Rn be an open set, and let f1, . . . , fn : Ω1 → R be C1

functions, with the property that the matrix

A(p) =[∂fj∂xk

(p)]nj,k=1

is invertible, for every point p ∈ Ω1. Define the map

F : Ω1 3 p 7−→(f1(p), . . . , fn(p)

)∈ Rn.

(i) Prove that the set Ω2 = F (Ω1) is open in Rn.(ii) Although F : Ω1 → Ω2 may fail to be injective, prove that there exists a

Borel measurable map φ : Ω2 → Ω1, with F φ = IdΩ2 .

Hint: Use the Inverse Function Theorem, combined with Exercises 2 and 3. exercise.

Exercise 5*. Let P (z) be a non-constant polynomial with complex coefficients.Prove that there exists a Borel measurable function f : C → C, such that

P(f(z)

)= z, ∀ z ∈ C.

Hint: Use the preceding exercise, applied to the set Ω1 = z ∈ C : P ′(z) 6= 0.

The preceding exercise can be generalized:

Exercise 6*. Let Ω1 ⊂ C be a connected open set, and let f : Ω1 → C be anon-constant holomorphic function. By the Open Mapping Theorem we know thatthe set Ω2 = f(Ω1) is open. Prove that there exists a Borel measurable functionφ : Ω2 → Ω1, such that f φ = Id

∣∣Ω2

.

Hint: Use Exercise 4, applied to the set Ω0 = z ∈ Ω1 : f ′(z) 6= 0. Since f is non-constant,

the set Ω1 r Ω0 is countable.

We continue with a discussion on the role of elementary functions.

Proposition 3.4. Let (X,A) be a measurable space, and let K be one of thefields R or C. For an elementary function f ∈ ElemK(X), the following are equiv-alent:

(i) f ∈ A-ElemK(X);(ii) f : (X,A) → K is measurable.

Proof. (i) ⇒ (ii). We know that A-ElemK = SpanKκA : A ∈ A. Since

BK(X,A) is a vector space, it suffices to show only that κA : (X,A) → K ismeasurable, for all A ∈ A. But this is trivial, since for every Borel set B ⊂ R onehas either κ−1

A (B) = ∅, or κ−1A (B) = A, or κ−1

A (B) = X.(ii) ⇒ (i). Assume now f is measurable. List the range of f as

f(X) = λ1, . . . , λn,

with λj 6= λk, for all j, k ∈ 1, . . . , n with j 6= k. Since f is measurable, and thesingleton sets λ1, . . . , λn are inBor(K), it follows that the setsAj = f−1

(λj

),

j = 1, . . . , n are all in A. Since we clearly have

f = λ1κA1+ · · ·+ λnκAn

,

it follows that f indeed belongs to A-ElemK(X).

CHAPTER III: MEASURE THEORY 169

Remarks 3.4. A. If (X,A) and (Y,B) are measurable spaces, if T : (X,A) →(Y,B) is a measurable map, and if f ∈ B-ElemK(Y ), then f T ∈ A-ElemK(X).This follows from the fact that the composition f T : (X,A) → K is measurable,and elementary.

B. If (X,A) is a measurable space, if f ∈ A-ElemK(X), and if g : f(X) → K isan arbitrary function, then g f ∈ A-ElemK(X). This follows from the fact that,if one considers the finite set Y = f(X), and the σ-algebra P(Y ) on it, then

(X,A)f−−→

(Y,P(Y )

) g−−→ K

are measurable. So g f is also measurable, and obviously elementary.The following is an interesting converse of Corollary 3.6.Theorem 3.2. Let (X,A) be a measurable space, and let f : (X,A) → [−∞,∞]

be a measurable function. Then there exists a sequence (fn)∞n=1 ∈ A-ElemR(X),such that

• inff(y) : y ∈ X

≤ fn(x) ≤ sup

f(z) : z ∈ X

, ∀x ∈ X, n ≥ 1;

• limn→∞ fn(x) = f(x), ∀x ∈ X.Moreover,

(i) if inff(x) : x ∈ X

> −∞, then the sequence (fn)∞n=1 can be chosen to

be non-decreasing, i.e. fn ≤ fn+1, ∀n ∈ N;(ii) if sup

f(x) : x ∈ X

< ∞, then the sequence (fn)∞n=1 can be chosen to

be non-increasing, i.e. fn ≥ fn+1, ∀n ∈ N;(iii) if inf

f(x) : x ∈ X

> −∞ and sup

f(x) : x ∈ X

< ∞, then the

sequence (fn)∞n=1 can be chosen eiher non-decreasing, or non-increasing,and such that it converges uniformly to f , i.e.

limn→∞

[supx∈X

∣∣fn(x)− f(x)∣∣] = 0.

Proof. We begin with a special case of (iii). Assume X = [0, 1], A =Bor([0, 1]), and consider the inclusion F : [0, 1] → [−∞,∞]. For each n ∈ N,define the intervals Ink , Jnk , 0 ≤ k ≤ 2n − 1 by

Ink =[k/2n, (k + 1)/2n

), if 0 ≤ k ≤ 2n − 2; In2n−1 =

[(2n − 1)/2n, 1

],

Jnk =(k/2n, (k + 1)2n

], if 1 ≤ k ≤ 2n − 1; Jn0 =

[0, 1/2n

].

We then define, for each n ∈ N, the functions gn, hn : [0, 1] → R by

gn = 2−n2n−1∑k=0

kκInk

and hn = 2−n2n−1∑k=0

(k + 1)κJnk.

Remark that

(8) 0 ≤ gn(s) < 1 and 0 < hn(s) ≤ 1, ∀ s ∈ [0, 1].

Note that, for every n ∈ N, we have

gn(0) = 0; gn(1) = (2n − 1)/2n;(9)

hn(0) = 1/2n; hn(1) = 1.(10)

Claim 1: The sequence (gn)∞n=1 is non-decreasing, and the sequence (hn)∞n=1

is non-increasing.

170 LECTURE 20

Using (9) and (10), we only need to examine the restrictions to the open interval(0, 1). Fix some point s ∈ (0, 1). For every integer n ≥ 1, define

psn = maxk ∈ Z : 0 ≤ k

2n< s

.

We clearly have psn < 2n and

(11)psn2n

< s ≤ psn + 12n

.

We then have

(12) gn(s) =

psn/2n if s 6= (psn + 1)/2n

(psn + 1)/2n if s = (psn + 1)/2n and hn(s) =psn + 1

2n

We now estimate gn+1(s) and hn+1(s). First of all, using (11), we have

2psn2n+1

< x ≤ 2psn + 22n+1

,

which means that either psn+1 = 2psn, or psn+1 = 2psn + 1. This immediately gives

hn+1(s) =psn+1 + 1

2n+1≤ 2psn + 2

2n+1=psn + 1

2n= hn(s).

Note that, if s = (psn+1)/2n, we will have psn+1 = 2ps+1 and s = (psn+1 +1)/2n+1,so we get

gn+1(s) = (psn+1 + 1)/2n+1 = (2psn + 2)/2n+1 = (psn + 1)/2n = gn(s).

If s 6= (psn + 1)/2n, then

gn(s) =psn2n

=2psn2n

≤psn+1

2n+1≤ gn+1(s).

Claim 2: For every s ∈ [0, 1] one has

limn→∞

[sups∈[0,1]

∣∣gn(s)− s∣∣] = lim

n→∞

[sups∈[0,1]

∣∣hn(s)− s∣∣] = 0.

To prove this fact we are going to estimate the differences |gn(s)−s| and |hn(s)−s|.If s = 0 or s = 1, then the equalities (9) and (10) immediately show that

(13) |gn(s)− s| ≤ 12n

and |hn(s)− s| ≤ 12n, ∀n ∈ N.

If s ∈ (0, 1), then the definitions of gn(s) and hn(s) clearly show that

s, gn(s), hn(s) ∈[psn/2

n, (psn + 1)/2n],

and then we see that we again have the inequalities (13). Since (13) now holds forall s ∈ [0, 1], the Claim immediately follows.

We proceed now with the proof of the theorem. Define

α = inff(x) : x ∈ X

and β = sup

f(x) : x ∈ X

.

If α = β, there is nothing to prove. Assume α < β. Depending on the finitude ofα and β, we define a homeomorphism Φ : [α, β] → [0, 1], as follows.

(a) If α > −∞ and β <∞, we define

Φ(s) =s− α

β − α, ∀ s ∈ [α, β].

CHAPTER III: MEASURE THEORY 171

(b) If α > −∞ and β = ∞, we define

Φ(s) =

2π arctan(s− α) if s 6= β

1 if s = β

(c) If α = −∞ and β <∞, we define

Φ(s) =

1 + 2π arctan(s− β) if s 6= α

0 if s = α

(d) If α = −∞ and β = ∞, we define

Φ(s) =

0 if s = α12 + 1

π arctan(s− β) if α < sβ1 if s = β

Notice that Φ(α) = 0, Φ(β) = 1, and

α ≤ s < t ≤ β ⇒ Φ(s) < Φ(t).

After these preparations, we proceed with the proof. We begin with the specialcases (i) (ii) and (iii).

If α > −∞, we define the functions fn = Φ−1 gn Φ f . Since Φ and Φ−1 areincreasing, and (gn)∞n=1 is non-decreasing, it follows that (fn)∞n=1 is non-decreasing.Since 0 ≤ gn(s) < 1, ∀ s ∈ [0, 1], we see that α ≤ fn(x) < β, ∀x ∈ X. In particular,we have −∞ < fn(x) < ∞, for all n and x. It it obvious that fn is elementary,measurable, and since limn→∞ gn(s) = s, ∀ s ∈ [0, 1] (by Claim 2), we immediatelyget limn→∞ fn(x) = f(x), ∀x ∈ X.

If β <∞, we define the functions fn = Φ−1 hn Φ f . Since Φ and Φ−1 areincreasing, and (hn)∞n=1 is non-increasing, it follows that (fn)∞n=1 is non-increasing.Since 0 < hn(s) ≤ 1, ∀ s ∈ [0, 1], we see that α < fn(x) ≤ β, ∀x ∈ X. In particular,we have −∞ < fn(x) < ∞, for all n and x. It it obvious that fn is elementary,measurable, and since limn→∞ hn(s) = s, ∀ s ∈ [0, 1] (by Claim 2), we immediatelyget limn→∞ fn(x) = f(x), ∀x ∈ X.

If α > −∞ and β <∞, then we can take fn = Φ−1 gn Φ f , ∀n, or we cantake fn = Φ−1 hn Φ f , ∀n. The inequalities (13), combined with the definition(c) of Φ, show that

|fn(x)− f | ≤ β − α

2n, ∀x ∈ X, n ∈ N,

with any of the above choices for (fn)∞n=1.Having proven the cases (i), (ii) and (iii), we now examine the general situation,

when α = −∞ and β = ∞. Consider the functions f ′, f ′′ : X → [−∞,∞] definedby

f ′(x) = maxf(x), 0 and f ′′(x) = minf(x), 0, ∀x ∈ X.By Corollary 3.4, both f ′ and f ′′ are measurable. Since infx∈X f ′(x) ≥ 0, by part(i), there exists a sequence (f ′n)

∞n=1 ∈ A-ElemR(X), such that limn→∞ f ′n(x) =

f ′(x), ∀x ∈ X. Since supx∈X f ′′(x) ≤ 0, by part (ii), there exists a sequence(f ′′n )∞n=1 ∈ A-ElemR(X), such that limn→∞ f ′′n (x) = f ′′(x), ∀x ∈ X. Define theelementary functions fn = f ′n + f ′′n , n ∈ N. Clearly the fn’s are all in A-ElemR(X).

We now check that

(14) limn→∞

fn(x) = f(x), ∀x ∈ X.

There are two cases to examine: (a) f(x) ≥ 0; (b) f(x) ≤ 0.

172 LECTURE 20

In case (a), we have f ′(x) = f(x) and f ′′(x) = 0, so limn→∞ f ′n(x) = f(x) andlimn→∞ f ′′n (x) = 0.

In case (b), we have f ′(x) = 0 and f ′′(x) = f(x), so limn→∞ f ′n(x) = 0 andlimn→∞ f ′′n (x) = f(x).

In either case, the equality (14) follows.

We conclude this section with a discussion on an interesting measurable space,that appears often in connection with probability theory.

Example 3.1. Consider the space T = 0, 1ℵ0 , i.e.

T =a = (αn)∞n=1 : αn ∈ 0, 1, ∀n ∈ N

.

We call T the space of infinite coin flippings, having in mind that an elementof T is the same as the outcome of an infinite sequence of coin flips (think 0as corresponding to tails, and 1 as corresponding to heads). Equipp T with theproduct topology. By Tihonov’s Theorem, T is compact. The product topology onT is in fact given by a metric d defined by

d(a, b) =∞∑n=1

|αn − βn|2n

, ∀ a = (αn)∞n=1, b = (βn)∞n=1 ∈ T.

For every number r ≥ 2 we define a map φr : T → [0, 1] by

φr(a) = (r − 1)∞∑n=1

αnrn, ∀ a = (αn)∞n=1 ∈ T.

It is pretty clear that∣∣φr(a)− φr(b)∣∣ ≤ (r − 1)d(a, b), ∀ a, b ∈ T,

so the maps φr : T → [0, 1], r ≥ 2 are continuous. In particular, the set Kr = φr(T )is a compact subset of [0, 1].

Define

T0 =a = (αn)n∈N ∈ T : the set n ∈ N : αn = 0 is infinite

.

The set T r T0 can be described as:

T r T0 =(αn)n∈N ∈ T : there exists N ∈ N, such that αn = 1, ∀n ≥ N

.

The following are well known (see Appendix B, the proof of Proposition B.2).Facts: 1. The set T r T0 is countable

2. For any r ≥ 2, and elements a = (αn)∞n=1, b = (βn)∞n=1 ∈ T0, thefollowing are equivalent:• there exists N ∈ N such that αN = 1, βN = 0, and αn = βn, for alln ∈ N with n < N ;

• φr(a) > φ(b).In particular, the map φr

∣∣T0

: T0 → [0, 1] is injective.

The above constructions have a remarkable feature.Theorem 3.3. Use the notations above. For a number r ≥ 2 and subset A ⊂ T ,

the following are equivalent:(i) A ∈ Bor(T );(ii) φr(A) ∈ Bor(Kr).

CHAPTER III: MEASURE THEORY 173

Proof. Throughout the proof the number r will be fixed. The map φr will bedenoted by φ, and the compact set Kr will be denoted by K.

Since φ : T → K is continuous, it is measurable, i.e. we have the implication

(15) B ∈ Bor(K) ⇒ φ−1(B) ∈ Bor(T ).

Before we proceed with the actual proof, we need some preparations. Remark that,since φ : T → K is surjective, we have the equality

(16) φ(φ−1(C)

)= C, ∀C ⊂ K.

Claim 1: If a subset C ⊂ K is at most countable, if and only if the setφ−1(C) ⊂ T is at most countable.

Suppose C is at most countable countable. If we take A0 = φ−1(C) ∩ T0, andA1 = φ−1(C) r T0, then obviously φ−1(C) = A0 ∪ A1. Since A1 ⊂ T r T0, andTrT0 is countable, it follows that A1 is at most countable, so we only need to provethat A0 is at most countable. But since φ

∣∣T0

is injective, and A0 ⊂ T0, it followsthat φ

∣∣A0

: A0 → C is injective, and then the fact that C is at most countable,forces A0 to be at most countable.

Conversely, if φ−1(C) is at most countable, then so is φ(φ−1(C)

). By (16) we

are done.For each subset A ⊂ T , we define

〈A〉 = φ−1(φ(A)

).

Remark that A ⊂ 〈A〉, ∀A ⊂ T . Note also that, for any family (Ai)i∈I of subsetsof T , one has the equality

(17)⟨ ⋃i∈I

Ai⟩

= φ−1

(φ( ⋃i∈I

Ai))

= φ−1

( ⋃i∈I

φ(Ai))

=⋃i∈I

φ−1(φ(Ai)

)=

⋃i∈I〈Ai〉.

As an application of Claim 1, to the set C = φ(T r T0), we see that(∗) the set 〈T r T0〉 is at most countable.Claim 2: For any subset A ⊂ T0, one has the inclusion

〈A〉rA ⊂ 〈T r T0〉.In particular, the difference 〈A〉rA is at most countable.

Start with an arbitrary element x ∈ 〈A〉 r A. This means that x 6∈ A, but φ(x) ∈φ(A), which means that there exists some a ∈ A, with φ(x) = φ(a). Assume nowx 6∈ T r T0, which means that x ∈ T0. But then, the fact that x, a ∈ T0, combinedwith the injectivity of φ

∣∣T0

will force x = a, which is impossible since a ∈ A.

Claim 3: For any set A ⊂ T , the difference 〈A〉rA is at most countable.Take A0 = A ∩ T0 and A1 = ArA0. Notice that, since A1 ⊂ T r T0, we have

〈A1〉 = φ−1(φ(A1)

)⊂ φ−1

(φ(T r T0)

)= 〈T r T0〉,

so it follows that 〈A1〉 is at most countable. We obviously have A = A0 ∪A1, so by(17)

〈A〉 = 〈A0〉 ∪ 〈A1〉.But now we are done, since

〈A〉rA =(〈A0〉 ∪ 〈A1〉

)r

(A0 ∪A1

)⊂

(〈A0〉rA0

)∪ 〈A1〉,

and both 〈A0〉rA0 (by Claim 2) and 〈A1〉 are at most countable.

174 LECTURE 20

Claim 4: For any subset A ⊂ T , one has the inclusion

(18) φ(T rA) ⊃ K r φ(A),

and the difference φ(T rA) r(K r φ(A)

)is at most countable.

The inclusion (18) is pretty obvious, from the surjectivity of φ. In order to provethat the difference

C = φ(T rA) r(K r φ(A)

)= φ(T rA) ∩ φ(A)

is countable, by Claim 1, it suffices to prove that φ−1(C) is countable. We have

φ−1(C) = φ−1(φ(T rA) ∩ φ(A)

)= φ−1

(φ(T rA)

)∩ φ−1

(φ(A)

)= 〈T rA〉 ∩ 〈A〉.

We can write φ−1(C) = A1 ∪A2, where

A1 = (T rA) ∩ 〈A〉 and A2 =[〈T rA〉r (T rA)

]∩ 〈A〉,

so it suffices to prove that both A1 and A2 are at most countable. But these factsare immediate from Claim 3, since A1 = 〈A〉rA, and A2 ⊂ 〈T rA〉r (T rA).

We can now proceed with the proof of the theorem. Define

A =A ⊂ T : φ(A) ∈ Bor(K)

,

so that what we need to prove is the equality A = Bor(T ).First, remark that, if A ∈ A, then φ(A) ∈ Bor(K), and the fact that φ is Borel

measurable will force 〈A〉 = φ−1(φ(A)

)to be a Borel set in T . But since 〈A〉 r A

is countable, hence Borel, it follows that

A = 〈A〉r(〈A〉rA

)is again Borel. Therefore, we have the inclusion A ⊂ Bor(T ).

Second, remark that if F ⊂ T is a compact subset, then the continuity of φgives the fact that φ(F ) is compact, hence Borel. This then forces F ∈ A. ThereforeA contains the collection CT of all compact subsets of T .

Now we haveCT ⊂ A ⊂ Bor(T ) = Σ(CT ),

so all we need to prove is the fact that A is a σ-algebra, i.e. we have the properties(a) A ∈ A ⇒ T rA ∈ A;(b) for any sequence (An)∞n=1 ⊂ A, the union

⋃∞n=1An also belongs to A..

To check (a) start with some set A ∈ A. We know that φ(A) ∈ Bor(K), andwe want to show that φ(T rA) is again Borel. By Claim 4, we know we can write

φ(T rA) =[K r φ(A)

]∪ C,

for some set C ⊂ K which is at most countable. Since C and K r φ(A) are Borel,this shows that φ(T rA) is also Borel.

Property (b) is obvious, since φ(An), n ≥ 1 are all Borel, and

φ

( ∞⋃n=1

An

)=

∞⋃n=1

φ(An).

Corollary 3.7. Use the above notations. For a number r ≥ 2 and a subsetB ⊂ Kr, the following are equivalent:

(i) B ∈ Bor(Kr);(ii) φ−1

r (B) ∈ Bor(T ).

CHAPTER III: MEASURE THEORY 175

Proof. The implication (i) ⇒ (ii) is trivial, since φr is continuous, hencemeasurable.

Conversely, if the set A = φ−1r (B) is Borel, then by the Theorem, φr(A) is

Borel. But since φr is surjective, we have B = φr(A).

Comments. From the above results, we see that φr : T → Kr “almost pre-serves Borel structures.” More explicitly, if one considers the maps

Φr : P(T ) 3 A 7−→ φr(A) ∈ P(Kr),

Ψr : P(Kr) 3 B 7−→ φ−1r (B) ∈ P(T ),

then• (Φr Ψr)(B) = B, for all B ⊂ Kr;• (Ψr Φr)(A) ⊃ A, and (Φr Ψr)(A) r A is at most countable, for allA ⊂ T ;

• B ∈ Bor(Kr) ⇔ Ψr(B) ∈ Bor(T );• A ∈ Bor(T ) ⇔ Φr(A) ∈ Bor(Kr).

In the particular case r = 2, we know that K2 = [0, 1], so we can think the mea-surable space

([0, 1], Bor([0, 1])

)as “approximatively the same” as the measurable

space(T,Bor(T )

).

The case r = 3 will be an interesting one, especially for constructing variouscounter-examples. The compact set K3 ⊂ [0, 1] is called the ternary Cantor set.

It turns out that there exists another useful description of the ternary Cantorset K3, which yields some interesting properties.

Notations. We keep the notations above. An element a = (α)∞n=1 ∈ T willbe called finite, if there exists some N ∈ N, such that αn = 0, ∀n ≥ 0. We define

Tfin =a ∈ T : a finite

.

Remark that Tfin ⊂ T0. In particular the map φ3

∣∣Tfin

: Tfin → K3 is injective.For a ∈ Tfin we define its length as

`(a) = minN ∈ N : αn = 0, ∀n ≥ N − 1.

With this definition, for every a = (αn)∞n=1 ∈ Tfin, we have

(19) α`(a) = 1 and αn = 0, ∀n > `(a).

We defineΛ =

(k, a) ∈ Z× Tfin : k ≥ `(a)

.

Finally, for every pair λ = (k, a) ∈ Λ, we define the open interval

Iλ =(φ3(a) +

13k+1

, φ3(a) +2

3k+1

).

Remark that, using (19) we have

φ3(a) ≤ 2`a∑n=1

23n

= 1− 13`(a)

,

with the convention that the sum is 0, if `(a) = 0. We then get

φ3(a) +2

3k+1≤ 1− 1

3`(a)+

23k+1

< 1− 13`(a)

+13k

≤ 1,

176 LECTURE 20

which gives the inclusion Iλ ⊂ (0, 1).The following result is describes an alternative construction of K3.Theorem 3.4. Use the notations above.

(i) The set Tfin is dense in T ;(ii) The system (Iλ)λ∈Λ is pair-wise disjoint.(iii)

⋃λ∈Λ = [0, 1] rK3.

Proof. The map φ3 will be simply denoted by φ, and the Cantor set K3 willbe denoted simply by K.

(i). Fix some element a = (αn)∞n=1 ∈ T . For every integer k ≥ 1 define theelement ak = (αkn)

∞n=1 ∈ T , by

αkn =αn if n ≤ k0 if n > k

It is obvious that ak ∈ Tfin, ∀ k ∈ N. The inequality

d(a, ak) =∞∑

n=k+1

αn2n

≤∑

n=k+1

12n

=12k, ∀ k ∈ N

then immediately shows that limk→∞ ak = a.(ii). Assume λ, µ ∈ Λ are such that λ 6= µ, and let us prove that Iλ ∩ Iµ = ∅.

Let λ = (j, a) and µ = (k, b), where a = (αn)∞n=1 and b = (βn)∞n=1 are elements intTfin with `(a) ≤ j and `(b) ≤ k. Since λ 6= µ, we have one (or both) of the followingcases: (a) a 6= b, or (b) j 6= k.

In case (a) we take

m = minn ∈ N : αn 6= βn.

Without any loss of generality, we can assume that αm = 0 and βm = 1. Note thatk ≥ `(b) ≥ m ≥ 1. We are going to prove that Iλ ∩ Iµ = ∅, by showing that theright end-point of Iλ is not greater than the left end-point of Iµ, that is,

(20) φ(a) +2

3k+1≤ φ(b) +

13k+1

.

Define the number

M =m−1∑n=1

αn3n

=m−1∑n=1

βn3n,

with the convention that M = 0, if m = 1. We have:

φ(a) = 2M + 2`(a)∑

n=m+1

αn3n

≤ 2M + 2`(a)∑m+1

13n

= 2M +1

3m− 1

3`(a);

φ(b) = 2M +2

3m+ 2

`(b)∑n=m+1

βn3n

≥ 2M +2

3m.

The inequality (20) then follows immediately from:

φ(a) +2

3j+1≤ 2M +

13m

− 13`(a)

+2

3j+1<< 2M +

13m

− 13`(a)

+13j≤

≤ 2M +1

3m< 2M +

23m

≤ φ(b) < φ(b) +1

3k+1.

CHAPTER III: MEASURE THEORY 177

In case (b), based on the fact that we have proven case (a), we can assume, withoutany loss of generality, that a = b and j < k. In this case we have

φ(b) +2

3k+1= φ(a) +

23k+1

< φ(a) +13k

≤ φ(a) +1

3j+1,

which means that the right end-point of Iµ is not greater than the left end-point ofIλ, so again we get Iλ ∩ Iµ = ∅.

For the proof of (iii) we are going to use the space

P = 0, 1, 2ℵ0 =(αn)∞n=1 : αn ∈ 0, 1, 2, ∀n ∈ N

.

Exactly as is the case with T , the product space P is compact with respect to theproduct topology, which is given by the metric

d(a, b) =∞∑n=1

|αn − βn|2n

, ∀ a = (αn)∞n=1, b = (βn)∞n=1 ∈ P.

Then map ψ : P → [0, 1], defined by

ψ(a) =∞∑n=1

αn3n, ∀ a = (αn)∞n=1 ∈ P,

satisfies ∣∣ψ(a)− ψ(b)| ≤ d(a, b), ∀ a, b ∈ P,

hence it is continuous. Note also that ψ is surjective. We can write φ = ψ ρ,where

ρ : 0, 1ℵ0 3 (αn)∞n=1 7−→ (2αn)∞n=1 ∈ 0, 1, 2ℵ0 .

Note also that ρ : T → P is continuous, since we clearly have

d(ρ(a), ρ(b)

)≤ 2d(a, b), ∀ a, b ∈ T.

We now proceed with the proof of (iii). Denote the open set⋃λ∈Λ Iλ simply by

D. Since Tfin is dense in T , it follows that φ(Tfin) is dense in K = φ(T ). Therefore,in order to prove the inclusion K ⊂ [0, 1]rD, using the surjectivity of ψ, it sufficesto prove the inclusion

φ(Tfin) ⊂ [0, 1] rD.

Using the map ψ : P → [0, 1], the above inclusion is equivalent to

(21) P r ρ(Tfin) ⊃ ψ−1(D).

In order to prove the inclusion [0, 1] rD ⊂ K, again using the surjectivity of ψ, itsuffices to prove the inclusion

(22) ψ−1(D) ⊃ P r ψ−1(K).

To prove (21) start with some element a = (αn)∞n=1 ∈ ψ−1(D), which means thatthere exists some b ∈ Tfin, and an integer k ≥ `(b), such that ψ(a) ∈ I(k,b), i.e.

(23)2β1

3+ · · ·+ 2βk

3k+

13k+1

<

∞∑n=1

αn3n

<2β1

3+ · · ·+ 2βk

3k+

23k+1

.

178 LECTURE 20

We prove that a 6∈ ρ(Tfin) by contradiction. Assume a ∈ ρ(Tfin), which means thatthere exists c = (γn)∞n=1 ∈ Tfin, such that αn = 2γn, ∀n ∈ N. Define the elementb = (βn)∞n=1 ∈ Tfin by

βn =

βn if n ≤ k1 if n = k + 10 if n > k + 1

With this definition, the inequalities (23) give

(24) φ(b) < φ(b) +1

3k+1< φ(c) < φ(b).

By Fact 2 above, there exist N,N ′ ∈ N such that

• γN = 1, βN = 0, and γn = βn, for all n ∈ N with n < N ;• γN ′ = 0, βN ′ = 1, and γn = βn, for all n ∈ N with n < N ′.

We will examine three cases: (a) N < N ′, (b) N = N ′, or (c) N > N ′′.Case (b) is clearly impossible. In case (a), the inequality N < N ′ forces

βN = 0, γN = 1 and βN = γN , which means that βN = 1 6= βN = 0. This clearlyforces N = k + 1 > `(b), which in particular gives βn = βn = 0, ∀n > N , so weclearly have γn ≥ βn, ∀n ∈ N, so we get φ(c) ≥ φ(b), thus contradicting (24). Incase (c), we have γN ′ = 0, βN ′ = 1, and since N ′ < N , we also have βN ′ = γN ′ = 0.As before this would force N ′ = k + 1. We then have

φ(c) = 2∞∑n=1

γn3n

= 2N ′−1∑n=1

γn3n

+2γN ′

3N ′ + 2∞∑

n=N ′+1

γn3n

= 2k∑

n=1

βn3n

+ 0 + 2∞∑

n=k+2

γn3n

=

= φ(b) + 2∞∑

n=k+2

γn3n

≤ φ(b) + 2∞∑

n=k+1

13n

= φ(b) +1

3k+1,

again contradicting (24).To prove (22), we start with some element a ∈ P rψ−1(K), and we show that

ψ(a) ∈ D. The fact that a 6∈ ψ−1(K) forces the fact that a 6∈ ρ(T ). In particular,this gives the fact that a = (αn)∞n=1 ∈ 0, 1, 2ℵ0 and there exists some n ∈ N suchthat αn = 1. Put

N = minn ∈ N : αn = 1.Define the elements b = (βn)∞n=1 ∈ 0, 1ℵ0 , by

βn =αn/2 if n < N

0 if n ≥ N

Notice that b ∈ Tfin , and `(b) ≤ N − 1. Notice also that 2βn = αn, for all n ∈ Nwith n < N − 1. In particular, using the equality αN = 1, this gives

φ(b) +1

3N= 2

N−1∑n=1

βn3n

+αN3N

=N∑n=1

αn3n

≤∞∑n=1

αn3n

= ψ(a);

(25)

φ(b) +2

3N= 2

N−1∑n=1

γn3n

+αN3N

+∞∑

n=N+1

23n

=N∑n=1

αn3n

+∞∑

n=N+1

23n

≥∞∑n=1

αn3n

= ψ(a).

(26)

CHAPTER III: MEASURE THEORY 179

Consider the pair λ = (N − 1, β) ∈ Λ. We are going to show that ψ(a) ∈ Iλ, i.e.we have the inequalities

(27) φ(b) +1

3N< ψ(a) < φ(b) +

23N

.

By (25) and (26) it suffices to prove only that

ψ(a) 6= φ(b) +1

3Nand ψ(a) 6= φ(b) +

23N

.

If ψ(a) = φ(b) + 13N , then by the inequalities (25), we are forced to have

(28) αn = 0, ∀n > N..

If ψ(a) = φ(b) + 23N , then by the inequalities (26), we are forced to have

(29) αn = 2, ∀n > N..

If (28) holds, we define c = (γn)∞n=1 ∈ T , by

γn =

αn/2 if n < N0 if n = N1 if n > N

and we will have

φ(c) = 2∞∑n=1

γn3n

=N−1∑n=1

2γn3n

+ 2∞∑

n=N+1

13n

=N−1∑n=1

αn3n

+1

3N= ψ(a),

thus forcing ψ(a) ∈ K, which is impossible.If (29) holds, we define c = (γn)∞n=1 ∈ T , by

γn =

αn/2 if n 6= N1 if n = N0 if n > N

and we will have

φ(c) = 2N−1∑n=1

γn3n

+2

3N=N−1∑n=1

2γn3n

+1

3N+

∞∑n=N+1

23n

=∞∑n=1

αn3n

= ψ(a),

thus forcing again ψ(a) ∈ K, which is impossible.

Exercise 7. Using the notations above, prove that the set

[0, 1] rK3 =⋃λ∈Λ

is dense in [0, 1].Hints: Define the set

P0 =(αn)∞n=1 ∈ 0, 1, 2ℵ0 : the set n ∈ N : αn = 1 is infinite

.

Prove that P0 is dense in P , and prove that ψ(P ) ⊂ [0, 1] rK. (Use the arguments employed in

the proof of part (iii).)

Remarks 3.5. If we set Λn = Λ∩(n×P

), then we can write the complement

of the ternary Cantor set as

[0, 1] rK3 =∞⋃n=0

Dn,

180 LECTURE 20

whereDn =

⋃λ∈Λn

Iλ.

Then the system of open sets (Dn)n≥0 is pair-wise disjoint. Morever, each Dn is aunion of 2n disjoint intervals of length 1/3n+1.

Since cardT0 = c, and the map φ3

∣∣T0

: T0 → K3 is injective, we get cardK3 ≥ c.Since we also have cardK3 ≤ card R = c, we get in fact the equality

cardK3 = c.

Lecture 21

4. The concept of measure

Definition. Let X be a non-empty set, and let E be an arbitrary collectionof subsets of X. Assume ∅ ∈ E. A measure on E is a map µ : E → [0, 1] with thefollowing properties

(0) µ(∅) = 0.(addσ) Whenever (En)∞n=1 ⊂ E is a pair-wise disjoint sequence, with

⋃∞n=1En ∈

E, it follows that we have the equality

µ( ∞⋃n=1

En)

=∞∑n=1

µ(En).

Property (addσ) is called σ-additivity.Convention. For a sequence (αn)∞n=1 ⊂ [0,∞] we define

∞∑n=1

αn =

∞∑n=1

αn if αn ∈ [0,∞), ∀n ∈ N

∞ if there exists n ∈ N with αn = ∞.

(Of course, in the first case, it is still possible to have∑∞n=1 αn = ∞.)

Remark 4.1. If µ is a measure on E, then µ is additive, i.e.(add) Whenever (En)Nn=1 ⊂ E is a finite pair-wise disjoint system, such that

E1 ∪ · · · ∪ EN ∈ E, it follows that we have the equality

µ(E1 ∪ · · · ∪ EN

)= µ(E1) + · · ·+ µ(EN ).

This follows from (addσ) (0), after completing the sequence E1, . . . , EN to aninfinite sequence by taking En = ∅, ∀n > N .

Comment. The most natural setting for measures is the one when E is a σ-ring.In this case, the stipulation that

⋃∞n=1En ∈ E, which appears in the definition, is

superfluous.The purpose of this section is to study measures on more rudimentary collec-

tions.Examples 4.1. Let X be a non-empty set.A. If we take E = ∅, X and we define µ(∅) = 0 and µ(X) to be any element

in [0,∞], then µ is obviously a measure on ∅, X.B. If we take E = P(X) and we define

µ(E) =

0 if E = ∅∞ if E 6= ∅

then µ is a measure on P(X).

181

182 LECTURE 21

C. If we take E = P(X) and we define

µ(E) =

cardE if E is finite∞ if E is infinite

then µ is a measure on P(X). This is called the counting measure.Exercise 1. Let X1, X2 be non-empty spaces, let Ek ⊂ P(Xk) be arbitrary

collections with ∅ ∈ Ek, k = 1, 2. Let µ1 be a measure on E1 and µ2 be a measureon E2. Consider the collections

f∗E1 = A ⊂ X2 : f−1(A) ∈ E1 ⊂ P(X2);

f∗E2 = f−1(A) : A ∈ E2 ⊂ P(X1).

A. Prove that the map f∗µ1 : f∗E1 → [0,∞], defined by

(f∗µ1)(A) = µ1

(f−1(A)

), ∀A ∈ f∗E1,

is a measure on f∗E1.B. If f is surjective, prove that the map f∗µ2 : f∗E2 → [0,∞], defined by

(f∗µ2)())

= µ2

(f(B)

), ∀B ∈ f∗E2,

is a measure on f∗E2.We now concentrate on the most rudimentary types of collections E on which

measures can be somehow easily defined. Actually, what we have in mind is a setof easy conditions on a map µ : E → [0,∞] which would guarrantee that µ is ameasure.

Definition. Let X be a non-empty set. A collection J ⊂ P(X) is called asemiring, if it satisfies the following properties:

• ∅ ∈ J;• if A,B ∈ J, then A ∩B ∈ J;• if A,B ∈ J and A ⊂ B, then there exists an integer n ≥ 1, and setsD0, D1, . . . , Dn ∈ J, such that A = D0 ⊂ D1 ⊂ · · · ⊂ Dn = B, andDk rDk−1 ∈ J, ∀ k ∈ 1, . . . , n.

Remark that every ring is a semiring.Exercise 2. Prove that the semiring type is not consistent. Give an example of

two semirings J1, J2 ⊂ P(X), such that J1 ∩ J2 is not a semiring.Hint: Use the set X = 1, 2, 3.

Exercise 3. Let X1, . . . , Xn be non-empty sets, and let Jk ⊂ P(Xk), k =1, . . . , n, be semirings. Prove that

J =A1 × · · · ×An : A1 ∈ J1, . . . , An ∈ Jn

⊂ P(X1 × · · · ×Xn)

is a semiring.Hint: First prove the case n = 2, and then use induction.

Example 4.2. Take X = R. The collection

J = ∅ ∪[a, b) : a, b ∈ R, a < b

⊂ P(R)

is a semiring.Indeed, the first two axioms are pretty clear. To prove the third axiom, we

start with two intervals A = [a, b) and B = [c, d) with A ⊂ B. This means thata ≥ c and b ≤ d. If a = c or b = d, we set D0 = A and D1 = B. If a > c and b < d,we set D0 = A, D1 = [a, d) and D2 = B.

CHAPTER III: MEASURE THEORY 183

More generally, by Exercise 3, the collection of ”half-open boxes”

Jn = ∅ ∪ n∏j=1

[aj , bj) : a1 < b1, . . . , an < bn⊂ P(Rn)

is a semiring.Exercise 4. Let Jn ⊂ P(Rn) be the semiring defined above. Prove that the

σ-ring S(J) generated by Jn coincides with Bor(Rn).The ring generated by a semiring has a particularly nice description (compare

to Proposition 2.1):Proposition 4.1. Let J be a semiring on X. For a subset A ⊂ X, the following

are equivalent:(i) A belongs to R(J), the ring generated by J;(ii) There exists an integer n ≥ 1, and a pair-wise disjoint system (Aj)nj=1 ⊂ J,

such that A = A1 ∪ · · · ∪An.

Proof. Denote by R the collection of all subsets A ⊂ X that satisfy condition(ii). It is obvious that

J ⊂ R ⊂ R(J),so (see Section III.2) we only need to prove that R is a ring.

Let us first remark that we obviously have the property:(i) if A,B ∈ R, and A ∩B = ∅, then A ∪B ∈ R.

Secondly, we remark that we have have the implication:(ii) A,B ∈ J ⇒ ArB ∈ R.

Indeed, since A∩B ∈ J, by the definition of a semiring, there existD0, D1, . . . , Dn ∈J with A ∩ B = D0 ⊂ D1 ⊂ · · · ⊂ Dn = A, and Dk r Dk−1 ∈ J, ∀ k ∈ 1, . . . , n.Then the equality

Ar =n⋃k=1

(Dk rDk−1)

shows that ArB indeed belongs to R.Thirdly, we prove the implication:(iii) A,B ∈ R ⇒ A ∩B ∈ R.

Write A = A1 ∪ · · · ∪ Am and B = B1 ∪ · · · ∪ Bn, with (Ai)mi=1, (Bk)nk=1 ⊂ J

pair-wise disjoint systems. If we define the sets Dik = Aj ∩ Bk ∈ J, (i, k) ∈1, . . . ,m × 1, . . . , n then it is obvious that

A ∩B =m⋃i=1

n⋃k=1

Dik,

and (Dik)1≤i≤m1≤j≤n

⊂ J is a pair-wise disjoint system, therefore A ∩B indeed belongs

to R.Finally, we show the implication:(iv) if A,B ∈ R and A ⊃ B, then ArB ∈ R.

Write A = A1 ∪ · · · ∪Am, with (Ai)mi=1 ⊂ J a pair-wise disjoint system. Notice that

ArB =m⋃i=1

(Ai rB),

184 LECTURE 21

with (Ai r B)mi=1 a pair-wise disjoint system, so by (i) it suffices to show thatAirB ∈ R, ∀ i ∈ 1, . . . ,m. To prove this, we fix i and we write B = B1∪· · ·∪Bn,with (Bk)nk=1 ⊂ J a pair-wise disjoint system. Then

Ai rB = (Ai rB1) ∩ · · · ∩ (Ai rBn),

and the fact that Ai rB belongs to R follows from (ii) and (iii).Having proven (i)-(iv), it we now prove that R is a ring. By (iii), we only need

to prove the implication(∗) A,B ∈ R ⇒ A4B ∈ R.

On the one hand, using (iv), it follows that the sets A r B = A r (A ∩ B) andB r A = B r (A ∩ B) both belong to R. Since A4B = (A r B) ∪ (B r A), and(ArB) ∩ (B rA) = ∅, by (i) is follows that A4B indeed belongs to R.

Theorem 4.1 (Semiring-to-ring extension). Let J be a semiring on X, and letµ : J → [0,∞] be an additive map with µ(∅) = 0.

(i) There exists a unique additive map µ : R(J) → [0,∞], such that µ∣∣J

= µ.(ii) If µ is σ-additive, then so is µ.

Proof. The key step is contained in the followingClaim: If (Ai)mi=1 ⊂ J and (Bj)nj=1 ⊂ J are pair-wise disjoint systems, with

A1 ∪ · · · ∪Am = B1 ∪ · · · ∪Bn,then µ(A1) + · · ·+ µ(Am) = µ(B1) + · · ·+ µ(Bn).

To prove this fact, we define the pair-wise disjoint system (Dij)1≤i≤m1≤j≤n

by Dij =

Ai ∩Bj , ∀ (i, j) ∈ 1, . . . ,m × 1, . . . , n. Sincen⋃j=1

Dij = Ai, ∀ i ∈ 1, . . . ,m,

m⋃i=1

Dij = Bj , ∀ j ∈ 1, . . . , n,

using additivity, we have the equalitiesn∑j=1

µ(Dij) = µ(Ai), ∀ i ∈ 1, . . . ,m,

m∑i=1

µ(Dij) = µ(Bj), ∀ j ∈ 1, . . . , n,

and then we getm∑i=1

µ(Ai) =m∑i=1

[ m∑j=1

µ(Dij)]

=n∑j=1

[ n∑i=1

µ(Dij)]

=n∑j=1

µ(Bj).

To prove (i), for any set A ∈ R(J) we choose (use Proposition 4.1) a finitepair-wise disjoint system (Ai)ni=1 ⊂ J, with A = A1 ∪ · · · ∪An, and we define

(1) µ(A) = µ(A1) + · · ·+ µ(An).

By the above Claim, the number µ(A) is independent of the particular choice of thepair-wise disjoint system (Ai)ni=1. Also, it is clear that µ

∣∣J

= µ, and µ is additive.

CHAPTER III: MEASURE THEORY 185

The uniqueness is also clear, because the equality µ∣∣J

= µ and additivity of µ force(1)

(ii). Assume now that µ is σ-additive, and let us prove that µ is again σ-additive. Start with a pair-wise disjoint sequence (An)∞n=1 ⊂ R(J), with

⋃∞n=1An ∈

R(J), and let us prove the equality

(2) µ( ∞⋃n=1

An)

=∞∑n=1

µ(An).

Since⋃∞n=1An ∈ R, there exists a finite pair-wise disjoint system (Bi)

pi=1 ⊂ J, such

that⋃∞n=1An = B1 ∪ · · · ∪Bp. With this choice we have

(3) µ( ∞⋃n=1

An)

=p∑i=1

µ(Bi).

For each i ∈ 1, . . . , p, we have Bi =⋃∞n=1(Bi ∩ An). Fix for the moment a

pair (n, i) ∈ N × 1, . . . , p. Since Bi ∩ An ∈ R(J), it follows that there exist aninteger Nni ≥ 1 and a finite pair-wise disjoint system (Cnik )Nni

k=1 ⊂ J, such thatBi ∩An =

⋃Nni

k=1 Cnik .

Since, for each i ∈ 1, . . . , p, the countable system (Cnik ) n∈N1≤k≤Nni

⊂ J is pair-

wise disjoint, and we have the equality∞⋃n=1

Nni⋃k=1

Cnik =∞⋃n=1

(Bi ∩An) = Bi ∈ J,

by the σ-additivity of µ, we have

(4) µ(Bi) =∞∑n=1

Nni∑k=1

µ(Cnik ), ∀ i ∈ 1, . . . , p.

Since, for each n ∈ N, the finite system (Cnik ) 1≤i≤p1≤k≤Nni

⊂ J is pair-wise disjoint,

and we have the equalityp⋃i=1

Nni⋃k=1

Cnik =∞⋃i=1

(Bi ∩An) = An ∈ J,

by the definition of µ, we have

µ(An) =p∑i=1

Nni∑k=1

µ(Cnik ), ∀ i ∈ 1, . . . , p.

Combining this with (4) yields∞∑n=1

µ(An) =∞∑n=1

p∑i=1

Nni∑k=1

µ(Cnik ) =p∑i=1

µ(Bi),

and the equality (2) follows from (3).

Definition. Let X be a non-empty set, and let E ⊂ P(X) be a collection ofsets. We say that a map µ : E → [0,∞] is sub-additive, if(add−) whenever A ∈ E, and (An)nk=1 is a finite sequence in E with A ⊂

⋃nk=1Ak,

it follows that µ(A) ≤∑nk=1 µ(Ak).

186 LECTURE 21

Note that we do not require the Ak’s to be pair-wise disjoint. With this terminology,Theorem 4.1 has the following.

Corollary 4.1. Let X be a non-empty set X, and let J ⊂ P(X) be a semiring.Then any additive map µ : J → [0,∞] is sub-additive.

Proof. Let µ : R(J) → [0,∞] be the additive extension of µ to the ring gener-ated by J. It suffices to prove that µ is sub-additive. Start with sets A,A1, . . . , An ∈R(J) such that A ⊂ A1 ∪ . . . An. Define the sets B1 = A1, and

Bk = Ak r (A1 ∪ · · · ∪Ak−1), forall k ∈ 1, . . . , n, k ≥ 2.

Since we work in a ring, the sets Bk, Bk∩A, BkrA, and AnrBn, n ∈ N, all belongto R(J). Moreover, the sequence (Bk)nk=1 is pair-wise disjoint and it satisfies

• Bk ⊂ Ak, ∀ k ∈ 1, . . . , n,•

⋃nk=1Bk =

⋃nk=1Ak ⊃ A,

so by the additivity of µ, we getn∑k=1

µ(Ak) =n∑k=1

µ((Ak rBk) ∪Bk

)=

n∑k=1

[µ(Ak rBk) + µ(Bk)

]≥

≥n∑k=1

µ(Bk) =n∑k=1

µ((Bk rA) ∪ (Bk ∩A)

)=

n∑k=1

[µ(Bk rA) + µ(Bk ∩A)

]≥

≥n∑k=1

µ(Bk ∩A) = µ( n⋃k=1

[Bk ∩A])

= µ(A).

Exercise 5*. Let X1, X2 be non-empty sets, let Jk ⊂ P(Xk), k = 1, 2, besemirings, and let µk : Jk → [0,∞] be additive maps. Consider the semiring (seeExercise 3)

J =A1 ×A2 : A1 ∈ J1, A2 ∈ J2

⊂ P(X1 ×X2).

Then the map µ : J → [0,∞] defined by

µ(A1 ×A2) = µ1(A)1 · µ2(A1)

is additive. Here we use the convention 0 · ∞ = ∞ · 0 = 0.Hints: One wants to show that, whenever A1 ×A2 ∈ J is written as aunion

A1 ×A2 =n⋃

k=1

(Ak1 ×Ak

2),

with (Ak1 ×Ak

2)nk=1 ⊂ J pair-wise disjoint, it follows that

µ1(A1) · µ2(A2) =

n∑k=1

µ1(Ak1) · µ2(Ak

2).

Analyze first the case of “strips,” that is, when A11 = · · · = An

1 = A1 or A21 = · · · = An

2 = A2. In

the general case, use induction, by picking some k such that Ak1 ( A1 and splitting A1 ×A2 into

“strips” of the form B` × A2, where B1, . . . , Bm ∈ J1 are pairwise disjoint, with B1 = Ak1 and

B1 ∪ · · · ∪Bm = A1.

Comment. In connection with the above exercise, one can as the followingQuestion: With the notations above, is it true that, if both µ1 and µ2 are

measures, then µ is also a measure?As we shall see a bit later in the course, that the answer is is “yes.”

CHAPTER III: MEASURE THEORY 187

Definition. Let X be a non-empty set, and let E ⊂ P(X) be a collection ofsets. We say that a map µ : E → [0,∞] is σ-sub-additive, if

(add−σ ) whenever A ∈ E, and (An)∞n=1 is a sequence in E with A ⊂⋃∞n=1An, it

follows that µ(A) ≤∑∞n=1 µ(An).

Note that we do not require the An’s to be pair-wise disjoint.

Proposition 4.2 (characterization of semiring measures). Let X be a non-empty set, let J ⊂ P(X) be a semiring, and let µ : J → [0,∞] be a map withµ(∅) = 0. The following are equivalent:

(i) µ is a measure on J;(ii) µ is additive, and σ-sub-additive.

Proof. (i) ⇒ (ii). Assume µ is a measure on J. It is clear that µ is additive,so we only need to prove σ-sub-additivity. Use Theorem 4.1 to find a measure µ onthe ring R(J) generated by J, such that

µ(A) = µ(A), ∀A ∈ J.

Then it suffices to show that µ is σ-sub-additive. Start with a set A ∈ R(J), and asequence (An)∞n=1 ⊂ R(J), such that A ⊂

⋃∞n=1An. Define the sets B1 = A1, and

Bn = An r (A1 ∪ · · · ∪An−1), forall n ≥ 2.

Since we work in a ring, the sets Bn, Bn∩A, BnrA, and AnrBn, n ∈ N, all belongto R(J). Moreover, the sequence (Bn)∞n=1 is pair-wise disjoint and it satisfies

• Bn ⊂ An, ∀n ∈ N,•

⋃∞n=1Bn =

⋃∞n=1An ⊃ A,

so by σ-additivity of µ, we get

∞∑n=1

µ(An) =∞∑n=1

µ((An rBn) ∪Bn

)=

∞∑n=1

[µ(An rBn) + µ(Bn)

]≥

≥∞∑n=1

µ(Bn) =∞∑n=1

µ((Bn rA) ∪ (Bn ∩A)

)=

∞∑n=1

[µ(Bn rA) + µ(Bn ∩A)

]≥

≥∞∑n=1

µ(Bn ∩A) = µ( ∞⋃n=1

[Bn ∩A])

= µ(A).

(ii) ⇒ (i). Assume µ : J → [0,∞] is additive and σ-sub-additive, and let usshow that µ is σ-additive. We again use Theorem 4.1, to find an additive mapµ : R(J) → [0,∞], such that µ

∣∣J

= µ. Start with a pair-wise disjoint sequence(An)∞n=1 ⊂ J, such that the union A =

⋃∞n=1An belongs to J. On the one hand, by

σ-sub-additivity, we have the inequality

(5) µ(A) ≤∞∑n=1

µ(An).

188 LECTURE 21

On the other hand, for any integer N ≥ 1, we have

µ(A) = µ(A) = µ

([ N⋃n=1

An]∪

(Ar

[ N⋃n=1

An]))

≥ µ( N⋃n=1

An)

=N∑n=1

µ(An) =N∑n=1

µ(An),

which then gives

µ(A) ≥ supN∈N

N∑n=1

µ(An) =∞∑n=1

µ(An),

so using (5) we immediately get µ(A) =∑∞n=1 µ(An).

The following technical result will be often employed in subsequent sections.Lemma 4.1 (Continuity). Let J be a semiring, and let µ be a measure on J.

(i) If (An)∞n=1 ⊂ J is a sequence of sets, with A1 ⊂ A2 ⊂ . . . , and⋃∞n=1An ∈

J, then

µ( ∞⋃n=1

An)

= limn→∞

µ(An).

(ii) If (Bn)∞n=1 ⊂ J is a sequence of sets, with B1 ⊃ B2 ⊃ . . . , and⋂∞n=1Bn ∈

J, and µ(B1) <∞, then

µ( ∞⋂n=1

Bn)

= limn→∞

µ(Bn).

Proof. Using Theorem 4.1, we can assume that J is already a ring. (Otherwisewe replace J by R(J), and µ by its extension µ.)

(i). Consider the sets D1 = A1, and Dk = An rAk−1, ∀ k ≥ 2. It is clear that(Dk)∞k=1 is a pairwise disjoint sequence in J, and we have the equality

(6)n⋃k=1

Dk = An, ∀n ≥ 1.

This gives of course the equality∞⋃k=1

Dk =∞⋃n=1

An ∈ J.

Using this equality, combined with the (σ-)additivity of µ, and with (6), we get

µ( ∞⋃n=1

An)

=∞∑k=1

µ(Dk) = limn→∞

[ n∑k=1

µ(Dk)]

= limn→∞

µ( n⋃k=1

Dk

)= limn→∞

µ(An).

(ii). Consider the setsB =⋂∞n=1Bn, and An = B1rBn, ∀n ≥ 1. It is clear that

(An)∞n=1 ⊂ J, and we have A1 ⊂ A2 ⊂ . . . . Moreover, we have⋃∞n=1An = B1 rB,

so by part (i), we get

(7) µ(B1 rB) = limn→∞

µ(B1 rBn).

Using the fact that µ(B1) <∞, it follows that

µ(B) ≤ µ(Bn) ≤ µ(B1) <∞, ∀n ≥ 1.

CHAPTER III: MEASURE THEORY 189

This gives then the equalities

µ(B1 rB) = µ(B1)− µ(B) and µ(B1 rBn) = µ(B1)− µ(Bn), ∀n ≥ 1,

so the equality (7) immediately gives µ(B) = limn→∞ µ(Bn).

The above result has a (minor) generalization, which we record for future use.To formulate it we introduce the following.

Notation. Let R be a ring, and let µ be a measure on R. For two setsA,B ∈ R, we write A ⊂

µB, if µ(ArB) = 0.

Using this notation, we have the following generalization of Lemma 4.1.

Proposition 4.3. Let R be a ring, and let µ be a measure on R.

(i) If (An)∞n=1 ⊂ R is a sequence of sets, with A1 ⊂µA2 ⊂

µ. . . , and

⋃∞n=1An ∈

R, then

µ( ∞⋃n=1

An)

= limn→∞

µ(An).

(ii) If (Bn)∞n=1 ⊂ R is a sequence of sets, with B1 ⊃µB2 ⊃

µ. . . , and

⋂∞n=1Bn ∈

J, and µ(B1) <∞, then

µ( ∞⋂n=1

Bn)

= limn→∞

µ(Bn).

Proof. (i). Define the sequence of sets (En)∞n=1 ⊂ R, by En =⋃nk=1Ak,

∀n ≥ 1. Notice that, A1 = E1, and for each n ≥ 2, we have An ⊂ En, as well asthe equality

En rAn =n−1⋃k=1

[An rAk].

Using sub-additivity, it follows that

µ(En rAn) ≤n−1∑k=1

µ(An rAk),

which forces µ(En rAn) = 0. This gives

(8) µ(En) = µ(An) + µ(En rAn) = µ(An), ∀n ≥ 1.

Since⋃∞n=1En =

⋃∞n=1An, and we have the inclusions E1 ⊂ E2 ⊂ . . . , by Lemma

4.1, combined with (8), we get

µ( ∞⋃n=1

An)

= µ( ∞⋃n=1

En)

= limn→∞

µ(En) = limn→∞

µ(An).

Part (ii) is proven exactly as part (ii) from Lemma 4.1.

Exercise 6. Let µ be a measure on a ring R. Prove that, for A,B ∈ R, one hasthe implication

A ⊂µB ⇒ µ(A) ≤ µ(B).

190 LECTURE 21

Example 4.3. Fix some integer n ≥ 1. Consider the semiring of “half-openboxes” in Rn

Jn = ∅ ∪ n∏j=1

[aj , bj) : a1 < b1, . . . , an < bn⊂ P(Rn).

For a non-empty box A = [a1, b1)× · · · × [an, bn) ∈ Jn, we define

voln(A) =n∏k=1

(bk − ak).

We also define voln(∅) = 0.Theorem 4.2. With the above notations, the map voln : J → [0,∞] is a

measure on Jn.

Proof. First we prove additivity. Using Exercise ?? (and induction on n) itsuffices to analyze only the case n = 1, i.e. the case of half-open intervals in R. Weneed to show the implication

(9)[a, b) =

⋃pk=1[ap, bp)

[ak, bk)pk=1

pair-wise disjoint

=⇒ b− a =p∑k=1

(bk − ak).

We can prove this using induction on p. The case p = 1 is trivial. Assuming that theabove fact holds for p = N , let us prove it for p = N + 1. Pick k1 ∈ 1, . . . , N + 1such that ak1 = a. Then we clearly have⋃

1≤k≤N+1k 6=k1

[ak, bk) = [bk1 , b),

so by the inductive hypothesis we get

b− bk1 =∑

1≤k≤N+1k 6=k1

(bk − ak),

so we getN+1∑k=1

(bk − ak) = (bk1 − ak1) + (b− bk1) = b− ak1 = b− a,

and we are done.We now prove that voln is σ-sub-additive. Suppose we have A ∈ Jn and a

sequence (Ak)∞k=1 ⊂ Jn, such that A ⊂⋃∞k=1Ak, and let us prove the inequality

(10) voln(A) ≤∞∑k=1

voln(Ak).

It will be helpfull to introduce the following notations. For every half-open box

B = [x1, y1)× · · · × [xn, yn),

and every δ > 0, we define the boxes boxes

Bδ = [x1 − δ, y1)× · · · × [xn − δ, yn) and Bδ = [x1, y1 − δ)× · · · × [xn, yn − δ).

CHAPTER III: MEASURE THEORY 191

It is clear that, for any box B ∈ Jn we have

Bδ ⊂ B ⊂ Int(Bδ),(11)

voln(B) = limδ→0+

voln(Bδ) = limδ→0+

voln(Bδ).(12)

To prove (10), we fix some ε > 0, and we choose positive numbers δ and (δk)∞k=1,such that

(13) voln(Aδ) > voln(A)− ε, and voln((Ak)δn

)<

ε

2k+ voln(Ak), ∀ k ∈ N.

Notice now that, using (11), we have the inclusions

Aδ ⊂ A ⊂∞⋃k=1

Ak ⊂ Int((Ak)δn

),

and using the compactness of Aδ, there exists some N ≥ 1, such that

Aδ ⊂N⋃k=1

Int((Ak)δn

).

This immediately gives the inclusion

Aδ ⊂N⋃k=1

(Ak)δn .

Using sub-additivity (see Corollary 4.1) we now get

voln(Aδ) ≤N∑k=1

voln((Ak)δn

),

and using (13) we have

voln(A)− ε ≤N∑k=1

[ ε2k

+ voln(Ak)]≤ ε+

N∑k=1

voln(Ak) ≤ ε+∞∑k=1

voln(Ak).

This gives

voln(A)− 2ε ≤∞∑k=1

voln(Ak).

But since this inequality holds for all ε > 0, the inequality (10) immediately follows.

Lecture 22

5. Outer measures

Although measures can be defined on arbitrary collections of sets, the mostnatural domain of a measure is a σ-ring. In the previous section we dealt howeveronly with (semi)rings. Therefore it is natural to ask the following

Question 1: Given a measure µ on a (semi)ring J, is it possible to extend itto a measure on the σ-ring S(J) generated by J?

As a particular case of the above question, we can specifically ask if there exists ameasure on Bor(Rn), which agrees with voln on “half-open boxes.”

As a consequence of a remarkably clever construction, due to Caratheodory, wewill be able to answer the above general question in the affirmative. Caratheodory’sapproach is based on the following concept.

Definition. Given a non-empty set X, an outer measure on X is simply amap ν : P(X) → [0,∞] with the following properties.

(0) ν(∅) = 0.(m) If A,B ∈ P(X) are such that A ⊂ B, then ν(A) ≤ ν(B).

(add−σ ) ν is σ-sub-additive, i.e. whenever A ∈ P(X), and (An)∞n=1 is a sequencein P(X) with A ⊂

⋃∞n=1An, it follows that µ(A) ≤

∑∞n=1 µ(An).

The property (m) is called monotonicity.Remark that ν is automatically sub-additive, in the sense that, whenever

A,A1, . . . , An ∈ P(X) are such that A ⊂ A1 ∪ · · · ∪ An, it follows that ν(A) ≤ν(A1) + · · ·+ ν(An).

The following result explains how a measure on a semiring can be naturallyextended to an outer measure on the ambient space.

Proposition 5.1. Let X be a non-empty set, let J be a semiring on X, andlet µ : J → [0,∞] be a measure on J. Consider the collection

PJσ(X) =

A ⊂ X : there exists (Bn)∞n=1 ⊂ J, with A ⊂

∞⋃n=1

Bn.

Define the map µ : PJσ(X) → [0,∞] by

µ(A) = inf ∞∑n=1

µ(Bn) : (Bn)∞n=1 ⊂ J, A ⊂∞⋃n=1

Bn

, ∀A ∈ PJ

σ(X).

Then the map µ∗ : P(X) → [0,∞], defined by

µ∗(A) =µ(A) if A ∈ PJ

σ(X)∞ if A 6∈ PJ

σ(X)

is an outer measure on X, and µ∗∣∣J

= µ.

193

194 LECTURE 22

Proof. It is obvious that µ∗(∅) = 0. It is also clear that µ∗ is mono-tone. To prove that µ∗ is σ-sub-additive, start with A ∈ P(X) and a sequence(An)∞n=1 ∈ P(X), such that A ⊂

⋃∞n=1An, and let us prove the inequality µ∗(A) ≤∑∞

n=1 µ∗(An). If there exists some n with An 6∈ PJ

σ(X), there is nothing to prove.Assume An ∈ PJ

σ(X), for all n. Then it is clear that A ∈ PJσ(X). Fix for the

moment some ε > 0. For every n ∈ N choose a sequence (Bnk )∞k=1 ⊂ J, such that∞∑k=1

µ(Bnk ) <ε

2n+ µ(An).

It is clear that, if we list the countable family (Bnk )∞n,k=1 as a sequence (Dm)∞m=1,then A ⊂

⋃∞m=1Dm, and

µ(A) ≤∞∑m=1

µ(Dm) =∞∑n=1

∞∑k=1

µ(Bnk ) ≤∞∑n=1

[ ε2n

+ µ(An)]

= ε+∞∑n=1

µ(An).

Since the above inequality holds for all ε > 0, we conclude that

µ∗(A) = µ(A) ≤∞∑n=1

µ(An) =∞∑n=1

µ∗(An),

so µ∗ is indeed σ-sub-additive.Finally, we must show that µ∗

∣∣J

= µ. Start with some A ∈ J. On the onehand, since µ is a measure on J, we know that µ is σ-subadditive (see Theorem4.2). This means that, for any sequence (Bn)∞n=1 ⊂ J with A ⊂

⋃∞n=1Bn, we have∑∞

n=1 µ(Bn) ≥ µ(A). Since A obviously belongs to PJσ(X), this will force

µ∗(A) = µ(A) ≥ µ(A).

On the other hand, if we consider the sequence B1 = A, B2 = B3 = · · · = ∅, thenwe clearly have

∑∞n=1 µ(Bn) = µ(A), which gives µ(A) ≤ µ(A), so in fact we must

have equality µ(A) = µ(A).

Definition. The outer measure µ∗, defined in the above result, is called themaximal outer extension of µ. This terminology is justified by the following.

Exercise 1. Let J be a semiring on X, and let µ be a measure on J. Prove thatany outer measure ν on X, with ν

∣∣J

= µ, then ν ≤ µ∗, in the sense that

ν(A) ≤ µ∗(A), ∀A ⊂ X.

Exercise 2. Let J1 and J2 be semirings on X with J1 ⊂ J2, and let µ1, µ2 berespectively measures on J1, J2, such that µ2

∣∣J1≤ µ1. Let µ∗1, µ

∗2 respectively be

the maximal outer extensions of µ1, µ2. Prove the inequality µ∗2 ≤ µ∗1.Given a measure µ on a semiring J on X, one can ask whether there exists a

unique outer measure on X, which extends µ. The answer is no, even in the mosttrivial cases.

Example 5.1. Work on the set X = 1, 2. Take the semiring J = ∅, Xand define a measure µ on J by µ(∅) = 0 and µ(X) = 1. Choose now any numbera ∈ (0, 1) and define νa : P(X) → [0, 1] by νa(A) = aκA(1) + (1 − a)κA(2). Thenνa is an outer measure on X - in fact νa is a measure on P(X) - and νa

∣∣J

= µ. Itis obvious that µ∗(1) = 1 6= a = νa(1) and µ∗(2) = 1 6= 1− a = νa(2).

We introduce now another concept, which is very important in our analysis.

CHAPTER III: MEASURE THEORY 195

Definition. Let ν be an outer measure on a non-empty set X. A subsetA ⊂ X is said to be ν-measurable, if it satisfies the condition

(m) ν(S) = ν(S ∩A) + ν(S rA), ∀S ⊂ X.For a given S, it is useful to think the equality ν(S) = ν(S ∩ A) + ν(S r A) inunorthodox terms as “A sharply cuts S,” so that saying that A is ν-measurablemeans that “A sharply cut every set S ⊂ X.”

Remarks 5.1. Let ν be an outer measure on X.A. Since ν is (finitely) sub-additive, for any two sets A,S ⊂ X, one always has

the inequality ν(S) ≤ ν(S∩A)+ν(SrA). Therefore, a set A ⊂ X is ν-measurable,if and only if

ν(S) ≥ ν(S ∩A) + ν(S rA), ∀S ⊂ X.

B. Any subset N ⊂ X, with ν(N) = 0, is ν-measurable. Indeed, from themonotonicity of ν, we see that for every S ⊂ X, we have

ν(S ∩N) + ν(S rN) ≤ ν(N) + ν(S) = ν(S),

so by the preceding remark, N is indeed ν-measurable. Such a set N is calledν-negligeable.

The first key result in this section is the following.Theorem 5.1. Let ν be an outer measure on a non-empty set X. Then the

collectionmν(X) =

A ⊂ X : A ν-measurable

is a σ-algebra on X. Moreover, the restriction

ν∣∣mν(X)

: mν(X) → [0,∞]

is a measure on mν(X).

Proof. The proof will be carried on in several steps.Step 1: If A ∈mν(X), then X rA ∈mν(X).

This is trivial, since for every S ⊂ X, one has the equalities

S ∩ (X rA) = S rA and S r (X rA) = S ∩A.Step 2: If A,B ∈mν(X), then A ∩B ∈mν(X).

Start with some arbitrary S ⊂ X. Since B is ν-measurable, it “shaprply cuts theset S r (A ∩B),” which means that

ν(S r (A ∩B)

)= ν

([S r (A ∩B)] ∩B

)+ ν

([S r (A ∩B)] rB

).

Since we clearly have [Sr(A∩B)]∩B = (S∩B)rA, and [Sr(A∩B)]rB = SrB,the above equality gives

ν(S r (A ∩B)

)= ν

((S ∩B) rA

)+ ν(S rB).

Adding ν((S ∩ B) ∩ A

), and using the fact that A “sharply cuts S ∩ B,” we now

get

ν((S ∩ (A ∩B)

)+ ν

(S r (A ∩B)

)=

= ν((S ∩B) ∩A

)+ ν

((S ∩B) rA

)+ ν(S rB) = ν(S ∩B) + ν(S rB).

Finally, using the fact that B “sharply cuts S,” we get

ν((S ∩ (A ∩B)

)+ ν

(S r (A ∩B)

)= ν(S ∩B) + ν(S rB) = ν(S),

so A ∩B is indeed ν-measurable.

196 LECTURE 22

So far, Steps 1 and 2 prove that mν(X) is an algebra on X.

Step 3: For any pair-wise disjoint finite sequence (An)Nn=1 ⊂ mν(X), onehas the equality

ν(S ∩

[A1 ∪ · · · ∪AN

])=

N∑n=1

ν(S ∩An), ∀S ⊂ X.

Since mν(X) is an algebra, it suffices to prove the aboove equalityonly for N =2. (The case of arbitrary N follows immediately by induction.) To prove thatν(S ∩ (A1 ∪A2)

)= ν(S ∩A1)+ ν(S ∩A2), we simply use the fact that A1 “sharply

cuts S ∩ (A1 ∪A2),” which gives

ν(S ∩ (A1 ∪A2)

)= ν

([S ∩ (A1 ∪A2)] ∩A1

)+ ν

([S ∩ (A1 ∪A2)] rA1

).

The desired equality then immediately follows from the obvious equalities

[S ∩ (A1 ∪A2)] ∩A1 = S ∩A1 and [S ∩ (A1 ∪A2)] rA1 = S ∩A2.

The preceding step can be in fact extended to infinite sequences.

Step 4: For any pair-wise disjoint sequence (An)∞n=1 ⊂mν(X), one has theequality

ν

(S ∩

[ ∞⋃n=1

An])

=∞∑n=1

ν(S ∩An), ∀S ⊂ X.

To prove this fact, we fix a sequence (An)∞n=1 as above, as well as S ⊂ X. Byσ-sub-additivity, we already know that

ν

(S ∩

[ ∞⋃n=1

An])

= ν

( ∞⋃n=1

[S ∩An])≤

∞∑n=1

ν(S ∩An),

so the only thing we have to show is the inequality

N∑n=1

ν(S ∩An) ≤ ν

(S ∩

[ ∞⋃n=1

An]), ∀N ∈ N.

This follows immediately from Step 3 and the monotonicity:

N∑n=1

ν(S ∩An) = ν

(S ∩

[ N⋃n=1

An])

≤ ν

(S ∩

[ ∞⋃n=1

An]).

Step 5: mν(X) is a monotone class.

We need to prove the properties:

(i) whenever (An)∞n=1 ⊂ mν(X) is a sequence with An ⊂ An+1, ∀n ∈ N, itfollows that

⋃∞n=1An belongs to mν(X);

(ii) whenever (An)∞n=1 ⊂ mν(X) is a sequence with An ⊃ An+1, ∀n ∈ N, itfollows that

⋂∞n=1An belongs to mν(X).

Since mν(X) is an algebra, it suffices only to prove (i). Start with an arbitrarysubset S, and a sequence (An)∞n=1 ⊂mν(X) with An ⊂ An+1, ∀n ∈ N, and denotethe union

⋃∞n=1An simply by A. Define the sets B1 = A1 and Bn = An r An−1,

∀n ≥ 2. It is obvious that (Bn)∞n=1 is a pair-wise disjoint sequence. Since mν(X)

CHAPTER III: MEASURE THEORY 197

is an alegbra, all the Bn’s belong to mν(X). We have,⋃∞n=1Bn =

⋃∞n=1An = A,

which, using Step 4 gives

(1) ν(S ∩A) = ν

(S ∩

[ ∞⋃n=1

An])

= ν

(S ∩

[ ∞⋃n=1

Bn])

=∞∑n=1

ν(S ∩Bn).

Using Step 3, combined with the equality⋃Nn=1Bn = AN , we also have

N∑n=1

ν(S ∩Bn) = ν

(S ∩

[ N⋃n=1

Bn])

= ν(S ∩AN ), ∀N ∈ N,

so by (1) we have

(2) ν(S ∩A) =∞∑n=1

ν(S ∩Bn) = limN→∞

ν(S ∩AN ).

Notice now that, using the fact that AN “sharply cuts S,” combined with themonotonicity of ν and the obvious inclusion S rA ⊂ S rAN , we have

ν(S ∩AN ) + ν(S rA) ≤ ν(S ∩AN ) + ν(S rAN ) = ν(S), ∀N ∈ N,

so using (2), we immediately get

ν(S ∩A) + ν(S rA) ≤ ν(S).

Since the above inequality holds for all S ⊂ X, by Remark 5.1.A it follows that Aindeed belongs to mν(X).

By the results from Section 1, we know that the fact that mν(X) is simu-lutaneously an algebra, and a monotone class, implies the fact that mν(X) is aσ-algebra.

We now show that ν∣∣mν(X)

is a measure. If we start with a pair-wise disjointsequence (An)∞n=1 ⊂mν(X), then the equality equality

ν( ∞⋃n=1

An)

=∞∑n=1

ν(An)

is an immediate consequence of Step 4, applied to the set S =⋃∞n=1An, which

clearly satisfies S ∩An = An, ∀n ∈ N.

We are now in position to answer the Question 1.

Theorem 5.2. Let X be a non-empty set, let J be a semiring on X, let µ be ameasure on J, and let µ∗ be the maximal outer extension of µ. Then J ⊂mµ∗(X).In particular, mµ∗(X) contains the σ-algebra Σ(J) on X, generated by J, andµ∗

∣∣Σ(J)

is a measure on Σ(J).

Proof. What we need to prove is the fact that every setA ∈ J is µ∗-measurable.Start with an arbitrary set S ⊂ X. As noticed before (Remark 5.1.A), we only needto prove the inequality

(3) µ∗(S ∩A) + µ∗(S rA) ≤ µ∗(S).

If µ∗(S) = ∞, there is nothing to prove, so we can assume that µ∗(S) < ∞. Inparticular this means that S ∈ PJ

σ(X). Fix for the moment ε > 0. By the definition

198 LECTURE 22

of µ∗(S) = µ(S), there exists a sequence (Bn)∞n=1 ⊂ J, such that S ⊂⋃∞n=1Bn,

and

(4)∞∑n=1

µ(Bn) ≤ µ∗(S) + ε.

Since J is a semiring, for each n ∈ N, we can find some integer pn ≥ 1, and asequence (Dn

j )pn

j=0 ⊂ J, such that• Bn ∩A = Dn

0 ⊂ Dn1 ⊂ · · · ⊂ Dn

pn= Bn,

• Dj rDj−1 ∈ J, ∀ j ∈ 1, . . . , pn.Define the numbers k0 = 0, and kn =

∑nj=1 pj , ∀n ∈ N, and the sequence

(Cm)∞m=1 ⊂ J, by

Cm = Dnm−kn−1

rDnm−1−kn−1

, if kn−1 < m ≤ kn, n ∈ N.

By construction, for each n ∈ N, we havekn⋃

m=kn−1+1

Cm =pn⋃j=1

(Dnj rDn

j−1) = Bn rAn.

Moreover, for each n ∈ N the system

(Dn0 , Ckn−1+1, Ckn−1+2, . . . , Ckn

) = (Dn0 , D

n1 rDn

0 , Dn2 rDn

1 , . . . , Dnpn

rDnpn−1)

in J is pair-wise disjoint, and has

Dn0 ∪

kn⋃m=kn−1+1

Cm = Bn,

so we get the equality

µ(Dn0 ) +

kn∑m=kn−1+1

µ(Cm) = µ(Bn).

Using (4) we now get∞∑n=1

µ(Dn0 ) +

∞∑m=1

µ(Cm) =∞∑n=1

µ(Dn0 ) +

∞∑n=1

( kn∑m=kn−1+1

µ(Cm))

=

=∞∑n=1

(µ(Dn

0 ) +kn∑

m=kn−1+1

µ(Cm))

=∞∑n=1

µ(Bn) ≤ µ∗(S) + ε.

(5)

On the one hand, we clearly have∞⋃m=1

Cm =∞⋃n=1

( kn⋃m=kn−1+1

Cm

)=

∞⋃n=1

( pn⋃j=1

(Dnj rDn

j−1))

=

=∞⋃n=1

(Dnpn

rDn0 ) =

∞⋃n=1

(Bn rA) =( ∞⋃n=1

Bn)

rA ⊃ S rA,

which gives the inequality

(6)∞∑m=1

µ(Cm) ≥ µ∗(S rA).

CHAPTER III: MEASURE THEORY 199

On the other hand, we also have∞⋃n=1

Dn0 =

∞⋃n=1

(Bn ∩A) =( ∞⋃n=1

Bn)∩A ⊃ S ∩A,

which gives the inequality

(7)∞∑n=1

µ(Dn0 ) ≥ µ∗(S ∩A).

Combining (6) and (7) with (5) immediately gives the desired inequality (3).

The constructionµ

measure on J

maximal outer

extension−−−−−−−−−→

µ∗

outer measure on X

restriction−−−−−−→

µ∗

∣∣Σ(J)

measure on Σ(J)

is referred to as the Caratheodory construction.

Definitions. Let J be a semiring on X, and let µ be a measure on J. TheCaratheodory construction provides us with two measures. The first measure -µ∗

∣∣S(J)

- is a measure on the σ-ring S(J) generated by J, and is called the maximal

σ-ring extension of µ. The second measure - µ∗∣∣Σ(J)

- is a measure on the σ-algebraΣ(J) generated by J, and is called the maximal σ-algebra extension of µ.

The above terminology is justified by the following result.Proposition 5.2. Let J be a semiring on X, and let µ be a measure on J.

(i) If ν is a measure on the σ-ring S(J) generated by J, with ν∣∣J

= µ, thenν ≤ µ∗

∣∣S(J)

.

(ii) If ν is a measure on the σ-algebra Σ(J) generated by J, with ν∣∣J

= µ, thenν ≤ µ∗

∣∣Σ(J)

.

Proof. We prove both statements simultaneously. Let J1 denote either theσ-ring, or the σ-algebra generated by J. In particular J1 is a semiring, and J ⊂ J1.Since ν is a measure on J1 with ν

∣∣J

= µ, if we denote by ν∗ its maximal outerextension, then by Exercise 2 we know that ν∗ ≤ µ∗. In particular, by Proposition5.1 and Theorem 5.2, we get ν = ν∗

∣∣J1≤ µ∗

∣∣J1

.

We now discuss the uniqueness of extensions of a semiring measure. In orderto clarify this matter, we have to introduce a technical condition, which turns outto be very helpful not only here, but in many other situations.

Definitions. Let J be a semiring on X, and let µ be a measure on J.A. We say that a subset A ⊂ X is J-µ-σ-finite, if there exists a sequence

(Bn)∞n=1 ⊂ J, such that A ⊂⋃∞n=1Bn, and µ(Bn) < ∞, ∀n ∈ N. (When there is

no danger of confusion, we will use the terms “µ-σ-finite,” or simply “σ-finite.”)B. We say that the measure µ is σ-finite, if every A ∈ J is σ-finite.C. We say that the measure µ is finite, if µ(A) <∞, ∀A ∈ J.Clearly every finite measure on J is σ-finite.Remark 5.2. Let J be a semiring on X, let µ be a measure on J, and let A be a

set which belongs to the σ-algebra Σ(J) generated by J. If A if J-µ-σ-finite, then Ain fact belongs to the semiring S(J) generated by J. The only thing that is actuallyneeded here is the existence of a sequence (Bn)∞n=1 ⊂ J with A ⊂

⋃∞n=1Bn. This

200 LECTURE 22

gives the fact that A belongs to PJσ(X), so by Proposition 2.3, the set A belongs to

the intersection Σ(J) ∩ PJσ(X) = S(J).

Using the above terminology, we have the following uniqueness result.Theorem 5.3. Let J be a semiring on X, let µ be a measure on J, let µ∗ be the

maximal outer extension of µ, and let ν be a measure on the σ-ring S(J) generatedby J, with ν

∣∣J

= µ. Then one has ν(A) = µ∗(A), for all J-µ-σ-finite sets A ∈ S(J).

Proof. Fix a J-µ-σ-finite set A ∈ S(J).Claim: There exists a pair-wise disjoint sequence (Dn)∞n=1 ⊂ S(J) such thatA ⊂

⋃∞n=1Dn, and ν(Dn) = µ∗(Dn) <∞, ∀n ∈ N.

To prove the above statement, start with a sequence (Bn)∞n=1 ⊂ J with A ⊂⋃∞n=1Bn and µ(Bn) < ∞, ∀n ∈ N. Define the sets Dn, n ∈ N by D1 = B1,

and Dn = Bn r (B1 ∪ · · · ∪ Bn−1), ∀n ≥ 2. It is clear that the sequence (Dn)∞n=1

is pair-wise disjoint, and

A ⊂∞⋃n=1

Bn =∞⋃n=1

Dn.

Moreover, all the Dn’s belong to the ring R(J) generated by J. The inclusionsDn ⊂ Bn then prove that

µ∗(Dn) ≤ µ∗(Bn) = µ(Bn) <∞, ∀n ∈ N.

Finally, since both µ∗∣∣R(J)

and ν∣∣R(J)

are measures on R(J), which have the samevalues on J, using the Semiring-to-Ring Extension Theorem 4.1, it follows that

(8) µ∗∣∣R(J)

= ν∣∣R(J)

.

In particular we have the equalities

ν(Dn) = µ∗(Dn), ∀n ∈ N.

Having proven the Claim, we now show that ν(A) = µ∗(A). We choose asequence (Dn)∞n=1 ⊂ S(J) as in the Claim. On the one hand, since the Dn’s arepair-wise disjoint, and both ν and µ∗

∣∣S(J)

are measures on the σ-ring S(J), one hasthe equalities

ν(A) =∞∑n=1

ν(A ∩Dn) and µ∗(A) =∞∑n=1

µ∗(A ∩Dn).

So, in order to prove the equality ν(A) = µ∗(A), it suffices to prove that

(9) ν(A ∩Dn) = µ∗(A ∩Dn), ∀n ∈ N.

Fix n ∈ N. On the one hand, by Proposition 5.2(i), we have the inequalities

(10) ν(A ∩Dn) ≤ µ∗(A ∩Dn) <∞ and ν(Dn rA) ≤ µ∗(Dn rA) <∞.

On the other hand, we have

ν(A ∩Dn) + ν(Dn rA) = ν(Dn) = µ∗(Dn) = µ∗(A ∩Dn) + µ∗(Dn rA).

Now if we go back to (10), we see that none of the two inequalities can be strict, be-cause in that case we would get ν(Dn) < µ∗(Dn). (The assumption that µ∗(Dn) <∞ is essential here.) So we must have (9), and we are done.

CHAPTER III: MEASURE THEORY 201

Corollary 5.1. If µ is a σ-finite measure on a semiring J, then there exists aunique measure ν on the σ-ring S(J) generated by J, such that ν

∣∣J

= µ. Moreover,ν is σ-finite.

Proof. The existence is given by the Caratheodory construction. The unique-ness follows from Theorem 5.3.

To prove σ-finiteness, start with some A ∈ S(J), and let us find a sequence(Bn)∞n=1 ⊂ S(J) with A ⊂

⋃∞n=1Bn and ν(Bn) < ∞, ∀n ∈ N. First of all, since

PJσ(X) is a σ-ring which contains J, it follows that S(J) ⊂ PJ

σ(X). In particular,there exists (Dn)∞n=1 ⊂ J such that A ⊂

⋃∞n=1Dn. Using the fact that µ is σ-finite,

we see that for each n we can find a sequence (Dnk )∞k=1 ⊂ J, withDn ⊂

⋃∞k=1D

nk and

µ(Dnk ) < ∞, ∀ k ∈ N. If we list all the sets Dn

k , k, n ∈ N as a sequence (Bm)∞m=1,then we are done.

In the absence of the σ-finitess condition the uniqueness of the σ-ring extensionfails, as illustrated by the following.

Example 5.2. Consider the set X = Q, and the semiring of rational half-openintervals

J1 =∅

[a, b) ∩Q : a, b ∈ R, a < b

.

We equipp J1 with the measure µ defined by

µ(A) =

0 if A = ∅∞ if A 6= ∅

Notice that, if we look at the inclusion ι : Q → R, then J1 = J∣∣Q, where J is the

semiring of half-open intervals in R. By the Generating Theorem we then have

S(J1) = S(J∣∣Q) = S(J)

∣∣Q = Bor(R)

∣∣Q = P(Q).

Define now the measures ν1, ν2 : S(J) → [0,∞] by

ν1(A) =

cardA if A is finite∞ if A is infinite ν2(A) =

2 · cardA if A is finite

∞ if A is infinite

It is obvious that both ν1 and ν2 satisfy ν1∣∣J1

= ν2∣∣J1

= µ, but obviously ν1 andν2 are not equal.

Comment. In connection with the Caratheodory construction, it is legitimateto ask the following.

Question 2: What happens if we do the Caratheodory construction twice?This problem has in fact two aspects.

Question 2A: Suppose ω is an outer measure on X. Take I = mω(X) andν = ω

∣∣I, so that I is a semiring (in fact it is a σ-algebra) on X, and ν

is a measure on I. Let ν∗ be the maximal outer extension of ν. Is it truethat ν∗ = ω?

By Exercise 2, we always have ω ≤ ν∗. In general the answer to Question 2A innegative, as shown in Exercise ??? below. One can ask however the following

Question 2B: Same question as 2A, but suppose ω = µ∗, the maximal outerextension of a measure µ on a semiring J.

The following result shows that Question 2B always has an affirmative answer.

202 LECTURE 22

Proposition 5.3. Let X be a non-empty set, let J be a semiring on X, andlet µ be a measure on J. Let µ∗ be the maximal outer exetension of ν. Let I be asemiring, with I ⊃ J Consider the measure ν = µ∗

∣∣I, and let ν∗ be the maximal

outer extension of ν. Then ν∗ = µ∗.

Proof. First of all, since ν∣∣J

= µ∗∣∣J

= µ, by Exercise 2, we have the inequalityν∗ ≤ µ∗.

To prove the other inequality, we start with an arbitrary set A ⊂ X, and weprove that µ∗(A) ≤ ν∗(A). If ν∗(A) = ∞, there is nothing to prove, so we mayassume ν∗(A) <∞. In particular, A ∈ PI

σ(X), i.e. there exists at least one sequence(Bn)∞n=1 ⊂ I, with A ⊂

⋃∞n=1Bn, and we have

ν∗(A) = inf ∞∑n=1

ν(Bn) : (Bn)∞n=1 ⊂ I, A ⊂∞⋃n=1

Bn

.

Fix for the moment a some ε > 0, and choose a sequence (Bεn)∞n=1 ⊂ I, such that

(11) A ⊂∞⋃n=1

Bεn and∞∑n=1

ν(Bεn) ≤ ν∗(A) + ε.

By σ-subadditivity of µ∗, we have

µ∗(A) ≤∞∑n=1

µ∗(Bεn).

Using the fact that ν = µ∗∣∣I, the above inequality, combined with (11) yields

µ∗(A) ≤ ν∗(A) + ε.

Since this inequality holds for all ε > 0, it forces the inequality µ∗(A) ≤ ν∗(A).

Exercise 3. Let X be an uncountable set, and define ω : P(X) → [0,∞] by

ω(A) =

0 if A = ∅1 if 0 < cardA ≤ ℵ0

2 if A is uncountable

(i) Prove that ω is an outer measure on X.(ii) Take I = mω(X). Prove that I = ∅, X.(iii) Consider the measure ν = ω

∣∣I, and let ν∗ be the maximal outer extension

of ν. Prove that there are sets A ⊂ X, with ω(A) < ν∗(A).

Hints: For (ii) start with some A with ∅ ( A ( X. Prove that A is not ω-measurable, by

showing that A does not “sharply cut” sets of the form a, b with a ∈ A and b ∈ X rA.

Comment. Suppose J is a semiring on X, and µ is a measure on J. Wehave used the maximal outer extension µ∗ as a tool in defining measures on theσ-ring S(J) and the σ-algebra Σ(J) generated by J, by employing the Caratheodoryconstruction, which uses the σ-algebra mµ∗(X) of µ∗-measurable sets. A legitimatequestion is then

Question 3: Is the inclusion Σ(J) ⊂mµ∗(X) strict?In most cases this inclusion is indeed strict (see Examples ?? below, or the dis-cussion in the next section). This can be seen by looking at µ∗-negligeable setsN ⊂ X, which are automatically µ∗-measurable. The following result gives someuseful information.

CHAPTER III: MEASURE THEORY 203

Proposition 5.4. Suppose J is a semiring on X, and µ is a measure on J. Letµ∗ be the maximal outer extension of µ. For any set A ∈ PJ

σ(X), there exists someset B in the σ-ring S(J) generated by J, such that A ⊂ B, and µ∗(A) = µ∗(B).

In particular, a subset N ⊂ X is µ∗-neglijeable, i.e. µ∗(N) = 0, if and only ifthere exists a µ∗-negligeable set B ∈ S(J), such that N ⊂ B.

Proof. Since A ∈ PJσ(X), there exists a sequence (Dn)∞n=1 ⊂ J with A ⊂⋃∞

n=1Dn. Moreover, we have

µ∗(A) = inf ∞∑n=1

µ(Dn) : (Dn)∞n=1 ⊂ J, A ⊂∞⋃n=1

Dn

.

For each integer k ≥ 1, we can then choose a sequence (Bkn)∞n=1 ⊂ J with A ⊂⋃∞

n=1Bkn and

∑∞n=1 µ(Bkn) ≤ µ∗(A)+1/k. For each integer k ≥ 1, we define the set

Bk =⋃∞n=1B

kn. It is clear that Ak ∈ S(J), and Bk ⊃ A, for all k ∈ N. Moreover,

by σ-sub-additivity of µ∗, and the equality µ∗∣∣J

= µ, we have the inequalities

µ∗(Bk) ≤∞∑n=1

µ∗(Bkn) =∞∑n=1

µ(Bkn) ≤ µ∗(A) +1k, ∀ k ∈ N.

If we then form B =⋂∞k=1Bk, then B still belongs to S(J), and we have A ⊂ B ⊂

Bk, which gives

µ∗(A) ≤ µ∗(B) ≤ µ∗(Bk) ≤ µ∗(A) +1k∀ k ∈ N,

thus forcing µ∗(B) = µ∗(A).To prove the second assertion, we see that the “only if” part is a particular

case of the first part. The “if” part is trivial, since the inclusion N ⊂ B forces theinequality µ∗(N) ≤ µ∗(B).

In connection to Question 3, it is useful to introduce the following terminology.Definition. Let X be a non-empty set, and let J be a semiring on X. A

measure µ on J is said to be complete, if it satisfies the condition(c) whenever N ∈ J has µ(N) = 0, it follows that J contains all the subsets

of N .Remarks 5.3. A. Given an outer measure ν on a set X, the measure ν

∣∣mν(X)

:mν(X) → [0,∞] is always complete, as a consequence of monotonicity, and ofRemark 5.1.B.

B. Given a semiring J on X, and a measure µ on J, we now see that a sufficientcondition, for having a strict inclusion Σ(S) ( mµ∗(X), is the lack of completenessfor the measure µ∗

∣∣Σ(J)

. Later on (see Corollary 5.2) we shall see that in the caseof σ-finite measures, defined on σ-total semirings, this condition is also necessary.

The lack of completeness of a σ-ring measure can be compensated by the fol-lowing result.

Theorem 5.4. Let X be a non-empty set, let S be a σ-ring on X, and let ν bea measure on S.

(i) The collection

N(S, ν) =N ⊂ X : there exists D ∈ S with N ⊂ D and ν(D) = 0

204 LECTURE 22

is a σ-ring on X. Moreover, if N ∈ N(S, ν), then N(S, ν) contains allsubsets of N .

(ii) For a subset A ⊂ X, the following are equivalent:(a) there exists B ∈ S and N ∈ N(S, ν), such that A = B rN ;(b) there exists F ∈ S and M ∈ N(S, ν), such that A = F ∪M .

(iii) The collection S of all subsets A ⊂ X, satisfying the equivalent conditionsin (ii), is a σ-ring. We have the equality

S = S(N(S, ν) ∪ S

).

(iv) There exists a unique measure ν on S, such that ν∣∣N(S,ν)

= 0 and ν∣∣S

= ν.The measure ν is complete.

(v) If E is a σ-ring with E ⊃ S, and if λ is a complete measure on E withλ∣∣S

= ν, then E ⊃ S and λ∣∣S

= ν.

Proof. (i). This is pretty clear. In fact, if one takes E = B ∈ S : ν(B) = 0,then one has the equality N(S, ν) = PE

σ(X).(ii). (a) ⇒ (b). Assume A = B r N with B ∈ S and N ∈ N(S, ν). Choose

D ∈ S with ν(D) = 0 and N ⊂ D. We now have

B rD ⊂ B rN = A,

so if we put F = B rD, we have the equality A = F ∪M , where

M = Ar F = (B rN) r (B rD) ⊂ D.

Notice that F ∈ S, while the inclusion M ⊂ D shows that M ∈ N(S, ν).(b) ⇒ (a). Assume A = F ∪M with F ∈ S and M ∈ N(S, ν). Choose D ∈ S

with M ⊂ D and ν(D) = 0. Define B = F ∪D. It is clear that B ∈ S, and A ⊂ B.Define N = B rA, so we clearly have A = B rN . We have

N = (F ∪D) r (F ∪M) ⊂ D rM ⊂ D,

so N clearely belongs to N(S, ν).(iii). We need to prove the following properties:(∗) whenever A1, A2 are sets in S, it follows that the difference A1 rA2 also

belongs to S;(∗∗) whenever (An)∞n=11 is a sequence of sets in S, it follows that the union⋃∞

n=1An also belongs to S.To prove (∗), we write A1 = B rN and A2 = F ∪M , with B,F ∈ S and M,N ∈N(S, ν). Then we have

A1 rA2 = (B rN) r (F ∪M) = B r (F ∪M ∪N) = (B r F ) r (M ∪N).

The difference B r F belongs to S, and, using (i), the union N ∪M belongs toN(S, ν). By (ii) it follows that A1 rA2 belongs to S.

To prove (∗∗), we write, for each n ∈ N, the set An as An = Fn ∪ rMn withFn ∈ S and Mn ∈ N(S, ν). Then

∞⋃n=1

An =( ∞⋃n=1

Fn)∪

( ∞⋃n=1

Mn

).

The union⋃∞n=1 Fn belongs to S, and, using (i), the union

⋃∞n=1Mn belongs to

N(S, ν). By (ii), the union⋃∞n=1An belongs to S.

CHAPTER III: MEASURE THEORY 205

Since S is a σ-ring, which clearly contains both N(S, ν) and S, it follows thatS ⊃ S

(N(S, ν) ∪ S

). The other inclusion S ⊂ S

(N(S, ν) ∪ S

)is trivial, by the

definition of S.(iv). To prove the existence, we consider the maximal outer extension ν∗.

When restricted to the σ-algebra mν∗(X) of all ν∗-measurable sets, then we geta measure. Notice that ν∗(N) = 0, ∀N ∈ N(S, ν), which gives the inclusionN(S, ν) ⊂ mν∗(X). In particular, since mν∗(X) is a σ-algebra, which containsboth N(S, ν) and S, it follows that

mν∗(X) ⊃ S(N(S, ν) ∪ S

)= S.

In particular, ν = ν∗∣∣S

is a measure on S, which clearly satisfies the requiredproperties.

To prove uniqueness, let µ be another measure on S, such that µ∣∣N(S,ν)

= 0 and

µ∣∣S

= ν. It we start with an arbitrary set A ∈ S, and we write it as A = F ∪M ,with F ∈ S and M ∈ N(S, ν), then using the fact that A r F ⊂ M , we see thatAr F belongs to N(S, ν), so we have

µ(A) = µ(F ) + ν(Ar F ) = µ(F ) = ν(F ) = ν(F ) = ν(F ) + ν(Ar F ) = ν(A).

Finally, we prove that the measure ν is complete. Let A ∈ S be a set withν(A) = 0, and let U be an arbitrary subset of A. Using (ii) we write A = F ∪M ,with F ∈ S and M ∈ N(S, ν). Notice that we have

0 ≤ ν(F ) = ν(F ) ≤ ν(F ∪M) = ν(A) = 0,

which forces F ∈ N(S, ν), so using (i), we see that A itself belongs to N(S, ν). By(i), it follows that U ∈ N(S, ν) ⊂ S.

(v) Let E and λ be as in indicated. In order to prove the inclusion E ⊃ S, itsuffices to prove the inclusion N(S, ν) ⊂ E. But this inclusion is pretty obvious. Ifwe start with some N ∈ N(S, ν), then there exists A ∈ S with N ⊂ A and ν(A) = 0.In particular, we have A ∈ E and λ(E) = 0, and then the completeness of λ forcesN ∈ E. Notice that this also forces λ(N) = ν(N) = 0. Using (iv) it then followsthat λ|S = ν.

Definition. Using the notations above, the σ-ring S is called the completionof S with respect to ν. The correspondence (S, ν) 7−→ (S, ν) is referred to as themeasure completion. Remark that, if ν is already complete, then S = S and ν = ν.

Exercise 4. Using the notations from Theorem 5.4, prove that for a set A ⊂ X,the condition A ∈ S is equivalent to any of the following:

(a′) there exists B ∈ S and N ∈ N(S, µ), with A = B rN , and N ⊂ B;(b′) there exists F ∈ S and M ∈ N(S, ν), such that A = F∪M and F∩M = ∅;(c) there exists E ∈ S and Z ∈ N(S, ν), such that A = E4Z.(d) there exist B,F ∈ S such that F ⊂ A ⊂ B, and µ(B r F ) = 0.

The µ∗-measurable sets of a special type can be completely characterized usingµ∗-negligeable ones.

Theorem 5.5. Suppose J is a semiring on X, and µ is a measure on J. Let µ∗

be the maximal outer extension of µ. For a J-µ-σ-finite subset A ⊂ X, the followingare equivalent;

(i) A is µ∗-measurable;

206 LECTURE 22

(ii) there exists B in the σ-ring S(J) generated by J, and a µ∗-neglijeable setN ⊂ X, such that A = B rN .

Proof. (i) ⇒ (ii). Start by choosing a sequence (Dn)∞n=1 ⊂ J with A ⊂⋃∞n=1Dn and µ(Dn) < ∞, ∀n ∈ N. Since mµ∗(X) is an algebra, which contains

J, it follows that all the intersections An = A ∩ Dn, n ∈ N, belong to mµ∗(X).For each n ∈ N, we use the previous result to find some set Bn ∈ S(J) such thatAn ⊂ Bn, and µ∗(Bn) = µ∗(An). On the one hand, if we put Vn = Bn r An, thenVn ∈mµ∗(X), so we will have

µ∗(Bn) = µ∗(An) + µ∗(Vn).

On the other hand, we know that µ∗(Bn) = µ∗(An) ≤ µ∗(Dn) < ∞, so the aboveequality forces µ∗(Vn) = 0.

Since we have Bn = An ∪ Vn, ∀n ∈ N, we will get∞⋃n=1

Bn =( ∞⋃n=1

An)∪

( ∞⋃n=1

Vn)

= A ∪( ∞⋃n=1

Vn),

so if we define B =⋃∞n=1En and V =

⋃∞n=1 Vn, then B belongs to S(J), we have

the equality B = A ∪ V , and V is µ∗-negligeable, because of the inequalities

µ∗(V ) ≤∞∑n=1

µ∗(Vn).

The set N = B r A ⊂ V is clearly µ∗-negligeable, because µ∗(N) ≤ µ∗(V ). Nowwe are done because B rN = A.

(ii) ⇒ (i). This part is trivial, since mµ∗(X) is an algebra.

Remarks 5.4. A. The implication (ii) ⇒ (i) holds without the assumptionthat A is J-µ-σ-finite. In fact, for any A ⊂ X, one has the implications (ii) ⇒(ii′) ⇒ (i), where

(ii′) there exists B in the σ-algebra Σ(J) generated by J, and a µ∗-neglijeableset N ⊂ X, such that A = B rN .

B. Consider the measure µ∗∣∣S(J)

on the σ-ring S(J). Using the notations fromTheorem 5.4, by Proposition 5.3, we clearly have the equality

N ⊂ X : N µ∗-negligeable

= N(S(J), µ∗

∣∣S(J)

).

So, if we denote by S(J) the completion of S(J) with respect to µ∗∣∣S(J)

, condition (ii)

from Theorem 5.5 reads: A ∈ S(J). Similarly, if we denote by Σ(J) the completionof Σ(J) with respect to the measure µ∗

∣∣Σ(J)

, condition (ii′) above reads: A ∈ Σ(J).With these notations, we have the inclusions

(12) S(J) ⊂ Σ(J) ⊂mµ∗(X).

With these notations, Theorem 5.5 states that

(13) S(J) ∩A ⊂ X : A J-µ-σ-finite

= mµ∗(X) ∩

A ⊂ X : A J-µ-σ-finite

.

Theorem 5.5, written in the form (13) has the following.Corollary 5.2. If the semiring J is σ-total in X, and µ is a σ-finite measure

on J, then one has the equalities

(14) S(J) = Σ(J) = mµ∗(X).

CHAPTER III: MEASURE THEORY 207

Proof. Indeed, under the given assumptions on J and µ, it follows that everyset A ⊂ X is J-µ-σ-finite.

Examples 5.3. A. The implication (i) ⇒ (ii) from Theorem 5.5 may fail, ifA is not σ-finite. Start with an arbitrary set X, consider the semiring J = ∅, Xand the measure µ on J defined by µ(∅) = 0 and µ(X) = ∞. Notice that J is aσ-algebra, so it is trivial that J is σ-total in X. The maximal outer extension µ∗

of µ is defined by

µ∗(A) =

0 if A = ∅∞ if A 6= ∅

It is clear that, since µ∗ is a measure on P(X), we have the equality mµ∗(X) =P(X), but the only µ∗-neglijeable set is the empty set ∅. This means that the setssatisfying condition (ii) in Theorem 5.4 are only the sets ∅ andX, so, if ∅ 6= A ( X,the implication (i) ⇒ (ii) fails, although J is σ-total in X. What occurs here is thetotal lack of J-µ-σ-finite sets.

B. Let X be an uncountable set, and let J be the semiring of all finite subsetsof X. We have

S(J) =A ⊂ X : cardA ≤ ℵ0

,

Σ(J) =A ⊂ X : either cardA ≤ ℵ0, or card(X rA) ≤ ℵ0

.

Equipp J with the trivial measure µ(A) = 0, ∀A ∈ J. The maximal outer extensionµ∗ is then defined by

µ∗(A) =

0 if cardA ≤ ℵ0

∞ if A is uncountable

It is clear that µ∗ is a measure on P(X), so we have mµ∗(X) = P(X) ) J. Noticethat both measures µ∗

∣∣S(J)

and µ∗∣∣Σ(J)

are complete, so using the notations fromRemark 5.4.B, we have the equalities

S(J) = S(J) and Σ(J) = Σ(J).

It is clear however that both inclusions in (12) are strict, although µ is finite. Whathappens here is the fact that J is not σ-total in X.

C. In the same setting as in Example B, if we take I = Σ(J), and ν = µ∗∣∣I,

then I is σ-total in X, simply because I is a σ-algebra. In this case, by Proposition5.3, the maximal outer extension ν∗ of ν coincides with µ∗. We have

I = S(I) = S(I) = Σ(I) = Σ(I) ( mν∗(X),

the reason for the strict inclusion being this time the fact that ν is not σ-finite.Comment. In the remainder of this section we take another look Question 3,

trying to generalize the answer given by Corollary 5.2. To simplify matters a littlebit, we start with a σ-algebra B on X (which is clearly σ-total in X), and a measureµ on B. It we take µ∗ to be the maximal outer extension of µ, and consider thecompletion B, we have the inclusion

(15) B ⊂mµ∗(X),

so we can ask whether this inclusion is strict. Of course, if µ is σ-finite, then byCorollary 5.2 the inclusion (15) is not strict. As Example 5.3.C suggests, in theabsence of the σ-finiteness assumption, the inclusion (15) may indeed be strict. As itturns out, the fact that the inclusion (15) is strict in Example 5.3.C is a consequence

208 LECTURE 22

of the fact that there are “new” measurable sets which are not necessarily of theform BrN with B ∈ B and N neglijeable. The existence of such sets is suggestedby the following.

Remark 5.5. Suppose ν is an outer measure on X. For a set A ⊂ X, thefollowing are equivalent:

(i) A is ν-measurable;(ii) ν(S) ≥ ν(S ∩A) + ν(S rA), for all S ⊂ X with ν(S) <∞;

The implication (i) ⇒ (ii) is trivial. To prove the converse, by Remark 5.1.A, weneed to show that

ν(S) ≥ ν(S ∩A) + ν(S rA), ∀S ⊂ X.

But this is trivial, when ν(S) = ∞. If ν(S) <∞, then this is exactly condition (ii).The “new” sets, that were mentioned above, are of a type covered by the

following.Definition. Let ν be an outer measure on X. A subset N ⊂ X is said to be

locally ν-neglijeable, if

ν(N ∩A) = 0, for all A ⊂ X with ν(A) <∞.

It is clear that every subset of N is also locally ν-neglijeable.The above observation shows that every locally ν-neglijeable set is ν-measurable.The term “local” will be used in connection with properties that hold when the

subject set is cut down by sets of finite measure. For example, one can formulatethe following.

Definitions. Let B be a σ-algebra on X, and µ be a measure on B. We saythat a set N ∈ B is locally µ-null, if

(16) µ(F ∩N) = 0, for all F ∈ B, with µ(F ) <∞.

Remark that locally µ-null sets do not necessarily have zero measure (see Example5.3.C)

We say that µ is locally complete, if it satisfies the condition(lc) whenever N ∈ B is a locally µ-null set, it follows that B contains all

subsets of N .Remarks 5.6. Use the notations above.A. If the measure µ is σ-finite, the local completeness of µ is equivalent to

completeness. The reason is the fact that, in the σ-finite case, condition (16) isequivalent to µ(N) = 0.

B. Given an outer measure ν on X, the measure ν∣∣mν(X)

is locally complete.

Comment. If we look at Example 5.3.C, we now see that although the measureν on I is complete, it is not locally complete, thus giving another explanation forthe strict inclusion I ( mν∗(X).

We are now in position to analyze Question 3, in the simplified given setting.The following fact will be helpful.

Lemma 5.1. Let B be a σ-algebra on X, let µ be a measure on B, and let µ∗

be the maximal outer extension of µ. Then, for every subset S ⊂ X, one has theequality

(17) µ∗(S) = infµ(B) : B ∈ B, B ⊃ S

.

CHAPTER III: MEASURE THEORY 209

Proof. Since B is σ-total in X, by definition we have

(18) µ∗(S) = inf ∞∑n=1

µ(Bn) : (Bn)∞n=1 ⊂ B, S ⊂∞⋃n=1

Bn.

If we denote the right hand side of (??) by ν(S), then using (18) we clearly haveµ∗(S) ≤ ν(S). Conversely, if we start with any sequence (Bn)∞n=1 ⊂ B with S ⊂⋃∞n=1Bn, then we clearly have

∞∑n=1

µ(Bn) ≥ µ

( ∞⋃n=1

Bn

)≥ ν(S),

so taking the infimum yields µ∗(S) ≥ ν(S).

Proposition 5.5. Let B be a σ-algebra on X, and let µ be a measure on B.Define the collection

Bfin =F ∈ B : µ(F ) <∞

.

For every F ∈ Bfin, denote by MF the completion of the σ-algebra B∣∣F

(on F ) withrespect to the measure8 µ

∣∣F. Denote by µ∗ the maximal outer extension of µ.

A. For a subset A ⊂ X, the following are equivalent(i) A is µ∗-measurable;(ii) A ∩ F ∈ MF , for each F ∈ Bfin;(iii) A ∩ F is µ∗-measurable, for each F ∈ Bfin.

B. For a subset N ⊂ X, the following are equivalent(i) N is locally µ∗-neglijeable;(ii) µ∗(N ∩ F ) = 0, for all F ∈ Bfin.

Proof. Let us fix some useful notations. By construction, for every F ∈ Bfin,we have

B∣∣F

=A ∩ F : A ∈ B

=

B ∈ B : B ⊂ F

.

For each F ∈ Bfin, we denote the σ-ring N(B

∣∣F, µ

∣∣F

)simply by NF . With the

above identification we have

NF =N ⊂ F : there exists D ∈ B with N ⊂ D ⊂ F and µ(D) = 0

,

so (see Theorem 5.4) the σ-algebra MF is given as

MF =B rN : B ∈ B, B ⊂ F, N ∈ NF

.

A. To prove the implication (i) ⇒ (ii), start with a µ∗-measurable set A, andwith some F ∈ Bfin. Since F is µ∗-measurable, the intersection A ∩ F is µ∗-measurable. Since µ∗(A ∩ F ) ≤ µ∗(F ) = µ(F ) < ∞, by Theorem 5.5 there existB0 ∈ B and N0 ⊂ X with µ∗(N0) = 0 and A ∩ F = B0 r N0. If we then defineB = B0 ∩ F and N = N0 ∩ F , then we clearly have B ∈ B

∣∣F, N ∈ NF , and

A ∩ F = B rN , so A ∩ F indeed belongs to MF .The implication (ii) ⇒ (ii) is trivial, since every set in MF is clearly µ∗-

measurable.To prove the implication (iii) ⇒ (i), assume A has property (iii), and let us

show that A is µ∗-measurable. We are going to use Remark 5.5, which means thatit suffices to prove the inequality

(19) µ∗(S) ≥ µ∗(S ∩A) + µ∗(S rA),

8 Here µ∣∣F

denotes the restriction of µ to the σ-algebra B∣∣F

.

210 LECTURE 22

only for those subsets S ⊂ X with µ∗(S) < ∞. Fix such a subset S. Sinceµ∗(S) <∞, Lemma 5.1 gives

(20) µ∗(S) = infµ(F ) : F ∈ Bfin, F ⊃ S

.

Start with some arbitrary ε > 0, and choose some F ∈ Bfin with F ⊃ S andµ(F ) ≤ µ∗(S) + ε. By (iii) the set A ∩ F is µ∗-measurable, so we have

µ∗(F ) = µ∗(F ∩ [A ∩ F ]

)+ µ∗

(F r [A ∩ F ]

)= µ∗(F ∩A) + µ∗(F rA).

Since F ∩ A ⊃ S ∩ A, and F r A ⊃ S r A, we have the inequalities µ∗(F ∩ A) ≥µ∗(S ∩A) and µ∗(F rA) ≥ µ∗(S rA), so the above inequality gives

µ∗(F ) ≥ µ∗(S ∩A) + µ∗(S rA).

By the choice of F , this gives

µ∗(S) + ε ≥ µ∗(S ∩A) + µ∗(S rA).

Since this inequality holds for all ε > 0, we immediately get the desired inequality(19).

B. The condition (i) says that

(21) µ∗(N ∩ S) = 0, for all S ⊂ X with µ∗(S) = 0.

It is obvious that we have the implication (i) ⇒ (ii). Conversely, suppose Nsatisfies (ii), and let us prove (21). Start with some arbitrary subset S ⊂ X withµ∗(S) < ∞. Using (20), there exists some F ∈ Bfin with S ⊂ F . By (ii), andthe monotocity of µ∗ we have 0 = µ∗(N ∩ F ) ≥ µ∗(N ∩ S), which clearly forcesµ∗(N ∩ S) = 0.

The above result suggests that the σ-algebra mµ∗(X) can be regarded as somesort of “local” completion of B. To simplify the exposition a little bit, we introducethe following.

Notation. Let B be a σ-algebra on X, let µ be a measure on B, and let µ∗

be the maximal outer extension of µ. The σ-algebra mµ∗(X), of all µ∗-measurablesubsets of X, will be denoted by Mµ(B) (or just Mµ, when there is no danger oofconfusion). The measure µ∗

∣∣Mµ

will be denoted by µ. The pair (Mµ, µ) will becalled the quasi-completion of B with respect to µ.

Unfortunately, analogues of Theorem 5.4 are not available, unless some (other-wise natural) restrictions are imposed. The type of restrictions we have in mind alsoaimed at making the test conditions A.(iii) and B.(ii) easier to check. We wouldlike to check them on a “small” sub-collection of Bfin. This naturally suggests thefollowing.

Definition. Let B be a σ-algebra on X, and let µ be a measure on B. Asufficient µ-finite B-partition of X is a collection F of non-empty subsets of X,with the following properties:

(i) F is pairwise disjoint, and⋃F∈F F = X;

(ii) F ⊂ B, and µ(F ) <∞, for all F ∈ F;(iii) for every set B ∈ B, with µ(B) <∞, one has the equality

µ(B) =∑F∈F

µ(B ∩ F ).

Condition (iii) uses the summation convention from II.2. (The sum is defined asthe suppremum of all finite partial sums.)

CHAPTER III: MEASURE THEORY 211

Remarks 5.7. A. Suppose F is a sufficient µ-finite B-partition of X. For everyset A ∈ B, we define the collection

SµF(A) =F ∈ F : µ(A ∩ F ) > 0

.

If µ(A) <∞, then(a) SµF(A) is at most countable, and(b) µ

(Ar

[⋃F∈Sµ

F(A)(A ∩ F )])

= 0.

By condition (iii) in the definition, it follows that, the family(µ(A∩F )

)F∈Sµ

F(A)is

summable, and

(22)∑

F∈SµF(A)

µ(A ∩ F ) = µ(A).

Since µ(A ∩ F ) > 0, ∀F ∈ SµF(A), property (a) follows from Proposition II.2.2. Ifwe denote the union

⋃F∈Sµ

F(A)(A ∩ F ) by A0, then by the σ-additivity of µ (it ishere where we use (a) in an essential way) the equality (22) gives

µ(A0) =∑

F∈SµF(A)

µ(A ∩ F ) = µ(A),

which combined with µ(A) <∞ forces µ(ArA0) = 0.B. The existence of a sufficient µ-finite B-partition of X is a generalization of

σ-finitess. In fact the following are equivalent (B is a σ-algebra on X):• µ is σ-finite;• there exists a countable sufficient µ-finite B-partition of X.

In the presence of a sufficient µ-finite B-partition, the properties that appearin Proposition 5.5 are simplified.

Proposition 5.6. Let B be a σ-algebra on X, let µ be a measure on B. As-sume F is a sufficient µ-finite B-partition of X. Denote by µ∗ the maximal outerextension of µ.

A. For a subset A ⊂ X, the following are equivalent(i) A is µ∗-measurable;(ii) A ∩ F is µ∗-measurable, for each F ∈ F.

B. For a subset N ⊂ X, the following are equivalent(i) N is locally µ∗-neglijeable;(ii) µ∗(N ∩ F ) = 0, for all F ∈ F.

C. If A ⊂ X is a subset with µ∗(A) <∞, then

(23) µ∗(A) =∑F∈F

µ∗(A ∩ F ).

Proof. It will be useful to introduce the following notations (use also thenotations from Proposition 5.5). For every B ∈ Bfin, we define

B′ =⋃

F∈SµF(B)

(B ∩ F ).

By Remark 5.7 we know that µ(B rB′) = 0.A. The implication (i) ⇒ (ii) is trivial. To prove the implication (ii) ⇒ (i),

we start with a set A ⊂ X satisfying condition (ii), and we show that A satisfiescondition (iii) from Proposition 5.5.A. Start with some arbitrary set B ∈ Bfin,

212 LECTURE 22

and let us show that A ∩ B is µ∗-measurable. Using the above notation, and themonotonicity of µ∗ we have

µ∗(A ∩ [B rB′]

)≤ µ∗(B rB) = µ(B rB′) = 0,

which in particular shows that A ∩ [B r B′] is µ∗-measurable. Since we haveA ∩ B = (A ∩ B′) ∪ (A ∩ [B r B′]), it then suffices to show that A ∩ B′ is µ∗-measurable. Notice that

A ∩B′ =⋃

SµF(B)

(A ∩ F ∩B),

and since the indexing set SµF(B) is at most countable, it then suffices to showthat A ∩ F ∩ B is µ∗-measurable, for each F . But this is obvious, since A ∩ F isµ∗-measurable, by condition (ii), and B ∈ B.

C. Let A ⊂ X be a subset with µ∗(A). Using Lemma 5.1, we can find, forevery ε > 0, some set Bε ∈ Bfin, such that Bε ⊃ A, and µ(Bε) ≤ µ∗(A) + ε. Fixfor the moment ε. Since the family

(µ(Bε∩F )

)F∈F

is summable, and µ∗(A∩F ) ≤µ(Bε ∩ F ), ∀F ∈ F, it follows that the family

(µ∗(A ∩ F )

)F∈F

is summable, andmoreover one has the inequality∑

F∈F

µ∗(A ∩ F ) ≤∑F∈F

µ(Bε ∩ F ) = µ(Bε) ≤ µ∗(A) + ε.

Since we have∑F∈F µ

∗(A ∩ F ) ≤ µ∗(A) + ε, for all ε > 0, it follows that we havein fact the inequality ∑

F∈F

µ∗(A ∩ F ) ≤ µ∗(A).

To prove the reverse inequality, we fix ε = 1 and we define set

G =⋃

F∈SµF(B1)

F.

Since SµF(B1) is at most countable, the setG belongs to B. With the above notation,we have the equality B′1 = B1 ∩ G, and by Remark 5.7.A, we have µ(B1 r G) =µ(B1 rB′1) = 0. Since ArG ⊂ B1 rG, it follows that µ∗(ArG) = 0. Since G isµ∗-measurable, we get

µ∗(A) = µ∗(A ∩G) + µ∗(ArG) = µ∗(A ∩G).

Since G is a countable union of F ’s, by the σ-subadditivity of µ∗, we have

µ∗(A) = µ∗(A∩G) = µ∗( ⋃F∈Sµ

F(B1)

[A∩F ])≤

∑F∈Sµ

F(B1)

µ∗(A∩F ) ≤∑F∈F

µ∗(A∩F ).

B. The implication (i) ⇒ (ii) is trivial. To prove the implication (ii) ⇒ (i), wemust show that condition (ii) implies

µ∗(N ∩B) = 0, ∀B ∈ Bfin.

But if we fix some B ∈ Bfin, then of course we have µ∗N∩B) ≤ µ∗(B) = µ(B) <∞,so using part C, we have

µ∗(N ∩B) =∑F∈F

µ∗(N ∩B ∩ F ) ≤∑F∈F

µ∗(N ∩ F ) = 0,

and we are done.

CHAPTER III: MEASURE THEORY 213

Comments. Let B be a σ-algebra on X, let µ be a measure on B. Assume F

is a sufficient µ-finite B-partition of X.By Proposition 5.6.C, it follows that F is also a sufficient µ-finite Mµ-partition

of X.We see naow that Mµ may contain more “new” sets, appart from the “nat-

ural candidates,” which are of the form B r N , with B ∈ B and N locally µ∗-neglijeable. Such “new” sets are those which belong (see Section 2) to the σ-algebra∨F∈F

(B

∣∣F

). More precisely, we have the following.

Corollary 5.3. Let B be a σ-algebra on X, let µ be a measure on B. AssumeF is a sufficient µ-finite B-partition of X.

A. One has the equality

Mµ =∨F∈F

(Mµ

∣∣F

).

B. For a subset A ⊂ X, the following are equivalent(i) A ∈ Mµ;(ii) there exist a set B ∈

∨F∈F

(B

∣∣F

), and a locally µ∗-neglijeable set

N ⊂ X, such that A = B rN .

Proof. A. This is exactly property A from Proposition 5.6.B. (i) ⇒ (ii). Assume A ∈ Mµ, i.e. A is µ∗ measurable. For every F ∈ F, the

set A∩ F is µ∗-measurable. Since µ∗(A∩ F ) <∞, by Theorem 5.5, it follows thatA ∩ F = BF r NF , with BF ∈ B and µ∗(NF ) = 0. Replacing BF with BF ∩ F ,and NF with NF ∩ F , we can assume that BF , NF ⊂ F . Form then the setsB =

⋃F∈F BF and N =

⋃F∈F NF . On the one hand, we have B ∩ B = BF ∈ B,

∀F ∈ F, which means precisely that B ∈∨F∈F

(B

∣∣F

). On the other hand, we also

have N ∩ F = NF , so we get µ∗(N ∩ F ) = 0, ∀F ∈ F. By Proposition 5.6.B, itfollows that N is locally µ∗-neglijeable. We clearly have A = B rN .

The implication (ii) ⇒ (i) is obvious.

There is yet another nicer consequence of Proposition 5.6, for which we aregoing to use the following terminology.

Definition. Let A be a σ-algebra on X, and let µ be a measure on A. Afamily F is called a µ-finite decomposition for A, if

(i) F is a sufficient µ-finite A-partition of X, and(ii) one has the equality

∨F∈F

(A

∣∣F

)= A.

(Given a collection F ⊂ A, one always has the inclusion∨F∈F

(A

∣∣F

)⊂ A.)

A measure µ on A is said to be decomposable, if there exists at least one µ-finitedecomposition for A.

Remark 5.8. Decomposability is a generalization of σ-finiteness. This followsfrom Remark 5.6.B, combined with the fact that whenever F ⊂ A is a countablesub-collection, one always has the equality

∨F∈F

(A

∣∣F

)= A.

With this terminology, Corollary 5.3 states that if F is a sufficient µ-finiteB-partition of X, then F is a µ-finite decomposition for Mµ.

With the above terminology, Corollary 5.2 has the following generalizationTheorem 5.6. Let µ be a decomposable measure on the σ-algebra B.

A. For a subset A ⊂ X, the following are equivalent(i) A is µ∗-measurable;

214 LECTURE 22

(ii) there exist B ∈ B, and some locally µ∗-neglijeable set N , such thatA = B rN .

B. For a subset N ⊂ X, the following are equivalent(i) N is locally µ∗-neglijeable;(ii) there exists a locally µ-null set D ∈ B with N ⊂ D.

Proof. A. This is clear, by Corollary 5.3.B. The implication (ii) ⇒ (i) is trivial, because any locally µ-null set D is

locally µ∗-neglijeable, and so is every subset of D.To prove the implication (i) ⇒ (ii) start with a locally µ∗-neglijeable set N ,

and we fix F a µ-finite decomposition of B. We know that µ∗(N ∩F ) = 0, ∀F ∈ F.In particular, using Remark 5.4.B, for each F ∈ F, there exists some set EF ∈ B,with N ∩ F ⊂ EF , and µ(EF ) = 0. Consider now the set D =

⋃F∈F(EF ∩ F ).

By construction, we have D ∩ F = EF ∩ F ∈ B, ∀F ∈ F, which means thatD ∈

∨F∈F

(B

∣∣F

). It is here where we use condition (ii) in the definition of µ-finite

decompositions, to conclude that D belongs to B. Of course, we have

µ(D ∩ F ) = µ(EF ∩ F ) ≤ µ(EF ) = 0, ∀F ∈ F,

which by Proposition 5.6 means that D is locally µ∗-neglijeable. This means that

µ(D ∩B) = µ∗(D ∩B) = 0, ∀B ∈ Bfin,

which means that D is locally µ-null. Since N ∩ F ⊂ EF ∩ F ⊂ D, ∀F ∈ F, and F

is a partition of X, we get N ⊂ D.

Lectures 23-25

6. The Lebesgue measure

In this section we apply various results from the previous sections to a verybasic example: the Lebesgue measure on Rn.

Notations. We fix an integer n ≥ 1. In Section 21 we introduced the semiringof “half-open boxes” in Rn:

Jn = ∅ ∪ n∏j=1

[aj , bj) : a1 < b1, . . . , an < bn⊂ P(Rn).

For a non-empty box A = [a1, b1)×· · ·× [an, bn) ∈ Jn, we defined its n-dimesnionalvolume by

voln(A) =n∏k=1

(bk − ak).

We also defined voln(∅) = 0.By Theorem 4.2, we know that voln is a finite measure on Jn.Definitions. The maximal outer extension of voln is called the n-dimensional

outer Lebesgue measure, and is denoted by λ∗n.The λ∗n-measurable sets in Rn will be called n-Lebesgue measurable. The σ-

algebra mλ∗n(Rn) will be denoted simply by m(Rn). The measure λ∗n∣∣m(Rn)

issimply denoted by λn, and is called the n-dimensional Lebesgue measure. Althoughthis notation may appear to be confusing, it turns out (see Proposition 5.3) that λ∗nis indeed the maximal outer extension of λn. In the case when n = 1, the subscriptwill be ommitted.

We know (see Section 21) that

S(Jn) = Σ(Jn) = Bor(Rn).

Using the fact that the semiring Jn is σ-total in Rn, by the definition of the outerLebesgue measure, we have

(1) λ∗n(A) = inf ∞∑k=1

voln(Bk) : (Bk)∞k=1 ⊂ Jn,∞⋃k=1

Bk ⊃ A, ∀A ⊂ Rn

Using Corollary 5.2, we have the equality

m(Rn) = Bor(Rn),

whereBor(Rn) is the completion ofBor(Rn) with respect to the measure λn∣∣Bor(Rn)

.This means that a subset A ⊂ Rn is Lebesgue measurable, if and only if there ex-ists a Borel set B and a neglijeable set N such that A = B ∪ N . (The fact N is

215

216 LECTURES 23-25

neglijeable means that λ∗n(N) = 0, and is equivalent to the existence of a Borel setC ⊃ N with λn(C) = 0.)

Exercise 1. Let A = [a1, b1)× · · · × [an, bn) be a half-open box in Rn. AssumeA 6= ∅ (which means that a1 < b1, . . . , an < bn). Consider the open box Int(A)and the closed box A, which are given by

Int(A) = (a1, b1)× · · · × (an, bn) and A = [a1, b1]× · · · × [an, bn].

Prove the equalitiesλn

(Int(A)

)= λn

(A

)= voln(A).

Remarks 6.1. If D ⊂ Rn is a non-empty open set, then λn(D) > 0. This is aconsequence of the above exercise, combined with the fact that D contains at leastone non-empty open box.

The Lebesgue measure of a countable subset C ⊂ Rn is zero. Using σ-additivity,it suffices to prove this only in the case of singletons C = x. If we write x incoordinates x = (x1, . . . , xn), and if we consider half-open boxes of the form

Jε = [x1, x1 + ε)× · · · × [xn, xn + ε),

then the obvious inclusion x ⊂ Jε will force

0 ≤ λn(x

)≤ λn(Jε) = εn,

so taking the limit as ε→ 0, we indeed get λn(x

)= 0.

The (outer) Lebesgue measure is completely determined by its values on opensets. More explicitly, one has the following result.

Proposition 6.1. Let n ≥ 1 be an integer. For every subset A ⊂ Rn one has:

(2) λ∗n(A) = infλn(D) : D open subset of Rn, with D ⊃ A.

Proof. Throughout the proof the set A will be fixed. Let us denote, forsimplicity, the right hand side of (2) by ν(A). First of all, since every open set isLebesgue measurable (being Borel), we have λn(D) = λ∗n(D), for all open sets D,so by the monotonicity of λ∗n, we get the inequality

λ∗n(A) ≤ ν(A).

We now prove the inequality λ∗n(A) ≥ ν(A). Fix for the moment some ε > 0, anduse (1). to get the existence of a sequence (Bk)∞k=1 ⊂ Jn, such that

⋃∞k=1Bk ⊃ A,

and∞∑k=1

voln(Bk) < λ∗n(A) + ε.

For every k ≥ 1, we write

Bk = [a(k)1 , b

(k)1 )× · · · × [a(k)

n , b(k)n ),

so that voln(Bk) =∏nj=1(b

(k)1 − a

(k)j ). Using the obvious continuity of the map

R 3 t 7−→n∏j=1

(b(k)1 − a(k)j − t) ∈ R,

we can find, for each k ≥ 1 some numbers c(k)1 < a(k)1 , . . . , c(k)n < a

(k)n , with

(3)n∏j=1

(b(k)1 − c(k)j ) <

ε

2k+

n∏j=1

(b(k)1 − a(k)j ).

CHAPTER III: MEASURE THEORY 217

Notice that, if we define the half-open boxes

Ek = [c(k)1 , b(k)1 )× · · · × [c(k)n , b(k)n ),

then for every k ≥ 1, we clearly have Bk ⊂ Int(Ek), and by Exercise 1, combinedwith (3), we also have the inequality

λn(Int(Ek)

)= voln(Ek) <

ε

2k+ voln(Bk).

Summing up we then get

(4)∞∑k=1

λn(Int(Ek)

)<

∞∑k=1

[ ε2k

+ voln(Bk)]

= ε+∞∑k=1

voln(Bk) < 2ε+ λ∗n(A).

Now we observe that by σ-sub-additivity we have

λn

( ∞⋃k=1

Int(Ek))≤

∞∑k=1

λn(Int(Ek)

),

so if we define the open set D =⋃∞k=1 Int(Ek), then using (4) we get

(5) λn(D) < 2ε+ λ∗n(A).

It is clear that we have the inclusions

A ⊂∞⋃k=1

Bk ⊂∞⋃k=1

Int(Ek) = D,

so by the definition of ν(A), combined with (5), we finally get

ν(A) ≤ λn(D) < 2ε+ λ∗n(A).

Up to this moment ε > 0 was fixed. Since the inequality ν(A) < 2ε+ λ∗n(A) holdsfor any ε > 0 however, we finally get the desired inequality ν(A) ≤ λ∗n(A).

The Lebesgue measure can also be recovered from its values on compact sets.Proposition 6.2. Let n ≥ 1 be an integer. For every Lebesgue measurable

subset A ⊂ Rn one has:

(6) λn(A) = supλn(K) : K compact subset of Rn, with K ⊂ A.

Proof. Let us denote, for simplicity, the right hand side of (6) by µ(A). Firstof all, by the mononoticity we clearly have the inequality

λn(A) ≥ µ(A).

To prove the inequality λn(A) ≤ µ(A), we shall first use a reduction to the boundedcase. For each integer k ≥ 1, we define the compact box

Bk = [−k, k]× · · · × [−k, k].

Notice that we have B1 ⊂ B2 ⊂ . . . , with⋃∞k=1Bk = Rn. We then have

B1 ∩A ⊂ B2 ∩A ⊂ . . . ,

with⋃∞k=1(Bk ∩A) = A, so using the Continuity Lemma 4.1, we have

(7) λn(A) = limk→∞

λn(Bk ∩A) = supλn(Bk ∩A) : k ≥ 1

.

218 LECTURES 23-25

Fix for the moment some ε > 0, and use the (7) to find some k ≥ 1, such thatλn(A) ≤ λn(Bk ∩A) + ε. Apply Proposition 6.1 to the set Bk rA, to find an openset D, with D ⊃ Bk rA, and λn(Bk rA) ≥ λn(D)− ε. On the one hand, we have

λn(Bk) = λn(Bk ∩A) + λn(Bk rA) ≥ λn(Bk ∩A) + λn(D)− ε ≥≥ λn(Bk ∩A) + λn(Bk ∩D)− ε.

(8)

On the other hand, we have

λn(Bk) = λn(Bk rD) + λn(Bk ∩D),

so using (8) we get the inequality

λn(Bk rD) + λn(Bk ∩D) ≥ λn(Bk ∩A) + λn(Bk ∩D)− ε,

and since all numbers involved in the above inequality are finite, we conclude that

λn(Bk rD) ≥ λn(Bk ∩A)− ε ≥ λn(A)− 2ε.

Obviously the set K = Bk r D is compact, with K ⊂ Bk ∩ A ⊂ A, so we haveµ(A) ≥ λn(K), hence we get the inequality

µ(A) ≥ λn(A)− 2ε.

Since this is true for all ε > 0, the desired inequality µ(A) ≥ λn(A) follows.

Corollary 6.1. For a set A ⊂ Rn, the following are equivalent:

(i) A is Lebesgue measurable;(ii) there exists a neglijeable set N and a sequence of (Kj)∞j=1 of compact

subsets of Rn, such that

A = N ∪∞⋃j=1

Kj .

Proof. (i) ⇒ (ii). Start by using the boxes

Bk = [−k, k]× · · · × [−k, k]

which have the property that⋃∞k=1Bj = Rn, so we get A =

⋃∞k=1(Bk ∩ A). Fix

for the moment k. Apply Proposition 6.2. to find a sequence (Ckr )∞r=1 of compactsubsets of Bk∩A, such that limr→∞ λn(Ckr ) = λn(Bk∩A). Consider the countablefamily (Ckr )∞k,r=1 of compact sets, and enumerate it as a sequence (Kj)∞j=1, so thatwe have

∞⋃j=1

Kj =∞⋃k=1

∞⋃r=1

Ckr .

If we define, for each k ≥ 1, the sets Ek =⋃∞r=1 C

kr ⊂ Bk∩A andNk = (Bk∩A)rEk,

then, because of the inclusion Ckr ⊂ Ek ⊂ Bk ∩A, we have the inequalities

(9) 0 ≤ λn(Nk) = λn(Bk ∩A)− λn(Ek) ≤ λn(Bk ∩A)− λn(Ckr ), ∀ r ≥ 1.

Using the fact that

limr→∞

λn(Ckr ) = λn(Bk ∩A) ≤ λn(Bk) <∞,

CHAPTER III: MEASURE THEORY 219

the inequalities (9) force λn(Nk) = 0, ∀ k ≥ 1. Now if we define the set N =Ar

( ⋃∞j=1Kj

), we have

N =∞⋃k=1

[(Bk ∩A) r

( ∞⋃j=1

Kj

)]=

∞⋃k=1

[(Bk ∩A) r

( ∞⋃p=1

Ep)]⊂

⊂∞⋃k=1

[(Bk ∩A) r Ek

]=

∞⋃k=1

Nk,

which proves that λn(N) = 0.The implication (ii) ⇒ (i) is trivial.

Proposition 6.2 does not hold if A ⊂ Rn is non-measurable. In fact the equality(6), with λn replaced by λ∗n, essentially forces A to be measurable, as shown by thefollowing.

Exercise 2. Let A ⊂ Rn be am arbitrary subset, with λ∗n(A) <∞. Prove thatthe following are equivalent:

(i) A is Lebesgue measurable;(ii) λ∗n(A) = supλn(K) : K compact subset of Rn, with K ⊂ A.

Propositions 6.1 and 6.2 are regularity properties. The following terminology isuseful:

Definitions. Suppose A is a σ-algebra on X, and µ is a measure on A. Sup-pose we have a sub-collection F ⊂ A.

(i) We say that µ is regular from below, with respect to F, if

µ(A) = supµ(F ) : F ⊂ A, F ∈ F

.

(ii) We say that µ is regular from above, with respect to F, if

µ(A) = infµ(F ) : F ⊃ A, F ∈ F

.

With this terminology, Proposition 6.1 gives the fact that the Lebesgue measure isregular from above with respect to open sets, while Proposition 6.2 gives the factthat the Lebesgue measure is regular from below with respect to compact sets.

Exercise 3. For a subset A ⊂ Rn, prove that the following are equivalent:(i) A is Lebesgue measurable;(ii) There exist a sequence of compact sets (Kj)∞j=1, and a dequence of open

sets (Dj)∞j=1, such that⋃∞j=1Kj ⊂ A ⊂

⋂∞j=1Dj , and the difference( ⋂∞

j=1Dj

)r

( ⋃∞j=1Kj

)is neglijeable.

Hint: For the implication (i) ⇒ (ii) analyze first the case when λ∗(A) <∞. Then write A as a

countable union of sets of finite outer measure.

In the one-dimensional case n = 1, the Lebesgue measure of open sets can becomputed with the aid of the following result.

Proposition 6.3. For every open set D ⊂ R, there exists a countable (orfinite) pair-wise disjoint collection Jii∈I of open intervals with D =

⋃i∈I Ji.

Proof. For every point x ∈ D, we define

ax = infa < x : (a, x) ⊂ D and bx = supb > x : (x, b) ⊂ D.(The fact that D is open guarantees the fact that both sets above are non-empty.)It is clear that, for every x ∈ D, the open interval Jx = (ax, bx) is contained in D, so

220 LECTURES 23-25

we have the equality D =⋃x∈D Jx. The problem at this point is the fact that the

collection Jxx∈D is not pair-wise disjoint. What we need to find is a countable(or finite) subset X ⊂ D, such that the sub-collection Jxx∈X is pair-wise disjoint,and we still have D =

⋃x∈X Jx. One way to do this is based on the following

Claim: For two points x, y ∈ D, the following are equivalent:(i) x ∈ Jy;(ii) Jx ⊃ Jy;(ii) Jx ∩ Jy 6= ∅;(iii) Jx = Jy.

To prove the implication (i) ⇒ (ii) we observe that if x ∈ Jy, then ay < x < by,so we have (ay, x) ⊂ D and (x, by) ⊂ D, which means that ax ≤ ay and bx ≥ by,therefore we have the inclusion Jx = (ax, bx) ⊃ (ay, by) = Jy. The implication(ii) ⇒ (iii) is trivial. To prove (iii) ⇒ (iv), assume Jx ∩ Jy 6= ∅, and pick a pointz ∈ Jx ∩ Jy. Using the implication (i) ⇒ (ii) we have the inclusions Jz ⊃ Jx andJz ⊃ Jy. In particular we have x ∈ Jz, so again using the inplication (i) ⇒ (ii) weget Jx ⊃ Jz, which means that we have in fact the equality Jx = Jz. Likewise wehave the equality Jy = Jz, so (iv) follows. The implication (iv) ⇒ (i) is trivial.

Going back to the proof of the Proposition, we now see that, using the factthat any open interval contains a rational number, if we put X0 = D ∩ Q, thenfor any y ∈ D, there exists x ∈ X0, such that Jx = Jy. This gives the equalityD =

⋃x∈X0

Jx, this time with the indexing set X0 countable. Finally, if we equipthe set X0 with the equivalence relation

x ∼ y ⇐⇒ Jx = Jy,

and we choose X ⊂ X0 to the a list of all equivalence classes. This means that, forevery y ∈ X0, there exists a unique x ∈ X with Jx = Jy. It is clear now that westill have D =

⋃x∈X Jx, but now if x, x′ ∈ X are such that x 6= x′, then x 6∼ x′, so

we have Jx 6= Jx′ , which by the Claim gives Jx ∩ Jx′ = ∅.

Comments. When we want to compute the Lebesgue measure of an open setD ⊂ R, we should first try to write D =

⋃i∈I Ji with (Ji)i∈I a countable (or finite)

pair-wise collection of open intervals. If we succeed, then we would have

λ(D) =∑i∈I

λ(Ji).

For intervals (open or not) the Lebesgue measure is the same as the length.There are instances when we can manage only to write a given open set D as a

union D =⋃∞k=1 Jk, with the J ’s not necessarily disjoint. In that case we can only

get the estimate

λ(D) ≤∞∑k=1

λ(Jk).

Example 6.1. Consider the ternary Cantor set K3 ⊂ [0, 1], discussed in III.3.We know (see Remarks 3.5) that one can find a pair-wise sequence (Dn)∞n=0 of opensubsets of (0, 1) such that K3 = [0, 1] r

⋃∞n=0Dn, and such that, for each n ≥ 0,

the open set Dn is a disjoint union of 2n intervals of length 1/3n+1. In particular,this means that λ(Dn) = 2n/3n+1, so

λ(K3) = λ([0, 1]

)− λ

( ∞⋃n=0

Dn

)= 1−

∞∑n=0

λ(Dn) = 1−∞∑n=0

2n

3n+1= 0.

CHAPTER III: MEASURE THEORY 221

What is interesting here (see Remarks 3.5) is the fact that cardK3 = c.

Remark 6.2. An interesting consequence of the above computation is thefact that all subsets of K3 are Lebesgue measurable, i.e. one has the inclusionP(K3) ⊂m(R). This gives the inequality

cardm(R) ≥ cardP(K3) = 2cardK3 = 2c.

Since we also have m(R) ⊂ P(R), we get

cardm(R) ≤ cardP(R) = 2card R = 2c,

so using the Cantor-Bernstein Theorem we get the equality

cardm(R) = 2c.

We also know (see Corollary 2.5) that cardBor(R) = c.As a consequence of this difference in cardinalities, one gets the fact that we

have a strict inclusion

(10) Bor(R) ( m(R).

Later on we shall construct (more or less) explicitly a Lebesgue measurable setwhich is not Borel.

Exercise 4. The strict inclusion (10) holds also if R is replaced with Rn,with n ≥ 2. In this case, instead of using Cantor sets, one can proceed as fol-lows. Consider the set S = Rn−1 × 0. Prove that λn(S) = 0. Conclude thatcardm(Rn) = 2c.

One key feature of the Lebesgue (outer) measure is the translation invarianceproperty, described in the following result. To formulate it we introduce the follow-ing notation. For an integer n ≥ 1, a point x ∈ Rn, and a subset A ⊂ Rn, we definethe set

A+ x = a+ x : a ∈ A.

Remark that the map Θx : Rn 3 a 7−→ a + x ∈ Rn is a homeomorphism. Inparticular, both Θx and Θ−1

x = Θ−x are Borel measurable, which means that, fora set A ⊂ Rn, one has the equivalence

A ∈ Bor(Rn) ⇐⇒ A+ x ∈ Bor(Rn).

Proposition 6.4. Let n ≥ 1 be an integer. For any set A ⊂ Rn one has theequality

λ∗n(A+ x) = λ∗n(A).

Proof. Fix A and x. First remark that, for every half-open box B ∈ Jn, itstranslation B + x is again a half-open box, and we have the equality

voln(B + x) = voln(B).

Fix for the moment ε > 0, and choose a sequence (Bk)∞k=1 ⊂ Jn, such that A ⊂⋃∞k=1Bk, and

∞∑k=1

voln(Bk) ≤ λ∗n(A) + ε.

222 LECTURES 23-25

Then, using the obvious inclusion A+ x ⊂⋃∞k=1(Bk + x), by the remark made at

the begining of the proof, combined with the monotonicity of the outer Lebesguemeasure, we have

λ∗n(A+ x) ≤ λ∗n

( ∞⋃k=1

(Bk + x))≤

∞∑k=1

λ∗n(Bk + x) =

=∞∑k=1

voln(Bk + x) =∞∑k=1

voln(Bk) ≤ λ∗n(A) + ε.

Since the inequality λ∗n(A+ x) ≤ λ∗n(A) + ε holds for all ε > 0, we get

λ∗n(A+ x) ≤ λ∗n(A).

The other inequality follows from the above one applied to the set A + x and thetranslation by −x.

Corollary 6.2. For a subset A ⊂ Rn, one has the equivalence

A ∈m(Rn) ⇐⇒ A+ x ∈m(Rn).

Proof. Write A = B ∪ N , with B Borel, and N neglijeable. Then we haveA + x = (B + x) ∪ (N + x). The set B + x is Borel. By the above result we haveλ∗n(N + x) = λ∗n(N) = 0, i.e. N + x is neglijeable. Therefore A + x is Lebesguemeasurable.

As we have seen, the fact that there exist Lebesgue measurable sets that arenot Borel is explained by the difference in cardinalities. Since cardm(Rn) = 2c =cardP(Rn), it is legitimate to ask whether the inclusion m(Rn) ⊂ P(Rn) is strict.In other words, do there exist sets that are not Lebesgue measurable? The answeris affirmative, as discussed in the following.

Example 6.2. Equipp R with the equivalence relation

x ∼ y ⇐⇒ x− y ∈ Q.Denote by R/Q the quotient space (this is in fact the quotient group of (R,+) withrespect to the subgroup Q), and denote by π : R → R/Q the quotient map. Sincefor every x ∈ R, one can find some y ∼ x, with y ∈ [0, 1), it follows that the mapπ∣∣[0,1)

: [0, 1) → R/Q is surjective. Choose then a map φ : R/Q → [0, 1), such thatφ π = Id, and put E = φ(R/Q). The set E is a complete set of representatives forthe equivalence relation ∼. In other words, E ⊂ [0, 1) has the property that, forevery x ∈ R, there exists exactly one element y ∈ E, with x ∼ y. In particular, thecollection of sets (E + q)q∈Q is pair-wise disjoint, and satisfies

⋃q∈Q(E + q) = R.

Using σ-sub-additivity, we get

∞ = λ(R) ≤∑q∈Q

λ∗(E + q).

Since (by Proposition 6.5) we have λ∗(E + q) = λ∗(E), the above inequality forcesλ∗(E) > 0.

Claim: The set E is not Lebesgue measurableAssume E is Lebesgue measurable. If we define the set X = Q ∩ [0, 1), then thesets E + q, q ∈ X are pair-wirse disjoint. On the one hand, the measurabilityof E, combined with the Corollary 6.2 would imply the measurability of the setS =

⋃q∈X(E + q). On the other hand, the equalities λ(E + q) = λ(E) > 0 will

CHAPTER III: MEASURE THEORY 223

force λ(S) = ∞. But this is impossible, since we obviously have S ⊂ [0, 2), whichforces λ(S) ≤ 2.

Exercise 5. Let E ∈m(Rn). Prove that the map

Rn 3 x 7−→ λ(E ∪ (E + x)

)∈ [0,∞]

is continuous.Hint: Analyze first the case when E is compact. In this particular case, show that for every

x0 ∈ Rn and every open set D ⊃ E ∪ (E+x0), there exists some neighborhood V of x0, such that

D ⊃ E ∪ (E + x), ∀x ∈ V.

Use then regularity from above, combined with the inequality9

|λ(A)− λ(B)| ≤ λ(A4B), for all A,B ∈m(Rn), with λ(A), λ(B) <∞.

In the general case, use regularity from below. (The case λ(E) = ∞ is trivial.)

Exercise 6. Let E ∈m(Rn), be such that λn(E) > 0. Prove that the set

E − E = x− y : x, y ∈ E

is a neighborhood of 0.

Hint: Assume the contrary, which means that there exists a sequence (xp)∞p=1 ⊂ Rn r (E −E),

with limp→∞ xp = 0. This will force E ∩ (E + xp) = ∅, ∀ p ≥ 1. Use the preceding Exercise to

get a contradiction.

We are now in position to construct a Lebesgue measurable set which is notBorel.

Example 6.3. In Section 3 we discussed the compact space T = 0, 1ℵ0 andthe maps

φr : T 3 (αn)∞n=1 7−→ (r − 1)∞∑n=1

αnrn

∈ [0, 1].

For each r ≥ 2 the map φr : T → [0, 1] is continuous so the set Kr = φr(T ) iscompact. We have K2 = [0, 1], and K3 is the ternary Cantor set. We also know(see Theorem 3.5) that, for a set A ⊂ T , one has the equivalence

(11) A ∈ Bor(T ) ⇐⇒ φr(A) ∈ Bor(Kr).

Choose now a set E ⊂ [0, 1] which is not Lebesgue measurable. In particular, E isnot Borel, so E 6∈ Bor([0, 1]). Since φ2 : T → [0, 1] is surjective, by (11) it followsthat the set A = φ−1

2 (E) is not in Bor(T ). Again, by (11) it follows that the setS = φ3(A) is not in Bor(K3). Since

Bor(K3) = Bor(R)∣∣K3

this gives S 6∈ Bor(R). Notice however that since S ⊂ K3, it follows that S isLebesgue measurable.

Comment. When one wants to prove that a Lebesgue measurable set M ⊂ Rhas positive measure, a sufficient condition for this property is that Int(M) 6= ∅(see Remark 6.1). It turns out however that this condition is not always necessary,as seen from the following:

9 This inequality holds for any additive map defined on a ring.

224 LECTURES 23-25

Exercise 7. Start with an arbitrary inerval [0, 1], and list all rational numbersin [0, 1] as a sequence Q∩ [0, 1] = xn∞n=1. Fix some ε > 0, and consider the openset

D =∞⋃n=1

(xn −

ε

2n+1, xn +

ε

2n+1

).

Consider the compact set K = [0, 1] rD.(i) Prove that λ(D) ≤ ε.(ii) Prove that λ(K) ≥ 1− ε.(iii) Prove that Int(K) = ∅.

Hint: For (iii) use the fact that K ∩ Q = ∅.

Exercise 8*. Prove that, for every non-empty open set D ⊂ R, and any twopositive numbers α, β with α+β < λ(D), there exist compact sets A,B ⊂ D, withλ(A) > α, λ(B) > β, such that A ∩B = ∅ and (A ∪B) ∩Q = ∅.Hint: Write D as a union of a pair-wise disjoint sequence (Jn)∞n=1 of open intervals, so that

λ(D) =∑∞

n=1 λ(Jn). Find then two sequences (αn)∞n=1 and (βn)∞n=1 of positive numbers, such

that∑∞

n=1 αn > α,∑∞

n=1 βn > β, and αn + βn < λ(Jn), for all n ≥ 1. This reduces essentially

the problem to the case when D is an open interval, for which one can use the construction

outlined in Exercise 7.

Exercise 9*. Construct o Borel set A ⊂ R, such that, for every open intervalI ⊂ R one has λ(I ∩A) > 0 and λ(I rA) > 0.Hints: List all open intervals with rational endpoints as a sequence (In)∞n=1. Start (use exercise

8) off by choosing two compact sets A1, B1 ⊂ I1, with A1 ∩ B1 = ∅, (A1 ∪ B1) ∩ Q = ∅,

and λ(A1), λ(B1) > 0. Use Exercise 5 to construct two sequences (An)∞n=1 and (Bn)∞n=1 of

compact sets, such that, for all n ≥ 1 we have: (i) An ∩ Bn = ∅; (ii) (An ∪ Bn) ∩ Q = ∅;

(iii) λ(An), λ(Bn) > 0; (iv) An+1 ∪ Bn+1 ⊂ In+1 r[ ⋃n

k=1(Ak ∪ Bk)]. Put A =

⋃∞n=1 An and

B =⋃∞

n=1Bn. Notice that A ∩B = ∅, λ(A), λ(B) > 0, and λ(A ∩ In), λ(B ∩ In) > 0, ∀n ≥ 1.

In the remainder of this section we discuss some applications of the Lebesguemeasure to the theory of Riemann integration. The following techincal result willbe very useful.

Lemma 6.1. Let f : [a, b] → R be a non-negative Riemann integrable function,let A,B ⊂ [a, b] be two disjoint sets, with A∪B = [a, b]. Then one has the estimates

λ∗(A) · infz∈A

f(z) ≤∫ b

a

f(t) dt ≤ (b− a) · supx∈A

f(x) + λ∗(B) · supy∈B

f(y).

Proof. Define the numbers

α = supx∈A

f(x), β = supy∈B

f(y), and γ = infz∈A

f(z).

Recall first that, if for each partition ∆ = (a = x0 < x1 < · · · < xn = b) of [a, b],we define the lower and the upper Darboux sums of f with respect to ∆:

L(∆, f) =n∑k=1

(xk − xk−1) · inft∈[xk−1,xk]

f(t),

U(∆, f) =n∑k=1

(xk − xk−1) · supt∈[xk−1,xk]

f(t),

CHAPTER III: MEASURE THEORY 225

then one has the equalities∫ b

a

f(t) dt = supL(∆, f) : ∆ partition of [a, b]

=

= infU(∆, f) : ∆ partition of [a, b]

.

(12)

Fix now a partition ∆ = (a = x0 < x1 < · · · < xn = b) of [a, b], and define the set

S =k ∈ 1, . . . , n : [xk−1, xk] ∩A 6= ∅

.

It is clear that

infx∈[xk−1,xk]

f(x) ≤ α, supx∈[xk−1,xk]

f(x) ≥ γ, ∀ k ∈ S,

infy∈[xk−1,xk]

f(y) ≤ β, supy∈[xk−1,xk]

f(y) ≥ 0, ∀ k ∈ 1, . . . , nr S,

so we get

L(∆, f) ≤[ ∑k∈S

(xk − xk−1)]· α+

[ ∑k 6∈S

(xk − xk−1)]· β(13)

U(∆, f) ≥[ ∑k∈S

(xk − xk−1)]· γ(14)

Consider now the sets

M =⋃k∈S

[xk−1, xk] and N =⋃k 6∈S

[xk−1, xk].

Since the intervals involded in both M and N have at most singleton overlaps, itfollows that we have the equalities∑

k∈S

(xk − xk−1) = λ(M) and∑k 6∈S

(xk − xk−1) = λ(N),

so the estimates (13) and (14) read

L(∆, f) ≤ λ(M) · α+ λ(N) · β(15)

U(∆, f) ≥ λ(M) · γ(16)

Since we clearly have A ⊂M ⊂ [a, b] and N ⊂ B, we have the inequalities

λ∗(A) ≤ λ(M) ≤ b− a and λ(N) ≤ λ∗(B),

so the inequalities (15) and (16) give

L(∆, f) ≤ (b− a) · α+ λ∗(B) · β and U(∆, f) ≥ λ∗(A) · γ.

Since ∆ is arbitrary, the desired inequality then follows from (12).

One application of the above result is the following.Proposition 6.5. If f : [a, b] → R is Riemann integrable, and the set

N = x ∈ [a, b] : f(x) 6= 0

is neglijeable, then

(17)∫ b

a

f(x) dx = 0.

226 LECTURES 23-25

Proof. Since f is bounded, there exists some constant C > 0, such that theRiemann integrable functions C+f and C−f are both non-negative. Apply Lemma6.1 to these two functions with A = [a, b] rN and B = N . Since f

∣∣[a,b]rN = 0, we

get (C ± f)∣∣[a,b]rN = C, so we get∫ b

a

[C ± f(x)] dx ≤ (b− a) · C,

which yields

±∫ b

a

f(x) dx =∫ b

a

[C ± f(x)]− C

dx =

∫ b

a

[C ± f(x)] dx− (b− a) · C ≤ 0,

from which (17) immediately follows.

In order to make the exposition a bit easier to follow, it will be helpful tointroduce the following

Convention. Given two functions f1, f2 : [a, b] → R, and a relation r on R(in our case r will be either “=,” or “≥,” or “≤”), we write

f1 r f2, a.e.

if the set

A =x ∈ [a, b] : f1(x)r f2(x)

has neglijeable complement in [a, b], i.e. λ∗

([a, b] rA

)= 0. The abreviation “a.e.”

stands for “almost everywhere.”For example, using this convention, Proposition 6.6 reads: if f : [a, b] → R is

Riemann integrable, and f = 0, a.e., then∫ baf(x) dx = 0.

Exercise 10. A. Prove that “= a.e” is an equivalence relation, and “≥ a.e” and“≤ a.e” are transitive relations on the collection of all function [a, b] → R.

B. Prove that f1 ≥ f2, a.e. and f1 ≤ f2, a.e. imply f1 = f2, a.e.C. Prove that these relations are compatible with the arithmetic operations, in

the exact way as their “honest” versions. For example, if r is one of “=,” or “≥,”or “≤”, and if f1 r f2, a.e. and g1 r g2, a.e., then (f1 + g1)r (f2 + g2), a.e.

Exercise 11. Let f, g : [a, b] → R be continuous functions, such that f ≥ g, a.e.Prove that f ≥ g.

Exercise 12. Let f : [a, b] → R be a non-negative Riemann integrable function,with

∫ baf(x) dx = 0. Prove that f = 0, a.e.

Comment. Riemann integrability is quite a rigid condition. For example thecharacteristic function κQ∩[a,b] of the set of rational numbers in [a, b] is not Riemannintegrable. By the above result however, we can introduce a slightly weaker notion,which will make such functions integrable, in a weaker sense. This will be a first“improvement” of the Riemann integration theory. Eventually (see Chapter IV), amore sofisticated theory - the Lebesgue integral - will emerge.

Definition. We say that a function f : [a, b] → R is almost Riemann inte-grable, if there exists a Riemann integrable function g : [a, b] → R, with f = g, a.e.Of course, such a g is not unique. Notice however that, if h : [a, b] → R is another

CHAPTER III: MEASURE THEORY 227

Riemann integrable function, with f = h, a.e., then g = h, a.e., so by Proposition6.6, we immediately get the equality∫ b

a

g(x) dx =∫ b

a

h(x) dx.

This observation shows that we can unambiguously define

≈∫ b

a

f(x) dx =∫ b

a

g(x) dx.

Example 6.4. Consider the function f = κQ∩[a,b]. Since Q∩[a, b] is neglijeable,we have f = 0, a.e. So f is almost Riemann integrable (althought it is not Riemannintegrable), and we have

≈∫ b

a

f(x) dx = 0.

We now focus our attention to (honest) Riemann integrability, with an eye onthe role played by continuity. For a function f : [a, b] → R we define the set

Df =x ∈ [a, b] : f not continuous at x

.

It is well-known that continuous functions are Riemann integrable. There are dis-continuous functions which are still Riemann integrable, for instance we know that

(18) Df finite =⇒ f Riemann integrable.

Notations. Let f : [a, b] → R be a bounded function. Suppose ∆ = (a =x0 < x1 < · · · < xn = b) is a partition. For each k ∈ 1, . . . , n we consider thenumbers

Mk = supt∈[xk−1,xk]

f(t) and mk = inft∈[xk−1,xk]

f(t),

and we define the functions

f∆ = m1 · κ [x0,x1] +m2 · κ (x1,x2] + · · ·+mn · κ (xn−1,xn],

f∆ = M1 · κ [x0,x1] +M2 · κ (x1,x2] + · · ·+Mn · κ (xn−1,xn].

Clearly the functions f∆ and f∆ have only finitely many points of discontinuity, sothey are Riemann integrable.

With these notations we have the followingProposition 6.6. For a bounded function f : [a, b] → R, the following are

equivalent:(i) f is Riemann integrable;(ii) inf

∫ ba[f∆(x)− f∆(x)] dx : ∆ partition of [a, b]

= 0;

(iii) there exists a sequence (∆p)∞p=1 of partitions of [a, b], with ∆1 ⊂ ∆2 ⊂ . . . ,

and limp→∞∫ ba

[f∆p(x)− f∆p(x)

]dx = 0.

Proof. From the definition of Riemann integrability, we know that (i) is equiv-alent to any of the following two conditions

(ii’) infU(∆, f)− L(∆, f) : ∆ partition of [a, b]

= 0;

(iii’) there exists a sequence (∆p)∞p=1 of partitions of [a, b], with ∆1 ⊂ ∆2 ⊂ . . . ,and limp→∞

[U(∆p, f)− L(∆p, f)

]= 0.

228 LECTURES 23-25

Then the Proposition follows immediately from the fact that, for every partition ∆one has the equalities∫ b

a

f∆(x) dx = L(∆, f) and∫ b

a

f∆(x) dx = U(∆, f).

The following result gives a complete description of the relationship betweenRiemann integrability and continuity.

Theorem 6.1 (Lebesgue’s criterion for Riemann integrability). Let f : [a, b] →R be a bounded function. The following are equivalent:

(i) f is Riemann integrable;(ii) the discontinuity set Df is neglijeable.

Proof. (i) ⇒ (ii). Assume f is Riemann integrable. Using Proposition 6.7,there exists a sequence (∆p)∞p=1 of partitions of [a, b], such that ∆1 ⊂ ∆2 ⊂ . . . and

limp→∞

∫ b

a

[f∆p(x)− f∆p

(x)]dx = 0.

Notice that

(19) f∆1 ≥ f∆2 ≥ f∆3 ≥ · · · ≥ f ≥ · · · ≥ f∆3 ≥ f∆2 ≥ f∆1 .

Define the Riemann integrable functions hp = f∆p − f∆p, p ∈ N. We then clearly

have(α) hp ≥ hp+1 ≥ 0, ∀ p ∈ N;(β) limp→∞

∫ bahp(x) dx = 0.

Using (α) we can define the function h : [a, b] → R by

h(x) = limp→∞

hp(x), ∀x ∈ [a, b].

Claim 1: The set N = x ∈ [a, b] : h(x) 6= 0 is neglijeable.First of all, the functions hp are all Lebesgue measurable. Secondly, since h isa point-wise limit of a sequence of Lebesgue measurable functions, it follows (seeTheorem 3.2) that h itself is Lebesgue measurable. In particular N is Lebesguemeasurable. For every integer j ≥ 1, define

Nj =x ∈ [a, b] : h(x) >

1j

,

so that the sets Nj , j ≥ 1 are again Lebesgue measurable, and N =⋃∞j=1Nj . In

order to prove that N is neglijeable, it then suffices to prove that λ(Nj) = 0, forall j ≥ 1. Fix for the moment j ≥ 1. Since hp ≥ h ≥ 0, it follows that

infx∈Nj

hp(x) ≥1j, ∀ p ≥ 1,

so by Lemma 6.1 we get the inequality

λ(Nj)j

≤∫ b

a

hp(x) dx, ∀ p ≥ 1,

so by (β) we indeed get λ(Nj) = 0.Define the set S =

⋃∞p=1 ∆p.

Claim 2: If y ∈ [a, b] r (N ∪ S), then f is continuous at y.

CHAPTER III: MEASURE THEORY 229

Fix y ∈ [a, b] r (N ∪ S). In order to prove that f is continuous at y, we must find,for every ε > 0, some open interval Jε 3 y, such that

(20) |f(z)− f(y)| < ε, ∀ z ∈ Jε ∩ [a, b].

Since y 6∈ N , we have limp→∞ hp(y) = 0. Fix ε and choose p ≥ 1, such that0 ≤ hp(y) < ε. Write the partition ∆p as

∆p = (a = x0 < x1 < · · · < xn = b).

Using the fact that y 6∈ ∆p, if we define k = minj ∈ 1, . . . , n : y < xj

, we

have y ∈ (xk−1, xk). In particular, we get

f∆p(y) = supt∈[xk−1,xk]

f(t) and f∆p(y) = inf

s∈[xk−1,xk]f(s),

so the inequality 0 ≤ hp(y) < ε gives[sup

t∈[xk−1,xk]

f(t)]−

[inf

s∈[xk−1,xk]f(s)

]< ε,

so if we choose Jε = (xk−1, xk), we clearly have (20).Now we are done, because using the fact that S is countable, it follows that S

is neglijeable, so N ∪ S is also neglijeable. Since by Claim 2, we have Df ⊂ N ∪ S,it follows that Df itself is neglijeable.

(ii) ⇒ (i). Assume now the discontinuity set Df is neglijeable, and let us provethat f is Riemann integrable. Fix a sequence (∆p)∞p=1 of partitions of [a, b], with∆1 ⊂ ∆2 ⊂ . . . , and10 limp→∞ |∆p| = 0. As before, we define the set S =

⋃∞p=1 ∆p.

Claim 3: For any point y ∈ [a, b] r (Df

⋃S), one has the equalities

limp→∞

f∆p(y) = limp→∞

f∆p(y) = f(y).

Fix for the moment ε > 0. Since f is continuous at y, there exists some δε > 0,such that

(21) |f(z)− f(y)| < ε, ∀ z ∈ (y − δε, y + δε) ∩ [a, b].

Choose now q ≥ 1, such that |∆q| < δε. Write ∆q = (a = x0 < x1 < · · · < xn = b).Using the fact that y 6∈ ∆q, we can find k ∈ 1, . . . , n such that y ∈ (xk−1, xk).Since xk − xk−1 < δε, we have the inclusion [xk−1, xk] ⊂ (y− δε, y+ δε), so by (21)we immediately get

f(y) ≤ f∆q (y) = supz∈[xk−1,xk]

f(z) ≤ f(y) + ε;

f(y) ≥ f∆q(y) = inf

z∈[xk−1,xk]f(z) ≥ f(y)− ε.

Since the sequence(f∆p(y)

)∞p=1

is non-increasing, and the sequence(f∆p(y)

)∞p=1

isnon-decreasing, the above inequalities give

|f∆p(y)− f(y)| ≤ ε and |f∆p(y)− f(y)| ≤ ε, for all p ≥ q,

and the Claim follows.Going back to the proof of the Theorem, we will now prove that f satsifies

condition (iii) in Proposition 6.6. Fix ε > 0. Since Df ∪ S is also neglijeable, usingregularity from above with respect to open sets, we can find an open set E ⊂ R

10 Recall that, for a partition ∆ = (a = x0 < · · · < xn = b), the number |∆| is defined as

|∆| = maxxk − xk−1 : 1 ≤ k ≤ n

.

230 LECTURES 23-25

such that E ⊃ Df ∪ S, and λ(E) < ε. Define the compact set A = [a, b] r E, andput B = [a, b] ∩ E. We clearly have

(22) λ(B) ≤ λ(E) < ε.

Define the sequence (hp)∞p=1 by hp = f∆p − f∆p. Since A∩∆p = ∅, it follows that

hp∣∣A

is continuous, for each p ≥ 1. Since A ∩ (Df ∪ S) = ∅, by Claim 3, we knowthat limp→∞ hp(y) = 0, ∀ y ∈ A. Since (hp)∞p=1 is monotone, by Dini’s Theorem(see ??) it follows that

limp→∞

[maxy∈A

hp(y)]

= 0.

In particular, there exists pε ≥ 1, such that

(23) hpε(y) ≤ ε, ∀ y ∈ A.

LetM = sup

x∈[a,b]

f(x) and m = infx∈[a,b]

f(x).

Using Lemma 6.1 for hpεand the sets A and B, combined with (22), we have∫ b

a

hpε(x) dx ≤ (b− a) · sup

y∈Ahpε

(y) + λ∗(B) · supz∈B

hpε(z) ≤

≤ ε(b− a) + λ∗(B)(M −m) ≤ ε(b− a+M −m).

Since hpε ≥ hp ≥ 0, for all p ≥ pε, we get the inequalities

0 ≤∫ b

a

hp(x) dx ≤ ε(b− a+M −m), ∀ p ≥ pε.

The above argument proves that limp→∞∫ bahp(x) dx = 0, i.e.

limp→∞

∫ b

a

[f∆p(x)− f∆p(x)] dx = 0.

By Proposition 6.6, it follows that f is Riemann integrable.

Exercise 13. Prove that a Riemann integrable function f : [a, b] → R isLebesgue measurable.Hint: Use a sequence of partitions (∆p)∞p=1, with ∆1 ⊂ ∆2 ⊂ . . . , and limp→∞ |∆p| = 0. Use

the arguments given in the proof of the implication (ii) ⇒ (i), to find a neglijeable set N ⊂ [a, b],

such that

limp→∞

f∆p (x) = f(x), ∀x ∈ [a, b] rN.

The sequence (f∆p )∞p=1 is non-decreasing, so it has a point-wise limit, say g, which is Lebesgue

measurable. Use the fact that

f(x) = g(x) ∀x ∈ [a, b] rN,

to show that f itself is Lebesgue measurable.

Exercise 14. Let K ⊂ [0, 1] be a compact set with K ∩ Q = ∅, and λ(K) > 0(see Exercise 7 for the existence of such sets). Prove that the characteristic functionκK : [0, 1] → R is not Riemann integrable. In fact, f cannot be almost Riemannintegrable either.Hint: Examine the discontinuity set Df , and prove that K ⊂ Df .

CHAPTER III: MEASURE THEORY 231

Exercise 15. Let fn : [a, b] → R, n ≥ 1 be a sequence of Riemann integrablefunctions. Consider the product space P =

∏∞n=1 Ran fn, equipped with the prod-

uct topology (the sets Ran fn, n ≥ 1, are equipped with the topology induced fromR), and the function F : [a, b] → P , defined by F (x) =

(fn(x)

)∞n=1

. Prove that, forevery bounded continuous function g : P → R, the composition g F : [a, b] → R isRiemann integrable. In other words, the result of a bounded continuous operation,involving a sequence of Riemann integrable functions, is again a Riemann integrablefunction.Hint: Examine the relationship between the discountinuity set DgF and the dsicontinuity sets

Dfn , n ≥ 1.

Exercise 16. Let M be an arbitrary subset of [a, b], and let f : [a, b] → R be aRiemann integrable function, such that f ≤ κM . Prove the inequality∫ b

a

f(x) dx ≤ λ∗(M).

Hint: Consider the function g : [a, b] → R defined by g(x) = maxf(x), 1. Then f ≥ g ≥ κM ,

and g is still Riemann integrable. Apply Lemma 6.1 (the first inequality) to the function 1− g.

Exercise 17*. Let f : [a, b] → R be a bounded function. Prove that the followingare equivalent:

(i) f is Riemann integrable;(ii) for every ε > 0, there exist continuous functions g, h : [a, b] → R with

g ≥ f ≥ h, and∫ ba[g(x)− h(x)] dx < ε;

(iii) for every ε > 0, there exist Riemann integrable functions g, h : [a, b] → Rwith g ≥ f ≥ h, and

∫ ba[g(x)− h(x)] dx < ε.

Hints: For the implication (i) ⇒ (ii) analyze first the particular case when f = κJ , with J

a sub-interval of [a, b]. Then analyze the functions of the type f∆ and f∆. For the implication

(iii) ⇒ (i), analyze the relationship among lower/upper Darboux sums of f , g and h.

Comment. The statement of Theorem 6.1 shows that, appart from trivialcases, the problem of checking that a function f : [a, b] → R is Riemann integrable,is a rather difficult one. The main difficulty arises from the fact that, if N ⊂ [a, b]is a neglijeable set, and f

∣∣[a,b]rN is continuous, then f need not be continuous

at all points in [a, b] r N . For instance, if we consider the characteristic functionf = κQ∩[a,b] of the rationals in [a, b], andN = Q∩[a, b], then clearlyN is neglijeable,f∣∣[a,b]rN is continuous (because it is constant zero), but Df = [a, b].

As earlier suggested, in the hope that such an anomaly can be eliminated, it isreasonable to consider the slightly weaker notion of almost Riemann integrabilty.In the remainder of this section, we take a closer look at this notion, and we willeventually show (see Theorem 6.2) that this indeed removes the above anomaly.

We begin with an “almost” version of Exercise 17.Lemma 6.2. For a function f : [a, b] → R, the following are equivalent:

(i) f is almost Riemann integrable;(ii) for every ε > 0, there exist continuous functions g, h : [a, b] → R with

g ≥ f ≥ h a.e., and∫ ba[g(x)− h(x)] dx < ε;

(iii) for every ε > 0, there exist Riemann integrable functions g, h : [a, b] → Rwith g ≥ f ≥ h a.e., and

∫ ba[g(x)− h(x)] dx < ε.

232 LECTURES 23-25

Proof. The implication (i) ⇒ (iii) is trivial.The implication (iii) ⇒ (ii) follows from Exercise 17.We now prove (ii) ⇒ (i). Assume f has property (ii). For each integer n ≥ 1,

choose continuous functions gn, hn : [a, b] → R, such that gn ≥ f ≥ hn, a.e., and∫ ba[gn(x)− hn(x)] dx ≤ 1/n. Define the functions Gn,Hn : [a, b] → R, n ≥ 1, by

Gn(x) = ming1(x), . . . , gn(x)

,

Hn(x) = maxg1(x), . . . , gn(x)

.

It is clear that(α) Gm ≥ f ≥ Hn, a.e., ∀m,n ≥ 1;(β) G1 ≥ G2 ≥ . . . and H1 ≤ H2 ≤ . . . ;(γ)

∫ ba[Gn(x)−Hn(x)] dx ≤

∫ ba[gn(x)− hn(x)] dx ≤ 1/n, ∀n ≥ 1.

Notice that, since the Gm’s and the Hn’s are continuous, by Exercise ??, we alsohave

(α′) Gm ≥ Hn (everywhere!), ∀m,n ≥ 1.Use (β) to define the functions G,H : [a, b] → R, by

G(x) = limn→∞

Gn(x) and H(x) = limn→∞

Hn(x), ∀x ∈ [a, b],

so by (α′) we clearly have Gn ≥ G ≥ H ≥ Hn, ∀n ≥ 1. Using then (γ), byExercise 17 it follows that both G and H are Riemann integrable. Moreover, wehave G−H ≥ 0 and

0 ≤∫ b

a

[G(x)−H(x)] dx ≤∫ b

a

[Gn(x)−Hn(x)] dx ≤ 1/n, ∀n ≥ 1,

Which forces∫ ba[G(x) − H(x)] dx = 0, so by Exercise ??, we get G = H, a.e. By

(α) it follows that f = G, a.e., so f in indeed almost Riemann integrable.

We are now in position to prove the “almost” version of Theorem 6.1.Theorem 6.2. Let f : [a, b] → R be a bounded function. The following are

equivalent:(i) f is almost Riemann integrable;(ii) there exists a neglijeable set N ⊂ [a, b] such that f

∣∣[a,b]rN is continuous.

Proof. (i) ⇒ (ii). Assume f is almost Riemann integrable, so there exists aRiemann integrable function g : [a, b] → R, such that f = g, a.e. By Theorem 6.1,the discontinuity set Dg is neglijeable. Take

M = x ∈ [a, b] : f(x) 6= g(x).

Since f = g, a.e., the set M is neglijeable, and so is the set N = M ∪ Dg. Onthe one hand, since Dg ⊂ N , the restriction g

∣∣[a,b]rN , is continuous. On the other

hand, since M ⊂ N , we have f∣∣[a,b]rN = g

∣∣[a,b]rN , so (ii) follows.

(ii) ⇒ (i). We are going to imitate the proof of Theorem 6.1, with some minormodifications. Fix N ⊂ [a, b] neglijeable, such that f

∣∣[a,b]rN is continuous. Fix

also a sequence (∆p)∞p=1 of partitions, with ∆1 ⊂ ∆2 ⊂ . . . , and limp→∞ |∆p| = 0.Put S =

⋃∞p=1 ∆p. Since S is countable, the set N ∪ S is still neglijeable. We put

T = [a, b] r (N ∪ S), and we define the analogues of the functions f∆p and f∆pas

CHAPTER III: MEASURE THEORY 233

follows. Write each partition as ∆p = (a = xp0 < xp1 < · · · < xpnp= b), and define,

for each k ∈ 1, . . . , np, the numbers

Mpk = sup

f(t) : t ∈ [xpk−1, x

pk] ∩ T

and mp

k = inff(t) : t ∈ [xpk−1, x

pk] ∩ T

.

We then define, for each p ≥ 1, the functions

gp = mp1 · κ [xp

0 ,xp1 ] +mp

2 · κ (xp1 ,x

p2 ] + · · ·+mp

n · κ (xpn−1,x

pn],

gp = Mp1 · κ [xp

0 ,xp1 ] +Mp

2 · κ (xp1 ,x

p2 ] + · · ·+Mp

n · κ (xpn−1,x

pn].

Note that we have the inequalities gp(x) ≥ f(x) ≥ gp(x), ∀x ∈ T , which give

(24) gp ≥ f ≥ gp, a.e., ∀ p ≥ 1.

It is obvious that gp and gp, p ≥ 1, are all Riemann integrable. We are now goingto estimate the integrals

∫ ba[gp(x) − gp(x)] dx. Put hp = gp − gp, p ≥ 1. First we

observe that, since f∣∣T

is continuous, and T ∩∆p = ∅, ∀ p ≥ 1, we clearly have theequalities limp→∞ gp(x) = limp→∞ gp(x) = f(x), ∀x ∈ T , which give

(25) limp→∞

hp(x) = 0, ∀x ∈ T.

Fix some ε > 0, and use regularity from above, to find an open setD withD ⊃ N∪Sand λ(D) < ε. Take the compact set A = [a, b] rD. Note that f

∣∣A

is continuous,since A ⊂ [a, b] r N . Note also that, since A ⊂ [a, b] r S, the functions gp

∣∣A

andgp

∣∣A

are also continuous, and so will be hp∣∣A, for every p ≥ 1. Since

(gp(x)

)∞p=1

is non-increasing, and(gp(x)

)∞p=1

is non-decreasing, for all x, it follows that thesequence (hp)∞p=1 is monotone, so by Dini’s Theorem, (25) gives

limp→∞

[maxx∈A

hp(x)]

= 0.

In particular, there exists some pε ≥ 1, such that

(26) hp(x) ≤ ε, ∀ p ≥ pε, x ∈ A.Put B = [a, b] r A, and take M = supx∈[a,b] f(x) and m = infx∈[a,b] f(x). Usingthe inclusion B ⊂ D, we get λ∗(B) ≤ λ(D) ≤ ε, so by Lemma 6.1, (the functionshp, p ≥ 1, are clearly non-negative), combined with (26), we get∫ b

a

hp(x) dx ≤ (b− a) · supx∈A

hp(x) + λ∗(B) · supx∈B

hp(x) ≤

≤ (b− a)ε+ λ∗(B)(M −m) ≤ ε(b− a+M −m), ∀ p ≥ pε.

This estimate then proves that limp→∞∫ bahp(x) dx = 0, i.e.

limp→∞

∫ b

a

[gp(x)− gp(x)] dx = 0.

Combining this with (24), and applying Lemma 6.2, yields the fact that f is almostRiemann integrable.

Comment. The hypothesis that f is bounded can be replaced with a slightlyweaker one, which assumes that f is almost bounded, meaning that there exists aneglijeable set U ⊂ [a, b], such that f

∣∣[a,b]rU is bounded.

Exercise 18. Let fn : [a, b] → R, n ≥ 1, be almost Riemann integrable functions,such that

234 LECTURES 23-25

(i) fn ≥ fn+1 ≥ 0, a.e., ∀n ≥ 1;(ii) limn→∞ fn(x) = 0, for “almost all” x ∈ [a, b], i.e. there exists a neglijeable

set N ⊂ [a, b], such that limn→∞ fn(x) = 0, ∀x ∈ [a, b] rN .Prove that

limn→∞

≈∫ b

a

fn(x) dx = 0.

Lectures 26-29

7. Measure theory on locally compact spaces

Earlier in this chapter we discussed the construction of (outer) measures, start-ing with more primitive objects: semiring measures. The main application was theconstruction of the (outer) Lebesgue measure on Rn. In this section we describe analternative construction, which has as its starting point another primitive object:a regular content. The idea is again to start with the measure defined on a “small”class of sets, extend it to an outer measure, and then use the Caratheodory con-struction. Among other applications, we will get an alternative construction of the(outer) Lebesgue measure on Rn.

Definition. Let X be a locally compact space. Denote by CX the collectionof all compact subsets of X. A content on X, is a map ω : CX → [0,∞), with thefollowing properties:

(i) ω(∅) = 0;(ii) if K,L ∈ CX are such that K ⊂ L, then ω(K) ≤ ω(L);(iii) ω(K ∪ L) ≤ ω(K) + ω(L), for all K,L ∈ CX ;(iv) ω(K ∪ L) = ω(K) + ω(L), for all K,L ∈ CX , with K ∩ L = ∅.Comments. Note that ω takes finite values. The collection CX does not have

any nice set-arithmetic properties, except for the following: (i) the union of anyfinite collection of sets in CX is again in CX ; (ii) an arbitrary intersection of sets inCX is again in CX .

Examples 7.1. A. If µ is a measure on Bor(X), then µ∣∣CX

is a content.B. Take X = R, and for a compact subset K ⊂ R, define

ω(K) =

1 if 0 ∈ Int(K)0 if 0 6∈ Int(K)

It is obvious that ω is a content on R. Notice however that if we consider thecompact sets Kn = [− 1

n ,1n ], then ω

( ⋂∞n=1Kn

)= 0, but ω(Kn) = 1, ∀n ≥ 1. This

shows that, in general, a content cannot be extended to a measure on Bor(X).One useful property, which will be invoked several times in this section, is

contained in the following:Exercise 1. Let X be a locally compact space, let K ⊂ X be compact, and let

D1, D2 ⊂ X be open subsets, with K ⊂ D1 ∪ D2. Show there exist compact setsK1 and K2, such that K1 ⊂ D1, K2 ⊂ D2, and K = K1 ∪K2.

As Example 7.1.B suggests, one obstruction for the extendability of a content onX, to a measure on Bor(X), is its behaviour with respect to interiors. The followingnotion isolates an important property, which will be shown to be sufficient for theextendability property.

235

236 LECTURES 26-29

Definition. Let X be a locally compact space. A content ω on X is said tobe regular, if for any K ∈ CX , one has the equality

ω(K) = infω(L) : L ∈ CX , Int(L) ⊃ K

.

The following exercise shows how the lack of regularity can always be repaired.Exercise 2. Let X be a locally compact space, and let ω be a content on X.

Define ω : CX → [0,∞), by

ω(K) = infω(L) : L ∈ CX , Int(L) ⊃ K

, ∀K ∈ CX .

Prove that:(i) ω is a regular content on X;(ii) ω(K) ≥ ω(K), ∀K ∈ CX ;(iii) if η is a regular content on X, with η(K) ≥ ω(K), ∀K ∈ CX , then

η(K) ≥ ω(K), ∀K ∈ CX ;(iv) ω is regular, if and only if ω = ω.Definition. With the notations from Exercise 2, the regular content ω is called

the regularization of ω.Theorem 7.1. Let X be a locally compact space, and let ω be a content on X.

Denote by TX the collection of all open subsets of X. Define the map ω : TX →[0,∞] by

ω(D) = supω(K) : K ∈ CX , K ⊂ D

, ∀D ∈ TX ,

and define the map ω∗ : P(X) → [0,∞], by

ω∗(A) = infω(D) : D ∈ TX , D ⊃ A

, ∀A ⊂ X.

Then ω∗ is an outer measure on X.

Proof. We begin by collecting the useful properties of the map ω.Claim: The map ω has the following properties

(i) ω(∅) = 0;(ii) ω is monotone, i.e. whenever D,E ∈ TX satisfy D ⊂ E, it follows

that ω(D) ≤ ω(E);(iii) ω is σ-sub-additive, i.e., for any sequence (Dn)∞n=1 ⊂ TX , one has

the inequality ω( ⋃∞

n=1Dn) ≤∑∞n=1 ω(Dn).

Properties (i) and (ii) are trivial.To prove property (iii), let us start with some sequence (Dn)∞n=1 of open sets,

and let us denote for simplicity the union⋃∞n=1Dn by D. Start with some arbitrary

compact set K ⊂ D. Using compactness, there exists some index p ≥ 1, such thatK ⊂ D1 ∪ D2 ∪ · · · ∪ Dp. Use Exercise 1 (and induction) to find compact setsK1 ⊂ D1, K2 ⊂ D2, . . . , Kp ⊂ Dp, such that K = K1 ∪K2 ∪ · · · ∪Kp. We thenclearly have the inequalities

ω(K) ≤p∑

n=1

ω(Kn) ≤p∑

n=1

ω(Dn) ≤∞∑n=1

ω(Dn).

Since we have

ω(K) ≤∞∑n=1

ω(Dn), for all K ∈ CX with K ⊂ D,

CHAPTER III: MEASURE THEORY 237

by the definition of ω, we immediately get

ω(D) ≤∞∑n=1

ω(Dn).

Having proven the Claim, we now check the conditions in the definition of anouter measure. It is clear that ω∗(∅) = 0. It is also clear, from the definition, andproperty (ii) from the Claim, that

A ⊂ B =⇒ ω∗(A) ≤ ω∗(B).

Finally, we need to show σ-sub-additivity, i.e.

(1) ω∗( ∞⋃n=1

An)≤

∞∑n=1

ω∗(An).

Start with some sequence (An)∞n=1 of subsets of X. Of course, if one of the termsin the right hand side of (1) is infinite, there is nothing to prove. Assume thatω∗(An) < ∞, ∀n ≥ 1. Fix some ε > 0, and choose, for each n ≥ 1, an open setDn ⊃ An, such that ω(Dn) ≤ ω∗(An) + ε

2n . Put D =⋃∞n=1Dn. Using part (iii) of

the Claim, we have

ω(D) ≤∞∑n=1

ω(Dn) ≤∞∑n=1

[ω∗(An) +

ε

2n]

= ε+∞∑n=1

ω∗(An).

Since we obviously have the inclusion D ⊃⋃∞n=1An, the above inequality gives

ω∗( ∞⋃n=1

An)≤ ω(D) ≤ ε+

∞∑n=1

ω∗(An).

Now we have

ω∗( ∞⋃n=1

An)≤ ε+

∞∑n=1

ω∗(An),

for all ε > 0, so the inequality (1) follows.

Definition. Let X be a locally compact space, and let ω be a content on X.The outer measure ω∗ on X, defined in Theorem 7.1, is called the outer measureinduced by ω.

Remarks 7.1. Let X be a locally compact space, let ω be a content on X,and let ω∗ be the outer measure induced by ω.

A. The map ω : TX → [0,∞], defined in the statement of Theorem 7.1, is givenby ω = ω∗

∣∣TX

. To see that this is the case, start with some open set D. On theone hand, by the definition of ω∗, we know that

ω∗(D) = infω(E) : E ∈ TX , E ⊃ D

,

which (using E = D) immediately gives the inequality ω∗(D) ≤ ω(D). On theother hand, using property (ii) from the Claim stated in the proof, we also knowthat

ω(E) ≥ ω(D), for all E ∈ TX with E ⊃ D,

which gives the reverse inequality, ω∗(D) ≥ ω(D).B. As a consequence of the above remark, we get the fact that ω∗ is regular

from above, with respect to the collection TX of all open sets in X, i.e.

ω∗(A) = infω∗(D) : D ∈ TX , D ⊃ A

, ∀A ⊂ X.

238 LECTURES 26-29

C. If one denotes by ω the regularization of ω (see Exercise 2), then ω∗ = ω∗.In fact, using the notations from Theorem 7.1, we have the equality ω = ˆω. Indeed,on the one hand, since we have the inequality

ω(K) ≥ ω(K), ∀K ∈ CX ,

it follows immediately by the definitions, that

ˆω(D) ≥ ω(D), ∀D ∈ TX .

To prove the other inequality, fix some open set D ⊂ X. Suppose K ⊂ D is somecompact subset. Using the well-known properties of locally compact spaces, thereexists some compact set L, with

K ⊂ Int(L) ⊂ L ⊂ D,

so by the definitions of ω and ω, we get

ω(D) ≥ ω(L) ≥ ω(K).

Since we have the inequality

ω(D) ≥ ω(K), for all K ∈ CX with K ⊂ D,

taking supremum in the right hand side yields

ω(D) ≥ supω(K) : K ∈ CX , K ⊂ D

= ˆω(D).

Proposition 7.1. Let X be a locally compact space, let ω be a content on X,and let ω∗ be the outer measure induced by ω. If we denote by ω the regularizationof ω, then one has the equality

ω∗∣∣CX

= ω.

Proof. Using Remark 7.1.C, we can assume that ω is regular, and in this casewe need to prove that ω∗

∣∣CX

= ω. Start with some compact set K ⊂ X. By thedefinition of ω∗, using the notations from Theorem 7.1, we know that

(2) ω∗(K) = infω(D) : D ∈ TX , D ⊃ K

.

It is clear that, for every open set D ⊃ K, we have the inequality

ω(D) ≥ ω(K),

so taking infimum in the left hand side, and using (2), immediately gives the in-equality

ω∗(K) ≥ ω(K).

To prove the reverse inequality, we start by fixing ε > 0, and we use regularity tofind some compact set L with K ⊂ Int(L), and ω(L) ≤ ω(K) + ε. Consider theopen set D = Int(L). On the one hand, for every compact set F ⊂ D, we havethe onbious inclusion F ⊂ L, which gives ω(F ) ≤ ω(L). Taking supremum over allcopact sets F ⊂ D then gives ω(D) ≤ ω(L). By the choice of L, by the definitionof ω∗, and using the inclusion D ⊃ K, we then get

ω∗(K) ≤ ω(D) ≤ ω(L) ≤ ω(K) + ε.

Since the inequalityω∗(K) ≤ ω(K) + ε,

holds for all ε > 0, we then must have ω∗(K) ≤ ω(K).

CHAPTER III: MEASURE THEORY 239

The above result gives a nice characterization for the regularity of a content,in terms of the induced outer measure.

Corollary 7.1. Let X be a locally comoact space. A content ω is regular, ifand only if ω∗

∣∣CX

= ω.

Proof. Immediate from Proposition 7.1 and exercise 2.

Theorem 7.2. Let X be a locally compact space, let ω be a content on X,and let ω∗ be the outer measure induced by ω. Then every open set D ⊂ X isω∗-measurable.

Proof. Fix an open set D ⊂ X. We need to prove (see Section 5) that D“sharply cuts” every subset of X, which is equivalent to the fact that, for everyA ⊂ X, one has the inequality:

(3) ω∗(A) ≥ ω∗(A ∩D) + ω∗(ArD).

This will be shown in several steps.Claim 1: For any open set E ⊂ X, and any compact set K ⊂ E, one has

the inequality

ω∗(E) ≥ ω(K) + ω∗(E rK).

To prove this inequality, we first note that, since both E and E rK are open, byRemark 7.1.A, we have the equalities ω∗(E) = ω(E) and ω∗(E rK) = ω(E rK),where ω : TX → [0,∞] is the map defined in the statement of Theorem 7.1. IfL ⊂ E r K is an arbitrary compact set, then we obviously have K ∩ L = ∅, sousing the inclusion K ∪ L ⊂ E, we get

ω(K) + ω(L) = ω(K ∪ L) ≤ ω(E) = ω∗(E),

which then gives

ω∗(E)− ω(K) ≥ ω(L), for all L ∈ CX with L ⊂ E rK.

Taking supremum in the right hand side then gives

ω∗(E)− ω(K) ≥ supω(L) : L ∈ CX , L ⊂ E rK

= ω(E rK) = ω∗(E rK),

and the Claim follows.Claim 2: The inequality (3) holds for all open subsets A ⊂ X.

Assume A is open. If the left hand side of (3) is infinite, there is nothing to prove.Assume ω∗(A) <∞, so both ω∗(A∩D) and ω∗(ArD) are also finite. Since A∩Dis open, we have

(4) ω∗(A ∩D) = ω(A ∩D) = supω(K) : K ∈ CX , K ⊂ A ∩D

.

Fix for the moment a compact subset K ⊂ A ∩ D. Using Claim 1 we have theinequality

ω∗(A) ≥ ω(K) + ω∗(ArK).Since we obviously have the inclusion A rK ⊃ A rD, the above inequality givesω∗(A) ≥ ω(K) + ω∗(ArD), which can be rw-written as

ω∗(A)− ω∗(ArD) ≥ ω(K), for all K ∈ CX with K ⊂ A ∩D.Taking supremum in the right hand side, and using (4), we immediately get thedesired inequality (3).

240 LECTURES 26-29

We now proceed with the proof of (3) for arbitrary A’s. Fix A, and consideran arbitrary open set E ⊃ A. By Claim 2, we have

ω∗(E) ≥ ω∗(E ∩D) + ω∗(E rD).

Using the obvious inclusions E ∩D ⊃ A ∩D and E rD ⊃ ArD, we then get

ω∗(E) ≥ ω∗(A ∩D) + ω∗(ArD).

The desired inequality (3) follows now by taking infimum in the left hand side, andusing Remark 7.1.B.

The most important consequence of Theorem 7.2 is the following.Corollary 7.2. Let X be a locally compact space, and let ω be a regular

content on X. Then ω can be extended uniquely to a measure µω on Bor(X), withthe following properties.

(i) µ is regular from above, with respect to the collection TX of all open sets,that is

µω(B) = infµω(D) : D ∈ TX , D ⊃ B

, ∀B ∈ Bor(X);

(ii) for every open set D ⊂ X, one has the equality

µω(D) = supµω(K) : K ∈ CX , K ⊂ D

.

Conversely, if µ is a measure on Bor(X) with properties (i) and (ii), and such thatµ(K) <∞, ∀K ∈ CX , then µ

∣∣CX

is regular content.

Proof. If we denote by mω∗(X) the σ-algebra of ω∗-measurable sets, thenTheorem 7.2 gives the inclusion Bor(X) ⊂ mω∗(X), so the existence follows bytaking µω = ω∗

∣∣Bor(X)

. The fact that µω has properties (i) and (ii) is trivial, byconstruction and by Remarks 7.1 and Proposition 7.1.

The uniqueness is trivial, since property (ii) uniquely defines µω on open sets,and (i) then uniquely defines µω on all Borel sets.

To prove the second assertion, assume µ is a measure on Bor(X) with properties(i) and (ii), and let us show that ω = µ

∣∣CX

is a regular content. The fact that ω isa content is trivial, so the only thing we must show is regularity. Fix some compactset K ⊂ X. It is clear that

ω(K) ≤ infω(L) : L ∈ CX , K ⊂ Int(L)

.

To prove the converse, we use property (i), to find, for each ε > 0, an open setDε ⊃ K, such that µ(Dε) ≤ µ(K)+ ε. If we choose, for each ε.), a compact set Lε,such that

K ⊂ Int(Lε) ⊂ Lε ⊂ Dε,

then we obviously have

µ(K) ≤ µ(Lε) ≤ µ(Dε) ≤ µ(K) + ε,

so we get the inequality

infω(L) : L ∈ CX , K ⊂ Int(L)

≤ ω(K) + ε.

Since this holds for all ε > 0, we get in fact the inequality

infω(L) : L ∈ CX , K ⊂ Int(L)

≤ ω(K),

and we are done.

CHAPTER III: MEASURE THEORY 241

Definition. Let X be a locally compact space. A Radon measure on X is ameasure µ on Bor(X) with the following properties:

(i) µ(K) <∞, for all compact sets K ⊂ X;(ii) for every open set D one has

µ(D) = supµ(K) : K ⊂ D, K compact

;

(iii) for every Borel set B one has

µ(B) = infµ(B) : D ⊃ B, D open

.

By Corollary 7.2, the map ω 7−→ µω establishes a bijective correspondence betweenthe set of all regular contents on X, and the set of all Radon measures on X. Fora regullar content ω, the measure µω is called the Radon measure extension of ω.

Proposition 7.2. Let X be a locally compact space.(a) If µ is a Radon measure on X, and t ∈ [0,∞), then tµ is also a Radon

measure on X.(b) If µ1 and µ2 are Radon measures on X, then µ1 + µ2 is also a Radon

measure on X.

Proof. Property (a) is trivial.To prove property (b) let us denote µ1 +µ2 simply by µ. We first obvserve that

µ is indeed a measure on Bor(X), and we clearly have

µ(K) = µ1(K) + µ2(K) <∞, ∀K ∈ CX .

Let us show that µ satisfies condition (ii). Fix some open set D ⊂ X, and letus prove that

(5) µ(D) = supµ(K) : K ∈ CX , K ⊂ D

.

If µ(D) = ∞, then either µ1(D) = ∞ or µ2(D) = ∞, so we get

sup

max(µ1(K), µ2(K)

): K ∈ CX , K ⊂ D

= ∞,

and since µ(K) ≥ max(µ1(K), µ2(K)

), ∀K ∈ CX , the equality (5) immedi-

ately follows. Suppose now µ(D) < ∞, which is equivalent to the fact thatµ1(D), µ2(D) < ∞. Denote the right hand side of (5) by ν(D). For every ε > 0,using the fact that µ1 and µ2 are Radon measures, we can find two compact setsKε

1 ,Kε2 ⊂ D, such that µ1(Kε

1) ≥ µ1(D)− ε2 and µ2(Kε

2) ≥ µ2(D)− ε2 . Of course,

the compact set Kε = Kε1 ∪Kε

2 is still a subset of D, and satisfies

µ1(Kε) ≥ µ1(Kε1) ≥ µ1(D)− ε

2,

µ2(Kε) ≥ µ2(Kε2) ≥ µ2(D)− ε

2,

so we get µ(Kε) = µ1(Kε) + µ2(Kε) ≥ µ1(D) + µ2(D) − ε = µ(D) − ε. Thisproves that ν(D) ≥ µ(D)− ε, and since this inequality is true for all ε > 0, we getν(D) ≥ µ(D). The inequality ν(D) ≤ µ(D) is trivial.

We now show that µ satisfies condition (iii). Fix some set A ∈ Bor(X), andlet us prove that

(6) µ(A) = infµ(D) : D ∈ TX , D ⊃ A

.

If µ(A) = ∞, there is nothing to prove. Suppose now µ(A) <∞, which is equivalentto the fact that µ1(A), µ2(A) <∞. Denote the right hand side of (6) by λ(A). Forevery ε > 0, using the fact that µ1 and µ2 are Radon measures, we can find two

242 LECTURES 26-29

open sets Dε1, D

ε2 ⊃ A, such that µ1(Dε

1) ≤ µ1(A) + ε2 and µ2(Dε

2) ≤ µ2(A) + ε2 .

Then open set Dε = Dε1 ∩Dε

2 still contains A, and satisfies

µ1(Dε) ≤ µ1(Dε1) ≤ µ1(A) +

ε

2,

µ2(Dε) ≤ µ2(Dε2) ≥ µ2(A) +

ε

2,

so we get µ(Dε) = µ1(Dε) + µ2(Dε) ≤ µ1(A) + µ2(A) + ε = µ(A) + ε. Thisproves that λ(A) ≤ µ(A) + ε, and since this inequality is true for all ε > 0, we getλ(A) ≤ µ(A). The inequality λ(A) ≥ µ(A) is trivial.

Radon measures are also functorial with respect to proper maps, in the followingsense.

Proposition 7.3. Let X and Y be locally compact spaces, let Φ : X → Ybe a proper continuous map, and let µ be a Radon measure on X. Then the mapν : Bor(Y ) → [0,∞], defined by

ν(B) = µ(Φ−1(B)

), ∀B ∈ Bor(Y ),

is a Radon measure on Y .

Proof. First of all, remark that since Φ is continuous, it is Borel measurable,which means that

Φ−1(B) ∈ Bor(X), ∀B ∈ Bor(Y ).Secondly, by the well known properties of measures, the map ν is a measure.

We now check that ν is a Radon measure. First of all, if K ⊂ Y is compact,then using the fact that Φ is proper, it means that Φ−1(K) is compact in X, so weclearly get

ν(K) = µ(Φ−1(K)

)<∞.

To prove that ν satisfies condition (ii), start with some open set D ⊂ Y ,and let us find a sequence (Ln)∞n=1 of compact subsets of D, such that ν(D) =limn→∞ ν(Ln). The set Φ−1(D) is open, so there exists a sequence (Kn)∞n=1 ofcompact subsets of Φ−1(D), with

(7) ν(D) = µ(Φ−1(D)

)= limn→∞

µ(Kn).

It we define the subsets Ln = Φ(Kn), then (Ln)n≥1 is a sequence of compact subsetsof D, and the inclusion Kn ⊂ Φ−1(Ln) immediately gives ν(D) ≥ ν(Ln) ≥ µ(Kn),so by (7) we also get ν(D) = limn→∞ ν(Ln).

To prove condition (iii) start with some arbitrary subset B ∈ Bor(Y ), and letus a sequence (En)∞n=1 of open subset of Y , such that ν(B) = limn→∞ ν(En), andEn ⊃ B, ∀n ≥ 1. Use the fact that µ is a Radon measure, to find a sequence(Dn)∞n=1 of open subset of X, such that

(8) ν(B) = µ(Φ−1(B)

)= limn→∞

µ(Dn),

and Dn ⊃ Φ−1(B), ∀n ≥ 1. Put Tn = XrDn, so that Tn is closed, for each n ≥ 1.By Proposition I.5.2, the sets Φ(Tn) are closed in Y , hence their complementsEn = Y r Φ(Tn), n ≥ 1 are open. Remark that we have the inclusions B ⊂ En,∀n ≥ 1. Otherwise, we would have B ∩ Φ(Tn) 6= ∅, forcing Tn ∩ Φ−1(B) 6= ∅,which is impossible, since Φ−1(B) ⊂ Dn = X r Tn. Moreover, we also have theinclusions

Φ−1(B) ⊂ Φ−1(En) ⊂ Dn, ∀n ≥ 1,

CHAPTER III: MEASURE THEORY 243

which then forceν(B) ≤ ν(En) ≤ µ(Dn), ∀n ≥ 1.

Using by (8) this gives the equality limn→∞ ν(En) = ν(B).

Of course, if X is a compact Hausdorff space, then every Radon measure µ onX is finite. The following gives an interesting converse of this property, which alsoshows that sometimes functoriality can be present beyond the proper case describedabove.

Proposition 7.4. Let X be a locally compact space, let µ be a Radon measureon X, and let (θ, T ) be a compactification of X. The following are equivalent:

(i) µ(X) <∞;(ii) the map ν : Bor(T ) → [0,∞), defined by

ν(B) = µ(θ−1(B)

), ∀B ∈ Bor(T ),

is a Radon measure on T .

Proof. Recall that the fact that (θ, T ) is a compactification of X means that• T is a compact Hausdorff space;• θ : X → T is continuous;• θ(X) is open and dense in T ;• θ : X → θ(X) is a homeomorphism.

Without any loss of generality, we can assume that X is a dense open subset of T ,and θ is the inclusion map. With this convention, the map ν is defined by

(9) ν(B) = µ(B ∩X), ∀B ∈ Bor(T ).

(i) ⇒ (ii). Assume µ(X) <∞. It is clear that ν is a finite measure on Bor(T ),and in fact we have ν(T rX) = 0.

The fact that ν(K) <∞, for every compact subset K ⊂ T is of course trivial.We now check the second condition in the definition. Fix some open subset

D ⊂ T , and let us show that

ν(D) = supν(K) : K compact, K ⊂ D

.

All we need is a sequence (Kn)∞n=1 of compact subsets of D, with limn→∞ ν(Kn) =ν(D). To get this sequence we simply use the fact that D∩X is open (in X), so wecan find a sequence (Kn)∞n=1 of compact subsets of D ∩X, with limn→∞ µ(Kn) =µ(D∩X) = ν(D). Now we are done, because the fact that Kn ⊂ X, gives µ(Kn) =ν(Kn), ∀n ≥ 1.

We now check the third condition in the definition. Fix some set B ∈ Bor(T ),and let us show that

ν(B) = infν(D) : D ⊂ T open, D ⊃ B

.

All we need is a sequence (Dn)∞n=1 of open subsets of T , with Dn ⊃ B, ∀n ≥ 1,and limn→∞ ν(Dn) = ν(B). Start off by choosing a sequnce (Kn)∞n=1 of compactsubsets of X, such that limn→∞ µ(Kn) = µ(X), we will get limn→∞ µ(XrKn) = 0(the condition that µ(X) < ∞ is essential here). If we define then the open setsAn = T rKn, then we will have ν(An) = µ(An ∩X) = µ(X rKn), ∀n ≥ 1, so wehave

(10) limn→∞

ν(An) = 0.

244 LECTURES 26-29

Notice also that

(11) An ⊃ T rX, ∀n ≥ 1.

Use now the fact that µ is a Radon measure on X, and the fact that B ∩ X ∈Bor(X), to find a sequence (En)∞n=1 of open subsets of X, with En ⊃ B ∩ X,∀n ≥ 1, and

(12) limn→∞

ν(En) = µ(B ∩X).

Since X is open in T , it follows that all the En’s are open in T . If we defineDn = En ∪An, then using (11) we have the inclusions

Dn = En ∪An ⊃ (B ∩X) ∪ (T rX) ⊃ B, ∀n ≥ 1,

as well as the inequalities

µ(B ∩X) = ν(B) ≤ ν(Dn) ≤ ν(En) + ν(An) =

= µ(En ∩X) + ν(An) = µ(En) + ν(An), ∀n ≥ 1,

which, combined with (10) and (12), clearly give limn→∞ ν(Dn) = µ(B∩X) = ν(B).(ii) ⇒ (i). This implication is trivial, because the fact that ν is a Radon

measure forces µ(X) = ν(X) ≤ ν(T ) <∞.

Comment. Assume µ is a Radon measure on a locally compact space X.Although the measure µ is regular from above with respect to open sets by (iii), ingeneral, one cannot conclude that it is regular from below with respect to compactsets. The following example illustrates such an anomaly.

Exercise 3*. Equipp the space X = R2 with the disjoint union topology definedby the decomposition X =

⋃y∈R

(R × y

). More explicitly, if we define, for each

A ⊂ X, and each y ∈ R, the set

Ay = x ∈ R : (x, y) ∈ A,

then a set D ⊂ X is declared to be open, if and only if all subsets Dy ⊂ R, y ∈ Rare open (in the usual topology on R). For each subset A ⊂ X, define its support

SA = y ∈ R : Ay 6= ∅.Prove the following.

(i) A set K ⊂ X is compact, if and only if its support SK is finite and, foreach y ∈ SK , the set Ky ⊂ R is compact (in the usual topology on R).

(ii) X is a locally compact space.(iii) If we define, for every compact subset K ⊂ X, the number

ω(K) =∑y∈SK

λ(Ky),

where λ is the Lebesgue measure on R, then ω is a regular content on X.(iv) Let µ denote the Radon measure extension of ω. Then for every open set

D ⊂ X, one has the equality

µ(D) =∑y∈R

λ(Dy),

where one uses the summation conventions discussed in II.2. (The sum inthe right hand side is defined as the supremum of all finite sums.)

(v) If B ∈ Bor(X) has uncountable support SB , then µ(B) = ∞.

CHAPTER III: MEASURE THEORY 245

(vi) Consider the y-axis Y = 0 ×R ⊂ X Show that F is closed in X (henceBorel), it has infinite measure µ(F ) = ∞, but µ(K) = 0, for all compactsubsets K ⊂ F .

Hints: Using regularity from above, it suffices to prove (v) only when B is open. In this case

use the fact that if a map α : R → R is summable, then the set t ∈ R : α(t) 6= 0 is countable.

For (vi), the equality µ(F ) = ∞ is a consequence of (v). To get the fact that all compact subsets

of F have measure zero, use part (i).

Remark 7.2. Let X be a locally compact space, and let µ be a Radon measureon X. We define the maximal outer extension of µ (see Section 5) by

µ∗(A) = infµ(B) : B ∈ Bor(X), B ⊃ A

, ∀A ⊂ X.

By the regularity from above, one has the equality

(13) µ∗(A) = infµ(D) : D ∈ TX , D ⊃ A

, ∀A ⊂ X.

If one considers the regular content ω = µ∣∣CX

, then µ∗ = ω∗, the outer mea-sure induced by ω. We also know that if we consider the σ-algebra mµ∗(X) ofall µ∗-measurable subsets of X, we have the inclusion Bor(X) ⊂ mµ∗(X), andµ∗

∣∣Bor(X)

= µ.

Exercise 4. Consider the collection D of all subsets of Rn, of the form

D = (a1, b1)× · · · × (an, bn), a1 < b1, . . . , an < bn.

For every such D we define

voln =n∏j=1

(bj − aj).

We define, for every bounded subset B ⊂ Rn, the number

(14) v(B) = inf N∑p=1

voln(Dp) : (Dp)Np=1 ⊂ D, B ⊂N⋃p=1

Dp

.

(i) If we define B =B ⊂ Rn : B bounded

, then B is a ring, the map

v : B → [0,∞) is sub-additive, but not σ-sub-additive. In particular, vdoes not extend to an outer measure on Rn.

(ii) If we consider the unit square S = [0, 1]n, then the collection Nv(S) =N ⊂ S : v(N) = 0

is a ring, but not a σ-ring.

(iii) When restricted to the collection CRn , of all compact subsets of Rn, themap ω = v

∣∣CRn

defines a regular content on Rn.(iv) The outer measure ω∗, defined by ω, is precisely the outer Lebesgue mea-

sure λ∗n.The above construction somehow belongs to the “prehistory” of measure theory.The map v : B → [0,∞) is called the Jordan content. Bounded sets N ⊂ Rn,with v(N) = 0 are called Jordan neglijeable. The theory of Riemann integration(especially for functions of several variables) relies heavily on the use of Jordanneglijeable sets. Part (ii) shows that, when restricted to Bor(S), the map v failsto be a measure. Parts (iii) and (iv) explain how the construction can be “fixed.”The regular content ω = v

∣∣CRn

is called the Lebesgue content. The correspondenceω 7−→ ω∗ gives an alternative construction of the outer Lebesgue measure, whichstarts with its definition on compact sets as the Jordan content.

246 LECTURES 26-29

For Radon measures, the lack of regularity from below, with respect to compactsets, in somehow compensated by the following result (compare with Exercise 2 fromSection 6).

Lemma 7.1. Let X be a locally compact space, let µ be a Radon measure onX, and let µ∗ be the maximal outer extension of µ. For a subset A ⊂ X, withµ∗(A) <∞, the following are equivalent

(i) A is µ∗-measurable;(ii) µ∗(A) = supµ(K) : K ∈ CX , K ⊂ A;(iii) there exists a sequence (Kn)∞n=1 of compact subsets of A, such that

µ∗(Ar

[ ∞⋃n=1

Kn

])= 0.

Proof. (i) ⇒ (ii). Suppose A is µ∗-measurable, and let us prove the equality(ii). Denote the right hand side of (ii) simply by ν(A). It is obvious, by themonotonicity of µ∗, and the fact that µ∗

∣∣Bor(X)

= µ, that we have the inequalityµ∗(A) ≥ ν(A). To prove the other inequality we fix for the moment some ε > 0.Using (13), there exists an open set D ⊃ A, such that µ(D) ≤ µ∗(A) + ε. Useproperty (ii) in the definition of Radon measures, to find some compact set L ⊂ Dsuch that

µ(D) ≤ µ(L) + ε.

Since µ(D) = µ(D r L) + µ(L), and µ(L) ≤ µ(D) <∞, this inequality gives

µ(D r L) ≤ ε,

which, combined with the obvious inclusion Ar L ⊂ D r L, yields

(15) µ∗(Ar L) ≤ µ∗(D r L) = µ(D r L) ≤ ε.

Using (13) we can also find an open set E ⊃ LrA, such that

(16) µ(E) ≤ µ∗(LrA) + ε.

Since LrA is µ∗-measurable, we have µ(E) = µ∗(E) = µ∗(Er(LrA)

)+µ∗(LrA).

Since µ∗(LrA) ≤ µ∗(E) = µ(E) <∞, the inequality (16) gives

(17) µ∗(E r (LrA)

)≤ ε.

Consider the set K = L r E. It is obvious that K is compact, and we have theinclusion

K ⊂ Lr (LrA) = L ∩A ⊂ A.

Moreover, we have(L ∩A) rK ⊂ E r (LrA).

Using the inequality (17), we then get

µ∗((L ∩A) rK

)≤ ε.

Finally, the above inequality, combined with (15), gives

µ∗(ArK) ≤ µ∗((L ∩A) rK

)+ µ∗

((Ar L) rK

)≤ ε+ µ∗(Ar L) ≤ 2ε.

Since K ⊂ A, we get

µ∗(A) ≤ µ∗(ArK) + µ∗(K) ≤ 2ε+ µ(K) ≤ 2ε+ ν(A).

Since the inequality µ∗(A) ≤ 2ε + ν(A) holds for all ε > 0, we get µ∗(A) ≤ ν(A),so (ii) follows.

CHAPTER III: MEASURE THEORY 247

(ii) ⇒ (iii). Assume A satisfies (ii), and let us show that A has property (iii).For every integer n ≥ 1, we use (ii) to find a compact set Kn ⊂ A, such that

(18) µ∗(A) ≤ µ(Kn) +1n.

On the one hand, we have the inclusions Ar[⋃∞

n=1Kn

]⊂ ArKp, which give

(19) µ∗(Ar

[ ∞⋃n=1

Kn

])≤ µ∗(ArKp), ∀ p ≥ 1.

On the other hand, since Kp is measurable, we have the equality

µ∗(A) = µ∗(ArKp) + µ∗(Kp) = µ∗(ArKp) + µ(Kp),

and then the fact that µ∗(A) <∞, combined with (18), will force

µ∗(ArKp) ≤1p, ∀n ≥ 1.

Using (19), this forces µ∗(Ar

[⋃∞n=1Kn

])= 0.

(iii) ⇒ (i). This is pretty obvious. We define the sets B =⋃∞n=1Kn ⊂ A, and

N = A r B. Then µ∗(N) = 0, so in particular, N is µ∗-measurable. Since B isBorel, it is also µ∗-measurable, so A = B ∪N is indeed µ∗-measurable.

The following result generalizes Lemma 7.1 to the σ-finite case.

Theorem 7.3. Let X be a locally compact space, let µ be a Radon measure onX, and let µ∗ be the maximal outer extension of µ. For a set A ⊂ X, the followingare equivalent

(i) A is µ∗-measurable, and µ∗-σ-finite.(ii) There exists sequences (Kn)∞n=1 ⊂ CX and (Dn)∞n=1 ⊂ TX , such that

∞⋃n=1

Kn ⊂ A ⊂∞⋂n=1

Dn and µ([ ∞⋂n=1

Dn

]r

[ ∞⋃n=1

Kn

])= 0.

(The condition that A is µ∗-σ-finite means that there exists a sequence (An)∞n=1 ofsubsets of X, with A =

⋃∞n=1An, and µ∗(An) <∞, for all n ≥ 1.)

Proof. (i) ⇒ (ii). Assume A is µ∗-measurable and µ∗-σ-finite.

Claim 1: There exists a sequence (An)∞n=1 of µ∗-measurable sets, such thatA =

⋃∞n=1An, and µ∗(An) <∞, ∀n ≥ 1.

A priori, we only know that there exists a sequence (A0n)∞n=1 of subsets of X

(not assumed to be µ∗-measurable), with A =⋃∞n=1A

0n, and µ∗(A0

n) <∞, ∀n ≥ 1.Using (13), we can choose however, for each n ≥ 1, an open set En, with A0

n ⊂ En,and µ(En) <∞. In particular, En is µ∗-measurable, and so will be An = A ∩ En.We clearly have A =

⋃∞n=1An, and µ∗(An) ≤ µ∗(En) <∞, ∀n ≥ 1.

Using Claim 1, we start off by writing A =⋃∞n=1An, with the An’s µ∗-

measurable, and µ∗(An) < ∞. For each n ≥ 1, we use Lemma 7.1 to find asequence (Lpn)

∞p=1 of compact subsets of An, such that

µ∗(An r

[ ∞⋃p=1

Lpn])

= 0.

248 LECTURES 26-29

Let us list the countable collection Lpn : p, n ≥ 1 as a sequence (Kn)∞n=1, so thatwe have

∞⋃n=1

Kn =∞⋃n=1

∞⋃p=1

Lpn ⊂∞⋃n=1

An = A.

Claim 2: The set M = Ar( ⋃∞

n=1Kn

)is µ∗-neglijeable, i.e. µ∗(M) = 0.

Indeed, if we define, for each k ≥ 1 the set Mk = Ak r[⋃∞

n=1Kn

], then we have

the obvious equality M =⋃∞k=1Mk, and the inclusions

Mk = Ak r[ ∞⋃p=1

∞⋃n=1

Lpn]⊂ Ak r

[ ∞⋃p=1

Lpk], ∀ k ≥ 1,

which, by the choice of the L’s, prove that µ∗(Mk) = 0, ∀ k ≥ 1.We proceed now with the construction of the D’s. For each pair of integers

(p, n), we use (13) to find an open set Epn ⊃ An, such that µ(Epn) ≤ µ∗(An) + 12p+n .

Since the An’s are µ∗-measurable, we have

µ(Epn) = µ∗(Epn) = µ∗(An) + µ∗(Epn rAn).

Since µ∗(An) <∞, by the choice of the E’s, we will get

(20) µ∗(Epn rAn) ≤1

2p+n, ∀ p, n ≥ 1.

We then define, for each p ≥ 1, the open set Dp =⋃∞n=1E

pn. Notice that, for each

p ≥ 1, we have the inclusion A =⋃∞n=1An ⊂

⋃∞n=1E

pn = Dp, and

Dp rA =∞⋃n=1

[Epn rA] ⊂∞⋃n=1

[Epn rAn].

Using (20), we then get

(21) µ∗(Dp rA) ≤∞∑n=1

µ∗(Epn rAn) ≤∞∑n=1

12p+n

=12p, ∀ p ≥ 1.

Since A ⊂ Dp, ∀ p ≥ 1, we get A ⊂⋂∞p=1Dp. Moreover, if we define the set

N =[⋂∞

p=1Dp

]r A, we obviously have the inclusions N ⊂ Dp r A, ∀ p ≥ 1, and

then (21) clearly forces µ∗(N) = 0.Now we have

⋃∞n=1Kn ⊂ A ⊂

⋂∞p=1Dp, and

[⋂∞p=1Dp

]r

[⋃∞n=1Kn

]= N∪M ,

with µ∗(M) = µ∗(N) = 0, so we indeed have (ii).The implication (ii) ⇒ (i) is pretty obvious. If there exist sequences (Kn)∞n=1

and (Dn)∞n=1 as in (ii), then the sets B =⋃∞n=1Kn and G =

⋂∞n=1Dn are Borel.

Moreover, the inclusions B ⊂ A ⊂ G, give ArB ⊂ GrB, so we have µ∗(ArB) ≤µ∗(G r B). By the second feature in (ii) we know that µ∗(G r B) = 0, thereforethe set P = A r B is µ∗-neglijeable, hence µ∗-measurable. Since A = B ∪ P , itfollows that A is indeed µ∗-measurable.

Comment. The implication (ii) ⇒ (i) in Theorem 7.3 holds without the µ∗-σ-finiteness assumption onA. In fact, condition (ii) actually forces A to be µ∗-σ-finite.

Corollary 7.3. If µ is a Radon measure on X, and the set A is µ∗-measurable,and µ∗-σ-finite, then one has the equality

µ∗(A) = supµ(K) : K ∈ CX , K ⊂ A

.

CHAPTER III: MEASURE THEORY 249

Proof. Follow the first part of the proof of (i) ⇒ (ii) to find a sequence(Kn)∞n=1 of compact subsets of A, such that

µ∗(Ar

∞⋃n=1

Kn

])= 0.

Since⋃∞n=1Kn is µ∗-measurable, this forces the equality

µ∗(A) = µ∗( ∞⋃n=1

Kn

)= limn→∞

µ∗(K1 ∪ · · · ∪Kn).

Exercise 5*. LetX be a locally compact space, and let µ be a Radon measure onX. Suppose ν : Bor(X) → [0,∞] is a measure satisfying the following conditions:

(a) ν(B) ≤ µ(B), ∀B ∈ Bor(X);(b) for every B ∈ Bor(X), one has the implication ν(B) <∞⇒ µ(B) <∞.

Prove that ν is a Radon measure on X. (Notice that, in the case when µ is finite,the condition (b) is superfluous.)Hints: To prove condition (ii) in the definition of Radon measures, start with some open setD ⊂ X, and choose a sequence K1 ⊂ K2 ⊂ · · · ⊂ D of compact subsets, such that

limn→∞

µ(Kn) = µ(D),

and define the Borel set B =⋃∞

n=1Kn ⊂ D. Notice that we have the equalities µ(B) =

limn→∞ µ(Kn) and ν(B) = limn→∞ ν(Kn). Argue that, when ν(D) = ∞, we must have

ν(B) = ∞. When ν(D) <∞, show that µ(D rB) = 0. In either case we get ν(B) = ν(D).

The next result explains somehow the anomaly illustrated by Exercise 3.Proposition 7.5. If µ is a Radon measure on X, and let µ∗ denote its maximal

outer extension. For a subset N ⊂ X, the following are equivalent(i) N is µ∗-measurable, and for every compact subset K ⊂ N , one has the

equality µ(K) = 0;(ii) µ∗(D ∩N) = 0, for all open subsets D ⊂ X with µ(D) <∞;(iii) N is locally µ∗-neglijeable, i.e.

µ∗(A ∩N) = 0, for all subsets A ⊂ X with µ∗(A) <∞.

Proof. (i) ⇒ (ii). Assume N satisfies condition (i). Fix some open set D ⊂X, with µ(D) <∞. Then the setD∩N is measurable, and µ∗(D∩N) ≤ µ(D) <∞.The equality µ∗(D ∩N) = 0 then follows from (i), combined with Corollary 7.3.

(ii) ⇒ (iii). Assume N satisfies condition (ii). Fix some arbitrary subsetA ⊂ X, with µ∗(A) < ∞. Using (13), there exists some open set D ⊃ A withµ(D) <∞. Then we have the inequality µ∗(A∩N) ≤ µ∗(D∩N), so condition (ii)will force µ∗(A ∩N) = 0.

(iii) ⇒ (i). LetN be locally µ∗-neglijeable. We know that local µ∗-neglijeabilityimplies µ∗-measurability (see Section 5). The fact that µ(K) = 0, for all compactsubsets K ⊂ N is also trivial.

Notation. Let µ be a Radon measure on the locally compact space X, andlet µ∗ be the maximal outer extension of µ. We denote the σ-algebra mµ∗(X),of all µ∗-measurable subsets of X, simply by Mµ(X), and we define the measureµ = µ∗

∣∣mµ∗ (X)

. Using the terminology introduced in Section 5, the pair (Mµ(X), µ)is the quasi-completion of Bor(X) with respect to µ.

250 LECTURES 26-29

Our next goal is to examine the inclusion Bor(X) ⊂ Mµ(X) along the samelines used in the final part of Section 5. In preparation for the results that follow,it is helpful to introduce the following terminology.

Definition. Let µ be a Radon measure on the locally compact space X. Anon-empty compact subset K ⊂ X, is said to be µ-tight, if it has the property

• there is no compact non-empty proper subset L ( K, with µ(K) = µ(L).Remark 7.3. Singleton sets are always µ-tight. If K is µ-tight, and µ(K) = 0

then K must be a singleton.For a non-empty compact set K with µ(K) > 0, the µ-tightness is equivalent

to the following condition11:

(22)D ⊂ X openD ∩K 6= ∅

=⇒ µ(D ∩K) > 0.

Indeed, if K is µ-tight, and D ⊂ X is an open set, such that D ∩K 6= ∅, then thecompact set L = K r D is either empty, or a proper subset of K. In either case,we get µ(L) < µ(K), and then the equality D ∩ K = K r L gives µ(D ∩ K) =µ(K) − µ(L) > 0. Conversely, if K satisfies (22) and if L is a non-empty propercompact subset of K, then the set D = X r L is open, and satisfies D ∩K 6= ∅.By (22) this forces µ(D ∩ K) > 0, and since we have L = K r (D ∩ K), we getµ(L) = µ(K)− µ(D ∩K) < µ(K).

A µ-tight compact set K, with µ(K) > 0, will be called non-degenerate.Lemma 7.2. Let X be a locally compact space, let µ be a Radon measure on

X. Every non-empty compact set K ⊂ X has a µ-tight compact subset K0 ⊂ K,with µ(K0) = µ(K).

Proof. If K is already tight, there is nothing to prove. Also, if µ(K) = 0,then we can pick K0 to be of the form x, with x any point in K.

For the remainder of the proof, we are going to assume that K is not µ-tight,and µ(K) > 0. Consider the collection

L =L ∈ CX : ∅ 6= L ( K and µ(L) = µ(K)

.

Since K is not µ-tight, the collection L is non-empty. One key property of thecollection L is the following.

Claim 1: If L1, . . . , Ln ∈ L, then L1 ∩ · · · ∩ Ln ∈ L.Indeed, if we define the sets Aj = KrLj , j = 1, . . . , n, then µ(A1) = · · · = µ(An) =0, and then the equality

K r [L1 ∩ · · · ∩ Ln] = A1 ∪ · · · ∪Anwill force µ

(K r [L1 ∩ · · · ∩ Ln]

)= 0, thus giving µ(L1 ∩ · · · ∩ Ln) = µ(K) > 0.

(The last inequality forces of course L1 ∩ · · · ∩ Ln 6= ∅.)Using the finite intersection property, it follows that the intersection K0 =⋂

L∈L L is non-empty.Claim 2: K0 ∈ L.

Obviously K0 is compact non-empty proper subset of K, so the only thing we needto prove is the equality µ(K0) = µ(K). Consider the Borel subset

B = K rK0 ⊂ K.

11 Notice that using D = X, condition (22) actually forces µ(K) > 0.

CHAPTER III: MEASURE THEORY 251

Since B ⊂ K, it follows that µ(B) <∞. By Corollary 7.3 we have

(23) µ(B) = supµ(P ) : P compact, P ⊂ B

.

Notice however that if P ⊂ B is compact, then we have, by the definition of B, theequality ⋂

L∈L

(L ∩ P ) =( ⋂L∈L

L)∩ P = (K rB) ∩ P = ∅,

so again by the finite intersection property, combined with Claim 1, it follows thatthere exists L ∈ L, such that P ∩ L = ∅. Then we have µ(K r L) = 0, so theinclusion P ⊂ K r L will force µ(P ) = 0. Using (23), this forces µ(B) = 0.

We now show that K0 is µ-tight. Indeed, if K0 were not tight, we could findsome non-empty compact proper subset L ( K0, with µ(L) = µ(K0) = µ(K).This will of course force L to belong to L, and therefore it will force the inclusionK0 ⊂ L, which is impossible.

Lemma 7.3. Let X be a locally compact space, let ν be a Radon measure on X,and let G be a pair-wise disjoint collection of non-degenerate µ-tight compact sets.For any set A ⊂ X, with µ∗(A) <∞, the collection

SG(A) =G ∈ G : G ∩A 6= ∅

is at most countable.

Proof. Since µ∗(A) < ∞, by (13), there exists some open set D ⊃ A withµ(D) <∞. It is obvious that SG(A) ⊂ SG(D), so it suffices to prove that SG(D) isat most countable.

On the one hand, we notice that, for every finite subset F ⊂ SG(D), one has∑G∈F

µ(G ∩D) = µ( ⋃G∈F

[G ∩D])≤ µ(D) <∞.

This means that the family(µ(G ∩D)

)G∈SG(D)

is summable, and we have∑G∈SG(D)

µ(G ∩D) ≤ µ(D) <∞.

On the other hand, by Remark 7.3, we know that all the terms µ(G ∩ D), G ∈SG(D) are are strictly positive. Using Proposition II.2.2, this forces SG(D) to becountable.

The main application of the above result is the following.Theorem 7.4. Let X be a locally compact space, and let µ be a Radon measure

on X. Then there exists a partition F of X into µ-tight compact sets, with theproperty that the set

NF =⋃F∈Fµ(F )=0

F

is locally µ∗-neglijeable.

Proof. Define the set

Ω =F : F pairwise disjoint collection of non-degenerate µ-tight compact sets

.

We agree to consider the empty collection as an element of Ω, so that Ω is non-empty. Equip the set Ω with the order relation ⊂ given by inclusion.

252 LECTURES 26-29

Claim 1: The ordered set (Ω,⊂) contains a maximal element.This is a straightforward application of Zorn’s Lemma. Start with some subset Λof Ω, which is totally ordered with respect to ⊂, and let us show that there is anupper bound for Λ (in Ω). If we write Λ = Gi : i ∈ I, we define the collectionG =

⋃i∈I Gi. It is clear that every element in G is a non-degenerate µ-tight compact

set. If K,L ∈ G are different elements, then there exist i, j ∈ I with K ∈ Gi andL ∈ Gj . Since Λ is totally ordered, we either have Gi ⊂ Gj , or Gj ⊂ Gi. In eithercase, we conclude that there exists some k ∈ I, such that K,L ∈ Gk, and thenK ∩ L = ∅. This shows that G is pairwise disjoint, hence G belongs to Ω. It isobvious that G is an upper bound for Λ.

Having proven Claim 1, we fix a maximal collection G ∈ Ω, and we define theset T =

⋃G∈GG. It is quite possible that G = ∅. In that case we define T = ∅.

Claim 2: For every compact subset K ⊂ X r T , one has µ(K) = 0.We prove this by contradiction. Assume µ(K) > 0. By Lemma 7.3 there existsa µ-tight compact subset K0 ⊂ K, with µ(K0) = µ(K) > 0 (in particular K0 isnon-degenerate). But then the collection G ∪ K0 would obviously contradict themaximality of G.

Claim 3: Whenever D ⊂ X is an open set with µ(D) < ∞, it follows thatthe set D r T is Borel, and µ(D r T ) = 0.

By Lemma 7.3, the collection

SG(D) =G ∈ G : G ∩D 6= ∅

is at most countable. Now we have

D ∩ T =⋃

G∈SG(D)

(D ∩G),

so D ∩ T is a countable union of Borel sets, hence D ∩ T itself is Borel, and so willbe Dr T = Dr (D ∩ T ). Since µ(Dr T ) ≤ µ(D) <∞, by Corollary 7.3, we have

µ(D r T ) = supµ(K) : K compact, K ⊂ D r T

.

By Claim 2 this, forces µ(D r T ) = 0.Going back to the proof of the theorem, we notice that, by Claim 2, none of

the singletons x, x ∈ X rT , has positive measure. We can then define collection

F = G ∪x : x ∈ X r T

,

which is obviously a partition of X into µ-tight compact sets. For this partition,we obviously have the equality NF = X r T . By Claim 3, we have

µ∗(NF ∩D) = 0, for all open sets D ⊂ X with µ(D) <∞.

By Proposition 7.2, it follows that NF is indeed locally µ∗-neglijeable.

Definition. Let X be a locally compact space, and let µ be a Radon measureon X. A partition F of X into µ-tight compact sets, with the property stated inTheorem 7.3, will be called non-degenerate.

The existence of such partitions is significant, as indicated below.Theorem 7.5. Let X be a locally compact space, let µ be a Radon measure on

X, and let F be a non-degenerate partition of X into µ-tight compact sets. Then12

F is a sufficient µ-finite Bor(X)-partition of X.

12 See Section 5 for the terminology.

CHAPTER III: MEASURE THEORY 253

Proof. What we to prove are the following properties:(i) F is pairwise disjoint, and

⋃F∈F F = X;

(ii) F ⊂ B, and µ(F ) <∞, for all F ∈ F;(iii) for every set B ∈ Bor(X), with µ(B) <∞, one has the equality13

(24) µ(B) =∑F∈F

µ(B ∩ F ).

Conditions (i) and (ii) are obvious.To prove condition (iii) we define the sub-collection G = F ∈ F : µ(F ) > 0,

so that G consists on non-degenerate µ-tight compact sets, and the set

NF = X r( ⋃F∈G

F)

is locally µ∗-neglijeable. Assume now B ∈ Bor(X) has µ(B) <∞. By Lemma 7.3,the collection

SG(B) =F ∈ G : B ∩ F 6= ∅

is at most countable. In particular, the set

B0 =⋃

F∈SG(B)

(B ∩ F ) = B rNF

is Borel, and so will be B r B0 = B ∩ NF. On the one hand, since B r B0 is asubset of NF, it follows that B r B0 is locally µ∗-neglijeable. On the other hand,since B r B0 is a subset of B, it follows that µ(B r B0) < ∞. This clearly forcesµ(B rB0) = 0, so we have the equality

(25) µ(B) = µ(B0) =∑

F∈SG(B)

µ(B ∩ F ).

Notice that, if F ∈ F rSG(B), then either F 6∈ G, in which case we have µ(F ) = 0,or F ∈ G r SG(B), in which case we have µ(B ∩ F ) = 0. This shows that

µ(B ∩ F ) = 0, ∀F ∈ F r SG(B),

so the equality (25) immediately gives (24).

Corollary 7.4. Under the hypothesis above, the collection F is a µ-finitedecomposition for Mµ(X).

Proof. Immediate from Corollary 5.3.

In the remainder of this section we discuss two basic examples of methods forconstructing (regular) contents.

To introduce the first construction, let us recall some notations and terminologyintroduced in II.5 For a locally compact space X, and K one of the fields R or C, wedenote by CK

c (X) the space of all continuous functions f : X → K, with compactsupport. A R-linear map φ : CR

c (X) → R is said to be positive, if it has theproperty:

f ∈ CRc (X), f ≥ 0 ⇒ φ(f) ≥ 0.

With these notations, we have the following result.

13 Here we use the summation convention from II.2

254 LECTURES 26-29

Proposition 7.6. Let X be a locally compact space, and let φ : CRc (X) → R

be a positive R-linear map. For every compact subset K ⊂ X, define the number

ωφ(K) = infφ(f) : f ∈ CR

c (X), f ≥ κK

.

Then the map CX 3 K 7−→ ωφ(K) ∈ [0,∞) is a regular content on X.

Proof. The inequality f ≥ κK forces f ≥ 0, so we indeed have ωφ(K) ≥ 0,∀K ∈ CX . We now check conditions (i)-(iv) in the definition of a content.

The constant function 0 satisfies 0 ≥ κ∅, which immediately gives the equalityωφ(∅) = 0, so condition (i) is satisfied.

By the definition of ωφ, it is clear that one has the implication

K,L ∈ CX , K ⊂ L =⇒ ωφ(K) ≤ ωφ(L),

thus giving condition (ii).To check condition (iii), suppose K,L ∈ CX , and let us prove the inequality

(26) ωφ(K ∪ L) ≤ ωφ(K) + ωφ(L).

Start with some ε > 0, and choose functions f, g ∈ CRc (X), such that f ≥ κK ,

g ≥ κL, φ(f) ≤ ωφ(K) + ε, and φ(g) ≤ ωφ(L). If we consider the functionh = f + g ∈ CR

c (X), then we clearly have h ≥ κK∪L, so we will have

ωφ(K ∪ L) ≤ φ(h) = φ(f + g) = φ(f) + φ(g) ≤ ωφ(K) + ωφ(L) + 2ε.

Since the inequality ωφ(K ∪ L) ≤ ωφ(K) + ωφ(L) + 2ε holds for arbitrary ε > 0, itwill clearly force (26)

Finally, to check condition (iv) we need start with two disjoint sets K,L ∈ CX ,and we prove the equality

(27) ωφ(K ∪ L) = ωφ(K) + ωφ(L).

By (26) it only suffices to show the inequality

(28) ωφ(K ∪ L) ≥ ωφ(K) + ωφ(L).

Start with some arbitrary ε > 0, and choose a function f ∈ CRc (X), with f ≥

κK∪L and φ(f) ≤ ωφ(K ∪L) + ε. Use Uryshon Lemma for locally compact spaces(Theorem I.5.1) to find a continuous map θ : X → [0, 1], such that θ

∣∣K

= 1 andθ∣∣L

= 0. The functions g = fθ and h = f(1 − θ) are obviously continuous, andhave compact supports. Moreover, one has the inequalities g ≥ κK and h ≥ κL.Since g + h = f , we get

ωφ(K ∪ L) + ε ≥ φ(f) = φ(g + h) = φ(g) + φ(h) ≥ ωφ(K) + ωφ(L).

Since the inequality ωφ(K ∪ L) + ε ≥ ωφ(K) + ωφ(L) holds for all ε > 0, it willclearly force the inequality (28)

So far, we have shown that ωφ is a content. We now prove that ωφ is regular,which means that, for every K ∈ CX , one has the equality

ωφ(K) = infωφ(L) : L ∈ CX , K ⊂ Int(L).

By property (ii) we always have the inequality

ωφ(K) ≤ infωφ(L) : L ∈ CX , K ⊂ Int(L),

so all we need to prove is the inequality

(29) ωφ(K) ≥ infωφ(L) : L ∈ CX , K ⊂ Int(L).

CHAPTER III: MEASURE THEORY 255

Start with some arbitrary ε > 0, and choose a function f ∈ CRc (X) with f ≥ κK ,

and φ(f) ≤ ωφ(K) + ε. Consider the function g = (1 + ε)f , and the set

D = x ∈ X : g(x) > 1.

Obviously D is an open set, and since f(x) ≥ 1, ∀x ∈ K, we get g(x) ≥ 1 + ε > 1,∀x ∈ K. In particular, this gives the inclusion K ⊂ D. Apply then Lemma I.5.1to find some compact set L ⊂ D, with K ⊂ Int(L). Since g(x) > 1, ∀x ∈ L, weclearly have

ωφ(L) ≤ φ(g) = (1 + ε)φ(f) ≤ (1 + ε)(ωφ(K) + ε).

This argument shows that, if we denote the right hand side of (29) by ν(K), thenwe have the inequality

ν(K) ≤ (1 + ε)(ωφ(K) + ε).Since this inequality holds for all ε > 0, it will force the inequality ν(K) ≤ ωφ(K),thus proving (29).

Definition. Let X be a locally compact space, and let φ : CRc (X) → R be a

positive R-linear map. We apply Corollary 7.2 to the regular content ωφ, and wewill denote the Radon measure extension of ωφ simply by µφ. The measure µφ onBor(X) is called the Riesz measure associated with φ.

An interesting property, which will later be generalized, is the following.Lemma 7.4 (Mean Value Property). Let X be a locally compact space, let

φ : CRc (X) → R be a positive R-linear map, and let µφ be the Riesz measure

associated with φ. For any function f ∈ CRc (X), and any compact subset K ⊂ X,

with K ⊃ supp f , one has the inequality

(30)[minx∈K

f(x)]· µφ(K) ≤ φ(f) ≤

[maxx∈K

f(x)]· µφ(K).

Proof. Since minx∈K f(x) = −maxx∈K(−f)(x), it suffices to prove only theinequality

(31) φ(f) ≤[maxx∈K

f(x)]· µφ(K).

Fix f ∈ CRc (X), as well as the compact set K ⊃ supp f . Denote the number

maxx∈K f(x) simply by M .If M < 0 the inequality is pretty clear, because the function g = f

M satisfies g ≥κK , which gives φ(g) ≥ ωφ(K) = µφ(K), and then multiplying by M immediatelygives (31).

The case M = 0 is also trivial, since this forces f ≤ 0, so we get φ(f) ≤ 0.Assume M > 0. Fix for the moment some ε > 0, and choose some function

h ∈ CRc (X), with h ≥ κK , and φ(h) ≤ µφ(K) + ε.

Let us observe that Mh− f ≥ 0. Indeed, if we start with some arbitrary pointx ∈ X, then either x ∈ K, in which case we have Mh(x) ≥ M ≥ f(x), or we havex ∈ X rK, in which case Mh(x) ≥ 0 = f(x).

Using the positivity of φ we then get φ(Mh− f) ≥ 0, which by the choice of hgives

φ(f) ≤ φ(Mh) = Mφ(h) ≤M(µφ(K) + ε

).

Since the inequality φ(f) ≤M(µφ(K) + ε

)holds for arbitrary ε > 0, it will clearly

force φ(f) ≤Mµφ(K).

The Riesz measure can be implicitly characterized by the following result.

256 LECTURES 26-29

Proposition 7.7. With the notations above, the Riesz measure µφ is theunique Radon measure which has the interpolation property:

(iφ) whenver F ⊂ X is compact, D ⊂ X is open, and f ∈ CRc (X) satisfies

κF ≤ f ≤ κD, it follows that one has the inequality

µφ(F ) ≤ φ(f) ≤ µφ(D).

Proof. Let us first show that µφ has property (iφ). Start with F , D and fas in (iφ). Since µφ(F ) = ωφ(F ), by the definition of ωφ, we immediately get theinequality µφ(F ) ≤ φ(f).

To prove the inequality φ(f) ≤ µφ(D), we need some preparations. For everyinteger n ≥ 1 we define the sets

An =x ∈ X : f(x) >

1n

and Bn =

x ∈ X : f(x) ≥ 1

n

.

Define also the set E = x ∈ X : f(x) > 0, so that E = supp f . (Here we usethe obvious fact that f ≥ 0.) The sets An, n ≥ 1 are open. The sets Bn, n ≥ 1are closed subsets of E ⊂ E, hence they are compact. Notice also that we have theinclusions

A1 ⊂ B1 ⊂ A2 ⊂ B2 ⊂ · · · ⊂ E ⊂ D.

For every n ≥ 1, we use Urysohn Lemma to find a continuous function hn : X →[0, 1], with hn

∣∣Bn

= 1 and hn∣∣XrAn+1

= 0. On the one hand, we notice that the

function f(1−hn) has the support contained in the compact set ErAn ⊂ XrAn.Moreover, since we clearly have f(x) ≤ 1

n , ∀x ∈ X rAn, by Lemma 7.4 we get theinequality

φ(f) = φ(fhn)+φ(f(1−hn)

)≤ φ(fhn)+

µφ(E rAn)n

≤ φ(fhn)+µφ(E)n

, ∀n ≥ 1,

which shows that

(32) φ(f) ≤ lim supn→∞

φ(fhn).

On the other hand, for each n ≥ 1, the function fhn has support contained inBn+1, and (fhn)(x) ≤ 1, ∀x ∈ Bn+1, so again by Lemma 7.4 combined with theinclusion Bn+1 ⊂ D, we get

φ(fhn) ≤ µφ(Bn+1) ≤ µφ(D).

Using (32) we immediately get φ(f) ≤ µφ(D).We now prove the uniqueness. Let µ be a Radon measure with property (iφ).

Claim 1: For any compact set K ⊂ X and any open set D ⊂ X, with K ⊂ D,one has the inequality

µφ(K) ≤ µ(D).

Choose a compact set L ⊂ X, with K ⊂ Int(L) ⊂ L ⊂ D, and use Urysohn Lemmato find a continuous function f : X → [0, 1] such that f

∣∣K

= 1 and f∣∣XrInt(L)

= 0.In particular, f has compact support, and satisfies κK ≤ f ≤ κD. Using (iφ) forµφ and for µ, we then get µφ(K) ≤ φ(f) ≤ µ(D), and we are done.

Claim 2: for every compact set K ⊂ X, one has the equality µφ(K) = µ(K).

CHAPTER III: MEASURE THEORY 257

On the one hand, by the definition of the Radon measure, we have

µ(K) = infµ(D) : D ⊂ X open, with D ⊃ K

.

By Claim 1, this immediately gives the inequality µφ(K) ≤ µ(K). On the otherhand, if we choose, for every ε > 0, a function fε ∈ CR

c (X) with f ≥ κK andφ(f) ≤ µφ(K)+ ε, then the function gε = minfε, 1 will also satisfy gε ≥ κK , andφ(gε) ≤ φ(fε) ≤ µφ(K)+ ε. Applying (iφ) for µ, with C = K and X = D will thenforce µ(K) ≤ φ(gε) ≤ µφ(K) + ε. Since the inequality µ(K) ≤ µφ(K) + ε holds forall ε > 0, it will force µ(K) ≤ µφ(K).

Having proven Claim 2, we now see that, using condition (iii) in the definitionof Radon measures, we get the equality µ(D) = µφ(D), for all open sets D ⊂ X.Using condition (ii) from the definition, it then follows that µ(B) = µφ(B), ∀B ∈Bor(X).

Comment. The Riesz correspondencepositive R-linear maps

CRc (X) → R

3 φ 7−→ µφ ∈

Radon measures

on X

.

will be studied in Chapter IV, where we will eventially prove the fact that it isbijective. At this point we simply regard it as a method of constructing Radonmeasures.

Proposition 7.8. Let X be a locally compact space. Then the Riesz corre-spondence is “linear” in the following sense.

(i) If φ : CRc (X) → R is a positive R-linear map, and t ∈ [0,∞), then tφ is

also a positive R-linear map, and one has the equality µtφ = tµφ.(ii) If φ1, φ2 : CR

c (X) → R are positive R-linear maps, then φ1 + φ2 is also apositive R-linear map, and one has the equality µφ1+φ2 = µφ1 + µφ2 .

Proof. (i). Assume φ is positive and t ∈ [0,∞). The fact that tφ is positiveis trivial. We know, by Proposition 7.2, that tµφ is a radon measure. Then theequality µtφ = tµφ follows from Proposition 7.5, combined with the obvious factthat µtφ has the interpolation property (itφ)

(ii). If φ1 and φ2 are positive, then so is φ1 + φ2. Define ψ = φ1 + φ2, andν = µφ1 +µφ2 . By Proposition 7.2, we again know that ν is a Radon measure. Theequality µψ = ν follows from Proposition 7.5, combined with the obvious fact thatν has the interpolation property (iψ)

The Riesz correspondence is also functorial, with respect to proper maps, inthe following sense.

Proposition 7.9. Let X and Y be locally compact spaces, let Φ : X → Y bea proper continuous map, and let φ : CR

c (X) be a positive linear map.

(i) Whenever f : Y → R is a continuous function with compact support, itfollows that the composition f Φ : X → R is also a continuous functionwith compact support.

(ii) The map

ψ : CRc (Y ) 3 f 7−→ φ(f Φ) ∈ R

is R-linear and positive.

258 LECTURES 26-29

(iii) If µφ is the Riesz measure on X defined by φ, and if µψ is the Rieszmeasure on Y defined by ψ, then one has the equality

µψ(B) = µφ(Φ−1(B)

), ∀B ∈ Bor(Y ).

Proof. (i). This statement is trivial, since Φ is proper.(ii). The linearity of ψ is a consequence of the linearity of the map

T : CRc (Y ) 3 f 7−→ f Φ ∈ CR

c (X),

and of the obvious equality ψ = φ T .(iii). Use Proposition 7.3, which states that the map ν : Bor(Y ) → [0,∞],

defined byν(B) = µφ

(Φ−1(B)

), ∀B ∈ Bor(Y ),

is a Radon measure. In order to prove statement (iii), which reads µψ = ν, weobserve that, using Proposition 7.5, it suffices to prove that ν has the interpolationproperty (iψ). Fix then a compact set K and an open set D ⊂ Y , as well as afunction f ∈ CR

c (Y ), such that κK ≤ f ≤ κD, and let us prove the inequalities

(33) ν(K) ≤ ψ(f) ≤ ν(D).

Define the compact set L = Φ−1(K) (here we use the fact that Φ is proper), anddefine the open set E = Φ−1(D) ⊂ X, so that ν(K) = µφ(L) and ν(D) = µφ(E).If we define, using (i), the function g = f Φ ∈ CR

c (X), then we have ψ(f) = φ(g),and the inequalities (33) are the same as the inequalities

µφ(L) ≤ φ(g) ≤ µφ(E).

But these inequalities follow immediately from the interpolation property of µφ,combined with the obvious inequalities κL ≤ g ≤ κE .

Remarks 7.4. Let X be a locally compact space, let φ : CRc (X) → R be a

positive R-linear map, and let µφ be the Riesz measure defined by φ.A. One has the equality

(34) µφ(X) = supφ(f) : f ∈ CR

c (X), 0 ≤ f ≤ 1.

Indeed, if we denote the right hand side of (34) by M , then the inequality µφ(X) ≥M is immediate from the interpolation property. In fact, if for each compact setK ⊂ X, we choose (use Urysohn Lemma) some continuous function fK : X → [0, 1],with compact support, such that fK

∣∣K

= 1, then by the interpolation property weget M ≥ φ(fK) ≥ µφ(K), so we have

M ≥ supµφ(K) : K ∈ CX

= µφ(K).

B. As a consequence of the equality (34), and of Remark II.5.4, we get theequivalence

φ continuous ⇔ µφ(X) <∞.

Moreover, in this case one has the equality ‖φ‖ = µφ(X).C. Assume X is non-compact, and φ is continuous. Then φ can be extended to

a positive linear function φ′ on the completion CR0 (X) of CR

c (X). In this case theRiesz correspondence has a nice connection with the Alexandrov compactificationXα = X t ∞ (see I.5 and II.5). Recall that CR

0 (X) is identified with the spaceof all continuous functions f : Xα → R with f(∞) = 0. Moreover, φ′ has a uniqueextension to a positive linear map ψ : CR(Xα) → R, with ‖φ‖ = ‖φ′‖ = ‖φ‖.

CHAPTER III: MEASURE THEORY 259

We can then consider two Riesz measures µφ on X, and µψ on Xα. One has theequality

(35) µψ(B) = µφ(B ∩X), ∀B ∈ Bor(Xα).

First of all, remark that

(36) µψ(K) = µφ(K), ∀K ∈ CX .

This is a consequence of the fact that for every g ∈ CR(Xα) with g ≥ κK , thereexists some f ∈ CR

c (X), with g ≥ f ≥ κK (Simply take f = gh, for some continuousfunction h : X → [0, 1] with compact support, with h

∣∣K

= 1.) Using (36), weimmediately get the equality

(37) µψ(B ∩X) = µφ(B ∩X), ∀B ∈ Bor(Xα).

Using this with B = X, we get

µψ(X) = µφ(X) = ‖φ‖ = ‖ψ‖ = µψ(Xα),

which forces µψ(∞) = 0, and then (35) is immediate from (37)Exercise 6. Consider the case when X = Rn. For every continuous function

f : Rn → R, with compact support, we define

φ(f) =∫ b1

a1

∫ b2

a2

· · ·∫ bn

an

f(x1, x2, . . . , xn) dx1 dx2 · · · dxn,

where the numbers a1 < b1, . . . , an < bn are chosen (arbitrarily) such that

supp f ⊂ [a1, b1]× [a2, b2]× · · · × [an, bn].

(One can show that the multiple integral is independent of the choice of the a’s andthe b’s.) It is obvious that this way we have constructed a positive R-linear mapφ : CR

c (Rn) → R. The Riesz measure µφ, defined by φ, is precisely the Lebesguemeasure λn.Hint: Compute the values of µφ on compact boxes.

We conclude this section with an important result from harmonic analysis. Themain object of study is explained in the following.

Definition. A topological group is a group G, which comes also equipped witha topology, which is compatible with the group structure in the sense that the map

G×G 3 (g, h) 7−→ gh−1 ∈ Gis continuous. Remark that is equivalent to the fact that both maps G × G 3(g, h) 7−→ gh ∈ G and G 3 g 7−→ g−1 ∈ G are continuous. To avoid any complica-tions, all topological groups are assumed to be Hausdorff.

Examples 7.2. A. Any group becomes a topological group, when equippedwith the discrete topology. (This is the topology in which every subset is open.)

B. The group (Rn,+) is a topological group, when equipped with the normtopology.

C. The unit circle T =z ∈ C : |z| = 1

is a topological group, when equipped

with the unsual multiplication, and the topology induced from C. More generally,for an integer n ≥ 1, the n-dimensional torus Tn, equipped with coordinate-wisemultiplication, and the product topology, is a topological group.

D. Given an integer n ≥ 1, the group GLn(R), of all invertible n× n matrices(with matrix multiplication as the group operation), is a topological group, when

260 LECTURES 26-29

equipped with the topology comming from the identification of GLn(R) as an opensubset in Rn2

.Notations. Let G be a group. For a subset A ⊂ G and an element g ∈ G, we

define the left and right translations of A by g, as the sets

gA = gh : h ∈ A and Ag = hg : h ∈ A.For two subsets A,B ⊂ G, we define

A ·B = hk : h ∈ A, k ∈ B.Finally, for a subset A ⊂ G, we define A−1 = h−1 : h ∈ A

.

Remark 7.5. There is some similarity between topological groups and metricspaces. The subsets that paly role of open balls are the open neighborhoods of theidentity. More explicitly, if G is a topological group, with identity element e, thenone has the equalities

N ⊂ G : N open neighborhood of g =

= gV ⊂ G : V open neighborhood of e =

= Wg ⊂ G : W open neighborhood of e.

For example, given a metric space (X, d), a map f : G→ X is continuous at somepoint g ∈ G, if and only if, for every ε > 0, there exists some neighborhood Vε ofe, such that

d(f(gh), f(g)

)< ε, ∀h ∈ Vε.

The following two results will be used several times.Lemma 7.5. Suppose G is a topological group, with identity element e. For

any open neighborhood U of e, there exists an open neighborhoods V of e, such thatV = V −1 and V · V ⊂ U .

Proof. Fix the open neighborhood U . Use the continuity of the map G×H 3(g, h) → gh ∈ G, at (e, e), to find an open neighborhood D of (e, e) in G×G, suchthat

gh ∈ U, ∀ (g, h) ∈ D.Since D is open in the product topology, there exist open neighborhoods U1 andU2, of e, such that U1 × U2 ⊂ D. Then we obviously have

U1 · U2 ⊂ U.

Consider the open neighborhood W = U1 ∩ U2. We still have W ·W ⊂ U . Finally,using the continuity of the map G 3 g 7−→ g−1 ∈ G, it follows that W−1 is also aneighborhood of e. Then we are done, if we take V = W ∩W−1.

Proposition 7.10. Let G be a topological group, and let K,L ⊂ G be twocompact disjoint sets. Then there exists an open neighborhood V of the identityelement e, such that V = V −1 and (K · V ) ∩ (L · V ) = (V ·K) ∩ (V · L) = ∅.

Proof. Consider the continuous map φ : G × G 3 (g, h) 7−→ gh−1 ∈ G, andthe compact set C = (K ×L)∪ (L×K) ⊂ G×G. Since φ is continuous, it followsthat φ(C) is a compact subset of G. The condition K ∩ L = ∅ obviously gives thefact that e 6∈ φ(C). Since φ(C) is closed, there exists some open neighborhood Uof e, such that φ(C) ∩ U = ∅. Use Lemma 7.5 to find some open neigborhood Vof e, such that V = V −1 and V · V ⊂ U .

CHAPTER III: MEASURE THEORY 261

We now show that (K ·V )∩ (L ·V ) = ∅. Suppose the contrary, i.e. there existg ∈ K, h ∈ L, and v, w ∈ V , such that gv = hw. Then we get h−1g = wv−1 ∈V · V −1 = V · V ⊂ U , which is impossible, since h−1g also belongs to φ(C).

Finally, let us show that we also have (V · K) ∩ (W · L) = ∅. Suppose thecontrary, i.e. there exist g ∈ K, h ∈ L, and v, w ∈ V , such that vg = wh. Then weget hg−1 = w−1v ∈ V −1· = V ·V ⊂ U , which is impossible, since gh−1 also belongsto φ(C).

In what follows we are going to restrict our attention to those topological groupswhich are locally compact in their respective topology. The topological groups listedin Examples 7.2.A-D are all locally compact.

Definition. Let G be a locally compact group. A Radon measure µ on G iscalled a Haar measure on G, if µ(G) > 0, and µ has the left invariance property:

µ(gA) = µ(A), ∀ g ∈ G, A ∈ Bor(G).

Remark that, for every g ∈ G the map `g : G 3 h 7−→ gh ∈ G is a homeomorphism,so for a subset A ⊂ G, one has the equivalence A ∈ Bor(G) ⇔ gA = `g(A) ∈Bor(G). Likewise, the map rg : G 3 h 7−→ hg ∈ G is a hoemorphism, so A ∈Bor(X) ⇔ Ag ∈ Bor(G).

Remark 7.6. Let G be a locally compact group. For any element g ∈ G, andany function F ∈ CR

c (G), we define the continuous functions LgF,RgF : G→ R byLgF = F `g−1 and RgF = f rg. In other words,

(LgF )(h) = F (g−1h) and (RgF )(h) = F (hg), ∀h ∈ G.

It is fairly obvious that LgF and RgF both have compact support. Moreover, fora fixed g ∈ G, the maps Lg, Rg : CR

c (G) → CRc (G) are linear, and continuous in the

norm defined in Exercise 5. One has the equalities

Lgh = Lg Lh and Rgh = Rg Rh, ∀ g, h ∈ G,

as well as Le = Re = Id, where e denotes the identity element in G.The following result gives a sufficient condition for a Riesz measure to be a

Haar measures.Proposition 7.11. Let G be a locally compact group, and let φ : CR

c (G) → Rbe a positive R-linear map, which is not identically zero, and has the left invarianceproperty:

φ Lg = φ, ∀ g ∈ G.Then the Riesz measure µφ is a Haar measure on G.

Proof. The key property we need is contained in the followingClaim 1: For any g ∈ G, and any compact subset K ⊂ G, one has the

equality µφ(gK) = µφ(K).Fix for the moment g ∈ G, as well as the compact set K ⊂ G. The set gK iscompact, so we have

(38) µφ(gK) = infφ(F ) : F ∈ CR

c (G), F ≥ κgK

.

Notice that if F ∈ CRc (G) satisfies F ≥ κgK , this means that F (gh) ≥ κgK(gh),

∀h ∈ G. Notice that, for any h ∈ G, one has the equivalences

κgK(gh) = 1 ⇔ gh ∈ gK ⇔ h ∈ K,

262 LECTURES 26-29

which means thatκgK(gh) = κK(h), ∀h ∈ K.

The inequality F ≥ κgK then gives

F (gh) ≥ κK(h), ∀h ∈ G,

which readsLg−1f ≥ κK .

Using the invariance property, we get

µφ(K) ≤ φ(Lg−1(F )

)= (φ Lg−1)(F ) = φ(F ).

In other words, we have

φ(F ) ≥ µφ(K), for all F ∈ CRc (G) with F ≥ κgK .

Using (38) this immediately gives

µφ(K) ≤ µφ(gK).

Applying the same inequality with g replaced by g−1 and K replaced by gK, yields

µφ(gK) ≤ µφ(g−1(gK)

)= µφ(K),

so the Claim follows.Claim 2: For any g ∈ G, and any open subset D ⊂ G, one has the equalityµφ(gD) = µφ(D).

For a compact subset L ⊂ G, one clearly has the equivalence L ⊂ gD ⇔ g−1 ⊂ D.So, using Claim 1, for every compact subset L ⊂ gD, one has

µφ(L) = µφ(g−1L) ≤ µφ(D),

and using property (iii) for Radon measures, we immediately get the inequality

µφ(gD) = supµφ(L) : L compact, L ⊂ gD

≤ µφ(D).

The inequality µφ(D) ≤ µφ(gD) is proven by replacing g with g−1 and D with gD,in the above inequality.

We now prove that µφ is a Haar measure. Start with some Borel set A ⊂ G.For every open set D ⊃ gA, one has the inclusion g−1D ⊃ A, which using Claim 2,gives µφ(D) = µφ(g−1D) ≥ µφ(A). Using property (ii) in the definition of Radonmeasures, we then have

µφ(gA) = infµφ(D) : D open, D ⊃ gA

≥ µφ(A).

The inequality µφ(A) ≥ µφ(gA) is proven by replacing g with g−1 and A with gA,in the above inequality.

Comment. Later on, in Chapter IV, we are going to prove that the left invari-ance property of φ is also a necessary condition for µφ to be a Haar measure.

Examples 7.3. Let us examine the examples 7.2.A-D and let us constructHaar measures on these groups.

A. On a discrete group G, one has the counting measure µ(A) = CardA,∀A ⊂ G, which is obviously a Haar measure.

B. On (Rn,+), the Lebesgue measure is a Haar measure.

CHAPTER III: MEASURE THEORY 263

C. On the n-dimensional torus Tn, we consider the Riesz measure µΛ, associatedwith the positive R-linear map Λ : CR(Tn) → R, defined by

Λ(F ) =∫ 1

0

· · ·∫ 1

0

F (e2πiθ1 , . . . , e2πiθn) dθ1 . . . dθn.

It is not hard to see that ΛLg = Λ, ∀ g ∈ Tn. One easy way is to check directly theequality (ΛLg)(P ) = Λ(P ), for functions of the form P (z1, . . . , zn) = zm1

1 · · · zmnn ,

with m1, . . . ,mn ∈ Z, and then use continuity and the Stone-Weierstrass Theoremwhich gives the fact that the linear span of all these P ’s is dense in CR(Tn). UsingProposition 7.6 it follows that µΛ is a Haar measure on Tn.

D. The construction of a Haar measure on GLn(R) is outlined in the following.

Exercise 7*. Identify GLn(R) as an open subset in Rn2. For every continuous

function F : GLn(R) → R, with compact support, F : Rn2 → R by

F (x) =F (x) · |det x|−n if x ∈ GLn(R)

0 if x ∈ Rn2 rGLn(R)

and we define

ψ(F ) =∫ b1

a1

∫ b2

a2

· · ·∫ bn2

an2

f(x1, x2, . . . , xn2) dx1 dx2 · · · dxn2 ,

where the numbers a1 < b1, . . . , an2 < bn2 are chosen (arbitrarily) such that

supp F ⊂ [a1, b1]× [a2, b2]× · · · × [an2 , bn2 ].

(On has the equality supp F = suppF , and the multiple integral is independent ofthe choice of the a’s and the b’s.) Prove that ψ Ls = ψ, ∀ s ∈ GLn(R). Concludethat the Riesz measure µψ associated with ψ is a Haar measure on GLn(R).Hints: Fix s ∈ GLn(R). The map `s−1 : GLn(R) → GLn(R) has an obvious linear extension

Φs : Rn2 → Rn2, defined by

Φs(x) = s−1x, ∀x ∈ Rn2,

where the vector space Rn2is identified withMatn×n(R). Fix now F ∈ CR

c

(GLn(R)

)and consider

the function H = F `s−1 , so that (ψ Ls)(F ) = ψ(H). Prove the equality

H(x) = F(Φs(x)

)· | det s|−n, ∀x ∈ Rn.

Prove that the Jacobian of Φs is given as∣∣ det[(DΦs)(x)]∣∣ = | det s|−n, ∀x ∈ Rn.

Use this equality, combined with the above formula for H, to get the equality ψ(H) = ψ(F ),

as a result of the change of variable theorem. (Use the fact that in the definition of ψ, instead

of integrating over rectangles one can integrate over arbitrary compact sets Ω ⊂ GLn(R), with

Jordan neglijeable boundary, and Int(Ω) ⊃ suppF .)

Comments. The Haar measures defined in Examples 7.3.A-D are peculiar inthe sense that they also have the right invariance property:

µ(Ag) = µ(A), ∀ g ∈ G, A ∈ Bor(G).

In general such a property does not hold. At this point, we can only speculate onthis matter, by examining the following example.

264 LECTURES 26-29

Exercise 8*. Consider the group G of all affine orientation preserving affinetransformations of R, i.e. the collection

G =Tab : a, b ∈ R, a > 0

,

where Tab : R 3 x 7−→ ax + b ∈ R. (Some people call this the “ax + b” group.) Itis not hard to see that compositions and inverses of such transformations are againof this form. In fact one can identify G as the subgroup of GL2(R) given by

G =[

a b0 1

]: a, b ∈ R, a > 0

.

The topology on G is the one induced from this inclusion. Equivalently, G can beidentified with the right half-plane (0,∞)×R. We use this identification to define apositive R-linear map Λ : CR

c (G) → R as follows. For every F ∈ CRc (G), we choose

0 < c1 < d1 and c2 < d2, such that suppF ⊂ [c1, d1]× [c2, d2], and we define

Λ(F ) =∫ d1

c1

∫ d2

c2

F (a, b)a2

da db.

The integral does not depend on the particular choice of the rectangle. Prove thatΛLg = Λ, ∀ g ∈ G, so that the Riesz measure µΛ is a Haar measure. In general theequality Λ Rg = Λ fails. As indicated in the comment that followed Proposition7.6, the fact that Λ Rg 6= Λ would prevent the Riesz measure µΛ from having theright invariance property.Hints: Use similar arguments to the ones in Exercise 8. If g = Tab ∈ G, then the map

`g−1 : G→ G extends to a linear map Φg : R2 → R2, defined by

Φg(x, y) = (ax+ by, y), ∀ (x, y) ∈ R2.

Argue as in Exercise 8, and use the change of variable theorem.

Exercise 9. As indicated above, in general, Haar measures need not have theright invariance property. Prove that when µ is a Haar measure on G, then themap ν : Bor(G) → [0,∞] defined by

ν(B) = µ(B−1), ∀B ∈ Bor(G),

is a Radon measure, which has the right invariance property.Hint: The map G 3 g 7−→ g−1 ∈ G is a homeomorphism.

The main result we are interested in is the existence of a Haar measure. Thefollowing result reduces the problem to the existence of a left invariant content.

Lemma 7.6. Let G be a locally compact group, and let ω be a content on G,with the left invariance property:

ω(gK) = ω(K), ∀ g ∈ G, K ∈ CG.

If ω is not identically zero, then the outer measure ω∗, induced by ω, also has theleft invariance property:

ω∗(gA) = ω(A), ∀ g ∈ G, A ⊂ G.

The measure µ = ω∗∣∣Bor(G)

is a Haar measure on G.

Proof. We trace the construction outlined in Theorem 7.1. Denote by TG thecollection of all open subsets of G, and define the map ω : TG → [0,∞] by

ω(D) = supω(K) : K ∈ CG, K ⊂ D

, ∀D ∈ TG.

CHAPTER III: MEASURE THEORY 265

The outer measure ω∗ is then defined by

ω∗(A) = infω(D) : D ∈ TG, D ⊃ A

, ∀A ⊂ G.

Claim: The map ω : TG → [0,∞] has the left invariance property:

ω(gD) = ω(D), ∀ g ∈ G, D ∈ TG.

Start with some arbitrary compact subset K ⊂ gD. Then g−1K is a compactsubset of D, so by the left invariance property of ω, we get

ω(K) = ω(g−1K) ≤ ω(D).

This means that we have ω(K) ≤ ω(D), for all compact subsets K ⊂ gD, so by thedefinition of ω we get

ω(gD) = supω(K) : K ∈ CG, K ⊂ gD

≤ ω(D).

The other inequality ω(D) ≤ ω(gD), follows from the one above if we replace gwith g−1 and D with gD.

We are now in position to prove that ω∗ has the left invariance property. Fixfor the moment A ⊂ G and g ∈ G. For every open set D ⊃ gA, one has g−1D ⊃ A,so by the Claim we get

ω(D) = ω(g−1D) ≥ ω∗(A).Since we have ω(D) ≥ ω∗(A), for all open sets D ⊃ gA, by the definition of ω∗, weget

ω∗(gA) = infω(D) : D ∈ TG, D ⊃ gA

≥ ω∗(A).

The other inequality ω∗(A) ≥ ω∗(gA), follows from the one above if we replace gwith g−1 and A with gA.

In order to prove that µ is a Haar measure, all we need to prove is the fact thatµ(G) > 0. Start with some compact subset K ⊂ G, with ω(K) > 0. We have

µ(G) ≥ µ(K) = ω∗(K) = ω(K) ≥ ω(K) > 0,

and we are done.

Before we prove the existence of Haar measures, we need more preparations.Notations. Let G be a group. For two non-empty subsets A,B ⊂ G, we write

A ≺ B, if there exist elements g1, . . . , gn ∈ G, such that A ⊂ g1B ∪ · · · ∪ gnB. Inthis case we define the number

[A : B] = minn ∈ N : there exist g1, . . . , gn ∈ G with K ⊂ g1V ∪ · · · ∪ gnV.

The following result will be useful.Lemma 7.7. Let G be a group.

(i) If A,B ⊂ G are non-empty sets with A ⊂ B, then A ≺ B, and [A : B] = 1.(ii) The relation ≺ is transitive, i.e. whenever A,B,C ⊂ G are non-empty

subsets satisfying A ≺ B and B ≺ C, it follows that A ≺ C. Moreover, inthis case one has the inequality

[A : C] ≤ [A : B] · [B : C].

(iii) The relation ≺ is compatible with left translations. This means that forany two elements g, h ∈ G, and any two non-empty subsets A,B ⊂ G,one has the equivalence A ≺ B ⇔ gA ≺ hB. Moreover, in this case onehas

[gA : hB] = [A : B].

266 LECTURES 26-29

(iv) If A,B,C ⊂ G are non-empty subsets such that A ≺ C and B ≺ C, thenA ∪B ≺ C. Moreover, in this case one has the inequality

[A ∪B : C] ≤ [A : C] + [B : C].

(v) If A,B,C ⊂ G are non-empty sets, such that A ≺ C, B ≺ C, and (A ·C−1) ∩ (B · C−1) = ∅, then one has the equality

[A ∪B : C] = [A : C] + [B : C].

Proof. (i) This part is trivial.(ii) Put m = [A : B] and n = [B : C]. Choose g1, . . . , gm, h1, . . . , hn ∈ G, such

that A ⊂ g1B ∪ · · · ∪ gmB, and B ⊂ h1C ∪ · · · ∪ hnC. We then obviously have theinclusion

A ⊂m⋃i=1

n⋃j=1

(gihj)C,

which proves that A ≺ C, but also shows that [A : C] ≤ mn.(iii) This follows immediately from (ii) plus the obvious relations A ≺ gA ≺ A,

B ≺ hB ≺ B, and the equalities

[A : gA] = [gA : A] = [B : hB] = [hB : B] = 1.

(iv) Let m = [A : C] and n = [B : C]. Choose g1, . . . , gm, gm+1, . . . , gm+n ∈ Gsuch that A ⊂ g1C ∪ · · · ∪ gmC and B ⊂ gm+1C ∪ · · · ∪ gm+nC. This clearly showsthat A ∪B ≺ C and [A ∪B : C] ≤ m+ n.

(v) Let p = [A ∪ B : C], and choose g1, . . . , gp ∈ G, such that A ∪ B ⊂g1C ∪ · · · ∪ gpC. Define the sets

M =j ∈ 1, . . . , p : A ∩ gjC 6= ∅

and N =

k ∈ 1, . . . , p : B ∩ gkC 6= ∅

.

Notice that M ∩N = ∅. Indeed, if there exists j ∈M ∩N , this means that on theone hand, we have A ∩ gjC 6= ∅, which gives gj ∈ A ·C−1, and on the other hand,we have B ∩ gjC 6= ∅ which gives gj ∈ B · C−1. But this clearly contradicts theassumption that (A · C−1) ∩ (B · C−1) = ∅.

By the definition of M and N , we clearly have the inclusions

A ⊂⋃j∈M

gjC and B ⊂⋃k∈N

gkC.

These immediately give the inequalities [A : C] ≤ cardM and [B : C] ≤ cardN .Since M and N are disjoint, and M ∪N ⊂ 1, . . . , p, these inequalities give

[A : C] + [B : C] ≤ cardM + cardN = card(M ∪N) ≤ p = [A ∪B : C].

Using part (iv), we see that in fact we have equality [A : C] + [B : C] = [A ∪ B :C].

Remark 7.7. If G is a topological group with identity element e, and if V isa neighborhood of e, then K ≺ V , for every compact subset of G. Indeed, if wechoose some open set D with e ∈ D ⊂ V , then using the compactness of K, andthe obvious inclusion K ⊂

⋃g∈K gD, it follows that there exists g1, . . . , gn ∈ K,

such that K ⊂ g1D ∪ · · · ∪ gnD ⊂ g1V ∪ · · · ∪ gnV .With these preparations we are in position to prove the following fundamental

result.

CHAPTER III: MEASURE THEORY 267

Theorem 7.6. Let G be a locally compact group, and let A be a compactneighborhood of the identity element. Then there exists a Haar measure µ on G,such that µ(A) = 1.

Proof. Denote the identity element of G by e. Throughout the proof thecompact neighborhood A of e will be fixed. For every non-empty compact setK ⊂ G, we define m(K) = [K : A]. We also put m(∅) = 0.

Let us define V to be the collection of all neighborhoods of e. For every V ∈ V,we denote by Ω(V ) the set of all maps ω : CG → [0,∞) with the following properties

(i) 0 ≤ ω(K) ≤ m(K), ∀K ∈ CG;(ii) ω(A) = 1;(iii) K,L ∈ CG, K ⊂ L⇒ ω(K) ≤ ω(L);(iv) ω(K ∪ L) ≤ ω(K) + ω(L), ∀K,L ∈ CG;(v) ω(gK) = ω(K), ∀ g ∈ G, K ∈ CG.(vi) K,L ∈ CG, (K · V ) ∩ (L · V ) = ∅ ⇒ ω(K ∪ L) = ω(K) + ω(L).

Claim 1: For every V ∈ V, the set Ω(V ) is non-empty.

Fix V . We shall prove this Claim by an explicit construction of an element ω ∈Ω(V ). Define ω(∅) = 0, and define

ω(K) =[K : V −1][A : V −1]

,

for all non-empty compact subsets K ⊂ G. The fact that ω has properties (i)-(vi)is immediate from Lemma 7.7.

Let us regard the sets Ω(V ), V ∈ V as subsets of the product space

P =∏

K∈CG

[0,m(K)].

Notice that, when we equip P with the product topology, it becomes a compactspace, by Tihonov’s Theorem.

Claim 2: For every V ∈ V, the set Ω(V ) is closed in P.

Define, for any K ∈ CG, the map

πK : P 3 ω 7−→ ω(K) ∈ R.

By the definition of the topology of P, all maps πK : P → R are continuous. Forany two sets K,L ∈ CG, consider the functions FKL, TKL : P → R, defined by

FKL(ω) = ω(K)− ω(L) and TKL(ω) = ω(K ∪ L)− ω(K)− ω(L), ∀ω ∈ P.

Since we have FKL = πK − πL and TKL = πK∪L − πK − πL, it follows that themaps FKL, TKL : P → R, K,L ∈ CG, are all continuous. As a consequence of thecontinuity of these maps, it follows that, for any two sets K,L ∈ CG, the sets

Γ(K,L) = ω ∈ P : ω(K) ≤ ω(L) = F−1KL

((−∞, 0]

),

Θ−(K,L) = ω ∈ P : ω(K ∪ L) ≤ ω(K) + ω(L) = T−1KL

((−∞, 0]

),

Θ+(K,L) = ω ∈ P : ω(K ∪ L) ≥ ω(K) + ω(L) = T−1KL

([0,∞)

)

268 LECTURES 26-29

are closed subsets of P. It then follows that the sets

Ω1 =ω ∈ P : ω(A) = 1

= π−1

A

(1

),

Ω2 =⋂

(K,L)∈CG×CG

K⊂L

Γ(K,L),

Ω3 =⋂

(K,L)∈CG×CG

Θ−(K,L),

Ω4 =⋂

K∈CG

⋂g∈G

[Γ(K, gK) ∩ Γ(gK,K)

],

are all closed, so the intersection

Ω5 = Ω1 ∩Ω3 ∩Ω3 ∩Ω4

is again closed. Notice that

Ω5 =ω ∈ P : ω has properties (i)-(v)

.

Finally, if we define, for every V ∈ V, the set

Ω6V =

⋂(K,L)∈CG×CG

(K·V )∩(L·V )=∅

Θ+(K,L),

then Ω6V is also closed, and so will then be the intersection Ω5 ∩Ω6

V = Ω(V ).Claim 3: The intersection

⋂V ∈V Ω(V ) is non-empty.

Remark that, if V1, V2 ∈ V are such that V1 ⊂ V2, then we have the inclusionΩ(V1) ⊂ Ω(V2). Indeed, if ω belongs to Ω(V1), then properties (i)-(v) are clear. Tocheck property (vi) for V2 we need to show that whenever K,L ⊂ G are compactsets, with (K · V2) ∩ (L · V2) = ∅, it follows that ω(K ∪ L) = ω(K) + ω(L). Thisis however trivial, since the inclusion V1 ⊂ V2 forces (K · V1) ∩ (L · V1) = ∅, andthen the desired equality follows from the property (vi) for V1. We now see that,for any finite number of sets V1, . . . , Vn ∈ V, we have the inclusion

Ω(V1 ∩ · · · ∩ Vn) ⊂ Ω(V1) ∩ · · · ∩Ω(Vn),

which by Claim 1, proves that Ω(V1) ∩ · · · ∩Ω(Vn) 6= ∅. Using Claim 2, and thecompactness of P, the Claim immediately follows.

Pick now an element ω ∈⋂V ∈V Ω(V ).

Claim 4: The map ω : CG → [0,∞) is a content on G with the left invarianceproperty

ω(gK) = ω(K), ∀ g ∈ G, K ∈ CG.

Moreover, one has the equality ω(A) = 1.The fact that ω(A) = 1 is clear, from condition (ii) in the definition of Ω(V ). Theleft invariance property follows from condition (v). In order to prove that ω is acontent, we need to prove

(a) ω(∅) = 0;(b) K,L ∈ CG, K ⊂ L⇒ ω(K) ≤ ω(L);(c) ω(K ∪ L) ≤ ω(K) + ω(L), ∀K,L ∈ CG;(d) K,L ∈ CG, K ∩ L = ∅ ⇒ ω(K ∪ L) = ω(K) + ω(L).

CHAPTER III: MEASURE THEORY 269

Properties (a), (b), and (c) are clear, because every element in Ω(V ), V ∈ V satisfiesthem. (Property (a) is a consequence of condition (i), property (b) is a consequenceof (iii), and property (c) is a consequence of (iv).) To prove property (d), we startwith two disjoint compact sets K and L, and we use Proposition 7.5 to find someV ∈ V such that (K · V ) ∩ (L ∩ V ) = ∅. Then we use the fact that ω belongs toΩ(V ), and by condition (vi) we indeed get ω(K ∪ L) = ω(K) + ω(L).

Having proven Claim 4, we now define the measure µ0 = ω∗∣∣Bor(G)

. By Lemma7.7, µ0 is a Haar measure on G. Notice that µ0(A) = ω(A) ≥ ω(A) = 1, so if wedefine µ : Bor(G) → [0,∞] by (use the convention ∞/µ0(A) = ∞)

µ(B) =µ0(B)µ0(A)

, ∀B ∈ Bor(G),

then µ is a Haar measure on G, and satisfies µ(A) = 1.

Comment. Eventually (see Chapter IV) we are going to improve on the aboveresult by proving the uniqueness of µ.

In concrete examples, it is possible to prove uniqueness.Exercise 10*. Let S = [0, 1]n be the unit square in Rn, and let µ be a Haar

measure on (Rn,+), with µ(S) = 1. Prove that µ coincides with the n-dimensionalLebesgue measure λn.Hint: Consider first the half open box S0 = [0, 1)n, and its measure β = µ(S0). Prove that for

a half open box of the formB = [a1, b1)× · · · × [an, bn)

with a1, . . . , an, b1, . . . , bn ∈ Q, one has µ(B) = βλn(B). Conclude that if a subset A ⊂ Rn iscontained in a hyperplane of the form

Πk(a) = (x1, . . . , xn) ∈ Rn : xk = a,then µ(A) = 0. Use this to get β = 1, so

µ(B) = λn(B),

for every “rational” half-open box. Prove that this equality holds for all half-open boxes. Use

Corollary 5.1 to conclude that µ = λn.

The following two exercises show how a Haar measure can be used to get sometopological information.

Exercise 11. Let G be a locally compact group, and let µ be a Haar measureon G. Prove that µ(D) > 0, for every open subset D ⊂ G.Hint: Use the inequality µ(K) ≤ [K : D] · µ(D), for all compact K ⊂ G.

Exercise 12*. Let G be a locally compact group, and let µ be a Haar measureon G. Prove that the following are equivalent:

(i) G is compact;(ii) µ(G) <∞.

Hint: For the implication (ii) ⇒ (i), start with some compact neighborhood V of the identity,

and choose a maximal subset A ⊂ G, such that the sets gV , g ∈ A are disjoint. Prove that A is

finite. Conclude that G =⋃

g∈A(gV · V −1), so G is a finite union of compact sets.

Lectures 30-31

8. Signed measures and complex measures

In this section we discuss a generalization of the notion of a measure, to thecase where the values are allowed to be outside [0,∞]. The first notion is describedby the following.

Definition. Suppose A is a σ-algebra on a non-empty set X. A functionµ : A → [−∞,∞] is called a signed measure on A, if it has the properties below.

(i) Either one of the following is true• µ(A) <∞, ∀A ∈ A;• µ(A) > −∞, ∀A ∈ A.

(ii) µ(∅) = 0.(iii) For any pairwise disjoint sequence (An)∞n=1 ⊂ A, one has the equality

(1) µ( ∞⋃n=1

An)

=∞∑n=1

µ(An).

Here we adopt the convention that if one term in the right hand side of (1) is equalto ±∞, then the entire sum is equal to ±∞. It is important to use condition (i),which avoids situations when one term is ∞ and another term is −∞.

Examples 8.1. Let us agree, in this section only, to use the term “honest”measure, for a measure in the usual sense.

A. Any “honest” measure is of course a signed measure.B. If µ is a signed measure, then −µ is again a signed measure.C. If µ1 and µ2 are “honest” measures, one of which is finite, then µ1 − µ2 is

a signed measure. Eventually (see Theorem 8.2) we are going to show that anysigned measure can be written in this form.

One key technical result about signed measures is the following.Theorem 8.1. Let A be a σ-algebra on a non-empty set X, and let µ be a

signed measure on X. Then there exist sets L,U ∈ A, such that

µ(L) = infµ(A) : A ∈ A

;(2)

µ(M) = supµ(A) : A ∈ A

.(3)

Proof. Since −µ is also a signed measure, it suffices to prove only the exis-tence of M satisfying (3). Denote the right hand side of (3) by α, and choose asequence (αn)n≥1 ⊂ R, such that limn→∞ αn = α, and αn < α, ∀n ≥ 1. The keyconstruction we need is contained in the following.

Claim 1: There exists a family of sets Bnk : k, n ∈ N, 1 ≤ k ≤ n ⊂ A,with the following properties:

271

272 LECTURES 30-31

(i) for every n ≥ 1, one has the inclusions

Bn1 ⊂ Bn2 ⊂ . . . ⊂ Bnn∪ ∪ . . . ∪Bn+1

1 ⊂ Bn+12 ⊂ . . . ⊂ Bn+1

n ⊂ Bn+1n+1

(ii) for every k ≥ 1 one has the inequalities

µ(Bnk rBn+1k ) ≤ 0, ∀n ≥ k.

(iii) µ(Bnn) ≥ αn, ∀n ≥ 1.

We construct this sequence inductively, one row at a time (the rows ar indexed bythe upper index n). Choose B1

1 ∈ A to be any set with µ(B11) ≥ α1. Suppose we

have constructed the first N rows, i.e. we have defined the sets Bnk , 1 ≤ k ≤ n ≤ m,so that property (i) holds for all n = 1, . . . ,m− 1, property (ii) holds in the form

αk ≤ µ(Bkk ) ≤ µ(Bk+1k ) ≤ · · · ≤ µ(Bmk ), ∀ k = 1, . . . ,m,

and property (iii) holdes for all n = 1, . . . ,m. Let us explain now how the next rowBm+1

1 ⊂ Bm+12 ⊂ . . . Bm+1

m ⊂ Bm+1m+1 is constructed. Define the sets E1, E2, . . . , Em ∈

A byE1 = Bm1 , and Ek = Bmk rBmk−1, ∀ k = 2, . . . ,m.

The sets Ek, k = 1, . . . ,m are pairwise disjoint, and we have

Bmk =k⋃j=1

Ej , ∀ k = 1, . . . ,m.

Choose now an arbitrary set D ∈ A, with µ(D) ≥ αm+1, and define, for eachj ∈ 1, . . . ,m, the set

Gj =

Ej if µ(Ej rD) > 0Ej ∩D if µ(Ej rD) ≤ 0

Notice that we have Ej ⊃ Gj , and using the equality µ(Ej) = µ(Ej∩D)+µ(EjrD),we also have

(4) µ(Ej rGj) ≤ 0 and µ(Gj) ≥ µ(Ej ∩D), ∀ j = 1, . . . ,m.

Define also the set Gm+1 = DrBmm . It is clear that the sets G1, G2, . . . , Gm+1 arepairwise disjoint. Construct now the m+ 1 row by taking

Bm+1k =

k⋃j=1

Gj , ∀ k = 1, 2, . . . ,m+ 1.

It is obvious that one has the inclusions

Bm+11 ⊂ Bm+1

2 ⊂ · · · ⊂ Bm+1m+1 .

Since Ek ⊃ Gk, ∀ k = 1, . . . ,m, it is also clear that we have the vertical inclusionsBmk ⊃ Bm+1

k , ∀ k = 1, . . . ,m. Using (4), for each k = 1, . . . ,m, we have

µ(Bmk rBm+1k ) = µ

( k⋃j=1

[Ej rGj ])

=k∑j=1

[µ(Ej rGj) ≤ 0.

CHAPTER III: MEASURE THEORY 273

Finally, again by (4), we have

µ(Bm+1m+1) = µ

(m+1⋃k=1

Gk)

=m+1∑k=1

µ(Gk) ≥ µ(Gm+1) +m∑k=1

µ(Ek ∩D) =

= µ(Gm+1) + µ( m⋃k=1

[Ek ∩D])

= µ(D rBmm) + µ(Bmm ∩D) = µ(D) ≥ αm+1.

Claim 2: There exists a sequence (Ak)∞k=1 ⊂ A, such that(i) A1 ⊂ A2 ⊂ A3 ⊂ . . . ;(ii) µ(Ak) ≥ αk, ∀ k ≥ 1.

We fix a family Bnk : k, n ∈ N, 1 ≤ k ≤ n

satisfying the properties in Claim 1.For every k ≥ 1, we define Ak =

⋂∞n=k B

nk . Notice that, using property (i) from

Claim 1 (the vertical inclusions), we have

Bkk = Ak ∪( ∞⋃n=k

[Bnk rBn+1k ]

),

and the sets Ak, Bnk r Bn+1k , n ≥ k, are pairwise disjoint, so using property (ii)

from Claim 1, we have

µ(Bkk ) = µ(Ak) +∞∑n=k

µ(Bnk rBn+1k ) ≤ µ(Ak).

Using property (iii) from Claim 1, we then get µ(Ak) ≥ αk. The fact that we havethe inclusions A1 ⊂ A2 ⊂ . . . is clear, from property (i) in Claim 1 (the horizontalinclusions).

Fix now the sequence (Ak)∞k=1 ⊂ A as in Claim 2, and let us consider the setM =

⋃∞k=1Ak. If we define the sets

M1 = A1 and Mk = Ak rAk−1, ∀ k ≥ 2,

then we have M =⋃∞k=1Mk, and the sets M1,M2,M3, . . . are pairwise disjoint. In

particular, this gives

µ(M) =∞∑k=1

µ(Mk) = limk→∞

[ k∑j=1

µ(Mj)]

= limk→∞

µ( k⋃j=1

Mj

).

Since we obviously have⋃kj=1Mj = Ak, ∀ k ≥ 1, the above equality proves that

(5) µ(M) = limk→∞

µ(Ak).

Since we have αk ≤ µ(Ak) ≤ α, ∀ k ≥ 1, as well as limk→∞ αk = α, the equality(5) forces µ(M) = α.

Remark 8.1. One interesting application of the above result is the fact that,whenever µ is a signed measure on A, such that

(6) −∞ < µ(A) <∞, ∀A ∈ A,

then−∞ < infµ(A) : A ∈ A

≤ supµ(A) : A ∈ A <∞.

A signed measure with property (6) is called finite.We are now in position to prove the statement made in Example 8.1.C.

274 LECTURES 30-31

Theorem 8.2. Let X be a non-empty set, let A be a σ-algebra on X, and let µbe a signed measure on A. Then there exist subsets X+, X− ∈ A, with the followingproperties:

(i) X+ ∩X− = ∅, and X+ ∪X− = X;(ii) the maps µ± : A → [−∞,∞], defined by

µ±(A) = ±µ(A ∩X±), ∀A ∈ A,

are “honest” measures on A;(iii) one of the measures µ± is finite, and one has the equality µ = µ+ − µ−.

Proof. Without any loss of generality, we can assume that

µ(A) <∞, ∀A ∈ A.

(Otherwise, we replace µ with −µ, and the conclusion does not essentially change.)Put α = sup

µ(A) : A ∈ A

. By Theorem 8.1, it follows that 0 ≤ α < ∞, and

there exists a set X+ ∈ A, such that µ(X+) = α. Define X− = X rX+.Claim: The sets X± have the following properties:

(a) 0 ≤ µ(A) ≤ α, for all A ∈ A, with A ⊂ X+;(b) 0 ≥ µ(B), for all B ∈ A, with B ⊂ X−.

To prove (a) start with some arbitrary subset A ⊂ X+. First of all, by the definitionof α, it is clear that µ(A) ≤ α. Second, using the equality µ(X+) = µ(A)+µ(X+ rA), it is clear that µ(A), µ(X+ rA) > −∞, so we have

α ≥ µ(X+ rA) = µ(X+)− µ(A) = α− µ(A),

which clearly forces µ(A) ≥ 0. To prove (b), we start with some set B ∈ A withB ⊂ X−. Using the fact that X+ ∩B = ∅, we have

α ≥ µ(X+ ∪B) = µ(X+) + µ(B) = α+ µ(B),

which clearly forces µ(B) ≤ 0.Having proven the Claim, we define the maps µ± : A → [−∞,∞] as in the

statement of the Theorem. By the Claim, we get µ±(A) ≥ 0, ∀A ∈ A. It is alsopretty clear that both µ+ and µ− are σ-additive, so they define “honest” measures.Also, by the Claim, we have µ+(A) ≤ α, ∀A ∈ A, so µ+ is a finite measure. Finally,if we start with some arbitrary A ∈ A, and we write it as A = (A∩X+)∪(A∩X−),then using the fact that (A ∩X+) ∩ (A ∩X−) = ∅, we get

µ(A) = µ(A ∩X+) + µ(A ∩X−) = µ+(A)− µ−(A).

It will be helpful not only here, but also in some future discussions, to isolatea certain feature identified by the above result.

Definition. Given a σ-algebra A on a non-empty set X, and two “honest”measures µ and ν on A, we say that µ and ν are mutually singular, if there existssets M,N ∈ A, with M ∪N = X and M ∩N = ∅, such that µ(N) = ν(M) = 0.Notice that this implies the equalities

µ(A) = µ(A ∩M) and ν(A) = ν(A ∩N), ∀A ∈ A.

If this situation occurs, we write µ ⊥ ν.With this terminology, Theorem 8.2 states that any signed measure µ can be

written as µ = µ+ − µ−, with µ+ and µ− “honest” mutually singular measures,and one of them finite.

CHAPTER III: MEASURE THEORY 275

Although the sets X± may not be uniquely determined, the decompositionµ = µ+ − µ− is unique, as indicated by the following result.

Theorem 8.3 (Minimality). Let X be a non-empty set, let A be a σ-algebraon X, and let µ be a signed measure on A. Suppose µ+ and µ− are mutuallysingular “honest” measures on A, one of them being finite, such that µ = µ+−µ−.Suppose ν and η are two “honest” measures on A, one of which being finite, suchthat µ = ν − η. Then one has the inequalities µ+ ≤ ν and µ− ≤ η.

Proof. Fix sets X+, X− ∈ A, such that X+ ∪X− = X, X+ ∩X− = ∅, andµ+(X−) = µ−(X+) = 0.

Start with some arbitrary set A ∈ A. On the one hand, since A = (A ∩X+) ∪(A∩X−), with A = (A∩X+)∩ (A∩X−) = ∅, we see that if λ is either one of themeasures µ, µ+ ,or µ−, we have the equality

(7) λ(A) = λ(A ∩X+) + λ(A ∩X−), ∀A ∈ A.

On the other hand, since µ+ is an “honest” measure, and µ+(X−) = 0, the inclusionA ∩X− ⊂ X− will force

µ+(A ∩X−) = 0, ∀A ∈ A.

Likewise, we have the equality

µ−(A ∩X+) = 0, ∀A ∈ A.

These equalities, combined with µ = µ+ − µ−, and with (7), give the equalities

µ+(A) = µ+(A ∩X+) = µ+(A ∩X+)− µ−(A ∩X+) = µ(A ∩X+),(8)

µ−(A) = µ−(A ∩X−) = −µ+(A ∩X−) + µ−(A ∩X−) = −µ(A ∩X−),(9)

for all A ∈ A. Fix now some set A ∈ A. Since ν is an “honest” measure, andη(A ∩X+) ≥ 0, using (8) we get

ν(A) ≥ ν(A ∩X+) ≥ ν(A ∩X+)− η(A ∩X−) = µ(A ∩X+) = µ+(A).

Likewise, we have

η(A) ≥ η(A ∩X−) ≥ η(A ∩X−)− ν(A ∩X−) = −µ(A ∩X−) = µ−(A).

Corollary 8.1. Let A be a σ-algebra on X, let µ be a signed measure on A,and let µ+, µ−, ν+ and ν− be “honest” measures on A with

• µ+ ⊥ µ−, and one of the measures µ+ and µ− is finite;• ν+ ⊥ ν−, and one of the measures ν+ and ν− is finite;• µ = µ+ − µ− = ν+ − ν−.

Then one has the equalities µ+ = ν+ and µ− = ν−.

Proof. Apply Theorem 8.3 “both ways” to get µ+ ≤ ν+ and µ− ≤ ν−, aswell as ν+ ≤ µ+ and ν− ≤ µ−.

Definition. Given a signed measure µ, the decomposition µ = µ+−µ−, whoseexistence is shown in Theorem 8.2, and whose uniqueness is shown above, is calledthe Hahn-Jordan decomposition of µ. A pair of sets (X+, X−), with X± ∈ A,X+ ∪ X− = X, X+ ∩ X− = ∅, and µ+(X−) = µ−(X+) = 0, is called a Hahn-Jordan set decomposition of X relative to µ.

Exercise 1. Let µ be a signed measure, and let µ = µ+−µ− be the Hahn-Jordandecomposition. Prove that the following are equivalent

276 LECTURES 30-31

(i) µ is finite, i.e. −∞ < µ(A) <∞, ∀A ∈ A;(ii) both “honest” measures µ+ and µ− are finite.

The following result characterizes mutual singularity in an approximate fashion.

Lemma 8.1. Let A be a σ-algebra on X, and let µ and ν be “honest” measureson A. The following are equivalent

(i) µ ⊥ ν;(ii) for every ε > 0, there exist sets D,E ∈ A, such that µ(D) < ε, ν(E) < ε,

and D ∪ E = X.

Proof. The implication (i) ⇒ (ii) is trivial.To prove the implication (ii) ⇒ (i) construct, for each ε > 0, two sequences

(Dεn)∞n=1 and (Eεn)

∞n=1 of sets in A, such that µ(Dε

n) < ε/2n, ν(Eεn) < ε/2n, andDεn ∪ Eεn = X. Put Aε =

⋂∞n=1D

εn and Bε =

⋃∞n=1E

εn. Fix for the moment ε > 0.

On the one hand, using the inclusion Aε ⊂ Dεn, ∀n ≥ 1, we get µ(Aε) ≤ ε/2n,

∀n ≥ 1, which clearly forces

(10) µ(Aε) = 0.

On the other hand, using σ-subadditivity, we have

(11) ν(Bε) = ν( ∞⋃n=1

Eεn)≤

∞∑n=1

ν(Eεn) <∞∑n=1

ε

2n= ε.

Finally, since we have, X rDεn ⊂ Eεn, ∀n ≥ 1, we get

X rAε =∞⋃n=1

(X rDεn) ⊂

∞⋃n=1

Eεn = Bε,

which gives

(12) Aε ⊃ X rBε.

Define now the sets N =⋃∞n=1A1/n and M = X r N . On the one hand, using

σ-subadditivity, combined with (10), we get µ(N) = 0. On the other hand, using(12), we have

M = X rN = X r( ∞⋃n=1

A1/n

)=

∞⋂n=1

(X rA1/n) ⊂∞⋂n=1

B1/n ⊂ B1/k, ∀ k ≥ 1,

which forces ν(M) = 0.

Although the next technical result seems a bit out of context at this point, weprove it here, and record it for future use.

Lemma 8.2. Let A be a σ-algebra on some non-empty set X, and let µ, η besigned measures on A. Assume there is an “honest” finite measure ν on A, withµ+ ν = η.

(i) If µ = µ+ − µ− and η = η+ − η− are the Hahn-Jordan decompositions ofµ and η respectively, then one has the inequalities

µ+ ≤ η+ ≤ µ+ + ν(13)

η− ≤ µ− ≤ η− + ν.(14)

CHAPTER III: MEASURE THEORY 277

(ii) If (X+, X−) is a Hahn-Jordan set decomposition of X relative to µ, andif (Y +, Y −) is a Hahn-Jordan set decomposition of X relative to η, thenone has the relations X+ ⊂

νY + and Y − ⊂

νX−.

Proof. On the one hand, the signed measure η has a decomposition

η = µ+ ν = (µ+ + ν)− µ−,

with µ+ + ν and µ− “honest” measures (one of them finite). Using the minimalityTheorem 8.3, we get the inequalities

(15) η+ ≤ µ+ + ν and η− ≤ µ−.

On the other hand, we can also consider the signed measure µ = η − ν, which hasa decomposition

µ = η+ − (η− + ν),

with η+ and η− + ν “honest” measures (one of them finite). Using again theminimality Theorem 8.3, we get the inequalities

(16) µ+ ≤ η+ and µ− ≤ η− + ν.

Clearly the inequalities (15) and (16) cover the desired inequalities (13) and (14)(ii). Recall (see Section 4) that the relation A ⊂

νB means that ν(ArB) = 0.

In our case, we have to look at the set

N = X+ r Y + = Y − rX−,

for which we have to show that ν(N) = 0. On the one hand, since N ⊂ Y −, we getη+(N) = 0. Using (13) this forces µ+(N) = 0. On the other hand, since N ⊂ X+,we get µ−(N) = 0, and using (14) we also get η−(N) = 0. In other words, we getthe equalities

µ(N) = µ+(N)− µ−(N) = 0,

η(N) = η+(N)− η−(N) = 0,

and then the equality η = µ+ ν clearly forces ν(N) = 0.

The Hahn-Jordan decomposition has the following interesting application to theproperties of the natural order relation on “honest” measures. The result belowgives the existence of a “infimum” and a ”supremum” for a pair of finite “honest”measures.

Proposition 8.1 (Lattice Property). Let A be a σ-algebra on a non-emptyset X, and let µ and ν be “honest” measures on A, with one of them finite.

(i) There exists a unique measure µ ∨ ν with:(a) µ ∨ ν ≥ µ and µ ∨ ν ≥ ν;(b) whenever ω is an “honest” measure on A, with µ ≤ ω and ν ≤ ω, it

follows that one has the inequality µ ∨ ν ≤ ω.(ii) There exists a unique measure µ ∧ ν with:

(a) µ ≥ µ ∧ ν and ν ≥ µ ∧ ν;(b) whenever λ is an “honest” measure on A, with µ ≥ λ and ν ≥ λ, it

follows that one has the inequality µ ∧ ν ≥ λ.

278 LECTURES 30-31

Proof. Since the statement of the Theorem is “symmetric,” without any lossof generality we can assume that µ is finite.

Consider the signed measure η = µ − ν, and its Hahn-Jordan decompositionη = η+ − η−. Let (X+, X−) be a Hahn-Jordan set decomposition of X relative toη. This means that, for every A ∈ A, one has

0 ≤ η+(A) = η(A ∩X+) = µ(A ∩X+)− ν(A ∩X+);(17)

0 ≤ η−(A) = −η(A ∩X−) = ν(A ∩X−)− µ(A ∩X−).(18)

In particular we get

(19) µ(A ∩X+) ≥ ν(A ∩X+) and µ(A ∩X−) ≤ ν(A ∩X−), ∀A ∈ A.

(i). Define the measure µ ∨ ν = µ+ η−. Using (18) we have

(20) (µ ∨ ν)(A) = µ(A ∩X+) + ν(A ∩X−), ∀A ∈ A.

Notice that, using (19), it follows that, for every A ∈ A, one has the inequalities

(µ ∨ ν)(A ∩X+) = µ(A ∩X+) ≥ ν(A ∩X+),

(µ ∨ ν)(A ∩X−) = ν(A ∩X−) ≥ µ(A ∩X−),

In particular, this gives

(µ ∨ ν)(A) = (µ ∨ ν)(A ∩X+) + (µ ∨ ν)(A ∩X−) ≥ µ(A ∩X+) + µ(A ∩X−) = µ(A),

(µ ∨ ν)(A) = (µ ∨ ν)(A ∩X+) + (µ ∨ ν)(A ∩X−) ≥ ν(A ∩X+) + ν(A ∩X−) = µ(A),

for every A ∈ A, so µ ∨ ν indeed has property (a).To prove property (b), start with some “honest” measure ω on A, with µ, ν ≤ ω,

and let us show that µ∨ ν ≤ ω. This is quite clear, since for any A ∈ A, using (20)we have

ω(A) = ω(A ∩X+) + ω(A ∩X−) ≥ µ(A ∩X+) + ν(A ∩X−) = (µ ∨ ν)(A).

The uniqueness of µ ∨ ν is now clear from (a) and (b).(ii). Remark that, using the Minimality Theorem 8.3, for the measure η = µ−ν,

it follows that η+ ≤ µ. In particular, η+ is a finite “honest” measure, and so is thedifference µ− η+. Put µ ∧ ν = µ− η+. Using (17) we have

(21) (µ ∧ ν)(A) = µ(A ∩X−) + ν(A ∩X+), ∀A ∈ A.

Notice that, using (19), it follows that, for every A ∈ A, one has the inequalities

(µ ∧ ν)(A ∩X+) = ν(A ∩X+) ≤ µ(A ∩X+),

(µ ∧ ν)(A ∩X−) = µ(A ∩X−) ≥ ν(A ∩X−),

In particular, this gives

(µ ∧ ν)(A) = (µ ∧ ν)(A ∩X+) + (µ ∧ ν)(A ∩X−) ≤ µ(A ∩X+) + µ(A ∩X−) = µ(A),

(µ ∧ ν)(A) = (µ ∧ ν)(A ∩X+) + (µ ∧ ν)(A ∩X−) ≤ ν(A ∩X+) + ν(A ∩X−) = µ(A),

for every A ∈ A, so µ ∧ ν indeed has property (a).To prove property (b), start with some “honest” measure λ on A, with µ, ν ≤ ω,

and let us show that µ∧ ν ≥ λ. This is quite clear, since for any A ∈ A, using (21)we have

λ(A) = λ(A ∩X+) + ω(A ∩X−) ≤ ν(A ∩X+) + µ(A ∩X−) = (µ ∧ ν)(A).

The uniqueness of µ ∧ ν is now clear from (a) and (b).

CHAPTER III: MEASURE THEORY 279

We conclude with a series of results that make a connection with the theory ofRadon measures discussed in Section 7.

Definition. Suppose X is a locally compact space, and µ is a signed measureon Bor(X). We call µ a signed Radon measure on X, if there exist “honest” Radonmeasures ν and η on X, one of which is finite, such that µ = ν − η.

Exercise 2*. Let X be a locally compact space, and let µ be a signed measureon Bor(X). Prove that the following are equivalent:

(i) µ is a signed Radon measure on X;(ii) if µ = µ+ − µ− denotes the Hahn-Jordan decomposition of µ, then both

µ+ and µ− are Radon measures on X.

Hint: To prove the implication (i) ⇒ (ii) use the fact that µ+ ≤ ν and µ− ≤ η. Moreover,

show that, for any B ∈ Bor(X), one has the implications µ+(B) < ∞ ⇒ ν(B) < ∞ and

µ−(B) <∞⇒ η(B) <∞. Then use Exercise 5 from Section 7.

Remark 8.2. Suppose X is a locally compact space. In Section 7 we discussedthe Riesz correpsondence, which associates to each linear positive map φ : CR

c (X) →R, a Radon measure µφ on X. As already suggested, this correspondence is in facta bijection, although the proof of this fact will come later in Chapter IV. At thispoint we would like to analyze the Riesz correspondence in a simpler situation,namely the case when X is compact. In this case it is interesting to point out thatRiesz correspondence can be extended beyond the positive case. The key fact (seeCorollary II.5.3) is that every linear continuous map φ : CR(X) → R can be writtenas a difference φ = φ1−φ2, with φ1, φ2 : CR(X) → R positive linear maps. (In factφ1 and φ2 can be chosen such that ‖φ‖ = ‖φ1‖ + ‖φ2‖. This fact will be heavilyexploited a little later.) We would like then to define a finite signed Radon measureµφ by the formula µφ = µφ1 − µφ2 . There is a minor problem here: What if wefind another pair of continuous positive linear maps ψ1, ψ2 : CR(X) → R, such thatφ = ψ1 − ψ2? Is is true that µψ1 − µψ2 = µφ1 − µφ2? The answer is affirmative,and this is an easy consequence of Proposition 7.6, which gives the equalities

µφ1 + µψ2 = µφ1+ψ2 = µψ1+φ2 = µψ1 + µφ2 .

Notations. Suppose X is a compact Hausdorff space. We define

MR(X) =φ : CR(X) → R : φ R-linear continuous

,

RR(X) =µ signed Radon measure on X

.

The correspondence

(22) MR(X) 3 φ 7−→ µφ ∈ RR(X)

defined above, will still be referred to as the extended Riesz correspondence.Remark 8.3. If X is a compact Hausdorff space, then the extended Riesz

correspondence (22) is a linear map. This is a consequence of Proposition 7.6.Given φ ∈ MR(X), the existence of a decomposition of φ, of the particular type

described in Corollary II.5.3, is extremely significant, as suggested by the followingresult.

Theorem 8.4. Let X be a compact Hausdorff space, let φ1, φ2 : CR(X) → Rbe positive linear maps, and let µφ1 and µφ2 be the corresponding Riesz measures.Consider the linear continuous map φ = φ1 − φ2, and the finite signed measure

(23) µφ = µφ1 − µφ2 .

280 LECTURES 30-31

If ‖φ‖ = ‖φ1‖+ ‖φ2‖, then µφ1 ⊥ µφ2 , so (23) represents the Hahn-Jordan decom-position of µφ.

Proof. We are going to show that the decomposition (23) satisfies condition(ii) in Lemma 8.1. The key step in proving this fact is contained in the following.

Claim: For every ε > 0, there exist functions f1, f2 ∈ CR(X), with f1, f2 ≥0, f1 + f2 ≥ 1, and such that φ1(f2) < ε and φ2(f1) < ε.

To prove this we fix ε > 0, and we use the definition of the norm, to find somefunction g ∈ CR(X), with ‖g‖ ≤ 1, and |φ(g)| ≥ ‖φ‖ − ε. Replacing g with −g, ifnecessary, we can assume that

(24) φ(g) ≥ ‖φ‖ − ε.

Consider the functions g+ = maxg, 0 and g− = max−g, 0, so that g = g+−g−,

and we clearly have 0 ≤ g± ≤ 1. On the one hand, since ‖φk‖ = φk(1) (seeProposition II.5.4), we have φk(g±) ≤ ‖φk, k = 1, 2. On the other hand, by (24),and the positivity of φ1 and φ2, we know that

‖φ‖ − ε ≤ φ(g) = φ1(g)− φ2(g) = φ1(g+) + φ2(g−)− φ1(g−)− φ2(g+) ≤≤ φ1(g+) + φ2(g−) ≤ ‖φ1‖+ ‖φ2‖ = ‖φ‖,

so we get

ε ≥ ‖φ‖ − φ1(g+)− φ2(g−) = ‖φ1‖+ ‖φ2‖ − φ1(g+)− φ2(g−) =

= φ1(1) + φ2(1)− φ1(g+)− φ2(g−) = φ1(1− g+) + φ2(1− g−).

If we define f1 = 1 − g− and f2 = 1 − g+, then it is clear that f1, f2 ≥ 0. Usingthe fact that g+ + g− = |g| ≤ 1, we get f1 + f2 = 2 − |g| ≥ 1. Finally, the aboveestimate gives φ1(f2) + φ2(f1) ≤ ε, and so the Claim immediately follows.

Having proven the Claim, we are now in position to prove that the two measuresµφ1 and µφ2 satisfy condition (ii) in Lemma 8.1. Start with some arbitrary ε > 0,and use the Claim to find two functions f1, f2 ∈ CR(X) with f1, f2 ≥ 0, f1+f2 ≥ 1,such that φ1(f2) ≤ ε/2 and φ2(f1) ≤ ε/2. Consider the compact subsets

K1 =x ∈ X : f1(x) ≥

12

and K2 =x ∈ X : f2(x) ≥

12.

Since f1 + f2 ≥ 1, it follows immediately that we have K1 ∪K2 = X. By construc-tion, we have 2f1 ≥ κK1

and 2f2 ≥ κK2, so using the interpolation property (see

Proposition 7.5), we get

µφ1(K2) ≤ φ1(2f2) = 2φ1(f2) ≤ ε;

µφ2(K1) ≤ φ2(2f1) = 2φ2(f1) ≤ ε.

The above result has several interesting consequences.Corollary 8.2. Suppose X is a compcat Hausdorff space. Then the extended

Riesz correspondence (22) is injective.

Proof. Since the correspondence (22) is linear, is suffices to prove the impli-cation µφ = 0 ⇒ φ = 0. Start with some linear continuous map φ : CR(X) → R,such that µφ = 0. Use Corollary II.5.3 to find two positive linear maps φ1, φ2 :CR(X) → R, such that φ = φ1 − φ2, and ‖φ‖ = ‖φ1‖ + ‖φ2‖. By Theorem 8.4the difference µφ1 − µφ2 = µφ = 0 is the Hahn-Jordan decomposition of the zeromeasure. By the uniqueness (see Corollary 8.1) it follows that µφ1 = µφ2 = 0. By

CHAPTER III: MEASURE THEORY 281

the interpolation property, we know that ‖φk‖ = φk(1) = µφk(X) = 0, k ≥ 1, so

we get φ1 = φ2 = 0, thus forcing φ = 0.

The injectivity of the extended Riesz correspondence has as a consequence theuniqueness of the decomposition of linear continuous as differences of positive ones,of the type described in Corollary II.5.3.

Corollary 8.3. Let X be a compact Hausdorff space, and let φ : CR(X) → Rbe a linear continuous map. Assume one has positive linear maps φ1, φ2, ψ1, ψ2 :CR(X) → R, such that

• φ = φ1 − φ2 = ψ1 − ψ2;• ‖φ1‖+ ‖φ2‖ = ‖ψ1‖+ ‖ψ2‖ = ‖φ‖.

Then one has the equalities φ1 = ψ1 and φ2 = ψ2.

Proof. Consider the signed measure µφ. By Theorem 8.4, the decompositions

µφ = µφ1 − µφ2 = µψ1 − µψ2

both represent the Hahn-Jordan decomposition of µφ. By the uniqueness (Corollary8.1) we have µφ1 = µψ1 and µφ2 = µψ2 . By Corollary 8.2 this forces φ1 = ψ1 andφ2 = ψ2.

Comment. If X is a compact Hausdorff space, and φ : CR(X) → R is alinear continuous map, then by the above result, combined with Corollary II.5.3,we know that there exist unique positive linear maps φ± : CR(X) → R such that‖φ‖ = ‖φ+‖+ ‖φ−‖, and

(25) φ = φ+ − φ−.

The decomposition (25) will be referred to as the Hahn-Jordan decomposition of φ.This noation and terminology are used for the following reason. If we take µφ themeasure given by the extended Riesz correspondence, then

µφ = µφ+ − µφ−

is precisely the Hahn-Jordan decomposition of µφ.Remarks 8.4. There is a version of the extended Riesz correspondence which

works for general locally compact spaces. Start with a locally compact space X,and define the spaces

MR0 (X) =

φ : CR

0 (X) → R : φ linear continuous,

RR0 (X) =

µ finite signed Radon measure on X

.

Since CR0 (X) is the completion of CR

C(X), the correspondence

MR0 (X) 3 φ 7−→ φ

∣∣CR

c (X)

establishes an isometric linear isomorphism between MR0 (X) and the space of all

continuous linear maps CRc (X) → R. For every positive φ ∈ MR

0 (X), we denote byµφ the Riesz measure associated with the restriction φc = φ

∣∣CR

c (X). Since ‖φc‖ =

‖φ‖, we have the equality µφ(X) = ‖φ‖.We know (see Proposition II.5.10) that for every linear continuous map φ :

CR0 (X) → R, there exist linear positive continuous maps φ1, φ2 : CR

0 (X) → R, withφ = φ1 − φ2. (In fact φ1 and φ2 can be chosen such that ‖φ1‖+ ‖φ2‖ = ‖φ‖.) Weuse this fact to define the finite signed Radon measure µφ = µφ1 − µφ2 . Exactly as

282 LECTURES 30-31

in Remark 8.2, this definition is independent of the particular choice of φ1 and φ2.This way we have constructed a map

(26) MR0 (X) 3 φ 7−→ µφ ∈ RR

0 (X)

which we will call the extended finite Riesz correspondence. Of course, ifX is alreadycompact, we have CR

0 (X) = CR(X), MR0 (X) = MR(X), and mathfrakRR

0 (X) =RR(X), so (26) is the extended Riesz correspondence previously defined.

The following result generalizes the statements of Remark 8.3, Theorem 8.4,and Corollaries 8.2 and 8.3.

Theorem 8.5. Let X be a locally compact space.A. The extended finite Riesz correspondence (26) is an injective linear map.B. For every φ ∈ MR

0 (X), there exist unique positive maps φ+, φ− ∈∈ MR0 (X),

such that φ = φ+ − φ−, and ‖φ‖ = ‖φ+‖+ ‖φ−‖. Moreover, in this case

µφ = µφ+ − µφ−

is precisely the Hahn-Jordan decomposition of µφ.

Proof. First of all, the correspondence (26) is clearly linear, again as a con-sequence of Proposition 7.6.

Second, we remark that the existence part in B is already known, from Propo-sition II.5.10. We are going to use the following version of Theorem 8.4.

Claim: Suppose φ ∈ MR0 (X) is written as a difference φ = φ1 − φ2, with

φ1, φ2 ∈ MR0 (X) positive, and ‖φ‖ = ‖φ1‖+ ‖φ2‖. Then

µφ = µφ1 − µφ2

is the Hahn-Jordan decomposition of µφ.One way to prove this is by employing the Alexandrov compactification Xα =X t ∞. We use the identification

CR0 (X) = f ∈ CR(Xα) : f(∞) = 0

.

We know that there exist positive linear maps ψ1, ψ2 : CR(X) → R, such thatψk

∣∣CR

0 (X)= φk, and ‖ψk‖ = ‖φk‖, k = 1, 2. If we define ψ : CR(Xα) → R by

ψ = ψ1 − ψ2, it it not hard to see that ‖ψ‖ = ‖ψ1‖ + ‖ψ2‖, so if we consider theRadon measures µψ, µψ1 and µψ2 on the compact space Xα, then using Theorem8.4, we get the fact that

µψ = µψ1 − µψ2

is precisely the Hahn-Jordan decomposition of µψ. This means that there are setsB1, B2 ∈ Bor(Xα), with B1∪B2 = Xα, B1∩B2 = ∅, and µψ1(B2) = µψ2(B1) = 0.We know (see Remarks 7.4) that

µψk(B) = µφk

(B ∩X), ∀B ∈ Bor(Xα), k = 1, 2,

so if we define Ak = Bk ∩X, we immediately get A1 ∪A2 = X, A1 ∩A2 = ∅, andµφ1(A2) = µφ2(A1) = 0, thus proving that µφ1 ⊥ µφ2 .

Having proven the above Claim, the proof follows line by line the proofs ofCorollaries 8.3 and 8.4.

The notion of a finite signed measure can be generalized to the complex case.Definition. Suppose A is a σ-algebra on a non-empty set X. A function

µ : A → C is called a complex measure on A, if it is σ-additive in the sense that

CHAPTER III: MEASURE THEORY 283

(addσ) for any pairwise disjoint sequence (An)∞n=1 ⊂ A, one has the equality

(27) µ( ∞⋃n=1

An)

=∞∑n=1

µ(An).

Remark that the condition µ(∅) = 0 is automatic in this case. Note also that amap µ : A → C is a complex measure, if and only if the maps Reµ and Imµ arefinite signed measures.

The following result describes an important construction.Theorem 8.6. Let A be a σ-algebra, and let µ be either a signed measure, or

a complex measure on A. For every A ∈ A, we define

(28) ν(A) = sup ∞∑k=1

|µ(Ak)| : (Ak)∞k=1 ⊂ A, pairwise disjoint,∞⋃k=1

Ak = A

.

The map ν : A → [0,∞] is an “honest” measure on A.

Proof. The first step in the proof is contained in the following.Claim 1: For any pariwise disjoint sequence (An)∞n=1 ⊂ A, one has the in-

equality

(29) ν( ∞⋃n=1

An)≤

∞∑n=1

ν(An).

Denote the right hand side of (29) by S, and denote the union⋃∞n=1An simply by A.

Start now with some pairwise disjoint sequence (Dk)∞k=1 ⊂ A, with⋃∞k=1Dk = A.

For every k ≥ 1, we have Dk =⋃∞n=1(Dk ∩ An), with (Dk ∩ An)∞n=1 ⊂ A pairwise

disjoint, so we have

|µ(Dk)| =∣∣∣∣ ∞∑n=1

µ(Dk ∩An)∣∣∣∣ ≤ ∞∑

n=1

|µ(Dk ∩An)|, ∀ k ≥ 1.

Summing up then yields

(30)∞∑k=1

|µ(Dk)| ≤∞∑k=1

[ ∞∑n=1

|µ(Dk ∩An)|]

=∞∑n=1

[ ∞∑k=1

|µ(Dk ∩An)|].

Since for each n ≥ 1, the sequence (Dk ∩ An)∞k=1 ⊂ A is pairwise disjoint, andsatisfies

⋃∞k=1(Dk ∩An) = An, by the definition of ν, we get

∞∑k=1

|µ(Dk ∩An)| ≤ ν(An), ∀n ≥ 1.

Using these estimates in (30), we then get∞∑k=1

|µ(Dk) ≤∞∑n=1

ν(An).

Since the inequality∑∞k=1 |µ(Dk)| ≤ S holds for all pairwise disjoint sequences

(Dk)∞k=1 ⊂ A, with⋃∞k=1Dk = A, by the definition of ν we get ν(A) ≤ S, and the

Claim is proven.Claim 2: For any finite pairwise disjoint collection (An)Nn=1 ⊂ A, one has

the inequality

ν(A1 ∪ · · · ∪AN ) ≥ ν(A1) + · · ·+ ν(AN ).

284 LECTURES 30-31

We use induction on N , and we see immediately that it suffices only to provethe case N = 2. Fix for the moment a pairwise disjoint sequence (Dk)∞k=1 ⊂ A,with

⋃∞k=1Dk = A1, and denote the sum

∑∞k=1 |µ(Dk)| by R. Suppose we have a

pairwise disjoint sequence (Ej)∞j=1 ⊂ A, with⋃∞j=1Ej = A2. If we combine it with

the Dk’s, i.e. we define

Fp =

Dp/2 if p is evenE(p+1)/2 if p is odd

then we get a new pairwise disjoint sequence (Fp)∞p=1 ⊂ A, with⋃∞p=1 Fp = A1∪A2.

By the definition of ν we will then get

ν(A1 ∪A2) ≥∞∑p=1

|µ(Fp)| =∞∑k=1

|µ(Dk)|+∞∑j=1

|µ(Ej)| = R+∞∑j=1

|µ(Ej)|.

Taking supremum over all pairwise disjoint sequences (Ej)∞j=1 ⊂ A, with⋃∞j=1Ej =

A2, the above inequality yields µ(A1 ∪A2) ≥ R+ ν(A2), so now we have

ν(A1 ∪A2) ≥ ν(A2) +∞∑k=1

|µ(Dk)|.

Taking supremum over all pairwise disjoint sequences (Dk)∞k=1 ⊂ A, with⋃∞k=1Dk =

A1, the above inequality finally gives ν(A1 ∪ A2) ≥ ν(A2) + ν(A1), and the Claimis proven.

We are now in position to prove that ν is a measure on A. The equalityν(∅) = 0 is trivial. To prove σ-additivity, we start with some pairwise disjointsequence (An)∞n=1 ⊂ A, and we must prove the equality

ν( ∞⋃n=1

An)

=∞∑n=1

ν(An).

On the one hand, using Claim 1, we know that we have the inequality ν( ⋃∞

n=1An)≤∑∞

n=1 ν(An). On the other hand, if we denote the union⋃∞n=1An simply by A,

then using Claim 2, we see that

ν(A) ≥ ν(Ar [A1 ∪ · · · ∪AN ]) + ν(A1) + . . . ν(AN ) ≥ ν(A1) + . . . ν(AN ), ∀N ≥ 1,

which immedaitely gives the other inequality ν(A) ≥∑∞n=1 ν(An).

Definition. With the notations above, and under the hypothesis of Theorem8.6, the “honest” measure ν, defined by (28), is called the variation measure of µ,and will be denoted by |µ|. By construction, we have the inequality

|µ(A)| ≤ |µ|(A), ∀A ∈ A.

Remark 8.5. Let µ be either a signed measure, or a complex measure onthe σ-algebra A. Exactly as with numbers (or functions), the measure |µ| has aminimality property, which can be stated as follows. Whenever ν is an “honest”measure on A with

|µ(A)| ≤ ν(A), ∀A ∈ A,

it follows that we have|µ|(A) ≤ ν(A), ∀A ∈ A.

CHAPTER III: MEASURE THEORY 285

This is quite clear, because for any pairwise disjoint sequence (An)∞n=1 ⊂ A, with⋃∞n=1An = A, one has the inequality

∞∑n=1

|µ(An)| ≤∞∑n=1

ν(An) = ν(A),

and then the desired inequality follows by taking the supremum in the left handside.

In the case of signed measures, the variation measure is also given by thefollowing.

Proposition 8.2. Let µ be a signed measure on the σ-algebra A. Then onehas the equality

|µ| = µ+ + µ−,

where µ = µ+ − µ− is the Hahn-Jordan decomposition of µ.

Proof. Denote the measure µ+ + µ− simply by ν. Remark that we obviouslyhave

−ν(A) = −µ+(A)−µ−(A) ≤ µ+(A)−µ−(A) ≤ µ+(A)+mu−(A) = ν(A), ∀A ∈ A,

which gives|µ(A)| ≤ ν(A), ∀A ∈ A.

By Remark 8.5, this forces the inequality |µ| ≤ ν.To prove the other inequality, we start by fixing setsX+, X− ∈ A as in Theorem

8.2. We decompose each set A ∈ A as A = A+ ∪A−, where A± = A∩X±, so thatwe have

ν(A) = ν(A+)+ν(A−) = µ+(A+)+µ+(A−)+µ−(A+)+µ−(A−) = µ+(A+)+µ−(A−).

Notice now that µ(A+) = µ+(A+) ≥ 0, and −µ(A−) = µ−(A−) ≥ 0, which meansthat we have the equalities µ+(A+) = |µ(A+)| and µ−(A−) = |µ(A−)|, so the aboveequality reads

ν(A) = |µ(A+)|+ |µ(A−)|,and by the definition of |µ| we then immediately get ν(A) ≤ |µ|(A).

An interesting consequence is the following.Corollary 8.4. Let µ be either a finite signed measure, or a comlex measure

on the σ-algebra A. Then the variation measure |µ| is finite.

Proof. The signed measure case is clear from the above result.In the complex case, we write µ = ν + iη, with ν and η finite signed measures

on A. We apply the signed case, to get the fac that both |ν| and |η| are finite.Notice that we have

|µ(A)| = |ν(A) + iη(A)| ≤ |ν(A)|+ |η(A)| ≤ |ν|(A) + |η|(A), ∀A ∈ A,

so by Remark 8.5 we get |µ| ≤ |ν|+|η|, and then the finiteness of |µ| is a consequenceof the finiteness of |ν| and |η|.

Exercise 3. Let A be a σ-algebra, and let K be one of the fields R or C. Forthe purpose of this exercise, let us agree to use the term K-measure for designatingeither a finite signed measure (when K = R), or a complex measure (when K = C).Prove the following.

(i) The collection of all K-measures on A is a vector space.

286 LECTURES 30-31

(ii) For anu two K-measures if µ and ν, one has the inequality

|µ+ ν| ≤ |µ|+ |ν|.(iii) For any K-measure µ and any α ∈ K, one has the equality

|αµ| = |α| · |µ|.Proposition 8.2 has another interesting consequence, which is relevant for the

study of the extended finite Riesz correspondence.Corollary 8.5. Let X be a locally compact space. Then the extended finite

Riesz correspondence (26) has the property

(31) |µφ|(X) = ‖φ‖, ∀φ ∈ MR0 (X).

Proof. From Proposition 8.1 and Theorem 8.5, we know that |µφ| = µφ+ +µφ− . Using Remark 8.4, and Theorem 8.5 again, we have

|µφ|(X) = µφ+(X) + µφ−(X) = ‖φ+‖+ ‖φ−‖ = ‖φ‖.

Comments. Given a locally compact space X, we can define a complex Radonmeasure on X as being a complex measure on X, whose real and imaginary partare both (finite) signed Radon measures. The extended finite Riesz correspondencecan be then defined also over the complex numbers, as a map

M0(X) 3 φ 7−→ µφ ∈ R0(X),

where

M0(X) =φ : C0(X) → C : φ linear constinuous

,

R0(X) =µ complex Radon measure on X

.

This correspondence is again linear. One will still have the equality (31), but theproof of this fact will appear later in Chapter IV.

Chapter IV

Integration Theory

Lectures 32-33

1. Construction of the integral

In this section we construct the abstract integral. As a matter of terminology,we define a measure space as being a triple (X,A, µ), where X is some (non-empty)set, A is a σ-algebra on X, and µ is a measure on A. The measure space (X,A, µ)is said to be finite, if If µ(X) <∞.

Definition. Let (X,A, µ) be a measure space, and let K be one of the fieldsR or R. A K-valued elementary µ-integrable function on (X,A, µ) is an functionf : X → K, with the following properties

• the range f(X) of f is a finite set;• f−1(α) ∈ A, and µ

(f−1(α)

)<∞, for all α ∈ f(X) r 0.

We denote by L1K,elem(X,A, µ) the collection of all such functions.

Remarks 1.1. Let (X,A, µ) be a measure space.A. Every K-valued elementary µ-integrable function f on (X,A), µ) is mea-

surable, as a map f : (X,A) →(K, Bor(K

). In fact, any such f can be written

asf = α1κA1

+ · · ·+ αnκAn,

with αk ∈ K, Ak ∈ A and µ(Ak) < ∞, ∀ k = 1, . . . , n. Using the notations fromIII.1, we have the inclusion

L1K,elem(X,A, µ) ⊂ A-ElemK(X).

B. If we consider the collection R = A ∈ A : µ(A) < ∞, then R is a ring,and, we have the equality

L1K,elem(X,A, µ) = R-ElemK(X).

In particular, it follows that L1K,elem(X,A, µ) is a K-vector space.

The following result is the first step in the construction of the integral.Theorem 1.1. Let (X,A, µ) be a measure space, and let K be one of the fields

R or C. Then there exists a unique K-linear map Iµelem : L1K,elem(X,A, µ) → K,

such that

(1) Iµelem(κA) = µ(A),

for all A ∈ A, with µ(A) <∞.

Proof. For every f ∈ L1K,elem(X,A, µ), we define

Iµelem(f) =∑

α∈f(X)r0

α · µ(f−1(α)

),

289

290 LECTURES 32-33

with the convention that, when f(X) = 0 (which is the same as f = 0), we defineIµelem(f) = 0. It is obvious that Iµelem satsifies the equality (1) for all A ∈ A withµ(A) <∞.

One key feature we are going to use is the following.

Claim 1: Whenever we have a finite pairwise disjoint sequence (Ak)nk=1 ⊂ A,with µ(Ak) <∞, ∀ k = 1, . . . , n, one has the equality

Iµelem(α1κA1+ · · ·+ αnκAn

) = α1µ(A1) + · · ·+ αnµ(An), ∀α1, . . . , αn ∈ K.

It is obvious that we can assume αj 6= 0, ∀ j = 1, . . . , n. To prove the above equality,we consider the elementary µ-integrable function f = α1κA1

+ · · ·+αnκAn, and we

observe that f(X)r0 = α1∪· · ·∪αn. It may be the case that some of the α’sa equal. We list f(X) r 0 = β1, . . . , βp, with βj 6= βk, for all j, k ∈ 1, . . . , pwith j 6= k. For each k ∈ 1, . . . , p, we define the set

Jk =j ∈ 1, . . . , n : αj = βk

.

It is obvious that the sets (Jk)pk=1 are pairwise disjoint, and we have J1∪ · · ·∪Jp =

1, . . . , n. Moreover, for each k ∈ 1, . . . , p, one has the equality

f−1(βk) =⋃j∈Jk

Aj ,

so we get

βkµ(f−1(βk)

)= βk

∑j∈Jk

µ(Aj) =∑j∈Jk

αjµ(Aj), ∀ k ∈ 1, . . . , p.

By the definition of Iµelem we then get

Iµelem(f) =p∑k=1

βkµ(f−1(βk)

)=

p∑k=1

[ ∑j∈Jk

αjµ(Aj)]

=n∑j=1

αjµ(Aj).

Claim 2: For every f ∈ L1K,elem(X,A, µ), and every A ∈ A with µ(A) <∞,

one has the equality

(2) Iµelem(f + ακA) = Iµelem(f) + αµ(A), ∀α ∈ K.

Write f = α1κA1+ · · ·+αnκAn

, with (Aj)nj=1 ⊂ A pairwise disjoint, and µ(Aj) <∞, ∀ j = 1, . . . , n. In order to prove (2), we are going to write the function f +ακA in a similar way, and we are going to apply Claim 1. Consider the setsB1, B2, . . . , B2n, B2n+1 ∈ A defined by B2n+1 = Ar (A1 ∪ · · · ∪An), and B2k−1 =Ak ∩ A, B2k = Ak r A, ∀ k = 1, . . . , n. It is obvious that the sets (Bp)2n+1

p=1 arepairwise disjoint. Moreover, one has the equalities

(3) B2k−1 ∪B2k = Ak, ∀ k ∈ 1, . . . , n,

as well as the equality

(4) A =n+1⋃k=1

B2k−1.

Using these equalities, now we have f + ακA =∑2n+1p=1 βpκBp

, where β2n+1 = α,and β2k = αk and β2k−1 = αk + α, ∀ k ∈ 1, . . . , n. Using these equalities,

CHAPTER IV: INTEGRATION THEORY 291

combined with Claim 1, and (3) and (4), we now get

Iµelem(f + ακA) =2n+1∑p=1

βpµ(Bp) =

= αµ(B2n+1) +n∑k=1

[(αk + α)µ(B2k−1) + αkµ(B2k)

]=

=[αn+1∑k=1

µ(B2k−1)]

+[ n∑k=1

αk[µ(B2k−1) + µ(B2k)

]]=

= αµ( n+1⋃k=1

B2k−1))

+n∑k=1

αkµ(B2k−1 ∪B2k) =

= αµ(A) +n∑k=1

αkµ(Ak) = αµ(A) + Iµelem(f),

and the Claim is proven.We now prove that Iµelem is linear. The equality

Iµelem(f + g) = Iµelem(f) + Iµelem(g), ∀ f, g ∈ L1K,elem(X,A, µ)

follows from Claim 2, using an obvious inductive argument. The equality

Iµelem(αf) = αIµelem(f), ∀α ∈ K, f ∈ L1K,elem(X,A, µ).

is also pretty obvious, from the definition.The uniqueness is also clear.

Definition. With the notations above, the linear map

Iµelem : L1K,elem(X,A, µ) → K

is called the elementary µ-integral.In what follows we are going to encounter also situations when certain relations

among measurable functions hold “almost everywhere.” We are going to use thefollowing.

Convention. Let T be one of the spaces [−∞,∞] or C, and let r be somerelation on T (in our case r will be either “=,” or “≥,” or “≤,” on [−∞,∞]). Givena measurable space (X,A, µ), and two measurable functions f1, f2 : X → T ,

f1 r f2, µ-a.e.

if the setA =

x ∈ X : f1(x)r f2(x)

belongs to A, and it has µ-null complement in X, i.e. µ(X rA) = 0. (If r is one ofthe relations listed above, the set A automatically belongs to A.) The abreviation“µ-a.e.” stands for “µ-almost everywhere.”

Remark 1.2. Let (X,A, µ) be a measure space, let f ∈ A-ElemK(X) be suchthat

f = 0, µ-a.e.Then f ∈ L1

K,elem(X,A, µ), and Iµelem(f) = 0. Indeed, if we define the set

N = x ∈ X : f(x) 6= 0,

292 LECTURES 32-33

then N ∈ A and µ(N) = 0. Since f−1(α) ⊂ N , ∀α ∈ f(X) r 0, it follows thatµ(f−1(α)

)= 0, ∀α ∈ f(X) r 0, and then by the definition of the elementary

µ-integral, we get Iµelem(f) = 0.One useful property of elementary integrable functions is the following.Proposition 1.1. Let (X,A, µ) be a measure space, let f, g ∈ L1

R,elem(X,A, µ),and let h ∈ A-ElemR(X) be such that

f ≤ h ≤ g, µ-a.e.

Then h ∈ L1R(X,A, µ), and

(5) Iµelem(f) ≤ Iµelem(h) ≤ Iµelem(h).

Proof. Consider the sets

A = x ∈ X : f(x) > h(x) and B = x ∈ X : h(x) > g(x),which both belong to A, and have µ(A) = µ(B) = 0. The set M = A ∪ Balso belongs to A and has µ(M) = 0. Define the functions f0 = f(1 − κM ),g0 = g(1 − κM ), and h0 = h(1 − κM ). It is clear that f0, g0, and h0 are allin A-ElemR(X). Moreover, we have the equalities f0 = f , µ-a.e., g0 = g, µ-a.e.,and h0 = h, µ-a.e., so by Remark ??, combined with Theorem 1.1, the functionsf0 = f + (f0 − f) and g0 = (g0 − g) + g both belong to L1

R(X,A, µ), and we havethe equalities

(6) Iµelem(f0) = Iµelem(f) and Iµelem(g0) = Iµelem(g).

Notice now that we have the (absolute) inequality

f0 ≤ h0 ≤ g0.

Let us show that h0 is elementary integrable. Start with some α ∈ h0(X) r 0. Ifα > 0, then, using the inequality h0 ≤ g0, we get

h−10 (α) ⊂ g−1

0

((0,∞)

)⊂

⋃λ∈g0(X)r0

g−10 (λ),

which proves that µ(h−1

0 (α))<∞. Likewise, if α < 0, then, using the inequality

h0 ≥ f0, we get

h−10 (α) ⊂ f−1

0

((−∞, 0)

)⊂

⋃λ∈f0(X)r0

f−10 (λ),

which proves again that µ(h−1

0 (α))<∞.

Having shown that h0 is elementary integrable, we now compare the numbersIµelem(f), Iµelem(h0), and Iµ(g). Define the functions f1 = h0−f0, and g1 = g0−h0.By Theorem 1.1, we know that f1, g1 ∈ L1

R,elem(X,A, µ). Since f1, g1 ≥ 0, wehave f1(X), g1(X) ⊂ [0,∞), so it follows immediately that Iµelem(f1) ≥ 0 andIµelem(g1) ≥ 0. Now, again using Theorem 1.1, and (6), we get

Iµelem(h0) = Iµelem(f0 + f1) = Iµelem(f0) + Iµelem(f1) ≥ Iµelem(f0) = Iµelem(f);

Iµelem(h0) = Iµelem(g0 − g1) = Iµelem(g0)− Iµelem(g1) ≤ Iµelem(g0) = Iµelem(g).

Since h = h0, µ-a.e., by the above Remark it follows that h ∈ L1R,elem(X,A, µ), and

Iµelem(h) = Iµelem(h0), so the desired inequality (5) follows immediately.

We now define another type of integral.

CHAPTER IV: INTEGRATION THEORY 293

Definition. Let (X,A, µ) be a measure space. A measurable function f :X → [0,∞] is said to be µ-integrable, if

(a) every h ∈ A-ElemR(X), with 0 ≤ h ≤ f , is elementary µ-integrable;(b) sup

Iµelem(h) : h ∈ A-ElemR(X), 0 ≤ h ≤ f

<∞.

If this is the case, the above supremum is denoted by Iµ+(f). The space of all suchfunctions is denoted by L1

+(X,A, µ). The map

Iµ+ : L1+(X,A, µ) → [0,∞)

is called the positive µ-integral.The first (legitimate) question is whether there is an overlap between the two

definitions. This is anwered by the following.Proposition 1.2. Let (X,A, µ) be a measure space, and let f ∈ A-ElemR(X)

be a function with f ≥ 0. The following are equivalent(i) f ∈ L1

+(X,A, µ);(ii) f ∈ L1

R,elem(X,A, µ).

Moreover, if f is as above, then Iµelem(f) = Iµ+(f).

Proof. The implication (i) ⇒ (ii) is trivial.To prove the implication (ii) ⇒ (i) we start with an arbitrary elementary

h ∈ A-ElemR(X), with 0 ≤ h ≤ f . Using Proposition 1.1, we clearly get(a) h ∈ L1

R,elem(X,A, µ);(b) Iµelem(h) ≤ Iµelem(f).

Using these two facts, it follows that f ∈ L1+(X,A, µ), as well as the equality

supIµelem(h) : h ∈ A-ElemR(X), h ≤ f

= Iµelem(f),

which gives Iµ+(f) = Iµelem(f).

We now examine properties of the positive integral, which are similar to thoseof the elementary integral. The following is an analogue of Proposition 1.1.

Proposition 1.3. Let (X,A, µ) be a measure space, let f ∈ L1+(X,A, µ),

and let g : X → [0,∞] be a measurable function, such that g ≤ f , µ-a.e., theng ∈ L1

+(X,A, µ), and Iµ+(g) ≤ Iµ+(f).

Proof. Start with some elementary function h ∈ A-ElemR(X), with 0 ≤ h ≤g. Consider the sets

M = x ∈ X : h(x) > f(x) and N = x ∈ X : g(x) > f(x),

which obviously belong to A. Since N ⊂ N , and µ(N) = 0, we have µ(M) = 0. Ifwe define the elementary function h0 = h(1 − κM ), then we have h = h0, µ-a.e.,and 0 ≤ h0 ≤ f , so it follows that h0 ∈ L1

R,elem(X,A, µ), and Iµelem(h0) ≤ Iµ(f).Since h = h0, µ-a.e., by Proposition 1.1., it follows that h ∈ L1

R,elem(X,A, µ),and Iµelem(h) = Iµelem(h0) ≤ Iµ+(f). By definition, this gives g ∈ L1

+(X,A, µ) andIµ+(g) ≤ Iµ+(f).

Remark 1.3. Let (X,A, µ) be a measure space, and let f ∈ L1+(X,A, µ).

Although f is allowed to take the value ∞, it turns out that this is inessential.More precisely one has

µ(f−1(∞)

)= 0.

294 LECTURES 32-33

This is in fact a consequence of the equality

(7) limt→∞

µ(f−1([t,∞])

)= 0.

Indeed, if we define, for each t ∈ (0,∞), the set At = f−1([t,∞]) ∈ A, then wehave 0 ≤ tκAt

≤ f . This forces the functions tκAt, t ∈ (0,∞) to be elementary

integrable, and

µ(At) ≤Iµ+(f)t

, ∀ t ∈ (0,∞).

This forces limt→∞ µ(At) = 0.The next result explains the fact that positive integrability is a “decomposable”

property.Proposition 1.4. Let (X,A, µ) be a measure space. Suppose (Ak)nk=1 ⊂ A

is a pairwise disjoint finite sequence, with A1 ∪ · · · ∪ An = X. For a measurablefunction f : X → [0,∞], the following are equivalent.

(i) f ∈ L1+(X,A, µ);

(ii) fκAk∈ L1

+(X,A, µ), ∀ k = 1, . . . , n.Moreover, if f satisfies these equivalent conditions, one has

Iµ+(f) =n∑k=1

Iµ+(fκAk).

Proof. The implication (i) ⇒ (ii) is trivial, since we have 0 ≤ fκAk≤ f , so

we can apply Proposition 1.3.To prove the implication (ii) ⇒ (i), start by assuming that f satisfies condition

(ii). We first observe that every elementary function h ∈ A-ElemR(X), with 0 ≤h ≤ f , has the properties:

(a) h ∈ L1R,elem(X,A, µ);

(b) Iµelem(h) ≤∑nk=1 I

µ+(fκAk

).This is immediate from the fact that we have the equality h =

∑nk=1 hκAk

, and allfunction hκAk

are elementary, and satisfy 0 ≤ hκAk≤ fκAk

, and then everythingfollows from Theorem 1.1 and the definition of the positive integral which givesIµelem(hκAk

) ≤ Iµ+(fκAk).

Of course, the properties (a) and (b) above prove that f ∈ L1+(X,A, µ), as well

as the inequality

Iµ+(f) ≤n∑k=1

Iµ+(fκAk).

To prove that we have in fact equality, we start with some ε > 0, and we choose, foreach k ∈ 1, . . . , n, a function hk ∈ L1

R,elem(X,A, µ), such that 0 ≤ hk ≤ fκAk,

and Iµelem(hk) ≥ Iµ+(fκAk) − ε

n . By Theorem 1.1, the function h = h1 + · · · + hnbelongs to L1

R,elem(X,A, µ), and has

(8) Iµelem(h) =n∑k=1

Iµelem(hk) ≥( n∑k=1

Iµ+(fκAk))− ε.

We obviously have

h =n∑k=1

hk ≤n∑k=1

fκAk= f,

CHAPTER IV: INTEGRATION THEORY 295

so we get Iµelem(h) ≤ Iµ(f), thus the inequality (8) gives

Iµ(f) ≥( n∑k=1

Iµ+(fκAk))− ε.

Since this inequality holds for all ε > 0, we get Iµ(f) ≥∑nk=1 I

µ+(fκAk

), and weare done.

Remark 1.4. Let (X,A, µ) be a measure space, and let S ∈ A. We can

A∣∣S

= A ∩ S : A ∈ A = A ∈ A : A ⊂ S,

so that A∣∣S⊂ A is a σ-algebra on S. The restriction of µ to A

∣∣S

will be denotedby µ|S . With these notations, (S,A

∣∣S, µ|S) is a measure space. It is not hard to

see that for a measurable function f : X → [0,∞], the conditions• fκS ∈ L1

+(X,A, µ),• f

∣∣S∈ L1

+(S,A∣∣S, µ

∣∣S)

are equivalent. Moreover, in this case one has the equality

Iµ+(fκS) = Iµ|S+ (f

∣∣S).

This is a consequence of the fact that these two conditions are equivalent if f iselementary, combined with the fact that the restriction map h 7−→ h

∣∣S

establishesa bijection between the sets

h ∈ A-ElemR(X) : 0 ≤ h ≤ fκS

,

k ∈ A∣∣S-ElemR(S) : 0 ≤ k ≤ f

∣∣S

.

The next result gives an alternative definition of the positive integral, for func-tions that are dominated by elementary integrable ones.

Proposition 1.5. Let X(,A, µ) be a measure space, let f : X → [0,∞] bea measurable function. Assume there exists h0 ∈ L1

R,elem(X,A, µ), with h0 ≥ f .Then f ∈ L1

+(X,A, µ), and one has the equality

(9) Iµ+(f) = infIµelem(h) : h ∈ L1

R,elem(X,A, µ), h ≥ f.

Proof. Since h0 ≥ 0, by Proposition 1.2, we know that h0 ∈ L1+(X,A, µ).

The fact that f ∈ L1+(X,A, µ) then follows from Proposition 1.3, combined with

the inequality h0 ≥ f . More generally, again by Propositions 1.2 and 1.3, we knowthat for any h ∈ L1

R,elem(X,A, µ), with h ≥ f , we have h ∈ L1+(X,A, µ), as well as

the inequalityIµ+(f) ≤ Iµ+(h) = Iµelem(h).

So, if we denote the right hand side of (9) by J(f), we have Iµ+(f) ≤ J(f) ≤Iµelem(h0).

We now prove the other inequality Iµ+(f) ≥ J(f). If h0 = 0, there is nothingto prove. Assume h0 is not identically zero. Without any loss of generality, we canassume that h0 = βκB , for some β ∈ (0,∞) and B ∈ A with µ(B) < ∞. (If wedefine B = h−1

0

((0,∞)

)=

⋃α∈h0(X)r0 h

−10 (α), and if we set β = max h0(X),

then we clearly have µ(B) <∞, and h0 ≤ βκB .)For every integer n ≥ 1, we define the sets An1 , . . . , A

nn ∈ A by

Ank = f−1(( (k−1)β

n , kβn ]), ∀ k = 1, . . . , n,

296 LECTURES 32-33

and we define the elementary functions

gn =n∑k=1

(k−1)βn κAn

kand hn =

n∑k=1

kβn κAn

k.

The main features of these constructions are collected in the following.Claim: For every n ≥ 1, the functions gn and hn are elementary integrable,

and satisfy the inequalities 0 ≤ gn ≤ f ≤ hn ≤ h0, as well as

Iµelem(hn) ≤ Iµelem(gn) +βµ(B)n

.

To prove this fact, we fix n ≥ 1, and we first remark that the sets (Ank )nk=1 are

pairwise disjoint. Since 0 ≤ f ≤ h0 = βκB , we have

An1 ∪ · · · ∪Ann = f−1((0, β]

)⊂ B.

In particular, if we define An = An1 ∪ · · · ∪Ann ⊂ B, we have

hn =n∑k=1

kβn κAn

k≤ β

n∑k=1

κAnk

= βκAn≤ βκB .

Let us prove the inequalities gn ≤ f ≤ hn. Start with some arbitrary point x ∈ X,and let us show that gn(x) ≤ f(x) ≤ hn(x). If f(x) = 0, there is nothing to prove,because this forces κAn

k(x) = 0, ∀ k = 1, . . . , n. Assume now f(x) > 0. Since

f ≤ βκB , we now that f(x) ∈ (0, β], so there exists a unique k ∈ 1, . . . , n, suchthat (k−1)β

n < f(x) ≤ kβn , i.e. x ∈ Ank . We then obviously have

gn(x) = (k−1)βn κAn

k(x) = (k−1)β

n < f(x) ≤ kβn = kβ

n κAnk(x) = hn(x),

and we are done. Finally, let us observe that since gn ≤ hn ≤ h0, it follows that gnand hn are in L1

+(X,A, µ), so gn and hn are elementary integrable. Notice that

hn − gn = βn

n∑k=1

κAnk

= βnκAn

≤ βnκB ,

so we have Iµelem(hn − gn) ≤ Iµelem(βnκB) = βµ(B)n , so using Theorem 1.1, we get

Iµelem(hn) = Iµelem(gn) + Iµelem(hn − gn) ≤ Iµ(gn) +βµ(B)n

.

Having proven the Claim, we immediately see that by the definition of thepositive integral, we have

J(f) ≤ Iµelem(hn) ≤ Iµ(gn) +βµ(B)n

≤ Iµ+(f) +βµ(B)n

.

Since the inequality J(f) ≤ Iµ+(f) + βµ(B)n holds for all n ≥ 1, it will clearly force

J(f) ≤ Iµ+(f).

Our next goal is to prove an analogue of Theorem 1.1, for the positive integral(Theorem 1.2 below). We discuss first a weaker version.

Lemma 1.1. Let (X,A, µ) be a measure space.(i) If f ∈ L1

+(X,A, µ) and g ∈ L1R,elem(X,A, µ) are such that g+ f ≥ 0, then

g + f ∈ L1+(X,A, µ), and Iµ+(g + f) = Iµelem(g) + Iµ+(f).

(ii) If f ∈ L1+(X,A, µ) and g ∈ L1

R,elem(X,A, µ) are such that g− f ≥ 0, theng − f ∈ L1

+(X,A, µ), and Iµ+(g − f) = Iµelem(g)− Iµ+(f).

CHAPTER IV: INTEGRATION THEORY 297

Proof. (i). We start with a weaker version.Claim: If f ∈ L1

+(X,A, µ) and g ∈ L1R,elem(X,A, µ), are such that g+f ≥ 0,

then g + f ∈ L1+(X,A, µ), and Iµ+(g + f) ≤ Iµelem(g) + Iµ+(f).

What we need to prove is the fact that, for every h ∈ A-ElemR(X), with 0 ≤ h ≤g + f , we have:

(a) h ∈ L1R,elem(X,A, µ);

(b) Iµelem(h) ≤ Iµelem(g) + Iµ+(f).Consider the elementary function h1 = maxh−g, 0. It is obvious that 0 ≤ h1 ≤ f ,so by Proposition 1.3, it follows that h1 ∈ L1

+(X,A, µ), and Iµ+(h1) ≤ Iµ+(f). ByProposition 1.2, this gives h1 ∈ L1

R,elem(X,A, µ), and

(10) Iµelem(h1) = Iµ+(h1) ≤ Iµ+(f).

Using the obvious inequality −g ≤ h− g ≤ h1, again by Proposition 1.2, it followsthat h− g ∈ L1

R,elem(X,A, µ), and

(11) Iµelem(h− g) ≤ Iµelem(h1).

Of course, by Theorem 1.1, this gives the fact that h = (h − g) + g is elementaryµ-integrable, as well as the equality

Iµelem(h) = Iµelem(h− g) + Iµelem(g).

Combining this with (11) and (10) immediately gives

Iµelem(h) ≤ Iµelem(h1) + Iµelem(g) ≤ Iµ+(f) + Iµelem(g),

and the Claim is proven.Having proven the above Claim, we now proceed with the proof of (i). If

f ∈ L1+(X,A, µ) and g ∈ L1

R,elem(X,A, µ) are such that g + f ≥ 0, then by theClaim , we already know that g+f ∈ L1

+(X,A, µ), and Iµ(g+f) ≤ Iµelem(g)+Iµ+(f).We apply now again the Claim to the functions f1 = g + f and g1 = −g, to get

Iµ+(f) = Iµ+(g1 + f1) ≤ Iµelem(g1) + Iµ+(f1) = −Iµ+(g) + I+(g + f),

which gives the other inequality Iµelem(g) + Iµ+(f) ≤ Iµ+(g + f).(ii). Start with f ∈ L1

+(X,A, µ) and g ∈ L1R,elem(X,A, µ), with g−f ≥ 0. First

of all, since 0 ≤ g − f ≤ g, by Proposition 1.5, it follows that g − f ∈ L1+(X,A, µ),

and

(12) Iµ+(g − f) = infIµelem(k) : k ∈ L1

R,elem(X,A, µ), k ≥ g − f.

Second, remark that, whenever k ∈ L1R,elem(X,A, µ) is such that g − f ≤ k, it

follows that k+f ≥ g, so using part (i) combined with Proposition 1.3, we see thatk + f ∈ L1

+(X,A, µ), and

Iµelem(g) = Iµ+(g) ≤ Iµ+(k + f) = Iµelem(k) + Iµ+(f).

This means that we have

Iµelem(k) ≥ Iµelem(g)− Iµ+(f),

for all k ∈ L1R,elem(X,A, µ), with k ≥ g − f , and then by (12), we immediately get

Iµ+(g − f) ≥ Iµelem(g)− Iµ+(f).

298 LECTURES 32-33

To prove the other inequality, we use the definition of the positive integral, whichgives

(13) Iµ+(g − f) = supIµelem(h) : h ∈ L1

R,elem(X,A, µ), 0 ≤ h ≤ g − f.

Remark that, whenever h ∈ L1R,elem(X,A, µ) is such that 0 ≤ h ≤ g − f , it follows

that 0 ≤ h + f ≤ g, so using part (i) combined with Proposition 1.3, we see thath+ f ∈ L1

+(X,A, µ), and

Iµelem(g) = Iµ+(g) ≥ Iµ+(h+ f) = Iµelem(h) + Iµ+(f).

This means that we have

Iµelem(h) ≤ Iµelem(g)− Iµ+(f),

for all h ∈ L1R,elem(X,A, µ), with 0 ≤ h ≤ g− f , and then by (13), we immediately

get Iµ+(g − f) ≤ Iµelem(g)− Iµ+(f).

We are now in position to prove the following result (compare with Theorem1.1).

Theorem 1.2. Let (X,A, µ) be a measure space.(i) If f1, f2 ∈ L1

+(X,A, µ), then f1 + f2 ∈ L1+(X,A, µ), and one has the

equality Iµ+(f1 + f2) = Iµ+(f1) + Iµ+(f2).(ii) If f ∈ L1

+(X,A, µ), and α ∈ [0,∞), then14 αf ∈ L1+(X,A, µ), and one

has the equality Iµ+(αf) = αIµ+(f).

Proof. (i). Fix f1, f2 ∈ L1+(X,A, µ).

Claim 1: Whenever h ∈ A-ElemR(X) satisfies 0 ≤ h ≤ f1 + f2, it followsthat(a) h ∈ L1

R,elem(X,A, µ),(b) Iµelem(h) ≤ Iµ+(f1) + Iµ+(f2).

Fix an elementary function h ∈ A-ElemR(X), with 0 ≤ h ≤ f1 + f2, and let us firstshow that h is elementary integrable. Fix some α ∈ h(X)r0, and let us prove thatµ(h−1(α)

)< ∞. If we define the sets Aj = f−1

j

([α/2,∞]

)∈ A, j = 1, 2, then

the elementary functions hj = α2 κAj

satisfy 0 ≤ hj ≤ fj , j = 1, 2. In particular,it follows that h1, h2 ∈ L1

R,elem(X,A, µ), which forces µ(A1) <∞ and µ(A2) <∞.Notice however that, for every x ∈ h−1(α), we have f1(x) + f2(x) ≥ h(x) = α,which forces either f1(x) ≥ α

2 or f2(x) ≥ α2 . This argument shows tha we have the

inclusion h−1(α) ⊂ A1∪A2, so it follows that we indeed have µ(h−1(α)

)<∞.

Having shown property (a), let us prove property (b). Define the sets

B = x ∈ X : h(x) ≥ f1(x) and D = X rB.

It is obvious that B,D ∈ A are pairwise disjoint, and B ∪ D = X. Define theelementary functions h′ = hκB , and h′′ = h − h′ = hκD. On the one hand, wehave

f1κB ≤ h′ ≤ f1κB + f2κB ,

which gives0 ≤ h′ − f1κB ≤ f2κB .

14 Here we use the convention that when α = 0, we take αf = 0.

CHAPTER IV: INTEGRATION THEORY 299

By Lemma 1.1.(ii), combined with Proposition 1.4, it follows that h′ − f1κB ∈L1

+(X,A, µ) and Iµelem(h′)− Iµ+(f1κB) = Iµ+(h′ − f1κB) ≤ Iµ+(f2κB), so we get

(14) Iµelem(h′) ≤ Iµ+(f1κB) + Iµ+(f2κB).

On the other hand, we have

h′′ = hκD ≤ f1κD,

which gives

(15) Iµelem(h′′) ≤ Iµ+(f1κD) ≤ Iµ+(f1κD) + Iµ+(f2κD).

Since h = h′ + h′′, with h′ and h′′ elementary integrable, using Theorem 1.1 com-bined with Proposition 1.4, by adding the inequalities (14) and (15) we get

Iµelem(h) = Iµelem(h′) + Iµelem(h′′) ≤≤ Iµ+(f1κB) + Iµ+(f2κB) + Iµ+(f1κD) + Iµ+(f2κD) = Iµ+(f1) + Iµ+(f2),

and the Claim is proven.Claim 1 obviously implies the fact that f1 + f2 ∈ L1

+(X,A, µ), as well as theinequality

Iµ+(f1 + f2) ≤ Iµ+(f1) + Iµ+(f2).To prove the other inequality, we use the following.

Claim 2: For every h ∈ A-ElemR(X), with 0 ≤ h ≤ f1, one has the inequal-ity

Iµelem(h) ≤ Iµ+(f1 + f2)− Iµ+(f2).Indeed, if h is as above, then h is in L1

+(X,A, µ), hence elementary integrable, andwe obviously have 0 ≤ h + f2 ≤ f1 + f2. Then by Lemma 1.1.(i), combined withProposition 1.3, we get

Iµelem(h) + Iµ+(f2) = Iµ+(h+ f2) ≤ Iµ+(f1 + f2),

and the Claim follows.Using Claim 2, and the definition of the positive integral, we get

Iµ+(f1) = supIµelem(h) : h ∈ A-ElemR(X), 0 ≤ h ≤ f1

≤ Iµ+(f1 + f2)− Iµ+(f2),

which then givesIµ+(f1) + Iµ+(f2) ≤ Iµ+(f1 + f2).

(ii). This part is obvious.

Definitions. Let (X,A, µ) be a measure space. Denote the extended real line[−∞,∞] by R. A measurable function f : X → R is said to be µ-integrable, if thereexist functions f1, f2 ∈ L1

+(X,A, µ), such that

(16) f(x) = f1(x)− f2(x), ∀x ∈ X r[f−11 (∞) ∪ f−1

2 (∞)].

By Remark 1.3, we know that the sets f−1k (∞), k = 1, 2, have measure zero. The

equality (16) gives then the fact f = f1 − f2, µ-a.e. We define

L1R(X,A, µ) =

f : X → R : f µ-integrable

.

We also define the space of “honest” real-valued µ-integrable functions, as

L1R(X,A, µ) =

f ∈ L1

R(X,A, µ) : f −∞ < f(x) <∞, ∀x ∈ X.

Finally, we define the space of complex-valued µ-integrable functions as

L1C(X,A, µ) =

f : X → C : Re f, Im f ∈ L1

R(X,A, µ).

300 LECTURES 32-33

The next result collects the basic properties of L1R. Among other things, it

states that it is an “almost” vector space.Theorem 1.3. Let (X,A, µ) be a measure space.

(i) For a measurable function f : X → R, the following are equivalent:(a) f ∈ L1

R(X,A, µ);(b) f ∈ L1

+(X,A, µ).(ii) If f, g ∈ L1

R(X,A, µ), and if h : X → R is a measurable function, suchthat

h(x) = f(x) + g(x), ∀x ∈ X r[f−1(−∞,∞) ∪ g−1(−∞,∞)

],

then h ∈ L1R(X,A, µ).

(iii) If f ∈ L1R(X,A, µ), and α ∈ R, and if g : X → R is a measurable function,

such that

g(x) = αf(x), ∀x ∈ X r f−1(−∞,∞),

then g ∈ L1R(X,A, µ).

(iv) One has the inclusion

L1R,elem(X,A, µ) ∪ L1

+(X,A, µ) ⊂ L1R(X,A, µ).

Proof. (i). Consider the functions measurable functions f± : X → [0,∞]defined as

f+ = maxf, 0 and f− = max−f, 0.To prove the impliaction (a) ⇒ (b), assume f ∈ L1

R(X,A, µ), which means thereexist f1, f2 ∈ L1

+(X,A, µ), such that

f(x) = f1(x)− f2(x), ∀x ∈ X r[f−11 (∞) ∪ f−1

2 (∞)].

Notice that we have the inequalities

f+ ≤ f1, µ-a.e.,(17)

f− ≤ f2, µ-a.e..(18)

Indeed, if we put N = f−11 (∞)∪ f−1

2 (∞), then µ(N) = 0, and if we start withsome x ∈ X rN , we either have f1(x) ≥ f2(x) ≥ 0, in which case we get

f+(x) = f(x) = f1(x)− f2(x) ≤ f1(x),

f−(x) = 0 ≤ f2(x),

or we have f1(x) ≤ f2(x), in which case we get

f+(x) = 0 ≤ f1(x),

f−(x) = −f(x) = f2(x)− f1(x) ≤ f2(x).

In other words, we have

f+(x) ≤ f1(x) and f−(x) ≤ f2(x), ∀x ∈ X rN,

so we indeed get (17) and (18). Using these inequalities, and Proposition 1.3, itfollows that f± ∈ L1

+(X,A, µ), so by Theorem 1.2, it follows that f+ + f− = |f |also belongs to L1

+(X,A, µ).

CHAPTER IV: INTEGRATION THEORY 301

To prove the implication (b) ⇒ (a), start by assuming that |f | ∈ L1+(X,A, µ).

Then, since we obviously have the inequalities 0 ≤ f± ≤ |f |, again by Proposition1.3, it follows that f± ∈ L1

+(X,A, µ). Since we obviously have

f(x) = f+(x)− f−(x), ∀x ∈ X r f−1(−∞,∞),it follows that f indeed belongs to f± ∈ L1

R(X,A, µ).(ii). Assume f , g, and h are as in (ii). By (i), both functions |f | and |g| are in

L1+(X,A, µ). By Theorem 1.2, it follows that the function k = |f |+ |g| also belongs

to L1+(X,A, µ). Notice that we have the equality

f−1(−∞,∞) ∪ g−1(−∞,∞) = k−1(∞),so the hypothesis on h reads

h(x) = f(x) + g(x), ∀x ∈ X r k−1(∞),which then gives

|h(x)| = |f(x) + g(x)| ≤ |f(x)|+ |g(x)|, ∀x ∈ X r k−1(∞).Of course, since µ

(k−1(∞)

)= 0, this gives

|h| ≤ k, µ-a.e.,

and using (i) it follows that h indeed belongs to L1R(X,A, µ).

(iii). Assume f , α, and g are as in (iii). Exactly as above, we have |g| = |α| · |f |,µ-a.e., and then by Theorem 1.2 it follows that |g| ∈ L1

+(X,A, µ).(iv). The inclusion L1

+(X,A, µ) ⊂ L1R(X,A, µ) is trivial. To prove the inclusion

L1R,elem(X,A, µ) ⊂ L1

R(X,A, µ), we use parts (ii) and (iii) to reduce this to the factthat κA ∈ L1

R(X,A, µ), for all A ∈ A, with µ(A) <∞. But this fact is now obvious,because any such function belongs to L1

+(X,A, µ) ⊂ L1R(X,A, µ).

Corollary 1.1. Let (X,A, µ) be a measure space, and let K be one of thefields R or C.

(i) For a K-valued measurable function f : X → K, the following are equiva-lent:(a) f ∈ L1

K(X,A, µ);(b) |f | ∈ L1

+(X,A, µ).(ii) When equipped with the pointwise addition and scalar multiplication, the

space L1K(X,A, µ) becomes a K-vector space.

Proof. (i). The case K = R is immediate from Theorem 1.3In the case when K = C, we use the obvious inequalities

(19) max|Re f |, |Im f |

≤ |f | ≤ |Re f |+ |Im f |.

If f ∈ L1C(X,A, µ), then both Re f and Im f belong to L1

R(X,A, µ), so byTheorem 1.3, both |Re f | and |Im f | belong to L1

+(X,A, µ). By Theorem 1.2, thefunction g = |Re f | + |Im f | belongs to L1

+(X,A, µ), and then using the secondinequality in (19), it follows that |f | belongs to L1

+(X,A, µ).Conversely, if |f | belongs to L1

+(X,A, µ), then using the first inequality in (19),it follows that both |Re f | and |Im f | belong to L1

+(X,A, µ), so by Theorem 1.3,both Re f and Im f belong to L1

R(X,A, µ), i.e. f belongs to L1C(X,A, µ).

(ii). This part is pretty clear. If f, g ∈ L1K(X,A, µ), then by (i) both |f |

and |g| belong to L1+(X,A, µ), and by Theorem 1.2, the function |f | + |g| will

302 LECTURES 32-33

also belong to L1+(X,A, µ). Since |f + g| ≤ |f | + |g|, it follows that |f + g| itself

belongs to L1+(X,A, µ), so using (i) again, it follows that f + g indeed belongs to

L1K(X,A, µ). If f ∈ L1

K(X,A, µ) and α ∈ K, then |f | belongs to L1+(X,A, µ), so

|αf | = |α| · |f | again belongs to L1+(X,A, µ), which by (i) gives the fact that αf

belongs to L1K(X,A, µ).

Remark 1.5. Let (X,A, µ) be a measure space. Then one has the equalities

L1+(X,A, µ) =

f ∈ L1

R(X,A, µ) : f(X) ⊂ [0,∞];(20)

L1K,elem(X,A, µ) = L1

K(X,A, µ) ∩A-ElemK(X).(21)

Indeed, by Theorem 1.3 that we have the inclusion

L1+(X,A, µ) ⊂

f ∈ L1

R(X,A, µ) : f(X) ⊂ [0,∞].

The inclusion in the other direction follows again from Theorem 1.3, since anyfunction that belongs to the right hand side of (20) satisfies f = |f |. The inclusion

L1K,elem(X,A, µ) ⊂ L1

K(X,A, µ) ∩A-ElemK(X)

is again contained in Theorem 1.3. To prove the inclusion in the other direction,it suffices to consider the case K = R. Start with h ∈ L1

R(X,A, µ) ∩ A-ElemR(X),which gives |h| ∈ L1

+(X,A, µ). The function |h| is obviously in A-ElemR(X), sowe get |h| ∈ L1

R,elem(X,A, µ). Since L1R,elem(X,A, µ) is a vector space, it will also

contain the function −|h|. The fact that h itself belongs to L1R,elem(X,A, µ) then

follows from Proposition 1.1, combined with the obvious inequalities

−|h| ≤ h ≤ |h|.

The following result deals with the construction of the integral.

Theorem 1.4. Let (X,A, µ) be a measure space. There exists a unique mapIµR(X,A, µ) → R, with the following properties:

(i) Whenever f, g, h ∈ L1R(X,A, µ) are such that

h(x) = f(x) + g(x), ∀x ∈ X r[f−1(−∞,∞) ∪ g−1(−∞,∞)

],

it follows that IµR(h) = IµR(f) + IµR(g).(ii) Whenever f, g ∈ L1

R(X,A, µ) and α ∈ R are such that

g(x) = αf(x), ∀x ∈ X r f−1(−∞,∞),

it follows that IµR(g) = αIµR(f).(iii) IµR(f) = Iµ+(f), ∀ f ∈ L1

+(X,A, µ).

Proof. Let us first show the existence. Start with some f ∈ L1R(X,A, µ), and

define the functions f± : X → [0,∞] by f+ = maxf, 0 and f− = max−f, 0 sothat f = f+ − f−, and f+, f− ∈ L1

+(X,A, µ). We then define

IµR(f) = Iµ+(f+)− Iµ+(f−).

It is obvious that IµR satisfies condition (iii).The key fact that we need is contained in the following.

CHAPTER IV: INTEGRATION THEORY 303

Claim: Whenever f ∈ L1R(X,A, µ), and f1, f2 ∈ L1

+(X,A, µ) are such that

f(x) = f+(x)− f−(x), ∀x ∈ X.r[−1(∞) ∪ f−1

2 (∞)],

it follows that we have the equality

IµR(f) = Iµ+(f1)− Iµ+(f2).

Indeed, since we have f = f+−f−, it follows immediately that we have the equality

f2(x) + f+(x) = f1(x) + f−(x), ∀x ∈ X.r[f−11 (∞) ∪ f−1

2 (∞)],

which givesf2 + f+ = f1 + f−, µ-a.e.

By Theorem 1.2, this immediately gives

Iµ+(f2) + Iµ+(f+) = Iµ+(f1) + Iµ+(f−),

which then gives

Iµ+(f1)− Iµ+(f2) = Iµ+(f+)− Iµ+(f−) = IµR(f).

Having prove the above Claim, let us show now that IµR has properties (i) and(ii). Assume f , g and h are as in (i). Notice that if we define h1 = f+ + g+ andh2 = f− + g−, then we clearly have 0 ≤ h1 ≤ |f |+ |g| and 0 ≤ h2 ≤ |f |+ |g|, so h1

and h2 both belng to L1+(X,A, µ). By Theorem 1.2, we then have

(22) Iµ+(h1) = Iµ+(f+) + Iµ+(g+) and Iµ+(h2) = Iµ+(f−) + Iµ+(g−).

Notice also that, because of the equalities

h−11 (∞) = f−1(∞ ∪ g−1(∞) and h−1

2 (∞) = f−1(−∞ ∪ g−1(−∞),we have

h = h1(x)− h2(x), ∀x ∈ X.r[h−1

1 (∞) ∪ h−12 (∞)

],

so by the above Claim, combined with (22), we get

IµR(h) = Iµ+(h1)− Iµ+(h2) = Iµ+(f+) + Iµ+(g+)− Iµ+(f−)− Iµ+(g−) = IµR(f) + IµR(g).

Property (ii) is pretty obvious.The uniqueness is also obvious. If we start with a map J : L1

R(X,A, µ) → Rwith properties (i)-(iii), then for every f ∈ L1

R(X,A, µ), we must have

J(f) = J(f+)− J(f−) = Iµ+(f+)− Iµ+(f−).

(For the second equality we use condition (iii), combined with the fact that bothf+ and f− belong to L1

+(X,A, µ).)

Corollary 1.2. Let (X,A, µ) be a measure space, and let K be either R orC. There exists a unique linear map IµK(X,A, µ) → K, such that

IµK(f) = Iµ+(f), ∀ f ∈ L1+(X,A, µ) ∩ L1

K(X,A, µ).

Proof. Let us start with the case K = R. In this case, we have the inclusion

L1R(X,A, µ) ⊂ L1

R(X,A, µ),

so we can define IµR as the restriction of IµR to L1R(X,A, µ). The uniqueness is again

clear, because of the equalities

IµR(f) = IµR(f+)− IµR(f−) = Iµ+(f+)− Iµ+(f−).

304 LECTURES 32-33

In the case K = C, we define

IµC(f) = IµR(Re f) + iIµR(Im f).

The linearity is obvious. The uniqueness is also clear, because the restriction of IµCto L1

R(X,A, µ) must agree with IµR .

Definition. Let (X,A, µ) be a measure space, and let K be one of the symbolsR, R, or C. For any f ∈ L1

K(X,A, µ), the number IµK(f) (which is real, if K = Ror R, and is complex if K = C) will be denoted by∫

X

f dµ,

and is called the µ-integral of f . This notation is unambiguous, because if f ∈L1

R(X,A, µ), then we have IµR(f) = IµC(f) = IµR(f).Remark 1.6. If (X,A, µ) is a measure space, then for every A ∈ A, with

µ(A) <∞, using the above Corollary, we get∫X

κA dµ = Iµ+(κA) = µ(A).

By linearity, if K = R,C, one has then the equality∫X

h dµ = Iµelem(h), ∀h ∈ L1K,elem(X,A, µ).

To make the exposition a bit easier, it will adopt the following.Convention. If (X,A, µ) is a measure space, and if f : X → [0,∞] is a

measurable function, which does not belong to L1+(X,A, µ), then we define∫

X

f dµ = ∞.

Remarks 1.7. Let (X,A, µ) be a measure space.A. Using the above convention, when h ∈ A-ElemR(X) is a function with

h(X) ⊂ [0,∞), the condition∫Xh dµ = ∞ is equivalent to the existence of some

α ∈ h(X) r 0, with µ(h−1(α)

)= ∞.

B. Using the above convention, for every measurable function f : X → [0,∞],one has the equality∫

X

f dµ = sup ∫

X

h dµ : h ∈ A-ElemR(X), 0 ≤ h ≤ f

.

C. If f, g : X → [0,∞] are measurable, then one has the equalities∫X

(f + g) dµ =∫X

f dµ+∫X

g dµ,∫(αf) dµ = α

∫X

f dµ, ∀α ∈ [0,∞),

even in the case when some term is infinite. (We use the convention ∞ + t = ∞,∀ t ∈ [0,∞], as well as α · ∞ = ∞, ∀α ∈ (0,∞), and 0 · ∞ = 0.)

D. If f, g : X → [0,∞] are measurable, and f ≤ g, µ-a.e., then (using B) onehas the inequality ∫

X

f dµ ≤∫X

g dµ,

even if one side (or both) is infinite.

CHAPTER IV: INTEGRATION THEORY 305

E. Let K be one of the symbols R, R, or C, and let f : X → K be a measurablefunction. Then the function |f | : X → [0,∞] is measurable. Using the above con-vention, the condition that f belongs to L1

K(X,A, µ) is equivalent to the inequality∫X|f | dµ <∞.In the remainder of this section we discuss several properties of integration that

are analoguous to those of the positive/elementary integration.We begin with a useful estimateProposition 1.6. Let (X,A, µ) be a measure space, and let K be one of the

symbols R, R, or C. For every function f ∈ L1K(X,A, µ), one has the inequality∣∣∣∣ ∫

X

f dµ

∣∣∣∣ ≤ ∫X

|f | dµ.

Proof. Let us first examine the case when K = R, R. In this case we definef+ = maxf, 0 and f− = max−f, 0, so we have f = f+ − f−, as well as|f | = f+ + f−. Using the inequalities Iµ+(f±) ≥ 0, we have∫

X

f dµ = Iµ+(f+)− Iµ+(f−) ≤ Iµ+(f+) + Iµ+(f−) =∫X

|f | dµ;

−∫X

f dµ = −Iµ+(f+) + Iµ+(f−) ≤ Iµ+(f+) + Iµ+(f−) =∫X

|f | dµ.

In other words, we have

±∫X

f dµ ≤∫X

|f | dµ,

and the desired inequality immediately follows.Let us consider now the case K = C. Consider the number λ =

∫Xf dµ, and

let us choose some complex number α ∈ C, with |α| = 1, and αλ = |λ|. (If λ 6= 0,we take α = λ−1|λ|; otherwise we take α = 1.) Consider the measurable functiong = αf . Notice now that( ∫

X

Re g dµ)

+ i

( ∫X

Im g dµ

)=

∫X

g dµ = α

∫X

f dµ = αλ = |λ| ≥ 0,

so in particular we get

|λ| =∫X

Re g dµ.

If we apply the real case, we then get

(23) |λ| ≤∫X

|Re g| dµ.

Notice now that, we have the inequality |Re g| ≤ |g| = |f |, which gives

Iµ+(|Re g|

)≤ Iµ

(|f |

)=

∫X

|f | dµ,

so the inequality (23) immediately gives∣∣∣∣ ∫X

f dµ

∣∣∣∣ = |λ| ≤∫X

|f | dµ.

Corollary 1.3. Let (X,A, µ) be a measure space, and let K be one of thesymbols R, R, or C. If a measurable function f : X → K satisfies f = 0, µ-a.e,then f ∈ L1

K(X,A, µ), and∫Xf dµ = 0.

306 LECTURES 32-33

Proof. Consider the measurable function |f | : X → [0,∞], which satisfies|f | = 0, µ-a.e. By Proposition 1.3, it follows that |f | ∈ L1

+(X,A, µ), hence f ∈L1

K(X,A, µ), and∫X|f | dµ = 0. Of course, the last equality forces

∫Xf dµ = 0.

Corollary 1.4. Let K be either R or C. If (X,A, µ) is a finite measure space,then every bounded measurable function f : X → K belongs to L1

K(X,A, µ), andsatisfies ∣∣∣∣ ∫

X

f dµ

∣∣∣∣ ≤ µ(X) · supx∈X

|f(x)|.

Proof. If we put β = supx∈X |f(x)|, then we clearly have |f | ≤ βκX , whichshows that |f | ∈ L1

+(X,A, µ), and also gives∫X|f | dµ ≤

∫XβκX dµ = µ(X) · β.

Then everything follows from Proposition 1.6.

Comment. The introduction of the space L1R(X,A, µ), of extended real-valued

µ-integrable functions, is useful mostly for technical reasons. In effect, everythingcan be reduced to the case when only “honest” real-valued functions are involved.The following result clarifies this matter.

Lemma 1.2. Let (X,A, µ) be a measure space, and let f : X → R be a mea-surable function. The following ar equivalent

(i) f ∈ L1R(X,A, µ);

(ii) there exists g ∈ L1R(X,A, µ), such that g = f , µ-a.e.

Moreover, if f satisfies these equivalent conditions, then any function g, satisfying(ii), also has the property ∫

X

f dµ =∫X

g dµ.

Proof. Consider the set F = x ∈ X : −∞ < f(x) < ∞, which belongs toA. We obviously have the equality X r F = |f |−1(∞).

(i) ⇒ (ii). Assume f ∈ L1R(X,A, µ), which means that |f | ∈ L1

+(X,A, µ). Inparticular, we get µ(XrF ) = 0. Define the measurable function g = fκF . On theone hand, it is clear, by construction, that we have −∞ < g(x) <∞, ∀x ∈ X. Onthe other hand, it is clear that g

∣∣F

= f∣∣F, so using µ(X r F ) = 0, we get the fact

that f = g, µ-a.e. Finally, the inequality 0 ≤ |g| ≤ |f |, combined with Proposition1.3, gives |g| ∈ L1

+(X,A, µ), so g indeed belongs to L1R(X,A, µ).

(ii) ⇒ (i). Suppose there exists g ∈ L1R(X,A, µ), with f = g, µ-a.e., and let us

prove that(a) f ∈ L1

R(X,A, µ);(b)

∫Xf dµ =

∫Xg dµ.

The first assertion is clear, because by Proposition 1.3, the equality |f | = |g|, µ-a.e.,combined with |g| ∈ L1

+(X,A, µ), forces |f | ∈ L1+(X,A, µ), i.e. f ∈ L1

R(X,A, µ). Toprove (b), we consider the difference h = f − g, which is a measurable function h :X → R, and satisfies h = 0, µ-a.e. By Corollary 1.3, we know that h ∈ L1

R(X,A, µ),and

∫Xh dµ = 0. By Theorem 1.3, we get∫

X

f dµ =∫X

g dµ+∫X

h dµ =∫x

g dµ.

The following result is an analogue of Proposition 1.1 (see also Proposition 1.3).

CHAPTER IV: INTEGRATION THEORY 307

Proposition 1.7. Let (X,A, µ), and let f1, f2 ∈ L1R(X,A, µ). Suppose f :

X → R is a measurable function, such that f1 ≤ f ≤ f2, µ-a.e. Then f ∈L1

R(X,A, µ), and one has the inequality∫X

f1 dµ ≤∫X

f dµ ≤∫X

f2 dµ.

Proof. First of all, since f1 and f2 belong to L1R(X,A, µ), it follows that

|f1| and |f2|, hence also |f1| + |f2|, belong tp L1+(X,A, µ). Second, since we have

f2 ≤ |f2| ≤ |f1|+ |f2| and f1 ≥ −|f1| ≥ −|f1| − |f2| (everyhwere!), the inequalitiesf1 ≤ f ≤ f2, µ-a.e., give

−|f1| − |f2| ≤ f ≤ |f1|+ |f2|, µ-a.e.,

which reads|f | ≤ |f1|+ |f2|, µ-a.e.

Since |f1|+ |f2| ∈ L1+(X,A, µ), by Proposition 1.3., we get |f | ∈ L1

+(X,A, µ), so findeed belongs to L1

R(X,A, µ).To prove the inequality for integrals, we use Lemma 1.2, to find functions

g1, g2, g ∈ L1R(X,A, µ), such that f1 = g1, µ-a.e., f2 = g2, µ-a.e., and f = g, µ-a.e.

Lemma 1.2 also gives the equalities∫Xf1 dµ =

∫Xg1 dµ,

∫Xf2 dµ =

∫Xg2 dµ, and∫

Xf dµ =

∫Xg dµ, so what we need to prove are the inequalities

(24)∫X

g1 dµ ≤∫X

g dµ ≤∫X

g2 dµ.

Of course, we haveg1 ≤ g ≤ g2, µ-a.e.

To prove the first inequality in (24), we consider the function h = g − g1 ∈L1

R(X,A, µ), and we prove that∫Xh dµ ≥ 0. But this is quite clear, because

we have h ≥ 0, µ-a.e., which means that h = |h|, µ-a.e., so by Lemma 1.2, we get∫X

h dµ =∫X

|h| dµ = Iµ+(|h|) ≥ 0.

The second inequality in (24) is prove the exact same way.

The next result is an analogue of Proposition 1.4.Proposition 1.8. Let (X,A, µ) be a measure space, and let K be one of the

symbols R, R, or C. Suppose (Ak)nk=1 ⊂ A is a pairwise disjoint finite sequence,with A1 ∪ · · · ∪ An = X. For a measurable function f : X → K, the following areequivalent.

(i) f ∈ L1K(X,A, µ);

(ii) fκAk∈ L1

K(X,A, µ), ∀ k = 1, . . . , n.Moreover, if f satisfies these equivalent conditions, one has

(25)∫X

f dµ =n∑k=1

∫X

fκAkdµ.

Proof. It is fairly obvious that |fκAk| = |f |κAk

. Then the equivalence (i) ⇔(ii) follows from Proposition 1.4 applied to the function |f | : X → [0,∞]. Inthe cases when K = R,C, the equality (25) follows immediately from linearity,and the obvious equality f =

∑nk=1 fκAk

. In the case when K = R, we takeg ∈ L1

R(X,A, µ), such that f = g, µ-a.e. Then we obviously have fκAk= gκAk

,

308 LECTURES 32-33

µ-a.e., for all k = 1, . . . , n, and the equality (25) follows from the correspondingequality that holds for g.

Remark 1.8. The equality (25) also holds for arbitrary measurable functionsf : X → [0,∞], if we use the convention that preceded Remarks 1.7. This is animmediate consequence of Proposition 1.4, because the left hand side is infinite, ifan only if one of the terms in the right hand side is infinite.

The following is an obvious extension of Remark 1.4.Remark 1.9. Let K be one of the symbols R, R, or C, let (X,A, µ) be a

measure space. For a set S ∈ A, and a measurable function f : X → K, one hasthe equivalence

fκS ∈ L1K(X,A, µ) ⇐⇒ f

∣∣S∈ L1

K

(S,A

∣∣S, µ

∣∣S

).

If this is the case, one has the equality

(26)∫X

fκS dµ =∫S

f∣∣Sdµ|S .

The above equality also holds for arbitrary measurable functions f : X → [0,∞],again using the convention that preceded Remarks 1.7.

Notation. The above remark states that, whenver the quantities in (26) aredefined, they are equal. (This only requires the fact that f

∣∣S

is measurable, andeither f

∣∣S∈ L1

K

(S,A

∣∣S, µ

∣∣S

), or f(S) ⊂ [0,∞].) In this case, the equal qunatities

in (26) will be simply denoted by∫Sf dµ.

Exercise 1. Let I be some non-empty set. Consider the σ-algebra P(I), of allsubsets of I, equipped with the counting measure

µ(A) =

CardA if A is finite∞ if A is infinite

Prove that L1R(I,P(I), µ) = L1

R(I,P(I), µ). Prove that, if K is either R or C, then

L1K(I,P(I), µ) = `1K(I),

the Banach space discussed in II.2 and II.3.Exercise 2. There is an instance when the entire theory developped here is

essentially vacuous. Let X be a non-empty set, and let A be a σ-algebra on X. Fora measure µ on A, prove that the following are equivalent

(i) L1+(X,A, µ) =

f : X → [0,∞] : f measurable, and f = 0, µ-a.e.

;

(ii) for every A ∈ A, one has µ(A) ∈ 0,∞.A measure space (X,A, µ), with property (ii), is said to be degenerate.

Exercise 3♦. Let (X,A, µ) be a measure space, and let f : X → [0,∞] be ameasurable function, with

∫Xf dµ = 0. Prove that f = 0, µ-a.e.

Hint: Define the measurable sets An = x ∈ X : f(x) ≥ 1n, and analyze the relationship

between f and κAn.

Lecture 34

2. Convergence theorems

In this section we analyze the dynamics of integrabilty in the case when se-quences of measurable functions are considered. Roughly speaking, a “convergencetheorem” states that integrability is preserved under taking limits. In other words,if one has a sequence (fn)∞n=1 of integrable functions, and if f is some kind of alimit of the fn’s, then we would like to conclude that f itself is integrable, as wellas the equality

∫f = limn→∞

∫fn.

Such results are often employed in two instances:

A. When we want to prove that some function f is integrable. In this casewe would look for a sequence (fn)∞n=1 of integrable approximants for f .

B. When we want to construct and integrable function. In this case, we willproduce first the approximants, and then we will examine the existenceof the limit.

The first convergence result, which is somehow primite, but very useful, is thefollowing.

Lemma 2.1. Let (X,A, µ) be a finite measure space, let a ∈ (0,∞) and letfn : X → [0, a], n ≥ 1, be a sequence of measurable functions satisfying

(a) f1 ≥ f2 ≥ · · · ≥ 0;(b) limn→∞ fn(x) = 0, ∀x ∈ X.

Then one has the equality

(1) limn→∞

∫X

fn dµ = 0.

Proof. Let us define, for each ε > 0, and each integer n ≥ 1, the set

Aεn = x ∈ X : fn(x) ≥ ε.

Obviously, we have Aεn ∈ A, ∀ ε > 0, n ≥ 1. One key fact we are going to use is thefollowing.

Claim 1: For every ε > 0, one has the equality

limn→∞

µ(Aεn) = 0.

Fix ε > 0. Let us first observe that, using (a), we have the inclusions

(2) Aε1 ⊃ Aε2 ⊃ . . .

309

310 LECTURE 34

Second, using (b), we clearly have the equality⋂∞k=1A

εk = ∅. Since µ is finite,

using the Continuity Property (Lemma III.4.1), we have

limn→∞

µ(Aεn) = µ( ∞⋂n=1

Aεn)

= µ(∅) = 0.

Claim 2: For every ε > 0 and every integer n ≥ 1, one has the inequality

0 ≤∫X

fn dµ ≤ aµ(Aεn) + εµ(X).

Fix ε and n, and let us consider the elementary function

hεn = aκAεn

+ εκBεn,

where Bεn = X r Aε. Obviously, since µ(X) < ∞, the function hεn is elementaryintegrable. By construction, we clearly have 0 ≤ fn ≤ hεn, so using the propertiesof integration, we get

0 ≤∫X

fn dµ ≤∫X

hεn dµ = aµ(Aεn) + εµ(Bε) ≤ aµ(Aε) + εµ(X).

Using Claims 1 and 2, it follows immediately that

0 ≤ lim infn→∞

∫X

fn dµ ≤ lim supn→∞

∫X

fn dµ ≤ εµ(X).

Since the last inequality holds for arbitrary ε > 0, the desired equality (1) immedi-ately follows.

We now turn our attention to a weaker notion of limit, for sequences of mea-surable functions.

Definition. Let (X,A, µ) be a measure space, let K be a one of the symbolsR, R, or C. Suppose fn : X → K, n ≥ 1, are measurable functions. Givena measurable function f : X → K, we say that the sequence (fn)∞n=1 convergesµ-almost everywhere to f , if there exists some set N ∈ A, with µ(N) = 0, such that

limn→∞

fn(x) = f(x), ∀x ∈ X rN.

In this case we writef = µ-a.e.- lim

n→∞fn.

Remark 2.1. This notion of convergence has, among other things, a certainuniqueness feature. One way to describe this is to say that the limit of a µ-a.e. con-vergent sequence is µ-almost unique, in the sense that if f anf g are measurable func-tions which satisfy the equalities f = µ-a.e.- limn→∞ fn and g = µ-a.e.- limn→∞ fn,then f = g, µ-a.e. This is quite obvious, because there exist sets M,N ∈ A, withµ(M) = µ(N) = 0, such that

limn→∞

fn(x) = f(x), ∀x ∈ X rM,

limn→∞

fn(x) = g(x), ∀x ∈ X rN,

then it is obvious that µ(M ∪N) = 0, and

f(x) = g(x), ∀x ∈ X r [M ∪N ].

CHAPTER IV: INTEGRATION THEORY 311

Comment. The above definition makes sense if K is an arbitrary metric space.Any of the spaces R, R, and C is in fact a complete metric space. There are instanceswhere the requirement that f is measurable is if fact redundant. This is somehowclarified by the the next two exercises.

Exercise 1*. Let (X,A, µ) be a measure space, let K be a complete separablemetric space, and let fn : X → K, n ≥ 1, be measurable functions.

(i) Prove that the set

L =x ∈ X :

(fn(x)

)∞n=1

⊂ K is convergent

belongs to A.(ii) If we fix some point α ∈ K, and we define ` : X → K by

`(x) =

limn→∞

fn(x) if x ∈ Lα if x ∈ X r L

then ` is measurable.In particular, if µ(X r L) = 0, then ` = µ-a.e.- limn→∞ fn.Hints: If d denotes the metric on K, then prove first that, for every ε > 0 and every m,n ≥ 1,

the set

Dεmn =

x ∈ X : d

(fm(x), fn(x)

)< ε

belongs to A (use the results from III.3). Based on this fact, prove that, for every p, k ≥ 1, the

set

Epk =

x ∈ X ; d

(fm(x), fn(x)

)< 1

p, ∀m,n ≥ k

belongs to A. Finally, use completeness to prove that

L =

∞⋂p=1

( ∞⋃k=1

Epk

).

Exercise 2. Use the setting from Exercise 1. Prove that Let (X,A, µ), K, and(fn)∞n=1 be as in Exercise 1. Assume f : X → K is an arbitrary function, for whichthere exists some set N ∈ A with µ(N) = 0, and

limn→∞

fn(x) = f(x), ∀x ∈ X rN.

Prove that, when µ is a complete measure on A (see III.5), the function f is auto-matically measurable.Hint: Use the results from Exercise 1. We have X r N ⊂ L, and f(x) = `(x), ∀x ∈ X r N .

Prove that, for a Borel set B ⊂ K, one has the equality f−1(B) = `−1(B)4M , for some M ⊂ N .

By completeness, we have M ∈ A, so f−1(B) ∈ A.

The following fundamental result is a generalization of Lemma 2.1.Theorem 2.1 (Lebesgue Monotone Convergence Theorem). Let (X,A, µ) be

a measure space, and let (fn)∞n=1 ⊂ L1+(X,A, µ) be a sequence with:

• fn ≤ fn+1, µ-a.e., ∀n ≥ 1;• sup

∫Xfn dµ : n ≥ 1

<∞.

Assume f : X → [0,∞] is a measurable function, with f = µ-a.e.- limn→∞ fn. Thenf ∈ L1

+(X,A, µ), and∫Xf dµ = limn→∞

∫Xfn dµ.

Proof. Define αn =∫Xfn dµ, n ≥ 1. First of all, we clearly have

0 ≤ α1 ≤ α2 ≤ . . . ,

312 LECTURE 34

so the sequence (αn)∞n=1 has a limit α = limn→∞ αn, and we have in fact theequality

α = supαn : n ≥ 1

<∞.

With these notations, all we need to prove is the fact that f ∈ L1+(X,A, µ), and

that we have

(3)∫X

f dµ = α.

Fix a set M ∈ A, with µ(M) = 0, and such that limn→∞ fn(x) = f(x),∀x ∈ X rM . For each n, we define the set

Mn = x ∈ X : fn(x) > fn+1(x).Obviously Mn ∈ A, and by assumption, we have µ(Mn) = 0, ∀n ≥ 1. Define theset N = M ∪

( ⋃∞n=1Mn

). It is clear that µ(N) = 0, and

• 0 ≤ f1(x) ≤ f2(x) ≤ · · · ≤ f(x), ∀x ∈ X rN ;• f(x) = limn→∞ fn(x), ∀x ∈ X rN .

So if we put A = X r N , and if we define the measurable functions gn = fnκA,n ≥ 1, and g = fκA, then we have

(a) 0 ≤ g1 ≤ g2 ≤ · · · ≤ g (everywhere!);(b) limn→∞ gn(x) = g(x), ∀x ∈ X;(c) gn = fn, µ-a.e., ∀n ≥ 1;(d) g = f , µ-a.e.;

Notice that property (c) gives gn ∈ L1+(X,A, µ) and

∫Xgn dµ = αn, ∀n ≥ 1. By

property (d), we see that we have the equivalence

f ∈ L1+(X,A, µ) ⇐⇒ f ∈ L1

+(X,A, µ).

Moreover, if g ∈ L1+(X,A, µ), then we will have

∫Xg dµ =

∫Xf dµ. These observa-

tions show that it suffices to prove the theorem with g’s in place of the f ’s. Theadvantage is now the fact that we have the slightly stronger properties (a) and (b)above. The first step in the proof is the following.

Claim 1: For every t ∈ (0,∞), one has the inequality µ(g−1((t,∞])

)≤ α

t .Denote the set g−1((t,∞]) simply by At. For each n ≥ 1, we also define the setAnt = g−1

n ((t,∞]). Using property (a) above, it is clear that we have the inclusions

(4) At1 ⊂ A2t ⊂ · · · ⊂ At.

Using property (b) above, we also have the equality At =⋃∞n=1A

nt . Using the

continuity Lemma 4.1, we then have

µ(At) = limn→∞

µ(Ant ),

so in order to prove the Claim, it suffices to prove the inequalities

(5) µ(Ant ) ≤αnt, ∀n ≥ 1.

But the above inequality is pretty obvious, since we clearly have 0 ≤ tκAnt≤ gn,

which gives

tµ(Ant ) =∫X

tκAntdµ ≤

∫X

gn dµ = αn.

Claim 2: For any elementary function h ∈ A-ElemR(X), with 0 ≤ h ≤ g,one has

CHAPTER IV: INTEGRATION THEORY 313

(i) h ∈ L1R,elem(X,A, µ);

(ii)∫Xh dµ ≤ α.

Start with some elementary function h, with 0 ≤ h ≤ g. Assume h is not identicallyzero, so we can write it as

h = β1κB1+ · · ·+ βpκBp

,

with (Bj)pj=1 ⊂ A pairwise disjoint, and 0 < β1 < · · · < βp. Define the set

B = B1 ∪ · · · ∪ Bp. It is obvious that, if we put t = β1/2, we have the inclusionB ⊂ g−1

((t,∞]

), so by Claim 1, we get µ(B) <∞. This gives, of course µ(Bj) <∞,

∀ j = 1, . . . , p, so h is indeed elementary integrable. To prove the estimate (ii), wedefine the measurable functions hn : X → [0,∞] by hn = mingn, h, ∀n ≥ 1.Since 0 ≤ hn ≤ gn, ∀n ≥ 1, it follows that, hn ∈ L1

+(X,A, µ), ∀n ≥ 1, and we havethe inequalities

(6)∫X

hn dµ ≤∫X

gn dµ = αn, ∀n ≥ 1.

It is obvious that we have(∗) 0 ≤ h1 ≤ h2 ≤ · · · ≤ h ≤ βpκB (everywhere);

(∗∗) h(x) = limn→∞ hn(x), ∀x ∈ X.Let us restrict everything to B. We consider the σ-algebra B = A

∣∣B

, and themeasure ν = µ

∣∣B

. Consider the elementary function ψ = h∣∣B∈ B-ElemR(B), as

well as the measurable functions ψn = hn∣∣B

: B → [0,∞], n ≥ 1. It is clear thatψ ∈ L1

R,elem(B,B, ν), and we have the equality

(7)∫B

ψ dν =∫X

h dµ.

Likewise, using (∗), which clearly forces hn∣∣XrB = 0, it follows that, for each n ≥ 1,

the function ψn belongs to L1+(B,B, ν), and by (6), we have

(8)∫B

ψn dν =∫X

hn dµ, ∀n ≥ 1.

Let us analyze the differences ϕn = ψ − ψn. On the one hand, using (∗), wehave ϕn(x) ∈ [0, βp], ∀x ∈ B, n ≥ 1. On the other hand, again by (∗), we haveϕ1 ≥ ϕ2 ≥ . . . . Finally, by (∗∗) we have limn→∞ ϕn(x) = 0, ∀x ∈ B. We canapply Lemma 2.1, and we will get limn→∞

∫Bϕn dν = 0. This clearly gives,∫

B

ψ dν = limn→∞

∫B

ψn dν,

and then using (7) and (8), we get the equality∫X

h dµ = limn→∞

∫X

hn dµ.

Combining this with (6), immediately gives the desired estimate∫Xh dµ ≤ α.

Having proven Claim 2, let us observe now that, using the definition of thepositive integral, it follows immediately that g ∈ L1

+(X,A, µ), and we have theinequality ∫

X

g dµ ≤ α.

314 LECTURE 34

The other inequality is pretty obvious, because the inequality g ≥ gn forces∫X

g dµ ≥∫X

gn dµ = αn, ∀n ≥ 1,

so we immediately get ∫X

g dµ ≥ supαn ; n ≥ 1 = α.

Comment. In the previous section we introduced the convention which defines∫Xf dµ = ∞, if f : X → [0,∞] is measurable, but f 6∈ L1

+(X,A, µ). Using thisconvention, the Lebesgue Monotone Convergence Theorem has the following generalversion.

Theorem 2.2 (General Lebesgue Monotone Convergence Theorem). Let(X,A, µ) be a measure space, and let f, fn : X → [0,∞], n ≥ 1, be measurablefunctions, such that

• fn ≤ fn+1, µ-a.e., ∀n ≥ 1;• f = µ-a.e.- limn→∞ fn.

Then

(9)∫X

f dµ = limn→∞

∫X

fn dµ.

Proof. As before, the sequence (αn)∞n=1 ⊂ [0,∞], defined by αn =∫Xfn dµ,

∀n ≥ 1, is non-decreasing, and is has a limit

α = limn→∞

αn = sup ∫

X

fn dµ : n ≥ 1∈ [0,∞].

There are two cases to analyze.Case I : α = ∞.

In this case the inequalities f ≥ fn ≥ 0, µ-a.e. will force∫X

f dµ ≥∫X

fn dµ = αn, ∀n ≥ 1,

which will force∫Xf dµ ≥ α, so we indeed get∫

X

f dµ = ∞ = α.

Case II : α <∞.In this case we apply directly Theorem 2.1.

The following result provides an equivalent definition of integrability for non-negative functions (compare to the construction in Section 1).

Corollary 2.1. Let (X,A, µ) be a measure space, and let f : X → [0,∞] bea measurable function. The following are equivalent:

(i) f ∈ L1+(X,A, µ);

(ii) there exists a sequence (hn)∞n=1 ⊂ L1R,elem(X,A, µ), with

• 0 ≤ h1 ≤ h2 . . . ;• limn→∞ hn(x) = f(x), ∀x ∈ X;• sup

∫Xhn dµ : n ≥ 1

<∞.

CHAPTER IV: INTEGRATION THEORY 315

Moreover, if (hn)∞n=1 is as in (ii), then one has the equality

(10)∫X

f dµ = limn→∞

∫X

hn dµ.

Proof. (i) ⇒ (ii). Assume f ∈ L1+(X,A, µ). Using Theorem III.3.2, we know

there exists a sequence (hn)∞n=1 ⊂ A-ElemR(X), with

(a) 0 ≤ h1 ≤ h2 ≤ · · · ≤ f ;(b) limn→∞ hn(x) = f(x), ∀x ∈ X.

Note the (a) forces hn ∈ L1R,elem(X,A, µ), as well as the inequalities

∫Xhn dµ ≤∫

Xf dµ <∞, ∀n ≥ 1, so the sequence (hn)∞n=1 clearly satisfies condition (ii).The implication (ii) ⇒ (i), and the equality (10) immediately follow from the

General Lebesgue Monotone Convergence Theorem.

Corollary 2.2 (Fatou Lemma). Let (X,A, µ) be a measure space, and letfn : X → [0,∞], n ≥ 1, be a sequence of measurable functions. Define the functionf : X → [0,∞] by

f(x) = lim infn→∞

fn(x), ∀x ∈ X.

Then f is measurable, and one has the inequality∫X

f dµ ≤ lim infn→∞

∫X

fn dµ.

Proof. The fact that f is measurable is already known (see Corollary III.3.5).Define the sequence (αn)∞n=1 ⊂ [0,∞] by αn =

∫Xfn dµ, ∀n ≥ 1.

Define, for each integer n ≥ 1, the function gn : X → [0,∞] by

gn(x) = inffk(x) : k ≥ n

, ∀x ∈ X.

By Corollary III.3.4, we know that gn, n ≥ 1 are all measurable. Moreover, it isclear that

• 0 ≤ g1 ≤ g2 ≤ . . . ;• f(x) = limn→∞ gn(x), ∀x ∈ X.

By the General Lebesgue Monotone Convergence Theorem 2.2, it follows that

(11)∫X

f dµ = limn→∞

∫X

gn dµ.

Notice that, if we define the sequence (βn)∞n=1 ⊂ [0,∞], by βn =∫Xgn dµ, ∀n ≥ 1,

then the obvious inequalities 0 ≤ gn ≤ fn give∫Xg dµ ≤

∫Xfn dµ, so we get

βn ≤ αn, ∀n ≥ 1.

Using (11), we then get∫X

f dµ = limn→∞

βn = lim infn→∞

βn ≤ lim infn→∞

αn.

The following is an important application of Theorem 2.1, that deals withRiemann integration.

Corollary 2.3. Let a < b be real numbers. Denote by λ the Lebesgue measure,and consider the Lebesgue space

([a, b],Mλ([a, b]), λ

), where Mλ([a, b]) denotes the

316 LECTURE 34

σ-algebra of all Lebesgue measurable subsets of [a, b]. Then every Riemann inte-grable function f : [a, b] → R belongs to L1

R([a, b],Mλ([a, b]), λ), and one has theequality

(12)∫

[a,b]

f dλ =∫ b

a

f(x) dx.

Proof. We are going to use the results from III.6. First of all, the fact that fis Lebesgue integrable, i.e. f belongs to L1

R([a, b],Mλ([a, b]), λ

), is clear since f is

Lebesgue measurable, and bounded. (Here we use the fact that the measure space([a, b],Mλ([a, b]), λ

)is finite.)

Next we prove the equality between the Riemann integral and the Lebesgueintegral. Adding a constant, if necessary, we can assume that f ≥ 0. For everypartition ∆ = (a = x0 < x1 < · · · < xn = b) of [a, b], we define the numbers

mk = inft∈[xk−1,xk]

f(t), ∀ k = 1, . . . , n,

and we define the function

f∆ = m1κ [x0,x1] +m2κ (x1,x2] + · · ·+mmκ (xn−1,xn].

Fix a sequence of partitions (∆p)∞p=1, with ∆1 ⊂ ∆2 ⊂ . . . , and limp→∞ |∆p| = 0,We know (see III.6) that we have

f = λ-a.e.- limp→∞

f∆p .

Clearly we have 0 ≤ f∆1 ≤ f∆2 ≤ · · · ≤ f , so by Theorem 2.1, we get

(13)∫

[a,b]

f dλ = limp→∞

∫[a,b]

f∆p dλ.

Notice however that∫[a,b]

f∆pdλ = L(f,∆p), ∀ p ≥ 1,

where L(f,∆p) denotes the lower Darboux sum. Combining this with (13), and withthe well known properties of Riemann integration, we immediately get (12).

The following is another important convergence theorem.Theorem 2.3 (Lebesgue Dominated Convergence Theorem). Let (X,A, µ) be

a measure space, let K be one of the symbols R, R, or C, and let (fn)∞n=1 ⊂L1

K(X,A, µ). Assume f : X → K is a measurable function, such that(i) f = µ-a.e.- limn→∞ fn;(ii) there exists some function g ∈ L1

+(X,A, µ), such that

|fn| ≤ g, µ-a.e., ∀n ≥ 1.

Then f ∈ L1K(X,A, µ), and one has the equality

(14)∫X

f dµ = limn→∞

∫X

fn dµ.

Proof. The fact that f is integrable follows from the followingClaim: |f | ≤ g, µ-a.e.

CHAPTER IV: INTEGRATION THEORY 317

To prove this fact, we define, for each n ≥ 1, the set

Mn =x ∈ X : |fn(x)| > g(x)

.

It is clear that Mn ∈ A, and µ(Mn) = 0, ∀n ≥ 1. If we choose M ∈ A such thatµ(M) = 0, and

f(x) = limn→∞

fn(x), ∀x ∈ X rM,

then the set N = M ∪( ⋃∞

n=1Mn

)∈ A will satisfy

• µ(N) = 0;• |fn(x)| ≤ g(x), ∀x ∈ X rN ;• f(x) = limn→∞ fn(x), ∀x ∈ X rN .

We then clearly get

|f(x)| ≤ g(x), ∀x ∈ X rN,

and the Claim follows.Having proven that f is integrable, we now concentrate on the equality (14).

Case I : K = R.

First of all, without any loss of generality, we can assume that 0 ≤ g(x) < ∞,∀x ∈ X. (See Lemma 1.2.) Let us define the functions gn = minfn, g andhn = maxfn,−g, n ≥ 1. Since we have −g ≤ fn ≤ g, µ-a.e., we immediately get

(15) gn = hn = fn, µ-a.e., ∀n ≥ 1,

thus giving the fact that gn, hn ∈ L1R(X,A, µ), ∀n ≥ 1, as well as the equalities

(16)∫X

gn dµ =∫X

hn dµ =∫X

fn dµ, ∀n ≥ 1.

Define the measurable functions ϕ,ψ : X → R by

ϕ(x) = lim infn→∞

hn(x) and ψ(x) = lim supn→∞

gn(x), ∀x ∈ X.

Using (15), we clearly have f = ϕ = ψ, µ-a.e., so we get

(17)∫X

f dµ =∫X

ϕdµ =∫X

ψ dµ.

Remark also that we have equalities(18)g(x)−ϕ(x) = lim inf

n→∞[g(x)−gn(x)] and g(x)+ψ(x) = lim inf

n→∞[g(x)+hn(x)], ∀x ∈ X.

Since we clearly have

g − gn ≥ 0 and g + hn ≥ 0, ∀n ≥ 1,

using (18), and Fatou Lemma (Corollary 2.2) and we get the inequalities∫X

(g − ϕ) dµ ≤ lim infn→∞

∫X

(g − gn) dµ,∫X

(g + ψ) dµ ≤ lim infn→∞

∫X

(g + hn) dµ,

318 LECTURE 34

In other words, we get∫X

g dµ−∫X

ϕdµ ≤ lim infn→∞

[ ∫X

g dµ−∫X

gn dµ

]=

∫X

g dµ− lim supn→∞

∫X

gn dµ,∫X

g dµ+∫X

ψ dµ ≤ lim infn→∞

[ ∫X

g dµ+∫X

hn dµ

]=

∫X

g dµ+ lim infn→∞

∫X

hn dµ.

Using the equalities (16) and (17), the above inequalities give∫X

f dµ =∫X

ϕdµ ≥ lim supn→∞

∫X

gn dµ = lim supn→∞

∫X

fn dµ,∫X

f dµ =∫X

ψ dµ ≤ lim infn→∞

∫X

hn dµ = lim infn→∞

∫X

fn dµ.

In other words, we have∫X

f dµ ≤ lim infn→∞

∫X

fn dµ ≤ lim supn→∞

∫X

fn dµ ≤∫X

f dµ,

thus giving the equality (14)The case K = R is trivial (it is in fact contained in case K = R).The case K = C is also pretty clear, using real and imaginary parts, since for

each n ≥ 1, we clearly have

|Re fn| ≤ g, µ-a.e.,

|Im fn| ≤ g, µ-a.e.,.

Exercise 3. Give an example of a sequence of continuous functions fn : [0, 1] →[0,∞), such that

(a) limn→∞ fn(x) = 0, ∀n ≥ 1;(b)

∫[0,1]

fn dλ = 1, ∀n ≥ 1.

(Here λ denotes the Lebesgue measure). This shows that the Lebesgue DominatedConvergence Theorem fails, without the dominance condition (ii).Hint: Consider the functions fn defined by

fn(x) =

n2x if 0 ≤ x ≤ 1/n

n(2− nx) if 1/n ≤ x ≤ 2/n0 if 2/n ≤ x ≤ 1

The Lebesgue Convergence Theorems 2.2 and 2.3 have many applications. Theyare among the most important results in Measure Theory. In many instances, thesetheorem are employed during proofs, at key steps. The next two results are goodillustrations.

Proposition 2.1. Let (X,A, µ) be a measure space, and let f : X → [0,∞] bea measurable function. Then the map

ν : A 3 A 7−→∫A

f dµ ∈ [0,∞]

defines a measure on A.

Proof. It is clear that ν(∅) = 0. To prove σ-additivity, start with a pairwisedisjoint sequence (An)∞n=1 ⊂ A, and put A =

⋃∞n=1An. For each integer n ≥ 1,

define the set Bn =⋃nk=1Ak, and the measurable function gn = fκBn

. Define alsothe function g = fκA. It is obvious that

• 0 ≤ g1 ≤ g2 ≤ · · · ≤ g (everywhere),

CHAPTER IV: INTEGRATION THEORY 319

• limn→∞ gn(x) = g(x), ∀x ∈ X.Using the General Lebesgue Monotone Convergence Theorem, it follows that

(19) ν(A) =∫X

fκA dµ =∫X

g dµ = limn→∞

∫X

gn dµ.

Notice now that, for each n ≥ 1, one has the equality

gn = fκA1+ · · ·+ fκAn

,

so using Remark 1.7.C, we get∫X

gn dµ =n∑k=1

∫X

fκAkdµ =

n∑k=1

ν(Ak),

so the equality (19) immediately gives ν(A) =∑∞n=1 ν(An).

The next result is a version of the previous one for K-valued functions.Proposition 2.2. Let (X,A, µ) be a measure space, let K be one of the symbols

R, R, or C, and let (An)∞n=1 ⊂ A be a pairwise disjoint sequence with⋃∞n=1An = X.

For a function f : X → K, the following are equivalent.(i) f ∈ L1

K(X,A, µ);(ii) f

∣∣An

∈ L1K(An,A

∣∣An, µ

∣∣An

), ∀n ≥ 1, and∞∑n=1

∫An

|f | dµ <∞.

Moreover, if f satisfies these equivalent conditions, then∫X

f dµ =∞∑n=1

∫An

f dµ.

Proof. (i) ⇒ (ii). Assume f ∈ L1K(X,A, µ). Applying Proposition 2.1, to

|f |, we immediately get∞∑n=1

∫An

|f | dµ =∫X

|f | dµ <∞,

which clearly proves (ii).(ii) ⇒ (i). Assume f satisfies condition (ii). Define, for each n ≥ 1, the set

Bn = A1 ∪ · · · ∪ An. First of all, since (An)∞n=1 ⊂ A, and⋃∞n=1An = X, it follows

that f is measurable. Consider the the functions fn = fκBnand gn = fκAn

, n ≥ 1.Notice that, since f

∣∣An

∈ L1K(An,A

∣∣An, µ

∣∣An

), it follows that gn ∈ L1K(X,A, µ),

∀n ≥ 1, and we also have ∫X

gn dµ =∫An

f dµ, ∀n ≥ 1.

In fact we also have ∫X

|gn| dµ =∫An

|f | dµ, ∀n ≥ 1.

Notice that we obviously have fn = g1 + · · ·+ gn, and |fn| = |g1|+ · · ·+ |gn|, so ifwe define

S =∞∑n=1

∫An

|f | dµ,

320 LECTURE 34

we get ∫X

|fn| dµ =n∑k=1

∫Ak

|f | dµ ≤ S <∞, ∀n ≥ 1.

Notice however that we have 0 ≤ |f1| ≤ |f2| ≤ . . . |f |, as well as the equalitylimn→∞ fn(x) = f(x), ∀x ∈ X. On the one hand, using the General LebesgueMonotone Convergence Theorem, we will get∫

X

|f | dµ = limn→∞

∫X

|fn| dµ = limn→∞

[ n∑k=1

∫Ak

|f | dµ]

=∞∑n=1

∫An

|f | dµ = S <∞,

which proves that |f | ∈ L1+(X,A, µ), so in particular f belongs to L1

K(X,A, µ). Onthe other hand, since we have |fn| ≤ |f |, by the Lebesgue Dominated ConvergenceTheorem, we get∫

X

f dµ = limn→∞

∫X

fn dµ = limn→∞

[ n∑k=1

∫Ak

f dµ

]=

∞∑n=1

∫An

f dµ.

Corollary 2.4. Let (X,A, µ) be a measure space, let K be one of the symbolsR, R, or C, and let (Xn)∞n=1 ⊂ A be sequence with

⋃∞n=1Xn = X, and X1 ⊂

X2 ⊂ . . . . For a function f : X → K be a measurable function, the following areequivalent.

(i) f ∈ L1K(X,A, µ);

(ii) f∣∣Xn

∈ L1K(Xn,A

∣∣Xn, µ

∣∣Xn

), ∀n ≥ 1, and

sup ∫

Xn

|f | dµ : n ≥ 1<∞.

Moreover, if f satisfies these equivalent conditions, then∫X

f dµ = limn→∞

∫Xn

f dµ.

Proof. Apply the above result to the sequence (An)∞n=1 given by A1 = X1

and An = Xn rXn−1, ∀n ≥ 2.

Remark 2.2. Suppose (X,A, µ) is a measure space, K is one of the fields R orC, and f ∈ L1

K(X,A, µ). By Proposition 2.2, we get the fact that the map

ν : A 3 A 7−→∫A

f dµ ∈ K

is a K-valued measure on A. By Proposition 2.1, we also know that

ω : A 3 A 7−→∫A

|f | dµ ∈ K

is a finite “honest” measure on A. Using Proposition 1.6, we clearly have

|ν(A)| =∣∣∣∣ ∫A

f dµ

∣∣∣∣ ≤ ∫A

|f | dµ = ω(A), ∀A ∈ A,

which by the results from III.8 gives the inequality |ν| ≤ ω. (Here |ν| denotes thevariation measure of ν.) Later on (see Section 4) we are going to see that in factwe have the equality |ν| = ω.

CHAPTER IV: INTEGRATION THEORY 321

Comment. It is important to understand the “sequential” nature of the con-vergence theorems discussed here. If we examine for instance the MononotoneConvergence Theorem, we could easily formulate a “series” version, which statesthe equality ∫

X

( ∞∑n=1

fn)dµ =

∞∑n=1

∫X

fn dµ,

for any sequence measurable functions fn : X → [0,∞].Suppose now we have an arbitrary family fj : X → [0,∞], j ∈ J of measurable

functions, and we define

f(x) =∑j∈J

fj(x), ∀x ∈ X.

(Here we use the summability convention which defines the sum as the supremumof all finite sums.) In general, f is not always measurable. But if it is, one stillcannot conclude that ∫

X

f dµ =∑j∈J

∫X

fi dµ.

The following example illustrates this anomaly.Example 2.1. Take the measure space ([0, 1],Mλ([0, 1]), λ), and fix J ⊂ [0, 1]

and arbitrary set. For each j ∈ J we consider tha characteristic function fj = κj.It is obvious that the function f : X → [0,∞], defined by

f(x) =∑j∈J

fj(x), ∀x ∈ [0, 1],

is equal to κJ If J is non-measurable, this already gives an example when f =∑j∈J fj is non-measurable. But even if J were measurable, it would be impossible

to have the equality ∫X

f dλ =∑j∈J

∫X

fj dλ,

simply because the right hand side is zero, while the left hand side is equal to λ(J).The next two exercises illustrate straightforward (but nevertheless interesting)

applications of the convergence theorems to quite simple situations.Exercise 4. Let A be a σ-algebra on a (non-empty) set X, and let (µn)∞n=1 be

a sequence of signed measures on A. Assume that, for each A ∈ A, the sequence(µn(A)

)∞n=1

has a limit denoted µ(A) ∈ [−∞,∞]. Prove that the map µ : A →[0,∞] defines a measure on A, if the sequence (µn)∞n=1 satisfies one of the followinghypotheses:

A. 0 ≤ µ1(A) ≤ µ2(A) ≤ . . . , ∀A ∈ A;B. there exists a finite measure ω on A, such that |µn(A)| ≤ ω(A), ∀n ≥ 1,

A ∈ A.

Hint: To prove σ-additivity, fix a pairwise disjoint sequence (Ak)∞k=1 ⊂ A, and put A =⋃∞

k=1 Ak.

Treat the problem of proving the equality µ(A) =∑∞

k=1 µ(Ak) as a convergence problem on

the measure space (N,P(N), ν) - with ν the counting measure - for the sequence of functions

fn : N → [0,∞] defined by fn(k) = µn(Ak), ∀ k ∈ N.

Exercise 5*. Let A be a σ-algebra on a (non-empty) set X, and let (µj)j∈J bea family of signed measures on A. Assume either of the following is true:

322 LECTURE 34

A. µj(A) ≥ 0, ∀ j ∈ J , A ∈ A.B. There exists a finite measure ω on A, such that

∑j∈J |µj(A)| ≤ ω(A),

∀A ∈ A.Define the map µ : A → [0,∞] by µ(A) =

∑j∈J µj(A), ∀A ∈ A. (In Case A, the

sum is defined as the supremum over finite sums. In case B, it follows that thefamily

(µj(A)

)j∈J is summable.) Prove that µ is a measure on A.

Hint: To prove σ-additivity, fix a pairwise disjoint sequence (Ak)∞k=1 ⊂ A, and put A =⋃∞

k=1 Ak.

To prove the equality µ(A) =∑∞

k=1 µ(Ak), analyze the following cases: (i) There is some k ≥ 1,

such that µ(Ak) = ∞; (ii) µ(Ak) < ∞, ∀ k ≥ 1. The first case is quite trivial. In the second

case reduce the problem to the previous exercise, by observing that, for each k ≥ 1, the set

J(Ak) = j ∈ J : µj(Ak) > 0

must be countable. Then the set J(A) = j ∈ J : µj(A) > 0 is

also countable.

Comment. One of the major drawbacks of the theory of Riemann integrationis illustrated by the approach to improper integration. Recall that for a functionh : [a, b) → R the improper Riemann integral is defined as∫ b−

a

h(t) dt = limx→b−

∫ x

a

f(t) dt,

provided that(a) h

∣∣[a,x]

is Riemann integrable, ∀x ∈ (a, b), and(b) the above limit exists.

The problem is that although the improper integral may exist, and the function isactually defined on [a, b], it may fail to be Riemann integrable, for example whenit is unbounded.

In contrast to this situation, by Corollary 2.4, we see that if for example h ≥ 0,then the Lebesgue integrability of h on [a, b] is equivalent to the fact that

(i) h∣∣[a,x]

is Lebesgue integrable, ∀x ∈ (a, b), and(ii) limx→b−

∫[a,x]

h dλ exists.

Going back to the discussion on improper Riemann integral, we can see thata sufficient condition for h : [a, b) → R to be Riemann integrable in the impropersense, is the fact that h has property (a) above, and h is Lebesgue integrable on[a, b). In fact, if h ≥ 0, then by Corollary 2.4, this is also necessary.

Notation. Let −∞ ≤ a < b ≤ ∞, and let f be a Lebesgue integrable function,defined on some interval J which is one of (a, b), [a, b), (a, b], or [a, b]. Then theLebesgue integral

∫Jf dλ will be denoted simply by

∫ baf dλ.

Exercise 6*. Let (X,A, µ) be a finite measure space. Prove that for everyf ∈ L1

+(X,A, µ), one has the equality∫X

f dµ =∫ ∞

0

µ(f−1([t,∞])

)dt,

where the second term is defined as improper Riemann integral.Hint: The function ϕ : [0,∞) → [0,∞) defined by ϕ(t) = µ

(f−1([t,∞])

), ∀ t ≥ 0, is non-

increasing, so it is Riemann integrable on every interval [0, a], a > 0. Prove the inequalities∫Xa

f dµ ≤∫ a

0ϕ(t) dt ≤

∫Xf dµ, ∀ a > 0,

where Xa = f−1([0, a)), by analyzing lower and upper Darboux sums of ϕ∣∣[0,a]

. Use Corollary

2.4 to get lima→∞∫

Xaf dµ =

∫X f dµ.

Lecture 35

3. Banach spaces of integrable functions I: the Lp spaces

In this section we discuss an important construction, which is extremely usefulin virtually all branches of Analysis. In Section 1, we have already introduced thespace L1. The first construction deals with a generalization of this space.

Definitions. Let (X,A, µ) be a measure space, and let K be one of the fieldsR or C.

A. For a number p ∈ (1,∞), we define the space

LpK(X,A, µ) =f : X → K : f measurable, and

∫X

|f |p ∈ dµ <∞.

Here we use the convention introduced in Section 1, which defines∫Xh dµ = ∞,

for those measurable functions h : X → [0,∞], that are not integrable.Of course, in this definition we can allow also the value p = 1, and in this case

we get the familiar definition of L1K(X,A, µ).

B. For p ∈ [1,∞), we define the map Qp : L1K(X,A, µ) → [0,∞) by

Qp(f) =∫X

|f |p dµ, ∀ f ∈ L1K(X,A, µ).

Remark 3.1. The space L1K(X,A, µ) was studied earlier (see Section 1). It

has the following features:

(i) L1K(X,A, µ) is a K-vector space.

(ii) The map Q1 : L1K(X,A, µ) → [0,∞) is a seminorm, i.e.

(a) Q1(f + g) ≤ Q1(f) +Q1(g), ∀ f, g ∈ L1K(X,A, µ);

(b) Q1(αf) = |α| ·Q1(f), ∀ f ∈ L1K(X,A, µ), α ∈ K.

(iii)∣∣ ∫Xf dµ

∣∣ ≤ Q1(f), ∀ f ∈ L1K(X,A, µ).

Property (b) is clear. Property (a) immediately follows from the inequality |f+g| ≤|f |+ |g|, which after integration gives∫

X

|f + g| dµ ≤∫X

[|f |+ |g|

]dµ =

∫X

|f | dµ+∫X

|g| dµ.

In what follows, we aim at proving similar features for the spaces LpK(X,A, µ)and Qp, 1 < p <∞.

The following will help us prove that Lp is a vector space.

Exercise 1♦. Let p ∈ (1,∞). Then one has the inequality

(s+ t)p ≤ 2p−1(sp + tp), ∀ s, t ∈ [0,∞).

323

324 LECTURE 35

Hint: The inequality is trivial, when s = t = 0. If s + t > 0, reduce the problem to the caset+ s = 1, and prove, using elementary calculus techniques that

mint∈[0,1]

[tp + (1− t)p

]= 21−p.

Proposition 3.1. Let (X,A, µ) be a measure space, let K be one of the fieldsR or C, and let p ∈ (1,∞). When equipped with pointwise addition and scalarmultiplication, LpK(X,A, µ) is a K-vector space.

Proof. It f, g ∈ LpK(X,A, µ), then by Exercise 1 we have∫X

|f + g|p dµ ≤∫X

(|f |+ |g|

)pdµ ≤ 2p−1

[ ∫X

|f |p dµ+∫X

|g|p dµ]<∞,

so f + g indeed belongs to LpK(X,A, µ).It f ∈ LpK(X,A, µ), and α ∈ K, then the equalities∫

X

|αf |p dµ =∫X

|α|p · |f |p dµ = |α|p ·∫X

|f |p dµ

clearly prove that αf also belongs to LpK(X,A, µ).

Our next task will be to prove that Qp is a seminorm, for all p > 1. In thisdirection, the following is a key result. (The above mentioned convention will beused throughout this entire section.)

Theorem 3.1 (Holder’s Inequality for integrals). Let (X,A, µ) be a measurespace, let f, g : X → [0,∞] be measurable functions, and let p, q ∈ (1,∞) be suchthat 1

p + 1q = 1. Then one has the inequality15

(1)∫X

fg dµ ≤[ ∫

X

fp dµ

]1/p

·[ ∫

X

gq dµ

]1/q

.

Proof. If either∫Xfp dµ = ∞, or

∫Xgp dµ = ∞, then the inequality (1) is

trivial, because in this case, the right hand side is ∞. For the remainder of theproof we will assume that

∫Xfp dµ <∞ and

∫Xgq dµ <∞.

Use Corollary 2.1 to find two sequences (ϕn)∞n=1, (ψn)∞n=1 ⊂ L1R,elem(X,A, µ),

such that• 0 ≤ ϕ1 ≤ ϕ2 ≤ . . . and 0 ≤ ψ1 ≤ ψ2 ≤ . . . ;• limn→∞ ϕn(x) = f(x)p and limn→∞ ψn(x) = g(x)q, ∀x ∈ X.

By the Lebesgue Dominated Convergence Theorem, we will also get the equalities

(2)∫X

fp dµ = limn→∞

∫X

ϕn dµ and∫X

gq dµ = limn→∞

∫X

ψn dµ.

Remark that the functions fn = ϕ1/pn , gnψ

1/qn , n ≥ 1 are also elementary (because

they obviously have finite range). It is obvious that we have• 0 ≤ f1 ≤ f2 ≤ . . . , and 0 ≤ g1 ≤ g2 ≤ . . . ;• limn→∞ fn(x) = f(x), and limn→∞ gn(x)] = g(x), ∀x ∈ X.

With these notations, the equalities (2) read

(3)∫X

fp dµ = limn→∞

∫X

(fn)p dµ and∫X

gq dµ = limn→∞

∫X

(gn)q dµ.

Of course, the products fngn, n ≥ 1 are again elementary, and satisfy

15 Here we use the convention ∞1/p = ∞1/q = ∞.

CHAPTER IV: INTEGRATION THEORY 325

• 0 ≤ f1g1 ≤ f2g2 ≤ . . . ;• limn→∞[fn(x)gn(x)] = f(x)g(x), ∀x ∈ X.

Using the General Lebesgue Monotone Convergence Theorem, we then get∫X

fg dµ = limn→∞

∫X

fngn dµ.

Using (3) we now see that, in order to prove (1), it suffices to prove the inequalities∫X

fngn dµ ≤[ ∫

X

(fn)p dµ]1/p

·[ ∫

X

(gn)q dµ]1/q

, ∀n ≥ 1.

In other words, it suffices to prove (1), under the extra assumption that both f andg are elementary integrable.

Suppose f and g are elementary integrable. Then (see III.1) there exist pair-wise disjoint sets (Dj)mj=1 ⊂ A, with µ(Dj) < ∞, ∀ j = 1, . . . ,m, and numbersα1, β1, . . . , αm, βm ∈ [0,∞), such that

f = α1κD1+ · · ·+ αmκDm

g = β1κD1+ · · ·+ βmκDm

Notice that we have

fg = α1β1κD1+ · · ·+ αmβmκDm

,

so the left hand side of (1) is the given by∫X

fg dµ =m∑j=1

αjβjµ(Dj).

Define the numbers xj = αjµ(Dj)1/p, yj = βjµ(Dj)1/q, j = 1, . . . ,m. Using thesenumbers, combined with 1

p + 1q = 1, we clearly have

(4)∫X

fg dµ =m∑j=1

(xjyj).

At this point we are going to use the Holder inequality for finite sequences (LemmaII.2.3), which gives

m∑j=1

(xjyj) ≤[ m∑j=1

(xj)p]1/p

·[ m∑j=1

(yj)q]1/q

,

so the equality (4) continues with∫X

fg dµ ≤[ m∑j=1

(xj)p]1/p

·[ m∑j=1

(yj)q]1/q

=

=[ m∑j=1

(αj)pµ(Dj)]1/p

·[ m∑j=1

(βj)qµ(Dj)]1/q

=

=[ ∫

X

fp dµ

]1/p

·[ ∫

X

gq dµ

]1/q

.

326 LECTURE 35

Corollary 3.1. Let (X,A, µ) be a measure space, let K be one of the fieldsR or C, and let p, q ∈ (1,∞) be such that 1

p + 1q = 1. For any two functions

f ∈ LpK(X,A, µ) and g ∈ LqK(X,A, µ), the product fg belongs to L1K(X,A, µ) and

one has the inequality ∣∣∣∣ ∫X

fg dµ

∣∣∣∣ ≤ Qp(f) ·Qq(g).

Proof. By Holder’s inequality, applied to |f | and |g|, we get∫X

|fg| dµ ≤ Qp(f) ·Qq(g) <∞,

so |fg| belongs to L1+(X,A, µ), i.e. fg belongs to L1

K(X,A, µ). The desired inequal-ity then follows from the inequality

∣∣ ∫Xfg dµ

∣∣ ≤ ∫X|fg| dµ.

Notation. Suppose (X,A, µ) is a measure space, K is one of the fields R or C,and p, q ∈ (1,∞) are such that 1

p+ 1q = 1. For any pair of functions f ∈ LpK(X,A, µ),

g ∈ LqK(X,A, µ), we shall denote the number∫Xfg dµ ∈ K simply by 〈f, g〉. With

this notation, Corollary 3.1 reads:∣∣〈f, g〉∣∣ ≤ Qp(f) ·Qq(g),∀ f ∈ LpK(X,A, µ), g ∈ LqK(X,A, µ).

The following result gives an alternative description of the maps Qp, p ∈ (1,∞).Proposition 3.2. Let (X,A, µ) be a measure space, let K be one of the fields

R or C, let p, q ∈ (1,∞) be such that 1p + 1

q = 1. and let f ∈ LpK(X,A, µ). Thenone has the equality

(5) Qp(f) = sup∣∣〈f, g〉∣∣ : g ∈ LqK(X,A, µ), Qq(g) ≤ 1

.

Proof. Let us denote the right hand side of (5) simply by P (f). By Corollary3.1, we clearly have the inequality

P (f) ≤ Qp(f).

To prove the other inequality, let us first observe that in the case when Qp(f) = 0,there is nothing to prove, because the above inequality already forces P (f) = 0.Assume then Qp(f) > 0, and define the function h : x→ K by

h(x) =

|f(x)|p

f(x)if f(x) 6= 0

0 if f(x) = 0

It is obvious that h is measurable. Moreover, one has the equality |h| = |f |p−1,which using the equality qp = p + q gives |h|q = |f |qp−q = |f |p. This proves thath ∈ LqK(X,A, µ), as well as the equality

Qq(h) =[ ∫

X

|h|q dµ]1/q

=[ ∫

X

|f |p dµ]1/q

= Qp(f)p/q.

If we define the number α = Qp(f)−p/q, then the function g = αh has Qq(g) = 1,so we get

P (f) ≥∣∣∣∣ ∫X

fg dµ

∣∣∣∣ =1

Qp(f)p/q

∣∣∣∣ ∫X

fh dµ

∣∣∣∣.

CHAPTER IV: INTEGRATION THEORY 327

Notice that fh = |f |p, so the above inequality can be continued with

P (f) ≥ 1Qp(f)p/q

∫X

|f |p dµ =Qp(f)p

Qp(f)p/q= Qp(f).

Corollary 3.2. Let (X,A, µ) be a measure space, let K be one of the fieldsR or C, and let p ∈ (1,∞). Then the Qp is a seminorm on LpK(X,A, µ), i.e.

(a) Qp(f1 + f2) ≤ Qp(f1) +Qp(f2), ∀ f1, f2 ∈ LpK(X,A, µ);(b) Qp(αf) = |α| ·Qp(f), ∀ f ∈ LpK(X,A, µ), α ∈ K.

Proof. (a). Take q = pp−1 , so that 1

p + 1q = 1. Start with some arbitrary

g ∈ LqK(X,A, µ), with Qq(g) ≤ 1. Then the functions f1g and f2g belong toL1

K(X,A, µ), and so f1g + f2g also belongs to L1K(X,A, µ). We then get∣∣〈f1 + f2 , g〉

∣∣ =∣∣∣∣ ∫X

(f1g + f2g) dµ∣∣∣∣ =

∣∣∣∣ ∫X

f1g dµ+∫X

f2g dµ

∣∣∣∣ ≤≤

∣∣∣∣ ∫X

f1g dµ

∣∣∣∣ +∣∣∣∣ ∫X

f2g dµ

∣∣∣∣ =∣∣〈f1, g〉∣∣ +

∣∣〈f2, g〉∣∣.Using Proposition 3.2, the above inequality gives∣∣〈f1 + f2 , g〉

∣∣ ≤ Qp(f1) +Qp(f2).

Since the above inequality holds for all g ∈ LqK(X,A, µ), with Qq(g) ≤ 1, again byProposition 3.2, we get

Qp(f1 + f2) ≤ Qp(f1) +Qp(f2).

Property (b) is obvious.

Remarks 3.2. Let (X,A, µ) be a measure space, and K be one of the fields Ror C, and let p ∈ [1,∞).

A. If f ∈ LpK(X,A, µ) and if g : X → K is a measurable function, with g = f ,µ-a.e., then g ∈ LpK(x,A, µ), and Qp(g) = Qp(f).

B. If we define the space

NK(X,A, µ) =f : X → K : f measurable, f = 0, µ-a.e.

,

then NK(X,A, µ) is a linear subspace of LpK(X,A, µ). In fact one has the equality

NK(X,A, µ) =f ∈ LpK(X,A, µ) : Qp(f) = 0

.

The inclusion “⊂” is trivial. Conversely, f ∈ LpK(X,A, µ) has Qp(f) = 0, then themeasurable function g : X → [0,∞) defined by g = |f |p will have

∫Xg dµ = 0. By

Exercise 2.3 this forces g = 0, µ-a.e., which clearly gives f = 0, µ-a.e.Definition. Let (X,A, µ) be a measure space, let K be one of the fields R or

C, and let p ∈ [1,∞). We define

LpK(X,A, µ) = LpK(X,A, µ)/NK(X,A, µ).

In other words, LpK(X,A, µ) is the collection of equivalence classes associated withthe relation “=, µ-a.e.” For a function f ∈ LpK(X,A, µ) we denote by [f ] itsequivalence class in LpK(X,A, µ). So the equality [f ] = [g] is equivalent to f = g,µ-a.e. By the above Remark, there exists a (unique) map ‖ . ‖p : LpK(X,A, µ) →[0,∞), such that

‖[f ]‖p = Qp(f), ∀ f ∈ LpK(X,A, µ).

328 LECTURE 35

By the above Remark, it follows that ‖ . ‖p is a norm on LpK(X,A, µ). When K = Cthe subscript C will be ommitted.

Conventions. Let (X,A, µ), K, and p be as above We are going to abuse abit the notation, by writing

f ∈ LpK(X,A, µ),if f belongs to LpK(X,A, µ). (We will always have in mind the fact that this notationsignifies that f is almost uniquely determined.) Likewise, we are going to replaceQp(f) with ‖f‖p.

Given p, q ∈ (1,∞), with 1p + 1

q = 1, we use the same notation for the (correctlydefined) map

〈 . , . 〉 : LpK(X,A, µ)× LqK(X,A, µ) → K.Remark 3.3. Let (X,A, µ) be a measure space, let K be either R or C, and

let p, q ∈ (1,∞) be such that 1p + 1

q = 1. Given f ∈ LpK(X,A, µ), we define the map

Λf : LqK(X,A, µ) 3 g 7−→ 〈f, g〉 ∈ K.According to Proposition 3.2, the map Λf is linear, continuous, and has norm‖Λf‖ = ‖f‖p. If we denote by LqK(X,A, µ)∗ the Banach space of all linear continu-ous maps LqK(X,A, µ) → K, then we have a correspondence

(6) LpK(X,A, µ) 3 f 7−→ Λf ∈ LqK(X,A, µ)∗

which is linear and isometric. This correspondence will be analyzed later in Section5.

Notation. Given a sequence (fn)∞n=1, and a function f , in LpK(X,A, µ), weare going to write

f = Lp- limn→∞

fn,

if (fn)∞n=1 converges to f in the norm topology, i.e. limn→∞ ‖fn − f‖p = 0.The following technical result is very useful in the study of Lp spaces.Theorem 3.2 (Lp Dominated Convergence Theorem). Let (X,A, µ) be a mea-

sure space, let K be one of the fields R or C, let p ∈ [1,∞) and let (fn)∞n=1 be asequence in LpK(X,A, µ). Assume f : X → K is a measurable function, such that

(i) f = µ-a.e.- limn→∞ fn;(ii) there exists some function g ∈ L1

K(X,A, µ), such that

|fn| ≤ |g|, µ-a.e., ∀n ≥ 1.

Then f ∈ LpK(X,A, µ), and one has the equality

f = Lp- limn→∞

fn.

Proof. Consider the functions ϕn = |fn|p, n ≥ 1, and ϕ = |f |p, and ψ = |g|p.Notice that

• ϕ = µ-a.e.- limn→∞ ϕn;• |ϕn| ≤ ψ, µ-a.e., ∀n ≥ 1;• ψ ∈ L1

+(X,A, µ).We can apply the Lebsgue Dominated Convergence Theorem, so we get the fact thatϕ ∈ L1

+(X,A, µ), which gives the fact that f ∈ LpK(X,A, µ). Now if we considerthe functions ηn = |fn− f |p, and η = 2p−1

(|g|p + |f |p

), then we have (use Exercise

1):• 0 = µ-a.e.- limn→∞ ηn;

CHAPTER IV: INTEGRATION THEORY 329

• |ηn| ≤ η, µ-a.e., ∀n ≥ 1;• η ∈ L1

+(X,A, µ).

Again using the Lebesgue Dominated Convergence Theorem, we get

limn→∞

∫X

ηn dµ = 0,

which means thatlimn→∞

|fn − f |p dµ,

which reads limn→∞(‖fn − f‖p

)p = 0, so we clearly have f = Lp- limn→∞ fn.

Our main goal is to prove that the Lp spaces are Banach spaces. The key resultwhich gives this, but also has some other interesting consequences, is the following.

Theorem 3.3. Let (X,A, µ) be a measure space, let K be one of the fields Ror C, let p ∈ [1,∞) and let (fk)∞k=1 be a sequence in LpK(X,A, µ), such that

∞∑k=1

‖fk‖p <∞.

Consider the sequence (gn)∞n=1 ⊂ LpK(X,A, µ) of partial sums:

gn =n∑k=1

fk, n ≥ 1.

Then there exists a function g ∈ LpK(X,A, µ), such that

(a) g = µ-a.e.- limn→∞ gn;(b) g = Lp- limn→∞ gn.

Proof. Denote the sum∑∞k=1 ‖fn‖p simply by S. For each integer n ≥ 1,

define the function hn : X → [0,∞], by

hn(x) =n∑k=1

|fn(x)|, ∀x ∈ X.

It is clear that hn ∈ LpR(X,A, µ), and we also have

(7) ‖hn‖p ≤n∑k=1

‖fk‖p ≤ S, ∀n ≥ 1.

Notice also that 0 ≤ h1 ≤ h2 ≤ . . . . Define then the function h : X → [0,∞] by

h(x) = limn→∞

hn(x), ∀x ∈ X.

Claim: h ∈ LpR(X,A, µ).

To prove this fact, we define the functions ϕ = hp and ϕn = (hn)p, n ≥ 1. Noticethat, we have

• 0 ≤ ϕ1 ≤ ϕ2 ≤ . . . ;• ϕn ∈ L1

R(X,A, µ), ∀n ≥ 1;• limn→∞ ϕn(x) = ϕ(x), ∀x ∈ X;• sup

∫Xϕn dµ : n ≥ 1

≤Mp.

330 LECTURE 35

Using the Lebesgue Monotone Convergence Theorem, it then follows that hp = ϕ ∈L1

R(X,A, µ), so h indeed belongs to LpR(X,A, µ).(7) givesLet us consider now the set N = x ∈ X : h(x) = ∞. On the one hand, since

we also haveN = x ∈ X : ϕ(x) <∞,

and ϕ is integrable, it follows that N ∈ A, and µ(N) = 0. On the other hand, since∞∑k=1

|fn(x)| = h(x) <∞, ∀x ∈ X rN,

it follows that, for each x ∈ X r N , the series∑∞k=1 fk(x) is convergent. Let us

define then g : X → K by

g(x) = ∑∞

k=1 fk(x) if x ∈ X rN0 if x ∈ N

It is obvious that g is measurable, and we have

g = µ-a.e.- limn→∞

gn.

Since we have

|gn| =∣∣∣∣ n∑k=1

fk

∣∣∣∣ ≤ n∑k=1

|fk| = hn ≤ h, ∀n ≥ 1,

using the Claim, and Theorem 3.2, it follows that g indeed belongs to LpK(X,A, µ)and we also have the equality g = Lp- limn→∞ gn.

Corollary 3.3. Let (X,A, µ) be a measure space, and let K be one of thefields R or C. Then LpK(X,A, µ) is a Banach space, for each p ∈ [1,∞).

Proof. This is immediate from the above result, combined with the complete-ness criterion given by Remark II.3.1.

Another interesting consequence of Theorem 3.3 is the following.Corollary 3.4. Let (X,A, µ) be a measure space, let K be one of the fields

R or C, let p ∈ [1,∞), and let f ∈ LpK(X,A, µ). Any sequence (fn)∞n=1 ⊂∈LpK(X,A, µ), with f = Lp- limn→∞ fn, has a subsequence (fnk

)∞k=1 such that f =µ-a.e.- limk→∞ fnk

.

Proof. Without any loss of generality, we can assume that f = 0, so that wehave

limn→∞

‖fn‖p = 0.

Choose then integers 1 ≤ n1 < n2 < . . . , such that

‖fnk‖p ≤

12k, ∀ k ≥ 1.

If we define the functions

gm =m∑k=1

fnk,

then by Theorem 3.3, it follows that there exists some g ∈ LpK(X,A, µ), such that

g = µ-a.e.- limm→∞

gm.

CHAPTER IV: INTEGRATION THEORY 331

This measn that there exists some N ∈ A, with µ(N) = 0, such that

limm→∞

gm(x) = g(x), ∀x ∈ X rN.

In other words, for each x ∈ XrN , the series∑∞k=1 fnk

(x) is convergent (to somenumber g(x) ∈ K). In particular, it follows that

limk→∞

fnk(x) = 0, ∀x ∈ X rN,

so we indeed have 0 = µ-a.e.- limk→∞ fnk.

The following result collects some properties of Lp spaces in the case when theundelrying measure space is finite.

Proposition 3.3. Suppose (X,A, µ) is a finite measure space, and K is oneof the fields R or C.

(i) If f : X → K is a bounded measurable function, then f ∈ LpK(X,A, µ),∀ p ∈ [1,∞).

(ii) For any p, q ∈ [1,∞), with p < q, one has the inclusion LqK(X,A, µ) ⊂LpK(X,A, µ). So taking quotients by NK(X,A, µ), one gets an inclusion ofvector spaces

(8) LqK(X,A, µ) → LpK(X,A, µ).

Moreover the above inclusion is a continuous linear map.

Proof. The key property that we are going to use here is the fact that theconstant function 1 = κX is µ-integrable (being elementary µ-integrable).

(i). This part is pretty clear, because if we start with a bounded measurablefunction f : X → K and we take M = supx∈X |f(x)|, then the inequality |f |p ≤Mp · 1, combined with the integrability of 1, will force the inetgrability of |f |p, i.e.f ∈ LpK(X,A, µ).

(ii). Fix 1 ≤ p < q < ∞, as well as a function f ∈ LqK(X,A, µ). Consider thenumber r = q

p > 1, and s = rr−1 , so that we have 1

r + 1s = 1. Since f ∈ LqK(X,A, µ),

the function g = |f |q belongs to L1K(X,A, µ). If we define then the function h = |f |p,

then we obviously have g = hr, so we get the fact that h belongs to LrK(X,A, µ).Using part (i), we get the fact that 1 ∈ LsK(X,A, µ), so by Corollary 3.1, it followsthat h = 1 · h belongs to L1

K(X,A, µ), and moreover, one has the inequality∫X

|f |p dµ =∫X

h dµ ≤ ‖1‖s · ‖h‖r =[ ∫

X

1 dµ]1/s

·[ ∫

X

hr dµ

]1/r

=

= µ(X)1/s ·[ ∫

X

|f |q dµ]1/r

= µ(X)1/s ·(‖f‖q

)q/r.

On the one hand, this inequality proves that f ∈ LpK(X,A, µ). On the other hand,this also gives the inequality(

‖f‖p)p ≤ µ(X)1/s ·

(‖f‖q

)q/r = µ(X)1−pq ·

(‖f‖q

)p,

which yields‖f‖p ≤ µ(X)

1p−

1q · ‖f‖q.

This proves that the linear map (8) is continuous (and has norm no greater thanµ(X)

1p−

1q ).

332 LECTURE 35

Exercise 2. Give an example of a sequence of continuous functions fn : [0, 1] →[0,∞), n ≥ 1, such that Lp- limn→∞ fn = 0, ∀ p ∈ [1,∞), but for which itis not true that 0 = µ-a.e.- limn→∞ fn. (Here we work on the measure space([0, 1],Mλ([0, 1]), λ).)

Exercise 3. Let Ω ⊂ Rn be an open set. Prove that CKc (Ω) is dense in

LpK(Ω,Mλ(Ω), λ), for every p ∈ [1,∞). (Here λ denotes the n-dimensional Lebesguemeasure, and Mλ(Ω) denotes the collection of all Lebesgue measurable subsets ofΩ.)

Notations. Let (X,A, µ) be a measure space, let K be one of the fields R orC. We define the space

NK,elem(X,A, µ) = L1K,elem(X,A, µ) ∩NK(X,A, µ),

and we define the quotient space

L1K,elem(X,A, µ) = L1

K,elem(X,A, µ) /NK,elem(X,A, µ).

In other words, if one considers the quotient map

Π1 : L1K(X,A, µ) → L1

K(X,A, µ),

then L1K,elem(X,A, µ) = Π1

(L1

K,elem(X,A, µ)). Notice that we have the obvious

inclusionL1

K,elem(X,A, µ) ⊂ LpK(X,A, µ), ∀ p ∈ [1,∞),so we we consider the quotient map

Πp : LpK(X,A, µ) → LpK(X,A, µ),

we can also define the subspace

LpK,elem(X,A, µ) = Πp

(LpK,elem(X,A, µ)

), ∀ p ∈ [1,∞).

Remark that, as vector spaces, the spaces LpK,elem(X,A, µ) are identical, since

Ker Πp = NK(x,A, µ), ∀ p ∈ [1,∞).

With these notations we have the following fact.Proposition 3.4. LpK,elem(X,A, µ) is dense in LpK(X,A, µ), for each p ∈

[1,∞).

Proof. Fix p ∈ [1,∞), and start with some f ∈ LpK(X,A, µ). What weneed to prove is the existence of a sequence (fn)∞n=1 ⊂ L1

K,elem(X,A, µ), such thatf = Lp- limn→∞ fn. Taking real and imaginary parts (in the case K = C), itsuffieces to consider the case when f is real valued. Since |f | also belongs to Lp,it follows that f+ = maxf, 0 = 1

2

(|f | + f

), and f− = max−f, 0 = 1

2

(|f | − f

)both belong to Lp, so in fact we can assume that f is non-negative. Consider thefunction g = fp ∈ L1

+(X,A, µ). Use the definition of the integral, to find a sequence(gn)∞n=1 ⊂ L1

R,elem(X,A, µ), such that• 0 ≤ gn ≤ g, ∀n ≥ 1;• limn→∞

∫Xgn dµ =

∫Xg dµ.

This gives the fact that g = L1- limn→∞ gn. Using Corollary 3.4, after replacing(gn)∞n=1 with a subsequence, we can also assume that g = µ-a.e.- limn→∞ gn. If weput fn = (gn)1/p, ∀n ≥ 1, we now have

• 0 ≤ fn ≤ f , ∀n ≥ 1;• f = µ-a.e.- limn→∞ fn.

CHAPTER IV: INTEGRATION THEORY 333

Obviously, the fn’s are still elementary integrable, and by the Lp Dominated Con-vergence Theorem, we indeed get f = Lp- limn→∞ fn.

Comments. A. The above result gives us the fact that LpK(X,A, µ) is the com-pletion of LpK,elem(X,A, µ). This allows for the following alternative constructionof the Lp spaces.

B. For a measurable function f : X → K, by the (proof of the) above result,it follows that the condition f ∈ LpK(X,A, µ) is equivalent to the equality f =µ-a.e.- limn→∞ fn, for some sequence (fn)∞n=1 of elementary integrable functions,which is Cauchy in the Lp norm, i.e.

(c) for every ε > 0, there exists Nε, such that

‖fm − fn‖p < ε, ∀m,n ≥ Nε.

One key feature, which will be heavily exploited in the next section, deals withthe Banach space p = 2, for which we have the following.

Proposition 3.5. Let (X,A, µ) be a measure space, and let K be one of thefields R or C.

(i) The map ( . | . ) : L2K(X,A, µ)× L2

K(X,A, µ) → K, given by

( f | g ) = 〈f , g〉 =∫X

fg dµ, ∀ f, g ∈ L2K(X,A, µ),

defines an inner product on L2K(X,A, µ).

(ii) One has the equality

‖f‖2 =√

( f | f ), ∀ f ∈ L2K(X,A, µ).

Consequently , L2K(X,A, µ) is a Hilbert space.

Proof. The properties of the inner product are immediate, from the propertiesof integration. The second property is also clear.

Remark 3.4. The main biproduct of the above feature is the fact that thecorrespondence (6) is an isometric isomorphism, in the case p = q = 2. Thisfollows from Riesz Theorem (only the surjectivity is the issue here; the rest hasbeen discussed in Remark 3.3). If φ : L2

K(X,A, µ) → K is a linear continuous map,then there exists some h ∈ L2

K(X,A, µ), such that

φ(g) = (h | g ), ∀ g ∈ L2K(X,A, µ).

If we put f = h, then the above equality gives

φ(g) = 〈f, g〉, ∀ g ∈ L2K(X,A, µ).

i.e. φ = Λf .Comments. Eventually (see Section 5) we shall prove that the correspondence

(6) is surjective also in the general case.The correspondence (6) also has a version for q = 1. This would require the

definition of an Lp space for the case p = ∞. We shall postpone this until we reachSection 5. The next exercise hints towards such a construction.

Exercise 4♦. Let (X,A, µ) be a measure space, let K be one of the fields R or C,and let f : X → K be a bounded measurable function. Define M = supx∈X |f(x)|.Prove the following.

334 LECTURE 35

(i) Whenever g ∈ L1K(X,A, µ), it follows that the function fg also belongs to

L1K(X,A, µ), and one has the inequality

‖fg‖1 ≤M · ‖g‖1.(ii) The map

Λf : L1K(X,A, µ) 3 g 7−→

∫X

fg dµ ∈ K

is linear and continuous. Moreover, one has the inequality ‖Λf‖ ≤M .Remark 3.5. If we apply the above Exercise to the constant function f = 1,

we get the (already known) fact that the integration map

(9) Λ1 : L1K(X,A, µ) 3 g 7−→

∫X

g dµ ∈ K

is linear and continuous, and has norm ‖Λ1‖ ≤ 1. The follwing exercise gives theexact value of the norm.

Exercise 5. With the notations above, prove that the following are equivalent:(i) the measure space (X,A, µ) is non-degenerate, i.e. there exists A ∈ A

with 0 < µ(A) <∞;(ii) L1

K(X,A, µ) 6= 0;(ii) the integration map (9) has norm ‖Λ1‖ = 1.

Lectures 36-37

4. Radon-Nikodym Theorems

In this section we discuss a very important property which has many importantapplications.

Definition. Let X be a non-empty set, and let A be a σ-algebra on X. Giventwo measures µ and ν on A, we say that ν has the Radon-Nikodym property relativeto µ, if there exists a measurable function f : X → [0,∞], such that

(1) ν(A) =∫A

f dµ, ∀A ∈ A.

Here we use the convention which defines the integral in the right hand side by∫A

f dµ = ∫

XfκA dµ if fκA ∈ L1

+(X,A, µ)∞ if fκA 6∈ L1

+(X,A, µ)

In this case, we say that f is a density for ν relative to µ.The Radon-Nikodym property has an equivalent useful formulation.Proposition 4.1 (Change of Variables). Let X be a non-empty set, and let

A be a σ-algebra on X, let µ and ν be measures on A, and let f : X → [0,∞] be ameasurable function.

A. The following are equivalent(i) ν has the Radon-Nikodym property relative to µ, and f is a density for ν

relative to µ;(ii) for every measurable function h : X → [0,∞], one has the equality16

(2)∫X

h dν =∫X

hf dµ.

B. If ν and f are as above, and K is either R or C, then the equality (2)also holds for those measurable functions h : X → K with h ∈ L1

K(X,A, ν) andhf ∈ L1

K(X,A, µ).

Proof. A. (i) ⇒ (ii). Assume property (i) holds, which means that we have(1). Fix a measurable function h : X → [0,∞], and use Theorem III.3.2, to find asequence (hn)∞n=1 ⊂ A-ElemR(X), with

(a) 0 ≤ h1 ≤ h2 ≤ · · · ≤ h;(b) limn→∞ hn(x) = h(x), ∀x ∈ X.

Of course, we also have(a′) 0 ≤ h1f ≤ h2f ≤ · · · ≤ hf ;

16 For the product hf we use the conventions 0 · ∞ = ∞ · 0 = 0, and t · ∞ = ∞ · t = ∞,

∀ t ∈ (0,∞].

335

336 LECTURES 36-37

(b′) limn→∞ hn(x)f(x) = h(x)f(x), ∀x ∈ X.Using the Monotone Convergence Theorem, we then get the equalities

(3)∫X

h dν = limn→∞

∫X

hn dν and∫X

hf dµ = limn→∞

∫X

hnf dν

Notice that, if we fix n and we write hn =∑pk=1 αkκAk

, for some A1, . . . , Ap ∈ A,and α1 > · · · > αp > 0, then∫

X

hn dν =p∑k=1

αkν(Ak) =p∑k=1

∫X

αkκAkf dµ =

∫X

hnf dµ,

so using (3), we immediately get (2).The implication (ii) ⇒ (i) is trivial, using functions of the form h = κA, A ∈ A.B. Suppose ν has the Radon-Nikodym property relative to µ, and f is a density

for ν relative to µ, and let h : X → K be a measurable function with h ∈ L1K(X,A, ν)

and hf ∈ L1K(X,A, µ). In the complex case, using the inequalities |Reh| ≤ |h| and

|Imh| ≤ |h|, it is clear that both functions Reh and Imh belong to L1(X,A, ν),and also the products (Reh)f and (Imh)f belong to L1(X,A, µ). This shows thatit suffices to prove (2) under the additional hypothesis that h is real-valued. In thiscase we consider the functions h±, defined by

h+ = maxh, 0 and h− = max−h, 0.Since we have 0 ≤ h± ≤ |h|, it follows that h± ∈ L1

+(X,A, ν), as well as h±f ∈L1

+(X,A, µ). In particular, we get the equalities

(4)∫X

h dν =∫X

h+ dν −∫X

h− dν and∫X

hf dν =∫X

h+f dµ−∫X

h−f dµ.

Since h± ≥ 0, we can use property A.(ii) above, and we have∫X

h± dν =∫X

h±f dµ,

and then the desired equality (2) immediately follows from (4).

One important issue is the uniqueness of the density. For this purpose, it willbe helpful to introduce the following.

Definition. Let T be one of the spaces [−∞,∞] or C, and let r be somerelation on T (in our case r will be either “=,” or “≥,” or “≤,” on [−∞,∞]).Given a measurable space (X,A, µ), and two measurable functions f1, f2 : X → T ,

f1 r f2, µ-l.a.e.

if the setA =

x ∈ X : f1(x)r f2(x)

belongs to A, and it has locally µ-null complement in X, i.e. µ

([X r A] ∩ F ) = 0,

for every set F ∈ A with µ(F ) < ∞. (If r is one of the relations listed above, theset A automatically belongs to A, so all intersections [X r A] ∩ F , F ∈ A, alsobelong to A.) The abreviation “µ-l.a.e.” stands for “µ-locally-almost everywhere.”Remark that one has the implication

f1 r f2, µ-a.e. ⇒ f1 r f2, µ-l.a.e.

Remark that, when µ is σ-finite, then the other implication also holds:

f1 r f2, µ-l.a.e. ⇒ f1 r f2, µ-a.e.

CHAPTER IV: INTEGRATION THEORY 337

With this terminology, one has the following uniqueness result.Proposition 4.2. Suppose A is a σ algebra on some non-empty set X, and µ

and ν are measures on A, such that ν has the Radon-Nikodym property relative toµ. If f, g : X → [0,∞] are densities for ν relative to µ, then

f = g, µ-l.a.e.

In particular, if µ is σ-finite, then

f = g, µ-a.e.

Proof. Consider the set B =x ∈ X : f(x) 6= g(x)

, which belongs to A.

We need to prove that B is locally µ-null, i.e. one has µ(B ∩ F ) = 0, for all F ∈ A

with µ(F ) < ∞. Fix F ∈ A with µ(F ) < ∞, and let us write B ∩ F = D ∪ E,where

D =x ∈ B ∩ F : f(x) < g(x)

and E =

x ∈ B ∩ F : f(x) > g(x)

.

If we define, for each integer n ≥ 1, the sets

Dn =x ∈ B ∩ F : f(x) + 1

n ≤ g(x)

and En =x ∈ B ∩ F : f(x) ≥ g(x) + 1

n

,

then it is clear that

B ∩ F = D ∪ E =∞⋃n=1

(Dn ∪ En),

so in order to prove that µ(B∩F ) = 0, it suffices to show that µ(Dn) = µ(En) = 0,∀n ≥ 1.

Fix n ≥ 1. It is obvious that f(x) <∞, ∀x ∈ Dn, so if we define the sequence(Dk

n)∞k=1 ⊂ A, by

Dkn =

x ∈ Dn : f(x) ≤ k

, ∀ k ≥ 1,

we have the equality Dn =⋃∞k=1D

kn, so in order to prove that µ(Dn) = 0, it suffices

to show that µ(Dkn) = 0, ∀ k ≥ 1. On the one hand, since f(x) ≤ k, ∀ k ≥ 1, using

the inclusion Dkn ⊂ F , we get

ν(Dkn) =

∫Dk

n

f dµ ≤∫X

kκDkndµ = kµ(Dk

n) ≤ kµ(F ) <∞.

On the other hand, since g(x) ≥ f(x) + 1n , ∀x ∈ Dk

n, we get

ν(Dkn) =

∫Dk

n

g dµ ≥∫X

(fκDkn

+ 1nκDk

n) dµ =

=∫X

fκDkndµ+

∫X

1nκDk

ndµ = ν(Dk

n) + 1nµ(Dk

n).

Since ν(Dkn) <∞, the above inequality forces µ(Dk

n) = 0.The fact that µ(En) = 0, ∀n ≥ 1, is proven the exact same way.

In general, the uniqueness of the density does not hold µ-a.e., as it is seen fromthe following.

Example 4.1. Take X to be some non-empty set, put A = ∅, X, and define

the measure µ on A, by µ(∅) and µ(X) = ∞. It is clear that µ has the Radon-Nikodym property realtive to itself, but as sensities one can choose for instance theconstant functions f = 1 and g = 2. Clearly, the equality f = g, µ-a.e. is not true.

338 LECTURES 36-37

Remark 4.1. The local almost uniqueness result, given in Proposition 4.2,holds under slightly weaker assumptions. Namely, if (X,A, µ) is a measure space,and if f, g : X → [0,∞] are measurable functions for which we have the equality∫

A

f dµ =∫A

g dµ,

for all A ∈ A with µ(A) < ∞, then we still have the equality f = g, µ-l.a.e. Thisfollows actually from Proposition 4.2, applied to functions of the form f

∣∣A

and g∣∣A.

Let us introduce the following.Notations. For a measure space (X,A, µ) we define

Aµ0 = N ∈ A : µ(N) = 0;

Aµfin = F ∈ A : µ(F ) <∞;

Aµ0,loc = A ∈ A : µ(A ∩ F ) = 0, ∀F ∈ A

µfin.

With these notations, we have the inclusions

Aµ0 = A

µ0,loc ∩A

µfin ⊂ A

µ0,loc ⊂ A,

and Aµ0 and A

µ0,loc are in fact σ-rings.

Comment. The “locally-almost everywhere” terminology is actually designedto “hide some pathologies under the rug.” For instance, if (X,A, µ) is a degeneratemeasure space , i.e. µ(A) ∈ 0,∞, ∀A ∈ A, then “anything happens locallyalmost-everywhere,” which means that we have the equality A

µ0,loc = A.

At the other end, there is a particular type of measure spaces on which, even inthe absence of σ-finiteness, the notions of “locally-almost everywhere” and ”almosteverywhere” coincide, i.e. we have the equality A

µ0,loc = A

µ0 . Such spaces are

described by the following.Definition. A measure space (X,A, µ) is said to be nowhere degenerate, or

with finite subset property, if(f) for every set A ∈ A with µ(A) > 0, there exists some set F ∈ A, with

F ⊂ A, and 0 < µ(F ) <∞.With this terminology, one has the following result.

Proposition 4.3. For a measure space (X,A, µ), the following are equivalent:(i) A

µ0,loc = A

µ0 ;

(ii) (X,A, µ) has the finite subset property.

Proof. (i) ⇒ (ii). Assume Aµ0,loc = A

µ0 , and let us prove that (X,A, µ) has

the finite subset property. We argue by contradiction, so let us assume there existssome set A ∈ A, with µ(A) = ∞, such that µ(B) ∈ 0,∞, for every B ∈ A, withB ⊂ A. In particular, if we start with some arbitrary F ∈ A

µfin, using the fact

that µ(A ∩ F ) ≤ µ(F ) <∞, we see that we must have µ(A ∩ F ) = 0. This provesprecisely that A ∈ A

µ0,loc. By assumption, it follows that A ∈ A

µ0 , i.e. µ(A) = 0,

which is impossible.(ii) ⇒ (i). Assume that (X,A, µ) has the finite subset property, and let us

prove the equality (i). Since one inclusion is always true, all we need to prove isthe inclusion A

µ0,loc ⊂ A

µ0 , which equivalent to the inclusion A

µ0,loc ⊂ A

µfin. Start

with some set A ∈ Aµ0,loc, but assume µ(A) = ∞. On the one hand, using the finite

CHAPTER IV: INTEGRATION THEORY 339

subset property, there exists some set F ∈ A with F ⊂ A and µ(F ) > 0. On theother hand, since A ∈ A

µ0,loc, we have µ(F ) = 0, which is impossible.

Example 4.2. Take X be an uncountable set, let A = P(X), and let µ be thecounting measure, i.e.

µ(A) =

CardA if A is finite∞ if A is infinite

Then (X,P(X), µ) has the finite subset property, but is not σ-finite.When we restrict to integrable functions, the two notions µ-l.a.e, and µ-a.e.

coincide. More precisely, we have the following.Proposition 4.4. Let (X,A, µ) be a measure space, let K be one of the fields

R or C, and let p ∈ [1,∞). For a function f ∈ LpK(X,A, µ), the following areequivalent:

(i) f = 0, µ-l.a.e.(ii) f = 0, µ-a.e.

Proof. Of course, we only need to prove the implication (i) ⇒ (ii). Assumef = 0, µ-l.a.e. Using the function g = |f |p, we can assume that p = 1 and f(x) ≥ 0,∀x ∈ X. Consider then the set N = x ∈ X : f(x) > 0, and write it as a unionN =

⋃∞n=1Nn, where

Nn = x ∈ X : f(x) ≥ 1n, ∀n ≥ 1.

Of course, all we need is the fact that µ(Nn) = 0, ∀n ≥ 1. Fix n ≥ 1. On the onehand, the assumption on f , it follows that Nn ∈ A

µ0,loc. On the other hand, the

inequality 1nκNn

≤ f , forces the elementary function 1nκNn

to be µ-integrable, i.e.µ(Nn) <∞. Consequently we have

N ∈ Aµ0,loc ∩A

µfin = A

µ0 .

Comment. In what follows we will discuss several results, which all have asconclusion the fact that one measure has the Radon-Nikodym property with respectto another one. All such results will be called “Radon-Nikodym Theorems.”

The first result is in fact quite general, in the sense that it works for finitesigned or complex measures.

Theorem 4.1 (“Easy” Radon-Nikodym Theorem). Let (X,A, µ) be a finitemeasure space, let K denote one of the fields R or C, and let C > 0 be someconstant. Suppose ν is a K-valued measure on A, such that

|ν(A)| ≤ Cµ(A), ∀A ∈ A.

Then there exists some function f ∈ L1K(X,A, µ), such that

(5) ν(A) =∫A

f dµ, ∀A ∈ A.

Moreover:(i) Any function f ∈ L1

K(X,A, µ), satisfying (5) has the property |f | ≤ C, µ-a.e. If ν is an “honest” measure, then one also has the inequality |f | ≥ 0,µ-a.e.

(ii) A function satisfying (5) is essentially unique, in the sense that, wheneverf1, f2 ∈ L1

K(X,A, µ) satisfy (5), it follows that f1 = f2, µ-a.e.

340 LECTURES 36-37

Proof. The ideea is to somehow make sense of∫Xh dν, for suitable measurable

functions h, and to examine the properties of such a number relative to the integral∫Xh dµ. The second integral is of course defined, for instance for h ∈ L1

K(X,A, µ),but the first integral is not, because ν is not an “honest” measure. The proof willbe carried on in several steps.

Step 1: There exist four “honest” finite measures νk, k = 1, 2, 3, 4, and num-bers αk, k = 1, 2, 3, 4, such that ν = α1ν1 + α2ν2 + α3ν3 + α4ν4, and

(6) νk ≤ Cµ, ∀ k = 1, 2, 3, 4.

In the case K = R we use the Hahn-Jordan decomposition ν = ν+ − ν−. We alsoknow that ν± ≤ |ν|, the variation measure of ν. In this case we take α1 = 1,ν1 = ν+, α2 = −1, ν2 = ν−, and we set ν3 = ν4 = 0, α3 = α4 = 0.

In the case K = C, we write ν = η + iλ, with η and λ finite signed measures,and we use the Hahn-Jordan decompositions η = η+ − η− and λ = λ+ − λ−. Wealso know that the variation measures of η and λ satisfy |η| ≤ |ν| and |λ| ≤ |ν|, sowe also have η± ≤ |ν| and λ± ≤ |ν|. In this case we can then take α1 = 1, ν1 = η+,α2 = −1, ν2 = η−, α3 = i, ν3 = λ+, α4 = −i, ν4 = λ−.

Notice that in either case we have

νk ≤ |ν|, ∀ k = 1, 2, 3, 4.

By Remark III.8.5 it follows that we have |ν| ≤ Cµ, so we immediately get theinequalities (6).

Step 2: For any measurable function h : X → [0,∞], one has the inequality

(7)∫X

h dνk ≤ C

∫X

h dµ, ∀ k = 1, 2, 3, 4.

To prove this, we choose a sequence of elementary functions (hn)∞n=1 ⊂ A-ElemR(X),with

• 0 ≤ h1 ≤ h2 ≤ . . . (everywhere),• limn→∞ hn(x) = h(x), ∀x ∈ X,

so that by the General Monotone Convergence Theorem, we get the equalities∫X

h dµ = limn→∞

∫X

hn dµ and∫X

h dνk = limn→∞

∫X

hn dνk, ∀ k = 1, 2, 3, 4.

This means that, in order to prove (7), it suffices to prove it under the extraassumption that h is elementary. In this case, we have

h = β1κB1+ · · ·+ βpκBp

,

with β1, . . . , βp ≥ 0 and B1, . . . , Bp ∈ A. The inequality is then immediate, from(6) since we have∫

X

h dνk =p∑j=1

βjνk(Bj) ≤ C

p∑j=1

µ(Bj) = C

∫X

h dµ.

As a consequence of Step 2, we get the fact that, for every k = 1, 2, 3, 4, onehas the inclusions

L1K(X,A, µ) ⊂ L1

K(X,A, νk) and NK(X,A, µ) ⊂ NK(X,A, νk).

Taking quotients, this gives rise to correctly defined linear maps

(8) Φk : L1K(X,A, µ) 3 h 7−→ h ∈ L1

K(X,A, νk), k = 1, 2, 3, 4.

CHAPTER IV: INTEGRATION THEORY 341

(Here we use the abusive notation that identifies an element in L1 with a functionin L1, which is defined almost uniquely.) Moreover, one has the inequality∫

X

|h| dνk ≤ C

∫X

|h| dµ, ∀h ∈ L1K(X,A, µ), k = 1, 2, 3, 4,

in other words, the linear maps (8) are all continuous. For every k = 1, 2, 3, 4, letφk denote the integration map

φk : L1K(X,A, νk) 3 h 7−→

∫X

h dνk ∈ K.

We know (see Remark 3.5) that the φk’s are continuous. In particular, the compo-sitions ψk = φk Φk : L1

K(X,A, µ) → K, which are defined by

ψk : L1K(X,A, µ) 3 h 7−→

∫X

h dνk, k = 1, 2, 3, 4,

are linear and continuous.We now use Proposition 3.3 which states that one has an inclusion

(9) Θ : L2K(X,A, µ) → L1

K(X,A, µ),

which is in fact a linear continuous map. So if we consider the compositions θk =ψk Θ, which are defined by

θk : L1K(X,A, µ) 3 h 7−→

∫X

h dνk, k = 1, 2, 3, 4,

then these compositions are linear and continuous. Apply then Riesz Theorem (inthe form given in Remark 3.4), to find functions f1, f2, f3, f4 ∈ L2

K(X,A, µ), suchthat

θk(h) = 〈fk, h〉, ∀h ∈ L2K(X,A, µ), k = 1, 2, 3, 4.

In particular, using functions of the form h = κA, A ∈ A (which all belong toL2

K(X,A, µ), due to the finiteness of µ), we get

νk(A) =∫X

κA dνk =∫X

fkκA dµ, ∀A ∈ A, k = 1, 2, 3, 4.

Finally, if we define the function f = α1f1 + α2f2 + α3f3 + α4f4 ∈ L2K(X,A, µ),

then the above equalities immediately give the equality (5).At this point we only know that f belongs to L2

K(X,A, µ). Using the inclusion(9), it turns out that f indeed belongs to L1

K(X,A, µ).Let us prove now the additional properties (i) and (ii).To prove the first assertion in (i), we start off by fixing some function f ∈

L1K(X,A, µ), which satisfies (5), and we define the set

A = x ∈ X : |f(x)| > C,

for which we must prove that µ(A) = 0. Since f is measurable, it follows that Abelongs to A. Consider the “rational unit sphere” S1

Q in K, defined as

(10) S1Q =

−1, 1 if K = R

e2πit : t ∈ Q if K = C

The point is that S1Q is dense in the unit sphere S1 in K:

S1 = α ∈ K : |α| = 1,

342 LECTURES 36-37

so we immediately have the equality A =⋃α∈S1

QAα, where

Aα = x ∈ X : Re[αf(x)] > C.

Since S1Q is countable, in order to prove that µ(A) = 0, it then suffices to show that

µ(Aα) = 0, ∀α ∈ S1Q. Fix then α ∈ S1

Q, and consider the K-valued measure η = αν.It is clear that we still have

(11) |η(A)| = |ν(A)| ≤ Cµ(A), ∀A ∈ A,

as well as the equality

(12) η(A) =∫A

αf dµ, ∀A ∈ A.

For each integer n ≥ 1, let us define the set

Anα = x ∈ X : Re[αf(x)] ≥ C + 1n

,

so that we obviously have the equality Aα =⋃∞n=1A

nα. In particular, in order to

prove µ(Aα) = 0, it suffices to prove that µ(Anα) = 0, ∀n ≥ 1. Fix for the momentn ≥ 1. Using (12), it follows that

Re η(Anα) = Re[ ∫

Anα

αf dµ

]=

∫An

α

Re[αf ] dµ =∫X

Re[αf ]κAnαdµ.

Since we have Re[αf ]κAnα≥ (C + 1

n )κAnα, the above inequality can be continued

with

Re η(Anα) ≥∫X

(C + 1n )κAn

αdµ = (C + 1

n )µ(Anα).

Of course, this will give

|η(Anα)| ≥ Re η(Anα) ≥ (C + 1n )µ(Anα).

Note now that, using (11), this will finally give

Cµ(Anα) ≥ (C + 1n )µ(Anα),

which clearly forces µ(Anα) = 0.Having proven that |f | ≤ C, µ-a.e., let us turn our attention now to the unique-

ness property (ii). Suppose f1, f2 ∈ L1K(X,A, µ) are such that

ν(A) =∫A

f1 dµ =∫X

f2 dµ, ∀A ∈ A.

Consider then the difference f = f1− f2 and the trivial measure ν0 = 0. Obviouslywe have

|ν0(A)| ≤ 1nµ(A), ∀A ∈ A,

for every integer n ≥ 1, as well as

ν0(A) =∫A

f dµ, ∀A ∈ A.

By the first assertion in (i), it follows that

|f1 − f2| = |f | ≤ 1n, µ-a.e.,

for every n ≥ 1. So if we take the sets (Nn)∞n=1 ⊂ A defined by

Nn = x ∈ X : |f1(x)− f2(x)| > 1n,

CHAPTER IV: INTEGRATION THEORY 343

then µ(Nn) = 0, ∀n ≥ 1. Of course, if we put N =⋃∞n=1Nn, then on the one hand

we have µ(N) = 0, and on the other hand, we have

f1(x)− f2(x) = 0, ∀x ∈ X rN,

which means that we indeed have f1 = f2, µ-a.e.Finally, let us prove the second assertion in (i), which starts with the assumption

that ν is an “honest” measure. Let f ∈ L1K(X,A, µ) satisfy (5). By the uniqueness

property (ii), it follows immediately that

f = Re f, µ-a.e.,

so we can assume that f is already real-valued. Consider the “honest” measureω = Cµ− ν, and notice that the function g : X → R defined by

g(x) = C − f(x), ∀x ∈ X,

clearly has the property

ω(A) =∫A

g dµ, ∀A ∈ A.

Since we obviously have

0 ≤ ω(A) ≤ Cµ(A), ∀A ∈ A,

by the first assertion of (i), applied to the measure ω and the function g, it followsthat |g| ≤ C, µ-a.e. In other words, we have now a combined inequality:

max|f |, |C − f |

≤ C, µ-a.e.

Of course, since f is real valued, this forces f ≥ 0, µ-a.e.

In what follows we are going to offer various generalizations of Theorem 4.1.There are several directions in which Theorem 4.1 can be generalized. The maindirection, which we present here, will aim at weakening the condition |ν| ≤ Cµ.The following result explains that in fact the case of K-valued measures can bealways reduced to the case of “honest” finite ones.

Proposition 4.5 (Polar Decomposition). Let A be a σ-algebra on a non-emptyset X, let K be one of the fields R or C, and let ν be a K-valued measure on A. Let|ν| denote the variation measure of ν. There exists some function f ∈ L1

K(X,A, |ν|),such that

(13) ν(A) =∫A

f d|ν|, ∀A ∈ A.

Moreover(i) Any function f ∈ L1

K(X,A, |ν|), satisfying (13) has the property |f | = 1,|ν|-a.e.

(ii) A function satisfying (13) is essentially unique, in the sense that, when-ever f1, f2 ∈ L1

K(X,A, |ν|) satisfy (13), it follows that f1 = f2, |ν|-a.e.

Proof. We know that

|ν(A)| ≤ |ν|(A), ∀A ∈ A.

So if we apply Theorem 4.1 for the finite measure µ = |ν| and C = 1, we immediatelyget the existence of f ∈ L1

K(X,A, |ν|), satisfying (13). Again by Theorem 4.1, the

344 LECTURES 36-37

uniqueness property (ii) is automatic, and we also have |f | ≤ 1, |ν|-a.e. To provethe fact that we have in fact the equality |f | = 1, |ν|-a.e., we define the set

A = x ∈ X : |f(x)| < 1,which belongs to A, and we prove that |ν|(A) = 0. If we define the sequence of sets(An)∞n=1 ⊂ A, by

An = x ∈ X : |f(x)| ≤ 1− 1n, ∀n ≥ 1,

then we clearly have A =⋃∞n=1An, so all we have to show is the fact that |ν|(An) =

0, ∀n ≥ 1. Fix n ≥ 1. For every B ∈ A, with B ⊂ An, we have

|f(x)| ≤ 1− 1n , ∀x ∈ B,

so using (13) we get

|ν(B)| =∣∣∣∣ ∫B

f d|ν|∣∣∣∣ ≤ ∫

B

|f | d|ν| ≤∫B

(1− 1n ) d|ν| = (1− 1

n )|ν|(B).

Now if we take an arbitrary pairwise disjoint sequence (Bk)∞k=1 ⊂ A, with⋃∞k=1Bk =

An, then the above estimate will give∞∑k=1

|ν(Bk)| ≤ (1− 1n )

∞∑k=1

|ν|(Bk) = (1− 1n )|ν|(An).

Taking supremum in the left hand side, and using the definition of the variationmeasure, the above estimate will finally give

|ν|(An) ≤ (1− 1n )|ν|(An),

which clearly forces |ν|(An) = 0.

Remark 4.2. The case K = R can be slighly generalized, to include the caseof infinite signed measures. If ν is a signed measure on A and if we consider theHahn-Jordan set decomposition (X+, X−), then the density f is simply the function

f(x) =

1 if x ∈ X+

−1 if x ∈ X−

The equality (13) will then hold only for those sets A ∈ A with |ν|(A) < ∞.Since |ν| is allowed to be infinite, as explained in Example 4.1, the only version ofuniqueness property (ii) will hold with “|ν|-l.a.e” in place of “|ν|-a.e” Likewise, theabsolute value property (i) will have to be replaced with ”|f | = 1, |ν|-l.a.e”

Comment. Up to this point, it seems that the hypotheses from Theorem 4.1are essential, particularly the dominance condition |ν| ≤ Cµ. It is worth discussingthis property in a bit more detail, especially having in mind that we plan to weakenit as much as possible.

Notation. Suppose A is a σ-algebra on some non-empty set X, and supposeµ and ν are “honest” (not necessarily finite) measures on A. We shall write

ν b µ,

if there exists some constant C > 0, such that

ν(A) ≤ Cµ(A), ∀A ∈ A.

A few steps in the proof of Theorem 4.1 hold even without the finiteness as-sumption, as indicated by the follwing.

CHAPTER IV: INTEGRATION THEORY 345

Exercise 1*. Suppose A is a σ algebra on some non-empty set X, and supposeµ and ν are “honest” measures on A. Prove the following.

(i) If ν b µ, then one has the inclusions

NK(X,A, µ) ⊂ NK(X,A, ν) and LpK(X,A, µ) ⊂ LpK(X,A, ν), ∀ p ∈ [1,∞).

Consequently (see the proof of Theorem 4.1) one has linear maps

LpK(X,A, µ) 3 h 7−→ h ∈ LpK(X,A, ν), ∀ p ∈ [1,∞).

Show that these linear maps are continuous.(ii) Conversely, assuming one has the inclusion

Lp0K (X,A, µ) ⊂ Lp0K (X,A, ν),

for some p0 ∈ [1,∞), prove that ν b µ.Hint: To prove (ii) show first one has the inclusion L1

+(X,A, µ) ⊂ L1+(X,A, µ). Then show that

the quantity

C = sup

∫Xh dν : h ∈ L1

+(X,A, µ),

∫Xh dµ ≤ 1

is finite. If C = ∞, there exists some sequence (hn)∞n=1 ⊂ L1

+(X,A, µ), with∫Xhn dµ ≤ 1 and

∫Xh dν ≥ 4n, ∀n ≥ 1.

Consider then the series∑∞

n=112n hn, and get a contradiction. Finally prove that ν(A) ≤ Cµ(A),

∀A ∈ A.

It is the moment now to introduce the following relation, which is a highlynon-trivial weakening of the relation b.

Definition. Let A is a σ-algebra on some non-empty set X, and supposeµ and ν are “honest” (not necessarily finite) measures on A. We say that ν isabsolutely continuous with respect to µ, if for every A ∈ A, one has the implication

(14) µ(A) = 0 =⇒ ν(A) = 0.

In this case we are going to use the notation

ν µ.

It is obvious that one always has the implication

ν b µ⇒ ν µ.

Remarks 4.3. Let (X,A, µ) be a measure space. A. If ν is an “honest” measureon A, which has the Radon-Nikodym property relative to µ, then ν µ. This ispretty obvious, since if we pick f : X → [0,∞] to be a density for ν realtive to µ,then for every A ∈ A with µ(A) = 0, we have fκA = 0, µ-a.e., so we get

ν(A) =∫A

f dµ =∫X

fκA dµ = 0.

B. For an “honest” measure ν on A, the relation ν µ is equivalent to theinclusion

NK(X,A, µ) ⊂ NK(X,A, ν).By Exercise 1, this already suggests that the relation is much weaker than b(see Exercise 2 below).

C. If ν is either a signed or a complex measure on A, then the following areequivalent:

(i) the variation measure |ν| is absolutely continuous with respect to µ;

346 LECTURES 36-37

(ii) for every A ∈ A, one has the implication (14)The implication (i) ⇒ (ii) is trivial, since one has

|ν(A)| ≤ |ν|(A), ∀A ∈ A.

The implication (ii) ⇒ (i) is also clear, since if we start with some A ∈ A withµ(A) = 0, then we get |ν(B)| = 0, for all B ∈ A with B ⊂ A, and then arguingexactly as in the proof of Proposition 4.3, we get |ν|(A) = 0.

Convention. Using Remark 4.2.A, we extend the definition of absolute con-tinuity, and the notation ν µ to include the case when ν is either a signedmeasure, or a complex measure on A. In other words, the notation ν µ meansthat |ν| µ.

The following techincal result is key for the second Radon-Nikodym Theorem.Lemma 4.1. Let (X,A, µ) be a finite measure space, and let ν be an “honest”

measure on A, with ν µ. Then there exists a sequence (νn)∞n=1, of “honest”measures on A, such that

(i) νn b µ, ∀n ≥ 1; in particular the measures νn, n ≥ 1 are all finite;(ii) ν1 ≤ ν2 ≤ . . . ;(iii) limn→∞ νn(A) = ν(A), ∀A ∈ A.

Proof. Let us define

νn = (nµ) ∧ ν, ∀n ≥ 1.

Recall (see III.8, the Lattice Property; it is essential here that one of the measures,namely nµ, is finite) that by construction νn has the following properties:

(a) νn ≤ nµ and νn ≤ ν;(b) whenever ω is a measure with ω ≤ nµ and ω ≤ ν, it follows that ω ≤ νn.

Property (a) above already gives condition (i). It will be helpful to notice thatproperty (a) also gives the inequality

(15) νn ≤ ν, ∀n ≥ 1.

The monotonicity condition is now trivial, since by (b) the inequalities νn−1 ≤(n− 1)µ ≤ nµ and νn−1 ≤ ν, imply νn−1 ≤ (nµ) ∧ ν = νn.

To derive property (iii), it will be helpful to recall the actual definition of theoperation ∧. Fix for the moment n ≥ 1. One first considers the signed measureλn = nµ − ν, and its Hahn-Jordan decomposition λn = λ+

n − λ−. In our case, weget λ+

n ≤ nµ and λ−n ≤ ν. With these notations the measures νn are defined byνn = nµ−λ+

n , ∀n ≥ 1. If we fix, for each n ≥ 1, a Hahn-Jordan set decomposition(X+

n , X−n ) for X relative to λn, then we have

(16) νn(A) = ν(A ∩X+n ) + nµ(A ∩X−

n ), ∀A ∈ A, n ≥ 1.

Consider then the sets X+∞ =

⋃∞n=1X

+n and X−

∞ =⋂∞n=1. It is clear that X±

∞ ∈ A,and X−

∞ = X rX+∞.

Fix now a set A ∈ A, and let us prove the equality (iii). On the one hand, theobvious inclusions X−

n ⊃ X−∞, combined with (16), give the inequalities

(17) νn(A) ≥ ν(A ∩X+n ) + nµ(A ∩X−

∞), ∀n ≥ 1.

On the other hand, since λn+1 = µ+ λn, ∀n ≥ 1, using Lemma III.8.2, we get therelations

X+1 ⊂

µX+

2 ⊂µ. . . .

CHAPTER IV: INTEGRATION THEORY 347

(Recall that the notation D ⊂µE stands for µ(D r E) = 0.) Since ν µ, we also

have the relationsA ∩X+

1 ⊂νA ∩X+

2 ⊂ν. . . ,

so using Proposition III.4.3, one gets the equality

ν(A ∩X+∞) = lim

n→∞ν(A ∩X+

n ).

Combining this with the inequalities (15) and (17) then yields the inequality

(18) ν(A) ≥ lim supn→∞

νn(A) ≥ lim infn→∞

νn(A) ≥ ν(A ∩X+∞) + lim

n→∞

[nµ(A ∩X−

∞)].

There are two posibilities here.Case I : µ(A ∩X−

∞) > 0.In this case, the estimate (18) forces

ν(A) = lim supn→∞

νn(A) = lim infn→∞

νn(A) = ∞.

Case II : µ(A ∩X−∞) = 0.

In this case, using absolute continuity, we get ν(A ∩ X−∞) = 0, and the equality

A = (A ∩X+∞) ∪ (A ∩X−

∞) yields

ν(A) = ν(A ∩X+∞).

Then (18) forceslim supn→∞

νn(A) = lim infn→∞

νn(A) = ν(A).

In either case, the concluison is the same: limn→∞ νn(A) = ν(A).

After the above preparation, we are now in position to prove the following.Theorem 4.2 (Radon-Nikodym Theorem: the finite case). Let (X,A, µ) be a

finite measure space.A. If ν is an “honest” measure on A, with ν µ, then there exists a measurable

function f : X → [0,∞], such that

(19) ν(A) =∫A

f dµ, ∀A ∈ A.

Moreover, such a function is essentially unique, in the sense that, whenever f1, f2 :X → [0,∞] are measurable functions, that satisfy (19), it follows that f1 = f2,µ-a.e.

B. Let K be either R or C. If λ is a K-valued measure on A, with λ µ, thenthere exists a function f ∈ L1

K(X,A, µ), such that

(20) λ(A) =∫A

f dµ, ∀A ∈ A.

Moreover:(i) A function f ∈ L1

K(X,A, µ) satisfying (20) is essentially unique, in thesense that, whenever f1, f2 ∈ L1

K(X,A, µ) satisfy (20), it follows thatf1 = f2, µ-a.e.

(ii) If f ∈ L1K(X,A, µ) is any function satisfying (20), then the variation

measure |λ| of λ is given by

|λ|(A) =∫A

|f | dµ, ∀A ∈ A.

348 LECTURES 36-37

Proof. A. Use Lemma 4.1 to find a sequence (νn)∞n=1 of “honest” measureson A, such that

• νn b µ, ∀n ≥ 1; in particular the measures νn, n ≥ 1 are all finite;• ν1 ≤ ν2 ≤ . . . ;• limn→∞ νn(A) = ν(A), ∀A ∈ A.

For each n ≥ 1, we apply the “Easy” Radon-Nikodym Theorem 4.1, to find somemeasurable function fn : X → R, such that

νn(A) =∫A

fn dµ, ∀A ∈ A.

Claim: The sequence (fn)∞n=1 satisfies

0 ≤ fn ≤ fn+1, µ-a.e., ∀n ≥ 1.

Fix n ≥ 1. On the one hand, since the νn’s are “honest” finite measures, andνn b µ, by part (i) of Theorem 4.1, it follows that fn ≥ 0, µ-a.e. On other hand,since νn+1 − νn is also an “honest” finite measure with νn+1 − νn b µ, and withdensity fn+1 − fn, again by part (i) of Theorem 4.1, it follows that fn+1 − fn ≥ 0,µ-a.e.

Having proven the above Claim, let us define the function f : X → [0,∞], by

f(x) = lim infn→∞

[maxfn(x), 0

]∀x ∈ X.

It is obvious that f is measurable. By the Claim, we have in fact the equality

f = µ-a.e.- limn→∞

fn.

Since we also havefκA = µ-a.e.- lim

n→∞fnκA, ∀A ∈ A,

using the Claim and the Monotone Convergence Theorem, we get∫A

f dµ =∫X

fκA dµ = limn→∞

∫X

fnκA dµ = limn→∞

∫A

fn dµ =

= limn→∞

νn(A) = ν(A), ∀A ∈ A.

Having shown that f satisfies (19), let us observe that the uniqueness propertystated in part A is a consequence of Proposition 4.2.

B. Let λ be a K-valued. In particular, the variation measure |λ| is finite, so bythe Polar Decomposition (Proposition 4.3) there exists some measurable functionh : X → K, such that

(21) λ(A) =∫A

h d|λ|, ∀A ∈ A,

and such that |h| = 1, |λ|-a.e. Replacing h with the measurable function h′ : X →K, defined by

h′(x) =h(x) if |h(x)| = 1

1 if |h(x)| 6= 1

we can assume that in fact we have

|h(x)| = 1, ∀x ∈ X.

CHAPTER IV: INTEGRATION THEORY 349

Apply then part A, to the measure |λ|, which is again absolutely continuous withrespect to µ, to find some measurable function g : X → [0,∞], such that

|λ|(A) =∫A

g dµ, ∀A ∈ A.

Remark that, since ∫X

g dµ = |λ|(X) <∞,

it follows that g ∈ L1+(X,A, µ). Fix for the moment some set A ∈ A. On the one

hand, since

(22) |hκA| ≤ 1,

and |λ| is finite, it follows that hκA ∈ L1K(X,A, |λ|). On the other hand, since

g ∈ L1+(X,A, µ), using (22) we get the fact that hκAg ∈ L1

K(X,A, µ). Using theChange of Variable formula (Proposition 4.1) we then get the equality∫

X

hκA d|λ| =∫X

hκAg dµ,

which by (21) reads:

λ(A) =∫A

hg dµ.

Now the function f0 = hg (which has |f0| = g) belongs to L1K(X,A, µ), and clearly

satisfies (20).To prove the uniqueness property (i), we start with two functions f1, f2 ∈

L1K(X,A, µ) which satisfy∫

A

f1 dµ =∫A

f2 dµ = λ(A), ∀A ∈ A.

If we define the function ϕ = f1 − f2 ∈ L1K(X,A, µ), then we clearly have∫

A

ϕdµ =∫A

0 dµ = ω(A), ∀A ∈ A,

where ω is the zero measure. Since ω ≤ µ, using Theorem 4.1 it follows that ϕ = 0,µ-a.e.

To prove (ii) we start with some f ∈ L1K(X,A, µ) that satisfies (20), and we

use the uniqueness property (i) to get the equality f = f0, µ-a.e., where f0 is thefunction constructed above. In particular, using the construction of f0, the factthat |f0| = g, and the fact that g is a density for |λ| relative to µ, we get∫

A

|f | dµ =∫A

|f0| dµ =∫A

g dµ = |λ|(A), ∀A ∈ A.

At this point we would like to go further, beyond the finite case. The followinggeneralization of Theorem 4.2 is pretty straightforward.

Corollary 4.1 (Radon-Nikodym Theorem: the σ-finite case). Let (X,A, µ)be a σ-finite measure space.

A. If ν is an “honest” measure on A, with ν µ, then there exists a measurablefunction f : X → [0,∞], such that

(23) ν(A) =∫A

f dµ, ∀A ∈ A.

350 LECTURES 36-37

Moreover, such a function is essentially unique, in the sense that, whenever f1, f2 :X → [0,∞] are measurable functions, that satisfy (19), it follows that f1 = f2,µ-a.e.

B. Let K be either R or C. If λ is a K-valued measure on A, with λ µ, thenthere exists a function f ∈ L1

K(X,A, µ), such that

(24) λ(A) =∫A

f dµ, ∀A ∈ A.

Moreover:(i) A function f ∈ L1

K(X,A, µ) satisfying (20) is essentially unique, in thesense that, whenever f1, f2 ∈ L1

K(X,A, µ) satisfy (24), it follows thatf1 = f2, µ-a.e.

(ii) If f ∈ L1K(X,A, µ) is any function satisfying (24), then the variation

measure |λ| of λ is given by

|λ|(A) =∫A

|f | dµ, ∀A ∈ A.

Proof. Since µ is σ-finite, there exists a sequence (An)∞n=1 ⊂ Aµfin, with⋃∞

n=1An = X. Put X1 = A1 and Xn = An r (A1 ∪ · · · ∪ An−1), ∀n ≥ 2. Then(Xn)∞n=1 ⊂ A

µfin is pairwise disjoint, and we still have

⋃∞n=1Xn = X. The Corol-

lary follows then immediately from Theorem 4.2, applied to the measure spaces(Xn,A

∣∣Xn, µ

∣∣Xn

) and the measures ν∣∣Xn

and λ∣∣Xn

respectively. What is used hereis the fact that, if K denotes one of the sets [0,∞], R or C, then for a functionf : X → K the fact that f is measurable, is equivalent to the fact that f

∣∣Xn

ismeasurable for each n ≥ 1. Moreover, given two functions f1, f2 : X → K, the con-dition f1 = f2, µ-a.e. is equivalent to the fact that f1

∣∣Xn

= f2∣∣Xn

, µ-a.e., ∀n ≥ 1.Finally, for f : X → K(= R,C), the condition f ∈ L1

K(X,A, µ), is equivalent to thefact that f

∣∣Xn

∈ L1K(Xn,A

∣∣Xn, µ

∣∣Xn

), ∀n ≥ 1, and

∞∑n=1

∫Xn

(f∣∣Xn

)d(µ∣∣Xn

)<∞.

Comment. The σ-finite case of the Radon-Nikodym Theorem, given above, isin fact a particular case of a more general version (Theorem 4.3 below). In orderto formulate this, we need a concept which has already appeared earlier in III.5.Recall that a measure space (X,A, µ) is said to be decomposable, if there exists apairwise disjoint subcollection F ⊂ A

µfin, such that

(i)⋃F∈F F = X;

(ii) for a set A ⊂ X, the condition A ∈ A is equiavelnt to the condition

A ∩ F ∈ A, ∀F ∈ F;

(iii) one has the equality

µ(A) =∑F∈F

µ(A ∩ F ), ∀Aµfin.

Such a collection F is then called a decomposition of (X,A, µ). Condition (ii)is referred to as the patching property, because it characterizes measurability asfollows.

CHAPTER IV: INTEGRATION THEORY 351

(p) Given a measurable space (Y,B), a function f : (X,A) → (Y,B) is mea-surable, if and only if all restrictions F

∣∣F

: (F,A∣∣F) → (Y,B), F ∈ F, are

measurable.Theorem 4.3 (Radon-Nikodym Theorem: the decomposable case). Let (X,A, µ)

be a decomposable measure space. Let Aµσ-fin be the collection of all µ-σ-finite sets

in A, that is,

Aµσ-fin =

A ∈ A : there exists (An)∞n=1 ⊂ A

µfin, with A =

∞⋃n=1

An.

A. If ν is an “honest” measure on A, with ν µ, then there exists a measurablefunction f : X → [0,∞], such that

(25) ν(A) =∫A

f dµ, ∀A ∈ Aµσ-fin.

Moreover, such a function is locally essentially unique, in the sense that, wheneverf1, f2 : X → [0,∞] are measurable functions, that satisfy (25), it follows thatf1 = f2, µ-l.a.e.

B. Let K be either R or C. If λ is a K-valued measure on A, with λ µ, thenthere exists a function f ∈ L1

K(X,A, µ), such that

(26) λ(A) =∫A

f dµ, ∀A ∈ Aµσ-fin.

Moreover:(i) A function f ∈ L1

K(X,A, µ) satisfying (26) is essentially unique, in thesense that, whenever f1, f2 ∈ L1

K(X,A, µ) satisfy (26), it follows thatf1 = f2, µ-a.e.

(ii) If f ∈ L1K(X,A, µ) is any function satisfying (26), then the variation

measure |λ|, of λ, satisfies

|λ|(A) =∫A

|f | dµ, ∀A ∈ Aµσ-fin.

Proof. Fix F to be a decomposition for (X,A, µ).A. For every F ∈ F, we apply Theorem 4.2 to the measure space (F,A

∣∣f, µ

∣∣F)

and the measure ν∣∣F, to find some measurable function fF : F → [0,∞], such that

ν(A) =∫A

fF dµ, ∀A ∈ A∣∣F.

Using the patching property, there exists a measurable function f : X → [0,∞],such that f

∣∣F

= fF , ∀F ∈ F. The key feature we ar going to prove is a particularcase of (25).

Claim 1: ν(A) =∫Af dµ, ∀A ∈ A

µfin.

Fix A ∈ Aµfin. On the one hand, we know that

µ(A) =∑F∈F

µ(A ∩ F ).

Since the sum is finite, it follows that the subcollection

F(A) =F ∈ F : µ(A ∩ F ) > 0

352 LECTURES 36-37

is at most countable. We then form the set A =⋃F∈F(A[A∩ F ], which is clearly a

subset of A. The difference D = Ar A has again µ(D) <∞, so its measure is alsogiven as

µ(D) =∑F∈F

µ(D ∩ F ).

Notice however that we have µ(D ∩ F ) = 0, ∀ ∈ F. (If F ∈ F(A), we already haveD∩F = ∅, whereas if F ∈ F r F(A), we have D∩F ⊂ A∩F , with µ(A∩F ) = 0.)Using then the above equality, we get µ(D) = 0. By abosulte continuity we alsoget ν(D) = 0. Using the equality A = A∪Dn, and σ-additivity (it is essential herethat F(A) is countable), it follows that

ν(A) = ν(A) =∑

F∈F(A)

ν(A ∩ F ).

Using the hypothesis, we then get

(27) ν(A) =∑

F∈F(A)

∫A∩F

f dµ.

Now if we list F(A) = Fk∞k=1, and if we take a partial sum, we haven∑k=1

∫A∩Fk

f dµ =∫Gn

f dµ =∫X

fκGndµ,

where

Gn =p⋃k=1

[A ∩ Fk], ∀n ≥ 1.

It is clear that we have• fκG1

≤ fκG2≤ . . . ,

• limn→∞(fκGn)(x) = (fκ A)(x), ∀x ∈ X,

so using the Monotone Convergence Theorem, it follows that

limn→∞

∫X

fκGndµ =

∫X

fκ A dµ =∫A

f dµ.

Using (27) we then get

ν(A) = limn→∞

∫X

fκGndµ =

∫A

f dµ.

On the other hand, since µ(Ar A) = 0, it follows that∫A

f dµ =∫A

f dµ,

so the preceding equality immediately gives the desired equality

ν(A) =∫A

f dµ.

At this point let us remark that the local almost uniqueness of f already followsfrom Remark 4.1.

Let us prove now the equality (25). Start with some set A ∈ Aµσ-fin, and choose

a sequence (An)∞n=1 ⊂ Aµfin, such that A =

⋃∞n=1An. Define the sequence (Bn)∞n=1

byBn = A1 ∪ · · · ∪An, ∀n ≥ 1,

CHAPTER IV: INTEGRATION THEORY 353

so that we still have Bn ∈ Aµfin, ∀n ≥ 1, as well as A =

⋃∞n=1Bn, but moreover we

have B1 ⊂ B2 ⊂ . . . . For each n ≥ 1, using Claim 1, we have the equality

ν(Bn) =∫Bn

f dµ.

Using these equalities, combined with

• 0 ≤ fκB1≤ fκB2

≤ . . . ,• limn→∞(fκBn

)(x) = (fκB)(x), ∀x ∈ X,

the Monotone Convergence Theorem, combined with continuity yields∫B

f dµ =∫X

fκB dµ = limn→∞

∫X

fκBndµ = lim

n→∞

∫Bn

dµ = limn→∞

ν(Bn) = ν(A).

B. We start off by choosing a measurable function h : X → K, with |h| = 1,such that

λ(A) =∫A

h d|λ|, ∀A ∈ A.

Using part A, there exists some measurable function g0 : X → [0,∞], such that

(28) |λ|(A) =∫A

g0 dµ, ∀A ∈ Aµσ-fin.

At this point, g0 may not be integrable, but we have the freedom to perturb it (µ-l.a.e.) to try to make it integrable. This is done as follows. Consider the collection

F0 =F ∈ F : |λ|(F ) > 0

.

Since |λ| is finite, it follows that F0 is at most countable. Define then the setX0 =

⋃F∈F0

F ∈ Aµσ-fin. Since X0 is µ-σ-finite, every set A ∈ A with A ⊂ X0, is

µ-σ-finite, so we have

|λ|(A) =∫A

g0 dµ, ∀A ∈ A∣∣X0.

Applying the σ-finite version of the Radon-Nikodym Theorem to the σ-finite mea-sure space (X0,A

∣∣X0, µ

∣∣X0

) and the finite measure λ∣∣X0

, it follows that the densityg0

∣∣X0

belongs to L1+(X0,A

∣∣X0, µ

∣∣X0

), which means that the function g = g0κX0

belongs to L1+(X,A, µ). With this choice of g, let us prove now that the equality

(28) still holds, with g in place of g0. Exactly as in the proof of part A, it sufficesto prove only the equality

(29) |λ|(A) =∫A

g dµ, ∀A ∈ Aµfin.

Claim 2: |λ|(A) = |λ|(A ∩X0), ∀A ∈ Aµσ-fin.

Since (use the fact that |λ| is finite) the equality is equivalent to

|λ|(ArX0) = 0, ∀A ∈ Aµσ-fin,

354 LECTURES 36-37

it suffices to prove it only for A ∈ Aµfin. If A ∈ A

µfin, using the properties of the

decomposition F, we have

|λ|(A) =∑F∈F

|λ|(A ∩ F ) =∑F∈F0

|λ|(A ∩ F ) +∑

F∈FrF0

|λ|(A ∩ F ) =

= |λ|( ⋃F∈F0

[A ∩ F ])

+∑

F∈FrF0

|λ|(A ∩ F ) =

= |λ|(A ∩X0) +∑

F∈F′(A)

|λ|(A ∩ F ).

Notice now that, for F ∈ F r F0, we have |λ|(F ) = 0, which gives |λ|(A ∩ F ) = 0,so the Claim follows immediately from the above computation.

Having proven the above Claim, let us prove now (29). Fix A ∈ Aµfin. The

desired equality is now immediate from Claim 2, combined with (28):

|λ|(A) = |λ|(A ∩X0) =∫A∩X0

g0 dµ =∫X

g0κA∩X0dµ =

=∫X

g0κX0κA dµ =

∫X

gκA dµ =∫A

g µ.

Define now the function f0 = hg. Since |f0| = g ∈ L1+(X,A, µ), it follows that

f0 ∈ L1K(X,A, µ). Let us prove that f0 satisfies the equality (26). Start with some

A ∈ Aµσ-fin. On the one hand, using Claim 2, we have

|λ(ArX0)| ≤ |λ|(ArX0) = 0,

so we get λ(A) = λ(A ∩ X0). Using the σ-finite version of the Radon-NikodymTheorem for (X0,A

∣∣X0, µ

∣∣X0

) and λ∣∣X0

, we then have

λ(A) = λ(A ∩X0) =∫A∩X0

hg0 dµ =∫X

hg0κA∩X0dµ =

=∫X

hg0κX0κA dµ =

∫X

hgκA dµ =∫A

hg dµ =∫A

f0 dµ.

We now prove the uniqueness property (i) of f (µ-a.e.!). Assume f ∈ L1K(X,A, µ)

is another function, such that

λ(A) =∫A

f dµ, ∀A ∈ Aµσ-fin.

Claim 3: f = f0, µ-l.a.e.What we need to show here is the fact that

fκB = f0κB , µ-a.e., ∀B ∈ Aµfin.

But this follows immediately from the uniqueness from part B of Theorem 4.2,applied to the finite measure space (B,A

∣∣B, µ

∣∣B

) and the measure λ∣∣B

, which hasboth f

∣∣B

and f0∣∣B

as densities.Using Claim 3, we now have f − f0 ∈ L1

K(X,A, µ), with f − f0 = 0, µ-l.a.e.,so we can apply Proposition 4.4, which forces f − f0 = 0, µ-a.e., so we indeed getf = f0, µ-a.e.

Property (ii) is obvious, since by (i), any function f ∈ L1K(X,A, µ), that satisfies

(26), automatically satisfies |f | = |f0| = g, µ-a.e.

CHAPTER IV: INTEGRATION THEORY 355

Comment. One should be aware of the (severe) limitations of Theorem 4.3,notably the fact that the equalities (25) and (26) hold only for A ∈ A

µσ-fin. For

example, if one considers the measure space (X,P(X), µ), with X uncountable,and µ defined by

µ(A) =∞ if A is uncountable0 if A is countable

This measure space is decomposable, with a decomposition consisting of singletons:F =

x : x ∈ X

. For a measure ν on P(X), the condition ν µ means

precisely that ν(A) = 0 for all countable subsets A ⊂ X. In this case the equality(25) says practically nothing, since it is restricted solely to countable sets A ⊂ X,when both sides are zero.

In this example, it is also instructive to analyze the case when ν is finite (see partB in Theorem 4.3). If we follow the proof of the Theorem, we see that at some pointwe have constructed a certain set X0 =

⋃F∈F0

, where F0 =F ∈ F : ν(F ) > 0

.

In our situation however it turns out that X0 = ∅. This example brings up a veryinteresting question, which turns out to sit at the very foundation of set theory.

Question: Does there exists an uncountable set X, and a finite measure νon P(X), such that ν(X) > 0, but ν(A) = 0, for every countable subsetA ⊂ X?

(The above vanishing condition is of course equivalent to the fact that ν(x) = 0,∀x ∈ X.) It turns out that, not only that the answer of this question is unkown, butin fact several mathematicians are seriously thinking of proposing it as an axiomto be added to the current system of axioms used in set theory!

The limitations of Theorem 4.3 also force limitations in the Change of Variablesproperty (see Proposition 4.1), which in this case has the following statement.

Proposition 4.6 (Local Change of Variables). Let (X,A, µ) be a measurespace, and let ν be a measure on A, and let f : X → [0,∞] be a measurablefunction.

A. The following are equivalent:

(i) one has

ν(A) =∫A

f dµ, ∀A ∈ Aµσ-fin;

(ii) for every measurable function h : X → [0,∞], with the property that theset Eh = x ∈ X : h(x) 6= 0 belongs to A

µσ-fin, one has the equality

(30)∫X

h dν =∫X

hf dµ.

B. If ν and f are as above, and K is either R or C, then the equality (30)also holds for those measurable functions h : X → K with Eh ∈ A

µσ-fin, for which

h ∈ L1K(X,A, ν) and hf ∈ L1

K(X,A, µ).

Proof. A. (i) ⇒ (ii). Assume (i) holds. Start with some measurable functionh : X → [0,∞], such that the set Eh = x ∈ X : h(x) 6= 0 belongs to A

µσ-fin. The

equality (30) is then immediate from Proposition 4.1, applied to the measure space(Eh,A

∣∣Eh, µ

∣∣Eh

), and the measure ν∣∣Eh

, which has density f∣∣Eh

.

356 LECTURES 36-37

(ii) ⇒ (i). Assume (ii) holds. If we start with some A ∈ Aµσ-fin, then obviously

the measurable function h = κA will have Eh = A, so by (ii) we immediately get

ν(A) =∫X

κA dν =∫X

κAf dµ =∫A

f dµ.

B. Assume now ν and f satisfy the equivalent conditions (i) and (ii). Supposeh : X → K is measurable, with Eh ∈ A

µσ-fin, such that h ∈ L1

K(X,A, ν) andhf ∈ L1

K(X,A, µ). Then the equality (30) follows again from Proposition 4.1,applied to the measure space (Eh,A

∣∣Eh, µ

∣∣Eh

), and the measure ν∣∣Eh

, which hasdensity f

∣∣Eh

.

Appendix A

Zorn Lemma

In this Appendix we review basic set theoretical results, which are consequencesof the following postulate:

Axiom of Choice. Given any non-empty collection17 Xi : i ∈ I of non-empty sets, the cartesian product ∏

i∈IXi

is non-empty.Recall that the cartesian product is defined as∏

i∈I=

f : I →

⋃i∈I

Xi : f(i) ∈ Xi, ∀ i ∈ I.

In order to formulate several consequences of the Axion of Choice, we needseveral concepts.

Definitions. Given a set X, by a relation on X one means simply as subsetR ⊂ X ×X. The standard notation for relations is:

xRy ⇐⇒ (x, y) ∈ R.

An order relation on X is a relation ≺ with the following properties:• x ≺ x, ∀x ∈ X;• if x, y, z ∈ X satisfy x ≺ y and y ≺ z, then x ≺ z;• if x, y ∈ X satisfy x ≺ y and y ≺ x, then x = y.

In this case the pair (X,≺) is called an ordered set.An ordered set (X,≺) is said to be totally ordered, if• for any elements x, y ∈ X one has either x ≺ y or y ≺ x.

More generally, given an (arbitrary) ordered set (X,≺), by a totally ordered subsetof (X,≺), one means a subset T ⊂ X, which becomes totally ordered with respectto the order relation ≺

∣∣T.

Example A.1. Fix a set M , and take X to be the collection of all subsets ofM . Then X carries a natural order relation defined by inclusion:

A ≺ B ⇐⇒ A ⊂ B.

A totally ordered subset C of (X,⊂) is called a chain of subsets of M . Two subsetA,B ⊂ M will be said to be comparable, if either A ⊂ B, or B ⊂ A, i.e. thecollection A,B is a chain of subsets of M .

Definition. Let M be a set. A collection F of subsets of M is said to havethe chain property, if

17 By a “collection of sets” one simply means a set whose elements are sets themselves.

357

358 APPENDIX A

(c) whenever C ⊂ F is a chain, it follows that the union⋃C∈C C also belongs

to F.Lemma A.1. Let M be a set, let F be a collection of subsets of M with the

chain property. For every set A ∈ F, the collection

comp(A;F) = B ∈ F : Bcomparable to Ahas the chain property.

Proof. Let C ⊂ comp(A;F) be a chain, and put T =⋃C∈C C. Since F has

the chain property, we have T ∈ F. To show that T is comparable with A, weconsider the two pssibilities:

Case 1: A ⊃ C, for all C ∈ C. In this case we have A ⊃⋃C∈C C = T .

Case 2: There exists C0 ∈ C, such that A ⊂ C0. In this case we have A ⊂C0 ⊂ T .

Lemma A.2. Let M be some non-empty set, let F let F be a non-empty collec-tion of subsets of M , with the chain property Suppose one has a map

F 3 A 7−→ xA ∈M,

with the property thatA ∪ xA ∈ F, ∀A ∈ F.

Then there exists A ∈ F such that xA ∈ A.

Proof. For each A ∈ F we define A+ = A ∪ xA. Call a subset G ⊂ F

inductive, if it has the chain property, and(+) A ∈ G ⇒ A+ ∈ G.

It is quite clear that if Gi, i ∈ I is a collection of inductive subsets of F, then theintersection

⋂i∈I Gi is again an inductive subset of F.

Fix now some subset A0 ∈ F, and define

G0 =⋂

G inductiveA0∈G

G.

Note that the subset F0 = A ∈ F : A ⊃ A0 is an inductive subset of F, so inparticular, G0 is non-empty, and G0 ⊂ F0, i.e.

(1) A ⊃ A0, ∀A ∈ G0.

Claim: The set G0 is a chain.What we need to prove is the fact that G0 is totally ordered by inclusion. Considerthe set

T = T ∈ G : T is comparable with every A ∈ G0 =⋂A∈G0

comp(A;G0),

and we try to prove that T = G0. By Lemma A.1 it is clear that T has the chainproperty. Using (1), it is clear that A0 ∈ T. Finally, we need to prove property(+). We prove this indirectly as follows. Fix T ∈ T, consider the collection

VT = comp(T+;G0) = A ∈ G0 : A comparable with T+,and let us prove that VT = G0, by showing that VT is an inductive set, and containsA0. First of all, by Lemma A.1, it follows that VT has the chain property. Secondly,using (1) we have A0 ⊂ T ⊂ T+, so A0 ∈ VT . Finally, to check property (+), we

ZORN LEMMA 359

start with some V ∈ VT , and we show that V + ∈ VT . In the case when T+ ⊂ V ,we are done, because we have T+ ⊂ V ⊂ V +. Assume T+ 6⊂ V , so that we haveV ⊂ T . Since T is comparable with V +, we either have V + ⊂ T , in which case weare done, or we have T ⊂ V +. In the latter case, we have

V ⊂ T ⊂ V +.

Since V + = V ∪ xV , the above inclusions forces either T = V , which givesT+ = V +, or T = V +. Clearly, either case gives V + ∈ VT . Having shown that VTis inductive, the inclusion VT ⊂ G0 will force the equality VT = G0. In turn, thedefinition of VT proves that T+ ∈ T, so T is indeed inductive. Finally, the inclusionT ⊂ G0 then forces T = G0, and by the definition of T, it follows that G0 is indeeda chain.

Having proven the Claim, we now take A =⋃G∈G0

G. Since G0 has the chainproperty, it follows that A ∈ G0. By construction we have

A ⊃ G, ∀G ∈ G0.

In particular we have A ⊃ A+, which clearly forces xA ∈ A.

Definitions. Let (X,≺) be an ordered set. By a maximal element for X onemeans an element x ∈ X with the property:

y ∈ X : x ≺ y = x.In other words, this means that there is no element y ∈ X, with x ≺ y and y 6= x.

Given a subset S ⊂ X, an element x ∈ X is said to be an upper bound for S, if

s ≺ x, ∀ s ∈ S.If such an x exists, we say that S has an upper bound. (It is not assumed that xbelongs to S!)

Lemma A.3 (“Easy” Zorn Lemma). Let M be a set, and let F be a collectionof subsets of M . Assume

• the Axiom of Choice is true;• F has the chain property;• F and is hereditary, in the sense that, whenever A ∈ F, it follows that all

subsets of A belong to F.Then, when equipped with the inclusion relation, (F,⊂) has at least one maximalelement.

Proof. The proof will be carried on by contradiction. Assume no A ∈ F ismaximal. For each A ∈ F, define

XA =x ∈M rA : A ∪ x ∈ F

.

Claim: For every A ∈ F, the set XA is non-empty.Indeed, since A is not maximal, there exists some B ∈ F, with A ( B. In particular,there exists some x ∈ B r A, and since A ∪ x ⊂ B, by the hereditary property,it follows that x ∈ XA.

Use now the Axiom of Choice, to find a map

F 3 A 7−→ xA ∈M,

such that xA ∈ XA, ∀A ∈ F. This means that A ∪ xA ∈ F, and xA 6∈ A, for allA ∈ F. By Lemma A.2 this is however impossible.

360 APPENDIX A

Theorem A.1 (Zorn Lemma). Assume the Axiom of Choice is true. Let (X,≺)be a non-empty ordered set, with the following property

(z) every totally ordered subset A ⊂ X has an upper bound.Then X has at least one maximal element.

Proof. Define the collection

F = A ⊂ X : A totally ordered subset.Clearly F is non-empty (it contains, for instance, all singletons).

It is quite clear that F satisfies the hypothesis of Lemma A.3. So (F,⊂) has amaximal element A. Take now x to be an upper bound for A, i.e. a ≺ x, ∀ a ∈ A.

Now we prove that x is maximal for (X,≺). Suppose y ∈ X satisfies x ≺ y.Then clearly A ∪ y will still be a totally ordered subset of X, i.e. A ∪ y ∈ F.The maximality of A in (F,⊂) will force A∪y = A, so we get y ∈ A, hence y ≺ x.Since we also have x ≺ y, this forces y = x.

Appendix B

Cardinal Arithmetic

In this Appendix we discuss cardinal arithmetic. We assume the Axiom ofChoice is true.

Definitions. Two sets A and B are said to have the same cardinality, if thereexists a bijective map A → B. It is clear that this defines an equivalence relationon the class18 of all sets.

A cardinal number is thought as an equivalence class of sets. In other words,if we write a cardinal number as a, it is understood that a consists of all sets of agiven cardinality. So when we write cardA = a we understand that A belongs tothis class, and for another set B we write cardB = a, exactly when B has the samecardinality as A. In this case we write cardB = card, A.

Notations. The cardinality of the empty set ∅ is zero. More generally thecardinality of a finite set is equal to its number of elements. The cardinality of theset N, of all natural numbers, is denoted by ℵ0.

Definition. Let a and b be cardinal numbers. We write a ≤ b if there existsets A ⊂ B with cardA = a and cardB = b.

This is equivalent to the fact that, for any sets A and B, with cardA = a andcardB = b, one of the following equivalent conditions holds:

• there exists an injective function f : A→ B;• there exists a surjective function g : B → A.

For two cardinal numbers a and b, we use the notation a < b to indicate thata ≤ b and a 6= b.

Theorem B.1 (Cantor-Bernstein). Suppose two cardinal numbers a and b sat-isfy a ≤ b and b ≤ a. Then a = b.

Proof. Fix two sets A and B with cardA = a and cardB = b, so there existinjective functions f : A → B and g : B → A. We shall construct a bijectivefunction h : A→ B. Define the sets

A0 = Ar g(B) and B0 = Ar f(A).

Then define recursively the sequences (An)n≥0 and (Bn)n≥0 by

An = g(Bn−1) and Bn = f(An−1), ∀n ≥ 1.

Claim 1: One has Am ∩An = Bm ∩Bn, ∀m > n ≥ 0.Let us first observe that the case when n = 0 is trivial, since we have the inclusionsAm = g(Bm−1) ⊂ g(B) = ArA0 and Bm = f(Am−1) ⊂ f(A) = B rB0. Next weprove the desired property by induction on m. The case m = 1 is clear (this forces

18 The term class is used, because there is no such thing as the “set of all sets.”

361

362 APPENDIX B

n = 0). Suppose the statement is true for m = k, and let us prove it for m = k+1.Start with some n < k+1. If n = 0, we are done, by the above discussion. Assumefirst n ≥ 1. Since f and g are injective we have

Ak+1 ∩An = g(Bk) ∩ g(Bn−1) = g(Bk ∩Bn−1) = ∅,Bk+1 ∩Bn = f(Ak) ∩ f(An−1) = f(Ak ∩An−1) = ∅,

and we are done.Put C = Arn≥0 An and D = B r

⋃n≥0Bn.

Claim 2: One has the equality f(C) = D.First we prove the inclusion f(C) ⊂ D. Start with some point c ∈ C, but assumef(c) 6∈ D. This means that there exists some n ≥ 0 such that f(c) ∈ Bn. Sincef(c) ∈ f(A) = BrB0, we must have n ≥ 1. But then we get f(c) ∈ Bn = f(An−1),and the injectivity of f will force c ∈ An−1, which is impossible.

Second, we prove that D ⊂ f(C). Start with some d ∈ D. First of all, sinceD ⊂ B r B0 = f(A), there exists some c ∈ A with d = f(c). If c 6∈ C, then thereexists some n ≥ 0, such that c ∈ An, and then we would get d = f(c) ∈ f(An) =Bn+1, which is impossible.

We now begin constructing the desired bijection. First we define φ :⋃n≥0Bn →

B by

φ(b) =

b if b ∈ Bn and n is odd(f g)(b) if b ∈ Bn and n is even

Claim 3: The map φ defines a bijection

φ :⋃n≥0

Bn →⋃n≥1

Bn.

It is clear that, since φ∣∣Bn

is injective, the map φ is injective. Notice also that, ifn ≥ 0 is even, then φ(Bn) = f

(g(Bn)

)= f(An+1) = Bn+2. When n ≥ 0 is odd we

have φ(Bn) = Bn, so we have indeed the equality

φ( ⋃n≥0

Bn)

=⋃n≥1

Bn.

Now we define ψ :⋃n≥0An → B by ψ = φ−1 f . Clearly ψ is injective, and

ψ( ⋃n≥0

An)

= φ−1( ⋃n≥0

f(An))

= φ−1( ⋃n≥0

Bn+1

)= φ−1

( ⋃n≥1

Bn)

=⋃n≥0

Bn,

so ψ defines a bijectionψ :

⋃n≥0

An →⋃n≥0

Bn.

We then combine ψ with the bijection f : C → D, i.e. we define the map h : A→ Bby

h(x) =ψ(x) if x ∈

⋃n≥0An

f(x) if x ∈ Ar⋃n≥0An = C.

Clearly h is injective, and

h(B) = ψ( ⋃n≥0

An)∪ f(C) =

( ⋃n≥0

Bn)∪D = B,

so h is indeed bijective.

CARDINAL ARITHMETIC 363

Theorem B.2 (Total ordering for cardinal numbers). Let a and b be cardinalnumbers. Then one has either a ≤ b, or b ≤ a.

Proof. Choose two sets A and B with cardA = a and cardB = b. In order toprove the theorem, it suffices to construct either an injective function f : A → B,or an injective function f : B → A.

We define the set

X = (C,D, g) : C ⊂ A, D ⊂ B, g : C → D bijection.

We equip X with the following order relation:

(C,D, g) ≺ (C ′, D′, g′) ⇐⇒

C ⊂ C ′

D ⊂ D′

g = g′∣∣C

We now check that (X,≺) satisfies the hypothesis of Zorn Lemma. Let A ⊂ X

be a totally ordered subset, say A =(Ci, Di, gi) : i ∈ I

. Define C =

⋃i∈I Ci,

D =⋃i∈I Di, and g : C → D to be the unique function with the property that

g∣∣Ci

= gi, ∀ i ∈ I. (We use here the fact that for i, j ∈ I we either have Ci ⊂ Cj

and gj∣∣Ci

= gi, or Cj ⊂ Ci and gi∣∣Cj

= gj . In either case, this proves that

gi∣∣Ci∩Cj

= gj∣∣Ci∩Cj

, ∀ i, j ∈ I, so such a g exists.) It is then pretty clear that(C,D, g) ∈ X and (Ci, Di, gi) ≺ (C,D, g), ∀ i ∈ I, i.e. (C,D, g) is an upper boundfor A. Use now Zorn Lemma, to find a maximal element (A0, B0, f) in X.

Claim: Either A0 = A or B0 = B.We prove this by contradiction. If we have strict inclusions A0 ( A and B0 ( B,then if we choose a ∈ A r A0 and b ∈ B r B0, we can define a bijection g :A0 ∪ a → B0 ∪ b0 by g(a) = b and g

∣∣A0

= f . This would then produce anew element (A0 ∪a, B0 ∪b, g) ∈ X, which would contradict the maximality of(A0, B0, f).

The theorem now follows immediately from the Claim. If A0 = A, then f :A→ B is injective, and if B0 = B, then f : B → A is injective.

We now define the operations with cardinal numbers.Definitions. Let a and b be cardinal numbers.• We define a+b = cardS, where S is any set which is of the form S = A∪B

with cardA = a, cardB = b, and A ∩B = ∅.• We define a·b = cardP , where P is any set which is of the form P = A×B

with cardA = a and cardB = b.• We define ab = cardX, where X is any set of the form X which is of the

formX =

∏i∈I

Ai,

with card I = b and cardAi = a, ∀ i ∈ I. Equivalently, if we take two setsA and B with cardA = a, and cardB = b, and if we define

AB =∏B

A = f : f function from B to A,

then ab = card(AB).

364 APPENDIX B

It is pretty easy to show that these definitions are correct, in the sense that theydo not depend on the particular choices of the sets involved. Moreover, theseoperations are consistent with the usual operations with natural numbers.

Remark B.1. The operations with cardinal numbers, defined above, satisfy:• a + b = b + a,• (a + b) + d = a + (b + d),• a + 0 = a,• a · b = b · a,• (a · b) · d = a · (b · d),• a · 1 = a,• a · (b + d) = (a · b) + (a · d),• (a · b)d = (ad) · (bd),• ab+d = (ab) · (ad),• (ab)d = (ab·d,

for all cardinal numbers a, b, d ≥ 1.Remark B.2. The order relation ≤ is compatible with all the operations, in

the sense that, if a1, a2, b1, and b2 are cardinal numbers with a1 ≤ a2 and b1 ≤ b2,then

• a1 + b1 ≤ a2 + b2,• a1 · b1 ≤ a2 · b2,• ab1

1 ≤ ab22 .

Proposition B.1. Let a ≥ 1 be a cardinal number.(i) If A is a set with cardA = a, and if we define

P(A) = B : B subset of A,then 2a = cardP(A).

(ii) a < 2a.

Proof. (i). Put

P = 0, 1A =f : f function from A to 0, 1

,

so that 2a = cardP . We need to define a bijection φ : P → P(A). We take

φ(f) = a ∈ A : f(a) = 1, ∀ f ∈ P.It is clear that, since a function f : A→ 0, 1 is completely determined by the seta ∈ A : f(a) = 1, the map φ is indeed bijective.

(ii). The map A 3 a 7−→ a ∈ P(A) is clearly injective. This prove theinequality a ≤ 2a. We now prove that a 6= 2a, by contradiction. Assume there is abijection θ : A→ P(A). Define the set

B = a ∈ A : a 6∈ θ(a),and choose b ∈ A such that B = θ(b). If b ∈ B, then by construction we getb 6∈ θ(b) = B, which is impossible. If b 6∈ B, we have b 6∈ θ(b), which forces b ∈ B,again an impossibility.

We now discuss the properties of these operations, when infinite cardinal num-bers are used.

Lemma B.1 (Properties of ℵ0).(i) For any infinite cardinal number a, one has the inequality ℵ0 ≤ a.

CARDINAL ARITHMETIC 365

(ii) ℵ0 + ℵ0 = ℵ0;(iii) ℵ0 · ℵ0 = ℵ0;

Proof. (i). Let a be an infinite cardinal number, and let A be an infiniteset A, with cardA = a. Since for every finite subset F ⊂ A, there exists somex ∈ Ar F , one to construct a sequence (xn)n∈N ⊂ A, with xm 6= xn, ∀m > n ≥ 1.Then the subset B = xn : n ∈ N has cardB = ℵ0, so the inclusion B ⊂ A givesthe desired inequality.

(ii). Consider the sets

A0 = n ∈ N : n, even and A1 = n ∈ N : n, odd.Then clearly cardA0 = cardA1 = ℵ0, and the equality A0 ∪A1 = N gives

ℵ0 + ℵ0 = cardA0 + cardA1 = card(A0 ∪A1) = card N = ℵ0.

(iii). Take the set P = N × N, so that ℵ0 · ℵ0 = cardP . It is obvious thatcardP ≥ ℵ0. To prove the other inequality, we define a surjection φ : N → P asfollows. For each n ≥ 1 we take sn = n(n− 1)/2, we set

Bn = m ∈ N : sn < m ≤ sn+1,and we define φn : Bn → P by

φ(m) = (n+ sn −m,m− sn + 1), ∀m ∈ Bn.Notice that

(1) φn(Bn) = (p, q) ∈ N× N : p+ q = n+ 1.Notice also that

⋃n≥1Bn = N, and Bj ∩ Bk = ∅, ∀ j > k ≥ 1, so there exists a

(unique) function φ : N → P , such that φ∣∣Bn

= φn, for all n ≥ 1. By (1) it is clearthat φ is surjective.

Theorem B.3. Let a and b be cardinal numbers, with 1 ≤ b ≤ a, and a infinite.Then:

(i) a + b = a;(ii) a · b = a.

Proof. It is clear that

a ≤ a + b ≤ a + a,

a ≤ a · b ≤ a · a,so in order to prove the theorem, we can assume that a = b.

(i). Fix some set A with cardA = a. Use Zorn Lemma, to find a maximalnon-empty family Ai : i ∈ I of subsets of A with

(a) cardAi = ℵ0, for all i, j ∈ I;(b) Ai ∩Aj = ∅, for all i, j ∈ I with i 6= j.

If we put B = A r( ⋃

i∈I Ai), then by maximality it follows that B is finite. In

particular, if we take i0 ∈ I then obviously card(Ai0 ∪ B) = ℵ0, so if we replaceAi0 with Ai0 ∪ B, we will still have the above properties (a) and (b), but alsoA =

⋃i∈I Ai. This proves that a = cardA = ℵ0 · d, where d = card I. In other

words, we have a = card(N× I). Consider then the sets

C0 = n ∈ N : n even and C1 = n ∈ N : n odd,

366 APPENDIX B

so that (C0 × I) ∪ (C1 × I) = I × N, and (C0 × I) ∩ (C1 × I) = ∅. In particular,we get

a = card(C0 × I) + card(C1 × I) =

= (cardC0) · (card I) + (cardC1) · (card I) == ℵ0 · d + ℵ0 · d = a + a.

(ii). Fix A a set with cardA = a. We are going to employ Zorn Lemma to finda bijection A→ A×A. Define

X =(D, f) : D ⊂ A, f : D → D ×D bijective

.

Equip X with the following order

(D, f) ≺ (D′, f ′) ⇐⇒D ⊂ D′

f = f ′∣∣D

Notice that X is non-empty, since we can find at leas one set D ⊂ A with cardD =ℵ0. We now check that X satisfies the hypothesis of Zorn Lemma. Let T =(Di, fi) : i ∈ I

be a totally ordered subset of X. It is fairly clear that if one takes

D =⋃i∈I and one defines f : D → D ×D as the unique function with f

∣∣Di

= fi,∀ i ∈ I, then f is injective, and

f(D) =⋃i∈I

f(Di) =⋃i∈I

fi(Di) =⋃i∈I

(Di ×Di) = D ×D,

so the pair (D, f) indeed belongs to X, and is an upper bound for T.Use Zorn Lemma to produce a maximal element (D, f) ∈ X. Notice that, if we

take d = cardD, then by construction we have

(2) d · d = d.

We would like to prove that D = A. In general this is not the case (for example,when A = N, every (D, f) ∈ X, with N rD finite, is automatically maximal). Wenotice however that all we need to show is the equality

(3) d = a.

We prove this equality by contradiction. We know that we already have d ≤ a.Suppose d < a. Put G = ArD notice that d + cardG = a. Since d < a, by (i) wesee that we must have the equality cardG = a. Then there exists a subset E ⊂ Gwith cardE = d. Consider the set

P = (E × E) ∪ (E ×D) ∪ (D × E).

Since E ∩D = ∅, the three sets above are pairwise disjoint, so using (2) combinedagain with part (i), we get

cardP = card(E × E) + card(E ×D) + card(D × E) == d · d + d · d + d · d = d + d + d = d = cardE.

This means that there exists a bijection g : E × P , which combined with the factthat E ∩D = P ∩ (D×D) = ∅, will produce a bijection h : D∪E → P ∪ (D×D),such that h

∣∣D

= f and h∣∣E

= g. Since we have P ∪ (D×D) = (D ∪E)× (D ∪E),the pair (D ∪ E, h) ∈ X will contradict the maximality of (D, f).

CARDINAL ARITHMETIC 367

Corollary B.1. If a is an infinite cardinal number, and if b is a cardinalnumber with 2 ≤ b ≤ 2a, then

ba = 2a.

Proof. We have2a ≤ ba ≤ (2a)a = 2a·a = 2a,

and the desired equality follows from the Cantor-Bernstein Theorem.

Corollary B.2. Let a be an infinite cardinal number, let A be a set withcardA = a, and define

Pfin(A) = F ∈ P(A) : F finite.

Then cardPfin(A) = a.

Proof. First of all, the map A 3 a 7−→ a ∈ Pfin(A) is injective, so a ≤cardPfin(A).

We now prove the other inequality. For every integer n ≥ 1, let An denote then-fold cartesian product. We treat the sequence A1, A2, . . . as pairwise disjoint.For every n ≥ 1 we define the map

φn : An → Pfin(A),

byφ(a1, . . . , an) = a1, . . . , an,

and we define the map φ :⋃∞n=1A

n → Pfin(A) as the unique map such thatφ∣∣An = φn, ∀n ≥ 1. Notice now that, since

cardAn = an = a, ∀n ≥ 1,

it follows that

card( ∞⋃n=1

An)

= ℵ0 · a = a,

which givescard(Rangeφ) ≤ a.

But it is clear that∅ ∪ Rangeφ = Pfin(A),

and the fact that Pfin(A) is infinite, proves that

cardPfin(A) = card(Rangeφ) ≤ a.

We conclude with a result on the cardinal number c = card R.

Proposition B.2.

(i) For two real numbers a < b, one has

card(a, b) = card[a, b) = card(a, b] = card[a, b] = c.

(ii) c = 2ℵ0 .

368 APPENDIX B

Proof. (i). It is clear that, since (a, b) is infinite, we have

card[a, b] = 2 + card(a, b) = card(a, b).

The inclusions (a, b) ⊂ [a, b) ⊂ [a, b] and (a, b) ⊂ (a, b] ⊂ [a, b], combined with theCantor-Bernstein Theorem, immediately give

card[a, b) = card(a, b] = card(a, b).

Finally, the bijection

(a, b) 3 t 7−→ tan(π(2t− a− b)

2(b− a)

)∈ R

shows that card(a, b) = c.(ii). The proof of this result uses a certain construction, which is useful for

many other purposes. Therefore we choose to work in full generality. Consider theset

T = 0, 1ℵ0 =a = (αn)n∈N : αn ∈ 0, 1, ∀n ∈ N

,

so 2ℵ0 = cardP . For any real number r ≥ 2, we define the map φr : T → [0, 1] by

φ(a) = (r − 1)∞∑n=1

αnrn, ∀ a = (αn)n∈N ∈ T.

The maps φr, r ≥ 2 are “almost” injective. To clarify this, we define the set

T0 =a = (αn)n∈N ∈ T : the set n ∈ N : αn = 0 is infinite

.

Note that

T r T0 =(αn)n∈N ∈ T : there exists N ∈ N, such that αn = 1, ∀n ≥ N

.

Clearly φ is surjective. In fact φ is “almost” bijective.Claim 1: Fix r ≥ 2. For elements a = (αn)n∈N, b = (βn)n∈N ∈ T0, the

following are equivalent(∗) φr(a) > φr(b);

(∗∗) there exists k ∈ N, such that alphak > βk, and αj = βj, for all j ∈ Nwith j < k.

We first prove the implication (∗∗) ⇒ (∗). If a, b ∈ T0 satisfiy (∗∗), then

(4) φr(a)− φr(b) =r − 1rk

+ (r − 1)∞∑

n=k+1

αn − βnrn

≥ r − 1rk

− (r − 1)∞∑

n=k+1

βn2n.

Notice now that there are infinitely many indices n ≥ k+ 1 such that βn = 0. Thisgives the fact that

∞∑n=k+1

βnrn

<∞∑

n=k+1

1rn

=1

(r − 1)rk,

so if we go back to (4) we get

φr(a)− φr(b) ≥r − 1rk

− (r − 1)∞∑

n=k+1

βnrn

>r − 1rk

− 1rk

=r − 2rk

≥ 0,

so in particular we get φr(a) > φr(b.Conversely, if φr(a) > φr(b), we choose

k = minn ∈ N : αn 6= βn.

CARDINAL ARITHMETIC 369

Using the implication (∗∗) ⇒ (∗) we see that we cannot have βk > αk, because thiswould force φ(b) > φ(a). Therefore we must have αk > βk, and we are done.

Using Claim 1, we now see that φr∣∣T0

: T0 → [0, 1] is injective

Claim 2: card(T r T0) = ℵ0.This is pretty clear, since we can write

T r T0 =∞⋃k=1

Rk,

whereRn =

a = (αn)n∈N ∈ T : αn = 1, ∀n ≥ 1

.

Since each Rn is finite, the desired result follows.Using Claim 2, we have

2ℵ0 = cardT = card(T r T0) + cardT0 = ℵ0 + cardT0.

Since ℵ0 < 2ℵ0 , the above equality forces

2ℵ0 = cardT0.

For every r ≥ 2, we also have cardφr(TrT0) ≤ ℵ0, which then gives card[φr(T )r

φr(T0)]≤ ℵ0, hence using the injectivity of φr

∣∣T0

, we have cardφr(T0) = cardT0 =2ℵ0 , so we get

2ℵ0 = cardφr(T0) ≤ cardφr(T ) = cardφr(T0)+card[φr(T )rφr(T0)

]≤ card phir(T0)+ℵ0 = 2ℵ0+ℵ0 = 2ℵ0 .

By the Cantor-Bernstein Theorem this forces cardφr(T ) = 2ℵ0 .Now we are done, since for r = 2 we clearly have φ2(T ) = [0, 1].

Corollary B.3.

(i) cℵ0 = c.(ii) If we define the set

Pcount = C ⊂ R : cardF ≤ ℵ0,then cardPcount(R) = c.

Proof. (i). This is immediate from the equality 2ℵ0 = c and from CorollaryB.1.

(ii). Using the inclusion Pfin(R) ⊂ Pcount(R), combined with Corollary B.2, wesee that we have the inequality

c ≤ cardPcount(R).

To prove the other inequality, we define a map φ : RN → Pcount(R), as follows. Ifa ∈ RN is a sequence, say a = (αn)n∈N, we put

φ(a) = αn : n ∈ N.Since φ is clearly surjective, using part (i) we get

cardPcount(R) ≤ card RN = cℵ0 = c.

Appendix C

Ordinal numbers

In this Appendix we discuss ordinal number arithmetic. The Axiom of Choiceis assumed to be true.

Definition. Let X be a non-empty set. A well ordering on X is an total orderrelation ≺ on X with the following property:

(w) every non-empty subset A ⊂ X has a smallest element, i.e. there existsa ∈ A, such that a ≺ x, ∀x ∈ A.

In this case the pair (X,≺) is called a well ordered set.Notations. Let (W,≺) be a well-ordered set. For any a ∈W , we define

W (a) = x ∈W : x ≺ a and x 6= a.Remark that (W (a),≺) is well-ordered.

Lemma C.1. Let (W,≺) be a well ordered set. For a subset S ⊂ W , thefollowing are equivalent:

(i) for every s ∈ S, one has the inclusion W (s) ⊂ S;(ii) either S = W , or there exists some a ∈W , such that S = W (a).

Proof. (i) ⇒ (ii). Assume S ( W . Take a to be the smallest element of theset W r S. If s ∈ S, then a 6= s, and by (i) we cannot have a ≺ s, since this wouldforce a ∈ W (s) ⊂ S. Therefore we must have s ≺ a, i.e. s ∈ W (a). This prove theinclusion S ⊂ W (a). Conversely, if s ∈ W (a), then s must belong to S. Otherwises ∈W r S would contradict the minimality of a.

(ii) ⇒ (i). This is trivial.

Definition. A subset S, as above, is called a full subset.The key feature of well-ordered sets is the following.Lemma C.2 (Transfinite Induction Principle). Let (W,≺) be a well-ordered

set. Let w1 ∈ W be the smallest element of W . Assume A ⊂ W is a set with theproperty

(i) If w ∈W has the property that, W (w) ⊂ A, then w ∈ A.Then A = W .

Proof. Consider the set

S = s ∈ A : W (s) ⊂ A.It is obvious that S is full, and S ⊂ A. By Lemma C.1, either S = W , in whichcase we clearly get A = W , or there exists w ∈ W , such that S = W (w). In thiscase we have W (w) ⊂ A. By (i) this forces w ∈ A, so we get w ∈ S, which isimpossible.

371

372 APPENDIX C

Another useful feature isLemma C.3 (Recursion Principle). Let (W,≺) be a well-ordered set, and let

w1 be the smallest element in W . Let X be a set, and assume one has a family ofmaps Φa :

∏W (a)X → X, a ∈ W r w1. Then for any element x1 ∈ X, there

exists a unique function f : W → X, such that

(1) f(w1) = x1 and f(a) = Φa(f∣∣W (a)

), ∀ a ∈W r w1.

Proof. For every a ∈W let us denote the set W (a)∪ a simply by Wa, andlet us define the set

Fa =g : Wa → X : g(w1) = x1 and g(b) = Φb

(g∣∣W (b)

), ∀ b ∈Wa r w1

.

Remark that, for any a, b ∈W , with a ≺ b, one has

(2) f∣∣Wa

∈ Fa, ∀ f ∈ Fb.

Claim: For every a ∈W , the set Fa is a singleton.We prove this statement using transfinite induction. Define

A =a ∈W : Fa is a singleton

.

Suppose a ∈ W has the property W (a) ⊂ A, which means that Fb is a singleton,for all b ∈W (a). For each b ∈W (a), let fb : Wb → X be the unique element in Fb.We notice that, for any b, c ∈W (a), with b ≺ c, using (2), we have

(3) fc∣∣Wb

= fb.

This follows immediately from the fact that fc∣∣Wb

belongs to Fb. Using the obviousequality

W (a) =⋃

b∈W (a)

Wb,

we define g : W (a) → X as the unique function with the property that g∣∣Wb

= fb,∀ b ∈W (a). Finally, we define fa : Wa → X by fa

∣∣W (a)

= g, and fa(a) = Φa(g). Itis clear that fa ∈ Fa, so Fa has at least one element. If h ∈ Fa is another function,then for every b ∈ W (a) we have h

∣∣Wb

∈ Fb, which forces h∣∣Wb

= fb, in particulargiving h

∣∣W (a)

= g = fa∣∣W (a)

. Then h(a) = Φa(g), which means that we also haveh(a) = fa(a), so we must have h = fa.

Having proven the Claim, we now have a family of functions fa : Wa → X,a ∈W , with fb

∣∣Wa

= fa, for all a, b ∈W with a ≺ b. Using the equality

W =⋃a∈W

Wa,

we then define f : W → X to be the unique function such that f∣∣Wa

= fa, ∀ a ∈W .Notice that, for each a ∈ W r w1, we have f(a) = fa(a), and since fa ∈ Fa,

we immediately get (1). The uniqueness of f with property (1) is also clear, sinceany such f will atomatically satisfy f

∣∣Wa

∈ Fa, for all a ∈W .

Comment. The system of maps Φa :∏W (a)X → X, a ∈ W is to be thought

as a “recurence relation,” in the sense that it is used to define the value f(a) interms of all “preceding” values f(w), w ≺ a, w 6= a.

ORDINAL NUMBERS 373

Definitions. Given two well ordered sets (W1,≺1) and (W2,≺2), a map f :(W1,≺1) → (W2,≺2) is called an full embedding, if

• f is injective.• For any two elements x, y ∈W1, one has

x ≺1 y ⇒ f(x) ≺2 f(y).

• f(W1) is a full subset of W2.If f is a full emebedding, with f(W1) = W2, then f is called an order isomorphism.

The properties of these types of maps are contained in the followingProposition C.1. A. Suppose (W1,≺1) and (W2,≺2), are well-ordered sets.

(i) If f : (W1,≺1) → (W2,≺2) is a full embedding, then

f(W1(a)

)= W2

(f(a)

), ∀ a ∈W1.

In particular, if w1 is the smallest element in W1, and w2 is the smallestelement in W2, then f(w1) = w2.

(ii) If f : (W1,≺1) → (W2,≺2) is an order isomorphism, then f−1 : (W2,≺2

) → (W1,≺1) is again an order isomorphism.(iii) There exists at most one full embedding f : (W1,≺1) → (W2,≺2).B. Suppose (W1,≺1), (W2,≺2), (W3,≺3) are well-ordered sets, and

(W1,≺1)f−→ (W2,≺2)

g−→ (W3,≺3)

are full emebeddings.(i) The composition g f : (W1,≺1) → (W3,≺3) is again a full emebdding.(ii) The composition g f is an order isomorphism, if and only if both f and

g are order isomorphisms.

Proof. A. (i). Start first with some element x ∈W (a). Since x ≺1 a, we havef(x) ≺2 f(a). Since f is injective, and x 6= a, we must have f(x) 6= f(a), hencex ∈ W2

(f(a)

). Conversely, if y ∈ W2

(f(a)), then using the fact that f(W2) is full

in W2, it follows that y ∈ f(W2), so there exists some x ∈ W1, with y = f(x). Ifa ≺1 x, then we would get f(a) ≺2 f(x), which is impossible. Therefore we musthave x ≺1 a and x 6= a, i.e. x ∈ W1(a), so y indeed belongs to f

(W1(a)

). The

second assertion is now clear since we have

W2

(f(w1)

)= f

(W1(w1)

)= f(∅) = ∅,

which clearly forces f(w1) = w2.(ii). This is obvious.(iii). Suppose f, g : (W1,≺1) → (W2,≺2) are full embeddings, and let us show

that we must have f = g. We use transfinite induction. Define the set

A = w ∈W1 : f(w) = g(w).Let w ∈ W1 be some element such that W1(w) ⊂ A, and let us prove that w ∈ A,i.e. f(w) = g(w). Denote f(w) by a, and g(w) by b. Using the fact that f

∣∣W1(w)

=

g∣∣W1(w)

, combined with (i), we have

W2(a) = W2

(f(w)

)= f

(W1(w)

)= g

(W1(w)

)= W2

(g(w)

)= W2(b).

This clearly forces a = b. Indeed, if a 6= b, then either a ≺ b, in which casea ∈ W2(b) rW2(a), or b ≺ a, in which case b ∈ W2(a) rW2(b). In either case, wewill get W2(a) 6= W2(b).

374 APPENDIX C

B .(i). It is clear that g f is injective, and satisfies the second condition in thedefinition, so the only thing we need to prove is the fact that (g f)(W1) is full. Iff(W1) = W2, there is nothing to prove, since we would get (g f)(W1) = g(W2),which is full.

Assume f(W1) = W2(a), for some a ∈W2. Then by (i) we have

(g f)(W1) = g(f(W1)

)= g

(W2(a)

)= W3

(g(a)

),

so again (g f)(W1) is full.(ii). Assume first that both f and g are order isomprphisms. Then g f :

(W1,≺1) → (W3,≺3) is a full embedding, by (i), and it is clearly surjective, henceg f is indeed an order isomorphism.

Conversely, assume g f : (W1,≺1) → (W3,≺3) is an order isomorphism. Thisclearly forces g to be surjective, hence an order isomorphism. But then g−1 is anorder isomorphism, and so will be g−1 (g f) = f .

Corollary C.1. If (W,≺) is a well-ordered set, and a ∈W , then there is nofull embedding (W,≺) → (W (a),≺).

Proof. Suppose there exists a full embedding f : (W,≺) → (W (a),≺). Sincethe inclusion ι : (W (a),≺) → (W,≺) is obviously a full embedding, the compositionι f : (W,≺) → (W,≺) is a full embedding. Since we also have IdW : (W,≺) →(W,≺) as a full embedding, this would force ι f = IdW , which would force ι to besurjective. But this is obviously impossible.

Definitions. Two well-ordered sets W1,≺1) and (W2,≺2) are said to havethe same order type, if there exists an order isomorphism (W1,≺1) → (W2,≺2).By the above considerations, this defines an equivalence relation on the class of allwell-ordered sets.

An ordinal number is thought as an equivalence class of well-ordered sets. Inother words, if we write a cardinal number as α, it is understood that α consistsof all well-ordered sets of a given order type. So when we write ord(W,≺) = αwe understand that (W,≺) belongs to this class, and for another well-ordered set(W ′,≺′) we write ord(W ′,≺′) = α, exactly when (W ′,≺′) has the same order typeas (W,≺). In this case we write ord(W ′,≺′) = ord(W,≺).

We regard the empty set ∅ as a well-ordered set, with the empty relation. Wewrite ord(∅) = 0.

Comments. If (W1,≺1) and (W2,≺2) are well-ordered sets, then one has theobvious implication

ord(W1,≺1) = ord(W2,≺2) =⇒ cardW1 = cardW2.

Conversely, if the well-ordered sets (W1,≺1) and (W2,≺2) are finite, and cardW1 =cardW2, then ord(W1,≺1) = ord(W2,≺2). Indeed, if we take n = 1cardW1, thenone can define recursively a finite sequence (wk)nk=1 ⊂ W1, by taking w1 to be thesmallest element of W1, and defining, for each k ∈ 2, 3, . . . , n the element wk tobe the smallest element of the set W1 r w1, w2, . . . , wk−1. The obvious bijection

1, 2, . . . , n 3 k 7−→ wk ∈W1

will then define an order isomorphism(1, . . . , n,≤

)→ (W1,≺1).

Likewise (W2,≺2) has same order type as(1, . . . , n,≤

).

ORDINAL NUMBERS 375

Using the above notations, we can then regard all non-negative integers asordinal numbers, by identifying ord(W,≺) = card(W ), for all finite well-orderedsets (W,≺).

Notation. If α is an ordinal number, say α = ord(W,≺), for some well-orderedset (W,≺), then the cardinal number cardW does not depend on the particularchoice of (W,≺). We will denote it by cardα. As dicussed above, if

cardα = cardβ = finite cardinal,

then α = β. As we shall see later, this implication holds only for finite ordinalnumbers.

Definitions. Let α1 and α2 be ordinal numbers, say α1 = ord(W1,≺1) andα2 = ord(W2,≺2), where (W1,≺1) and (W2,≺2) are two well-ordered sets. We writeα1 ≤ α2, if there exists a full embedding f : (W1,≺1) → (W2,≺2). By PropositionC.1, this definition is independent of the choices of (W1,≺1) and (W2,≺2).

We write α1 < α2 if α1 ≤ α2 and α1 6= α2.Remark C.1. If α1 and α2 are ordinal numbers, with α1 ≤ α2, then cardα1 ≤

cardα2.Proposition C.2. The relation ≤ is an order relation, on any set of ordinal

numbers.

Proof. It is obvious that α ≤ α, for any ordinal number αAssume α1 and α2 are ordinal numbers with α1 ≤ α2 and α2 ≤ α1, and let

us show that this forces α1 = α2. Let (W1,≺1) and (W2,≺2) be well-ordered setswith α1 = ord(W1,≺1) and α2 = ord(W2,≺2). Since α1 ≤ α2, there exists a fullemebedding f : (W1,≺1) → (W2,≺2). Since α2 ≤ α1, either there exists a fullemebdding g : (W2,≺2) → (W1,≺1). By Proposition C.1.B, the composition g f :(W1,≺1) → (W1,≺1) is a full emebedding. Since we already have a full emebddingIdW1 : (W1,≺1) → (W1,≺1), by Proposition C.1.A, we must have g f = IdW1 .Using Proposition C.1.B this forces f (and g) to be order isomorphisms, so weindeed have α1 = α2.

Finally, suppose α1, α2 and α3 are ordinal numbers such that α1 ≤ α2 andα2 ≤ α3. The fact that α1 ≤ α3 follows immediately from Proposition C.1.B.

Theorem C.1 (Ordinal Comparability Theorem). Let α1 and α2 be ordinalnumbers. Then either α1 ≤ α2, or α2 ≤ α1.

Proof. Let (W1,≺1) and (W2,≺2) be well-ordered sets with α1 = ord(W1,≺1)and α2 = ord(W2,≺2). For every a ∈W1 we denote the set W1(a)∪ a simply byW a

1 . It is clear that (W a1 ,≺1) is well-ordered. Consider the set

A =a ∈W1 : there exists a full embedding (W a

1 ,≺1) → (W2,≺).

By Proposition C.1.A, we know that for any a ∈ A, there exists a unique fullembedding (W a

1 ,≺1) → (W2,≺2). We denote this full embedding by fa.Claim 1: The set A is full. Moreover, for any a, b ∈ A, with b ≺ a, we havefb = fa

∣∣W b

1.

Start with some a ∈ A, and let us prove that W1(a) ⊂ A. Fix some arbitrary b ∈W (a). Then the inclusion ι : (W b

1 ,≺1) → (W a1 ,≺1) is obviously a full embedding,

since we can writeW b

1 = W1(c),

376 APPENDIX C

where c is the smallest element of the set

Db = x ∈W1 : b ≺1 x and b 6= x.(The fact that a ∈ Db shows that Db 6= ∅.) Then the composition

fa ι : (W b1 ,≺1) → (W2,≺2)

is a full emebedding, so b indeed belongs to A. Moreover, we will have fb = fa ι =fa

∣∣W b

1.

Define the map φ : A→W2 by

φ(a) = fa(a), ∀ a ∈ A.Remark that

(4) φ∣∣Wa

1= fa, ∀ a ∈ A.

Indeed, if we take some b ∈W1(a), then by Claim 1, we have φ(b) = fb(b) = fa(b),so we get φ

∣∣W1(a)

= fa∣∣W1(a)

.

Claim 2: φ : (A,≺1) → (W2,≺2) is a full embedding.We start by proving the first two conditions. Let a, b ∈ A be such that b ≺1 a andb 6= a, and let us show that φ(b) ≺2 φ(a) and φ(b) 6= φ(a). We have b ∈ W a

1 andφ∣∣Wa

1= fa, so using the fact that fa : (W a

1 ,≺1) → (W2,≺2) is a full embedding,we indeed get φ(b) = fa(b) ≺ fa(a) = φ(a), and φ(b) 6= φ(a).

We now show that φ(A) is full in (W2,≺2). Start with some y ∈ φ(A), and letus show that W2(x) ⊂ φ(A). On the one hand, since we obviously have

A =⋃a∈A

W a1 ,

we also haveφ(A) = φ

( ⋃a∈A

W a1

)=

⋃a∈A

φ(W a1 ),

so there exists some a ∈ A, such that y ∈ φ(W a1 ) = fa(W a

1 ). On the other hand,since fa : (W a

1 ,≺1) → (W2,≺2) is a full embedding, it follows that fa(W a1 ) is full,

so we get W2(y) ⊂ fa(W a1 ) = φ(W a

1 ) ⊂ φ(A).We now finish the proof. Since both A and φ(A) are full, there are three cases

to examineCase 1: A = W1. In this case φ : (W1,≺1) → (W2,≺2) is a full embedding, so

we get α1 ≤ α2.Case 2: φ(A) = W2. In this case φ : (A,≺1) → (W2,≺2) is a an order

isomorphism, so φ−1 : (W2,≺2) → (W1,≺1) is a full embedding, and we get α1 ≤α2.

Case 3: A ( W1 and φ(A) ( W2. This means there exist a1 ∈ W1 anda2 ∈ W2 such that A = W1(a1) and φ(A) = W2(a2). This case turns out to beimpossible. To see this, we define ψ : W a1

1 → W2 by ψ∣∣W1(a)

= φ and ψ(a1) = a2,then ψ : (W a1

1 ,≺1) → (W2,≺2) will still be an order isomorphism. Indeed, the firsttwo conditions in the definition are clear, while the equality

ψ(W a11 ) = W a2

2 = y ∈W2 : y ≺2 a2,proves that ψ(W a1

1 ) is full. The existence of ψ then forces a1 ∈ A, which contradictsthe equality A = W1(a1).

ORDINAL NUMBERS 377

Theorem C.2. Let α be an ordinal number. Then the class Pα of all ordinalnumbers β with β < α is a set. More explicitly, if (W,≺) is a well-ordered set withord(W,≺) = α, then the map

φ : W 3 a 7−→ ord(W (a),≺) ∈ Pαis a bijection. Moreover, (Pα,≤) is well-ordered, and φ : (W,≺) → (Pα,≤) is anorder isomorphism.

Proof. Let β be an ordinal number with β < α. Then there exists a well-ordered set (W1,≺1), and a full emebedding φ : (W1,≺1), such that

• β = ord(W1,≺1),• φ(W1) = W (a1),

for some a1 ∈W . This fact already proves that Pα is a set.Claim: The element a1 ∈ W does not depend on the particular choice of

(W1,≺1).Indeed, if (W2,≺2) is another well-ordered set, and ψ : (W2,≺2) → (W,≺) isanother full emebdding with

• β = ord(W2,≺2),• ψ(W2) = W (a2),

for some a2 ∈ W , then we would get the existence of an order isomorphism γ :(W (a1),≺) → (W (a2),≺). We can assume (otherwise we replace γ with γ−1) thata1 ≺ a2. If a1 6= a2, we would have a1 ∈W (a2), so if we work with the well-orderedset Z = W (a2) we would have an order isomorphism (Z,≺) → (Z(a1),≺). ByCorollary C.1 this is impossible. Therefore, we must have a1 = a2.

Using the Claim, we then define aβ as the unique element in W , such thatord(W (aβ),≺) = β. Define the map ψ : Pα 3 β 7−→ aβ ∈ W . It is clear thatφ ψ = IdPα .

Let us prove now that ψ φ = IdW . Start with some arbitrary a ∈W , and putβ = φ(a) = ord(W (a),≺). Since ord(W (a),≺) = β, by the Claim, we must haveaβ = a, i.e. ψ(β) = a, which means that (ψ φ)(a) = a.

Finally, we note that, if a, b ∈W are elements with a ≺ b, then the obvious fullembedding (W (a),≺) → (W (b),≺) proves that ord(W (a),≺) ≤ ord(W (b),≺), i.e.φ(a) ≤ φ(b).

Since φ is bijective, it is clear that, for a, b ∈W , we have in fact the equivalence

a ≺ b⇐⇒ φ(a) ≤ φ(b).

This proves that (Pα,≤) is well-ordered, and φ : (W,≺) → (Pα,≤) is an orderisomorphism.

Corollary C.2. If S is a set of ordinal numbers, then (S,≤) is well-ordered.

Proof. By Theorem C.1, (S,≤) is totally ordered. Fix some non-empty subsetA ⊂ S, and let us show that A has a smallest element. Start with some arbitraryα ∈ A. If α ≤ β, ∀β ∈ A, we are done. Otherwise, the intersection A ∩ Pα isnon-empty. We then use the fact that (Pα,≤) is well-ordered, to choose α1 to beits smallest element. If we start with some arbitrary β ∈ A, then either α ≤ β, inwhich case we immediately get α1 < β, or β < α, in which case β ∈ A ∩ Pα, andwe again get α1 ≤ β. So α1 is in fact the smallest element of A.

378 APPENDIX C

Theorem C.3 (Well ordering Theorem). Every non-empty set has a well or-dering.

Proof. Let

W =(W,≺) : (W,≺) well-ordered, and W ⊂ X

.

For two elements (W1,≺1) and (W2,≺2), we define (W1,≺1) @ (W2,≺2), if andonly if W1 ⊂W2, and the inclusion map (W1,≺1) → (W2,≺2) is a full embedding.(This is equivalent to the fact that W1 is a full subset of (W2,≺2), and ≺1=≺2

∣∣W1

.)It is obvious that (W,@) is an ordered set. We want to apply Zorn Lemma

to this set. We need to check the hypothesis. Start with a totally ordered subsetT =

(Wi,≺i) : i ∈ I

⊂ W, and let us show that T has an upper bound in W.

Define W =⋃i∈IWi. For a, b ∈ W , we define a ≺ b, if and only if there exists

i ∈ I, such that a, b ∈ Wi, and a ≺i b. Let us chack that (W,≺) is a well-orderedset. First of all, we need to show that ≺ is an order relation on W . It is clearthat a ≺ a, ∀ a ∈ W . Suppose a, b ∈ W satisfy a ≺ b and b ≺ a, and let us showthat a = b. We know there exists i, j ∈ I such that a, b ∈ Wi and a ≺i b, anda, b ∈ Wj and b ≺j a. Now there are two possibilities: either (Wi,≺i) @ (Wj ,≺j),or (Wj ,≺j) @ (Wi,≺i). In the first case we get a ≺i b and b ≺i a, so we wouldget a = b. In the other case, by symmetry, we again get a = b. Let us show nowtransitivity. Suppose a, b, c ∈ W satisfy a ≺ b and b ≺ c, and let us show thata ≺ c. We know there exist i, j ∈ I, such that a, b ∈ Wi and a ≺i b, and b, c ∈ Wj

and b ≺j c. As above, we have two possibilities: either (Wi,≺i) @ (Wj ,≺j), or(Wj ,≺j) @ (Wi,≺i). In the first case we get a, b, c ∈Wj and a ≺j b ≺j c, so we geta ≺j c. In the second case, we get a, b, c ∈Wi and a ≺i b ≺i c, so we get a ≺i c. Ineither case we get a ≺ c.

Next we show that (W,≺) is totally ordered. Start with arbitrary a, b ∈W , andlet us prove that either a ≺ b or b ≺ a. If we choose i, j ∈ I such that a ∈ Wi andb ∈Wj , then using the two possiblities (Wi,≺i) @ (Wj ,≺j) or (Wj ,≺j) @ (Wi,≺i)we immediately see that we can find k ∈ I (k is either i or j), such that a, b ∈Wk.Then using the fact that (Wk,≺k) is totally ordered, we either have a ≺k b, orb ≺k a. This gives either a ≺ b, or b ≺ a.

In order to prove that (W,≺) is well-ordered, and (Wi,≺i) @ (W,≺), ∀ i ∈ I,we shall use the following

Claim: For any i ∈ I, one has the implication:

a ∈Wi =⇒W (a) ⊂Wi.

Indeed, if there exists some b ∈ W (a), but b 6∈ Wi, this would mean that thereexists some j ∈ I, with b ∈ Wj , b ≺j a, and b 6= a. This would then force(Wi,≺i) @ (Wj ,≺j), and b ∈Wj(a). But this is impossible, since the fact that Wi

is full in (Wj ,≺j) would force b ∈Wj(a) ⊂Wi.Let us show now that (W,≺) is well-ordered. Start with some arbitrary non-

empty subset A ⊂ W . Choose i ∈ I, such that A ∩Wi 6= ∅, and take a to be thesmallest element in A ∩Wi, in the well-ordered set (Wi,≺i), i.e.

(5) a ∈ A ∩Wi, and a ≺i x, ∀x ∈ A ∩Wi.

Let us prove that a is in fact the smallest element of A, in (W,≺). Start with somearbitrary element b ∈ A, and let us prove that a ≺ b. Assume the opposite, whichusing the fact that (W,≺) is totally ordered, this means that b ≺ a, and b 6= a,

ORDINAL NUMBERS 379

i.e. b ∈ W (a). By the Claim hoewever, this will force b ∈ Wi, so we would getb ∈ A ∩Wi, and the choice of a would give a ≺i b, which would then give a ≺ b,thus contradicting the assumption on b.

We now prove (Wi,≺i) @ (W,≺), ∀ i ∈ I. It is clear that the inclusion mapι : (Wi,≺i) → (W,≺) satsifies the first two conditions in the definition of fullembeddings, so the only thing we need is the fact that Wi is full in (W,≺). Butthis is precisely the content of the above Claim.

Having shown that every totally orderes subset T ⊂ W has an upper bound, wenow invoke Zorn Lemma, to get the existence of a maximal element (W,≺) ∈ W.The proof of the Theorem will be finished onece we prove that W = X. We provethis equality by contardiction. Assume W ( X. Pick an element x ∈ X rW , anddefine the set W1 = W ∪ x. Equipp W1 with the order relation ≺1 defined by

a ≺ b⇐⇒⟨a, b ∈W and a ≺ b,or b = x

It is pretty obvious that W = W1(x) and ≺=≺1

∣∣W

, so (W1,≺1) is well-orderedand (W,≺) @ (W1,≺1). Since W ( W1, this would contardict the maximality.

Comment. An interesting consequence of the Well-Ordering Theorem is thefollowing: For any cardinal number a, there exists an ordinal number α, such thatcardα = a.

Another interesting application is the following:Corollary C.3. If C is a set of cardinal numbers, then (C,≤) is well-ordered.

Proof. For any a ∈ C we choose a well-ordered set (Wa,≺a) with cardWa = a.Choose any set X with

a < cardX, ∀ a ∈ C.

(For example, we can take Y =⋃

a∈CWa, so that a ≤ cardY , ∀ a ∈ C, and then wedefine X = 0, 1Y .) Choose a well-ordering ≺ on the set X. Define α = ord(X,≺)and αa = ord(Wa,≺a), ∀ a ∈ C). Since

cardαa = a < cardX = card(X,≺), ∀ a ∈ C,

it follows that we have αa < α, i.e. αa ∈ Pα, ∀ a ∈ C.Apply now the fact that the ordinal set (Pα,≤) is a well-ordered, to find some

a0 ∈ C, such thatαa0 ≤ αa, ∀ a ∈ C.

This will clearly imply

a0 = cardαa0 ≤ cardαa = a, ∀ a ∈ C.

Examples C.1. As previously discussed, for every finite cardinal number n ≥0, there exists exactly one ordinal number with n as its cardinality.

The next interesting case is the class

A = α : α ordinal number with cardα ≤ ℵ0,then A is a set. Indeed if we choose an ordinal number γ1 with card γ1 = c, then Ais a subset of Pγ1 . Moreover, if we choose an ordinal number γ2 with card γ2 = 2c,then we see that γ1 ∈ Pγ2 r A. We can then take Ω to be the smallest element ofthe non-empty set Pγ2 rA, and we have

A = PΩ.

380 APPENDIX C

The inclusion A ⊂ PΩ is clear. To prove the other inclusion, we start with someordinal number α < Ω and we see that this forces α ∈ Pγ2 , so it will be impossible tohave ℵ0 < cardα, because this would give α ∈ Pγ2 rA, contradicting the minimalityof Ω.

The ordinal number Ω is called the smallest uncountable ordinal number.Fact 1: The set PΩ is uncountable.

This follows from the fact that

Ω = ord(PΩ,≤),

which gives ℵ0 < cardΩ = cardPΩ.Fact 2: The cardinal number ℵ1 = cardΩ = cardPΩ is the smallest uncount-

able cardinal number.Indeed, if one starts with some cardinal number a < ℵ1, then if we choose a well-ordered set (W,≺) with cardW = a, then, since we have cardW < cardΩ we musthave ord(W,≺) < Ω, which then forces a ≤ ℵ0.

Fact 3: Any countable subset A ⊂ PΩ has a strict upper bound in PΩ, thatis, there exists β ∈ PΩ, such that α < β, ∀α ∈ A.

We prove this by contradiction. Assume A has no strict upper bound in PΩ, whichmeans that for every β ∈ PΩ, there exists some α ∈ A such that β ≤ α. This gives

(6) PΩ =⋃α∈A

(Pα ∪ α).

But for every α ∈ PΩ we have ordPα = α, which forces cardPα ≤ ℵ0. Then thefact that A is countable, combined with (6) will force PΩ to be countable, which isimpossible.

The above construction can be generalized to arbitrary cardinal numbers, givingthe following

Fact 4: Given any cardinal number a, there exist a smallest ordinal numberΩa with a < cardΩa, and the cardinal number a′ = cardΩa is the smallestcardinal number with a < a′. Any set A ⊂ PΩa , with cardA ≤ a, has astrict upper bound in PΩa .