my personal notes for the course

Math 3001 - Analysis 1

Agnes Beaudry

May 3, 2017

Contents

1 Introduction 41.1 Disclaimers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41.2 Logic Reminders . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

1.2.1 Truth Values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41.2.2 “Or” and “And” . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51.2.3 Negation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51.2.4 Conditionals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61.2.5 Quantifiers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

2 Set Theory 82.1 The very basics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

2.1.1 Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82.1.2 Algebra of Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102.1.3 DeMorgan’s Laws . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122.1.4 Families of sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122.1.5 Power Set . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132.1.6 Cartesian Product . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

2.2 Relations, Orders, Equivalences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142.2.1 Relations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142.2.2 Order Relations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142.2.3 Equivalence Relations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

2.3 Functions and what you can do with them . . . . . . . . . . . . . . . . . . . . . . . . 182.3.1 Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 182.3.2 Image and Preimage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 192.3.3 Injectivity, Surjectivity, Bijectivity . . . . . . . . . . . . . . . . . . . . . . . . 202.3.4 Inverse Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 222.3.5 Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

2.4 Cardinality and Countability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 232.4.1 Finite sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 232.4.2 Countability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

3 Sets of Numbers 263.1 From the Natural Numbers to the Integers . . . . . . . . . . . . . . . . . . . . . . . . 26

3.1.1 Peano’s Axioms and Mathematical Induction . . . . . . . . . . . . . . . . . . 263.1.2 Integers and Rings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29

1

3.1.3 The Rational Numbers and Fields . . . . . . . . . . . . . . . . . . . . . . . . 303.1.4 Totally Ordered Fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

3.2 Towards the Real Numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 323.2.1 Irrationality of

√2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32

3.2.2 Least Upper Bounds and Greatest Lower Bounds . . . . . . . . . . . . . . . . 333.2.3 Axiomatic definition of the real numbers R . . . . . . . . . . . . . . . . . . . 363.2.4 Archimedean Property . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 373.2.5 Nested Interval Property . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 393.2.6 Density of Q . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 403.2.7 Square Roots . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42

3.3 Absolute Value and ε–Neighborhoods . . . . . . . . . . . . . . . . . . . . . . . . . . . 44

4 Sequences 474.1 Sequences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47

4.1.1 Convergence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 484.1.2 Monotone Sequences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 544.1.3 Subsequences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55

4.2 Bolzano-Weierstrass . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 564.3 Cauchy Sequences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57

5 Topology on R 605.1 Limit Points . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 605.2 Closed sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 625.3 Open Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 625.4 Compactness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 625.5 Connectedness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63

6 Continuity and Differentiability 656.1 Continuous Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65

6.1.1 Functional Limits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 656.1.2 Continuity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 676.1.3 Continuity and open sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 696.1.4 Extreme Value Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 706.1.5 Intermediate Value Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . 706.1.6 Uniform Continuity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72

6.2 Differentiability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 756.2.1 Derivatives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 756.2.2 The Chain Rule . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 776.2.3 Derivative at minima and maxima . . . . . . . . . . . . . . . . . . . . . . . . 786.2.4 The Mean Value Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79

7 The Riemann Integral 837.1 Definition and properties of the Riemann integral . . . . . . . . . . . . . . . . . . . . 837.2 The Fundamental Theorem of Calculus . . . . . . . . . . . . . . . . . . . . . . . . . . 937.3 Integration by parts and change of variables . . . . . . . . . . . . . . . . . . . . . . . 95

2

8 Series and power series 978.1 Series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 978.2 Lebesgue’s Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1028.3 Sequences and Series of Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1058.4 Power Series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110

3

Chapter 1

Introduction

1.1 Disclaimers

These notes are a mix between notes I have written based on Paul Sally’s book:

Sally, Paul J., Jr., Tools of the trade. Introduction to advanced mathematics. American

Mathematical Society, Providence, RI, 2008.

things I’ve added myself, and things I have taken from UChicago’s wonderful set of Math 160s IBL

- Honors Calculus I-III scripts. A lot of this is not original at all and I take responsibility for the

mistakes.

1.2 Logic Reminders

Mathematical language is very rigid and depends on simple logic. So let’s make sure we are all on

the same page.

1.2.1 Truth Values

Suppose that P is a statement. For example, P = “the dog is brown”. Alternatively, we can have

statements that depend on an input, like P (x) = “x is brown”. Here, P (x) is a property of the

input x. Then the truth value of P is true or false depending on whether the statement is correct

or wrong. For example, if P = “the Sun never rises”, then most likely, the truth value of P is false.

If the statement P (x) depends on x, the truth value also depends on x. For example, P (my dog)

could be true while P (the Sun) is false.

4

http://bookstore.ams.org/mbk-55

http://math.uchicago.edu/~boller/IBL/

http://math.uchicago.edu/~boller/IBL/

1.2.2 “Or” and “And”

Now take P and Q any statements. We can make various sentences out of P and Q. For example,

P and Q

This sentence as a whole has a truth value, and this truth value depends on that of P and Q. In

fact, the sentence is true only if

• P is true and Q is true

In any other case (for example, P is false and Q is true), the sentence itself is false.

Another example is

P or Q

When is this sentence true? In math, “or” is not exclusive. This means, that there are three cases

when this sentence can be true.

• If P is true and Q is true.

• If P is true and Q is false.

• If P is false and Q is true.

I.e., you only need one of the two to be true, but they can also both be true.

1.2.3 Negation

Now, negating sentences, (determining there opposite), is extremely important. For example, (not

P ) is true exactly when P is false and (not P ) is false exactly when P is true. But sometimes, we

have more elaborate statements to negate.

For example, when is “P or Q” a false sentence? From what we saw above, there is only one

way this can be false, i.e.,

• if P is false and Q is false

That is

not (P or Q) = (not P ) and (not Q)

Similarly, when is “P and Q” a false sentence? Well, when one of P and Q is false, so

not (P and Q) = (not P ) or (not Q).

5

1.2.4 Conditionals

Often, we make sentences such as

if P , then Q

This is the same as saying “P implies Q”.

When is such a statement true? Well, the requirement is that, Q be true whenever P be true.

So, the sentence is true when


• P is false and Q is true

• P is false and Q is false

In the last two cases, since P is false, we ask for nothing! For example,

if 2 is odd, I am 10 years old

That’s just always true. Whether I’m 10 or not, since 2 is just not odd!

So, when is “if P , then Q” false then? Well, there’s only one way for this to be false. That is, if

• P is true and Q is false

That is,

not (if P , then Q) = P and not Q

Exercise 1.2.1 (Contrapositive). Check that

if P , then Q = if (not Q), then (not P )

This is called the contrapositive.

Another popular phrasing is

P if and only if Q

This sentence is true exactly when P and Q have the same truth value. That is, when


• P is false and Q is false

6

Therefore, we have

not (P if and only if Q) = (P and (not Q)) or ((not P ) and Q)

Exercise 1.2.2. Check that

P if and only if Q = (if P then Q) and (if Q then P )

and that

P if and only if Q = (not P ) if and only if (not Q)

1.2.5 Quantifiers

Finally, we turn to sentence P (x) that depend on the meaning of x. If we plug in a specific x, like

x=“my dog”, then P (x) becomes a precise statement and we can go to the previous sections to

discuss its truth value and how it behaves within sentences. However, we often makes statements

like:

for all x, P (x)

or

there exists x such that P (x)

When are such statements false? Well, “for all x, P (x)” is false if there is at least one x such that

P (x) is false. In other words

not (for all x, P (x)) = there exists x such that (not P (x))

Similarly, when is “there exists x such that P (x)” false? Well, exactly when all x make P (x) false.

That is,

not (there exists x such that P (x)) = for all x, (not P (x))

7

Chapter 2

Set Theory

2.1 The very basics

2.1.1 Sets

Definition 2.1.1. A set is a collection of elements or objects. If A is a set and x is an element of

A, we write

x ∈ A or A 3 x.

The symbols { and } are used to denote a set. For example {a, b, c} is a set containing three

elements called a, b and c.

Example 2.1.2. • N = {1, 2, 3, . . .}

• Z = {0, 1,−1, 2,−2, . . .}

• X = {A,B,C,. . . , X,Y,Z}

• Y = {x | x is a letter in the English alphabet}

• Ø = { }.

If P (x) is any property that describes x, and A is a set, we can form the set:

{x ∈ A | P (x)}.

Warning 2.1.3. Sets don’t have repetitions and are not ordered. For example

{1, 1, 2, 3} = {1, 2, 3} = {2, 3, 1}.

8

Question 2.1.4. Is {0, 1} equal to {0, {1}}?

Answer. No, these sets do not contain the same elements. Although they both contain 0, the first

set contains 1 while the second set contains the “set containing 1”. Here is an imperfect analogy:

1 is to {1} as a dog is to a picture of a dog.

Definition 2.1.5. Suppose that A and B are sets. We say that A = B if the elements of A are

the same as the elements of B.

Example 2.1.6. For the examples above, X = Y .

Warning 2.1.7. When making a definition, we often use the sentence structure “if ..., then...”.

For example, as above, “Let A = B if the elements of A are the same as the elements of B”. Since

this is a definition, it automatically behaves like an “if and only if”. That is, from this point on,

A = B implies (or more precisely means) that the elements of A are the same as the elements of

B and the elements of A being the same as those of B implies that A = B.

Definition 2.1.8. Let A and B be sets. Then A is a subset of B, denoted A ⊆ B (or A ⊂ B), if

whenever x ∈ A, then x ∈ B. If A ⊆ B but A 6= B, then we write A ( B.

Example 2.1.9. For the examples above, N ⊆ Z, N ( Z, X ⊆ Y and Y ⊆ Y .

Lemma 2.1.10. If A ⊆ B and B ⊆ C︸︷︷︸P

then A ⊆ C︸︷︷︸Q

.

Remark 2.1.11. How to write a proof? Derive from the given what is wanted. The given are:

• Definitions

• Axioms

• Previously known facts

• Logic

To prove an “if P , then Q” statement, you must show that whenever P is true, Q is also true.

Here are two techniques:

1. You can assume that P is true, and from this assumption, derive that Q is true.

2. You can assume that Q is not true, and from this assumption derive that P is not true. This

is proving the contrapositive.

9

3. You can assume that P is true and that Q is not true, and from these two assumptions, derive

a contradiction. This is called a proof by contradiction.

Proof. (1) Suppose that A ⊆ B and B ⊆ C. Suppose that x ∈ A. Then x ∈ B since A ⊆ B. Since

x ∈ B, then x ∈ C since B ⊆ C. So if x ∈ A, then x ∈ C and therefore A ⊆ C.

Lemma 2.1.12. Let A be any set. Then Ø ⊆ A.

This is not an “if . . . , then . . . ” statement, so we must find a different idea. We will do a proof

by contradiction.

Proof. Suppose that Ø 6⊆ A. Then there exists x ∈ Ø such that x 6∈ A. But there are no elements

in the empty set. So this is a contradiction. Therefore, Ø ⊆ A.

Lemma 2.1.13. Let A and B be sets. Then A = B if and only if A ⊆ B and B ⊆ A.

Remark 2.1.14. Recall that the statement “P if and only if Q” is the same as having “if P , then

Q” and “if Q, then P”. So when you prove an “if and only if”, one simple way is to do two proofs,

one for “if P , then Q” and another for “if Q, then P”.

Proof. Suppose that A = B. Assume for the sake of contradiction that either A 6⊆ B or B 6⊆ A

(this is exaclty “not ( A ⊆ B and B ⊆ A)”). If A 6⊆ B, then there exists x ∈ A such that x 6∈ B.

So x is not a common element of A and B, a contradiction, since A = B. Similarly, if B 6⊆ A, then

there exists x ∈ B such that x 6∈ A. Again, this is a contradiction.

Suppose that A ⊆ B and B ⊆ A. Suppose for the sake of contradiction that A 6= B. Then

there exists x such that either x ∈ A and x 6∈ B or x ∈ B and x 6∈ A. If x ∈ A and x 6∈ B, then

A 6⊆ B. If x ∈ B and x 6∈ A, then B 6⊆ A. In either cases, this is a contradiction.

Exercise 2.1.15. Redo this proof, but for each direction, use a direct proof. Redo it again by

proving the contrapositives. Which is the “nicer” proof?

2.1.2 Algebra of Sets

Definition 2.1.16. Let A and B be sets. The union of A and B, denoted A ∪B is defined by

A ∪B = {x | x ∈ A or x ∈ B}.

10

The intersection of A and B, denoted A ∩B is defined by

A ∩B = {x | x ∈ A and x ∈ B}

The sets A and B are disjoint if

A ∩B = Ø

The difference of A and B is

A\B = {x ∈ A | x 6∈ B}.

Let A ⊆ X. Then complement of A in X, denoted Ac, is

Ac = X\A

Exercise 2.1.17. The union and the intersection are commutative and associative!

A ∪B = B ∪A commutativity

A ∩B = B ∩A commutativity

(A ∪B) ∪ C = A ∪ (B ∪ C) associativity

(A ∩B) ∩ C = A ∩ (B ∩ C) associativity

Lemma 2.1.18 (Distributivity of ∩ over ∪). Let A,B,C be sets. Then

A ∩ (B ∪ C) = (A ∩B) ∪ (A ∩ C)

Proof. First we show that A∩ (B ∪C) ⊆ (A∩B)∪ (A∩C). Let x ∈ A∩ (B ∪C). Then x ∈ A and

11

x ∈ B ∪ C. So x ∈ B or x ∈ C. Hence, x ∈ A and x ∈ B or x ∈ A and x ∈ C. So x ∈ A ∩ B or

x ∈ A ∩ C. Therefore, x ∈ (A ∩B) ∪ (A ∩ C).

Next we show that (A ∩ B) ∪ (A ∩ C) ⊆ A ∩ (B ∪ C). Let x ∈ (A ∩ B) ∪ (A ∩ C). Then,

x ∈ (A ∩ B) or x ∈ (A ∩ C). So x ∈ A and x ∈ B or x ∈ A and x ∈ C. So x ∈ A, but also, x ∈ B

or x ∈ C. So x ∈ A and x ∈ B ∪ C. Hence x ∈ A ∩ (B ∪ C).

Since A∩ (B∪C) ⊆ (A∩B)∪ (A∩C) and (A∩B)∪ (A∩C) ⊆ A∩ (B∪C), then A∩ (B∪C) =

(A ∩B) ∪ (A ∩ C).

2.1.3 DeMorgan’s Laws

Theorem 2.1.19. (DeMorgan’s Laws) Let X be a set, and let A,B ⊂ X. Then:

1. X \ (A ∪B) = (X \A) ∩ (X \B)

2. X \ (A ∩B) = (X \A) ∪ (X \B)

Proof. Exercise.

2.1.4 Families of sets

Sometimes we will encounter arbitrary families of sets. The definitions of intersection/union can

be extended to infinitely many sets.

Definition 2.1.20. Let A = {Aλ | λ ∈ I} be a collection of sets indexed by a nonempty set I.

Then the intersection and union of A are the sets

⋂λ∈I

Aλ = {x | x ∈ Aλ, for all λ ∈ I},

and ⋃λ∈I

Aλ = {x | x ∈ Aλ, for some λ ∈ I}.

Theorem 2.1.21. (DeMorgan’s Laws) Let X be a set, and let A = {Aλ | λ ∈ I} be a collection of

subsets of X. Then:

1. X \(⋃

λ∈I Aλ)

=⋂λ∈I(X \Aλ)

12

2. X \(⋂

λ∈I Aλ)

=⋃λ∈I(X \Aλ).

2.1.5 Power Set

Definition 2.1.22. Let X be a set. The set P(X) is the set of all subset of X. I.e.

P(X) = {X | Y ⊆ X}.

Remark 2.1.23. P(X) is never empty since Ø ∈ P(X)

Example 2.1.24.

P({1, 2}) = {Ø, {1}, {2}, {1, 2}}.

Exercise 2.1.25. If X has n elements, prove that P(X) has 2n elements. (Hint: Use induction,

see Theorem 3.1.2).

2.1.6 Cartesian Product

Definition 2.1.26. The symbol (x, y) is called an ordered pair. Two ordered pairs (x, y) and (z, w)

are equal if and only if x = z and y = w.

Let A and B be sets. The cartesian product of A and B, denoted A×B is defined by

A×B = {(a, b) | a ∈ A and b ∈ B}

Similarly,

A1 ×A2 × . . .×An = {(a1, a2, . . . , an) | ai ∈ Ai}.

Notation 2.1.27. We often denote A×A by A2 and A× . . .×A︸︷︷︸n

by An.

Example 2.1.28. (a) If A = B = R, then A×B = R×R = R2 is just the cartesian plane.

(b) If A = {1, 2, 3} and B = {a, b}, then

A×B = {(1, a), (2, a), (3, a), (1, b), (2, b), (3, b)}

and

B ×A = {(a, 1), (a, 2), (a, 3), (b, 1), (b, 2), (b, 3)}.

13

Warning 2.1.29. Note that, in general, A × B 6= B × A as the ordering of the elements in the

pair (x, y) matters. Indeed, (x, y) = (y, x) if and only if x = y.

Lemma 2.1.30. Let A be a set. Then A×Ø = Ø×A = Ø

Proof. There are no pairs (x, y) such that x ∈ Ø since there are no x ∈ Ø.

2.2 Relations, Orders, Equivalences

2.2.1 Relations

Definition 2.2.1. Let X and Y be sets.

(a) A relation is a subset of X × Y .

(b) A relation on X is a subset of X ×X.

Example 2.2.2. (a) Equality is a relation. We can think of = as the set

= = {(x, x) | x ∈ X}.

(b) Given any function f : R→ R, we get a relation

f = {(x, f(x)) | x ∈ R} ⊆ R×R .

This is often called the graph of f .

(c) The ordering on the integers is an example of a relation:

< = {(a, b) ∈ Z×Z | a is (strictly) less than b}.

The last example is a special case of the following more general concept.

2.2.2 Order Relations

Definition 2.2.3. Let X be a set. A partial order on the set X is a subset � of X × X, with

elements (x, y) ∈� written as x � y, satisfying the following properties:

(PO1) (Reflexive) For all x ∈ X, x � x.

14

(PO2) (Antisymmetry) For all x, y ∈ X, if a � b and b � a, then a = b.

(PO2) (Transitivity) For all x, y, z ∈ X, if x � y and y � z, then x � z.

Example 2.2.4. The relation A ⊆ B on the set of subsets of a set X is a partial order.

Definition 2.2.5. Let X be a set. A strict partial order on the set X is a subset ≺ of X × X,

with elements (x, y) ∈≺ written as x ≺ y, satisfying the following properties:

(SPO1) (Irreflexivity) a 6≺ a for all a ∈ X.

(SPO1) (Asymmetry) If a ≺ b, then b 6≺ a.

(SPO2) (Transitivity) For all x, y, z ∈ X, if x ≺ y and y ≺ z then x ≺ z.

Example 2.2.6. The relation A ( B on the power set P(X) of X is a strict partial order.

Definition 2.2.7. Let X be a set. A total order on the set X is a subset � of X×X, with elements

(x, y) ∈� written as x � y, satisfying the following properties:

(TO1) (Reflexive) For all x ∈ X, x � x.

(TO2) (Antisymmetry) For all x, y ∈ X, if a � b and b � a, then a = b.

(TO2) (Transitivity) For all x, y, z ∈ X, if x � y and y � z then x � z.

(TO3) (Totality) For all x and y, x � y or y � x.

Example 2.2.8. The relation a ≤ b on the set of integers Z is a total order.

Definition 2.2.9. Let X be a set. A strict total order on the set X is a subset ≺ of X ×X, with

elements (x, y) ∈≺ written as x ≺ y, satisfying the following properties:

(STO1) (Trichotomy) For all x, y ∈ X exactly one of the following holds: x ≺ y, y ≺ x or x = y.

(STO2) (Transitivity) For all x, y, z ∈ X, if x ≺ y and y ≺ z then x ≺ z.

Example 2.2.10. The relation a < b on the set of integers Z is a strict total order.

Remark 2.2.11. Let X be a set and � be a partial order on X. Then we can define

a ≺ b if a � b and a 6= b.

15

This defines a strict partial order on X. Similarly, if � is a total order, then this defines a strict

total order on X.

Conversely, let X be a set and ≺ be a strict partial order on X. Then we can define

a � b if a ≺ b or a = b.

This defines a partial order on X. Similarly, if ≺ is a strict total order, then this defines a total

order on X.

Therefore, given a partial or a total order �, we get an associated strict partial or strict total

order ≺. Similarly, given a strict partial or total order ≺, we get an associated partial or total

order �.

Definition 2.2.12. In these notes, we will say that a set X is an ordered set if it has a strict total

order or a total order.

Exercise 2.2.13. Prove that the lexicographical relation R2, defined by

((a, b), (c, d)) if a < c or, if a = c and b < d

is a strict total order on R2.

Remark 2.2.14. Since a complex number a+bi ∈ C is specified by an ordered pair of real numbers,

(a, b), the lexicographical ordering of the previous exercise gives an ordering on C. However, this

strict total order does not respect multiplication, so is not the right kind of ordering given all the

structure we have on C.

2.2.3 Equivalence Relations

Definition 2.2.15. Let ∼⊆ X × X be a relation on X. Then ∼ is an equivalence relation if ∼

satisfies the following properties:

(a) (Reflexivity) For all a ∈ X, a ∼ a.

(b) (Symmetry) If a ∼ b then b ∼ a.

(c) (Transitivity) If a ∼ b and b ∼ c, then a ∼ c.

Example 2.2.16. (1) =, the subset {(x, x) | x ∈ X} ⊆ X ×X is an equivalence relation.

16

(2) < is not an equivalence relation. It fails reflexivity and symmetry. ≤ fails symmetry.

Exercise 2.2.17. Let

Z×Z6=0 = {(a, b) | a, b ∈ Z and b 6= 0}.

Say that

(a, b) ∼ (c, d) if and only if ad = bc.

Then ∼ is an equivalence relation on Z×Z6=0.

In fact, this example gives rise to Q as we will see.

Exercise 2.2.18. Let X be any set and X1, . . . , Xn be a collection of subsets of X which are

pairwise disjoint. That is,

Xi ∩Xj = Ø, if i 6= j.

Further, suppose that

X =

n⋃i=1

Xi = X1 ∪X2 ∪ . . . ∪Xn.

This is called a partition of X. Let x ∈ X. Then note that x ∈ Xi for exactly one i. Now suppose

that y, z ∈ X. We say that

y ∼ z, if and only if y, z ∈ Xi for some i = 1, . . . n.

Check that this is an equivalence relation.

Definition 2.2.19. Let X be any set and ∼ be an equivalence relation on X. If x ∈ X, then

[x] = x = {y ∈ X | y ∼ x}

is called the equivalence class of x in X. Note that [x] ⊆ X. We often write

X/∼ = {[x] | x ∈ X}

for the set of equivalence classes. The elements of X/∼ are subsets of X.

Example 2.2.20. If Z×Z6=0 and ∼ are as in Exercise 2.2.17, then

a

b:= [(a, b)] = {(c, d) ∈ Z×Z6=0 | ad = bc}.

17

We can define the set Q are the equivalence classes in Z×Z6=0 for ∼. That is

Q = (Z×Z6=0)/∼.

Example 2.2.21. In the example of a partition X = X1 ∪ . . . ∪Xn, if x ∈ Xi, we have

[x] = {y ∈ X | y ∼ x} = {y ∈ X | y ∈ Xi} = Xi.

The equivalence classes of X are exactly the subsets Xi.

2.3 Functions and what you can do with them

2.3.1 Functions

Definition 2.3.1. Let A and B be sets. A function f : A → B is a subset of A × B such that,

for all a ∈ A, there exists a unique b ∈ B such that (a, b) ∈ f . One thinks of a function as an

assignment f : A→ B, which sends a to a unique element b = f(a), written

a 7→ b = f(a).

Remark 2.3.2. A function f is determined by three piece of data:

(1) The set A, called the source or the domain.

(2) The set B, called the target or the codomain. This is also sometimes called the range. Note

that this is different from the image, see Definition 2.3.5.

(3) The subset f ⊂ A×B satisfying the properties of Definition 2.3.1.

Example 2.3.3.

• f : N→ Z, such that g(n) = 2n. This corresponds to {(n, 2n) | n ∈ N} ⊂ N×Z.

• g : N→ N, such that f(n) = n. This corresponds to {(n, n) | n ∈ N} ⊂ N×N.

• h : R→ R, such that h(x) = x2. This corresponds to {(x, x2) | x ∈ R} ⊂ R×R.

18

• Let A ⊆ X. fA : X → {0, 1} such that fA(x) = 0 if x 6∈ A and fA(x) = 1 if x ∈ A. This is

called an indicator function for the set A in X.

• Non-example: r = {(x, y) ∈| y2 = x} ⊂ R×R. This is not a function since it contains both

(1,−1) and (1, 1). So for a = 1, there is not a unique b such that (a, b) ∈ r. Further, for

a = −1, there is no b such that (a, b) ∈ r. So r fails to be a function in two different ways.

However,√− = {(x, y) | x ≥ 0, y2 = x, y ≥ 0} ⊂ R≥0×R is a function

√− : R≥0 → R.

Remark 2.3.4. Let f, g : A→ B. Then f = g if f(a) = g(a) for all a ∈ A. That is

{(a, f(a)) | a ∈ A} = {(a, g(a)) | a ∈ A}

as subsets of A×B.

2.3.2 Image and Preimage

Definition 2.3.5. Let f : A→ B be a function.

1. The set A is called the domain or the source and the set B is called the codomain or the

target (sometimes also called the range).

2. If A′ ⊆ A is a subset, then the image of A′ in B under f is

f(A′) = {b ∈ B | there exists a ∈ A′ such that f(a) = b} = {f(a) | a ∈ A′} ⊆ B.

In particular, f(A) is also called the image of f .

3. If B′ ⊆ B, then the preimage of B′ is

f−1(B′) = {a ∈ A | f(a) ∈ B′} ⊆ A

Exercise 2.3.6. Let f : A→ B be a function. Prove that f−1(B) is always equal to A.

Example 2.3.7. Let f : N→ Q, a 7→ 1a . Let {1, 2, 3} ⊆ N. Then

f({1, 2, 3}) = {1

1,1

2,1

3}.

Let {−11 ,15 ,

23 ,

12} ⊆ Q.

f−1({−1

1,1

5,2

3,1

2}) = {−1, 5, 2}.

19

Let N ⊆ Q, then

f−1(N) = {1}.

Warning 2.3.8. f−1(B′) is a subset of A. If f−1(B′) = A′, then f(A′) = f(f−1(B′)) is a subset

of B, which may or may not be equal to B′!!! For example, consider f : N → Z, f(n) = 2n. Let

B′ = {. . . ,−2, 0, 2, 4}, that is, all the even integers less then or equal to 4. Then

f−1(B′) = {1, 2}

Let A′ = f−1(B′)

f(A′) = f(f−1(B′)) = {2, 4} 6= B′.

Exercise 2.3.9. Let f : A→ B. Let A1, A2 ⊆ A and B1, B2 ⊆ B. Then

1. f−1(B1 ∪B2) = f−1(B1) ∪ f−1(B2)

2. f−1(B1 ∩B2) = f−1(B1) ∩ f−1(B2)

3. f(A1 ∪A2) = f(A1) ∪ f(A2)

4. f(A1 ∩A2) ⊆ f(A1) ∩ f(A2). Give two sets A,B and a function f : A→ B where this is not

an equality.

Proof sample. We show that f−1(B1 ∪ B2) ⊆ f−1(B1) ∪ f−1(B2). Let a ∈ f−1(B1 ∪ B2). Then

f(a) ∈ B1 ∪ B2. Hence f(a) ∈ B1 or f(a) ∈ B2. Therefore, a ∈ f−1(B1) or a ∈ f−1(B2), so that

a ∈ f−1(B1) ∪ f−1(B2).

2.3.3 Injectivity, Surjectivity, Bijectivity

Definition 2.3.10. Let f : A→ B

1. f is surjective or onto if f(A) = B. That is, for every b ∈ B, there exists a ∈ A such that

f(a) = b.

2. f is injective or one-to-one if, for all a1, a2 ∈ A, if a1 6= a2, then f(a1) 6= f(a2). This is

equivalent to saying

if f(a1) = f(a2), then a1 = a2.

3. A function is bijective if it is injective and surjective. That is, for each b in B, there is a

unique a ∈ A such that f(a) = b.

20

4. If there exists a bijection between the sets A and B, we say that they are in bijective corre-

spondence.

Example 2.3.11. 1. f : Z→ Z, a 7→ |a| is not surjective and not injective.

2. g : Z→ N∪{0}, a 7→ |a| is surjective but not injective.

3. h : Z→ Z, a 7→ 1 + 2a is injective but not surjective.

Proof. An integer x = h(a) only if it is odd so h is not surjective. Suppose that h(a) = h(b).

Then 1 + 2a = 1 + 2b. Hence, 2a = 2b. Since 2 6= 0, this implies that a = b.

4. k : Q→ Q, a 7→ 2a is injective and surjective, so it is bijective.

Definition 2.3.12. Let f : A → B and g : B → C. Then the composite of f and g, denote g ◦ f

is the function

g ◦ f : A→ C

(g ◦ f)(a) = g(f(a)).

In other words,

g ◦ f = {(a, g(f(a))) | a ∈ A} ⊆ A× C.

Example 2.3.13. Let f, g, h be as above.

1. h ◦ f : Z→ Z is a 7→ 1 + 2|a|.

2. f ◦ h : Z→ Z is a 7→ |1 + 2a|. So

h ◦ f 6= f ◦ h.

Warning 2.3.14. In general, even if g◦f makes sense, f ◦g may not make any sense. For example,

if A 6= C, I can’t apply f to g(b). Even when they do make sense, it may not be the case that f ◦ g

is equal to g ◦ f . Composition of functions is not commutative.

Definition 2.3.15. For any set A, the function IA : A→ A, where idA(a) = a or

idA = {(a, a) | a ∈ A} ⊆ A×A

21

is called the identity of A. If f : A→ B and g : C → A are functions, then

idA ◦g = g

and

f ◦ idA = f.

Exercise 2.3.16. Let A, B, and C be sets and suppose that there are functions f : A → B and

g : B → C. Then

(a) if f and g are injective, then so is g ◦ f .

(b) if f and g are surjective, then so is g ◦ f .

(c) if f and g are bijective, then so is g ◦ f .

2.3.4 Inverse Function

Exercise 2.3.17. Suppose that f : A→ B is bijective. Prove that

g = {(f(a), a) | a ∈ A} ⊂ B ×A

is a function. Check that

g ◦ f = idA and f ◦ g = idB .

Definition 2.3.18. Suppose that f : A→ B is bijective. The function f−1 : B → A defined by

f−1 = {(f(a), a) | a ∈ A} ⊂ B ×A

is called the inverse of f .

Warning 2.3.19. Do not confuse the function f−1 and its values f−1(b) on elements of B, which

are only defined for bijective functions f , with the preimage f−1(B′) of a subset B′ ⊂ B, which is

defined for any subset B′ of B and any function f (whether bijective or not).

Example 2.3.20. 1. Let f : Z→ Z be the function f(x) = x+ 1. Check that this is bijective.

The inverse of f is the function f−1 : Z→ Z given by f−1(x) = x− 1.

22

2. Let g : R≥0 → R≥0 be the function g(x) = x2. This is a bijective function. Its inverse

g−1 : R≥0 → R≥0 is the function g−1(x) =√x. (In fact, that’s the definition of the square

root! It is defined as the inverse of the squaring function!)

2.3.5 Operations

Definition 2.3.21. A binary operation on a set X is a function

∗ : X ×X → X.

We right ∗((a, b)) = a ∗ b. Similarly, an n-airy operation on a set X is a function

m : Xn → X.

Example 2.3.22. Addition, + : R×R → R, multiplication · : R×R → R are the canonical

examples of binary operations.

2.4 Cardinality and Countability

2.4.1 Finite sets

Let n ∈ N. For the following section, we write [n] = {1, 2, . . . , n} and [0] = Ø.

Definition 2.4.1. 1. A set A is finite if there is a bijective correspondence between A and the

set [n] for some n.

2. If A is not finite, then it is called infinite.

The following theorem is classical, but not easy to prove rigorously.

Theorem 2.4.2. (The Pigeonhole Principle) Let n,m ∈ N with n < m.

Then there does not exist an injective function f : [m]→ [n].

Exercise 2.4.3. Let A be a finite set. Suppose that A is in bijective correspondence both with

[m] and with [n]. Then m = n.

Because of this last exercise, we can make the following rigorous definition.

23

Definition 2.4.4 (Cardinality of a finite set). If A is a finite set that is in bijective correspondence

with [n], then we say that the cardinality of A is n, and we write |A| = n. (By Theorem 2.4.3,

there is exactly one such natural number n.)

Here is a list of exercise to make sure you understand the concept of cardinality and good

practice at writing proofs.

Exercise 2.4.5. (a) Let A and B be finite sets. If A ⊂ B, then |A| ≤ |B|.

(b) Let A and B be finite sets such that A ∩B = Ø. Then |A ∪B| = |A|+ |B|.

(c) (The Inclusion/Exclusion Principle) Let A and B be two finite sets. Then:

|A ∪B|+ |A ∩B| = |A|+ |B|

(d) Let A and B be two finite sets. Then |A×B| = |A| · |B|.

(e) Show that if A is a finite set, then |P(A)| = 2|A|.

2.4.2 Countability

Definition 2.4.6. 1. A set A is countably infinite if it is in bijective correspondence with N. It

is countable if it is finite or countably infinite.

2. A set is uncountable if it is not countable.

Example 2.4.7. Some fun facts:

1. The natural numbers are countable. This is clear from the definition.

2. N∪{0} is countable. A bijection is given by f : N→ N∪{0}, f(n) = n− 1.

3. The integers are countable. A bijection is given by composing f above with g : N∪{0} → Z,

g(0) = 0, g(2n) = n and g(2n− 1) = −n.

4. A subset of a countable set is countable (exercise).

24

5. N×N is countable. Indeed, we can do a diagonal counting argument:

(1, 1) // (1, 2)

{{

(1, 3) // (1, 4)

{{(2, 1)

��

(2, 2)

;;

(2, 3)

{{(3, 1)

;;

(3, 2)

{{(4, 1)

��

. . .

(5, 1)

;;

6. Similar ideas can be used to prove that Q is countable, and that if A and B are countable,

so is A×B.

Theorem 2.4.8 (Cantor’s Theorem). A set A and its power set P(A) are not in bijective corre-

spondence.

Proof. Suppose that they are and let f : A→ P(A) be a bijection. Let

B = {a ∈ A | a 6∈ f(a).}

Since f is a bijection, B = f(b) for some b ∈ A. Looking at the definition of B, we have that b ∈ B

if and only if b 6∈ f(b). But f(b) = B, so b ∈ B if and only if b 6∈ B. This is a contradiction.

Exercise 2.4.9. Use ideas similar to the proof of Cantor’s Theorem, or the theorem itself, to prove

1. The set of binary sequences

S = {(a1, a2, a3, . . .) | ai = 0, 1}

is not uncountable.

2. (0, 1) ⊆ R is uncountable.

3. R is uncountable.

25

Chapter 3

Sets of Numbers

3.1 From the Natural Numbers to the Integers

3.1.1 Peano’s Axioms and Mathematical Induction

Here is a quick summary on this subject. We recommend the online notes Construction of Number

Systems by N. Mohan Kumar for more details.

Axioms 3.1.1 (Peano’s Postulates). The natural numbers are defined as a set N together with a

“successor” function s : N→ N and a special element 1 ∈ N satisfying the following axioms:

I. 0 ∈ N.

II. If n ∈ N, then s(n) ∈ N.

III. There is no n ∈ N such that s(n) = 0.

IV. If n,m ∈ N and s(n) = s(m), then n = m.

V. If A ⊂ N is a subset satisfying the two properties:

• 1 ∈ A

• if n ∈ A, then s(n) ∈ A,

then A = N.

Theorem 3.1.2 (Weak Mathematical Induction). For each n ∈ N, let P (n) be a proposition.

Suppose the following two results:

(a) P (0) is true.

(b) If P (n) is true, then P (s(n)) is true.

Then P (n) is true for all n ∈ N.

26

http://www.math.wustl.edu/~kumar/courses/310-2011/Peano.pdf

http://www.math.wustl.edu/~kumar/courses/310-2011/Peano.pdf

Proof. Let A = {n ∈ N | P (n) is true}. Then by (a) 0 ∈ A. By (b), if n ∈ A, then so is n + 1.

Therefore, by the Axiom V, A = N. That is, for all n ∈ N, P (n) is true.

This is the set that we know and love N = {0, 1, 2, 3, . . .}. We often think of the successor

function as a function

s : N→ N

which sends s(n) = n+ 1. In fact, this can be extended to an operation

+ : N×N→ N · : N×N→ N

(m,n) 7→ m+ n (m,n) 7→ m · n

which satisfies the properties

(A1) (Associativity of addition) For any k, m, n in N,

k + (m+ n) = (k +m) + n.

(A2) (Commutativity for addition) For any n, m in N,

m+ n = n+m.

(A3) (Additive Identity) There exists 0 ∈ N such that for any n ∈ N.

0 + n = n+ 0 = n.

(M0) (0 Annihilates) For any n in N,

0 · n = n · 0 = 0.

(M1) (Associativity of multiplication) For any k, m, n in N,

k · (m · n) = (k ·m) · n.

(M2) (Commutativity of multiplication) For any n, m in N,

m · n = n ·m.

27

(M3) (Multiplicative Identity) For any n,

1 · n = n · 1 = n.

(D) (Distributivity) For any k, m, n in N,

k · (m+ n) = (k ·m) + k · n.

Definition 3.1.3. We can define a strict total order relation on N by declaring m < n if and only

if there exists a non-zero natural number k such that m+ k = n. This ordering satisfies

(STO3) (Monotony of addition) If m < n, then m+ k < n+ k.

(STO4) (Monotony of multiplication) If 0 < k, then n · k < m · k.

There following two results are equivalent to Weak Mathematical induction, but giving rigorous

proof would require a deeper study of the consequences of Peano’s axiom than we have time for.

Theorem 3.1.4 (Strong Mathematical Induction). For each n ∈ N, let P (n) be a proposition.

Suppose the following two results:

(i) P (1) is true.

(ii) If P (k) is true for 1 ≤ k ≤ n, then P (n+ 1) is true.

Then P (n) is true for all n ∈ N.

Theorem 3.1.5 (Well-Ordering Principle). If A is a non-empty subset of N, then A has a least

element. That is, there exists a0 ∈ A such that a0 ≤ a for all a ∈ A.

Now, here are some exercises to remind us how to use mathematical induction. These are results

which we will use later on.

Exercise 3.1.6 (Bernoulli’s Inequality). If 1 + x > 0, then (1 + x)n ≥ 1 + nx for any n ∈ N.

Exercise 3.1.7 (Binomial Theorem). Recall that a0 = 1 and that(ni

)= n!

(n−i)!i! where

n! = n · (n− 1) · · · 2 · 1

28

and 0! = 1. For any n ∈ N,

(x+ y)n =n∑i=0

(n

i

)xn−iyi = xn +

(n

1

)xn−1y +

(n

2

)xn−2y2 + . . .+

(n

n− 1

)xyn−1 + yn.

Exercise 3.1.8 (Geometric Series Formula). Suppose that r 6= 0. Prove that for all n ∈ N,

n∑i=0

ri =rn+1 − 1

r − 1.

3.1.2 Integers and Rings

The integers Z = {0, 1,−1, 2,−2, 3,−3, . . .} are built from N ⊆ Z by adding an additive identity

and additive inverses. As for N, there are binary operations + and · where the addition and

multiplication are associative and commutative (A1, A2, A3, M1, M2) and satisfy the distributive

law (D). Multiplication has an identity (M3), namely 1. However, in addition, we have

(A4) (Additive Inverses) For any n ∈ Z, there exists −n ∈ Z

(−n) + n = n+ (−n) = 0.

Definition 3.1.9. A set R with two binary operations + and · where the addition and multiplication

are associative and commutative (A1, A2, M1, M2) and satisfy the distributive law (D), such that

addition and multiplication have an identity (A3, M3) called 0 and 1 and such that addition has

inverses (A4) is called a commutative ring.

Remark 3.1.10. We did not assume (M0) for rings! The reason is that we don’t need to. Once

we add (A4), it follows from the other axioms.

Exercise 3.1.11. Assume that R is a ring. Prove that a · 0 = 0 for all a ∈ R.

One way to construct Z is to consider the following equivalence relation on N2:

(m,n) ∼ (m′, n′) if and only if m+ n′ = m′ + n.

Then Z is the set of equivalence classes:

Z = N2 /∼.

29

You should thing of the equivalence class [(m,n)] as representing the integer m−n. Further, there

is a function

i : N→ Z, i(n) = [(n, 0)].

We can define an addition + and a multiplication · on Z that respects the addition on N. Let

−n = [(0, n)].

Then −n is the additive inverse of n.

3.1.3 The Rational Numbers and Fields

Now, you’ll note that there are no multiplicative inverses in Z. To get those, we must past to the

rational numbers

Q = (Z×Z6=0)/∼ ={ab

},

where the equivalence relation ∼ identifies pairs (a, b) and (c, d) when ad = bc. Note that there is

an injective function

i : Z→ Q, i(n) =n

1

As for N and Z, there are binary operations + and · which give Q the structure of a ring.

Further, the multiplication satisfies

(M4) (Multiplicative Inverses) For any n ∈ Q, with n 6= 0, there exists n−1 ∈ Q such that

n−1 · n = n · n−1 = 1

Finally, the element 0 and 1 are different:

(F) The additive identity 0 and the multiplicative identity 1 are not equal.

This is an example of a field :

Definition 3.1.12. A ring F such that the multiplication also satisfies (M4) and (F) is called a

field.

30

3.1.4 Totally Ordered Fields

Both Z and Q have strict total orders in the sense of Definition 2.2.9, where the order relation <

satisfies trichotomy and transitivity (STO1, STO2). Further, this order respects the addition and

multiplication in the sense that they satisfy (STO3) and (STO4).

Definition 3.1.13. A field F with an order relation that satisfies STO1-STO4 is called a totally

ordered field.

Example 3.1.14. Q is an totally ordered field.

Definition 3.1.15. Let F be an ordered field and x ∈ F . If 0 < x, we say that x is positive. If

x < 0, we say that x is negative.

Theorem 3.1.16. Suppose that F is a field. Then additive and multiplicative inverses are unique.

This means:

1. Let x ∈ F . If y, y′ ∈ F satisfy x+ y = 0 and x+ y′ = 0, then y = y′.

2. Let x ∈ F . If y, y′ ∈ F satisfy x · y = 1 and x · y′ = 1, then y = y′.

Exercise 3.1.17. Let F be a field.

1. The additive and multiplicative inverses are unique.

2. If x ∈ F , then −(−x) = x.

3. If x ∈ F and x 6= 0, then (x−1)−1 = x.

4. If a+ b = a+ c, then b = c.

5. If a · b = a · c and a 6= 0, then b = c.

6. If a ∈ F , then a · 0 = 0.

7. If a · b = 0, then a = 0 or b = 0.

8. a · (−b) = −(a · b) = (−a) · b and a · b = (−a) · (−b)

Now, suppose that F is ordered.

1. If 0 < x, then −x < 0. Similarly, if x < 0, then 0 < −x.

2. If x > 0 and y < z, then xy < xz.

31

3. If x < 0 and y < z then xz < xy.

4. For all x ∈ F , then 0 ≤ x2. Further, if x 6= 0, then 0 < x2.

5. 0 < 1

Theorem 3.1.18. Let F be an ordered field with addition +F , multiplication ·F , additive identity

0F and multiplicative identity 1F . Let <F be the strict total order on F . Let +Q be the addition in

Q, ·Q the multiplication in Q, 0Q and 1Q the additive and multiplicative identities in Q. Let <Q be

the usual strict total order on Q.

There exists an injective map i : Q→ F that respects all of the axioms for an ordered field. In

particular:

• i(0Q) = 0F

• i(1Q) = 1F

• If a, b ∈ Q, then i(a+Q b) = i(a) +F i(b).

• If a, b ∈ Q, then i(a ·Q b) = i(a) ·F i(b).

• If a, b ∈ Q and a <Q b, then i(a) <F i(b).

Proof. This is a tedious exercise. It is good to at least try to figure out an idea for the proof.

Remark 3.1.19. The image i(Q) ⊆ F can be thought of as a copy of the rational numbers sitting

inside F . In this sense, every ordered field contains a “copy” of the rational numbers.

3.2 Towards the Real Numbers

3.2.1 Irrationality of√

2

Lemma 3.2.1. That is, there does not exists ab ∈ Q such that

(ab

)2= 2.

For the proof, I will ask you to assume that we have shown that any integer is either even or

odd.

Proof. Suppose that(ab

)2= 2, where a and b are chosen to be positive with no common factors.

Then,

a2 = 2b2.

32

Hence, 2|a2. Therefore, a2 is even. Suppose that 2 6 |a. Then

a = 2k + 1.

Therefore,

a2 = (2k + 1)2 = 4k2 + 4k + 1.

So, a2 = 2(k2 + k) + 1. So a2 is odd, a contradiction. Hence, a = 2k for some k ∈ Z.

(2k)2 = 2b2

4k2 = 2b2

2k2 = b2.

Hence, b2 is even, therefore, so is b. But this means that 2|b and 2|a. A contradiction since a and

b were chosen to have no common factors.

The next goal is to give an axiomatic of R that “fills the holes” in Q.

3.2.2 Least Upper Bounds and Greatest Lower Bounds

The next section is devoted to the study of ordered fields and the definition of R.

Definition 3.2.2. Let F be an ordered field and A be a subset of F . Then A is bounded above if

there exists M ∈ F such that

a ≤M for all a ∈ A.

Then M is called an upper bound for A. The set A is bounded below if there exists m ∈ F such

that

m ≤ a for all a ∈ A.

Then m is called a lower bound for A. The set A is bounded if it is bounded above and bounded

below.

Example 3.2.3. 1. Let A ⊆ Q,

A = {1 +(−1)n

n| n ∈ N}.

33

For a ∈ A, −10 ≤ a ≤ 10. We can do better though, in fact:

0 ≤ a ≤ 3

2.

2. The set A = {a ∈ Q | a2 < 2} is bounded in Q, since for a ∈ A, −2 < a < 2. What’s the best

we can do? Well, we would like to say −√

2 ≤ x ≤√

2, but√

2 is not defined, and we know

that when we define it, it won’t be in Q.

Definition 3.2.4. Let F be an ordered field and A be a subset of F . We say that L ∈ F is a least

upper bound or supremum of A if

1. L is an upper bound for A

2. if M is an upper bound for A, then L ≤M .

Suppose that B is a subset of F . Then l ∈ F is a greatest lower bound or infimum of B if

1. l is a lower bound for B

2. if m is a lower bound for B, then m ≤ l.

We often denote

L = sup(A) l = inf(A).

Exercise 3.2.5. If it exists, the least upper bound of a set is unique.

Example 3.2.6. 1. Show that

sup(A) = sup

({1 +

(−1)n

n| n ∈ N}

)= 3/2.

Proof. Since 1 + (−1)n ≤ 1 + 1

2 = 3/2, 3/2 is an upper bound. Suppose that M is any other

upper bound. Then a ≤ M for all a ∈ A. Then since 3/2 ∈ A, we must have 3/2 ≤ M . So

3/2 is the least upper bound.

2. Exercise: prove that inf(A) = 0.

3. Let

B = {x ∈ Q | 0 < x < 1} ⊆ Q .

34

Show that the least upper bound of B in Q is inf(B) = 0.

Proof.

1.Certainly, if x ∈ B, then 0 < x so 0 is a lower bound. Let m be a lower bound for B.

2.Then m ≤ x for all x ∈ B. Suppose that m > 0, so that we can choose a, b ∈ Z positive

integers such that m = ab . Since m is a lower bound for B and 1/2 ∈ B, we also have

m ≤ 1/2 < 1. Then,

0 <a

b+ 1<a

b= m < 1.

Hence, ab+1 ∈ B and a

b+1 < m. This is a contradiction, since m was a lower bound. Hence,

we must have m ≤ 0. So 0 is inf(B).

Exercise: prove that sup(B) = 1.

Warning 3.2.7. sup(X) and inf(X), if they exists, may or may not be in the set X. In the

previous example, sup(A) ∈ A, inf(A) ∈ A. However, inf(B) 6∈ B and sup(B) 6∈ B.

In general, if sup(A) ∈ A, the proofs will be easier. But you don’t always know that it is the

case. We will need more machinery to be able to give a lot of examples.

Here is a useful criterion to recognize supremums.

Lemma 3.2.8. Let A be a subset of an ordered field F and s be an upper bound for A. Then

s = sup(A) if and only if, for every x ∈ F such that x < s, there exists a ∈ A such that x < a.

Proof. Suppose that s = sup(A). Since x < s, it is not an upper bound for A. By definition, there

exists a ∈ A such that x < a.

Conversely, suppose that for every x ∈ F such that x < s, there exists a ∈ A such that x < a.

We already know that s is an upper bound for A, so we just need to show that it’s the smallest

one.

Let x ∈ F be an upper bound for A and suppose that x < s. Then, by assumption, there exists

a ∈ A with x < a. So x is not an upper bound for A. Therefore, if x is an upper bound for A,

s ≤ x and s is the least upper bound.

Corollary 3.2.9. Let A be a subset of an ordered field F and s be an upper bound for A. Then

s = sup(A) if and only if, for every ε > 0, ε ∈ F , there exists a ∈ A such that s− ε < a.

Proof. Exercise.

35

Exercise 3.2.10. Let F be any ordered field. Then the empty set is bounded. However, it does

not have a greatest lower bound nor a least upper bound.

Definition 3.2.11 (Axiom of Completeness). An ordered field F is complete (or has the least upper

bound property) if every non-empty subset of F that is bounded above has a least upper bound.

Warning 3.2.12. The definition says that if S ⊆ F , and ∃M ∈ F such that s ≤ M for all s ∈ S,

then S has a least upper bound. It does not say that the field F itself has a least upper bound. It

is a property of the subsets of F that are bounded above.

Definition 3.2.13. Let A ⊆ F for F an ordered field. A maximum or greatest element for A is

an element M ∈ A such that, for all a ∈ A, a ≤ M . If such an M exists write M = max(A). A

minimum or least element for A is an element m ∈ A such that, for all a ∈ A, m ≤ a. If such an

m exists write m = min(A).

Remark 3.2.14. Maximums and minimums don’t necessarily exist. Further, sup(A) may exist

while max(A) does not and similarly for inf(A) and min(A).

Exercise 3.2.15. (a) If max(A) exists, then so does sup(A) and in this case, max(A) = sup(A).

Similarly, if min(A) exists, then so does inf(A) and in this case, min(A) = inf(A).

(b) Show that the converse is not true. That is, produce a set A such that sup(A) exists, but

max(A) does not. Do the same for inf(A) and min(A).

3.2.3 Axiomatic definition of the real numbers R

Definition 3.2.16 (Real Numbers). The real numbers R is the (essentially) unique complete

ordered field.

Since every ordered field contains Q (see Theorem 3.1.18),

N ⊆ Z ⊆ Q ⊆ R .

In fact, Q is a proper subset of R,

Q ( R .

Indeed, the set {x ∈ R |x2 ≤ 2} is non-empty (since 12 < 2) and bounded above (since 2, for

example, is an upper bound). Therefore,

sup({x ∈ R |x2 ≤ 2}) ∈ R

36

We will show later:

Exercise 3.2.17. Let x = sup({x ∈ R |x2 ≤ 2}) ∈ R. Then x2 = 2.

But, we saw that there was no rational number x such that x2 = 2. Hence, x ∈ R but x 6∈ Q.

Corollary 3.2.18. Q does not have the least upper bound property.

Definition 3.2.19. A number x ∈ R is irrational if x 6∈ Q.

Definition 3.2.20. Let a ∈ R, then if {x ∈ R |x2 ≤ a} is non-empty,

√a := sup

({x ∈ R |x2 ≤ a}

).

Proving that this is the right definition takes some time and will be done below.

3.2.4 Archimedean Property

Theorem 3.2.21 (Archimedean Property for N). Let m and k be non-zero natural numbers. Then

there exists n ∈ N such that n ·m > k.

Proof. If k < m, let n = 1 and this satisfies the requirement. Suppose that m ≤ k. Then there

exists l ∈ N such that k = m+ l. Let n = l + 2. Then

m · n = m · (l + 1 + 1)

= ml +m+m

≥ l +m+ 1

= k + 1

> k.

Remark 3.2.22. From the Archimedean Property for N, we can deduce that the set N has no

upper bound, although we already knew that. If m ∈ N, then m < m+ 1...

Let F be an ordered field with addition +F and multiplicative identity 1F . For N ∈ N, let N

denote the element

N := 1F +F . . .+F 1F︸︷︷︸N -times

.

37

(This is i(n) in the sense of Theorem 3.1.18.)

Definition 3.2.23. Let F be an ordered field. Then, F is an Archimedean ordered field if, for

every a > 0 and b > 0 in F , there exists n ∈ N such that na > b.

Exercise 3.2.24. Q is an Archimedean ordered fields.

Lemma 3.2.25. Let F be an Archimedean ordered field. Then N is unbounded in F . That is, for

every element x ∈ F , there exists a natural number n such that x < n.

Proof. If x ≤ 0, let n = 1 in the Archimedean property. If x > 0, let a = 1 and x = b in the

previous lemma. Then there exists n ∈ N such that n = n · 1 > x.

Theorem 3.2.26 (Archimedean Property for R). R is an Archimedean ordered field. That is, if a

and b are real numbers such that a > 0 and b > 0, there exists n ∈ N such that na > b.

Proof. If a > b, choose n = 1. If a = b, choose n = 2. If a < b, let

S = {na | n ∈ N}.

Note that S is not empty since 1 · a = a ∈ S.

Our claim fails if b happens to be an upper bound for S. So, for the sake of contradiction,

suppose that for all s ∈ S, s ≤ b. Since S is not-empty and bounded above, it has a least upper

bound. Let L = sup(S) so that

∀s ∈ S, s ≤ L

Note that

L− a < L.

So there is n0a ∈ S such that L− a < n0a. So

L < n0a+ a = (n0 + 1)a.

But n0 + 1 ∈ N, so (n0 + 1)a ∈ S. This is a contradiction since L was an upper bound for S.

Corollary 3.2.27. N is unbounded in R. That is, for every a ∈ R, there exists n ∈ N such that

a < n.

Exercise 3.2.28. If x ∈ R, there exists N > 0 such that −N < x.

38

Corollary 3.2.29. Suppose that ε > 0. There exists n ∈ N such that 1n < ε.

Proof. Consider 1ε . Then 1

ε and 1 are positive real numbers. Letting a = 1 and b = 1/ε in the

previous theorem, there exists n ∈ N such that

1

ε< 1 · n.

So,

1/n < ε.

3.2.5 Nested Interval Property

Definition 3.2.30. We let (a, b) = {x ∈ R | a < x < b}. This is called an open interval. Similarly,

[a, b] = {x ∈ R | a ≤ x ≤ b}, and is called a closed interval. The sets [a, b) and (a, b] are defined

similarly. Finally, (a,∞) = {x ∈ R | a < x} and similarly for [a,∞), (−∞, a) and (−∞, a].

Definition 3.2.31 (Nested Interval Property). Suppose that for each n ∈ N, we have closed

intervals in F

In = [an, bn] = {x ∈ F | an ≤ x ≤ bn}

such that an < bn. Further, suppose that the intervals are nested, that is, In+1 ⊆ In, then

∞⋂n=0

In = {x ∈ F | x ∈ In for every n ∈ N} 6= Ø.

Theorem 3.2.32. R satisfies the Nested Interval Property.

Proof. Let In be as in Definition 3.2.31. Consider the set A = {ak | k ∈ N}. This set is non-empty,

since a0 ∈ A. We show that it is bounded above by bi for any i ∈ N.

Let ak ∈ A. If k ≤ i, since [ai, bi] ⊆ [ak, bk] we have

ak ≤ ai < bi.

If i < k, then since [ak, bk] ⊆ [ai, bi]

ak < bk ≤ bi.

In both cases, ak ≤ bi. So bi is an upper bound for A.

39

Therefore, A is non-empty and bounded above. By completeness sup(A) exists. We prove that

sup(A) ∈ In for every n ∈ N.

Since sup(A) is an upper bound for A, for every n ∈ N , an ≤ sup(A). Further, we showed that

for every n ∈ N, bn is an upper bound for A, so since sup(A) is the least upper bound, sup(A) ≤ bn.

Therefore,

an ≤ sup(A) ≤ bn

and so sup(A) ∈ [an, bn] = In for all n ∈ N, and hence that sup(A) ∈⋂∞n=0 In and this set is not

empty.

Remark 3.2.33. The Nested Interval Property is equivalent to completeness (i.e., the least upper

bound property). That is, if F is an ordered field, it satisfies the Nested Interval Property if and

only if it it is complete. We will talk about Cauchy sequences later and show that completeness is

also equivalent to the criteria that every Cauchy sequence in F converges.

3.2.6 Density of Q

The next goal is to show that Q is “everywhere” in R.

Theorem 3.2.34. Let a ∈ R, a > 0. There exists n0 ∈ Z such that

n0 − 1 ≤ a < n0.

Proof. Let

S = {n ∈ Z | n > a}.

Since there exists m ∈ N such that −m < a, we have that S is bounded below by −m. Further, S

is non-empty since there exists N ∈ N with a < N . By the well-ordering principle, the set S has a

least element n0 ∈ S. Hence,

a < n0

Since n0 − 1 < n0, n0 − 1 6∈ S. So n0 − 1 ≤ a.

Theorem 3.2.35. Let a and b be real numbers with a < b. There exists a rational number pq such

that

a <p

q< b

40

Proof. Since a < b, then 0 < b− a. So there exists q ∈ N such that

1

q< b− a,

so that

a+1

q< b.

Now, consider qa ∈ R. There exists p ∈ Z such that

p− 1 ≤ qa < p.

So,p− 1

q≤ a < p

q.

Therefore,p

q≤ a+

1

q.

Therefore,

a <p

q≤ a+

1

q< b

Definition 3.2.36. A subset A ⊆ R is dense in R if, given any two real numbers a, b ∈ R such

that a < b, there exists r ∈ A such that a < r < b.

Corollary 3.2.37. The rationals Q are dense in R.

Exercise 3.2.38. The product of an irrational number with a non-zero rational is irrational.

Corollary 3.2.39. The irrationals are dense in R.

Proof. Suppose that a and b are in R and that a < b. Let pq ∈ Q be such that

a√2<p

q<

b√2.

If pq 6= 0, then

a <p

q·√

2 < b.

41

and we are done. If pq = 0, then choose p′

q′ ∈ Q such that

0 <p′

q′<

b√2.

Then,

a < 0 <p′

q′·√

2 < b.

3.2.7 Square Roots

For the next three lemmas, we let a ∈ R, and

√a := sup

({x ∈ R |x2 ≤ a}

).

In the extra problems, you show that:

Lemma 3.2.40. The√a exists in R if and only if a ≥ 0.

Proof. The set {x ∈ R |x2 ≤ a} is non-empty if and only if a ≥ 0. In that case, by the least upper

bound property for R, it has a least upper bound.

Lemma 3.2.41. If√a exists, then

√a ≥ 0.

Proof. Since 02 ≤ a for any a ≥ 0, then 0 ∈ {x ∈ R |x2 ≤ a}, hence

0 ≤ sup({x ∈ R |x2 ≤ a}

)= L.

Lemma 3.2.42. Let a ≥ 0. Then (√a)2 = a.

Proof. If a = 0, L = 0 so the claim holds. So we assume that a > 0.

Case 1

42

Suppose that L2 < a. Then a− L2 > 0. Since L ≥ 0, 2L + 1 > 0. Hence by the Archimedean

property for the real numbers, there exists n ∈ N such that

2L+ 1 < n(a− L2).

Hence,2L+ 1

n< a− L2.

But, 2L+1n = 2L

n + 1n ≥

2Ln + 1

n2 . Hence,

2L

n+

1

n2< a− L2.

This implies that (L+

1

n

)2

= L2 +2L

n+

1

n2< a.

So L+ 1n ∈ {x ∈ R |x2 ≤ a}, but L+ 1

n > L, a contradiction. So L2 6< a.

Case 2

Suppose that L2 > a. Then L2 − a > 0. As before, choose n such that

2L+ 1

n< L2 − a.

However,2L

n− 1

n2<

2L+ 1

n

Hence,2L

n− 1

n2< L2 − a.

Then,

a <

(L− 1

n

)2

.

So, if x2 ≤ a, then x2 ≤(L− 1

n

)2. One can show easily that this implies that |x| ≤ L − 1

n . Since

x < |x|, we conclude that for all x in {x ∈ R |x2 ≤ a}

x ≤ L− 1

n

43

So L− 1n is an upper bound for {x ∈ R |x2 ≤ a}, yet L− 1

n < L. A contradiction, hence L 6> a.

∴ L = a.

Definition 3.2.43. Let a ∈ R, then if {x ∈ R |x2 ≤ a} is non-empty,

√a := sup

({x ∈ R |x2 ≤ a}

).

3.3 Absolute Value and ε–Neighborhoods

Definition 3.3.1. Let x ∈ R. The absolute value of x is

|x| :=

x if x ≥ 0;

−x if x < 0.

Theorem 3.3.2 (Properties of absolute value).

1. For x ∈ R, |x| ≥ 0 and |x| = 0 if and only if x = 0.

2. For x ∈ R, −|x| ≤ x ≤ |x|.

3. For any x, y ∈ R, |xy| = |x||y|.

4. (Triangle inequality) For any x, y ∈ R, |x+ y| ≤ |x|+ |y|.

5. (The Reverse Triangle Inequality) For any x, y ∈ R, ||x| − |y|| ≤ |x− y|.

Proof. We will only prove (4) and (5).

44

For (4), we have

(|x+ y|)2 = (x+ y)2

= x2 + 2xy + y2

≤ x2 + 2|xy|+ y2

= |x|2 + 2|x||y|+ |y|2

= (|x|+ |y|)2.

Since |x + y| ≥ 0 and |x| + |y| ≥ 0, this implies that |x + y| ≤ |x| + |y|. The reverse triangle

inequality follows from the triangle inequality.

For (5), by the triangle inequality,

|x| = |x− y + y| ≤ |x− y|+ |y|.

Therefore, |x| − |y| ≤ |x− y|. Similarly,

|y| = |x+ y − x| ≤ |x|+ |y − x| = |x|+ |x− y|.

Therefore, |y| − |x| ≤ |x− y|, or −|x− y| < |x| − |y|. We have

−|x− y| < |x| − |y| < |x− y|,

so ||x| − |y|| ≤ |x− y|.

Exercise 3.3.3. Let x, y and z be real numbers. If |x− z| < y, then |x| < y + |z|.

Lemma 3.3.4. Let a ∈ R. Suppose that |a| < ε for all ε > 0. Then a = 0.

Proof. If a 6= 0, then a > 0 or −a > 0. In either cases, |a| > 0. Let ε = |a|. By assumption,

|a| < |a|, a contradiction. So a = 0.

Definition 3.3.5. Let ε > 0. The open ε–neighborhood of a real number a is

Vε(a) = (a− ε, a+ ε).

45

The closed ε–neighborhood of a real number a is

V ε(a) = [a− ε, a+ ε].

Remark 3.3.6. (a)

Vε(a) = (a− ε, a+ ε) = {x ∈ R : |x− a| < ε}.

In particular, |x− a| < ε if and only if a− ε < x and x < a+ ε.

(b)

V ε(a) = [a− ε, a+ ε] = {x ∈ R : |x− a| ≤ ε}.

In particular, |x− a| ≤ ε if and only if a− ε ≤ x and x ≤ a+ ε.

(c)

(−∞, a− ε) ∪ (a+ ε,∞) = {x ∈ R : |x− a| > ε}.

In particular, |x− a| > ε if and only if x < a− δ or a+ ε < x.

(d)

(−∞, a− ε] ∪ [a+ ε,∞) = {x ∈ R : |x− a| ≥ ε}.

In particular, |x− a| ≥ ε if and only if x ≤ a− δ or a+ ε ≤ x.

46

Chapter 4

Sequences

4.1 Sequences

Definition 4.1.1. A sequence of real numbers is a function a : N→ R.

Remark 4.1.2. Let (an) be a sequence a : N→ R. Then {an | n ∈ N} is the image of a.

We write an = a(n) and think of a sequence a as a list (a1, a2, a3, . . . ) of real numbers. Other

notation you will see are

(an)∞n=1 or (ak)k∈N or (an).

For functions defined {n ∈ N | n ≥ n0} for any fixed n0 ∈ N, we write (an)∞n=n0and also call this a

sequence.

Example 4.1.3. (a) Let an = (−1)n. Then

(an)n∈N = (−1, 1,−1, 1,−1, . . .)

(b) Let an = 2n. Then

(an)n∈N = (2, 4, 6, 8, 10, . . .)

(c) Let an =√

3/n · (2n + 1). Then

(ak)k∈N = (3√

3, 5√

3/2, 9, . . .)

47

(d) We can have other kinds of behavior too

(1, 0, 1, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, . . .)

4.1.1 Convergence

Definition 4.1.4. A sequence (an)n∈N of real numbers is convergent if there exists a ∈ R such

that, given any ε > 0, there exists Nε ∈ N such that, for all n ≥ Nε,

|a− an| < ε.

We say that (an) converges to a, and write an → a or

limn→∞

an = a.

Remark 4.1.5. In other words, (an) converges to a if, for all ε > 0, there exists Nε ∈ N such that

{an | n ≥ Nε} ⊆ (a− ε, a+ ε) = Vε(a).

Example 4.1.6. Show that limn→∞

(1− 1

n

)= 1.

Proof. Let ε > 0. Then, there exists Nε ∈ N such that 1Nε

< ε. Let n ≥ Nε. Then,

∣∣∣∣1− (1− 1

n)

∣∣∣∣ =1

n≤ 1

Nε< ε.

Exercise 4.1.7. Use induction to show that n < 2n for all n ∈ N.

Example 4.1.8. Show that limn→∞

1

2n= 0.

Proof. Let ε > 0. Choose Nε ∈ N be such that 1Nε

< ε. Then, for n ≥ Nε∣∣∣∣ 1

2n− 0

∣∣∣∣ =1

2n<

1

n≤ 1

Nε< ε.

So, lim 12n = 0.

Lemma 4.1.9. Let (ak)k∈N be a convergent sequence in R. Then limn→∞

an is unique.

48

Proof. Suppose limn→∞

an = a and limn→∞

an = a′. We will show that for all ε > 0, |a − a′| < ε. This

will imply that a− a′ = 0, that is, that a = a′.

Fix ε > 0. Then ε2 > 0, so there exists N ε

2∈ N such that, for all n ≥ N ε

2,

|a− an| <ε

2.

Also, there exists M ε2∈ N such that, for all n ≥M ε

2,

|a′ − an| <ε

2.

Let n > N ε2,M ε

2. Then,

|a− an| <ε

2and |a′ − an| <

ε

2.

So,

|a− a′| = |a− an + an − a′| ≤ |a− an|+ |an − a′| <ε

2+ε

2= ε .

So, for all ε > 0, |a− a′| < ε.

What does it mean for a sequence not to converge to a?

Definition 4.1.10 (Divergence). Let (an) be a sequence. Then we say that (an) diverges or is a

divergent sequence if it is not convergent. That is, for every a ∈ R, there exists εa > 0 such that,

for every n ∈ N, there exists kn > n such that |akn − a| ≥ εa.

Example 4.1.11. We show that limn→∞

(−1)n 6= 1.

Let ε1 = 1. Let n ∈ N. If n is odd, let kn = n. Then

|1− akn | = |1− (−1)n| = |1 + 1| = 2 ≥ ε1.

If n is even, let kn = n+ 1.

|1− akn | = |1− (−1)n+1| = |1 + 1| = 2 ≥ ε1.

49

So,

limn→∞

an 6= 1.

Remark 4.1.12. To prove that ((−1)n) is divergent, we would have to show that for every a ∈ R,

limn→∞

(−1)n 6= a.

Example 4.1.13. The sequence (n) is divergent. Let a ∈ R.

Choose εa = 1. Let n ∈ N. Choose k ∈ N such that a + 1 < k. Let kn = max(n + 1, k). Then

kn > n and

a+ εa = a+ 1 < kn

and it follows that |kn − a| > εa.

Definition 4.1.14. A sequence (ak)k∈N is bounded below (resp. bounded above) if {ak | k ∈ N} is

bounded below (resp. bounded above). A sequence (ak)k∈N is bounded if it is both bounded below

and bounded above.

Example 4.1.15.

((−1)n)n∈N is bounded.

(n)n∈N is not bounded. It is bounded below, but not bounded above.

Exercise 4.1.16. Let A ⊆ R be bounded (above and below). Then there exists M ∈ R such that

|a| ≤M for all a ∈ A.

Lemma 4.1.17. Let (an) be a convergent sequence. Then (an) is bounded.

Proof. Suppose that lim an = a. Let ε = 1. Then there exists N > 0 such that

|a− an| < 1

for all n ≥ N . In particular,

a− 1 < an < a+ 1

which implies that

−|a| − 1 < an < |a|+ 1.

50

So, |an| < |a|+ 1 for all n ≥ N . Let M = max{|a|+ 1, |a1|, . . . , |aN−1|} (this is a finite set, so the

maximum exists). Then

|an| ≤M

for all n ∈ N so {an | n ∈ N} is bounded.

Theorem 4.1.18 (Manipulating Limits). Let lim an = a and lim bn = b (in particular, both limits

exist). Let c ∈ R.

1. lim(can) = c · a

2. lim(an + bn) = a+ b

3. lim(anbn) = a · b

4. If b 6= 0 and bn 6= 0 for any n, then lim anbn

= ab .

Proof. 1. If c = 0, then (can) = (0) which converges to 0. Suppose c 6= 0. Let ε > 0. Then

ε /|c| > 0. Since (an) converges, there exists N ∈ N such that, for all n ≥ N ,

|an − a| < ε /|c|

Therefore, for all n ≥ N ,

|can − ca| = |c||an − a| < |c|ε/|c| = ε,

Hence, lim can = ca.

2. Let ε > 0. Since (an) and (bn) converge, there exists Na, Nb ∈ N such that for all n ≥ Na,

|an − a| < ε /2

and for all n ≥ Nb,

|bn − b| < ε /2.

Let N ≥ max(Na, Nb). Let n ≥ N . Then

|an + bn − (a+ b)| = |an − a+ bn − b| ≤ |an − a|+ |bn − b|.

51

Since n ≥ Na, and n ≥ Nb, we have

|an − a|+ |bn − b| < ε /2 + ε /2 = ε .

Hence, for n ≥ N ,

|an + bn − (a+ b)| < ε

and lim an + bn = a+ b.

3. Since (an) converges, it is bounded. Let M ≥ |an| for all n ∈ N. Note that, for all n ∈ N

|anbn − ab| = |anbn − anb+ anb− ab|

= |an(bn − b) + b(an − a)|

≤ |an||bn − b|+ |b||an − a|

≤M |bn − b|+ |b||an − a|

Let ε > 0. Then ε /(M + 1 + |b|) > 0. Since (an) converges, there exists Na ∈ N such that for

all n ≥ Na,

|an − a| < ε /(M + 1 + |b|).

Since (bn) converges, there exists Nb ∈ N such that for all n ≥ Nb,

|bn − b| < ε /(M + 1 + |b|).

So, for all n ≥ N = max(Na, Nb),

|anbn − ab| ≤M |bn − b|+ |b||an − a|

≤M ε/(M + 1 + |b|) + |b| ε /(M + 1 + |b|)

= ε(M + |b|)/(M + 1 + |b|)

≤ ε .

4. We prove that if b 6= 0, then lim 1/bn = 1/b. Then use 3 to finish the proof.

52

First, consider ∣∣∣∣ 1

bn− 1

b

∣∣∣∣ =

∣∣∣∣b− bnbbn

∣∣∣∣ =|b− bn||bbn|

.

Since b 6= 0, we have |b| > 0. Further, since bn → b, there exists N1 ∈ N such that, for all

n ≥ N1,

||bn| − |b|| ≤ |b− bn| < |b|/2.

Hence,

−|b|/2 < |bn| − |b|

and therefore,

|b|/2 < |bn|.

This implies that

|b|2/2 < |bbn|

so that1

|bbn|<

2

|b|2

Hence, for all n ≥ N1, ∣∣∣∣ 1

bn− 1

b

∣∣∣∣ =|b− bn||bbn|

<2

|b|2|bn − b|.

Let ε > 0. Then ε |b2|2 > 0. So there exists N ∈ N, with N ≥ N1, such that, for all n ≥ N ,

|bn − b| < (ε |b2|)/2.

So, ∣∣∣∣ 1

bn− 1

b

∣∣∣∣ < |b− bn| 2

|b|2

<|b|2 ε

2

2

|b|2

= ε .

So, lim 1/bn = 1/b.

Lemma 4.1.19. Let lim an = a and suppose that there exists K ∈ N such that an ≥ 0 for all

53

n ≥ K. Then a ≥ 0.

Proof. Suppose a < 0 and let ε = −a. There there exists Nε ∈ N such that, for all n ≥ Nε,

an ∈ (a− ε, a+ ε) = (a− (−a), a+ (−a)) = (2a, 0).

In particular, for all n ≥ Nε, an < 0. However, if M = max(K,Nε), then M ≥ Nε, so aM < 0, but

M ≥ K, so aM ≥ 0, a contradiction. Hence, a ≥ 0.

Theorem 4.1.20. Let lim an = a and lim bn = b. Suppose that there exists K ∈ N such that

an ≤ bn for all n ≥ K, then a ≤ b.

Proof. Consider the sequence cn = bn − an. Then (cn) is convergent since both (an) and (bn) are.

Further, c = lim cn = b− a. Now, note that bn ≥ an for all n ≥ K, hence, cn = bn − an ≥ 0 for all

n ≥ K. Hence, c = b− a ≥ 0 by the previous result. So b ≥ a.

4.1.2 Monotone Sequences

Definition 4.1.21. (i) A sequence (ak)k∈N is monotonic increasing if ak ≤ ak+1. It is strictly

monotonic increasing if ak < ak+1.

Example 4.1.22. (1, 1, 2, 2, 3, 3, 4, 4, 5, 5, . . .) is monotonic increasing.

(1, 2, 3, 4, 5, 6, . . .) is strictly monotonic increasing.

(ii) A sequence (ak)k∈N is monotonic decreasing if ak ≥ ak+1. It is strictly monotonic increasing

if ak > ak+1.

Example 4.1.23. ( 1n)n∈N is strictly monotonic decreasing.

Example 4.1.24. (1, 1, 1, 1, . . .) is monotonic decreasing. It is also monotonic increasing.

54

Lemma 4.1.25. If (an)n∈N is a monotonic increasing sequence of real numbers which is bounded

above, then it converges to a = sup({an}n∈N). Similarly, every sequence (bn)n∈N which is monotonic

decreasing and bounded below converges to b = inf({bn}n∈N)

Proof. We suppose that (an)n∈N is monotonic increasing. The proof for monotonic decreasing is

similar.

Since {an}n∈N is bounded above and non-empty, it has a least upper bound sup ({an}n∈N) = a.

Let ε > 0. Then a − ε < a. So there exists aNε ∈ {an}n∈N such that a − ε < aNε ≤ a. Since

(an) is monotonic increasing, for all n ≥ Nε,

a− ε < aNε ≤ an ≤ a < a+ ε

and aNε ∈ Vε(a). Therefore, for all n ≥ Nε,

|a− an| < ε .

So, lim an = a.

Corollary 4.1.26. Every bounded monotonic sequence of real numbers converges.

4.1.3 Subsequences

Definition 4.1.27. Let (an) be a sequence. A subsequence of (an) is a sequence b : N→ R defined

by the composition b = a ◦ i, where i : N → N is a strictly increasing function. If (an) has a

subsequence with limit p, we call p a subsequential limit of (an).

We can write bk = a(i(k)) = a(ik) = aik , so that (bk) is the sequence b1, b2, b3, . . . , which is equal

to the sequence ai1 , ai2 , ai3 , . . . , where i1 < i2 < i3 < · · · .

So, if you think of a sequence (an)n∈N as a list, then a subsequence of (an)n∈N is an infinite

sublist.

Example 4.1.28. • ((−1)2k)k∈N is a subsequence of ((−1)n)n∈N.

• Consider (an)n∈N = (1, 2, 3, 4, . . .). Then (1, 1, 3, 5, 7, . . .) is not a subsequence. It is not a

sublist since 1 is not repeated in (an)n∈N. However, (1, 3, 5, 7, . . .) is a subsequence of (an)n∈N.

55

• For any k ∈ N, the sequences (an)n≥k and (an)n>k are subsequences of (an)n∈N.

• Let (an)n∈N be a sequence of real numbers. If (ank)k∈N as subsequence of (an)n∈N and

(anki )i∈N a subsequence of (ank)k∈N, then (anki )i∈N is a subsequence of (an)n∈N.

Lemma 4.1.29. Let (an)n∈N be a sequence of real numbers and (ank)k∈N be a subsequence of

(an)n∈N. If (an) converges to a, then so does (ank).

Lemma 4.1.30. Let 0 ≤ q < 1. Then lim qn = 0.

Proof. If q = 0, then this is a constant sequence at 0 hence converges to 0. If 0 < q < 1, then

note that 0 < qn < qn−1 < . . . < q < 1 for all n ∈ N. Hence, (qn) is a monotone decreasing

sequence which is bounded below by 0. Therefore, lim qn = L exists. Now, remark that (qn+1) is

a subsequence of (qn). Further, so lim qn+1 = L. Now

L = lim qn+1 = lim q · qn = q lim qn = q · L.

Since q < 1, we must have L = 0.

4.2 Bolzano-Weierstrass

Definition 4.2.1. Let [a, b] be an interval. Then the length of [a, b] is b− a. Similarly, the length

of (a, b), [a, b) or (a, b] is b− a.

Recall that a sequence (an) converges to a if, for all ε > 0, there exists N ∈ N such that for all

n ≥ N ,

|a− an| < ε .

Theorem 4.2.2 (Bolzano-Weierstrass). Every bounded sequence (an)n∈N in R has a convergent

subsequence.

Proof. Since (an) is bounded, there exists M > 0 such that |an| ≤ M for all n ∈ N. Let b0 = −M

and c0 = M . Consider the interval I0 = [b0, c0] = [−M,M ]. If [0,M ] contains infinitely many

terms of (an), let I1 = [b1, c1] = [0,M ]. Otherwise, let I1 = [b1, c1] = [−M, 0]. Suppose that

Ik = [bk, ck] has been constructed. Then if [bk + ck−bk2 , ck] contains infinitely many terms of (an),

let Ik+1 = [bk+1, ck+1] = [bk + ck−bk2 , ck]. Otherwise, let Ik+1 = [bk+1, ck+1] = [bk, bk + ck−bk

2 ].

56

We get a nested sequence of closed intervals Ik, each containing infinitely many terms of (an).

Further, Ik has length

ck − bk =M

2k−1.

Let an1 be any term of (an) in I1. Let ank ∈ Ik be such that nk > nk−1. This is possible since

there are infinitely many a′is in Ik and only a finite number of them can have i ≤ nk−1. This gives

us a subsequence (ank) of (an) with ank ∈ Ik.

By the Nested Interval Theorem,⋂k Ik 6= Ø. Let a ∈

⋂k Ik 6= Ø. We prove that limk ank = a.

Let ε > 0. Choose K ∈ N such that M/2K−1 < ε. Then, a ∈ IK and, for all k ≥ K, ank ∈ Ik ⊆ IK .

So,

|ank − a| ≤ |cK − bK | =M

2K−1< ε.

So, limk ank = a.

4.3 Cauchy Sequences

Definition 4.3.1. A sequence of real numbers (ak)k∈N is a Cauchy sequence if, for any ε > 0, there

exists Nε ∈ N such that, for all n,m ≥ Nε,

|an − am| < ε .

So a sequence is Cauchy if the terms of the sequence are getting closer and closer together. You

will show the following lemma in your assignment:

Lemma 4.3.2. If (ak)k∈N is convergent, then it is Cauchy.

Proof. Let lim ak = a. Let ε > 0. There exists N ∈ N such that for all n ≥ N

|a− ak| <ε

2.

Let n,m ≥ N. Then,

|an − am| = |an − a+ a− am| ≤ |an − a|+ |a− am| <ε

2+ε

2= ε .

So (ak) is a Cauchy sequence.

Lemma 4.3.3. If a sequence (an)n∈N is Cauchy, it is convergent.

57

Proof. We show the following two things:

1. (an)n∈N is bounded, so it has a convergent subsequence (ank)k∈N.

2. limn→∞

an = limk→∞

ank .

1. Let ε = 1. There exists N1 ∈ N such that for all n ≥ N1,

|an − am| < 1.

So, in particular, for all n ≥ N1,

|an − aN1 | < 1

|an| < 1 + |aN1 |.

Let M = max{|a1|, |a2|, . . . , |aN1−1|, 1 + |aN1 |}. If n < N1, then |an| ≤ M . If n ≥ N1,

then |an| ≤ 1 + |aN1 | ≤ M . So, (an)n∈N is bounded by M . Therefore, it has a convergent

subsequence

limk→∞

ank = a.

2. Let ε > 0. Use the convergence of (ank) to choose K ∈ N such that, for all k ≥ K,

|a− ank | <ε

2.

Use the fact that (an) is Cauchy to choose N ∈ N such that, for all n,m ≥ N ,

|am − an| <ε

2.

58

Choose k0 big enough such that

nk0 ≥ N

k0 ≥ K

Then, for all n ≥ N ,

|a− an| = |a− ank0 + ank0 − an| ≤ |a− ank0 |+ |ank0 − an| <ε

2+ε

2= ε .

So,

lim an = a.

The two previous results together give us the following theorem.

Theorem 4.3.4 (Cauchy Criterion). A sequence (ak)k∈N of real numbers is convergent if and only

if it is a Cauchy sequence.

We won’t prove the following result, but it is a good exercise.

Theorem 4.3.5. Let F be an ordered field. The following statements are equivalent.

(a) Every non-empty subset of F which is bounded above has a supremum.

(b) Every Cauchy sequence in F converges

(c) Then intersection of a sequence of nested closed intervals in F is non-empty.

If F satisfies these equivalent properties, then F is complete.

59

Chapter 5

Topology on R

5.1 Limit Points

Definition 5.1.1. Let S ⊆ R and x ∈ R. Then x is a limit point of S if, for all ε > 0, Vε(x) =

(x− ε, x+ ε) contains an element s ∈ S such that s 6= x. That is,

(Vε(x)\{x}) ∩ S 6= Ø.

Warning 5.1.2. The point x ∈ R could be in S or it could not be in S. There is no requirement

on x. For every x ∈ R, one can ask the question: “is x and limit point of S?”

Example 5.1.3. 1. Consider the set S = { 1n | n ∈ N}. Then 0 is an limit point for S. Indeed,

let (− ε, ε) be an open neighborhood of 0. Then 0 < ε. So there exists n ∈ N such that

0 < 1n < ε. Hence,

0 6= 1

nand

1

n∈ (ε, ε).

2. For [0, 1), the point 1 is an limit points. Indeed, let ε > 0. We need a point of [0, 1) in

(1− ε, 1 + ε). Since ε > 0, there exists n ∈ N such that 1n < ε, then 1− ε < 1− 1

n < 1. So,

1− 1

n∈ (1− ε, 1 + ε)\{1} ∩ [0, 1).

3. The set S = {1} has no limit points. Let x ∈ R. If x = 1, then the interval (0, 2) is an open

neighborhood of 1, but it does not contain points S other than 1. So 1 is not an accumulation

point.

60

Suppose that x 6= 1. Then |x − 1| > 0. Let ε = |x − 1|. Then 1 is not in (x − ε, x + ε), so

(x− ε, x+ ε)\{x} ∩ S = Ø. So x is not an accumulation point.

Theorem 5.1.4. Let S ⊆ R. Then x ∈ R is an limit point for S if and only if every ε-neighborhood

of x contains infinitely many points in S.

Proof. Suppose that x is a limit point for S. Since x is a limit point, S ∩ Vε(x)\{x} contains at

least one point, say s0. Suppose that Vε(x) contains finitely many points in S. Then we can let

δ = min({|x− s| | s ∈ S ∩ Vε(x), s 6= x}).

So, δ > 0 and δ ≤ |x− s0| < ε. Since Vδ(x) ⊆ Vε(x), we have

(Vδ(x)\{x}) ∩ S ⊆ (Vε(x)\{x}) ∩ S.

However, if s ∈ (Vε(x)\{x}) ∩ S, then |x− s| ≥ δ, so s 6∈ (Vδ(x)\{x}) ∩ S, Hence,

Vδ(x)\{x} ∩ S = Ø.

This is a contradiction since x is a limit point of S. So, Vε(x) contains infinitely many points of S.

Conversely, if for every ε > 0, Vε(x) contains infinitely many points of S, the it contains at least

one point of s which is not equal to x.

Theorem 5.1.5. Every infinite bounded subset S of R has an limit point.

Proof. Choose any sequence (sn)n∈N such that sn ∈ S and such that sn 6= sm for any n,m. This

is possible since S is infinite. Then (sn) is bounded, so it has a convergent subsequence (snk)k∈N.

The limit x = lim snk is an limit point for S.

Indeed, let ε > 0. Then there exists N ∈ N such that for all n ≥ N ,

|x− sn| < ε.

So, for all n ∈ N, sn ∈ (x− ε, x+ ε). Since the sn’s are all distinct and are in S, there are infinitely

many elements of S in (x− ε, x+ ε). This holds for all ε > 0, so x is an limit point of S.

Theorem 5.1.6. Let S ⊆ R. A point x ∈ R is an limit point of S if and only if there is a sequence

in S\{x} which converges to x.

61

Proof. Exercise.

5.2 Closed sets

Definition 5.2.1. A set S is closed if it contains all of its limit points.

Definition 5.2.2. Let S ⊆ R. Then

S := S ∪ S′

is the closure of S in R.

Exercise 5.2.3. S is a closed set.

Example 5.2.4. • Let S = (0, 1), then S = [0, 1]

• Let S = Z, then Z = Z

• Let S = Q, then Q = R

Exercise 5.2.5. Show that S is dense in R if and only if S = R.

5.3 Open Sets

Definition 5.3.1. A set U is open if the set R \U is closed.

Lemma 5.3.2. A set U is open if and only if, for every x ∈ U , there exists ε > 0 such that

Vε(x) ⊆ U .

Proof. Exercise.

5.4 Compactness

Definition 5.4.1. A set K ⊆ R is compact if, given any sequence (an) with an ∈ K, then (an) has

a convergent subsequence (ank) such that lim ank ∈ K.

Theorem 5.4.2 (Heine-Borel). A subset K ⊆ R is compact if and only if it is closed and bounded.

Proof. Suppose that K is closed and bounded. Let (an) be a sequence with an ∈ K. Since K is

bounded, so is (an). So it has a convergent subsequence (ank). Let limk ank = a. If a 6∈ K, then a

62

is a limit point for K. But K is closed, so contains all its limit points, a contradiction. So a must

be in K.

Conversely, suppose that K is compact. If K is not bounded, then for any n ∈ N, there exists

an ∈ K such that |an| > n. Since (an) is a sequence in K, it has a convergent subsequence ank .

However, |ank | > Nk, so ank is not bounded, hence cannot be convergent, a contradiction. So K is

bounded.

Now, let a be a limit point of K. Since a is a limit point of K, there is a sequence (an) with

an ∈ K\{a} such that limn an = a. Then (an) has a convergent subsequence whose limit is in K.

However, any subsequence of (an) converges to a, so a ∈ K.

Exercise 5.4.3. Suppose that K is a non-empty compact set. Then K has a maximum and a

minimum.

5.5 Connectedness

Definition 5.5.1. A set X is disconnected if there exists A and B open sets such that A∩X 6= Ø,

B ∩X 6= Ø, but A ∩ B ∩X = Ø and X = (A ∪ B) ∩X. We say that X is connected if it is not

disconnected.

Exercise 5.5.2. Prove that if X is disconnected, then there are closed sets S and R such that

S ∩X 6= Ø, R ∩X 6= Ø, but S ∩R ∩X = Ø and X = (S ∪R) ∩X.

Theorem 5.5.3. Let X ⊆ R. The following are equivalent:

(a) X is connected.

(b) If a and b are in X and a < b, then [a, b] ⊆ X.

Proof. Suppose that X is connected but that there exists a < b with a, b ∈ X but [a, b] 6⊂ X. Then

there is a < c < b such that c 6∈ X. Let A = (−∞, c) and B = (c,∞). Then A and B are open.

Further, (A ∪B) ∩X and X ∩B ∩A = Ø. This contradicts the fact that X is connected.

Now, we show that (b) implies (a). Suppose that (b) holds but that X is disconnected. Let A,

B be as in Definition 5.5.1. Let a ∈ X ∩ A and b ∈ X ∩B, and suppose without loss of generality

that a < b. Then, by assumption [a, b] ⊆ X.

Consider the set

U = {x ∈ B ∩X | a < x ≤ b}.

63

Then U is bounded below by a. Further, b ∈ U so it is not empty. Hence, U has a greatest lower

bound, say c = inf(U) and note that a ≤ c ≤ b. Since

c ∈ X = X ∩ (A ∪B),

either c ∈ A or c ∈ B.

Suppose that c ∈ A. Then c 6= b since A ∩B ∩X = Ø. Since A is open, there exists ε > 0 such

that (c− ε, c+ ε) ⊆ A. Further, we can choose ε ≤ |b− c|/2 so that [c, c+ ε) ⊆ [a, b] ⊆ X. Hence,

[c, c+ ε) ⊆ A ∩X.

However, if x ∈ U , then x ∈ B ∩X so x 6∈ A. Since c ≤ x, we must have c+ ε ≤ x. Hence, c+ ε is

a lower bound for U , contradicting the fact that c = inf(U).

Suppose that c ∈ B. Then c 6= a. Since B is open, there exists ε > 0 such that (c−ε, c+ε) ⊆ B.

Further, we can choose ε ≤ |c− a|/2 so that (c− ε, c] ⊆ [a, b] ⊆ X. Further,

(c− ε, c] ⊆ B.

Therefore, c − ε /2 ∈ X ∩ B and a < c − ε /2, so c − ε /2 ∈ U . This contradicts the fact that

c = inf(U).

So, our assumption must be false, and X is connected.

64

Chapter 6

Continuity and Differentiability

6.1 Continuous Functions

6.1.1 Functional Limits

Definition 6.1.1. Let f : A → R be a function. Let a ∈ R be a limit point. A limit of f at a is

a real number ` ∈ R such that, for all ε > 0, there exists δ > 0 such that, if 0 < |x − a| < δ, then

|f(x)− `| < ε. If this is the case, we write

limx→a

f(x) = `.

Exercise 6.1.2. If limx→a

f(x) = `1 and limx→a

f(x) = `2, then `1 = `2.

Example 6.1.3. (a) Let f : R→ R be the function f(x) = b. Then limx→c f(x) = b for all c ∈ R.

(b) Let f : R→ R be given by

f(x) =

1 x < 0

0 x ≥ 0.

Then, limx→0 f(x) does not exists. Indeed, suppose that b ∈ R and ε = 1/2. Let δ > 0.

Suppose that |f(x)− b| < 12 for 0 < |x− 0| < δ. Choose x < 0 in (−δ, δ), then

|f(x)− b| = |1− b| < 1/2.

65

So, b ∈ (1/2, 3/2). If x > 0 and x ∈ (−δ, δ), then

|f(x)− b| = |0− b| < 1/2

so b ∈ (−1/2, 1/2). This is a contradiction.

(c) Let f : R→ R be given by

f(x) =

1 x 6= 0

0 x = 0.

Then, limx→0 f(x) = 1.Indeed, let ε > 0 and δ = 1. Then for 0 < |x−0| < 1, then in particular,

x 6= 0. So, |f(x)− 1| = |1− 1| = 0 < ε.

Theorem 6.1.4. Let f : A → R be a function. Suppose that c is a limit point of A. Then the

following are equivalent:

(i) limx→c

f(x) = L

(ii) Given any sequence (xn) such that xn ∈ A\{c} and limn xn = c, limn f(xn) = c.

Proof. Suppose that limx→c f(x) = L and assume that (xn) is as above. Note that since xn 6= c,

0 < |xn − c| for all n ∈ N. Let ε > 0. Then there exists δ > 0 such that, if 0 < |x − c| < δ, then

|f(x)− L| < ε. Since (xn)→ c, there exists N ∈ N such that, for all n ≥ N ,

0 < |xn − c| < δ.

So, for all n ≥ N, |f(xn)− L| < ε and lim f(xn) = L.

Conversely, suppose that for all sequences (xn) as above, lim f(xn) = L. Suppose that limx→c f(x) 6=

L. Then, there exists ε0 > 0 such that, for all δ > 0, there exists x such that 0 < |x − c| < δ, for

|f(x) − L| ≥ ε. Let δ = 1n and choose xn such that 0 < |xn − c| < δ but |f(xn) − L| ≥ ε. Then,

limxn = c, but lim f(xn) 6= L, a contradiction. So limx→c f(x) 6= L.

Given two functions f : A → R and g : A → R, we can define new functions (f + g) : A → R,

(fg) : A → R, f/g : A \ g−1(0) → R by (f + g)(a) = f(a) + g(a), (fg)(a) = f(a)g(a) and

(f/g)(a) = f(a)/g(a).

Theorem 6.1.5. Let f1, f2 : A → R be functions. Let c be a limit point of A. Suppose that

limx→a

f1(x) = b1 and limx→a

f2(x) = b2 . Then

66

(a) If k ∈ R, then limx→c kf1 = kb1

(b) limx→c

f1 + f2 = b1 + b2

(c) limx→c

f1g1 = b1b2

(d) If b2 6= 0, then limx→c f1/f2 = b1/b2.

Proof. Let (xn) → c be a sequence as in Theorem 6.1.4. Then the sequence (f1(xn)) converges

to b1 and the sequence (f2(xn)) converges to b2. We apply the theorem for limits of sequences in

each cases to conclude that limn→∞

kf1(xn) = kb1, etc.. Since the sequence (xn) was arbitrary, we use

Theorem 6.1.4 to conclude that the limits exists and are as stated.

6.1.2 Continuity

Definition 6.1.6. Let A,B ⊆ R, let f : A → B and a ∈ A. We say that f is continuous at a if,

for every ε > 0, there exists δ > 0 such that, if x ∈ A and |x− a| < δ, then |f(x)− f(a)| < ε. The

function f is continuous if it is continuous at a for every a ∈ A.

Example 6.1.7. (a) Let f : R→ R be the function f(x) = b. Then f is continuous at a for every

a ∈ R. Indeed, if ε > 0, let δ be any positive real number. Then if |x− a| < δ,

|f(x)− f(a)| = |b− b| = 0 < ε .

(b) Let f : R → R be the function f(x) = x. Then f is continuous at a for every a ∈ R. Indeed,

let ε > 0. Let δ = ε. Then if |x− a| < ε,

|f(x)− f(a)| = |x− a| < ε.

(c) Let f : R→ R be the function f(x) = |x|. Then f is continuous at a for every a ∈ R. Indeed,

let ε > 0. Let δ = ε. Then if |x− a| < δ = ε, we have

|f(x)− f(a)| = ||x| − |a|| ≤ |x− a| < ε.

(d) Let f : R→ R

f(x) =

1 x ∈ Q

0 x 6∈ Q .

67

Then f is not continuous for any a ∈ R. Let ε = 12 . Let δ > 0. If a ∈ Q, chose x ∈ R \Q such

that x ∈ Vδ(a). If a ∈ R \Q, chose x ∈ Q in Vδ(a). In either cases, |x− a| < δ. However,

|f(x)− f(a)| = 1 >1

2.

Remark 6.1.8. If a ∈ A is not a limit point of A, then we can choose δ > 0 such that (a− δ, a+

δ) ∩A = {a}. Then, f is necessarily continuous at a since, for any ε > 0, if x ∈ A and |x− a| < δ,

then x = a so |f(x)− f(a)| = |f(a)− f(a)| = 0 < ε.

Theorem 6.1.9. Let a ∈ A. Then the following are equivalent:

(a) f is continuous at a

(b) If a is a limit point of A, then limx→a

f(x) = f(a)

(c) Given any sequence (xn) with xn ∈ A, limn→∞

f(xn) = f(a). (Note, we are not asking that xn 6= a

here.)

Proof. Exercise.

Corollary 6.1.10. Let f, g : A→ R be functions. Suppose that f, g are continuous at a ∈ A. Then

so are the functions

(a) f + g

(b) fg

(c) f/g provided that g(a) 6= 0.

Corollary 6.1.11. Suppose that f : R→ R is defined by f(x) = anxn + . . .+ a1x+ a0. Then f is

continuous on all of R.

Theorem 6.1.12. Suppose that f : A→ B and g : B → C are continuous. Then so is g ◦ f .

Proof. Let a ∈ A. Let ε > 0. Choose δ1 > 0 such that, if |y− f(a)| < δ1, then |g(y)− g(f(a))| < ε.

Let δ > 0 be such that if |x−a| < δ, then |f(x)−f(a)| < δ1. Then, for such x, |g(f(x))−g(f(a))| <

ε.

Remark 6.1.13. Continuity is a local concept. That is, the continuity of a function at a ∈ R only

depends on what happens in a small neighborhood of a. Indeed, let f : A→ B and g : A→ B be

two functions. Suppose that f is continuous at a ∈ A. Suppose that there exists γ > 0 such that

f(x) = g(x) for all x ∈ Vγ(a). Then one can show that g is continuous at a.

68

Exercise 6.1.14. Let α, β ∈ R be such that α 6= β. Let γ ∈ R f : R→ R be given by

f(x) =

α x ≤ γ

β x > γ.

Decide where on R the function f is continuous.

Exercise 6.1.15. If f : A→ R is continuous at a and f(a) 6= 0, then there exists γ > 0 and M > 0

such that |f(x)| ≥M for all x ∈ Vγ(a).

6.1.3 Continuity and open sets

Remark 6.1.16. We can rephrase the definition of continuity only using the language of sets.

Indeed, a function f : A→ R is continuous at a ∈ A if and only if, for all ε > 0, there exists δ > 0

such that, if x ∈ Vδ(a), then f(x) ∈ Vε(f(a)).

Theorem 6.1.17. Let A ⊆ R and f : A→ R be a function. Then the following are equivalent:

(a) f is continuous on A.

(b) For every open set U ⊆ R, f−1(U) = A ∩ V for some open set V .

(c) For every closed set S ⊆ R, f−1(S) = A ∩ T for some closed set T ⊆ R.

Proof. Suppose (a) holds. Let U be an open set and a ∈ f−1(U). Then f(a) ∈ U , so there exists

ε > 0 such that Vε(f(a)) ⊆ U . Since f is continuous at a, there exists δa > 0 such that, for all

x ∈ A such that |x−a| < δa, then |f(x)−f(a)| < ε. So, for all x ∈ Vδa(a)∩A, f(x) ∈ Vε(f(a)) ⊆ U .

This means that

Vδa(a) ∩A ⊆ f−1(U).

Let V =⋃

a∈f−1(U)

Vδa(a). Then, V ∩A = f−1(U) and V is open.

Now, suppose that (b) holds. Let a ∈ A. Let ε > 0. Then, U = Vε(f(a)) is an open set, so

there is a set V ⊆ R such that f−1(U) = A ∩ V . Since a ∈ f−1(U), a ∈ V . Since V is open, there

exists δ > 0 such that Vδ(a) ⊆ V . So, if x ∈ A and |x− a| < δ, then

x ∈ A ∩ Vδ(a) ⊆ A ∩ V = f−1(U),

and hence f(x) ∈ U = Vε(f(a)). In other words, |f(x) − f(a)| < ε. So f is continuous at a for

every a ∈ A.

69

That (b) holds if and only if (c) holds is left as an exercise.

6.1.4 Extreme Value Theorem

Recall from Exercise 5.4.3 that a non-empty compact set K ⊆ R has a maximum and an minimum.

Theorem 6.1.18. Let f : K → R be a continuous function. Suppose that K is compact. Then so

is f(K).

Proof. Let K be a compact set. Let (yn) be a sequence in f(K). We will show that it has a

convergent subsequence (ynk) whose limit is in f(K).

Then there are xn ∈ K such that f(xn) = yn. The sequence (xn) is in K, so it has a convergent

subsequence (xnk) whose limit limk→∞

xnk = x is in K. Since f is continuous, limk→∞

f(xnk) = f(x).

Hence, since ynk = f(xnk),

limk→∞

ynk = f(x) ∈ f(K).

Therefore, f(K) is compact.

Theorem 6.1.1 (Extreme Value Theorem). Let f : K → R be a continuous function on a non-

empty compact set K. Then f(K) has a maximum and a minimum. That is, there exists x0 and

x1 in K such that, for all x ∈ K, f(x0) ≤ f(x) ≤ f(x1).

Proof. Any compact set has a maximum and a minimum. Since f(K) is compact, then there are

y0 and y1 such that, for all y ∈ f(K), y0 ≤ y ≤ y1. We let x0 and x1 be such that f(x0) = y0 and

f(x1) = y1. Then, if x ∈ K, f(x) ∈ f(K), hence f(y0) ≤ f(x) ≤ f(y1).

Corollary 6.1.19. Any continuous function f : [a, b]→ R achieves its maximum and its minimum.

Definition 6.1.20. Let A ⊂ R and f : A → R be a function. Then f is bounded if f(A) is

a bounded subset of R. That is, f is bounded if there exists M ∈ R such that, for all a ∈ A,

|f(a)| ≤M .

Corollary 6.1.21. If f : K → R is continuous and K is compact, then f is bounded.

6.1.5 Intermediate Value Theorem

Theorem 6.1.22. Let f : C → R be a continuous function where C ⊆ R. Suppose that C is

connected. Then so is f(C).

70

Remark 6.1.23. Recall that C is connected if and only if, for every x, y ∈ C with x < y, [x, y] ⊆ C.

The theorem says that if C is connected, then so is f(C), so that, for every f(s), f(t) ∈ f(C) with

f(s) < f(t), then [f(s), f(t)] ⊆ f(C). This implies that every real number between f(s) and f(t)

is achieved by the function f .

First Proof. Suppose that f(C) is not connected. Then there exists f(a) ∈ f(C), f(b) ∈ f(C) and

y ∈ R such that, f(a) < y < f(b) but y 6∈ f(C).

First, suppose that a < b. Consider

U = {x ∈ C | x ≤ b and, for all z such that z ≤ x ≤ b, y < f(z) ≤ f(b)}.

Since b ∈ U and U is bounded below by a, c = inf(U) exists. Further, c ∈ [a, b], so c ∈ U .

Suppose that f(c) < y (in particular, c 6= b). Let ε = y − f(c). Since f is continuous, there

exists δ > 0 with |c− b| ≤ δ such that, if |x− c| < δ, then |f(x)− f(c)| < ε. So, if x ∈ (c− δ, c+ δ),

then

f(x)− f(c) < y − f(c),

so f(x) < y. So, letting z = c + δ/2, we have that f(z) < y, so if x < z, then x 6∈ U . So, z is a

lower bound for U , but c < z, a contradiction.

So suppose that y < f(c). In particular, c 6= a and [c, b] ⊆ U . Let ε = f(c) − y. Since f is

continuous, there exists δ > 0 with |c− a| ≤ δ such that, if |x− c| < δ, then |f(x)− f(c)| < ε. So,

if x ∈ (c− δ, c+ δ), then

f(c)− f(x) < ε = f(c)− y,

hence, f(x) > y. In particular, for all z ∈ [c − δ/2, c], then y < f(z). Since this also holds for all

z ∈ [c, b], we have that c− δ/2 ∈ U . But c− δ/2 < c and c = inf(U), a contradiction.

The case b < a leads to similar contradictions.

We conclude that no such y can exist, and therefore, that f(C) is connected.

Second Proof. We prove the contrapositive. Suppose that f(C) is disconnected. (Recall that this

means that there are points f(a) ∈ f(C) and f(b) ∈ f(C) such that a < b and there exists

f(a) < c < f(b) such that c 6∈ f(C).) Then for

A = {x ∈ R | x < c} and B = {x ∈ R | x > c}

71

we have that f(C) ⊆ A ∪B and

f(a) ∈ f(C) ∩A 6= Ø and f(b) ∈ f(C) ∩B 6= Ø

but f(C) ∩A ∩B = Ø. Since A and B are open, there exists open sets U and V such that

f−1(A) = U ∩ C, f−1(B) = V ∩ C.

Therefore, C ⊆ U ∪ V . Further,

a ∈ U ∩ C 6= Ø and b ∈ V ∩ C 6= Ø.

Finally,

U ∩ V ∩ C = f−1(A) ∩ f−1(B) ∩ C = f−1(A ∩B ∩ C) = Ø.

So C is disconnected.

Theorem 6.1.24 (Intermediate Value Theorem). Let f : [a, b] → R be a continuous function. If

r ∈ R is such that f(a) < r < f(b) or f(b) < r < f(a), then there exists c ∈ (a, b) such that

f(c) = r.

Proof. Since [a, b] is connected, f([a, b]) is connected. Since f(a) and f(b) are in f([a, b]), so is

any real number r between them. Therefore, such an r is in the image of f . Hence, there exists

c ∈ [a, b] such that f(c) = r. Since r 6= f(a) and r 6= f(b), c 6= a and c 6= b so c ∈ (a, b).

6.1.6 Uniform Continuity

Definition 6.1.25. A function f : A → R is uniformly continuous if for all ε > 0, there exists a

δ > 0 such that for all x and y in A, if |x− y| < δ, then |f(x)− f(y)| < ε.

Lemma 6.1.26. If f : A→ R is a uniformly continuous function, then it is continuous.

Proof. Let ε > 0. Choose δ > 0 such that if |x − y| < δ, then |f(x) − f(y)| < ε. Let a ∈ R. If

|x− a| < δ, then |f(x)− f(a)| < ε. So f is continuous at a for every a ∈ R.

Example 6.1.27. (a) The function f : R → R given by f(x) = x is uniformly conitnuous. Let

ε > 0 and let δ = ε. Then if |x− y| < δ, we have

|f(x)− f(y)| = |x− y| < δ = ε .

72

(b) The function f : R → R given by f(x) = x2 is not uniformly continuous. Let ε = 1. Suppose

that for δ > 0, if |x− y| < δ, then |x2 − y2| < 1. Then, for any x ∈ R, since |x+ δ/2− x| < δ,

we have

|(x+ δ)2 − x2| < 1

That is, |2δx+ δ2| < 1. In particular, 2δx+ δ2 < 1, that is,

x < (1− δ2)/2δ

for all x ∈ R. This is a contradiction (for example, it contradicts the Archimedean principle).

(c) The function f : (1, 2)→ R given by f(x) = x2 is uniformly continuous. Let ε > 0. Note that

for x, y ∈ (1, 2), |x+ y| < |x|+ |y| < 4. Hence, if |x− y| < ε/4, then

|x2 − y2| = |x− y||x+ y| < |x− y| < ε/4 · 4 = ε.

So f is uniformly continuous on (1, 2).

Exercise 6.1.28. Which of the following functions is uniformly continuous?

(a) f : (0,∞)→ R given f(x) = 1x

(b) f : [1,∞)→ R given f(x) = 1x

(c) f : [0,∞)→ R given f(x) =√x

(d) f : R→ R given f(x) = xn for n ≥ 0.

Exercise 6.1.29 (Hard). Let f : R→ R be given by f(x) = p(x) where p(x) = anxn + . . .+ a0 is

a polynomial. Then f is uniformly continuous if and only if deg(p) ≤ 1.

Exercise 6.1.30. If f : A→ R is uniformly continuous and B ⊆ R, then the restriction of f to B,

f : B → R is uniformly continuous.

Theorem 6.1.31. Let f, g : A → R and suppose that both functions are uniformly continuous.

Then

(a) If k ∈ R, kf is uniformly continuous.

(b) f + g is uniformly continuous.

73

Proof. We prove part (b). Let ε > 0. Choose δ such that, if |x− y| < δ, then both

|f(x)− f(y)| < ε /2 and |g(x)− g(y)| < ε /2.

Then, if |x− y| < δ,

|f(x) + g(x)− (f(y) + g(y))| = |f(x)− f(y) + g(x)− g(y)|

≤ |f(x)− f(y)|+ |g(x)− g(y)|

< ε /2 + ε /2 = ε .

So f + g is uniformly continuous.

Remark 6.1.32. Note that if f, g : A → R are uniformly continuous, then fg is not necessarily

uniformly continuous. For example, f(x) = g(x) = x are uniformly continuous on R, yet (fg)(x) =

x2 is not.

Exercise 6.1.33. Let f, g : A → R be bounded, uniformly continuous functions. Show that

fg : A→ R is uniformly continuous.

Theorem 6.1.34. Let f : K → R be a continuous function. If K is a compact set, then f is

uniformly continuous.

Proof. Suppose that f is not uniformly continuous. Then there exists ε0 > 0 such that, for every

δ > 0, there exists x, y ∈ K such that |x − y| < δ and |f(x) − f(y)| ≥ ε0. For each n ∈ N, choose

xn, yn such that |xn − yn| < 1/n and |f(xn)− f(yn)| ≥ ε0.

Since K is compact, (xn) has a convergent subsequence (xnk) whose limit x is in K. Since

|xnk − ynk | < 1/nk, it’s simple to show that lim ynk = x. Therefore, since f is continuous,

limk→∞

f(xnk) = f(x) = limk→∞

f(ynk).

This contradicts the fact that |f(xnk) − f(ynk)| ≥ ε0 for all k ∈ N. Therefore, our assumption is

wrong and f is uniformly continuous.

Example 6.1.35. Any continuous function f : [a, b] → R is uniformly continuous since [a, b] is

compact.

74

6.2 Differentiability

6.2.1 Derivatives

In this section, we will assume that the domains of our functions are open sets. That is, given

f : A→ R, we suppose that for each a ∈ A, there is an εa > 0 such that (a− εa, a+ εa) ⊆ A.

Definition 6.2.1. Let f : A→ R. Let a ∈ A. Then, if it exists, the limit

f ′(a) = limx→a

f(x)− f(a)

x− a

is called the derivative of f at a and we say that f is differentiable at a. If f is differentiable at a

for every a ∈ A, then f is differentiable on A.

Exercise 6.2.2. Let f : A→ R. Prove that the derivative of f at a ∈ A can also be defined as

limh→0

f(a+ h)− f(a)

h.

Lemma 6.2.3. If f : A→ R is differentiable at a ∈ A, then f is continuous at a ∈ A.

Proof. Suppose f is differentiable at a ∈ A, then

limx→a

f(x)− f(a)

x− a

exists. Since limx→a

x− a exists and equals zero,

limx→a

f(x)− f(a) = limx→a

(x− a)f(x)− f(a)

x− a

= limx→a

x− a limx→a

f(x)− f(a)

x− a= 0 · f ′(a)

Therefore, limx→a f(x)− f(a) = 0. However, limx→a f(a) = f(a) hence,

limx→a

f(x) = limx→a

f(x)− f(a) + f(a) = limx→a

f(x)− f(a) + limx→a

f(a) = 0 + f(a).

Since limx→a f(x) = f(a), f is continuous at a.

Exercise 6.2.4. Prove this using a δ-ε argument.

75

Theorem 6.2.5. Suppose that f, g : A→ R are differentiable at a ∈ A. Then

(a) f + g is differentiable at a and (f + g)′(a) = f ′(a) + g′(a)

(b) fg is differentiable at a and (fg)′(a) = f ′(a)g(a) + f(a)g′(a)

(c) If g(a) 6= 0, then fg is differentiable at a and (f/g)′(a) = f ′(a)g(a)−g′(a)f(a)

g(a)2.

Proof. (a)

limx→a

(f + g)(x)− (f + g)(a)

x− a= lim

x→a

f(x) + g(x)− f(a)− g(a)

x− a

= limx→a

f(x)− f(a) + g(x)− g(a)

x− a

= limx→a

(f(x)− f(a)

x− a+g(x)− g(a)

x− a

).

Since both limits exists, so does their sum, and

limx→a

(f + g)(x)− (f + g)(a)

x− a= lim

x→a

(f(x)− f(a)

x− a

)+ limx→a

(g(x)− g(a)

x− a

)= f ′(a) + g′(a).

So, (f + g)′(a) = f ′(a) + g′(a).

(b)

(f · g)′(a) = limx→a

(f · g)(x)− (f · g)(a)

x− a

= limx→a

f(x) · g(x)− f(a) · g(a)

x− a

= limx→a

f(x) · g(x)− f(a)g(x) + f(a)g(x)− f(a) · g(a)

x− a

= limx→a

(f(x) · g(x)− f(a)g(x)

x− a+f(a)g(x)− f(a) · g(a)

x− a

)= lim

x→a

(f(x)− f(a)

x− ag(x) + f(a)

g(x)− ·g(a)

x− a

).

Since g(x) and F (x) = f(a) are continuous, their limit at a exists. So since all limits involved

exist,

76

(f · g)′(a) = limx→a

f(x)− f(a)

x− alimx→a

g(x) + limx→a

f(a) limx→a

g(x)− ·g(a)

x− a= f ′(a)g(a) + f(a)g′(a).

(c) Suppose that g(a) 6= 0. Since g is differentiable at a, it is continuous at a. So there is some

δ > 0 such that, if x ∈ Vδ(a) ∩ A, then g(x) 6= 0. Therefore, to take the limit, we can restrict

the domain of g to Vδ(a). Then,

limx→a

1/g(x)− 1/g(a)

x− a= lim

x→a− 1

g(a)g(x)

g(x)− g(a)

x− a

Since g(a) 6= 0, limx→a

1

g(x)exists and is equal to 1

g(a) Similarly, F (x) = − 1g(a) has a limit at a.

Therefore, all limits involved exist, and

limx→a

1/g(x)− 1/g(a)

x− a= lim

x→a− 1

g(a)limx→a

1

g(x)limx→a

g(x)− g(a)

x− a

= − 1

g(a)

1

g(a)g′(a)

= − g′(a)

g(a)2.

We can now use part (b) to obtain the result for (fg)′(a).

6.2.2 The Chain Rule

Theorem 6.2.6 (Chain Rule). Let f : A → R and g : B → R and suppose that f(A) ⊆ B. Let

a ∈ A. Suppose that f is differentiable at a and g is differentiable at f(a). Then g ◦ f : A→ R is

differentiable at a and

(g ◦ f)′(a) = g′(f(a)) · f ′(a).

Proof. We want to show that

limx→a

g(f(x))− g(f(a))

x− a

77

exists. First note that,

g(f(x))− g(f(a))

x− a=

g(f(x))−g(f(a))f(x)−f(a)

f(x)−f(a)x−a f(x) 6= f(a)

0 f(x) = f(a).

Motivated by this, we define

F (y) =

g(y)−g(f(a))y−f(a) y 6= f(a)

g′(f(a)) y = a.

Then, for all x ∈ A,

g(f(x))− g(f(a))

x− a= F (f(x))

f(x)− f(a)

x− a,

by definition of F (f(x)) if f(x) 6= f(a) and since both sides are zero if f(x) = f(a).

Now, note that

limy→f(a)

F (y) = limy→f(a)

g(y)− g(f(a))

y − f(a)= g′(f(a)).

Since F (f(a)) = g′(f(a)), the function F is continuous at f(a). Since f is continuous at a and F

is continuous at f(a), then F ◦ f is continuous at a. Hence, limx→a

F (f(x)) = F (f(a)). So:

limx→a

g(f(x))− g(f(a))

x− a= lim

x→aF (f(x))

f(x)− f(a)

x− a

= limx→a

F (f(x)) limx→a

f(x)− f(a)

x− a= F (f(a))f ′(a)

= g′(f(a))f ′(a).

6.2.3 Derivative at minima and maxima

Definition 6.2.7. Let f : A→ R.

• f attains its maximum ((esp. minimum) at a ∈ A if f(a) ≥ f(x) (resp. f(a) ≤ f(x)) for all

x ∈ A.

78

• f attains its local maximum (resp. local minimum) at a ∈ A if there is region a ∈ (b, c) ⊆ A

such that f(a) ≥ f(x) (resp. f(a) ≤ f(x)) for all x ∈ (b, c).

Theorem 6.2.8. Suppose that f : A → R is differentiable on A (with A open). Suppose that f

attains a local maximum or local minumum at a ∈ A. Then f ′(a) = 0.

Proof. Let a ∈ (b, c) ⊆ A be a region on which f(a) is the maximum value of f . Consider the

function g(x) = f(x)−f(a)x−a on A\{a}. Since f is differentiable on A,

f ′(a) = limx→a

g(x).

So, for any sequence (xn) such that limn→∞

xn = a, xn ∈ (b, c), then limn→∞

g(xn) = f ′(a). Choose

a sequence (xn) such that limn→∞

xn = a and with the property that, for all n ∈ N, xn ≤ a and

xn ∈ (b, c). Similarly, choose a sequence (yn) such that limn→∞

yn = a and with the property that, for

all n ∈ N, yn ≥ a and yn ∈ (b, c).

Then,

limn→∞

g(xn) = f ′(a) = limn→∞

g(yn).

However,

g(xn) =f(xn)− f(a)

xn − a≥ 0

since f(xn)− f(a) ≤ 0 (f(a) is a maximum) and xn − a ≤ 0 and

g(yn) =f(yn)− f(a)

yn − a≤ 0

since f(xn)− f(a) ≤ 0 and yn − a ≥ 0. Hence, lim g(xn) ≥ 0 and lim g(yn) ≤ 0. So, we have both

f ′(a) ≥ 0 and f ′(a) ≤ 0, hence f ′(a) = 0.

The proof for the minimum is similar.

6.2.4 The Mean Value Theorem

Theorem 6.2.9 (Rolle’s Theorem). Let f : [a, b] → R be continuous and differentiable on (a, b).

If f(a) = f(b) = 0, then there exists c ∈ (a, b) such that f ′(c) = 0

Proof. Let f : [a, b] → R be continuous. Since [a, b] is compact, f achieves its maximum and

minimum on [a, b]. If the maximum of f is f(a). Then f(b) is a maximum. If f(a) and f(b) are

also the minimum, then f(x) = 0 on [a, b], so f ′(c) = 0 for all c ∈ (a, b).

79

So suppose, without loss of generality, that f(a) = f(b) is not the maximum. Then there is a

point c ∈ (a, b) such that f(c) is the maximum of f . Then f ′(c) = 0 by the previous problem.

Theorem 6.2.10 (Mean Value Theorem). Let f : [a, b] → R be continuous and differentiable on

(a, b). There exists c ∈ (a, b) such that

f(b)− f(a) = f ′(c)(b− a).

Proof. The trick is to modify f a little to put us in the situation of Rolle’s Theorem. The line

which goes through f(a) and f(b) has equation

(x− a)f(b)− f(a)

b− a+ f(a).

Let

F (x) = f(x)−(

(x− a)f(b)− f(a)

b− a+ f(a)

).

Then F (a) = F (b) = 0. By Rolle’s theorem, there exists c ∈ [a, b] such that

0 = F ′(c) = f ′(c)− f(b)− f(a)

b− a.

Hence,

f ′(c)(b− a) = f(b)− f(a).

Corollary 6.2.11. Let f, g : [a, b] → R be continuous and differentiable on (a, b). Suppose that

for all x ∈ (a, b), f ′(x) = g′(x). Then, there exists a constant c ∈ R such that, for all x ∈ [a, b],

f(x) = g(x) + c.

Proof. Consider the function F = f − g. Let a < r ≤ b. Then by the MVT, there exists c ∈ (a, r)

such that

0 = F ′(c) =F (r)− F (a)

r − a.

Hence, F (r) = F (a). It follows that, for any r ∈ (a, b]

f(r)− g(r) = f(a)− g(a).

80

So, letting c = f(a)− g(a), we conclude that for any r ∈ (a, b]

f(r) = g(r) + f(a)− g(a).

Further, this equality holds on all of [a, b].

Definition 6.2.12. A function f : A→ R is

• increasing if, for all a, b ∈ A with a < b, f(a) ≤ f(b). It is strictly increasing if, for all a, b ∈ A

with a < b,f(a) < f(b).

• decreasing if, for all a, b ∈ A with a < b, f(a) ≥ f(b). It is strictly decreasing if, for all

a, b ∈ A with a < b,f(a) > f(b).

Theorem 6.2.13. Let f : [a, b]→ R be continuous and differentiable on (a, b).

(a) If f ′(x) > 0 for all x ∈ (a, b), then f is strictly increasing on [a, b].

(b) If f ′(x) < 0 for all x ∈ (a, b), then f is strictly decreasing on [a, b].

(c) If f ′(x) = 0 for all x ∈ (a, b), then f is constant [a, b].

Proof. For (a), choose s, t ∈ [a, b] with s < t. By the MVT, there exists c ∈ (s, t) such that

f ′(c)(t− s) = f(t)− f(s).

Since f ′(c) > 0 and (t− s) > 0, then f(t)− f(s) > 0, so f(t) > f(s). This holds for all s, t in [a, b]

with s < t. So the claim holds.

The proof for (b) is similar and (c) follows from the previous result by taking g(x) = 0 on

[a, b].

Exercise 6.2.14. Find a strictly increasing function on an interval (a, b) whose derivative vanishes

at some point in (a, b). Conclude that condition (a) of Theorem 6.2.13 is not an if and only if.

Theorem 6.2.15 (Cauchy Mean Value Theorem). Let f, g : [a, b] → R be continuous and differ-

entiable on (a, b). Then there exists c ∈ (a, b) such that

(f(b)− f(a))g′(c) = (g(b)− g(a))f ′(c).

81

Proof. Apply Rolle’s theorem to

F (x) = (f(b)− f(a))g(x)− (g(b)− g(a))f(x) + f(a)g(b)− f(b)g(a).

Theorem 6.2.16 (L’Hospital’s Rule). Let A be an open set. Let a ∈ A. Let f, g : A → R be

continuous and suppose that f and g are differentiable on A\{a}. Then, if f(a) = g(a) = 0 and

limx→a

f ′(x)

g′(x)= `

then

limx→a

f(x)

g(x)= `.

Proof. If limx→a

f ′(x)

g′(x)= `, then this limit exists and there is some interval I = (a− δ0, a+ δ0), δ0 > 0,

on which g′(x) 6= 0. For any b ∈ I\{a}, the Cauchy Mean Value Theorem implies that there is a

c ∈ (b, a) or (a, b) such thatf(b)

g(b)=f(b)− f(a)

g(b)− g(a)=f ′(c)

g′(c).

Let ε > 0. Choose δ0 > δ > 0 such that, if 0 < |y − a| < δ, then∣∣∣∣f ′(y)

g′(y)− `∣∣∣∣ < ε.

Then, if 0 < |x− a| < δ, x ∈ I\{a} and by the previous remarks, there is a y ∈ (x, a) or (a, x) such

thatf(x)

g(x)=f ′(y)

g′(y)

But, since 0 < |x− a| < δ, then 0 < |y − a| < δ, so∣∣∣∣f(x)

g(x)− `∣∣∣∣ =

∣∣∣∣f ′(y)

g′(y)− `∣∣∣∣ < ε.

Hence, limx→a

f(x)

g(x)= ` .

82

Chapter 7

The Riemann Integral

7.1 Definition and properties of the Riemann integral

Definition 7.1.1. A partition of [a, b] is a finite subset P of [a, b] which contains a and b.

Definition 7.1.2. If P and Q are partitions of the interval [a, b] and P ⊂ Q, then Q is called a

refinement of P .

If P is a partition of [a, b], then it is ordered,

a = t0 < t1 < . . . < tn−1 < tn = b,

We label the elements of P = {t0, t1, . . . , tn} with ti−1 < ti for 1 ≤ i ≤ n.

Definition 7.1.3. Let f : [a, b]→ R be a bounded function. Let P = {t0, t1, . . . , tn} be a partition

of [a, b]. Let

mi = m([ti−1, ti]) = inf{f(x) | ti−1 ≤ x ≤ ti}

Mi = M([ti−1, ti]) = sup{f(x) | ti−1 ≤ x ≤ ti}.

The lower sum of f for P is

L(f, P ) =

n∑i=1

mi(ti − ti−1).

The upper sum of f for P is

U(f, P ) =n∑i=1

Mi(ti − ti−1).

83

Remark 7.1.4. For any partition P , L(f, P ) ≤ U(f, P ).

Lemma 7.1.5. If P and Q are partitions of [a, b] and Q ⊆ P , then

L(f, P ) ≤ L(f,Q)

and

U(f, P ) ≥ U(f,Q)

Proof. Let Q = {t0, . . . , tn}. Then if P is a refinement of Q, P = Q1 ∪ Q2 ∪ . . . ∪ Qn where

Qi = {ti,0, . . . , ti,ki} and ti−1 = ti,0 and ti,ki = ti is a partition of [ti−1, ti]. Further, since if

[s, t] ⊆ [s′, t′], m([s′, t′]) ≤ m([s, t]), we have:

L(P, f) =

n∑i=1

ki∑r=1

m([ti,r−1, ti,r])(ti,r − ti,r−1)

≥n∑i=1

ki∑r=1

m([ti,0, ti,ki ])(ti,r − ti,r−1)

=n∑i=1

m([ti,0, ti,ki ])

ki∑r=1

(ti,r − ti,r−1)

=n∑i=1

m([ti,0, ti,ki ])(ti,ki − ti,0)

=n∑i=1

m([ti,0, ti,ki ])(ti − ti−1)

= L(Q, f).

The proof is similar for the upper sum.

Theorem 7.1.6. Let f : [a, b] → R be a bounded function Let P1 and P2 be partitions of [a, b].

Then

L(f, P1) ≤ U(f, P2).

Proof. Let P = P1 ∪ P2. Then P is a refinement of P1 and a refinement of P2. Hence,

L(f, P1) ≤ L(f, P ) ≤ U(f, P ) ≤ U(f, P2).

84

Remark 7.1.7. Consider the set

X = {L(f, P ) | P is a partition of [a, b]}.

Let Q be any partition of [a, b]. Then U(f,Q) is greater than every element in the set X. Since X

is not empty, it has a supremum. Similarly,

Y = {U(f, P ) | P is a partition of [a, b]}

has an infimum.

Definition 7.1.8. Let f : [a, b]→ R be a bounded function. Then

L(f) = sup{L(f, P ) | P is a partition of [a, b]}

is called the lower integral of f from a to b and

U(f) = inf{U(f, P ) | P is a partition of [a, b]}

is called the upper integral of f from a to b.

Exercise 7.1.9. Prove that L(f) ≤ U(f).

Definition 7.1.10. Let f : [a, b]→ R be a bounded function. If L(f) = U(f), then f is integrable

on [a, b] and L(f) = U(f) is called the integral of f from a to b and is denoted

∫ b

af.

Example 7.1.11. Let f : [a, b]→ R be given by f(x) = c. Then, for any partition P ,

U(f, P ) =

n∑i=1

Mi(ti − ti−1) =

n∑i=1

c(ti − ti−1) = c(b− a).

The last equality follows from the fact that the sum∑n

i=1(ti − ti−1) telescopes to b− a. Similarly,

L(f, P ) = c(b− a). So,

U(f) = L(f) = c(b− a).

Theorem 7.1.12 (Criterion for Integrability). Let f : [a, b] → R be a bounded function. Then f

85

is integrable if and only if for every ε > 0, there exists a partition P of [a, b] such that

U(f, P )− L(f, P ) < ε.

Proof. Suppose f is integrable so that U(f) = L(f). Let ε > 0. Then, there is P1 such that

U(f) ≤ U(f, P1) < U(f) + ε/2 and P2 such that L(f)− ε/2 < L(f, P2) ≤ L(f) . Since P = P1 ∪P2

is a refinement, we have

L(f)− ε/2 < L(f, P1) ≤ L(f, P ) ≤ L(f)

and

U(f) ≤ U(f, P ) ≤ U(f, P1) < U(f) + ε/2.

So, since U(f) = L(f), both U(f, P ) and L(f, P ) are in (L(f) − ε /2, L(f) + ε /2), and hence the

claim holds.

Suppose that for every ε > 0, there is such a partition. Suppose that U(f) > L(f). Let

U(f)− L(f) = ε > 0. Choose P such that

U(f, P )− L(f, P ) < ε.

Then

U(f) ≤ U(f, P ) < ε+ L(f, P ) ≤ ε+ L(f).

Hence, U(f)− L(f) < ε a contradiction.

Example 7.1.13. Let f : [0, 1]→ R be given by

f(x) =

0 x ∈ Q

1 x 6∈ Q

Then,

U(f, P ) =n∑i=1

Mi(ti − ti−1) =n∑i=1

1(ti − ti−1) = 1

and

L(f, P ) =

n∑i=1

mi(ti − ti−1) =

n∑i=1

0(ti − ti−1) = 0.

86

So,

U(f, P )− L(f, P ) = 1

and for ε = 1, the criterion for integrability fails.

Exercise 7.1.14. Prove that f : [a, b] → R is integrable if and only if there exists a sequence Pn

of partitions of [a, b] such that

limn→∞

U(f, Pn)− L(f, Pn) = 0.

Further, for such a sequence, show that

∫ b

af = lim

n→∞U(Pn, f) = lim

n→∞L(Pn, f).

Theorem 7.1.15. If f : [a, b]→ R is continuous, then f is integrable.

Proof. A continuous function f : [a, b] → R is uniformly continuous. Now let ε > 0. Choose δ > 0

such that, if |x − y| < δ, then |f(x) − f(y)| < ε/(b − a). Let P = {t0 = a, ti, . . . , tn = b} be any

partition such that |ti − ti−1| < δ. Let xi ∈ [ti−1, ti] be such that f(xi) = Mi and y be such that

f(yi) = mi. This is by the Extreme Value Theorem possible since f is continuous and [ti−1, ti] is

compact. Since xi, yi ∈ [ti−1, ti], |xi − yi| < δ. Hence, f(xi) − f(yi) = |f(xi) − f(yi)| < ε/(b − a).

Hence, Mi −mi < ε/(b− a). Then,

U(f, P )− L(f, P ) =

n∑i=1

(Mi −mi)(ti − ti−1) <n∑i=0

ε/(b− a)(ti − ti−1) = ε.

So, U(f) = L(f) by the previous result.

Example 7.1.16. Let f : [0, a]→ R be given by f(x) = x. Thenf is integrable on [0, a] and that∫ a0 f = a2

2 .

Proof. Since f is continuous, it is integrable. So, U(f) = L(f) =∫ a0 f . Choose a partition

87

P = {t0 = 0, . . . , tn = b}. Then mi = ti−1 and Mi = ti. Hence,

L(f, P ) = t0(t1 − t0) + t1(t2 − t1) + . . .+ tn−1(tn − tn−1)

≤ t0(t1 − t0) + (t1 − t0)2/2 + t1(t2 − t1) + (t2 − t1)2/2 + . . .+ tn−1(tn − tn−1) + (tn − tn−1)2/2

=t2n − t20

2

= a2/2.

Therefore, a2/2 is L(f, P ) ≤ a2/2 for all partitions P of [0, a], and hence∫ a0 f = L(f) ≤ a2/2.

Similarly,

U(f, P ) = t1(t1 − t0) + t2(t2 − t1) + . . .+ tn(tn − tn−1)

≥ t1(t1 − t0)− (t1 − t0)2/2 + t2(t2 − t1)− (t2 − t1)2/2 + . . .+ tn(tn − tn−1)− (tn − tn−1)2/2

=t2n − t20

2

= a2/2.

So,∫ a0 f = U(f) ≥ a2/2. Hence

∫ a0 f = a2/2.

Exercise 7.1.17. Let A ⊆ R and B ⊆ R be non-empty sets which are bounded below. Let

A+B = {a+ b | a ∈ A, b ∈ B}.

Then, inf(A+B) = inf(A) + inf(B).

Theorem 7.1.18. Let a < b < c. Then f : [a, c] → R is integrable on [a, c] if and only if f is

integrable on [a, b] and [b, c]. Further, if this is the case, then

∫ c

af =

∫ b

af +

∫ c

bf.

Proof. If f is integrable on [a, b] and [b, c], we can choose Q a partition of [a, b] and R a partition

of [b, c] such that

U(f,Q)− L(f,Q) < ε/2

and

U(f,R)− L(f,R) < ε/2.

88

But P = Q ∪R is a partition of [a, b], and

U(f, P )− L(f, P ) = U(f,Q)− L(f,Q) + U(f,R)− L(f,R) < ε.

Conversely, suppose that f is integrable on [a, c]. Let ε > 0. Then there exists a partition

P of [a, c] such that U(f, P ) − L(f, P ) < ε. Let P ′ = P ∪ {b}. Then, U(f, P ′) ≤ U(f, P ) and

L(f, P ) ≤ L(f, P ′). Hence, U(f, P ′)− L(f, P ′) ≤ U(f, P )− L(f, P ) < ε.

Further, suppose b = tk in P ′. Let Q = {t0, . . . , tk}, which is a partition on [a, b] and R =

{tk, . . . , tn}, which is a partition of [b, c]. Then

U(f, P ′) =k∑i=0

Mi(ti − ti−1) +n∑i=1

Mi(ti − ti−1) = U(f,Q) + U(f,R)

and

L(f, P ′) =k∑i=0

mi(ti − ti−1) +n∑i=1

mi(ti − ti−1) = L(f,Q) + L(f,R)

Hence,

U(f, P ′)− L(f, P ′) = U(f,Q)− L(f,Q) + U(f,R)− L(f,R).

Hence,

0 ≤ U(f,Q)− L(f,Q) + U(f,R)− L(f,R) = U(f, P ′)− L(f, P ′) < ε.

So both U(f,Q) − L(f,Q) < ε and U(f,R) − L(f,R) < ε. Hence, f is integrable on [a, b] and on

[b, c].

Now, to finish the proof, we prove that∫ ca f =

∫ ba f +

∫ cb f . First, it’s an exercise (see Exer-

cise 7.1.17) to show that∫ ba f +

∫ cb f is the infimum of

X = {U(f,R) + U(f,Q) | R a partition of [a, b] and a partition Q of [b, c]}.

But, for such R and Q, we have that R ∪Q is a partition on [a, c] and∫ c

af ≤ U(f,R ∪Q) = U(f,R) + U(f,Q).

89

So,∫ ca f is a lower bound for X, hence

∫ c

af ≤

∫ b

af +

∫ c

bf.

Similarly,∫ ba f +

∫ cb f is the supremum of

Y = {L(f,R) + L(f,Q) | R a partition of [a, b] and a partition Q of [b, c]}

and for any R, Q as above, ∫ c

af ≥ L(f,R ∪Q) = L(f,R) + L(f,Q).

So,∫ ca f is an upper bound for Y and it follows that:

∫ c

af ≥

∫ b

af +

∫ c

bf.

So the two must be equal.

Definition 7.1.19. If a < b and f is integrable on [a, b], then

∫ a

bf = −

∫ b

af.

Corollary 7.1.20. Let f be integrable on an interval containing three distinct points a, b, c. Then

∫ c

af =

∫ b

af +

∫ c

bf

(where we make no assumption on the ordering of the set {a, b, c}.)

Theorem 7.1.21. Let f and g be integrable on [a, b].

(a) f + g is integrable and ∫ b

a(f + g) =

∫ b

af +

∫ b

ag

(b) If k ∈ R, kf is integrable [a, b] and

∫ b

ak · f = k

∫ b

af

90

Proof. We prove (a) and leave (b) as an exercise (see Homework). Let P = {t0, . . . , tn} be a

partition of [a, b]. Then

mi(f) +mi(g) ≤ mi(f + g) ≤Mi(f + g) ≤Mi(f) +Mi(g).

Hence, for any partition, we have

L(f, P ) + L(g, P ) ≤ L(f + g, P ) ≤ U(f + g, P ) ≤ U(f, P ) + U(g, P ). (7.1)

Let ε > 0. Choose P1 such that

U(f, P1)− L(f, P1) < ε/2

and P2 such that

U(g, P2)− L(g, P2) < ε/2.

For P = P1 ∪ P2, we have

U(f, P ) + U(g, P )− (L(f, P ) + L(g, P )) < ε.

Hence,

U(f + g, P )− L(f + g, P ) < ε

and f + g is in integrable on [a, b].

To show that∫ ba (f + g) =

∫ ba f +

∫ ba g, we first note that

∫ b

af +

∫ b

ag = inf({U(f, P ) + U(g,Q) | P ,Q partitions of [a, b]})

= sup({L(f, P ) + L(g,Q) | P ,Q partitions of [a, b]})

(see again Exercise 7.1.17). Let P and Q be partitions of [a, b], then P ∪Q is a common refinement.

Hence,

U(f, P ∪Q) + U(g, P ∪Q) ≤ U(f, P ) + U(g,Q)

Further, we showed above that

U(f + g, P ∪Q) ≤ U(f, P ∪Q) + U(g, P ∪Q)

91

So, ∫ b

a(f + g) ≤ U(f + g, P ∪Q) ≤ U(f, P ) + U(g,Q)

and∫ ba (f + g) is a lower bound for {U(f, P ) + U(g,Q) | P ,Q partitions of [a, b]}. Therefore,

∫ b

a(f + g) ≤

∫ b

af +

∫ b

ag.

Similarly, using lower sums, we conclude that

∫ b

a(f + g) ≥

∫ b

af +

∫ b

ag.

So the two must be equal.

Theorem 7.1.22. Let f and g be integrable on [a, b].

(a) If m ≤ f ≤M , then m(b− a) ≤∫ ba f ≤M(b− a).

(b) If f ≤ g, then∫ ba f ≤

∫ ba g

(c) |f | is integrable and∣∣∣∫ ba f ∣∣∣ ≤ ∫ ba |f |.

Proof. First, we show that if f ≥ 0, then∫ ba f ≥ 0. Indeed, choose any partition P , then

L(f, P ) ≤∫ b

af.

However,

L(f, P ) =∑

mi(ti − ti−1) ≥ 0

since mi ≥ 0 for all i.

Now, apply this to

(a)∫ ba (M − f) and

∫ ba (f −m).

(b)∫ ba (g − f).

(c)∫ ba (|f | − f) and

∫ ba (f + |f |).

92

(a) and (b) clearly imply the claims. For (c), it’s an exercise (see Homework) to show that if f is

integrable, so is |f |. We then get that

∫ b

af ≤

∫ b

a|f |

and

−∫ b

a|f | =

∫ b

a−|f | ≤

∫ b

af

which together imply the statement of (c).

Exercise 7.1.23. Let f, g : [a, b]→ R be integrable functions. Then fg is integrable.

7.2 The Fundamental Theorem of Calculus

Theorem 7.2.1. Let f : [a, b]→ R be an integrable function. Let F : [a, b]→ R be given by

F (x) =

∫ x

af.

Then F is uniformly continuous on [a, b].

Proof. Let ε > 0. Let |f(x)| ≤M on [a, b] for some M > 0. Choose δ = ε/M . Suppose |x− y| < δ,

and without loss of generality, assume that x < y. Then

|F (y)− F (x)| =∣∣∣∣∫ y

af −

∫ x

af

∣∣∣∣=

∣∣∣∣∫ y

xf

∣∣∣∣≤M(y − x) < ε.

Theorem 7.2.2 (The Fundamental Theorem of Calculus). Let f be a bounded function which is

integrable on [a, b].

1. Suppose that f = F ′ on (a, b) for a continuous function F : [a, b]→ R. Then

∫ b

af = F (b)− F (a).

93

2. Let F : [a, b]→ R be given by

F (x) =

∫ x

af.

If f is continuous at p ∈ (a, b), then F is differentiable at p and

F ′(p) = f(p).

Proof. For (a), choose any partition P = {t0, . . . , tn}. Use the Mean Value Theorem to choose ci

such that

F (ti)− F (tti−1) = f(ci)(ti − ti−1).

Then

L(f, P ) =∑

mi(ti − ti−1) ≤∑

f(ci)(ti − ti−1) ≤∑

Mi(ti − ti−1) = U(f, P ).

But, ∑f(ci)(ti − ti−1) =

∑F (ti)− F (tti−1) = F (b)− F (a).

Hence, for any partition,

L(f, P ) ≤ F (b)− F (a) ≤ U(f, P ).

Therefore, ∫ b

af = F (b)− F (a).

For (b), we will show that

limx→p

F (x)− F (p)

x− p= f(p).

Let ε > 0. Since f is continuous at p, there exists δ > 0 such that, if x ∈ [a, b] and |x− p| < δ, then

94

|f(x)− f(p)| < ε/2. Suppose that 0 < |x− p| < δ. Then

∣∣∣∣F (x)− F (p)

x− p− f(p)

∣∣∣∣ =

∣∣∣∣∫ xa f −

∫ pa f

x− p− f(p)

∣∣∣∣=

∣∣∣∣ 1

x− p

∫ x

pf − 1

x− p

∫ x

pf(p)

∣∣∣∣=

1

|x− p|

∣∣∣∣∫ x

p(f − f(p))

∣∣∣∣≤ 1

x− p

∫ x

p|f − f(p)|

≤ 1

x− p

∫ x

pε /2 = ε /2 < ε .

7.3 Integration by parts and change of variables

Corollary 7.3.1 (Integration by Parts). Let f, g : A→ R for an open set A containing [a, b], and

such that f ′ and g′ are continuous on [a, b]. Then

∫ b

afg′ = (f(b)g(b)− f(a)g(a))−

∫ b

af ′g.

Proof. Let F = fg. Then F ′ = f ′g + fg′, and by assumption this is a continuous function. Hence,

∫ b

af ′g + fg′ = f(b)g(b)− f(a)g(a).

Corollary 7.3.2 (Change of Variables). Let g : A → R where A is an open set containing [a, b].

Let f : [c, d]→ R and suppose that g([a, b]) ⊆ [c, d]. Suppose that g′ is continuous on [a, b] and f is

continuous. Then, ∫ b

a(f ◦ g) · g′ =

∫ g(b)

g(a)f.

Proof. Let F : [c, d]→ R given by F (x) =∫ xc f . By the first fundamental theorem of calculus, since

f is continuous, F is differentiable and F ′(p) = f(p). Now, using the chain rule for differentiation,

(F ◦ g)′ = (F ′ ◦ g)g′ = (f ◦ g)g′.

95

By the second fundamental theorem of calculus, using the fact that (f ◦ g)g′ is continuous,

∫ b

a(f ◦ g)g′ =

∫ b

a(F ◦ g)′ = F (g(b))− F (g(a)) =

∫ g(b)

cf −

∫ g(a)

cf =

∫ g(b)

g(a)f.

96

Chapter 8

Series and power series

8.1 Series

Definition 8.1.1. Let (an) be a sequence of real numbers. Let

sn = a1 + · · ·+ an =n∑i=1

ai.

We call the sequence (sn) a series. We let

∞∑n=1

an = limn→∞

sn

and say that the series converges if this limit exists. We call the limit the sum of the series. If the

limit does not exist, we say that the series diverges or call it divergent.

From the properties of limits of sequences, we deduce the following result.

Theorem 8.1.2. Suppose that∑∞

n=1 an = α and∑∞

n=1 bn = β. Then

(a)∑∞

n=1(an + bn) = α+ β

(b) For c ∈ R,∑∞

n=1(c · an) = c · α.

Theorem 8.1.3 (Geometric Series). Let −1 < x < 1. Then:

∞∑n=1

xn =x

1− x.

97

Proof.n∑i=1

xi = x

n−1∑i=0

xi = x(1− xn)

1− x

Since (xn) converges to 0 as x→∞, this proves the claim.

Recall that a sequence (pn) converges if and only if it is Cauchy. We apply this to series to

obtain:

Theorem 8.1.4 (Cauchy Criterion for Series). A series∑∞

n=1 an converges if and only if for all

ε > 0 there is some N ∈ N such that, for all n,m ≥ N ,∣∣∣∣∣n∑

i=m

ai

∣∣∣∣∣ < ε.

Definition 8.1.5. We say that a series∞∑n=1

an converges absolutely if∞∑n=1

|an| converges.

Theorem 8.1.6. If a series converges absolutely, then it converges.

Proof. Let ε > 0. Choose N such that

∣∣∣∣∣n∑

k=m

|an|

∣∣∣∣∣ < ε,∀n ≥ N,m ≥ N. Since

∣∣∣∣∣n∑

k=m

an

∣∣∣∣∣ ≤n∑

k=m

|an| =

∣∣∣∣∣n∑

k=m

|an|

∣∣∣∣∣ < ε,

this proves the claim.

Theorem 8.1.7. If∑∞

n=1 an converges, then limn→∞

an = 0.

Proof. Let L =∞∑n=1

an. Let ε > 0. Choose N such that |pm − L| < ε/2 for all m ≥ N . Then, for

m > N ,

|am| = |pm − pm−1|

= |pm − L+ L− pm−1|

≤ |pm − L|+ |L− pm−1| < ε.

Warning 8.1.8. The converse is not true! There are divergent series whose terms an got to zero!

(See Theorem 8.1.9)

98

Theorem 8.1.9. The series ∞∑n=1

1

n

(called the harmonic series) is divergent.

Proof. We show that it fails the Cauchy Criterion for Series. Let ε = 12 . Let N ∈ N. Then

2N−1∑n=N

1

n≥

2N−1∑n=N

1

2N

≥ 1

2

2N−1∑n=N

1

N

=1

2N

1

N

=1

2.

So, there is no N ∈ N such that, for all n,m ≥ N ,

∣∣∣∣∣n∑

i=m

1

i

∣∣∣∣∣ < ε.

Theorem 8.1.10 (Alternating Series Test). If (an) is a decreasing sequence of positive numbers

such that lim an = 0, then∞∑n=1

(−1)n+1an converges.

Proof. Since (an) converges, it is Cauchy, so we can choose N such that, if n ≥ N , |an−an+m| < ε/2

for all m ≥ 0 and |an| < ε/2. Now, note that

∣∣∣∣∣m∑i=0

(−1)n+ian+i

∣∣∣∣∣ =

|an − an+1 + an+2 − an+3 + . . .+ an+m−2 − an+m−1 + an+m| m is even

|an − an+1 + an+2 − an+3 + . . .+ an+m−1 − an+m| m is odd

=

an − an+1 + an+2 − an+3 + . . .+ an+m−2 − an+m−1 + an+m m is even

an − an+1 + an+2 − an+3 + . . .+ an+m−1 − an+m m is odd

=

an − an+1 + an+1 − an+3 + . . .+ an+m−3 − an+m−1 + an+m m is even

an − an+1 + an+1 − an+3 + . . .+ an+m−2 − an+m m is odd

=

an − an+m−1 + an+m m is even

an − am m is odd

99

However,

an − an+m−1 + an+m < ε/2 + ε/2

and

an − an+m < ε/2.

In either case,∣∣∑m

i=0(−1)n+ian+i∣∣ < ε. By the Cauchy Criterion, the series converges.

Theorem 8.1.11 (Comparison Test). Suppose that (cn) is a sequence of positive numbers and that

(an) is any sequence. Suppose that there exists M ∈ N such that such that |an| ≤ cn for all n ≥M .

Then, if

∞∑n=1

cn converges, so does∞∑n=1

an.

Proof. Let ε > 0. Since∞∑n=1

cn converges, there exists some N ∈ N (and we can choose it so that

N ≥M) such that ∣∣∣∣∣n∑

k=m

cn

∣∣∣∣∣ =n∑

k=m

cn < ε

for all n,m ≥ N . However, ∣∣∣∣∣n∑

k=m

an

∣∣∣∣∣ ≤n∑

k=m

|an| ≤n∑

k=m

cn < ε.

Therefore, an satisfies the Cauchy criterion for convergence.

Example 8.1.12. The series∑∞

n=11n2n converges by the comparison test.

Exercise 8.1.13. A series∑∞

n=0 an converges if and only if there exists k ∈ N such that the series∑∞n=k an converges.

Theorem 8.1.14 (Ratio Test). Let (an) be a sequence with an 6= 0. Suppose that r = limn→∞

∣∣∣∣an+1

an

∣∣∣∣exists. If r < 1, the series

∞∑n=1

an converges. If r > 1, the series diverges.

Proof. Suppose that r < 1. Let r < α < 1. Then, for some N ∈ N, if n ≥ N , then∣∣∣∣an+1

an

∣∣∣∣ < α.

That is,

|an+1| < α |an| .

100

Inductively, this implies that

|aN+k| < α |aN+k−1| < . . . < αk |aN | .

By the comparison test,

∞∑k=0

|aN+k| ≤∞∑k=0

|aN |αk = |aN |∞∑k=0

αk = |aN |1

1− α.

Hence, the series converges.

Suppose that r > 1. Let r > α > 1. Then, for some N , if n ≥ N , then∣∣∣∣an+1

an

∣∣∣∣ > α.

That is,

|an+1| > α |an| .

Therefore,

|aN+k| > α |aN+k−1| > . . . > αk |aN | > |aN |.

Since aN 6= 0, limn→∞ |an| ≥ |aN | > 0. Hence, the sequence (an) does not converge to zero. So,

the series does not converge.

Example 8.1.15. The following facts are shown using the ratio test.

(a) For any x ∈ R series∑∞

n=0xn

n! converges. Its limit is denoted by ex. (Note: e can also be

defined as lim(1 + 1

n

)n).

(b) For any x ∈ R series∑∞

n=0(−1)n x2n+1

(2n+1)! converges. Its limit is denoted by sin(x).

(c) For any x ∈ R series∑∞

n=0(−1)n x2n

(2n)! converges. Its limit is denoted by cos(x).

Theorem 8.1.16 (Root Test). Let (an) be a sequence of real numbers. Suppose that r = limn→∞

n√|an|

exists. Then If r < 1, the series∞∑n=1

an converges. If r > 1, the series diverges.

Proof. Suppose r < 1 and let r < α < 1. Then for some N , n√|an| ≤ α < 1 for all n ≥ N . Hence,

|an| ≤ αn. Now use the comparison test with the geometric series.

101

Suppose r > 1 and let r > α > 1. Then for some N , n√|an| ≥ α > 1 for all n ≥ N . So

|an| ≥ αn. Now we can compare the sequence to the geometric sequence and conclude that the

terms do not go to zero.

Remark 8.1.17. To use the root test, it’s useful to know that limn→∞ n√n = 1.

8.2 Lebesgue’s Theorem

Definition 8.2.1. A set A ⊆ R has measure zero if, for all ε > 0, there exists a sequence (an, bn)

of open intervals such that

A ⊆∞⋃n=0

(an, bn)

and ∞∑n=0

bn − an ≤ ε

Example 8.2.2. Let X be a countable set. Then X has measure zero. Indeed, let X = {xn}n∈N.

Recall that ∞∑n=0

1

2n=

1

1− 1/2= 2.

Let

(an, bn) = (xn −ε

2n+2, xn +

ε

2n+2).

Then

bn − an = 2ε

2n+2=

ε

2n+1

so ∞∑n=0

bn − an =

∞∑n=0

ε

2n+1=ε

2

∞∑n=0

1

2n= ε.

Example 8.2.3 (Thomae’s function). The function f : [0, 1]→ R

f(x) =

1 x = 0

1q x ∈ Q, x = p

q , (p, q) = 1 and p, q ≥ 0

0 x 6∈ Q .

is continuous on R \Q and discontinuous on Q. Let’s prove that it’s continuous at the irrationals.

Let a ∈ [0, 1] be irrational. Let ε > 0. Choose n such that 1n < ε. Note, there are finitely rational

102

numbers in [0, 1] which, in reduced form pq , have denominator q since, for such a rational, we have

0 < p ≤ q. So, there are finitely many rationals in [0, 1] with q < n. Let Xn be the finite set of

such rationals, and let δ = min({|x − a| | x ∈ Xn}). Then, δ > 0 and if |x − a| < δ, x 6∈ Xn, so

x = pq with q ≥ n, so that

|f(x)− f(a)| = 1

q≤ 1

n< ε.

We may not prove the following theorem, but it gives a good idea of what kind of functions are

Riemann Integrable.

Theorem 8.2.4 (Lebesgue’s Criterion for Integration). Let f be bounded on [a, b]. Then f is

Riemann integrable if and only if the set of points where f is not continuous has measure zero.

Example 8.2.5. An example of an uncountable set of measure zero are Cantor sets. The Cantor

set is produced as follows.

Let

C1 = [0, 1]\(1/3, 2/3) = [0, 1/3] ∪ [2/3, 1],

let

C2 = [0, 1/9] ∪ [2/9, 3/9] ∪ [6/9, 7/9] ∪ [8/9, 9/9].

The set Cn is inductively defined by removing the middle third of the disjoint closed intervals

comprising Cn−1. Then the Cantor set is defined as:

C =∞⋂n=1

Cn.

Here are some facts.

• Given an element α ∈ [0, 1], we associate a sequence α : N→ {0, 1}. We let

α1 =

0 α ∈ [0, 1/3]

1 α ∈ [2/3, 1].

Assuming that we αn has been defined and that αn ∈ [ x3n ,x+13n ], we let

αn+1 =

0 α ∈ [ x3n ,3x+13n+1 ]

1 α ∈ [3x+23n+1 ,

x+13n ].

103

Check that this gives a bijection from C to sequences of 0’s and 1’s, which, by Cantor’s

diagonal argument, is an uncountable set.

• The Cantor set has measure zero. Let ε > 0. We have that

C ⊆ Ck

for each k. Note that Ck is the union of 2k closed intervals [xs3k, xs+1

3k] of length 1

3k. So, the

total length is2k

3k.

Let k be such that 2k

3k+ 2k+1

3k+1 < ε. Let

(as, bs) = (xs3k− 1

3k+1,xs + 1

3k+

1

3k+1).

Then,

C ⊆ Ck ⊆2k⋃s=1

(as, bs) = Xk

and2k∑s=1

(bs − as) =

2k∑s=1

(1

3k+

2

3k+1) =

2k

3k+

2k+1

3k+1< ε.

Proposition 8.2.6. Let f : [0, 1]→ R be given by

f(x) =

1 x ∈ C

0 x 6∈ C.

Then f is Riemann Integrable and∫ 10 f = 0.

Proof. Let

fk(x) =

1 x ∈ Ck

0 x 6∈ Ck.

Then, 0 ≤ f ≤ fk for all k. In particular, for any partition P ,

0 ≤ L(f, P ) ≤ U(f, P ) ≤ U(fk, P ).

104

Now ∫ 1

0fk =

2k

3k

so we can choose Pk so that

U(fk, Pk) ≤2k

3k+

1

2k.

Hence,

0 ≤ L(f, Pk) ≤ U(f, Pk) ≤ U(fk, Pk) ≤2k

3k+

1

2k

so,

0 ≤ limk

(U(f, Pk)− L(f, Pk)) ≤ limk

(2k

3k+

1

2k) = 0.

Hence, f is integrable and in fact,∫ 10 f = 0.

8.3 Sequences and Series of Functions

Definition 8.3.1. Let A ⊆ R and let RA be the set of real valued functions from A to R. A

sequence of functions on A is a function f : N → RA. We write fn = f(n). So, in other words, a

sequence of functions is a countably infinite list (f0, f1, f2, . . .) with fn : A→ R for each n ∈ N.

Definition 8.3.2. A sequence (fn) converges pointwise to f : A → R if, for all a ∈ A and ε > 0,

there exists Na,ε in N such that if n ≥ Na,ε, then |f(a)− fn(a)| < ε.

Remark 8.3.3. (fn) converges pointwise to f if limn→∞

fn(a) = f(a) for every a ∈ A.

Example 8.3.4. Let fn : [0, 1]→ R be the function fn(x) = xn. Le

f(x) =

0 0 ≤ x < 1

1 x = 1.

Then, limn fn(x) = f(x) for every x ∈ X

Example 8.3.5. Power series are examples of sequences of functions. For example, sn : R→ R

sn(x) =

n∑k=0

xn

n!

is a sequence of functions. It converges pointwise to s : R→ R given by s(x) = ex.

105

Definition 8.3.6. The sequence (fn) converges uniformly to f : A → R if, for all ε > 0, there

exists Nε in N such that, if n ≥ Nε, then

|f(x)− fn(x)| < ε

for all x ∈ A.

Example 8.3.7. Let fn : R → R be given by fn(x) = sin(nx)/n. Then (fn) converges uniformly

to 0. Indeed, if ε > 0 and 1N < ε, then, for n ≥ N ,

|fn(x)| ≤ 1

n< ε.

Remark 8.3.8. Suppose that the sequence (fn) converges pointwise to f : A → R. Then it does

not converge uniformly if, for some ε0 > 0, for all n ∈ N, there exists xn such that

|f(xn)− fn(xn)| ≥ ε0.

Example 8.3.9. The sequence (fn) from Example 8.3.4 does not converge uniformly. Indeed, let

ε = 1/2 and, for n ∈ N, choose xn such that

(1/2)1/n < xn < 1.

Then 1/2 < fn(xn) so that

|fn(xn)− f(xn)| = fn(xn) ≥ 1/2 = ε.

Example 8.3.10. For any bounded domain A ⊆ R, the restriction of sn : A→ R of the sequence

(sn) from Example 8.3.5 converges uniformly. We will return to the proof when we discuss power

series.

Theorem 8.3.11. Let (fn) be a sequence of continuous functions fn : A → R. If (fn) converges

uniformly to f : A→ R, then f is continuous.

Proof. We prove that f is continuous at a ∈ A. Let ε > 0. Let N be such that, for all x ∈ A, if

n ≥ N ,

|fn(x)− f(x)| < ε/3.

106

Since fN : A→ R is continuous, there exists δ > 0 such that, if |x− a| < δ, then

|fN (x)− fN (a)| < ε/3.

Suppose that |x− a| < δ and x ∈ A. Then

|f(x)− f(a)| = |f(x)− fN (x) + fN (x)− fN (a) + fN (a)− f(a)|

≤ |f(x)− fN (x)|+ |fN (x)− fN (a)|+ |fN (a)− f(a)|

< ε.

Theorem 8.3.12. Suppose that (fn) is a sequence of integrable functions fn : [a, b] → R. If (fn)

converges uniformly to f : [a, b]→ R, then f is integrable and

∫ b

af = lim

n→∞

∫ b

afn.

In fact, for y ∈ [a, b] and Fn(y) =

∫ y

afn and F (y) =

∫ y

af , the sequence (Fn) converges

uniformly to F on [a, b].

Proof. Let ε > 0. Let N be such that, if n ≥ N , then

|f(x)− fn(x)| < ε

4(b− a)

for all x ∈ [a, b].

First, we show that f is integrable on [a, b]. Let P be a partition of [a, b]. Then, for all

x ∈ [ti−1, ti],

f(x) < fn(x) +ε

4(b− a)≤Mi(fn) +

ε

4(b− a).

Hence,

Mi(f) ≤Mi(fn) +ε

4(b− a)

Similarly,

mi(fn)− ε

4(b− a)≤ mi(f)

107

So,

Mi(f)−mi(f) ≤Mi(fn)−mi(fn) +ε

2(b− a).

Hence, for any partition P , we have

U(f, P )− L(f, P ) ≤ U(fn, P )− L(fn, P ) +ε

2.

Let P be such that U(fN , P )− L(fN , P ) < ε/2. Then

U(f, P )− L(f, P ) ≤ U(fN , P )− L(fN , P ) +ε

2< ε

Finally, since f is integrable on [a, y], if n ≥ N ,∣∣∣∣∫ y

af −

∫ y

afn

∣∣∣∣ ≤ ∣∣∣∣∫ y

a|f − fn|

∣∣∣∣≤ |y − a| ε

4(b− a)

< ε.

Theorem 8.3.13. Let A be an open interval containing [a, b]. Let (fn) be a sequence of functions,

fn : A → R. Suppose that, for all n ∈ N, the function fn is differentiable and that the derivative

f ′n is continuous on [a, b]. Suppose that limn→∞

fn(a) exists, and that (f ′n) converges uniformly to a

function g on [a, b], then (fn) converges uniformly to a differentiable function f and f ′ = g on

(a, b).

Proof. Note that f ′n is continuous, hence integrable. Since (f ′n) converges uniformly to g it follows

that g is integrable and that

limn→∞

∫ x

af ′n =

∫ x

ag

uniformly on [a, b]. However, by the Fundamental Theorem of Calculus, if we let Fn(x) =∫ xa f′n,

then

F ′n(x) = f ′n(x).

108

So, fn(x) = Fn(x) + c where c is a constant. Since Fn(a) = 0, c = fn(a). So,

fn(x) =

∫ x

af ′n + fn(a).

Since lim fn(a) exists, we let f(a) = lim fn(a) and define

f(x) = limn→∞

fn(x) = limn→∞

(∫ x

af ′n + fn(a)

)=

∫ x

ag + f(a)

Further, (fn) converges uniformly to f(x). Indeed, let N be such that for all n ≥ N∣∣∣∣∫ x

ag −

∫ x

af ′n

∣∣∣∣ < ε/2

for all x ∈ [a, b] and

|f(a)− fn(a)| < ε/2.

Then, ∣∣∣∣∫ x

ag + f(a)−

∫ x

af ′n − fn(a)

∣∣∣∣ < ε.

Finally, note that since the f ′n are continuous, so is g. By the FTC, f is differentiable and

f ′(x) = g(x) for x ∈ (a, b).

Remark 8.3.14. If we drop the assumption that f ′n is continuous, the theorem is still true, but

trickier to prove.

Theorem 8.3.15 (Cauchy’s criterion for uniform convergence). Let (fn) be a sequence of functions,

fn : A→ R. Then (fn) converges uniformly on A if and only if, for all ε > 0, there exists an N ∈ N

such that, if m,n ≥ N , then

|fn(x)− fm(x)| < ε

for every x ∈ A.

Proof. Suppose that (fn) converges uniformly to f . Let ε > 0. There exists N such that if n ≥ N ,

|fn(x)− f(x)| < ε/2

for all x ∈ A. Hence, if n,m ≥ N ,

|fn(x)− fm(x)| ≤ |fn(x)− f(x)|+ |fm(x)− f(x)| < ε.

109

Conversely, suppose that for all ε > 0, there exists an N in N such that, if m,n ≥ N , then

|fn(x)− fm(x)| < ε

for every x ∈ A. For a given x ∈ A, the sequence (fn(x)) is then a Cauchy, hence it is convergent.

Define f : A→ R by

f(x) = limnfn(x).

Now, choose N such that, if n,m ≥ N , then

|fn(x)− fm(x)| < ε/2

for all x ∈ A. Fix, x and n ≥ N . Using f(x) = limm fm(x), we can choose some m ≥ N such that

|fm(x)− f(x)| < ε/2.

Then,

|fn(x)− f(x)| ≤ |fn(x)− fm(x)|+ |fm(x)− f(x)| < ε.

Hence, fn converges uniformly to f .

8.4 Power Series

Definition 8.4.1. Let (fn) be a sequence of functions fn : A→ R. Then

sn(x) =n∑k=0

fk(x)

is called the sequence of partial sums. We define the series as

∞∑n=1

fn = limn→∞

sn.

If (sn) converges pointwise, we say that the series∑∞

n=1 fn converges pointwise. If (sn) converges

uniformly, we say that the series converges uniformly.

Exercise 8.4.2. Let∑∞

k=n ak be limm→∞

m∑k=n

ak. Prove that if

∞∑k=0

ak converges, then limn→∞

∞∑k=n

ak = 0.

110

(Hint: Prove that∑∞

k=n ak =∑∞

k=0 ak − sn−1.)

Theorem 8.4.3 (Weierstrass M -test). Let (fn) be a sequence of functions fn : A→ R . Let (Mn)

be a sequence of real numbers such that Mn ≥ 0 and so that |fn(x)| ≤ Mn for all x ∈ A. If the

series∑∞

n=0Mn converges, then∑∞

n=0 fn(x) converges absolutely for all x ∈ A and the∑∞

n=0 fn

converges uniformly on A

Proof. By the comparison test,∞∑n=1

fn(x) converges absolutely for each x. Let f(x) =∞∑n=1

fn(x).

We must show that∑n

k=0 fk(x) converges uniformly to f . Let ε > 0. Since∑∞

k=0Mk converges, it

follows that limn→∞

∞∑n+1

Mk is zero. Let N be such that, for all n ≥ N ,

∣∣∣∣∣∞∑n+1

Mk

∣∣∣∣∣ =

∞∑n+1

Mk < ε.

Then, ∣∣∣∣∣n∑k=0

fk(x)− f(x)

∣∣∣∣∣ =

∣∣∣∣∣∞∑n+1

fk(x)

∣∣∣∣∣ ≤∞∑n+1

Mk < ε.

So, the series converges uniformly.

Example 8.4.4. The function sn(x) =∑n

k=0xk

k! converges pointwise to s(x) = ex on R. In fact,

for any bounded domain A, the restriction of (sn) to sn : A → R converges uniformly. Indeed, let

|x| ≤ M for all x ∈ A. Then we can apply the Weierstrass M -test with∑∞

k=0Mk

k! to conclude the

result.

Definition 8.4.5. A power series is a function of the form

f(x) =∞∑n=0

cn(x− a)n

for a, cn ∈ R. We say that the power series is centered at a with coefficients cn.

Theorem 8.4.6. Let

f(x) =∞∑n=0

cnxn

and suppose there exists a ∈ R, a 6= 0 such that the series

f(a) =∞∑n=0

cnan

111

converges. Let r be any number such that 0 < R < |a|. Then the series defining f converges

uniformly on [−R,R] and f is continuous, differentiable and integrable on [−R,R]. Further,

∫f =

∞∑n=0

cnxn+1

n+ 1

and

f ′(x) =∞∑n=1

ncnxn−1

Proof. Let R < r < |a|. Let

fk(x) =

k∑n=0

cnxn

Since∑∞

n=0 cnan converges, lim |cnan| = 0. Hence, there exists M such that |cnan| < M for all

n ∈ N. It follows that, |cn| ≤ M/|a|n for all n ∈ N. Therefore, |cnrn| ≤ M(r/|a|)n for all n, and if

|x| ≤ r, then

|cnxn| ≤M(r/|a|)n.

Let Mn = M(r/|a|)n. Since r/|a| < 1, the series∑Mn converges. Hence, we get that for all

x ∈ [−r, r], the series∑∞

n=1 cnxn converges absolutely and the sequence (fk) converge uniformly.

Further, since the functions fk are continuous, f is continuous and since fk are integrable on [−r, r],

f is integrable and ∫f = lim

∫ k∑n=0

cnxn = lim

k∑n=0

cnxn+1

n+ 1.

For the derivative, note that

f ′n(x) =n∑

m=1

mcmxm−1

These are continuous functions. So, it suffices to prove that (f ′n) converges uniformly to g(x) =∑∞m=1mcmx

m−1 on [−r, r] and use Theorem 8.3.13. Further, by what we just showed, it’s enough

to prove∑∞

m=1mcm|a|m−1 converges.

We know that for x ≤ r

|ncnxn−1| ≤ |ncn|rn−1 ≤M

r

(r

|a|

)nn.

112

So, it’s enough to prove that if |`| < 1, then

∑n`n

converges and this is an easy application of the Ratio Test.

Definition 8.4.7. Let

f(x) =∞∑n=0

cnxn

Consider the set A = {a ∈ R | f(a) converges}. Let

R =

sup(A) A is bounded

∞ A = R .

Then R is called the radius of convergence of f .

113

my personal notes for the course

Documents