compendium of results in advanced calculus

Compendium of results in advancedcalculus

Contents

1 Linear Algebra 3

1.1 Matrix Algebra . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

1.1.1 Some Special Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

1.1.2 Operations with Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

1.1.3 Inverses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

1.1.4 The dot or inner product . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

1.1.5 Some more special matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

1.2 Linear Maps or Transformations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

1.3 Determinant & Trace . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

1.4 Systems of Linear Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

1.4.1 Solution of a System of Linear Equations . . . . . . . . . . . . . . . . . . . . . . . 9

1.5 Eigenvalues and Eigenvectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

2 Calculus 15

2.1 Mean-Value Theorems & their Consequences . . . . . . . . . . . . . . . . . . . . . . . . . 15

2.2 Limits & indeterminate forms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

2.3 Partial derivatives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

2.4 Maxima & Minima . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

2.5 Theorems on Integration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

2.6 Improper Integrals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

2.6.1 Convergence tests for Type 1a & 1b integrals . . . . . . . . . . . . . . . . . . . . . 21

2.6.2 Convergence tests for Type 2 integrals . . . . . . . . . . . . . . . . . . . . . . . . . 22

2.7 Uniform convergence & improper integrals . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

2.8 The Gamma Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

2.9 Multiple Integrals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

2.10 Vector identities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

2.11 Line, Surface & Volume Integrals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

2.12 Green’s, Stokes’ & Gauss’ theorems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32

3 Fourier series 35

4 Integral Transforms 39

4.1 Laplace Transforms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39

4.2 Fourier Transforms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41

4.3 Z-Transforms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42

5 Ordinary Differential Equations (ODEs) 45

5.1 First-order equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45

5.2 First-order equations in separable form . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45

5.3 Exact First-order ODEs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45

5.4 General first-order first-degree linear equations . . . . . . . . . . . . . . . . . . . . . . . . 46

5.4.1 Integrating factors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46

3

5.5 First-order nth degree equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46

5.6 Linear ODEs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47

5.6.1 Linear ODE of Euler-(or Cauchy-)type . . . . . . . . . . . . . . . . . . . . . . . . . 48

5.6.2 nth order constant coefficient homogeneous equations . . . . . . . . . . . . . . . . . 48

6 Partial Differential Equations (PDEs) 51

6.1 Formation of a PDE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51

6.2 First-order PDEs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51

6.2.1 Special types of first-order equations . . . . . . . . . . . . . . . . . . . . . . . . . . 52

6.3 Linear PDEs with constant coefficients . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52

6.4 Some special linear PDEs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54

6.4.1 The one-dimensional wave equation . . . . . . . . . . . . . . . . . . . . . . . . . . 54

6.4.2 The two-dimensional wave equation . . . . . . . . . . . . . . . . . . . . . . . . . . 55

6.4.3 The three-dimensional wave equation . . . . . . . . . . . . . . . . . . . . . . . . . . 55

6.4.4 Two-dimensional Laplace equation in a rectangle . . . . . . . . . . . . . . . . . . . 56

6.4.5 Two-dimensional Laplace equation in a circle with Dirichlet conditions . . . . . . . 56

6.4.6 Laplace’s equation in three dimensions . . . . . . . . . . . . . . . . . . . . . . . . . 56

6.4.7 One-dimensional heat equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57

6.4.8 Two-dimensional heat equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57

7 Complex Variables 59

7.1 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59

7.2 Linear Fractional Transformations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59

7.3 Elementary Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60

7.4 Analytic Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61

7.5 Complex integration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62

7.6 Series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64

7.7 Residues & Poles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66

8 Probability & Statistics 69

8.1 Probability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69

8.1.1 Conditional probability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70

8.1.2 Random variables & probability distributions . . . . . . . . . . . . . . . . . . . . . 71

8.1.3 Some standard distributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73

8.2 Statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74

9 Numerical Methods 77

9.1 Errors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77

9.2 Solution of algebraic & transcendental equations . . . . . . . . . . . . . . . . . . . . . . . 77

9.3 Numerical differentiation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77

9.4 Numerical integration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78

9.5 Numerical solution of ODEs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78

4

Notation

The following notation will be used throughout.

N, R and C will represent the natural numbers

{1, 2, . . . }, the real numbers and the complex num-

bers respectively which will often be referred to as

scalars. Rn is real n-dimensional space.

The symbol “ := ” means that the LHS is defined

by the RHS.∐An is the “disjoint union” of the sets An,

viz. the union of the sets An which are assumed to

be disjoint.

If an is an infinite sequence, then

lim sup an := limn→∞

[sup{an, an+1, . . . }]

Functions will be usually be described in the

f : X −→ Y format. The domain [a, b] or (a, b) of a

real-valued function will be assumed to be bounded

unless otherwise specified, i.e. −∞ < a < b < ∞.

A function f : (a, b) −→ R, −∞ ≤ a < b ≤ ∞,

is Cn if f is differentiable n times and its derivat-

ives f (1), f (2), . . . , f (n) are continuous. Continuous

functions are C0.

1

Chapter 1

Linear Algebra

1.1 Matrix Algebra

Definition 1. A matrix is a collection of objects

or symbols displayed in a rectangular arrangement.

The objects may be, for example, numbers, func-

tions or other matrices. We will be concerned only

with matrices whose entries are numbers. The ob-

jects are called the entries of the matrix. The col-

lection of entries may be finite or infinite.

Example.The matrices[a b cd e f

]and

[a]

are finite 2×3 and 1×1 matrices respectively. The

matrix 1 1

213 · · ·

12

13

14 · · ·

......

......

is infinite.

We will be concerned only with finite matrices. A

matrix with real-number entries will be called a

real matrix and one with complex-number entries a

complex matrix. Every real matrix is also a complex

matrix. If the matrix has m rows and n columns

(m,n ≥ 1), the matrix is said to be of order m×n.

If the matrix has m rows and m columns, then its

order is said to be m. A matrix is usually conveni-

ently specified by a typical entry. Thus, the m×nmatrix

A =

a11 a12 · · · a1j · · · a1n

a21 a22 · · · a2j · · · a2n

......

......

......

ai1 ai2 · · · aij · · · ain...

......

......

am1 am2 · · · amj · · · amn

is written A = [aij ], where aij is the entry in the

ith row and jth column and is said to be in the

(i, j) position. Sometimes the order of the matrix

is made explicit by writing Am,n or Am×n. If

m = n, then the abbreviation An is also used.

A submatrix of a matrix is a matrix formed by

the entries from the intersection of some set of rows

{i1, i2, . . . , ik} and set of columns {j1, j2, . . . , jl}of the original matrix. A principal submatrix

of a square matrix (see below) is formed by

selecting some rows {i1, i2, . . . , ik} and columns

{i1, i2, . . . , ik}.

1.1.1 Some Special Matrices

Zero matrix The matrix all of whose entries are

0. It is usually denoted by 0 or 0m×n.

0 :=

0 0 · · · 0...

......

...0 0 · · · 0

Matrix unit (of order m×n) The matrix unit

Eij of order m×n is defined to have 1 in the

(i, j) position and 0 elsewhere:

Eij :=

0 0 · · · 0 · · · 0...

......

......

...0 0 · · · 1 · · · 0...

......

......

...0 0 · · · 0 · · · 0

In terms of the Kronecker delta or symbol, δij ,

defined by

δij =

{1 if i = j0 if i 6= j

the (k, l)th entry of the matrix unit Eij is

δikδjl.

Row- & column vectors A 1×n matrix is called

a row-vector or -matrix and an n×1 matrix is

3

called a column vector or column matrix. A

column vector is also termed a column vector

or simply, vector. An n × 1 column vector is

often termed an n-dimensional vector. If the

vector is assumed to have entries only from R(resp. C), it is said to be a real (resp. complex)

vector. a1

a2

...an

is sometimes written as [a1 a2 · · · an]

Tfor ty-

pographical convenience. The symbol T is

defined on p. 5. A row vector is also written

(a1, a2, . . . , an). The entries of a column-vector

are called its components.

Row-reduced Echelon matrix Its defining

characteristics are:

(i) If a row has non-zero entries (i.e. a non-

zero row), then its first non-zero entry is

1, i.e. the leading term of the row is 1.

(ii) All the entries in the column containing a

leading 1 are 0.

(iii) If rows i and j are non-zero, i < j, then

the leading 1 of the ith row occurs to the

left of the leading 1 of the jth row.

(iii) Any zero row occurs below every non-

zero row.

An example is the matrix0 1 0 0 a b0 0 0 1 c d0 0 0 0 0 00 0 0 0 0 0

Square matrix An m×n matrix A = [aij ] with

m = n, i.e. the number of rows of the matrix

is equal to the number of its columns. The list

of entries {aii : i = 1, 2, . . . , n} is the diagonal

of A. Note that diagonals can only be defined

for square matrices. Entries which do not lie

on the diagonal are said to be off-diagonal.

The list {a12, a23, . . . , ai,i+1, . . . , an−1,n}is the superdiagonal and the list

{a21, a32, . . . , ai,i−1, . . . , an,n−1} is the subdi-

agonal of A.

Identity matrix A square matrix whose di-

agonal is {1, 1, . . . , 1} and all off-diagonal

entries are 0. It is usually denoted by I or

if the order is to be emphasised, In. Thus

In :=

1 0 · · · 00 1 · · · 0...

......

...0 0 · · · 1

Diagonal matrix A square matrix whose

off-diagonal entries are all 0. The

identity matrix is diagonal. A gen-

eral diagonal matrix is often written as

diag[a1, a2, . . . , an].

Triangular matrix An n×n square matrix

is upper triangular if all entries below the

diagonal are 0:

a11 a12 a13 · · · a1n

0 a22 a23 · · · a2n

0 0...

......

......

......

...0 0 0 · · · ann

The definition of lower triangular is similar.

Partitioned or Block matrices A block or par-

titioned matrix is a matrix whose entries are

themselves matrices which satisfy the condi-

tion that the matrices in a given row of the

block-matrix each have the same number of

rows and those in a given column, the same

number of columns. In other words, if the

entries of the individual matrices are written

out, the result should be a matrix. For ex-

ample, [A2×3 B2×4 C2×1

D3×3 E3×4 F3×1

]is a block-matrix whereas[

A2×4 B2×4 C2×1

D3×3 E3×3 F3×3

]is not. For convenience, a matrix is sometimes

written in block form. An n×n block diagonal

matrix has the formA1 0 0 · · · 00 A2 0 · · · 0...

......

......

0 0 0 · · · An

Here each 0 is a zero matrix of appropri-

ate order. A matrix can be partitioned in

many different ways by drawing horizontal

4

lines between rows and vertical lines between

columns. The terminology of matrices with

scalar entries can usually be carried over to

block matrices.

1.1.2 Operations with Matrices

Addition Given two m×n matrices A = [aij ] and

B = [bij ], the matrix A+B is defined to be the

m×n matrix whose (i, j)th entry is aij + bij ,

i.e. matrix addition is defined entrywise.

Analogously, if X = [Xij ] and Y = [Yij ], are

partitioned matrices (each entry is a matrix

of suitable order), then X + Y = [Xij + Yij ],

provided Xij and Yij are of the same order for

each i, j.

The following results are immediate and hold

for matrices with scalar entries and for block

matrices:

Theorem 2. Matrix addition is

1. commutative: A+B = B +A.

2. associative A + (B + C) = (A + B) + C

for any three matrices A, B and C of the

same order.

For this reason the addition as above is often

written without parentheses: A + B + C.

Clearly, addition can be extended to any finite

number of summands.

Any m×n matrix A = [aij ] can be written as

A =

m,n∑i=1j=1

aijEij

where the Eij ’s are the matrix units defined

earlier.

Multiplication by scalars For any scalar c and

any matrix A = [aij ], the matrix cA has the

(i, j)th entry caij , i.e. every entry of A is

multiplied by c. When c = −1, the matrix

(−1)A is usually denoted by −A. Note that

A + (−A) = 0 = (−A) + A. The definition is

the same for block matrices. The square mat-

rix cIn is called a scalar matrix.

Transpose & Hermitian Adjoint If A = [aij ]

is an m× n matrix with entries from R, its

transpose is defined to be the n×m matrix

denoted AT (also At or tA) and defined by

AT = [aji]. That is, the entry in the (i, j) pos-

ition of AT is the entry in the (j, i) position of

A. Another way of describing the transposed

matrix is to say that it is obtained from A

by replacing its ith row by its ith column,

i = 1, 2, . . . ,m. The corresponding operation

for a block-matrix X = [Xij ] is XT = [XjiT],

i.e. both the block-matrix and its individual

matrix entries are transposed.

If A = [aij ] has entries from C, then it

is convenient to define a slightly modified

version of the transpose, viz. the hermitian

adjoint (or often simply, adjoint). This is the

matrix denoted A∗ and defined by A∗ = [aji],

where aji is the conjugate of the complex

number aji. When the entries are all real,

then AT = A∗. The notation AT for A∗ is also

used. The corresponding definition for block

matrices is X∗ = [Xji∗].

Theorem 3.

1. (AT)T = A and (A∗)∗ = A.

2. (A+B)T

= AT + BT and (A + B)∗ =

A∗ +B∗.

3. (AB)T

= BTAT and (AB)∗ = B∗A∗.

Multiplication of matrices The product of two

matrices Am×n = [aij ] and Bn×p = [bjk], de-

noted by AB, is a matrix of order m × pwith (i, j)th entry

∑nr=1 airbrj . Matrix mul-

tiplication is defined only when the number of

columns of the first factor is equal to the num-

ber of rows of the second factor. For block

matrices Xm×n = [Xij ] and Yn×p = [Yij ],

XY = [∑nr=1 XirYrj ] provided all the mul-

tiplications XirYrj are valid, and is of order

m×p.

Theorem 4.

1. Matrix multiplication is not in general

commutative: it is not necessarily true

that AB = BA.

2. Multiplication is associative: If A, B

and C are three matrices such that the

products BC and A(BC) are defined,

then so are AB and (AB)C. Moreover,

A(BC) = (AB)C.

3. Multiplication is distributive: given

matrices A, B and C, A(B + C) =

5

AB + AC, if the sum B + C and the

products AB, AC are defined. Analog-

ously, (B + C)A = BA+ CA.

4. Am×n0n×p = 0m×p and 0p×mAm×n = 0p×n.

5. Am×nIn = Am×n = ImAm×n.

6. c(AB) = (cA)B = A(cB), c a scalar.

7. (Product of matrix units) EijEkl =

δjkEil, where δjk is the Kronecker sym-

bol defined earlier.

When a product AB is formed, B is said

to be pre- or left-multiplied by A which in

turn is said to be post- or right-multiplied by B.

1.1.3 Inverses

Let Am×n, A′n×m and A′′n×m be matrices such

that A′A = In and AA′′ = Im. Then

A′ is a left inverse of A and A′′ a right

inverse of A. In general, there will exist

many left (or right) inverses.

Theorem 5. 1. If Am×n has a left in-

verse and a right inverse, then A

must be square and the two inverses

are equal and unique.

2. If A is square and has a left in-

verse (resp. a right inverse), then it

also has a right inverse (resp. a left

inverse) and the two are equal and

unique.

When A is square, the unique matrix B,

if it exists, satisfying BA = I = AB is

called the inverse of A and is denoted

A−1. A matrix which has an inverse

is said to be invertible or non-singular.

Otherwise it is said to be non-invertible

or singular. The basic properties of

inverses are summarised in the next

theorem.

Theorem 6. Let A and B be n×n in-

vertible matrices and A−1 and B−1 their

respective inverses. Then

1. (A−1)−1 = A.

2. (AB)−1

= B−1A−1. Thus, a product

of invertible matrices is invertible.

This result extends to any finite num-

ber of factors.

3. (Left- and Right-cancellation) AX =

AY ⇒ X = Y and XA = Y A ⇒X = Y , where X, Y are arbitrary

matrices for which the products are

defined.

4. (A−1)T = (AT)−1 and (A∗)−1 =

(A−1)∗.

5. A is invertible ⇐⇒ detA 6= 0, where

detA is the determinant of A.

6. A diagonal matrix D :=

diag[a1, a2, . . . , an] is invertible iff

each ai 6= 0. The inverse, if it exists,

is D−1 = diag[1/a1, 1/a2, . . . , 1/an].

7. If a real or complex square matrix

A = [aij ] is strictly diagonally dom-

inant, i.e. for each i = 1, 2, . . . , n,

|aii| >n∑j=1

j 6=i

|aij |

then A is invertible.

8. Let

X =

[Am×m Bm×nCn×m Dn×n

]be a block (or partitioned) matrix

with A invertible. Then X is invert-

ible if and only if Y := D−CA−1B is

invertible, in which case X−1 is given

by

X−1 =

[A−1 +A−1BY −1CA−1 −A−1BY −1

−Y −1CA−1 Y −1

]9. An important special case of the

above result occurs when C = 0.

Then X is invertible iff A and D are

invertible and

X−1 =

[A−1 −A−1BD−1

0 D−1

]A pair of square matrices A and B of the

same order are similar if B = PAP−1 for

some invertible P . They are orthogonally

(resp. unitarily) similar if P is orthogonal

(resp. unitary).

Theorem 7. Similar matrices have the

same determinant.

A and B are congruent or cogredient if

B = PAPT (or PAP ∗ in the case of com-

plex matrices) for some invertible P . Two

m×n matrices A and B are equivalent if

there exist invertible matrices Pm and Qnsuch that B = PmAQn.

6

1.1.4 The dot or inner product

Definition 8.

1. The dot product or inner product of

two real n × 1 column vectors x =

[x1, . . . , xn]T

and y = [y1, . . . , yn]T

is de-

noted by x · y (or 〈x, y〉) and defined to

be

x · y :=

n∑k=1

xkyk = xTy

2. If x and y have complex entries then

x · y :=

n∑k=1

xkyk = xTy

where y := [y1, . . . , yn]T

(vector whose

entries are the complex conjugates of y).

Definition 9. The length or magnitude or

norm of a vector x = [x1 x2 . . . xn]T

with real

or complex entries is defined to be

||x|| :=

[n∑k=1

|xk|2]1/2

= (xTx)1/2 (1.1)

for real vectors; the last expression must be re-

placed by xTx for complex vectors. A vector x

such that ||x|| = 1 is called a unit vector.

Definition 10.

1. Non-zero vectors x, y ∈ Rn are ortho-

gonal if x · y = 0. They are parallel or

collinear if y = cx for some scalar c.

2. A set of vectors {x1, . . . , xn} is orthonor-

mal if each xi is a unit vector and xi ·xj = δij (Kronecker delta; see subsection

(1.1.1)).

Theorem 11. Let x, y be two n-dimensional

real or complex vectors. The norm has the fol-

lowing properties:

1. ||x|| = 0⇐⇒ x = 0.

2. ||cx|| = |c| ||x||, c ∈ R,C.

3. (Triangle inequality)

| ||x|| − ||y|| | ≤ ||x+ y|| ≤ ||x||+ ||y||

4. (Cauchy-Buniakowsky-Schwarz inequal-

ity)

|x · y| ≤ ||x|| ||y||

Equality holds iff y = cx for some scalar

c.

1.1.5 Some more special matrices

Let A be a square matrix.

1. It is Symmetric If AT = A. If A is complex,

then A is Hermitian if A∗ = A. Note that a

complex matrix A may be symmetric without

being hermitian.

2. It is Normal if A∗A = AA∗; in particular, for

matrices with real entries, this reduces to the

condition ATA = AAT.

3. It is Orthogonal if

AAT = I (= ATA)

and unitary if

AA∗ = I (= A∗A)

Orthogonal matrices are thus invertible.

4. Rotation matrices in 2 dimensions. The ortho-

gonal matrices

Rθ :=

[cos θ − sin θsin θ cos θ

](0 ≤ θ ≤ 2π)

represent rotations about the origin by an

angle θ in the anticlockwise or positive direc-

tion: if x ∈ R2, then Rθx is the vector obtained

by rotating x through θ about the origin.

5. Rotation matrices in 3-dimensions. The ortho-

gonal matrices

Rx :=

[1 0 00 cos θ − sin θ0 sin θ cos θ

], Ry :=

[cos θ 0 sin θ1 0 0

− sin θ 0 cos θ

]

and Rz :=

[cos θ − sin θ 0sin θ cos θ 00 0 1

]

represent rotations through an angle θ about

the x-axis, the y-axis and the z-axis respect-

ively. For example, Rxv is the vector v rotated

through θ about the x-axis.

6. General Rotation matrix. The orthogonal mat-

rix[c+ (1− c)a21 (1− c)a1a2 − sa3 (1− c)a1a3 + sa2

(1− c)a1a2 + sa3 c+ (1− c)a22 (1− c)a2a3 − sa1(1− c)a1a3 − sa2 (1− c)a2a3 + sa1 c+ (1− c)a23

]where c := cos θ and s := sin θ, represents a

rotation through an angle θ about the axis a =

(a1, a2, a3).

7

7. Reflection matrix. Let u = (u1, u2) (u =

(u1, u2, u3)) be a unit vector in R2 (resp. R3).

Then the matrix representing reflection in the

line (resp. plane) perpendicular to u is given

by[1− 2u21 −2u1u2−2u1u2 1− 2u22

]&

[1− 2u21 −2u2u1 −2u3u1−2u1u2 1− 2u22 −2u3u2−2u1u3 −2u2u3 1− 2u23

]

8. Idempotent matrix if A2 = A and Nilpotent if

Ak = 0 for some integer k > 0.

1.2 Linear Maps or Trans-formations

A map T : Rn −→ Rm is said to be linear if

T (x+ y) = T (x) + T (y) for all x, y ∈ Rn

T (cx) = c T (x) for all x ∈ Rn, c ∈ R

In the definition above, R can be replaced by C.

The two conditions can be combined to state that

T is linear if T (cx + y) = c T (x) + T (y) for all

x, y ∈ Rn and c ∈ R.

Theorem 12.

1. If T is linear, then T (0n) = 0m.

2. If A is an m×n matrix, then T (x) := Ax is a

linear map from Rn −→ Rm.

3. If A is m× n, B is n× p, then AB is a linear

map from Rp −→ Rm.

Definition 13.

1. Let T : Rn −→ Rm be a linear map. The set

{x ∈ Rn : Tx = 0} is called the null space or

kernel of T . The set {Tx ∈ Rm : x ∈ Rn} is

called the range space or image space of T .

Definition 14.

1. A set {v1, v2, . . . , vk} of non-zero (column) vec-

tors is said to be linearly independent if given

scalars c1, c2, . . . , ck,

k∑i=1

civi = 0⇒ ci = 0

for i = 1, 2, . . . , k. A set of row vec-

tors {u1, u2, . . . , uk} is linearly independent

if {u1T, u2

T, . . . , ukT} is. If the vectors are

not linearly independent, they are said to be

linearly dependent. In this case, there ex-

ist scalars c1, c2, . . . , ck not all zero such that∑ki=1 civi = 0.

2. The number of vectors in any maximal sub-

set of linearly independent vectors of the null

space of a linear transformation T is called the

nullity of T and sometimes denoted null(T ).

The number of vectors in any maximal sub-

set of linearly independent vectors in the range

space of T is called the rank of T and variously

denoted rank(T ) or rk(T ).

Theorem 15. (The Rank-Nullity theorem) Let T :

Rn −→ Rm be a linear transformation. Then

n = null(T ) + rk(T )

1.3 Determinant & Trace

Given a 1× 1 matrix A = [a11], its determinant

detA := a11. If A = [aij ] is 2× 2, detA :=

a11a22 − a12a21. Assuming that the determinant

of an (n − 1)×(n − 1) matrix A has been defined,

the determinant of an n×n matrix A = [aij ] is

defined as follows: let Mij be the determinant of

the submatrix of A obtained from A by omitting

its ith row and jth column. Then

detA =

n∑j=1

(−1)i+jaijMij (1.2)

Mij is called the minor corresponding to the entry

aij and the determinant is said to be expressed in

terms of an expansion by minors along the ith row.

In particular, we may take i = 1 in (1.2). The term

(−1)i+jMij is called the cofactor corresponding to

the entry aij .

Definition 16. Let A = [aij ] be an n × n square

matrix. The trace of A is the sum of the diagonal

entries of A., i.e. Trace(A) :=∑nk=1 akk. It is also

denoted tr(A) etc.

Theorem 17.

1. The “trace map” Trace : A 7→ Trace(A) is lin-

ear, i.e. given matrices A and B, and c any

scalar,

Trace(cA+ b) = cTrace(A) + Trace(B)

2. Trace(AB) = Trace(BA), where A,B are

square.

3. If A and B are similar, i.e. there is an in-

vertible matrix P such that B = PAP−1, then

Trace(A) = Trace(B).

8

Properties of determinants Let A = [aij ] be an

n×n square matrix.

1. (Expansion along columns) detA =∑ni=1(−1)i+jaijMij .

2. (Multiplying a column by a constant) Writing

A in block form as A = [A1A2 · · ·Aj · · ·An]

where Aj is the jth column of A,

det[A1A2 · · · (cAj) · · ·An] = cdetA

Similarly for columns. Hence, if any row

or column of a square matrix has only zero

entries, then its determinant is zero. Also,

det(cA) = cn detA

3. If A′ is the matrix obtained by interchanging

two rows or two columns of A, then detA′ =

−detA.

4. If A has any two rows or any two columns

identical, then detA = 0.

5. If A′ = [A1A2 · · · cAk + Aj · · ·An] where A =

[A1A2 · · ·Aj · · ·An], then detA′ = detA.

6. det(AB) = (detA)(detB) = det(BA).

7. detAT = detA and detA∗ = detA if A is

complex.

8. det In = 1.

9. If A is upper or lower triangular,

detA = a11a22 · · · ann

If A is block upper triangular, i.e.

A =

[X Y0 Z

]then detA = (detX)(detZ).

10. A is invertible ⇐⇒ detA 6= 0. Moreover,

detA−1 = 1/(detA).

11. If A and B are similar, then detA = detB.

1.4 Systems of Linear Equa-tions

A collection of m simultaneous equations:

a11x1 + a12x2 + a13x3 + · · · + a1nxn = b1

a21x1 + a22x2 + a23x3 + · · · + a2nxn = b2

... (1.3)

am1x1 + am2x2 + am3x3 + · · ·+ amnxn = bm

in the unknowns x1, x2, x3, . . . , xn is called a non-

homogeneous system of linear equations. If all the

bi’s are zero, then the system is said to be ho-

mogeneous. The scalars aij , i = 1, 2, . . . ,m and

j = 1, 2, . . . , n are the coefficients of the unknowns.

The system (1.3) can be rewritten in matrix form

as Ax = b, where

A :=

a11 a12 a13 · · · a1n

a21 a22 a23 · · · a2n

......

......

...am1 am2 am3 · · · amn

m×n

is the coefficient matrix and

x :=

x1

x2

...xn

n×1

and b :=

b1b2...bm

m×1

Augmented Matrix The block-matrix [A b] is

the augmented matrix associated with the sys-

tem (1.3).

Any n-tuple (c1, c2, . . . , cn) such that

A

c1c2...cn

= b

is said to be a solution of the system (1.3). A linear

system of equations may have no solution, a unique

solution or infinitely many solutions. In the first

case, the system is said to be inconsistent. Oth-

erwise it is said to be consistent. A homogeneous

system is consistent since x = 0 is always a solution;

such a solution is said to be trivial.

1.4.1 Solution of a System of LinearEquations

Elementary row operations These are certain

operations performed on the rows of a matrix

to obtain another closely related matrix called

an elementary matrix. We first define on the

operations on the identity matrix In.

1. Elementary Permutation. The ith row

of In is exchanged with its jth row. The

elementary matrix thus obtained is de-

noted Pij . In terms of the matrix units

(see p. 3) we can write this as:

Pij := In − Eii − Ejj + Eij + Eji

9

2. Elementary Dilation or Dilatation.

This replaces the ith 1 on the diagonal

of In by some scalar c 6= 0. The corres-

ponding elementary matrix is Di(c):

Di(c) := In − Eii + cEii

3. Elementary Transvection. This oper-

ation replaces the ith row by the ith row +

c×(jth) row, i 6= j, c an arbitrary scalar.

The corresponding matrix is Tij(c):

Tij(c) := In + cEij

It differs from In in having c instead of 0

in the (i, j) position.

Elementary matrices are often denoted E.

Theorem 18.

1. Pre-multiplying any m×n matrix by an

elementary matrix from the list above re-

produces the corresponding row operation

on the given matrix.

2. Each of Pij, Di(c) and Tij(c) is invertible

with corresponding inverse Pji, Di(1/c)

and Tij(−c).

Gaussian Elimination This is a method for solv-

ing a consistent system or for demonstrating

that a given system is inconsistent. The sys-

tem Ax = b is row-reduced to echelon form

(see subsection (1.1.1) by a finite sequence

of row operations, or equivalently, by pre-

multiplying the augmented matrix [A b] (p. 9)

by a finite sequence of elementary matrices,

say E1, E2,. . . , Ek, to obtain the block-matrix

[A′ b′] := Ek . . . E2E1[A b]. Note that A′ =

Ek . . . E2E1A and b′ = Ek . . . E2E1b. The sys-

tem A′x = b′ can be easily solved or determ-

ined to be inconsistent. Since the Ei’s are in-

vertible, it is clear that the systems Ax = b

and A′x = b′ are equivalent in the sense that

either both are inconsistent or both have ex-

actly the same solutions. The following ex-

ample illustrates the procedure. Consider the

system Ax = b:

1 −2 1 21 1 −1 11 7 −5 −1

x1

x2

x3

x4

=

b1b2b3

We apply the following sequence of row oper-

ations to the augmented matrix A′:1 −2 1 2 b11 1 −1 1 b21 7 −5 −1 b3

T21(−1)−−−−−→

1 −2 1 2 b10 3 −2 −1 b2 − b11 7 −5 −1 b3

T31(−1)−−−−−→

1 −2 1 2 b10 3 −2 −1 b2 − b10 9 −6 −3 b3 − b1

D2(1/3)−−−−−→

1 −2 1 2 b1

0 1 − 23 − 1

3b2−b1

3

0 9 −6 −3 b3 − b1

D3(1/9)−−−−−→

1 −2 1 2 b1

0 1 − 23 − 1

3b2−b1

3

0 1 − 23 − 1

3b3−b1

9

T12(2)−−−−→

1 0 − 13

43

b1+2b23

0 1 − 23 − 1

3b2−b1

3

0 1 − 23 − 1

3b3−b1

9

T32(−1)−−−−−→

1 0 − 13

43

b1+2b23

0 1 − 23 − 1

3b2−b1

3

0 0 0 0 2b1−3b2+b39

We obtain the equivalent system of equations

plus a consistency condition (1.4):

x1 − 1

3x3 +

4

3x4 =

b1 + 2b23

x2 −2

3x3 −

1

3x4 =

b2 − b13

0 =2b1 − 3b2 + b3

9(1.4)

If for example b1 = 0, b2 = 0, b3 = 1, then the

system has no solution. Assuming that (1.4)

is satisfied, we may assign arbitrary values to

x3 = s and x4 = t to obtain the general solu-

tion

x =

13s−

43 t+ b1+2b2

323s+ 1

3 t+ b2−b13

st

Every solution is obtained by giving particular

values to s and t: real values if the system is

regarded as being over R and complex values

if it is over C.

Rank of a matrix The maximum number of

linearly independent columns (regarded as

column vectors) of a matrix Am×n is called

10

the column-rank of A. The 0 matrix has

column-rank 0. The row-rank is defined

analogously. If the column-rank of A is n,

then it is said to be of (full column-rank) and if

its row-rank is m, then to be of full row-rank.

The column-rank of a block-matrix is the rank

of the matrix obtained when the block-matrix

is written without the partitioning.

Theorem 19.

1. The row-rank of an m×n matrix is equal

to its column-rank. This common integer

is called the rank of the matrix and is de-

noted variously as ρ(A), rk(A) etc. Ob-

viously, rk(A) ≤ min{m,n}. If rk(A) =

min{m,n}, it is said to be of full rank).

2. If A is of order m×n and B of order n×p,

then

rk(A)+rk(B)−n ≤ rk(AB) ≤ min{rk(A), rk(B)}

The lower bound is Sylvester’s inequality.

3. (Frobenius’ inequality)

rk(ABC) ≥ rk(AB) + rk(BC)− rk(B)

4. rk(A) = rk(AX) = rk(YA), if X and Y

are invertible.

5. If A is similar to B, then rk(A) = rk(B).

6. rk(AB) = rk(A) ⇐⇒ A = ABX for some

matrix X.

7. rk(AB) = rk(B) ⇐⇒ B = YAB for some

matrix Y .

8. rk(A + B) ≤ rk(A) + rk(B). Note that

rk(A + (−A)) = 0. This inequality ex-

tends to any finite number of summands:

rk(A1+A2+· · ·+Ak) ≤ rk(A1)+rk(A2)+

· · ·+ rk(Ak).

9. The system Ax = 0, where A is m×n,

has a non-trivial solution (i.e. a non-zero

solution) iff rk(A) < n. In particular, this

is the case if m < n.

10. The system Ax = b is consistent iff

rk(A) = rk([A b]), the matrix on the RHS

being the augmented matrix.

Cramer’s Rule If in the system Ax = b, A is in-

vertible, then the solution exists and is unique:

x = A−1b. An alternative way of describ-

ing the solution is Cramer’s rule: if x =

[x1 x2 · · · xn]T

is the solution of the given sys-

tem, then

xi = detAi/detA

where Ai is the matrix obtained by replacing

the ith column of A by b.

1.5 Eigenvalues and Eigen-vectors

In this section unless otherwise mentioned, all

matrices are assumed to be square.

Eigenvalue, Eigenvector & Eigenspace Let

A be an n×n matrix over the real or complex

numbers. Any scalar λ such that Ax = λx

for some column vector x 6= 0, is called an

eigenvalue of the matrix A corresponding

to the eigenvector x. A matrix may not

have any real eigenvalues but it always has

complex ones. The set of all the eigenvectors

associated with a given eigenvalue together

with the zero vector, is called the eigenspace

of the eigenvalue.

Spectrum & Spectral Radius The collection of

all distinct eigenvalues of a matrix A is called

its spectrum and is usually denoted σ(A). The

nonnegative number

ρ(A) := max{|λ| : λ ∈ σ(A)}

is said to be the spectral radius of A. In the

event a real matrix A does not have any real

eigenvalues, the spectral radius is defined

using its complex eigenvalues.

Theorem 20. Let A be an n×n matrix.

1. The eigenvalues of a diagonal or a tri-

angular matrix (upper or lower) are the

entries on the diagonal.

2. A is singular (i.e. noninvertible) iff 0 ∈σ(A). If A is invertible, then

σ(A−1) = σ(A)−1 := {1/λ : λ ∈ σ(A)}

If x is an eigenvector of A associated with

λ, then x is also an eigenvector of A−1

associated with the eigenvalue 1/λ.

3. Let p(x) = a0 +a1x+a2x2 + · · ·+anx

n be

a polynomial. Then for every λ ∈ σ(A)

11

with an eigenvector x, p(λ) is an eigen-

value of

p(A) := a0I + a1A+ a2A2 + · · ·+ anA

n

with an eigenvector x. However, if only

the real eigenvalues of a matrix are taken

into account, then in general p(σ(A)) :=

{p(λ) : λ ∈ σ(A)} & σ(p(A)). If x is

an eigenvector of A associated with the

eigenvalue λ, then x is also an eigen-

vector of p(A) associated with the eigen-

value p(λ): p(A)x = p(λ)x.

4. (Spectral mapping theorem) If all the ei-

genvalues (real and complex) are con-

sidered, then if p is a polynomial,

p(σ(A)) = σ(p(A))

As a special case of the above, taking

p(x) = cx, c a scalar,

σ(cA) = cσ(A)

Eigenvalues of special matrices

5. Every matrix of odd order has at least one

real eigenvalue.

6. The complex eigenvalues of a real matrix

occur in conjugate pairs: if λ is a complex

eigenvalue of A, then so is λ.

7. The spectrum of an idempotent matrix

(see section 8) is a subset of {0, 1}. Every

eigenvalue of a nilpotent matrix is 0.

8. n×n symmetric and hermitian matrices

have n real eigenvalues which, however,

may not be all distinct.

9. The eigenvalues of a skew-symmetric

matrix are purely imaginary.

10. Let A be orthogonal. Then λ ∈ σ(A) ⇒|λ| = 1.

11. If A is strictly diagonally dominant, then

(a) The diagonal entries of A are all pos-

itive ⇒ all the eigenvalues of A have

positive real part.

(b) A is hermitian and the diagonal

entries of A are all positive ⇒ the ei-

genvalues of A are real and positive.

Characteristic Polynomial The polynomial

pA(x) := det (A− xI)

(sometimes defined as det (xI −A)) is called

the characteristic polynomial of A. It is of de-

gree n if A is n×n.

Theorem 21.

1. The roots of pA(x) are the eigenvalues of

A. Similar matrices have the same char-

acteristic polynomial.

2. If pA(x) =∑nk=0 akx

k, then

Trace(A) = −an−1 =

n∑k=1

λk

where the λk’s are all the eigenvalues (real

and complex) of A.

3. detA = (−1)na0 = λ1λ2 . . . λn.

Theorem 22. (Cayley −Hamilton) Any

n×n matrix A satisfies its characteristic poly-

nomial, ı.e. pA(x) = a0 + a1x + · · · + anxn is

the characteristic polynomial of A, then

pA(A) := a0I + a1A+ · · ·+ anAn = 0

Eigenvalue Multiplicity Suppose that the

characteristic polynomial pA(x) factors as

(x− λ1)m1)x− λ2)m2 · · · (x− λk)mk , in which

the λi’s are possibly complex, then for each

for i = 1, 2, . . . , k, the eigenvalue λi is said

to have (algebraic) multiplicity mi. Alternat-

ively, there are said to be mi eigenvalues λi(i = 1, 2, . . . ,mi) counting multiplicities. If A

is n×n then m1 +m2 + · · ·+mk = n.

Theorem 23.

1. AT has the same eigenvalues as A count-

ing multiplicities. The eigenvalues of A∗

are the complex conjugates of the eigen-

values of A counting multiplicities (i.e. if

λ is an eigenvalue of A with multiplicity

m, then λ is an eigenvalue of A∗ with the

same multiplicity m).

2. Similar matrices have the same eigenval-

ues counting multiplicity.

3. Given Am×n and Bn×m (m ≤ n),

pBA(x) = xn−mpAB(x)

AB and BA have the same non-zero ei-

genvalues counting multiplicities. BA has

an additional n −m eigenvalues 0. If A

and B are square and at least one of A

and B is invertible, then AB and BA are

similar.

12

4. Let

A =

[A 00 B

]Then the eigenvalues of A are those of A

and of B counting multiplicities.

————————–

13

Chapter 2

Calculus

All functions in this chapter will be real-valued

unless otherwise mentioned.

2.1 Mean-Value Theorems &their Consequences

Rolle’s theorem Let f : [a, b] −→ R, −∞ <

a, b < ∞, be continuous and differentiable in

(a, b) with finite or infinite derivative. Suppose

f(a) = f(b). Then there exists a < c < b such

that f ′(c) = 0.

Mean Value Theorem Let f : [a, b] −→ R be a

continuous function which is differentiable in

(a, b) with finite or infinite derivatives. Then

there exists c ∈ (a, b) such that

f(b)− f(a) = (b− a)f ′(c)

Generalised Mean Value Theorem Let f, g :

[a, b] −→ R be continuous on [a, b] and differ-

entiable on (a, b) with finite or infinite derivat-

ives. Suppose f ′ and g′ are not simultaneously

infinite at any point of (a, b). Then, there ex-

ists c ∈ (a, b) such that

[g(b)− g(a)]f ′(c) = [f(b)− f(a)]g′(c)

The mean value theorem is recovered by taking

g(x) = x.

Monotonicity If f : [a, b] −→ R is continuous and

differentiable on (a, b) with possibly infinite de-

rivatives, then

1. f ′(x) > 0 on (a, b)⇒ f is strictly increas-

ing on [a, b].

2. f ′(x) < 0 on (a, b) ⇒ f is strictly de-

creasing on [a, b].

3. f ′(x) = 0 on (a, b) ⇒ f is constant on

[a, b].

Applications to maxima and minima See

(2.4).

Intermediate Value Property Suppose

f : [a, b] −→ R is differentiable with finite

or infinite derivative on [a, b], the one-sided

derivatives f ′(a+) and f ′(b−) at the endpoints

a and b respectively being assumed to be finite

and unequal: f ′(a+) 6= f ′(b−). Then, for

any α such that f ′(a+) < α < f ′(b−) or

f ′(b−) < α < f ′(a+), there is c ∈ (a, b) for

which f ′(c) = α.

Corollary 1 f ′ cannot have jump discon-

tinuities.

Corollary 2 If f is continuous on [a, b] and

differentiable on (a, b) with f ′(x) 6= 0 (but

possibly infinite), then f is strictly mono-

tonic: increasing if f ′ > 0 and decreasing

if f ′ < 0, at any point.

Continuity of derivatives If f : (a, b) −→ R is

differentiable and monotonic, then f ′ is con-

tinuous.

Derivative of a vector-valued function The

ith projection function pi : Rn −→ R is the

function defined by

pi(x1, x2, . . . , xn) = xi

Let f : (a, b) −→ Rn be a vector-valued func-

tion. f can be written in terms of its compon-

ent functions fi := pi ◦ f as

f(x) = (f1(x), f2(x), . . . , fn(x))

We say that f is differentiable if each fi is and

define

f ′(x) := (f ′1(x), f ′2(x), . . . , f ′n(x))

15

In matrix notation

if f(x) =

f1(x)f2(x)

...fn(x)

then f ′(x) :=

f ′1(x)f ′2(x)

...f ′n(x)

Taylor’s theorem Let f : [a, b] −→ R be con-

tinuous with finite nth order derivative f (n) in

(a, b). Suppose that f (n−1) is continuous on

[a, b] and that x0 ∈ [a, b] is arbitrary. Then,

for every x ∈ [a, b], x 6= x0, there exists ξ such

that x < ξ < x0 (if x < x0) or x0 < ξ < x (if

x0 < x) and for which

f(x)− f(x0) =

n−1∑k=1

f (k)(x0)

k!(x− x0)k

+f (n)(ξ)

n!(x− x0)n

The special case of n = 1 is the mean-value

theorem. The polynomial

p(x) =

n−1∑k=1

f (k)(x0)

k!(x− x0)k

is called the Taylor polynomial of degree n− 1

at x0 associated with f .

Generalised Taylor’s theorem (for a pair of

functions) Let the hypotheses of the previ-

ous theorem hold for each of two functions

f, g : [a, b] −→ R. Then, for every x ∈ [a, b],

x 6= x0, there exists ξ in (x, x0) (if x < x0) or

in (x0, x) (if x0 < x) such that[f(x)−

n−1∑k=1

f (k)(x0)

k!(x− x0)k

]g(n)(ξ)

=

[g(x)−

n−1∑k=1

g(k)(x0)

k!(x− x0)k

]f (n)(ξ)

Taylor’s theorem for a single function is re-

covered by taking g(x) = (x− x0)n.

Integral form of Taylor’s theorem Let f :

[x0−δ, x0+δ] −→ R be continuously differenti-

able of order n : f (n) exists on (a, b) and is con-

tinuous there. Then for every x ∈ [x0−δ, x0+δ]

f(x) = f(x0) +

n−1∑k=1

f (k)(x0)

k!(x− x0)k

+

∫ x

x0

f (n)(t)

n!(x− t)n−1dt

This form avoids introducing an unspecified

value ξ in (a, b).

2.2 Limits & indeterminateforms

Form 0/0

1. Suppose f, g : [a, b] −→ R are continu-

ously differentiable and f(x0) = g(x0) for

some a < x0 < b. If g′(x0) 6= 0, then

limx→x0

f(x)

g(x)=f ′(x0)

g′(x0)

2. (L’Hospital’s rule) Let f, g : [a, b] −→ Rbe continuously differentiable, f(x0) =

0 = g(x0), g′(x) 6= 0 for all x 6= x0

in [a, b], and limx→x0

f ′(x0)g′(x0) = L, where

−∞ ≤ L ≤ ∞. Then

limx→x0

f(x0)

g(x0)= L

Form ∞/∞Suppose f, g : (a, b] −→ R are continuously

differentiable,

limx→a+

f(x) =∞ = limx→a+

g(x) (g′(x) 6= 0)

and

limx→a+

f ′(x)

g′(x)= L

where −∞ ≤ L ≤ ∞. Then

limx→a+

f(x)

g(x)= L

A similar result is true for limits x→ b−.

Form 0.∞With similar differentiability assumptions as

above if f, g : [a, b] −→ R are such that

limx→a = 0 and limx→a g(x) = ∞, then re-

duce to the earlier forms by writing either

f(x)g(x) = f(x)/g(x)−1 or f(x)g(x) =

g(x)/f(x)−1 depending on the convenience in

applying L’Hospital’s rule.

Forms 00,0∞,∞0,∞∞,1∞

If

limx→a

f(x) = 0 = limx→a

g(x) limx→a

h(x) =∞

with f(x) ≥ 0, then

limx→a

f(x)g(x) = elimx→a g(x) log f(x)

The form 00 is reduced to 0.∞. Taking logar-

ithms as above, the forms 0∞ and∞∞ are seen

to be not indeterminate, i.e. they both evalu-

ate directly to 0 without requiring a passage to

derivatives. The same logarithmic formulation

reduces ∞0 to 0.∞.

16

2.3 Partial derivatives

An open ball of radius r centred on a in Rn is a set

of the form {x := (x1, x2, . . . , xn) : ||x−y|| < r} for

some 0 < r < ∞ and a := (a1, a2, . . . , an) (for the

definition of || · || see (1.1)). It is usually denoted

B(a, r). Rn may be thought of as a ball of infinite

radius. Let U ⊂ Rn be a set. Then x ∈ U is

an interior point of U if there is some open ball

B(x, r) ⊂ U . If every point of U is an interior

point, then U is said to be open.

Directional Derivatives Let f : U ⊂ Rn −→Rm be a function and a ∈ U be an interior

point and u 6= 0 be an arbitrary point of U .

Then the limit, if it exists,

f ′(a;u) := limt→0

f(a+ tu)− f(a)

t

is the directional derivative of f at a in the

direction u.

Partial derivatives The kth unit vector ek of Rnis the vector [0, 0, . . . , 1, 0, . . . , 0]

Tand 1 is in

the kth place.

Let f : U ⊂ Rn −→ R be a function and

a ∈ U be any point. The kth partial de-

rivative or simply the kth partial of f at a

is defined to be the the directional derivative

of f at a in the direction ek, viz. f ′(a; ek),

if the derivative exists. It is variously de-

noted by ∂f∂xk

(a), fk, Dkf(a), ∂kf(a) etc. The

partial derivative at a can be seen as the

usual one-variable derivative of the function

F (xk) := f(a1, a2, . . . , ak−1, xk, ak+1, . . . , an)

with respect to xk. The domain of F is Uk :=

{xk ∈ R : (a1, a2, . . . , ak−1, xk, ak+1, . . . , an) ∈U}. Informally, f is differentiated with respect

to xk treating the other variables as fixed.

Gradient If f : U ⊂ Rn −→ R is a function whose

partial derivatives exist at a ∈ U , then the

gradient of f at a is the (row-)vector

∇f(a) :=

[∂f

∂x1(a),

∂f

∂x2(a), . . . ,

∂f

∂xn(a)

]It is also sometimes denoted by

−→∇f(a) or by

gradf .

Successive partial differentiation Consider f :

U ⊂ R2 −→ R. Assuming the possibility of dif-

ferentiation, each partial derivative ∂f∂xi

which

is a function

(x1, . . . , xn) 7→ ∂f

∂xi

i = 1, 2, can be differentiated again generating

the second order derivatives ∂2f∂x2

1, ∂2f∂x2

2, ∂2f∂x1∂x2

and ∂2f∂x2∂x1

. The partials of the form ∂2f∂xi∂xj

,

i 6= j, are said to be mixed. This process can

be continued finitely many times to produce

nth order partials derivatives provided the dif-

ferentiations are possible.

Theorem 24. Let U ⊂ Rn −→ R, U open.

Suppose the partials ∂f∂x1

, ∂f∂x2

and ∂2f∂x1∂x2

exist,

and ∂2f∂x1∂x2

is continuous at some point a :=

(a1, a2) ∈ U . Then ∂2f∂x2∂x1

exists at ‘a’ and

∂2f

∂x1∂x2(a) =

∂2f

∂x2∂x1(a)

Cn functions Suppose f : U ⊂ U −→ R has all

partial derivatives of order n which moreover

are continuous. Then, f is said be Cn.

Taylor’s theorem for functions of two variables

We use the notation:[(h∂

∂x+ k

∂

∂y

)nf

](a, b)

:=

n∑i=0

(n

i

)hikn−i

∂nf

∂xi∂yn−i(a, b)

h, k ∈ R.

Theorem 25. Let U ⊂ R2 be the rectangular

open set (a−h, a+h)×(b−k, b+k), a, b, h, k ∈R, and f : U −→ R be Cn+1. Then,

f(a+ h, b+ k) =

n∑r=0

1

r!

[(h∂

∂x+ k

∂

∂y

)rf

](a, b)

+1

(n+ 1)!

[(h∂

∂x+ k

∂

∂y

)(n+1)

f

](a+ ξh, b+ ξh)

where 0 < ξ < 1.

Partial differentiation of composite functions

Let φ, ψ : A ⊂ R2 −→ R and f : B ⊂ R2 −→ Rbe functions such that (φ(u, v), ψ(u, v)) ∈ B

for all (u, v) ∈ A. Let a general element of B

be denoted (x, y), i.e

x := φ(u, v) y := ψ(u, v)

Define (f ◦ (φ, ψ))(u, v) := f(φ(u, v), ψ(u, v)).

17

Theorem 26. Let φ, ψ and f be as above and

continuously differentiable. Then

∂

∂u(f ◦ (φ, ψ)) =

∂f

∂x

∂φ

∂u+∂f

∂y

∂ψ

∂u

∂

∂v(f ◦ (φ, ψ)) =

∂f

∂x

∂φ

∂v+∂f

∂y

∂ψ

∂v

Partial differentiation of implicit functions

Theorem 27. 1. Let F : U ⊂ R2 −→ R be

C1 and y = f(x) for some differentiable

f . Assume that ∂f∂x 6= 0 on U . Then

f ′(x) = −∂f∂y

∂f∂x

2. Let F : U ⊂ R3 −→ R be a continuously

differentiable function. Suppose

(a) f : V ⊂ R2 −→ R is continuously

differentiable.

(b) If F = F (x, y, z), then

F (x, y, f(x, y)) = 0, assuming

that (x, y, f(x, y)) ∈ U for all

(x, y) ∈ V .

(c) ∂F∂z (x, y, f(x, y)) 6= 0.

Then

∂f

∂x(x, y) = −

∂F∂x (x, y, f(x, y))∂F∂z (x, y, f(x, y))

∂f

∂y(x, y) = −

∂F∂y (x, y, f(x, y))∂F∂z (x, y, f(x, y))

Homogeneous functions f : U ⊂ R −→ R is ho-

mogeneous of degree n if for any t > 0 and all

(x, y) ∈ U , f(tx, ty) = tnf(x, y).

Theorem 28. (Euler’s theorem & its con-

verse) Suppose f : U ⊂ R −→ R is continu-

ously differentiable. Then, f is homogeneous

of degree n⇐⇒ x∂f∂x + y ∂f∂y = nf .

Caution In the definition of homogeneity,

sometimes t is allowed to be any real. In this

case, the converse to Euler’s theorem, viz. the

“⇐=” part can be false.

2.4 Maxima & Minima

If f : A ⊂ Rn −→ R is a function, then f is said

to have a local or relative maximum (resp. local

or relative minimum) at a ∈ A if f(x) ≤ f(a)

(resp. f(x) ≥ f(a)) for all x in some open ball

B(a, r) of radius r centred on a. A function may

have several local maxima and local minima. A

local maximum and a local minimum are also each

termed a local extremum. If f has a maximum or

minimum over its entire domain, that value is said

to be a global or absolute maximum or global or ab-

solute minimum respectively.

The one variable case

Theorem 29.

1. (Necessary condition for the existence of

a local extremum) Let f : (a, b) −→ Rand suppose f has a local extremum (max-

imum or minimum) at x ∈ (a, b). If f is

differentiable at x, then f ′(x) = 0.

2. (Sufficient condition for the existence of

a local extremum) Let f : (a, b) −→ R be

Cn on (a, b). Suppose at c ∈ (a, b)

f ′(c) = f ′′(c) = · · · = f (n−1)(c) = 0 but

fn(c) 6= 0.

Then if n ∈ N is even,

f (n)(c) > 0f (n)(c) < 0

}⇒{

f(c) is a local minimumf(c) is a local maximum

If n is odd, there is no local extremum at

c.

Definition 30. A point x0 at which f ′(x0) = 0

but which is not an extremal point, is called an

inflection point.

The multivariate case

Let f : U ⊂ Rn −→ R be a function whose

first-order partial derivatives exist at c ∈ U .

If ∇f(c) = 0, c is called a stationary point of

f . A stationary point is called a saddle point

if every ball B(c, r) contains a point where

f(x) > f(c) and a point where f(x) < f(c).

If f has second-order partial derivatives at a

point c ∈ U , the Hessian of f is the n×n mat-

rix

Hf (c) :=∂2f∂x2

1(c) ∂2f

∂x1∂x2(c) . . . ∂2f

∂x1∂xn(c)

∂2f∂x2∂x1

(c) ∂2f∂x2

2(c) . . . ∂2f

∂x2∂xn(c)

......

......

∂2f∂xn∂x1

(c) ∂2f∂xn∂x2

(c) . . . ∂2f∂x2n

(c)

If the second-order partials are continuous at

a, then Hf (a) is symmetric.

18

Theorem 31.

1. Let f : U ⊂ Rn −→ R have second-order

partial derivatives which are continuous

at a stationary point c ∈ U . Define the

function or “quadratic form” Q : U −→ Rby

Q(x) := xTHf (c)x

=1

2

n∑i,j=1

∂2f

∂xi∂xj(c)xixj

where x =[x1 x2 . . . xn

]T ∈ U .

Then

(a) Q(x) > 0 for all x 6= 0 ⇒ f has a

local minimum at c.

(b) Q(x) < 0 for all x 6= 0 ⇒ f has a

local maximum at c.

(c) Q(x) takes both positive and negative

values for x 6= 0 ⇒ f has a “saddle

point” at c.

2. As a special case of the above, take n = 2

and assume that the second-order partials

of f are continuous at c. Then

(a) detHf (c) > 0 and ∂2f∂x2

1(c) > 0 ⇒ f

has a local minimum at c.

(b) detHf (c) > 0 and ∂2f∂x2

1(c) < 0 ⇒ f

has a local maximum at c.

(c) detHf (c) < 0⇒ f has a saddle point

at c.

(d) If detHf (c) = 0, then f may have a

local minimum or a local maximum

or a saddle point at c.

2.5 Theorems on Integration

Let f : [a, b] −→ R be a function. We will say that

f ∈ R if∫ baf(x) dx exists. f is said to be absolutely

integrable if∫ ba|f(x)|dx exists.

Theorem 32.

1. If f is bounded (i.e. |f(x)| ≤ C for some C ≥ 0

and all x ∈ [a, b]) with possibly only finitely

many points of discontinuity, then f ∈ R.

2. If f ∈ R, m ≤ f(x) ≤M for all x ∈ [a, b], and

g : [m,M ] −→ R is continuous, then g◦f ∈ R.

3. (Additivity) f, g ∈ R⇒ f + g ∈ R and∫ b

a

(f + g)(x) dx =

∫ b

a

f(x) dx+

∫ b

a

g(x) dx

4. c ∈ R and f ∈ R⇒ cf ∈ R.

5. (Monotonicity) f, g ∈ R and f ≤ g on [a, b]

implies ∫ b

a

f(x) dx ≤∫ b

a

g(x) dx

6. f ∈ R on [a, b] and a < ξ < b implies∫ ξ

a

f(x) dx+

∫ b

ξ

f(x) dx

7. f ∈ R on [a, b] and |f | ≤ K on [a, b] implies∣∣∣∣∣∫ b

a

f(x) dx

∣∣∣∣∣ ≤ K(b− a)

8. f, g ∈ R, then

(a) fg ∈ R

(b) |f | ∈ R and∣∣∣∣∣∫ b

a

f(x) dx

∣∣∣∣∣ ≤∫ b

a

|f(x)|dx

Continuity of the integral The integral as a

function of its upper limit is continuous: if

f ∈ R and a ≤ x ≤ b, define

F (x) :=

∫ x

a

f(t) dt

Then F : [a, b] −→ R is (uniformly) continu-

ous. If f is continuous at x ∈ [a, b], then F is

differentiable at x and F ′(x) = f(x). Thus, F

is continuously differentiable. If

F (x) :=

∫ b

x

f(t) dt

then F ′(x) = −f(x).

The fundamental theorem of calculus

Suppose f : f −→ R ∈ R has a primit-

ive, viz. a function F : [a, b] −→ R such

that F ′ = f (one-sided derivatives at the

endpoints). Then∫ b

a

f(x) dx = F (b)− F (a)

Change of variable theorem Let f : [a, b] −→R be in R and α : [c.d] −→ [a, b]. If either

19

1. α is a strictly increasing continuous

function (thus, α(c) = a, α(d) = b) with

α′ ∈ R

or

2. α is continuously differentiable on [c, d]

and f is continuous on α([c, d]),

then, f ◦ α ∈ R and∫ α(d)

α(c)

f(x) dx =

∫ d

c

f(α(t))α′(t)dt.

Integration by parts Let f, g : [a, b] −→ R be

differentiable on [a, b] and such that f ′, g′ ∈ R.

Then∫ b

a

f(x)g′(x)dx = f(b)g(b)− f(a)g(a)

−∫ b

a

f ′(x)g(x)dx

Integration of vector-valued-functions Let

f1, f2, . . . , fn : [a, b] −→ R be functions

in R and f : [a, b] −→ Rn be defined by

f(x) := (f1(x), f2(x), . . . , fn(x)). Then∫ baf(x) dx is said to exist (or to be in R) iff

fi ∈ R for i = 1, 2, . . . , n. We then define∫ b

a

f(x) dx := (

∫ b

a

f1(x) dx, . . . ,

∫ b

a

fn(x) dx)

Integration of a sequence of functions

Theorem 33.

1. Consider a sequence {fn} of real-valued

functions defined on [a, b]. Assume that

fn ∈ R for n = 1, 2, . . . and that fn → f

uniformly on [a, b]. Then f ∈ R and

limn→∞

∫ b

a

fn(x) dx =

∫ b

a

f(x) dx

2. A special case of the above is the follow-

ing. If∑∞n=1 fn(x) is a uniformly con-

vergent series, then∫ ba

∑∞n=1 fn(x) dx =∑∞

n=1

∫ bafn(x) dx.

Integral as a function of a parameter Let f :

[a, b]× [c, d] −→ R be continuous and define

F : [a, b] −→ R by F (x) :=∫ dcf(x, t) dt. Then,

1. F is continuous.

2. (Interchanging limit and integral)

limx→x0

∫ d

c

f(x, t) dt =

=

∫ d

c

limx→x0

f(x, t) dt

=

∫ d

c

f(x0, t) dt

Differentiation under the integral Let f :

[a, b]×[c, d] −→ R and ∂f∂x be continuous. If we

define F : [a, b] −→ R by F (x) :=∫ dcf(x, t) dt,

then F is continuously differentiable and

F ′(x) =

∫ d

c

∂f

∂x(x, t) dt

Differentiation of integrals with variable limits

Theorem 34. 1. Let f be as above. Define

F : [a, b] × [c, d] × [c, d] −→ R by

F (x, y, z) :=∫ zyf(x, t) dt. Then

∂F

∂x=

∫ z

y

∂F

∂x(x, t) dt,

∂F

∂y= −f(x, y)

∂F

∂z= f(x, z)

2. As a useful special case of the above, let

g(x) :=

∫ ψ(x)

φ(x)

f(x, t) dt

In terms of the function F defined above,

g(x) = F (x, φ(x), ψ(x)). Then

g′(x) =

∫ ψ(x)

φ(x)

∂F

∂x(x, t) dt− f(x, φ(x))φ′(x)

+ f(x, ψ(x))ψ′(x)

2.6 Improper Integrals

Let f : [a, x] −→ R be a function.

Type 1a Suppose that∫ xaf(t) dt exists for every

x ≥ a. Define Ia : [a,∞) −→ R by Ia(x) :=∫ xaf(t) dt. The function Ia(x) is said to be

an improper integral of Type 1. It converges if

limx→∞ Ia(x) = limx→∞∫ xaf(t) dt exists and

the limit is denoted by∫ ∞a

f(t) dt := limx→∞

∫ x

a

f(t) dt

20

If the integral does not converge, it is said to

diverge. When the integrand is entirely non-

negative (or non-positive), convergence is in-

dicated by∫∞af(t) dt < ∞ and divergence by∫∞

af(t) dt =∞ (or −∞).

Type 1b Here Ib(x) : (−∞, b] −→ R,

Ib(x) :=∫ bxf(t) dt, is said to converge if

limx→−∞ Ib(x) = limx→−∞∫ bxf(t) dt exists

and we write∫ b

−∞f(t) dt := lim

x→−∞

∫ b

x

f(t) dt

Type 1a and Type 1b integrals can be trans-

formed into one another by the change of

variable φ : t 7→ −t giving∫ bxf(t) dt =∫ −x

−b (f ◦ φ)(t) dt =∫ −x−b f(−t) dt. Con-

sequently, the Type 1a integral converges iff

the corresponding Type 1b integral does, in

which case we may write:∫ b−∞ f(t) dt =∫∞

−b f(−t) dt.

Type 2 If limx→a+ f(x) does not exist but∫ xa+ε

f(t) dt exists for every ε > 0 such that

a < a + ε < b, define the improper integral of

Type 2 to be the function Ia+ : (0, b−a) −→ R,

Ia+(ε) :=∫ ba+ε

f(t) dt. If limε→0 Ia+(ε) exists

then the improper integral of Type 2 is said to

converge and we write∫ b

a+

f(t) dt := limε→0

Ia+(ε)

The notation∫ baf(t) dt is also commonly used

with the understanding that limx→a+ f(x)

does not exist. If limx→a+ f(x) exists, then

Ia+ is said to be proper even if f is discon-

tinuous at a. An integral of Type 2 can be

converted to one of Type 1 (or equivalently, of

Type 1b) by a suitable change of variable and

vice versa: for example, φ : t 7→ e1−t trans-

forms∫ 1

0+f(t) dt to

∫∞1f(e1−t)e1−t dt. Never-

theless, for calculations, it is useful to retain

the Type 2 integrals as a separate class.

2.6.1 Convergence tests for Type 1a& 1b integrals

Comparison tests Let f, g : [a,∞) −→ R be con-

tinuous, 0 ≤ f(x) ≤ g(x) for all x. Then∫ ∞a

g(x) dx <∞⇒∫ ∞a

f(x) dx <∞

and∫ ∞a

f(x) dx =∞⇒∫ ∞a

g(x) dx =∞

For Type 1b integrals, the domain of the func-

tion becomes (−∞, b] and the limits of integ-

ration becomes (−∞, b].

Tests for absolute convergence We first state

them for Type 1a integrals. The integral∫∞af(x) dx is said to converge absolutely if∫∞

a|f(x)|dx < ∞. The integral converges

conditionally if it converges but∫∞a|f(x)|dx =

∞.

Theorem 35. Let f : [a,∞) −→ R be continu-

ous. Then∫∞a|f(x)|dx < ∞ ⇒

∫∞af(x) dx

converges.

Limit tests for Type 1a integrals

Theorem 36. 1. Let f : [a,∞) −→R be continuous. Then, for any

p > 1, limx→∞ xpf(x) converges ⇒∫∞a|f(x)|dx <∞.

2. With f as before, limx→∞ xf(x) con-

verges to a non-zero limit or diverges to

±∞ ⇒∫∞af(x) dx diverges. If the limit

is 0, then the test is inconclusive.

In the setting of Type 1b integrals, replace

the limit in the limit test of convergence by

limx→−∞(−x)pf(x) and in the limit test of di-

vergence by limx→−∞ xf(x).

Tests for conditional convergence

Theorem 37. Let f : [a,∞) −→ (0,∞) be

continuous and monotonically decreasing to 0

as x → ∞, i.e. limx→∞ f(x) = 0. Then∫∞af(x) sinx dx converges.

Corollary 38.

1. With the same function and hypotheses

as above,∫∞af(x) dx diverges (converges)

iff∫∞af(x)| sinx|dx diverges (resp. con-

verges).

2. With the same hypotheses as be-

fore,∫∞af(x) sin(αx + β) dx and∫∞

af(x) cos(αx + β) dx (α 6= 0) both

converge.

3. With the previous hypotheses, if n ∈N, n > a/π, then |

∫∞nπf(x) sinxdx| ≤

2f(nπ)

21

2.6.2 Convergence tests for Type 2integrals

We first consider Type 2a integrals.

Comparison tests Let f, g : (a, b] −→ R be con-

tinuous, 0 ≤ f(x) ≤ g(x) for all x. Then∫ b

a+

g(x) dx <∞⇒∫ b

a+

f(x) dx <∞

and∫ b

a+

f(x) dx =∞⇒∫ b

a+

g(x) dx =∞

The corresponding results for Type 2b integ-

rals are obtained by replacing the domain of f

by [a, b) and∫ ba+

by∫ b−a

.

Absolute convergence

1. Let f : (a, b] −→ R be continuous. Then∫ ba+|f(x)|dx < ∞ ⇒

∫ ba+f(x) dx con-

verges.

2. (Limit tests for Type 2a integrals) The in-

tegral∫∞af(x) dx is said to converge ab-

solutely if∫ ba+|f(x)|dx < ∞. The integ-

ral converges conditionally if it converges

but∫ ba+|f(x)|dx =∞.

Theorem 39.

1. Let f : (a, b] −→ R be continuous. Then,

for any 0 < p < 1, limx→a+ xpf(x) con-

verges ⇒∫ ba+|f(x)|dx <∞.

2. With f as before, limx→a+ xf(x) con-

verges to a non-zero limit or diverges to

±∞ ⇒∫ ba+f(x) dx diverges. If the limit

is 0, then the test is inconclusive.

For testing the convergence Type 2b integrals,

take limits limx→b−(b− x)pf(x) and for diver-

gence, limx→b−(b− x)f(x).

Tests for conditional convergence

Theorem 40. Let f : (a, b] −→ R be con-

tinuous, (x − a)2f(x) monotonically increas-

ing and limx→a+(x − a)2f(x) = 0. Then∫ ba+f(x) sin

(1

x−a)

dx converges.

Combination of types An integral which can be

written as a finite sum of integrals of the above

types, is said to converge if each of the sum-

mand integrals converges. It diverges if at least

one of the summand integrals diverges.

2.7 Uniform convergence &improper integrals

Consider the integral∫∞af(x, t) dt which is as-

sumed to converge for each x in some interval [A,B]

to a value denoted by F (x), i.e.

F (x) :=

∫ ∞a

f(x, t) dt A ≤ x ≤ B (2.1)

Let SR(x) :=∫ Raf(x, t) dt be its “partial integral”.

Definition 41. The integral (2.1) is said to con-

verge uniformly to F (x) in [A,B] if given any ε > 0,

there exists R′ depending only on ε and independent

of x ∈ [A,B] such that

R > R′ ⇒ |f(x)− SR(x)|

=

∣∣∣∣∣∫ ∞a

f(x, t) dt−∫ R

a

f(x, t) dt

∣∣∣∣∣=

∣∣∣∣∫ ∞R

f(x, t) dt

∣∣∣∣< ε

for all x ∈ [A,B].

The definitions are analogous for improper integ-

rals of the other types.

Theorem 42. Let f : [A,B]×[a,∞) −→ R be con-

tinuous and suppose∫∞af(x, t) dt converges uni-

formly to a function F : [A,B] −→ R. Then F (x)

is continuous.

Theorem 43. (Interchange of order of integration)

Let f be as in Theorem 42 and let the improper

integral occurring there converge uniformly to F (x)

in [A,B]. Then∫ B

A

F (x) dx =

∫ B

A

(∫ ∞a

f(x, t) dt

)dx

=

∫ ∞a

(∫ B

A

f(x, t) dx

)dt

Theorem 44. (Differentiation under the integral

sign) Let f and F be as in Theorem 42. Suppose

that ∂f∂x (x, t) is continuous and that

∫∞a

∂f∂x (x, t) dt

converges uniformly in [A,B]. Then F is differen-

tiable and

F ′(x) =

∫ ∞a

∂f

∂x(x, t) dt

22

2.8 The Gamma Function

The function Γ : (0,∞) −→ R defined by

Γ(x) :=

∫ ∞0+

tx−1e−t dt

is called the gamma function. This improper integ-

ral is also commonly written as∫∞

0. Its properties

are as follows.

Theorem 45.

1. Γ is differentiable to infinite order. In fact,

dn

dxnlog Γ(x) =

∞∑n=0

(−1)n(n− 1)(x+ k)−n

n ≥ 2, x > 0.

2. Γ(x + 1) = xΓ(x). Consequently, the do-

main of definition of Γ can be extended to

R r {0,−1,−2, . . . }. Also, Γ(x + n) = (x +

n− 1)(x+ n− 2) · · ·xΓ(x), where n ∈ N.

3. Γ(n + 1) = n!, n = 0, 1, 2, . . . . In particular,

Γ(1) = 1.

4. Γ( 12 ) = 1

2

√π.

5. Γ(0+) =∞ and limx→∞ Γ(x) =∞.

6. limx→0+ xΓ(x) = 1.

The beta function The Beta function denoted

B(x, y) is defined by

B : (0,∞)×(0,∞) −→ R

B(x, y) :=

∫ 1−

0+

tx(1− t)y−1 dt

The integral is improper if x < 1 or y < 1 or both.

With this understanding the integral is usually

written as∫ 1

0.

Properties

1. B(x, y) = B(y, x).

2. B(x, y) =

2∫ π/2

0(sin t)2x−1(cos t)2y−1 dt.

3. B(x, y) =∫∞

0tx−1

(1+t)x+y dt.

4. B(x, y) = Γ(x)Γ(y)Γ(x+y) .

5. B(x, y) = B(x+ 1, y) +B(x, y + 1).

2.9 Multiple Integrals

Theorem 46. Let R := [a, b]×[c, d] be a “rectangle”

in R2 and f : R −→ R be a function that is integ-

rable on R, i.e. assume that∫∫R

f(x, y) dxdy exists

(here “ dxdy” represents integration with respect to

the two-dimensional variable (x, y)). Suppose for

each y ∈ [c, d], I(y) :=∫ baf(x, y) dx exists. Then,∫ d

cI(y) dy exists ⇒

∫ dcI(y) dy =

∫∫R

f(x, y) dx dy.

Thus,∫∫R

f(x, y) dxdy =

∫ d

c

[∫ b

a

f(x, y) dx

]dy

The integral on the right is called an iterated in-

tegral. An analogous theorem holds with y replaced

by x.

Theorem 47. (Changing the order of

integration) Let f : R −→ R, R as above, be con-

tinuous. Then f is integrable and∫∫R

f(x, y) dx dy

is given by∫∫R

f(x, y) dxdy =

∫ d

c

[∫ b

a

f(x, y) dx

]dy

=

∫ b

a

[∫ d

c

f(x, y) dy

]dx

Definition 48. If A ⊂ R2 is bounded (i.e. A is con-

tained in some open ball), it is said to have content

0 if for every ε > 0, there is a finite collection of

rectangles which contain A and are such that the

sum of their areas is < ε.

Theorem 49. Suppose f : R −→ R is bounded

(i.e. f(R) is contained in some ball) on the rect-

angle R. If the set of discontinuities of f has con-

tent 0, then∫∫R

f(x, y) dxdy exists.

Definition 50. Let φ, ψ : [a, b] −→ R be continu-

ous. Define the following types of regions enclosed

by the graphs of the two functions:

Rφ,ψ :=

{(x, y) : a ≤ x ≤ b, φ(x) ≤ y ≤ ψ(y)}

and

Rφ,ψ :=

{(x, y) : φ(y) ≤ x ≤ ψ(y), a ≤ y ≤ b}

23

Theorem 51. Let R be a region of type Rφ,ψ and

f : R −→ R be bounded on R and continuous on

the interior of R. Then∫∫R

f(x, y) dx dy exists and

can be evaluated by the iterated integrals:∫∫R

f(x, y) dxdy =

∫ b

a

[∫ ψ(x)

φ(x)

f(x, y) dy

]dx

An analogous theorem is true for regions of type

Rφ,ψ:∫∫R

f(x, y) dxdy =

∫ b

a

[∫ ψ(y)

φ(y)

f(x, y) dx

]dy

If a region is simultaneously of both types, i.e. of

type Rφ1,ψ1 and Rφ2,ψ2, then

∫∫R

f(x, y) dxdy can

be evaluted by either of the two formulas above.

Triple Integrals Consider the 3-dimensional ana-

logue of the region of type Rφ,ψ, viz.

V :=

{(x, y, z) ∈ R3 : (x, y) ∈ R,φ(x, y) ≤ z ≤ ψ(x, y)}

where R is some region in R2.

Theorem 52. If f : V −→ R is continuous

then∫∫∫V

f(x, y, z) dx dy dz

=

∫∫R

[∫ ψ(x,y)

φ(x,y)

f(x, y, z) dz

]dxdy

Change of Variable Suppose we have functions

x, y : B ⊂ R2 −→ R defined by x = φ(u, v)

and y = ψ(u, v). The Jacobian or Jacobian

determinant of the mapping F : (u, v) 7→(x, y) = (φ(u, v), ψ(u, v)) is

J(u, v) =

∣∣∣∣∣∂φ∂u

∂φ∂v

∂ψ∂u

∂ψ∂v

∣∣∣∣∣The notations ∂(φ,ψ)

∂(u,v) and JF (u, v) are also

used.

Definition 53. A mapping α : V ⊂ Rn −→ Rn(n = 1, 2, 3) is said to be a coordinate trans-

formation or diffeomorphism if

1. α is one-to-one.

2. Each component function αi of α =

(α1, . . . , αn) has continuously differenti-

able partial derivatives.

3. Jα(u1, . . . , un) 6= 0 for all u =

(u1, . . . , un) ∈ V .

The image set α(V ) is open.

Special Coordinate Transformations

Linear coordinate transformations Let

V := R2 and α(u, v) = (au+ bv, cu+ dv)

for some a, b, c, d ∈ R satisfying

ad− bc 6= 0. Then Jα(u, v) = ad− bc and

α(V ) = R2. The extension to n = 3 is

obvious.

Polar coordinates in R2 Let V := {(r, θ) :

r > 0, 0 < θ < 2π} and

α(r, θ) = (α1(r, θ), α2(r, θ))

= (r cos θ, r sin θ)

Here Jα(r, θ) = r and α(V ) = R2 r{(x, 0) : x ≥ 0}.

Cylindrical coordinates in R3 Let V :=

{(r, θ, z) : r > 0, 0 < θ < 2π, z ∈ R},

α(r, θ, z) = (α1(r, θ, z), α2(r, θ, z), α3(r, θ, z))

= (r cos θ, r sin θ, z)

Now Jα(r, θ, z) = r and α(V ) = R3 r{(x, 0, 0) : x ≥ 0}.

Spherical coordinates in R3 Let V :=

{(r, θ, φ) : r > 0, 0 < θ < 2π, 0 < φ < π}and

α(r, θ, φ) = (r cos θ sinφ, r sin θ sinφ, r cosφ)

Then Jα(r, θ, φ) = −r2 sinφ and α(V ) =

R3 r [{(x, 0, 0) : x ≥ 0} ∪ {(0, 0, z) : z ∈R}].

Theorem 54. Let f : V ⊂ R2 −→ R be integ-

rable over the open set V . If α is a coordinate

transformation mapping V to U := α(V ) ⊂

24

R2, then∫∫U

f(x, y) dx dy

=

∫∫V

(f ◦ α)(u, v) |Jα(u, v)|dudv

=

∫∫V

f(α(u, v)) |Jα(u, v)|dudv

=

∫∫V

f [(α1(u, v), α2(u, v))] |Jα(u, v)|dudv

Here x := α1(u, v), y := α2(u, v).

In the case of triple integrals∫∫∫U

f(x, y) dxdy

=

∫∫∫V

(f ◦ α)(u, v, w) |Jα(u, v)|dudv dw

=

∫∫∫V

f(α(u, v, w)) |Jα(u, v, w)|dudv dw

=

∫∫∫V

f [(α1(u, v, w), α2(u, v, w), α3(u, v, w))]

×|Jα(u, v, w)|dudv dw

Area & Volume

1. Assume as before that R is the rectangle

[a, b]× [c, d]. Let f : R −→ [0,∞) be a

function S := {(x, y, z) ∈ R3 : (x, y) ∈R, z = f(x, y)} be the associated surface

(or graph) lying above the x-y plane. The

ordinate set of f over R is

{(x, y, z) ∈ R3 : (x, y) ∈ R, 0 ≤ z ≤ f(x, y)}

It may be thought of as the “cylinder”

under the graph of f . For each y ∈ [c, d],∫ baf(x, y) dx is the area of the cross-

section cut out by a plane parallel to the

x-z plane. Similarly for each x ∈ [a, b].

The volume of the ordinate set∫ ba

[∫ dcf(x, y) dy

]dx.

2. If R = Rφ,ψ for some continuous φ, ψ,

then the area of R, denoted∫∫R

dxdy, is

given by ∫ b

a

[ψ(x)− φ(x)] dx

The area of a cross-section of the associ-

ated ordinate set of a function f over R

is then given by∫∫R

dxdy =

∫ ψ(x)

φ(x)

f(x, y) dy

and the volume of the ordinate set is given

by ∫ b

a

[∫ ψ(x)

φ(x)

f(x, y) dy

]dx

3. A figure or surface of revolution is ob-

tained by revolving a curve or the graph

of a function, about an axis. Let f(y, z) =

c, ∇f 6= 0, be a curve in the upper y-z

plane in R3 (z ≥ 0). Then the surface of

revolution S obtained by rotating the set

{(y, z) : f(y, z) = c} about the y-axis is

S = {(x, y, z) ∈ R3 : F (x, y, z) = c}

where F (x, y, z) = f(y,√x2 + z2). In

particular, if z = φ(y), a ≤ y ≤ b (so

that f(y, z) = z − φ(y) and c = 0), then

S = {(x, y, z) ∈ R3 : x2 + z2 = φ(y)2}

The surface area of S is

area(S) = 2π

∫ b

a

y√

1 + φ′(y)2 dy

and the volume of the solid of revolution

V := {(x, y, z) ∈ R3 : 0 ≤ x2+z2 ≤ φ(y)2}

obtained by rotating z = φ(y) as before

is

vol(V ) = π

∫ b

a

φ(y)2 dy

The volume of the solid of revolution V

obtained by rotating the region between

two curves z = φ(y) and z = ψ(y) is

vol(V ) = π

∫ b

a

(ψ(y)2 − φ(y)2

)dy

Definition 55. A set A ⊂ R2 is simply

connected if every simple closed curve in

A encloses only points of A. Alternat-

ively, in such a set, every simple closed

curve can be shrunk continuously to a

point.

25

Definition 56. A parametric or para-

metrised surface S in R3 is a set of points

(x, y, z) satisfying the three equations

x = φ(u, v) y = ψ(u, v) z = ρ(u, v)

where φ, ψ and ρ are defined on a com-

mon domain R ⊂ R2 which is simply

connected and bounded by a simple closed

curve (see p. 30). Alternatively, it is the

range of the map

r : R ⊂ R2 −→ R3

r(u, v) = φ(u, v) i + ψ(u, v) j + ρ(u, v) k

x, y, z as before. The surface is said to be

simple if r is one-to-one.

Definition 57. A (regular) surface S may

also be defined as the level set of a func-

tion f : U ⊂ R3 −→ R such that ∇f 6= 0

on S and

S := f−1(c)

for some c ∈ R. It may be possible to ob-

tain a simple expression as above by elim-

inating u and v from the parametric equa-

tions in Definition 56. By considering

the function F := f − c, we may assume

without loss of generality that c = 0.

If a surface (a two-dimensional object)

has a “boundary”, then the boundary will

be a curve since it would have to be of one

dimension less.

Definition 58. Let S be a parametric sur-

face described as above by

r(u, v) = φ(u, v) i + ψ(u, v) j + ρ(u, v) k

The surface area of S is

area (S) :=

∫∫R

∣∣∣∣∣∣∣∣ ∂r

∂u× ∂r

∂v

∣∣∣∣∣∣∣∣ dudv

See Sec. 1.1 for the definition of || ||.Definition 59. The unit normal to S at-

tached at any point r(u, v) ∈ S is defined

as follows:

(a) when S is given in parametric form,

by

n :=∂r∂u×

∂r∂v∣∣∣∣ ∂r

∂u×∂r∂v

∣∣∣∣at all points (u, v) ∈ R where the

cross product, sometimes called the

fundamental vector product of the

surface, is nonzero.

(b) when S is given as a level set f−1(0),

by

n :=∇f(a)

||∇f(a)||for all a ∈ S.

Theorem 60. Suppose a surface S is

given by the explicit equation z = f(x, y).

Then

(a)

area (S)

=∫∫R

√1 +

(∂f∂x

)2

+(∂f∂y

)2

dx dy

(b) Let θ ∈ [0, π/2) be the angle between

the normal vector ∂r∂u×

∂r∂v to R and k.

Then the expression for area becomes

area (R) =

∫∫R

1

cos θdxdy

Hence, if R and S are regions lying in

two planes which are at an angle of θ

to one another, S being the projection

of R, then

area (S) = area (R) cos θ

Theorem 61. Let the surface S be given in im-

plicit form as f(x, y, z) = 0. Suppose that one

of x, y, z, say z, can be written as a function

of x, y, viz. z = φ(x, y). Then assuming that

∂f/∂z 6= 0 on R,

area (S) =

∫∫R

[(∂f∂x

)2

+(∂f∂y

)2

+(∂f∂z

)2]1/2

∣∣∣∣∣∣∂f∂z ∣∣∣∣∣∣ dx dy

2.10 Vector identities

In just this section vectors and vector-valued

functions will be denoted by bold letters for

clarity and emphasis. Vectors will be 2- or 3-

dimensional. We repeat the definition of the

dot product and state its properties.

Dot Product The dot product or inner

product of vectors a and b is

a · b := a1b1 + a2b2

or

a · b := a1b1 + a2b2 + a3b3

26

according as a and b are two- or three-

dimensional and expressed in terms of

their components. It is also denoted

〈a,b〉

If a = (a1, a2) or (a1, a2, a3), the mag-

nitude or norm of a is, as in (1.1),

||a|| := (a21 + a2

2)1/2

or

:= (a21 + a2

2 + a23)

1/2

respectively. The norm is also sometimes

denoted |a| in analogy with the absolute

value of real or complex numbers.

Properties of the dot product

1. Given two vectors a and b in R2

or R3, a · b = ||a|| ||b|| cos θ, where

θ ∈ [0, π] is the angle between the

two vectors.

2. (Commutativity) a · b = b · a3. t(a · b) = (ta)·b = a·(tb), t ∈ R.

4. (Bilinearity)

(sa + tb) · c = s(a · c) + t(b · c)

and

a · (sb + tc) = s(a · b) + t(a · c)

s, t ∈ R5. If i, j and k are the standard unit

vectors (1, 0, 0), (0, 1, 0) and (0, 0, 1)

respectively in R3, then i · i = j · j =

k · k = 1 and i · j = j · k = k · i =

0. In R2, there would only be i =

(1, 0) and j = (0, 1) with analogous

properties.

6. a · a = ||a||2

7. a · b = 0 and a,b 6= 0 ⇒ a and

b are perpendicular or orthogonal to

one another.

8. (Cauchy-Buniakowsky-Schwarz

inequality)

|a · b| ≤ ||a|| ||b||

Equality holds iff b = ca for some

scalar c.

Cross Product Given a and b in R3, the

cross product of a and b, denoted a×b

(also sometimes a ∧ b), is defined to be

the vector

a×b := (||a|| ||b|| sin θ) e

where as before θ is the angle between a

and b and e is a unit vector orthogonal

to both a and b directed so that {a,b, e}form a right-handed system or are pos-

itively oriented, i.e. if a = (a1, a2, a3),

b = (b1, b2, b3) and e = (e1, e2, e3), then

the determinant∣∣∣∣∣∣a1 a2 a3

b1 b2 b3e1 e2 e3

∣∣∣∣∣∣ > 0

If the determinant were negative, the vec-

tors would be said to form a left-handed

system or to be negatively oriented. The

terminology can be applied to any three

(resp. two) linearly independent vectors

in R3 (resp. R2).

Properties of the cross product

1. (Anti-commutativity) a×b = −b×a

2. t(a×b) = (ta)×b = a×(tb), t ∈ R3. (Bilinearity) For s, t ∈ R,

(sa + tb)×c = s(a×c) + t(b×c)

a×(sb + tc) = s(a×b) + t(a×c)

4. i×i = j×j = k×k = 0 and i×j = k,

j×k = i, k×i = j

5. If a = (a1, a2, a3) and b = (b1, b2, b3),

then

a×b =

∣∣∣∣∣∣i j ka1 a2 a3

b1 b2 b3

∣∣∣∣∣∣The expression on the LHS is to be

expanded as if it were a determinant

to obtain the components of i, j and

k.

6. The area of a parallelogram with

sides a and b is ||a×b||.Scalar triple product or box product of a, b

and c is defined to be the scalar a · (b×c)

or simply, a · b×c, often denoted by

[a b c]. We have

1. [a b c] = [b c a] = [c a b]

2.

[a b c] =

∣∣∣∣∣∣a1 a2 a3

b1 b2 b3c1 c2 c3

∣∣∣∣∣∣27

where a = (a1, a2, a3), b = (b1, b2, b3)

and c = (c1, c2, c3).

3. The scalar triple product a · b×c is

the “oriented” volume of the paral-

lelepiped with edges a, b, c: if the

vectors {a,b, c} are positively ori-

ented, the volume is positive whereas

if the vectors are negatively oriented,

the volume has a negative sign at-

tached to it. The usual (unoriented)

volume is given by the absolute value

| [a b c] |.

Vector triple product This is the vector

a×(b×c). We have the relations

a×(b×c) = (a · c)b− (a · b)c

(a×b)×c = (a · c)b− (b · c)a

(a×b) · (c×d) = (a · c)(b · d)

− (a · d)(b · c)

[Lagrange identity]

(a×b)×(c×d) = [a c d]b− [b c d]a

= [a b d]c− [a b c]d

Reciprocal systems of vectors Two sets

of vectors (a1,a2,a3) and (b1,b2,b3)

are said to be reciprocal systems if

ai · bj = δij , where we have used the

Kronecker symbol (see p. 3).

Theorem 62. Assume that the vec-

tors (a1,a2,a3) are linearly independ-

ent (or equivalently that [a1 a2 a3] 6= 0).

The two sets of vectors {a1,a2,a3} and

{b1,b2,b3} are reciprocal iff

b1 =a2×a3

[a1 a2 a3], b2 =

a3×a1

[a1 a2 a3]

b3 =a1×a2

[a1 a2 a3]

Vector differentiation Let f : U ⊂ R −→Rn (n = 2, 3) be a function. Such func-

tions are called vector-valued functions or

vector functions. If

limh→0

f(t+ h)− f(t)

h

exists, then f is said to be differentiable

at t ∈ U and the limit is said to be the

derivative of f at t. It is variously denoted

by f ′(t), dfdt , f(t) or Df(t) etc. Writing

f(t) = x(t) i + y(t) j (in R2) or as x(t) i +

y(t) j + z(t) k (in R3), the derivative is

given by

df

dt=

dxdt

i +dydt

j in R2

dxdt

i +dydt

j + dzdt

k in R3

Properties of vector differentiation

Let f , g and h be differentiable vector

functions defined on the same domain in

R. Then

1. ddt

(f(t) + g(t)) = dfdt

+dgdt

2. ddt

(f(t) · g(t)) = f(t) · dgdt

+ dfdt· g(t)

3.

d

dt(f(t)×g(t)) =

(f(t)× dg

dt

)+

(df

dt×g(t)

)4. If φ : U ⊂ R −→ R is a differentiable

function whose domain is that of f ,

define the function φf : U −→ Rn(n = 2, 3) by (φf)(t) = φ(t)f(t).

Then

d

dt(φf) = φ(t)

df

dt+

dφ

dtf(t)

5. (Derivative of the scalar triple

product)

d

dt[f(t) g(t) h(t)] =

[f(t) g(t)

dh

dt

]+[

f(t)dg

dth(t)

]+

[df

dtg(t) h(t)

]6. (Derivative of the vector triple

product)

d

dt(f(t)×(g(t)×h(t)))

= f(t)×(

g(t)× dh

dt(t)

)+ f(t)×

(dg

dt(t)×h(t)

)+

df

dt(t)×(g(t)×h(t))

Partial derivatives If f : U ⊂ R2 −→ Rn(n = 2, 3), then the partial derivatives at

a point (a, b) ∈ U are defined in terms of

the limits, if they exist:

∂f

∂x(a, b) = lim

h→0

f(a+ h, b)− f(a, b)

h∂f

∂y(a, b) = lim

k→0

f(a, b+ k)− f(a, b)

k

28

If

f(x, y) = (f1(x, y), f2(x, y), . . . , fn(x, y))

then

∂f

∂x=

(∂f1

∂x,∂f2

∂x, . . . ,

∂fn∂x

)and similarly for ∂f

∂y . The definition can

be extended to more than two variables in

the obvious way. The basic properties of

partial differentiation are as follows. Sup-

pose f ,g are vector functions on a com-

mon domain U . Then

1.∂

∂x(f · g) =

(f · ∂g

∂x

)+

(∂f

∂x· g)

. A

similar formula holds with respect to

y.

2.∂

∂x(f×g) =

(f× ∂g

∂x

)+

(∂f

∂x×g

).

A similar formula holds with respect

to y.

Gradient, divergence & curl

The gradient

Definition 63. If φ : U ⊂ R3 −→ Ris a function whose partial derivatives

exist at a point a = (a1, a2, a3) ∈ U ,

then the gradient of φ at a, denoted

by ∇φ(a), is

∇φ(a) =∂φ

∂x(a) i+

∂φ

∂y(a) j+

∂φ

∂z(a) k

∇φ can be thought of as the “differ-

ential operator”

∇ :=∂

∂xi +

∂

∂yj +

∂

∂zk

acting on φ.

The gradient can also be defined on

R2. If u is a unit vector in R3 and

∇φ(a) 6= 0, then ∇φ(a) · u is the

“projection” of ∇φ(a) on u or the

component of ∇φ(a) in the direction

of u. It is also the directional deriv-

ative (p. 17) of φ in the direction of

u. It measures the rate of change of

φ at a in the direction of u. Hence,

the rate of change is maximum when

u = ∇φ(a)/||∇φ(a)|| and its mag-

nitude is equal to ||∇φ(a)||.Scalar & vector fields

1. A function F : U ⊂ R3 −→ R is

called a scalar field.

2. A function F : U ⊂ R3 −→ R3 is

called a vector field. In terms of

components,

F(x, y, z) = F1(x, y, z) i + F2(x, y, z) j

+ F3(x, y, z) k

for suitable functions Fi : R3 −→R (i = 1, 2, 3). We will say that

the vector field F is differentiable

if each Fi is. The gradient

∇φ : U −→ R3

a 7→ ∇φ(a)

is a vector field.

Definition 64. Let F be a differen-

tiable three-dimensional vector field.

The divergence of F is the scalar field

∇ · F :=∂F1

∂x+∂F2

∂y+∂F3

∂z

It can be viewed as the “formal” dot

product of ∇: = ∂∂x i+ ∂

∂y j+ ∂∂z k and

F = F1 i + F2 j + F3 k. A common

notation for the divergence is div F.

Caution We can define a formal dot

product F · ∇ by (F · ∇)φ := F1∂φ∂x +

F2∂φ∂y + F3

∂φ∂z . Clearly, ∇ · F 6= F · ∇

since the LHS acts on points on R3

and the RHS on scalar fields.

The curl Let F be a vector field as

above. Then the curl of F at a is

the vector field defined by

(curl F)(a)

=

(∂F3

∂y(a)− ∂F2

∂z(a)

)i

+

(∂F1

∂z(a)− ∂F3

∂x(a)

)j

+

(∂F2

∂x(a)− ∂F1

∂y(a)

)k

Alternatively, the curl may be viewed

as the “cross product”:

curl F = ∇×F =

∣∣∣∣∣∣i j k∂∂x

∂∂y

∂∂z

F1 F2 F3

∣∣∣∣∣∣The curl is also denoted rot F.

The curl is also defined for two-

dimensional vector fields by

∂F2

∂x− ∂F1

∂y

which makes it a scalar field.

29

Properties Let φ and ψ be differentiable

scalar fields, and F and G differenti-

able vector fields on the same domain

in R3. Let c ∈ R be arbitrary.

1. (Linearity of the gradient)

∇(cφ+ ψ) = c∇φ+∇ψ2. (Linearity of the divergence)

div (c F + G) = c(div F) + div G

3. (Linearity of the curl)

curl (c F + G) = c(curl F) +

curl G

For the remaining properties we

assume that φ, F and G have

continuous second-order partial

derivatives.

4. div (φF) = (∇φ · F) + φ(div F).

Here φF is the vector field defined

by (φF)(a) := φ(a)F(a).

5. curl (φF) = (∇φ)×F + φ(curl F)

6. div (F×G) = G · curl F −F · curl G

7. curl (F×G) = (G · ∇)F −G(div F)− (F · ∇)G + (F div G)

8.

∇(F ·G) = (G · ∇)F + (F · ∇)G

+ (G×curl F) + (F×curl G)

9.

∇2φ := div (∇φ) =∂2φ

∂x2+∂2φ

∂y2+∂2φ

∂z2

The symbol ∇2φ is also denoted

∆φ and is called the Laplacian

of φ. The equation ∆φ = 0 is

Laplace’s equation.

10. curl (∇φ) = 0

11. div (curl F) = 0

12. curl (curl F) = ∇(div F) − ∆F,

where ∆F = (∆F1,∆F2,∆F3).

2.11 Line, Surface &Volume Integrals

Paths or Curves in Rn A map

γ : [a, b] ⊂ R −→ Rn(n = 2, 3)

γ(t) = (γ1(t), · · · , γn(t))

is said to be a continuous (resp. Ck,

1 ≤ k ≤ ∞) path or curve if its com-

ponent functions γi are each continuous

(resp. Ck). It is piecewise-continuous or

piecewise-Ck if [a, b] can be partitioned

into finitely many sub-intervals on each of

which γ is continuous or Ck and the only

discontinuities are jump discontinuities;

as a special case, it may consist of finitely

many continuous or Ck curves joined end

to end. The curve is said to be closed if

γ(a) = γ(b). It is said to be simple closed

or a Jordan curve if γ is one-to-one on

(a, b], i.e. the only self-intersection of the

curve occurs when γ(a) = γ(b). The do-

main of γ is sometimes called a parameter

domain. The parameter domain is not al-

ways explicitly mentioned but is assumed

to be given. Sometimes for convenience

the same same symbol γ represents not

only the curve (which by the definition is

a function) but also the range γ([a, b]) of

the function.

Tangents Let A map γ : [a, b] ⊂ R −→ Rn(n = 2, 3) be a differentiable curve. Then

the it tangent to the curve is the function

γ : [a, b] −→ R defined by

γ(t) = (γ1(t), · · · , γn(t))

where γi is traditional notation for differ-

entiation with respect to the parameter t

which is sometimes thought of as repres-

enting ‘time’.

Line Integrals Let γ : [a, b] −→ Rn (n =

2, 3) be a piecewise-smooth curve with

image C := γ([a, b]). If f : C −→ Rnis a bounded, then the line integral of f

along γ is the integral (assuming it exists)∫γ

f dγ :=

∫ b

a

f(γ(t)) · γ(t)︸︷︷︸dot product

dt

An alternative notation (with n = 3 for

example) is∫γ

f1(x, y, z) dx+f2(x, y, z) dy+f3(x, y, z) dz

where x := γ1(t), y := γ2(t), z := γ3(t). If

γ is closed, then the line integral is some-

times denoted by∮γ

f dγ , or, if the curve

30

is traversed in the “anticlockwise” direc-

tion, by

∨©

∫γ

f dγ

Theorem 65.

1. (Linearity) For constants a, b and

functions f, g,∫γ

(af + bg) dγ = a

∫γ

f dγ + b

∫γ

g dγ

2. If γ and ρ are “concatenated” (or

joined) curves, i.e. the end-point of

γ is the starting point of ρ, then de-

noting the resulting curve by γ + ρ,∫γ+ρ

f dγ =

∫γ

f dγ +

∫ρ

f dγ

Two piecewise-smooth curves

γ : [a, b] −→ Rn and ρ : [c, d] −→ Rnare said to be equivalent if there ex-

ists an onto (i.e. surjective) function

u : [c, d] −→ [a, b] such that its derivative

u′ 6= 0 on [c, d] and ρ = γ ◦ u. If u′ > 0

on [c, d], the curves ρ and γ are said to

have the same orientation and u is said

to be orientation-preserving whereas if

u′ < 0 on [c, d] they are said to have

opposite orientations and u is said to be

orientation-reversing.

Example. Define u : [0, 1] −→ [0, 1] by

u(t) = 1 − t. If γ(t) is a curve on [0, 1],

then (γ ◦ u)(t) = γ(1 − t) is an equival-

ent curve of opposite orientation to γ. In

fact, both have the same range or “trace”

(viz. γ([0, 1])) except that γ ◦ u covers it

moving from γ(1) to γ(0).

Theorem 66. Line Integrals under

a change of parameter Let γ and ρ

be equivalent curves defined on the same

domain. Then∫γ

f dγ =∫ρ

f dρ if γ and ρ have the

same orientation.∫γ

f dγ = −∫ρ

f dρ if γ and ρ have op-

posite orientations.

Line integrals with respect to arc length

Let γ : [a, b] −→ R be a C1 curve. Then

the arc length along γ is given by

s(t) =

∫ b

a

||γ(t)|| dt and s(t) = ||γ(t)||

If φ is a scalar field defined and bounded

on the range Γ of γ, then the line integral

of φ with respect to arc length along γ is

defined by∫Γ

φds :=

∫ b

a

(φ ◦ γ)(t)s(t) dt

=

∫ b

a

φ(γ(t)) ||γ(t)||dt

if the integral on the RHS exists.

Definition 67. An open subset U of Rn is said

to be connected if any two points in it can be

joined by a polygonal line, i.e. a continuous

curve consisting of line segments joined end to

end.

Theorem 68. Let D be a connected open set

in Rn (n = 2, 3) and φ : D −→ R be a differen-

tiable scalar field with continuous gradient ∇φ.

If γ is a piecewise smooth curve and γ(a), γ(b)

are two points on it, then∫γ

∇φ dγ = φ(γ(b))− φ(γ(a))

where γ is the portion of the curve starting

from γ(a) and ending at γ(b). Thus, the

line integral of a gradient is independent of

the curve joining the endpoints γ(a) and γ(b)

(this is usually termed path-independence).

Moreover, the line integral of a gradient along

a piecewise-smooth closed curve is 0.

Theorem 69. Let f : D ⊂ Rn −→ Rn (n =

2, 3) be a continuous vector field on the con-

nected open domain D. Suppose the line integ-

ral∫γ

f dγ is path-independent in D (i.e. with

respect to any pair of points of D). Fix a point

x0 ∈ D and define a scalar field φ : D −→ Rby

φ(x) =

∫γ

f dγ

where γ is any piecewise smooth (i.e. infinitely

differentiable) curve joining x0 to x and lying

in D (i.e. the range of γ is contained in D).

Then ∇φ exists and ∇φ(x) = f(x) for all x ∈

31

D. Under these circumstances, φ is called a

potential function (for f) and f is said to be

the gradient of the potential (function) φ.

Theorem 70. Let f : D ⊂ Rn −→ Rn (n =

2, 3) be a continuous vector field, D connected

and open. Then the following statements are

equivalent:

1. f is the gradient of some potential φ.

2.∫γ

f dγ is path-independent: γ can be

replaced by any other piecewise smooth

curve φ in D, provided they have the same

starting point and the same ending point.

3.∫γ

f dγ = 0 for every closed piecewise

smooth curve γ in D.

Theorem 71. Suppose f : D ⊂ R2 −→ R2,

f(x, y) = P (x, y) i + Q(x, y) j, is a continuous

vector field on the open connected set D. Then

f is a gradient on D (i.e. f = ∇φ for some

scalar field φ) iff

∂P

∂y=∂Q

∂x

Surface Integrals Let S = r(R) be a para-

metric surface (see Definition 56) r : R ⊂R2 −→ R3 being differentiable. Let f : S −→R be a bounded scalar field (bounded as a func-

tion).Then the surface integral of f over S is∫∫S

f dS

:=

∫∫R

f(r(u, v))

∣∣∣∣∣∣∣∣ ∂r

∂u× ∂r

∂v

∣∣∣∣∣∣∣∣ dudv

if the double integral on the RHS exists.

If f ≡ 1 in the above formula, we obtain the ex-

pression for surface area:∫∫S

dS =

∫∫R

∣∣∣∣∣∣∣∣ ∂r

∂u× ∂r

∂v

∣∣∣∣∣∣∣∣ dudv

Cf. Definition 58.

2.12 Green’s, Stokes’ &Gauss’ theorems

Green’s Theorem 1 This version of the theorem

is for regions in R2 bounded by piecewise

smooth Jordan curves.

Theorem 72. Let P,Q : U ⊂ R2 −→ R be

continuously differentiable scalar fields. Let γ

be a piecewise smooth Jordan curve and R de-

note the region in R2 consisting of all points

on the curve and enclosed by it. Assume that

R ⊂ U . Then∫∫R

(∂Q

∂x− ∂P

∂y

)dxdy = ∨©

∫γ

P dx+Qdy

γ is traversed in the anticlockwise direction.

Green’s Theorem 2

Definition 73. (Multiply connected do-

mains) Suppose γ, γ1, γ2, . . . , γn are piece-

wise smooth closed Jordan curves satisfying the

properties:

1. No pair of curves intersects.

2. γ1, γ2, . . . , γn lie in the interior of the re-

gion bounded by γ.

3. Any γi lies outside the region bounded by

γj if i 6= j.

If R and Ri are the regions bounded by γ and

the γi, then define R := Rr (∪ni=1Ri). The set

R is said to be multiply or n-connected with

boundary curves γ, γ1, . . . , γn.

Theorem 74. If P,Q : U ⊂ R2 −→ R are

continuously differentiable scalar fields, R ⊂ Uis a multiply connected region as above, then∫∫R

(∂Q

∂x− ∂P

∂y

)dxdy = ∨©

∫γ

P dx+Qdy

+

n∑i=1

∨©

∫γi

P dx+Qdy

Stokes’ Theorem

Theorem 75. Let S = r(R) be a smooth simple

parametric surface, R ⊂ R2, bounded by a

piecewise smooth Jordan curve γ. Suppose also

that r is C2 on some open set U which con-

tains R and its boundary. Let Γ(t) := r(γ(t))

be the curve bounding S. If F = P i+Q j+Rk,

P,Q,R continuously differentiable, is a vector

field on S, then∫∫S

(curl F) · n dS =

∫Γ

F · dΓ

where n is the usual unit vector normal to S

(p.26).

32

Gauss’ or the Divergence theorem The 3-

dimensional analogue of a closed curve will

be called a closed or compact surface. A

2-dimensional surface has two normals at

each point. Call the one pointing outside the

surface as the “outward normal”. This might

be the standard unit normal n defined on

p. 26 or −n. Regardless of the choice made,

denote it by n.

Definition 76. A surface S is orientable if the

normal vector field defined as follows:

N : S −→ R3

a 7→ ∇f(a)

||∇f(a)||if (S = f−1(0))

a 7→(∂r∂u×

∂r∂v

)(a)∣∣∣∣( ∂r

∂u×∂r∂v

)(a)∣∣∣∣

(if S = r(R), is simple)

is smooth, i.e. its component functions are

smooth. In the latter case when S is a simple

parametric surface, a = r(ua, va) for unique

(ua, va) in the parameter domain R.

Theorem 77. Let V be a “solid” in R3 bounded

by an orientable closed surface S. If F is a con-

tinuously differentiable vector field on V, then∫∫∫V

(div F) dxdy dz =

∫∫S

F · n dS

————————–

33

Chapter 3

Fourier series

Trigonometric series An infinite series of the

form

a0

2+

∞∑n=1

(an cosnx+ bn sinnx) (3.1)

is called a trigonometric series.

Fourier series The trigonometric series is called

the Fourier series of a function f : [−π, π] −→R if for n = 0, 1, 2, . . .

an =1

π

∫ π

−πf(x) cosnx dx

(3.2)

bn =1

π

∫ π

−πf(x) sinnx dx

This is sometimes expressed by writing

f(x) ∼ a0

2+

∞∑n=1

(an cosnx+ bn sinnx)

and the series is said to be generated by or as-

sociated with f . The numbers an and bn are

called the Fourier coefficients of f . More gen-

erally, if the domain of f is [−l, l], then by

considering the Fourier series associated with

f( lπx), the Fourier series of f(x) becomes

a0

2+

∞∑n=1

(an cos

nπx

l+ bn sin

nπx

l

)(3.3)

and the formulas for the Fourier coefficients

become for n = 0, 1, 2, . . .

an =1

l

∫ l

−lf(x) cos

(nπxl

)dx

(3.4)

bn =1

l

∫ l

−lf(x) sin

(nπxl

)dx

Not every trigonometric series is the Fourier

series of a function. However we have the res-

ult:

Theorem 78.

1. If the series (3.1) converges uniformly in

[−π, π] to a function f : [−π, π] −→ R,

then it is the Fourier series of f .

2. In particular, if p > 1 and

limn→∞ npan < ∞ and limn→∞ npbn <

∞, then the trigonometric series above is

a Fourier series.

Orthogonality Functions f, g : [a, b] −→ R are

said to be orthogonal if∫ baf(x)g(x) dx = 0.

The function f is said to be normalised on [a, b]

if∫ baf2(x) dx = 1. The following is an import-

ant list of orthogonal pairs of functions as well

as two standard normalised functions.∫ l

−lcos(mπx

l

)cos(nπx

l

)dx = 0

(m 6=n, m,n= 0,±1,±2, ...)∫ l

−lsin(mπx

l

)sin(nπx

l

)dx = 0

(m 6=n, m,n= 0,±1,±2, ...)∫ l

−lcos(mπx

l

)sin(nπx

l

)dx = 0

(m,n= 0,±1,±2, ...)

1

l

∫ l

−lsin2

(nπxl

)dx = 1

(n= 1, 2, ...)

1

l

∫ l

−lcos2

(nπxl

)dx = 1

(n= 1, 2, ...)

Fourier series of even & odd functions

f : [−l, l] −→ R is said to be even (resp. odd)

if f(−x) = f(x) (resp. −f(x)).

Definition 79. Let f : [−l, l] −→ R be an even

function. Then the associated Fourier series is

35

called the Fourier cosine series:

f(x) ∼ a0

2+

∞∑n=1

an cos(nπx

l

)(3.5)

a0 =2

l

∫ l

0

f(x) dx

an =2

l

∫ l

0

f(x) cos(nπx

l

)dx

Definition 80. Let f : [−l, l] −→ R be an odd

function. Then the associated Fourier series is

called the Fourier sine series:

f(x) ∼∞∑n=1

bn sin(nπx

l

)(3.6)

bn =2

l

∫ l

0

f(x) sin(nπx

l

)dx

Bessel’s inequality Assume that f : [0, π] −→ Ris piecewise continuous.

For the cosine series Let f be even and

have the associated cosine series (3.5). Then,

for N = 1, 2, . . .

a20

2+

N∑n=1

a2n ≤

2

π

∫ π

0

f2(x) dx

For the sine series Let f be odd and have

the associated sine series (3.6). Then, for N =

1, 2, . . .

N∑n=1

b2n ≤2

π

∫ π

0

f2(x) dx

Property of the Fourier coefficients With f

as above,

n→∞ =⇒{an → 0 if f is evenbn → 0 if f is odd

Half-range series & periodic extensions

Suppose f : [0, l] −→ R. Then f can be

extended to f : [−l, l] −→ R in two ways:

f(x) = f(−x) or f(x) = −f(−x) on [−l, 0] (in

either case f(x) = f(x) on [0, l]). In the first

situation, the extension of f to [−l, l] is even

whereas in the second, it is odd.

Definition 81. A function F : R −→ R is

periodic with period l if F (x + l) = F (x) for

all x ∈ R.

The periodic extension of an even function

f defined on [−l, l] to a function F defined

on R is obtained by repeatedly shifting the

graph of f by 2l units to the right and

to the left. To periodically extend an odd

function, essentially the same procedure is

followed but using the graph of f on the open

interval (−l, l) and defining f(x) = 0 at the

endpoints {nl : n = ±1,±2, . . .}. Depending

on which extension is chosen, F has either an

associated Fourier cosine series or a sine series.

Convergence of Fourier series

Definition 82. A function f : [a, b] −→ Ris piecewise smooth if f is piecewise continu-

ous and differentiable with piecewise continu-

ous derivatives on [a, b]. (At the endpoints, the

one-sided derivatives are taken).

Theorem 83. 1. Let f : (−l, l) −→ R be

piecewise smooth. At each x ∈ (−l, l), the

Fourier series of f on (−l, l) converges:

a0

2+

∞∑n=1


=f(x+) + f(x−)

2

and the Fourier coefficients an and bnare given by (3.4). Hence, at an interior

point of continuity x of f , the series con-

verges to f(x).

2. If F is the periodic extension of f with

period 2l, then by the result for f , at each

x ∈ (−∞,∞),

a0

2+

∞∑n=1


=F (x+) + F (x−)

2

The Fourier coefficients an and bn are

again given by (3.4) and at an interior

point of continuity x of F , the series con-

verges to F (x).

Corollary 84. The Fourier series of f :

(−l, l) −→ R at x = ±l converges tof(−l+)+f(l−)

2 . That of F at x = nl, n =

±1,±2, . . . converges to F (nl+)+F ((n+2)l−)2 .

Differentiation of a Fourier series

Theorem 85. Let f : [−π, π] −→ R be con-

tinuous and such that f(−π) = f(π). Suppose

36

that f ′(x) is piecewise continuous on (−π, π).

Then the Fourier series representation of f :

f(x) =a0

2+

∞∑n=1


an and bn as in (3.2), is differentiable at those

points x ∈ (−π, π) where f ′′(x) exists, and,

f ′(x) =

∞∑n=1

n(−an sinnx+ bn cosnx) (3.7)

At points x where f ′′(x) does not exist but

the left-handed and right-handed derivatives

of f ′(x) exist, the series (3.7) converges tof ′(x−)+f ′(x+)

2 . The interval [−π, π] can be re-

placed by [−l, l] and the series above by (3.3)

etc.

Integration of a Fourier series

Theorem 86. Let f : [−π, π] −→ R be piece-

wise continuous on (−π, π). Then if

f(x) ∼ a0

2+

∞∑n=1


without the convergence of the series being as-

sumed, then for all x ∈ [−π, π],∫ x−π f(t) dt

=a0

2(x+ π)

+

∞∑n=1

1

n[an sinnx− bn(cosnx+ (−1)n+1)]

Riemann-Lebesgue theorem Suppose

f : [a, b] −→ R is continuous except for

finitely many jump discontinuities, the jumps

being of finite magnitude. Then

limx→∞

∫ b

a

f(t) sinxt dt = limx→∞

∫ b

a

f(t) cosxt dt = 0

In particular, the result is true if x is replaced

by n ∈ N and the limit limx→∞ by limn→∞.

————————–

37

Chapter 4

Integral Transforms

4.1 Laplace Transforms

Let f : (0,∞) −→ R be a function. Its Laplace

transform denoted L{f(t)} is defined by

L{f(t)} =

∫ ∞0

f(t) e−st dt (4.1)

provided the improper integral (improper with re-

spect to both endpoints) converges for some value

of s ∈ R. The following notation is used to emphas-

ise the role of L{f(t)} as a function of s:

F (s) := L{f(t)}

F and f are sometimes referred to as the generating

function and the determining function respectively.

Theorem 87. The integral (4.1) converges at s =

s0 ⇒ it converges for s > s0. It diverges at s =

s0 ⇒ it diverges for s < s0.

Corollary 88. lims→∞ F (s) = 0. Hence, a poly-

nomial which is not the zero polynomial, cannot be

the Laplace transform of any function.

As a consequence of Theorem (87), there are

three possibilities:

1. Integral (4.1) converges for all s.

2. Integral (4.1) diverges for all s.

3. There exists sc such that (4.1) converges for

s > sc and diverges for s < sc. Then sc is

called the abscissa of convergence.

In the first two cases, we set sc = −∞ and sc =∞respectively.

Theorem 89.

1. The integral (4.1) converges absolutely at s =

s0 ⇒ it converges absolutely for s ≥ s0

2. If (4.1) converges for s = s0, it converges uni-

formly for s0 ≤ s ≤ R, where R ∈ R is arbit-

rary.

As before, an abscissa of absolute convergence sacan be defined which is such that absolute conver-

gence happens for s > sa, absolute divergence for

s < sa. Clearly, sc ≤ sa.

Laplace transform of an infinite series

Theorem 90. Let f(t) =∑∞n=0 ant

n converge

for all t > 0. Suppose the coefficients an satisfy

the condition

|an| ≤Cαn

n!

for all n sufficiently large, and C,α > 0. Then

L{f(t)} =

∞∑n=0

anL{tn} =

∞∑n=0

ann!

sn+1(s > α)

Differentiating a Laplace transform

Theorem 91. Let

F (s) := L{f(t)} =

∫ ∞0

f(t) e−st dt (s > sc)

Then

d

dsF (s) = −

∫ ∞0

tf(t)e−st dt (s > sc)

Corollary 92. F : (sc,∞) −→ R is infinitely

differentiable and

F (n)(s) =

∫ ∞0

(−t)nf(t)e−st dt (s > sc)

Integrating a Laplace transform

Theorem 93. Let F (s) = L{f(t)} for s > saand suppose

∫∞0|f(t)|t dt <∞. Then∫ ∞

0

F (s) ds = L

{f(t)

t

}(s > sa)

39

Linearity of the Laplace transform If f, g :

(0,∞) −→ R are Laplace transformable for

s > s0, and a, b ∈ R are arbitrary, then

L{(af + bg)(t)} = aL{f(t)}+bL{g(t)}(s > s0)

Linear change of variables Let λ > 0.

1. Then

L{f(λt)} =1

λF( sλ

)2. If f(t) = 0 on −∞ < t < 0, then for

λ > 0,

L{f(t− λ)} = e−λsL{f(t)}

3. F (λs) = L{ 1af( ta )} and F (s − λ) =

L{eλtf(t)}.

Differentiating the determining function f

Theorem 94. Let f : [0,∞) −→ R be con-

tinuously differentiable and satisfy the growth

condition limt→∞ f(t)e−st = 0 for all s > sc.

Then

L{f ′(t)} = sF (s)− f(0) (s > sc)

Corollary 95. Let f be Cn and

limt→∞ f (k)(t)e−st = 0 for all s > scand k = 0, 1, . . . , n− 1. Then

L{f (n)(t)} = snF (s)−n∑k=1

f (k−1)(0)sn−k

(s > sc)

Periodic functions Let f : [0,∞) −→ R be

periodic, i.e. there exists T > 0 such that

f(t+ T ) = f(t) for all t ≥ 0. Then

L{f(t)} = F (s) =1

1− e−sT

∫ T

0

f(t)e−st dt

The convolution theorem The convolution of

two functions f, g : (0,∞) −→ R, denoted f∗g,

is the function defined by

f ∗ g (t) :=

∫ t

0

f(x)g(t− x) dx (4.2)

if the integral exists: for example, if f and g

are piecewise continuous. The integral (4.2)

may also be improper at the upper limit, in

which case the integral is actually∫ t−

0. The

properties of the convolution “product” are as

follows.

1. c(f ∗g) = (cf)∗g = f ∗ (cg), c a constant.

2. (Commutativity) f ∗ g = g ∗ f .

3. (Associativity) (f ∗ g) ∗ h = f ∗ (g ∗ h).

4. (Distributivity) (f + g) ∗h = f ∗h+ g ∗h.

Theorem 96. Suppose L{f(t)} and L{g(t)}converge absolutely at s = a. Then

L{f ∗ g (t)} = L{f(t)}L{g(t)} (s ≥ a)

Rational functions are Laplace transforms

Theorem 97. Let R(s) be a rational function,

i.e. R(s) = P (s)/Q(s), where P,Q are poly-

nomials. Suppose lims→∞R(s) = 0. Then R

is the Laplace transform of some determining

function.

Theorem 98. (Power series in 1/s) Let

F (s) =

∞∑n=0

ansn+1

s > r for some r

be the Laplace transform of a function f for

sa ≤ r. Then

f(t) =

∞∑n=0

antn

n!(0 ≤ t <∞)

Laplace transform of the Dirac delta Let

δ(t − a) be the Dirac delta “function” (or

“pulse”) at a 6= t. Then its Laplace transform

is given by

F (s) = L{δ(t− a)} = e−as

Hence, if a = 0, L{δ(t)} = 1 and L−1[1] =

δ(t). This last statement does not contradict

the property stated in Corollary 88 because the

Dirac delta is not a function in the conven-

tional sense.

Uniqueness

Theorem 99. A function F (s) cannot be the

Laplace transform of more than one continuous

function f(t).

The inversion formula If F (s) is a given func-

tion, then the inverse transform of F , if it ex-

ists, denoted by L−1[ ], is a function f(t) such

that L{f(t)} = F (s).

Theorem 100. Let f : (0,∞) −→ R be a con-

tinuous function whose Laplace transform con-

verges to F (s) absolutely for s > sa. Then

L−1[F (s)] = f(t)

40

4.2 Fourier Transforms

A function f : R −→ C is absolutely integrable if∫∞−∞ |f(t)|dt < ∞, i.e. the improper integral con-

verges.

Definition 101. The Fourier transform of an ab-

solutely integrable function f : R −→ C, denoted by

f , F[f(t)] or F , is defined to be the function

f : R −→ C

f(ω) :=

∫ ∞−∞

f(t)e−iωt dt (4.3)

The function f is most commonly taken to be

R −→ R.

Caution The Fourier transform has variant defin-

itions in the literature. These can be summar-

ised as follows. If

f(ω) =1

a

∫ ∞−∞

f(t)eibωt dt

then the common choices for the pair (a, b) are

(√

2π,±1), (1,±√

2π), (1,±1). (4.3) above cor-

responds to (a, b) = (1,−1).

Basic Properties Let f : R −→ R be absolutely

integrable. Then

1. |f(ω)| ≤∫∞−∞ |f(t)|dt. Thus, f is

bounded.

2. f is continuous.

3. limω→−∞ f(ω) = 0 = limω→+∞ f(ω).

4. If the function f is even, then

f(ω) = 2

∫ ∞0

f(t) cosωtdt

and hence is even, whereas if it is odd,

then

f(ω) = −2i

∫ ∞0

f(t) sinωtdt

and is thus odd.

5. The Fourier transform is linear:(cf + g) = cf + g, c ∈ R.

Definition 102. (Convolution) Given two func-

tions f, g : R −→ C, their convolution, denoted

f ∗ g is defined to be the function

f ∗ g : R −→ C

f ∗ g (t) =

∫ ∞−∞

f(x)g(t− x) dx

if the integral exists. Cf. (4.2).

The convolution theorem Let f, g : R −→ C be

functions which are piecewise continuous, ab-

solutely integrable and bounded. Then f ∗ gis absolutely integrable and f ∗ g = f g. Con-

volution is commutative, associative and dis-

tributive.

Definition 103. The cross correlation ρfg of func-

tions f, g is defined to be ρfg := g ∗ f , where

f(x) := f(−x) (complex conjugate of f(−x)). The

autocorrelation is the function ρff .

Definition 104. (Fourier sine & cosine transforms)

For a function f which is even or odd, we define

fs(ω) :=

∫ ∞0

f(t) sinωtdt

(Fourier sine transform)

fc(ω) :=

∫ ∞0

f(t) cosωtdt

(Fourier cosine transform)

Theorem 105. (Transform of the conjugate) Let

f : R −→ C have Fourier transform f . Then the

transform of f(t) := f(t) is f(−t).

Theorem 106. (Shift in the time & frequency do-

main) Suppose f : R −→ C is absolutely integ-

rable and c ∈ R. Then the translated function

fa(t) := f(t − a) and eiatf(t) are also absolutely

integrable and

fa(ω) = e−iaω f(ω) (time shift)

[eiatf(t)](ω) = f(ω − a) (frequency shift)

Corollary 107. (Modulation) Let f : R −→ Chave Fourier transform f . Then denoting φ(t) :=

f(t) cosω0t and ψ(t) := f(t) sinω0t, we have

φ(ω) =1

2[f(ω + ω0) + f(ω − ω0)]

ψ(ω) =i

2[f(ω + ω0)− f(ω − ω0)]

Theorem 108. (Scaling) Let f : R −→ C be

Fourier transformable. If c ∈ R, c 6= 0, and

φ(x) := f(cx), then

φ(ω) = |c|−1f(c−1ω)

Theorem 109. (Uniqueness of the Fourier trans-

form) Let f, g : R −→ C be absolutely integrable

with Fourier transfroms f , g. If f = g on R, then

f = g at all the common points of continuity of f

and g.

41

Theorem 110. (Fourier transform of the time de-

rivative) Suppose f : R −→ C is n times continu-

ously differentiable and limt→±∞ f (k)(t) = 0 for

k = 0, 1, . . . , n− 1 (here f (0) := f), then the Four-

ier transform of f (n) exists and

[f (n)](ω) = (iω)nf(ω)

Theorem 111. (Frequency derivative of the Four-

ier transform) If f : R −→ C is differentiable and

f and f ′ are both absolutely integrable, then

1. (f ′)(ω) = iωf(ω).

2. If both f(t) and tf(t) are absolutely integrable,

then f is differentiable and

(f )′(ω) = −i[tf(t)]

Theorem 112. (Fourier transform of the indef-

inite integral) Let f be continuous and absolutely

integrable with Fourier transform f . Suppose

limt→∞∫ t−∞ f(τ) dτ = 0. Then for all ω 6= 0,(∫ t

−∞f(τ) dτ

)(ω) =

f(ω)

iω

Theorem 113. (Fourier transform of the Dirac

delta) Define δa(t) := δ(t− a). Then

δa(ω) = e−iaω

Moreover, if 1 denotes the constant function f(t) ≡1, then 1(ω) = 2πδ(ω).

Theorem 114. (Parseval’s identity) Let f, g :

R −→ C be piecewise smooth, absolutely integrable

and square-integrable (i.e.∫∞−∞ |f(t)|2 dt <∞; sim-

ilarly for g). Then∫ ∞−∞

f(t)g(t) dt =1

2π

∫ ∞−∞

f(ω) g(ω) dω

Theorem 115. (Plancherel’s identity) Let f have

Fourier transform f . Then∫ ∞−∞|f(t)|2 dt =

1

2π

∫ ∞−∞|f(ω)|2 dω

Theorem 116. (Fourier sine & cosine transforms

of time-derivatives) Let f : [0,∞) −→ R be con-

tinuously differentiable with second derivative f ′′(t)

piecewise continuous on each subinterval [0, b].

Suppose also that limt→∞ f(t) = 0 = limt→∞ f ′(t).

Then

(f ′′)c = −ω2fc − f ′(0)

(f ′′)s = −ω2fs + ωf(0)

Definition 117. (Finite Fourier sine & cosine

transforms) Let f : [0, π] −→ R be piecewise con-

tinuous. Then

fs(n) :=

∫ π

0

f(t) sinnt dt (n = 1, 2, . . .)

(Finite Fourier sine transform)

fc(n) :=

∫ π

0

f(t) cosnt dt (n = 0, 1, 2, . . .)

(Finite Fourier cosine transform)

Definition 118. (The Cauchy principle value) Let

f : R −→ C be a function. Then

limR→∞

∫ R

−Rf(t) dt

provided the limit exists, is the Cauchy principle

value of∫∞−∞ f(t) dt (which may not exist as an im-

proper integral). The integral is also said to exist

in the Cauchy sense.

Theorem 119. (Fourier inversion) Let f : R −→C be absolutely integrable and piecewise smooth with

Fourier transform f . Then

1

2π

∫ ∞−∞

f(ω)eiωt dω =1

2[f(t+) + f(t−)]

Here, the Cauchy principle value of the integral is

taken on the LHS for each t.

4.3 Z-Transforms

(One-sided) Z-Transforms Let

f : (−∞,∞) −→ C be a function and T > 0

be fixed. Then the Z-transform of the sequence

{f(nT ) : n = 0, 1, . . .} is defined to be

Z{f(nT} := Z{fn} := F (z)

:=

∞∑n=0

f(nT )z−n ∈ C

(4.4)

when the series is convergent. The radius of

convergence R (see Definition (178)) is determ-

ined by applying the ratio test or the root test.

For example, applying the root test to (4.4), we

obtain that the series converges if

|z| > lim supn→∞

n√|f(nT )| = R

T is called the sampling time. Note that

F : {z ∈ C : |z| > R} −→ C.

Sometimes Z-transforms are defined by setting

T = 1 in (4.4).

42

Properties of the Z-transform

Let fn := f(nT ) and gn := g(nT ) be two se-

quences such that Z{fn} = F (z) and Z{gn} = G(z)

for some functions F,G with radii of convergence

Rf and Rg respectively.

Linearity Given any a, b ∈ C,

Z{afn + bgn} = aZ{fn}+ bZ{gn}

and the radius of convergence of the LHS is

R := max{Rf , Rg}.

Shifting The right shift f(nT −kT )) of f(nT ) has

transform

Z{fn−k} =

z−k Z{fn} if f(−nT ) = 0, n ∈ Nz−k Z{fn}+

∑kn=1 f(−nT )z−(k−n)

otherwise

The left shift f(nT + kT ) has transform

Z{fn+k} = zk Z{fn} −k−1∑n=0

f(nT )zk−n

= zk

[Z{fn} −

k−1∑n=0

f(nT )z−n

]

As a special case, if k = 1, we have Z{fn+1} =

z [Z{fn} − f(0)].

Time-scaling If a ∈ C then

Z{a±nT fn} = F (a∓T z) =

∞∑n=0

f(nT )(a∓T z)−n

Periodic sequences Let fn be periodic with

period N : fn+N = fn and assume that

f(−nT ) = 0 for all n ∈ N. Define the first

block of periodic values by

f1(nT ) :=

{f(nT ) 0 ≤ n ≤ N

0 n /∈ {0, 1, . . . , N − 1}

Then

Z{f(nT )} =zN

zN − 1Z{f1(nT )}

Multiplication by n, nT & (nT )k n, k ≥ 0 are

integers.

Z{nf(nT )} = −zdF

dz

Z{nTf(nT )} = −TzdF

dz

Z{(nT )kf(nT )} = −Tz d

dzZ{(nT )k−1f(nT )}

Convolution Let fn = f(nT ) and gn = g(nT ) be

sequences with Z-transforms Z{fn} and Z{gn}respectively. Then the convolution of fn and

gn, denoted fn ∗ gn, is the sequence

fn ∗ gn :=

n∑k=0

f(kT )g(nT − kT )

Theorem 120. (The convolution theorem)

If fn and gn are two sequences as above, then

Z{fn ∗ gn} = Z{fn}Z{gn}

The convergence of the LHS is valid in the

region |z| > max{Rf , Rg}. Just as for the

Laplace and the Fourier transform, convo-

lution is commutative, associative and dis-

tributive.

The initial value theorem Let the sequence fnhave Z-transform F (z) in some region of con-

vergence. Then

fn = limz→∞

zn

[F (z)−

n−1∑k=1

fkz−k

]

In particular, f0 = limz→∞ F (z).

The final value theorem If the sequence fn has

Z-transform F (z) in some region of conver-

gence, then limn→∞ fn = limz→1(z − 1)F (z),

provided the limit on the LHS exists.

Transform of the complex conjugate If fn is

a sequence with Z-transform F (z), then

Z{f(nT )} = F (z) valid in the same region of

convergence as that of fn.

Transform of a product Let fn and gn be se-

quences having Z-transforms F (z) = Z{fn}and G(z) = ztg valid in regions |z| > Rf and

|z| > Rg respectively. Then

Z{f(nT )g(nT )} =

∞∑n=0

f(nT )g(nT )z−n

=1

2πi

∫C

F (w)G( zw

) dw

w

where C is a simple closed contour enclosing

the origin and oriented anticlockwise. The re-

gion of convergence is |z| > RfRg.

Transforms with parameters If fn = f(nT, a),

43

a ∈ R, have Z-transform F (z, a), then

Z

{∂

∂af(nT, a)

}=

∂

∂aF (z, a)

Z

{lima→a0

f(nT, a)

}= lima→a0

F (z, a)

Z

{∫ a1

a0

f(nT, a) da

}=

∫ a1

a0

F (z, a) da

assuming the integrals are finite.

Inverse transforms. Power series method

Let F (z) be the Z-transform of a sequence

fn := f(nT ). If F is analytic in |z| > R

(including at z = ∞ in the extended complex

plane), then fn can be recovered as the coeffi-

cients of the Taylor’s series expansion of F (z)

in terms of z−1. In particular, suppose F is a

rational function (quotient of polynomials):

F (z) =a0 + a1z

−1 + · · ·+ anz−n

b0 + b1z−1 + · · ·+ bnz−n

:= f(0) + f(T )z−1 + f(2T )z−2 + · · ·+ f(nT )z−n + · · ·

Then the sequence fn is obtained from the re-

lations

a0 = f(0)b0

a1 = f(0)b1 + f(T )b0

...

an =

n∑k=0

f((n− k)T )bk

Inverse transforms. Partial fractions

Suppose F (z) is a rational function

and the Z-transform of a sequence

fn. Then F can be written as a sum:

F (z) = F1(z) + F2(z) + · · · + Fn(z) + · · · . In

this case Z−1{F (z)} =∑∞n=1 Z

−1{Fn(z)} and

fn is the sum of the nth terms of the inverse

transforms Z−1{Fn(z)}.

Now suppose

F (z) =a1

z − c+

a2

(z − c)2+ · · ·+ an

(z − c)n

Then the aj ’s are given by the relations

an = (z − c)nF (z)|z=c

an−1 =d

dz[(z − c)nF (z)] |z=c

...

ak =1

(k − 1)!

dk−1

dzk−1[(z − c)nF (z)] |z=c

...

a1 =1

(n− 1)!

dn−1

dzn−1[(z − c)nF (z)] |z=c

————————–

44

Chapter 5

Ordinary Differential Equations(ODEs)

5.1 First-order equations

Let f : U ⊂ R −→ R be differentiable on an open

set U . An ODE which contains no derivative of

order higher than the first, is called a first-order

equation. Such an equation appears in one of two

forms:

y′ = F (x, y) (explicit form) (5.1)

F (x, y, y′) = 0 (implicit form) (5.2)

where y = f(x) (often written as y = y(x)). An

initial value problem is an ODE together with a

prescribed value of y(x0) at a given point x = x0. If

f satisfying the ODE exists, it is called the solution

of the ODE.

5.2 First-order equations inseparable form

Suppose the ODE is given as (5.1) and that

F (x, y) = f(x)g(y) for some functions f and g.

Then the solution of (5.1) is given by∫dy

g(y)=

∫f(x) dx+ c

Where c is an arbitrary constant of integration.

Particular solutions are obtained by giving values

to c.

The following equations are reducible to separ-

able form with solutions as follows:

1. dydx = f(ax + by + c), a, b, c ∈ R. The solution

is:∫du

a+ bf(u)= x+C (C: const. of integration)

and u := ax+ by + c.

2. (Homogeneous equations) Let F in (5.1) be ho-

mogeneous of degree 0 (see p. (2.3)). Then

(5.1) can be written as y′ = f(x/y) or y′ =

g(y/x). Substitute y = vx or x = uy respect-

ively to reduce the equation to separable form.

Assuming the substitution y = vx, the solution

is: ∫dv

g(v)− v= log |Cx| (y = vx)

where C is an arbitrary constant of integration.

3. dydx = ax+by+c

a′x+b′y+c′ , a, b, c and a′, b′, c′ ∈ R.

Case 1 aa′ = b

b′ = cc′ =: k. Substitute a′x+

b′y = u to obtain the following equation

in separable form:∫u+ c′

a′(u+ c′) + b′(ku+ c)du = x+ C

Case 2 aa′ 6=

bb′ . Substitute u := x − h and

v := y−k in the equation and choose h, k

such that ah + bk + c = 0. The equation

reduces to the form:

dv

du=

au+ bv

a′u+ b′v

which is homogeneous in u and v.

5.3 Exact First-order ODEs

The ODE

M(x, y) +N(x, y)y′ = 0 (5.3)

where M and N are functions defined on a certain

common open set U ⊂ R2. The ODE is said to be

exact if there exists a function f : U −→ R which

is such that ∂f∂x = M and ∂f

∂y = N . In differential

notation, the exact equation can be written as

df = M(x, y)dx+N(x, y)dy = 0 (5.4)

45

Theorem 121. Consider the ODE (5.3) defined on

a rectangle U := [a, b]× [c, d] ⊂ R2. Suppose that

M and N have continuous partial derivatives in U .

Then (5.3) is exact iff

∂M

∂y=∂N

∂x(5.5)

in U . If (5.3) is exact, its solution has the form

f(x, y) = c where

f(x, y) =

∫M(x, y) dx+ ψ(y)

ψ(y) :=

∫ [N(x, y)− ∂

∂y

∫M(x, y) dx

]dy

Theorem 122. Suppose (5.3) is exact and M,N

are homogeneous of degree n 6= 1 (see p. (2.3)).

Then the solution is Mx + Ny = C (constant of

integration).

5.4 General first-order first-degree linear equations

A general first-order first degree ODE (i.e. one in

which the highest power of y′ := dydx is 1) can be

expressed as

dy

dx+ P (x)y = Q(x) (5.6)

We have the following cases. C will denote an ar-

bitrary constant of integration.

(P,Q constants or P or Q = 0) The variables

are separable.

(P 6= 0, Q = 0) The solution is

y = Ce−∫P (x) dx

(Q 6= 0) Then the solution is

y = e−∫P (x) dx

[C +

∫Qe∫P (x) dxdx

]Bernoulli’s equation The equation

dy

dx+ P (x) = Q(x)yn

has the solution

y1−n = e−(1−n)∫P dx×[

C +

∫Q(1− n)e(1−n)

∫P dx dx

]C an arbitrary constant.

5.4.1 Integrating factors

A function φ(x, y) is said to be an integrating factor

of (5.3) if

φ(x, y)[M(x, y)dx+N(x, y)dy] = df

for some function f .

1. If the equation (5.4) is homogeneous (but not

exact), then 1/(M(x, y)x+N(x, y)y) is an in-

tegrating factor of the equation, provided the

denominator is not zero.

2. If 1N

(∂M∂y −

∂N∂x

)=: f(x), i.e. solely a function

of x, then e∫f(x) dx is an integrating factor for

(5.4).

3. If 1M (∂M∂y −

∂N∂x ) =: g(y), then e−

∫g(y) dy is an

integrating factor for (5.4).

4. If M(x, y) = yf(xy) and N(x, y) = xg(xy),

Mx − Ny 6= 0, then 1Mx−Ny is an integrating

factor of (5.4).

5. If M(x, y) = py and N(x, y) = qx, p, q ∈ R,

then (5.4) has integrating factor xp−1yq−1.

6. If (5.4) has the form xpyq(ay dx + bxdy),

a, b, p, q ∈ R, then (5.4) has integrating factor

xa−p−1yb−q−1.

7. If (5.4) can be expressed as py dx + qxdy +

xmyn(ay dx + bxdy) = 0, then xαyβ is an in-

tegrating factor if

α+ 1

p=β + 1

q

α+m+ 1

a=β + n+ 1

b

8. If (5.4) can be rewritten as xmyn(ay dx +

bxdy)+xm′yn′(a′y dx+b′x dy) = 0, then xαyβ

is an integrating factor provided the following

conditions are satisfied:

α+m+ 1

a=β + n+ 1

bα+m′ + 1

a′=β + n′ + 1

b′

ab′ − a′b 6= 0

5.5 First-order nth degreeequations

The general equation of this type has the form

(y′)n + P1(x, y)(y′)n−1 + Pn−2(x, y)(y′)n−2 + · · ·+ Pn−1(x, y)y′ + Pn(x, y) = 0

(5.7)

46

Three particular cases of (5.7) are the following.

Equations solvable for y′. The equation (5.7)

being an nth degree polynomial in y′, may have

n roots, say, u1, . . . , un. Then LHS of (5.7)

factors to give (y′−u1)(y′−u2) · · · (y′−un) =

0. Let y′ − ui(x, y) = 0 have the solution

fi(x, y, Ci) = 0, Ci an arbitrary constant.

Then, depending on the domain of the function

y, one set of solutions of (5.7) can be obtained

as

f1(x, y, Ci1)f2(x, y, Ci2) · · · fn(x, y, Cik) = 0

(1 ≤ k ≤ n), i.e. a product of some or all

the factors fi(x, y, Ci) = 0. Or else, solutions

could be formed by joining some of these to-

gether, for example, fi(x, y, Ci) = 0 on one

subinterval and fj(x, y, Cj) = 0 (j 6= i) on the

adjacent subinterval etc with a differentiable

overlap.

Equations solvable for y. Suppose (5.7) can be

solved for y to get y = f(x, y′). Differentiating

this w. r. t. x yields

y′ =∂f

∂x+∂f

∂y′dy′

dx

= F

(x, y,

dy′

dx

)(say)

Then solve y′ = F (x, y, dy′

dx ) to obtain

φ(x, y′, C) = 0 (C an arbitrary constant). The

general solution is obtained by eliminating y′

between

y = f(x, y′)

φ(x, y′, C) = 0

Equations solvable for x. If we have x =

f(y, y′), differentiate the relation w. r. t. y to

obtain a relation of the form

1

y′= G

(y, y′,

dy′

dy

)Solve this equation (if possible) to obtain a re-

lation ψ(y, y′, C) = 0. Eliminating y′ between

x = f(y, y′) and ψ(y, y′, C) = 0 for the general

solution.

Clairaut’s equation This equation has the form

y = y′ + f(y′). Its general solution is y =

C + f(C), C an arbitrary constant.

5.6 Linear ODEs

The general linear equation of order n defined on

(α, β) ⊂ R is

L(y) := a0(x)(y′)n + a1(x)(y′)n−1 + · · ·+ an−1(x)y′

+ an(x)y

= b(x)(5.8)

Each ai : (α, β) −→ R. A standing assumption

will be that a0 6= 0 on (α, β). If b(x) ≡ 0, (5.8)

is said to be linear homogeneous, and linear non-

homogeneous otherwise.

Theorem 123. (“Superposition”) Any finite linear

combination of solutions of (5.8) is again a solution

of (5.8). In other words, if y1, . . . , yn are solutions

of (5.8), then so is c1y1 + c2y2 + · · ·+ cnyn, where

the ci’s are arbitrary reals.

Definition 124. Suppose that y1, . . . , yn are solu-

tions of (5.8) with common domain U . Then if∑ni=1 ciyi ≡ 0 on U =⇒ ci = 0 for i = 1, . . . , n,

the solutions yi are said to be linearly independent.

Otherwise, they are said to be linearly dependent

Thus, the solutions yi are linearly dependent if

there exist ci ∈ R, i = 1, . . . , n, not all zero, such

that∑ni=1 ciyi ≡ 0.

Theorem 125. Consider the ODE (5.8) such that

the functions ai, b : (α, β) −→ R are continuous.

Let x0 ∈ (α, β) be arbitrary and y0, y1, . . . , yn−1 ∈R be arbitrary. Then (5.8) has a unique solution

satisfying the initial conditions

y(x0) = y0, y′(x0) = y1, . . . , y

(n−1) = yn−1

Definition 126. The Wronskian of a set of func-

tions fi : [α, β] −→ R (i = 1, . . . , n) which are

differentiable n− 1 times, is the determinant

W (f1, . . . , fn)(x):=

∣∣∣∣∣∣∣∣∣∣∣∣∣

f1(x) f2(x) · · · fn(x)f ′1(x) f ′2(x) · · · f ′n(x)

...... · · ·

...

f(n−1)1 (x) f

(n−1)2 (x) · · · f

(n−1)n (x)

∣∣∣∣∣∣∣∣∣∣∣∣∣Note that W (f1, . . . , fn) : [a, b] −→ R.

Theorem 127.

1. W (f1, . . . , fn) 6= 0 ⇒ the functions f1, . . . , fnare linearly independent. In the opposite dir-

ection, even if the f1, . . . , fn are linearly inde-

pendent, the W (f1, . . . , fn) may have = 0.

47

2. If in the ODE (5.8) the coefficient func-

tions ai : (α, β) −→ R are continu-

ous, and y1, . . . , yn are solutions of (5.8),

then the yi’s are linearly independent ⇐⇒W (f1, . . . , fn)(x) 6= 0 on (α, β).

Theorem 128.

1. The homogeneous equation L(y) = 0 (obtained

by setting b(x) = 0 in (5.8)) has n linearly

independent solutions.

2. If y1, . . . , yn are linearly independent solutions

of L(y) = 0, and y : (α, β) −→ R is any other

solution, then y is a linear combination of the

yi’s:

y(x) = c1y1(x) + c2y2(x) + · · ·+ cnyn(x) (5.9)

for some constants c1, . . . , cn. Thus, together

with Theorem (123), it follows that all solu-

tions of (5.8) are obtained when each of the

coefficients in (5.9) is varied over R.

Theorem 129. (The non-homogeneous equa-

tion) If yp is any solution, called the particular

solution or integral, of the non-homogeneous equa-

tion L(y) = b, then all the solutions are given by

y = yp + yc, where yc, called the complementary

function, is any solution of the homogeneous equa-

tion L(y) = 0.

5.6.1 Linear ODE of Euler-(orCauchy-)type

This is a linear equation with variable coefficients

of the form

xny(n) + a1xn−1y(n−1) + · · ·+ any = 0 (5.10)

The associated indicial polynomial is

q(r) := [r(r − 1) · · · (r − n+ 1)]

+ [a1r(r − 1) · · · (r − n+ 2)] + · · ·+ an(5.11)

It is degree n. In the special case of the 2nd order

Euler-type equation

x2y′′ + a1xy′ + a2 = 0

the indicial polynomial becomes

q(r) = r(r − 1) + a1r + a2 (5.12)

Theorem 130. Let the indicial polynomial (5.11)

have distinct roots r1, r2, . . . , rk and root ri have

multiplicity ni (so that n1 + · · · + nk = n). The

all the solutions of (5.10) are given by all possible

linear combinations of the n linearly independent

functions

|x|r1 , |x|r1 log |x|, . . . , |x|r1(log |x|)n1−1

|x|r2 , |x|r2 log |x|, . . . , |x|r2(log |x|)n2−1

...

|x|rk1, |x|rk log |x|, . . . , |x|rk(log |x|)nk−1

This solution is valid in any interval not containing

x = 0. In the special case of n = 2, the general

solution is given by

y(x) = c1|x|r1 + c2|x|r2 (distinct roots of (5.12))

= c1|x|r + c2|x|r log |x| (double root of (5.12))

where c1 and c2 are arbitrary reals.

5.6.2 nth order constant coefficienthomogeneous equations

These are equations of the form L(y) = 0 in which

b ≡ 0 and each ai(x) =: ai is a constant function.

The polynomial

P (u) := a0un + a1u

n−1 + · · ·+ an−1u+ an (5.13)

is called the characteristic or the auxiliary equation

associated with the given homogeneous equation.

The equation has n complex roots, some of which

may be real. The following possibilities can occur.

Roots all real & distinct Let m1, . . . ,mn be

the distinct real roots of (5.13). Then the

n linearly independent solutions are given by

em1x, . . . , emnx. The general solution is

y(x) = c1em1x + · · ·+ emnx

where the cj ’s are arbitrary real constants.

Roots all real with multiplicity Let the root

m have multiplicity r. Then all the linearly

independent solutions corresponding to m are

emx, xemx, . . . , xr−1emx. Thus, if the auxili-

ary equation has roots mj with multiplicity rj ,

j = 1, . . . , s, the general solution is:

r1∑k=1

c1kxk−1em1x +

r2∑k=1

c2kxk−1em2x + . . .

+

rs∑k=1

cskxk−1emsx

48

where the cjk ∈ R are arbitrary and r1 + r2 +

· · ·+ rs = n.

Roots complex & simple Since the auxilliary

equation has real coefficients, if p + i q is a

root of (5.13), then so is its complex conjugate

p− i q. If there are r pairs of complex conjug-

ate roots pk ± i qk, then all the basic linearly

independent solutions are listed as follows:

ep1x cos q1x, ep1x sin q1x (5.14)

ep2x cos q2x, ep2x sin q2x (5.15)

... (5.16)

eprx cos qrx, eprx sin qrx (5.17)

The general solution is an arbitrary linear com-

bination of the above solutions:

r∑k=1

ckepkx cos pkx+

r∑k=1

dkepkx sin qkx

ck, dk ∈ R.

Roots complex with multiplicity Let m = p+

i q be a complex root of multiplicity r. Then

so is p− i q. The linearly independent solutions

corresponding to this pair of roots are

epx cos qx, epx sin qx

xepx cos qx, xepx sin qx

......

xr−1epx cos qx, xr−1epx sin qx

Similarly for the other complex roots with mul-

tiplicity. The general solution is a linear com-

bination of the all these roots.

————————–

49

Chapter 6

Partial Differential Equations(PDEs)

In what follows, if f : U ⊂ R2 −→ R is a function

whose first-order partial derivatives exist, then

p :=∂f

∂xq :=

∂f

∂y

6.1 Formation of a PDE

Let u = f(x, y) be a function.

Elimination of constants We consider the case

of two constants. Let F (x, y, u, a, b) = 0 be a

relation. Then a PDE is obtained if a and b

can be eliminated from the three equations.

F (x, y, u, a, b) = 0

∂f

∂x+∂f

∂up = 0

∂f

∂y+∂f

∂uq = 0

to obtain a relation of the form

G(x, y, u, p, q) = 0. Here the number of

constants equals the number of independent

variables. If the number of constants is

greater, then higher-order partial derivatives

must be taken to obtain sufficiently many

equations to eliminate the constants.

Elimination of functions Suppose u := u(x, y)

and v := v(x, y) are connected by a relation of

the form φ(u, v) = 0. If the first-order partial

derivatives of φ exist, then an equation results

which is of the form

Pp+Qq = R

where

P :=∂u

∂y

∂v

∂z− ∂u

∂z

∂v

∂y

Q :=∂u

∂z

∂v

∂x− ∂u

∂x

∂v

∂z

R :=∂u

∂x

∂v

∂y− ∂u

∂y

∂v

∂x

6.2 First-order PDEs

Some terminology associated with solutions

Let

F (x, y, u, p, q) = 0 (6.1)

be a PDE in which the partial derivatives with re-

spect to the independent variables (taken to be x

and y) are of order at most 1. Such an equation is

said to be of first-order. The dependent variable is

generally denoted by u. A solution of the form

f(x, y, u, a, b) = 0

where a, b are arbitrary constants or parameters, is

said to be a complete integral of (6.1). If we restrict

b = φ(a) for arbitrary functions φ, then

f(x, y, u, a, φ(a)) = 0

is said to be a general integral of (6.1). If the envel-

ope of the two-parameter family of surfaces defined

by f(x, y, u, a, b) = 0 exists, then it is also a solu-

tion of (6.1) and it is called a singular integral or

singular solution of the PDE.

Linear PDEs

A PDE of the form

Pp+Qq = R (6.2)

51

where P = P (x, y, u), Q = Q(x, y, u) and R =

R(x, y, u) are functions in which p and q do not

occur and u is the dependent variable, is called a

linear PDE. In general, a linear equation is one of

the form

P1p1 + P2p2 + · · ·+ Pnpn = R (6.3)

where Pi = Pi(x1, x2, . . . , xn, u) and R =

R(x1, x2, . . . , xn, u) are functions of the n independ-

ent variables x1, . . . , xn and the dependent variable

u; pi := ∂u/∂xi for i = 1, 2, . . . , n.

Theorem 131.

1. The general solution of (6.2) is Φ(ξ, η) = 0

where Φ has first-order partial derivatives with

respect to ξ and η but is otherwise arbitrary,

and

ξ(x, y, u) = a, η(x, y, u) = b

a and b constants, are independent solutions of

the ODEs

dx

P=

dy

Q=

du

R(6.4)

2. The general solution of (6.3) is

Φ(ξ1, . . . , ξn, u) = 0, where Φ has first-

order partial derivatives with respect to the ξibut is otherwise arbitrary, and

ξi(x1, . . . , xn, u) = ai (i = 1, 2, . . . , n)

ai’s constants, are linearly independent solu-

tions of the ODEs

dx1

P1=

dx2

P2= · · · = dxn

Pn=

du

R

6.2.1 Special types of first-orderequations

Equations in p, q not involving x, y explicitly

The equations of the type

f(p, q) = 0

has the complete integral

u = ax+Q(a)y + b

where a, b are arbitrary constants and q = Q(a) is

the function explicitly solving

f(a, q) = 0

Equations involving u but not x, y

The equation

f(u, p, q) = 0

Solve the simultaneous equations f(u, p, q) = 0 and

p = aq, a an arbitrary constant, to obtain formulas

for p and q which in turn are integrated to form the

complete integral.

Separable equations

These are equations which can be written as

f(x, p) = g(y, q)

Solve for p and q from the simultaneous equations

f(x, p) = a and g(y, q) = a, where a is an arbit-

rary constant and construct the complete integral

as above.

Clairaut’s equations

These are equations of the type

u = px+ qy + f(p, q)

The corresponding complete integral is

u = ax+ by + f(a, b)

6.3 Linear PDEs with con-stant coefficients

A PDE of the form

F (D,D′)u :=

m∑i=1

n∑j=1

aijDiD′ju = f(x, y) (6.5)

where

D :=∂

∂xD′ :=

∂

∂y

Di :=∂i

∂xiD′j :=

∂j

∂yj

and the aij are constants, is called a linear PDE

with constant coefficients. The most general solu-

tion of the homogeneous PDE

F (D,D′)u = 0 (6.6)

is called the complementary function (CF) of (6.5).

Any solution of (6.5) is termed a particular integral

of the equation.

Theorem 132.

52

1. If u0 is the CF and u1 a particular integral of

(6.5), then u0 + u1 is the general solution of

(6.5).

2. If u1, . . . , un are solutions of (6.6), then so is

a1u1 + a2u2 + · · ·+ anun

A linear PDE is said to be reducible if it can be

factorised into factors of the form

D + aD′ + b

where a, b are constants.

Theorem 133. If (6.6) is reducible, then the

factors may be permuted without altering the equa-

tion.

Theorem 134. (Superposition) If u1 and u2 are

solutions of a homogeneous linear PDE in the un-

known function u = u(x, y, . . . ) of finitely many

variables defined on some domain U , then c1u1 +

c2u2 is also a solution of the PDE in that domain.

Theorem 135.

1. If for a 6= 0, aD + bD′ + c is a factor of

F (D,D′) and φ(t) is a differentiable but oth-

erwise arbitrary function, then

u(x, y) = e−cax φ(bx− ay)

is a solution of (6.6), i.e. it is the CF of (6.5).

2. If in the above case a = 0 but b 6= 0 and the

other hypotheses unchanged, then

u(x, y) = e−cb y φ(bx)

is the CF of (6.5).

For factors with multiplicity the following results

hold.

Theorem 136.

1. If for a 6= 0 (aD + bD′ + c)n is a factor of

F (D,D′) and if φi(t) (i = 1, 2, . . . , n) are ar-

bitrary differentiable functions, then

u(x, y) = e−cax

n∑i=1

xi−1φi(bx− ay)

is the CF of (6.5).

2. If in the above case a = 0 but b 6= 0 and the

other hypotheses unchanged, then

u(x, y) = e−cb y

n∑i=1

xi−1φi(bx)

is the CF of (6.5).

In general, the following theorem is true.

Theorem 137. Suppose F (D,D′) is reducible to

linear factors:

F (D,D′) =

n∏i=1

(aiD + biD′ + ci)

ni

and φij(t), i = 1, 2, . . . , n and j = 1, 2, . . . , ni, are

arbitrary differentiable functions.

1. If ai 6= 0 for all i = 1, 2, . . . , n, then the CF is

given by

u(x, y) =

n∑i=1

e− ciai x

ni∑j=1

xj−1φij(bix− aiy)

2. If some of the ai’s are zero but the correspond-

ing bi’s are not, then the appropriate factors

from (2) of Theorem 136 are used.

Since

F (D,D′)eax+by = F (a, b)eax+by and

F (D,D′)(eax+byφ(x, y))

= eax+byF (D + a,D′ + b)φ(x, y)

where φ is differentiable with respect to x and y

to the orders of D and D′ respectively, we have for

the factors of (6.6) which are not reducible to linear

factors, we have

Theorem 138. eax+by is a solution of (6.6) if

F (a, b) = 0. Hence, a general solution is given by

u(x, y) =

n∑j=1

cjeajx+bjy

where aj, bj, cj are constants, F (aj , bj) = 0 for all

j and n ∈ N∪{∞}. If the series is infinite, then it

is a solution if it is uniformly convergent.

When the non-homogeneous term has the form

f(x, y) = sin(ax+ by)

and F is of the form F (D2, DD′, D′2), then one can

use the relation

F (D2, DD′, D′2) sin(ax+ by)

= F (−a2,−ab,−b2) sin(ax+ by)

An analogous result is true for cos(ax + by). In

general, the trigonometric function f(x, y) can be

expressed as a sum of complex exponentials and

the theorems above applied. Alternatively, one may

53

assume the solution (of a second-order equation, for

example, with f(x, y) = sin(ax+ by)) in the form

u(x, y) = α sin(ax+ by) + β cos(ax+ by)

and determine the unknown constants α and β

by substituting the assumed solution in the given

equation.

Finding a particular integral of (6.5)

Interpreting D−1 and D′−1 to mean integration

with respect to x and y respectively, and writing

(6.5) as

u =1

F (D,D′)f(x, y)

it may be convenient in some cases to expand

1/F (D,D′) by the binomial theorem and perform

the integrations on f(x, y).

6.4 Some special linear PDEs

6.4.1 The one-dimensional waveequation

Let u := u(x, t) and f(x, t) be a functions and c > 0

a constant. The PDE

∂2u

∂t2− c2 ∂

2u

∂x2= f(x, t)

is called the one-dimensional non-homogeneous

wave equation. If f(x, t) ≡ 0, it is said to be homo-

geneous and is usually written as

utt :=∂2u

∂t2= c2

∂2u

∂x2=: c2uxx (6.7)

The Cauchy problem for an infinite string

This is the homogeneous wave equation (6.7) with

“initial conditions” as follows. Assume that U =

R× [0,∞] (interpreted as time) and f , g are certain

prescribed functions.

utt = c2uxx

u(x, 0) = f(x)ut(x, 0) = g(x)

}initial conditions at t = 0

The solution called D’Alembert’s solution is given

by

u(x, t) =1

2[f(x+ ct) + f(x− ct)]

+1

2c

∫ x+ct

x−ctg(s) ds

Semi-infinite string with a fixed end

This is an initial-cum-boundary-value problem. As-

sume that U = R× (0,∞], f is C2, g is C1 and

f(0) = f ′′(0) = g(0) = 0

utt = c2uxx

u(x, 0) = f(x) x ≥ 0ut(x, 0) = g(x) x ≥ 0u(0, t) = 0 t ≥ 0

The solution is given by

u(x, t) =1

2[f(x+ ct) + f(x− ct)]

+1

2c

∫ x+ct

x−ctg(s) ds

(x > ct) (6.8)

u(x, t) =1

2[f(ct+ x)− f(ct− x)]

+1

2c

∫ ct+x

ct−xg(s) ds

(x < ct)

Semi-infinite string with a free end

This is another initial-cum-boundary-value prob-

lem. Assume that the free end is at x = 0,

U = (0,∞)× (0,∞], f is C2, g is C1 and

f ′(0) = g′(0) = 0

utt = c2uxx

u(x, 0) = f(x) x ≥ 0ut(x, 0) = g(x) x ≥ 0u(0, t) = 0 t ≥ 0ux(0, t) = 0 t ≥ 0


u(x, t) =1

2[f(x+ ct) + f(x− ct)] +

1

2c

∫ x+ct

x−ctg(s) ds

(x > ct)

u(x, t) =1

2[f(ct+ x)− f(ct− x)] +

1

2c

[∫ ct+x

0

g(s) ds

+

∫ ct−x

0

g(s) ds

](x < ct)

54

Nonhomogeneous boundary conditions

Let the following equation hold on the domain U =

(0,∞)× (0,∞), f be C2, g be C1, p be C2 and

p(0) = f(0); p′(0) = g(0); p′′(0) = c2f ′′(0)

utt = c2uxx

u(x, 0) = f(x) x ≥ 0

ut(x, 0) = g(x) x ≥ 0

u(0, t) = p(t) t ≥ 0


u(x, t) = p(t− x

c

)+ φ(x+ ct)− ψ(ct− x)

(0 ≤ x < ct)

where

φ(ξ) =1

2f(ξ) +

1

2c

∫ ξ

0

g(s) ds

and

ψ(η) =1

2f(η) +

1

2c

∫ η

0

g(s) ds

The solution for x > ct is given by (6.8).

The vibrating string. Separation of variables

Consider a string of length l stretched along the x-

axis from 0 to l. Assume that the string is under

constant tension τ and has density ρ. The PDE

on the domain U = (0, l) × (0,∞) describing the

vibration of the string with prescribed initial and

boundary conditions is the following.

utt = c2uxx

u(x, 0) = f(x) 0 ≤ x ≤ lut(x, 0) = g(x) 0 ≤ x ≤ lu(0, t) = 0 t ≥ 0

u(l, t) = 0 t ≥ 0

Let f and g be representable by Fourier sine series.

Assuming a solution of the form

u(x, t) = X(x)T (t) 6= 0

we obtain

u(x, t) =

∞∑n=1

[an cos

(nπcl

)t+ bn sin

(nπcl

)t]

× sin(nπx

l

)

where

an =2

l

∫ l

0

f(x) sin(nπx

l

)dx

and

bn =2

nπc

∫ l

0

g(x) sin(nπx

l

)dx

6.4.2 The two-dimensional waveequation

Vibrating rectangular membrane

The situation is modelled by the following equation

on the domain U = (0, a) × (0, b) × (0,∞) with

associated initial and boundary conditions. Here

the function is u = u(x, y, t) and the membrane

has length a and width b.

utt = c2(uxx + uyy)

u(x, y, 0) = f(x, y) 0 ≤ x ≤ a, 0 ≤ y ≤ but(x, y, 0) = g(x, y) 0 ≤ x ≤ a, 0 ≤ y ≤ bu(0, y, t) = 0; u(a, y, t) = 0

u(x, 0, t) = 0; u(x, b, t) = 0

The solution by separation of variables is given by

u(x, y, t) =

∞∑m=1

∞∑n=1

(amn cos θmnct+ bmn sin θmnct)

× sin(mπx

a

)sin(nπy

b

)where θmn = m2π2

a2 + n2π2

b2 and

amn =4

ab

∫ a

0

∫ b

0

f(x, y) sin(mπx

a

)sin(nπy

b

)dxdy

bmn =4

θmnabc

∫ a

0

∫ b

0

g(x, y) sin(mπx

a

)sin(nπy

b

)dxdy

6.4.3 The three-dimensional waveequation

The equation on U = (0, a)× (0, b)× (0, d)× (0,∞)

with initial and boundary conditions has the form

utt = c2∇2u = c2(uxx + uyy + uzz)

u(x, y, z, 0) = f(x, y, z) on [0, a]× [0, b]× [0, d]

ut(x, y, z, 0) = g(x, y, z) on [0, a]× [0, b]× [0, d]

u(0, y, z, t) = 0; u(a, y, z, t) = 0

u(x, 0, z, t) = 0; u(x, b, z, t) = 0

u(x, y, 0, t) = 0; u(x, y, d, t) = 0

Assuming a solution of the form

u(x, y, z, t) = U(x, y, z)T (t)

55

we have

u(x, y, z, t) =∞∑l=1

∞∑m=1

∞∑n=1

(almn cos

√λct+ blmn sin

√λct)

× sin

(lπx

a

)sin(mπy

b

)sin(nπz

d

)where

λ =

(l2

a2+m2

b2+n2

d2

)π2

and

almn =8

abd

∫ a

0

∫ b

0

∫ d

0

f(x, y, z) sin(mπy

b

)× sin

(nπzd

)dxdy dz

blmn =8√λabcd

∫ a

0

∫ b

0

∫ d

0

g(x, y, z) sin(mπy

b

)× sin

(nπzd

)dxdy dz

6.4.4 Two-dimensional Laplaceequation in a rectangle

This is the equation on the domain U = (0, a) ×(0, b):

∇2u = uxx + uyy = 0

u(x, 0) = f(x); u(x, b) = g(x) (0 < x < a)

u(0, y) = 0; u(a, y) = 0 (0 < y < b)


u(x, y) =

∞∑n=1

(an cosh

nπy

a+ bn sinh

nπy

a

)sin

nπx

a

where for all n ∈ N

an =2

a

∫ a

0

f(s) sinnπs

ads

bn =1

sinh nπba

[2

a

∫ a

0

g(s) sinnπs

ads

−(

coshnπb

a

)2

a

∫ a

0

f(x) sinnπx

ads

]

6.4.5 Two-dimensional Laplaceequation in a circle with Di-richlet conditions

Transforming the coordinate system to polar (r, θ),

Laplace’s equation becomes

urr +1

rur +

1

r2uθθ = 0

On the domain U = (0, a)× (−π, π] and subject to

the boundary conditions

u(r, θ) = f(θ) (−π < θ ≤ π)

|u(r, θ)| <∞u(r, π) = u(r,−π) (0 < r < a)

uθ(r, π) = uθ(r,−π) (0 < r < a)

f(r) and u(r, θ) are assumed to be periodic with

period 2π. The solution is

u(r, θ) =1

2a0 +

∞∑n=1

rn(an cosnθ + bn sinnθ)

where

an =1

πan

∫ π

−πf(s) cosns ds (n ≥ 0)

bn =1

πan

∫ π

−πf(s) sinns ds (n ≥ 1)

6.4.6 Laplace’s equation in three di-mensions

This is the equation

∇2u = uxx + uyy + uzz = 0 (6.9)

When specific initial and boundary conditions are

imposed, it is termed a Dirichlet problem.

Laplace’s equation in a box with Dirichletconditions

Consider the following equation on the domain U =

(0, a)× (0, b)× (0, c).

∇2u = 0

u(0, y, z) = u(a, y, z) = 0 on (0, b)× (0, c)

u(x, 0, z) = u(x, b, z) = 0 on (0, a)× (0, c)

u(x, y, c) = 0; u(x, y, 0) = f(x, y) on (0, a)× (0, b)

Assuming a solution of the form u(x, y, z) =

X(x)Y (y)Z(z), we obtain the solution as

u(x, y, z) =

∞∑m=1

∞∑n=1

amn sinhλmn(c− z) sinmπx

a

× sinnπy

b

λmn := π

√m2

a2+n2

b2

amn =4

ab sinh(cλmn)

∫ a

0

∫ b

0

f(s, t) sinmπs

a

× sinnπt

bdsdt

56

Laplace’s equation in a sphere with Dirichletconditions

In spherical polar coordinates (r, θ, φ), 0 ≤ r ≤ R,

0 ≤ θ ≤ π and 0 ≤ φ ≤ 2π, Laplace’s equation

takes the form

∂

∂r

(r2 ∂u

∂r

)+

1

sin θ

∂

∂θ

(sin θ

∂u

∂θ

)+

1

sin2 θ

∂2u

∂φ2= 0

Subject to the boundary condition U(R, θ, φ) =

f(θ, φ) on the surface, the solution is

u(r, θ, φ) =

∞∑n=0

(r/R)nZn(θ, φ) (r < R)

Zn(θ, φ) =

n∑k=1

(ank cos kφ+ bnk sin kφ)P kn (cos θ)

where

P kn (x) = (1− x2)k/2d

dxkPn(x)

is the Legendre function associated with the Le-

gendre polynomial Pn(x) of degree n and

a00 =1

4π

∫ 2π

0

∫ π

0

f(θ, φ) sin θ dθ dφ

ank =(2n+ 1)(n− k)!

2π(n+ k)!

∫ 2π

0

∫ π

0

f(θ, φ)P kn (cos θ)

× cos kφ sin θ dθ dφ (n > 0)

bnk =(2n+ 1)(n− k)!

2π(n+ k)!

∫ 2π

0

∫ π

0

f(θ, φ)P kn (cos θ)

× sin kφ sin θ dθ dφ

6.4.7 One-dimensional heat equation

This is the following equation defined on the do-

main U = (0, a)× (0,∞).

ut = c2uxx (c > 0)

u(x, 0) = f(x) (0 < x < a)

u(0, t) = 0 = u(a, t) (t > 0)

The formal solution is given by

u(x, t) =

∞∑n=1

cne−(n2π2c2/a2)t sinnπx

a

cn =2

a

∫ a

0

f(x) sinnπx

adx (n ∈ Z)

6.4.8 Two-dimensional heat equa-tion

Heat equation on a strip in cartesian co-ordinates

This equation with its initial and boundary condi-

tions defined on U = (0, a)× (0, b)× (0,∞) is

ut = c2(uxx + uyy) (c > 0)

u(x, y, 0) = f(x, y) on (0, a)× (0, b)

u(x, 0, t) = 0 = u(x, b, t) on (0, a)× (0,∞)

u(0, y, t) = 0 = u(a, y, t) on (0, b)× (0,∞)

has a solution by the separation of variables given

by

u(x, y, t) =

∞∑m=1

∞∑n=1

amn sinmπx

asin

nπy

be−λ

2mnc

2t

where

λ2mn =

m2π2

a2+n2π2

b2

and

amn =4

ab

∫ a

0

∫ b

0

f(s, t) sinmπs

asin

nπt

bdsdt

Heat equation in a sphere

In spherical polar coordinates the equation is

ut = c2(urr +

2

rur

)subject to the initial and boundary conditions

u(r, 0) = rf(r)

u(0, t) = 0 u(R, t) = 0 (t > 0)

is

u(r, t) =

∞∑n=1

ane−(nπcR )2t 1

rsin(nπrR

)(r > 0)

where

an =2

r

∫ R

0

rf(r) sin(nπrR

)dr

Heat equation in a cylinder in cylindrical po-lar coordinates

The equation is given by

ut = c21

r

∂

∂r

(r∂u

∂r

)57

subject to the initial and boundary conditions

u(r, 0) = 0 (0 ≤ r < R)

u(R, t) = U a constant and t > 0

has the solution

u(r, t) = U

[1− 2

∞∑n=1

e−µ2nc

2t

R2

]J0(µnr/R

µnJ1(µn)

where the µn’s are the positive roots of the Bessel

function J0(µ) = 0.

————————–

58

Chapter 7

Complex Variables

7.1 Preliminaries

1. In many circumstances, a complex number z =

x + i y can be regarded as the point (x, y) of

R2. The real part x of z is denoted Re z and y

the imaginary part, by Im z

2. An (open) ball of radius r centred on z0 is the

set B(z0, r) := {z ∈ C : |z− z0| < r}, where | |is the complex modulus or absolute value. A

set U ⊂ C is said to be open if for each z ∈ Uthere is an open ball B(z, r) ⊂ U , r depending

on z.

3. A function f : A ⊂ C −→ C can be de-

composed into its real and imaginary parts:

f(z) = f(x, y) = u(x, y) + i v(x, y), where

u, v : A ⊂ R2 −→ R.

4. Let f : A ⊂ C −→ C be a function and z0 ∈ Abe such that B(z0, r) ⊂ A for some ball centred

on z0 (i.e. z0 is an interior point of A). If there

is l ∈ C, |l| < ∞, such that given any ε > 0,

|f(z)− l| < ε for all 0 < |z − z0| < δ for some

δ = δ(ε) > 0, then we say that limz→z0 f(z) =

l. If no such l exists, then we say that the limit

does not exist.

Theorem 139.

limz→z0

f(z) = l⇐⇒ limz→z0

u(x, y) = Re l

& limz→z0

v(x, y) = Im l

5. Let f : A ⊂ C −→ C be a function on an

unbounded set A, i.e. there is no ball B(0, r) ⊃A. If there is l ∈ C, |l| < ∞, such that given

any ε > 0, |f(z)− l| < ε for all |z| > r for some

r > 0, then we say that limz→∞ f(z) = l.

Theorem 140.

limz→∞

f(z) = l⇐⇒ limz→∞

u(x, y) = Re l

& limz→∞

v(x, y) = Im l

6. Let f : A ⊂ C −→ C and z0 ∈ A be an in-

terior point. f is continuous at z0 if given

any ε > 0 there exists δ = δ(ε) such that

|f(z) − f(z0)| < ε whenever |z − z0| < δ. If

f is continuous at every point of its domain, it

is said to be continuous. If f is not continuous

at z0, it is said to be discontinuous at z0. f is

discontinuous if it is discontinuous at at least

one point in its domain.

Theorem 141. f is continuous at z0 = x0 +

i y0 iff u(x, y) and v(x, y) are continuous at

(x0, y0).

7. f : A ⊂ R −→ C is said to be differentiable at

an interior point t0 ∈ A if

f ′(t0) := limt→t0

f(t)− f(t0)

t− t0

exists. f ′(t0) is called the derivative of f at

t0. f ′(t0) exists iff u′(t0) and v′(t0) exist, and,

f ′(t0) = u′(t0) + i v′(t0).

8. f : A ⊂ C −→ C is differentiable at an interior

point z0 ∈ A if

f ′(z0) := limz→z0

f(z)− f(z0)

z − z0

exists. f ′(z0) is called the derivative of f at z0.

If f is differentiable at every point of C, it is

said to be entire.

7.2 Linear Fractional Trans-formations

The map

T : C \ {−d/c} −→ C (7.1)

z 7→ az + b

cz + d(7.2)

59

a, b, c, d ∈ C, c 6= 0, is called a linear fractional

transformation or a Mobius transformation. Some-

times the latter term is reserved for the situation

ad − bc 6= 0. Another name is bilinear transform-

ation. When c = 0, it is called a linear or affine

linear transformation.

Definition 142. T (z) = z+b is called a translation,

T (z) = az (a 6= 0) is a dilation and T (z) = eiθz is

a rotation. T (z) = 1/z is an inversion.

If ∞ denotes the usual point at infinity of the ex-

tended complex plane, we extend the domain of T

by writing T (∞) = ∞ if c = 0, and T (∞) = a/c

and T (−d/c) =∞ if c 6= 0.

Theorem 143.

1. When the domain of T is extended as above,

the map T : C ∪ {∞} −→ C ∪ {∞} is bijective

(one-to-one and onto).

2. A linear fractional transformation always

maps circles to circles or lines.

3. Given points z1, z2, z3 ∈ C and w1, w2, w3 ∈ C,

there exists a linear fractional transformation

mapping zk to wk (k = 1, 2, 3). This trans-

formation w = T (z) is given by the equation

(w − w1)(w2 − w3)

(w − w3)(w2 − w1)=

(z − z1)(z2 − z3)

(z − z3)(z2 − z1)

7.3 Elementary Functions

Definition 144.

Argument Let z = x + i y have polar form reiθ.

The multivalued function

arg z = θ + 2nπ n = 0,±1,±2, . . .

is called the argument of z.

Principal value The well-defined function de-

noted Arg z is called the principal value of arg z

and is defined to be the unique value of arg z

such that −π < arg z ≤ π.

Theorem 145. Given any z, w ∈ C,

arg(zw) = arg(z) + arg(w)

in the sense that any value of arg z plus any value

of argw is a value of arg zw. Conversely, any value

of arg(zw) is a sum of a value of arg(z) and a value

of arg(w).

If arg z is replaced by Arg z, then the above

equality is false.

Definition 146. The exponential function de-

noted ez or exp z is defined to be

ez := ex(cos y + i sin y)

where z = x+ i y.

Theorem 147. (Properties of the exponential) For

any z, w ∈ C

1. ez+w = ezew.

2. ez+2πi = ez. This is called periodicity. The ex-

ponential function is periodic with period 2πi.

3. | expz | = ex and arg(ez) = y + 2nπ, n =

0,±1,±2, . . . .

4. The range of ez is C \ {0}.

5. ddz ez = ez.

6. ez =∑∞n=0

zn

n! with disc of convergence C (see

subsection 7.6). Thus ez is entire (see section

7.1).

Definition 148. The trigonometric functions

sin z, cos z and tan z are defined by

sin z =ei z − e−i z

2i

cos z =ei z + e−i z

2

tan z =sin z

cos z

The associated functions sec, csc, cot are defined as

in the real case.

Theorem 149. (Properties of the trigonometric

functions)

1. The basic algebraic formulas of real trigono-

metric functions such as those involving the

sums of angles and integral multiples of angles

are also true in the complex case.

2. They are periodic with complex period 2π.

3. sin z = sin z, cos z = cos z and tan z = tan z.

4. sin z, cos z and tan z are unbounded.

60

5. The following series expansions are valid for

all z ∈ C:

sin z =

∞∑n=0

(−1)n−1 z2n+1

(2n+ 1)!

cos z =

∞∑n=0

(−1)nz2n

(2n)!

Thus, sin z and cos z are entire functions.

6. ddz sin z = cos z, d

dz cos z = − sin z andddz tan z = sec2 z.

7.

sin z = sinx cosh y + i cosx sinh y

cos z = cosx cosh y − i sinx sinh y

sin(i y) = i sinh y & cos(i y) = cosh y

| sin z|2 = sin2 x+ sinh2 y

| cos z|2 = cos2 x+ sinh2 y

Definition 150. The complex hyperbolic func-

tions sinh z and cosh z are defined as follows:

sinh z =ez − e−z

2

cosh z =ez + e−z

2

The associated hyperbolic functions tanh, coth etc

are defined by analogy with the real case.

Theorem 151. (Properties of hyperbolic functions)

1. −i sinh(i z) = sin z and cosh(i z) = cos z.

2. The basic algebraic identities in the real case

are also true in the complex case.

3. | sinh z|2 = sinh2 x + sin2 y and | cosh2 z| =

sinh2 x+ cos2 y.

4. ddz sinh z = cosh z and d

dz coshz = sinh z.

Both sinh z and cosh z are entire.

Definition 152.

Logarithm The (complex) logarithm is a multi-

valued function log : C \ {0} −→ C given by

log z = log |z|+i arg z = log |z|+i (Arg z+2nπ)

for all n = 0,±1,±2, . . . . If z = reiθ with

r > 0 and −π < θ < π, then

log z = log r + i(θ + 2nπ)

Branch of the logarithm Let α ∈ R be arbitrary

but fixed. The function

log z = Log r + i θ

is called a branch of the logarithm on all points

(r, θ) ∈ (0,∞)× (α, α+ 2π).

Principal branch The principal branch of the

logarithm on C \ {z ≤ 0} is the function log z

defined by

log z = log |z|+ i Arg z

It is sometimes denoted Log z.

Theorem 153. (Properties of the logarithm)

1. elog z = z for all z 6= 0.

2. log ez = z + i 2nπ (n = 0,±1, . . . ), and

Log ez = z provided −π < y ≤ π.

3. log(zw) = log z + logw in the sense that any

value of log z plus any value of logw is a value

of log(zw). Conversely, any value of log(zw)

can be expressed as a sum of a value of log z

and a value of logw.

4. As a special case there is the equality of func-

tions:

Log |zw| = Log |z|+ Log |w|

5. log(zw

)= log z− logw, w 6= 0, under a similar

interpretation as for log(zw).

Definition 154. The complex powers or expo-

nents of a complex z 6= 0 are defined as follows:

zw = ew log z

for any w ∈ C. It is multivalued. The principal

value or branch of zw is obtained by replacing log

by Log in the definition and is thus valid in −π <Arg z < π.

7.4 Analytic Functions

Let U ⊂ C be an open set. A function f : U −→C is said to be analytic or holomorphic at a point

z0 ∈ U if f(z) is differentiable everywhere in some

ball B(z0, r) centred on z0:

f ′(z) = limz→z0z 6=z0

f(z)− f(z0)

z − z0

exists for all z ∈ B(z0, r). It is analytic or holo-

morphic if it is analytic at every point of its do-

main. An analytic function whose domain is C is

thus entire.

61

Theorem 155. The sum and product of analytic

functions defined on a common domain are again

analytic on that domain. The quotient of two ana-

lytic functions is analytic if the function in the de-

nominator is nonzero on its domain.

The Cauchy-Riemann equations Let f : U ⊂C −→ C, f(z) = u(x, y) + i v(x, y) and z =

x0 +iy0 ∈ U . The partial differential equations

ux(x0, y0) :=∂u

∂x(x0, y0) =

∂v

∂y(x0, y0)

=: vy(x0, y0)

(7.3)

uy(x0, y0) =∂u

∂y(x0, y0) = −∂v

∂x(x0, y0)

= −vx(x0, y0)

are called the Cauchy-Riemann (CR) equa-

tions satisfied by the functions u, v at (x0, y0).

The CR equations can be written in the altern-

ative form

∂f

∂x(x0, y0) = −i

∂f

∂y(x0, y0)

The CR equations & differentiability

Theorem 156. (Necessity) Suppose f : A ⊂C −→ C, f = u + iv, is differentiable at

z0 = x0 + i y0 ∈ A. Then the first order par-

tial derivatives of u and v exist at (x0, y0) and

satisfy the CR equations (7.3). Moreover,

f ′(z0) = ux(x0, y0) + i vx(x0, y0)= vy(x0, y0)− iuy(x0, y0)

}(7.4)

Theorem 157. (Sufficiency) Suppose f :

A ⊂ C −→ C and let z0 = x0 + i y0 ∈ A be

an interior point. Suppose that the partial de-

rivatives occurring in the CR equations (7.3)

exist in some ball B(z0, r) ⊂ A and are con-

tinuous there. If they satisfy the CR equations

at z0, then f ′(z0) exists and is given by (7.4).

These theorems have obvious extensions to

analytic functions.

The CR equations in polar coordinates The

substitutions x = r cos θ and y = r sin θ in

(7.3) give

ur =1

rvθ

1

ruθ = −vr

Here, ur := ∂u∂r , uθ := ∂u

∂θ etc.

Harmonic functions A function f : U ⊂ R2 −→R is said to be harmonic in U if it has con-

tinuous second-order partial derivatives every-

where in U and satisfies the partial differential

equation

fxx + fyy = 0

which is called Laplace’s equation. Here, fxx =∂2f∂x2 etc. In alternative notation, ∆f = 0 (see

p.9).

Theorem 158. If f : U ⊂ C −→ C is analytic

in U , then its real and imaginary parts u(x, y)

and v(x, y) are each harmonic: ∆u = 0 and

∆v = 0.

Definition 159. If u, v : U ⊂ R2 −→ R are

harmonic in U and satisfy the CR equations in

U , then v is said to be a harmonic conjugate

of u in U .

Theorem 160. f : U ⊂ C −→ C, f(z) =

u(x, y) + i v(x, y), is analytic in U ⇐⇒ v is a

harmonic conjugate of u in U .

7.5 Complex integration

Integrals of complex functions on R If f :

[a, b] −→ C, −∞ ≤ a < b ≤ ∞, is a func-

tion whose real and imaginary parts u and v

are integrable (see p. 2.5), then∫ b

a

f(t) dt :=

∫ b

a

u(t) dt+ i

∫ b

a

v(t) dt

Theorem 161. Let f : [a, b] −→ C, −∞ ≤a < b ≤ ∞. Then∣∣∣∣∣

∫ b

a

f(t) dt

∣∣∣∣∣ ≤∫ b

a

|f(t)|dt

if both integrals exist.

Contours & Line Integrals A curve or arc or

path in C is a function γ : [a, b] −→ C for

which it is generally assumed that the real and

imaginary component functions are continu-

ous. γ is simple if it does not intersect itself:

s = t ⇒ γ(s) = γ(t). It is simple closed or

Jordan if the only self-intersection occurs at

t = a and t = b, where γ(a) = γ(b). A differ-

entiable curve γ is one such that γ′ exists on

its domain. A contour is a piecewise differen-

tiable curve, i.e. finitely many smooth curves

joined end to end. A simple closed contour

62

has no self-intersections except at the starting

point and the ending point. A simple closed

contour γ : [a, b] −→ C is said to be positively

oriented if an observer traversing the contour

as per the parametrisation, has all the interior

points enclosed by the contour to his left.

Definition 162. The length of a differentiable

curve γ : [a, b] −→ C, γ(t) = x(t) + i y(t), is

L :=

∫ b

a

|γ′(t)|dt =

∫ b

a

√x′(t)2 + y′(t)2 dt

The length of a contour is the sum of the

lengths of the constituent curves.

Definition 163. Let f : U ⊂ C −→ C be piece-

wise continuous and γ : [a, b] −→ U be a con-

tour. Then∫γ

f(z) dz :=

∫ b

a

f(γ(t))γ′(t) dt

=

∫ b

a

(u(t)x′(t)− v(t)y′(t)) dt

+ i

∫ b

a

(v(t)x′(t) + u(t)y′(t)) dt

=

∫γ

udx− v dy + i

∫γ

v dx− udy

The last line uses the notation of real line in-

tegrals (section 2.11).

Reversing a contour Suppose a contour is given

by γ : [a, b] −→ C which is traced from γ(a)

to γ(b). The contour traversed in the reverse

direction is described by (−γ)(t) := γ(−t),−b ≤ t ≤ −a. The contour integral∫

−γ

f(z) dz = −∫γ

f(z) dz

Estimating a contour integral

Theorem 164. Let f : U ⊂ C −→ C be a

function which is continuous on a contour γ

having length L. Suppose max{|f(z)| : z ∈γ} = M . Then∣∣∣∣∣∣

∫γ

f(z) dz

∣∣∣∣∣∣ ≤ML

Definition 165. A set A ⊂ C is simply con-

nected if every simple closed curve in A en-

closes only points of A. Alternatively, in such

a set, every simple closed curve can be shrunk

continuously to a point.

Theorem 166.

The Cauchy-Goursat theorem

Theorem 167. (Simply connected do-

mains) Let f : U ⊂ C −→ C be analytic on

the the simply connected open set U . Then∫γ

f(z) dz = 0

along every simple closed curve γ in U .

Corollary 168. (Path independence) Let f

be as in Theorem 167 and γ1, γ2 : [a, b] −→C be two simple curves with the same start-

ing point and the same ending points: z1 :=

γ1(a) = γ2(a) and z2 := γ1(b) = γ2(b) with no

other points of intersection. Then∫γ1

f(z) dz =

∫γ2

f(z) dz

and thus depends only on the endpoints z1 and

z2.

Theorem 169. (Multiply connected do-

mains) Let f : U ⊂ C −→ C be analytic on an

open set U containing a multiply connected set

S (p. 73) with boundary curves γ, γ1, . . . , γn.

Then ∫γ

f(z) dz =

n∑k=1

∫γk

f(z) dz

Definition 170. If f : U ⊂ C −→ C is con-

tinuous on the open set U and if there exists

an analytic function F : U −→ C such that

F ′(z) = f(z), then F is said to be a primitive

or antiderivative of f in U .

Theorem 171. Let the continuous function f :

U −→ C (U connected) have a primitive F in

U and z, w ∈ U . If γ is any contour joining

z to w, then∫γ

f(z) dz = F (z) − F (w) and is

thus independent of the contour joining z to w.

The Cauchy integral formula Let f : U ⊂C −→ C be analytic and γ a simple closed

positively oriented contour in U . Then∫γ

f(z) dz =1

2πi

∫γ

f(w)

z − z0dw

Integral representation of derivatives

63

Theorem 172. Let f : U ⊂ C −→ C be ana-

lytic and γ a simple closed contour inside U .

Then

1. If f : U ⊂−→ C is analytic, it then has

derivatives of all orders in U which are

consequently analytic in U .

2. The nth derivative of f , denoted by f (n),

has the integral representation

f (n)(z) =n!

2πi

∫γ

f(w)

(w − z)n+1dz

Morera’s theorem If f : U ⊂ C −→ C is con-

tinuous on the open set U and is such that∫∆

f(z) dz = 0

for every triangular contour ∆ ⊂ U , then f is

analytic in U .

Further properties of analytic functions

Maximum Modulus theorem Suppose

f : U −→ C is analytic on the connected

open set U and z0 ∈ U . Let B(z0, r) ⊂ Ube some open ball. Then

|f(z0)| ≤ maxθ{|f(z0+reiθ)| : 0 ≤ θ ≤ 2π}

Equality holds iff f is constant. Thus,

if f is nonconstant, its maximum (if it

exists) can occur only on the boundary

of U and never in its interior. Similarly,

if f = u + i v, then u and v attain their

respective maxima (if they exist) only on

the boundary of U .

Corollary 173. (Minimum Modulus

theorem) Under the same assumptions

as above and if f(z) 6= 0 in B(z0, r), then

|f(z0)| ≥ minθ{|f(z0 + reiθ)| : 0 ≤ θ ≤ 2π}

and hence, the minimum (if it exists)

can occur only on the boundary of U and

never in its interior. Similarly, if f =

u+i v, then u and v attain their respective

minima (if they exist) only on the bound-

ary of U .

Liouville’s theorem Let f : C −→ C be

analytic (such a function is termed entire)

and bounded, i.e. |f(z)| ≤ C, for some

C > 0 and all z ∈ C, then f is constant.

Cauchy’s estimates If B(z0, r) ⊂ C is some

open ball and f : B(z0, r) −→ C is ana-

lytic and bounded by M , viz. |f(z)| ≤Mfor all z ∈ B(z0, r), then

|f (n)(z0)| ≤ n!M

rn

Open Mapping theorem Given an analytic

function f : U −→ C, where U is an open

connected set, then f(U) is either open

or a point (in which case f is constant).

Thus, the image or range of a nonconstant

analytic function is an open subset of C.

Argument principle Let γ ⊂ U be a simple

closed contour and f : U −→ C be a func-

tion which is analytic at all points on and

inside γ except possibly for poles in the

interior of γ. Suppose f has no zeros on

γ. If N and P denote the number of zeros

and poles respectively of f , then

1

2πi

∫γ

f ′(z)

f(z)dz = N − P

Rouche’s theorem Suppose f, g : U −→ Care analytic and γ ⊂ U is a simple closed

contour. If |f(z)| > |g(z)| on γ, then

f(z) and f(z) + g(z) have the same num-

ber of zeros counting multiplicities, i.e. if

z0 is such that for some positive integer

m, f(z) = (z − z0)mp(z), p analytic with

p(z0) 6= 0, then also f(z) + g(z) = (z −z0)mq(z), q analytic with q(z0) 6= 0, and

vice versa.

7.6 Series

An infinite sequence z1, . . . , zn, . . . in C is said to

converge to the limit z if given any ε > 0, there is

n0 := n0(ε) ∈ N such that n ≥ n0 ⇒ |z − zn| < ε.

If a series does not converge, it is said to diverge.

Theorem 174. Let zn = xn + i yn (n ∈ N) and

z = x+ i y be in C. Then

limn→∞

zn = z ⇐⇒ limn→∞

xn = x and limn→∞

yn = y.

Definition 175. An infinite series∑∞n=1 zn con-

verges to z ∈ C if the sequence sn of partial sums,

sn :=∑nk=1 zk, converges to z. This is denoted by∑∞

n=1 zn = z.

Theorem 176.

64

1. If the series∑∞n=1 zn converges, then

limn→∞ zn = 0.

2. Given a convergent series∑∞n=1 zn = z, where

zn = xn + i yn and z = x + i y, it follows that∑∞n=1 xn = x and

∑∞n=1 yn = y.

Definition 177. A series of the form

∞∑n=0

an(z − z0)n

is called a power series. Here the partial sums are

sn(z) :=

n∑k=0

ak(z − z0)k

and convergence to f(z) := limn→∞ sn(z) is ex-

pressed by saying that given any ε > 0, there ex-

ists n0(z, ε) which is such that n ≥ n0(z, ε) ⇒|sn(z)− f(z)| < ε for each z.

Definition 178. If there exists R ∈ [0,∞] such that

|z − z0| < R|z − z0| > R

}⇒{

the series converges absolutelythe series diverges

then R is called the radius of convergence and the

circle {z ∈ C : |z − z0| = R} is called the circle of

convergence. The open ball B(z0, R) is called the

disc of convergence.

Taylor series Let f : U ⊂ C −→ C be analytic on

the open simply connected set U , z0 ∈ U and

C := {z : |z − z0| < r} be a circle contained in

U . Then at each point z such that |z− z0| < r

(i.e. strictly within C),

f(z) = f(z0) +f ′(z0)

1!(z − z0) +

f ′′(z0)

2!(z − z0)2

+ · · ·+ f (n)(z0)

n!(z − z0)n + · · ·

(7.5)

The series 7.5 is called the Taylor series expan-

sion of f about z0. The special case of z0 = 0

is called the Maclaurin series of f .

Absolute & uniform convergence A series∑∞n=1 zn is said to be absolutely convergent if∑∞n=1 |zn| converges.

Theorem 179.

1. Absolute convergence implies conver-

gence.

2. If a power series∑∞n=0 an(z − z0)n con-

verges for some z such that |z − z0| = r,

0 < r < ∞, then it converges whenever

|z − z0| < r, i.e. everywhere inside the

circle of radius |z − z0| = r.

3. If a power series∑∞n=0 an(z − z0)n con-

verges then its radius of convergence is

given by

R = limn→∞

∣∣∣∣ anan+1

∣∣∣∣if the limit exists. If it does not, then it

is given by 1/ lim sup |an|1/n.

The series∑∞n=0 an(z−z0)n is said to converge

uniformly to f(z) if given ε > 0, there exists

n0 = n0(ε) independent of z, such that |sn(z)−f(z)| < ε whenever n ≥ n0.

Theorem 180. (The Weierstrass M test)

Given a sequence of functions fn : U ⊂ C −→C which are uniformly bounded in the sense

that |fn(z)| ≤ Mn for all z ∈ U and n ∈ N,

the series∑∞n=1 fn(z) converges uniformly on

U if∑∞n=1Mn converges.

Theorem 181. Suppose a power series has a

nonzero radius of convergence R about a point

z0. Then it converges uniformly in the set {z ∈C : |z − z0| ≤ r} where 0 < r < R.

Laurent series Let z0 ∈ C and C1 := {z ∈ C :

|z − z0| = r1} and C2 := {z ∈ C : |z − z0| =

r2} be two concentric positively oriented circles

centred on z0, 0 ≤ r1 < r2 ≤ ∞. Let U :=

{z ∈ C : r1 < |z − z0| < r2}. If f : U −→ C is

analytic, then

f(z) =

∞∑n=−∞

an(z − z0)n

where

an =1

2πi

∫C

f(z)

(z − z0)n+1(n ∈ Z)

and C is the circle γ(t) := a+rei θ, r1 < r < r2

and 0 ≤ θ ≤ 2π. The series, which is uniquely

determined by f and U , converges uniformly

and absolutely on U .

Theorem 182. The function f : U −→ Cdefined by f(z) :=

∑∞n=0 an(z − z0)n, is ana-

lytic on the disc of convergence U of the power

series.

65

Theorem 183. (Integrating a power

series) Let∑∞n=0 an(z − z0)n be a convergent

power series and C be any contour lying in the

disc of convergence. If g : C −→ C is continu-

ous, then∫C

g(z)[ ∞∑n=0

an(z − z0)n]

dz

=

∞∑n=1

an

∫C

g(z)(z − z0)n dz

In particular, by taking g(z) ≡ 1, it follows

that the power series can be integrated term-

by-term:∫C

∞∑n=0

an(z − z0)n dz =

∞∑n=1

an

∫C

(z − z0)n dz

Theorem 184. (Differentiating a power

series) A power series∑∞n=0 an(z − z0)n can

be differentiated term-by-term in its disc of

convergence and the resulting series has the

same disc of convergence:

d

dz

∞∑n=0

an(z − z0)n =

∞∑n=1

nan(z − z0)n−1

Theorem 185. (Uniqueness of a series

representation) The function f defined by

f(z) :=∑∞n=0 an(z − z0)n on the disc of con-

vergence of the power series, is the Taylor ex-

pansion of f about z0. In other words,

an =f (n)(z0)

n!

Similarly, if∑∞n=−∞ cn(z − z0)n converges to

a function f(z) in some annular open set, then

it is the Laurent series of f .

Sum & product of series If∑∞n=0 an(z − z0)n

and∑∞n=0 bn(z−z0)n are two power series con-

verging on the same disc, their sum is the series∑∞n=0(an+bn)(z−z0)n convergent on the same

disc.

Theorem 186. Given convergent power series

f(z) :=∑∞n=0 an(z − z0)n and g(z) :=∑∞

n=0 bn(z − z0)n converging within the same

disc D, the product f(z)g(z) is also a power

series with the expansion

f(z)g(z) =

∞∑n=0

(n∑k=0

akbn−k

)(z − z0)n

valid in D. The product series is sometimes

called the Cauchy product.

Theorem 187. Let f : U −→ C be analytic

and z0 ∈ U . If f is nonconstant and f(z0) = 0,

then there is an open ball B(z0, r) ⊂ U within

which f has no other zero, i.e. the zeros of f

are “isolated”.

7.7 Residues & Poles

Singularities Let f : U −→ C and z0 ∈ U . If f

is not analytic (or possibly not even defined)

at z0 but is analytic at some point in every

ball centred on z0, then z0 is said to be a

singularity of f .

The singularity z0 is isolated if there is a

ball B(z0, r) ⊂ C such that f is analytic

on B(z0, r) r {z0}. It is a removable sin-

gularity if there exists an analytic function

f : B(z0, r) −→ C such that f(z) = f(z)

on B(z0, r) r {z0} (thus f can be redefined

at z0 to be f(z0) to become analytic). The

isolated singularity z0 is said to be a pole if

limz→z0 |f(z)| = ∞. If an isolated singularity

is neither removable nor a pole, it is said to

be an essential singularity.

By (7.6) f has a Laurent series expansion in an

annular domain about an isolated singularity

z0.

Theorem 188. Let f : Ur{z0} −→ C be ana-

lytic on a connected open set. The following

criteria classify the isolated singularities.

Removable singularity If z0 ∈ U is an isol-

ated singularity of f , then it is removable

iff

limz→z0

(z − z0)f(z) = 0

Pole If f has a pole at z0, then there are m ∈N and analytic g : U −→ C such that

f(z) =g(z)

(z − z0)m

and m is the smallest positive integer such

that (z−z0)mf(z) has a removable singu-

larity at z0. m is called the order of the

pole at z0. A pole of order 1 is said to

be simple. Moreover, if B(z0, r) ⊂ U , we

can express f as follows:

f(z) = h(z) +

m∑k=1

bk(z − z0)k

66

where h : B(z0, r) −→ C is analytic and

bm 6= 0. The m summed terms on the

RHS are called the singular or principal

part of f at z0.

Essential singularity If z0 is an essential

singularity of f , then for any r > 0,

every z ∈ C can be approximated ar-

bitrarily closely by elements from the set

f(B(z0, r) r {z0}) in the following sense:

given any ε > 0 and z ∈ C, there is

w ∈ B(z0, r)r{z0} such that |z−f(w)| <ε. (This result is called the Casorati-

Weierstrass theorem).

Alternatively, the Laurent series expansion can

be used to identify the nature of an isolated

singularity.

Theorem 189. Let z0 be an isolated singular-

ity of a function f which has the Laurent series

development f(z) =∑∞n=−∞ cn(z − z0)n in a

suitable domain. Then

1. z0 is removable iff cn = 0 for n ≤ −1.

2. It is a pole of order m iff c−m 6= 0 and

cn = 0 for all n ≤ −(m+ 1).

3. It is an essential singularity iff cn 6= 0 for

infinitely many (but not necessarily all)

negative n.

Residues Let f have an isolated singularity at

z0 with Laurent series expansion f(z) =∑∞n=−∞ cn(z − z0)n. The coefficient of c−1 is

called the residue of f at z0. It is often denoted

by Res(f ;z0).

Theorem 190.

1. Res(f ; z0) := c−1 = 12πi

∫C

f(z). where C

is a positively oriented circle centred on

z0 and contained in the domain of f .

2. Suppose f has a pole of order m at z0. If

g(z) := (z − z0)mf(z), then

Res(f ; z0) =1

(m− 1)!g(m−1)(z0)

The Residue Theorem Suppose f : U −→ C is

analytic except at the isolated singular points

z1, z2, . . . , zn ∈ U and let C be a positively

oriented simple closed contour lying in U such

that the singularities are in the interior of the

region enclosed by C. Then∫C

f(z) dz = 2πi

n∑k=1

Res(f ; zk)

Theorem 191.

1. Let F : U −→ C, f(z) = p(z)/q(z), p, q

analytic at z0 ∈ U . Suppose p(z0) 6= 0

but q has a “zero of order m at z0”,

i.e. q(z) = (z − z0)mr(z) with r(z0) 6= 0.

Then f has a pole of order m at z0.

2. In particular, if p, q are as before and

q(z0) = 0 but q′(z0) 6= 0, then z0 is a

simple pole of f and

Res(f ; z0) = p(z0)/q′(z0)

Jordan’s inequality∫ π/2

0

e−R sin θ dθ <π

2R(R > 0)

————————–

67

Chapter 8

Probability & Statistics

8.1 Probability

Definition 192. An outcome of an experiment is

called an event. It may be possible to regard cer-

tain more complex events as combinations of sim-

pler events. If an event cannot be exhibited as a

combination of other events associated with the ex-

periment, it is said to be simple or elementary.

Otherwise it is compound. A sample space S is the

collection of all elementary events which are also

called sample points. A subset A of S is, in general,

a compound event. The event A is said to occur if

all its constituent events simultaneously occur. An

elementary event is a singleton set. The event S is

called the sure event and the null set ∅ is called the

impossible event.

Henceforth S will denote some sample space.

Definition 193.

1. Given an event A, the non-occurrence of A is

also an event called the complementary event.

It corresponds to the set Ac, the complement

of A. This is sometimes denoted by A or ∼A.

2. Given events A and B, the event “either A oc-

curs or B occurs (or both occur)” is represen-

ted by the set A∪B. Given a finite or infinite

sequence of events A1, A2, . . . , the correspond-

ing representation is ∪∞An. The event “both

A and B occur (simultaneously)” is represen-

ted by the set A ∩ B. For a finite or infinite

sequence of events A1, A2, . . . , the correspond-

ing representation is ∩∞An.

3. A collection of events A1, . . . , An in S is mu-

tually exclusive if the sets Ai are pairwise dis-

joint. They are exhaustive if ∪ni=1Ai = S.

4. The event “if A occurs then B occurs” or “the

occurrence of A implies the occurrence of B”

is represented by the relation A ⊂ B.

5. The event “A occurs but not B” is the set A\B.

Definition 194. A sample space S is said to be

discrete if S is finite or countably infinite, i.e. its

points can be arranged in an infinite sequence.

Let S = {x1, x2, . . . , xn, . . . } be a finite or infinite

discrete sample space.

Definition 195. Let p1, . . . , pn . . . be a finite or

infinite sequence of non-negative real numbers, 0 ≤pi ≤ 1 for all i = 1, 2, . . . such that

∑pi = 1. For

any A ⊂ S, define

P (A) :=∑xi∈A

pi

(The sum is over all i such that xi ∈ A.) In par-

ticular, P ({xi}) = pi. Then P (A) is the assigned

probability that the event A occurs. Thus, P is a

function:

P : {all subsets of S} −→ [0, 1]

The sample space S together with the probability

function P is said to be a probability space.

A very important special case is the following.

Definition 196. Let S = {x1, x2, . . . , xn} be finite

and take pi = 1/n for all i = 1, 2, . . . , n. Then the

probability of the occurence of an event A can be

described as

P (A) =no. of points in A

no. of points in S

=no. of outcomes favourable to A

total no. of possible outcomes

Theorem 197. Let S be a sample space and let

P (A) be the probability assigned to the event A as

in Definition 195. Then

1. P (∅) := 0.

69

2. P (∐∞n=1An) =

∑∞n=1 P (An), i.e. the probabil-

ity of the union of a sequence of mutually ex-

clusive events is the sum of the probabilities of

the individual events.

3. P (A∪B) = P (A)+P (B)−P (A∩B) where the

events A and B are not necessarily exclusive.

In particular, P (A) + P (B) ≤ P (A) + P (B)

(Boole’s inequality).

4. P (A∪B∪C) = P (A) +P (B) +P (C) −P (A∩B)− P (B ∩ C)− P (C ∩A) +P (A ∩B ∩ C).

5. More generally,

P (∪nk=1Ak)

=

n∑k=1

(−1)k+1( ∑

1≤i1<···<ik≤n

P (Ai1 ∩ · · · ∩Aik))

Also, Boole’s inequality holds:

P (∪ni=1Ai) ≤n∑i=1

P (Ai)

6. If A and B are mutually exclusive, then P (A∩B) = P (∅) = 0.

7. P (Ac) = 1− P (A).

8. If event A implies event B so that A ⊂ B, then

P (A) ≤ P (B).

Definition 198. Let S be a finite sample space with

probability function P .

1. Events A and B are independent if P (A∩B) =

P (A)P (B).

2. A collection of events A1, . . . , An is said to be

independent if for any finite subset i1, . . . , ikof indices, we have

P (Ai1 ∩ · · · ∩Aik) = P (Ai1) · · ·P (Aik)

Theorem 199. Let A1, . . . , An be a collection of

independent events. Then, if any event Aj is re-

placed by its complementary event Ac, the resulting

collection of events is again independent.

Definition 200.

1. Suppose an experiment with exactly two pos-

sible outcomes (usually termed “success” and

“failure”) with associated probabilities p and

q := 1 − p, is repeated n times. Assume

that any repetition is independent of any other.

Such an experiment is said to be a Bernoulli

trial.

2. If each repetition of the experiment has pre-

cisely k > 2 possible outcomes (labelled

ω1, ω2, . . . , ωk) with associated probabilities

p1, . . . , pk,∑nk=1 pk = 1, then the sequence of

independent experiments is said to be a gener-

alised Bernoulli trial.

Theorem 201.

1. In a sequence of n Bernoulli trials, the prob-

ability of obtaining exactly k successes (and

hence, n− k failures), k = 0, 1, . . . , n, is given

by

p(k) =

(n

k

)pkqn−k

where as before q := 1− p.

2. In a sequence of n generalised Bernoulli tri-

als, the probability that outcome ω1 occurs n1

times, . . . , ωk occurs nk times, n1 + · · ·+nk =

n, is given by

p(n1, . . . , nk) =n!

n1! . . . nk!pn1

1 . . . pnkk

8.1.1 Conditional probability

Let (S, P ) be a given probability space and A,B two

events. The conditional probability of the event A

occurring given that event B has already occurred

is defined to be

P (A |B) :=P (A ∩B)

P (B)

Here (in the context of discrete sample spaces) it

can be safely assumed that P (B) > 0. Alternat-

ively, we may write

P (A ∩B) = P (B)P (A |B) = P (A)P (B |A)

Theorem 202.

1. If A and B are independent events, then

P (A |B) = P (A).

2. (Chain Rule) Let A1, . . . , An be events such

that P (A1 ∩ · · · ∩An−1) > 0. Then

P (A1 ∩ · · · ∩An)

= P (A1)P (A2 |A1)P (A3 |A1 ∩A2)

· · ·P (An |A1 ∩ · · · ∩An−1)

Definition 203. Given a sample space S, a parti-

tion of S is a (finite or infinite) collection of events

An such that the events are mutually exclusive and

exhaustive.

70

Theorem 204. Let {Ei : i = 1, . . . , n} be a finite

partition of a sample space S and such that P (Ei) >

0 for all i.

1. (Theorem of total probability) If A is an

arbitrary event, then

P (A) =

n∑i=1

P (A ∩ Ei) =

n∑i=1

P (A |Ei)P (Ei)

An exactly analogous theorem with the finite

sum replaced by an infinite series is true for

an infinite partition.

2. If A and B are any two events in S with

P (B) > 0, then

P (A |B) =

n∑i=1

P (A |B ∩ Ei)P (Ei |B)

Theorem 205. (Bayes′ theorem) Let {Ei :

i = 1, . . . , n} be a partition of S with P (Ei) > 0 for

all i. Suppose A is an event such that P (A) > 0.

Then for each Ek

P (Ek |A) =P (A |Ek)P (Ek)∑ni=1 P (A |Ei)P (Ei)

Corollary 206. If the partition consists of the two

events {E,Ec}, then with the same assumptions as

above,

P (E |A) =P (A |E)P (E)

P (A |E)P (E) + P (A |Ec)P (Ec)

8.1.2 Random variables & probabil-ity distributions

The one-dimensional case

Definition 207. A (one-dimensional) discrete ran-

dom variable X is a function X : S −→ R :=

{a1, a2, . . . , an, . . . } ⊂ R. Here, the co-domain R

may be finite or infinite.

A (one-dimensional) continuous random variable X

is a function X : S −→ R.

The term is frequently abbreviated as r.v. We

use the following notation: if X is an r.v. on S and

A is any subset of the co-domain of X, then

{X ∈ A} := {s ∈ S : X(s) ∈ A} = X−1(A)

An alternative notation is (X ∈ A). The event

{X ∈ {a}} is written as {X = a}.

Definition 208. A distribution of an r.v. is the as-

signment of probabilities to all possible events when

they are defined in terms of X. In other words, a

distribution of X assigns probabilities to all events

of the form {X ∈ A}, where A is a subset the co-

domain of X and such that X−1(A) is an event

(possibly the sure event S or the impossible event

∅) in the sample space S.

Definition 209. Let X : S −→ R be a r.v. The

function

FX(t) = P{X ≤ t}

is called the (cumulative) distribution function of

X. If the r.v. X is understood then FX is simply

denoted F . The abbreviations c.d.f. and d.f. are

common.

Definition 210.

1. Let X be a discrete r.v. The function p : R −→[0, 1] defined by p(t) := P{X = t}, is called the

probability mass function of the r.v. X. Thus,

p(t) is the probability that X takes the value t.

2. Let X be a continuous r.v. A function f :

R −→ [0,∞) is called the probability density

function or simply, density, of X if

FX(t) =

∫ t

−∞f(x) dx (−∞ < t <∞)

f is usually taken to be integrable on every fi-

nite interval [s, t] in which case F becomes con-

tinuous.

Definition 211. Let X and Y be r.v.’s defined on

the same sample space. X and Y are said to be

independent iff whenever E := {X ∈ A} and F :=

{Y ∈ B} (A,B ⊂ R) are events,

P (E ∩ F ) = P (E)P (F )

General properties of distributions

Theorem 212. Let X be a discrete r.v. and A ⊂ R,

the co-domain of X. Then

P{X ∈ A} =∑t∈A

p(t)

where the sum may be restricted to all t such that

p(t) > 0.

Theorem 213. Let FX be the distribution function

of the r.v. X. Then

1. 0 ≤ FX(t) ≤ 1 for all t ∈ R.

71

2. P{a < X ≤ b} = FX(b)− FX(a) when a < b.

3. a < b⇒ FX(a) ≤ FX(b).

4. P{a ≤ X ≤ b} = FX(b)−FX(a) +P{X = a}.

5. P{a < X < b} = FX(b)− FX(a)− P{X = b}.

6.

P{a ≤ X < b} = FX(b)− FX(a) +P{X = a}−P{X = b}

Theorem 214. Let FX be the distribution function

of a r.v. X. Then

1. F is nondecreasing.

2.

limt→−∞

FX(t) = 0 and limt→∞

FX(t) = 1

3. FX is “continuous from the right”:

limt→a+

FX(t) = FX(a)

and

limt→a−

FX(t) = FX(a)− P{X = a}

Theorem 215.

1. Let X be a continuous r.v. with density f .

Then whenever a < b

P{a ≤ X ≤ b} = P{a < X ≤ b} = P{a < X < b}= P{a ≤ X < b}

= F (b)− F (a) =

∫ b

a

f(x) dx

2.∫∞−∞ f(x) dx = 1.

Functions of a random variable

Let X : S −→ R be a r.v. with distribution function

FX . Let φ : R −→ R be any function. Then Y :=

φ(X) is a new r.v. defined by φ(X)(x) := φ(X(x)).

Theorem 216. Let X be a r.v. with density fX .

Suppose φ is continuously differentiable with in-

verse ψ. Then the density fY of Y := φ(X) is

given by

fY (y) = fX(ψ(y))|ψ′(y)|

In particular, if φ(t) = at + b (a 6= 0) so that Y =

aX + b, then

fY (y) =1

|a|fX

(y − ba

)

Definition 217. A collection {X1, X2, . . . , Xn} of

r.v.’s defined on the same sample space S is said to

be identically distributed if FXi = FXj for all i, j.

Definition 218. (Expectation or Expected

Value or Mean)

1. (The discrete case) Let X be a discrete r.v. tak-

ing values {x1, x2, . . . , xn} with probability dis-

tribution pi := P{X = xi}. Then the expecta-

tion of X is

E(X) :=

n∑i=1

xipi

If {x1, x2, . . . , xn, . . . } is infinite, then the ex-

pectation is again defined to be

E(X) :=

∞∑i=1

xipi

under the assumption that

n∑i=1

|xi|pi <∞

2. If X is a continuous r.v. with density f(x),

then the expectation is defined to be

E(X) :=

∫ ∞−∞

xf(x) dx

subject to the condition that∫ ∞−∞|x|f(x) dx <∞

An alternative term for the expectation is first (or-

dinary) moment. The expectation a r.v. is often

denoted by E[X] or µX or mX or m1 (for “mean”

or “moment”).

Another notion of a “mean” or “central value” is

the median.

Definition 219. Let X be a r.v. Then a number

m ∈ R such that P{X ≤ m} ≥ 0.5 and P{X ≥m} ≥ 0.5 is called the median of X.

Some properties of the expectation are listed be-

low.

Theorem 220. Let X and Y be r.v.’s on the same

sample space.

1. E(X) ≥ 0.

2. E(aX + b) = aE(X) + b for all a, b ∈ R.

72

3. |E(X)| ≤ E(|X|).

4. If E|X| <∞ and E|Y | <∞, then

E(X + Y ) = E(X) + E(Y )

This results extends to finitely many r.v.’s.

5. If X and Y are independent (see (??)) and

E|XY | ≤ ∞, then

E(XY ) = E(X)E(Y )

6. If X is a discrete non-negative r.v., then

E(X) =

∞∑n=0

P{X > n}

and if X is a non-negative continuous r.v. with

c.d.f F (t), then

E(X) =

∫ ∞0

[1− F (t)] dt

Definition 221. (Variance) Let X be a r.v. such

that E(X2) < ∞. Then the variance of X is the

quantity

Var(X) := E[(X − E(X))2]

Definition 222. The standard deviation (s.d.) of a

r.v. X is defined to be

σX :=√

Var(X)

Definition 223. Given a pair of r.v.’s X,Y , their

covariance is defined as

Cov(X,Y ) := E(XY )− E(X)E(Y )

= E(X − E(X))E(Y − E(Y ))

If Cov(X,Y ) = 0, then X and Y are said to be

uncorrelated or orthogonal.

Definition 224. Let X and Y be r.v.’s such that

E(X2) <∞ and E(Y 2) <∞. Then

ρ := ρX,Y :=Cov(X,Y )√

Var(X)Var(Y )

is called the coefficient of correlation between X and

Y .

Theorem 225.

1. −1 ≤ ρX,Y ≤ 1.

2. |ρX,Y | = 1 iff there are a, b ∈ R, a 6= 0, such

that P{Y = aX + b} = 1.

3. |ρX,Y | is invariant under a linear change of

(random) variables: if a, b, c, d ∈ R are such

that ac 6= 0, then

ρaX+b,cY+d =

{ρX,Y ac > 0−ρX,Y ac < 0

The following are the properties of the variance,

the covariance and some related facts.

Theorem 226.

1. Var(X) = E(X2)− [E(X)]2.

2. If Var(X) exists, then

Var(aX + b) = a2Var(X)

for any a, b ∈ R.

3. If E(X2) and E(Y 2) exist, then

Var(X+Y ) = Var(X)+Var(X)+2 Cov(X,Y )

4. Suppose the n r.v.’s X1, . . . , Xn are pairwise

uncorrelated. Then

Var(X1 + · · ·+Xn) = Var(X1)+ · · ·+Var(Xn)

5. If the mean square deviation from t is the

quantity E[(X − t)2], it is minimised when

t = E(X) and the minimum value is Var(X).

6. The quantity E(|X − t|) called the mean abso-

lute deviation from t is minimised at t = m,

where m is the median of X.

7. (Schwarz inequality) Given two r.v.’s X and Y

[E(XY )]2 ≤ E(X)2E(Y 2)

8.1.3 Some standard distributions

Discrete distributions

Binomial distribution Let X be a

r.v. defined to be the total number

of successes in an experiment with only

two outcomes, “success” (with some

probability p) and “failure” (with prob-

ability 1 − p), conducted n times. Thus,

X takes values 1, 2, . . . , n. Then X is

said to be binomially distributed and its

probability mass function is given by

b(k;n, p) := P{X = k} =

(n

k

)pk(1− p)k

The fact that a r.v. has the binomial

distribution is sometimes indicated by

writing X ∼ Bin(n, p).

73

Theorem 227. If X ∼ Bin(n, p), then

E(X) = np and Var(X) = np(1− p).

Poisson distribution A r.v. X is said to

have Poisson distribution with parameter

λ > 0 if X : S −→ {0, 1, 2, . . . } and

P{X = n} =λn

n!e−λ

It is sometimes indicated by writing X ∼Poi(λ).

Theorem 228. If X has Poisson distri-

bution, then E(X) = λ and Var(X) = λ.

Continuous distributions

Uniform distribution Let X be a continu-

ous r.v. and [a, b] ⊂ R be a given interval.

We define a probability density function

(p.d.f) of X as follows:

f(x) =

0 if x < ac if a ≤ x ≤ b0 if x > b

Then X is said to be uniformly distributed

on [a, b] and is symbolised by X ∼ U [a, b].

Theorem 229. If X ∼ U [a, b], then E(X) =

(a+ b)/2 and Var(X) = (b− a)2/12.

Theorem 230. If the distribution F of X is

strictly increasing, then Y := F (X) ∼ U [0, 1]

(see subsection (8.1.3)).

Normal or Gaussian distribution A continu-

ous r.v. X is said to have normal (or Gaussian)

distribution with parameters µ and σ > 0 if it

has density

f(x) =1

σ√

2πe−

(x−µ)2

2σ2

This distribution is indicated by writing X ∼N(µ, σ2).

Theorem 231. If X ∼ N(µ, σ2), then E(X) = µ

and Var(X) = σ2. A linear transformation of a

normally distributed r.v. is normally distributed. If

X1, . . . , Xn are independent r.v.’s such that Xi ∼N(µi, σ

2i ), then

c1X1 + · · ·+ cnXn ∼ N

(n∑i

ciµi,

n∑i

c2iσ2i

)In particular, if X ′ := (X − µ)/σ, then X ′ ∼N(0, 1).

8.2 Statistics

Statistic A population being analysed has certain

statistical measures such as the expectation (or

mean) and variance defined on it. In prac-

tice, such measures can only be computed for

samples. Any measure of interest determined

for a sample is called a statistic.

Error estimates Since the computation of stat-

istical quantities from a sample is likely to dif-

fer from those calculated for the whole popula-

tion, an idea of the errors incurred is required.

Definition 232. The probability distribution

of a given statistic (e.g. mean or variance) is

called its sampling distribution. The standard

deviation of the sampling distribution is called

the standard error (SE) of the statistic.

Theorem 233. Suppose n independent ran-

dom samples are drawn at a time from a popu-

lation assumed to have mean µ and variance

σ2. Let Xi (i = 1, . . . , n) be independent

identically distributed (iid) r.v.’s representing

the value of the ith sample in a draw. Suppose

Xi ∼ N(µ1, σ2i ). Then the standard errors of

the various statistics are as follows.

Sample mean vs population mean

Suppose µi = µ and σi = σ for all i.

The sample mean µ and and the sample

variance σ are defined to be the mean and

the variance of the r.v. X = 1n

∑ni=1Xi.

Hence

µ = µ

σ = σ2/n

Thus, X ∼ N(µ, σ2/n) and the SE is

σ/√n.

Difference between means of two samples

Let samples of size n1 and n2 be drawn

from the same or two different popula-

tions which have mean µ and variances

σ21 and σ2

2 (both may be equal). If X1

and X2 are the means of the samples

respectively, assume that X1 ∼ N(µ, σ21)

and X2 ∼ N(µ, σ22). Then the SE of

X1 −X2 is√σ2

1/n1 + σ22/n2

74

Linear regression

The method of least squares

Let Ax = b be a linear non-homogeneous sys-

tem of m equations in n unknowns, where

A =

a11 a12 · · · a1j · · · a1n

a21 a22 · · · a2j · · · a2n

......

......

......

ai1 ai2 · · · aij · · · ain...

......

......

am1 am2 · · · amj · · · amn

and

b =

b1b2...bm

Then the approximate solution according to

the method of least squares is that x which min-

imises the quantity∑mi=1E

2i where

Ei := ai1x1 + ai2x2 + · · ·+ ainxn − bi

This gives the system of normal equations

which can be solved to obtain the minimising

x:

m∑i=1

ai1Ei = 0,

m∑i=1

ai2Ei = 0, . . . ,

m∑i=1

ainEi = 0

Linear regression Let (X,Y ) be a bivariate r.v.

Let the N data-pairs (xi, yi) have frequencies

fi for i = 1, 2, . . . , n. So, N =∑ni=1 fi. The

following are the equations of the lines of re-

gression.

Deviations parallel to the y-axis minimised

This is called the line of regression of y

on x.

y − y =µ11

σ2x

(x− x)

where we use the following notation

x :=1

N

n∑i=1

fixi

y :=1

N

n∑i=1

fiyi

σ2x := Var({x1, . . . , xn}) =

(1

N

n∑i=1

fix2i

)− x

= E(X2)− E(X)2

Deviations parallel to the x-axis minimised

With similar notation as above, the equation

of the line of regression of x on y is

x− x =µ11

σ2y

(y − y)

Definition 234. The numbers byx := µ11/σ2x and

bxy := µ11/σ2y are called the coefficients of regres-

sion of y on x and x on y respectively.

Theorem 235. If ρX,Y denotes the correlation

coefficient (see Definition (224)), then

ρX,Y =√byxbxy

————————–

75

Chapter 9

Numerical Methods

9.1 Errors

Definition 236. Let x be a real or complex number

and x be an approximation to it. Then

1. The absolute error is the number |x− x|.

2. The relative error or absolute accuracy is |x−x|/x. The relative accuracy is |x− x|/x

3. The percentage error is |x− x|/x| × 100.

Theorem 237. Let f(x1, x2, . . . , xn) be a function

with continuous partial derivatives of some order

k > 1. Then if there are errors ∆xi in each xi such

that (∆xi)2 can be neglected for all i = 1, 2, . . . , n,

then the approximate error ∆f in f is given by

∆f ≈n∑i=1

∂f

∂xi∆xi

and the relative error in computing f is given by

∆f

f=

n∑i=1

∂f

∂xi

∆xif(x1, . . . , xn)

9.2 Solution of algebraic &transcendental equations

The bisection method Let f : [a, b] −→ R be

continuous and such that f(a)f(b) < 0. To

find a root of f , i.e. a point c such that f(c) ≈0, the following algorithm is used. Let ε > 0 be

small and represent the maximum acceptable

absolute error of the approximate root c.

1. Let c := (a+ b)/2

2. If b− c ≤ ε, then accept c as the approx-

imate root and exit.

3. If f(b)f(c) ≤ 0, the set a := c; else set

b := c.

4. Return to step 1.

Newton’s method Also called the Newton-

Raphson method and the tangent method. Let

f : R −→ R be differentiable. An approximate

root is obtained by the following algorithm:

1. Pick any x0 ∈ R. If f ′(x0) 6= 0, set x1 :=

x0 − f(x0)f ′(x0) . Else pick another x0.

2. In general xn+1 = xn − f(xn)f ′(xn) for all n ≥

0, assuming f ′(xn) 6= 0.

9.3 Numerical differentiation

Let f : [a, b] −→ R be differentiable to whatever

order necessary. Let

a ≤ x0 < x1 < · · · < xn ≤ b and h = xi − xi−1

for all i, meaning that the n + 1 nodal or tabular

points xi are equally spaced a distance h apart.

Theorem 238.

1. f ′(xk) ≈ 12h [−3f(xk) + 4f(xk+1 − f(xk+2) for

k ≤ n− 2.

2. f ′′(xk) = 1h2 [f(xk) − 2f(xk+1) + f(xk+2) for

k ≤ n− 2.

3. Near the centre, the following formulas are

useful:

f ′(xk) ≈ f(xk−2)− 8f(xk−1) + 8f(xk+1)− f(xk+2)

12h

provided terms in h4 and higher order can be

neglected.

f ′′(xk) ≈ f(xk+1)− 2f(xk) + f(xk−1)

h2

provided h2 is small.

77

9.4 Numerical integration

Let f : [a, b] −→ R be integrable.

The trapezoidal rule If the interval [a, b] is

small, the following approximation holds:∫ b

a

f(x) dx ≈ b− a2

[f(a) + f(b)]

If the interval [a, b] is not small, it is parti-

tioned into n subintervals of length h:

a := x0 < x1 · · · < xn = b

and xk := x0 + kh, k = 1, . . . , n. Then

∫ b

a

f(x) dx ≈ h

2[f(x0) + 2

n−1∑k=1

f(xk) + f(xn)]

This formula is also called the composite

trapezoidal rule.

Simpson’s rule If [a, b] is small, divide into two

equal parts, each of length (b − a)/2. Given

the three points (x0, f(x0)), (x1, f(x1)) and

(x2, f(x2)), where a = x0 < x1 = (b − a)/2 <

x2 = b,∫ b

a

f(x) dx ≈ h

3[f(x0) + 4f(x1) + f(x2)]

and h := (b−a)/2. This method is also known

as Simpson’s 1/3 Rule. If the interval is not

small, the interval is partitioned by points

a = x0 < x1 := x0+h < · · · < xn := x0+nh = b

The approximation is now given by∫ b

af(x) dx =

(h

3

) 2n−2∑k=0

[f(xk) + 4f(xk+1) + f(xk+2)]

=

(h

3

)[f(x0) + 4

n∑k=1

f(x2k−1) + 2

n−1∑k=1

f(x2k)

+ f(x2n)]

9.5 Numerical solution ofODEs

Let

y′ = f(x, y), y(x0) = y0 (9.1)

be an initial value problem.

Single-step methods In this class of methods

(also known as single-point methods), the pro-

cess starts with an initial choice (x0, y0) and at

the points xk := x0 +kh (k = 0, 1, . . . , n and h

represents the step size of the nodal or tabular

points) successively computes the approxima-

tions yk to the exact values y(xk) by means of

the iteration scheme

yk+1 = yk + hF (xk, yk, h, f)

where F (. . . ) is a function which specifies the

method.

Definition 239. In the Euler method

F (x, y, h, f) = f(x, y). This is a so-called

first-order method.

Let y(x) represent the exact solution of the

initial value problem (9.1) at an initial point

x0 with y(x0) = y0. Let h represent a step

size. Define

∆(x0, y0, h, f) :=

{y(x0+h)−y0

h h 6= 0f(x0, y0) h = 0

To obtain a general two-stage Runge-Kutta

method, let

F (x0, y0, h)

:= af(x0, y0) + bf(x0 + ph, y0 + qhf(x0, y0))

where a, b, p, q are chosen so that the Taylor

expansion of

∆(x0, y0, h, f)− F (x0, y0, h)

in terms of h begins with the highest possible

power.

Definition 240. 1. (Heun’s method or the

Euler-Cauchy method) This method is

defined by choosing a = b = 1/2 and

p = q = 1 and setting

yk+1 = yk + hF (xk, yk, h, f)

= yk +1

2h[f(xk, yk)

+1

2f(xk + h, yk + hf(xk, yk))]

2. (Modified Euler-Cauchy method) Here

a = 0, b = 1, p = 1/2, q = 0. We obtain

the method

yk+1 = yk + hf

(xk +

h

2, yk +

1

2hf(xk, yk)

)

78

Definition 241. (Four stage explicit Runge-

Kutta method) Here

F (x, y, h) =1

6(k1 + 2k2 + 2k3 + k4)

and so

yk+1 = yk + F (xk, yk, h)

where

k1 := f(x, y)

k2 := f

(x+

1

2h, y +

1

2hk1

)k3 := f

(x+

1

2h, y +

1

2k2

)k4 := f(x+ h, y + hk3)

————————–

79

compendium of results in advanced calculus

Documents

system of linear equations

systems of linear equations

linear odes

order pdes

special linear pdes

order odes

linear maps

order nth degree equations