compendium of results in advanced calculus
DESCRIPTION
Compendium of results in advanced calculus. A draft.TRANSCRIPT
Compendium of results in advancedcalculus
2
Contents
1 Linear Algebra 3
1.1 Matrix Algebra . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.1.1 Some Special Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.1.2 Operations with Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.1.3 Inverses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.1.4 The dot or inner product . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.1.5 Some more special matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.2 Linear Maps or Transformations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.3 Determinant & Trace . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.4 Systems of Linear Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
1.4.1 Solution of a System of Linear Equations . . . . . . . . . . . . . . . . . . . . . . . 9
1.5 Eigenvalues and Eigenvectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2 Calculus 15
2.1 Mean-Value Theorems & their Consequences . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.2 Limits & indeterminate forms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
2.3 Partial derivatives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
2.4 Maxima & Minima . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
2.5 Theorems on Integration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
2.6 Improper Integrals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
2.6.1 Convergence tests for Type 1a & 1b integrals . . . . . . . . . . . . . . . . . . . . . 21
2.6.2 Convergence tests for Type 2 integrals . . . . . . . . . . . . . . . . . . . . . . . . . 22
2.7 Uniform convergence & improper integrals . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
2.8 The Gamma Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
2.9 Multiple Integrals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
2.10 Vector identities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
2.11 Line, Surface & Volume Integrals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
2.12 Green’s, Stokes’ & Gauss’ theorems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
3 Fourier series 35
4 Integral Transforms 39
4.1 Laplace Transforms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
4.2 Fourier Transforms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
4.3 Z-Transforms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
5 Ordinary Differential Equations (ODEs) 45
5.1 First-order equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
5.2 First-order equations in separable form . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
5.3 Exact First-order ODEs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
5.4 General first-order first-degree linear equations . . . . . . . . . . . . . . . . . . . . . . . . 46
5.4.1 Integrating factors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
3
5.5 First-order nth degree equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
5.6 Linear ODEs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
5.6.1 Linear ODE of Euler-(or Cauchy-)type . . . . . . . . . . . . . . . . . . . . . . . . . 48
5.6.2 nth order constant coefficient homogeneous equations . . . . . . . . . . . . . . . . . 48
6 Partial Differential Equations (PDEs) 51
6.1 Formation of a PDE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
6.2 First-order PDEs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
6.2.1 Special types of first-order equations . . . . . . . . . . . . . . . . . . . . . . . . . . 52
6.3 Linear PDEs with constant coefficients . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
6.4 Some special linear PDEs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
6.4.1 The one-dimensional wave equation . . . . . . . . . . . . . . . . . . . . . . . . . . 54
6.4.2 The two-dimensional wave equation . . . . . . . . . . . . . . . . . . . . . . . . . . 55
6.4.3 The three-dimensional wave equation . . . . . . . . . . . . . . . . . . . . . . . . . . 55
6.4.4 Two-dimensional Laplace equation in a rectangle . . . . . . . . . . . . . . . . . . . 56
6.4.5 Two-dimensional Laplace equation in a circle with Dirichlet conditions . . . . . . . 56
6.4.6 Laplace’s equation in three dimensions . . . . . . . . . . . . . . . . . . . . . . . . . 56
6.4.7 One-dimensional heat equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
6.4.8 Two-dimensional heat equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
7 Complex Variables 59
7.1 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
7.2 Linear Fractional Transformations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
7.3 Elementary Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
7.4 Analytic Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
7.5 Complex integration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
7.6 Series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
7.7 Residues & Poles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
8 Probability & Statistics 69
8.1 Probability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
8.1.1 Conditional probability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
8.1.2 Random variables & probability distributions . . . . . . . . . . . . . . . . . . . . . 71
8.1.3 Some standard distributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
8.2 Statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
9 Numerical Methods 77
9.1 Errors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
9.2 Solution of algebraic & transcendental equations . . . . . . . . . . . . . . . . . . . . . . . 77
9.3 Numerical differentiation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
9.4 Numerical integration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
9.5 Numerical solution of ODEs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
4
Notation
The following notation will be used throughout.
N, R and C will represent the natural numbers
{1, 2, . . . }, the real numbers and the complex num-
bers respectively which will often be referred to as
scalars. Rn is real n-dimensional space.
The symbol “ := ” means that the LHS is defined
by the RHS.∐An is the “disjoint union” of the sets An,
viz. the union of the sets An which are assumed to
be disjoint.
If an is an infinite sequence, then
lim sup an := limn→∞
[sup{an, an+1, . . . }]
Functions will be usually be described in the
f : X −→ Y format. The domain [a, b] or (a, b) of a
real-valued function will be assumed to be bounded
unless otherwise specified, i.e. −∞ < a < b < ∞.
A function f : (a, b) −→ R, −∞ ≤ a < b ≤ ∞,
is Cn if f is differentiable n times and its derivat-
ives f (1), f (2), . . . , f (n) are continuous. Continuous
functions are C0.
1
2
Chapter 1
Linear Algebra
1.1 Matrix Algebra
Definition 1. A matrix is a collection of objects
or symbols displayed in a rectangular arrangement.
The objects may be, for example, numbers, func-
tions or other matrices. We will be concerned only
with matrices whose entries are numbers. The ob-
jects are called the entries of the matrix. The col-
lection of entries may be finite or infinite.
Example.The matrices[a b cd e f
]and
[a]
are finite 2×3 and 1×1 matrices respectively. The
matrix 1 1
213 · · ·
12
13
14 · · ·
......
......
is infinite.
We will be concerned only with finite matrices. A
matrix with real-number entries will be called a
real matrix and one with complex-number entries a
complex matrix. Every real matrix is also a complex
matrix. If the matrix has m rows and n columns
(m,n ≥ 1), the matrix is said to be of order m×n.
If the matrix has m rows and m columns, then its
order is said to be m. A matrix is usually conveni-
ently specified by a typical entry. Thus, the m×nmatrix
A =
a11 a12 · · · a1j · · · a1n
a21 a22 · · · a2j · · · a2n
......
......
......
ai1 ai2 · · · aij · · · ain...
......
......
am1 am2 · · · amj · · · amn
is written A = [aij ], where aij is the entry in the
ith row and jth column and is said to be in the
(i, j) position. Sometimes the order of the matrix
is made explicit by writing Am,n or Am×n. If
m = n, then the abbreviation An is also used.
A submatrix of a matrix is a matrix formed by
the entries from the intersection of some set of rows
{i1, i2, . . . , ik} and set of columns {j1, j2, . . . , jl}of the original matrix. A principal submatrix
of a square matrix (see below) is formed by
selecting some rows {i1, i2, . . . , ik} and columns
{i1, i2, . . . , ik}.
1.1.1 Some Special Matrices
Zero matrix The matrix all of whose entries are
0. It is usually denoted by 0 or 0m×n.
0 :=
0 0 · · · 0...
......
...0 0 · · · 0
Matrix unit (of order m×n) The matrix unit
Eij of order m×n is defined to have 1 in the
(i, j) position and 0 elsewhere:
Eij :=
0 0 · · · 0 · · · 0...
......
......
...0 0 · · · 1 · · · 0...
......
......
...0 0 · · · 0 · · · 0
In terms of the Kronecker delta or symbol, δij ,
defined by
δij =
{1 if i = j0 if i 6= j
the (k, l)th entry of the matrix unit Eij is
δikδjl.
Row- & column vectors A 1×n matrix is called
a row-vector or -matrix and an n×1 matrix is
3
called a column vector or column matrix. A
column vector is also termed a column vector
or simply, vector. An n × 1 column vector is
often termed an n-dimensional vector. If the
vector is assumed to have entries only from R(resp. C), it is said to be a real (resp. complex)
vector. a1
a2
...an
is sometimes written as [a1 a2 · · · an]
Tfor ty-
pographical convenience. The symbol T is
defined on p. 5. A row vector is also written
(a1, a2, . . . , an). The entries of a column-vector
are called its components.
Row-reduced Echelon matrix Its defining
characteristics are:
(i) If a row has non-zero entries (i.e. a non-
zero row), then its first non-zero entry is
1, i.e. the leading term of the row is 1.
(ii) All the entries in the column containing a
leading 1 are 0.
(iii) If rows i and j are non-zero, i < j, then
the leading 1 of the ith row occurs to the
left of the leading 1 of the jth row.
(iii) Any zero row occurs below every non-
zero row.
An example is the matrix0 1 0 0 a b0 0 0 1 c d0 0 0 0 0 00 0 0 0 0 0
Square matrix An m×n matrix A = [aij ] with
m = n, i.e. the number of rows of the matrix
is equal to the number of its columns. The list
of entries {aii : i = 1, 2, . . . , n} is the diagonal
of A. Note that diagonals can only be defined
for square matrices. Entries which do not lie
on the diagonal are said to be off-diagonal.
The list {a12, a23, . . . , ai,i+1, . . . , an−1,n}is the superdiagonal and the list
{a21, a32, . . . , ai,i−1, . . . , an,n−1} is the subdi-
agonal of A.
Identity matrix A square matrix whose di-
agonal is {1, 1, . . . , 1} and all off-diagonal
entries are 0. It is usually denoted by I or
if the order is to be emphasised, In. Thus
In :=
1 0 · · · 00 1 · · · 0...
......
...0 0 · · · 1
Diagonal matrix A square matrix whose
off-diagonal entries are all 0. The
identity matrix is diagonal. A gen-
eral diagonal matrix is often written as
diag[a1, a2, . . . , an].
Triangular matrix An n×n square matrix
is upper triangular if all entries below the
diagonal are 0:
a11 a12 a13 · · · a1n
0 a22 a23 · · · a2n
0 0...
......
......
......
...0 0 0 · · · ann
The definition of lower triangular is similar.
Partitioned or Block matrices A block or par-
titioned matrix is a matrix whose entries are
themselves matrices which satisfy the condi-
tion that the matrices in a given row of the
block-matrix each have the same number of
rows and those in a given column, the same
number of columns. In other words, if the
entries of the individual matrices are written
out, the result should be a matrix. For ex-
ample, [A2×3 B2×4 C2×1
D3×3 E3×4 F3×1
]is a block-matrix whereas[
A2×4 B2×4 C2×1
D3×3 E3×3 F3×3
]is not. For convenience, a matrix is sometimes
written in block form. An n×n block diagonal
matrix has the formA1 0 0 · · · 00 A2 0 · · · 0...
......
......
0 0 0 · · · An
Here each 0 is a zero matrix of appropri-
ate order. A matrix can be partitioned in
many different ways by drawing horizontal
4
lines between rows and vertical lines between
columns. The terminology of matrices with
scalar entries can usually be carried over to
block matrices.
1.1.2 Operations with Matrices
Addition Given two m×n matrices A = [aij ] and
B = [bij ], the matrix A+B is defined to be the
m×n matrix whose (i, j)th entry is aij + bij ,
i.e. matrix addition is defined entrywise.
Analogously, if X = [Xij ] and Y = [Yij ], are
partitioned matrices (each entry is a matrix
of suitable order), then X + Y = [Xij + Yij ],
provided Xij and Yij are of the same order for
each i, j.
The following results are immediate and hold
for matrices with scalar entries and for block
matrices:
Theorem 2. Matrix addition is
1. commutative: A+B = B +A.
2. associative A + (B + C) = (A + B) + C
for any three matrices A, B and C of the
same order.
For this reason the addition as above is often
written without parentheses: A + B + C.
Clearly, addition can be extended to any finite
number of summands.
Any m×n matrix A = [aij ] can be written as
A =
m,n∑i=1j=1
aijEij
where the Eij ’s are the matrix units defined
earlier.
Multiplication by scalars For any scalar c and
any matrix A = [aij ], the matrix cA has the
(i, j)th entry caij , i.e. every entry of A is
multiplied by c. When c = −1, the matrix
(−1)A is usually denoted by −A. Note that
A + (−A) = 0 = (−A) + A. The definition is
the same for block matrices. The square mat-
rix cIn is called a scalar matrix.
Transpose & Hermitian Adjoint If A = [aij ]
is an m× n matrix with entries from R, its
transpose is defined to be the n×m matrix
denoted AT (also At or tA) and defined by
AT = [aji]. That is, the entry in the (i, j) pos-
ition of AT is the entry in the (j, i) position of
A. Another way of describing the transposed
matrix is to say that it is obtained from A
by replacing its ith row by its ith column,
i = 1, 2, . . . ,m. The corresponding operation
for a block-matrix X = [Xij ] is XT = [XjiT],
i.e. both the block-matrix and its individual
matrix entries are transposed.
If A = [aij ] has entries from C, then it
is convenient to define a slightly modified
version of the transpose, viz. the hermitian
adjoint (or often simply, adjoint). This is the
matrix denoted A∗ and defined by A∗ = [aji],
where aji is the conjugate of the complex
number aji. When the entries are all real,
then AT = A∗. The notation AT for A∗ is also
used. The corresponding definition for block
matrices is X∗ = [Xji∗].
Theorem 3.
1. (AT)T = A and (A∗)∗ = A.
2. (A+B)T
= AT + BT and (A + B)∗ =
A∗ +B∗.
3. (AB)T
= BTAT and (AB)∗ = B∗A∗.
Multiplication of matrices The product of two
matrices Am×n = [aij ] and Bn×p = [bjk], de-
noted by AB, is a matrix of order m × pwith (i, j)th entry
∑nr=1 airbrj . Matrix mul-
tiplication is defined only when the number of
columns of the first factor is equal to the num-
ber of rows of the second factor. For block
matrices Xm×n = [Xij ] and Yn×p = [Yij ],
XY = [∑nr=1 XirYrj ] provided all the mul-
tiplications XirYrj are valid, and is of order
m×p.
Theorem 4.
1. Matrix multiplication is not in general
commutative: it is not necessarily true
that AB = BA.
2. Multiplication is associative: If A, B
and C are three matrices such that the
products BC and A(BC) are defined,
then so are AB and (AB)C. Moreover,
A(BC) = (AB)C.
3. Multiplication is distributive: given
matrices A, B and C, A(B + C) =
5
AB + AC, if the sum B + C and the
products AB, AC are defined. Analog-
ously, (B + C)A = BA+ CA.
4. Am×n0n×p = 0m×p and 0p×mAm×n = 0p×n.
5. Am×nIn = Am×n = ImAm×n.
6. c(AB) = (cA)B = A(cB), c a scalar.
7. (Product of matrix units) EijEkl =
δjkEil, where δjk is the Kronecker sym-
bol defined earlier.
When a product AB is formed, B is said
to be pre- or left-multiplied by A which in
turn is said to be post- or right-multiplied by B.
1.1.3 Inverses
Let Am×n, A′n×m and A′′n×m be matrices such
that A′A = In and AA′′ = Im. Then
A′ is a left inverse of A and A′′ a right
inverse of A. In general, there will exist
many left (or right) inverses.
Theorem 5. 1. If Am×n has a left in-
verse and a right inverse, then A
must be square and the two inverses
are equal and unique.
2. If A is square and has a left in-
verse (resp. a right inverse), then it
also has a right inverse (resp. a left
inverse) and the two are equal and
unique.
When A is square, the unique matrix B,
if it exists, satisfying BA = I = AB is
called the inverse of A and is denoted
A−1. A matrix which has an inverse
is said to be invertible or non-singular.
Otherwise it is said to be non-invertible
or singular. The basic properties of
inverses are summarised in the next
theorem.
Theorem 6. Let A and B be n×n in-
vertible matrices and A−1 and B−1 their
respective inverses. Then
1. (A−1)−1 = A.
2. (AB)−1
= B−1A−1. Thus, a product
of invertible matrices is invertible.
This result extends to any finite num-
ber of factors.
3. (Left- and Right-cancellation) AX =
AY ⇒ X = Y and XA = Y A ⇒X = Y , where X, Y are arbitrary
matrices for which the products are
defined.
4. (A−1)T = (AT)−1 and (A∗)−1 =
(A−1)∗.
5. A is invertible ⇐⇒ detA 6= 0, where
detA is the determinant of A.
6. A diagonal matrix D :=
diag[a1, a2, . . . , an] is invertible iff
each ai 6= 0. The inverse, if it exists,
is D−1 = diag[1/a1, 1/a2, . . . , 1/an].
7. If a real or complex square matrix
A = [aij ] is strictly diagonally dom-
inant, i.e. for each i = 1, 2, . . . , n,
|aii| >n∑j=1
j 6=i
|aij |
then A is invertible.
8. Let
X =
[Am×m Bm×nCn×m Dn×n
]be a block (or partitioned) matrix
with A invertible. Then X is invert-
ible if and only if Y := D−CA−1B is
invertible, in which case X−1 is given
by
X−1 =
[A−1 +A−1BY −1CA−1 −A−1BY −1
−Y −1CA−1 Y −1
]9. An important special case of the
above result occurs when C = 0.
Then X is invertible iff A and D are
invertible and
X−1 =
[A−1 −A−1BD−1
0 D−1
]A pair of square matrices A and B of the
same order are similar if B = PAP−1 for
some invertible P . They are orthogonally
(resp. unitarily) similar if P is orthogonal
(resp. unitary).
Theorem 7. Similar matrices have the
same determinant.
A and B are congruent or cogredient if
B = PAPT (or PAP ∗ in the case of com-
plex matrices) for some invertible P . Two
m×n matrices A and B are equivalent if
there exist invertible matrices Pm and Qnsuch that B = PmAQn.
6
1.1.4 The dot or inner product
Definition 8.
1. The dot product or inner product of
two real n × 1 column vectors x =
[x1, . . . , xn]T
and y = [y1, . . . , yn]T
is de-
noted by x · y (or 〈x, y〉) and defined to
be
x · y :=
n∑k=1
xkyk = xTy
2. If x and y have complex entries then
x · y :=
n∑k=1
xkyk = xTy
where y := [y1, . . . , yn]T
(vector whose
entries are the complex conjugates of y).
Definition 9. The length or magnitude or
norm of a vector x = [x1 x2 . . . xn]T
with real
or complex entries is defined to be
||x|| :=
[n∑k=1
|xk|2]1/2
= (xTx)1/2 (1.1)
for real vectors; the last expression must be re-
placed by xTx for complex vectors. A vector x
such that ||x|| = 1 is called a unit vector.
Definition 10.
1. Non-zero vectors x, y ∈ Rn are ortho-
gonal if x · y = 0. They are parallel or
collinear if y = cx for some scalar c.
2. A set of vectors {x1, . . . , xn} is orthonor-
mal if each xi is a unit vector and xi ·xj = δij (Kronecker delta; see subsection
(1.1.1)).
Theorem 11. Let x, y be two n-dimensional
real or complex vectors. The norm has the fol-
lowing properties:
1. ||x|| = 0⇐⇒ x = 0.
2. ||cx|| = |c| ||x||, c ∈ R,C.
3. (Triangle inequality)
| ||x|| − ||y|| | ≤ ||x+ y|| ≤ ||x||+ ||y||
4. (Cauchy-Buniakowsky-Schwarz inequal-
ity)
|x · y| ≤ ||x|| ||y||
Equality holds iff y = cx for some scalar
c.
1.1.5 Some more special matrices
Let A be a square matrix.
1. It is Symmetric If AT = A. If A is complex,
then A is Hermitian if A∗ = A. Note that a
complex matrix A may be symmetric without
being hermitian.
2. It is Normal if A∗A = AA∗; in particular, for
matrices with real entries, this reduces to the
condition ATA = AAT.
3. It is Orthogonal if
AAT = I (= ATA)
and unitary if
AA∗ = I (= A∗A)
Orthogonal matrices are thus invertible.
4. Rotation matrices in 2 dimensions. The ortho-
gonal matrices
Rθ :=
[cos θ − sin θsin θ cos θ
](0 ≤ θ ≤ 2π)
represent rotations about the origin by an
angle θ in the anticlockwise or positive direc-
tion: if x ∈ R2, then Rθx is the vector obtained
by rotating x through θ about the origin.
5. Rotation matrices in 3-dimensions. The ortho-
gonal matrices
Rx :=
[1 0 00 cos θ − sin θ0 sin θ cos θ
], Ry :=
[cos θ 0 sin θ1 0 0
− sin θ 0 cos θ
]
and Rz :=
[cos θ − sin θ 0sin θ cos θ 00 0 1
]
represent rotations through an angle θ about
the x-axis, the y-axis and the z-axis respect-
ively. For example, Rxv is the vector v rotated
through θ about the x-axis.
6. General Rotation matrix. The orthogonal mat-
rix[c+ (1− c)a21 (1− c)a1a2 − sa3 (1− c)a1a3 + sa2
(1− c)a1a2 + sa3 c+ (1− c)a22 (1− c)a2a3 − sa1(1− c)a1a3 − sa2 (1− c)a2a3 + sa1 c+ (1− c)a23
]where c := cos θ and s := sin θ, represents a
rotation through an angle θ about the axis a =
(a1, a2, a3).
7
7. Reflection matrix. Let u = (u1, u2) (u =
(u1, u2, u3)) be a unit vector in R2 (resp. R3).
Then the matrix representing reflection in the
line (resp. plane) perpendicular to u is given
by[1− 2u21 −2u1u2−2u1u2 1− 2u22
]&
[1− 2u21 −2u2u1 −2u3u1−2u1u2 1− 2u22 −2u3u2−2u1u3 −2u2u3 1− 2u23
]
8. Idempotent matrix if A2 = A and Nilpotent if
Ak = 0 for some integer k > 0.
1.2 Linear Maps or Trans-formations
A map T : Rn −→ Rm is said to be linear if
T (x+ y) = T (x) + T (y) for all x, y ∈ Rn
T (cx) = c T (x) for all x ∈ Rn, c ∈ R
In the definition above, R can be replaced by C.
The two conditions can be combined to state that
T is linear if T (cx + y) = c T (x) + T (y) for all
x, y ∈ Rn and c ∈ R.
Theorem 12.
1. If T is linear, then T (0n) = 0m.
2. If A is an m×n matrix, then T (x) := Ax is a
linear map from Rn −→ Rm.
3. If A is m× n, B is n× p, then AB is a linear
map from Rp −→ Rm.
Definition 13.
1. Let T : Rn −→ Rm be a linear map. The set
{x ∈ Rn : Tx = 0} is called the null space or
kernel of T . The set {Tx ∈ Rm : x ∈ Rn} is
called the range space or image space of T .
Definition 14.
1. A set {v1, v2, . . . , vk} of non-zero (column) vec-
tors is said to be linearly independent if given
scalars c1, c2, . . . , ck,
k∑i=1
civi = 0⇒ ci = 0
for i = 1, 2, . . . , k. A set of row vec-
tors {u1, u2, . . . , uk} is linearly independent
if {u1T, u2
T, . . . , ukT} is. If the vectors are
not linearly independent, they are said to be
linearly dependent. In this case, there ex-
ist scalars c1, c2, . . . , ck not all zero such that∑ki=1 civi = 0.
2. The number of vectors in any maximal sub-
set of linearly independent vectors of the null
space of a linear transformation T is called the
nullity of T and sometimes denoted null(T ).
The number of vectors in any maximal sub-
set of linearly independent vectors in the range
space of T is called the rank of T and variously
denoted rank(T ) or rk(T ).
Theorem 15. (The Rank-Nullity theorem) Let T :
Rn −→ Rm be a linear transformation. Then
n = null(T ) + rk(T )
1.3 Determinant & Trace
Given a 1× 1 matrix A = [a11], its determinant
detA := a11. If A = [aij ] is 2× 2, detA :=
a11a22 − a12a21. Assuming that the determinant
of an (n − 1)×(n − 1) matrix A has been defined,
the determinant of an n×n matrix A = [aij ] is
defined as follows: let Mij be the determinant of
the submatrix of A obtained from A by omitting
its ith row and jth column. Then
detA =
n∑j=1
(−1)i+jaijMij (1.2)
Mij is called the minor corresponding to the entry
aij and the determinant is said to be expressed in
terms of an expansion by minors along the ith row.
In particular, we may take i = 1 in (1.2). The term
(−1)i+jMij is called the cofactor corresponding to
the entry aij .
Definition 16. Let A = [aij ] be an n × n square
matrix. The trace of A is the sum of the diagonal
entries of A., i.e. Trace(A) :=∑nk=1 akk. It is also
denoted tr(A) etc.
Theorem 17.
1. The “trace map” Trace : A 7→ Trace(A) is lin-
ear, i.e. given matrices A and B, and c any
scalar,
Trace(cA+ b) = cTrace(A) + Trace(B)
2. Trace(AB) = Trace(BA), where A,B are
square.
3. If A and B are similar, i.e. there is an in-
vertible matrix P such that B = PAP−1, then
Trace(A) = Trace(B).
8
Properties of determinants Let A = [aij ] be an
n×n square matrix.
1. (Expansion along columns) detA =∑ni=1(−1)i+jaijMij .
2. (Multiplying a column by a constant) Writing
A in block form as A = [A1A2 · · ·Aj · · ·An]
where Aj is the jth column of A,
det[A1A2 · · · (cAj) · · ·An] = cdetA
Similarly for columns. Hence, if any row
or column of a square matrix has only zero
entries, then its determinant is zero. Also,
det(cA) = cn detA
3. If A′ is the matrix obtained by interchanging
two rows or two columns of A, then detA′ =
−detA.
4. If A has any two rows or any two columns
identical, then detA = 0.
5. If A′ = [A1A2 · · · cAk + Aj · · ·An] where A =
[A1A2 · · ·Aj · · ·An], then detA′ = detA.
6. det(AB) = (detA)(detB) = det(BA).
7. detAT = detA and detA∗ = detA if A is
complex.
8. det In = 1.
9. If A is upper or lower triangular,
detA = a11a22 · · · ann
If A is block upper triangular, i.e.
A =
[X Y0 Z
]then detA = (detX)(detZ).
10. A is invertible ⇐⇒ detA 6= 0. Moreover,
detA−1 = 1/(detA).
11. If A and B are similar, then detA = detB.
1.4 Systems of Linear Equa-tions
A collection of m simultaneous equations:
a11x1 + a12x2 + a13x3 + · · · + a1nxn = b1
a21x1 + a22x2 + a23x3 + · · · + a2nxn = b2
... (1.3)
am1x1 + am2x2 + am3x3 + · · ·+ amnxn = bm
in the unknowns x1, x2, x3, . . . , xn is called a non-
homogeneous system of linear equations. If all the
bi’s are zero, then the system is said to be ho-
mogeneous. The scalars aij , i = 1, 2, . . . ,m and
j = 1, 2, . . . , n are the coefficients of the unknowns.
The system (1.3) can be rewritten in matrix form
as Ax = b, where
A :=
a11 a12 a13 · · · a1n
a21 a22 a23 · · · a2n
......
......
...am1 am2 am3 · · · amn
m×n
is the coefficient matrix and
x :=
x1
x2
...xn
n×1
and b :=
b1b2...bm
m×1
Augmented Matrix The block-matrix [A b] is
the augmented matrix associated with the sys-
tem (1.3).
Any n-tuple (c1, c2, . . . , cn) such that
A
c1c2...cn
= b
is said to be a solution of the system (1.3). A linear
system of equations may have no solution, a unique
solution or infinitely many solutions. In the first
case, the system is said to be inconsistent. Oth-
erwise it is said to be consistent. A homogeneous
system is consistent since x = 0 is always a solution;
such a solution is said to be trivial.
1.4.1 Solution of a System of LinearEquations
Elementary row operations These are certain
operations performed on the rows of a matrix
to obtain another closely related matrix called
an elementary matrix. We first define on the
operations on the identity matrix In.
1. Elementary Permutation. The ith row
of In is exchanged with its jth row. The
elementary matrix thus obtained is de-
noted Pij . In terms of the matrix units
(see p. 3) we can write this as:
Pij := In − Eii − Ejj + Eij + Eji
9
2. Elementary Dilation or Dilatation.
This replaces the ith 1 on the diagonal
of In by some scalar c 6= 0. The corres-
ponding elementary matrix is Di(c):
Di(c) := In − Eii + cEii
3. Elementary Transvection. This oper-
ation replaces the ith row by the ith row +
c×(jth) row, i 6= j, c an arbitrary scalar.
The corresponding matrix is Tij(c):
Tij(c) := In + cEij
It differs from In in having c instead of 0
in the (i, j) position.
Elementary matrices are often denoted E.
Theorem 18.
1. Pre-multiplying any m×n matrix by an
elementary matrix from the list above re-
produces the corresponding row operation
on the given matrix.
2. Each of Pij, Di(c) and Tij(c) is invertible
with corresponding inverse Pji, Di(1/c)
and Tij(−c).
Gaussian Elimination This is a method for solv-
ing a consistent system or for demonstrating
that a given system is inconsistent. The sys-
tem Ax = b is row-reduced to echelon form
(see subsection (1.1.1) by a finite sequence
of row operations, or equivalently, by pre-
multiplying the augmented matrix [A b] (p. 9)
by a finite sequence of elementary matrices,
say E1, E2,. . . , Ek, to obtain the block-matrix
[A′ b′] := Ek . . . E2E1[A b]. Note that A′ =
Ek . . . E2E1A and b′ = Ek . . . E2E1b. The sys-
tem A′x = b′ can be easily solved or determ-
ined to be inconsistent. Since the Ei’s are in-
vertible, it is clear that the systems Ax = b
and A′x = b′ are equivalent in the sense that
either both are inconsistent or both have ex-
actly the same solutions. The following ex-
ample illustrates the procedure. Consider the
system Ax = b:
1 −2 1 21 1 −1 11 7 −5 −1
x1
x2
x3
x4
=
b1b2b3
We apply the following sequence of row oper-
ations to the augmented matrix A′:1 −2 1 2 b11 1 −1 1 b21 7 −5 −1 b3
T21(−1)−−−−−→
1 −2 1 2 b10 3 −2 −1 b2 − b11 7 −5 −1 b3
T31(−1)−−−−−→
1 −2 1 2 b10 3 −2 −1 b2 − b10 9 −6 −3 b3 − b1
D2(1/3)−−−−−→
1 −2 1 2 b1
0 1 − 23 − 1
3b2−b1
3
0 9 −6 −3 b3 − b1
D3(1/9)−−−−−→
1 −2 1 2 b1
0 1 − 23 − 1
3b2−b1
3
0 1 − 23 − 1
3b3−b1
9
T12(2)−−−−→
1 0 − 13
43
b1+2b23
0 1 − 23 − 1
3b2−b1
3
0 1 − 23 − 1
3b3−b1
9
T32(−1)−−−−−→
1 0 − 13
43
b1+2b23
0 1 − 23 − 1
3b2−b1
3
0 0 0 0 2b1−3b2+b39
We obtain the equivalent system of equations
plus a consistency condition (1.4):
x1 − 1
3x3 +
4
3x4 =
b1 + 2b23
x2 −2
3x3 −
1
3x4 =
b2 − b13
0 =2b1 − 3b2 + b3
9(1.4)
If for example b1 = 0, b2 = 0, b3 = 1, then the
system has no solution. Assuming that (1.4)
is satisfied, we may assign arbitrary values to
x3 = s and x4 = t to obtain the general solu-
tion
x =
13s−
43 t+ b1+2b2
323s+ 1
3 t+ b2−b13
st
Every solution is obtained by giving particular
values to s and t: real values if the system is
regarded as being over R and complex values
if it is over C.
Rank of a matrix The maximum number of
linearly independent columns (regarded as
column vectors) of a matrix Am×n is called
10
the column-rank of A. The 0 matrix has
column-rank 0. The row-rank is defined
analogously. If the column-rank of A is n,
then it is said to be of (full column-rank) and if
its row-rank is m, then to be of full row-rank.
The column-rank of a block-matrix is the rank
of the matrix obtained when the block-matrix
is written without the partitioning.
Theorem 19.
1. The row-rank of an m×n matrix is equal
to its column-rank. This common integer
is called the rank of the matrix and is de-
noted variously as ρ(A), rk(A) etc. Ob-
viously, rk(A) ≤ min{m,n}. If rk(A) =
min{m,n}, it is said to be of full rank).
2. If A is of order m×n and B of order n×p,
then
rk(A)+rk(B)−n ≤ rk(AB) ≤ min{rk(A), rk(B)}
The lower bound is Sylvester’s inequality.
3. (Frobenius’ inequality)
rk(ABC) ≥ rk(AB) + rk(BC)− rk(B)
4. rk(A) = rk(AX) = rk(YA), if X and Y
are invertible.
5. If A is similar to B, then rk(A) = rk(B).
6. rk(AB) = rk(A) ⇐⇒ A = ABX for some
matrix X.
7. rk(AB) = rk(B) ⇐⇒ B = YAB for some
matrix Y .
8. rk(A + B) ≤ rk(A) + rk(B). Note that
rk(A + (−A)) = 0. This inequality ex-
tends to any finite number of summands:
rk(A1+A2+· · ·+Ak) ≤ rk(A1)+rk(A2)+
· · ·+ rk(Ak).
9. The system Ax = 0, where A is m×n,
has a non-trivial solution (i.e. a non-zero
solution) iff rk(A) < n. In particular, this
is the case if m < n.
10. The system Ax = b is consistent iff
rk(A) = rk([A b]), the matrix on the RHS
being the augmented matrix.
Cramer’s Rule If in the system Ax = b, A is in-
vertible, then the solution exists and is unique:
x = A−1b. An alternative way of describ-
ing the solution is Cramer’s rule: if x =
[x1 x2 · · · xn]T
is the solution of the given sys-
tem, then
xi = detAi/detA
where Ai is the matrix obtained by replacing
the ith column of A by b.
1.5 Eigenvalues and Eigen-vectors
In this section unless otherwise mentioned, all
matrices are assumed to be square.
Eigenvalue, Eigenvector & Eigenspace Let
A be an n×n matrix over the real or complex
numbers. Any scalar λ such that Ax = λx
for some column vector x 6= 0, is called an
eigenvalue of the matrix A corresponding
to the eigenvector x. A matrix may not
have any real eigenvalues but it always has
complex ones. The set of all the eigenvectors
associated with a given eigenvalue together
with the zero vector, is called the eigenspace
of the eigenvalue.
Spectrum & Spectral Radius The collection of
all distinct eigenvalues of a matrix A is called
its spectrum and is usually denoted σ(A). The
nonnegative number
ρ(A) := max{|λ| : λ ∈ σ(A)}
is said to be the spectral radius of A. In the
event a real matrix A does not have any real
eigenvalues, the spectral radius is defined
using its complex eigenvalues.
Theorem 20. Let A be an n×n matrix.
1. The eigenvalues of a diagonal or a tri-
angular matrix (upper or lower) are the
entries on the diagonal.
2. A is singular (i.e. noninvertible) iff 0 ∈σ(A). If A is invertible, then
σ(A−1) = σ(A)−1 := {1/λ : λ ∈ σ(A)}
If x is an eigenvector of A associated with
λ, then x is also an eigenvector of A−1
associated with the eigenvalue 1/λ.
3. Let p(x) = a0 +a1x+a2x2 + · · ·+anx
n be
a polynomial. Then for every λ ∈ σ(A)
11
with an eigenvector x, p(λ) is an eigen-
value of
p(A) := a0I + a1A+ a2A2 + · · ·+ anA
n
with an eigenvector x. However, if only
the real eigenvalues of a matrix are taken
into account, then in general p(σ(A)) :=
{p(λ) : λ ∈ σ(A)} & σ(p(A)). If x is
an eigenvector of A associated with the
eigenvalue λ, then x is also an eigen-
vector of p(A) associated with the eigen-
value p(λ): p(A)x = p(λ)x.
4. (Spectral mapping theorem) If all the ei-
genvalues (real and complex) are con-
sidered, then if p is a polynomial,
p(σ(A)) = σ(p(A))
As a special case of the above, taking
p(x) = cx, c a scalar,
σ(cA) = cσ(A)
Eigenvalues of special matrices
5. Every matrix of odd order has at least one
real eigenvalue.
6. The complex eigenvalues of a real matrix
occur in conjugate pairs: if λ is a complex
eigenvalue of A, then so is λ.
7. The spectrum of an idempotent matrix
(see section 8) is a subset of {0, 1}. Every
eigenvalue of a nilpotent matrix is 0.
8. n×n symmetric and hermitian matrices
have n real eigenvalues which, however,
may not be all distinct.
9. The eigenvalues of a skew-symmetric
matrix are purely imaginary.
10. Let A be orthogonal. Then λ ∈ σ(A) ⇒|λ| = 1.
11. If A is strictly diagonally dominant, then
(a) The diagonal entries of A are all pos-
itive ⇒ all the eigenvalues of A have
positive real part.
(b) A is hermitian and the diagonal
entries of A are all positive ⇒ the ei-
genvalues of A are real and positive.
Characteristic Polynomial The polynomial
pA(x) := det (A− xI)
(sometimes defined as det (xI −A)) is called
the characteristic polynomial of A. It is of de-
gree n if A is n×n.
Theorem 21.
1. The roots of pA(x) are the eigenvalues of
A. Similar matrices have the same char-
acteristic polynomial.
2. If pA(x) =∑nk=0 akx
k, then
Trace(A) = −an−1 =
n∑k=1
λk
where the λk’s are all the eigenvalues (real
and complex) of A.
3. detA = (−1)na0 = λ1λ2 . . . λn.
Theorem 22. (Cayley −Hamilton) Any
n×n matrix A satisfies its characteristic poly-
nomial, ı.e. pA(x) = a0 + a1x + · · · + anxn is
the characteristic polynomial of A, then
pA(A) := a0I + a1A+ · · ·+ anAn = 0
Eigenvalue Multiplicity Suppose that the
characteristic polynomial pA(x) factors as
(x− λ1)m1)x− λ2)m2 · · · (x− λk)mk , in which
the λi’s are possibly complex, then for each
for i = 1, 2, . . . , k, the eigenvalue λi is said
to have (algebraic) multiplicity mi. Alternat-
ively, there are said to be mi eigenvalues λi(i = 1, 2, . . . ,mi) counting multiplicities. If A
is n×n then m1 +m2 + · · ·+mk = n.
Theorem 23.
1. AT has the same eigenvalues as A count-
ing multiplicities. The eigenvalues of A∗
are the complex conjugates of the eigen-
values of A counting multiplicities (i.e. if
λ is an eigenvalue of A with multiplicity
m, then λ is an eigenvalue of A∗ with the
same multiplicity m).
2. Similar matrices have the same eigenval-
ues counting multiplicity.
3. Given Am×n and Bn×m (m ≤ n),
pBA(x) = xn−mpAB(x)
AB and BA have the same non-zero ei-
genvalues counting multiplicities. BA has
an additional n −m eigenvalues 0. If A
and B are square and at least one of A
and B is invertible, then AB and BA are
similar.
12
4. Let
A =
[A 00 B
]Then the eigenvalues of A are those of A
and of B counting multiplicities.
————————–
13
14
Chapter 2
Calculus
All functions in this chapter will be real-valued
unless otherwise mentioned.
2.1 Mean-Value Theorems &their Consequences
Rolle’s theorem Let f : [a, b] −→ R, −∞ <
a, b < ∞, be continuous and differentiable in
(a, b) with finite or infinite derivative. Suppose
f(a) = f(b). Then there exists a < c < b such
that f ′(c) = 0.
Mean Value Theorem Let f : [a, b] −→ R be a
continuous function which is differentiable in
(a, b) with finite or infinite derivatives. Then
there exists c ∈ (a, b) such that
f(b)− f(a) = (b− a)f ′(c)
Generalised Mean Value Theorem Let f, g :
[a, b] −→ R be continuous on [a, b] and differ-
entiable on (a, b) with finite or infinite derivat-
ives. Suppose f ′ and g′ are not simultaneously
infinite at any point of (a, b). Then, there ex-
ists c ∈ (a, b) such that
[g(b)− g(a)]f ′(c) = [f(b)− f(a)]g′(c)
The mean value theorem is recovered by taking
g(x) = x.
Monotonicity If f : [a, b] −→ R is continuous and
differentiable on (a, b) with possibly infinite de-
rivatives, then
1. f ′(x) > 0 on (a, b)⇒ f is strictly increas-
ing on [a, b].
2. f ′(x) < 0 on (a, b) ⇒ f is strictly de-
creasing on [a, b].
3. f ′(x) = 0 on (a, b) ⇒ f is constant on
[a, b].
Applications to maxima and minima See
(2.4).
Intermediate Value Property Suppose
f : [a, b] −→ R is differentiable with finite
or infinite derivative on [a, b], the one-sided
derivatives f ′(a+) and f ′(b−) at the endpoints
a and b respectively being assumed to be finite
and unequal: f ′(a+) 6= f ′(b−). Then, for
any α such that f ′(a+) < α < f ′(b−) or
f ′(b−) < α < f ′(a+), there is c ∈ (a, b) for
which f ′(c) = α.
Corollary 1 f ′ cannot have jump discon-
tinuities.
Corollary 2 If f is continuous on [a, b] and
differentiable on (a, b) with f ′(x) 6= 0 (but
possibly infinite), then f is strictly mono-
tonic: increasing if f ′ > 0 and decreasing
if f ′ < 0, at any point.
Continuity of derivatives If f : (a, b) −→ R is
differentiable and monotonic, then f ′ is con-
tinuous.
Derivative of a vector-valued function The
ith projection function pi : Rn −→ R is the
function defined by
pi(x1, x2, . . . , xn) = xi
Let f : (a, b) −→ Rn be a vector-valued func-
tion. f can be written in terms of its compon-
ent functions fi := pi ◦ f as
f(x) = (f1(x), f2(x), . . . , fn(x))
We say that f is differentiable if each fi is and
define
f ′(x) := (f ′1(x), f ′2(x), . . . , f ′n(x))
15
In matrix notation
if f(x) =
f1(x)f2(x)
...fn(x)
then f ′(x) :=
f ′1(x)f ′2(x)
...f ′n(x)
Taylor’s theorem Let f : [a, b] −→ R be con-
tinuous with finite nth order derivative f (n) in
(a, b). Suppose that f (n−1) is continuous on
[a, b] and that x0 ∈ [a, b] is arbitrary. Then,
for every x ∈ [a, b], x 6= x0, there exists ξ such
that x < ξ < x0 (if x < x0) or x0 < ξ < x (if
x0 < x) and for which
f(x)− f(x0) =
n−1∑k=1
f (k)(x0)
k!(x− x0)k
+f (n)(ξ)
n!(x− x0)n
The special case of n = 1 is the mean-value
theorem. The polynomial
p(x) =
n−1∑k=1
f (k)(x0)
k!(x− x0)k
is called the Taylor polynomial of degree n− 1
at x0 associated with f .
Generalised Taylor’s theorem (for a pair of
functions) Let the hypotheses of the previ-
ous theorem hold for each of two functions
f, g : [a, b] −→ R. Then, for every x ∈ [a, b],
x 6= x0, there exists ξ in (x, x0) (if x < x0) or
in (x0, x) (if x0 < x) such that[f(x)−
n−1∑k=1
f (k)(x0)
k!(x− x0)k
]g(n)(ξ)
=
[g(x)−
n−1∑k=1
g(k)(x0)
k!(x− x0)k
]f (n)(ξ)
Taylor’s theorem for a single function is re-
covered by taking g(x) = (x− x0)n.
Integral form of Taylor’s theorem Let f :
[x0−δ, x0+δ] −→ R be continuously differenti-
able of order n : f (n) exists on (a, b) and is con-
tinuous there. Then for every x ∈ [x0−δ, x0+δ]
f(x) = f(x0) +
n−1∑k=1
f (k)(x0)
k!(x− x0)k
+
∫ x
x0
f (n)(t)
n!(x− t)n−1dt
This form avoids introducing an unspecified
value ξ in (a, b).
2.2 Limits & indeterminateforms
Form 0/0
1. Suppose f, g : [a, b] −→ R are continu-
ously differentiable and f(x0) = g(x0) for
some a < x0 < b. If g′(x0) 6= 0, then
limx→x0
f(x)
g(x)=f ′(x0)
g′(x0)
2. (L’Hospital’s rule) Let f, g : [a, b] −→ Rbe continuously differentiable, f(x0) =
0 = g(x0), g′(x) 6= 0 for all x 6= x0
in [a, b], and limx→x0
f ′(x0)g′(x0) = L, where
−∞ ≤ L ≤ ∞. Then
limx→x0
f(x0)
g(x0)= L
Form ∞/∞Suppose f, g : (a, b] −→ R are continuously
differentiable,
limx→a+
f(x) =∞ = limx→a+
g(x) (g′(x) 6= 0)
and
limx→a+
f ′(x)
g′(x)= L
where −∞ ≤ L ≤ ∞. Then
limx→a+
f(x)
g(x)= L
A similar result is true for limits x→ b−.
Form 0.∞With similar differentiability assumptions as
above if f, g : [a, b] −→ R are such that
limx→a = 0 and limx→a g(x) = ∞, then re-
duce to the earlier forms by writing either
f(x)g(x) = f(x)/g(x)−1 or f(x)g(x) =
g(x)/f(x)−1 depending on the convenience in
applying L’Hospital’s rule.
Forms 00,0∞,∞0,∞∞,1∞
If
limx→a
f(x) = 0 = limx→a
g(x) limx→a
h(x) =∞
with f(x) ≥ 0, then
limx→a
f(x)g(x) = elimx→a g(x) log f(x)
The form 00 is reduced to 0.∞. Taking logar-
ithms as above, the forms 0∞ and∞∞ are seen
to be not indeterminate, i.e. they both evalu-
ate directly to 0 without requiring a passage to
derivatives. The same logarithmic formulation
reduces ∞0 to 0.∞.
16
2.3 Partial derivatives
An open ball of radius r centred on a in Rn is a set
of the form {x := (x1, x2, . . . , xn) : ||x−y|| < r} for
some 0 < r < ∞ and a := (a1, a2, . . . , an) (for the
definition of || · || see (1.1)). It is usually denoted
B(a, r). Rn may be thought of as a ball of infinite
radius. Let U ⊂ Rn be a set. Then x ∈ U is
an interior point of U if there is some open ball
B(x, r) ⊂ U . If every point of U is an interior
point, then U is said to be open.
Directional Derivatives Let f : U ⊂ Rn −→Rm be a function and a ∈ U be an interior
point and u 6= 0 be an arbitrary point of U .
Then the limit, if it exists,
f ′(a;u) := limt→0
f(a+ tu)− f(a)
t
is the directional derivative of f at a in the
direction u.
Partial derivatives The kth unit vector ek of Rnis the vector [0, 0, . . . , 1, 0, . . . , 0]
Tand 1 is in
the kth place.
Let f : U ⊂ Rn −→ R be a function and
a ∈ U be any point. The kth partial de-
rivative or simply the kth partial of f at a
is defined to be the the directional derivative
of f at a in the direction ek, viz. f ′(a; ek),
if the derivative exists. It is variously de-
noted by ∂f∂xk
(a), fk, Dkf(a), ∂kf(a) etc. The
partial derivative at a can be seen as the
usual one-variable derivative of the function
F (xk) := f(a1, a2, . . . , ak−1, xk, ak+1, . . . , an)
with respect to xk. The domain of F is Uk :=
{xk ∈ R : (a1, a2, . . . , ak−1, xk, ak+1, . . . , an) ∈U}. Informally, f is differentiated with respect
to xk treating the other variables as fixed.
Gradient If f : U ⊂ Rn −→ R is a function whose
partial derivatives exist at a ∈ U , then the
gradient of f at a is the (row-)vector
∇f(a) :=
[∂f
∂x1(a),
∂f
∂x2(a), . . . ,
∂f
∂xn(a)
]It is also sometimes denoted by
−→∇f(a) or by
gradf .
Successive partial differentiation Consider f :
U ⊂ R2 −→ R. Assuming the possibility of dif-
ferentiation, each partial derivative ∂f∂xi
which
is a function
(x1, . . . , xn) 7→ ∂f
∂xi
i = 1, 2, can be differentiated again generating
the second order derivatives ∂2f∂x2
1, ∂2f∂x2
2, ∂2f∂x1∂x2
and ∂2f∂x2∂x1
. The partials of the form ∂2f∂xi∂xj
,
i 6= j, are said to be mixed. This process can
be continued finitely many times to produce
nth order partials derivatives provided the dif-
ferentiations are possible.
Theorem 24. Let U ⊂ Rn −→ R, U open.
Suppose the partials ∂f∂x1
, ∂f∂x2
and ∂2f∂x1∂x2
exist,
and ∂2f∂x1∂x2
is continuous at some point a :=
(a1, a2) ∈ U . Then ∂2f∂x2∂x1
exists at ‘a’ and
∂2f
∂x1∂x2(a) =
∂2f
∂x2∂x1(a)
Cn functions Suppose f : U ⊂ U −→ R has all
partial derivatives of order n which moreover
are continuous. Then, f is said be Cn.
Taylor’s theorem for functions of two variables
We use the notation:[(h∂
∂x+ k
∂
∂y
)nf
](a, b)
:=
n∑i=0
(n
i
)hikn−i
∂nf
∂xi∂yn−i(a, b)
h, k ∈ R.
Theorem 25. Let U ⊂ R2 be the rectangular
open set (a−h, a+h)×(b−k, b+k), a, b, h, k ∈R, and f : U −→ R be Cn+1. Then,
f(a+ h, b+ k) =
n∑r=0
1
r!
[(h∂
∂x+ k
∂
∂y
)rf
](a, b)
+1
(n+ 1)!
[(h∂
∂x+ k
∂
∂y
)(n+1)
f
](a+ ξh, b+ ξh)
where 0 < ξ < 1.
Partial differentiation of composite functions
Let φ, ψ : A ⊂ R2 −→ R and f : B ⊂ R2 −→ Rbe functions such that (φ(u, v), ψ(u, v)) ∈ B
for all (u, v) ∈ A. Let a general element of B
be denoted (x, y), i.e
x := φ(u, v) y := ψ(u, v)
Define (f ◦ (φ, ψ))(u, v) := f(φ(u, v), ψ(u, v)).
17
Theorem 26. Let φ, ψ and f be as above and
continuously differentiable. Then
∂
∂u(f ◦ (φ, ψ)) =
∂f
∂x
∂φ
∂u+∂f
∂y
∂ψ
∂u
∂
∂v(f ◦ (φ, ψ)) =
∂f
∂x
∂φ
∂v+∂f
∂y
∂ψ
∂v
Partial differentiation of implicit functions
Theorem 27. 1. Let F : U ⊂ R2 −→ R be
C1 and y = f(x) for some differentiable
f . Assume that ∂f∂x 6= 0 on U . Then
f ′(x) = −∂f∂y
∂f∂x
2. Let F : U ⊂ R3 −→ R be a continuously
differentiable function. Suppose
(a) f : V ⊂ R2 −→ R is continuously
differentiable.
(b) If F = F (x, y, z), then
F (x, y, f(x, y)) = 0, assuming
that (x, y, f(x, y)) ∈ U for all
(x, y) ∈ V .
(c) ∂F∂z (x, y, f(x, y)) 6= 0.
Then
∂f
∂x(x, y) = −
∂F∂x (x, y, f(x, y))∂F∂z (x, y, f(x, y))
∂f
∂y(x, y) = −
∂F∂y (x, y, f(x, y))∂F∂z (x, y, f(x, y))
Homogeneous functions f : U ⊂ R −→ R is ho-
mogeneous of degree n if for any t > 0 and all
(x, y) ∈ U , f(tx, ty) = tnf(x, y).
Theorem 28. (Euler’s theorem & its con-
verse) Suppose f : U ⊂ R −→ R is continu-
ously differentiable. Then, f is homogeneous
of degree n⇐⇒ x∂f∂x + y ∂f∂y = nf .
Caution In the definition of homogeneity,
sometimes t is allowed to be any real. In this
case, the converse to Euler’s theorem, viz. the
“⇐=” part can be false.
2.4 Maxima & Minima
If f : A ⊂ Rn −→ R is a function, then f is said
to have a local or relative maximum (resp. local
or relative minimum) at a ∈ A if f(x) ≤ f(a)
(resp. f(x) ≥ f(a)) for all x in some open ball
B(a, r) of radius r centred on a. A function may
have several local maxima and local minima. A
local maximum and a local minimum are also each
termed a local extremum. If f has a maximum or
minimum over its entire domain, that value is said
to be a global or absolute maximum or global or ab-
solute minimum respectively.
The one variable case
Theorem 29.
1. (Necessary condition for the existence of
a local extremum) Let f : (a, b) −→ Rand suppose f has a local extremum (max-
imum or minimum) at x ∈ (a, b). If f is
differentiable at x, then f ′(x) = 0.
2. (Sufficient condition for the existence of
a local extremum) Let f : (a, b) −→ R be
Cn on (a, b). Suppose at c ∈ (a, b)
f ′(c) = f ′′(c) = · · · = f (n−1)(c) = 0 but
fn(c) 6= 0.
Then if n ∈ N is even,
f (n)(c) > 0f (n)(c) < 0
}⇒{
f(c) is a local minimumf(c) is a local maximum
If n is odd, there is no local extremum at
c.
Definition 30. A point x0 at which f ′(x0) = 0
but which is not an extremal point, is called an
inflection point.
The multivariate case
Let f : U ⊂ Rn −→ R be a function whose
first-order partial derivatives exist at c ∈ U .
If ∇f(c) = 0, c is called a stationary point of
f . A stationary point is called a saddle point
if every ball B(c, r) contains a point where
f(x) > f(c) and a point where f(x) < f(c).
If f has second-order partial derivatives at a
point c ∈ U , the Hessian of f is the n×n mat-
rix
Hf (c) :=∂2f∂x2
1(c) ∂2f
∂x1∂x2(c) . . . ∂2f
∂x1∂xn(c)
∂2f∂x2∂x1
(c) ∂2f∂x2
2(c) . . . ∂2f
∂x2∂xn(c)
......
......
∂2f∂xn∂x1
(c) ∂2f∂xn∂x2
(c) . . . ∂2f∂x2n
(c)
If the second-order partials are continuous at
a, then Hf (a) is symmetric.
18
Theorem 31.
1. Let f : U ⊂ Rn −→ R have second-order
partial derivatives which are continuous
at a stationary point c ∈ U . Define the
function or “quadratic form” Q : U −→ Rby
Q(x) := xTHf (c)x
=1
2
n∑i,j=1
∂2f
∂xi∂xj(c)xixj
where x =[x1 x2 . . . xn
]T ∈ U .
Then
(a) Q(x) > 0 for all x 6= 0 ⇒ f has a
local minimum at c.
(b) Q(x) < 0 for all x 6= 0 ⇒ f has a
local maximum at c.
(c) Q(x) takes both positive and negative
values for x 6= 0 ⇒ f has a “saddle
point” at c.
2. As a special case of the above, take n = 2
and assume that the second-order partials
of f are continuous at c. Then
(a) detHf (c) > 0 and ∂2f∂x2
1(c) > 0 ⇒ f
has a local minimum at c.
(b) detHf (c) > 0 and ∂2f∂x2
1(c) < 0 ⇒ f
has a local maximum at c.
(c) detHf (c) < 0⇒ f has a saddle point
at c.
(d) If detHf (c) = 0, then f may have a
local minimum or a local maximum
or a saddle point at c.
2.5 Theorems on Integration
Let f : [a, b] −→ R be a function. We will say that
f ∈ R if∫ baf(x) dx exists. f is said to be absolutely
integrable if∫ ba|f(x)|dx exists.
Theorem 32.
1. If f is bounded (i.e. |f(x)| ≤ C for some C ≥ 0
and all x ∈ [a, b]) with possibly only finitely
many points of discontinuity, then f ∈ R.
2. If f ∈ R, m ≤ f(x) ≤M for all x ∈ [a, b], and
g : [m,M ] −→ R is continuous, then g◦f ∈ R.
3. (Additivity) f, g ∈ R⇒ f + g ∈ R and∫ b
a
(f + g)(x) dx =
∫ b
a
f(x) dx+
∫ b
a
g(x) dx
4. c ∈ R and f ∈ R⇒ cf ∈ R.
5. (Monotonicity) f, g ∈ R and f ≤ g on [a, b]
implies ∫ b
a
f(x) dx ≤∫ b
a
g(x) dx
6. f ∈ R on [a, b] and a < ξ < b implies∫ ξ
a
f(x) dx+
∫ b
ξ
f(x) dx
7. f ∈ R on [a, b] and |f | ≤ K on [a, b] implies∣∣∣∣∣∫ b
a
f(x) dx
∣∣∣∣∣ ≤ K(b− a)
8. f, g ∈ R, then
(a) fg ∈ R
(b) |f | ∈ R and∣∣∣∣∣∫ b
a
f(x) dx
∣∣∣∣∣ ≤∫ b
a
|f(x)|dx
Continuity of the integral The integral as a
function of its upper limit is continuous: if
f ∈ R and a ≤ x ≤ b, define
F (x) :=
∫ x
a
f(t) dt
Then F : [a, b] −→ R is (uniformly) continu-
ous. If f is continuous at x ∈ [a, b], then F is
differentiable at x and F ′(x) = f(x). Thus, F
is continuously differentiable. If
F (x) :=
∫ b
x
f(t) dt
then F ′(x) = −f(x).
The fundamental theorem of calculus
Suppose f : f −→ R ∈ R has a primit-
ive, viz. a function F : [a, b] −→ R such
that F ′ = f (one-sided derivatives at the
endpoints). Then∫ b
a
f(x) dx = F (b)− F (a)
Change of variable theorem Let f : [a, b] −→R be in R and α : [c.d] −→ [a, b]. If either
19
1. α is a strictly increasing continuous
function (thus, α(c) = a, α(d) = b) with
α′ ∈ R
or
2. α is continuously differentiable on [c, d]
and f is continuous on α([c, d]),
then, f ◦ α ∈ R and∫ α(d)
α(c)
f(x) dx =
∫ d
c
f(α(t))α′(t)dt.
Integration by parts Let f, g : [a, b] −→ R be
differentiable on [a, b] and such that f ′, g′ ∈ R.
Then∫ b
a
f(x)g′(x)dx = f(b)g(b)− f(a)g(a)
−∫ b
a
f ′(x)g(x)dx
Integration of vector-valued-functions Let
f1, f2, . . . , fn : [a, b] −→ R be functions
in R and f : [a, b] −→ Rn be defined by
f(x) := (f1(x), f2(x), . . . , fn(x)). Then∫ baf(x) dx is said to exist (or to be in R) iff
fi ∈ R for i = 1, 2, . . . , n. We then define∫ b
a
f(x) dx := (
∫ b
a
f1(x) dx, . . . ,
∫ b
a
fn(x) dx)
Integration of a sequence of functions
Theorem 33.
1. Consider a sequence {fn} of real-valued
functions defined on [a, b]. Assume that
fn ∈ R for n = 1, 2, . . . and that fn → f
uniformly on [a, b]. Then f ∈ R and
limn→∞
∫ b
a
fn(x) dx =
∫ b
a
f(x) dx
2. A special case of the above is the follow-
ing. If∑∞n=1 fn(x) is a uniformly con-
vergent series, then∫ ba
∑∞n=1 fn(x) dx =∑∞
n=1
∫ bafn(x) dx.
Integral as a function of a parameter Let f :
[a, b]× [c, d] −→ R be continuous and define
F : [a, b] −→ R by F (x) :=∫ dcf(x, t) dt. Then,
1. F is continuous.
2. (Interchanging limit and integral)
limx→x0
∫ d
c
f(x, t) dt =
=
∫ d
c
limx→x0
f(x, t) dt
=
∫ d
c
f(x0, t) dt
Differentiation under the integral Let f :
[a, b]×[c, d] −→ R and ∂f∂x be continuous. If we
define F : [a, b] −→ R by F (x) :=∫ dcf(x, t) dt,
then F is continuously differentiable and
F ′(x) =
∫ d
c
∂f
∂x(x, t) dt
Differentiation of integrals with variable limits
Theorem 34. 1. Let f be as above. Define
F : [a, b] × [c, d] × [c, d] −→ R by
F (x, y, z) :=∫ zyf(x, t) dt. Then
∂F
∂x=
∫ z
y
∂F
∂x(x, t) dt,
∂F
∂y= −f(x, y)
∂F
∂z= f(x, z)
2. As a useful special case of the above, let
g(x) :=
∫ ψ(x)
φ(x)
f(x, t) dt
In terms of the function F defined above,
g(x) = F (x, φ(x), ψ(x)). Then
g′(x) =
∫ ψ(x)
φ(x)
∂F
∂x(x, t) dt− f(x, φ(x))φ′(x)
+ f(x, ψ(x))ψ′(x)
2.6 Improper Integrals
Let f : [a, x] −→ R be a function.
Type 1a Suppose that∫ xaf(t) dt exists for every
x ≥ a. Define Ia : [a,∞) −→ R by Ia(x) :=∫ xaf(t) dt. The function Ia(x) is said to be
an improper integral of Type 1. It converges if
limx→∞ Ia(x) = limx→∞∫ xaf(t) dt exists and
the limit is denoted by∫ ∞a
f(t) dt := limx→∞
∫ x
a
f(t) dt
20
If the integral does not converge, it is said to
diverge. When the integrand is entirely non-
negative (or non-positive), convergence is in-
dicated by∫∞af(t) dt < ∞ and divergence by∫∞
af(t) dt =∞ (or −∞).
Type 1b Here Ib(x) : (−∞, b] −→ R,
Ib(x) :=∫ bxf(t) dt, is said to converge if
limx→−∞ Ib(x) = limx→−∞∫ bxf(t) dt exists
and we write∫ b
−∞f(t) dt := lim
x→−∞
∫ b
x
f(t) dt
Type 1a and Type 1b integrals can be trans-
formed into one another by the change of
variable φ : t 7→ −t giving∫ bxf(t) dt =∫ −x
−b (f ◦ φ)(t) dt =∫ −x−b f(−t) dt. Con-
sequently, the Type 1a integral converges iff
the corresponding Type 1b integral does, in
which case we may write:∫ b−∞ f(t) dt =∫∞
−b f(−t) dt.
Type 2 If limx→a+ f(x) does not exist but∫ xa+ε
f(t) dt exists for every ε > 0 such that
a < a + ε < b, define the improper integral of
Type 2 to be the function Ia+ : (0, b−a) −→ R,
Ia+(ε) :=∫ ba+ε
f(t) dt. If limε→0 Ia+(ε) exists
then the improper integral of Type 2 is said to
converge and we write∫ b
a+
f(t) dt := limε→0
Ia+(ε)
The notation∫ baf(t) dt is also commonly used
with the understanding that limx→a+ f(x)
does not exist. If limx→a+ f(x) exists, then
Ia+ is said to be proper even if f is discon-
tinuous at a. An integral of Type 2 can be
converted to one of Type 1 (or equivalently, of
Type 1b) by a suitable change of variable and
vice versa: for example, φ : t 7→ e1−t trans-
forms∫ 1
0+f(t) dt to
∫∞1f(e1−t)e1−t dt. Never-
theless, for calculations, it is useful to retain
the Type 2 integrals as a separate class.
2.6.1 Convergence tests for Type 1a& 1b integrals
Comparison tests Let f, g : [a,∞) −→ R be con-
tinuous, 0 ≤ f(x) ≤ g(x) for all x. Then∫ ∞a
g(x) dx <∞⇒∫ ∞a
f(x) dx <∞
and∫ ∞a
f(x) dx =∞⇒∫ ∞a
g(x) dx =∞
For Type 1b integrals, the domain of the func-
tion becomes (−∞, b] and the limits of integ-
ration becomes (−∞, b].
Tests for absolute convergence We first state
them for Type 1a integrals. The integral∫∞af(x) dx is said to converge absolutely if∫∞
a|f(x)|dx < ∞. The integral converges
conditionally if it converges but∫∞a|f(x)|dx =
∞.
Theorem 35. Let f : [a,∞) −→ R be continu-
ous. Then∫∞a|f(x)|dx < ∞ ⇒
∫∞af(x) dx
converges.
Limit tests for Type 1a integrals
Theorem 36. 1. Let f : [a,∞) −→R be continuous. Then, for any
p > 1, limx→∞ xpf(x) converges ⇒∫∞a|f(x)|dx <∞.
2. With f as before, limx→∞ xf(x) con-
verges to a non-zero limit or diverges to
±∞ ⇒∫∞af(x) dx diverges. If the limit
is 0, then the test is inconclusive.
In the setting of Type 1b integrals, replace
the limit in the limit test of convergence by
limx→−∞(−x)pf(x) and in the limit test of di-
vergence by limx→−∞ xf(x).
Tests for conditional convergence
Theorem 37. Let f : [a,∞) −→ (0,∞) be
continuous and monotonically decreasing to 0
as x → ∞, i.e. limx→∞ f(x) = 0. Then∫∞af(x) sinx dx converges.
Corollary 38.
1. With the same function and hypotheses
as above,∫∞af(x) dx diverges (converges)
iff∫∞af(x)| sinx|dx diverges (resp. con-
verges).
2. With the same hypotheses as be-
fore,∫∞af(x) sin(αx + β) dx and∫∞
af(x) cos(αx + β) dx (α 6= 0) both
converge.
3. With the previous hypotheses, if n ∈N, n > a/π, then |
∫∞nπf(x) sinxdx| ≤
2f(nπ)
21
2.6.2 Convergence tests for Type 2integrals
We first consider Type 2a integrals.
Comparison tests Let f, g : (a, b] −→ R be con-
tinuous, 0 ≤ f(x) ≤ g(x) for all x. Then∫ b
a+
g(x) dx <∞⇒∫ b
a+
f(x) dx <∞
and∫ b
a+
f(x) dx =∞⇒∫ b
a+
g(x) dx =∞
The corresponding results for Type 2b integ-
rals are obtained by replacing the domain of f
by [a, b) and∫ ba+
by∫ b−a
.
Absolute convergence
1. Let f : (a, b] −→ R be continuous. Then∫ ba+|f(x)|dx < ∞ ⇒
∫ ba+f(x) dx con-
verges.
2. (Limit tests for Type 2a integrals) The in-
tegral∫∞af(x) dx is said to converge ab-
solutely if∫ ba+|f(x)|dx < ∞. The integ-
ral converges conditionally if it converges
but∫ ba+|f(x)|dx =∞.
Theorem 39.
1. Let f : (a, b] −→ R be continuous. Then,
for any 0 < p < 1, limx→a+ xpf(x) con-
verges ⇒∫ ba+|f(x)|dx <∞.
2. With f as before, limx→a+ xf(x) con-
verges to a non-zero limit or diverges to
±∞ ⇒∫ ba+f(x) dx diverges. If the limit
is 0, then the test is inconclusive.
For testing the convergence Type 2b integrals,
take limits limx→b−(b− x)pf(x) and for diver-
gence, limx→b−(b− x)f(x).
Tests for conditional convergence
Theorem 40. Let f : (a, b] −→ R be con-
tinuous, (x − a)2f(x) monotonically increas-
ing and limx→a+(x − a)2f(x) = 0. Then∫ ba+f(x) sin
(1
x−a)
dx converges.
Combination of types An integral which can be
written as a finite sum of integrals of the above
types, is said to converge if each of the sum-
mand integrals converges. It diverges if at least
one of the summand integrals diverges.
2.7 Uniform convergence &improper integrals
Consider the integral∫∞af(x, t) dt which is as-
sumed to converge for each x in some interval [A,B]
to a value denoted by F (x), i.e.
F (x) :=
∫ ∞a
f(x, t) dt A ≤ x ≤ B (2.1)
Let SR(x) :=∫ Raf(x, t) dt be its “partial integral”.
Definition 41. The integral (2.1) is said to con-
verge uniformly to F (x) in [A,B] if given any ε > 0,
there exists R′ depending only on ε and independent
of x ∈ [A,B] such that
R > R′ ⇒ |f(x)− SR(x)|
=
∣∣∣∣∣∫ ∞a
f(x, t) dt−∫ R
a
f(x, t) dt
∣∣∣∣∣=
∣∣∣∣∫ ∞R
f(x, t) dt
∣∣∣∣< ε
for all x ∈ [A,B].
The definitions are analogous for improper integ-
rals of the other types.
Theorem 42. Let f : [A,B]×[a,∞) −→ R be con-
tinuous and suppose∫∞af(x, t) dt converges uni-
formly to a function F : [A,B] −→ R. Then F (x)
is continuous.
Theorem 43. (Interchange of order of integration)
Let f be as in Theorem 42 and let the improper
integral occurring there converge uniformly to F (x)
in [A,B]. Then∫ B
A
F (x) dx =
∫ B
A
(∫ ∞a
f(x, t) dt
)dx
=
∫ ∞a
(∫ B
A
f(x, t) dx
)dt
Theorem 44. (Differentiation under the integral
sign) Let f and F be as in Theorem 42. Suppose
that ∂f∂x (x, t) is continuous and that
∫∞a
∂f∂x (x, t) dt
converges uniformly in [A,B]. Then F is differen-
tiable and
F ′(x) =
∫ ∞a
∂f
∂x(x, t) dt
22
2.8 The Gamma Function
The function Γ : (0,∞) −→ R defined by
Γ(x) :=
∫ ∞0+
tx−1e−t dt
is called the gamma function. This improper integ-
ral is also commonly written as∫∞
0. Its properties
are as follows.
Theorem 45.
1. Γ is differentiable to infinite order. In fact,
dn
dxnlog Γ(x) =
∞∑n=0
(−1)n(n− 1)(x+ k)−n
n ≥ 2, x > 0.
2. Γ(x + 1) = xΓ(x). Consequently, the do-
main of definition of Γ can be extended to
R r {0,−1,−2, . . . }. Also, Γ(x + n) = (x +
n− 1)(x+ n− 2) · · ·xΓ(x), where n ∈ N.
3. Γ(n + 1) = n!, n = 0, 1, 2, . . . . In particular,
Γ(1) = 1.
4. Γ( 12 ) = 1
2
√π.
5. Γ(0+) =∞ and limx→∞ Γ(x) =∞.
6. limx→0+ xΓ(x) = 1.
The beta function The Beta function denoted
B(x, y) is defined by
B : (0,∞)×(0,∞) −→ R
B(x, y) :=
∫ 1−
0+
tx(1− t)y−1 dt
The integral is improper if x < 1 or y < 1 or both.
With this understanding the integral is usually
written as∫ 1
0.
Properties
1. B(x, y) = B(y, x).
2. B(x, y) =
2∫ π/2
0(sin t)2x−1(cos t)2y−1 dt.
3. B(x, y) =∫∞
0tx−1
(1+t)x+y dt.
4. B(x, y) = Γ(x)Γ(y)Γ(x+y) .
5. B(x, y) = B(x+ 1, y) +B(x, y + 1).
2.9 Multiple Integrals
Theorem 46. Let R := [a, b]×[c, d] be a “rectangle”
in R2 and f : R −→ R be a function that is integ-
rable on R, i.e. assume that∫∫R
f(x, y) dxdy exists
(here “ dxdy” represents integration with respect to
the two-dimensional variable (x, y)). Suppose for
each y ∈ [c, d], I(y) :=∫ baf(x, y) dx exists. Then,∫ d
cI(y) dy exists ⇒
∫ dcI(y) dy =
∫∫R
f(x, y) dx dy.
Thus,∫∫R
f(x, y) dxdy =
∫ d
c
[∫ b
a
f(x, y) dx
]dy
The integral on the right is called an iterated in-
tegral. An analogous theorem holds with y replaced
by x.
Theorem 47. (Changing the order of
integration) Let f : R −→ R, R as above, be con-
tinuous. Then f is integrable and∫∫R
f(x, y) dx dy
is given by∫∫R
f(x, y) dxdy =
∫ d
c
[∫ b
a
f(x, y) dx
]dy
=
∫ b
a
[∫ d
c
f(x, y) dy
]dx
Definition 48. If A ⊂ R2 is bounded (i.e. A is con-
tained in some open ball), it is said to have content
0 if for every ε > 0, there is a finite collection of
rectangles which contain A and are such that the
sum of their areas is < ε.
Theorem 49. Suppose f : R −→ R is bounded
(i.e. f(R) is contained in some ball) on the rect-
angle R. If the set of discontinuities of f has con-
tent 0, then∫∫R
f(x, y) dxdy exists.
Definition 50. Let φ, ψ : [a, b] −→ R be continu-
ous. Define the following types of regions enclosed
by the graphs of the two functions:
Rφ,ψ :=
{(x, y) : a ≤ x ≤ b, φ(x) ≤ y ≤ ψ(y)}
and
Rφ,ψ :=
{(x, y) : φ(y) ≤ x ≤ ψ(y), a ≤ y ≤ b}
23
Theorem 51. Let R be a region of type Rφ,ψ and
f : R −→ R be bounded on R and continuous on
the interior of R. Then∫∫R
f(x, y) dx dy exists and
can be evaluated by the iterated integrals:∫∫R
f(x, y) dxdy =
∫ b
a
[∫ ψ(x)
φ(x)
f(x, y) dy
]dx
An analogous theorem is true for regions of type
Rφ,ψ:∫∫R
f(x, y) dxdy =
∫ b
a
[∫ ψ(y)
φ(y)
f(x, y) dx
]dy
If a region is simultaneously of both types, i.e. of
type Rφ1,ψ1 and Rφ2,ψ2, then
∫∫R
f(x, y) dxdy can
be evaluted by either of the two formulas above.
Triple Integrals Consider the 3-dimensional ana-
logue of the region of type Rφ,ψ, viz.
V :=
{(x, y, z) ∈ R3 : (x, y) ∈ R,φ(x, y) ≤ z ≤ ψ(x, y)}
where R is some region in R2.
Theorem 52. If f : V −→ R is continuous
then∫∫∫V
f(x, y, z) dx dy dz
=
∫∫R
[∫ ψ(x,y)
φ(x,y)
f(x, y, z) dz
]dxdy
Change of Variable Suppose we have functions
x, y : B ⊂ R2 −→ R defined by x = φ(u, v)
and y = ψ(u, v). The Jacobian or Jacobian
determinant of the mapping F : (u, v) 7→(x, y) = (φ(u, v), ψ(u, v)) is
J(u, v) =
∣∣∣∣∣∂φ∂u
∂φ∂v
∂ψ∂u
∂ψ∂v
∣∣∣∣∣The notations ∂(φ,ψ)
∂(u,v) and JF (u, v) are also
used.
Definition 53. A mapping α : V ⊂ Rn −→ Rn(n = 1, 2, 3) is said to be a coordinate trans-
formation or diffeomorphism if
1. α is one-to-one.
2. Each component function αi of α =
(α1, . . . , αn) has continuously differenti-
able partial derivatives.
3. Jα(u1, . . . , un) 6= 0 for all u =
(u1, . . . , un) ∈ V .
The image set α(V ) is open.
Special Coordinate Transformations
Linear coordinate transformations Let
V := R2 and α(u, v) = (au+ bv, cu+ dv)
for some a, b, c, d ∈ R satisfying
ad− bc 6= 0. Then Jα(u, v) = ad− bc and
α(V ) = R2. The extension to n = 3 is
obvious.
Polar coordinates in R2 Let V := {(r, θ) :
r > 0, 0 < θ < 2π} and
α(r, θ) = (α1(r, θ), α2(r, θ))
= (r cos θ, r sin θ)
Here Jα(r, θ) = r and α(V ) = R2 r{(x, 0) : x ≥ 0}.
Cylindrical coordinates in R3 Let V :=
{(r, θ, z) : r > 0, 0 < θ < 2π, z ∈ R},
α(r, θ, z) = (α1(r, θ, z), α2(r, θ, z), α3(r, θ, z))
= (r cos θ, r sin θ, z)
Now Jα(r, θ, z) = r and α(V ) = R3 r{(x, 0, 0) : x ≥ 0}.
Spherical coordinates in R3 Let V :=
{(r, θ, φ) : r > 0, 0 < θ < 2π, 0 < φ < π}and
α(r, θ, φ) = (r cos θ sinφ, r sin θ sinφ, r cosφ)
Then Jα(r, θ, φ) = −r2 sinφ and α(V ) =
R3 r [{(x, 0, 0) : x ≥ 0} ∪ {(0, 0, z) : z ∈R}].
Theorem 54. Let f : V ⊂ R2 −→ R be integ-
rable over the open set V . If α is a coordinate
transformation mapping V to U := α(V ) ⊂
24
R2, then∫∫U
f(x, y) dx dy
=
∫∫V
(f ◦ α)(u, v) |Jα(u, v)|dudv
=
∫∫V
f(α(u, v)) |Jα(u, v)|dudv
=
∫∫V
f [(α1(u, v), α2(u, v))] |Jα(u, v)|dudv
Here x := α1(u, v), y := α2(u, v).
In the case of triple integrals∫∫∫U
f(x, y) dxdy
=
∫∫∫V
(f ◦ α)(u, v, w) |Jα(u, v)|dudv dw
=
∫∫∫V
f(α(u, v, w)) |Jα(u, v, w)|dudv dw
=
∫∫∫V
f [(α1(u, v, w), α2(u, v, w), α3(u, v, w))]
×|Jα(u, v, w)|dudv dw
Area & Volume
1. Assume as before that R is the rectangle
[a, b]× [c, d]. Let f : R −→ [0,∞) be a
function S := {(x, y, z) ∈ R3 : (x, y) ∈R, z = f(x, y)} be the associated surface
(or graph) lying above the x-y plane. The
ordinate set of f over R is
{(x, y, z) ∈ R3 : (x, y) ∈ R, 0 ≤ z ≤ f(x, y)}
It may be thought of as the “cylinder”
under the graph of f . For each y ∈ [c, d],∫ baf(x, y) dx is the area of the cross-
section cut out by a plane parallel to the
x-z plane. Similarly for each x ∈ [a, b].
The volume of the ordinate set∫ ba
[∫ dcf(x, y) dy
]dx.
2. If R = Rφ,ψ for some continuous φ, ψ,
then the area of R, denoted∫∫R
dxdy, is
given by ∫ b
a
[ψ(x)− φ(x)] dx
The area of a cross-section of the associ-
ated ordinate set of a function f over R
is then given by∫∫R
dxdy =
∫ ψ(x)
φ(x)
f(x, y) dy
and the volume of the ordinate set is given
by ∫ b
a
[∫ ψ(x)
φ(x)
f(x, y) dy
]dx
3. A figure or surface of revolution is ob-
tained by revolving a curve or the graph
of a function, about an axis. Let f(y, z) =
c, ∇f 6= 0, be a curve in the upper y-z
plane in R3 (z ≥ 0). Then the surface of
revolution S obtained by rotating the set
{(y, z) : f(y, z) = c} about the y-axis is
S = {(x, y, z) ∈ R3 : F (x, y, z) = c}
where F (x, y, z) = f(y,√x2 + z2). In
particular, if z = φ(y), a ≤ y ≤ b (so
that f(y, z) = z − φ(y) and c = 0), then
S = {(x, y, z) ∈ R3 : x2 + z2 = φ(y)2}
The surface area of S is
area(S) = 2π
∫ b
a
y√
1 + φ′(y)2 dy
and the volume of the solid of revolution
V := {(x, y, z) ∈ R3 : 0 ≤ x2+z2 ≤ φ(y)2}
obtained by rotating z = φ(y) as before
is
vol(V ) = π
∫ b
a
φ(y)2 dy
The volume of the solid of revolution V
obtained by rotating the region between
two curves z = φ(y) and z = ψ(y) is
vol(V ) = π
∫ b
a
(ψ(y)2 − φ(y)2
)dy
Definition 55. A set A ⊂ R2 is simply
connected if every simple closed curve in
A encloses only points of A. Alternat-
ively, in such a set, every simple closed
curve can be shrunk continuously to a
point.
25
Definition 56. A parametric or para-
metrised surface S in R3 is a set of points
(x, y, z) satisfying the three equations
x = φ(u, v) y = ψ(u, v) z = ρ(u, v)
where φ, ψ and ρ are defined on a com-
mon domain R ⊂ R2 which is simply
connected and bounded by a simple closed
curve (see p. 30). Alternatively, it is the
range of the map
r : R ⊂ R2 −→ R3
r(u, v) = φ(u, v) i + ψ(u, v) j + ρ(u, v) k
x, y, z as before. The surface is said to be
simple if r is one-to-one.
Definition 57. A (regular) surface S may
also be defined as the level set of a func-
tion f : U ⊂ R3 −→ R such that ∇f 6= 0
on S and
S := f−1(c)
for some c ∈ R. It may be possible to ob-
tain a simple expression as above by elim-
inating u and v from the parametric equa-
tions in Definition 56. By considering
the function F := f − c, we may assume
without loss of generality that c = 0.
If a surface (a two-dimensional object)
has a “boundary”, then the boundary will
be a curve since it would have to be of one
dimension less.
Definition 58. Let S be a parametric sur-
face described as above by
r(u, v) = φ(u, v) i + ψ(u, v) j + ρ(u, v) k
The surface area of S is
area (S) :=
∫∫R
∣∣∣∣∣∣∣∣ ∂r
∂u× ∂r
∂v
∣∣∣∣∣∣∣∣ dudv
See Sec. 1.1 for the definition of || ||.Definition 59. The unit normal to S at-
tached at any point r(u, v) ∈ S is defined
as follows:
(a) when S is given in parametric form,
by
n :=∂r∂u×
∂r∂v∣∣∣∣ ∂r
∂u×∂r∂v
∣∣∣∣at all points (u, v) ∈ R where the
cross product, sometimes called the
fundamental vector product of the
surface, is nonzero.
(b) when S is given as a level set f−1(0),
by
n :=∇f(a)
||∇f(a)||for all a ∈ S.
Theorem 60. Suppose a surface S is
given by the explicit equation z = f(x, y).
Then
(a)
area (S)
=∫∫R
√1 +
(∂f∂x
)2
+(∂f∂y
)2
dx dy
(b) Let θ ∈ [0, π/2) be the angle between
the normal vector ∂r∂u×
∂r∂v to R and k.
Then the expression for area becomes
area (R) =
∫∫R
1
cos θdxdy
Hence, if R and S are regions lying in
two planes which are at an angle of θ
to one another, S being the projection
of R, then
area (S) = area (R) cos θ
Theorem 61. Let the surface S be given in im-
plicit form as f(x, y, z) = 0. Suppose that one
of x, y, z, say z, can be written as a function
of x, y, viz. z = φ(x, y). Then assuming that
∂f/∂z 6= 0 on R,
area (S) =
∫∫R
[(∂f∂x
)2
+(∂f∂y
)2
+(∂f∂z
)2]1/2
∣∣∣∣∣∣∂f∂z ∣∣∣∣∣∣ dx dy
2.10 Vector identities
In just this section vectors and vector-valued
functions will be denoted by bold letters for
clarity and emphasis. Vectors will be 2- or 3-
dimensional. We repeat the definition of the
dot product and state its properties.
Dot Product The dot product or inner
product of vectors a and b is
a · b := a1b1 + a2b2
or
a · b := a1b1 + a2b2 + a3b3
26
according as a and b are two- or three-
dimensional and expressed in terms of
their components. It is also denoted
〈a,b〉
If a = (a1, a2) or (a1, a2, a3), the mag-
nitude or norm of a is, as in (1.1),
||a|| := (a21 + a2
2)1/2
or
:= (a21 + a2
2 + a23)
1/2
respectively. The norm is also sometimes
denoted |a| in analogy with the absolute
value of real or complex numbers.
Properties of the dot product
1. Given two vectors a and b in R2
or R3, a · b = ||a|| ||b|| cos θ, where
θ ∈ [0, π] is the angle between the
two vectors.
2. (Commutativity) a · b = b · a3. t(a · b) = (ta)·b = a·(tb), t ∈ R.
4. (Bilinearity)
(sa + tb) · c = s(a · c) + t(b · c)
and
a · (sb + tc) = s(a · b) + t(a · c)
s, t ∈ R5. If i, j and k are the standard unit
vectors (1, 0, 0), (0, 1, 0) and (0, 0, 1)
respectively in R3, then i · i = j · j =
k · k = 1 and i · j = j · k = k · i =
0. In R2, there would only be i =
(1, 0) and j = (0, 1) with analogous
properties.
6. a · a = ||a||2
7. a · b = 0 and a,b 6= 0 ⇒ a and
b are perpendicular or orthogonal to
one another.
8. (Cauchy-Buniakowsky-Schwarz
inequality)
|a · b| ≤ ||a|| ||b||
Equality holds iff b = ca for some
scalar c.
Cross Product Given a and b in R3, the
cross product of a and b, denoted a×b
(also sometimes a ∧ b), is defined to be
the vector
a×b := (||a|| ||b|| sin θ) e
where as before θ is the angle between a
and b and e is a unit vector orthogonal
to both a and b directed so that {a,b, e}form a right-handed system or are pos-
itively oriented, i.e. if a = (a1, a2, a3),
b = (b1, b2, b3) and e = (e1, e2, e3), then
the determinant∣∣∣∣∣∣a1 a2 a3
b1 b2 b3e1 e2 e3
∣∣∣∣∣∣ > 0
If the determinant were negative, the vec-
tors would be said to form a left-handed
system or to be negatively oriented. The
terminology can be applied to any three
(resp. two) linearly independent vectors
in R3 (resp. R2).
Properties of the cross product
1. (Anti-commutativity) a×b = −b×a
2. t(a×b) = (ta)×b = a×(tb), t ∈ R3. (Bilinearity) For s, t ∈ R,
(sa + tb)×c = s(a×c) + t(b×c)
a×(sb + tc) = s(a×b) + t(a×c)
4. i×i = j×j = k×k = 0 and i×j = k,
j×k = i, k×i = j
5. If a = (a1, a2, a3) and b = (b1, b2, b3),
then
a×b =
∣∣∣∣∣∣i j ka1 a2 a3
b1 b2 b3
∣∣∣∣∣∣The expression on the LHS is to be
expanded as if it were a determinant
to obtain the components of i, j and
k.
6. The area of a parallelogram with
sides a and b is ||a×b||.Scalar triple product or box product of a, b
and c is defined to be the scalar a · (b×c)
or simply, a · b×c, often denoted by
[a b c]. We have
1. [a b c] = [b c a] = [c a b]
2.
[a b c] =
∣∣∣∣∣∣a1 a2 a3
b1 b2 b3c1 c2 c3
∣∣∣∣∣∣27
where a = (a1, a2, a3), b = (b1, b2, b3)
and c = (c1, c2, c3).
3. The scalar triple product a · b×c is
the “oriented” volume of the paral-
lelepiped with edges a, b, c: if the
vectors {a,b, c} are positively ori-
ented, the volume is positive whereas
if the vectors are negatively oriented,
the volume has a negative sign at-
tached to it. The usual (unoriented)
volume is given by the absolute value
| [a b c] |.
Vector triple product This is the vector
a×(b×c). We have the relations
a×(b×c) = (a · c)b− (a · b)c
(a×b)×c = (a · c)b− (b · c)a
(a×b) · (c×d) = (a · c)(b · d)
− (a · d)(b · c)
[Lagrange identity]
(a×b)×(c×d) = [a c d]b− [b c d]a
= [a b d]c− [a b c]d
Reciprocal systems of vectors Two sets
of vectors (a1,a2,a3) and (b1,b2,b3)
are said to be reciprocal systems if
ai · bj = δij , where we have used the
Kronecker symbol (see p. 3).
Theorem 62. Assume that the vec-
tors (a1,a2,a3) are linearly independ-
ent (or equivalently that [a1 a2 a3] 6= 0).
The two sets of vectors {a1,a2,a3} and
{b1,b2,b3} are reciprocal iff
b1 =a2×a3
[a1 a2 a3], b2 =
a3×a1
[a1 a2 a3]
b3 =a1×a2
[a1 a2 a3]
Vector differentiation Let f : U ⊂ R −→Rn (n = 2, 3) be a function. Such func-
tions are called vector-valued functions or
vector functions. If
limh→0
f(t+ h)− f(t)
h
exists, then f is said to be differentiable
at t ∈ U and the limit is said to be the
derivative of f at t. It is variously denoted
by f ′(t), dfdt , f(t) or Df(t) etc. Writing
f(t) = x(t) i + y(t) j (in R2) or as x(t) i +
y(t) j + z(t) k (in R3), the derivative is
given by
df
dt=
dxdt
i +dydt
j in R2
dxdt
i +dydt
j + dzdt
k in R3
Properties of vector differentiation
Let f , g and h be differentiable vector
functions defined on the same domain in
R. Then
1. ddt
(f(t) + g(t)) = dfdt
+dgdt
2. ddt
(f(t) · g(t)) = f(t) · dgdt
+ dfdt· g(t)
3.
d
dt(f(t)×g(t)) =
(f(t)× dg
dt
)+
(df
dt×g(t)
)4. If φ : U ⊂ R −→ R is a differentiable
function whose domain is that of f ,
define the function φf : U −→ Rn(n = 2, 3) by (φf)(t) = φ(t)f(t).
Then
d
dt(φf) = φ(t)
df
dt+
dφ
dtf(t)
5. (Derivative of the scalar triple
product)
d
dt[f(t) g(t) h(t)] =
[f(t) g(t)
dh
dt
]+[
f(t)dg
dth(t)
]+
[df
dtg(t) h(t)
]6. (Derivative of the vector triple
product)
d
dt(f(t)×(g(t)×h(t)))
= f(t)×(
g(t)× dh
dt(t)
)+ f(t)×
(dg
dt(t)×h(t)
)+
df
dt(t)×(g(t)×h(t))
Partial derivatives If f : U ⊂ R2 −→ Rn(n = 2, 3), then the partial derivatives at
a point (a, b) ∈ U are defined in terms of
the limits, if they exist:
∂f
∂x(a, b) = lim
h→0
f(a+ h, b)− f(a, b)
h∂f
∂y(a, b) = lim
k→0
f(a, b+ k)− f(a, b)
k
28
If
f(x, y) = (f1(x, y), f2(x, y), . . . , fn(x, y))
then
∂f
∂x=
(∂f1
∂x,∂f2
∂x, . . . ,
∂fn∂x
)and similarly for ∂f
∂y . The definition can
be extended to more than two variables in
the obvious way. The basic properties of
partial differentiation are as follows. Sup-
pose f ,g are vector functions on a com-
mon domain U . Then
1.∂
∂x(f · g) =
(f · ∂g
∂x
)+
(∂f
∂x· g)
. A
similar formula holds with respect to
y.
2.∂
∂x(f×g) =
(f× ∂g
∂x
)+
(∂f
∂x×g
).
A similar formula holds with respect
to y.
Gradient, divergence & curl
The gradient
Definition 63. If φ : U ⊂ R3 −→ Ris a function whose partial derivatives
exist at a point a = (a1, a2, a3) ∈ U ,
then the gradient of φ at a, denoted
by ∇φ(a), is
∇φ(a) =∂φ
∂x(a) i+
∂φ
∂y(a) j+
∂φ
∂z(a) k
∇φ can be thought of as the “differ-
ential operator”
∇ :=∂
∂xi +
∂
∂yj +
∂
∂zk
acting on φ.
The gradient can also be defined on
R2. If u is a unit vector in R3 and
∇φ(a) 6= 0, then ∇φ(a) · u is the
“projection” of ∇φ(a) on u or the
component of ∇φ(a) in the direction
of u. It is also the directional deriv-
ative (p. 17) of φ in the direction of
u. It measures the rate of change of
φ at a in the direction of u. Hence,
the rate of change is maximum when
u = ∇φ(a)/||∇φ(a)|| and its mag-
nitude is equal to ||∇φ(a)||.Scalar & vector fields
1. A function F : U ⊂ R3 −→ R is
called a scalar field.
2. A function F : U ⊂ R3 −→ R3 is
called a vector field. In terms of
components,
F(x, y, z) = F1(x, y, z) i + F2(x, y, z) j
+ F3(x, y, z) k
for suitable functions Fi : R3 −→R (i = 1, 2, 3). We will say that
the vector field F is differentiable
if each Fi is. The gradient
∇φ : U −→ R3
a 7→ ∇φ(a)
is a vector field.
Definition 64. Let F be a differen-
tiable three-dimensional vector field.
The divergence of F is the scalar field
∇ · F :=∂F1
∂x+∂F2
∂y+∂F3
∂z
It can be viewed as the “formal” dot
product of ∇: = ∂∂x i+ ∂
∂y j+ ∂∂z k and
F = F1 i + F2 j + F3 k. A common
notation for the divergence is div F.
Caution We can define a formal dot
product F · ∇ by (F · ∇)φ := F1∂φ∂x +
F2∂φ∂y + F3
∂φ∂z . Clearly, ∇ · F 6= F · ∇
since the LHS acts on points on R3
and the RHS on scalar fields.
The curl Let F be a vector field as
above. Then the curl of F at a is
the vector field defined by
(curl F)(a)
=
(∂F3
∂y(a)− ∂F2
∂z(a)
)i
+
(∂F1
∂z(a)− ∂F3
∂x(a)
)j
+
(∂F2
∂x(a)− ∂F1
∂y(a)
)k
Alternatively, the curl may be viewed
as the “cross product”:
curl F = ∇×F =
∣∣∣∣∣∣i j k∂∂x
∂∂y
∂∂z
F1 F2 F3
∣∣∣∣∣∣The curl is also denoted rot F.
The curl is also defined for two-
dimensional vector fields by
∂F2
∂x− ∂F1
∂y
which makes it a scalar field.
29
Properties Let φ and ψ be differentiable
scalar fields, and F and G differenti-
able vector fields on the same domain
in R3. Let c ∈ R be arbitrary.
1. (Linearity of the gradient)
∇(cφ+ ψ) = c∇φ+∇ψ2. (Linearity of the divergence)
div (c F + G) = c(div F) + div G
3. (Linearity of the curl)
curl (c F + G) = c(curl F) +
curl G
For the remaining properties we
assume that φ, F and G have
continuous second-order partial
derivatives.
4. div (φF) = (∇φ · F) + φ(div F).
Here φF is the vector field defined
by (φF)(a) := φ(a)F(a).
5. curl (φF) = (∇φ)×F + φ(curl F)
6. div (F×G) = G · curl F −F · curl G
7. curl (F×G) = (G · ∇)F −G(div F)− (F · ∇)G + (F div G)
8.
∇(F ·G) = (G · ∇)F + (F · ∇)G
+ (G×curl F) + (F×curl G)
9.
∇2φ := div (∇φ) =∂2φ
∂x2+∂2φ
∂y2+∂2φ
∂z2
The symbol ∇2φ is also denoted
∆φ and is called the Laplacian
of φ. The equation ∆φ = 0 is
Laplace’s equation.
10. curl (∇φ) = 0
11. div (curl F) = 0
12. curl (curl F) = ∇(div F) − ∆F,
where ∆F = (∆F1,∆F2,∆F3).
2.11 Line, Surface &Volume Integrals
Paths or Curves in Rn A map
γ : [a, b] ⊂ R −→ Rn(n = 2, 3)
γ(t) = (γ1(t), · · · , γn(t))
is said to be a continuous (resp. Ck,
1 ≤ k ≤ ∞) path or curve if its com-
ponent functions γi are each continuous
(resp. Ck). It is piecewise-continuous or
piecewise-Ck if [a, b] can be partitioned
into finitely many sub-intervals on each of
which γ is continuous or Ck and the only
discontinuities are jump discontinuities;
as a special case, it may consist of finitely
many continuous or Ck curves joined end
to end. The curve is said to be closed if
γ(a) = γ(b). It is said to be simple closed
or a Jordan curve if γ is one-to-one on
(a, b], i.e. the only self-intersection of the
curve occurs when γ(a) = γ(b). The do-
main of γ is sometimes called a parameter
domain. The parameter domain is not al-
ways explicitly mentioned but is assumed
to be given. Sometimes for convenience
the same same symbol γ represents not
only the curve (which by the definition is
a function) but also the range γ([a, b]) of
the function.
Tangents Let A map γ : [a, b] ⊂ R −→ Rn(n = 2, 3) be a differentiable curve. Then
the it tangent to the curve is the function
γ : [a, b] −→ R defined by
γ(t) = (γ1(t), · · · , γn(t))
where γi is traditional notation for differ-
entiation with respect to the parameter t
which is sometimes thought of as repres-
enting ‘time’.
Line Integrals Let γ : [a, b] −→ Rn (n =
2, 3) be a piecewise-smooth curve with
image C := γ([a, b]). If f : C −→ Rnis a bounded, then the line integral of f
along γ is the integral (assuming it exists)∫γ
f dγ :=
∫ b
a
f(γ(t)) · γ(t)︸ ︷︷ ︸dot product
dt
An alternative notation (with n = 3 for
example) is∫γ
f1(x, y, z) dx+f2(x, y, z) dy+f3(x, y, z) dz
where x := γ1(t), y := γ2(t), z := γ3(t). If
γ is closed, then the line integral is some-
times denoted by∮γ
f dγ , or, if the curve
30
is traversed in the “anticlockwise” direc-
tion, by
∨©
∫γ
f dγ
Theorem 65.
1. (Linearity) For constants a, b and
functions f, g,∫γ
(af + bg) dγ = a
∫γ
f dγ + b
∫γ
g dγ
2. If γ and ρ are “concatenated” (or
joined) curves, i.e. the end-point of
γ is the starting point of ρ, then de-
noting the resulting curve by γ + ρ,∫γ+ρ
f dγ =
∫γ
f dγ +
∫ρ
f dγ
Two piecewise-smooth curves
γ : [a, b] −→ Rn and ρ : [c, d] −→ Rnare said to be equivalent if there ex-
ists an onto (i.e. surjective) function
u : [c, d] −→ [a, b] such that its derivative
u′ 6= 0 on [c, d] and ρ = γ ◦ u. If u′ > 0
on [c, d], the curves ρ and γ are said to
have the same orientation and u is said
to be orientation-preserving whereas if
u′ < 0 on [c, d] they are said to have
opposite orientations and u is said to be
orientation-reversing.
Example. Define u : [0, 1] −→ [0, 1] by
u(t) = 1 − t. If γ(t) is a curve on [0, 1],
then (γ ◦ u)(t) = γ(1 − t) is an equival-
ent curve of opposite orientation to γ. In
fact, both have the same range or “trace”
(viz. γ([0, 1])) except that γ ◦ u covers it
moving from γ(1) to γ(0).
Theorem 66. Line Integrals under
a change of parameter Let γ and ρ
be equivalent curves defined on the same
domain. Then∫γ
f dγ =∫ρ
f dρ if γ and ρ have the
same orientation.∫γ
f dγ = −∫ρ
f dρ if γ and ρ have op-
posite orientations.
Line integrals with respect to arc length
Let γ : [a, b] −→ R be a C1 curve. Then
the arc length along γ is given by
s(t) =
∫ b
a
||γ(t)|| dt and s(t) = ||γ(t)||
If φ is a scalar field defined and bounded
on the range Γ of γ, then the line integral
of φ with respect to arc length along γ is
defined by∫Γ
φds :=
∫ b
a
(φ ◦ γ)(t)s(t) dt
=
∫ b
a
φ(γ(t)) ||γ(t)||dt
if the integral on the RHS exists.
Definition 67. An open subset U of Rn is said
to be connected if any two points in it can be
joined by a polygonal line, i.e. a continuous
curve consisting of line segments joined end to
end.
Theorem 68. Let D be a connected open set
in Rn (n = 2, 3) and φ : D −→ R be a differen-
tiable scalar field with continuous gradient ∇φ.
If γ is a piecewise smooth curve and γ(a), γ(b)
are two points on it, then∫γ
∇φ dγ = φ(γ(b))− φ(γ(a))
where γ is the portion of the curve starting
from γ(a) and ending at γ(b). Thus, the
line integral of a gradient is independent of
the curve joining the endpoints γ(a) and γ(b)
(this is usually termed path-independence).
Moreover, the line integral of a gradient along
a piecewise-smooth closed curve is 0.
Theorem 69. Let f : D ⊂ Rn −→ Rn (n =
2, 3) be a continuous vector field on the con-
nected open domain D. Suppose the line integ-
ral∫γ
f dγ is path-independent in D (i.e. with
respect to any pair of points of D). Fix a point
x0 ∈ D and define a scalar field φ : D −→ Rby
φ(x) =
∫γ
f dγ
where γ is any piecewise smooth (i.e. infinitely
differentiable) curve joining x0 to x and lying
in D (i.e. the range of γ is contained in D).
Then ∇φ exists and ∇φ(x) = f(x) for all x ∈
31
D. Under these circumstances, φ is called a
potential function (for f) and f is said to be
the gradient of the potential (function) φ.
Theorem 70. Let f : D ⊂ Rn −→ Rn (n =
2, 3) be a continuous vector field, D connected
and open. Then the following statements are
equivalent:
1. f is the gradient of some potential φ.
2.∫γ
f dγ is path-independent: γ can be
replaced by any other piecewise smooth
curve φ in D, provided they have the same
starting point and the same ending point.
3.∫γ
f dγ = 0 for every closed piecewise
smooth curve γ in D.
Theorem 71. Suppose f : D ⊂ R2 −→ R2,
f(x, y) = P (x, y) i + Q(x, y) j, is a continuous
vector field on the open connected set D. Then
f is a gradient on D (i.e. f = ∇φ for some
scalar field φ) iff
∂P
∂y=∂Q
∂x
Surface Integrals Let S = r(R) be a para-
metric surface (see Definition 56) r : R ⊂R2 −→ R3 being differentiable. Let f : S −→R be a bounded scalar field (bounded as a func-
tion).Then the surface integral of f over S is∫∫S
f dS
:=
∫∫R
f(r(u, v))
∣∣∣∣∣∣∣∣ ∂r
∂u× ∂r
∂v
∣∣∣∣∣∣∣∣ dudv
if the double integral on the RHS exists.
If f ≡ 1 in the above formula, we obtain the ex-
pression for surface area:∫∫S
dS =
∫∫R
∣∣∣∣∣∣∣∣ ∂r
∂u× ∂r
∂v
∣∣∣∣∣∣∣∣ dudv
Cf. Definition 58.
2.12 Green’s, Stokes’ &Gauss’ theorems
Green’s Theorem 1 This version of the theorem
is for regions in R2 bounded by piecewise
smooth Jordan curves.
Theorem 72. Let P,Q : U ⊂ R2 −→ R be
continuously differentiable scalar fields. Let γ
be a piecewise smooth Jordan curve and R de-
note the region in R2 consisting of all points
on the curve and enclosed by it. Assume that
R ⊂ U . Then∫∫R
(∂Q
∂x− ∂P
∂y
)dxdy = ∨©
∫γ
P dx+Qdy
γ is traversed in the anticlockwise direction.
Green’s Theorem 2
Definition 73. (Multiply connected do-
mains) Suppose γ, γ1, γ2, . . . , γn are piece-
wise smooth closed Jordan curves satisfying the
properties:
1. No pair of curves intersects.
2. γ1, γ2, . . . , γn lie in the interior of the re-
gion bounded by γ.
3. Any γi lies outside the region bounded by
γj if i 6= j.
If R and Ri are the regions bounded by γ and
the γi, then define R := Rr (∪ni=1Ri). The set
R is said to be multiply or n-connected with
boundary curves γ, γ1, . . . , γn.
Theorem 74. If P,Q : U ⊂ R2 −→ R are
continuously differentiable scalar fields, R ⊂ Uis a multiply connected region as above, then∫∫R
(∂Q
∂x− ∂P
∂y
)dxdy = ∨©
∫γ
P dx+Qdy
+
n∑i=1
∨©
∫γi
P dx+Qdy
Stokes’ Theorem
Theorem 75. Let S = r(R) be a smooth simple
parametric surface, R ⊂ R2, bounded by a
piecewise smooth Jordan curve γ. Suppose also
that r is C2 on some open set U which con-
tains R and its boundary. Let Γ(t) := r(γ(t))
be the curve bounding S. If F = P i+Q j+Rk,
P,Q,R continuously differentiable, is a vector
field on S, then∫∫S
(curl F) · n dS =
∫Γ
F · dΓ
where n is the usual unit vector normal to S
(p.26).
32
Gauss’ or the Divergence theorem The 3-
dimensional analogue of a closed curve will
be called a closed or compact surface. A
2-dimensional surface has two normals at
each point. Call the one pointing outside the
surface as the “outward normal”. This might
be the standard unit normal n defined on
p. 26 or −n. Regardless of the choice made,
denote it by n.
Definition 76. A surface S is orientable if the
normal vector field defined as follows:
N : S −→ R3
a 7→ ∇f(a)
||∇f(a)||if (S = f−1(0))
a 7→(∂r∂u×
∂r∂v
)(a)∣∣∣∣( ∂r
∂u×∂r∂v
)(a)∣∣∣∣
(if S = r(R), is simple)
is smooth, i.e. its component functions are
smooth. In the latter case when S is a simple
parametric surface, a = r(ua, va) for unique
(ua, va) in the parameter domain R.
Theorem 77. Let V be a “solid” in R3 bounded
by an orientable closed surface S. If F is a con-
tinuously differentiable vector field on V, then∫∫∫V
(div F) dxdy dz =
∫∫S
F · n dS
————————–
33
34
Chapter 3
Fourier series
Trigonometric series An infinite series of the
form
a0
2+
∞∑n=1
(an cosnx+ bn sinnx) (3.1)
is called a trigonometric series.
Fourier series The trigonometric series is called
the Fourier series of a function f : [−π, π] −→R if for n = 0, 1, 2, . . .
an =1
π
∫ π
−πf(x) cosnx dx
(3.2)
bn =1
π
∫ π
−πf(x) sinnx dx
This is sometimes expressed by writing
f(x) ∼ a0
2+
∞∑n=1
(an cosnx+ bn sinnx)
and the series is said to be generated by or as-
sociated with f . The numbers an and bn are
called the Fourier coefficients of f . More gen-
erally, if the domain of f is [−l, l], then by
considering the Fourier series associated with
f( lπx), the Fourier series of f(x) becomes
a0
2+
∞∑n=1
(an cos
nπx
l+ bn sin
nπx
l
)(3.3)
and the formulas for the Fourier coefficients
become for n = 0, 1, 2, . . .
an =1
l
∫ l
−lf(x) cos
(nπxl
)dx
(3.4)
bn =1
l
∫ l
−lf(x) sin
(nπxl
)dx
Not every trigonometric series is the Fourier
series of a function. However we have the res-
ult:
Theorem 78.
1. If the series (3.1) converges uniformly in
[−π, π] to a function f : [−π, π] −→ R,
then it is the Fourier series of f .
2. In particular, if p > 1 and
limn→∞ npan < ∞ and limn→∞ npbn <
∞, then the trigonometric series above is
a Fourier series.
Orthogonality Functions f, g : [a, b] −→ R are
said to be orthogonal if∫ baf(x)g(x) dx = 0.
The function f is said to be normalised on [a, b]
if∫ baf2(x) dx = 1. The following is an import-
ant list of orthogonal pairs of functions as well
as two standard normalised functions.∫ l
−lcos(mπx
l
)cos(nπx
l
)dx = 0
(m 6=n, m,n= 0,±1,±2, ...)∫ l
−lsin(mπx
l
)sin(nπx
l
)dx = 0
(m 6=n, m,n= 0,±1,±2, ...)∫ l
−lcos(mπx
l
)sin(nπx
l
)dx = 0
(m,n= 0,±1,±2, ...)
1
l
∫ l
−lsin2
(nπxl
)dx = 1
(n= 1, 2, ...)
1
l
∫ l
−lcos2
(nπxl
)dx = 1
(n= 1, 2, ...)
Fourier series of even & odd functions
f : [−l, l] −→ R is said to be even (resp. odd)
if f(−x) = f(x) (resp. −f(x)).
Definition 79. Let f : [−l, l] −→ R be an even
function. Then the associated Fourier series is
35
called the Fourier cosine series:
f(x) ∼ a0
2+
∞∑n=1
an cos(nπx
l
)(3.5)
a0 =2
l
∫ l
0
f(x) dx
an =2
l
∫ l
0
f(x) cos(nπx
l
)dx
Definition 80. Let f : [−l, l] −→ R be an odd
function. Then the associated Fourier series is
called the Fourier sine series:
f(x) ∼∞∑n=1
bn sin(nπx
l
)(3.6)
bn =2
l
∫ l
0
f(x) sin(nπx
l
)dx
Bessel’s inequality Assume that f : [0, π] −→ Ris piecewise continuous.
For the cosine series Let f be even and
have the associated cosine series (3.5). Then,
for N = 1, 2, . . .
a20
2+
N∑n=1
a2n ≤
2
π
∫ π
0
f2(x) dx
For the sine series Let f be odd and have
the associated sine series (3.6). Then, for N =
1, 2, . . .
N∑n=1
b2n ≤2
π
∫ π
0
f2(x) dx
Property of the Fourier coefficients With f
as above,
n→∞ =⇒{an → 0 if f is evenbn → 0 if f is odd
Half-range series & periodic extensions
Suppose f : [0, l] −→ R. Then f can be
extended to f : [−l, l] −→ R in two ways:
f(x) = f(−x) or f(x) = −f(−x) on [−l, 0] (in
either case f(x) = f(x) on [0, l]). In the first
situation, the extension of f to [−l, l] is even
whereas in the second, it is odd.
Definition 81. A function F : R −→ R is
periodic with period l if F (x + l) = F (x) for
all x ∈ R.
The periodic extension of an even function
f defined on [−l, l] to a function F defined
on R is obtained by repeatedly shifting the
graph of f by 2l units to the right and
to the left. To periodically extend an odd
function, essentially the same procedure is
followed but using the graph of f on the open
interval (−l, l) and defining f(x) = 0 at the
endpoints {nl : n = ±1,±2, . . .}. Depending
on which extension is chosen, F has either an
associated Fourier cosine series or a sine series.
Convergence of Fourier series
Definition 82. A function f : [a, b] −→ Ris piecewise smooth if f is piecewise continu-
ous and differentiable with piecewise continu-
ous derivatives on [a, b]. (At the endpoints, the
one-sided derivatives are taken).
Theorem 83. 1. Let f : (−l, l) −→ R be
piecewise smooth. At each x ∈ (−l, l), the
Fourier series of f on (−l, l) converges:
a0
2+
∞∑n=1
(an cosnx+ bn sinnx)
=f(x+) + f(x−)
2
and the Fourier coefficients an and bnare given by (3.4). Hence, at an interior
point of continuity x of f , the series con-
verges to f(x).
2. If F is the periodic extension of f with
period 2l, then by the result for f , at each
x ∈ (−∞,∞),
a0
2+
∞∑n=1
(an cosnx+ bn sinnx)
=F (x+) + F (x−)
2
The Fourier coefficients an and bn are
again given by (3.4) and at an interior
point of continuity x of F , the series con-
verges to F (x).
Corollary 84. The Fourier series of f :
(−l, l) −→ R at x = ±l converges tof(−l+)+f(l−)
2 . That of F at x = nl, n =
±1,±2, . . . converges to F (nl+)+F ((n+2)l−)2 .
Differentiation of a Fourier series
Theorem 85. Let f : [−π, π] −→ R be con-
tinuous and such that f(−π) = f(π). Suppose
36
that f ′(x) is piecewise continuous on (−π, π).
Then the Fourier series representation of f :
f(x) =a0
2+
∞∑n=1
(an cosnx+ bn sinnx)
an and bn as in (3.2), is differentiable at those
points x ∈ (−π, π) where f ′′(x) exists, and,
f ′(x) =
∞∑n=1
n(−an sinnx+ bn cosnx) (3.7)
At points x where f ′′(x) does not exist but
the left-handed and right-handed derivatives
of f ′(x) exist, the series (3.7) converges tof ′(x−)+f ′(x+)
2 . The interval [−π, π] can be re-
placed by [−l, l] and the series above by (3.3)
etc.
Integration of a Fourier series
Theorem 86. Let f : [−π, π] −→ R be piece-
wise continuous on (−π, π). Then if
f(x) ∼ a0
2+
∞∑n=1
(an cosnx+ bn sinnx)
without the convergence of the series being as-
sumed, then for all x ∈ [−π, π],∫ x−π f(t) dt
=a0
2(x+ π)
+
∞∑n=1
1
n[an sinnx− bn(cosnx+ (−1)n+1)]
Riemann-Lebesgue theorem Suppose
f : [a, b] −→ R is continuous except for
finitely many jump discontinuities, the jumps
being of finite magnitude. Then
limx→∞
∫ b
a
f(t) sinxt dt = limx→∞
∫ b
a
f(t) cosxt dt = 0
In particular, the result is true if x is replaced
by n ∈ N and the limit limx→∞ by limn→∞.
————————–
37
38
Chapter 4
Integral Transforms
4.1 Laplace Transforms
Let f : (0,∞) −→ R be a function. Its Laplace
transform denoted L{f(t)} is defined by
L{f(t)} =
∫ ∞0
f(t) e−st dt (4.1)
provided the improper integral (improper with re-
spect to both endpoints) converges for some value
of s ∈ R. The following notation is used to emphas-
ise the role of L{f(t)} as a function of s:
F (s) := L{f(t)}
F and f are sometimes referred to as the generating
function and the determining function respectively.
Theorem 87. The integral (4.1) converges at s =
s0 ⇒ it converges for s > s0. It diverges at s =
s0 ⇒ it diverges for s < s0.
Corollary 88. lims→∞ F (s) = 0. Hence, a poly-
nomial which is not the zero polynomial, cannot be
the Laplace transform of any function.
As a consequence of Theorem (87), there are
three possibilities:
1. Integral (4.1) converges for all s.
2. Integral (4.1) diverges for all s.
3. There exists sc such that (4.1) converges for
s > sc and diverges for s < sc. Then sc is
called the abscissa of convergence.
In the first two cases, we set sc = −∞ and sc =∞respectively.
Theorem 89.
1. The integral (4.1) converges absolutely at s =
s0 ⇒ it converges absolutely for s ≥ s0
2. If (4.1) converges for s = s0, it converges uni-
formly for s0 ≤ s ≤ R, where R ∈ R is arbit-
rary.
As before, an abscissa of absolute convergence sacan be defined which is such that absolute conver-
gence happens for s > sa, absolute divergence for
s < sa. Clearly, sc ≤ sa.
Laplace transform of an infinite series
Theorem 90. Let f(t) =∑∞n=0 ant
n converge
for all t > 0. Suppose the coefficients an satisfy
the condition
|an| ≤Cαn
n!
for all n sufficiently large, and C,α > 0. Then
L{f(t)} =
∞∑n=0
anL{tn} =
∞∑n=0
ann!
sn+1(s > α)
Differentiating a Laplace transform
Theorem 91. Let
F (s) := L{f(t)} =
∫ ∞0
f(t) e−st dt (s > sc)
Then
d
dsF (s) = −
∫ ∞0
tf(t)e−st dt (s > sc)
Corollary 92. F : (sc,∞) −→ R is infinitely
differentiable and
F (n)(s) =
∫ ∞0
(−t)nf(t)e−st dt (s > sc)
Integrating a Laplace transform
Theorem 93. Let F (s) = L{f(t)} for s > saand suppose
∫∞0|f(t)|t dt <∞. Then∫ ∞
0
F (s) ds = L
{f(t)
t
}(s > sa)
39
Linearity of the Laplace transform If f, g :
(0,∞) −→ R are Laplace transformable for
s > s0, and a, b ∈ R are arbitrary, then
L{(af + bg)(t)} = aL{f(t)}+bL{g(t)}(s > s0)
Linear change of variables Let λ > 0.
1. Then
L{f(λt)} =1
λF( sλ
)2. If f(t) = 0 on −∞ < t < 0, then for
λ > 0,
L{f(t− λ)} = e−λsL{f(t)}
3. F (λs) = L{ 1af( ta )} and F (s − λ) =
L{eλtf(t)}.
Differentiating the determining function f
Theorem 94. Let f : [0,∞) −→ R be con-
tinuously differentiable and satisfy the growth
condition limt→∞ f(t)e−st = 0 for all s > sc.
Then
L{f ′(t)} = sF (s)− f(0) (s > sc)
Corollary 95. Let f be Cn and
limt→∞ f (k)(t)e−st = 0 for all s > scand k = 0, 1, . . . , n− 1. Then
L{f (n)(t)} = snF (s)−n∑k=1
f (k−1)(0)sn−k
(s > sc)
Periodic functions Let f : [0,∞) −→ R be
periodic, i.e. there exists T > 0 such that
f(t+ T ) = f(t) for all t ≥ 0. Then
L{f(t)} = F (s) =1
1− e−sT
∫ T
0
f(t)e−st dt
The convolution theorem The convolution of
two functions f, g : (0,∞) −→ R, denoted f∗g,
is the function defined by
f ∗ g (t) :=
∫ t
0
f(x)g(t− x) dx (4.2)
if the integral exists: for example, if f and g
are piecewise continuous. The integral (4.2)
may also be improper at the upper limit, in
which case the integral is actually∫ t−
0. The
properties of the convolution “product” are as
follows.
1. c(f ∗g) = (cf)∗g = f ∗ (cg), c a constant.
2. (Commutativity) f ∗ g = g ∗ f .
3. (Associativity) (f ∗ g) ∗ h = f ∗ (g ∗ h).
4. (Distributivity) (f + g) ∗h = f ∗h+ g ∗h.
Theorem 96. Suppose L{f(t)} and L{g(t)}converge absolutely at s = a. Then
L{f ∗ g (t)} = L{f(t)}L{g(t)} (s ≥ a)
Rational functions are Laplace transforms
Theorem 97. Let R(s) be a rational function,
i.e. R(s) = P (s)/Q(s), where P,Q are poly-
nomials. Suppose lims→∞R(s) = 0. Then R
is the Laplace transform of some determining
function.
Theorem 98. (Power series in 1/s) Let
F (s) =
∞∑n=0
ansn+1
s > r for some r
be the Laplace transform of a function f for
sa ≤ r. Then
f(t) =
∞∑n=0
antn
n!(0 ≤ t <∞)
Laplace transform of the Dirac delta Let
δ(t − a) be the Dirac delta “function” (or
“pulse”) at a 6= t. Then its Laplace transform
is given by
F (s) = L{δ(t− a)} = e−as
Hence, if a = 0, L{δ(t)} = 1 and L−1[1] =
δ(t). This last statement does not contradict
the property stated in Corollary 88 because the
Dirac delta is not a function in the conven-
tional sense.
Uniqueness
Theorem 99. A function F (s) cannot be the
Laplace transform of more than one continuous
function f(t).
The inversion formula If F (s) is a given func-
tion, then the inverse transform of F , if it ex-
ists, denoted by L−1[ ], is a function f(t) such
that L{f(t)} = F (s).
Theorem 100. Let f : (0,∞) −→ R be a con-
tinuous function whose Laplace transform con-
verges to F (s) absolutely for s > sa. Then
L−1[F (s)] = f(t)
40
4.2 Fourier Transforms
A function f : R −→ C is absolutely integrable if∫∞−∞ |f(t)|dt < ∞, i.e. the improper integral con-
verges.
Definition 101. The Fourier transform of an ab-
solutely integrable function f : R −→ C, denoted by
f , F[f(t)] or F , is defined to be the function
f : R −→ C
f(ω) :=
∫ ∞−∞
f(t)e−iωt dt (4.3)
The function f is most commonly taken to be
R −→ R.
Caution The Fourier transform has variant defin-
itions in the literature. These can be summar-
ised as follows. If
f(ω) =1
a
∫ ∞−∞
f(t)eibωt dt
then the common choices for the pair (a, b) are
(√
2π,±1), (1,±√
2π), (1,±1). (4.3) above cor-
responds to (a, b) = (1,−1).
Basic Properties Let f : R −→ R be absolutely
integrable. Then
1. |f(ω)| ≤∫∞−∞ |f(t)|dt. Thus, f is
bounded.
2. f is continuous.
3. limω→−∞ f(ω) = 0 = limω→+∞ f(ω).
4. If the function f is even, then
f(ω) = 2
∫ ∞0
f(t) cosωtdt
and hence is even, whereas if it is odd,
then
f(ω) = −2i
∫ ∞0
f(t) sinωtdt
and is thus odd.
5. The Fourier transform is linear:(cf + g) = cf + g, c ∈ R.
Definition 102. (Convolution) Given two func-
tions f, g : R −→ C, their convolution, denoted
f ∗ g is defined to be the function
f ∗ g : R −→ C
f ∗ g (t) =
∫ ∞−∞
f(x)g(t− x) dx
if the integral exists. Cf. (4.2).
The convolution theorem Let f, g : R −→ C be
functions which are piecewise continuous, ab-
solutely integrable and bounded. Then f ∗ gis absolutely integrable and f ∗ g = f g. Con-
volution is commutative, associative and dis-
tributive.
Definition 103. The cross correlation ρfg of func-
tions f, g is defined to be ρfg := g ∗ f , where
f(x) := f(−x) (complex conjugate of f(−x)). The
autocorrelation is the function ρff .
Definition 104. (Fourier sine & cosine transforms)
For a function f which is even or odd, we define
fs(ω) :=
∫ ∞0
f(t) sinωtdt
(Fourier sine transform)
fc(ω) :=
∫ ∞0
f(t) cosωtdt
(Fourier cosine transform)
Theorem 105. (Transform of the conjugate) Let
f : R −→ C have Fourier transform f . Then the
transform of f(t) := f(t) is f(−t).
Theorem 106. (Shift in the time & frequency do-
main) Suppose f : R −→ C is absolutely integ-
rable and c ∈ R. Then the translated function
fa(t) := f(t − a) and eiatf(t) are also absolutely
integrable and
fa(ω) = e−iaω f(ω) (time shift)
[eiatf(t)](ω) = f(ω − a) (frequency shift)
Corollary 107. (Modulation) Let f : R −→ Chave Fourier transform f . Then denoting φ(t) :=
f(t) cosω0t and ψ(t) := f(t) sinω0t, we have
φ(ω) =1
2[f(ω + ω0) + f(ω − ω0)]
ψ(ω) =i
2[f(ω + ω0)− f(ω − ω0)]
Theorem 108. (Scaling) Let f : R −→ C be
Fourier transformable. If c ∈ R, c 6= 0, and
φ(x) := f(cx), then
φ(ω) = |c|−1f(c−1ω)
Theorem 109. (Uniqueness of the Fourier trans-
form) Let f, g : R −→ C be absolutely integrable
with Fourier transfroms f , g. If f = g on R, then
f = g at all the common points of continuity of f
and g.
41
Theorem 110. (Fourier transform of the time de-
rivative) Suppose f : R −→ C is n times continu-
ously differentiable and limt→±∞ f (k)(t) = 0 for
k = 0, 1, . . . , n− 1 (here f (0) := f), then the Four-
ier transform of f (n) exists and
[f (n)](ω) = (iω)nf(ω)
Theorem 111. (Frequency derivative of the Four-
ier transform) If f : R −→ C is differentiable and
f and f ′ are both absolutely integrable, then
1. (f ′)(ω) = iωf(ω).
2. If both f(t) and tf(t) are absolutely integrable,
then f is differentiable and
(f )′(ω) = −i[tf(t)]
Theorem 112. (Fourier transform of the indef-
inite integral) Let f be continuous and absolutely
integrable with Fourier transform f . Suppose
limt→∞∫ t−∞ f(τ) dτ = 0. Then for all ω 6= 0,(∫ t
−∞f(τ) dτ
)(ω) =
f(ω)
iω
Theorem 113. (Fourier transform of the Dirac
delta) Define δa(t) := δ(t− a). Then
δa(ω) = e−iaω
Moreover, if 1 denotes the constant function f(t) ≡1, then 1(ω) = 2πδ(ω).
Theorem 114. (Parseval’s identity) Let f, g :
R −→ C be piecewise smooth, absolutely integrable
and square-integrable (i.e.∫∞−∞ |f(t)|2 dt <∞; sim-
ilarly for g). Then∫ ∞−∞
f(t)g(t) dt =1
2π
∫ ∞−∞
f(ω) g(ω) dω
Theorem 115. (Plancherel’s identity) Let f have
Fourier transform f . Then∫ ∞−∞|f(t)|2 dt =
1
2π
∫ ∞−∞|f(ω)|2 dω
Theorem 116. (Fourier sine & cosine transforms
of time-derivatives) Let f : [0,∞) −→ R be con-
tinuously differentiable with second derivative f ′′(t)
piecewise continuous on each subinterval [0, b].
Suppose also that limt→∞ f(t) = 0 = limt→∞ f ′(t).
Then
(f ′′)c = −ω2fc − f ′(0)
(f ′′)s = −ω2fs + ωf(0)
Definition 117. (Finite Fourier sine & cosine
transforms) Let f : [0, π] −→ R be piecewise con-
tinuous. Then
fs(n) :=
∫ π
0
f(t) sinnt dt (n = 1, 2, . . .)
(Finite Fourier sine transform)
fc(n) :=
∫ π
0
f(t) cosnt dt (n = 0, 1, 2, . . .)
(Finite Fourier cosine transform)
Definition 118. (The Cauchy principle value) Let
f : R −→ C be a function. Then
limR→∞
∫ R
−Rf(t) dt
provided the limit exists, is the Cauchy principle
value of∫∞−∞ f(t) dt (which may not exist as an im-
proper integral). The integral is also said to exist
in the Cauchy sense.
Theorem 119. (Fourier inversion) Let f : R −→C be absolutely integrable and piecewise smooth with
Fourier transform f . Then
1
2π
∫ ∞−∞
f(ω)eiωt dω =1
2[f(t+) + f(t−)]
Here, the Cauchy principle value of the integral is
taken on the LHS for each t.
4.3 Z-Transforms
(One-sided) Z-Transforms Let
f : (−∞,∞) −→ C be a function and T > 0
be fixed. Then the Z-transform of the sequence
{f(nT ) : n = 0, 1, . . .} is defined to be
Z{f(nT} := Z{fn} := F (z)
:=
∞∑n=0
f(nT )z−n ∈ C
(4.4)
when the series is convergent. The radius of
convergence R (see Definition (178)) is determ-
ined by applying the ratio test or the root test.
For example, applying the root test to (4.4), we
obtain that the series converges if
|z| > lim supn→∞
n√|f(nT )| = R
T is called the sampling time. Note that
F : {z ∈ C : |z| > R} −→ C.
Sometimes Z-transforms are defined by setting
T = 1 in (4.4).
42
Properties of the Z-transform
Let fn := f(nT ) and gn := g(nT ) be two se-
quences such that Z{fn} = F (z) and Z{gn} = G(z)
for some functions F,G with radii of convergence
Rf and Rg respectively.
Linearity Given any a, b ∈ C,
Z{afn + bgn} = aZ{fn}+ bZ{gn}
and the radius of convergence of the LHS is
R := max{Rf , Rg}.
Shifting The right shift f(nT −kT )) of f(nT ) has
transform
Z{fn−k} =
z−k Z{fn} if f(−nT ) = 0, n ∈ Nz−k Z{fn}+
∑kn=1 f(−nT )z−(k−n)
otherwise
The left shift f(nT + kT ) has transform
Z{fn+k} = zk Z{fn} −k−1∑n=0
f(nT )zk−n
= zk
[Z{fn} −
k−1∑n=0
f(nT )z−n
]
As a special case, if k = 1, we have Z{fn+1} =
z [Z{fn} − f(0)].
Time-scaling If a ∈ C then
Z{a±nT fn} = F (a∓T z) =
∞∑n=0
f(nT )(a∓T z)−n
Periodic sequences Let fn be periodic with
period N : fn+N = fn and assume that
f(−nT ) = 0 for all n ∈ N. Define the first
block of periodic values by
f1(nT ) :=
{f(nT ) 0 ≤ n ≤ N
0 n /∈ {0, 1, . . . , N − 1}
Then
Z{f(nT )} =zN
zN − 1Z{f1(nT )}
Multiplication by n, nT & (nT )k n, k ≥ 0 are
integers.
Z{nf(nT )} = −zdF
dz
Z{nTf(nT )} = −TzdF
dz
Z{(nT )kf(nT )} = −Tz d
dzZ{(nT )k−1f(nT )}
Convolution Let fn = f(nT ) and gn = g(nT ) be
sequences with Z-transforms Z{fn} and Z{gn}respectively. Then the convolution of fn and
gn, denoted fn ∗ gn, is the sequence
fn ∗ gn :=
n∑k=0
f(kT )g(nT − kT )
Theorem 120. (The convolution theorem)
If fn and gn are two sequences as above, then
Z{fn ∗ gn} = Z{fn}Z{gn}
The convergence of the LHS is valid in the
region |z| > max{Rf , Rg}. Just as for the
Laplace and the Fourier transform, convo-
lution is commutative, associative and dis-
tributive.
The initial value theorem Let the sequence fnhave Z-transform F (z) in some region of con-
vergence. Then
fn = limz→∞
zn
[F (z)−
n−1∑k=1
fkz−k
]
In particular, f0 = limz→∞ F (z).
The final value theorem If the sequence fn has
Z-transform F (z) in some region of conver-
gence, then limn→∞ fn = limz→1(z − 1)F (z),
provided the limit on the LHS exists.
Transform of the complex conjugate If fn is
a sequence with Z-transform F (z), then
Z{f(nT )} = F (z) valid in the same region of
convergence as that of fn.
Transform of a product Let fn and gn be se-
quences having Z-transforms F (z) = Z{fn}and G(z) = ztg valid in regions |z| > Rf and
|z| > Rg respectively. Then
Z{f(nT )g(nT )} =
∞∑n=0
f(nT )g(nT )z−n
=1
2πi
∫C
F (w)G( zw
) dw
w
where C is a simple closed contour enclosing
the origin and oriented anticlockwise. The re-
gion of convergence is |z| > RfRg.
Transforms with parameters If fn = f(nT, a),
43
a ∈ R, have Z-transform F (z, a), then
Z
{∂
∂af(nT, a)
}=
∂
∂aF (z, a)
Z
{lima→a0
f(nT, a)
}= lima→a0
F (z, a)
Z
{∫ a1
a0
f(nT, a) da
}=
∫ a1
a0
F (z, a) da
assuming the integrals are finite.
Inverse transforms. Power series method
Let F (z) be the Z-transform of a sequence
fn := f(nT ). If F is analytic in |z| > R
(including at z = ∞ in the extended complex
plane), then fn can be recovered as the coeffi-
cients of the Taylor’s series expansion of F (z)
in terms of z−1. In particular, suppose F is a
rational function (quotient of polynomials):
F (z) =a0 + a1z
−1 + · · ·+ anz−n
b0 + b1z−1 + · · ·+ bnz−n
:= f(0) + f(T )z−1 + f(2T )z−2 + · · ·+ f(nT )z−n + · · ·
Then the sequence fn is obtained from the re-
lations
a0 = f(0)b0
a1 = f(0)b1 + f(T )b0
...
an =
n∑k=0
f((n− k)T )bk
Inverse transforms. Partial fractions
Suppose F (z) is a rational function
and the Z-transform of a sequence
fn. Then F can be written as a sum:
F (z) = F1(z) + F2(z) + · · · + Fn(z) + · · · . In
this case Z−1{F (z)} =∑∞n=1 Z
−1{Fn(z)} and
fn is the sum of the nth terms of the inverse
transforms Z−1{Fn(z)}.
Now suppose
F (z) =a1
z − c+
a2
(z − c)2+ · · ·+ an
(z − c)n
Then the aj ’s are given by the relations
an = (z − c)nF (z)|z=c
an−1 =d
dz[(z − c)nF (z)] |z=c
...
ak =1
(k − 1)!
dk−1
dzk−1[(z − c)nF (z)] |z=c
...
a1 =1
(n− 1)!
dn−1
dzn−1[(z − c)nF (z)] |z=c
————————–
44
Chapter 5
Ordinary Differential Equations(ODEs)
5.1 First-order equations
Let f : U ⊂ R −→ R be differentiable on an open
set U . An ODE which contains no derivative of
order higher than the first, is called a first-order
equation. Such an equation appears in one of two
forms:
y′ = F (x, y) (explicit form) (5.1)
F (x, y, y′) = 0 (implicit form) (5.2)
where y = f(x) (often written as y = y(x)). An
initial value problem is an ODE together with a
prescribed value of y(x0) at a given point x = x0. If
f satisfying the ODE exists, it is called the solution
of the ODE.
5.2 First-order equations inseparable form
Suppose the ODE is given as (5.1) and that
F (x, y) = f(x)g(y) for some functions f and g.
Then the solution of (5.1) is given by∫dy
g(y)=
∫f(x) dx+ c
Where c is an arbitrary constant of integration.
Particular solutions are obtained by giving values
to c.
The following equations are reducible to separ-
able form with solutions as follows:
1. dydx = f(ax + by + c), a, b, c ∈ R. The solution
is:∫du
a+ bf(u)= x+C (C: const. of integration)
and u := ax+ by + c.
2. (Homogeneous equations) Let F in (5.1) be ho-
mogeneous of degree 0 (see p. (2.3)). Then
(5.1) can be written as y′ = f(x/y) or y′ =
g(y/x). Substitute y = vx or x = uy respect-
ively to reduce the equation to separable form.
Assuming the substitution y = vx, the solution
is: ∫dv
g(v)− v= log |Cx| (y = vx)
where C is an arbitrary constant of integration.
3. dydx = ax+by+c
a′x+b′y+c′ , a, b, c and a′, b′, c′ ∈ R.
Case 1 aa′ = b
b′ = cc′ =: k. Substitute a′x+
b′y = u to obtain the following equation
in separable form:∫u+ c′
a′(u+ c′) + b′(ku+ c)du = x+ C
Case 2 aa′ 6=
bb′ . Substitute u := x − h and
v := y−k in the equation and choose h, k
such that ah + bk + c = 0. The equation
reduces to the form:
dv
du=
au+ bv
a′u+ b′v
which is homogeneous in u and v.
5.3 Exact First-order ODEs
The ODE
M(x, y) +N(x, y)y′ = 0 (5.3)
where M and N are functions defined on a certain
common open set U ⊂ R2. The ODE is said to be
exact if there exists a function f : U −→ R which
is such that ∂f∂x = M and ∂f
∂y = N . In differential
notation, the exact equation can be written as
df = M(x, y)dx+N(x, y)dy = 0 (5.4)
45
Theorem 121. Consider the ODE (5.3) defined on
a rectangle U := [a, b]× [c, d] ⊂ R2. Suppose that
M and N have continuous partial derivatives in U .
Then (5.3) is exact iff
∂M
∂y=∂N
∂x(5.5)
in U . If (5.3) is exact, its solution has the form
f(x, y) = c where
f(x, y) =
∫M(x, y) dx+ ψ(y)
ψ(y) :=
∫ [N(x, y)− ∂
∂y
∫M(x, y) dx
]dy
Theorem 122. Suppose (5.3) is exact and M,N
are homogeneous of degree n 6= 1 (see p. (2.3)).
Then the solution is Mx + Ny = C (constant of
integration).
5.4 General first-order first-degree linear equations
A general first-order first degree ODE (i.e. one in
which the highest power of y′ := dydx is 1) can be
expressed as
dy
dx+ P (x)y = Q(x) (5.6)
We have the following cases. C will denote an ar-
bitrary constant of integration.
(P,Q constants or P or Q = 0) The variables
are separable.
(P 6= 0, Q = 0) The solution is
y = Ce−∫P (x) dx
(Q 6= 0) Then the solution is
y = e−∫P (x) dx
[C +
∫Qe∫P (x) dxdx
]Bernoulli’s equation The equation
dy
dx+ P (x) = Q(x)yn
has the solution
y1−n = e−(1−n)∫P dx×[
C +
∫Q(1− n)e(1−n)
∫P dx dx
]C an arbitrary constant.
5.4.1 Integrating factors
A function φ(x, y) is said to be an integrating factor
of (5.3) if
φ(x, y)[M(x, y)dx+N(x, y)dy] = df
for some function f .
1. If the equation (5.4) is homogeneous (but not
exact), then 1/(M(x, y)x+N(x, y)y) is an in-
tegrating factor of the equation, provided the
denominator is not zero.
2. If 1N
(∂M∂y −
∂N∂x
)=: f(x), i.e. solely a function
of x, then e∫f(x) dx is an integrating factor for
(5.4).
3. If 1M (∂M∂y −
∂N∂x ) =: g(y), then e−
∫g(y) dy is an
integrating factor for (5.4).
4. If M(x, y) = yf(xy) and N(x, y) = xg(xy),
Mx − Ny 6= 0, then 1Mx−Ny is an integrating
factor of (5.4).
5. If M(x, y) = py and N(x, y) = qx, p, q ∈ R,
then (5.4) has integrating factor xp−1yq−1.
6. If (5.4) has the form xpyq(ay dx + bxdy),
a, b, p, q ∈ R, then (5.4) has integrating factor
xa−p−1yb−q−1.
7. If (5.4) can be expressed as py dx + qxdy +
xmyn(ay dx + bxdy) = 0, then xαyβ is an in-
tegrating factor if
α+ 1
p=β + 1
q
α+m+ 1
a=β + n+ 1
b
8. If (5.4) can be rewritten as xmyn(ay dx +
bxdy)+xm′yn′(a′y dx+b′x dy) = 0, then xαyβ
is an integrating factor provided the following
conditions are satisfied:
α+m+ 1
a=β + n+ 1
bα+m′ + 1
a′=β + n′ + 1
b′
ab′ − a′b 6= 0
5.5 First-order nth degreeequations
The general equation of this type has the form
(y′)n + P1(x, y)(y′)n−1 + Pn−2(x, y)(y′)n−2 + · · ·+ Pn−1(x, y)y′ + Pn(x, y) = 0
(5.7)
46
Three particular cases of (5.7) are the following.
Equations solvable for y′. The equation (5.7)
being an nth degree polynomial in y′, may have
n roots, say, u1, . . . , un. Then LHS of (5.7)
factors to give (y′−u1)(y′−u2) · · · (y′−un) =
0. Let y′ − ui(x, y) = 0 have the solution
fi(x, y, Ci) = 0, Ci an arbitrary constant.
Then, depending on the domain of the function
y, one set of solutions of (5.7) can be obtained
as
f1(x, y, Ci1)f2(x, y, Ci2) · · · fn(x, y, Cik) = 0
(1 ≤ k ≤ n), i.e. a product of some or all
the factors fi(x, y, Ci) = 0. Or else, solutions
could be formed by joining some of these to-
gether, for example, fi(x, y, Ci) = 0 on one
subinterval and fj(x, y, Cj) = 0 (j 6= i) on the
adjacent subinterval etc with a differentiable
overlap.
Equations solvable for y. Suppose (5.7) can be
solved for y to get y = f(x, y′). Differentiating
this w. r. t. x yields
y′ =∂f
∂x+∂f
∂y′dy′
dx
= F
(x, y,
dy′
dx
)(say)
Then solve y′ = F (x, y, dy′
dx ) to obtain
φ(x, y′, C) = 0 (C an arbitrary constant). The
general solution is obtained by eliminating y′
between
y = f(x, y′)
φ(x, y′, C) = 0
Equations solvable for x. If we have x =
f(y, y′), differentiate the relation w. r. t. y to
obtain a relation of the form
1
y′= G
(y, y′,
dy′
dy
)Solve this equation (if possible) to obtain a re-
lation ψ(y, y′, C) = 0. Eliminating y′ between
x = f(y, y′) and ψ(y, y′, C) = 0 for the general
solution.
Clairaut’s equation This equation has the form
y = y′ + f(y′). Its general solution is y =
C + f(C), C an arbitrary constant.
5.6 Linear ODEs
The general linear equation of order n defined on
(α, β) ⊂ R is
L(y) := a0(x)(y′)n + a1(x)(y′)n−1 + · · ·+ an−1(x)y′
+ an(x)y
= b(x)(5.8)
Each ai : (α, β) −→ R. A standing assumption
will be that a0 6= 0 on (α, β). If b(x) ≡ 0, (5.8)
is said to be linear homogeneous, and linear non-
homogeneous otherwise.
Theorem 123. (“Superposition”) Any finite linear
combination of solutions of (5.8) is again a solution
of (5.8). In other words, if y1, . . . , yn are solutions
of (5.8), then so is c1y1 + c2y2 + · · ·+ cnyn, where
the ci’s are arbitrary reals.
Definition 124. Suppose that y1, . . . , yn are solu-
tions of (5.8) with common domain U . Then if∑ni=1 ciyi ≡ 0 on U =⇒ ci = 0 for i = 1, . . . , n,
the solutions yi are said to be linearly independent.
Otherwise, they are said to be linearly dependent
Thus, the solutions yi are linearly dependent if
there exist ci ∈ R, i = 1, . . . , n, not all zero, such
that∑ni=1 ciyi ≡ 0.
Theorem 125. Consider the ODE (5.8) such that
the functions ai, b : (α, β) −→ R are continuous.
Let x0 ∈ (α, β) be arbitrary and y0, y1, . . . , yn−1 ∈R be arbitrary. Then (5.8) has a unique solution
satisfying the initial conditions
y(x0) = y0, y′(x0) = y1, . . . , y
(n−1) = yn−1
Definition 126. The Wronskian of a set of func-
tions fi : [α, β] −→ R (i = 1, . . . , n) which are
differentiable n− 1 times, is the determinant
W (f1, . . . , fn)(x):=
∣∣∣∣∣∣∣∣∣∣∣∣∣
f1(x) f2(x) · · · fn(x)f ′1(x) f ′2(x) · · · f ′n(x)
...... · · ·
...
f(n−1)1 (x) f
(n−1)2 (x) · · · f
(n−1)n (x)
∣∣∣∣∣∣∣∣∣∣∣∣∣Note that W (f1, . . . , fn) : [a, b] −→ R.
Theorem 127.
1. W (f1, . . . , fn) 6= 0 ⇒ the functions f1, . . . , fnare linearly independent. In the opposite dir-
ection, even if the f1, . . . , fn are linearly inde-
pendent, the W (f1, . . . , fn) may have = 0.
47
2. If in the ODE (5.8) the coefficient func-
tions ai : (α, β) −→ R are continu-
ous, and y1, . . . , yn are solutions of (5.8),
then the yi’s are linearly independent ⇐⇒W (f1, . . . , fn)(x) 6= 0 on (α, β).
Theorem 128.
1. The homogeneous equation L(y) = 0 (obtained
by setting b(x) = 0 in (5.8)) has n linearly
independent solutions.
2. If y1, . . . , yn are linearly independent solutions
of L(y) = 0, and y : (α, β) −→ R is any other
solution, then y is a linear combination of the
yi’s:
y(x) = c1y1(x) + c2y2(x) + · · ·+ cnyn(x) (5.9)
for some constants c1, . . . , cn. Thus, together
with Theorem (123), it follows that all solu-
tions of (5.8) are obtained when each of the
coefficients in (5.9) is varied over R.
Theorem 129. (The non-homogeneous equa-
tion) If yp is any solution, called the particular
solution or integral, of the non-homogeneous equa-
tion L(y) = b, then all the solutions are given by
y = yp + yc, where yc, called the complementary
function, is any solution of the homogeneous equa-
tion L(y) = 0.
5.6.1 Linear ODE of Euler-(orCauchy-)type
This is a linear equation with variable coefficients
of the form
xny(n) + a1xn−1y(n−1) + · · ·+ any = 0 (5.10)
The associated indicial polynomial is
q(r) := [r(r − 1) · · · (r − n+ 1)]
+ [a1r(r − 1) · · · (r − n+ 2)] + · · ·+ an(5.11)
It is degree n. In the special case of the 2nd order
Euler-type equation
x2y′′ + a1xy′ + a2 = 0
the indicial polynomial becomes
q(r) = r(r − 1) + a1r + a2 (5.12)
Theorem 130. Let the indicial polynomial (5.11)
have distinct roots r1, r2, . . . , rk and root ri have
multiplicity ni (so that n1 + · · · + nk = n). The
all the solutions of (5.10) are given by all possible
linear combinations of the n linearly independent
functions
|x|r1 , |x|r1 log |x|, . . . , |x|r1(log |x|)n1−1
|x|r2 , |x|r2 log |x|, . . . , |x|r2(log |x|)n2−1
...
|x|rk1, |x|rk log |x|, . . . , |x|rk(log |x|)nk−1
This solution is valid in any interval not containing
x = 0. In the special case of n = 2, the general
solution is given by
y(x) = c1|x|r1 + c2|x|r2 (distinct roots of (5.12))
= c1|x|r + c2|x|r log |x| (double root of (5.12))
where c1 and c2 are arbitrary reals.
5.6.2 nth order constant coefficienthomogeneous equations
These are equations of the form L(y) = 0 in which
b ≡ 0 and each ai(x) =: ai is a constant function.
The polynomial
P (u) := a0un + a1u
n−1 + · · ·+ an−1u+ an (5.13)
is called the characteristic or the auxiliary equation
associated with the given homogeneous equation.
The equation has n complex roots, some of which
may be real. The following possibilities can occur.
Roots all real & distinct Let m1, . . . ,mn be
the distinct real roots of (5.13). Then the
n linearly independent solutions are given by
em1x, . . . , emnx. The general solution is
y(x) = c1em1x + · · ·+ emnx
where the cj ’s are arbitrary real constants.
Roots all real with multiplicity Let the root
m have multiplicity r. Then all the linearly
independent solutions corresponding to m are
emx, xemx, . . . , xr−1emx. Thus, if the auxili-
ary equation has roots mj with multiplicity rj ,
j = 1, . . . , s, the general solution is:
r1∑k=1
c1kxk−1em1x +
r2∑k=1
c2kxk−1em2x + . . .
+
rs∑k=1
cskxk−1emsx
48
where the cjk ∈ R are arbitrary and r1 + r2 +
· · ·+ rs = n.
Roots complex & simple Since the auxilliary
equation has real coefficients, if p + i q is a
root of (5.13), then so is its complex conjugate
p− i q. If there are r pairs of complex conjug-
ate roots pk ± i qk, then all the basic linearly
independent solutions are listed as follows:
ep1x cos q1x, ep1x sin q1x (5.14)
ep2x cos q2x, ep2x sin q2x (5.15)
... (5.16)
eprx cos qrx, eprx sin qrx (5.17)
The general solution is an arbitrary linear com-
bination of the above solutions:
r∑k=1
ckepkx cos pkx+
r∑k=1
dkepkx sin qkx
ck, dk ∈ R.
Roots complex with multiplicity Let m = p+
i q be a complex root of multiplicity r. Then
so is p− i q. The linearly independent solutions
corresponding to this pair of roots are
epx cos qx, epx sin qx
xepx cos qx, xepx sin qx
......
xr−1epx cos qx, xr−1epx sin qx
Similarly for the other complex roots with mul-
tiplicity. The general solution is a linear com-
bination of the all these roots.
————————–
49
50
Chapter 6
Partial Differential Equations(PDEs)
In what follows, if f : U ⊂ R2 −→ R is a function
whose first-order partial derivatives exist, then
p :=∂f
∂xq :=
∂f
∂y
6.1 Formation of a PDE
Let u = f(x, y) be a function.
Elimination of constants We consider the case
of two constants. Let F (x, y, u, a, b) = 0 be a
relation. Then a PDE is obtained if a and b
can be eliminated from the three equations.
F (x, y, u, a, b) = 0
∂f
∂x+∂f
∂up = 0
∂f
∂y+∂f
∂uq = 0
to obtain a relation of the form
G(x, y, u, p, q) = 0. Here the number of
constants equals the number of independent
variables. If the number of constants is
greater, then higher-order partial derivatives
must be taken to obtain sufficiently many
equations to eliminate the constants.
Elimination of functions Suppose u := u(x, y)
and v := v(x, y) are connected by a relation of
the form φ(u, v) = 0. If the first-order partial
derivatives of φ exist, then an equation results
which is of the form
Pp+Qq = R
where
P :=∂u
∂y
∂v
∂z− ∂u
∂z
∂v
∂y
Q :=∂u
∂z
∂v
∂x− ∂u
∂x
∂v
∂z
R :=∂u
∂x
∂v
∂y− ∂u
∂y
∂v
∂x
6.2 First-order PDEs
Some terminology associated with solutions
Let
F (x, y, u, p, q) = 0 (6.1)
be a PDE in which the partial derivatives with re-
spect to the independent variables (taken to be x
and y) are of order at most 1. Such an equation is
said to be of first-order. The dependent variable is
generally denoted by u. A solution of the form
f(x, y, u, a, b) = 0
where a, b are arbitrary constants or parameters, is
said to be a complete integral of (6.1). If we restrict
b = φ(a) for arbitrary functions φ, then
f(x, y, u, a, φ(a)) = 0
is said to be a general integral of (6.1). If the envel-
ope of the two-parameter family of surfaces defined
by f(x, y, u, a, b) = 0 exists, then it is also a solu-
tion of (6.1) and it is called a singular integral or
singular solution of the PDE.
Linear PDEs
A PDE of the form
Pp+Qq = R (6.2)
51
where P = P (x, y, u), Q = Q(x, y, u) and R =
R(x, y, u) are functions in which p and q do not
occur and u is the dependent variable, is called a
linear PDE. In general, a linear equation is one of
the form
P1p1 + P2p2 + · · ·+ Pnpn = R (6.3)
where Pi = Pi(x1, x2, . . . , xn, u) and R =
R(x1, x2, . . . , xn, u) are functions of the n independ-
ent variables x1, . . . , xn and the dependent variable
u; pi := ∂u/∂xi for i = 1, 2, . . . , n.
Theorem 131.
1. The general solution of (6.2) is Φ(ξ, η) = 0
where Φ has first-order partial derivatives with
respect to ξ and η but is otherwise arbitrary,
and
ξ(x, y, u) = a, η(x, y, u) = b
a and b constants, are independent solutions of
the ODEs
dx
P=
dy
Q=
du
R(6.4)
2. The general solution of (6.3) is
Φ(ξ1, . . . , ξn, u) = 0, where Φ has first-
order partial derivatives with respect to the ξibut is otherwise arbitrary, and
ξi(x1, . . . , xn, u) = ai (i = 1, 2, . . . , n)
ai’s constants, are linearly independent solu-
tions of the ODEs
dx1
P1=
dx2
P2= · · · = dxn
Pn=
du
R
6.2.1 Special types of first-orderequations
Equations in p, q not involving x, y explicitly
The equations of the type
f(p, q) = 0
has the complete integral
u = ax+Q(a)y + b
where a, b are arbitrary constants and q = Q(a) is
the function explicitly solving
f(a, q) = 0
Equations involving u but not x, y
The equation
f(u, p, q) = 0
Solve the simultaneous equations f(u, p, q) = 0 and
p = aq, a an arbitrary constant, to obtain formulas
for p and q which in turn are integrated to form the
complete integral.
Separable equations
These are equations which can be written as
f(x, p) = g(y, q)
Solve for p and q from the simultaneous equations
f(x, p) = a and g(y, q) = a, where a is an arbit-
rary constant and construct the complete integral
as above.
Clairaut’s equations
These are equations of the type
u = px+ qy + f(p, q)
The corresponding complete integral is
u = ax+ by + f(a, b)
6.3 Linear PDEs with con-stant coefficients
A PDE of the form
F (D,D′)u :=
m∑i=1
n∑j=1
aijDiD′ju = f(x, y) (6.5)
where
D :=∂
∂xD′ :=
∂
∂y
Di :=∂i
∂xiD′j :=
∂j
∂yj
and the aij are constants, is called a linear PDE
with constant coefficients. The most general solu-
tion of the homogeneous PDE
F (D,D′)u = 0 (6.6)
is called the complementary function (CF) of (6.5).
Any solution of (6.5) is termed a particular integral
of the equation.
Theorem 132.
52
1. If u0 is the CF and u1 a particular integral of
(6.5), then u0 + u1 is the general solution of
(6.5).
2. If u1, . . . , un are solutions of (6.6), then so is
a1u1 + a2u2 + · · ·+ anun
A linear PDE is said to be reducible if it can be
factorised into factors of the form
D + aD′ + b
where a, b are constants.
Theorem 133. If (6.6) is reducible, then the
factors may be permuted without altering the equa-
tion.
Theorem 134. (Superposition) If u1 and u2 are
solutions of a homogeneous linear PDE in the un-
known function u = u(x, y, . . . ) of finitely many
variables defined on some domain U , then c1u1 +
c2u2 is also a solution of the PDE in that domain.
Theorem 135.
1. If for a 6= 0, aD + bD′ + c is a factor of
F (D,D′) and φ(t) is a differentiable but oth-
erwise arbitrary function, then
u(x, y) = e−cax φ(bx− ay)
is a solution of (6.6), i.e. it is the CF of (6.5).
2. If in the above case a = 0 but b 6= 0 and the
other hypotheses unchanged, then
u(x, y) = e−cb y φ(bx)
is the CF of (6.5).
For factors with multiplicity the following results
hold.
Theorem 136.
1. If for a 6= 0 (aD + bD′ + c)n is a factor of
F (D,D′) and if φi(t) (i = 1, 2, . . . , n) are ar-
bitrary differentiable functions, then
u(x, y) = e−cax
n∑i=1
xi−1φi(bx− ay)
is the CF of (6.5).
2. If in the above case a = 0 but b 6= 0 and the
other hypotheses unchanged, then
u(x, y) = e−cb y
n∑i=1
xi−1φi(bx)
is the CF of (6.5).
In general, the following theorem is true.
Theorem 137. Suppose F (D,D′) is reducible to
linear factors:
F (D,D′) =
n∏i=1
(aiD + biD′ + ci)
ni
and φij(t), i = 1, 2, . . . , n and j = 1, 2, . . . , ni, are
arbitrary differentiable functions.
1. If ai 6= 0 for all i = 1, 2, . . . , n, then the CF is
given by
u(x, y) =
n∑i=1
e− ciai x
ni∑j=1
xj−1φij(bix− aiy)
2. If some of the ai’s are zero but the correspond-
ing bi’s are not, then the appropriate factors
from (2) of Theorem 136 are used.
Since
F (D,D′)eax+by = F (a, b)eax+by and
F (D,D′)(eax+byφ(x, y))
= eax+byF (D + a,D′ + b)φ(x, y)
where φ is differentiable with respect to x and y
to the orders of D and D′ respectively, we have for
the factors of (6.6) which are not reducible to linear
factors, we have
Theorem 138. eax+by is a solution of (6.6) if
F (a, b) = 0. Hence, a general solution is given by
u(x, y) =
n∑j=1
cjeajx+bjy
where aj, bj, cj are constants, F (aj , bj) = 0 for all
j and n ∈ N∪{∞}. If the series is infinite, then it
is a solution if it is uniformly convergent.
When the non-homogeneous term has the form
f(x, y) = sin(ax+ by)
and F is of the form F (D2, DD′, D′2), then one can
use the relation
F (D2, DD′, D′2) sin(ax+ by)
= F (−a2,−ab,−b2) sin(ax+ by)
An analogous result is true for cos(ax + by). In
general, the trigonometric function f(x, y) can be
expressed as a sum of complex exponentials and
the theorems above applied. Alternatively, one may
53
assume the solution (of a second-order equation, for
example, with f(x, y) = sin(ax+ by)) in the form
u(x, y) = α sin(ax+ by) + β cos(ax+ by)
and determine the unknown constants α and β
by substituting the assumed solution in the given
equation.
Finding a particular integral of (6.5)
Interpreting D−1 and D′−1 to mean integration
with respect to x and y respectively, and writing
(6.5) as
u =1
F (D,D′)f(x, y)
it may be convenient in some cases to expand
1/F (D,D′) by the binomial theorem and perform
the integrations on f(x, y).
6.4 Some special linear PDEs
6.4.1 The one-dimensional waveequation
Let u := u(x, t) and f(x, t) be a functions and c > 0
a constant. The PDE
∂2u
∂t2− c2 ∂
2u
∂x2= f(x, t)
is called the one-dimensional non-homogeneous
wave equation. If f(x, t) ≡ 0, it is said to be homo-
geneous and is usually written as
utt :=∂2u
∂t2= c2
∂2u
∂x2=: c2uxx (6.7)
The Cauchy problem for an infinite string
This is the homogeneous wave equation (6.7) with
“initial conditions” as follows. Assume that U =
R× [0,∞] (interpreted as time) and f , g are certain
prescribed functions.
utt = c2uxx
u(x, 0) = f(x)ut(x, 0) = g(x)
}initial conditions at t = 0
The solution called D’Alembert’s solution is given
by
u(x, t) =1
2[f(x+ ct) + f(x− ct)]
+1
2c
∫ x+ct
x−ctg(s) ds
Semi-infinite string with a fixed end
This is an initial-cum-boundary-value problem. As-
sume that U = R× (0,∞], f is C2, g is C1 and
f(0) = f ′′(0) = g(0) = 0
utt = c2uxx
u(x, 0) = f(x) x ≥ 0ut(x, 0) = g(x) x ≥ 0u(0, t) = 0 t ≥ 0
The solution is given by
u(x, t) =1
2[f(x+ ct) + f(x− ct)]
+1
2c
∫ x+ct
x−ctg(s) ds
(x > ct) (6.8)
u(x, t) =1
2[f(ct+ x)− f(ct− x)]
+1
2c
∫ ct+x
ct−xg(s) ds
(x < ct)
Semi-infinite string with a free end
This is another initial-cum-boundary-value prob-
lem. Assume that the free end is at x = 0,
U = (0,∞)× (0,∞], f is C2, g is C1 and
f ′(0) = g′(0) = 0
utt = c2uxx
u(x, 0) = f(x) x ≥ 0ut(x, 0) = g(x) x ≥ 0u(0, t) = 0 t ≥ 0ux(0, t) = 0 t ≥ 0
The solution is given by
u(x, t) =1
2[f(x+ ct) + f(x− ct)] +
1
2c
∫ x+ct
x−ctg(s) ds
(x > ct)
u(x, t) =1
2[f(ct+ x)− f(ct− x)] +
1
2c
[∫ ct+x
0
g(s) ds
+
∫ ct−x
0
g(s) ds
](x < ct)
54
Nonhomogeneous boundary conditions
Let the following equation hold on the domain U =
(0,∞)× (0,∞), f be C2, g be C1, p be C2 and
p(0) = f(0); p′(0) = g(0); p′′(0) = c2f ′′(0)
utt = c2uxx
u(x, 0) = f(x) x ≥ 0
ut(x, 0) = g(x) x ≥ 0
u(0, t) = p(t) t ≥ 0
The solution is given by
u(x, t) = p(t− x
c
)+ φ(x+ ct)− ψ(ct− x)
(0 ≤ x < ct)
where
φ(ξ) =1
2f(ξ) +
1
2c
∫ ξ
0
g(s) ds
and
ψ(η) =1
2f(η) +
1
2c
∫ η
0
g(s) ds
The solution for x > ct is given by (6.8).
The vibrating string. Separation of variables
Consider a string of length l stretched along the x-
axis from 0 to l. Assume that the string is under
constant tension τ and has density ρ. The PDE
on the domain U = (0, l) × (0,∞) describing the
vibration of the string with prescribed initial and
boundary conditions is the following.
utt = c2uxx
u(x, 0) = f(x) 0 ≤ x ≤ lut(x, 0) = g(x) 0 ≤ x ≤ lu(0, t) = 0 t ≥ 0
u(l, t) = 0 t ≥ 0
Let f and g be representable by Fourier sine series.
Assuming a solution of the form
u(x, t) = X(x)T (t) 6= 0
we obtain
u(x, t) =
∞∑n=1
[an cos
(nπcl
)t+ bn sin
(nπcl
)t]
× sin(nπx
l
)
where
an =2
l
∫ l
0
f(x) sin(nπx
l
)dx
and
bn =2
nπc
∫ l
0
g(x) sin(nπx
l
)dx
6.4.2 The two-dimensional waveequation
Vibrating rectangular membrane
The situation is modelled by the following equation
on the domain U = (0, a) × (0, b) × (0,∞) with
associated initial and boundary conditions. Here
the function is u = u(x, y, t) and the membrane
has length a and width b.
utt = c2(uxx + uyy)
u(x, y, 0) = f(x, y) 0 ≤ x ≤ a, 0 ≤ y ≤ but(x, y, 0) = g(x, y) 0 ≤ x ≤ a, 0 ≤ y ≤ bu(0, y, t) = 0; u(a, y, t) = 0
u(x, 0, t) = 0; u(x, b, t) = 0
The solution by separation of variables is given by
u(x, y, t) =
∞∑m=1
∞∑n=1
(amn cos θmnct+ bmn sin θmnct)
× sin(mπx
a
)sin(nπy
b
)where θmn = m2π2
a2 + n2π2
b2 and
amn =4
ab
∫ a
0
∫ b
0
f(x, y) sin(mπx
a
)sin(nπy
b
)dxdy
bmn =4
θmnabc
∫ a
0
∫ b
0
g(x, y) sin(mπx
a
)sin(nπy
b
)dxdy
6.4.3 The three-dimensional waveequation
The equation on U = (0, a)× (0, b)× (0, d)× (0,∞)
with initial and boundary conditions has the form
utt = c2∇2u = c2(uxx + uyy + uzz)
u(x, y, z, 0) = f(x, y, z) on [0, a]× [0, b]× [0, d]
ut(x, y, z, 0) = g(x, y, z) on [0, a]× [0, b]× [0, d]
u(0, y, z, t) = 0; u(a, y, z, t) = 0
u(x, 0, z, t) = 0; u(x, b, z, t) = 0
u(x, y, 0, t) = 0; u(x, y, d, t) = 0
Assuming a solution of the form
u(x, y, z, t) = U(x, y, z)T (t)
55
we have
u(x, y, z, t) =∞∑l=1
∞∑m=1
∞∑n=1
(almn cos
√λct+ blmn sin
√λct)
× sin
(lπx
a
)sin(mπy
b
)sin(nπz
d
)where
λ =
(l2
a2+m2
b2+n2
d2
)π2
and
almn =8
abd
∫ a
0
∫ b
0
∫ d
0
f(x, y, z) sin(mπy
b
)× sin
(nπzd
)dxdy dz
blmn =8√λabcd
∫ a
0
∫ b
0
∫ d
0
g(x, y, z) sin(mπy
b
)× sin
(nπzd
)dxdy dz
6.4.4 Two-dimensional Laplaceequation in a rectangle
This is the equation on the domain U = (0, a) ×(0, b):
∇2u = uxx + uyy = 0
u(x, 0) = f(x); u(x, b) = g(x) (0 < x < a)
u(0, y) = 0; u(a, y) = 0 (0 < y < b)
The solution is given by
u(x, y) =
∞∑n=1
(an cosh
nπy
a+ bn sinh
nπy
a
)sin
nπx
a
where for all n ∈ N
an =2
a
∫ a
0
f(s) sinnπs
ads
bn =1
sinh nπba
[2
a
∫ a
0
g(s) sinnπs
ads
−(
coshnπb
a
)2
a
∫ a
0
f(x) sinnπx
ads
]
6.4.5 Two-dimensional Laplaceequation in a circle with Di-richlet conditions
Transforming the coordinate system to polar (r, θ),
Laplace’s equation becomes
urr +1
rur +
1
r2uθθ = 0
On the domain U = (0, a)× (−π, π] and subject to
the boundary conditions
u(r, θ) = f(θ) (−π < θ ≤ π)
|u(r, θ)| <∞u(r, π) = u(r,−π) (0 < r < a)
uθ(r, π) = uθ(r,−π) (0 < r < a)
f(r) and u(r, θ) are assumed to be periodic with
period 2π. The solution is
u(r, θ) =1
2a0 +
∞∑n=1
rn(an cosnθ + bn sinnθ)
where
an =1
πan
∫ π
−πf(s) cosns ds (n ≥ 0)
bn =1
πan
∫ π
−πf(s) sinns ds (n ≥ 1)
6.4.6 Laplace’s equation in three di-mensions
This is the equation
∇2u = uxx + uyy + uzz = 0 (6.9)
When specific initial and boundary conditions are
imposed, it is termed a Dirichlet problem.
Laplace’s equation in a box with Dirichletconditions
Consider the following equation on the domain U =
(0, a)× (0, b)× (0, c).
∇2u = 0
u(0, y, z) = u(a, y, z) = 0 on (0, b)× (0, c)
u(x, 0, z) = u(x, b, z) = 0 on (0, a)× (0, c)
u(x, y, c) = 0; u(x, y, 0) = f(x, y) on (0, a)× (0, b)
Assuming a solution of the form u(x, y, z) =
X(x)Y (y)Z(z), we obtain the solution as
u(x, y, z) =
∞∑m=1
∞∑n=1
amn sinhλmn(c− z) sinmπx
a
× sinnπy
b
λmn := π
√m2
a2+n2
b2
amn =4
ab sinh(cλmn)
∫ a
0
∫ b
0
f(s, t) sinmπs
a
× sinnπt
bdsdt
56
Laplace’s equation in a sphere with Dirichletconditions
In spherical polar coordinates (r, θ, φ), 0 ≤ r ≤ R,
0 ≤ θ ≤ π and 0 ≤ φ ≤ 2π, Laplace’s equation
takes the form
∂
∂r
(r2 ∂u
∂r
)+
1
sin θ
∂
∂θ
(sin θ
∂u
∂θ
)+
1
sin2 θ
∂2u
∂φ2= 0
Subject to the boundary condition U(R, θ, φ) =
f(θ, φ) on the surface, the solution is
u(r, θ, φ) =
∞∑n=0
(r/R)nZn(θ, φ) (r < R)
Zn(θ, φ) =
n∑k=1
(ank cos kφ+ bnk sin kφ)P kn (cos θ)
where
P kn (x) = (1− x2)k/2d
dxkPn(x)
is the Legendre function associated with the Le-
gendre polynomial Pn(x) of degree n and
a00 =1
4π
∫ 2π
0
∫ π
0
f(θ, φ) sin θ dθ dφ
ank =(2n+ 1)(n− k)!
2π(n+ k)!
∫ 2π
0
∫ π
0
f(θ, φ)P kn (cos θ)
× cos kφ sin θ dθ dφ (n > 0)
bnk =(2n+ 1)(n− k)!
2π(n+ k)!
∫ 2π
0
∫ π
0
f(θ, φ)P kn (cos θ)
× sin kφ sin θ dθ dφ
6.4.7 One-dimensional heat equation
This is the following equation defined on the do-
main U = (0, a)× (0,∞).
ut = c2uxx (c > 0)
u(x, 0) = f(x) (0 < x < a)
u(0, t) = 0 = u(a, t) (t > 0)
The formal solution is given by
u(x, t) =
∞∑n=1
cne−(n2π2c2/a2)t sinnπx
a
cn =2
a
∫ a
0
f(x) sinnπx
adx (n ∈ Z)
6.4.8 Two-dimensional heat equa-tion
Heat equation on a strip in cartesian co-ordinates
This equation with its initial and boundary condi-
tions defined on U = (0, a)× (0, b)× (0,∞) is
ut = c2(uxx + uyy) (c > 0)
u(x, y, 0) = f(x, y) on (0, a)× (0, b)
u(x, 0, t) = 0 = u(x, b, t) on (0, a)× (0,∞)
u(0, y, t) = 0 = u(a, y, t) on (0, b)× (0,∞)
has a solution by the separation of variables given
by
u(x, y, t) =
∞∑m=1
∞∑n=1
amn sinmπx
asin
nπy
be−λ
2mnc
2t
where
λ2mn =
m2π2
a2+n2π2
b2
and
amn =4
ab
∫ a
0
∫ b
0
f(s, t) sinmπs
asin
nπt
bdsdt
Heat equation in a sphere
In spherical polar coordinates the equation is
ut = c2(urr +
2
rur
)subject to the initial and boundary conditions
u(r, 0) = rf(r)
u(0, t) = 0 u(R, t) = 0 (t > 0)
is
u(r, t) =
∞∑n=1
ane−(nπcR )2t 1
rsin(nπrR
)(r > 0)
where
an =2
r
∫ R
0
rf(r) sin(nπrR
)dr
Heat equation in a cylinder in cylindrical po-lar coordinates
The equation is given by
ut = c21
r
∂
∂r
(r∂u
∂r
)57
subject to the initial and boundary conditions
u(r, 0) = 0 (0 ≤ r < R)
u(R, t) = U a constant and t > 0
has the solution
u(r, t) = U
[1− 2
∞∑n=1
e−µ2nc
2t
R2
]J0(µnr/R
µnJ1(µn)
where the µn’s are the positive roots of the Bessel
function J0(µ) = 0.
————————–
58
Chapter 7
Complex Variables
7.1 Preliminaries
1. In many circumstances, a complex number z =
x + i y can be regarded as the point (x, y) of
R2. The real part x of z is denoted Re z and y
the imaginary part, by Im z
2. An (open) ball of radius r centred on z0 is the
set B(z0, r) := {z ∈ C : |z− z0| < r}, where | |is the complex modulus or absolute value. A
set U ⊂ C is said to be open if for each z ∈ Uthere is an open ball B(z, r) ⊂ U , r depending
on z.
3. A function f : A ⊂ C −→ C can be de-
composed into its real and imaginary parts:
f(z) = f(x, y) = u(x, y) + i v(x, y), where
u, v : A ⊂ R2 −→ R.
4. Let f : A ⊂ C −→ C be a function and z0 ∈ Abe such that B(z0, r) ⊂ A for some ball centred
on z0 (i.e. z0 is an interior point of A). If there
is l ∈ C, |l| < ∞, such that given any ε > 0,
|f(z)− l| < ε for all 0 < |z − z0| < δ for some
δ = δ(ε) > 0, then we say that limz→z0 f(z) =
l. If no such l exists, then we say that the limit
does not exist.
Theorem 139.
limz→z0
f(z) = l⇐⇒ limz→z0
u(x, y) = Re l
& limz→z0
v(x, y) = Im l
5. Let f : A ⊂ C −→ C be a function on an
unbounded set A, i.e. there is no ball B(0, r) ⊃A. If there is l ∈ C, |l| < ∞, such that given
any ε > 0, |f(z)− l| < ε for all |z| > r for some
r > 0, then we say that limz→∞ f(z) = l.
Theorem 140.
limz→∞
f(z) = l⇐⇒ limz→∞
u(x, y) = Re l
& limz→∞
v(x, y) = Im l
6. Let f : A ⊂ C −→ C and z0 ∈ A be an in-
terior point. f is continuous at z0 if given
any ε > 0 there exists δ = δ(ε) such that
|f(z) − f(z0)| < ε whenever |z − z0| < δ. If
f is continuous at every point of its domain, it
is said to be continuous. If f is not continuous
at z0, it is said to be discontinuous at z0. f is
discontinuous if it is discontinuous at at least
one point in its domain.
Theorem 141. f is continuous at z0 = x0 +
i y0 iff u(x, y) and v(x, y) are continuous at
(x0, y0).
7. f : A ⊂ R −→ C is said to be differentiable at
an interior point t0 ∈ A if
f ′(t0) := limt→t0
f(t)− f(t0)
t− t0
exists. f ′(t0) is called the derivative of f at
t0. f ′(t0) exists iff u′(t0) and v′(t0) exist, and,
f ′(t0) = u′(t0) + i v′(t0).
8. f : A ⊂ C −→ C is differentiable at an interior
point z0 ∈ A if
f ′(z0) := limz→z0
f(z)− f(z0)
z − z0
exists. f ′(z0) is called the derivative of f at z0.
If f is differentiable at every point of C, it is
said to be entire.
7.2 Linear Fractional Trans-formations
The map
T : C \ {−d/c} −→ C (7.1)
z 7→ az + b
cz + d(7.2)
59
a, b, c, d ∈ C, c 6= 0, is called a linear fractional
transformation or a Mobius transformation. Some-
times the latter term is reserved for the situation
ad − bc 6= 0. Another name is bilinear transform-
ation. When c = 0, it is called a linear or affine
linear transformation.
Definition 142. T (z) = z+b is called a translation,
T (z) = az (a 6= 0) is a dilation and T (z) = eiθz is
a rotation. T (z) = 1/z is an inversion.
If ∞ denotes the usual point at infinity of the ex-
tended complex plane, we extend the domain of T
by writing T (∞) = ∞ if c = 0, and T (∞) = a/c
and T (−d/c) =∞ if c 6= 0.
Theorem 143.
1. When the domain of T is extended as above,
the map T : C ∪ {∞} −→ C ∪ {∞} is bijective
(one-to-one and onto).
2. A linear fractional transformation always
maps circles to circles or lines.
3. Given points z1, z2, z3 ∈ C and w1, w2, w3 ∈ C,
there exists a linear fractional transformation
mapping zk to wk (k = 1, 2, 3). This trans-
formation w = T (z) is given by the equation
(w − w1)(w2 − w3)
(w − w3)(w2 − w1)=
(z − z1)(z2 − z3)
(z − z3)(z2 − z1)
7.3 Elementary Functions
Definition 144.
Argument Let z = x + i y have polar form reiθ.
The multivalued function
arg z = θ + 2nπ n = 0,±1,±2, . . .
is called the argument of z.
Principal value The well-defined function de-
noted Arg z is called the principal value of arg z
and is defined to be the unique value of arg z
such that −π < arg z ≤ π.
Theorem 145. Given any z, w ∈ C,
arg(zw) = arg(z) + arg(w)
in the sense that any value of arg z plus any value
of argw is a value of arg zw. Conversely, any value
of arg(zw) is a sum of a value of arg(z) and a value
of arg(w).
If arg z is replaced by Arg z, then the above
equality is false.
Definition 146. The exponential function de-
noted ez or exp z is defined to be
ez := ex(cos y + i sin y)
where z = x+ i y.
Theorem 147. (Properties of the exponential) For
any z, w ∈ C
1. ez+w = ezew.
2. ez+2πi = ez. This is called periodicity. The ex-
ponential function is periodic with period 2πi.
3. | expz | = ex and arg(ez) = y + 2nπ, n =
0,±1,±2, . . . .
4. The range of ez is C \ {0}.
5. ddz ez = ez.
6. ez =∑∞n=0
zn
n! with disc of convergence C (see
subsection 7.6). Thus ez is entire (see section
7.1).
Definition 148. The trigonometric functions
sin z, cos z and tan z are defined by
sin z =ei z − e−i z
2i
cos z =ei z + e−i z
2
tan z =sin z
cos z
The associated functions sec, csc, cot are defined as
in the real case.
Theorem 149. (Properties of the trigonometric
functions)
1. The basic algebraic formulas of real trigono-
metric functions such as those involving the
sums of angles and integral multiples of angles
are also true in the complex case.
2. They are periodic with complex period 2π.
3. sin z = sin z, cos z = cos z and tan z = tan z.
4. sin z, cos z and tan z are unbounded.
60
5. The following series expansions are valid for
all z ∈ C:
sin z =
∞∑n=0
(−1)n−1 z2n+1
(2n+ 1)!
cos z =
∞∑n=0
(−1)nz2n
(2n)!
Thus, sin z and cos z are entire functions.
6. ddz sin z = cos z, d
dz cos z = − sin z andddz tan z = sec2 z.
7.
sin z = sinx cosh y + i cosx sinh y
cos z = cosx cosh y − i sinx sinh y
sin(i y) = i sinh y & cos(i y) = cosh y
| sin z|2 = sin2 x+ sinh2 y
| cos z|2 = cos2 x+ sinh2 y
Definition 150. The complex hyperbolic func-
tions sinh z and cosh z are defined as follows:
sinh z =ez − e−z
2
cosh z =ez + e−z
2
The associated hyperbolic functions tanh, coth etc
are defined by analogy with the real case.
Theorem 151. (Properties of hyperbolic functions)
1. −i sinh(i z) = sin z and cosh(i z) = cos z.
2. The basic algebraic identities in the real case
are also true in the complex case.
3. | sinh z|2 = sinh2 x + sin2 y and | cosh2 z| =
sinh2 x+ cos2 y.
4. ddz sinh z = cosh z and d
dz coshz = sinh z.
Both sinh z and cosh z are entire.
Definition 152.
Logarithm The (complex) logarithm is a multi-
valued function log : C \ {0} −→ C given by
log z = log |z|+i arg z = log |z|+i (Arg z+2nπ)
for all n = 0,±1,±2, . . . . If z = reiθ with
r > 0 and −π < θ < π, then
log z = log r + i(θ + 2nπ)
Branch of the logarithm Let α ∈ R be arbitrary
but fixed. The function
log z = Log r + i θ
is called a branch of the logarithm on all points
(r, θ) ∈ (0,∞)× (α, α+ 2π).
Principal branch The principal branch of the
logarithm on C \ {z ≤ 0} is the function log z
defined by
log z = log |z|+ i Arg z
It is sometimes denoted Log z.
Theorem 153. (Properties of the logarithm)
1. elog z = z for all z 6= 0.
2. log ez = z + i 2nπ (n = 0,±1, . . . ), and
Log ez = z provided −π < y ≤ π.
3. log(zw) = log z + logw in the sense that any
value of log z plus any value of logw is a value
of log(zw). Conversely, any value of log(zw)
can be expressed as a sum of a value of log z
and a value of logw.
4. As a special case there is the equality of func-
tions:
Log |zw| = Log |z|+ Log |w|
5. log(zw
)= log z− logw, w 6= 0, under a similar
interpretation as for log(zw).
Definition 154. The complex powers or expo-
nents of a complex z 6= 0 are defined as follows:
zw = ew log z
for any w ∈ C. It is multivalued. The principal
value or branch of zw is obtained by replacing log
by Log in the definition and is thus valid in −π <Arg z < π.
7.4 Analytic Functions
Let U ⊂ C be an open set. A function f : U −→C is said to be analytic or holomorphic at a point
z0 ∈ U if f(z) is differentiable everywhere in some
ball B(z0, r) centred on z0:
f ′(z) = limz→z0z 6=z0
f(z)− f(z0)
z − z0
exists for all z ∈ B(z0, r). It is analytic or holo-
morphic if it is analytic at every point of its do-
main. An analytic function whose domain is C is
thus entire.
61
Theorem 155. The sum and product of analytic
functions defined on a common domain are again
analytic on that domain. The quotient of two ana-
lytic functions is analytic if the function in the de-
nominator is nonzero on its domain.
The Cauchy-Riemann equations Let f : U ⊂C −→ C, f(z) = u(x, y) + i v(x, y) and z =
x0 +iy0 ∈ U . The partial differential equations
ux(x0, y0) :=∂u
∂x(x0, y0) =
∂v
∂y(x0, y0)
=: vy(x0, y0)
(7.3)
uy(x0, y0) =∂u
∂y(x0, y0) = −∂v
∂x(x0, y0)
= −vx(x0, y0)
are called the Cauchy-Riemann (CR) equa-
tions satisfied by the functions u, v at (x0, y0).
The CR equations can be written in the altern-
ative form
∂f
∂x(x0, y0) = −i
∂f
∂y(x0, y0)
The CR equations & differentiability
Theorem 156. (Necessity) Suppose f : A ⊂C −→ C, f = u + iv, is differentiable at
z0 = x0 + i y0 ∈ A. Then the first order par-
tial derivatives of u and v exist at (x0, y0) and
satisfy the CR equations (7.3). Moreover,
f ′(z0) = ux(x0, y0) + i vx(x0, y0)= vy(x0, y0)− iuy(x0, y0)
}(7.4)
Theorem 157. (Sufficiency) Suppose f :
A ⊂ C −→ C and let z0 = x0 + i y0 ∈ A be
an interior point. Suppose that the partial de-
rivatives occurring in the CR equations (7.3)
exist in some ball B(z0, r) ⊂ A and are con-
tinuous there. If they satisfy the CR equations
at z0, then f ′(z0) exists and is given by (7.4).
These theorems have obvious extensions to
analytic functions.
The CR equations in polar coordinates The
substitutions x = r cos θ and y = r sin θ in
(7.3) give
ur =1
rvθ
1
ruθ = −vr
Here, ur := ∂u∂r , uθ := ∂u
∂θ etc.
Harmonic functions A function f : U ⊂ R2 −→R is said to be harmonic in U if it has con-
tinuous second-order partial derivatives every-
where in U and satisfies the partial differential
equation
fxx + fyy = 0
which is called Laplace’s equation. Here, fxx =∂2f∂x2 etc. In alternative notation, ∆f = 0 (see
p.9).
Theorem 158. If f : U ⊂ C −→ C is analytic
in U , then its real and imaginary parts u(x, y)
and v(x, y) are each harmonic: ∆u = 0 and
∆v = 0.
Definition 159. If u, v : U ⊂ R2 −→ R are
harmonic in U and satisfy the CR equations in
U , then v is said to be a harmonic conjugate
of u in U .
Theorem 160. f : U ⊂ C −→ C, f(z) =
u(x, y) + i v(x, y), is analytic in U ⇐⇒ v is a
harmonic conjugate of u in U .
7.5 Complex integration
Integrals of complex functions on R If f :
[a, b] −→ C, −∞ ≤ a < b ≤ ∞, is a func-
tion whose real and imaginary parts u and v
are integrable (see p. 2.5), then∫ b
a
f(t) dt :=
∫ b
a
u(t) dt+ i
∫ b
a
v(t) dt
Theorem 161. Let f : [a, b] −→ C, −∞ ≤a < b ≤ ∞. Then∣∣∣∣∣
∫ b
a
f(t) dt
∣∣∣∣∣ ≤∫ b
a
|f(t)|dt
if both integrals exist.
Contours & Line Integrals A curve or arc or
path in C is a function γ : [a, b] −→ C for
which it is generally assumed that the real and
imaginary component functions are continu-
ous. γ is simple if it does not intersect itself:
s = t ⇒ γ(s) = γ(t). It is simple closed or
Jordan if the only self-intersection occurs at
t = a and t = b, where γ(a) = γ(b). A differ-
entiable curve γ is one such that γ′ exists on
its domain. A contour is a piecewise differen-
tiable curve, i.e. finitely many smooth curves
joined end to end. A simple closed contour
62
has no self-intersections except at the starting
point and the ending point. A simple closed
contour γ : [a, b] −→ C is said to be positively
oriented if an observer traversing the contour
as per the parametrisation, has all the interior
points enclosed by the contour to his left.
Definition 162. The length of a differentiable
curve γ : [a, b] −→ C, γ(t) = x(t) + i y(t), is
L :=
∫ b
a
|γ′(t)|dt =
∫ b
a
√x′(t)2 + y′(t)2 dt
The length of a contour is the sum of the
lengths of the constituent curves.
Definition 163. Let f : U ⊂ C −→ C be piece-
wise continuous and γ : [a, b] −→ U be a con-
tour. Then∫γ
f(z) dz :=
∫ b
a
f(γ(t))γ′(t) dt
=
∫ b
a
(u(t)x′(t)− v(t)y′(t)) dt
+ i
∫ b
a
(v(t)x′(t) + u(t)y′(t)) dt
=
∫γ
udx− v dy + i
∫γ
v dx− udy
The last line uses the notation of real line in-
tegrals (section 2.11).
Reversing a contour Suppose a contour is given
by γ : [a, b] −→ C which is traced from γ(a)
to γ(b). The contour traversed in the reverse
direction is described by (−γ)(t) := γ(−t),−b ≤ t ≤ −a. The contour integral∫
−γ
f(z) dz = −∫γ
f(z) dz
Estimating a contour integral
Theorem 164. Let f : U ⊂ C −→ C be a
function which is continuous on a contour γ
having length L. Suppose max{|f(z)| : z ∈γ} = M . Then∣∣∣∣∣∣
∫γ
f(z) dz
∣∣∣∣∣∣ ≤ML
Definition 165. A set A ⊂ C is simply con-
nected if every simple closed curve in A en-
closes only points of A. Alternatively, in such
a set, every simple closed curve can be shrunk
continuously to a point.
Theorem 166.
The Cauchy-Goursat theorem
Theorem 167. (Simply connected do-
mains) Let f : U ⊂ C −→ C be analytic on
the the simply connected open set U . Then∫γ
f(z) dz = 0
along every simple closed curve γ in U .
Corollary 168. (Path independence) Let f
be as in Theorem 167 and γ1, γ2 : [a, b] −→C be two simple curves with the same start-
ing point and the same ending points: z1 :=
γ1(a) = γ2(a) and z2 := γ1(b) = γ2(b) with no
other points of intersection. Then∫γ1
f(z) dz =
∫γ2
f(z) dz
and thus depends only on the endpoints z1 and
z2.
Theorem 169. (Multiply connected do-
mains) Let f : U ⊂ C −→ C be analytic on an
open set U containing a multiply connected set
S (p. 73) with boundary curves γ, γ1, . . . , γn.
Then ∫γ
f(z) dz =
n∑k=1
∫γk
f(z) dz
Definition 170. If f : U ⊂ C −→ C is con-
tinuous on the open set U and if there exists
an analytic function F : U −→ C such that
F ′(z) = f(z), then F is said to be a primitive
or antiderivative of f in U .
Theorem 171. Let the continuous function f :
U −→ C (U connected) have a primitive F in
U and z, w ∈ U . If γ is any contour joining
z to w, then∫γ
f(z) dz = F (z) − F (w) and is
thus independent of the contour joining z to w.
The Cauchy integral formula Let f : U ⊂C −→ C be analytic and γ a simple closed
positively oriented contour in U . Then∫γ
f(z) dz =1
2πi
∫γ
f(w)
z − z0dw
Integral representation of derivatives
63
Theorem 172. Let f : U ⊂ C −→ C be ana-
lytic and γ a simple closed contour inside U .
Then
1. If f : U ⊂−→ C is analytic, it then has
derivatives of all orders in U which are
consequently analytic in U .
2. The nth derivative of f , denoted by f (n),
has the integral representation
f (n)(z) =n!
2πi
∫γ
f(w)
(w − z)n+1dz
Morera’s theorem If f : U ⊂ C −→ C is con-
tinuous on the open set U and is such that∫∆
f(z) dz = 0
for every triangular contour ∆ ⊂ U , then f is
analytic in U .
Further properties of analytic functions
Maximum Modulus theorem Suppose
f : U −→ C is analytic on the connected
open set U and z0 ∈ U . Let B(z0, r) ⊂ Ube some open ball. Then
|f(z0)| ≤ maxθ{|f(z0+reiθ)| : 0 ≤ θ ≤ 2π}
Equality holds iff f is constant. Thus,
if f is nonconstant, its maximum (if it
exists) can occur only on the boundary
of U and never in its interior. Similarly,
if f = u + i v, then u and v attain their
respective maxima (if they exist) only on
the boundary of U .
Corollary 173. (Minimum Modulus
theorem) Under the same assumptions
as above and if f(z) 6= 0 in B(z0, r), then
|f(z0)| ≥ minθ{|f(z0 + reiθ)| : 0 ≤ θ ≤ 2π}
and hence, the minimum (if it exists)
can occur only on the boundary of U and
never in its interior. Similarly, if f =
u+i v, then u and v attain their respective
minima (if they exist) only on the bound-
ary of U .
Liouville’s theorem Let f : C −→ C be
analytic (such a function is termed entire)
and bounded, i.e. |f(z)| ≤ C, for some
C > 0 and all z ∈ C, then f is constant.
Cauchy’s estimates If B(z0, r) ⊂ C is some
open ball and f : B(z0, r) −→ C is ana-
lytic and bounded by M , viz. |f(z)| ≤Mfor all z ∈ B(z0, r), then
|f (n)(z0)| ≤ n!M
rn
Open Mapping theorem Given an analytic
function f : U −→ C, where U is an open
connected set, then f(U) is either open
or a point (in which case f is constant).
Thus, the image or range of a nonconstant
analytic function is an open subset of C.
Argument principle Let γ ⊂ U be a simple
closed contour and f : U −→ C be a func-
tion which is analytic at all points on and
inside γ except possibly for poles in the
interior of γ. Suppose f has no zeros on
γ. If N and P denote the number of zeros
and poles respectively of f , then
1
2πi
∫γ
f ′(z)
f(z)dz = N − P
Rouche’s theorem Suppose f, g : U −→ Care analytic and γ ⊂ U is a simple closed
contour. If |f(z)| > |g(z)| on γ, then
f(z) and f(z) + g(z) have the same num-
ber of zeros counting multiplicities, i.e. if
z0 is such that for some positive integer
m, f(z) = (z − z0)mp(z), p analytic with
p(z0) 6= 0, then also f(z) + g(z) = (z −z0)mq(z), q analytic with q(z0) 6= 0, and
vice versa.
7.6 Series
An infinite sequence z1, . . . , zn, . . . in C is said to
converge to the limit z if given any ε > 0, there is
n0 := n0(ε) ∈ N such that n ≥ n0 ⇒ |z − zn| < ε.
If a series does not converge, it is said to diverge.
Theorem 174. Let zn = xn + i yn (n ∈ N) and
z = x+ i y be in C. Then
limn→∞
zn = z ⇐⇒ limn→∞
xn = x and limn→∞
yn = y.
Definition 175. An infinite series∑∞n=1 zn con-
verges to z ∈ C if the sequence sn of partial sums,
sn :=∑nk=1 zk, converges to z. This is denoted by∑∞
n=1 zn = z.
Theorem 176.
64
1. If the series∑∞n=1 zn converges, then
limn→∞ zn = 0.
2. Given a convergent series∑∞n=1 zn = z, where
zn = xn + i yn and z = x + i y, it follows that∑∞n=1 xn = x and
∑∞n=1 yn = y.
Definition 177. A series of the form
∞∑n=0
an(z − z0)n
is called a power series. Here the partial sums are
sn(z) :=
n∑k=0
ak(z − z0)k
and convergence to f(z) := limn→∞ sn(z) is ex-
pressed by saying that given any ε > 0, there ex-
ists n0(z, ε) which is such that n ≥ n0(z, ε) ⇒|sn(z)− f(z)| < ε for each z.
Definition 178. If there exists R ∈ [0,∞] such that
|z − z0| < R|z − z0| > R
}⇒{
the series converges absolutelythe series diverges
then R is called the radius of convergence and the
circle {z ∈ C : |z − z0| = R} is called the circle of
convergence. The open ball B(z0, R) is called the
disc of convergence.
Taylor series Let f : U ⊂ C −→ C be analytic on
the open simply connected set U , z0 ∈ U and
C := {z : |z − z0| < r} be a circle contained in
U . Then at each point z such that |z− z0| < r
(i.e. strictly within C),
f(z) = f(z0) +f ′(z0)
1!(z − z0) +
f ′′(z0)
2!(z − z0)2
+ · · ·+ f (n)(z0)
n!(z − z0)n + · · ·
(7.5)
The series 7.5 is called the Taylor series expan-
sion of f about z0. The special case of z0 = 0
is called the Maclaurin series of f .
Absolute & uniform convergence A series∑∞n=1 zn is said to be absolutely convergent if∑∞n=1 |zn| converges.
Theorem 179.
1. Absolute convergence implies conver-
gence.
2. If a power series∑∞n=0 an(z − z0)n con-
verges for some z such that |z − z0| = r,
0 < r < ∞, then it converges whenever
|z − z0| < r, i.e. everywhere inside the
circle of radius |z − z0| = r.
3. If a power series∑∞n=0 an(z − z0)n con-
verges then its radius of convergence is
given by
R = limn→∞
∣∣∣∣ anan+1
∣∣∣∣if the limit exists. If it does not, then it
is given by 1/ lim sup |an|1/n.
The series∑∞n=0 an(z−z0)n is said to converge
uniformly to f(z) if given ε > 0, there exists
n0 = n0(ε) independent of z, such that |sn(z)−f(z)| < ε whenever n ≥ n0.
Theorem 180. (The Weierstrass M test)
Given a sequence of functions fn : U ⊂ C −→C which are uniformly bounded in the sense
that |fn(z)| ≤ Mn for all z ∈ U and n ∈ N,
the series∑∞n=1 fn(z) converges uniformly on
U if∑∞n=1Mn converges.
Theorem 181. Suppose a power series has a
nonzero radius of convergence R about a point
z0. Then it converges uniformly in the set {z ∈C : |z − z0| ≤ r} where 0 < r < R.
Laurent series Let z0 ∈ C and C1 := {z ∈ C :
|z − z0| = r1} and C2 := {z ∈ C : |z − z0| =
r2} be two concentric positively oriented circles
centred on z0, 0 ≤ r1 < r2 ≤ ∞. Let U :=
{z ∈ C : r1 < |z − z0| < r2}. If f : U −→ C is
analytic, then
f(z) =
∞∑n=−∞
an(z − z0)n
where
an =1
2πi
∫C
f(z)
(z − z0)n+1(n ∈ Z)
and C is the circle γ(t) := a+rei θ, r1 < r < r2
and 0 ≤ θ ≤ 2π. The series, which is uniquely
determined by f and U , converges uniformly
and absolutely on U .
Theorem 182. The function f : U −→ Cdefined by f(z) :=
∑∞n=0 an(z − z0)n, is ana-
lytic on the disc of convergence U of the power
series.
65
Theorem 183. (Integrating a power
series) Let∑∞n=0 an(z − z0)n be a convergent
power series and C be any contour lying in the
disc of convergence. If g : C −→ C is continu-
ous, then∫C
g(z)[ ∞∑n=0
an(z − z0)n]
dz
=
∞∑n=1
an
∫C
g(z)(z − z0)n dz
In particular, by taking g(z) ≡ 1, it follows
that the power series can be integrated term-
by-term:∫C
∞∑n=0
an(z − z0)n dz =
∞∑n=1
an
∫C
(z − z0)n dz
Theorem 184. (Differentiating a power
series) A power series∑∞n=0 an(z − z0)n can
be differentiated term-by-term in its disc of
convergence and the resulting series has the
same disc of convergence:
d
dz
∞∑n=0
an(z − z0)n =
∞∑n=1
nan(z − z0)n−1
Theorem 185. (Uniqueness of a series
representation) The function f defined by
f(z) :=∑∞n=0 an(z − z0)n on the disc of con-
vergence of the power series, is the Taylor ex-
pansion of f about z0. In other words,
an =f (n)(z0)
n!
Similarly, if∑∞n=−∞ cn(z − z0)n converges to
a function f(z) in some annular open set, then
it is the Laurent series of f .
Sum & product of series If∑∞n=0 an(z − z0)n
and∑∞n=0 bn(z−z0)n are two power series con-
verging on the same disc, their sum is the series∑∞n=0(an+bn)(z−z0)n convergent on the same
disc.
Theorem 186. Given convergent power series
f(z) :=∑∞n=0 an(z − z0)n and g(z) :=∑∞
n=0 bn(z − z0)n converging within the same
disc D, the product f(z)g(z) is also a power
series with the expansion
f(z)g(z) =
∞∑n=0
(n∑k=0
akbn−k
)(z − z0)n
valid in D. The product series is sometimes
called the Cauchy product.
Theorem 187. Let f : U −→ C be analytic
and z0 ∈ U . If f is nonconstant and f(z0) = 0,
then there is an open ball B(z0, r) ⊂ U within
which f has no other zero, i.e. the zeros of f
are “isolated”.
7.7 Residues & Poles
Singularities Let f : U −→ C and z0 ∈ U . If f
is not analytic (or possibly not even defined)
at z0 but is analytic at some point in every
ball centred on z0, then z0 is said to be a
singularity of f .
The singularity z0 is isolated if there is a
ball B(z0, r) ⊂ C such that f is analytic
on B(z0, r) r {z0}. It is a removable sin-
gularity if there exists an analytic function
f : B(z0, r) −→ C such that f(z) = f(z)
on B(z0, r) r {z0} (thus f can be redefined
at z0 to be f(z0) to become analytic). The
isolated singularity z0 is said to be a pole if
limz→z0 |f(z)| = ∞. If an isolated singularity
is neither removable nor a pole, it is said to
be an essential singularity.
By (7.6) f has a Laurent series expansion in an
annular domain about an isolated singularity
z0.
Theorem 188. Let f : Ur{z0} −→ C be ana-
lytic on a connected open set. The following
criteria classify the isolated singularities.
Removable singularity If z0 ∈ U is an isol-
ated singularity of f , then it is removable
iff
limz→z0
(z − z0)f(z) = 0
Pole If f has a pole at z0, then there are m ∈N and analytic g : U −→ C such that
f(z) =g(z)
(z − z0)m
and m is the smallest positive integer such
that (z−z0)mf(z) has a removable singu-
larity at z0. m is called the order of the
pole at z0. A pole of order 1 is said to
be simple. Moreover, if B(z0, r) ⊂ U , we
can express f as follows:
f(z) = h(z) +
m∑k=1
bk(z − z0)k
66
where h : B(z0, r) −→ C is analytic and
bm 6= 0. The m summed terms on the
RHS are called the singular or principal
part of f at z0.
Essential singularity If z0 is an essential
singularity of f , then for any r > 0,
every z ∈ C can be approximated ar-
bitrarily closely by elements from the set
f(B(z0, r) r {z0}) in the following sense:
given any ε > 0 and z ∈ C, there is
w ∈ B(z0, r)r{z0} such that |z−f(w)| <ε. (This result is called the Casorati-
Weierstrass theorem).
Alternatively, the Laurent series expansion can
be used to identify the nature of an isolated
singularity.
Theorem 189. Let z0 be an isolated singular-
ity of a function f which has the Laurent series
development f(z) =∑∞n=−∞ cn(z − z0)n in a
suitable domain. Then
1. z0 is removable iff cn = 0 for n ≤ −1.
2. It is a pole of order m iff c−m 6= 0 and
cn = 0 for all n ≤ −(m+ 1).
3. It is an essential singularity iff cn 6= 0 for
infinitely many (but not necessarily all)
negative n.
Residues Let f have an isolated singularity at
z0 with Laurent series expansion f(z) =∑∞n=−∞ cn(z − z0)n. The coefficient of c−1 is
called the residue of f at z0. It is often denoted
by Res(f ;z0).
Theorem 190.
1. Res(f ; z0) := c−1 = 12πi
∫C
f(z). where C
is a positively oriented circle centred on
z0 and contained in the domain of f .
2. Suppose f has a pole of order m at z0. If
g(z) := (z − z0)mf(z), then
Res(f ; z0) =1
(m− 1)!g(m−1)(z0)
The Residue Theorem Suppose f : U −→ C is
analytic except at the isolated singular points
z1, z2, . . . , zn ∈ U and let C be a positively
oriented simple closed contour lying in U such
that the singularities are in the interior of the
region enclosed by C. Then∫C
f(z) dz = 2πi
n∑k=1
Res(f ; zk)
Theorem 191.
1. Let F : U −→ C, f(z) = p(z)/q(z), p, q
analytic at z0 ∈ U . Suppose p(z0) 6= 0
but q has a “zero of order m at z0”,
i.e. q(z) = (z − z0)mr(z) with r(z0) 6= 0.
Then f has a pole of order m at z0.
2. In particular, if p, q are as before and
q(z0) = 0 but q′(z0) 6= 0, then z0 is a
simple pole of f and
Res(f ; z0) = p(z0)/q′(z0)
Jordan’s inequality∫ π/2
0
e−R sin θ dθ <π
2R(R > 0)
————————–
67
68
Chapter 8
Probability & Statistics
8.1 Probability
Definition 192. An outcome of an experiment is
called an event. It may be possible to regard cer-
tain more complex events as combinations of sim-
pler events. If an event cannot be exhibited as a
combination of other events associated with the ex-
periment, it is said to be simple or elementary.
Otherwise it is compound. A sample space S is the
collection of all elementary events which are also
called sample points. A subset A of S is, in general,
a compound event. The event A is said to occur if
all its constituent events simultaneously occur. An
elementary event is a singleton set. The event S is
called the sure event and the null set ∅ is called the
impossible event.
Henceforth S will denote some sample space.
Definition 193.
1. Given an event A, the non-occurrence of A is
also an event called the complementary event.
It corresponds to the set Ac, the complement
of A. This is sometimes denoted by A or ∼A.
2. Given events A and B, the event “either A oc-
curs or B occurs (or both occur)” is represen-
ted by the set A∪B. Given a finite or infinite
sequence of events A1, A2, . . . , the correspond-
ing representation is ∪∞An. The event “both
A and B occur (simultaneously)” is represen-
ted by the set A ∩ B. For a finite or infinite
sequence of events A1, A2, . . . , the correspond-
ing representation is ∩∞An.
3. A collection of events A1, . . . , An in S is mu-
tually exclusive if the sets Ai are pairwise dis-
joint. They are exhaustive if ∪ni=1Ai = S.
4. The event “if A occurs then B occurs” or “the
occurrence of A implies the occurrence of B”
is represented by the relation A ⊂ B.
5. The event “A occurs but not B” is the set A\B.
Definition 194. A sample space S is said to be
discrete if S is finite or countably infinite, i.e. its
points can be arranged in an infinite sequence.
Let S = {x1, x2, . . . , xn, . . . } be a finite or infinite
discrete sample space.
Definition 195. Let p1, . . . , pn . . . be a finite or
infinite sequence of non-negative real numbers, 0 ≤pi ≤ 1 for all i = 1, 2, . . . such that
∑pi = 1. For
any A ⊂ S, define
P (A) :=∑xi∈A
pi
(The sum is over all i such that xi ∈ A.) In par-
ticular, P ({xi}) = pi. Then P (A) is the assigned
probability that the event A occurs. Thus, P is a
function:
P : {all subsets of S} −→ [0, 1]
The sample space S together with the probability
function P is said to be a probability space.
A very important special case is the following.
Definition 196. Let S = {x1, x2, . . . , xn} be finite
and take pi = 1/n for all i = 1, 2, . . . , n. Then the
probability of the occurence of an event A can be
described as
P (A) =no. of points in A
no. of points in S
=no. of outcomes favourable to A
total no. of possible outcomes
Theorem 197. Let S be a sample space and let
P (A) be the probability assigned to the event A as
in Definition 195. Then
1. P (∅) := 0.
69
2. P (∐∞n=1An) =
∑∞n=1 P (An), i.e. the probabil-
ity of the union of a sequence of mutually ex-
clusive events is the sum of the probabilities of
the individual events.
3. P (A∪B) = P (A)+P (B)−P (A∩B) where the
events A and B are not necessarily exclusive.
In particular, P (A) + P (B) ≤ P (A) + P (B)
(Boole’s inequality).
4. P (A∪B∪C) = P (A) +P (B) +P (C) −P (A∩B)− P (B ∩ C)− P (C ∩A) +P (A ∩B ∩ C).
5. More generally,
P (∪nk=1Ak)
=
n∑k=1
(−1)k+1( ∑
1≤i1<···<ik≤n
P (Ai1 ∩ · · · ∩Aik))
Also, Boole’s inequality holds:
P (∪ni=1Ai) ≤n∑i=1
P (Ai)
6. If A and B are mutually exclusive, then P (A∩B) = P (∅) = 0.
7. P (Ac) = 1− P (A).
8. If event A implies event B so that A ⊂ B, then
P (A) ≤ P (B).
Definition 198. Let S be a finite sample space with
probability function P .
1. Events A and B are independent if P (A∩B) =
P (A)P (B).
2. A collection of events A1, . . . , An is said to be
independent if for any finite subset i1, . . . , ikof indices, we have
P (Ai1 ∩ · · · ∩Aik) = P (Ai1) · · ·P (Aik)
Theorem 199. Let A1, . . . , An be a collection of
independent events. Then, if any event Aj is re-
placed by its complementary event Ac, the resulting
collection of events is again independent.
Definition 200.
1. Suppose an experiment with exactly two pos-
sible outcomes (usually termed “success” and
“failure”) with associated probabilities p and
q := 1 − p, is repeated n times. Assume
that any repetition is independent of any other.
Such an experiment is said to be a Bernoulli
trial.
2. If each repetition of the experiment has pre-
cisely k > 2 possible outcomes (labelled
ω1, ω2, . . . , ωk) with associated probabilities
p1, . . . , pk,∑nk=1 pk = 1, then the sequence of
independent experiments is said to be a gener-
alised Bernoulli trial.
Theorem 201.
1. In a sequence of n Bernoulli trials, the prob-
ability of obtaining exactly k successes (and
hence, n− k failures), k = 0, 1, . . . , n, is given
by
p(k) =
(n
k
)pkqn−k
where as before q := 1− p.
2. In a sequence of n generalised Bernoulli tri-
als, the probability that outcome ω1 occurs n1
times, . . . , ωk occurs nk times, n1 + · · ·+nk =
n, is given by
p(n1, . . . , nk) =n!
n1! . . . nk!pn1
1 . . . pnkk
8.1.1 Conditional probability
Let (S, P ) be a given probability space and A,B two
events. The conditional probability of the event A
occurring given that event B has already occurred
is defined to be
P (A |B) :=P (A ∩B)
P (B)
Here (in the context of discrete sample spaces) it
can be safely assumed that P (B) > 0. Alternat-
ively, we may write
P (A ∩B) = P (B)P (A |B) = P (A)P (B |A)
Theorem 202.
1. If A and B are independent events, then
P (A |B) = P (A).
2. (Chain Rule) Let A1, . . . , An be events such
that P (A1 ∩ · · · ∩An−1) > 0. Then
P (A1 ∩ · · · ∩An)
= P (A1)P (A2 |A1)P (A3 |A1 ∩A2)
· · ·P (An |A1 ∩ · · · ∩An−1)
Definition 203. Given a sample space S, a parti-
tion of S is a (finite or infinite) collection of events
An such that the events are mutually exclusive and
exhaustive.
70
Theorem 204. Let {Ei : i = 1, . . . , n} be a finite
partition of a sample space S and such that P (Ei) >
0 for all i.
1. (Theorem of total probability) If A is an
arbitrary event, then
P (A) =
n∑i=1
P (A ∩ Ei) =
n∑i=1
P (A |Ei)P (Ei)
An exactly analogous theorem with the finite
sum replaced by an infinite series is true for
an infinite partition.
2. If A and B are any two events in S with
P (B) > 0, then
P (A |B) =
n∑i=1
P (A |B ∩ Ei)P (Ei |B)
Theorem 205. (Bayes′ theorem) Let {Ei :
i = 1, . . . , n} be a partition of S with P (Ei) > 0 for
all i. Suppose A is an event such that P (A) > 0.
Then for each Ek
P (Ek |A) =P (A |Ek)P (Ek)∑ni=1 P (A |Ei)P (Ei)
Corollary 206. If the partition consists of the two
events {E,Ec}, then with the same assumptions as
above,
P (E |A) =P (A |E)P (E)
P (A |E)P (E) + P (A |Ec)P (Ec)
8.1.2 Random variables & probabil-ity distributions
The one-dimensional case
Definition 207. A (one-dimensional) discrete ran-
dom variable X is a function X : S −→ R :=
{a1, a2, . . . , an, . . . } ⊂ R. Here, the co-domain R
may be finite or infinite.
A (one-dimensional) continuous random variable X
is a function X : S −→ R.
The term is frequently abbreviated as r.v. We
use the following notation: if X is an r.v. on S and
A is any subset of the co-domain of X, then
{X ∈ A} := {s ∈ S : X(s) ∈ A} = X−1(A)
An alternative notation is (X ∈ A). The event
{X ∈ {a}} is written as {X = a}.
Definition 208. A distribution of an r.v. is the as-
signment of probabilities to all possible events when
they are defined in terms of X. In other words, a
distribution of X assigns probabilities to all events
of the form {X ∈ A}, where A is a subset the co-
domain of X and such that X−1(A) is an event
(possibly the sure event S or the impossible event
∅) in the sample space S.
Definition 209. Let X : S −→ R be a r.v. The
function
FX(t) = P{X ≤ t}
is called the (cumulative) distribution function of
X. If the r.v. X is understood then FX is simply
denoted F . The abbreviations c.d.f. and d.f. are
common.
Definition 210.
1. Let X be a discrete r.v. The function p : R −→[0, 1] defined by p(t) := P{X = t}, is called the
probability mass function of the r.v. X. Thus,
p(t) is the probability that X takes the value t.
2. Let X be a continuous r.v. A function f :
R −→ [0,∞) is called the probability density
function or simply, density, of X if
FX(t) =
∫ t
−∞f(x) dx (−∞ < t <∞)
f is usually taken to be integrable on every fi-
nite interval [s, t] in which case F becomes con-
tinuous.
Definition 211. Let X and Y be r.v.’s defined on
the same sample space. X and Y are said to be
independent iff whenever E := {X ∈ A} and F :=
{Y ∈ B} (A,B ⊂ R) are events,
P (E ∩ F ) = P (E)P (F )
General properties of distributions
Theorem 212. Let X be a discrete r.v. and A ⊂ R,
the co-domain of X. Then
P{X ∈ A} =∑t∈A
p(t)
where the sum may be restricted to all t such that
p(t) > 0.
Theorem 213. Let FX be the distribution function
of the r.v. X. Then
1. 0 ≤ FX(t) ≤ 1 for all t ∈ R.
71
2. P{a < X ≤ b} = FX(b)− FX(a) when a < b.
3. a < b⇒ FX(a) ≤ FX(b).
4. P{a ≤ X ≤ b} = FX(b)−FX(a) +P{X = a}.
5. P{a < X < b} = FX(b)− FX(a)− P{X = b}.
6.
P{a ≤ X < b} = FX(b)− FX(a) +P{X = a}−P{X = b}
Theorem 214. Let FX be the distribution function
of a r.v. X. Then
1. F is nondecreasing.
2.
limt→−∞
FX(t) = 0 and limt→∞
FX(t) = 1
3. FX is “continuous from the right”:
limt→a+
FX(t) = FX(a)
and
limt→a−
FX(t) = FX(a)− P{X = a}
Theorem 215.
1. Let X be a continuous r.v. with density f .
Then whenever a < b
P{a ≤ X ≤ b} = P{a < X ≤ b} = P{a < X < b}= P{a ≤ X < b}
= F (b)− F (a) =
∫ b
a
f(x) dx
2.∫∞−∞ f(x) dx = 1.
Functions of a random variable
Let X : S −→ R be a r.v. with distribution function
FX . Let φ : R −→ R be any function. Then Y :=
φ(X) is a new r.v. defined by φ(X)(x) := φ(X(x)).
Theorem 216. Let X be a r.v. with density fX .
Suppose φ is continuously differentiable with in-
verse ψ. Then the density fY of Y := φ(X) is
given by
fY (y) = fX(ψ(y))|ψ′(y)|
In particular, if φ(t) = at + b (a 6= 0) so that Y =
aX + b, then
fY (y) =1
|a|fX
(y − ba
)
Definition 217. A collection {X1, X2, . . . , Xn} of
r.v.’s defined on the same sample space S is said to
be identically distributed if FXi = FXj for all i, j.
Definition 218. (Expectation or Expected
Value or Mean)
1. (The discrete case) Let X be a discrete r.v. tak-
ing values {x1, x2, . . . , xn} with probability dis-
tribution pi := P{X = xi}. Then the expecta-
tion of X is
E(X) :=
n∑i=1
xipi
If {x1, x2, . . . , xn, . . . } is infinite, then the ex-
pectation is again defined to be
E(X) :=
∞∑i=1
xipi
under the assumption that
n∑i=1
|xi|pi <∞
2. If X is a continuous r.v. with density f(x),
then the expectation is defined to be
E(X) :=
∫ ∞−∞
xf(x) dx
subject to the condition that∫ ∞−∞|x|f(x) dx <∞
An alternative term for the expectation is first (or-
dinary) moment. The expectation a r.v. is often
denoted by E[X] or µX or mX or m1 (for “mean”
or “moment”).
Another notion of a “mean” or “central value” is
the median.
Definition 219. Let X be a r.v. Then a number
m ∈ R such that P{X ≤ m} ≥ 0.5 and P{X ≥m} ≥ 0.5 is called the median of X.
Some properties of the expectation are listed be-
low.
Theorem 220. Let X and Y be r.v.’s on the same
sample space.
1. E(X) ≥ 0.
2. E(aX + b) = aE(X) + b for all a, b ∈ R.
72
3. |E(X)| ≤ E(|X|).
4. If E|X| <∞ and E|Y | <∞, then
E(X + Y ) = E(X) + E(Y )
This results extends to finitely many r.v.’s.
5. If X and Y are independent (see (??)) and
E|XY | ≤ ∞, then
E(XY ) = E(X)E(Y )
6. If X is a discrete non-negative r.v., then
E(X) =
∞∑n=0
P{X > n}
and if X is a non-negative continuous r.v. with
c.d.f F (t), then
E(X) =
∫ ∞0
[1− F (t)] dt
Definition 221. (Variance) Let X be a r.v. such
that E(X2) < ∞. Then the variance of X is the
quantity
Var(X) := E[(X − E(X))2]
Definition 222. The standard deviation (s.d.) of a
r.v. X is defined to be
σX :=√
Var(X)
Definition 223. Given a pair of r.v.’s X,Y , their
covariance is defined as
Cov(X,Y ) := E(XY )− E(X)E(Y )
= E(X − E(X))E(Y − E(Y ))
If Cov(X,Y ) = 0, then X and Y are said to be
uncorrelated or orthogonal.
Definition 224. Let X and Y be r.v.’s such that
E(X2) <∞ and E(Y 2) <∞. Then
ρ := ρX,Y :=Cov(X,Y )√
Var(X)Var(Y )
is called the coefficient of correlation between X and
Y .
Theorem 225.
1. −1 ≤ ρX,Y ≤ 1.
2. |ρX,Y | = 1 iff there are a, b ∈ R, a 6= 0, such
that P{Y = aX + b} = 1.
3. |ρX,Y | is invariant under a linear change of
(random) variables: if a, b, c, d ∈ R are such
that ac 6= 0, then
ρaX+b,cY+d =
{ρX,Y ac > 0−ρX,Y ac < 0
The following are the properties of the variance,
the covariance and some related facts.
Theorem 226.
1. Var(X) = E(X2)− [E(X)]2.
2. If Var(X) exists, then
Var(aX + b) = a2Var(X)
for any a, b ∈ R.
3. If E(X2) and E(Y 2) exist, then
Var(X+Y ) = Var(X)+Var(X)+2 Cov(X,Y )
4. Suppose the n r.v.’s X1, . . . , Xn are pairwise
uncorrelated. Then
Var(X1 + · · ·+Xn) = Var(X1)+ · · ·+Var(Xn)
5. If the mean square deviation from t is the
quantity E[(X − t)2], it is minimised when
t = E(X) and the minimum value is Var(X).
6. The quantity E(|X − t|) called the mean abso-
lute deviation from t is minimised at t = m,
where m is the median of X.
7. (Schwarz inequality) Given two r.v.’s X and Y
[E(XY )]2 ≤ E(X)2E(Y 2)
8.1.3 Some standard distributions
Discrete distributions
Binomial distribution Let X be a
r.v. defined to be the total number
of successes in an experiment with only
two outcomes, “success” (with some
probability p) and “failure” (with prob-
ability 1 − p), conducted n times. Thus,
X takes values 1, 2, . . . , n. Then X is
said to be binomially distributed and its
probability mass function is given by
b(k;n, p) := P{X = k} =
(n
k
)pk(1− p)k
The fact that a r.v. has the binomial
distribution is sometimes indicated by
writing X ∼ Bin(n, p).
73
Theorem 227. If X ∼ Bin(n, p), then
E(X) = np and Var(X) = np(1− p).
Poisson distribution A r.v. X is said to
have Poisson distribution with parameter
λ > 0 if X : S −→ {0, 1, 2, . . . } and
P{X = n} =λn
n!e−λ
It is sometimes indicated by writing X ∼Poi(λ).
Theorem 228. If X has Poisson distri-
bution, then E(X) = λ and Var(X) = λ.
Continuous distributions
Uniform distribution Let X be a continu-
ous r.v. and [a, b] ⊂ R be a given interval.
We define a probability density function
(p.d.f) of X as follows:
f(x) =
0 if x < ac if a ≤ x ≤ b0 if x > b
Then X is said to be uniformly distributed
on [a, b] and is symbolised by X ∼ U [a, b].
Theorem 229. If X ∼ U [a, b], then E(X) =
(a+ b)/2 and Var(X) = (b− a)2/12.
Theorem 230. If the distribution F of X is
strictly increasing, then Y := F (X) ∼ U [0, 1]
(see subsection (8.1.3)).
Normal or Gaussian distribution A continu-
ous r.v. X is said to have normal (or Gaussian)
distribution with parameters µ and σ > 0 if it
has density
f(x) =1
σ√
2πe−
(x−µ)2
2σ2
This distribution is indicated by writing X ∼N(µ, σ2).
Theorem 231. If X ∼ N(µ, σ2), then E(X) = µ
and Var(X) = σ2. A linear transformation of a
normally distributed r.v. is normally distributed. If
X1, . . . , Xn are independent r.v.’s such that Xi ∼N(µi, σ
2i ), then
c1X1 + · · ·+ cnXn ∼ N
(n∑i
ciµi,
n∑i
c2iσ2i
)In particular, if X ′ := (X − µ)/σ, then X ′ ∼N(0, 1).
8.2 Statistics
Statistic A population being analysed has certain
statistical measures such as the expectation (or
mean) and variance defined on it. In prac-
tice, such measures can only be computed for
samples. Any measure of interest determined
for a sample is called a statistic.
Error estimates Since the computation of stat-
istical quantities from a sample is likely to dif-
fer from those calculated for the whole popula-
tion, an idea of the errors incurred is required.
Definition 232. The probability distribution
of a given statistic (e.g. mean or variance) is
called its sampling distribution. The standard
deviation of the sampling distribution is called
the standard error (SE) of the statistic.
Theorem 233. Suppose n independent ran-
dom samples are drawn at a time from a popu-
lation assumed to have mean µ and variance
σ2. Let Xi (i = 1, . . . , n) be independent
identically distributed (iid) r.v.’s representing
the value of the ith sample in a draw. Suppose
Xi ∼ N(µ1, σ2i ). Then the standard errors of
the various statistics are as follows.
Sample mean vs population mean
Suppose µi = µ and σi = σ for all i.
The sample mean µ and and the sample
variance σ are defined to be the mean and
the variance of the r.v. X = 1n
∑ni=1Xi.
Hence
µ = µ
σ = σ2/n
Thus, X ∼ N(µ, σ2/n) and the SE is
σ/√n.
Difference between means of two samples
Let samples of size n1 and n2 be drawn
from the same or two different popula-
tions which have mean µ and variances
σ21 and σ2
2 (both may be equal). If X1
and X2 are the means of the samples
respectively, assume that X1 ∼ N(µ, σ21)
and X2 ∼ N(µ, σ22). Then the SE of
X1 −X2 is√σ2
1/n1 + σ22/n2
74
Linear regression
The method of least squares
Let Ax = b be a linear non-homogeneous sys-
tem of m equations in n unknowns, where
A =
a11 a12 · · · a1j · · · a1n
a21 a22 · · · a2j · · · a2n
......
......
......
ai1 ai2 · · · aij · · · ain...
......
......
am1 am2 · · · amj · · · amn
and
b =
b1b2...bm
Then the approximate solution according to
the method of least squares is that x which min-
imises the quantity∑mi=1E
2i where
Ei := ai1x1 + ai2x2 + · · ·+ ainxn − bi
This gives the system of normal equations
which can be solved to obtain the minimising
x:
m∑i=1
ai1Ei = 0,
m∑i=1
ai2Ei = 0, . . . ,
m∑i=1
ainEi = 0
Linear regression Let (X,Y ) be a bivariate r.v.
Let the N data-pairs (xi, yi) have frequencies
fi for i = 1, 2, . . . , n. So, N =∑ni=1 fi. The
following are the equations of the lines of re-
gression.
Deviations parallel to the y-axis minimised
This is called the line of regression of y
on x.
y − y =µ11
σ2x
(x− x)
where we use the following notation
x :=1
N
n∑i=1
fixi
y :=1
N
n∑i=1
fiyi
σ2x := Var({x1, . . . , xn}) =
(1
N
n∑i=1
fix2i
)− x
= E(X2)− E(X)2
Deviations parallel to the x-axis minimised
With similar notation as above, the equation
of the line of regression of x on y is
x− x =µ11
σ2y
(y − y)
Definition 234. The numbers byx := µ11/σ2x and
bxy := µ11/σ2y are called the coefficients of regres-
sion of y on x and x on y respectively.
Theorem 235. If ρX,Y denotes the correlation
coefficient (see Definition (224)), then
ρX,Y =√byxbxy
————————–
75
76
Chapter 9
Numerical Methods
9.1 Errors
Definition 236. Let x be a real or complex number
and x be an approximation to it. Then
1. The absolute error is the number |x− x|.
2. The relative error or absolute accuracy is |x−x|/x. The relative accuracy is |x− x|/x
3. The percentage error is |x− x|/x| × 100.
Theorem 237. Let f(x1, x2, . . . , xn) be a function
with continuous partial derivatives of some order
k > 1. Then if there are errors ∆xi in each xi such
that (∆xi)2 can be neglected for all i = 1, 2, . . . , n,
then the approximate error ∆f in f is given by
∆f ≈n∑i=1
∂f
∂xi∆xi
and the relative error in computing f is given by
∆f
f=
n∑i=1
∂f
∂xi
∆xif(x1, . . . , xn)
9.2 Solution of algebraic &transcendental equations
The bisection method Let f : [a, b] −→ R be
continuous and such that f(a)f(b) < 0. To
find a root of f , i.e. a point c such that f(c) ≈0, the following algorithm is used. Let ε > 0 be
small and represent the maximum acceptable
absolute error of the approximate root c.
1. Let c := (a+ b)/2
2. If b− c ≤ ε, then accept c as the approx-
imate root and exit.
3. If f(b)f(c) ≤ 0, the set a := c; else set
b := c.
4. Return to step 1.
Newton’s method Also called the Newton-
Raphson method and the tangent method. Let
f : R −→ R be differentiable. An approximate
root is obtained by the following algorithm:
1. Pick any x0 ∈ R. If f ′(x0) 6= 0, set x1 :=
x0 − f(x0)f ′(x0) . Else pick another x0.
2. In general xn+1 = xn − f(xn)f ′(xn) for all n ≥
0, assuming f ′(xn) 6= 0.
9.3 Numerical differentiation
Let f : [a, b] −→ R be differentiable to whatever
order necessary. Let
a ≤ x0 < x1 < · · · < xn ≤ b and h = xi − xi−1
for all i, meaning that the n + 1 nodal or tabular
points xi are equally spaced a distance h apart.
Theorem 238.
1. f ′(xk) ≈ 12h [−3f(xk) + 4f(xk+1 − f(xk+2) for
k ≤ n− 2.
2. f ′′(xk) = 1h2 [f(xk) − 2f(xk+1) + f(xk+2) for
k ≤ n− 2.
3. Near the centre, the following formulas are
useful:
f ′(xk) ≈ f(xk−2)− 8f(xk−1) + 8f(xk+1)− f(xk+2)
12h
provided terms in h4 and higher order can be
neglected.
f ′′(xk) ≈ f(xk+1)− 2f(xk) + f(xk−1)
h2
provided h2 is small.
77
9.4 Numerical integration
Let f : [a, b] −→ R be integrable.
The trapezoidal rule If the interval [a, b] is
small, the following approximation holds:∫ b
a
f(x) dx ≈ b− a2
[f(a) + f(b)]
If the interval [a, b] is not small, it is parti-
tioned into n subintervals of length h:
a := x0 < x1 · · · < xn = b
and xk := x0 + kh, k = 1, . . . , n. Then
∫ b
a
f(x) dx ≈ h
2[f(x0) + 2
n−1∑k=1
f(xk) + f(xn)]
This formula is also called the composite
trapezoidal rule.
Simpson’s rule If [a, b] is small, divide into two
equal parts, each of length (b − a)/2. Given
the three points (x0, f(x0)), (x1, f(x1)) and
(x2, f(x2)), where a = x0 < x1 = (b − a)/2 <
x2 = b,∫ b
a
f(x) dx ≈ h
3[f(x0) + 4f(x1) + f(x2)]
and h := (b−a)/2. This method is also known
as Simpson’s 1/3 Rule. If the interval is not
small, the interval is partitioned by points
a = x0 < x1 := x0+h < · · · < xn := x0+nh = b
The approximation is now given by∫ b
af(x) dx =
(h
3
) 2n−2∑k=0
[f(xk) + 4f(xk+1) + f(xk+2)]
=
(h
3
)[f(x0) + 4
n∑k=1
f(x2k−1) + 2
n−1∑k=1
f(x2k)
+ f(x2n)]
9.5 Numerical solution ofODEs
Let
y′ = f(x, y), y(x0) = y0 (9.1)
be an initial value problem.
Single-step methods In this class of methods
(also known as single-point methods), the pro-
cess starts with an initial choice (x0, y0) and at
the points xk := x0 +kh (k = 0, 1, . . . , n and h
represents the step size of the nodal or tabular
points) successively computes the approxima-
tions yk to the exact values y(xk) by means of
the iteration scheme
yk+1 = yk + hF (xk, yk, h, f)
where F (. . . ) is a function which specifies the
method.
Definition 239. In the Euler method
F (x, y, h, f) = f(x, y). This is a so-called
first-order method.
Let y(x) represent the exact solution of the
initial value problem (9.1) at an initial point
x0 with y(x0) = y0. Let h represent a step
size. Define
∆(x0, y0, h, f) :=
{y(x0+h)−y0
h h 6= 0f(x0, y0) h = 0
To obtain a general two-stage Runge-Kutta
method, let
F (x0, y0, h)
:= af(x0, y0) + bf(x0 + ph, y0 + qhf(x0, y0))
where a, b, p, q are chosen so that the Taylor
expansion of
∆(x0, y0, h, f)− F (x0, y0, h)
in terms of h begins with the highest possible
power.
Definition 240. 1. (Heun’s method or the
Euler-Cauchy method) This method is
defined by choosing a = b = 1/2 and
p = q = 1 and setting
yk+1 = yk + hF (xk, yk, h, f)
= yk +1
2h[f(xk, yk)
+1
2f(xk + h, yk + hf(xk, yk))]
2. (Modified Euler-Cauchy method) Here
a = 0, b = 1, p = 1/2, q = 0. We obtain
the method
yk+1 = yk + hf
(xk +
h
2, yk +
1
2hf(xk, yk)
)
78
Definition 241. (Four stage explicit Runge-
Kutta method) Here
F (x, y, h) =1
6(k1 + 2k2 + 2k3 + k4)
and so
yk+1 = yk + F (xk, yk, h)
where
k1 := f(x, y)
k2 := f
(x+
1
2h, y +
1
2hk1
)k3 := f
(x+
1
2h, y +
1
2k2
)k4 := f(x+ h, y + hk3)
————————–
79