chapter 2 linear transformations
TRANSCRIPT
-
8/2/2019 Chapter 2 Linear Transformations
1/24
LECTURE 8
Linear maps
In this lecture, we introduce the idea of linear maps, also known as linear mappings or
linear transformations, with examples. The aim of our study of linear maps is two-fold: to
understand linear maps in R, R2 and R3, and to bring this understanding to bear on more
complex examples.
1. Defining linear maps
Linear maps are (mathematical abstractions of) very common types of function.
Exercise 8.1. Consider the function R : R2 R2 that rotates a vector v anti-
clockwise around the origin through an angle to give a vector Rv. Let v,w R2 and
R. Find R(v + w) in terms of Rv and Rw, and find R(v) in terms of Rv.
Answer. When we rotate a line, we get a line, and when we rotate parallel lines, we
get parallel lines. So when we rotate a parallelogram, we get a parallelogram.
Consider the parallelogram with vertices 0, v, w and v + w. When we rotate thisparallelogram about the origin through the angle , we obtain a new parallelogram, with
vertices R0, Rv, Rw and R(v + w). Note that R0 = 0.
By the parallelogram law for adding vectors, the fourth vertex of the parallelogram
with vertices 0, Rv and Rw is Rv + Rw. Hence
R(v + w) = Rv + Rw.
0
v
w
v + w
R0
Rv
Rw
R(v + w)
Figure 8.1. Rotating sums of vectors
37
-
8/2/2019 Chapter 2 Linear Transformations
2/24
38 8. LINEAR MAPS
Next, if > 0, then multiplication by is a dilation about the origin. We get the same
result when we dilate first, then rotate, as when we rotate, then dilate. This means that
R(v) = (R
v).
If we consider reflections in the origin as well, which correspond to multiplication by 1,
then we get this formula for all scalars .
Exercise 8.2. The simple harmonic oscillator is the quantum mechanical version of a
mass oscillating in simple harmonic motion on a spring. The operator H is a function
from functions to functions. Given a function f : R R, we define the new function
Hf : R to R by the rule
(Hf)(x) = d2
dx2f(x) + x2f(x) x R.
For instance, if f(x) = ex2/2, then Hf(x) = ex
2/2. Find H(f + g) in terms of Hf and
Hg, and find H(f) in terms of Hf.
Answer. It is easy to see that
H(f + g)(x) = d2
dx2(f + g)(x) + x2(f + g)(x)
= d2
dx2
f(x) + x2f(x) d2
dx2
g(x) + x2g(x)
= Hf(x) + Hg(x),
that is, H(f + g) = Hf + Hg.
Similarly, H(f) = Hf.
The functions R and H in these two examples enjoy the same algebraic properties
they respect the basic vector operations of addition and scalar multiplication. This motivates
the following definition.
Definition 8.3. Let V and W be vector spaces. A linear transformation (or mapping
or map) from V to W is a function T : V W such that
T(v + w) = Tv + Tw
T(v) = T(v)
for all vectors v and w and scalars .
-
8/2/2019 Chapter 2 Linear Transformations
3/24
2. LINEAR MAPS AND MATRICES 39
Let us investigate the linear maps from R to R.
Exercise 8.4. Which of the following functions are linear?
f(x) = 3x + 1 g(x) = x
h(x) = ex + 2 k(x) = sin(x).
Answer. First, observe that f(x+y) = 3(x+ y)+1 = 3x+3y +1, while f(x)+ f(y) =
(3x +1)+(3y +1) = 3x + 3y +2. These expressions are different, so f(x + y) = f(x) + f(y),
and f is not a linear map.
Next, g(x + y) = (x + y) = x y , while g(x) + g(y) = x y . Hence
g(x + y) = g(x) + g(y). Further, g(x) = x = (x) = g(x). Then g is a linear
map.
Third, h(x) = ex + 2, while h(x) = (ex + 2) = ex + 2. There is no reason
why these should be the same for all values of x and in R. Indeed, when = 1, then
they are equal only when ex + 2 = (ex + 2), that is, if ex + ex + 4 = 0, that is, never.
Then h is not a linear map.
We omit the proof that k is not a linear map.
In general, the linear functions from R to R are the functions of the form l(x) = cx, for
some fixed c R (including 0).
2. Linear Maps and Matrices
Suppose that A is an m n matrix. Define TA : Rm Rn by the formula
TAx = Ax x Rm.
Then TA is a linear map. Indeed,
TA(x + y) = A(x + y) = Ax + Ay = TA(x) + TA(y)
for all x,y Rm, and similarly TA(x) = TA(x) for all x Rm and R.
We show now that multiplications by matrices are essentially the only examples of
linear maps from Rm to Rn. Recall that ej denotes the vector (0, . . . , 0, 1, 0, . . . , 0)T, with
only one nonzero entry, namely 1, in the jth place.
-
8/2/2019 Chapter 2 Linear Transformations
4/24
40 8. LINEAR MAPS
Theorem 8.5. Suppose that T : Rm Rn is a linear map. Letaj denote the vector
Tej, and A denote the m n matrix whose jth column is aj, where 1 j m. Then
Tx = Ax
for allx inRm.
Proof. Take x in Rm. We write x as (x1, x2, . . . , xm)T, and then
x = x1e1 + x2e2 + + xmem.
Since T is linear,
Tx = T(x1e1 + x2e2 + + xmem)
= x1Te1 + x2Te2 + + xmTem
= x1a1 + x2a2 + + xmam = Ax,
as required. 2
3. Examples in R2 and R3
Exercise 8.6. Let R : R2
R2
be the rotation (anti-clockwise) through the angle .Represent R as multiplication by a matrix.
Answer. Observe that
R
1
0
=
cos
sin
and R
0
1
=
sin
cos
.
Then
Rxy = xR1
0+ yR0
1 = cos sin
sin cos x
y .
.
Exercise 8.7. Let Ds : R3 R3 be dilation by the factor s in R+, that is, Dsx = sx
for all x in R3. Show that Ds is linear and represent Ds as multiplication by a matrix.
Answer. We do not show that Ds is linear.
-
8/2/2019 Chapter 2 Linear Transformations
5/24
3. EXAMPLES IN R2
AND R3
41
It is easy to see that
Dsx = s 0 0
0 s 0
0 0 sx x R3.
.
Exercise 8.8. What is the geometric effect of multiplying vectors in R3 by the matrix
1 0 0
0 1 0
0 0 103
?
Answer. If we think of the vectors (1, 0, 0)T and (0, 1, 0)T as being horizontal, and
(0, 0, 1)T as begin vertical, then, on the one hand, we do not change the horizontal compo-nents of a vector, while on the other hand, we reduce the vertical component by a factor
of 103. This corresponds to squashing down.
Exercise 8.9. What is the geometrical effect of multiplying vectors in R2 by the matrix1 1
0 1
?
Answer. The effect of this transformation is to slide horizontally: to the left wheny < 0 and to the right when y > 0. This is known as a shear transformation.
Exercise 8.10. Let a be a unit vector in R3. Define the map L : R3 R3 by the
formula Lx = (x a)a for all x R3.
(a) Show that L is linear.
(b) Express L as multiplication by a matrix.
(c) Describe L geometrically.
Answer. First, observe that
L(x + y) = ((x + y) a)a = (x a + y a)a = (x a)a + (y a)a = Lx + Ly,
for all x,y R3. Moreover,
L(x) = ((x) a)a = ((x a))a = ((x a)a) = Lx,
-
8/2/2019 Chapter 2 Linear Transformations
6/24
42 8. LINEAR MAPS
for all x R3 and R. Hence L is a linear map.
It is easy to check that Lej = aja, and hence Lv = Av, where
A =a1a1 a2a1 a3a1a1a2 a2a2 a3a2
a1a3 a2a3 a3a3
.Geometrically, Lv is the projection ofv onto a.
Exercise 8.11. Let a be a unit vector in R3. Define the map X : R3 R3 by the
formula
Xv = a v
for all v R3.
(a) Show that X is linear.
(b) Express X as multiplication by a matrix.
(c) Describe X geometrically.
Answer. First, observe that
X(x + y) = a (x + y) = a x + a y = Xx + Xy,
for all x,y R3. Moreover,
X(x) = a (x) = (a x) = Xx,
for all x R3 and R. Hence X is a linear map.
It is easy to check that Xv = Av, where
A =
0 a3 a2a3 0 a1
a2 a1 0
.
Geometrically, ifv is parallel to a, then Xv = 0, while ifv is perpendicular to a, then
Xv is obtained by rotating v through /2 around the a axis.
So X corresponds to projection onto the plane {v R3 : v a = 0} followed by a
rotation through /2 around the a axis.
-
8/2/2019 Chapter 2 Linear Transformations
7/24
LECTURE 9
More on linear maps
We have defined linear maps, and seen some examples. In this lecture, we see more
examples and properties of linear maps.
1. Examples of linear maps
Exercise 9.1. Define the function S : R2
R2
by
S
x
y
=
x2 y2
2xy
for all (x, y)T R2. Is S linear?
Answer. Observe that
S
x
y
= S
x
y
=
2x2 2y2
22xy
= 2
x2 y2
2xy
.
Since = 2 (except when = 0 or = 1),
S
x
y
= S
x
y
,
and S is not linear.
Exercise 9.2. Define the function I : C[R] C[R] by
If(x) =x0
f(t) dt
for all x R. Is I linear?
Answer. Take continuous functions f and g on R and R. First,
I(f + g)(x) =
x0
(f(t) + g(t)) dt =
x0
f(t) dt +
x0
g(t) dt = If(x) + Ig(x)
43
-
8/2/2019 Chapter 2 Linear Transformations
8/24
44 9. MORE ON LINEAR MAPS
for all x R, that is, I(f + g) = If +Ig. Next,
I(f)(x) = x
0
(f(t)) dt = x
0
f(t) dt = If(x)
for all x R, that is, I(f) = (If).
Hence I is linear.
2. Algebra of linear maps
We gather together a number of useful facts about linear maps.
Lemma 9.3. Suppose that V and W are vector spaces and T : V W is a linear map.
Then
T(0) = 0
T(u) = T(u)
for allu V.
Proof. For all vectors u in V and all scalars , we know that T(u) = T(u). Take
equal to 0 to prove the first identity, and equal to 1 to prove the second. 2
This means that if T : V W is a function between vector spaces, and T(0) = 0 or
T(v) = T(v) for just one vector v, then T is not linear.
We have already used the following results implicitly.
Lemma 9.4. Suppose that V and W are vector spaces and T : V W is a function.
Then T is a linear transformation if and only if
T(u + v) = T(u) + T(v) (9.1)
for allu,v V and all scalars , .
Proof. If T is linear, then
T(u + v) = T(u) + T(v) = T(u) + T(v).
Conversely, if (9.1) holds, then taking = = 1 shows that
T(u + v) = T(u) + T(v)
-
8/2/2019 Chapter 2 Linear Transformations
9/24
2. ALGEBRA OF LINEAR MAPS 45
and taking = 0 shows that
T(u) = T(u),
so T is linear. 2
Corollary 9.5. Suppose that V and W are vector spaces and T : V W is a linear
map. Then for all finite linear combinationsn
j=1 jvj in V,
T
nj=1
jvj
=
n
j=1jT(vj).
Proof. This is proved by induction on n: the lemma above shows that the result
holds when n = 2, and if k 2, and
T
kj=1
jvj
=
kj=1
jT(vj),
then
T
k+1j=1
jvj
= T
k
j=1
jvj
+ k+1vk+1
= T
kj=1
jvj
+ T(k+1vk+1)
=k
j=1
jT(vj) + k+1T(vk+1)
=
k+1j=1 jT(vj).
The first equality holds by definition, the second because T is linear, the third by the
inductive hypothesis and the linearity of T, and the fourth by definition. It follows that
the result holds for all integers n 2 by induction. 2
-
8/2/2019 Chapter 2 Linear Transformations
10/24
46 9. MORE ON LINEAR MAPS
3. More on the geometry of linear maps
Exercise 9.6. Let D denote the diagonal n n matrix
d1 0 . . . 00 d2 . . . 0...
.... . .
...
0 0 . . . dn
.
Show that multiplication by D is linear. What is the geometric effect of multiplying by
the matrix D?
Answer. We omit the proof that multiplication by D is linear.
Multiplication changes the kth component of a vector by a factor ofdk
. If|dk
| < 1, this
gives a compression; if |dk| > 1, this gives an expansion. If dk < 0, there is also a change
of orientation.
4. Images and kernels
Consider the linear system
a11x1 + a12x2 + + a1mxn = b1
. . .
am1x1 + am2x2 + + amnxn = bm,
or equivalently, in matrix form
Ax = b.
What b can we solve this for? Is the solution unique?
We have seen that we can solve the equation if and only ifb is a linear combination of
the columns ofA; we write b col(A) for short. We also know that, ifxpart is a particular
solution ofAx = b, then every solution is of the form xpart+xhom, where xhom is a solution
of the homogeneous equation
Ax = 0.
If Ax = b has any solutions, then it has as many solutions as the homogeneous equation.
Next, consider the integral equation
Iu = b,
-
8/2/2019 Chapter 2 Linear Transformations
11/24
4. IMAGES AND KERNELS 47
where I is the integration operator introduced earlier, b is a known function and u is an
unknown continuous function. What b can we solve this for? Is the solution unique?
It is actually quite hard to answer the first question. But if u is a continuous function,
then Iu(0) = 0, so a necessary condition to be able to solve this problem is that b(0) = 0,
and certainly we cannot solve this for all functions b. It is convenient to have a notation
for the functions for which we can solve this equation. The set of vectors {If : f C[R]}
is called the image of I, and written image(I).
For the second question, we can show that, if upart is a particular solution of Iu = b,
then every solution of the equation is of the form upart + uhom, where uhom is a solution of
the homogeneous equation
Iu = 0.
If Iu = b has any solutions, then it has as many solutions as the homogeneous equation.
We unify these (and other examples) in the following definitions.
Definition 9.7. Suppose that V and W are vector spaces and T : V W is a linear
map. The set of vectors {Tv : v V} is called the image or range of T, and written
image(T) or range(T) or T(V). In symbols,
image(T) = {Tv : v V}.
An equivalent form of the definition is that image(T) is the collection of vectors w in
W for which the equation Tx = w can be solved.
Let xpart be a particular solution of the equation Tx = w, and let xhom be any solution
of the homogeneous equation Txhom = 0. Then
T(xpart + xhom) = T(xpart) + T(xhom) = w + 0 = w,
so that xpart +xhom is also a solution ofTx = w. Further, every solution of Tx = w is of
this form. Indeed, if Tx = w and Txpart = w, then
T(x xpart) = T(x) T(xpart) = w w = 0,
so x xpart is a solution of the homogeneous equation, and x = xpart + (x xpart).
The solutions to the homogeneous equation Tx = 0 are important in the discussion
above, and we give them a name.
Definition 9.8. Suppose that V and W are vector spaces and T : V W is a linear
map. The kernel of T, written ker(T), also known as the null space of T, is the subset of
-
8/2/2019 Chapter 2 Linear Transformations
12/24
48 9. MORE ON LINEAR MAPS
V of all vectors x such that Tx = 0. In symbols,
ker(T) = {x V : Tx = 0}.
Theorem 9.9. Suppose that V and W are vector spaces and T : V W is a linearmap. Then ker(T) and image(T) are subspaces.
Proof. First we show that ker(T) is a subspace. We have just seen that T0 = 0, so
ker(T) is not empty. Next, ifv,w ker(T), then
T(v + w) = Tv + Tw = 0 + 0 = 0,
so v + w ker(T). Thus ker(T) is closed under vector addition. Further, if is a scalar
and v ker(T), then
T(v) = (Tv) = 0 = 0,so v ker(T). Thus ker(T) is closed under scalar multiplication. By the Subspace
Theorem, ker(T) is a subspace.
Now we show that image(T) is a subspace. We have just seen that T0 = 0, so image(T)
is not empty. Next, ifx, y image(T), then there are vectors u and v in V such that
x = Tu and y = Tv. Then
x + y = Tu + Tv = T(u + v),
and x + y image(T) since u + v V. Thus image(T) is closed under vector addition.
Further, if is a scalar and v ker(T), then there is a vector u in V such that x = Tu,
whence
x = (Tu) = T(u),
and x image(T) since u V. Thus image(T) is closed under scalar multiplication.
By the Subspace Theorem, image(T) is a subspace. 2
The spaces ker(T) and image(T) are vector spaces, and have dimensions. These tell us
something about T, and this is what we will investigate next.
-
8/2/2019 Chapter 2 Linear Transformations
13/24
LECTURE 10
Rank and nullity
1. Definitions and properties
Definition 10.1. Suppose that V and W are vector spaces and T : V W is a linear
map. The nullity of T, written nullity(T), is the dimension of ker(T). The rank of T,
written rank(T), is the dimension of image(T). The co-rank of T, written co-rank(T), is
the number dim(W) dim(image(T)).
Observe that the nullity of T is equal to the number of parameters in the general
solution of Tx = w. Observe also that the rank determines the co-rank, and vice versa.
To any matrix A, we have associated a linear map TA, namely, multiplication by A. It
is convenient to use the expressions ker(A), nullity(A), image(A), rank(A) and co-rank(A)
to mean ker(TA), nullity(TA), image(TA), rank(TA) and co-rank(TA).
However, this also applies to other kinds of vectors. For example, the general solution
of
d2dx2
u(x) u(x) = x
is
u(x) = x + A sin x + B cos x.
This has two parameters, and the nullity of the linear differential operator T, given by
T f(x) = f(x) f(x) is 2.
Proposition 10.2. Suppose that T : V W is a linear map. Then
(i) T is one-to-one if and only if nullity(T) = 0.
(ii) T is onto if and only if rank(T) = dim(W), that is, if and only if co-rank(T) = 0.
Proof. First, we consider when T is one-to-one. On the one hand, suppose that
nullity(T) = 0, so ker(T) = {0}. By linearity, if x,y V and T(x) = T(y), then
T(x y) = 0, and so x y ker T = {0}. Thus x y = 0, that is, x = y. Thus T is
one-to-one.
49
-
8/2/2019 Chapter 2 Linear Transformations
14/24
50 10. RANK AND NULLITY
One the other hand, suppose that nullity(T) = 0, so that ker(T) {0}. Take z
ker(T) such that z = 0. Then T(z) = 0 = T(0), and T is not one-to-one.
Now we consider when T is onto. On the one hand, suppose that rank(T) = dim(W) =
n, say. Now image(T) W, and image(T) has a basis, {w1, . . . ,wn}. This is a linearly
independent set in W, of maximal size, since dim(W) = n, and hence is a basis in W.
Thus span{w1, . . . ,wn} = image(T) and span{w1, . . . ,wn} = W, so W = image(T). This
means that for any w W, we can find v V such that Tv = w, and T is onto.
On the other hand, suppose that rank(T) < dim(W). Then a basis for image(T)
is smaller than a basis for W. In particular, a basis {w1, . . . ,wn} for image(T) can be
enlarged to form a basis {w1, . . . ,wn,wn+1, . . . } for W. A basis is linearly independent,
so wn+1 is not in the span of {w1, . . . ,wn}. Thus wn+1 is not in image(T), and T is not
onto. 2
Theorem 10.3. If U is a subspace of W, then the smallest number of linear equations
needed to describe U is dim(W) dim(U).
We omit the proof of this result. It implies that the co-rank of a linear transformation
T : Rm Rn is the number of linear equations needed to describe the image of T.
Exercise 10.4. Consider the line in R3 with parametric equation x = d, where
d = (2, 0, 1)T
. This line is a 1-dimensional subspace. It may also be described by theequations y = 0 and x = 2z. It is ofcodimension 2.
Challenge Problem 10.5. Find equations which define span
(1, 2, 0, 1)T, (1, 0, 2, 0)T
in R4. What is the minimal number of linear equations needed to do this?
2. Examples
Exercise 10.6. Consider D : Pn(R) Pn1(R), given by
D(antn + + a0) = nant
n1 + + 2a2t + a1
(that is, D corresponds to differentiation). Find nullity(D) and rank(D).
Answer. First of all, take p(t) Pn(R) given by p(t) = antn + + a0. IfD(p)(t) = 0
(for all t), then nantn1 + + a2t + a1 = 0, and so an, an1, . . . , a1 are all 0. However a0
can be arbitrary. Thus ker(D) is the set of constant polynomials, which is of dimension 1.
Hence nullity(T) = 1.
-
8/2/2019 Chapter 2 Linear Transformations
15/24
2. EXAMPLES 51
It is easy to see that {1, t , t2, . . . , tn1} is a basis for Pn1(R), whence dim(Pn1(R)) = n.
Suppose that q(t) Pn1(R) is given by q(t) = bn1tn1 + + b1t + b0. We choose
the coefficients an, an1, . . . , a1 as follows:
an =bn1
n, an1 =
bn2n 1
, . . . , , a2 =b12
, a1 = b0,
and define p(t) = antn + + a0. It is easy to check that Dp(t) = q(t) for all real t. Since
q is an arbitrary element of Pn1(R), it follows that D is onto, image(D) = Pn1(R), and
rank(D) = dim(Pn1(R)) = n.
Observe that rank(D) + nullity(D) = n + 1 = dim(Pn).
Exercise 10.7. Find the nullity of the matrix
1 2 3 4
1 0 4 2
1 1 0 0
.
Answer. Reduce to row-echelon form. The reduced matrix is of the form
1 2 3 4
0 2 1 2
0 0 92 1
.
Then the rank of the matrix is 3; the nullity is 1.
Note that we do not have to find ker(T) explicitly to show that it is 1-dimensional.
Exercise 10.8. Find a basis for the kernel of the matrix
1 2 3 4
1 0 4 2
1 1 0 0
.
Answer. We need to find x1, . . . , x4 such that
1 2 3 4
1 0 4 2
1 1 0 0
x1
x2
x3
x4
= 0,
-
8/2/2019 Chapter 2 Linear Transformations
16/24
52 10. RANK AND NULLITY
that is, to find the solutions of the system represented by the augmented matrix
1 2 3 4
1 0 4 2
1 1 0 0
0
0
0 .
Row-reduced, this is of the form1 2 3 40 2 1 2
0 0 92
1
0
0
0
.
The solution space has the parametric description
x = t
10
10
2
9
,
where t R, and is 1-dimensional. Then {(10, 10, 2, 9)T} is a basis for the kernel of
the matrix.
Exercise 10.9. Find a basis for the image of the matrix1 2 3 41 0 4 2
1 1 0 0
.
Answer. We row-reduce this matrix, and get
1 2 3 4
0 2 1 2
0 0 92
1
.
Thus the first three columns are linearly independent, and the fourth column depends lin-early on these. It follows that the vectors (1, 1, 1)T, (2, 0, 1)T and (3, 4, 0)T are linearly
independent; since R3 is 3-dimensional, they must form a basis.
Of course, other sets of three of these vectors, such as {(1, 0, 0)T, (0, 1, 0)T, (0, 0, 1)T},
are also bases for R3.
-
8/2/2019 Chapter 2 Linear Transformations
17/24
2. EXAMPLES 53
Exercise 10.10. Define T : P3(R) R4 by
T(a3x3 + a2x
2 + a1x + a0) = (a0, a1, a2, a3)T.
Show that T is a linear mapping, and find its rank and nullity.
Answer. Suppose that
p(x) = a3x3 + a2x
2 + a1x + a0
q(x) = b3x3 + b2x
2 + b1x + b0.
Then
T(p(x) + q(x)) = T((a3 + b3)x3 + (a2 + b2)x
2 + (a1 + b1)x + (a0 + b0))
= (c0 + b0, a1 + b1, a2 + b2, a3 + b3)T
= (a0, a1, a2, a3)T + (b0, b1, b2, b3)
T
= T(p(x)) + T(q(x)),
and further
T(p(x)) = T(a3x3 + a2x
2 + a1x + a0)
= (a0, a1, a2, a3)T
= (a0, a1, a2, a3)T
= T(p(x)),
so T is linear.
Alternatively, it suffices to write
T(p(x) + q(x)) = T((a3 + b3)x3 + (a2 + b2)x
2 + (a1 + b1)x + (a0 + b0))
= (a0 + b0, a1 + b1, a2 + b2, a3 + b3)T
= (a0, a1, a2, a3)T + (b0, b1, b2, b3)
T
= T(p(x)) + (T(q(x)).
by Lemma 9.4.
Exercise 10.11. Define T : Pn(R) Pn(R) by
T(p(x)) = x2d2p(x)
dx2+ 4x
dp(x)
dx 4p(x).
-
8/2/2019 Chapter 2 Linear Transformations
18/24
54 10. RANK AND NULLITY
Find rank(T) and nullity(T).
Answer. Suppose that p(x) = xk, where 0 k n. Then
T(p(x)) = x2 d2xk
dx2+ 4xdxk
dx 4xk = x2k(k 1)xk2 + 4xkxk1 4xk
= [k(k 1) + 4k 4]xk = [k2 + 3k 4]xk
= tkxk,
say. Note that y2 + 3y 4 = 0 if and only if y = 4 or 1, so that tk = 0 when k = 1, but
not for other nonnegative integers k.
Now suppose that
p(x) = anxn + an1x
n1 + + a1x + a0.
Then
T(p(x)) = antnxn + an1tn1x
n1 + + a1t1x + a0t0.
If T(p(x)) = 0, then antn = an1tn1 = . . . a1t1 = a0t0 = 0, and so an = an1 = . . . a2 =
a0 = 0, while a1 can be arbitrary. Thus ker(T) = {cx : c R}, and nullity(T) = 1.
Given any q Pn(R), say
q(x) = bnxn + bn1x
n1 + + b1x + b0,
then we can try to solve T(p(x)) = q(x) by taking
bn =antn
, bn1 =an1tn1
, . . . , b0 =a0t0
.
Of course, this is a problem when n = 1, but fine otherwise. Hence
image(T) = {bnxn + bn1x
n1 + + b1x + b0 : b1 = 0},
and rank(T) = n.
-
8/2/2019 Chapter 2 Linear Transformations
19/24
LECTURE 11
The rank-nullity theorem
In this lecture, we prove the rank-nullity theorem for matrices and general linear maps.
And we show that the set of linear maps from a vector space V to a vector space W is
itself a vector space, and that the set of invertible linear maps on a vector space V is a
group.
1. Representing general linear maps by matrices
It is easier calculating with matrices than with general linear maps. Now we show how
to represent a general linear map as a matrix.
Theorem 11.1. Suppose that V and W are vector spaces with bases {v1, . . . ,vm} = A
and {w1, . . . ,wn} = B respectively, and suppose that T : V W is a linear map. Lettj
denote the vector Tvj and [tj]B denote its coordinates relative to the basis B, andA denote
the matrix whose jth column is [tj]B, where j = 1, . . . , m. Then for allx in V,
[Tx]B = A[x]A.
Proof. We start by clarifying the definitions: for j = 1, . . . , m, we may write tj as a
linear combination of the vectors w1, . . . ,wn, that is,
tj = a1jw1 + + anjwn =n
k=1
akjwk,
where the akj are scalars. By definition, the scalars a1j , . . . , anj are the coordinates of tj
relative to the ordered basis B, that is, [tj]B = (a1j, . . . , anj)T. Hence the matrix A has
entries akj .
For x in V, we may write
x = x1v1 + + xmvm =m
j=1
xjvj,
55
-
8/2/2019 Chapter 2 Linear Transformations
20/24
56 11. THE RANK-NULLITY THEOREM
and then, by definition, (x1, . . . , xm)T = [x]A. It follows that
Tx = Tm
j=1
xjvj =m
j=1
xjT(vj) =m
j=1
xjtj
=m
j=1
xj
nk=1
akjwk =n
k=1
mj=1
akjxj
wk =
nk=1
(A[x]A)kwk,
so
[Tx]B = A(x1, x2, . . . , xm)T,
as required. 2
Exercise 11.2. Consider the differential operator D : Pn(R) Pn(R) given by
Dp(x) = xd2
dx2p(x) + 3p(x).
Represent D as a matrix, using the basis {1, x , . . . , xn} for Pn(R).
Answer. Observe that
Dxk = xk(k 1)xk2 + 3xk = k(k 1)xk1 + 3xk.
It follows that, if p(x) = a0x0 + a1x
1 + + anxn, then Dp(x) = b0x
0 + b1x1 + + bnx
n,
where b = Aa, and A is the matrix
3 0 0 0 0 . . .
0 3 2 0 0 . . .
0 0 3 6 0 . . .
0 0 0 3 12 . . .
0 0 0 0 3 . . ....
......
......
. . .
.
A number of problems about linear equations can be reduced to problems about ma-
trices in this way.
2. The rank-nullity theorem
Theorem 11.3 (The rank-nullity theorem for matrices). Suppose thatA Mm,n. Then
rank(A) + nullity(A) = m.
-
8/2/2019 Chapter 2 Linear Transformations
21/24
2. THE RANK-NULLITY THEOREM 57
Proof. Suppose that A is row-reduced to row-echelon form. Then the columns of A
that correspond to the leading columns of the reduced matrix form a basis for range(A),
hence rank(A) is equal to the number of leading columns. The nonleading columns of the
reduced matrix correspond to the parameters of the solution, that is, nullity(A) is equal
to the number of nonleading columns. These numbers add to give the total number of
columns, that is, m. 2
Corollary 11.4. Suppose thatT : V W is a linear map between (finite-dimensional)
vector spaces. Then rank(T) + nullity(T) = dim(V).
Proof. We have just seen that we may represent T by a matrix, and we may apply
the rank-nullity theorem for matrices to this matrix.
Alternatively, here is a more descriptive proof. Take a basis {v1, . . . ,vk} for the kernel
of T and enlarge it to a basis {v1, . . . ,vm} for V. We now claim that the m k vectors
T(vk+1), . . . , T(vm) are linearly independent and span image(T), so form a basis for this
space. It then follows that dim(image(T)) = m k, which is essentially the desired result.
To see that the vectors T(vk+1), . . . , T(vm) are linearly independent is quite easy: if
k+1T(vk+1) + + mT(vm) = 0,
then
T(k+1vk+1 + + mvm) = 0,
that is,
k+1vk+1 + + mvm ker(T).
Since v1, . . . , vk span ker(T), there exist 1, . . . , k such that
(k+1vk+1 + + mvm) = 1v1 + + kvk,
and hence
1v1 + + kvk + k+1vk+1 + + mvm = 0;
since the vectors v1, . . . , vm are linearly independent, the j are all 0 when 1 j m.
To prove that the vectors T(vk+1), . . . , T(vm) span image(T) is shorter and easier:
if y image(T), then there exists x V such that y = T(x); we write x as a linear
-
8/2/2019 Chapter 2 Linear Transformations
22/24
58 11. THE RANK-NULLITY THEOREM
combinationm
j=1 xjvj, and then y is the corresponding linear combination:
y = T(x) = Tm
j=1
xjvj =m
j=1
xjT(vj) =m
j=k+1
xjT(vj),
since T(vj) = 0 when 1 j k. Hence y span{T(vk+1), . . . , T (vm)}. 2
Corollary 11.5. If A is a square matrix, then nullity(A) = co-rank(A). Hence the
linear maps associated to square matrices are one-to-one if and only if they are onto.
Proof. We saw that a linear map is one-to-one if and only if its nullity is 0 and that
it is onto if and only if its co-rank is 0. The rank-nullity theorem implies that the nullity
and co-rank are equal. 2
3. The algebra of linear maps
We discuss briefly how to get new linear maps from old.
Suppose that S and T are linear maps from V to W and and are scalars. We may
define a map S+ T from V to W by
(S+ T)(v) = S(v) + T(v) v V.
Lemma 11.6. Suppose that S and T are linear maps from V to W and and are
scalars. Then S+ T is a linear map.
Proof. By definition, ifu,v V and and are scalars, then
(S+ T)(u + v) = S(u + v) + T(u + v)
= (S(u) + S(v)) + (T(u) + T(v))
= S(u) + S(v) + T(u) + T(v)
= (S(u) + T(u)) + (S(v) + T(v))
= (S+ T)(u) + (S+ T)(v),
and so S+ T is linear, by Lemma 9.4. 2
We may also compose linear maps: if S : U V and T : V W are linear maps,
then we define T S : U W by T S(u) = T(S(u)) for all u U.
-
8/2/2019 Chapter 2 Linear Transformations
23/24
3. THE ALGEBRA OF LINEAR MAPS 59
Lemma 11.7. Suppose that S : U V and T : V W are linear maps. Then
T S : U W is a linear map.
Proof.We leave this proof as an exercise.
2
Exercise 11.8. Suppose that S : U V and T : V W are linear maps. How big,
in terms of the quantities nullity(S) and nullity(T), can nullity(T S) be?
Answer. First of all, ker(S) ker(T S) U. We take a basis for ker(S), and enlarge
it, first to a basis A of ker(T S), and then to a basis B ofU. There are nullity(S) vectors in
the basis for ker(S). Any basis vector v in B that is part of the basis for ker(T S) but not
of the basis for ker(S) has the property that S(v) = 0, but T(S(v)) = 0, so S(v) ker(T).
Such vectors S(v) are linearly independent in ker(T), so there are at most nullity(T) of
them.
In total, there are at least nullity(S) vectors in A, and at most nullity(S) + nullity(T)
vectors.
We conclude that
nullity(S) nullity(T S) nullity(S) + nullity(T)..
If T : V V is a linear map on a vector space V, then T has an inverse T1
, which isa function on V, precisely when T is one-to-one and onto. This is when nullity(T) = 0 and
rank(T) = dim(V). By the rank-nullity theorem, it suffices to suppose that nullity(T) = 0
or that rank(T) = dim(V).
Lemma 11.9. If T : V V is an invertible linear map, then T1 is a linear map.
Proof. Ify1,y2 V, then there exist x1,x2 V such that T(x1) = y1 and T(x2) =
y2. Then x1 = T1(y1) and x2 = T
1(y2).
Further,
T(1x1 + 2x2) = 1T(x1) + 2T(x2) = 1y1 + 2y2.
Hence
T1(1y1 + 2y2) = T1T(1x1 + 2x2) = 1x1 + 2x2 = 1T
1(y1) + 2T1(y2),
which shows that T1 is linear. 2
-
8/2/2019 Chapter 2 Linear Transformations
24/24
60 11. THE RANK-NULLITY THEOREM
Of course, T T1 = I, the identity map, and I is linear.
Invertible linear maps are important in both pure and applied mathematics. Pure
mathematicians study the group GL(V) of invertible linear maps of a vector space, and
find that the properties of this group answer basic questions in number theory and in the
construction of efficient networks.
Exercise 11.10. Show that GL(V) has the following properties:
(a) GL(V) is closed under composition, that is, if S, T GL(V), then ST GL(V).
(b) GL(V) has an identity I, such that IT = T I = T for all T GL(V).
(c) GL(V) has inverses: for all T GL(V), there exists T1 GL(V) such that
T T1 = T1T = I.
(d) GL(V) is associative, that is
(RS)T = R(ST)
for all R,S,T GL(V).
Engineers describe states of robotic systems by vectors, and are interested in invertible
linear maps of these systems. For instance, a television camera is mounted on a base which
allows it to swivel up and down in a vertical plane, while the base can rotate in a circle in
a horizontal plane. To get the camera pointing in a given direction, one has to both swivel
the camera and rotate the base. How much of the basic motions is needed to effect a three
dimensional motion? This is quite an easy question, but it does not take long to get to
rather difficult questions of this kind on robotics.