chapter 2 linear transformations

8/2/2019 Chapter 2 Linear Transformations

1/24

LECTURE 8

Linear maps

In this lecture, we introduce the idea of linear maps, also known as linear mappings or

linear transformations, with examples. The aim of our study of linear maps is two-fold: to

understand linear maps in R, R2 and R3, and to bring this understanding to bear on more

complex examples.

1. Defining linear maps

Linear maps are (mathematical abstractions of) very common types of function.

Exercise 8.1. Consider the function R : R2 R2 that rotates a vector v anti-

clockwise around the origin through an angle to give a vector Rv. Let v,w R2 and

R. Find R(v + w) in terms of Rv and Rw, and find R(v) in terms of Rv.

Answer. When we rotate a line, we get a line, and when we rotate parallel lines, we

get parallel lines. So when we rotate a parallelogram, we get a parallelogram.

Consider the parallelogram with vertices 0, v, w and v + w. When we rotate thisparallelogram about the origin through the angle , we obtain a new parallelogram, with

vertices R0, Rv, Rw and R(v + w). Note that R0 = 0.

By the parallelogram law for adding vectors, the fourth vertex of the parallelogram

with vertices 0, Rv and Rw is Rv + Rw. Hence

R(v + w) = Rv + Rw.

0

v

w

v + w

R0

Rv

Rw

R(v + w)

Figure 8.1. Rotating sums of vectors

37


2/24

38 8. LINEAR MAPS

Next, if > 0, then multiplication by is a dilation about the origin. We get the same

result when we dilate first, then rotate, as when we rotate, then dilate. This means that

R(v) = (R

v).

If we consider reflections in the origin as well, which correspond to multiplication by 1,

then we get this formula for all scalars .

Exercise 8.2. The simple harmonic oscillator is the quantum mechanical version of a

mass oscillating in simple harmonic motion on a spring. The operator H is a function

from functions to functions. Given a function f : R R, we define the new function

Hf : R to R by the rule

(Hf)(x) = d2

dx2f(x) + x2f(x) x R.

For instance, if f(x) = ex2/2, then Hf(x) = ex

2/2. Find H(f + g) in terms of Hf and

Hg, and find H(f) in terms of Hf.

Answer. It is easy to see that

H(f + g)(x) = d2

dx2(f + g)(x) + x2(f + g)(x)

= d2

dx2

f(x) + x2f(x) d2

dx2

g(x) + x2g(x)

= Hf(x) + Hg(x),

that is, H(f + g) = Hf + Hg.

Similarly, H(f) = Hf.

The functions R and H in these two examples enjoy the same algebraic properties

they respect the basic vector operations of addition and scalar multiplication. This motivates

the following definition.

Definition 8.3. Let V and W be vector spaces. A linear transformation (or mapping

or map) from V to W is a function T : V W such that

T(v + w) = Tv + Tw

T(v) = T(v)

for all vectors v and w and scalars .


3/24

2. LINEAR MAPS AND MATRICES 39

Let us investigate the linear maps from R to R.

Exercise 8.4. Which of the following functions are linear?

f(x) = 3x + 1 g(x) = x

h(x) = ex + 2 k(x) = sin(x).

Answer. First, observe that f(x+y) = 3(x+ y)+1 = 3x+3y +1, while f(x)+ f(y) =

(3x +1)+(3y +1) = 3x + 3y +2. These expressions are different, so f(x + y) = f(x) + f(y),

and f is not a linear map.

Next, g(x + y) = (x + y) = x y , while g(x) + g(y) = x y . Hence

g(x + y) = g(x) + g(y). Further, g(x) = x = (x) = g(x). Then g is a linear

map.

Third, h(x) = ex + 2, while h(x) = (ex + 2) = ex + 2. There is no reason

why these should be the same for all values of x and in R. Indeed, when = 1, then

they are equal only when ex + 2 = (ex + 2), that is, if ex + ex + 4 = 0, that is, never.

Then h is not a linear map.

We omit the proof that k is not a linear map.

In general, the linear functions from R to R are the functions of the form l(x) = cx, for

some fixed c R (including 0).

2. Linear Maps and Matrices

Suppose that A is an m n matrix. Define TA : Rm Rn by the formula

TAx = Ax x Rm.

Then TA is a linear map. Indeed,

TA(x + y) = A(x + y) = Ax + Ay = TA(x) + TA(y)

for all x,y Rm, and similarly TA(x) = TA(x) for all x Rm and R.

We show now that multiplications by matrices are essentially the only examples of

linear maps from Rm to Rn. Recall that ej denotes the vector (0, . . . , 0, 1, 0, . . . , 0)T, with

only one nonzero entry, namely 1, in the jth place.


4/24

40 8. LINEAR MAPS

Theorem 8.5. Suppose that T : Rm Rn is a linear map. Letaj denote the vector

Tej, and A denote the m n matrix whose jth column is aj, where 1 j m. Then

Tx = Ax

for allx inRm.

Proof. Take x in Rm. We write x as (x1, x2, . . . , xm)T, and then

x = x1e1 + x2e2 + + xmem.

Since T is linear,

Tx = T(x1e1 + x2e2 + + xmem)

= x1Te1 + x2Te2 + + xmTem

= x1a1 + x2a2 + + xmam = Ax,

as required. 2

3. Examples in R2 and R3

Exercise 8.6. Let R : R2

R2

be the rotation (anti-clockwise) through the angle .Represent R as multiplication by a matrix.

Answer. Observe that

R

1

0

=

cos

sin

and R

0

1

=

sin

cos

.

Then

Rxy = xR1

0+ yR0

1 = cos sin

sin cos x

y .

.

Exercise 8.7. Let Ds : R3 R3 be dilation by the factor s in R+, that is, Dsx = sx

for all x in R3. Show that Ds is linear and represent Ds as multiplication by a matrix.

Answer. We do not show that Ds is linear.


5/24

3. EXAMPLES IN R2

AND R3

41

It is easy to see that

Dsx = s 0 0

0 s 0

0 0 sx x R3.

.

Exercise 8.8. What is the geometric effect of multiplying vectors in R3 by the matrix

1 0 0

0 1 0

0 0 103

?

Answer. If we think of the vectors (1, 0, 0)T and (0, 1, 0)T as being horizontal, and

(0, 0, 1)T as begin vertical, then, on the one hand, we do not change the horizontal compo-nents of a vector, while on the other hand, we reduce the vertical component by a factor

of 103. This corresponds to squashing down.

Exercise 8.9. What is the geometrical effect of multiplying vectors in R2 by the matrix1 1

0 1

?

Answer. The effect of this transformation is to slide horizontally: to the left wheny < 0 and to the right when y > 0. This is known as a shear transformation.

Exercise 8.10. Let a be a unit vector in R3. Define the map L : R3 R3 by the

formula Lx = (x a)a for all x R3.

(a) Show that L is linear.

(b) Express L as multiplication by a matrix.

(c) Describe L geometrically.

Answer. First, observe that

L(x + y) = ((x + y) a)a = (x a + y a)a = (x a)a + (y a)a = Lx + Ly,

for all x,y R3. Moreover,

L(x) = ((x) a)a = ((x a))a = ((x a)a) = Lx,


6/24

42 8. LINEAR MAPS

for all x R3 and R. Hence L is a linear map.

It is easy to check that Lej = aja, and hence Lv = Av, where

A =a1a1 a2a1 a3a1a1a2 a2a2 a3a2

a1a3 a2a3 a3a3

.Geometrically, Lv is the projection ofv onto a.

Exercise 8.11. Let a be a unit vector in R3. Define the map X : R3 R3 by the

formula

Xv = a v

for all v R3.

(a) Show that X is linear.

(b) Express X as multiplication by a matrix.

(c) Describe X geometrically.

Answer. First, observe that

X(x + y) = a (x + y) = a x + a y = Xx + Xy,

for all x,y R3. Moreover,

X(x) = a (x) = (a x) = Xx,

for all x R3 and R. Hence X is a linear map.

It is easy to check that Xv = Av, where

A =

0 a3 a2a3 0 a1

a2 a1 0

.

Geometrically, ifv is parallel to a, then Xv = 0, while ifv is perpendicular to a, then

Xv is obtained by rotating v through /2 around the a axis.

So X corresponds to projection onto the plane {v R3 : v a = 0} followed by a

rotation through /2 around the a axis.


7/24

LECTURE 9

More on linear maps

We have defined linear maps, and seen some examples. In this lecture, we see more

examples and properties of linear maps.

1. Examples of linear maps

Exercise 9.1. Define the function S : R2

R2

by

S

x

y

=

x2 y2

2xy

for all (x, y)T R2. Is S linear?


S

x

y

= S

x

y

=

2x2 2y2

22xy

= 2

x2 y2

2xy

.

Since = 2 (except when = 0 or = 1),

S

x

y

= S

x

y

,

and S is not linear.

Exercise 9.2. Define the function I : C[R] C[R] by

If(x) =x0

f(t) dt

for all x R. Is I linear?

Answer. Take continuous functions f and g on R and R. First,

I(f + g)(x) =

x0

(f(t) + g(t)) dt =

x0

f(t) dt +

x0

g(t) dt = If(x) + Ig(x)

43


8/24

44 9. MORE ON LINEAR MAPS

for all x R, that is, I(f + g) = If +Ig. Next,

I(f)(x) = x

0

(f(t)) dt = x

0

f(t) dt = If(x)

for all x R, that is, I(f) = (If).

Hence I is linear.

2. Algebra of linear maps

We gather together a number of useful facts about linear maps.

Lemma 9.3. Suppose that V and W are vector spaces and T : V W is a linear map.

Then

T(0) = 0

T(u) = T(u)

for allu V.

Proof. For all vectors u in V and all scalars , we know that T(u) = T(u). Take

equal to 0 to prove the first identity, and equal to 1 to prove the second. 2

This means that if T : V W is a function between vector spaces, and T(0) = 0 or

T(v) = T(v) for just one vector v, then T is not linear.

We have already used the following results implicitly.

Lemma 9.4. Suppose that V and W are vector spaces and T : V W is a function.

Then T is a linear transformation if and only if

T(u + v) = T(u) + T(v) (9.1)

for allu,v V and all scalars , .

Proof. If T is linear, then

T(u + v) = T(u) + T(v) = T(u) + T(v).

Conversely, if (9.1) holds, then taking = = 1 shows that

T(u + v) = T(u) + T(v)


9/24

2. ALGEBRA OF LINEAR MAPS 45

and taking = 0 shows that

T(u) = T(u),

so T is linear. 2

Corollary 9.5. Suppose that V and W are vector spaces and T : V W is a linear

map. Then for all finite linear combinationsn

j=1 jvj in V,

T

nj=1

jvj

=

n

j=1jT(vj).

Proof. This is proved by induction on n: the lemma above shows that the result

holds when n = 2, and if k 2, and

T

kj=1

jvj

=

kj=1

jT(vj),

then

T

k+1j=1

jvj

= T

k

j=1

jvj

+ k+1vk+1

= T

kj=1

jvj

+ T(k+1vk+1)

=k

j=1

jT(vj) + k+1T(vk+1)

=

k+1j=1 jT(vj).

The first equality holds by definition, the second because T is linear, the third by the

inductive hypothesis and the linearity of T, and the fourth by definition. It follows that

the result holds for all integers n 2 by induction. 2


10/24


3. More on the geometry of linear maps

Exercise 9.6. Let D denote the diagonal n n matrix

d1 0 . . . 00 d2 . . . 0...

.... . .

...

0 0 . . . dn

.

Show that multiplication by D is linear. What is the geometric effect of multiplying by

the matrix D?

Answer. We omit the proof that multiplication by D is linear.

Multiplication changes the kth component of a vector by a factor ofdk

. If|dk

| < 1, this

gives a compression; if |dk| > 1, this gives an expansion. If dk < 0, there is also a change

of orientation.

4. Images and kernels

Consider the linear system

a11x1 + a12x2 + + a1mxn = b1

. . .

am1x1 + am2x2 + + amnxn = bm,

or equivalently, in matrix form

Ax = b.

What b can we solve this for? Is the solution unique?

We have seen that we can solve the equation if and only ifb is a linear combination of

the columns ofA; we write b col(A) for short. We also know that, ifxpart is a particular

solution ofAx = b, then every solution is of the form xpart+xhom, where xhom is a solution

of the homogeneous equation

Ax = 0.

If Ax = b has any solutions, then it has as many solutions as the homogeneous equation.

Next, consider the integral equation

Iu = b,


11/24

4. IMAGES AND KERNELS 47

where I is the integration operator introduced earlier, b is a known function and u is an

unknown continuous function. What b can we solve this for? Is the solution unique?

It is actually quite hard to answer the first question. But if u is a continuous function,

then Iu(0) = 0, so a necessary condition to be able to solve this problem is that b(0) = 0,

and certainly we cannot solve this for all functions b. It is convenient to have a notation

for the functions for which we can solve this equation. The set of vectors {If : f C[R]}

is called the image of I, and written image(I).

For the second question, we can show that, if upart is a particular solution of Iu = b,

then every solution of the equation is of the form upart + uhom, where uhom is a solution of

the homogeneous equation

Iu = 0.

If Iu = b has any solutions, then it has as many solutions as the homogeneous equation.

We unify these (and other examples) in the following definitions.

Definition 9.7. Suppose that V and W are vector spaces and T : V W is a linear

map. The set of vectors {Tv : v V} is called the image or range of T, and written

image(T) or range(T) or T(V). In symbols,

image(T) = {Tv : v V}.

An equivalent form of the definition is that image(T) is the collection of vectors w in

W for which the equation Tx = w can be solved.

Let xpart be a particular solution of the equation Tx = w, and let xhom be any solution

of the homogeneous equation Txhom = 0. Then

T(xpart + xhom) = T(xpart) + T(xhom) = w + 0 = w,

so that xpart +xhom is also a solution ofTx = w. Further, every solution of Tx = w is of

this form. Indeed, if Tx = w and Txpart = w, then

T(x xpart) = T(x) T(xpart) = w w = 0,

so x xpart is a solution of the homogeneous equation, and x = xpart + (x xpart).

The solutions to the homogeneous equation Tx = 0 are important in the discussion

above, and we give them a name.


map. The kernel of T, written ker(T), also known as the null space of T, is the subset of


12/24


V of all vectors x such that Tx = 0. In symbols,

ker(T) = {x V : Tx = 0}.

Theorem 9.9. Suppose that V and W are vector spaces and T : V W is a linearmap. Then ker(T) and image(T) are subspaces.

Proof. First we show that ker(T) is a subspace. We have just seen that T0 = 0, so

ker(T) is not empty. Next, ifv,w ker(T), then

T(v + w) = Tv + Tw = 0 + 0 = 0,

so v + w ker(T). Thus ker(T) is closed under vector addition. Further, if is a scalar

and v ker(T), then

T(v) = (Tv) = 0 = 0,so v ker(T). Thus ker(T) is closed under scalar multiplication. By the Subspace

Theorem, ker(T) is a subspace.

Now we show that image(T) is a subspace. We have just seen that T0 = 0, so image(T)

is not empty. Next, ifx, y image(T), then there are vectors u and v in V such that

x = Tu and y = Tv. Then

x + y = Tu + Tv = T(u + v),

and x + y image(T) since u + v V. Thus image(T) is closed under vector addition.

Further, if is a scalar and v ker(T), then there is a vector u in V such that x = Tu,

whence

x = (Tu) = T(u),

and x image(T) since u V. Thus image(T) is closed under scalar multiplication.

By the Subspace Theorem, image(T) is a subspace. 2

The spaces ker(T) and image(T) are vector spaces, and have dimensions. These tell us

something about T, and this is what we will investigate next.


13/24

LECTURE 10

Rank and nullity

1. Definitions and properties


map. The nullity of T, written nullity(T), is the dimension of ker(T). The rank of T,

written rank(T), is the dimension of image(T). The co-rank of T, written co-rank(T), is

the number dim(W) dim(image(T)).

Observe that the nullity of T is equal to the number of parameters in the general

solution of Tx = w. Observe also that the rank determines the co-rank, and vice versa.

To any matrix A, we have associated a linear map TA, namely, multiplication by A. It

is convenient to use the expressions ker(A), nullity(A), image(A), rank(A) and co-rank(A)

to mean ker(TA), nullity(TA), image(TA), rank(TA) and co-rank(TA).

However, this also applies to other kinds of vectors. For example, the general solution

of

d2dx2

u(x) u(x) = x

is

u(x) = x + A sin x + B cos x.

This has two parameters, and the nullity of the linear differential operator T, given by

T f(x) = f(x) f(x) is 2.

Proposition 10.2. Suppose that T : V W is a linear map. Then

(i) T is one-to-one if and only if nullity(T) = 0.

(ii) T is onto if and only if rank(T) = dim(W), that is, if and only if co-rank(T) = 0.

Proof. First, we consider when T is one-to-one. On the one hand, suppose that

nullity(T) = 0, so ker(T) = {0}. By linearity, if x,y V and T(x) = T(y), then

T(x y) = 0, and so x y ker T = {0}. Thus x y = 0, that is, x = y. Thus T is

one-to-one.

49


14/24

50 10. RANK AND NULLITY

One the other hand, suppose that nullity(T) = 0, so that ker(T) {0}. Take z

ker(T) such that z = 0. Then T(z) = 0 = T(0), and T is not one-to-one.

Now we consider when T is onto. On the one hand, suppose that rank(T) = dim(W) =

n, say. Now image(T) W, and image(T) has a basis, {w1, . . . ,wn}. This is a linearly

independent set in W, of maximal size, since dim(W) = n, and hence is a basis in W.

Thus span{w1, . . . ,wn} = image(T) and span{w1, . . . ,wn} = W, so W = image(T). This

means that for any w W, we can find v V such that Tv = w, and T is onto.

On the other hand, suppose that rank(T) < dim(W). Then a basis for image(T)

is smaller than a basis for W. In particular, a basis {w1, . . . ,wn} for image(T) can be

enlarged to form a basis {w1, . . . ,wn,wn+1, . . . } for W. A basis is linearly independent,

so wn+1 is not in the span of {w1, . . . ,wn}. Thus wn+1 is not in image(T), and T is not

onto. 2

Theorem 10.3. If U is a subspace of W, then the smallest number of linear equations

needed to describe U is dim(W) dim(U).

We omit the proof of this result. It implies that the co-rank of a linear transformation

T : Rm Rn is the number of linear equations needed to describe the image of T.

Exercise 10.4. Consider the line in R3 with parametric equation x = d, where

d = (2, 0, 1)T

. This line is a 1-dimensional subspace. It may also be described by theequations y = 0 and x = 2z. It is ofcodimension 2.

Challenge Problem 10.5. Find equations which define span

(1, 2, 0, 1)T, (1, 0, 2, 0)T

in R4. What is the minimal number of linear equations needed to do this?

2. Examples

Exercise 10.6. Consider D : Pn(R) Pn1(R), given by

D(antn + + a0) = nant

n1 + + 2a2t + a1

(that is, D corresponds to differentiation). Find nullity(D) and rank(D).

Answer. First of all, take p(t) Pn(R) given by p(t) = antn + + a0. IfD(p)(t) = 0

(for all t), then nantn1 + + a2t + a1 = 0, and so an, an1, . . . , a1 are all 0. However a0

can be arbitrary. Thus ker(D) is the set of constant polynomials, which is of dimension 1.

Hence nullity(T) = 1.


15/24

2. EXAMPLES 51

It is easy to see that {1, t , t2, . . . , tn1} is a basis for Pn1(R), whence dim(Pn1(R)) = n.

Suppose that q(t) Pn1(R) is given by q(t) = bn1tn1 + + b1t + b0. We choose

the coefficients an, an1, . . . , a1 as follows:

an =bn1

n, an1 =

bn2n 1

, . . . , , a2 =b12

, a1 = b0,

and define p(t) = antn + + a0. It is easy to check that Dp(t) = q(t) for all real t. Since

q is an arbitrary element of Pn1(R), it follows that D is onto, image(D) = Pn1(R), and

rank(D) = dim(Pn1(R)) = n.

Observe that rank(D) + nullity(D) = n + 1 = dim(Pn).

Exercise 10.7. Find the nullity of the matrix

1 2 3 4

1 0 4 2

1 1 0 0

.

Answer. Reduce to row-echelon form. The reduced matrix is of the form

1 2 3 4

0 2 1 2

0 0 92 1

.

Then the rank of the matrix is 3; the nullity is 1.

Note that we do not have to find ker(T) explicitly to show that it is 1-dimensional.

Exercise 10.8. Find a basis for the kernel of the matrix

1 2 3 4

1 0 4 2

1 1 0 0

.

Answer. We need to find x1, . . . , x4 such that

1 2 3 4

1 0 4 2

1 1 0 0

x1

x2

x3

x4

= 0,


16/24


that is, to find the solutions of the system represented by the augmented matrix

1 2 3 4

1 0 4 2

1 1 0 0

0

0

0 .

Row-reduced, this is of the form1 2 3 40 2 1 2

0 0 92

1

0

0

0

.

The solution space has the parametric description

x = t

10

10

2

9

,

where t R, and is 1-dimensional. Then {(10, 10, 2, 9)T} is a basis for the kernel of

the matrix.

Exercise 10.9. Find a basis for the image of the matrix1 2 3 41 0 4 2

1 1 0 0

.

Answer. We row-reduce this matrix, and get

1 2 3 4

0 2 1 2

0 0 92

1

.

Thus the first three columns are linearly independent, and the fourth column depends lin-early on these. It follows that the vectors (1, 1, 1)T, (2, 0, 1)T and (3, 4, 0)T are linearly

independent; since R3 is 3-dimensional, they must form a basis.

Of course, other sets of three of these vectors, such as {(1, 0, 0)T, (0, 1, 0)T, (0, 0, 1)T},

are also bases for R3.


17/24

2. EXAMPLES 53

Exercise 10.10. Define T : P3(R) R4 by

T(a3x3 + a2x

2 + a1x + a0) = (a0, a1, a2, a3)T.

Show that T is a linear mapping, and find its rank and nullity.

Answer. Suppose that

p(x) = a3x3 + a2x

2 + a1x + a0

q(x) = b3x3 + b2x

2 + b1x + b0.

Then

T(p(x) + q(x)) = T((a3 + b3)x3 + (a2 + b2)x

2 + (a1 + b1)x + (a0 + b0))

= (c0 + b0, a1 + b1, a2 + b2, a3 + b3)T

= (a0, a1, a2, a3)T + (b0, b1, b2, b3)

T

= T(p(x)) + T(q(x)),

and further

T(p(x)) = T(a3x3 + a2x

2 + a1x + a0)

= (a0, a1, a2, a3)T

= (a0, a1, a2, a3)T

= T(p(x)),

so T is linear.

Alternatively, it suffices to write

T(p(x) + q(x)) = T((a3 + b3)x3 + (a2 + b2)x

2 + (a1 + b1)x + (a0 + b0))

= (a0 + b0, a1 + b1, a2 + b2, a3 + b3)T

= (a0, a1, a2, a3)T + (b0, b1, b2, b3)

T

= T(p(x)) + (T(q(x)).

by Lemma 9.4.

Exercise 10.11. Define T : Pn(R) Pn(R) by

T(p(x)) = x2d2p(x)

dx2+ 4x

dp(x)

dx 4p(x).


18/24


Find rank(T) and nullity(T).

Answer. Suppose that p(x) = xk, where 0 k n. Then

T(p(x)) = x2 d2xk

dx2+ 4xdxk

dx 4xk = x2k(k 1)xk2 + 4xkxk1 4xk

= [k(k 1) + 4k 4]xk = [k2 + 3k 4]xk

= tkxk,

say. Note that y2 + 3y 4 = 0 if and only if y = 4 or 1, so that tk = 0 when k = 1, but

not for other nonnegative integers k.

Now suppose that

p(x) = anxn + an1x

n1 + + a1x + a0.

Then

T(p(x)) = antnxn + an1tn1x

n1 + + a1t1x + a0t0.

If T(p(x)) = 0, then antn = an1tn1 = . . . a1t1 = a0t0 = 0, and so an = an1 = . . . a2 =

a0 = 0, while a1 can be arbitrary. Thus ker(T) = {cx : c R}, and nullity(T) = 1.

Given any q Pn(R), say

q(x) = bnxn + bn1x

n1 + + b1x + b0,

then we can try to solve T(p(x)) = q(x) by taking

bn =antn

, bn1 =an1tn1

, . . . , b0 =a0t0

.

Of course, this is a problem when n = 1, but fine otherwise. Hence

image(T) = {bnxn + bn1x

n1 + + b1x + b0 : b1 = 0},

and rank(T) = n.


19/24

LECTURE 11

The rank-nullity theorem

In this lecture, we prove the rank-nullity theorem for matrices and general linear maps.

And we show that the set of linear maps from a vector space V to a vector space W is

itself a vector space, and that the set of invertible linear maps on a vector space V is a

group.

1. Representing general linear maps by matrices

It is easier calculating with matrices than with general linear maps. Now we show how

to represent a general linear map as a matrix.

Theorem 11.1. Suppose that V and W are vector spaces with bases {v1, . . . ,vm} = A

and {w1, . . . ,wn} = B respectively, and suppose that T : V W is a linear map. Lettj

denote the vector Tvj and [tj]B denote its coordinates relative to the basis B, andA denote

the matrix whose jth column is [tj]B, where j = 1, . . . , m. Then for allx in V,

[Tx]B = A[x]A.

Proof. We start by clarifying the definitions: for j = 1, . . . , m, we may write tj as a

linear combination of the vectors w1, . . . ,wn, that is,

tj = a1jw1 + + anjwn =n

k=1

akjwk,

where the akj are scalars. By definition, the scalars a1j , . . . , anj are the coordinates of tj

relative to the ordered basis B, that is, [tj]B = (a1j, . . . , anj)T. Hence the matrix A has

entries akj .

For x in V, we may write

x = x1v1 + + xmvm =m

j=1

xjvj,

55


20/24

56 11. THE RANK-NULLITY THEOREM

and then, by definition, (x1, . . . , xm)T = [x]A. It follows that

Tx = Tm

j=1

xjvj =m

j=1

xjT(vj) =m

j=1

xjtj

=m

j=1

xj

nk=1

akjwk =n

k=1

mj=1

akjxj

wk =

nk=1

(A[x]A)kwk,

so

[Tx]B = A(x1, x2, . . . , xm)T,

as required. 2

Exercise 11.2. Consider the differential operator D : Pn(R) Pn(R) given by

Dp(x) = xd2

dx2p(x) + 3p(x).

Represent D as a matrix, using the basis {1, x , . . . , xn} for Pn(R).


Dxk = xk(k 1)xk2 + 3xk = k(k 1)xk1 + 3xk.

It follows that, if p(x) = a0x0 + a1x

1 + + anxn, then Dp(x) = b0x

0 + b1x1 + + bnx

n,

where b = Aa, and A is the matrix

3 0 0 0 0 . . .

0 3 2 0 0 . . .

0 0 3 6 0 . . .

0 0 0 3 12 . . .

0 0 0 0 3 . . ....

......

......

. . .

.

A number of problems about linear equations can be reduced to problems about ma-

trices in this way.

2. The rank-nullity theorem

Theorem 11.3 (The rank-nullity theorem for matrices). Suppose thatA Mm,n. Then

rank(A) + nullity(A) = m.


21/24

2. THE RANK-NULLITY THEOREM 57

Proof. Suppose that A is row-reduced to row-echelon form. Then the columns of A

that correspond to the leading columns of the reduced matrix form a basis for range(A),

hence rank(A) is equal to the number of leading columns. The nonleading columns of the

reduced matrix correspond to the parameters of the solution, that is, nullity(A) is equal

to the number of nonleading columns. These numbers add to give the total number of

columns, that is, m. 2

Corollary 11.4. Suppose thatT : V W is a linear map between (finite-dimensional)

vector spaces. Then rank(T) + nullity(T) = dim(V).

Proof. We have just seen that we may represent T by a matrix, and we may apply

the rank-nullity theorem for matrices to this matrix.

Alternatively, here is a more descriptive proof. Take a basis {v1, . . . ,vk} for the kernel

of T and enlarge it to a basis {v1, . . . ,vm} for V. We now claim that the m k vectors

T(vk+1), . . . , T(vm) are linearly independent and span image(T), so form a basis for this

space. It then follows that dim(image(T)) = m k, which is essentially the desired result.

To see that the vectors T(vk+1), . . . , T(vm) are linearly independent is quite easy: if

k+1T(vk+1) + + mT(vm) = 0,

then

T(k+1vk+1 + + mvm) = 0,

that is,

k+1vk+1 + + mvm ker(T).

Since v1, . . . , vk span ker(T), there exist 1, . . . , k such that

(k+1vk+1 + + mvm) = 1v1 + + kvk,

and hence

1v1 + + kvk + k+1vk+1 + + mvm = 0;

since the vectors v1, . . . , vm are linearly independent, the j are all 0 when 1 j m.

To prove that the vectors T(vk+1), . . . , T(vm) span image(T) is shorter and easier:

if y image(T), then there exists x V such that y = T(x); we write x as a linear


22/24


combinationm

j=1 xjvj, and then y is the corresponding linear combination:

y = T(x) = Tm

j=1

xjvj =m

j=1

xjT(vj) =m

j=k+1

xjT(vj),

since T(vj) = 0 when 1 j k. Hence y span{T(vk+1), . . . , T (vm)}. 2

Corollary 11.5. If A is a square matrix, then nullity(A) = co-rank(A). Hence the

linear maps associated to square matrices are one-to-one if and only if they are onto.

Proof. We saw that a linear map is one-to-one if and only if its nullity is 0 and that

it is onto if and only if its co-rank is 0. The rank-nullity theorem implies that the nullity

and co-rank are equal. 2

3. The algebra of linear maps

We discuss briefly how to get new linear maps from old.

Suppose that S and T are linear maps from V to W and and are scalars. We may

define a map S+ T from V to W by

(S+ T)(v) = S(v) + T(v) v V.

Lemma 11.6. Suppose that S and T are linear maps from V to W and and are

scalars. Then S+ T is a linear map.

Proof. By definition, ifu,v V and and are scalars, then

(S+ T)(u + v) = S(u + v) + T(u + v)

= (S(u) + S(v)) + (T(u) + T(v))

= S(u) + S(v) + T(u) + T(v)

= (S(u) + T(u)) + (S(v) + T(v))

= (S+ T)(u) + (S+ T)(v),

and so S+ T is linear, by Lemma 9.4. 2

We may also compose linear maps: if S : U V and T : V W are linear maps,

then we define T S : U W by T S(u) = T(S(u)) for all u U.


23/24

3. THE ALGEBRA OF LINEAR MAPS 59

Lemma 11.7. Suppose that S : U V and T : V W are linear maps. Then

T S : U W is a linear map.

Proof.We leave this proof as an exercise.

2

Exercise 11.8. Suppose that S : U V and T : V W are linear maps. How big,

in terms of the quantities nullity(S) and nullity(T), can nullity(T S) be?

Answer. First of all, ker(S) ker(T S) U. We take a basis for ker(S), and enlarge

it, first to a basis A of ker(T S), and then to a basis B ofU. There are nullity(S) vectors in

the basis for ker(S). Any basis vector v in B that is part of the basis for ker(T S) but not

of the basis for ker(S) has the property that S(v) = 0, but T(S(v)) = 0, so S(v) ker(T).

Such vectors S(v) are linearly independent in ker(T), so there are at most nullity(T) of

them.

In total, there are at least nullity(S) vectors in A, and at most nullity(S) + nullity(T)

vectors.

We conclude that

nullity(S) nullity(T S) nullity(S) + nullity(T)..

If T : V V is a linear map on a vector space V, then T has an inverse T1

, which isa function on V, precisely when T is one-to-one and onto. This is when nullity(T) = 0 and

rank(T) = dim(V). By the rank-nullity theorem, it suffices to suppose that nullity(T) = 0

or that rank(T) = dim(V).

Lemma 11.9. If T : V V is an invertible linear map, then T1 is a linear map.

Proof. Ify1,y2 V, then there exist x1,x2 V such that T(x1) = y1 and T(x2) =

y2. Then x1 = T1(y1) and x2 = T

1(y2).

Further,

T(1x1 + 2x2) = 1T(x1) + 2T(x2) = 1y1 + 2y2.

Hence

T1(1y1 + 2y2) = T1T(1x1 + 2x2) = 1x1 + 2x2 = 1T

1(y1) + 2T1(y2),

which shows that T1 is linear. 2


24/24


Of course, T T1 = I, the identity map, and I is linear.

Invertible linear maps are important in both pure and applied mathematics. Pure

mathematicians study the group GL(V) of invertible linear maps of a vector space, and

find that the properties of this group answer basic questions in number theory and in the

construction of efficient networks.

Exercise 11.10. Show that GL(V) has the following properties:

(a) GL(V) is closed under composition, that is, if S, T GL(V), then ST GL(V).

(b) GL(V) has an identity I, such that IT = T I = T for all T GL(V).

(c) GL(V) has inverses: for all T GL(V), there exists T1 GL(V) such that

T T1 = T1T = I.

(d) GL(V) is associative, that is

(RS)T = R(ST)

for all R,S,T GL(V).

Engineers describe states of robotic systems by vectors, and are interested in invertible

linear maps of these systems. For instance, a television camera is mounted on a base which

allows it to swivel up and down in a vertical plane, while the base can rotate in a circle in

a horizontal plane. To get the camera pointing in a given direction, one has to both swivel

the camera and rotate the base. How much of the basic motions is needed to effect a three

dimensional motion? This is quite an easy question, but it does not take long to get to

rather difficult questions of this kind on robotics.

chapter 2 linear transformations

Documents