sketchsolutionsfor exercises inthe maintext of vectors ...twk/hex.pdf · textual through to the...

Sketch Solutions for Exercises

in the Main Text of

Vectors, Pure and Applied

T. W. Korner

1

2

Introduction

Here are what I believe to be sketch solutions to the bulk of exercisesin the main text the book (i.e. those not in the “Further Exercises”).I have written in haste in the hope that others will help me correct atleisure. I am sure that they are stuffed with errors ranging from theTEXtual through to the arithmetical and not excluding serious math-ematical mistakes. I would appreciate the opportunity to correct atleast some of these problems. Please tell me of any errors, unbridge-able gaps, misnumberings etc. I welcome suggestions for additions.

ALL COMMENTS GRATEFULLY RECEIVED.

If you can, please use LATEX2ε or its relatives for mathematics. If not,please use plain text. My e-mail is [email protected]. Youmay safely assume that I am both lazy and stupid so that a messagesaying ‘Presumably you have already realised the mistake in Exercise Z’is less useful than one which says ‘I think you have made a mistake inExercise Z because you have have assumed that the sum is necessarilylarger than the integral. One way round this problem is to assume thatf is decreasing.’

When I was young, I used to be surprised when the answer in theback of the book was wrong. I could not believe that the wise andgifted people who wrote textbooks could possibly make mistakes. I amno longer surprised.

It may be easiest to navigate this document by using the table ofcontents which follow on the next few pages. To avoid disappointment,observe that those exercises marked ⋆ have no solution given.

3

Contents

Introduction 2Exercise 1.1.2 13Exercise 1.2.1 14Exercise 1.2.2 15Exercise 1.2.3 16Exercise 1.2.5 17Exercise 1.2.6 18Exercise 1.2.7 19Exercise 1.2.8 20Exercise 1.2.9⋆ 20Exercise 1.2.10 21Exercise 1.3.5 22Exercise 1.3.9 23Exercise 1.3.10 25Exercise 1.4.3 26Exercise 2.1.3 27Exercise 2.1.4 28Exercise 2.1.6⋆ 28Exercise 2.1.7⋆ 28Exercise 2.1.9 29Exercise 2.2.4⋆ 29Exercise 2.2.5⋆ 29Exercise 2.2.8 30Exercise 2.2.10 31Exercise 2.3.3 32Exercise 2.3.4⋆ 32Exercise 2.3.11 33Exercise 2.3.12 34Exercise 2.3.13 35Exercise 2.3.14 36Exercise 2.3.16 37Exercise 2.3.17 38Exercise 2.3.18 39Exercise 2.4.1 40Exercise 2.4.2 41Exercise 2.4.4⋆ 41Exercise 2.4.6 42Exercise 2.4.9 43Exercise 2.4.10 44Exercise 2.4.11 45Exercise 2.4.12 46Exercise 3.2.1 47Exercise 3.3.4 48

4

Exercise 3.3.6 49Exercise 3.3.9 50Exercise 3.3.10 52Exercise 3.3.12 53Exercise 3.3.13 54Exercise 3.3.14 55Exercise 3.4.5 56Exercise 3.4.7 57Exercise 3.5.1 58Exercise 3.5.2 59Exercise 4.1.1 60Exercise 4.1.2 61Exercise 4.1.3 62Exercise 4.2.1 63Exercise 4.2.2 64Exercise 4.2.3⋆ 64Exercise 4.3.2 65Exercise 4.3.3 66Exercise 4.3.10 67Exercise 4.3.13 68Exercise 4.3.14 69Exercise 4.3.15 70Exercise 4.3.16⋆ 70Exercise 4.4.1 71Exercise 4.4.4 72Exercise 4.4.6 73Exercise 4.4.7 74Exercise 4.4.8 75Exercise 4.4.9 76Exercise 4.5.1 78Exercise 4.5.3 79Exercise 4.5.3 80Exercise 4.5.5 81Exercise 4.5.6 82Exercise 4.5.7 83Exercise 4.5.8 84Exercise 4.5.10 85Exercise 4.5.13 86Exercise 4.5.14 87Exercise 4.5.15 88Exercise 4.5.16 89Exercise 4.5.17 90Exercise 5.1.1 91Exercise 5.1.2⋆ 91Exercise 5.2.7 92

5

Exercise 5.2.8 93Exercise 5.2.11 94Exercise 5.3.2 95Exercise 5.3.3 96Exercise 5.3.10 97Exercise 5.3.11 98Exercise 5.3.14 99Exercise 5.3.16 100Exercise 5.4.3 101Exercise 5.4.11 102Exercise 5.4.12 104Exercise 5.4.13 105Exercise 5.4.14 106Exercise 5.5.3 107Exercise 5.5.5 108Exercise 5.5.6 109Exercise 5.5.12⋆ 109Exercise 5.5.13 110Exercise 5.5.14 111Exercise 5.5.16 113Exercise 5.5.17 114Exercise 5.5.18 115Exercise 5.5.19 116Exercise 5.6.2 118Exercise 5.6.3 119Exercise 5.6.4 120Exercise 5.6.5 121Exercise 5.6.6 122Exercise 6.1.2 123Exercise 6.1.3 124Exercise 6.1.5 125Exercise 6.1.7 126Exercise 6.1.8 127Exercise 6.1.9 128Exercise 6.1.12 129Exercise 6.2.3 130Exercise 6.2.4 131Exercise 6.2.9 132Exercise 6.2.11 133Exercise 6.2.14 134Exercise 6.2.15 135Exercise 6.2.16 136Exercise 6.3.2 137Exercise 6.4.2 138Exercise 6.4.5 139

6

Exercise 6.4.7 140Exercise 6.4.8 141Exercise 6.4.9 142Exercise 6.5.5 143Exercise 6.5.2 144Exercise 6.6.2 145Exercise 6.6.7 146Exercise 6.6.8 147Exercise 6.6.9 148Exercise 6.6.10 149Exercise 6.6.11 150Exercise 6.6.12 151Exercise 6.6.13 153Exercise 6.7.1 154Exercise 6.7.2 155Exercise 6.7.3 156Exercise 6.7.6 157Exercise 6.7.7 158Exercise 6.7.9 159Exercise 6.7.10 160Exercise 6.7.11 161Exercise 6.7.12 162Exercise 7.1.4 163Exercise 7.1.6 164Exercise 7.1.8 165Exercise 7.1.9 166Exercise 7.1.10 167Exercise 7.2.2 168Exercise 7.2.6 169Exercise 7.2.10 170Exercise 7.2.13 171Exercise 7.3.2 172Exercise 7.3.4 173Exercise 7.3.6 174Exercise 7.3.7 175Exercise 7.4.3 176Exercise 7.4.6 177Exercise 7.4.9 178Exercise 7.5.1 179Exercise 7.5.2 180Exercise 7.5.3 181Exercise 7.5.4 182Exercise 7.5.5 183Exercise 7.5.6 185Exercise 7.5.7 186

7

Exercise 8.1.7 188Exercise 8.2.2 189Exercise 8.2.6 190Exercise 8.2.7 191Exercise 8.2.9 192Exercise 8.2.10 193Exercise 8.3.2 194Exercise 8.3.3⋆ 194Exercise 8.3.4 195Exercise 8.3.5 196Exercise 8.4.2 197Exercise 8.4.3 198Exercise 8.4.4 199Exercise 8.4.6 200Exercise 8.4.7 201Exercise 8.4.8 202Exercise 8.4.9 203Exercise 8.4.10 204Exercise 8.4.11 206Exercise 8.4.12 207Exercise 8.4.13 208Exercise 8.4.14 209Exercise 8.4.15 210Exercise 8.4.16 211Exercise 8.4.17 212Exercise 8.4.18 213Exercise 8.4.19 214Exercise 8.4.20 215Exercise 9.1.2 216Exercise 9.1.4 217Exercise 9.2.1 218Exercise 9.3.1 219Exercise 9.3.5 220Exercise 9.3.6 221Exercise 9.3.8 222Exercise 9.4.3 223Exercise 9.4.5 225Exercise 9.4.6 226Exercise 9.4.7 227Exercise 9.4.9 228Exercise 9.4.10 229Exercise 9.4.11 230Exercise 9.4.12 231Exercise 9.4.13 232Exercise 9.4.14 233

8


9


10

Exercise 12.5.7 333Exercise 12.5.8 334Exercise 12.5.9 336Exercise 12.5.10 337Exercise 12.5.11 338Exercise 13.2.2 340Exercise 13.2.3 341Exercise 13.2.4 342Exercise 13.2.6 343Exercise 13.2.7 344Exercise 13.2.8 346Exercise 13.2.9 347Exercise 13.3.1 348Exercise 13.3.2 349Exercise 13.3.3 350Exercise 13.3.4 351Exercise 13.3.5 352Exercise 13.3.6 353Exercise 13.3.7 354Exercise 13.3.9 355Exercise 13.3.10 356Exercise 13.3.12 357Exercise 13.3.13 358Exercise 13.3.14 359Exercise 13.3.15 360Exercise 13.3.16 361Exercise 13.3.17 362Exercise 13.3.18 363Exercise 14.1.3 364Exercise 14.1.4 365Exercise 14.1.5 367Exercise 14.1.11 369Exercise 14.1.12 370Exercise 14.1.14 372Exercise 14.1.16 373Exercise 14.2.1 374Exercise 14.2.3 375Exercise 14.2.5⋆ 375Exercise 14.2.6 376Exercise 14.2.7 377Exercise 14.2.10 378Exercise 14.2.12 379Exercise 14.3.2 380Exercise 14.3.3 381Exercise 14.3.4 382

11

Exercise 14.3.5 383Exercise 14.3.7 384Exercise 14.3.8 385Exercise 14.3.9 386Exercise 14.3.10⋆ 386Exercise 14.3.11 387Exercise 14.3.12 388Exercise 14.3.13 389Exercise 14.3.14 390Exercise 14.3.15 391Exercise 15.1.1 392Exercise 15.1.3 393Exercise 15.1.7 394Exercise 15.1.9 395Exercise 15.1.10 396Exercise 15.1.11 397Exercise 15.1.12 398Exercise 15.1.13 399Exercise 15.2.2 400Exercise 15.2.4 401Exercise 15.3.2 402Exercise 15.3.3 403Exercise 15.3.7 404Exercise 15.3.8 405Exercise 15.3.11 406Exercise 15.4.3 408Exercise 15.4.6 409Exercise 15.4.7 411Exercise 15.4.9 412Exercise 16.1.5 413Exercise 16.1.9 414Exercise 16.1.14 415Exercise 16.1.16 416Exercise 16.1.17 417Exercise 16.1.23 419Exercise 16.1.25 420Exercise 16.1.26 421Exercise 16.1.27 423Exercise 16.2.4 424Exercise 16.2.8 425Exercise 16.2.9 426Exercise 16.2.10 427Exercise 16.2.12 429Exercise 16.2.13 430Exercise 16.3.2 431

12

Exercise 16.3.3 432Exercise 16.3.4 433Exercise 16.3.5 434Exercise 16.3.6 435Exercise 16.3.7 437Exercise 16.3.11 438Exercise 16.3.12 439Exercise 16.3.14 440Exercise 16.3.15 441Exercise 16.3.16 442Exercise 16.3.17 444Exercise 16.3.19 445Exercise 16.4.2 447Exercise 16.4.3 448Exercise 16.4.6 449Exercise 16.4.7 450Exercise 16.4.8 451Exercise 16.4.9 452

13

Exercise 1.1.2

Starting with

x+ y + z = 1

x+ 2y + 3z = 2

x+ 4y + 9z = 6.

We subtract the first equation from the second and from the third toget

x+ y + z = 1

y + 2z = 1

3y + 8z = 5.

We now solve the equations

y + 2z = 1

3y + 8z = 5,

for z by subtracting 3 times the first equation from the second to obtain

y + 2z = 2

2z = 2.

Thus z = 1 and, working backwards, y = 1 − 2z = −1, whence x =1− y − z = 1.

14

Exercise 1.2.1

STEP 1 If aij = 0 for all i and j, then our equations have the form

0 = yi [1 ≤ i ≤ m].

Our equations are inconsistent unless y1 = y2 = . . . = ym = 0. Ify1 = y2 = . . . = ym = 0 the equations impose no constraints on x1, x2,. . . , xn which can take any value we want.

STEP 2 If the condition of STEP 1 does not hold, we can arrange,byreordering the equations and the unknowns, if necessary, that a11 6= 0.We now subtract ai1/a11 times the first equation from the ith equation[2 ≤ i ≤ m] to obtain

⋆⋆

n∑

j=2

bijxj = zi [2 ≤ i ≤ m].

where

bij =a11aij − ai1a1j

a11and zi =

a11yi − ai1y1a11

.

STEP 3 If the new set of equations ⋆⋆ has no solution, then our oldset ⋆ has no solution. If our new set of equations ⋆⋆ has a solutionxi = x′

i for 2 ≤ i ≤ m, then our old set ⋆ has the solution

x1 =1

a11

(

y1 −n∑

j=2

a1jx′j

)

xi = x′i [2 ≤ i ≤ n].

(Continued in Exercise 1.2.2.)

15

Exercise 1.2.2

(Continued from Exercise 1.2.1.)

This means that, if ⋆⋆ has exactly one solution, then ⋆ has exactlyone solution and, if ⋆⋆ has infinitely many solutions, then ⋆ hasinfinitely many solutions. We have already remarked that, if ⋆⋆ hasno solutions, then ⋆ has no solutions.

16

Exercise 1.2.3

(i)(a) If ai = 0, but yi 6= 0 for some i, then there can be no solution,since the equation 0 = yi cannot be satisfied.

(b) If yr = 0 whenever ar = 0 and there exists an i such that ai = 0,then there are an infinity of solutions with xs = a−1

s ys when as 6= 0 andys chosen freely otherwise.

(c) If ai 6= 0 for all i, there is a unique solution xi = a−1i yi for all i.

(ii) (a) If aj = 0 for all j then either b = 0 and every choice of xj

gives a solution (so there is an infinity of solutions or b 6= 0 and thereis no solution.

(b) If ak 6= 0 for some k, then, choosing xi freely for i 6= k and setting

xk = a−1k

(

b−∑

j 6=k

ajxj

)

gives an infinity of solutions.

17

Exercise 1.2.5

(i) If a = 1, b = 2, c = d = 4 the first and second equations areincompatible, so there is no solution.

(ii) If a = 1, b = 2, c = d = 4, then the third equation gives thesame information as the first equation, so the system reduces to

x+ y = 2

x+ 2y = 4

Subtracting the first equation from the second we see that y = 2. Thusx = 0. By inspection this is a solution.

(iii) If a = b = 2, c = d = 4, then the second and third equation givethe same information as the first equation so the system reduces to

x+ y = 1

with the infinite set of solutions given by choosing x arbitrarily andsetting y = 1− x.

18

Exercise 1.2.6

The two equations are incompatible if a = 1. There is then nosolution.

If a 6= 1, then subtracting the first equation from the second we get

x+ y + z = 2

(a− 1)z = 2,

so z = 2/(a− 1). Knowing the value of z the system reduces to

x+ y = 2− 2

a− 1= 2

a− 2

a− 1

so x may be chosen freely and then

y = 2a− 2

a− 1− x.

There are an infinity of solutions.

19

Exercise 1.2.7

(i) Observe that, if n ≥ 4

n3 ≥n∑

r=1

n2 ≥n∑

r=1

r2 ≥n∑

n≥r≥n/2

(n/2)2 ≥ (n/4)× (n/2)2 = n3/16.

(ii) Let f(x) = x2. Then f is increasing so

f(r) ≤ f(x) ≤ f(r + 1)

for r ≤ x ≤ r + 1.

Integrating we get

f(r) =

∫ r+1

r

f(r) dx ≤∫ r+1

r

f(x) dx ≤∫ r+1

r

f(r + 1) dx = f(r + 1)

Thus, summing,n−1∑

r=1

f(r) ≤∫ n

1

f(x) dx ≤n∑

r=2

f(r).

In other wordsn−1∑

r=1

r2 ≤ (n3 − 1)/3 ≤n∑

r=2

r2

so(

n∑

r=1

r2

)

− n2 ≤ (n3 − 1)/3 ≤(

n∑

r=1

r2

)

− 1

Thus

(n3 − 1)/3 ≤(

n∑

r=1

r2

)

− 1 ≤ (n3 − 1)/3 + n2.

Dividing by n3 and allowing n → ∞, we obtain

n−3

n∑

r=1

r2 → 1/3.

20

Exercise 1.2.8

If we have the triangular system of size n, one operation is neededto get xn from the nth equation. Substitution of the value of xn inthe remaining n − 1 equations reduces them to a triangular system ofsize n− 1 in about 2(n− 1) operations. We can repeat this process toobtain the complete solution in about

n∑

r=1

2(r − 1) + 1 ≈ 2

n−1∑

r=1

(r − 1) ≈ n2

operations.

Exercise 1.2.9⋆

Rather you than me.

21

Exercise 1.2.10

The number of operations is about An3. I reckon A ≈ 1, so I needabout 1000 operations. Reckoning about 100 operations an hour anda 5 hour day gives 2 days. The reader may disagree about everything,but we should still have about the same order of magnitude for thetask.

22

Exercise 1.3.5

Row operations

Subtract twice first row from second row.(

1 −1 32 5 2

)

→(

1 −1 30 6 −4

)

Divide second row by 6.(

1 −1 30 6 −4

)

→(

1 −1 30 1 −4

3

)

Add second row to first row.(

1 −1 30 1 −4

3

)

→(

1 0 53

0 1 −43

)

Column operations.

Add first column to second. Subtract three times first column fromthird.

(

1 −1 32 5 2

)

→(

1 0 00 7 −10

)

Divide second column by 7.(

1 0 00 7 −10

)

→(

1 0 00 1 −10

7

)

Add 10/7 times second column to third.(

1 0 00 1 −10

7

)

→(

1 0 00 1 0

)

23

Exercise 1.3.9

(i) Subtract 2 times first row from second. Subtract 4 times first rowfrom third.

1 −1 32 5 24 3 8

→

1 −1 30 7 −40 7 −4

Subtract second row from third.

1 −1 30 7 −40 7 −4

→

1 −1 30 7 −40 0 0

Add first column to second. Subtract 3 times first column from third.

1 −1 30 7 −40 0 0

→

1 0 00 7 −40 0 0

Add 4/7 times second column to third. Divide second column by 7.

1 0 00 7 −40 0 0

→

1 0 00 1 00 0 0

(ii) Subtract first row from second and interchange first and secondrow.

2 4 53 2 14 1 3

→

2 4 51 −2 −44 1 3

→

1 −2 −42 4 54 1 3

Subtract 2 times first row from second and 4 times first row from third.

1 −2 −42 4 54 1 3

→

1 −2 −40 8 130 9 19

Subtract second row from third, interchange second and third.

1 −2 −40 8 130 9 19

→

1 −2 −40 8 130 1 6

→

1 −2 −40 1 60 8 13

Add twice second row to first and subtract 8 times second row fromthird.

1 −2 −40 1 60 8 13

→

1 0 80 1 60 0 −35

24

Divide third row by −35. Subtract 6 times third row from second and8 times third row from first.

1 0 80 1 60 0 −35

→

1 0 80 1 60 0 1

→

1 0 00 1 00 0 1

Subtract first row from second and interchange first and second row.

2x+ 4y + 5z = −3 2x+ 4y + 5z = −3 x− 2y − 4z = 5

3x+ 2y + z = 2 x− 2y − 4z = 5 2x+ 4y + 5z = −3

4x+ y + 3z = 1 4x+ y + 3z = 1 4x+ y + 3z = 1

Subtract 2 times first row from second and 4 times first row from third.Subtract second row from third, interchange second and third

x− 2y − 4z = 5 x− 2y − 4z = 5 x− 2y − 4z = 5

8y + 13z = −13 8y + 13z = −13 y + 6z = −6

9y + 19z = −19 y + 6z = −6 8y + 13z = −13

Add twice second row to first and subtract 8 times second row fromthird. Divide third row by −35. Subtract 6 times third row fromsecond and 8 times third row from first.

x + 8z = −7 x + 8z = −7 x = 1

y + 6z = −6 y + 6z = −6 y = 0

−35z = 35 z = −1 z = −1

25

Exercise 1.3.10

By subtracting the second row from the first and the third from thesecond, we see that the matrix

1 1 1 10 1 1 10 0 1 1

has rank 3. By adding the first row to the second and the first rowto the third, we obtain a matrix of the same rank 3 with all entriesnon-zero.

1 1 1 11 2 2 21 1 2 2

.

The same arguments, starting with

1 1 1 10 1 1 10 0 0 0

and

1 1 1 10 0 0 00 0 0 0

,

yields matrices of rank 2 and 1 with all entries non-zero

1 1 1 11 2 2 21 1 1 1

and

1 1 1 11 1 1 11 1 1 1

.

Not possible. All elementary operations (which change a matrix)involve a non-zero row or column which remains non-zero (though pos-sibly moved) after the operation. Thus a rank 0 matrix must be thezero matrix.

26

Exercise 1.4.3

(i) Observe that

πi

(

(x+ y) + x)

= (xi + yi) + zi = xi + (yi + zi) = πi

(

x+ (y + x))

.

(ii) Observe that

πi(x+ y) = xi + yi = yi + xi = πi(y + x).

(iii) Observe that

πi(x+ 0) = xi + 0 = xi = πi(x).

(v) Observe that

πi

(

(λ+ µ)x)

= (λ+ µ)xi = λxi + µxi = πi(λx+ µy).

(vi) Observe that

πi

(

(λµ)x)

= (λµ)xi = λ(µxi) = πi

(

λ(µy))

.

(vii) Observe that

πi(1x) = 1xi = xi = πi(x)

andπi(0x) = 0xi = 0 = πi(0).

(viii) Can do as above or

x− x = 1x+ (−1)x =(

1 + (−1))

x = 0x = 0.

27

Exercise 2.1.3

If c 6= 0, then u = (a/c, 0) and v = (0, b/c) are distinct vectorsrepresenting points on the line.

If c = 0, then suppose, without loss of generality that a 6= 0. Ifb = 0, then u = (0, 0) and v = (0, 1) are distinct vectors representingpoints on the line. If b 6= 0, then u = (0, 0) and v = (1,−a/b) aredistinct vectors representing points on the line.

28

Exercise 2.1.4

(i) Suppose that

{v + λw : λ ∈ R} ∩ {v′ + µw : µ ∈ R} 6= ∅.Then there exist λ0, µ0 ∈ R such that

v + λ0w = v′ + µ0w

and sov + λw = v′ + (λ− λ0 + µ0)w.

Thus{v + λw : λ ∈ R} ⊆ {v′ + µw : µ ∈ R}

and, similarly

{v′ + µw : µ ∈ R} ⊆ {v + λw : λ ∈ R}so

{v + λw : λ ∈ R} = {v′ + µw : µ ∈ R}.

(ii) The line joining u to u′ is

{u+ λw : λ ∈ R}with w = u− u′. Since w = (σ/τ)v − v′, the line joining v to v′ is

{v + µw : µ ∈ R}.Thus part (ii) follows from part (i).

(iii) Observe that

u′′ = − 1

µ′′(µu+ µ′u′) =

1

µ+ µ′(µu+ µ′u′) = u+

µ′

µ+ µ′(u′ − u).

Exercise 2.1.6⋆

Exercise 2.1.7⋆

29

Exercise 2.1.9

λ

1− λ= −α

β⇒ αλ = −α + λα ⇒ λ =

α

α− β.

Interchanging α and −(1 − α) and β and −(1− β), yields

λ′ =1− α

β − α.

Exercise 2.2.4⋆

Exercise 2.2.5⋆

30

Exercise 2.2.8

Let y be the centroid of x1, x2, . . . , xq.

Observe thatq − 1

qyj +

1

qxj =

1

q

∑

i 6=j

xi +1

qxj = y,

so y lies on the line joining yj and xj.

31

Exercise 2.2.10

(i) Let y be centre of mass of all the points x1, x2, . . . , xq and letM =

∑qj=1mj .

Observe thatmj

Myj +

M −mj

Mxj =

1

M

∑

i 6=j

mixi +mj

Mxj = y

so y lies on the line joining yj and xj.

(ii) We might have m1 +m2 + . . .+mq = 0 and we cannot divide by0.

32

Exercise 2.3.3

(i) We take the positive square root.

(ii) x = 0 ⇒ ‖x‖2 = 0 ⇒∑nj=1 x

2j = 0 ⇒ xj = 0 ∀j⇒ x = 0.

(iii) ‖λx‖2 =∑nj=1(λxj)

2 = λ2∑n

j=1 x2j , so ‖λx‖ = |λ|‖x‖.

Exercise 2.3.4⋆

33

Exercise 2.3.11

The result is trivial if a = b, or b = c, so we assume this is not thecase.

Setting x = a− b, y = b− c, we see that Theorem 2.3.10 gives

‖a− b‖+ ‖b− c‖ = ‖x‖+ ‖y‖≥ ‖x+ y‖= ‖a− c‖

with equality if and only if λ(a− b) = µ(b− c) for some λ, µ > 0 i.e.if and only if a, b, c lie in order along a line.

The length of one side of a triangle AB is less than or equal to thesum of the lengths of the other two sides BC and CA with equality ifand only if the triangle is degenerate with A, B, C lying in order alonga straight line.

34

Exercise 2.3.12

If x lies on the line through a and b then

x = ta+ (1− t)b

for some t ∈ R. The condition

‖x− a‖ = ‖x− b‖yields

‖(1− t)(a− b)‖ = ‖t(a− b)‖that is to say

|1− t|‖b− a‖ = |t|‖b− a‖so |t| = |1− t| so t = 1− t or t = t−1. The second equation is insolubleso t = 1/2 and x = 1

2(a+ b).

We have proved uniqueness. Direct substitution shows we have asolution.

35

Exercise 2.3.13

(i) Choose X on BC so that AX is perpendicular to BC. By thetheorem of Pythagoras and the geometric definition of the cosine andsine

|AC|2 = |AX|2 + |CX|2 = |AX|2 + (|BC| − |BX|)2

= |BA|2 sin2 θ + (|BC| − |BA| cos θ)2

= |BA|2(cos2 θ + sin2 θ) + |BC|2 − 2|BA||BC| cos θ= |BC|2 + |BA|2 − 2|BC| × |BA| cos θ

(If the reader is a very careful mathematician, she will not be entirelyhappy with this argument. She will ask, for example, how we choosethe sign of some of the terms. However in this book the algebra isprimary and the geometry is illustrative.)

(ii) We have

‖a− c‖2 = (a− c) · (a− c) = ‖a‖2 + ‖c‖2 − 2a · c.

(iii) The formula in (i) can be rewritten

‖a‖2 + ‖c‖2 − 2‖a‖ × ‖c‖ cos θ = ‖a− c‖2

givinga · c = ‖a‖‖c‖ cos θ.

36

Exercise 2.3.14

Sincev · u

‖u‖‖v‖ =u · v

‖u‖‖v‖ ,the angle between v and u is the same as the the angle between u andv

If θ is the angle between u and v and φ is the angle between u and−v, then

cos φ =u · (−v)

‖u‖‖v‖ = − u · v‖u‖‖v‖ = − cos θ

and 0 ≤ θ, φ ≤ π, so φ = π − θ

37

Exercise 2.3.16

Let us put the vertices at

a = (a1, a2, . . . , an).

The diagonals join vertices a and −a (call this diagonal da).

Diagonals da, db are perpendicular if and only if

0 = a · b =

n∑

i=1

aibi

Since aibi = ±1,∑n

i=1 aibi is odd if n is odd and even if n is even.(Proof by induction or modular arithmetic.) Thus if n is odd, a ·b 6= 0and no diagonals are perpendicular.

Now suppose n = 4. In finding the possible angles θ we may supposewithout loss of generality that a = (1, 1, 1, 1). The possible angles aregiven by

cos θ =a · b

‖a‖‖b‖ ∈ {r/4 : −4 ≤ r ≤ 4}

and all the possibilities occur (note that we may get different anglesaccording to the direct assigned to the diagonals). The possible anglesare

0, cos−1 14, π

6, cos−1 3

4, π

2, π − cos−1 3

4, 5π

6, π − cos−1 1

4, π

the angle 0 and π being the angle between the diagonal and itself andthe angle between the diagonal and itself reversed.

38

Exercise 2.3.17

(i) Always true

u ⊥ v ⇒ u · v = 0 ⇒ v · u = 0 ⇒ v ⊥ u.

(ii) Sometimes false. Take u = w = (1, 0, 0) and v = (0, 1, 0).

(iii) Always true

u ⊥ u ⇒ u · u = 0 ⇒ u = 0.

39

Exercise 2.3.18

(i) We have

‖u+ v‖2 = (u+ v) · (u+ v)

= u · u+ u · v + v · u+ v · v= ‖u‖2 + 0 + 0 + ‖v‖2 = ‖u‖2 + ‖v‖2

Consider the right angled triangle OUV with O at 0, U at u and Vat v. Then ‖u+ v‖ is the length of the hypotenuse.

(ii) We have

‖u+ v +w‖2 = (u+ v +w) · (u+ v +w)

= u · u+ v · v +w ·w + 2u · v + 2v ·w + 2w · u= ‖u‖2 + ‖v‖2 + ‖w‖2 + 0 + 0 + 0 = ‖u‖2 + ‖v‖2 + ‖w‖2

(iii) If uj ∈ R4 and uj ⊥ uk for k 6= j, we have∥

∥

∥

∥

∥

4∑

j=1

uj

∥

∥

∥

∥

∥

2

=

(

4∑

j=1

uj

)

·(

4∑

j=1

uj

)

=

4∑

j=1

4∑

k=1

uj · uk

=4∑

j=1

uj · uj =4∑

j=1

‖uj‖2.

40

Exercise 2.4.1

‖a+ b‖2 + ‖a− b‖2 = (a+ b) · (a+ b) + (a− b) · (a− b)

= (‖a‖2 + 2a · b+ ‖b‖2) + (‖a‖2 − 2a · b+ ‖b‖2)= 2(‖a‖2 + ‖b‖2)

The sum of the squares of the lengths of the two diagonals of aparallelogram equals the sum of the lengths of the four sides.

41

Exercise 2.4.2

(i) Consider the parallelogram OACB with vertex 0 at 0, vertex Aat a, vertex C at a+b and vertex B at b. The midpoint of the diagonalOC is at

x = 120+ 1

2(a+ b) = 1

2(a+ b).

The midpoint of the diagonal AB is at

y = 12a+ 1

2b = 1

2(a+ b).

Thus x = y and we are done.

The midpoint M of the side CA opposite 0 is at

m = 12(a+ b) + 1

2b = 1

2a+ b.

The point Z given by

z = 130+ 2

3m = 1

3a+ 2

3b

trisects OM and AB, so we are done.

Exercise 2.4.4⋆

42

Exercise 2.4.6

(i) If (n,m) is a unit vector perpendicular to (u, v) we have nu =−mv so n = −vx, m = ux and 1 = (u2 + v2)x2 = x2 thus

(n,m) = (−v, u) or (n,m) = (v,−u).

(ii) Let n be a unit vector perpendicular to ‖c‖−1c as found in (i).Let p = n · a.

Thenx = a+ tc ⇒ x · n = p

and if x · n = p we have (x− a) · n = 0 so x− a = tc for some t.

(iii) Let a = pn and let c = u be a unit vector perpendicular to n.

x = pn+ tu ⇒ x · n = p

and, if x · n = p, we have (x− pn) · u = 0 so x− pn = tu for some t.

(Or use a geometric argument.)

43

Exercise 2.4.9

We have

π1 = {(0, x2, x3) : x2, x3 ∈ R},π2 = {(x1, 0, x3) : x2, x3 ∈ R},π3 = {(x1, x2, x3) : x1 + x2 = 1} xj ∈ R}.

π1 and π2 meet in the line x1 = x2 = 0.

π1 and π3 meet in the line x1 = 0, x2 = 1.

π2 and π3 meet in the line x1 = 1, x2 = 0.

The lines have no point in common.

(i) Now let πj be given by

nj · x = pj,

withn1 = n2 = n3 = (1, 0, 0)

and p1 = −1, p2 = 0, p3 = 1. By inspection no two planes meet.

(ii) Finally let πj be given by

nj · x = pj,

withn2 = n3 = (1, 0, 0), n1 = (0, 1, 0)

and p1 = p2 = 0, p3 = 1. We have that π1 and π2 meet in the line givenby x1 = x2 = 0 and π1 and π2 meet in the line given by x1 = 1, x2 = 0,but π2 and π3 do not meet.

44

Exercise 2.4.10

{x ∈ R2 : ‖x− a‖ = r}is the singleton {a} if r = 0 and the empty set ∅ if r < 0.

45

Exercise 2.4.11

(i) We have

y(

y(x))

=1

‖y(x)‖2y(x) =‖x‖4‖x‖2

1

‖x‖2x = x.

(ii) Observe that, if ‖a‖ 6= r, then, if x 6= 0, setting C = (‖a‖2− r2),

‖x− a‖2 = r2 ⇔ ‖x‖2 − 2x · a+ (‖a‖2 − r2) = 0

↔ 1− 2y · a+ (‖a‖2 − r2)‖y‖2

⇔ ‖y‖2 − 2y · (C−1a) + C−1 = 0

⇔ ‖y − C−1a‖ = C−1(C−2‖a‖2 − 1)

so we transform the circle into a circle centre C−1a radius(

(C−1(C−2‖a‖2 − 1))1/2

.

(Note that we are taking the square root of a positive number.)

(iii) Observe that, if ‖a‖ = r > 0, then, if x 6= 0,

‖x− a‖2 = r2 ⇔ ‖x‖2 − 2x · a = 0

⇔ 1− 2y · a = 0

so we transform the circle into a line perpendicular to a closest distanceto the origin (2‖a‖)−1.

(iv) Goes through without change, replacing ‘circle’ by ‘sphere’ and‘line’ by plane’

46

Exercise 2.4.12

∥

∥

∥

∥

x

‖x‖2 − y

‖y‖2∥

∥

∥

∥

2

=

(

x

‖x‖2 − y

‖y‖2)

·(

x

‖x‖2 − y

‖y‖2)

=x · x‖x‖4 − 2

x · y‖x‖2‖y‖2 +

y · y‖y‖4

=1

‖x‖2 − 2x · y

‖x‖2‖y‖2 +1

‖y‖2

=‖x‖2 − 2x · y + ‖y‖2

‖x‖2‖y‖2

=

(‖x− y‖‖x‖‖y‖

)2

.

Thus∥

∥

∥

∥

x

‖x‖2 − y

‖y‖2∥

∥

∥

∥

=‖x− y‖‖x‖‖y‖ .

Using this result, we have, by the triangle inequality, (if x, y, z 6= 0)

(‖x‖‖y‖‖z‖)−1(‖y‖‖z− x‖+ ‖x‖‖y− z‖) = ‖z− x‖‖z‖‖x‖ +

‖y − z‖‖y‖‖z‖

=

∥

∥

∥

∥

z

‖z‖2 − x

‖x‖2∥

∥

∥

∥

+

∥

∥

∥

∥

y

‖y‖2 − z

‖z‖2∥

∥

∥

∥

≥∥

∥

∥

∥

x

‖x‖2 − y

‖y‖2∥

∥

∥

∥

= (‖x‖‖y‖‖z‖)−1‖z‖‖y− x‖with equality if and only if ‖x‖−2x − ‖y‖−2y and ‖z‖−2z − ‖y‖−2y

are scalar multiples of each other (i.e. ‖x‖−2, ‖y‖−2y ‖z‖−2z lie on astraight line). Thus

‖z‖‖x− y‖ ≤ ‖y‖‖z− x‖+ ‖x‖y‖if x, y, z 6= 0. If at least one of the vectors is zero the result is trivial.

To obtain the Euclidean result place D at the origin , let A haveposition vector x, B have position vector y and C position vector z.

We now use the notation of Exercise 2.4.11. The condition ‖x‖−2x,‖y‖−2y ‖z‖−2z are scalar multiples of each other is the same as sayingthat f(x), f(y), f(y) lie on a straight line so x = f 2(x), y = f 2(y)z = f 2(z) lie on a circle through the origin i.e. A, B, C, D are concyclic.

47

Exercise 3.2.1

zi = aikyk = aik(bkjxj) = (aikbkj)xj = cijxj

48

Exercise 3.3.4

There are many proofs. We can work coordinatewise e.g.

aij + (bij + cij) = (aij + bij) + cij

so A+ (B + C) = (A+ B) + C

Or direct from definition.

(A+ (B + C))x = Ax + (B + C)x = Ax + (Bx+ Cx)

= (Ax +Bx) + Cx) = (A+B)x + Cx

= ((A+B) + C))x

for all x, so (A+B) + C = A+ (B + C).

49

Exercise 3.3.6

Suppose Cx = x for all x. Taking x to be the column vector with 1in k th place and zero elsewhere, we get

cik = δik.

Conversely, if y ∈ Rn

n∑

j=1

δijyi = yj

so Iy = y.

50

Exercise 3.3.9

(iii) By definition. Observe that(

(B + C)A)

x = (B + C)(Ax) = B(Ax) + C(Ax)

= (BA)x+ (CA)x = (BA+ CA)x

for all x so (B + C)A = BA + CA.

(iii) By calculation. We have

n∑

j=1

(bij + cij)ajk =

n∑

j=1

bijajk + cijajk =

n∑

j=1

bijajk +

n∑

j=1

cijajk

(iii) By summation convention. Observe that

(bij + cij)ajk = bijajk + cijajk

(iv) By definition. Observe that(

(λA)B)

x = (λA)(Bx) = λ(

A(Bx))

= λ(

(AB)x)

=(

λ(AB))

x

for all x so (λA)B = λ(AB).

Again(

A(λB))

x = A(

(λB)x)

=(

A(λ(Bx)))

= λ(

A((Bx))

= λ(

(AB)x)

=(

λ(BA))

x

for all x so A(λB) = λ(AB).

(iv) By calculation. We have

n∑

j=1

(λaij)bjk = λ

n∑

j=1

aijbjk =

n∑

j=1

aij(λbjk)

(iv) By summation convention. Observe that

(λaij)bjk = λ(aijbjk) = aij(λbjk)

(v) By definition. Observe that

(IA)x = I(Ax) = Ax

and

(AI)x = A(Ix) = Ax

for all x so IA = AI.

51

(v) By calculation. We haven∑

j=1

δijajk = aik =

n∑

j=1

aijδjk =

n∑

j=1

bijajk + cijajk.

(v) By summation convention. Observe that

δijajk = aik = aijδjk.

52

Exercise 3.3.10

We haveBA = A ∀A ⇒ BI = I ⇒ B = I.

53

Exercise 3.3.12

If A is an m × n matrix and λ ∈ R, then λA = C where C is them× n matrix such that

λ(Ax) = Cx

for all x ∈ Rn.

If A and B are m × n matrices, then A + B = C where C is them× n matrix such that

Ax +Bx = Cx

If A is an m× n matrices and B is an m× p matrix, then AB = Cwhere C is the n× p matrix such that

A(Bx) = Cx

for all x ∈ Rn

Remark The matrices defined above are certainly unique. The defini-tions do not by themselves show existence, but the existence may beeasily checked.

In (i) take cij = λaij . In (ii) take cij = λaij + bij . In (iii) takecij =

∑mr=1 airbrj .

54

Exercise 3.3.13

(i) Observe that(

(A +B) + C)x = (A+B)x + Cx = (Ax+Bx) + Cx

= Ax + (Bx+ Cx) = Ax + (B + C)x

= ((

A+ (B + C))

x

for all x ∈ Rn so (A+B) + C = A+ (B + C).

(ii) Observe that

(A +B)x = Ax+Bx = Bx+ Ax = (B + A)x

for all x ∈ Rn so A+B = B + A).

(iii) Observe that

(A + 0)x = Ax+ 0x = Ax + 0 = Ax

for all x ∈ Rn so A+ 0 = A.

(iv) Observe that(

λ(A+B))

x = λ(

(A +B)x)

= λ(Ax+Bx)

= λ(Ax) + λ(Bx) = (λA)x + (λB)x

= (λA+ λB)x

for all x ∈ Rn so λ(A+B) = λA+ λB.

(v) Observe that(

(λ+ µ)A)

x = (λ+ µ)(Ax) = λ(Ax) + µ(Ax)

= (λA)x+ (µA)x = (λA+ µA)x

for all x ∈ Rn so (λ+ µ) = λA + µA.

(vi) Observe that(

(λµ)A)

x = (λµ)Ax = λ(µAx) =(

λ(µA))

x

for all x ∈ Rn so (λµ)A = λ(µA).

(vii) Observe that

(0A)x = 0(Ax) = 0 = 0x

for all x ∈ Rn so 0A = 0.

(ix) We haveA−A = (1− 1)A = 0A = 0.

55

Exercise 3.3.14

(i) Observe that(

(AC)F)

x = (AC)(Fx) = A(

C(Fx))

= A(

(CF )x))

=(

(ACF ))

x

for all x ∈ Rn, so (AC)F = A(CF ).

(ii) Observe that(

G(A+B))

x = G(

(A +B)x)

= G(Ax+Bx)

= G(Ax) +G(Bx) = (GA)x + (GB)x

= (GA+GB)x

for all x ∈ Rn, so G(A+B) = GA+GB.

(iii) Observe that(

(A+B)C)

x = (A+B)(Cx) = A(Cx) + A(Bx)

= (AC)x+ (BC)x

= (AC +BC)x

for all x ∈ Rn, so (A+B)C = AC +BC.

(iv) Observe that(

(λA)C)x = (λA)(Cx) == λ(

A(Cx))

= λ(

(AC)x))

=(

λ(AC))

x

for all x ∈ Rn, so (λA)C = λ(AC).

Observe that(

A(λC))

x = A(

(λC)x)

= A(

λ(Cx))

= λ(

A(Cx))

= λ(

(AC)x))

=(

λ(AC))

x

for all x ∈ Rn so A(λC) = λ(AC).

56

Exercise 3.4.5

The conjecture is false.

The matrices A and B given by

A =

(

1 10 1

)

and B =

(

1 01 1

)

are shear matrices, but

AB =

(

2 11 1

)

is not.

57

Exercise 3.4.7

Since r 6= s, we have δrs = 0 andn∑

j=1

(δij + λδirδjs)(δjk + µδjrδks)

=n∑

j=1

(δijδjk + λδirδjsδjk + µδjrδksδij + λµδirδjsδjrδks)

= δik + λδirδks + µδirδks + λµδsrδirδks

= δik + (λ+ µ)δirδks

The result is also obvious geometrically.

58

Exercise 3.5.1

(LkLk−1 . . . L1I)A = LkLk−1 . . . L1A = I,

so A is invertible with

LkLk−1 . . . L1I = A−1.

Thus the same set of elementary operations applied in the same orderwhich reduce A to I will transform I to A−1.

59

Exercise 3.5.2

If we use row operations and column exchanges we can reduce A to I.If we omit column exchanges then the same row operations in the sameorder produce a matrix with exactly one 1 in each row and column andthe remaining entries 0. By exchanging rows we can ensure that thefirst row is (1, 0, 0, . . . , 0), the second row (0, 1, 0, . . . , 0) and so on, i.e.we can produce I.

60

Exercise 4.1.1

(i) The vertices ABC, BCA, CAB are described in one sense (sayanti clockwise) and the same vertices ACB, BAC, CBA in the oppositesense (say clockwise).

(ii) (Of course a bit hand waving.) OAXB, 0PQB, BXQ and 0APanticlockwise but AXPQ clockwise. Writing | areaΣ| for the absolutevalue of the area we have

| area OPQB|= | areaOAXB| − | areaAPQX|+ | areaOAP | − | areaBXQ|= | areaOAXB| − | areaAPQX|.

61

Exercise 4.1.2

D(a,b) +D(b, a) = D(a+ b,b) +D(a+ b, a)

(By adding second entry to first.)

D(a+ b,b) +D(a+ b, a) = D(a+ b, a+ b)

(Since first entry the same, we can add as shown.)

D(a+ b, a+ b) = D(a+ b− (a+ b), a+ b)

(By subtracting second entry from first entry.)

D(a+ b− (a+ b), a+ b) = D(0, a+ b)

(Just do the calculation.)

D(0, a+ b) = 0

(Area degenerate parallelogram.) D(a,b) is the area of the parallelo-gram with vertices 0, a, a + b, b described in that order. D(b, a) isthe area of the same parallelogram but with the vertices laid out in theopposite sense 0, b, a+ b, a.

62

Exercise 4.1.3

D((

a1a2

)

,

(

b1b2

))

= D((

a10

)

,

(

b1b2

))

+D((

0a2

)

,

(

b1b2

))

(Since second column the same, can add first columns.)

D((

a10

)

,

(

b1b2

))

+D((

0a2

)

,

(

b1b2

))

= D((

a10

)

,

(

b1b2

)

− b1a1

(

a10

))

+D((

0a2

)

,

(

b1b2

)

− b2a2

(

0a2

))

(Since we may subtract multiples of the first column from the second.)

D((

a10

)

,

(

b1b2

)

− b1a1

(

a10

))

+D((

0a2

)

,

(

b1b2

)

− b2a2

(

0a2

))

= D((

a10

)

,

(

0b2

))

+D((

0a2

)

,

(

b20

))

(Just doing the calculation.)

D((

a10

)

,

(

0b2

))

+D((

0a2

)

,

(

b20

))

= D((

a10

)

,

(

0b2

))

−D((

b20

)

,

(

0a2

))

(Interchanging columns.)

D((

a10

)

,

(

0b2

))

−D((

b20

)

,

(

0a2

))

= (a1b2 − a2b1)D((

10

)

,

(

01

))

= a1b2 − a2b1.

(Using the rules D(λa,b) = D(a, λb) = λD(a,b) and the fact thatthat the area of a unit square is 1.)

63

Exercise 4.2.1

(Not a good idea, but possible as follows.)

D(AB) = D(

a11b11 + a12b21 a11b12 + a12b22a21b11 + a22b21 a21b12 + a22b22

)

= (a11b11 + a12b21)(a21b12 + a22b22)

− (a11b12 + a12b22)(a21b11 + a22b21)

= a11a22(b11b22 − b12b21)− a12a21(b11b22 − b12b21)

= D(A)D(B).

64

Exercise 4.2.2

(i) We have

DI = D(

1 00 1

)

= 1× 1− 0× 0 = 1.

The area of a unit square is 1.

(i) We have

DE = D(

0 11 0

)

= 0× 0− 1× 1 = −1,

and

E

(

xy

)

=

(

0 11 0

)(

xy

)

=

(

0x+ 1y1x+ 0y

)

=

(

yx

)

.

We observe that

E

(

cos tsin t

)

=

(

sin tcos t

)

=

(

cos(π2− t)

sin(π2− t)

)

runs clockwise from (0, 1) back to (0, 1) as (cos t, sin t)T runs anticlock-wise from (1, 0) back to (1, 0).

(iii) We have

D(

1 λ0 1

)

= 1× 1− λ× 0 = 1

and

D(

1 0λ 1

)

= 1× 1− 0× λ = 1.

(iv) We have

D(

a 00 b

)

= a× b− 0× 0 = ab

corresponding to the fact that a rectangle with sides of length a and bhas area ab.

Exercise 4.2.3⋆

65

Exercise 4.3.2

As we hoped,

ǫ11a11a12 + ǫ12a11a22ǫ21a21a12 + ǫ22a21a22 = a11a22 − a21a12.

66

Exercise 4.3.3

The result is automatic if r, s and t are not distinct.

When r, s andt are distinct, we check the case r = 1 when eithers = 2, t = 3 or s = 3, t = 2 in Definition 4.3.1. We now do the samefor r = 2 and r = 3.

67

Exercise 4.3.10

Observe that writing C = AB D = CT , A′ = AT , B′ = BT

dij = cji =

m∑

r=1

ajrbri =

m∑

r=1

a′rjb′ir =

m∑

r=1

b′ira′rj

(for 1 ≤ i ≤ p, 1 ≤ j ≤ n). Thus (AB)T = BTAT .

68

Exercise 4.3.13

x 7→ Er,s,λx is a shear which leaves volume unchanged.

Let D have ith diagonal entry di. x 7→ Dx (where D is a diagonalmatrix) stretches by |di| in the 0xi direction and reflects if dii < 0.Thus the volume is multiplied by

∏ni=1 di = detD.

x 7→ P (σ)x exchanges the handedness of coordinates (or equivalentlyis a reflection in a particular plane) which multiplies volume by −1.

If A is any 3 matrix then A = a1A2 . . . Ak with the Aj elementarymatrices. Since

Ax = A1(A2(A3 . . . (Akx) . . .))

the transformation x 7→ Ax rescales by∏k

j=1 detAj = detA.

69

Exercise 4.3.14

The map x 7→ Mx changes the length scale by λ and the volumescale by λ3. (This is just a special case of the map x 7→ Dx consideredin the previous question.)

70

Exercise 4.3.15

LetF (r, s, t) = ǫijkairajsakt.

We observe that F (r, s, t) is the determinant of the matrix with firstrow the rth row of A, second row the sth row of A and third row thetth row of A.

Thus interchanging any two of r, s, t multiplies F (r, s, t) by −1 andso

F (r, s, t) = ǫrstK

for some constant K.

Since F (1, 2, 3) = detA we have

ǫijkairajsakt = F (r, s, t) = ǫrst detA.

(ii) Writing C = AB, we have

ǫrst detAB = ǫrst detC = ǫijkcircjsckt

= ǫijkaiuburajvbvsakwbwt = ǫijkaiuajvakwburbvsbwt

= ǫuvw detAburbvsbwt = ǫrst det detB

Taking r = 1, s = 2, t = 3 we get detAB = detA detB.

Exercise 4.3.16⋆

71

Exercise 4.4.1

There is no non-trivial χ.

Observe that

χijkrs = −χkijrs = χjkirs = −χijkrs,

so χijkrs = 0.

72

Exercise 4.4.4

We have

ζ(σ) =(σ2− σ1)(σ3− σ1)(σ4− σ1)(σ3− σ2)(σ4− σ2)(σ4− σ3)

(2− 1)(3− 1)(4− 1)(3− 2)(4− 2)(4− 3),

ζ(τ) =(3− 2)(1− 2)(4− 2)(1− 3)(4− 3)(4− 1)

(2− 1)(3− 1)(4− 1)(3− 2)(4− 2)(4− 3)= 1,

ζ(ρ) =(3− 2)(4− 2)(1− 2)(4− 3)(1− 3)(1− 4)

(2− 1)(3− 1)(4− 1)(3− 2)(4− 2)(4− 3)= −1.

73

Exercise 4.4.6

(i) If t /∈ {1, 2, i}αρα1 = αρ2 = α1 = i = τ1

αρα2 = αρ1 = α2 = 2 = τ2

αραi = αρi = αi = 1 = τi

αραt = αρt = αt = t = τt

(ii) If j = 1, then the result follows from part (iv) of the lemma. Ifnot, let τ be as in part (iv) of the lemma and β ∈ Sn interchange 1 andj leaving the remaining integers unchanged. Then, by inspection,

βτβ(r) = κ(r)

for all 1 ≤ r ≤ n and so βτβ = κ. It follows that

ζ(κ) = ζ(β)ζ(τ)ζ(β) = ζ(β)2ζ(τ) = −1.

74

Exercise 4.4.7

If all the suffices i, j, k, l are distinct, then σ(1) = i, σ(2) = j,σ(3) = k, σ(4) = l gives a unique σ ∈ S4, so

ǫijkl = ǫσ(1)σ(2)σ(3)σ(4) = ζ(σ)

does define ǫijkl. If τ interchanges two suffices and leaves the restunchanged

ǫτ(i)τ(j)τ(k)τ(l) = ǫτσ(1)τσ(2)τσ(3)τσ(4) = ζ(τσ) = ζ(τ)ζ(σ) = −ζ(σ) = −ǫijkl

If i, j, k, l are not all distinct and i′, j′, k′, l′ is some rearrangement

ǫijkl = 0 = −ǫi′j′k′l′.

Finally ǫ1234 = ζ(ι) = 1.

75

Exercise 4.4.8

Observe that 1−1 = 1 and (−1)−1 = −1.

Since ζ(σ)ζ(σ−1) = ζ(σσ−1) = ζ(ι) = 1, it follows that ζ(σ) =ζ(σ−1).

Thus

detAT =∑

σ∈Sn

ζ(σ)a1σ(1)a2σ(2) . . . anσ(n)

=∑

σ∈Sn

ζ(σ)aσ−1(1)1aσ−1(2)2 . . . aσ−1(n)n

=∑

σ∈Sn

ζ(σ−1)aσ−1(1)1aσ−1(2)2 . . . aσ−1(n)n

=∑

τ∈Sn

ζ(τ)aτ(1)1aτ(2)2 . . . aτ(n)n

= detA

76

Exercise 4.4.9

(i) We have

det

(

1 1x y

)

= y − x.

(ii) Each term ζ(σ)aσ(1)1aσ(2)2aσ(3)3 in the standard expansion is amultinomial of degree 3.

F (x, x, z) = det

1 1 1x x zx2 x2 z2

= 0,

because two columns of the matrix are identical. Thus y − x must bebe a factor of F (x, y, z). Similarly z− y and z−x are factors. Since Fhas degree 3 and (y − x)(z − y)(z − x) has degree 3,

F (x, y, z) = A(y − x)(z − y)(z − x)

for some constant A. By considering the terms in the standard deter-minant expansion, we know that the coefficient of yz2 is 1. Thus A = 1and

F (x, y, z) = (y − x)(z − y)(z − x).

(iii) Since each term in the standard expansion of det V is a multi-nomial of degree 0+ 1+ . . .+ (n− 1) = (n− 1)n/2, F is a multinomialof degree (n− 1)n/2.

F (x1, x1, x3, x4, . . . , xn) = 0

because two columns of the associated matrix are identical. Thus x2−x1 must be be a factor of F . Similarly xj−xk is a factor for each j > k.Since F has degree (n− 1)n/2 and

∏

i>j(xi − xj) has the same degree,

F (x1, x2, . . . , xn) = A∏

i>j

(xi − xj)

for some constant A.

By considering the terms in the standard determinant expansion, weknow that the coefficient of

∏nj=2 x

j−1j is 1. Thus A = 1 and

F (x1, x2, . . . , xn) =∏

i>j

(xi − xj).

(iv) We have

ζx(σ) =F (xσ(1), xσ(2), . . . , xσ(n))

F (x1, x2, . . . , xn)=∏

i>j

xσ(i) − xσj

xi − xj.

77

Now∏

i>j

(xσ(i) − xσj) = Bσ∏

i>j

(xi − xj)

where B depends only on σ (and not on x). Thus

ζx(σ) = ζy(σ)

with yj = j and so

ζx(σ) = ζ(σ).

Thus ζ = ζ .

78

Exercise 4.5.1

Each of the n! terms involves nmultiplications (in addition to findingthe appropriate sign) so we need n× n! multiplications. We can eitheruse a Stirling approximation or an electronic calculator (which probablyuses some version of Stirling’s formula) to obtain an estimate of about36 000 000 for n = 10 and about 4.9× 1019 for n = 20.

79

Exercise 4.5.3

If A = (aij) is an n × n matrix which is both upper and lowertriangular then aij = 0 if i < j or if j < i so A is diagonal.Eachof the n! terms involves n multiplications (in addition to finding theappropriate sign) so we need nn! multiplications. We can either usea Stirling approximation or an electronic calculator (which probablyuses some version of Stirling’s formula) to obtain an estimate of about36 000 000 for n = 10 and about 4.9× 1019 for n = 20.

80

Exercise 4.5.3

If A = (aij) is an n × n matrix which is both upper and lowertriangular then aij = 0 if i < j or if j < i so A is diagonal.

81

Exercise 4.5.5

(i) By row and column operations on the first r rows and columnswe can reduce

C =

(

A 00 B

)

to C ′ =

(

A′ 00 B

)

with detC = K detC ′, detA′ = K detA and A′ lower triangular.

By row and column operations on the last s rows and columns wecan reduce C ′ to C ′′ with

C ′′ =

(

A′ 00 B′

)

with detC ′′ = K ′ detC ′, detB′ = K ′ detB and B′ lower triangular.

We now have

detC = KK ′ detC ′′ = KK ′ detA′ detB′ = detA detB.

(ii) FALSE. Consider

F =

0 1 0 00 0 1 01 0 0 00 0 0 1

With the suggested notation

detA = detB = detC = detD = 0

butdetF = 1 6= 0 = detA detD − detB detC.

82

Exercise 4.5.6

Dividing the first row by 2,

det

2 4 63 1 25 2 3

= 2det

1 2 33 1 25 2 3

.

Subtracting multiples of the first row from the second and third row,

2 det

1 2 33 1 25 2 3

= 2det

1 2 30 −5 −70 −8 −12

.

Expanding by first row,

2 det

1 2 30 −5 −70 −8 −12

= 2det

(

−5 −7−8 −12

)

.

Dividing the first row by −1 and the second row by −4,(

−5 −7−8 −12

)

= 8det

(

5 72 3

)

.

Subtracting two times the second row from the first (hardly necessary,we could have finished the calculation here),

8 det

(

5 22 1

)

= 8det

(

1 02 1

)

.

From the definition,

8 det

(

1 02 1

)

= 8.

83

Exercise 4.5.7

(i) Observe that

det

a11 a12 a13a21 a22 a23a31 a32 a33

= det

a11 a12 a13a21 a22 a23a31 a32 a33

T

=∑

σ∈S3

3∏

i=1

aiσ(i)

= a11a22a33 − a11a23a32 + a12a23a31 − a12a21a33

+ a13a21a32 − a13a22a31

= a11 det

(

a22 a23a32 a33

)

− a12 det

(

a21 a23a31 a33

)

+ a31 det

(

a21 a22a31 a32

)

.

(ii) We have

det

2 4 63 1 25 2 3

= 2det

(

1 22 3

)

− 4 det

(

3 25 3

)

+ 6det

(

3 15 2

)

= 2× (−1)− 4(−1) + 6× 1 = 8.

84

Exercise 4.5.8

(i) We look at each of the four terms ar1 detBr1 say. If r 6= 1, jthen detBr1 changes sign to give −ar1 detBr1. a11 detB11 changes to−aj1 detBj1 and aj1 detBj1 changes to −a11 detB11.

ThusF(A) = −F(A).

(iii) The result is already proved if i or j takes the value 1. If not,interchanging row 1 with row i, then interchanging row 1 of the newmatrix with row j of the new matrix and finally interchanging row 1and i again transforms A to A. By part (i)

F(A) = −F(A).

If rows i and j are the same this gives F(A) = −F(A) and soF(A) = 0.

(iv) By inspection

F(A) = F(A) + F(B)

where B is the matrix A with the first row replaced by the ith row. Bypart (iii), F(B) = 0 so

F(A) = F(A).

By considering the effect of interchanging the first and jth row, weget the more general result

F(A) = F(A).

(v) If we now carry out the diagonalisation procedure of Theorem 3.4.8on A using the rules above and observe that

F(I) = 1 = det I,

we getF(A) = detA.

85

Exercise 4.5.10

(i) The argument goes through essentially word for word (replacing4 by n). The formula

n∑

j=1

a1jA1j = detA.

is the row expansion formula so obtained.

(ii) If i = 1 there is nothing to prove. If i 6= 1, we argue as follows.

If B is the matrix obtained from A by interchanging row 1 and rowi.

detA = − detB = −n∑

j=1

b1jB1j =n∑

j=1

aijAij .

(iii) If i 6= k, thenn∑

j=1

aijAkj = detC

where C is an n× n matrix with ith and kth rows the same. Thusn∑

j=1

aijAkj = 0.

(iv) Conditions (ii) and (iii) together given∑

j=1

aijAkj = δkj detA

86

Exercise 4.5.13

We havedetA detA−1 = detAA−1 = det I = 1

so detA−1 = (detA)−1.

87

Exercise 4.5.14

The formula of Exercise 4.5.10 (v) shows that∑n

k=1 bkAkj is thedeterminant of a matrix obtained from A by replacing the jth columnof A by b.

Thusn∑

j=1

aij detBj =n∑

j=1

aij

n∑

k=1

bkAkj

=

n∑

j=1

n∑

k=1

aijbkAkj

=n∑

k=1

n∑

j=1

aijbkAkj

=

n∑

k=1

bk

n∑

j=1

aijAkj

=

n∑

k=1

bkδik detA = bi detA

Thus, if detA 6= 0,

xj =detBj

detAgives a solution of Ax = b (and, since there is only one solution, it isthe solution).

88

Exercise 4.5.15

The statement is true but you still need to find the determinant oftwo n×n matrices. Unless we deal with very small systems of equations(corresponding to a 3×3 matrix, say) the labour involved in computingdetA (at least by the methods given in this book or any I am awareof) is, at best, comparable with the effort of finding all the solutions ofAx = b

89

Exercise 4.5.16

(i) Observe that, since Sn is a group

permAT =∑

σ∈Sn

n∏

i=1

aiσ(i)

=∑

σ∈Sn

n∏

i=1

aσ−1(i)i

=∑

σ∈Sn

n∏

i=1

aτ(i)i = permA

(ii) Both statements false. If we set

A =

(

1 11 1

)

, B =

(

1 11 −1

)

then permA = 4 6= 0, but detA = 0, and detB = −2, but permB = 0.

(iii) Write Aji for the n−1×n−1 matrix obtained by removing theith row and jth column from A. We have

permA =n∑

i=1

a1i permAi1

(iv) By (iii),permA ≤ nKmax

iAi1

so, by induction, | permA| ≤ n!Kn.

If we take A(n) to be the n × n matrix with all entries K (whereK ≥ 0), then

permA(n) = nK permA(n− 1)

so, by induction, permA(n) = n!Kn.

(v) If B is the n× n matrix with bij = |aij|| detA| ≤ permB ≤ n!Kn.

Hadamard’s inequality is proved later.

90

Exercise 4.5.17

(i) True. If A is antisymmetric,(

0 b−b 0

)

,

and detA = b2 6= 0 if A 6= 0.

(ii) False. Consider

0 1 0 0−1 0 0 00 0 0 00 0 0 0

.

(iii) True.

detA = detAT = det(−A) = (−1)n detA = − detA.

91

Exercise 5.1.1

If we work over Z we cannot always divide.

Thus the equation 2x = 1 has no solution, although the 1×1 matrix(2) has non-zero determinant.

Exercise 5.1.2⋆

92

Exercise 5.2.7

I would be inclined to pick (iv) and (vi).

(ii) (f + g)(x) = f(x) = g(x) = g(x) + f(x) = (g+ f)(x) for all x sof + g = g + f .

(iii) (f+0)(x) = f(x)+0(x) = f(x)+0 = f(x) for all x so f+0 = f .

(iv) (λ(f+g))(x) = λ((f+g)(x)) = λ(f(x)+g(x)) = λf(x)+λg(x) =(λf)(x) + (λg)(x) = (λf + λg)(x) for all x so λ(f + g) = λf + λg.

(v) ((λ + µ)f)(x) = (λ + µ)(f(x)) = λf(x) + µf(x) = (λf)(x) +(µg)(x) + (λf + µg)(x) for all x so (λ+ µ)f = λf + µf .

(vi) ((λµ)f)(x) = (λµ)(f(x)) = (λµ)f(x) = λ(µf(x)) = λ((µf)(x)) =(λ(µf))(x) for all x so (λµ)f = λ(µf).

(vii) (1f)(x) = 1 × f(x) = f(x) and (0f)(x) = 0 × f(x) = 0 = 0(x)for all x and so 1f = f and 0f = 0.

93

Exercise 5.2.8

The correspondence FX ↔ Fn given by

f ↔ (f(1), f(2), f(3), . . . , f(n))

identifies FX with the known vector space Fn.

94

Exercise 5.2.11

Observe that all these sets are subsets of the vector space RR so wemay use Lemma 5.2.10.

(i) Subspace. If f and g are 3 times differentiable so is λf + µg.

(ii) Not a vector space. Let f(t) = t2. Then f is in the set but (−1)fis not.

(iii) Not a vector space. Let P (t) = t. Then P is in the set but(−1)P is not.

(iv) Subspace. If P , Q in set, then λP + µQ is a polynomial and(λP + µQ)′(1) = λP ′(1) + µQ′(1) = 0

(v) Subspace. If P , Q in set, then λP + µQ is a polynomial and∫ 1

0

(λP + µQ)(t) dt =

∫ 1

0

(λP (t) + µQ(t) dt

= λ

∫ 1

0

P (t) dt+ µ

∫ 1

0

q(t) dt = 0.

(vi) Not a vector space. Let

h(t) = max{(1− 10|t|), 0}and observe that

∫

h(t)3 dt = A 6= 0. If f(t) = h(t) − h(t + 1/5) andg(t) = h(t)− h(t+ 2/5), then

∫ 1

−1

f(t) dt =

∫ 1

−1

g(t) dt = A− A = 0

but∫ 1

−1

(f(t) + g(t)) dt = 8A− 2A 6= 0.

(vii) Not a vector space. If f(t) = −g(t) = t3 then f and g havedegree exactly 3 but f + g = 0 does not.

(viii) Subspace. If P , Q in set, then so is λP + µQ.

95

Exercise 5.3.2

T (0) = T (00) = 0T (0) = 0.

96

Exercise 5.3.3

Let f, g ∈ D(i) We have

δ(λf + µg) = (λf + µg)(0) = λf(0) + µg(0) = λδf + µδg.

(ii) We have

D(λf + µg) = (λf + µg)′ = λf ′ + µg′ = λDf + µDg.

(iii) We have(

K(λf + µg))

(x) = (x2 + 1)(λf + µg)(x) = (x2 + 1)(λf(x) + µg(x))

= λ(x2 + 1)f(x) + µ(x2 + 1)g(x)

= λ(Kf)(x) + µ(Kg)(x) = (λKf + µKg)(x)

for all x and soK(λf + µg) = λKf + µKg.

(iv) We have

(

J(λf + µg))

(x) =

∫ x

0

(λf(t) + µg(t)) dt

= λ

∫ x

0

f(t) dt+ µ

∫ x

0

g(t) dt = (λJf + µJg)(x)

for all x and soJ(λf + µg) = λJf + µJg.

97

Exercise 5.3.10

(i) We have

ι(λx + µy) = λx+ µy = λιx + µιy

so ι is linear

(ii) Just a restatement of definitions.

(iii) The fundamental theorem of the calculus states that DJ = ι.However JD1 = J0 = 0 6= 1. To see that J is injective observe that

Jf = Jg ⇒ DJf = DJg ⇒ f = g

However Jf(0) = 0 so 1 /∈ JD so J is not surjective.

To see that D is surjective observe that D(Jf) = f . However D0 =D1 = 0 so D is not injective.

(iv) Observe that

(αβ)(β−1α−1) = α(ββ−1)α−1 = αια−1 = α−1α = ι

and similarly(β−1α−1)(αβ) = ι

98

Exercise 5.3.11

The only if part is trivial

To prove the if part suppose

Tu = 0 ⇒ u = 0.

Then

Tx = Ty ⇒ Tx− Ty = 0

⇒ T (x− y) = 0

⇒ x− y = 0

⇒ x = y.

99

Exercise 5.3.14

(i) If α, β ∈ GL(U), then α and β are bijective, so αβ is, so αβ ∈GL(U).

(ii) Observe that if α, β, γ ∈ GL(U), then

(

(αβ)γ)

u = (αβ)(γu) =

(

α(

β(γu))

)

= α(

β(γu))

= α(

(βγ)u)

=(

α(βγ))

u

for all u ∈ U so (αβ)γ = α(βγ).

(iii) ι ∈ GL(U) and αι = ια = α for all α ∈ GL(U).

(iv) Use the definition of GL(U).

Let α(x, y) = (x, 0), β(x, y) = (y, x). Then αβ(x, y) = (y, 0),βα(x, y) = (0, x). (Could use matrices.)

100

Exercise 5.3.16

(i) Not a subgroup. Observe that if

A =

(

1 12 0

)

then A is invertible (since detA 6= 0) but

A2 =

(

3 12 2

)

.

(ii) Yes, a subgroup since det ι = 1 > 0,

detα, detβ > 0 ⇒ detαβ = detα det β > 0

anddetα > 0 ⇒ detα−1 = (detα)−1 > 0.

(iii) Not a subgroup. If α = 2ι then detα = 2n ∈ Z but (if n ≥ 2)detα−1 = 2−n /∈ Z.

(iv) Yes, a subgroup. If α, β ∈ H4 with matrices A and B thenAB has integer entries and detAB = detA detB = 1. Further sinceA−1 = (detA)−1AdjA = AdjA and AdjA has integer entries, α−1 ∈H4. Finally ι ∈ H4.

(v) Yes, a subgroup. Let Sn be the group of permutations

σ : {1, 2, . . . , n} → {1, 2, . . . , n}.Let ασ be the linear map whose matrix (δi,σi) (that is to say (aij) withaiσi = 1, aij = 0 otherwise). Then αι = ι, ασατ = ατσ and α−1

σ = ασ−1 .

101

Exercise 5.4.3

Observe thatn∑

j=1

λj(ej + y) = 0 ⇒n∑

j=1

λjej +n∑

j=1

n∑

k=1

λjakek = 0

⇒n∑

j=1

λjej +

n∑

j=1

n∑

k=1

λkajej = 0

⇒n∑

j=1

(

λj + aj

n∑

r=1

λr

)

ej = 0

⇒ λj + aj

n∑

r=1

λr = 0 ∀j

Thus, ifn∑

j=1

λjej + y = 0

and we write K = −∑nr=1 λr,

λj = Kaj .

Summing we obtain

−K =

n∑

j=1

λj = K

n∑

j=1

aj

so, a1 + a2 + . . . + an + 1 6= 0, K = 0 and λj = 0 showing that thevectors e1 + y are linearly independent.

If a1 + a2 + . . .+ an + 1 = 0, thenn∑

j=1

aj(ej + y) = 0

and (since y 6= 0) not all the aj are zero, so we do not have linearindependence.

102

Exercise 5.4.11

U and V are the null spaces of linear maps from R4 to R2.

The elements of U ∩ V are given by

x+ y − 2z + t = 0

−x+ y + z − 3t = 0

x− 2y + z + 2t = 0

y + z − 3t = 0.

Subtracting the 4th equation from the 2nd yields x = 0 so the systembecomes

y − 2z + t = 0

x = 0

−2y + z + 2t = 0

y + z − 3t = 0.

Adding the 1st and 3rd equation reveals that the 4th equation is su-perfluous so the system is

y − 2z + t = 0

x = 0

−2y + z + 2t = 0,

or equivalently

x = 0

y − 2z + t = 0

−3z + 6t = 0

so x = 0, z = 2t, y = −2t. Thus U ∩ V has basis e1 = (0,−2, 2, 1)T .

The equations for U yield

x+ y − 2z + t = 0

2y − z − 2t = 0

so by inspection e2 = (1, 1, 2, 0)T is in U . Since e1 and e2 are linearlyindependent and dimU = 2 we have basis for U

(To see that dimU = 2 either quote general theorems or observe thatx and y determine z and t.)

The equations for v yield

x− 2y + z + 2t = 0

y + z − 2t = 0

103

so, by inspection, e3 = (3, 1, 2, 0)T is in U . Since e1 and e3 are linearlyindependent and dim V = 2 (argue as before) we have basis for V .

The proof of Lemma 5.4.10 now shows that e1, e2 form a basis fore3

104

Exercise 5.4.12

(i) Since U ⊇ V +W ⊇ V,W we have

dimU ≥ dim(V +W ) ≥ dim V, dimW

By Lemma 5.4.10

dim(V +W ) = dimV + dimW − dim(V ∩W ) ≥ dimV + dimW.

Putting these two results together, we get

min{dimU, dimV + dimW} ≥ dim(V +W ) ≥ max{dimV, dimW}.

(ii) Consider a basis e1, e2, . . . , en for U .

Let

V = span{e1, e2, . . . , er}W = span{et−s+1, et−s+2, . . . , et}

ThenV +W = span{e1, e2, . . . , et}

anddimV = r, dimW = s, dim(V +W ) = t.

105

Exercise 5.4.13

We use row vectors.

E is the null space of a linear map from R3 to R so a subspace.

Let e1 = (1,−1, 0), e2 = (1, 0,−1). We have e1, e2 ∈ U .

x1e1 + x2e2 = 0 ⇒ (x1 + x2,−x1,−x2) = 0 ⇒ x1 = x2 = 0

so e1, e2 are independent.

If x ∈ U then

x = (x1, x2, x3) ⇒ x = (−x2 − x3, x2, x3) ⇒ x = x2e1 + x3e2.

Thus e1, e2 span U and so form a basis.

I do not think everybody will choose the basis and so there cannotbe a ‘standard basis’ in this case.

106

Exercise 5.4.14

(i) 0 ∈ V for all V ∈ V and so 0 ∈ ⋂V ∈V V .

Further, if λ, µ ∈ F,

u, v ∈⋂

V ∈V

V ⇒ u, v ∈ V ∀V ∈ V

⇒ λu+ µv ∈ V ∀V ∈ V⇒ λu+ µv ∈

⋂

V ∈V

V.

Thus⋂

V ∈V V is a subspace of U .

(ii) By (i) W is a subspace of U . Since E ⊆ V for all V ∈ V, wehave E ⊆ W . If W ′ is as stated W ′ ∈ V so W ′ ⊇ W .

(iii) Let

W ′ =

{

n∑

j=1

λjej : λj ∈ F for 1 ≤ j ≤ n

}

.

Observe that E ⊆ W ′ (take λj = δij). and W is a subspace of Usince

λn∑

j=1

λjej + µn∑

j=1

µjej =n∑

j=1

(λλj + µµjej.

Any subspace of U containing E must certainly contain W ′. ThusW = W ′.

107

Exercise 5.5.3

Always false. 0 /∈ α−1v.

108

Exercise 5.5.5

We use the rank-nullity theorem.

α injective ⇔ α−1(0) = 0

⇔ dimα−1(0) = 0

⇔ n− dimα−1(0) = n

⇔ dimα(U) = n

⇔ α surjective

We now note that bijective means injective and surjective and thata map is bijective if and only if it is invertible.

109

Exercise 5.5.6

Adding the second and third row to the top row, we get

det

a b bb a bb b a

= det

a+ 2b a+ 2b a + 2bb a bb b a

= (a+ 2b)

1 1 1b a bb b a

= (2a+ b)

1 1 10 a− b 00 0 a− b

= (2a+ b)(a− b)2.

Thus, if a 6= b and a 6= −2b, α has rank 3 α−1(0) = 0, has emptybasis and α(U) = U has basis (1, 0, 0)T , (0, 1, 0)T , (0, 0, 1)T .

If a = b, there are two cases. If a = 0, α has rank 3 α(U) = 0 hasempty basis and α−1(0) = U has basis (1, 0, 0)T , (0, 1, 0)T , (0, 0, 1)T .

If a = b 6= 0, then

(x, y, z) ∈ α−1(0) ⇔ x+ y + z = 0

so α has rank 3−2 = 1 α−1(0) has basis (1, 0,−1)T , (0, 1,−1)T . U hasbasis (1, 0,−1)T , (0, 1,−1)T , (0, 0, 1)T , so α(U) has basis α(0, 0, 1)T =(a, a, a)T so basis (1, 1, 1)T .

If a = −2b 6= 0, then

(x, y, z) ∈ α−1(0 ⇔ x− 2y + z = 0, x+ y − 2z = 0 ⇔ x = y = z

so α has rank 3 − 1 = 2 α−1(0) has basis (1, 1, 1)T . U has basis(1, 1, 1)T , (0, 1, 0)T , (0, 0, 1)T so α(U) has basis α(0, 1, 0)T = (b, a, b)T

α(0, 1, 0)T = (b, b, a)T so basis (1,−2, 1)T , (1, 1,−2).

Exercise 5.5.12⋆

110

Exercise 5.5.13

(i) If A is an m × n matrix with rows a1, a2, . . .am then the rowrank of A is dimension of

span{a1, a2, . . . , am}.

(ii) If B is a non-singular m × m matrix then if x is a row vectorwith m entries

xB = 0 ⇒ xBB−1 = 0B−1 ⇒ x = 0 ⇒ xB = 0

sox = 0 ⇔ xB = 0.

It follows thatm∑

j=1

λjajB = 0 ⇔(

m∑

j=1

λjaj

)

B = 0

⇔m∑

j=1

λjaj = 0,

so

dim span{a1B, a2B, . . . , amB} = dim span{a1, a2, . . . , am}and the row rank of A and AB are the same.

Since the elementary column operations correspond to post multipli-cation by invertible matrices, the row rank is unaltered by elementarycolumn operations. A similar argument with pre-multiplication replac-ing post-multiplication shows that row rank is unaltered by elementaryrow operations. Similarly the column rank is unaltered by elementaryrow and column operations.

(iii) By elementary row and column operations any matrix A can betransformed to a matrix C = (cij) with cij = 0 if i 6= j. By inspection,C has row and column rank equal to the number of non-zero entriescii. Thus, by (ii), the row rank and column rank of A are the same.

111

Exercise 5.5.14

If a = b = 0, then r = 0.

If a 6= 0 and b = 0, the matrix is

a 0 0 00 a 0 0a 0 0 00 0 0 a

The first and third rows are identical, so r ≤ 3. However, the matrixobtained by removing the third row and first column

a 0 00 a 00 0 a

is non-singular, so r = 3.

If a = 0, b 6= 0, we get

0 0 b 00 0 0 b0 0 0 b0 b 0 0

.

The second and third rows are identical, so r ≤ 3. However, thematrix obtained by removing the third row and first column

0 b 00 0 bb 0 0

is non-singular (for example, because the determinant is non-zero), sor = 3.

If a = b 6= 0, the matrix is

a 0 a 00 a 0 aa 0 0 a0 a 0 a

.

The second and fourth rows are identical so r ≤ 3. However, the matrixobtained by removing the fourth row row and column

a 0 a0 a 00 0 a

is non-singular (for example, because the determinant is non-zero) sor = 3.

112

If a, b 6= 0 and a 6= b, then subtracting the first row from the thirdrow and the a/b times the second row from the fourth

a 0 b 00 a 0 ba 0 0 b0 b 0 a

→

a 0 b 00 a 0 b0 0 −b b0 b 0 a

→

a 0 b 00 a 0 b0 0 −b b0 0 0 a− b2/a

yields a non-singular matrix (e.g. by looking at the determinant), sor = 4 and the original matrix was non-singular.

113

Exercise 5.5.16

Consider the linear maps α and β whose matrices with respect to agiven basis are (aij) with aii = 1 if 1 ≤ i ≤ r, aij = 0, otherwise, bii = 1if r − t + 1 ≤ i ≤ r + s − t, bij = 0, otherwise. (Note that we use thefacts that r, s ≥ t and r + s− t ≤ n.) Then AB has matrix (cij) withcii = 1 if r − t + 1 ≤ i ≤ r, cij = 0, otherwise. Then α has rank r, βrank s and γ rank t.

114

Exercise 5.5.17

By noting that (1,−1, 0, 0, . . . , 0)T , (1, 0,−1, 0, . . . , 0) . . . , (1, 0, 0, 0, . . . ,−1)T

are n − 1 linearly independent eigenvectors with eigenvalue 0 and(1, 1, . . . , 1)T is an eigenvector with eigenvalue n for the n × n ma-trix A with all entries 1, we see that the characteristic polynomial ofA is det(tI −A) = tn−1(t− n). Thus A+ sI is invertible if and only ifs 6= 0,−n.

Let P be as stated. Then PP T = Q with

qij =n∑

r=1

pirpjr =

{

1 if i 6= j

k if i = j

Thus PP T = A+ (k− 1)I and, since k ≥ 2 we have PP T invertible soP is invertible so of full rank n.

If k = 1, then there can only be one party P = (1 1 1 . . . 1)T andhas rank 1. The proof fails because PP T = A is not of full rank.

115

Exercise 5.5.18

(i) Observe that (α + β)U ⊆ V , so dimV ≥ dim(α + β)U .

Since (α+β)u = αu+βu ∈ αU+βU , we have (α+β)U ⊆ αU+βU ,so, by Lemma 5.4.10,

dim(α+ β)U ≤ dim(αU + βU) ≤ dimαU + dim βU.

Thusmin{dimV, rankα+ rankβ} ≥ rank(α + β).

(ii) By (i),

rank(α+β)+rankβ = rank(α+β)+rank(−β) ≥ rank(

(α+β)−β)

= rankα.

Thusrank(α + β) ≥ rankα− rankβ

andrank(α + β) = rank(β + α) ≥ rank β − rankα.

Thusrank(α + β) ≥ | rankα− rank β|.

(iii) Since α + β = β + α, there is no loss of generality in assumingr ≥ s.

Fix bases for U and V . If t ≥ r, consider the linear map α given bythe matrix (aij) with aii = 1 if 1 ≤ i ≤ r, aij = 0 otherwise, and thelinear map β given by the matrix (bij) with bii = 1 if t− s+ 1 ≤ i ≤ t,bij = 0 otherwise. Then α + β has matrix (cij) with cii = 1 if 1 ≤ i ≤t − s cii = 2 if t − s + 1 ≤ i ≤ r, cii = 1 if r + 1 ≤ i ≤ t, cij = 0otherwise. Thus rankα = r, rank β = s rank(α + β) = t.

If t ≤ r − 1, consider the linear map α given by the matrix (aij)with aii = 1 if 1 ≤ i ≤ r, aij = 0 otherwise, and the linear map βgiven by the matrix (bij) with bii = −1 if 1 ≤ i ≤ r − t, bii = 1 ifr − t+ 1 ≤ i ≤ s, bij = 0 otherwise. Then α+ β has matrix (cij) withcii = 2 if r − t + 1 ≤ i ≤ s cii = 1 if s + 1 ≤ i ≤ r, cij = 0 otherwise.Thus rankα = r, rank β = s rank(α + β) = t.

116

Exercise 5.5.19

To avoid repetition, we go straight to (iii). We consider n×nmatricesand say A ∈ Γ if and only if there is a K such that

∑nr=1 arj = K for

all j and∑n

r=1 air = K for all i.

(i)′ Observe that

A, B ∈ Γ ⇒n∑

r=1

arj =

n∑

r=1

air,

n∑

i=1

bij =

n∑

j=1

bij ∀i, j

⇒n∑

r=1

λarj + µbrj =n∑

r=1

λir + µbir ∀i, j

λA+ µB ∈ Γ.

Since 0 ∈ Γ, we have γ a vector space.

(ii)′ Let Γ0 be the subset of Γ such that

n∑

r=1

arj =

n∑

r=1

air = 0.

We observe that Γ0 is a subspace of Γ since 0 ∈ Γ0 and

A, B ∈ Γ ⇒n∑

r=1

arj =n∑

r=1

air = 0,n∑

i=1

bij =n∑

j=1

bij = 0 ∀i, j

⇒n∑

r=1

λarj + µbrj =3∑

r=1

λir + µbir = 0 ∀i, j

λA+ µB ∈ Γ0.

We note that if E is the n × n matrix all of whose entries are 1 thenany basis of Γ0 together with E gives a basis for Γ.

If 1 ≤ i, j ≤ n − 1, let E(i, j) be the n × n matrix with entry 1 inthe (i,j)th place and (n, n)th place and entry −1 in the (i, n)th placeand (n, j)th place. We observe that E(i, j) ∈ Γ0.

We observe by looking at the (r, s)th entry that∑

1≤i,j≤n−1

aijE(i, j) = 0 ⇒ ars = 0 ∀1 ≤ r, s ≤ n− 1,

so the Ers are linearly independent.

If A = (aij) then

A−∑

1≤i,j≤n−1

aijE(i, j) = 0

117

We check the (n, n)th entry by observing thatn∑

i=1

n∑

j=1

aij = 0

Thus the E(i, j) form a basis for Γ0 and the Ei,j [1 ≤ i, j ≤ n − 1]together with E form a basis for Γ. Thus Γ has dimension (n−1)2+1 =n2 − 2n+ 1

I cannot see any natural basis.

118

Exercise 5.6.2

We have

P (r) ≡ b0 + b1cr

P (r) ≡ 2 + 2cr

P (1) ≡ 2 + 2× 2 ≡ 6

P (2) ≡ 2 + 2× 4 ≡ 3

P (3) ≡ 2 + 2× 5 ≡ 5

(all modulo 7).

(P (1), c(1)) and (P (2), c(2)) yield

6 ≡ b0 + 2b1

3 ≡ b0 + 4b1

Subtracting the first equation from the second yields 3 ≡ −2b1. Eu-clid’s algorithm (or inspection) tells us to multiply by 3 to recoverb1 = 2. Substitution yields b0 = 2.

(P (1), c(1)) and (P (3), c(3)) yield

6 ≡ b0 + 2b1

5 ≡ b0 + 5b1

Subtracting the first equation from the second yields −1 ≡ −3b1. Eu-clid’s algorithm (or inspection) tells us to multiply by 5 to recoverb1 = 2. Substitution yields b0 = 2.

119

Exercise 5.6.3

If cj = ck with j 6= k, then two people have the same secret. Morepeople are needed to find the full secret.

The Vandermonde determinant

det

1 cr(1) c2r(1) . . . ck−1r(1)

1 cr(2) c2r(2) . . . ck−1r(2)

1 cr(3) c2r(3) . . . ck−1r(3)

......

.... . .

...1 cr(k) c2r(k) . . . ck−1

r(k)

does not vanish (and so the system is soluble) if and only if the cr(j)are distinct.

If cj = 0 then P (j) = b0 and j knows the secret.

The proof that fewer cannot find the secret depended on

det

cr(1) c2r(1) . . . ck−1r(1)

cr(2) c2r(2) . . . ck−1r(2)

cr(3) c2r(3) . . . ck−1r(3)

......

.... . .

...cr(k−1) c2r(k−1) . . . ck−1

r(k−1)

≡ cr(1)cr(2) . . . cr(k−1)

∏

1≤j<i≤k−1

(cr(i)−cr(j)) 6≡ 0

and this needs the cr(j) distinct and non-zero.

120

Exercise 5.6.4

P (r) ≡ b0 + b1cr

P (r) ≡ 1 + cr

P (1) ≡ 1 + 1× 1 ≡ 2

P (2) ≡ 1 + 1× 4 ≡ 5

all modulo 6. The recovery equations are

2 ≡ b0 + b1

5 ≡ b0 + 4b1

There are two solutions (b0, b1)) = (1, 1) and (b0, b1)) = (5, 3).

If p is not a prime, mn ≡ k may have more than one solution in neven if k 6≡ 0.

121

Exercise 5.6.5

(i) The secret remains safe. The arguments continue to work if thecj are known.

(ii) If the bs are known for r 6= 0 the holder of any pair (cr, P (r))can compute b0 directly as

b0 ≡ P (r)−∑

s=1

bscsr.

122

Exercise 5.6.6

With k rings, they can say nothing with high probability since oneof the rings may be a fake so they would have only k − 1 truth tellingrings.

With k + 1 rings, either every set of k rings gives the same answerin which case all the rings are true and the answer is correct or eachset of k rings will give a different answer (one of which correspondingto the true k rings will be correct). All they can say is that one ring isfake but they do not know which one or what the Integer of Power is.

With k+ 2 rings, there are (k+ 2)(k+ 1)/2 different sets of k rings.Either each set tells the same story in which case all the rings are trueand the answer is correct or k+1 sets will tell the same story in whichcase they are the k + 1 sets containing k true rings and their answeris correct and k(k− 1)/2 sets giving different answers. The heroes canidentify the k + 1 correct rings since they belong to the truth tellingsets.

One prime to rule them allOne prime to bind themIn the land of compositesWhere the factors are

123

Exercise 6.1.2

Observe that, writing A = (aij), we have

αej =

n∑

i=1

aijei =

a1ja2j...

anj

.

124

Exercise 6.1.3

Let A = (aij), B = (bij) and let αβ have matrix C = (cij).

Thenn∑

i=1

cijei = αβej = α(βej)

= α

(

n∑

k=1

bkjek

)

=

n∑

k=1

bkjαek =

n∑

k=1

bkj

n∑

r=1

arker =

n∑

r=1

(

n∑

k=1

arkbkj

)

er

so, by the uniqueness of the basis expansion,

cij =

n∑

k=1

aikbkj

i.e. C = AB.

125

Exercise 6.1.5

bijfi = α(fj) = α(prjer)

= prjαer = prjasres

= prjasrqisfi = qisasrprjfi.

Thus bij = qisasrprj.

126

Exercise 6.1.7

(i) Theorem 6.1.5 shows that, if A and B represent the same mapwith respect to two bases, then A and B are similar.

If A and B are similar then B = P−1AP . Let α be the linear mapwith matrix A corresponding to a basis

e1, e2, . . . , en.

Let fj =∑n

i=1 pijei. Since P has rank n, the fj form a basis andTheorem 6.1.5 tells us that α has matrix B with respect to this basis.

(ii) Follows at once from Theorem 6.1.5 and part (i).

(iii) Write A ∼ B if B = P−1AP for some non-singular P .

I−1AI = IAI = A so A ∼ A.

If A ∼ B then B = P−1AP for some non-singular P and writingQ = P−1 we have Q non-singular and A = Q−1BQ so B ∼ A.

If A ∼ B, B ∼ C then B = P−1AP , C = Q−1CQ for some non-singular P , Q. Then PQ is non-singular and C = (PQ)−1A(PQ) soC ∼ A.

(iv) If A represents α with respect to one basis then A represents αwith respect to the same basis. Thus A ∼ A.

If A ∼ B then there exist two bases with respect to which A and Brepresent the same map so B ∼ A.

By (i), if there exist two bases with respect to which A and B rep-resent the same map then given any basis E1 there exists a basis E2 sothat A and B represent the same map with respect to E1 and E2. Thusif A ∼ B, B ∼ C we can find bases Ej such that A and B represent thesame map with respect to E1 and E2, and b and C represent the samemap with respect to E2 and E3, so A and C represent the same mapwith respect to E1 and E3 so A ∼ B.

127

Exercise 6.1.8

We have

fj =

n∑

i=1

pijej = (pi1, pi2, . . . , pin)T .

128

Exercise 6.1.9

Observe that e1 = f1, e2 = f2 − f1, e3 = f3 − f2,

α(f1) =

1 1 2−1 2 10 1 3

100

=

1−10

= e1 − e2 = 2f1 − f2,

α(f2) =

1 1 2−1 2 10 1 3

110

=

211

= 2e1 + e2 + e3 = f1 + f3,

α(f3) =

1 1 2−1 2 10 1 3

111

=

424

= 4e1 + 2e2 + 4e3 = 2f1 − 2f2 + 4f3.

The new matrix is

B =

2 −1 01 0 12 −2 4

.

129

Exercise 6.1.12

Without Theorem 6.1.10, our definition would depend on the choiceof basis.

130

Exercise 6.2.3

(i) Observe that

det

(

t

(

1 00 1

)

−(

a bc d

))

= det

(

t− a −b−c t− d

)

= (t− a)(t− d)− (b)(−c)

= t2 − (a+ d) + (ad− bc)

= t2 − (TrA)t+ detA.

(ii) We have

det(tI − A) =∑

σ∈S3

ζ(σ)

3∏

i=1

(tδiσi − aiσi) = ut3 + vt2 + ct + w.

Looking at the coefficient of t3, we have u = 1. Looking at the coeffi-cient of t2, we have v = −a11 − a22 − a33. Setting t = 0

w = det(−A) = (−1)3 detA = − detA.

131

Exercise 6.2.4

Write ∂P for the degree of a polynomial P .

Observe that ∂biσ(i) = 0 if σ(i) 6= i and ∂biσ(i) = 1 if σ(i) = i.

Thus if σ 6= ι we know that σ(i) 6= i for at least two distinct valuesof i so

∂

n∏

i=1

biσ(i) ≤n∑

i=1

∂biσ(i) ≤ n− 2.

It follows that

det(tI−A) =∑

σ∈Sn

ζ(σ)biσ(i)(t) =

n∏

i=1

bii(t)∑

σ 6=ι

ζ(σ)biσ(i)(t) =

n∏

i=1

(t−aii)+Q(t),

where Q is a polynomial of degree at most n− 2.

Thus the coefficient of tn−1 in det(tI−A) is the coefficient of tn−1 in∏n

i=1(t− aii), that is to say, −TrA.

The constant term of a polynomial P is P (0) so the constant termin det(tI − A) is det((−1)A) = (−1)n detA.

132

Exercise 6.2.9

We show that any linear map α : R2n+1 → R3 has an eigenvector. Itfollows that there exists some line l through 0 with α(l) ⊆ l.

Proof. Since det(tι−α) is a real polynomial of odd degree, the equationdet(tι− α) = 0 has a real root λ say. We know that λ is an eigenvalueand so has an associated eigenvector u, say. Let

l = {su : s ∈ R}.�

133

Exercise 6.2.11

det(tι− ρθ) = det

(

t− cos θ sin θ− sin θ t− cos θ

)

= (t− cos θ)2 + sin2 θ

= t2 − 2 cos θ + 1.

The equation t2 − 2 cos θ + 1 = 0 has a real root if and only if

4 cos2 θ ≥ 4,

so if and only if cos θ = ±1, i.e. if and only if θ ≡ 0 mod π.

If θ ≡ 0 mod 2π, then ρθ = ι and every non-zero vector is aneigenvector with eigenvalue 1. If θ ≡ π mod 2π, then ρθ = −ι andevery non-zero vector is an eigenvector with eigenvalue −1.

134

Exercise 6.2.14

(i) PA(t) = det(tI + A) = (−1)n det(−tI − A) = (−1)nPA(−t).

(ii) Let δ be as in Lemma 6.2.13. Let tn = δ/2n.

135

Exercise 6.2.15

χAB(t) = det(tI − AB) = det(A(tA−1 − B))

= detA det(tA−1 −B) = det(tA−1 −B) detA

= det((tA−1 −B)A) = det(tI − BA) = χBA(t).

In general, we can find sn → 0 such that snI +A is invertible. Thusif t is fixed

χ(snI+A)B(t) = χB(sn−A)(t).

But χ(snI+A)B(t) → χAB(t) and χB(snI+A)(t) → χBA(t) so

χAB(t) = χBA(t).

Allowing t to vary, we see that χAB = χBA.

Observe that −TrC is the coefficient of tn−1 in χC(t). Thus χA =χB ⇒ TrA = TrB. By the previous result this gives TrAB = TrBA.

(i) (False in general.) We have

A =

(

1 00 0

)

, B =

(

1 10 0

)

, C =

(

0 01 1

)

then

ABC = (AB)C =

(

1 01 0

)(

0 10 1

)

=

(

1 10 0

)

and

ACB = (AC)B =

(

0 00 0

)(

0 01 1

)

=

(

0 01 1

)

.

Thus Tr(ABC) = 1 6= 0 = Tr(ACB) and χABC 6= χACB.

(ii) (True.) χABC = χA(BC) = χ(BC)A = χBCA.

136

Exercise 6.2.16

Let e = (1, 1, 1, . . . , 1)T . Observe that A is magic with constant κ ifand only if

Ae = ATe = κe.

In particular e is an eigenvector of A.

If A is magic with constant κ and AB = BA = I, then

e = BAe = κBe,

so κ 6= 0 and Be = κ−1e. Since BTAT = (AB)T = IT = I, theargument of the previous sentence gives BTe = κ−1e. Thus B is magicwith constant κ−1.

Suppose A is magic with constant κ. If A is invertible AdjA =(detA)A−1 so, by the previous paragraph with B = A−1, AdjA ismagic with constant κ−1 detA.

Chose tn → 0 with A + tnI invertible. If we write Adj(A + tnI) =(cij(n)), AdjA = (cij) then cij(n) → cij as n → 0. Thus

0 =n∑

r=1

cir(n)−n∑

r=1

ckr(n) →n∑

r=1

cir −n∑

r=1

ckr

and∑n

r=1 cir −∑n

r=1 ckr for all i and k. Similarly∑n

r=1 cir =∑n

r=1 crk.Thus AdjA is magic.

137

Exercise 6.3.2

(i) Ifx1e1 + x2e2 = 0,

then

0 = α(x1e1 + x2e2) = x1αe1 + x2αe2 = λ1x1e1 + λ2x2e2

Thus

0 = λ1(x1e1 + x2e2)− (λ1x1e1 + λ2x2e2)

= (λ1 − λ2)x2e2,

so that(λ1 − λ2)x2 = 0

and x2 = 0. Similarly x1 = 0, so e1 and e2 are linearly independent.

(ii) Observe that

0 = (α− λ2ι)(0) = (α− λ2ι)(x1e1 + x2e2) = (λ1 − λ2)x2e2

and proceed as before.

(iii) Observe that

0 = (α− λ2ι)(α− λ3ι)0

= (α− λ2ι)(α− λ3ι)(x1e1 + x2e2 + x3e3)

= (α− λ2ι)(

(λ1 − λ3)x1e1 + (λ2 − λ3)x1e2)

= (λ1 − λ2)(λ1 − λ3)x1e1

so that(λ1 − λ2)(λ1 − λ3)x1 = 0

and x1 = 0. Similarly x2 = x3 = 0 so e1, e2, e3 are linearly indepen-dent.

138

Exercise 6.4.2

(i) Observe that

det(tι− β) = det

(

t −10 t

)

= t2.

Thus the characteristic equation of β only has 0 as a root and β onlyhas zero as an eigenvalue.

(ii) If x = (x, y)T then

βx = 0x ⇔ (y, 0)T = (0, 0)T ⇔ y = 0

Thus the eigenvectors of β are the non-zero vectors of the form (x, 0)T

and these do not span R2.

139

Exercise 6.4.5

(i) We have

Q(t) = det

(

t− λ 00 t− µ

)

= (t− λ)(t− µ) = t2 − (λ+ µ)t+ λµ,

so

Q(A) =

(

λ2 00 µ2

)

−(

(λ+ µ)λ 00 (λ+ µ)µ

)

+

(

λµ 00 λµ

)

=

(

0 00 0

)

.

Thus Q(α) = 0.

(ii) We have

Q(t) = det

(

t− λ −10 t− λ

)

= (t− λ)2 = t2 − 2λt+ λ2,

so

Q(A) =

(

λ2 00 λ2

)

−(

2λ2 00 2λ2

)

+

(

λ2 00 λ2

)

=

(

0 00 0

)

Thus q(α) = 0.

(iii) We know that there exists a basis with respect to which α hasone of the two forms discussed, so the result of Example 6.4.4 is correct.

140

Exercise 6.4.7

Let P be as in Theorem 6.4.6. Since P is non-singular, detP 6= 0,so (working in C) we can find a κ with κ2 = detP . Set ν = κ−1 andQ = νP .

Then detQ = ν2 detP = 1 and

Q−1AQ = ν−1νP−1AP = P−1AP.

141

Exercise 6.4.8

(i) Since

d

dt(e−λtx(t)) = −λe−λtx(t) + e−λtx(t) = e−λt(−λx(t) + x(t)) = 0,

the mean value theorem tells us that e−λtx(t) = C and so x(t) = Ce−λt

for some constant C.

(ii) Since

d

dt

(

e−λtx(t)−Kt)

= e−λt(−λx(t) + x(t)−Keλt) = 0,

the mean value theorem tells us that e−λtx(t)−Kt = C and so x(t) =(C +Kt)e−λt for some constant C.

142

Exercise 6.4.9

If x1(t) = x(t) and x2(t) = x(t) then x1(t) = x2(t) automatically and

x2(t) = −bx1(t)− ax2(t) ⇔ x = −ax− bx ⇔ x+ ax+ bx = 0

det(λI − A) = λ(λ+ a) + b = λ2 + aλ+ b

143

Exercise 6.5.5

By definition,

Q−1RθQ =1

2

(

1 ii 1

)(

cos θ − sin θsin θ cos θ

)(

1 −i−i 1

)

=1

2

(

cos θ + i sin θ − sin θ + i cos θsin θ + i cos θ cos θ − i sin θ

)(

1 −i−i 1

)

=1

2

(

eiθ ieiθ

ie−iθ e−iθ

)(

1 −i−i 1

)

=1

2

(

2eiθ 00 2e−iθ

)

=

(

eiθ 00 e−iθ

)

.

144

Exercise 6.5.2

Since A has distinct eigenvalues, it is diagonalisable and we can finda an invertible matrix B such that

BAB−1 = D,

where D is the diagonal matrix with jth diagonal entry λj. If we sety = Bx (using column vectors) we obtain

y = Bx = BAx = BAB−1y = Dy.

Thus yj = λjyj and yj = cjeλjt for [1 ≤ j ≤ n] and so, writing

F = B−1, F = (fij)

x1 =n∑

j=1

f1jyj =n∑

j=1

f1jcj exp(iλjt).

If we set µj = f1jcj, we obtain the required result.

If A = D, then x = y and x1 = c1 exp(iλ1t) and, with the notationof the question, we have µj = 0 for 2 ≤ j ≤ n.

145

Exercise 6.6.2

Dq is the diagonal matrix with diagonal entries the qth powers ofthe diagonal entries of D.

(ii) A and D represent the same linear map α with respect to twobases E1 and E2 with basis change rule U = PV P−1. Aq and Dq

represent the same linear map αq with respect to the two bases E1 andE2 so Aq = PDqP−1.

146

Exercise 6.6.7

(i) We have

λ−qαq(x1e1 + x2e2) = x1e1 + x2(µ/λ)nx2e2 → x1e1

coordinate wise.

(i)′ We have

λ−qαq(x1e1 + x2e2) = x1e1 + x2(µ/λ)nx2e2,

but (µ/λ)n does not tend to a limit, so λ−qαq(x1e1 + x2e2) convergescoordinatewise if and only if x2 = 0.

(ii)′ Nothing to prove.

(iii) We have

α(

(λqx1 + qλq−1x2)e1 + λqx2e2)

= λ(λqx1 + qλq−1x2)e1 + λq(x2e1 + λe2)

=(

λq+1x1 + (q + 1)λqx2

)

e1 + λq+1x2e2

so, by induction

αq(x1e1 + x2e2) = (λqx1 + qλq−1x2)e1 + λqx2e2

for all q ≥ 0.

Thus

q−1λ−qαq(x1e1 + x2e2) = (q−1x1 + λ−1x2)e1 + q−1x2e2 → λ−1x2)e1

coordinatewise.

(iv) Immediate.

147

Exercise 6.6.8

F1 = F2 = 1, F3 = 2, F4 = 3, F5 = 5, F6 = 8, F7 = 13, F8 = 21,F9 = 34, F10 = 55.

F0 + 1 = F0 + F1 = F2 = 1,

so F0 = 0.F−1 + 0 = F−1 + F0 = F1 = 1,

so F−1 = 1.

Write G−n = (−1)n+1Fn. Then

G−n+1 +G−n = (−1)n(Fn−1 − Fn) = (−1)n−1Fn−2 = G−n+2

and G−1 = F−1, G(0) = F (0) so, by induction, F−n = G−n for alln ≥ 0. Thus F−n = (−1)n+1Fn for all n.

148

Exercise 6.6.9

If b = 0, but a 6= 0, then un = (−a)nu0.

If a = b = 0, then un = 0 for all n.

149

Exercise 6.6.10

We look at vectors

u(n) =

(

un

un+1

)

.

We then have

u(n + 1) = Au(n), where A =

(

0 1−a −b

)

and so(

un

un+1

)

= An

(

u0

u1

)

.

Now A is not a multiple of the identity matrix so (since the roots areequal) there is a basis e1 e2 with Ae1 = λe1 = λe1 and Ae2 = λe1+λe2.

NowAn(x1e1 + x2e1) = (λnx1 + nλn−1x2) + λnx2e2

so, sinceu(n) = Anu(0),

we haveun = (c + c′n)λn

for some constants c and c′ depending on u0 and u1.

150

Exercise 6.6.11

We look at vectors

u(n) = (un, un+1, . . . , un+m−1)T .

We then have

u(n + 1) = Au(n), where A =

0 1 0 . . . 00 0 1 . . . 0...

......

. . ....

0 0 1 . . . 1−am−1 −am−2 −am−3 . . . −a0

The characteristic polynomial of A is P (t), so, since all the roots ofP are distinct and non-zero, the eigenvectors ej (corresponding to aneigenvalue λj) of A form a basis.

Setting u0 =∑m

k=1 bkek we get

un = Anu0 =

m∑

k=1

bkλkek

and so looking at the first entry in the vectors

un =

m∑

k=1

ckλnk

for some constants cj .

Conversely, if

un =m∑

k=1

ckλnk ,

then

ur +m−1∑

j=0

ajuj−m+r =m∑

k=1

λn−mk P (λk) = 0.

Observe that, if |λ1| > |λk| for 2 ≤ k ≤ m, then

λ−n1 un =

m∑

k=1

ck(λ−11 λk)

n → c1

as n → ∞. Thus, if c0 6= 0, we have ur−1 6= 0 for r large andur

ur−1

→ λ1

as r → ∞.

A very similar proof shows that, if |λk| > |λm| for 1 ≤ k ≤ m − 1and cm 6= 0, then un 6= 0 for −n large and

ur−1

ur→ λm.

151

Exercise 6.6.12

(i) Observe that

1 +√5

2× −1 +

√5

2=

(√5)2 − 12

4= 1.

(ii) We have

det(tI − A) = t(t− 1)− 1 = t2 − t− 1 = (t− τ)(t + τ−1),

so the eigenvalues are τ and −τ−1.

If (x, y)T is an eigenvector with eigenvalue τ ,

y = τx

x+ y = τy,

so e1 = (1, τ)T is an eigenvector with eigenvalue τ .

A similar argument shows that e2 = (−1, τ−1)T is an eigenvectorwith eigenvalue τ−1.

By inspection (or solving two simultaneous linear equations)(

01

)

=1√5(e1 + e2).

(iii) It follows that(

Fn

Fn+1

)

= An

(

F0

F1

)

= 5−1/2(τne1 + (−τ)−ne2)

and so

Fn = 5−1/2(τn + (−τ)−n)

(vi) Since Fn is an integer and

|5−1/2(−τ)−n| ≤ 5−1/2 < 1/2

for n ≥ 0 we know that Fn is the closest integer to τn/51/2.

Both methods give F20 = 6765.

(v) Observe that An has first column

An−1(F0, F1)T = (Fn−1, Fn)

T

and second column

An−1(F1, F2)T = (Fn, Fn+1)

T .

(Or use induction.)

(−1)n = (detA)n = detAn = Fn−1Fn+1 − F 2n .

152

(vi)True for all n because Fn = (−1)n+1F−n.

(vi) We have A2n = AnAn, so(

Fn−1 Fn

Fn Fn+1

)(

Fn−1 Fn

Fn Fn+1

)

=

(

F2n−1 F2n

F2n F2n+1

)

so, evaluating the upper left hand term of the matrix, we get

F 2n + F 2

n−1 = F2n−1.

153

Exercise 6.6.13

(i) Let e(n)ij be the number of routes from i to j involving exactly n

flights. We claim that e(n)ij = d

(n)ij . We use induction on n. The result

is clearly true if n = 1. Suppose it is true for n = r.

e(r+1)ij = number of journeys with exactly r + 1 flights from i to j

=m∑

k=1

number of journeys with exactly r + 1 flights

from i to j ending with a flight from k to j

=∑

dkj 6=0

number of journeys with exactly r flights from i to k

=

m∑

k=1

d(r)ik dkj = d

(r+1)ij

The result follows.

In particular, there is a journey from i to j which involve exactly n

flights if and only if d(n)ij > 0.

(ii) We can produce an argument like (i) if we make the conventionthat a flight includes just staying at an airport (so a flight from i to iis allowed).

In particular there is a journey from i to j which involve n flights or

less (between different airports) if and only if d(n)ij > 0.

(iii) We write down the safe states for example

i = (G|PWC)

goat on one side, peasant, wolf and cabbage on other and write dij = 1if the peasant can change the situation from i to j in one crossing.Otherwise dij = 0. The least number of crossings is the least N forwhich

dN(GPWC|∅),(∅|GPWC) 6= 0.

Observe that, since the peasant will not repeat a state, if there areM safe states the problem is soluble, if at all, with at most M − 1crossings so the method will reveal if the problem is insoluble.

(The cabbage survived but was so distressed by its narrow escapethat it turned over a new leaf).

154

Exercise 6.7.1

(i) Suppose that A = (aij)1≤i≤n1≤i≤n is lower triangular. Then, by row

expansion,detA = a11 detB

where B = (ai+1,j+1)1≤i≤n−11≤i≤n−1. Since B is lower triangular with diagonal

entries ai+1,i+1 [1 ≤ i ≤ n− 1] the required result follows by induction.

(ii) We have

L invertible ⇔ detL 6= 0 ⇔n∏

j=1

ljj 6= 0 ⇔ ljj 6= 0 ∀j

(iii) Since tI − L is lower triangular

det(tI − L) =

n∏

j=1

(t− ljj)

so the roots of the characteristic equation are the diagonal entries ljj(multiple roots occurring with the correct multiplicity).

Observe that (0, 0, 0, . . . , 0, 1)T is an eigenvector with eigenvalue lnn.

155

Exercise 6.7.2

Just repeat the earlier discussion mutatis mutandis.

The following is merely a variation.

If the system isn∑

j=1

uijxj = yi

then setting x′j = xn−j , y

′i = yn−i lij = un−i,n−j gives the lower trian-

gular systemn∑

j=1

lijx′j = y′i.

156

Exercise 6.7.3

(i) Observe first that the product of two lower triangular matrices istriangular since if aij, bij = 0 for i+ 1 ≥ j, then

n∑

r=1

airbrj =∑

j≤r≤i

airbrj = 0

for i+ 1 ≥ j.

We use induction to show that the inverse of an invertible n×n lowertriangular matrix is triangular.

Since every 1 × 1 matrix is lower triangular, the result is certainlytrue for n = 1. Suppose it is true for n = m and A is an m+1×m+1invertible lower triangular matrix. Since A is invertible a11 6= 0. IfB = (bij) is the m + 1 ×m + 1 matrix with b11 = a−1

11 , bi1 = −ai1a−111

bii = 1 for 2 ≤ i ≤ m+ 1 and bij = 0, otherwise, then

BA =

(

1 0T

0 L

)

where L is an m×m invertible lower triangular matrix and 0 ∈ Rm isthe zero column vector.

If we now set

C =

(

1 0T

0 L−1

)

then since L−1 is lower triangular (by the inductive hypothesis), weknow that C−1 is lower triangular so C−1B is lower triangular. But

(C−1B)A = C−1(BA) = C−1C = I

so A has lower triangular inverse and the induction is complete.

(ii) By the first paragraph of (i),

A,B ∈ L ⇒ AB ∈ L.

By inspection I ∈ L and, by part (i),

A ∈ L ⇒ A−1 ∈ L.

157

Exercise 6.7.6

Observe that

LU =

1 0 02 1 0−1 −3 1

2 1 10 −1 −20 0 −4

=

2 1 14 2− 1 2− 2−2 −1 + 3 −1 + 6− 4

=

2 1 14 1 0−2 2 1

.

158

Exercise 6.7.7

Consider an n × n matrix A. The first step will be impossible onlyif the entire first row consists of zeros in which case it is clear that then× n matrix A is not invertible. If the first step can be done, then werepeat the process with a n− 1× n− 1 square matrix B and the sameremarks apply to B.

Thus either the process will stop because the matrix with whichwe deal at the given stage has top row all zeros or the process willnot stop in which case we obtain an upper triangular matrix U anda lower triangular matrix L both with all diagonal entries non-zero soinvertible. Since the product of invertible matrices is invertible A = LUis invertible.

Thus the process works if and only if A is invertible.

159

Exercise 6.7.9

Suppose that(

a 0c d

)(

u vw 0

)

=

(

0 11 0

)

.

Then(

au avcu cv + dw

)

=

(

0 11 0

)

.

It follows that au = 0 and av = 1, so u = 0, which is incompatiblewith cu = 1. Our initial formula cannot hold.

160

Exercise 6.7.10

We use induction on n. The result is trivial for n = 1 since

(1)(u1) = (1)(u2) ⇒ (u1) = (u2) ⇒ u1 = u2.

Now suppose the result is true for n = m and L1, L2 are lowertriangular (with diagonal entries 1) and U1, U2 upper triangular m +1×m+ 1 matrices with L1U1 = L2U2. Observe that we can write

Lj =

(

1 0T

lj Lj

)

and Lj =

(

u(j) uTj

0 Uj

)

where the Lj , L2 lower triangular (with diagonal entries 1) and the Uj

upper triangular m×m matrices, u(j) ∈ F and lj , uj ∈ Fm are columnvectors. Equating first rows in the equation L1U1 = L2U2 we get

(u(1),uT1 ) = (u(2),uT

2 )

and equating first columns we get

(1, lT1 )T = (1, lT2 )

T

so u(1) = u(2), u1 = u2, l1 = l2.

The condition L1U1 = L2U2 now gives

L1U1 = L2U2

so by the inductive hypothesis

L1 = L2, U1 = U2.

We have shown thatL1 = L2, U1 = U2

and this completes the induction

161

Exercise 6.7.11

Since detLU = detL detU and detL, detU are the product of thediagonal entries, detLU is cheap to calculate.

By solving the n sets of simultaneous linear equationsn∑

r=1

lirxrj = δij

each of which requires of the order of n2 operations we can computeX = L−1 in the order of n3 operations and we can compute U−1 sim-ilarly. We now compute (LU)−1 = U−1L−1 again in the order of n3

operations.

162

Exercise 6.7.12

(i) It is probably most instructive to do this ab initio, but quicker touse known results.

We know that (after reordering columns) A = LU with L lowertriangular and all diagonal entries 1 and U upper triangular. Since Ais non-singular U is, so all its diagonal entries are non-zero. Let D bethe diagonal matrix with diagonal entries the diagonal entries of U .Set U = D−1U . Then U is upper triangular with all diagonal entries1, D is non-singular and

A = LDU.

Suppose that Lj is a lower triangular n× n matrix with all diagonalentries 1, Uj is an upper triangular n × n matrix with all diagonalentries 1 and Dj is a non-singular diagonal n×n matrix [j = 1, 2]. Weclaim that, if

L1D1U1 = L2D2U2,

then L1 = L2, U1 = U2 and D1 = D2.

To see this observe that DjUj is non singular upper triangular so, byExercise 6.7.10, L1 = L2 and

D1U1 = D2U2.

By looking at the diagonal entries of both sides of the equation we getD1 = D2 so

U1 = D−11 (D1U1) = D−1

2 (D2U2) = U2.

(ii) Yes. Either repeat the argument mutatis mutandis or argue asfollows.

If G = (gij) is an n×n matrix, write G = (gn+1−i,n+1−j). If G = LU

with L lower triangular, U upper triangular, then G = LU and L isupper triangular whilst U is lower triangular.

(iii) No. Observe that(

a bc 0

)(

x 0y z

)

=

(

ax+ by bzcx 0

)

.

163

Exercise 7.1.4

(i) Any set of orthonormal vectors is linearly independent. Any setof k linearly independent vectors forms a basis for U .

(ii) By (i), we can find λj such that

x =k∑

j=1

λjej .

Now observe that

〈x, ei〉 =k∑

j=1

λj〈ej, ei〉 = λi.

164

Exercise 7.1.6

If we take e1 = 3−1/2(1, 1, 1), e1 = 2−1/2(1,−1, 0), then, by inspec-tion, they form an orthonormal system.

We now use Gramm-Schmidt for x = (1, 0, 0).

x− 〈x, e1〉e1 − 〈x, e2〉e2 = (1, 0, 0)− 3−1(1, 1, 1)− 2−1(1,−1, 0)

= (1/6, 1/6,−1/3) = a,

say, Settinge3 = ‖a‖−1a = 6−1/2(1, 1,−2)

we have e1, e2, e3 orthonormal.

It follows at one that e1 = 3−1/2(1, 1, 1), 2−1/2(0,−1, 1), 6−1/2(−2, 1, 1)is another orthonormal system.

165

Exercise 7.1.8

(Dimension 2) If a point B does not lie in a plane π, then there existsa unique line l′ perpendicular to π passing through B. The point ofintersection of l with π is the closest point in π to B. More briefly, thefoot of the perpendicular dropped from B to π is the closest point inπ to B.

(Dimension 1) If a point B does not lie on a line l, then there existsa unique line l′ perpendicular to l passing through B. The point ofintersection of l with l′ is the closest point in l to B. More briefly, thefoot of the perpendicular dropped from B to l is the closest point in lto B.

166

Exercise 7.1.9

(i) We have∥

∥

∥

∥

∥

x−k∑

j=1

λjej

∥

∥

∥

∥

∥

2⟨

x−k∑

j=1

λjej,x−k∑

j=1

λjej

⟩

= ‖x‖2 − 2k∑

j=1

λj〈x, ej〉+k∑

j=1

λ2j

= ‖x‖2 −k∑

j=1

〈x, ej〉2 +k∑

j=1

(λj − 〈x, ej〉)2

≥ ‖x‖2 −k∑

j=1

〈x, ej〉2

with equality if and only if λj = 〈x, ej〉.(ii) It follows from (i) that

‖x‖2 ≥k∑

j=1

〈x, ej〉2

with equality if and only if λj = 〈x, ej〉 and∥

∥

∥

∥

∥

x−k∑

j=1

λjej

∥

∥

∥

∥

∥

= 0,

ie

x =

k∑

j=1

λjej

and this occurs if and only if

x ∈ span{e1, e2, . . . , ek}.

167

Exercise 7.1.10

Observe first that〈0,u〉 = 0

for all u so 0 ∈ U⊥.

Next observe that if λ1, λ2 ∈ R

v1, v2 ∈ U⊥

⇒ 〈λ1v1 + λ2v2u〉 = λ1〈v1,u〉+ λ2〈v2,u〉 = 0 + 0 = 0 for all u ∈ U

⇒ λ1v1 + λ2v2 ∈ U⊥.

If we take a − b = v in Theorem 7.1.7, we see that a ∈ Rn can bewritten in one and only one way as x = u+ v with u ∈ U , v ∈ U⊥.

If e1, . . . , ek is a basis for U and ek+1, . . . , el is a basis for U⊥, theprevious paragraph tells us that e1, . . . , ek, ek+1, . . . , el is a basis forRn so l = n and

dimU + dimU⊥ = n.

168

Exercise 7.2.2

〈αx,y〉 =n∑

i=1

(

n∑

j=1

aijxjyi

)

=

n∑

j=1

(

n∑

i=1

aijxjyi

)

=n∑

j=1

(

n∑

i=1

cjixjyi

)

= 〈x, α∗y〉.

169

Exercise 7.2.6

Suppose that A, B represent α, β with respect to some basis. Lemma 7.2.4enables us to translate Lemma 7.2.5 into matrix form

(i) Since (αβ)∗ = β∗α∗, we have (AB)T = BTAT .

(ii) Since α∗∗ = (α∗)∗ = α, we have ATT = (AT )T + A.

(iii) Since (λα+µβ)∗ = λα∗+µβ∗, we have (λA+µB)T = λAT+µBT .

(vi) Since ι∗ = ι, we have IT = I.

In coordinates

(i) If AB = C, then

cij =

n∑

r=1

airbrj

so

cji =n∑

r=1

brjair

and (AB)T = CT = BTAT .

(ii) ATT = (aji)T = (aij) = A.

(iii) (λA+ µB)T = (λaji + µbji) = λ(aji + µbji) = λAT + µBT .

(iv) δij = δji so I = IT .

170

Exercise 7.2.10

Let the jth row of A be aj .

A ∈ O(Rn) ⇔ AAT = I ⇔ aiaTj = δij ⇔ ai · aj = δij ⇔ the aj are o.n.

Thus (i)⇔(ii). Since

A ∈ O(Rn) ⇒ ATTAT = AAT = I ⇒ AT ∈ O(Rn)

transposition gives (i)⇔(iii).

We turn to the statements.

(i) FALSE. Consider(

1 −12 2

)

.

(ii) FALSE. Consider(

1 21 2

)

.

(iii) FALSE If a1, a2, . . . , an−1, an are the rows of an orthogonalmatrix, so are a1, a2, . . . , an−1, −an (and an 6= 0).

(iv) TRUE If a1, a2, . . . , an−1 are orthonormal in Rn then there areexactly two choices ±an giving an orthonormal set and of these exactlyone will give detA = 1.

171

Exercise 7.2.13

Take

A =

(

1 10 1

)

, B =

(

1 10 −1

)

.

Note that the columns in each matrix are not orthogonal.

172

Exercise 7.3.2

Any pair of orthonormal vectors form a basis for R2. We observethat

〈e1, e1〉 = 1, 〈e1,−e2〉 = −〈e1, e2〉 = 0, 〈−e2,−e2〉 = 〈e2, e2〉 = 1.

Observe that

αe1 = cos θe1 + sin θe2 = cos θe1 − sin θ(−e2)

= α(−e2) = −αe2 = sin θe1 − cos θe2 = sin θe1 + cos θ(−e2)

so, with respect to the new basis, α has matrix(

cos θ sin θ− sin θ cos θ

)

=

(

cos(−θ) − sin(−θ)sin(−θ) cos(−θ)

)

173

Exercise 7.3.4

(i) We have(

cos θ − sin θsin θ cos θ

)(

xy

)

=

(

x cos θ − y sin θx sin θ + y cos θ

)

which I was told in school was a rotation through θ

We have(

−1 00 1

)(

xy

)

=

(

−xy

)

which certainly looks like a reflection in the y axis to me.

Since〈e1,u1〉2 + 〈e1,u2〉2 = 1,

we may take 〈e1,u1〉 = cosφ, 〈e1,u2〉 = sin φ. We have

e1 = cosφu1 + sinφu2

so

e1 = −αe1

= − cos φαu1 − sinφαu2

= − cos φ(cos θu1 + sin θu2)− sinφ(sin θu1 + cos θu2)

Taking the inner product with u1 we get

cos φ = − cosφ cos θ − sin φ sin θ = − cos(θ − φ)

soφ = π + θ − φ

and θ = 2φ− π.

174

Exercise 7.3.6

Since a reflection in a plane has perpendicular eigenvectors witheigenvalues −1, 1, 1, α is a reflection only if detα = −1 so, by ourdiscussion, α has matrix

A =

−1 0 00 cos θ − sin θ0 sin θ cos θ

,

so

det(tι−α) = det

t+ 1 0 00 t− cos θ sin θ0 − sin θ t− cos θ

= (t−1)(t2−2t cos θ+1)

and we can only have the correct eigenvalues if cos θ = 1. In this casewe clearly have a reflection in the plane spanned by e2 and e3.

By definition α is a reflection in the origin if and only if α = −ι soif and only if cos θ = −1.

175

Exercise 7.3.7

det(tι− α) = det

(

t− cos θ sin θ− sin θ t− cos θ

)

det

(

t− cosφ sinφ− sinφ t− cosφ

)

= (t2 − 2t cos θ + 1)(t2 − 2t cosφ+ 1)

with no real roots unless cos θ = cos φ = 1 i.e. θ ≡ φ ≡ 0 (mod π).

176

Exercise 7.4.3

Let u = 〈x,n〉n and v = x− u. Then

〈v,n〉 = 〈x− u,n〉 = 〈x− 〈x,n〉n,n〉 = 〈x,n〉 − 〈x,n〉〈n,n〉 = 0

as required.

We observe that

ρ(x) = u+ v − 2(〈u,n〉+ 〈v,n)〉n = u+ v − 2〈x,n〉n = v − u.

177

Exercise 7.4.6

Let A be the point given by the vector a and B be the point givenby the vector b. Let OM be an angle bisector of the line OA, OB oflength 1. and let OM be given by the vector m. Let n be a unit vectorperpendicular to m.

178

Exercise 7.4.9

The product of two reflections in lines OA, OB through the originmaking an angle ∠AOB = θ is a rotation through 2θ.

To prove this, observe that, if ρ1 and ρ2 are reflections, then ρ2ρ1preserves lengths and has determinant 1, so is a rotation. Now let e1and e2 be orthonormal and let n = −e1, m = − cos θe1 − sin θe2. If

ρ1(x) = x− 2〈x,n〉nρ2(x) = x− 2〈x,m〉m,

then

ρ2ρ1e1 = ρ2(e1) = e1 − 2〈e1,m〉m= e1 − 2 cos θ(cos θ1 + sin θ2)

= cos 2θ1 + sin 2θ2.

Thus the rotations through angle θ are obtained from two reflectionsin lines at angle θ/2 (in the correct sense).

179

Exercise 7.5.1

(i) Observe that

‖Ax− b‖2 =n∑

j=1

(x− bj)2

= nx2 − 2x

n∑

j=1

bj +

n∑

j=1

b2j

= n

(

x− n−1n∑

j=1

bj

)2

+n∑

j=1

b2j − n

(

n−1n∑

j=1

bj

)2

is minimised by taking x = n−1∑n

j=1 bj .

In the case of the mountain this amounts to the reasonable choice ofthe average of the measured heights.

(ii) Observe that‖Ax− b‖2 = f(x1, x2)

with

f(x1, x2) =

n∑

i=1

(x1 − vix2 − bi)2

We observe that f(x1, x2) → ∞ as |x1|, |x2| → ∞.

Since

∂f

∂x1(x1, x2) =

n∑

i=1

2(x1 − vix2 − bi) = 2

(

nx1 −n∑

i=1

bi

)

and

∂f

∂x2(x1, x2) = −

n∑

i=1

2vi(x1 − vix2 − bi) = −2

(

x2

n∑

i=1

v2i −n∑

i=1

bivi

)

,

we see that f and therefore ‖Ax − b‖ has a unique minimum when(x1, x2) = (µ, κ).

We have sought to find a line u+x1v = x2 ‘close to’ the points (ui, vi)by minimising

n∑

i=1

(ui − x1 − x2vi)2.

180

Exercise 7.5.2

Observe that

f(x)n∑

i=1

(

m∑

j=1

aijxj − bi

)2

is a smooth function of x with f(x) → ∞ as ‖x‖ → ∞. Differentiatingand solving the resulting n linear equations in n unknowns will, ingeneral (to see possible problems, take aij = 0 for all i and j) give aunique stationary point which will be a minimum.

The other two penalty functions cannot (in general) be minimisedby calculus techniques, but could be attacked as linear programmingproblems.

181

Exercise 7.5.3

(i) If we write the columns of A as column vectors a1, a2, . . .am,then (possibly after reordering the columns of A), the Gram–Schmidtmethod gives us orthonormal row vectors e1, e2, . . . , ek such that

a1 = r11e1

a2 = r12e1 + r22e2

a3 = r13e1 + r23e2 + r33e3...

ak = r1ke1 + r2ke2 + r3ke3 + . . .+ rkkekk

for some rij [1 ≤ j ≤ i ≤ k] with rii 6= 0 for 1 ≤ i ≤ k. and

ap = r1pe1 + r2pe2 + r3pe3 + . . .+ rpkek

for rij [1 ≤ i ≤ k, k + 1 ≤ p ≤ m]. We set rij = 0 in all the cases with1 ≤ i ≤ n and 1 ≤ j ≤ m where this has not previously been defined.

Using the Gram–Schmidt method again, we can now find ek+1, ek+2,. . . , en so that the vectors ej with 1 ≤ j ≤ n form an orthonormalbasis for the space Rn of column vectors. If we take Q to be the n× nmatrix with jth column ej, then Q is orthogonal and condition ⋆ gives

A = QR.

(ii) We have

rii 6= 0 ∀ 1 ≤ i ≤ m ⇔ rankR = m

⇔ rankQTA = m

⇔ rankA = m

since the linear map corresponding to QT is an isomorphism.

182

Exercise 7.5.4

(i) We know thatπ = {At : t ∈ Rn}

is a subspace of Rm. Thus (e.g. by Theorem 7.1.7) it contains a closestpoint y to b. Choose x0 with Ax0 = y. to obtain a minimiser for ourproblem.

Since A has rank strictly less than m we can find a non zero u ∈ Rm

with Au = 0. Then every choice x = x0 + λz minimises ‖Ax− b‖.(ii) The QL result is just the QR result with a renumbering of the

basis.

If A is an m × n matrix, Q an n × n orthogonal matrix and R ann × m upper triangular matrix with AT = QR, then A = RTQT ,QT is orthogonal and RT a lower triangular matrix. Thus the QRresults imply the corresponding LQ results and the QL results implythe corresponding RQ results.

We could have used the QL in the discussion. However, the RQ andLQ results give Q in the wrong place when we derive the opening for-mulae. In addition (at least if we follow the obvious path) we factorisean n×m matrix with n ≥ m and we are not usually interested in caseswhere we have more unknowns than equations.

183

Exercise 7.5.5

Since ρ is norm preserving,

ρ(a) = b ⇒ ‖a‖ = ‖ρ(a)‖ = ‖b‖.

Now take ‖a‖ = ‖b‖ with b 6= ±a and set c = (a − b)/2. Supposethat

ρx = λx+ µ〈c,x〉cdescribes a reflection with ρa = b and ρx = x whenever 〈x, a〉 = 〈x,b〉.If x ⊥ c then

x = ρ(x) = λx

so λ = 1.

Since〈c, a〉 = 1

2(‖a‖2 + 〈a,b〉) = 1

2(‖b‖2 + 〈b, a〉),

we have ρ(c) = c, soc = c+ µ〈c, c〉c

and µ = −2‖c‖2. Thus

ρx = x− 2〈c,x〉‖c‖2 c

Conversely, if

ρx = x− 2〈c,x〉‖c‖2 c,

then ρc = −c andd ⊥ c ⇒ ρd = d,

so ρ is a reflection. Further

〈x, a〉 = −〈x,b〉 ⇔ 〈x, c〉 = 0 ⇔ ρx = −x

and

ρa = a− 2〈c, a〉‖c‖2 c

= a− 〈c, a〉+ 〈c,b〉‖c‖2 c

= a− 2〈c, c〉‖c‖2 c

= a− 2c = b

and, since ρ is a reflection ρb = ρa.

We have

Tijxj = xi − 2cjxj‖c‖−2ci = (δij − 2(ckck)−2cicj)xj ,

soTij = δij − 2(ckck)

−2cicj.

184

If a = (±‖a‖, 0, 0, . . . , 0), take T1 = I. Otherwise, take T1 = T withT as above.

If C is an n×m matrix with cij = 0 for all j > i when i ≤ r, considerthe (n−r)× (m−r) matrix A formed by the cij with n ≥ i ≥ n−r+1m ≥ j ≥ m− r + 1. We can find a (n− r)× (n− r) reflection matrix(or the identity) T with TA having first column zero except possiblythe first entry. If S is the n× n matrix given by

S =

(

I 00 A

)

,

then H = SC is an n × m matrix with hij = 0 for all j > i wheni ≤ r + 1.

Thus we can find reflection (or identity) matrices Tj such that

Tn−1Tn−2 . . . T1A = R

is upper triangular. The matrix Q = T1T2 . . . Tn−1 is the product ofreflection (or identity) so of orthonormal matrices so (since O(Rn) is agroup under multiplication) an orthonormal matrix.

We have

A = (Tn−1Tn−2 . . . T1)−1R = T−1

1 T−12 . . . T−1

n R = T1T2 . . . TnR = QR.

185

Exercise 7.5.6

We are interested in the vector a = (1, 2, 2)T of length 3 which wewish to reflect to the vector b = (3, 0, 0)T . We thus want a reflectionin the plane perpendicular to

c = (a− b)/2 = (−1, 1, 1)T .

This reflection is given by

ρ(x) = x− 2‖c‖−2(c · x)cwith matrix

T = I − 2‖c‖2cTc

=

1 0 00 1 00 0 1

− 2

3

1 −1 −1−1 1 1−1 1 1

=

13

23

23

23

13

−23

23

−23

13

Now

TA =

3 113

43

0 0 10 1 −1

3

so (interchanging the second and third row) TA is upper triangular.

186

Exercise 7.5.7

We are interested in the vector a = (2, 2,−1)T of length 3 which wewish to reflect to the vector b = (3, 0, 0)T . We thus want a reflectionin the plane perpendicular to

c = (a− b)/2 = (−12, 1,−1

2)T .

Setting d = 2c = (−1, 2,−1)T , reflection is given by

ρ(y) = y − 2‖d‖2d · ydwith matrix

T = I − 2‖c‖2dTd

=

1 0 00 1 00 0 1

− 1

3

1 −2 1−2 4 −21 −2 1

=

23

23

−13

23

−13

23

−13

23

−13

Setting

Q =

1 0 0 00 2

323

−13

0 23

−13

23

0 −13

23

−13

.

We see that (since Q is orthogonal) we seek the least squares fit for

Q

1 30 20 20 −1

x = Q

4141

that is to say

1 30 30 00 0

x =

4303

so we require x2 = 1, x1 = 1.

Let f be the square of the norm of

1 30 20 20 −1

x−

4141

187

Then f(x1, x2) = (x1+3x2−4)2+(2x2−1)2+(2x2−4)2+(x2+1)2 andthe unique stationary value of f (which must be minimum) is given by

0 =∂f

∂x1

= 2(x1 + 3x2 − 4)

0 =∂f

∂x2= 2(

3(x1 + 3x2 − 4) + 2(2x2 − 1) + 2(2x2 − 4) + (x2 + 1))

that is to say, by

4 = x1 + 3x2

21 = 3x1 + 18x2,

that is to say, by

4 = x1 + 3x2

7 = x1 + 6x2,

so we require x2 = 1, x1 = 1, as stated earlier.

188

Exercise 8.1.7

(i) We have

AT = A ⇒ (P TAP )T = P TATP TT = P TAP

and

(P TAP )T = P TAP ⇒ P TATP = P TAP

⇒ P (P TATP )P T = P ((P TAP )P T

⇒ AT = A.

(ii) If P ∈ SO(Rn), then set Q = P .

If P /∈ SO(Rn), then set Q = PD. We have

detQ = detP detD = (−1)2 = 1

and QTQ = DTP TPD = DID = I, so Q ∈ SO(Rn). Further, sinceP TAP is diagonal,

QTAQ = DTP TAPD = D(P TAP )D = P TAP

and QTAQ is diagonal.

189

Exercise 8.2.2

If f akj = ajk for all j and k,∑n

j=1 akjuj = λuk and∑n

j=1 akjvj = µvkfor all k, but λ 6= µ, then

λn∑

k=1

ukvk =n∑

k=1

λukvk

=

n∑

k=1

n∑

j=1

akjujvk =

n∑

k=1

n∑

j=1

ajkujvk

=

n∑

j=1

n∑

k=1

ujajkvk =

n∑

j=1

µujvj

= µn∑

j=1

ujvj = µn∑

k=1

ukvk,

so∑n

k=1 ukvk = 0.

190

Exercise 8.2.6

We have (by Cramer’s rule or direct calculation),

P−1 =

(

1 −10 1

)

so

PAP−1 = P (AP−1) =

(

1 01 1

)(

2 −20 1

)

=

(

2 −22 −1

)

.

Observe that P is not orthogonal.

191

Exercise 8.2.7

Let

A =

(

1 ii −1

)

so AT = A.det(tI − A) = (t− 1)(t+ 1) + 1 = t2

so all the eigenvalues of A are 0 and, if A were diagonalisable, we wouldhave A = 0 which is not the case.

192

Exercise 8.2.9

Interchanging x2 and x3 coordinates is the same as reflecting in aplane passing through the x1 axis and a line bisecting the angle betweenthe x2 and x3 axes (so perpendicular to (0, 1,−1)T ).

193

Exercise 8.2.10

If there are n distinct eigenvalues λj with associated eigenvectors ejthen

P TAP = D

with P orthogonal and D diagonal if and only if the columns of Pconsist of the eigenvectors ±ej (with choice of signs) in some order sothere are exactly 2nn! such matrices.

If A has less than n distinct eigenvalues then Rn has a basis ofeigenvectors ej with en−1 and en having the same eigenvalues, then, ifP has rth column er for 1 ≤ r ≤ n−2 n−1th column cos θen−1−sin θenand nth column sin θen−1 + cos θen, we have P orthogonal and

P TAP = D

so case (ii) occurs.

194

Exercise 8.3.2

Observe that there exists a diagonal matrix D and an orthogonalmatrix P with

D =

(

λ1 00 λ2

)

and A = P TAP

where λ1, λ2 are the eigenvalues of A. Parts (i), (ii) and (iii) can beread off from the observation that detA = detD = λ1λ2 and TrA =TrD = λ1 + λ2.

(iv) detA = uw − v2 ≤ uw, so detA > 0 ⇒ u 6= 0.

If u > 0 and detA > 0, then w > 0 so TrA > 0, so, by (iii), theeigenvalues of A are strictly positive.

If u < 0 and detA > 0, then w < 0, so TrA < 0, so, by applying(iii) to A, the eigenvalues of A are strictly positive.

Exercise 8.3.3⋆

195

Exercise 8.3.4

By an orthogonal transformation we may reduce the equation ax2 +2bxy + cy2 = d to Ax2 +By2 = K. By multiplying by −1 if necessarywe may assume K ≥ 0 and by interchanging x and y if necessary wemay assume that A = 0 ⇒ B = 0 and if A and B are non-zero withopposite signs then A > 0 > B

The cases may now be enumerated as follows.

(1) K > 0, A > 0, B > 0 ellipse.

(2) K > 0, A < 0, B < 0 empty set.

(3) K > 0, A > 0, B < 0 hyperbola.

(4) K = 0, A > 0, B > 0 single point {0}.(5) K = 0, A < 0, B < 0 single point {0}.(6) K = 0, A > 0, B < 0 pair of lines meeting at 0.

(7) K > 0, A > 0 B = 0 pair of parallel lines.

(8) K > 0, A < 0, B = 0 empty set.

(7) K = 0, A > 0 B = 0 single line through 0.

(8) K = 0, A < 0, B = 0 single line through 0.

(9) K > 0, A = 0 B = 0 empty set.

(10) K = 0 A = 0, B = 0 whole plane.

196

Exercise 8.3.5

Let(

8 61/2

61/2 7

)

Then

det(t− A) = (t− 8)(t− 7)− 6 = t2 − 15t+ 50 = (t− 10)(t− 5)

so A has eigenvalues 5 and 10.

A

(

xy

)

= 5

(

xy

)

⇒{

3x = 61/2y

2y = 61/2x

so the eigenvector (1, (3/2)1/2)T gives one axis of symmetry.

A

(

xy

)

= 10

(

xy

)

⇒{

−2x = 61/2y

−3y = 61/2x

so the eigenvector (1,−(2/3)1/2)T gives the other axis of symmetry.

197

Exercise 8.4.2

(i) 〈z, z〉 =∑nj=1 |zj|2 is always real and positive.

(ii) We have

〈z, z〉 = 0 ⇔n∑

j=1

|zj |2 = 0

⇔ |zj|2 = 0 for 1 ≤ j ≤ n

⇔ |zj| = 0 for 1 ≤ j ≤ n

⇔ z = 0.

(iii) We have

〈λz,w〉 =n∑

j=1

(λzj)w∗j

= λ

n∑

j=1

zjw∗j = λ〈z,w〉.

(iv) We have

〈z+ u,w〉 =n∑

j=1

(zj + uj)w∗j

=

n∑

j=1

zjw∗j + ujw

∗j = 〈z,w〉+ 〈u,w〉.

(v) 〈z,w〉∗ =(

∑nj=1 zjw

∗j

)∗

=∑n

j=1 z∗jw

∗∗j =

∑nj=1 z

∗jwj = 〈w, z〉.

198

Exercise 8.4.3

The result is trivial if z = 0 or w = 0, so we need only consider thecase when z, w 6= 0.

Suppose first that 〈z,w〉 is real and positive. Then, if λ is real,

〈(λz+w), (λz+w)〉 = λ2〈z, z〉+ 2λ〈z,w〉+ 〈w,w〉

=

(

λ〈z, z〉1/2 + 〈z,w〉〈z, z〉1/2

)

+ 〈w,w〉 − 〈z,w〉2〈z, z〉

Thus, taking λ = λ0, with λ0 = (〈z,w〉)1/2(〈z, z〉)−1/2 we see that

〈w,w〉 − 〈z,w〉2〈z, z〉 ≥ 0

with equality if and only if

〈(λ0z+w), (λ0z+w)〉 = 0

and so if and only ifλ0z+w = 0

In general, we may choose θ so that

eiθ〈z,w〉is real and positive. Then the result above, applied to eiθz, yields

〈w,w〉 − e2iθ〈z,w〉2〈z, z〉 ≥ 0

so|〈z,w〉| ≤ ‖z‖‖w‖.

with equality only possible if

λ1z+w = 0

for some λ1 ∈ C. By inspection, if this last condition holds, we do haveequality.

199

Exercise 8.4.4

(i) ‖z‖ ≥ 0 since we take the positive square root.

(ii) ‖z‖ = 0 ⇔ 〈z, z〉 = 0 ⇔ z = 0.

(iii) ‖λz‖2 = 〈λz, λz〉 = λλ∗〈z, z〉 = (|λ|‖z‖)2, so ‖λz‖ = |λ|‖z‖.(iv) We have

‖z+w‖2 = 〈z+w, z+w〉= ‖z‖2 + 〈z,w〉+ 〈w, z〉+ ‖w‖2

= ‖z‖2 + 2ℜ〈z,w〉+ ‖w‖2

≤ ‖z‖2 + 2‖z‖‖w‖+ ‖w‖2

= (‖z‖+ ‖w‖)2

The result follows on taking square roots.

200

Exercise 8.4.6

(i) Observe thatn∑

j=1

λjej = 0 ⇒⟨

n∑

j=1

λjej , ek

⟩

= 0 ∀k

⇒n∑

j=1

λjδjk = 0 ∀k ⇒ λk = 0 ∀k,

so we have a set of n linearly independent vectors in a space of dimen-sion n which is thus a basis.

(ii) By (i),

z =n∑

j=1

λjej

for some λj . We observe that

λj =n∑

j=1

λjδjk =

⟨

n∑

j=1

λjej, ek

⟩

= 〈z, ek〉

(iii) False. If n > m, m vectors cannot span Cm.

201

Exercise 8.4.7

If k = q, we are done, since a linearly independent set forms a basisif and only if the number of elements in the basis equals the dimensionof the space.

If not, then k < q and there exists a

u ∈ U \ span{e1, e2, . . . , ek}so

v = u−k∑

j=1

〈u, ej〉ej

is a non-zero element of U . Setting ek+1 = ‖v‖−1v we have e1, e2, . . . ,ek+1 orthonormal in U .

After repeating the process at most q times we obtain a basis for U .

202

Exercise 8.4.8

(i) Observe that∥

∥

∥

∥

∥

z−k∑

j=1

λjej,

∥

∥

∥

∥

∥

2

= ‖z‖2 −k∑

j=1

λj〈ej, z〉∗ −k∑

j=1

λ∗j〈ej , z〉+

k∑

j=1

|λj|2

=k∑

j=1

(λj − 〈ej , z〉)(λj − 〈ej, z〉)∗ + ‖z‖2 −k∑

j=1

|〈ej, z〉|2

=

k∑

j=1

|λj − 〈ej, z〉|2 + ‖z‖2 −k∑

j=1

|〈ej, z〉|2

≥ ‖z‖2 −k∑

j=1

|〈ej, z〉|2

with equality if and only if λj = 〈z, ej〉 for all j.(ii) Since

z =

k∑

j=1

〈z, ej〉ej

if and only if z ∈ span{e1, e2, . . . , ek}, it follows from (i) that

‖z‖2 ≥k∑

j=1

〈z, ek〉2,

with equality if and only if z ∈ span{e1, e2, . . . , ek}.

203

Exercise 8.4.9

We have

‖z+w‖2 − ‖z−w‖2 = 〈z+w, z+w〉 − 〈z−w, z−w〉= ‖z‖2 + 〈z,w〉 − 〈z,w〉∗ + ‖w‖2 − ‖z‖2 + 〈z,w〉+ 〈z,w〉∗ − ‖w‖2= 4ℜ〈z,w〉.

Thus‖z+ iw‖2 − ‖z− iw‖2 = 4ℜ〈z, iw〉 = ℑ〈z,w〉

and

‖z+w‖2 − ‖z−w‖2 + i‖z+ iw‖2 − i‖z− iw‖2 = 4〈z,w〉.

204

Exercise 8.4.10

Existence Choose an orthonormal basis ej . If α has matrix A = (aij)with respect to this basis, let α∗ be the linear map with matrix A∗ =(bij) where bij = a∗ji. Then

⟨

αn∑

j=1

zjej ,n∑

r=1

wjer

⟩

=

⟨

n∑

j=1

zj

n∑

i=1

aijei,n∑

r=1

wrer

⟩

=

n∑

j=1

n∑

i=1

n∑

r=1

zjwraijδir

=

n∑

j=1

n∑

i=1

zjwiaij

=n∑

i=1

n∑

j=1

zjwib∗ji

=

⟨

n∑

j=1

zjej , α∗

n∑

r=1

wjer

⟩

,

so

〈αz,w〉 = 〈z, α∗w〉

for all z, w ∈ Cn.

Uniqueness Observe that, if β and γ are linear maps with

〈αz,w〉 = 〈z, βw〉 = 〈z, γw〉

for all z, w ∈ Cn, then

(β − γ)w = 0

for all w ∈ Cn (set z = (β − γ)w), so β = γ.

The required formula A∗ = (bij) with bij = a∗ji follows from theresults already established in the question.

205

Finally

detα∗ = detA∗ =∑

σ∈Sn

ζ(σ)n∏

i=1

a∗σ(i)i

=∑

σ∈Sn

ζ(σ)

n∏

j=1

a∗jσ−1(j)

=∑

τ∈Sn

ζ(σ)n∏

j=1

a∗jτ(j)

=

(

∑

τ∈Sn

ζ(σ)

n∏

j=1

ajτ(j)

)∗

= (detA)∗ = (detα)∗

206

Exercise 8.4.11

(i)⇒(ii) If (i) holds, then using the polarisation identity

4〈αz, αw〉 = ‖αz+ αw‖2 − ‖αz− αw‖2 + i‖αz+ iαw‖2 − i‖αz− iαw‖2

= ‖α(z+w)‖2 − ‖α(z−w)‖2 + i‖α(z+ iw)‖2 − i‖α(z− iw)‖2

= ‖z+w‖2 − ‖z−w‖2 + i‖z+ iw‖2 − i‖z− iw‖2= 4〈z,w〉,

so (ii) holds.

(ii)⇒(iii) if (ii) holds

〈α∗αz,w〉 = 〈w, α∗αz〉∗ = 〈αw, αz〉∗ = 〈αz, αw〉 = 〈z,w〉for all w so α∗αz = z for all z so α∗α = ι.

(iii)⇒(iv) Immediate.

(iv)⇔(v)⇔(vi) Immediate.

(iv)⇒(i) If (iv) holds,

‖αz‖2 = 〈αz, αz〉 = 〈z, α∗αz〉 = 〈z, z〉 = ‖z‖2.

207

Exercise 8.4.12

(i) U(Cn) is a subset of the group GL(Cn). We have

α, β ∈ U(Cn) ⇒ (αβ)∗(αβ) = (β∗α∗)(αβ)

= β∗(α∗α)β = β∗ιβ = β∗β = ι

⇒ αβ ∈ U(Cn)

and

α ∈ U(Cn) ⇒ (α−1)∗ = α∗∗ = α = (α−1)−1 ⇒ α−1 ∈ U(Cn)

whilstι∗ι = ιι = ι

so ι ∈ U(Cn),

Thus U(Cn) is a subgroup of GL(Cn).

(ii) 1 = det ι = detαα∗ = detα detα∗ = detα(detα)∗ = | detα|2.The converse is false. If

A =

(

1 01 1

)

,

then detA = 1, but

A−1 =

(

1 10 1

)

6= A∗.

208

Exercise 8.4.13

If D is a real diagonal matrix with diagonal entries dj,

D ∈ O(Rn) ⇔ DDT = I ⇔ d2j = 1 ∀j ⇔ dj = ±1 ∀j.

If D is a complex diagonal matrix with diagonal entries dj,

D ∈ U(Cn) ⇔ DD∗ = I ⇔ djd∗j = 1 ∀j ⇔ |dj| = ±1 ∀j.

The diagonal matrix with first diagonal entry eiθ and all other diag-onal entries 1 is unitary with determinant eiθ.

209

Exercise 8.4.14

SU(Cn) is a subset of the group SU(Cn) containing ι.

α, β ∈ SU(Cn) ⇒ detαβ = detα det β = 1×1 = 1 ⇒ αβ =∈ SU(Cn).

α ∈ SU(Cn) ⇒ detα−1 = (detα)−1 = 1 ⇒ α−1 =∈ SU(Cn).

Thus SU(Cn) is a subgroup of U(Cn).

210

Exercise 8.4.15

If e1, e2, . . . , en is an orthonormal basis for Cn and α has matrix Awith respect to this basis.

A = A∗ ⇔ α = α∗ ⇔ 〈z, αw〉 = 〈z, αw〉 ∀w, z ∈ Cn

⇔ 〈z, α∗w〉 = 〈z, αw〉 ∀w, z ∈ Cn

211

Exercise 8.4.16

If A is Hermitian,

detA = detA∗ = (detA)∗

If

A =

(

1 10 1

)

,

then detA = 1 is real, but A is not Hermitian.

212

Exercise 8.4.17

(i) If u is an eigenvector with eigenvalue λ then

λ‖u‖2 = 〈λu,u〉 = 〈αu,u〉 = 〈u, αu〉 = 〈u, λu〉 = λ∗‖u‖2

so λ = λ∗.

(ii) If u, v are eigenvectors with distinct eigenvalues λ and µ then

λ〈u,v〉 = 〈λu,v〉 = 〈αu,v〉 = 〈u, αv〉 = 〈u, µv〉 = µ∗〈u,v〉 = µ〈u,v〉so 〈u,v〉 = 0.

213

Exercise 8.4.18

Part (ii) is equivalent to part (i) by the basis change rule.

We prove part (i) by induction on n.

If n = 1, then, since every 1 × 1 matrix is diagonal, the result istrivial.

Suppose now that the result is true for n = m and that α : Cm+1 →Cm+1 is a symmetric linear map. We know that the characteristicpolynomial must have a root and that all its roots are real. Thus wecan can find an eigenvalue λ1 ∈ R and a corresponding eigenvector e1of norm 1. Consider the subspace

e⊥1 = {u : 〈e1,u〉 = 0}.We observe (and this is the key to the proof) that

u ∈ e⊥1 ⇒ 〈e1, αu〉 = 〈αe1,u〉 = λ1〈e1,u〉 = 0 ⇒ αu ∈ e⊥1 .

Thus we can define α|e⊥1

: e⊥1 → e⊥1 to be the restriction of α to e⊥1 .

We observe that α|e⊥1

is symmetric and e⊥1 has dimension m so, by the

inductive hypothesis, we can find m orthonormal eigenvectors of α|e⊥1

in e⊥1 . Let us call them e2, e3, . . . ,em+1. We observe that e1, e2, . . . ,em+1 are orthonormal eigenvectors of α and so α is diagonalisable. Theinduction is complete.

214

Exercise 8.4.19

We note that

det(tI − A) = (t− 5)(t− 2)− 4 = t2 − 7t+ 6,

so the eigenvalues of A are7± 5i

2.

Eigenvectors (z, w)T corresponding to the eigenvalue 7+5i2

are givenby

10z + 4iw = (7 + 5i)z

−4iz + 4w = (7 + 5i)w,

that is to say

(3− 5i)z + 4iw = 0

−4iz − (3 + 5i)w = 0,

so an eigenvector of norm 1 is given by 2−1/25−1(4i, 5i− 3)T .

Eigenvectors (z, w)T corresponding to the eigenvalue 7−5i2

are givenby

10z + 4iw = (7− 5i)z

−4iz + 4w = (7− 5i)w,

that is to say,

(3 + 5i)z + 4iw = 0

−4iz − (3− 5i)w = 0,

so an eigenvector of norm 1 is given by 2−1/25−1(5i− 3,−4i)T .

Thus

U = 2−1/25−1

(

4i 5i− 35i− 3 −4i

)

is a unitary matrix with

U∗AU =

(

(7 + 5i)/2 00 (7− 5i)/2

)

215

Exercise 8.4.20

We can do this using matrices, but it is nicer to use maps.

If γ is unitary, set α = (γ+γ∗)/2, β = (γ−γ∗)/(2i). Then γ = α+iβand

α∗ = (γ∗ + γ∗∗)/2 = (γ∗ + γ)/2 = α

β∗ = −(γ∗ + γ∗∗)/(2i) = −(γ∗ − γ)/(2i) = β

so α and β are Hermitian.

To prove uniqueness, suppose that γ is unitary, α and β Hermitianand γ = α + iβ.

Thenα− iβ = α∗ − iβ∗ = (α + iβ)∗ = γ∗ = γ

so2α = (α− iβ) + (α+ iβ) = 2γ

and α = (γ + γ∗)/2, iβ = γ − α.

We observe that

αβ = (γ + γ∗)(γ − γ∗)/(4i) = (γ2 − 2ι+ (γ∗)2)/(4i)

(γ − γ∗)(γ + γ∗)/(4i) = βα

and

α2 + β2 = 4−1((γ + γ∗)2 − (γ − γ∗)2) = 4−1(2γγ∗ + 2γ∗γ) = ι.

If we look at 1×1 matrices we getG = (eiθ), A = (cos θ), B = (i sin θ)and recover Euler’s formula.

216

Exercise 9.1.2

We have

LLT = I and detL = det

(

2−1/2 −2−1/2

2−1/2 2−1/2

)

= 1

so L ∈ SO(R3).

However, if we set x = (0, 1, 0)T , we have x′ = (0, 2−1/2, 2−1/2), so

x′42 = 4−1 6= 2−1/2 = l2x

42

3∑

j=1

l2jx4j

217

Exercise 9.1.4

(i) We have

u′i =

d

dtu′i =

d

dtlijuj = lijuj

so u is a Cartesian tensor of order 1.

(ii) x is a Cartesian tensor of order 1, so x is, so x is, so F = mx is.

218

Exercise 9.2.1

(i) Observe that

aij = u′iv

′j = lirurljsvs = lirljsars.

(ii) Observe that

a′ij =∂u′

j

∂x′i

=∂u′

j

∂xr

∂xr

∂x′i

= lir∂ljkuk

∂xr= lirljk

∂uk

∂xr= lirljkark.

(iii) Observe that

a′ij =∂2φ

∂x′i∂x

′j

=∂2φ

∂xr∂xs

∂xr

∂x′i

∂xs

∂x′j

= lirljsars.

219

Exercise 9.3.1

(i) If ui, vj wk are Cartesian tensors of order 1 then

(uivjwk)′ = u′

iv′jw

′k = lirljslkpurvswp

so uivjwk is a Cartesian tensor of order 3.

(ii) If uij is a Cartesian tensor of order 2 then

∂u′ij

∂x′k

=∂lirljsu

′rs

∂xp

∂xp

∂x′k

= lirljslkp∂urs

∂xp

so∂uij

∂xk

is a Cartesian tensor of order 3.

(iii) If φ is a smooth tensor of order 0

∂2φ

∂x′i∂x

′j∂x

′k

=∂3φ

∂xr∂xs∂xp

∂xr

∂x′i

∂xs

∂x′j

∂xp

∂x′k

= lirljslkp∂3φ

∂xr∂xs∂xp

so∂2φ

∂xi∂xj∂xk

is a Cartesian tensor of order 3.

220

Exercise 9.3.5

(i) Observe that, in our standard notation

lika′ibk = a′ib

′i = (aibi)

′ = aibi = akbk.

and so(lika

′i − ak)bk = 0.

Since we can assign bk any values we please in a particular coordinatesystem, we must have

lika′i − ak = 0,

soδira

′i − lrkak = lrk(lika

′i − ak) = 0

and we have shown that a′r = lrkak and ai is a Cartesian tensor of order1.

(ii) By (i), aijbi is a Cartesian tensor of rank 1 whenever bi is aCartesian tensor of order 1 and so, by Theorem 9.3.4, aij is a Cartesiantensor of order 2.

(iii) Apply (ii) with uij = bicj.

(iv) Observe that, in our standard notation

lkrlmsa′ijkmbrs = a′ijkmb

′km = (aijkmbkm)

′ = lipljq(apqkmbkm) = lipljq(apqrsbrs)

and so(lkrlmsa

′ijkm − lipljqapqrs)brs = 0.

Since we can assign brs any values we please in a particular coordinatesystem, we must have

lkrlmsa′ijkm − lipljqapqrs = 0.

solerlfs(lkrlmsa

′ijkm − lipljq)apqrs = 0

whence a′ijef = lerlfslipljqapqrs and we have shown that aijkl is a Carte-sian tensor.

221

Exercise 9.3.6

(i) We have

cijkmekm = 12(cijkm + cijmk)ekm = 1

2(cijkmekm + cijmkekm)

= 12(cijkmekm + cijmkemk) = pij

andcijmk =

12(cijmk + cijkm) =

12(cijkm + cijmk) = cijkm.

(ii) Set emk = 12(bkm + bmk) and fmk = 1

2(bkm − bmk).

(iii) We have

cijmkfmk = cijkmfkm = −cijmkfmk

so cijmkfmk = 0 andcijkmekm = cijmkbmk.

Thus cijmkbmk is a Cartesian tensor of order 2 whenever bmk is, so, bythe quotient rule given in Exercise 9.3.5 (iv), cijmk is a Cartesian tensorof order 4.

222

Exercise 9.3.8

If we work with a fixed set of coordinates, Cartesian tensors of rankn may be identified with the space of functions

f : R3n → R

with pointwise multiplication and multiplication by scalars. They there-fore form a vector space of dimension 3n.

223

Exercise 9.4.3

Using the Levi-Civita identity, we have

ǫijkǫklmǫmni = ǫkijǫklmǫmni = (δilδjm−δimδjl)ǫmni = ǫjnl−ǫmnm = ǫjnl = ǫljn.

224

Exercise 9.4.4 Observe that

ǫijk...pǫijk...p =∑

σ∈Sn

ζ(σ)2 =∑

σ∈Sn

1 = cardSn = n!

(See the definition of the determinant on page 90 if necessary.)

225

Exercise 9.4.5

(i) We have

λ(ǫijkajbk) = ǫijk(λaj)bk = ǫijkaj(λbk),

soλ(a× b) = (λa)× b = a× (λb).

(ii) We have

ǫijkaj(bk + ck) = ǫijkajbk + ǫijkajck,

soa× (b+ c) = a× b+ a× c.

(iii) We have

ǫijkajbk = −ǫikjajbk = −ǫikjbkaj ,

soa× b = −b× a.

(iv) Putting b = a in (iii), we get

a× a = −a× a

so a× a = 0.

226

Exercise 9.4.6

There is no identity. For suppose e was an identity. Then

e = e× e = 0,

but0× a = 0

for all a so (since R3 has more than one element) 0 is not an identity.

227

Exercise 9.4.7

(i) We haveλ(ajbj) = (λaj)bj = aj(λbj),

soλ(a · b) = (λa) · b = a · (λb).

(ii) We haveak(bk + ck) = akbk + akck,

soa · (b+ c) = a · b+ a · c.

(iii) We haveajbj = bjaj ,

soa · b = b · a.

228

Exercise 9.4.9

(i) We observe that

a× (b× c) = (a1, a2, a3)× (b2c3 − b3c2, b3c1 − b1c3, b1c2 − b2c1)

=(

a2(b1c2 − b2c1)− a3(b3c1 − b1c3), a3(b2c3 − b3c2)− a1(b1c2 − b2c1),

a1(b3c1 − b1c3)− a2(b2c3 − b3c2))

=(

(a2c2 + a3c3)b1, (a3c3 + a1c1)b2, (a1c1 + a2c2)b3)

−(

(a2b2 + a3b3)c1, (a3b3 + a1b1)c2, (a1b1 + a2b2)c3)

=(

(a2c2 + a3c3 + a1c1)b1, (a3c3 + a1c1 + a2c2)b2, (a1c1 + a2c2 + a3c3)b3)

−(

(a2b2 + a3b3 + a1b1)c1, (a3b3 + a1b1 + a2b2)c2, (a1b1 + a2b2 + a3b3)c3)

= (a · c)b− (a · b)c.

(ii) ‘Search me, guv’ner!’ There is a geometric proof in the AmericanMathematical Monthly (Vol 72, Feb 1965) by K. Bishop, but I doubtif the reader will call it simple.

229

Exercise 9.4.10

We have

(a×b)×c = −c× (a×b) = −(

(c ·b)a− (c ·a)b)

= (a · c)b− (b · c)a.as required.

If x = (1, 0, 0), y = z = (0, 1, 0) then

(x× y)× z = (0, 0, 1)× (0, 1, 0) = (−1, 0, 0) 6= (0, 0, 0)

= (0, 0, 1)× (0, 0, 0) = x× (y× z).

230

Exercise 9.4.11

We have

a× (b× c) + b× (c× a) + c× (a× b)

= (a · c)b− (a · b)c+ (b · a)c− (b · c)a+ (c · b)a− (c · a)b= 0.

231

Exercise 9.4.12

We have

(a× b) · (a× b) + (a · b)2 = ǫijkajbkǫirsarbs + aibiajbj

= (δjrδks − δjsδkr)ajbkarbs + aibiajbj

= ajajbkbk − ajbjakbk + aibiajbj

= ajajbkbk = ‖a‖2‖b‖2.

If we use the geometric formulae for a × b and a · b, our formulabecomes

a2b2 cos2 θ + a2b2 sin2 θ = a2b2

where θ is the angle between a and b, a = ‖a‖ and b = ‖b‖. We thusrecover the formula

cos2 θ + sin2 θ = 1.

232

Exercise 9.4.13

(i) We have

(a× b) · a = ǫijkajbkai = −ǫjikajbkai = −(a× b) · aso

(a× b) · a = 0

(ii) a× b ⊥ a so (a× b) · a = 0

(iii) The volume of a parallelepiped with one vertex at 0 and adjacentvertices at a, b and a is 0 so

(a× b) · a = [a,b, a] = 0.

233

Exercise 9.4.14

We have

(a× b)× (c× d) = ((a× b) · d)c− ((a× b) · c)d= [a,b,d]c− [a,b, c]d.

We also have

(a× b)× (c× d) = −(c× d)× (a× b)

= (c× d) · ab− c× d) · ba= [c,d, a]b− [c,d,b]a

= −[c, a,d]b− [b, c,d]a.

Thusa,b,d]c− [a,b, c]d = [c, a,d]b− [b, c,d]a

and the required result follows.

Our formulae also give

(a× b)× (a× c) = ((a× b) · c)a− ((a× b) · a)c= [a,b, c]a− 0 = [a,b, c]a

since (a× b) ⊥ a

234

Exercise 9.4.15

Since b× c ⊥ b, c we have

x · (b× c = (λa+ µb+ νc) · (b× c) = λa · (b× c)

so

λ =[x,b, c]

[a,b, c].

Similarly

µ =[a,x,b]

[a,b, c]and ν =

[a,b,x]

[a,b, c].

235

Exercise 9.4.16

(i) We work with row vectors

[a,b, c]2 = det

a

b

c

2

= det

a

b

c

det(

aT bT cT ))

= det

aaT abT acT

baT bbT bcT

caT cbT ccT

= det

a · a a · b a · cb · a b · b b · cc · a c · b c · c

(ii) In the case given we first observe that

r2 = ‖a+ b+ c‖2 = ‖a‖2 + ‖b‖2 + ‖c‖2 + 2(a · b+ b · c+ a · bso

a · b+ b · c+ a · b = −r2

[a,b, c]2 = det

a · a a · b a · cb · a b · b b · cc · a c · b c · c

= det

r2 a · b a · cb · a r2 b · cc · a c · b r2

= r2(r4 − (b · c)2)− a · b(r2a · b− (b · c)(a · c))+ a · c((b · a)(b · c)− r2b · a)

= 2r6 + 2r4(a · b+ b · c+ a · b)+ r2(a · b+ b · c+ a · b)2 − r2((a · b)2 + (b · c)2 + (c · a)2)+ 2(a · b)(b · c)(c · a)

= 2(r2 + a · b)(r2 + b · c)(r2 + c · a).

236

Exercise 9.4.17

Observe, using the summation convention, that

(a× b) · (c× d) = ǫijkajbkǫirscrds = (δjrδks − δkrδjs)ajbkcrds

= ajcjbkdk − ajdjbkck = (a · c)(b · d)− (a · d)(c · d)(Or we could use the formula for the triple vector product.)

Thus

(a× b) · (c× d) + (a× c) · (d× b) + (a× d) · (b× c)

= (a · c)(b · d)− (a · d)(c · d)+ (a · d)(b · c)− (a · b)(c · d)+ (a · b)(c · d)− (a · c)(b · d)

= 0.

237

Exercise 9.4.19

(i) Using the summation convention, we have(

d

dtφa

)

i

=d

dt(φai) = φai + φai =

(

φa+ φa)

i.

(ii) Using the summation convention, we have

d

dta · b =

d

dtaibi = aibi + aibi = a · b+ a · b.

(iii) By Lemma 9.4.18,

d

dta× a = a× a+ a× a = a× a.

238

Exercise 9.4.21

(i) Using the summation convention, we have

(

∇× (φu))

i= ǫijk

d

dxjφuk

= ǫijkdφ

dxjuk + ǫijk

duk

dxjφ =

(

∇φ)× u+ φ∇× u)

i.


∇ · (∇× u) =d

dxiǫijk

duk

dxj= ǫijk

d2uk

dxidxj= ǫijk

d2uk

dxjdxi= −∇ · (∇× u)

so∇ · (∇× u) = 0.


(

∇× (u× v))

i= ǫijk

d

dxjǫkrsurvs

= ǫkijǫkrsd

dxjurvs

= (δirδjs − δisδjr)d

dxjurvs

=d

dxjuivj −

d

dxjujvi

= vjdui

dxj+ ui

dvjdxj

− viduj

dxj− uj

dvidxj

=(

(∇ · u)v + u · ∇v − (∇ · v)u− v · ∇u)

i.

239

Exercise 9.4.22

(i) Observe that the cancellation law continues to hold, so it is suffi-cient to show that the analogue of the scalar vector product (think ofthe 4 dimensional volume) is a scalar.

Let ai = ǫijklxjykzl. Then if bi is an vector (ie order 1 Cartesiantensor),

(aibi)′ = ǫijklx

′jy

′kz

′lb

′i = ǫijklljulkvllvlipxuyvzwbp

= (detL)ǫpuvwxuyvzwbp

= ǫpuvwxuyvzwbp = aibi

since L ∈ SO(R4).

Thus, by the cancellation law, x ∧ y ∧ z is a vector

(ii) We have(

(λx+ µu) ∧ y ∧ z)

i= ǫijkl(λxj + µuj)ykzl

= λǫijklxjykzl + µǫijklujykzl

= (λx ∧ y ∧ z+ µu ∧ y ∧ z)i.

(y ∧ x ∧ z)i = ǫijklyjxkzl = −ǫikjlxkyjzl = (x ∧ y ∧ z)iso

x ∧ y ∧ z = −y ∧ x ∧ z.

The proof thatx ∧ y ∧ z = −z ∧ y ∧ x

follows the same lines.

(iii) Using the summation convention with range 1, 2, 3, 4, 5, wedefine

x ∧ y ∧ z ∧w = a

withai = ǫijkluxjykzlwu.

240

Exercise 10.1.4

Since the material is isotropic so is aijk thus aijk = κǫijk and

σij = aijkBk = κǫijkBk = −κǫjikBk = −σji = −σij .

Thus σij = 0.

241

Exercise 10.1.2

It is sufficient to show that δijδkl is isotropic, since it then followsthat δikδjl and δilδjk and all linear combinations are.

But (δijδkl)′ = δ′ijδ

′kl = δijδkl so we are done.

Ifαδijδkl + βδikδjl + γδilδjk = 0,

then0 = αδijδkk + βδikδjk + γδikδjk = (3α+ β + γ)δij .

Thus 3α + β + γ = 0 and similarly α + 3β + γ = 0, α + β + 3γ = 0.Thus

2α = 2β = 2γ = −(α + β + γ)

and α = β = γ = 0. (Alternatively, just examine particular entries inthe tensor αδijδkl + βδikδjl + γδilδjk.)

For the last part, observe that, since ǫijk and ǫist are isotropic, so isǫijkǫist. Thus

ǫijkǫist = αδjkδst + βδjsδkt + γδjtδks.

We observe that

αδjkδst+βδjsδkt+γδjtδks = ǫijkǫist = −ǫikjǫist = −αδjkδst+γδjsδkt+βδjtδks

so, by the linear independence proved earlier, β = −γ.

Next we observe that

0 = ǫijjǫist = (3α + β + γ)δst

so 3α+ β + γ = 0 and α = 0 Finally

6 = ǫijkǫijk = 3α + 9β + 3γ

so α = 0, β = 1 and γ = −1. We have shown that

ǫijkǫist = δjsδkt − δksδjt.

242

Exercise 10.1.3

a′ij = lirljsars = lirljsasr = a′ji.

243

Exercise 10.1.6

(i) We have

a′ij = lirljsars = −lirljsasr = −a′ji.

(ii) We have

bij =12(bij + bji) +

12(bij − bji)

and

12(bji + bij) =

12(bij + bji)

12(bji − bij) = −1

2(bij − bji).

(iii) Observe that the zero order two tensor is both symmetric andantisymmetric. Further

aij = aji, bij = bji ⇒ λaij + µbij = λaji + µbji

and

aij = −aji, bij = −bji ⇒ λaij + µbij = −(λaji + µbji)

Thus the symmetric tensors form a subspace and the antisymmetrictensors form a subspace.

We do not use the summation convention for α and β.

Let e(α, β) be the second order Cartesian tensor with

eij(α, β) = δαiδβj + δαiδβj

in a particular coordinate system S [1 ≤ α < β ≤ 3] and

eij(α, α) = δαiδαj

for 1 ≤ α ≤ 3

Let f(α, β) be the second orderCartesian tensor with

fij(α, β) = δαiδβj − δαiδβj

in the same coordinate system S [1 ≤ α < β ≤ 3].

Then in that coordinate system e(α, β) is symmetric, f(α, β) is anti-symmetric. If a is a second order symmetric tensor then in the coordi-nate system S,

a =∑

1≤α≤β≤3

aαβe(α, β)

and if b is an antisymmetric tensor then in the coordinate system S,

b =∑

1≤α<β≤3

bαβf(α, β).

Since the result is true in one system it is true in all.

244

Working in S we see that∑

1≤α≤β≤3

aαβe(α, β) = 0 ⇒ aαβ = 0 [1 ≤ α ≤ β ≤ 3]

and∑

1≤α<β≤3

bαβe(α, β) = 0 ⇒ bαβ = 0 [1 ≤ α < β ≤ 3].

Thus the e(α, β) give a basis for the subspace of symmetric tensorswhich thus has dimension 6 the f(α, β) give a basis for the subspace ofantisymmetric tensors which thus has dimension 3.

245

Exercise 10.1.8

Sinceǫijkωjxk = (ω × x)i

and ω × x ⊥ x, the only possible eigenvalue is 0 and this correspondsto eigenvectors tω with t 6= 0.

246

Exercise 10.2.1

Observe first that∑

α

mαrα =∑

α

mαxα −MxG = 0.

We have

H =∑

α

mαxα × xα

=∑

α

mα(rα + xG)× (rα + xG)

=∑

α

mαrα × rα + xG ×∑

α

mαrα +∑

α

mαrα × rα

+

(

∑

α

mαrα

)

× xG

=∑

α

mαrα × rα + xG × d

dt

∑

α

mαrα +∑

α

mαrα × rα + 0× xG

=∑

α

mαrα × rα + xG × d

dt0+

∑

α

mαrα

= MxG × xG +∑

α

mαrα × rα

The total angular momentum about the origin is the angular mo-mentum of the system around the centre of mass plus the angularmomentum of a point mass equal to the mass of the system at thecentre of gravity around the origin.

247

Exercise 10.2.2

We have

H =d

dt

∑

α

mαxα × xα

=∑

α

mα(xα × xα + xα × xα)

=∑

α

mαxα × xα

=∑

α

xα ×mαxα

=∑

α

xα × Fα

= −k∑

α

mαxα × xα

= −kH

Thus (you can work with the summation convention, if you prefer)

d

dtektH(t) = 0

so H(t)ekt is a constant vector H0 and we are done.

248

Exercise 10.2.3

Since the first lump has centre of mass at 0 we have∫

xiρ(x) dV (x) = 0

By a simple change of variable

Jij =

∫

(

(xk + ak)(xk + ak)δij − (xi + ai)(xj + aj))

ρ(x) dV (x)

=

∫

(xkxkδij − xixj)ρ(x) dV (x) + 2akδij

∫

xkρ(x) dV (x)

+ (akakδij − aiaj)

∫

ρ(x) dV (x) + ai

∫

xjρ(x) dV (x)

+ aj

∫

xiρ(x) dV (x)

= Iij +M(akakδij − aiaj).

249

Exercise 10.2.4

In the coordinate system chosen, taking x1 = x, x2 = y, x3 = z,

A = I11

=

∫

(xkxkδ11 − x1x1)ρ(x) dV (x)

=

∫

(x2x2 + x3x3)ρ(x) dV (x)

=

∫∫∫

(y2 + z2)ρ(x, y, z) dx dy dz ≥ 0

We also have

B =

∫∫∫

(z2 + x2)ρ(x, y, z) dx dy dz ≥ 0,

C =

∫∫∫

(x2 + y2)ρ(x, y, z) dx dy dz ≥ 0.

250

Exercise 10.3.2

(i) We havelirljsδrs = lirljr = δij

since LLT = I.

(ii) If aijk is an isotropic extended Cartesian tensor, then aijk is anisotropic Cartesian tensor so aijk = Aǫijk so aijk = 0.

(iii) We have

(ǫrstarbsct)′ = lrilsjltkǫrstaibjck = (detL)ǫijkaibjck = −ǫijkaibjck.

(iv) We have

(arbr)′ = a′rb

′r = lrilrjaibj = δijaibj .

(v) If a× b was an extended Cartesian tensor then by (iv) so wouldbe (a× b) · c. Now choose c non-zero and orthogonal to a and b.

251

Exercise 10.4.1

(i) We have

(λui + µvi) = (λui + µvi) = (λlijuj + µlijv

j) = lij(λuj + µvj).

(ii) We have(u′)i = (liju

i)′ = lijuj.

252

Exercise 10.4.3

mij = lij ⇔ L = (LT )−1 ⇔ LT = L−1 ⇔ L ∈ O(R3).

253

Exercise 10.4.6

(i) We have

liumvjδ

ji = lium

vi = δvu = δvu.

Choose an L such that L2 6= I, for example

L =

1 0 00 4

5−3

50 3

545

.

Thenlui l

vj δuv = lui l

vu 6= δij = δij .

(ii) The rule isaknij = lui l

vjm

kpm

nq a

pquv.

(iii) We have

lui lvjm

kpm

nq a

pquv = lui l

vjm

kpm

nq δ

ki δ

nj = lui l

vjm

ipm

jq = lui m

ipl

vjm

jq = δup δ

vq = auvpq ,

so aknij is a 4th rank tensor.

Next observe thatainij = δiiδ

nj = 3δnj

a second rank tensor.

Howeveraknii = δki δ

ni = δkn

is not a second rank tensor . The same argument shows that annij isnot.

(iv) The rule isanijk = lui l

vj l

pkm

nq a

quvp.

We observe that

anijn = lui lvj l

pnm

nq a

quvp = lui l

vj δ

pqa

quvp = lui l

vja

puvp

so anijn is a second rank tensor.

254

Exercise 11.1.2

Observe that

(α + β)(ej) = α(ej) + β(ej) =n∑

i=1

aijfi +n∑

i=1

bijfi =n∑

i=1

(aij + bij)fi

and

λα(ej) = λn∑

i=1

aijfi =n∑

i=1

λaijfi.

255

Exercise 11.1.3

If

α(fj) =

p∑

i=1

aijgi

β(er) =n∑

s=1

bsrfs,

then

αβ(er) =n∑

s=1

bsrαfs =n∑

s=1

bsr

p∑

i=1

aisgi =

p∑

i=1

(

n∑

s=1

aisbsr

)

gi,

so αβ has matrix AB with respect to the given bases.

256

Exercise 11.1.4

(i) Observe that, by definition,

(α + β)γu = α(γu) + β(γu) = (αγ)u+ (βγ)u = (αβ + αγ)u

for all u ∈ U so, by definition

(α + β)γ = αγ + βγ.

(ii) and (iii) follow at once from Exercises 11.1.2 and 11.1.3.

(iv) Let U , V and W be (not necessarily finite dimensional) vectorspaces over F and let α, β ∈ L(U, V ), γ ∈ L(V,W ). Then

γ(α + β) = γα + γβ.

If A and B are n×m matrices over F and C is an p× n matrix overF, we have

C(A +B) = CA+ CB.

257

Exercise 11.1.5

Since the fi are a basis for V , we can find pij ∈ F such that

f ′j =n∑

i=1

pijfi [1 ≤ j ≤ n]

and, similarly, we can find mij and qrs such that

fj =n∑

i=1

mijfi [1 ≤ j ≤ n]

and

e′s =m∑

r=1

qrser [1 ≤ s ≤ m].

We have

fj =n∑

i=1

mijf′i =

n∑

i=1

mij

n∑

k=1

pkifk

so by uniqueness

δkj =n∑

i=1

pkimij

that is to say PM = I so P is invertible with M = P−1.

Similarly

αe′s = α

m∑

r=1

qrser

=

m∑

r=1

qrsαer

=

m∑

r=1

qrs

n∑

p=1

aprf′p

=

m∑

r=1

qrs

n∑

p=1

apr

n∑

i=1

mipf′i

m∑

i=1

(

n∑

p=1

m∑

r=1

mipaprqrs

)

f ′p,

so

B = MAQ = P−1AQ.

258

Exercise 11.1.9

(i) Since I ∈ G,

A ∈ X ⇒ A, A ∈ X,A = AI ⇒ A ∼1 A.

If A ∼1 B then A, B ∈ X and we can find a P ∈ G such thatB = PA. But P−1 ∈ G and A = P−1B so B ∼1 A.

If A ∼1 B, B ∼1 C, then A, B, C ∈ X and we can find a P, Q ∈ Gsuch that B = PA, C = QB. Since QP ∈ G and C = (QP )A we haveA ∼1 C. Thus ∼1 is an equivalence relation on X .

(ii) Since I ∈ G,

A ∈ X ⇒ A, A ∈ X,A = I−1AI ⇒ A ∼2 A.

If A ∼2 B, then A, B ∈ X and we can find P, Q ∈ G such thatB = P−1AQ. But P−1, Q ∈ G and A = (P−1)−1BQ−1, so B ∼2 A.

If A ∼2 B, B ∼2 C, then A, B, C ∈ X and we can find P, Q, R, S ∈G such that B = P−1AQ, C = R−1BS. Since PR, SQ ∈ G andC = (PR)−1A(SQ), we have A ∼2 C. Thus ∼2 is an equivalencerelation on X .

(iii) If A ∼2 B, then A, B ∈ X and we can find P, Q ∈ GL(Fn) withB = P−1AQ and so rankB = rankA. If rankB = rankA = k, then wecan find P, Q, R, S ∈ GL(Fn) such that D = P−1AQ and D = R−1BSwhere D is the n × n diagonal matrix with first k diagonal entries 1and remaining entries 0 so A ∼2 D, D ∼2 B and A ∼2 B. Thus thereare n + 1 equivalence classes corresponding to the matrices of rank k[0 ≤ k ≤ n].

(iv) Since I ∈ G.

A ∈ X ⇒ A, A ∈ X,A = I−1AI ⇒ A ∼3 A.

If A ∼3 B then A, B ∈ X and we can find P ∈ G such that B =P−1AP . But P−1 ∈ G and A = (P−1)−1BP−1 so B ∼3 A.

If A ∼3 B, B ∼3 C, then A, B, C ∈ X and we can find P, Q ∈G such that B = P−1AP , C = Q−1BQ. Since PQ ∈ G and C =(PQ)−1A(PQ) we have A ∼3 C. Thus ∼3 is an equivalence relation onX .

(v) Since A and P−1AP have the same characteristic polynomial

A ∼3 B ⇒ χA = χB

Since whenever A is symmetric there exists an orthogonal matrix Psuch that P−1AP = P TAP = D where D is a diagonal matrix with

259

the same characteristic polynomial

χA = χB ⇒ A ∼3 B.

ThusA ∼3 B ⇔ χA = χB.

If Dλ is the diagonal matrix with first diagonal entry λ and all others0 then

Dλ ∼3 Dµ ⇔ χDλ= χDµ

⇔ λ = µ.

Thus there are infinitely many equivalence classes for ∼3.

(vi) Since I ∈ G

A ∈ X ⇒ A, A ∈ X,A = ITAI ⇒ A ∼3 A.

If A ∼4 B, then A, B ∈ X and we can find P ∈ G such thatB = P TAP . But P−1 ∈ G and A = (P−1)TBP−1, so B ∼4 A.

If A ∼4 B, B ∼4 C, then A, B, C ∈ X and we can find P, Q ∈G such that B = P TAP , C = QTBQ. Since PQ ∈ G and C =(PQ)TA(PQ) we have A ∼4 C. Thus ∼4 is an equivalence relation onX .

(vii) If A is a real symmetric n × n matrix we can find P ∈ O(Rn)such that D = P TAP with D = (dij) a diagonal matrix with dii > 0 for1 ≤ i < u, dii < 0 for u ≤ i < v and dii = 0 otherwise. Let E = (eij) a

diagonal matrix with eii = d−1/2ii for 1 ≤ i < u, eii = (−dii)

−1/2 for u ≤i < v and eii = 1 otherwise. Then PE ∈ GL(Rn) and (PE)TA(PE)is a diagonal matrix with entries 1, −1 and 0. Thus there can only befinitely many equivalence classes.

[The reason that this does not complete the discussion is that we havenot shown (as we will in Section 16.2) that different (u, v) correspondto different equivalence classes.]

260

Exercise 11.1.15

Let U and V be finite dimensional vector spaces. and let α : U → Vbe linear.

By Theorem 11.1.13,

dim imα = dim(U/ kerα)

and, by Lemma 11.1.14,

dimU = dimkerα+ dim(U/ kerα),

sodimU = dimkerα + dim(imα).

261

Exercise 11.1.16

Observe that

dimHj = dimBj − dimZj = rank(αj+1)− (dimCj − rank(αj))

and sodimHj + dimCj = rank(αj+1) + rank(αj)

for 1 ≤ j ≤ n

Now rank(αn+1) = rank(α0) = 0 since Cn+1 = C0 = {0} son∑

j=1

(−1)j(dimHj + dimCj) = 0.

262

Exercise 11.2.1

The sum of two polynomials is polynomial and the product of apolynomial with a real number is a polynomial, so we have a subspaceof the vector space of functions f : R → R with pointwise addition andscalar multiplication.

(i) Let λ, µ ∈ R, P, Q ∈ P.

We haveD(λP + µQ) = λDP + µDQ,

by the standard rules for differentiation.

We have

M(λP+µQ)(t) = t(

λP (t)+µQ(t))

= λtP (t)+µtQ(t) = (λMP+µMQ)(t),

soM(λP + µQ) = λMP + µMQ.

We have

Eh(λP + µQ)(t) = (λP (t+ h) + µQ(t+ h)) = λEhP (t) + µEhQ(t)

= (λEhP + µEhQ)(t),

soM(λP + µQ) = λMP + µMQ.

(ii) We have

(

(DM −MD))P (t) =d

dttP (t)− tP ′(t) = P (t),

so (DM −MD)P = P for all P ∈ P and DM −MD = ι the identitymap.

(iii) Well defined since for each P we only consider a finite sum andif αjP = 0 for all j ≥ N

N∑

j=0

αjP =M∑

j=0

αjP

for all M ≥ N .

If λ, µ ∈ R, P, Q ∈ P. we can find an N such that αjP = αjQ = 0for all j ≥ N , so

∞∑

j=0

αj(λP + µQ) =N∑

j=0

αj(λP + µQ) =N∑

j=0

(λαjP + µαjQ)

= λ

N∑

j=0

αjP + µ

N∑

j=0

αjQ = λ

∞∑

j=0

αjP + µ

∞∑

j=0

αjQ.

263

Thus∑∞

j=0 αj ∈ L(P,P).

(iv) If P is a polynomial of degree N , then DjP = 0 for j ≥ N + 1.

If en = tn, then M je0 = ej 6= 0, so M does not have the requiredproperty.

Since Ejhe0 = ej 6= 0, Eh does not have the required property.

(ι − Eh)ek is a polynomial of degree k − 1 if k ≥ 1. and is zero ifk = 0. Thus, if P is a polynomial of degree n, (ι−Eh)ek is a polynomialof degree at most n − 1 if n ≥ 1. and is zero if n = 0. Thus if P is apolynomial of degree N (ι−Eh)

N+1P = 0 and we have desired property.

(v) Part (iii) tells us that exp(α) and log(ι − α) are well definedmembers of L(P,P).

Observe that

expα− ι = α

∞∑

j=1

αj−1/(j − 1) = αβ

where β is an endomorphism which commutes with α. Thus

(expα− ι)N = αNβN = 0

if αN = 0. It follows, by part (iii), that log(expα) is well defined.

By the standard rules on Taylor expansions, we have

log(exp t) =

∞∑

j=0

cjtj

for |t| small and (using those standard rules to compute cj)

log(expα) =∞∑

j=0

cjαj .

However we know from calculus that

log(exp t) = t

for |t| small and so (by the uniqueness of Taylor expansions) c1 = 1and cj = 0 otherwise. Thus

log(expα) = α.

(vi) We know that polynomials are their own Taylor series so

(

(exp hD)P)

=∞∑

j=0

hj

j!P (j)(t) = P (t+ h) = (EhP )(t)

and thus

exp hD = Eh.

264

We know, from part (iv), that, for each P ∈ P, we can find an N(P )such that △j

h(P ) = 0 for all j > N(P ). Thus (v) tells us that

logEh = hD,

that is to say

hD =∞∑

j=1

(−1)j+1

j△j

h.

(vii If P has degree N ,

(ι−λD)∞∑

j=0

λjDrP = (ι−λD)∞∑

j=N+1

λjDrP = (ι−λN+2DN+2)P = P = ιP

Thus

(ι− λD)∞∑

j=0

λjDj = ι

and similarly(

∞∑

j=0

λjDj

)

(ι− λD) = ι

It follows that

(ι− λD)P = Q ⇒ P =

(

∞∑

j=0

λjDj

)

(ι− λD)P =

(

∞∑

j=0

λjDj

)

Q.

We also have

(ι− λD)

(

∞∑

j=0

λjDj

)

P = P.

Thus the unique polynomial solution of

(ι− λD)P = Q

is

Q =∞∑

j=0

λjDjP =∞∑

j=0

λjP (j).

In particular, the unique polynomial solution of

f ′(x)− f(x) = x2

ie of

(ι−D)f = −e2

is∞∑

j=0

dj

dxj(−x2) = −x2 − 2x− 2.

265

There is only one polynomial solution. (The general, not necessarilypolynomial, solution is Aex − x2 − 2x− 2 with A chosen freely.)

266

Exercise 11.2.3

If rankαn 6= 0, then, since rankαm = 0 for some m,

rankαn − rankαn+1 ≥ 1

Thus if n ≥ r ≥ 0

rankαr − rankαr+1 ≥ rankαn − rankαn+1 ≥ 1

so

rankαn ≤ n−n−1∑

r=0

rankαr − rankαr+1 ≤ 0

which contradicts our original hypothesis.

By reductio ad absurdum, rankαn = 0.

Now suppose that α has rank r and αm = 0, We have

n− r = n− rankα ≥ rankαj − rankαj+1

so

m(n− r) ≥m∑

j=0

rankαj − rankαj+1 = n− rankαm = n

so mn− n ≥ mr and r ≤ n(1−m−1).

267

Exercise 11.2.4

Observe that if Ak =(

aij(k))

is the k × k matrix with ai,i+1(k) = 1for 1 ≤ i ≤ k − 1 aij(k) = 0 otherwise, then the associated linear mapαk : Fk → Fk has rankαu(k) = max(k, u − 1). Consider the n × nmatrix A which has

(

sj − sj+1

)

−(

sj+1 − sj+2

)

copies of Aj and onesm×sm identity matrix along the diagonal and all other entries zero. Ifα has matrix A with respect to some basis, then α will have the desiredproperty.

268

Exercise 11.2.5

If we write rk = rankαk, we know from general theorems that

r0 − r1 ≥ r1 − r2 ≥ r2 − r3 ≥ . . . ≥ rn−1 − rn ≥ 0.

We have r0 − r1 = 2, so 2 ≥ rj − rj+1 ≥ 0. Since

2n = r0 − rn =n−1∑

j=0

rj − rj+1,

it follows that 2 = rj − rj+1 for 0 ≤ j ≤ n − 1 and rankαj = 2n − 2jfor 0 ≤ j ≤ n.

269

Exercise 11.3.1

(i) We have

δa(λ+ µg) = (λf + µg)(a) = λf(a) + µg(a) = λδaf + µδag.

(ii) We have

δ′a(λ+ µg) = −(λf + µg)′(a) = −λf ′(a)− µg′(a) = λδ′af + µδ′ag.

(iii) We have

J(λf + µg) =

∫ 1

0

(λf(x) + µg(x)) dx

= λ

∫ 1

0

f(x) dx+ µ

∫ 1

0

g(x) dx

= λJf + µJg.

270

Exercise 11.3.6

Φ(α) = 0 ⇒ α′ = 0 by definition.

α′ = 0 ⇒ α′(v′) = 0 ∀v′ ∈ V ′ by definition.

α′(v′) = 0 ∀v′ ∈ V ′ ⇒ α′(v′)u = 0 ∀v′ ∈ V ′ ∀u ∈ U by definition.

α′(v′)u = 0 ∀v′ ∈ V ′ ∀u ∈ U ⇒ v′αu = 0 ∀v′ ∈ V ′ ∀u ∈ U since,by definition, α′(v′)u = v′α(u).

v′αu = 0 ∀v′ ∈ V ′ ∀u ∈ U ⇒ αu = 0 ∀u ∈ U since V ′ separates V .

αu = 0 ∀u ∈ U ⇒ α = 0 by definition.

271

Exercise 11.3.7

(i) We have(

(βα)′w′)

u = w′(

(βα)u)

= w′(

β(αu))

= (β ′w′)(αu)

=(

α′(β ′w′))

u

=(

(α′β ′)w′)

u

for all u ∈ U , so(βα)′w′ = (α′β ′)w′

for all w′ ∈ W ′, so(βα)′ = α′β ′.

(ii) We have(ι′Uu

′)v = u′ιuv = u′v

for all v ∈ U ,soι′Uu

′ = u′

for all u′ ∈ U ′, soι′U = ιU ′ .

(iii) If α is invertible, then αα−1 = ι so, by the previous parts,(α−1)′α′ = ι so α′ is invertible and (α′)−1 = (α−1)′.

272

Exercise 11.3.8

We have

α′′(Θu)v′ = Θu(α′v′)

= (α′v′)u

= v′αu

=(

Θ(αu))

v′

for all v′ ∈ V . Since V ′′ is separated by V ′,

α′′(Θu) = Θ(αu)

for all u ∈ U .

273

Exercise 11.3.9

(i) Certainly the zero sequence 0 ∈ c00.

If a, b ∈ c00 we can find n and m such that ar = 0 for r ≥ n andbr = 0 for r ≥ m. Thus if λ, µ ∈ R we have λar + µbr = 0 for allr ≥ max{n,m} so λa+ µb ∈ c00. Thus c00 is a subspace of s.

(ii) If x ∈ c00, then we can find an n such that xr = 0 for r ≥ n. Wehave

x =n−1∑

r=1

xrer

and

Tx =n−1∑

r=1

xrTer =n−1∑

r=1

arxr =∞∑

r=1

arxr.

(iii) If x ∈ c00, then we can find an n such that xr = 0 for r ≥ n.Thus

Tax =

n−1∑

r=1

arxr

∞∑

r=1

arxr

is well defined.

If x, y ∈ c00 and λ, µ ∈ R we can find an n such that xr = yr = 0for r ≥ n. Thus

Ta(λx+µx) =

n−1∑

r=1

ar(λxr+µyr) = λ

n−1∑

r=1

arxr+µ

n−1∑

r=1

aryr = λTax+µTay.

We have shown that Ta ∈ c00.

If x ∈ c00 and x 6= 0 then Txx 6= 0. Thus c′00 separates c00.

(iv) Let θ : s → c′00 be given by θa = Ta. By (iii), θ is well definedand, by (ii), θ is surjective.

If x ∈ c00, then we can find an n such that xr = 0 for all r ≥ n.

Tλa+µbx =n∑

r=1

xr(λar + µbr) = λn∑

r=1

xrar + µn∑

r=1

xrbr

= λTax + µTbx = (λTa + µTb)x

Thus Tλa+µb = λTa + µTb and θ is linear. Further,

Ta = 0 ⇔ Taej = 0 ∀j ⇔ aj = 0 ∀j ⇔ a = 0,

so θ is injective.

Thus c′00 is isomorphic to s.

274

Exercise 11.3.10

(i) If x ∈ c00, then we can find an n such that xr = 0 for r ≥ n. Wehave

x =

n−1∑

r=1

xrer

(ii) Let If wj ∈ Rn+1 be the row vector whose kth entry is the m+ kth entry of fj . Since n vectors cannot span a space of dimension n+ 1there exists a vector u ∈ Rn+1 with

u /∈ span{w1, w2, . . . , wn+1}Set bm+r = ur.

(iii) Let m(1) = 0 and let m(n) =∑n−1

r=1 (r + 1) for n ≥ 2. For eachn ≥ 1 choose ar with m(n) + 1 ≤ r ≤ m(n + 1) so that

n∑

j=1

λjfj = a

has no solution with λj ∈ R [1 ≤ j ≤ n].

By construction, a does not lie in the span of the fj . Thus s cannotbe spanned by a countable set and is not isomorphic to c00 which can.Thus c00 which is isomorphic to s is not isomorphic to c′00.

275

Exercise 11.4.2

We use the fact that ej(xk) = δjk repeatedly.

To see that the ej are linearly independent, observe thatn∑

j=0

λjej = 0 ⇒n∑

j=0

λjej(xk) = 0 ∀k ⇒ λk = 0 ∀k.

To see that the ej span, observe that, if P ∈ Pn, then

P −n∑

j=0

P (xj)ej

is a polynomial of degree at most n which vanishes at the n+1 pointsxk and is this identically zero. Thus

P =

n∑

j=0

P (xj)ej .

We have ej(xk) = δjk = ekej so

ek

(

n∑

j=0

λjej

)

=n∑

j=0

λj ekej =n∑

j=0

λjej(xk)

and ekP = P (xk).

276

Exercise 11.4.4

Observe thatejek = ekej = δjk

for all k so ˆej = ej.

277

Exercise 11.4.5

We have

f1 = ae1 + be2,

f2 = ce1 + de2

and, since the ej form a basis,

f1 = Ae1 +Be2,

f2 = Be1 +De2

Now1 = f1f1 = (Ae1 +Be2)(ae1 + be2) = Aa +Bb

and0 = f1f2 = (Ae1 +Be2)(ce1 + de2) = Ac+Bd.

Similarly Ca+Db = 0, Cc+Dd = 0. In other words(

A BC D

)(

a cb d

)

= I

so

Q =

(

A BC D

)

=

(

a cb d

)−1

= (ad− bc)−1

(

a −c−b d

)

.

and

f1 = (ad− bc)−1ae1 − b(ad − bc)−1e2,

f2 = −c(ad− bc)−1ae1 + (ad− bc)−1de2.

278

Exercise 11.4.6

Using the summation convention,

δjs = fsfj = krslijerei = krslijδir = krslrj

Thus I = KLT and K = (LT )−1.

279

Exercise 11.4.9

Let A and B be n×n matrices and α and β the corresponding linearmaps for some fixed basis.

We know that (α+β)′ = α′+β ′, (λα)′ = λα′, α′′ = α (since we workin a finite dimensional space) ι′ = ι, (αβ)′ = β ′α′ so

(A+B)T = AT+BT , (λA)T = λAT , ATT = A, IT = I, (AB)T = BTAT .

If A is invertible, then α is, so α′ is with (α′)−1 = (α−1)′, so AT isinvertible with (AT )−1 = (A−1)T .

280

Exercise 11.4.10

Let α have matrix A with respect to some basis. Then α′ has matrixAT with respect to the dual basis so (by the matrix result)

detα′ = detAT = detA = detα.

It follows that

det(tι− α) = det(tι− α)′ = det(tι− α)

and α and α′ have the same characteristic polynomials and so the sameeigenvalues.

The trace is minus the coefficient of tn−1 in the characteristic poly-nomial (supposed of degree n). Thus Trα′ = Trα.

281

Exercise 11.4.12

We certainly have the zero map 0 ∈ W 0. If λ, µ ∈ F

u′, v′ ∈ W 0 ⇒ (λu′ + µv′)w = λu′w + µv′w = λ0+ µ0 = 0 ∀w ∈ W

⇒ λu′ + µv′ ∈ W 0.

Thus W 0 is a subspace of U ′.

282

Exercise 11.4.15

∑

j=1

xjej ∈ W 00 ⇔(

n∑

j=1

xjej

)

n∑

r=k+1

yrer = 0 ∀yr ∈ F

⇔(

n∑

j=1

xjej

)

er = 0 ∀k + 1 ≤ r ≤ n

⇔ xr = 0 ∀k + 1 ≤ r ≤ n

⇔∑

j=1

xjej ∈ W

283

Exercise 11.4.19

dim{u ∈ U : αu = λu} = dim(λι− α)−10

= dim im(λι− α)

= dim im(λι− α)′

= dim im(λι− α′)

= dim{u′ ∈ U ′ : α′u = λu′.}The eigenvalues and the dimensions of the spaces spanned by the cor-responding eigenvectors are the same for α and α′.

284

Exercise 11.4.20

v′ ∈ (αU)0 ⇒ v′(αu) = 0 ∀u ∈ U

⇒ (α′v′)u = v′(αu) = 0 ∀u ∈ U

⇒ α′v′ ∈ U0

so α′(αU0) is a subspace of U . (Remark that (αU)0 is a subspace ofV ′ so α′(αU)0 is a subspace of U ′.)

Let V have basis e1, e2 and V ′ dual basis e1 e2. Let α be theendomorphism given by α(e1) = e1, α(e2) = 0. We observe that

α′(e1) = e1, α′(e2) = 0.

If U = span{e1}, then αU = U , (αU)0 = span{e2} and

α′(αU)0 = {0} 6= U0.

If If W = span{e2}, then αW = {0}, (αW )0 = V and

α′(αU)0 = span{e1} = U0

285

Exercise 12.1.2

If U = U1 ⊕ U2 ⊕ . . .⊕ Um and u ∈ U then, by Definition 12.1.1 (i),we can find uj ∈ Uj such that

u = u1 + u2 + . . .+ um.

If vj ∈ Uj andu = v1 + v2 + . . .+ vm,

then uj − vj ∈ Uj and

0 = u− v = (u1 − v1) + (u2 − v2) + . . .+ (um − vm)

so, by Definition 12.1.1 (ii), uj − vj = 0 and uj = vj ∀j.If the equation

u = u1 + u2 + . . .+ um

has exactly one solution with uj ∈ U , then the conditions of Defini-tion 12.1.1 can be read off.

286

Exercise 12.1.3

(i) If u ∈ U then we can find uj ∈ Uj such that

u = u1 + u2 + . . .+ um.

For each j we can find λjk ∈ F such that

uj =

n(j)∑

k=1

λjkejk

and so

u =

m∑

j=1

n(j)∑

k=1

λjkejk.

Thus the ejk span U .

If

0 =

m∑

j=1

n(j)∑

k=1

λjkejk

then setting uj =∑n(j)

k=1 λjkejk we have uj ∈ Uj and, applying thedefinition of direct sum, we obtain

0 = uj =

n(j)∑

k=1

λjkejk,

so λjk = 0 for all k and j. Thus the ejk are linearly independent andso form a basis.

(ii) and (iii) Since every subspace of a finite dimension subspace isfinite dimensional

U finite dimensional ⇒ Uj finite dimensional

If the Uj are finite dimensional the we can choose bases as in (i) andobserve that U is finite dimensional with

dimU =

m∑

j=1

n(j) =

m∑

j=1

dimUj.

287

Exercise 12.1.4

Observe that B = U∩V is a subspace of a finite dimensional space sohas a basis e1, e2, . . . , ek. Since U is a subspace of a finite dimensionalspace containing B we can extend the basis of B to a basis of U

e1, e2, . . . , ek, ek+1, ek+2, . . . , ek+l.

Similarly we can extend the basis of B to a basis of v

e1, e2, . . . , ek, ek+l+1, ek+l+2, . . . , ek+l+r.

Let

C = span{ek+1, ek+2, . . . , ek+l, ek+l+1, ek+l+2, . . . , ek+l+r}.

We show that the ej are linearly independent as follows. If

k+l+r∑

j=1

λjej = 0

thenk+l∑

j=1

λjej = −k+l+r∑

j=k+l+1

λjej ∈ U ∩ V = B

Thusk+l∑

j=1

λjej =

k∑

j=1

µjej

for some µj . Thus λj = 0 for 1 ≤ j ≤ k. Thus

k∑

j=1

λjej +

k+l+r∑

j=k+l+1

λjej = 0

so λj = 0 for k + l + 1 ≤ j ≤ k + l + r and for 1 ≤ j ≤ k.

We can now read off the desired results,

U = A⊕ B, W = B ⊕ C and U +W = U ⊕ C = W ⊕ A.

Observe that if

U = A′ ⊕ B′, W = B′ ⊕ C ′, U +W = U ⊕ C ′ = W ⊕A′

we have B′ ⊆ U, V soB′ ⊆ U ∩ V.

If v ∈ U ∩W then we can find b1, b2 ∈ B′ and a ∈ A, c ∈ C such that

v = a+ b1 = c+ b2

soa− c = b1 − b2 ∈ B′.

Thusa = c+ (b1 − b2)

288

with a ∈ A′ ⊆ U and b1 − b2 ∈ U ∩W ⊆ U . Since U +W = U ⊕ C ′

we have c = 0. sov = c+ b2 = b2 ∈ B′.

ThusB′ = U ∩W

and we have shown B unique.

Not unique. Take V = R3 (using row vectors)

U = {(x, y, 0) : x, y ∈ R}, V = {(x, 0, z) : x, z ∈ R}B = {(x, 0, 0) : x ∈ R},

A = {(0, y, 0) : y ∈ R}, A′ = {(y, y, 0) : y ∈ R}C = {(0, 0, z) : y ∈ R}, C ′ = {(z, 0, z) : z ∈ R}.

Then

U = A⊕B, W = B ⊕ C and U +W = U ⊕ C = W ⊕ A

and

U = A′ ⊕B, W = B′ ⊕ C and U +W = U ⊕ C ′ = W ⊕ A′.

289

Exercise 12.1.6

(i) Can occur. Take V = V1 = V2 = F2.

(ii) Cannot occur. If e1, e2, . . . ek is a basis for V1 and ek+1, ek+2,. . . ek+m is a basis for V2 then e1, e2, . . . ek+m span V1 + V2 so

dim(V1 + V2) ≤ k +m = dimV1 + dimV2

(ii) Can occur. Let e1, e2 be a basis for V = F2 and

V1 = V2 = span{e1}.

290

Exercise 12.1.7

We know from Theorem 11.2.2 that there exists an m such thatαm+1U = αmU . We thus have αmU an invariant subspace.

If W is an invariant subspace

W = αmV ⊆ αmU.

Thus αmU is the unique maximal invariant subspace.

(ii) If x ∈ M ∩N thenαmx = 0

But α|M : M → M is surjective so invertible so α|mM is so x = 0. Bythe rank-nullity theorem dimM + dimN = dimU so U +M ⊕ V .

(iii) α(M) = M ⊆ M , αmα(N) = ααmN = 0 so αN ⊆ N .

(iv) As we observed in (ii), β : M → M is surjective, so β is invert-ible, so an isomorphism. We know that γm = 0.

(v) Observe that M ′ is an invariant subspace so M ′ ⊆ M . If b ∈ N ′

thenαr(b) = (γ′)rb = 0

for r large so b ∈ N . Thus N ′ ⊆ N . We have dimM ′ ≤ dimM ,dimN ′ ≤ dimN

dimU = dimM ′ + dimN ′ ≤ dimM + dimN = dimU

so dimM ′ = dimM , dimN ′ = dimN and M = M ′, N = N ′. Byconsidering b = 0, we have β = β ′ and similarly, by considering a = 0,we have γ = γ′.

(vi) Let A correspond to a linear map α for some choice of basis fora space U of appropriate dimension. Choose a basis M for M definedas above and a basis N for N . Then E , N is a basis for U with respectto which α has the form

(

B 00 C

)

with B an invertible r × r matrix and C a nilpotent n − r × n − rmatrix, The result now follows from the change of basis formula.

291

Exercise 12.1.11

Let dimU = n and dimV = r. If U is neither V nor {0}, then0 < r < n. Choose a basis

e1, e2, . . . , er

and extend it to a basis

e1, e2, . . . , en

of V .

If

W = span{er+1, e2, . . . , en}W ′ = span{e1 + er+1, e2, . . . , en},

then V = U ⊕W = U ⊕W ′ but er+1 ∈ W \W ′ so W 6= W .

292

Exercise 12.1.13

(i) It will remain true θ has an n2 × n2 diagonal matrix with dimUdiagonal entries taking the value 1 and dimV diagonal entries takingthe value −1, since θ|U = ι|U and since θ|V = −ι|V .

(ii) det θ = (−1)dimV = (−1)n(n−1)/2. Now

(n+ 4)((n+ 4)− 1)

2−n(n− 1)

2=

n2 + 8n+ 16− n− 4− n2 + n

2= 4n+6

is divisible by 2 so det θ depends only on the value of n modulo 4. Byinspection

n ≡ 0 ⇒ det θ = 1,

n ≡ 1 ⇒ det θ = 1,

n ≡ 2 ⇒ det θ = −1,

n ≡ 3 ⇒ det θ = −1.

293

Exercise 12.1.14

Automatically 0 ∈ V and, if λ, µ ∈ R,

f, g ∈ V ⇒ (λf + µg)(−x) = λf(−x) + µg(−x)

= −λf(x)− µg(x) = −(λf + µg)(x) ∀x⇒ λf + µg ∈ V,

so V is subspace. A similar argument shows that U is a subspace,

If f ∈ C(R), then u(x) = (f(x)+f(−x))/2, v(x) = (f(x)−f(−x))/2define u ∈ U , v ∈ V . Since f = u+ v we have U + V = C(R).

Now if f ∈ U ∩ V , then f(x) = f(−x) = −f(x) so f(x) = 0 for allx ∈ R and f = 0. Thus U ∩ V = {0} and U and V are complementarysubspaces.

294

Exercise 12.1.15

(i)⇒(ii) Let U = αV and W = (ι− α)V .

If u ∈ U , then u = αv for some v ∈ V so

αu = α2v = αv = u

so α|U = ι|U .If u ∈ W , then u = (ι− α)v for some v ∈ V , so

αu = (α− α2)v = αv − αv = 0,

so α|W = 0|WSince v = αv + (ι− α)v for all v ∈ V so V = U +W .

If u ∈ U ∩W , then

u = α|Uu = αu = α|Wu = 0.

Thus U ∩W = {0} and V = U ⊕W .

(ii)⇒(iii) Let e1, e2, . . . , er be a basis of U and er+1, . . . , en be abasis of W . Then with respect to the basis e1, e2, . . . , en of V , α hasa matrix of stated type.

(iii)⇒(i) Since A2 = A we have α2 = α.

We have α1(x, y) = (x, 0) = α21(x, y) so α1 is a projection.

We have α2(x, y) = (0, y) 6= (0, 0) = α22(x, y) if y = 1, so α2 is not a

projection.

We have α3(x, y) = (x, y) 6= (y, x) = α23(x, y) if x = 0, y = 1, so α3

is not a projection.

We have α4(x, y) = (x + y, 0) = (x + y, 0) = α24(x, y), so α4 is a

projection.

We have α5(x, y) = (x + y, x+ y) 6=(

2(x+ y), 2(x+ y))

= α25(x, y)

if x = 0, y = 1, so α5 is not a projection.

We have α6(x, y) =(

12(x+ y), 1

2(x+ y)

)

= α26(x, y) so α6 is a projec-

tion.

295

Exercise 12.1.16

If α is a projection, then

(ι− α)2 = ι− 2α + α2 = ι− 2α+ α = ι− α

so ι− α is a projection.

If ι − α is a projection, then the first paragraph tells us that α =ι− (ι− α) is a projection.

296

Exercise 12.1.17

If α and β are projections and αβ = βα, then

(αβ)2 = (αβ)(αβ) = α(βα)β = α(αβ)β = α2β2 = αβ.

We work in R2 with row vectors. If

β(x, y) = (x+ y, 0), α(x, y) = (0, y)

then α2 = α, β2 = α

βα(x, y) = (y, 0), αβ(x, y) = (0, y)

so (αβ)2 = (αβ) but (βα)2 6= (βα).

Ifβ(x, y) = (x+ y, 0), α(x, y) = (0, x+ y)

then α2 = α, β2 = α

βα(x, y) = (x+ y, 0), αβ(x, y) = (0, x+ y)

so (αβ)2 = (αβ) and (βα)2 = (βα), but (looking e.g. at (x, y) = (1, 1))αβ 6= αβ.

297

Exercise 12.1.18

(i) If αβ = −βα, then

αβ = ααβ = α(−βα) = (−αβ)α = βαα = βα

Thus αβ = 0 and βα = 0.

(ii) If (α+ β) is a projection, then

α + αβ + βα+ β = (α + β)2 = α + β,

so αβ = −βα and αβ = βα = 0.

If αβ = βα = 0, then

(α + β)2 = α + αβ + βα+ β = α + β,

so α + β is a projection.

(ii) If (α− β) is a projection, then

α− αβ − βα+ β = (α− β)2 = α− β,

so −αβ − βα = 2β whence (ι − α)β = −β(ι − α) so, since ι − α is aprojection, (ι− α)β = β(ι− α) = 0 and αβ = βα = β.

If αβ = βα = β, then

(α− β)2 = α− αβ − βα + β = α− β,

so α− β is a projection.

298

Exercise 12.1.19

If α is diagonalisable with distinct eigenvalues λ1, λ2, . . . , λm, thenU is the direct sum of the spaces

Ej = {e : αe = λje}.If u ∈ U , we can write u uniquely as

u = e1 + e2 + . . .+ em

with ej ∈ Ej If we set πj(e) = ej for 1 ≤ j ≤ m then direct verificationshows that πj is linear and πj is a projection. By inspection πkπj = 0when k 6= j and

α = λ1π1 + λ2π2 + . . .+ λmπm.

Conversely, if the stated conditions hold and we write Ej = πjU ,we see that Ej is a subspace and πiEj = πiπjEj = 0 for i 6= j. Sinceiota = π1 + π2 + . . .+ πm we have

u = ιu =

m∑

j=1

πju =

m∑

j=1

uj

with uj = πju ∈ Ej . On the other hand, if ej ∈ Ej andm∑

j=1

ej = 0

then applying πi to both sides we get ei = 0 for all i. Thus

U = E1 ⊕E2 ⊕ . . .⊕ Em.

Let Ej be a basis for Ej . The set E =⋃m

j=1 Ej is a basis for U . Theconditions πkπj = 0 when k 6= j and

α = λ1π1 + λ2π2 + . . .+ λmπm

show that E is a basis of eigenvectors so α is diagonalisable.

299

Exercise 12.2.1

Let E(rs) = (δirδjs)1≤j≤n1≤i≤n

Then, if A = (aij),

A =n∑

r=1

n∑

s=1

arsE(rs)

so the E(r, s) span Mn(F).

Alson∑

r=1

n∑

s=1

brsE(rs) = 0 ⇒ B = 0 ⇒ brs = 0 ∀r ∀s

so the E(r, s) form a basis for Mn(F). Thus dimMn(F) = n2.

It follows that the n2 + 1 elements Aj with 0 ≤ j ≤ n2 must belinearly dependent, ie we can find aj ∈ C, not all zero, such that∑n2

j=0 ajAj = 0.

In other words, there is a non-trivial polynomial P of degree at mostn2 such that P (A) = 0.

300

Exercise 12.2.2

(i) Observe that

χD(t) =n∏

j=1

(t− λj).

Let Dj be the diagonal matrix with the same entries as D exceptthat the jth diagonal entry is 0. Then

n∑

k=0

bkDk =

n∏

j=1

(D − λjI) =n∏

j=1

Dj = 0.

(ii) We have A = P−1DP for some invertible matrix P .

χD = χA

and

χA(A) = χD(A) = χD(P−1DP ) = P−1χD(D)P = P−10P = 0

as stated.

301

Exercise 12.2.3

(i) Let C = B−1. Using the summation convection,

TrB−1AB = TrCAB = cijajkbki = bkicijajk = δkjajk = akk = TrA.

(Lots of other proofs available.)

(ii) QA(tI−A) = tδii−aii = nt−TrA. Part (i) yields QB−1AB = QA.

(iii) QA(A) = nA−(TrA)I. If a11 = −a22 = 1 and aij = 0 otherwise,the. QA(A) 6= 0.

If n = 1, Q(a11)(t) = t− a11 and Q(a11)

(

(a11))

= (a11)− a11(1) = (0).

302

Exercise 12.2.6

The roots of the characteristic polynomial of a triangular matrix arethe diagonal entries. If every real n×n was triangularisable by changeof basis, then, since the characteristic equation is unaltered by changeof basis, every root of the characteristic equation would be real. TakingA = (aij) with arr = 1 for r 6= 1, 2 a12 = −a21 = 1, aij = 0, otherwise,we see that this is not the case.

If dimV = 1 every matrix is triangular.

303

Exercise 12.2.7

(i) Choose a basis such that α has an upper triangular matrix withrespect to that basis. The eigenvalues of α are the diagonal entries.

With respect to the chosen basis αr has matrix Ar which is uppertriangular with diagonal entries the rth powers of the diagonal entriesof A. Thus αr has an eigenvalue µ if and only if α has an eigenvalue λwith λr = µ.

(ii) If α is invertible then αr has an eigenvalue µ if and only if α hasan eigenvalue λ with λr = µ.

To prove this observe that if β is invertible with eigenvalue λ andassociated eigenvector e

β−1βe = e

so λ 6= 0 and β−1 has eigenvalue λ−1 and associated eigenvector e.

(iii) False. Let

A =

(

0 −11 0

)

.

Then A has no real eigenvalues but A2 = I has 1 as eigenvalue.

0 a12 a130 a22 a230 0 a33

b11 b12 b130 0 b230 0 b33

c11 c12 c130 c22 c230 0 0

=

0 0 00 0 00 0 0

.

304

Exercise 12.2.8

(i) We have

0 a12 a130 a22 a230 0 a33

b11 b12 b130 0 b230 0 b33

c11 c12 c130 c22 c230 0 0

=

0 0 00 0 00 0 0

=

0 a12 a130 a22 a230 0 a33

b11c11 b11c12 + b12c22 b11c13 + b12c230 0 00 0 0

=

0 0 00 0 00 0 0

(ii) Let Tj be an n×n upper triangular matrix with the jth diagonalentry 0. Then T1T2 . . . Tn = 0

We can prove this by induction on n using matrices or as follows.Let τj be the linear map on an n dimensional vector space U which hasmatrix Tj with respect to some basis e1, e2, . . . , en. Then writing

Ej = span{e1, . . . , ej}for 1 ≤ j ≤ n and E0 = {0} we have τjEj ⊆ Ej−1. Thus

τ1τ2 . . . τnU = τ1τ2 . . . τnEn = E0

and τ1τ2 . . . τn = 0 so T1T2 . . . Tn = 0

(iii) Not necessarily true.

1 1 10 1 10 0 0

1 1 10 0 10 0 1

0 1 10 1 10 0 1

001

=

1 1 10 1 10 0 0

1 1 10 0 10 0 1

111

=

1 1 10 1 10 0 0

311

=

521

so the matrix product is not zero.

305

Exercise 12.2.9

By induction on r, βαr = αrβ, so, by induction on s, βsαr = αrβs.Thus

n∑

j=0

ajαj

m∑

k=0

akβk =

n∑

j=0

m∑

k=0

ajbkαjβk =

m∑

k=0

n∑

j=0

bkajβkαj

=m∑

k=0

akβk

n∑

j=0

ajαj.

306

Exercise 12.2.11

Setting t = 0 we have

0 6= detA = a0

By the Cayley–Hamilton theorem

a0I =n∑

j=1

ajAj

so, multiplying by a−10 A−1, we get

A−1 = −a−10

n∑

j=1

ajAj−1.

Does not appear to be a good method. The computation of the ajappears to require the evaluation of

det(tI −A)

which appears to require at least as many operations as inverting amatrix by Gaussian elimination.

307

Exercise 12.2.12

Let M be the smallest value of m with αm = 0

We know that there exist aj such that

αn =

n−1∑

j=0

ajαj

and so either αn = 0 or there exists a k with n− 1 ≥ k ≥ 0 such that

αn =

n−1∑

j=k

ajαj

and ak 6= 0. But then applying αM−k−1 to both sides we get

0 = akαM−1

so αM1 = 0, contrary to our definition of M .

308

Exercise 12.3.1

(i) We have

det(tI − A1) = det(tI − A2) = t2.

A1 and A2 are not similar since they have different ranks.

(ii) We have

det(tI −A1) = det(tI −A2) = det(tI −A3) = t3.

None of A1, A2, A3 are similar since they have different ranks.

(iii) Set

A8 =

0 0 0 00 0 0 00 0 0 00 0 0 0

, A9 =

0 1 0 00 0 1 00 0 0 00 0 0 0

.

and

A10 =

0 1 0 00 0 1 00 0 0 10 0 0 0

Thendet(tI −Aj) = t4

for 6 ≤ j ≤ 10. Now A6 has rank 1, A7 and A9 have rank 2, A8 hasrank 0 and A9 has rank 3.

A27 = 0, A2

9 =

0 0 1 00 0 0 00 0 0 00 0 0 0

so A27 has rank 0 and A2

9 rank 1. None of the Aj can be similar.

309

Exercise 12.3.3

(i) Since the characteristic polynomials have the form tn so do theminimal polynomials. The minimal polynomial of Ak is mk(t) = tr

where Ark = 0, Ar−1

k 6= 0. Thus

m1(t) = t, m2(t) = t2, m3(t) = t, m4(t) = t2, m5(t) = t3m6(t) = t2, m7(t) = t2, m8(t) = t, m

(ii) We have

det(tI − A) = (t− 1)2(t− 2)2 = det(tI − B)

Now (A−I)(A−2I)2 6= 0 and (A−I)2(A−I)2 6= 0 so A has minimalpolynomial (t− 1)2(t− 2)2.

We also have (B − I)(B − 2I) 6= 0 and (B − II)2 6= 0 but (B −I)2(B − 2I) 6= 0 so A has minimal polynomial (t− 1)2(t− 2).

310

Exercise 12.3.6

(i)∏r

i=1(D − λiI) is a diagonal matrix with jth diagonal entry∏r

i=1(djj − λi) = 0 so∏r

i=1(D − λiI) = 0.

Now suppose, without loss of generality that d11 = λ1. Then∏r

i=2(D−λiI) is a diagonal matrix with 1st diagonal entry

∏ri=2(d11−λi) 6= 0 so

∏ri=2(D − λiI) 6= 0. Thus

∏ri=1(t− λi) is the minimal polynomial for

D.

(ii) Choose a basis of eigenvectors. e1, e2, . . . en. With respect to thisbasis α has a diagonal matrix with j th diagonal entry the eigenvalueassociated with ej . Thus if λ is associated with k of the basis vectorsdet(tI −D) = det(tι− α) has (t− λ)k as a factor but not (t− λ)k+1

Now writing S for the set of eigenvalues which are not equal to λ

dimUλ = dim(∏

µ∈S

(µι−α)U) = dim span{∏

µ∈S

(µι−α)U)ej : 1 ≤ j ≤ n} = k

311

Exercise 12.3.7

(i) If P has a repeated root θ, then P (t) = (t − θ)kQ(t) with Q apolynomial and k ≥ 2. We have

P ′(t) = k(t−θ)k−1Q(t)+(t−θ)kQ′(t) = (t−θ)k−1(kQ(t)+(t−θ)Q′(t)

so t− θ is a factor of both P (t) and P ′(t).

The converse is true If P (t) = (t−θ)Q(t) with Q a polynomial havingQ(θ) 6= 0 then

P ′(t) = Q(t) + (t− θ)kQ′(t)

so P ′(θ) 6= 0.

(ii) Since Am = I the minimal polynomialmA ofAmust divide tm−1.But tm − 1 = 0 has no repeated roots (by part (i), or otherwise). Thusma has no repeated roots so A is diagonalisable.

If A is a real symmetric matrix then its eigenvalues are real so wehave A = LDL−1 with L invertible (indeed we may take L orthogonal)and D real diagonal

Dm = I

so dmjj = 1 and djj is real so djj = ±1 so D2 + I so A2 = I.

312

Exercise 12.3.9

(i) Let ∂Q denote the degree of Q. If P 6= {0} then the set

E = {∂P : P ∈ P \ {0}}is a non-empty subset of the positive integers and so has a least memberN .

Let Q0 ∈ P have degree N and leading coefficient a. If we setP0 = a−1Q0, then P0 is a monic polynomial of smallest degree in P.

If Q ∈ P then Q = SP + R with S, R polynomials and ∂R < N .Now R = Q−SP0 ∈ P so by the definition of N , R = 0 and Q = SP0.

(ii) Observe that P satisfies the hypotheses of (i).

If S divides each Pj then S divides each QjPj and so divides P0.

(iii) Take Pj(t) =∑r

j=1Qj(t)∏

i 6=j(t− λi)m(i) and apply (ii).

313

Exercise 12.3.11

Write α = α1 ⊕ α2 ⊕ . . .⊕ αr The expansion

u = u1 + u2 + . . .+ ur

with uj ∈ Uj always exists a and is unique so

α(u) == α1u1 + α2u2 + . . .+ αrur

is well defined.

If u, v ∈ U and λ, µ then we can write

u = u1 + u2 + . . .+ ur

v = v1 + v2 + . . .+ vr

with uj , vj ∈ Uj. We then have

λu+ µv = (λu1 + µv1) + (λu2 + µv2) + . . .+ (λur + µvr)

and λuj + µvj ∈ Uj so

α(λu+ µv) = α1(λu1 + µv1) + α2(λu2 + µv2) + . . .+ αr(λur + µvr)

= (λα1u1 + µα1v1) + (λα2u2 + µα2v2) + . . .+ (λαru1 + µαrv1)

= λ(α1u1 + α2u2 + . . .+ αrur) + µ(α1v1 + α2v2 + . . .+ αrvr)

= λαu+ µαv

so α is linear.

314

Exercise 12.3.13

(i) We have

ι =

r∑

j=1

Qj(α)∏

i 6=j

(α− λjι)m(j),

so, if u ∈ U we have

u =

r∑

j=1

Qj(α)∏

i 6=j

(α− λjι)m(j)u = sumr

j=1uj

whereuj = Qj(α)

∏

i 6=j

(α− λjι)m(j)u.

Since

(α− λjι)m(j)uj = (α− λjι)

m(j)Qj(α)∏

i 6=j

(α− λjι)m(j)u

= Qj(α)

r∏

j=1

(α− λjι)m(j)u

= Qj(α)Q(α)u = Qj(α)0 = 0

we have uj ∈ Uj . Thus

U = U1 + U2 + . . .+ Ur.

(ii) Next we observe that if v ∈ Uk with k 6= j then

Qj(α)∏

i 6=j

(α− λjι)m(j)v = 0

Thus if v ∈ Uk we have

v =r∑

j=1

Qj(α)∏

i 6=j

(α− λiι)m(i)v = Qk(α)

∏

i 6=k

(α− λiι)m(i)v.

In particular∏

i 6=k(α− λiι)m(i)v 6= 0 if v 6= 0.

Now if vj ∈ Uj andr∑

j=1

vj = 0

then∏

i 6=k

(α− λjι)m(i)vk =

∏

i 6=k

(α− λjι)m(i)

r∑

j=1

vj0

so vk = 0 for each k.

ThusU = U1 ⊕ U2 ⊕ · · · ⊕ Ur

315

(iii) We have

u ∈ Uj ⇒ (α−λj)m(j)u = 0 ⇒ (α−λj)

m(j)αu = α(α−λj)m(j)u = 0 ⇒ αu ∈ Uj .

If u ∈ U then

u =

r∑

j=1

uj

and

αu = α

r∑

j=1

uj =

r∑

j=1

αuj =

r∑

j=1

αjuj = α = α1 ⊕ α2 ⊕ · · · ⊕ αru

soα = α1 ⊕ α2 ⊕ · · · ⊕ αr.

(iiv) By the definition of Uj , (α − λjι)m(j)Uj = 0 so the minimal

polynomial for αj must have the form. (t− λj)p(j). But If u ∈ U then

u =∑r

s=1 us with us ∈ Us

r∏

j=1

(α− λj)p(j)u =

r∑

s=1

r∏

j=1

(α− λj)p(j)uj =

r∑

s=1

0 = 0.

so the minimal polynomial must divide∏r

j=1(α−λj)p(j) so p(j) = m(j)

for all J .

316

Exercise 12.3.14

Let e1, e2, . . . , er be a basis for U with respect to which α has matrixA and er+1, er+2, . . . , er+s be a basis for V with respect to which βhas matrix B Then e1, e2, . . . , er+s is a basis for W with respect towhich α⊕ β has matrix

(

A 00 B

)

Thus

χα⊕β(t) = det

(

tI − A 00 tI − B

)

= det(tI−A) det(tI−B) = χα(t)χβ(t).

If α has minimal polynomial mα, β has minimal polynomial mβ andP is a polynomial

P (α⊕ β) = 0 ⇔ P (α⊕ β)(u+ v) = 0∀u ∈ U, v ∈ V

⇔ P (α)u+ P (β)v = 0∀u ∈ U, v ∈ V

⇔ P (α)u = P (β)v = 0∀u ∈ U, v ∈ V

⇔ mα and mβ divide P

Thus the minimal polynomial of α⊕ β is the lowest common multipleof (ie the monic polynomial of lowest degree dividing) mα and mβ.

317

Exercise 12.3.15

(i) If (t − λ) is a factor of the characteristic polynomial then λ isan eigenvalue and (t− λ) must be a factor of the minimal polynomial.Thus we cannot choose S(t) = t+ 1, Q(t) = t.

(ii) If m = 1 set A = 0. If m = n let A = (aij) be the n× n matrixwith ai,i+1 = 1 aij = 0 otherwise.

If n > m > 1 let B = 0n−m the n−m+×n−m zero matrix and letC be the m×m matrix with ci,i+1 = 1 cij = 0 otherwise. Take

A =

(

B 00 C

)

A− λI has characteristic equation (t− λ)n and minimal polynomial(t− λ)m

(iii) We can write P (t) =∏

i = 1r(t−λi)m(i) andQ(t) =

∏

i = 1r(t−λi)

m(i) with the λj distinct and m(j) ≥ p(j) ≥ 1. By (ii) we can finda m(j)×m(j) matrix Bj = Aj − λIm(j) with characteristic polynomial

(t− λ)m(j) and minimal polynomial (t− λ)p(j). Take

A =

A1 0 . . . 0 00 A2 0 . . . 0 0...

......

...0 0 0 . . . Ar−1 00 0 0 . . . 0 Ar

A has the required properties either by direct computation or usingExercise 12.3.14.

318

Exercise 12.4.3

Since αej = ej+1 for 1 ≤ j ≤ n − 1 and αen = 0 we have A = (aij)with ai,i+1 = 1 aij = 0 otherwise.

319

Exercise 12.4.4

If αn 6= 0, then e, αe, . . .αne are linearly independent a and a spaceof dimension n would contain n+1 linearly independent vectors whichis impossible.

320

Exercise 12.4.7

We have αke 6= 0 for only finitely many values of k.

321

Exercise 12.4.11

Observe that (djj) is a 1× 1 Jordan matrix.

322

Exercise 12.4.12

(i) A is the matrix of a linear map α : U → U

α = α1 ⊕ α2 . . . αr

with U = U1 ⊕ U2 . . . Ur where αj is a linear map on Uj with αj =

λjι+ βj with βj a nilpotent linear map on Uj with βkjj = 0, βkj−1 6= 0

dimUj = kj Thus

(λι− α)−k0 =

r⊕

j=1

(λι− αj)−k0

and

dim(λι− α)−k0 =

r∑

j=1

dim(λι− αj)−k0 =

∑

λ=λj

min{k, kj}.

Thus{x ∈ Cn : (λI −A)kx = 0} =

∑

λ=λj

min{k, kj}

(ii) By (i)∑

λ=λj

min{k, kj} =∑

λ=λi

min{k, ki}

so r = r and, possibly after renumbering, λj = λj and kj = kj for1 ≤ j ≤ r.

323

Exercise 12.4.13

Observe by looking at the effect on zn that T n+1 = 0 but T n 6= 0.Thus, since Pn has dimension n+ 1, T has Jordan form

Jn+1(0) =

0 1 0 0 . . . 0 00 0 1 0 . . . 0 00 0 0 1 . . . 0 0...

......

......

...0 0 0 0 . . . 0 10 0 0 0 . . . 0 0

.

We observe that er(z) = zr is an eigenvector with eigenvalue r [n ≥r ≥ 0]. Thus the Jordan normal form is the diagonal matrix

0 0 0 0 . . . 0 00 1 0 0 . . . 0 00 0 2 0 . . . 0 0...

......

......

...0 0 0 0 . . . n− 1 00 0 0 0 . . . 0 n

.

324

Exercise 12.4.14

(i) If λ is a root of χα then we can find an associated eigenvector e.Since

e ∈ dim{u : (α− λι)(u) = 0}we have mg(λ) ≥ 1.

Now take a basis e1, e2 . . . , emg(λ) for

{u : (α− λι)(u) = 0}and extend it to a basis e1, e2 . . . , en for U . With respect to this basisα has matrix

A =

(

λImg(λ) 0R S

)

with Img(λ) the mg(λ)× (mg(λ) identity matrix so

χα(t) = (t− λ)mg(λ)χR(t)

and ma(λ) ≥ mg(λ).

(ii) Suppose s ≥ r ≥ 1 Let B = (bij) be the r × r matrix withbi,i+1 = 1 and bij = 0 and A(r, s) the s× s matrix with

A(r, s) =

(

λB 00 0

)

.

If αr,s is the linear map associated with A(r, s) then, for this linearmap,

mg(λ) =

{

r if λ = 0

= 0 otherwise, ma(λ) =

{

s if λ = 0

= 0 otherwise

If we now set βr,s = µι− αr,s then for this linear map

mg(λ) =

{

r if λ = µ

= 0 otherwise, ma(λ) =

{

s if λ = µ

= 0 otherwise

Now take a vector space

U = U1 ⊕ U2 ⊕ . . .⊕ Ur

with dimUk = na(λk). By the previous paragraph we can find γk :Uk → Uk linear such that for this linear map

mg(λ) =

{

ng(λk)If if λ = λk

= 0 otherwise, na(λ) =

{

s if λ = λk

= 0 otherwise

If we set

α = γ1 ⊕ γ2 ⊕ . . .⊕ γr

then the required properties can be read off.

325

(iii) Suppose α has the associated Jordan form

A =

Jk1(λ1)Jk2(λ2)

Jk3(λ3). . .

Jkr−1(λr−1)

Jkr(λr)

(using the notation of Theorem 12.4.10).

We observe that the characteristic polynomial

χα(t) = χA(t) =r∏

j=1

χJkj(t) =

r∏

j=1

(t− λj)k(j)

soma(λ) =

∑

λj=λ

k(j).

Observe that the dimension of the space of solutions of

Jkx = λx

is 1 if λ = λk and zero otherwise we see that

ma(λ) =∑

λj=λ

1 = card{ : λj = λ}

326

Exercise 12.5.1

Let U be a finite dimensional vector space over C. If α ∈ L(U, U)then the Jordan normal form theorem tells that we can write U =⊕

j = 1rUj and find αj ∈ L(Uj , Uj) such that α =⊕

j = 1rαj and Uj

has a basis with respect to which αj has an n(j)× n(j) matrix

Aj =

λ 1 0 0 . . . 0 00 λ 1 0 . . . 0 00 0 λ 1 . . . 0 0...

......

......

...0 0 0 0 . . . λ 10 0 0 0 . . . 0 λ

.

Now

det(tι− α) =r∏

j=1

det(tι− αj)r∏

j=1

(t− λj)n(j)

andr∏

j=1

(α− λj)n(j) =

⊕

j = 1rr∏

i=1

(αj − λi)n(j) = 0

(since (αj − λj)n(j) = 0) so χα(α) = 0.

Lemma 12.2.1 tells us that there exists a non-zero polynomial P withP (α) = 0. This is all we need to show that there is a minimal poly-nomial. The proof of the Jordan form theorem only uses the minimalpolynomial.

327

Exercise 12.5.2

Consider the s× s matrix

B =

λ 1 0 0 . . . 0 00 λ 1 0 . . . 0 00 0 λ 1 . . . 0 0...

......

......

...0 0 0 0 . . . λ 10 0 0 0 . . . 0 λ

.

If λ 6= 0, B is invertible so rankBj = s for all j ≥ 0. If λ = 0 thendirect calculation shows that rankBj = max{0, k − j} for all j ≥ 0.

If α =∏r

k=1 αk then rankαj =∑r

k=1 rankαjk

Thus if the Jordan normal form of an n× n A contains m nilpotentblocks

Kp =

0 1 0 0 . . . 0 00 0 1 0 . . . 0 00 0 0 1 . . . 0 0...

......

......

...0 0 0 0 . . . 0 10 0 0 0 . . . 0 0

with Kp an n(p)× n(p) matrix, then

rankAj =

(

n−m∑

p=1

np

)

+

m∑

p=1

max{0, np − j}

. (ii) Observe that if we set

r(j) =

(

n−m∑

p=1

np

)

+

m∑

p=1

max{0, np − j}

then for some m = maxp n(p) =≤ n together with the condition

r0 − r1 ≥ r1 − r2 ≥ r2 − r3 ≥ . . . ≥ rm−1 − rm.

Let q(j) = (sj − sj−1)− (sj+1 − sj). Let A be the n× n matrix withqj nilpotent Jordan blocks of size j × j and s(m) 1 × 1 Jordan blocks(1). If α has matrix A with respect to some basis then α satisfies therequired conditions.

328

Exercise 12.5.3

(i) We have

A1 =

0 0 0 00 0 0 00 0 0 00 0 0 0

, A2 =

0 1 0 00 0 0 00 0 0 00 0 0 0

, A3 =

0 1 0 00 0 1 00 0 0 00 0 0 0

,

A4 =

0 1 0 00 0 1 00 0 0 10 0 0 0

, A5 =

0 1 0 00 0 0 00 0 0 10 0 0 0

(ii) For A1 xr = 0 so x(t) = c a constant.

For A2 x1 = x2, xr = 0 otherwise so x(t) = (c2t+ c1, c2, c3, c4)T with

cj constants.

For A3 x1 = x2, x2 = x3, xr = 0 otherwise so x(t) = (c3t2/2 + c2t +

c1, c2t + c3, c3, c4)T with cj constants.

Similarly for A4, x(t) = (c4t3/6 + c3t

2/2 + c2t + c1, c4t2/2 + c3t +

c2, c4t + c3, c4)T with cj constants.

Similarly for A5 x(t) = (c2t+ c1, c2, c4t+ c3, c4)T

(iii) If we write x(t) = eλy(t) then

x′(t) = Ajx(t)

Thus the general solution of

y′(t) = (λI + Aj)y(t)

is y(t) = e−λtx(t) where x(t) is the general solution of

x′(t) = Ax(t).

(iv) Suppose

B =

B1 0 . . . 0 00 B2 . . . 0 0...

......

...0 0 . . . Br−1 00 0 . . . 0 Br

with Bj = λjI+Cj where Cj is an s(j)×s(j) nilpotent Jordan matrix.Then the general solution of

x′(t) = Bx(t)

329

is x = (y1,y2, . . . ,yr)T with

yj(t) = e−λjt(P(s(j))j (t), P

(s(j)−1)j (t), . . . , Pj(t))

T

with Pj any polynomial of degree s(j)− 1.

(iv) Find M invertible and B a matrix in Jordan form such that

MAM−1 = B

then writing y = Mx we have

⋆ y = By.

The general solution ofx′(t) = Ax(t).

is x = M−1y where y is the general solution of ⋆.

330

Exercise 12.5.4

Observe that

0 1 0 . . . 0 00 0 1 . . . 0 0

......

0 0 0 . . . 1 00 0 0 . . . 0 1

−an−1 −an−2 −an−3 . . . −a1 −a0

x0

x1

x2...

xn−2

xn−1

=

x0

x1

x2...

xn−2

xn−1

⇔ xj = xj+1 [0 ≤ j ≤ n− 2],−n−2∑

j=0

ajxj = xn−1

⇔ x(n) +

n−2∑

j=1

ajx(n−j) = 0

331

Exercise 12.5.5

Suppose U =⊕r

j=1 Uj, Uj has dimension kj βj : Uj → Uj has

βk(j)−1j 6= 0, β

kjj = 0, αj = λjι+ βj and α =

⊕rj=1 αj .

Write s(λ) = maxλ=λjkj . If P is a polynomial

P (α) = 0 ⇔ P (αj) = 0 ∀j ↔ (t− λ)s(λ)|P (t).

Thus the characteristic polynomial equals the minimal polynomial ifand only if

r∏

j=1

(t− λ)kj =∏

s(λ)6=0

(t− λ)s(λ)

that is to say, if and only if all the λj are distinct.

332

Exercise 12.5.6

Writing ej for the vector with 1 in the jth place and zero everywhereelse. By induction

Ajen = en−j + fn−j

wherefn−j ∈ span{er : n ≥ r ≥ n− j + 1}.

Thus the Ajen with n− 1 ≥ j ≥ 0 are linearly independent and(

n−1∑

j=0

bjAj

)

e = 0 ⇔ bj = 0 ∀0 ≤ j ≤ n− 1.

Thus the minimal polynomial has degree at least n. Since the minimalpolynomial divides the characteristic polynomial and the characteristicpolynomial has degree n it follows that the two polynomials are equal.By Exercise 12.5.5, it follows that that there is a Jordan form associatedwith A in which all the blocks Jk(λk) have distinct λk.

333

Exercise 12.5.7

(i) If A is an n × n matrix consisting of a single Jordan block withassociated eigenvalue λ and

x = Ax

then writing y(t) = e−λtx(t) we have

y(t) = −λe−λtx(t) + e−λtx(t) = −λe−λtx(t) + e−λtAx(t) = By(t)

where B is a single Jordan block with associated eigenvalue 0. Thus

y(n) = 0

and y(t) =∑n−1

j=0 cjtj for some cj ∈ R.

Conversely, if y(t) =∑n−1

j=0 cjtj, for some cj ∈ R, then

y(t) = By(t).

Thus

x(t) = eλtn−1∑

j=0

cjtj

where the cj ∈ R are arbitrary.

(ii) Suppose that A is associated with a Jordan form B with blocksthe nk×nk Jordan block Jk(λk) with all the lambdak distinct [1 ≤ k ≤r]. Writing A = MBM−1 with M non-singular we see that if y is asolution of y = By if and only x = My is a solution of x = Ax. Thusthe only possible solutions of ⋆ are

x(t) =

r∑

k=1

eλktPk(t).

with the Pk general polynomials of degree nk − 1.

To show that every x of this form is a solution observe that (bylinearity) it suffices to show that

eλktPk(t)

is a solution with the Pk a general polynomials of degree nk − 1. Byconsidering e−λktx(t) we see that it suffices to show that if the char-acteristic equation of A has 0 as an p times repeated root the everypolynomial of degree r − 1 satisfies ⋆. But if A has 0 as an p timesrepeated root then a0 = a1 = . . . = ar−1 = 0 so the previous sentenceis automatic.

334

Exercise 12.5.8

(i) We have(

r

k − 1

)

+

(

r

k

)

= kr!

k!(r + 1− k)!+(r+1−k)k

r!

k!(r + 1− k)!= (r+1)

r!

k!(r + 1− k)!=

(

r + 1

k

)

(ii) If k = 0 our system becomes

ur(0) = ur−1(0)

with the general solution ur(0) = b0.

If k = 1 we have ur(0) = b0 as before and

ur(1)− ur−1(1) = b0

If we set

vr(1) = ur(1)− b0r

we obtain

vr(1)− vr−1(1) = 0

so as before vr(1) = b1 for some freely chosen b1. Thus ur(1) = b0r+b1.

Suppose that the solution for a given k ≥ 0 is

ur(k) =

k∑

j=0

bk−j

(

r

j

)

.

Then

ur(k + 1)− ur−1(k + 1) =

k∑

j=0

bj

(

r

j

)

,

and setting

vr(k + 1) = ur(k + 1)−k∑

j=0

bk−j

(

r

j + 1

)

,

we obtain

vr(k + 1)− vr−1(k + 1) = 0

so, as before, vr(k + 1) = bk+1 for some freely chosen bk+1. Thus

ur(k + 1) =

k+1∑

j=0

bk+1−j

(

r

j

)

.

Thus, by induction,

ur(k) =

k∑

j=0

bk−j

(

r

j

)

.

335

(ii) If we set ur(k) = λ−rvr(k) then the u(k) satisfy the equations in(i). Thus

vr(k) = λr

k∑

j=0

bk−j

(

r

j

)

.

(iii) The equation

ur +

n−1∑

j=0

ajuj−n+r = 0

may be rewritten asut+1 + Aut = 0

where ut is the column vector (ut, ut+1, . . . , ut+n−1) and

A =

0 1 0 . . . 0 00 0 1 . . . 0 0

......

0 0 0 . . . 1 00 0 0 . . . 0 1

−a0 −a1 −a2 . . . −an−2 −an−1

.

By Exercise 12.5.6 we know that we can associate A with a Jordanform B in which in which the q blocks Jk(λk) are of size sk × sk andhave distinct λk.

Let B = M−1AM with M invertible.

By (ii) the general solution of

vt+1 +Bvt = 0

is

vt

q∑

k=1

λtkek

sk∑

j=0

bk,sk−j

(

r

j

)

so

ut =

q∑

k=1

λtkek

sk∑

j=0

ck,sk−j

(

r

j

)

.

We have shown that ut must be of the form just given. Direct com-putation shows that any ut of the form just given is a solution.

336

Exercise 12.5.9

All roots characteristic equation the same.

λ 0 00 λ 00 0 λ

,

λ 1 00 λ 00 0 λ

,

λ 1 00 λ 10 0 λ

.

Characteristic equation has two distinct roots

λ 0 00 λ 00 0 µ

,

λ 1 00 λ 00 0 µ

with λ 6= µ. Characteristic equation has three distinct roots

λ 0 00 µ 00 0 ν

,

λ 1 00 λ 00 0 µ

with λ, µ, ν distinct.

337

Exercise 12.5.10

We have A4 − A2 = 0 so the minimal polynomial m must dividet4 − t2 = t2(t− 1)(t+ 1)

Since (A− I)A = A2 − A 6= 0 the minimal polynomial cannot be 1,t, t− 1 or t(t− 1)

(i) m(t) = t2. The characteristic polynomial must be t5. The Jordanblocks must be of size 2× 2 or less. There are three possibilities

0,

(

J 00 0

)

J 0 00 J 00 0 0

with J =

(

J 00 0

)

(ii) m(t) = t + 1. The Jordan normal form is −I giving one possi-bility.

(iii) m(t) = t2 − 1. The Jordan normal form is diagonal with rentries 1, 5 − r entries −1 [1 ≤ r ≤ 4] and the characteristic equationis (t− 1)r(t+ 1)s There are four possibilities.

(iii) m(t) = t(t + 1). The Jordan normal form is diagonal with rentries −1, 5− r entries −0 [1 ≤ r ≤ 4] and the characteristic equationis (t+ 1)rts There are four possibilities.

(iv)m(t) = t2(t−e) with e = ±1. The Jordan normal form either hasone block J and then r diagonal entries e and 3− r diagonal entries 0[1 ≤ r ≤ 3] giving a characteristic polynomial t5−r(t−e)r or two blocksJ and a diagonal entry e giving a characteristic polynomial t4(t − e)There are eight possibilities.

(v) m(t) = t(t − 1)(t + 1). (iii) m(t) = t2 − 1. The Jordan normalform is diagonal with r entries 1, s entries −1, u entries 0 [r, s, u ≥1, r + s + u = 5] and the characteristic equation is (t − 1)r(t + 1)stu

There are six possibilities.

(vi) m(t) = t2(t− 1)(t+ 1). The Jordan normal form has one blockJ and r diagonal entries 1, s diagonal entries −1 [r, s ≥ 1, r + s ≤3] and 3 − r − s diagonal entries 0. The characteristic equation is(t− 1)r(t+ 1)st5−r−s. There are three possibilities.

There are 29 possible Jordan forms.

338

Exercise 12.5.11

det(tI −M) = (t− 1) det

t− 1 0 01 t− 2 00 0 t− 2

= (t− 1)2(t− 2)2

Looking at the eigenvalue 1 we have

1 0 1 00 1 0 00 −1 2 00 0 0 2

xyzw

=

xyzw

implies

x+ z = x

y = y

−y + 2z = z

2w = w

soM(x, y, z, w)T = (x, y, z, w)T ⇔ (x, y, z, w) = (t, 0, 0, 0)

for general t.

Thus e2 = (1, 0, 0, 0)T is an eigenvector eigenvalue 1. If (M−I)x = e

then

z = 1

0 = 0

−y + z = 0

w = 0

Thus writing e1 = (0, 1, 1, 0)T we have (M−I)e1 = e2 6= 0 (M−I)2e1 =0.

Looking at the eigenvalue 2 we have

1 0 1 00 1 0 00 −1 2 00 0 0 2

xyzw

= 2

xyzw

implies

x+ z = 2x

y = 2y

−y + 2z = 2z

2w = 2w

339

soM(x, y, z, w)T = (x, y, z, w)T ⇔ (x, y, z, w) = (t, t, 0, s)

for general t and s. Thus the eigenspace has dimension 2 and basis(1, 1, 0, 0)T , (0, 0, 0, 1)T .

The characteristic polynomial is (t− 1)2(t− 2)2 minimal polynomialis (t− 1)2(t− 2). A Jordan form is given by

J =

1 1 0 00 1 0 00 0 2 00 0 0 2

An appropriate basis is (0, 1, 1, 0)T , (1, 0, 0, 0)T (1, 1, 0, 0)T , (0, 0, 0, 1)T

and we can take M to be the matrix with these columns, that is to say

M =

0 1 1 01 0 1 01 0 0 00 0 0 1

.

340

Exercise 13.2.2

(i) The tables are

+ 0 10 0 11 1 0

× 0 10 0 01 0 1

By inspection of the addition table x+ x = 0 for all x.

(ii) x = −x ⇒ (1 + 1)x = 0 ⇒ x = 0.

341

Exercise 13.2.3

(i) If c 6= 0 then

0 = c−1(cd) = (c−1c)d = d

(ii) If a2 = b2 then (a + b)(a − b) = a2 − b2 = 0 so a + b = 0 and/orb− a = 0 i.e. a = −b and or a = b.

(iii) By (ii) (and the fact that 1 + 1 6= 0, the equation x2 = c hasno solutions or two solutions or c = 0 and we have x = 0. Thus everyelement y is a square root (just look at y2) but every square has twodistinct square roots or is 0. Thus k is odd, there are (k − 1)/2 nonzero squares and k + 1)/2 squares in all.

(iv) If we work in Z2, then 02 = 0 and 12 = 1. Every element is asquare. (Note −1 = 1)

342

Exercise 13.2.4

Suppose x, y, z ∈ R λ, µ ∈ Q.

(i) (x+y)+z = (x+ y) + z = x+ (y + z) = (x+y)+z = x+(y+z).

(ii) x+y = x+ y = y + x = y+x.

(iii) x+0 = x+ 0 = x.

(iv) λ×(x+y) = λ(x+ y) = λx+ λy = λ×x+λ×y.

(v) (λ+ µ)×x = (λ+ µ)x = λx+ µx = λ×x+µ×y.

(vi) (λµ)×x = (λµ)x = λ(µx) = λ×(µ×x).

(vii) 1×x = 1x = x and 0×x = 0x = 0.

343

Exercise 13.2.6

(i) If

MAM−1 = D with D =

(

λ 00 µ

)

thent2 − a = det(tI − A) = det(tI −D) = (t− λ)(t− µ)

so t2 = a has a solution.

Suppose c2 = a 6= 0. The eigenvalues are ±c which are distinct soby our standard arguments A is diagonalisable.

(ii) Our standard argument shows that A is not diagonalisable.

(iii) We know that 2 has no square root in Q so by (i), no M exists.

(iv) Since some a do not have square roots the corresponding a arenot diagonalisable.

(v) If A = M−1DM with D diagonal then I = A2 = M−1D2M soD2 = I. Thus (since we work in Z2, D = I so A = I which is absurd.

If we set e1 = (1, 1)T and e2 = (1, 0)T we have a basis and Ae1 = e1,Ae1 = e1 + e2, Thus setting

M =

(

1 11 0

)

we have

MAM1 =

(

1 01 1

)

.

344

Exercise 13.2.7

The tables are

+ (0, 0) (1, 0) (0, 1) (1, 1)(0, 0) (0, 0) (1, 0) (0, 1) (1, 1)(1, 0) (1, 0) (0, 0) (1, 1) (0, 1)(0, 1) (0, 1) (1, 1) (0, 0) (1, 0)(1, 1) (1, 1) (0, 1) (1, 0) (0, 0)

× (0, 0) (1, 0) (0, 1) (1, 1)(0, 0) (0, 0) (0, 0) (0, 0) (0, 0)(1, 0) (0, 0) (1, 0) (1, 1) (0, 1)(0, 1) (0, 0) (0, 1) (1, 1) (1, 0)(1, 1) (1, 0) (1, 1) (1, 0) (0, 1)

We check the axioms for a field.

(i) (a, b) + (c, d) = (a+ c, b+ d) = (c, d) + (a, b).

(ii) We have

((a, b) + (c, d)) + (e, f) = (a+ c, b+ d) + (e, f)

=(

(a+ c) + e, (b+ d) + f)

=(

a+ (c+ e), b+ (d+ f))

.

(iii) (a, b) + (0, 0) = (a, b)

(iv) (a, b) + (a, b) = (0, 0)

(v) We have

(a, b)× (c, d) = (ac + bd, ad+ bc+ bd)

= (ca + db, da+ cb+ db) = (c, d)× (a, b).

(vi) We have(

(a, b)× (c, d))

× (e, f) = (ac+ bd, ad+ bc + bd)× (e, f)

=(

(ac+ bd)e + (ad+ bc+ bd)f,

(ac + bd)f + (ad+ bc+ bd)e + (ad+ bc+ bd)f)

=(

a(ce+ df) + b(de+ cf + df),

a(de+ cf + df) + b(ce+ df) + b(de+ cf + df))

= (a, b)×(

(c, d)× (e, f))

(vii) (a, b)× (1, 0) = (a, b).

(viii) By inspection of the multiplication table we see that if (a, b) 6=(0, 0), we can find (c, d) with a× (a, b)× (c, d) = (1, 0).

345

(ix) We have

(a, b)×(

(c, d) + (e, f))

= (a, b)× (c+ e, d+ f)

=(

a(c + e) + b(c+ e), a(d+ f) + b(c + e) + b(d+ f))

= (ac+ bd, ad+ bd+ bc) + (ae+ bf, af + bf + be)

= (a, b)× (c, d) + (a, b)× (e, f)

(x) By inspection of the multiplication table, all the elements aresquares.

346

Exercise 13.2.8

(i) Suppose 2 6= 0.

If A = B + C with BT = B and CT = −C then

2B = B + C + B − C = (B + C) + (B + C)T = A+ AT

so B = 2−1(A+ AT ) and C = A− B = 2−1(A− AT ).

Conversely if B = 2−1(A+AT ) and C = A−B = 2−1(A−AT ) then

BT = 2−1(AT + ATT ) = B

CT = 2−1(AT − ATT ) = −C

B + C = 2−1(A+ AT ) + 2−1(A− AT ) = A

(ii) Suppose 2 = 0. Then a+ a = (1+ 1)a = 2a = 0a = 0 so a = −aand (aij) = (−aij) so A = −A. Thus

A = AT ⇔ A = −AT .

We work over Z2. If A = B+C with BT = B, CT = C then AT = A.Thus

(

0 10 0

)

cannot be written as the sum of a symmetric and an antisymmetricmatrix.

On the other hand the zero matrix 0 and the identity matrix I aresymmetric so antisymmetric and

0 = 0 + 0 = I + I.

347

Exercise 13.2.9

(i) Observe that 02+0 = 0+0 and 12+1 = 1+1 = 0 so x2 = x = 0for all x ∈ Z2.

(ii) If G is a finite field there are only a finite number of distinctfunctions f : G → G but there are infinitely many polynomials. Thusthere must exist distinct polynomials P and Q which give rise to thesame function. Consider P −Q.

348

Exercise 13.3.1

If U is a vector space over G then W ⊆ U is a subspace if and onlyif.

(i) 0 ∈ W .

(ii) x, y ∈ W → x = y ∈ W .

(iii) λ ∈ G, x ∈ W ⇒ λx ∈ W .

If G = Z2 then, if λ ∈ G, x ∈ W either λ = 0 and λx = 0 so, ifcondition (ii) is true λx ∈ W , or λ = 1 and λx = x ∈ W .

Thus we can drop condition (iii) in this case.

349

Exercise 13.3.2

If U is infinite then all three statements are false.

If U is finite then (since U spans itself) it has a finite spanning setand so a basis e1, e2, . . . , em. The elements of U are the vectors

m∑

j=1

xjej

with xj ∈ Z2, each choice of the xj giving a distinct vector so U has2m elements.

If U is isomorphic to Zq2 they must have the same number of elements

so q = m.

The map θ : Zm2 → U given by

θ(x1, x2, . . . , xm) =m∑

j=1

xjej.

Thus the three given statements are equivalent.

350

Exercise 13.3.3

We have

α(λe+µf) =n∑

j=1

pj(λej+µfj) = λn∑

j=1

pjej+µn∑

j=1

pjfj = λα(e)+µα(f)

so α is linear.

Let e(j) be the vector with 1 in the jth place and 0 elsewhere. If αis linear, then, setting pj = αe(j) we have

α(e) = α

(

n∑

j=1

eje(j)

)

=n∑

j=1

ejαe(j) =n∑

j=1

pjej.

351

Exercise 13.3.4

Codeword (0, 0, 0, 1, 1, 1, 1).

Error in fourth place gives (0, 0, 0, 0, 1, 1, 1).

z1 = 0, z2 = 0, z3 = 1 tells us error in 4th place. Recover (0, 0, 0, 1, 1, 1, 1).

352

Exercise 13.3.5

The tape will be accepted if each line contains an even number oferrors. Since the probability of errors is small (and the number of bitson each line is small) the probability of one error is much greater thanthe probability of an odd number of errors greater than 1. Thus

Pr(odd number of errors in one line) ≈ Pr(exactly one error in one line)

= 8× 10−4 × (1− 10−4)7 ≈ 8× 10−4.

Since the the probability λ of an odd number of errors in one line isvery small but there are a large number N of lines, we may use thePoisson approximation to get

1− Pr(odd number of errors in some line) ≈ e−λN ≈ e−8 ≈ 0.00034

and conclude that the probability of acceptance is less than .04%.

If we use the Hamming scheme, then, instead of having 7 freelychosen bits (plus a check bit) on each line, we only have 4 freely chosenbits (plus three check bits plus an unused bit) per line so we needapproximately

1

4× 7× 104 = 1.75× 104

lines.

If a line contains at most one error, it will be correctly decoded. Aline will fail to be correctly decoded if it contains exactly two errorsand it may fail to be correctly decoded if it contains more than twoerrors. Since the probability of errors is small (and the number of bitson each line is small), the probability of two errors is much greater thanthe probability of more than two. Thus

Pr(decoding failure for one line) ≈ Pr(exactly two errors in one line)

=

(

7

2

)

× (10−4)2 × (1− 10−4)5 ≈ 21× 10−8.

Since the the probability of a decoding error in one line is very smallbut there are a large number of lines, we may use the Poisson approx-imation to get

Pr(decoding error for some line) = 1− Pr(no decoding error in any line)

≈ 1− e−21×10−8×17500 ≈ 1− e−.003675 ≈ .9963

and conclude that the probability of a correct decode is greater than99.6%.

353

Exercise 13.3.6

(i) Observe that

1 0 1 0 1 0 10 1 1 0 0 1 10 0 0 1 1 1 1

c1c2c3c4c5c6c7

=

000

can be rewritten as

c1 + c3 + c5 + c7 ≡ 0

c2 + c3 + c6 + c7 ≡ 0

c4 + c5 + c6 + c7 ≡ 0.

(ii) Observe that AeTj is the j row of A and that the jth row of A isthe binary digits of j written in reverse order.

By linearity A(c + ej)T = AcT + AeTj = AeTj = (a1, a2, a3)

T so thestatement follows.

354

Exercise 13.3.7

AeTi 6= AeTj for i 6= j since otherwise an error in the jth place wouldhave the same effect as an error in the ith place. Since

x+ y = 0 ⇔ x = y

we haveA(eTi + eTj ) 6= 0T

Thus by linearity

A(c+ ei)T + ej)

T ) = Ac+ AcT + AeTj

= A(eTi + eTj ) 6= 0

and c+ ei)T + ej is not a codeword.

(ii) Aej ∈ Z3 \ {0}. Since the Aej are distinct for distinct j and{j : 1 ≤ j ≤ 7} has the same number of elements as Z3 \ {0}T we havebijection and so since

A(eTi + eTj ) 6= 0

there must be a k with

A(eTi + AeTj ) = A(eTk ).

If k = j we would have AeTi = 0T which is impossible. Thus j 6= k andsimilarly j 6= i.

If we have an error in the i and jth place the Hamming system willassume an error in the kth place.

355

Exercise 13.3.9

Observe that if A = (aij)

c ∈ C ⇔ AcT = 0T

⇔n∑

j=1

aijcj = 0 for 1 ≤ j ≤ r

⇔ αjc = 0 for 1 ≤ j ≤ r

where αj ∈ (Zn2 )

′ is given by

αjc =n∑

j=1

aijcj .

356

Exercise 13.3.10

The null space of a linear map is a subspace.

357

Exercise 13.3.12

Let α be the linear map associated with A for the standard bases.Since A has rank r so does α and

dimC = dimα−1(0) = n− rank(α) = n− r

so C has 2n−r elements.

358

Exercise 13.3.13

(i) Suppose C has basis g(j) [1 ≤ r ≤ n]. Consider an r×n matrix Uwith jth row g(j). We can use elementary elementary row operationsso that (after reordering coordinates if necessary) we get V =

(

I B)

.Since elementary row operations leave the space spanned by the rowsof a matrix unchanged, C has basis the rows e(j) of V . This gives therequired result.

(ii) The rank nullity theorem tells us that the Hamming code hasdimension 4. If we do not interchange columns then considering theeffect of choosing one of the c3, c5, c6, c7 to be 1 and the rest 0 weobtain 4 independent vectors.

(1, 1, 1, 0, 0, 0, 0), (1, 0, 0, 1, 1, 0, 0)

(0, 1, 0, 1, 0, 1, 0), (1, 1, 0, 1, 0, 0, 1)

which therefore form a basis.

(iii) α is linear and surjective so (since the spaces have the samedimension) an isomorphism.

(

λx+ µy, β(λx+ µy))

= α(λx+ µy)

= λαx+ µαy

= λ(x, βx) + µ(y, βy)

= (λx+ µy, λβx+ µβy)

soβ(λx+ µy) = λβx+ µβy

and β is linear.

(iv) Neither is true. Take n = 2r b(j) = 0.

359

Exercise 13.3.14

The probability of exactly 2 errors in a line (ie codeword), in whichcase the correction is erroneous, is

(

7

2

)(

1

10

)2(9

10

)5

≈ 0.124

The probability of no such problem in 17 500 lines is less than (9/10)17500

which is negligible.

360

Exercise 13.3.15

If p = 1/2 then, whatever the message sent, the received sequence isa series of independent random variables each taking the value 0 withprobability 1/2 and the value 1 with probability 1/2.

If p > 1/2, then replacing each received 0 by 1 and each received 1by 0 gives a system for the probability of an error in one bit is 1− p.

361

Exercise 13.3.16

By definition

A1 = (1), A2 =

(

1 0 10 1 1

)

, A3 =

1 0 1 0 1 ) 10 1 1 0 0 1 10 0 0 1 1 1 1

and

A4 =

1 0 1 0 1 0 1 0 1 0 1 0 1 0 10 1 1 0 0 1 1 0 0 1 1 0 0 1 10 0 0 1 1 1 1 0 0 0 0 1 1 1 10 0 0 0 0 0 0 1 1 1 1 1 1 1 1

(ii) The 2rth column has a 1 in the rth place with all other entrieszero, so these columns are linearly independent and A has rank n.

(iii) Let us write j =∑n+1

i=1 aij2i−1 with aij ∈ {0, 1}. Then, if 0 ≤

j < 2n, aij = ai(j+2n) for 1 ≤ i ≤ n− 1 and an,j = 0, an,(j+2n) = 1. Wehave ai2n−1 = δin.

362

Exercise 13.3.17

Let ek be a row vector of length 2n − 1 with 1 in the kth place anzero elsewhere. Write k =

∑n−1i=1 bi2

i−1 with bi ∈ {0, 1}. Then, if c is acode word,

A(c+ ek)T = AeTk = b,

so we can recover from a single mistake by the rules:-

(1) If x received and AxT = 0T , assume message correct.

(2) If x received and AxT = bT , assume mistake in∑n

i=1 bi2i−1th

place.

Since A has full rank (see previous question), the rank-nullity theo-rem tells us that C has 2n−r elements.

363

Exercise 13.3.18

Since the error rate is very low we can use the Poisson approximation.(But exact calculation is easy using a calculator.) We have a rate(5×107)×10−7 = 10 so the probability that no error occurs is roughlye−5 which is very small.

By the Poisson approximation (or directly by bounding terms) theprobability of more than one error in a line is roughly

p =1

2!λ2e−λ

with λ = 63× 10−7. Thus

p ≈ 2× 10−10

There are roughly 105 lines so the probability of an error in the decodedmessage is (using the Poisson approximation) less than about exp(−2×10−5) which is pretty small.

364

Exercise 14.1.3

C([a, b]) is a subset of the vector space R[a,b] of functions f : [a, b] →R. We observe that 0 ∈ C([a, b]) and that

f, g ∈ C([a, b]) ⇒ λf + µg ∈ C([a, b])

so C([a, b]) is a subspace of R[a,b] and so a vector space.

We have

〈f, f〉 =∫ b

a

f(x)2 dx ≥ 0

and since the integral of a positive continuous function is zero if andonly if the function is zero

〈f, f〉 = 0 ⇒∫ b

a

f(x)2 dx = 0 ⇒ f × f = 0 ⇒ f = 0.

Also

〈f, g〉 =∫ b

a

f(x)g(x) dx =

∫ b

a

g(x)f(x) dx = 〈g, f〉,

〈f, g + h〉 =∫ b

a

f(x)(g(x) + h(x) dx

=

∫ b

a

f(x)g(x) + f(x)h(x) dx

= 〈f, g〉+ 〈f, g〉and

〈λf, g〉 =∫ b

a

λf(x)g(x) dx = λ

∫ b

a

f(x)g(x) dx = λ〈f, g〉.

365

Exercise 14.1.4

(i) If x = 0 or y = 0, we have equality. If λ is a real number thenwe have

0 ≤ ‖λx+ y‖2= 〈λx+ y, λx+ y〉= λ2‖x‖2 + 2λ〈x,y〉+ ‖y‖2

=

(

λ‖x‖+ 〈x,y〉‖x‖

)2

+ ‖y‖2 −(〈x,y〉

‖x‖

)2

.

If we set

λ = −〈x,y〉‖x‖2 ,

we obtain

0 ≤ (λx+ y) · (λx+ y) = ‖y‖2 −(

x · y‖x‖

)2

.

Thus

⋆ ‖y‖2 −(〈x,y〉

‖x‖

)2

≥ 0

with equality only if

0 = ‖λx+ y‖so only if

λx+ y = 0.

Rearranging the terms in ⋆ we obtain

〈x,y〉2 ≤ ‖x‖2‖y‖2

and so

|〈x,y〉| ≤ ‖x‖‖y‖with equality only if

λx+ y = 0.

We observe that, if λ′, µ′ ∈ R are not both zero and λ′x = µ′y, then|〈x,y〉| = ‖x‖‖y‖.

Thus |〈x,y〉| ≤ ‖x‖‖y with equality if and only if x and y are linearlydependent.

(ii) By definition ‖x‖ ≥ 0. Further

‖x‖ ≥ 0 ⇔ 〈x,x〉 = 0 ⇔ x = 0.

Since 〈λx, λx〉 = λ2〈x,x〉, we have ‖λx‖ = |λ|‖x‖.

366

Finally

‖x+ y‖2 = ‖x‖2 + 2〈x,y + ‖y‖2

≤ ‖x‖2 + 2‖x‖‖y‖+ ‖y‖2

= (‖x‖+ ‖y‖)2,so ‖x+ y‖ ≤ ‖x‖+ ‖y‖.

(iii) If we use the inner product of Exercise 14.1.3, the Cauchy–Schwarz inequality takes the form

(∫ b

a

f(t)g(t) dt

)2

≤∫ b

a

f(t)2 dt

∫ b

a

g(t)2 dt.

367

Exercise 14.1.5

(i) We have p0 = 1, so the result is true for n = 0. If pr is polynomialof degree r for r ≤ n− 1, then

n−1∑

j=0

〈qn, pj〉‖pj‖2

is a polynomial of degree at most n− 1, so

pn = qn −n−1∑

j=0


pj

is a monic polynomial of degree n.

p1(x) = x− 〈q1, p0〉‖p0‖2

p0(x) = x.

p1(x) = x2 −1∑

j=0


pj(x)

= x2 −(∫ 1

−1

t2 dt

)(∫ 1

−1

12 dt

)

= x2 − 13.

(ii) 0 is a polynomial of degree less than n. If P and Q are polyno-mials of degree at most n so is λP = µQ. Writing qr(x) = xr, we havethe trivial equality

n∑

r=0

arxr =

(

n∑

r=0

arqr

)

(x)

so the qj with 0 ≤ j ≤ n span Pn. Since a non-zero polynomial ofdegree at most n can have at most n zeros

n∑

r=0

arqr = 0 ⇒ ar = 0 [0 ≤ r ≤ n]

so the qr form a basis and Pn has dimension n + 1.

The pj‖pj‖−1 form a collection of n+1 orthonormal vectors so forman orthonormal basis.

(iii) If P is monic of degree n+ 1, then

P = pn+1 +Q

368

with Q a polynomial of degree at most n Now

P ⊥ Pn ⇒ (P − pn+1) ⊥ Pn ⇒ Q ⊥ Pn

⇒ Q ⊥ Q ⇒ 〈Q,Q〉 = 0

⇒ Q = 0 ⇒ P = pn+1.

By definition pn+1 is monic and

pn+1 ⊥ span{p0, p1, . . . , pn}and so, since Pn span{p0, p1, . . . , pn}, we have pn+1 ⊥ Pn.

369

Exercise 14.1.11

(i) Observe thatn∑

j=1

|λj| ≥n∑

j=1

λj =

∫ 1

−1

1 dt = 2.

Similarly∑2n

j=n+1 |λj| ≥ 2.

(ii) We have

2n∑

j=1

µjP (tj) = (ǫ−1 + 1)

n∑

j=1

λjP (tj)− ǫ−1

2n∑

j=n+1

λjP (tj) =

∫ 1

−1

P (t) dt

for all polynomials P of degree n or less.

Let f be the simplest piecewise linear functional with

f(tj =

ǫ if 1 ≤ j ≤ n and λj ≥ 0

−ǫ if 1 ≤ j ≤ n and λj < 0

ǫ if n+ 1 ≤ j ≤ 2n and λj < 0

−ǫ if n+ 1 ≤ j ≤ 2n and λj ≥ 0

with

f(−1) = 0 if −1 /∈ {tj : 1 ≤ j ≤ 2n}f(1) = 0 if 1 /∈ {tj : 1 ≤ j ≤ 2n}.

Then |f(t)| ≤ ǫ for all t ∈ [−1, 1], but

2n∑

j=1

µjP (tj) =

2n∑

j=1

|µj| ≥ 2(1 + ǫ−1) + 2ǫ−1.

370

Exercise 14.1.12

(i) Observe that

〈f, f〉 =∫ b

a

f(x)2r(x) dx ≥ 0,

and, since x 7→ f(x)2r(x) is continuous and positive

〈f, f〉 = 0 ⇒∫ b

a

f(x)2r(x) dx = 0

⇒ f(x)2r(x) = 0 ∀x⇒ f(x) = 0 ∀x

Automatically

〈f, g〉 =∫ b

a

f(x)g(x)r(x) dx =

∫ b

a

g(x)f(x)r(x) dx = 〈g, f〉

and

〈λ1f1 + λ2f2, g〉 =∫ b

a

(λ1f1(x) + λ2f2(x))g(x)r(x) dx

=

∫ b

a

λ1f1(x)g(x) + λ2f2g(x)r(x) dx

= λ1〈f1, g〉+ λ2〈f2, g〉.

(ii) This is just the Cauchy–Schwartz inequality

〈f, f〉〈g, g〉 ≥ 〈f, g〉2.

(iii) We define Pn inductively. Set P0 = 1. Suppose that Pj is amonic polynomial of degree j with the Pj orthogonal with respect toour inner product [0 ≤ j ≤ n− 1]. Then, if Qn(x) = xn,

Pn = Qn −n−1∑

j=0

〈Qn, Pj〉〈Pj, Pj〉

is a monic polynomial of degree n orthogonal to all the Pj with j ≤n− 1.

Since the subspace Pn−1 of polynomials of degree at most n− 1 hasdimension n and the Pj are non-zero and mutually orthogonal the Pj

with j ≤ n−1 form a basis for Pn−1 and 〈Pn, Q〉 = 0 for all polynomialsQ of degree n− 1 or less.

(iii) Let x1, x2, . . .xk be the real roots of Pn which have odd order

and lie in (a, b). Setting Q(x) =∏k

j=1(x − xk), we see that PnQ is

371

single signed so

〈Pn, Q〉 =∫ b

a

Pn(x)Q(x)r(x) dx 6= 0

and k ≥ n. Thus k = n and Pn has n distinct real roots all lying in(a, b).

If we set ev(x) =∏

j 6=v,1≤j≤n(x− xj)(xv − xj)−1 and

λv =

∫ b

a

ev(x)r(x) dx

then (since f −∑nv=1 f(xv)ev is a polynomial of degree at most n − 1

vanishing at n points) we have f =∑n

v=1 f(xv)ev and∫ b

a

f(x)r(x) dx =n∑

v=1

λvf(xv)

for all f polynomials of degree n− 1 or less.

If f is polynomials of degree 2n− 1 or less, then f = gPn + h withg and h polynomials of degree n− 1 or less, so

∫ b

a

f(x)r(x) dx =

∫ b

a

g(x)Pnr(x) dx+

∫ b

a

h(x)r(x) dx

= 0 +

n∑

v=1

λvg(xv)

=

n∑

v=1

λvg(xv) +

n∑

v=1

λv0

=

n∑

v=1

λvg(xv) +

n∑

v=1

λvPn(xv)h(xv)

=

n∑

v=1

λvf(xv).

372

Exercise 14.1.14

We need to show that 〈Pn, qPn−1〉 > 0.

Now qPn−1 and Pn are both monic polynomials of degree n, soqPn−1 − Pn has degree at most n− 1. Thus

〈Pn, qPn−1 − Pn〉 = 0

and〈Pn, qPn−1〉 = 〈Pn, Pn〉 > 0.

373

Exercise 14.1.16

Observe that

0 ≤∫ 1

1

fn(x)2 dx = (2n+1 − 1)

∫ 2−8n

−2−8n

fn(x)2 dx

≤ 2n+12−8n+122n = 2−5n+2 → 0

as n → ∞ so ‖fn‖ → 0.

However, if u is an integer with |u| < 2m − 1, then f(u2−m) = 2n

when n ≥ m and so fn(u2−m) → ∞ as n → ∞.

374

Exercise 14.2.1

If dimU = 0, there is nothing to prove.

Otherwise, we construct the ej inductively. Since dimU > 0 we canfind a a 6= 0. Set e1 = ‖a‖−1a. If e1, e2, . . . , er have been found asorthonormal vectors either r = n and we have an orthonormal basis orwe can find

b /∈ span{e1, e2, . . . , er}The Gram–Schmidt method now produces an er+1 with e1, e2, . . . ,er+1 orthonormal.

The process terminates in an orthonormal basis.

Since the ej are linearly independent and span the map

θ

(

n∑

j=1

xjej

)

= (x1, x2, . . . , xn)T

is a well defined bijection. Since

θ

(

λn∑

j=1

xjej + µn∑

j=1

yjej

)

= θ

(

n∑

j=1

(λxj + µyj)ej

)

= (λx1 + µy1, λx2 + µy2, . . . , λxn + µyn)T

= λ(x1, x2, . . . , xn)T + µ(y1, y2, . . . , yn)

T

= λθ

(

λ

n∑

j=1

xjej

)

+ µθ

(

n∑

j=1

yjej

)

θ is linear. Finally,

θ

(

n∑

j=1

xjej)

)

·θ(

n∑

j=1

yjej

)

=

n∑

j=1

xjyj =

⟨

θ

(

n∑

j=1

xjej

)

, θ

(

n∑

j=1

yjej

)⟩

.

375

Exercise 14.2.3

Choose an orthonormal basis

e1, e2, . . . , er

for V . Then by Bessel’s inequality

‖a−r∑

j=1

λjej‖

has a unique minimum when λj = 〈a, ej〉.If x ∈ V

〈a− x,v〉 = 0 ∀v ∈ V ⇔ 〈a− x, ej〉 = 0 ∀1 ≤ j ≤ r

⇔ 〈x, ej〉 = 〈a, ej〉 ∀1 ≤ j ≤ r

⇔ x =r∑

j=1

〈a, ej〉ej

so we are done.

Exercise 14.2.5⋆

376

Exercise 14.2.6

Observe that

〈u, a〉 =⟨

n∑

j=1

α(ej)ej,u

⟩

=n∑

j=1

α(ej)〈ej , a〉

= α

(

n∑

j=1

〈ej, a〉ej)

= αu.

377

Exercise 14.2.7

We have a map α : U → R so we only need check linearity.

But, if u, v ∈ U and λ, µ ∈ R we have

α(λu+ µv) = 〈λu+ µv, a〉 = λ〈u, a〉+ µ〈v, a〉 = λαu+ µαv,

so we are done.

378

Exercise 14.2.10

Observe that

〈u,Φ(λα + µβ)v〉 = 〈(λα + µβ)u,v〉= 〈λα(u) + µβ(u),v〉= λ〈αu,v〉+ µ〈βu,v〉= λ〈u, α∗v〉+ µ〈u, β∗v〉 = 〈u, (λα∗ + µβ∗)v〉= 〈u, (λΦα + µΦβ)v〉

for all u ∈ U , so

Φ(λα + µβ)v = (λΦα + µΦβ)v

for all v ∈ U soΦ(λα + µβ) = λΦα + µΦβ

for all α, β ∈ L(U, U) and λ, µ ∈ F, so Φ is linear.

379

Exercise 14.2.12

D is a subset of the inner product space C([0, 1]). 0 ∈ D and

λ, µ ∈ R, f, g ∈ D ⇒ λf + µg ∈ D.

Thus D is a subspace of the inner product space C([0, 1]) and so aninner product space.

Further,

α(λf + µg) =(

p(λf + µg)′)′=(

λpf ′ + µgf ′)′= λαf + µαg.

so α is linear.

By integration by parts,

〈αf, g〉 =∫ 1

0

(f ′p)′(t)g(t) dt

=[

f ′(t)p(t)g(t)]1

0−∫ 1

0

f ′(t)p(t)g′(t) dt

= −∫ 1

0

f ′(t)p(t)g′(t) dt

= 〈αg, f〉 = 〈f, αg〉,so α∗ = α.

If we write (βf)(t) = tf(t) then β is a self-adjoint linear functional.

380

Exercise 14.3.2

We have

〈z, λw〉P = 〈λw, z〉∗P =(

λ∗〈w, z〉P)∗

= λ〈w, z〉∗P = λ〈z,w〉P .

Let 〈 , 〉M be defined as in the question. (i) Since 〈z, z〉P is real andpositive, so is 〈z, z〉M = 〈z, z〉∗P .

(ii) 〈z, z〉M = 0 ⇒ 〈z, z〉P = 0 ⇒ z = 0.

(iii) 〈λz,w〉M = 〈λz,w〉∗P =(

λ∗〈z,w〉P)∗

= λ〈z,w〉∗P = λ〈z,w〉M .

(iv) We have

〈z+ u,w〉M = 〈z+ u,w〉∗P=(

〈z,w〉P + 〈u,w〉P)∗

= 〈z,w〉∗P + 〈u,w〉∗P= 〈z,w〉M + 〈u,w〉M

(v) 〈w, z〉M = 〈w, z〉∗P = 〈z,w〉∗∗P = 〈z,w〉∗M .

Essentially same arguments show that if 〈 , 〉M is a mathematician’sinner product then

〈z,w〉P = 〈z,w〉∗Mdefines a physicist’s inner product.

381

Exercise 14.3.3

C([a, b]) is a subset of the vector space C[a,b] of functions f : [a, b] →C. We observe that 0 ∈ C([a, b]) and that

f, g ∈ C([a, b]) ⇒ λf + µg ∈ C([a, b])

so C([a, b]) is a subspace of R[a,b] and so a vector space.

We have

〈f, f〉 =∫ b

a

|f(x)|2 dx ≥ 0

and since the integral of a positive continuous function is zero if andonly if the function is zero

〈f, f〉 = 0 ⇒∫ b

a

|f(x)|2 dx = 0 ⇒ f × f = 0 ⇒ f = 0.

Also

〈f, g〉 =∫ b

a

f(x)g(x)∗ dx =

∫ b

a

(g(x)f(x)∗)∗ dx

=

(∫ b

a

g(x)f(x)∗ dx

)∗

= 〈g, f〉,

and

〈f, g + h〉 =∫ b

a

f(x)(g(x) + h(x)∗ dx

=

∫ b

a

f(x)g(x)∗ + f(x)h(x)∗ dx = 〈f, g〉+ 〈f, g〉,

whilst

〈λf, g〉 =∫ b

a

λf(x)g(x)∗ dx = λ

∫ b

a

f(x)g(x)∗ dx = λ〈f, g〉.

382

Exercise 14.3.4

(i) We have

〈f , ej〉 =⟨

n∑

r=1

arer, ej

⟩

=n∑

r=1

ar〈er, ej〉 =n∑

r=1

arδrj = aj

Note that〈ej , f〉 = 〈f , ej〉∗ = a∗j .

(ii) We have

〈er, es〉 =∫ 1

0

exp(2πirt) exp(−2πist) dt =

∫ 1

0

exp(

2πi(r−s)t)

dt = δrs.

383

Exercise 14.3.5

(i) Observe that

〈v, er〉 = 〈x, er〉 −k∑

j=1

〈x, er〉〈ej, er〉 = 〈x, er〉 − 〈x, er〉 = 0

for all 1 ≤ r ≤ k.

(ii) If v = 0, then

x =

k∑

j=1

〈x, ej〉ej ∈ span{e1, e2, . . . , ek}.

If v 6= 0, then ‖v‖ 6= 0 and

x = v +k∑

j=1

〈x, ej〉ej

= ‖v‖ek+1 +k∑

j=1

〈x, ej〉ej ∈ span{e1, e2, . . . , ek+1}.

(iii) If e1, e2, . . . , ek do not form a basis for U , then we can find

x ∈ U \ span{e1, e2, . . . , ek}.Defining v as in part (i), we see that v ∈ U and so the vector ek+1

defined in (ii) lies in U . Thus we have found orthonormal vectors e1,e2, . . . , ek+1 in U . If they form a basis for U we stop. If not, we repeatthe process. Since no set of n + 1 vectors in U can be orthonormal(because no set of n+1 vectors in U can be linearly independent), theprocess must terminate with an orthonormal basis for U of the requiredform.

384

Exercise 14.3.7

Let V have an orthonormal basis e1, e2, . . . , en−1. By using theGram–Schmidt process, we can find en such that e1, e2, . . . , en areorthonormal and so a basis for U . Setting b = en, we have the result.

385

Exercise 14.3.8

Uniqueness. If αu = 〈u, a1〉 = 〈u, a2〉 for all u ∈ U , then

〈u, a1 − a2〉 = 0

for all u ∈ U and, choosing u = a1−a2, we conclude, in the usual way,that a1 − a2 = 0.

Existence. If α = 0, then we set a = 0. If not, then α has rank 1 (sinceα(U) = R) and, by the rank-nullity theorem, α has nullity n − 1. Inother words,

α−1(0) = {u : αu = 0}has dimension n− 1. By Lemma 14.2.2, we can find a b 6= 0 such that

α−1(0) = {x ∈ U : 〈x,b〉 = 0},that is to say,

α(x) = 0 ⇔ 〈x,b〉 = 0

If we now seta = ‖b‖−2α(b)b,

we haveαx = 0 ⇔ 〈x, a〉 = 0

andαa = ‖b‖−2α(b)2 = 〈a, a〉.

Now suppose that u ∈ U . If we set

x = u− αu

αaa,

then αx = 0 so 〈x, a〉 = 0, that is to say,

0 =⟨

u− αu

αaa, a⟩

= 〈u, a〉 − αa

〈a, a〉αu = 〈u, a〉

and we are done.

386

Exercise 14.3.9

Since〈λu+ µv, a〉 = λ〈u, a〉+ µ〈u, a〉

θ(a ∈ U ′. The previous exercise tells us that θ is surjective.

θ(a) = θ(b) ⇒ 〈b− a,b− a〉 ⇒ a = b

so θ is injective.

Finally,

θ(λa+ µb)(u) = 〈u, λa+ µb〉= λ∗〈u, a〉+ µ∗〈u,b〉= λ∗θ(a)u+ µ∗θ(b)u

=(

λ∗θ(a) + µ∗θ(b))

u

for all u ∈ U . Thus

θ(λa+ µb) = λ∗θ(a) + µ∗θ(b)

as required.

Exercise 14.3.10⋆

387

Exercise 14.3.11

Observe that, if v ∈ U , the map

θ(u) = 〈αu,v〉lies in U ′, so the Riesz representation theorem tells us that there is aunique α∗v ∈ U such that

〈αu,v〉 = 〈u, α∗v〉for all v ∈ U .

Since

〈u, α∗(λv + µw)〉 = 〈αu, (λv + µw)〉= λ∗〈αu,v〉+ µ∗〈αu,w〉= λ∗〈u, α∗v〉+ µ∗〈u, α∗w〉= 〈u, λα∗v + µα∗w〉

for all u ∈ U , we have

α∗(λv + µw) = λα∗v + µα∗w

so α∗ ∈ L(U, U).

Again,

〈u, (λα + µβ)∗v〉 = 〈(λα+ µβ)u,v〉= λ〈αu,v〉+ µ〈βu,v〉= λ〈u, α∗v〉+ µ〈u, β∗v〉= 〈u, λ∗α∗v + µ∗β∗v〉= 〈u, (λ∗α∗ + µ∗β∗)v〉

for all u ∈ U , so

(λα+ µβ)∗v = (λ∗α∗ + µ∗β∗)v

for all v ∈ U , so(λα + µβ)∗ = λ∗α∗ + µ∗β∗

and he map Ψ : L(U, U) → L(U, U) given by Ψα = α∗ satisfies

Ψ(λα+ µβ) = λ∗Ψα + µ∗Ψβ.

Finally,

〈α∗∗u,v〉 = 〈v, α∗∗u〉 = 〈α∗v,u〉∗= 〈u, α∗v〉 = 〈αu,v〉

for all v ∈ U , soα∗∗u = u

for all u ∈ U , so α∗∗ = α and Ψ is its own inverse, so bijective so ananti-isomorphism.

388

Exercise 14.3.12

If α has matrix A = (aij) and α∗ has matrix B = (bij) with respectto the given basis, then

brj =

⟨

n∑

i=1

bijei, er

⟩

= 〈α∗ej , er〉 = 〈erα∗ej〉∗ = 〈αer, ej〉

=

⟨

n∑

k=1

akrek, ej

⟩∗

= a∗jr,

so B = A∗.

389

Exercise 14.3.13

(i) Since V and V ⊥ are complementary,

dim V + dimV ⊥ = dimU.

For the same reason, dimV ⊥+dimV ⊥⊥ = dimU , so dimV ⊥⊥+dimV .

Sincex ∈ V ⇒ 〈x,y〉 = 〈y,x〉 = 0 ∀y ∈ V ⊥

we have V ⊆ V ⊥⊥ so V = V ⊥⊥.

(ii) The statement is equivalent to saying that, if u ∈ U , there areunique a ∈ V and b ∈ V ⊥ such that u = a + b. and this is the sameas saying that V and V ⊥ are complementary.

(iii) We have

λ1u1 + λ2u2 = (λ1πu1 + λ2πu2) + (λ1(ι− π)u1 + λ2(ι− π)u2)

and λ1πu1 + λ2πu2 ∈ V , λ1(ι − π)u1 + λ2(ι − π)u2 ∈ V ⊥ so, byuniqueness,

π(λ1u1 + λ2u2) = λ1πu1 + λ2πu2.

Thus π and so ι− π are linear.

Since πu ∈ V we have, by uniqueness π2U = π(π(u) = πu for allu ∈ U so π = π2.

If x, y ∈ U , then

〈πx,y〉 = 〈πx, πy + (ι− π)y〉= 〈πx, πy〉+ 〈πx, (ι− π)y〉= 〈πx, πy〉= 〈πx, πy〉+ 〈(ι− π)x, πy〉= 〈x, πy〉

so π = π∗

390

Exercise 14.3.14

(i)⇒(ii)

Let U = α(V ) and W = (ι− α)(V ). Since

v = αv + (ι− α)v

U +W = V . If u ∈ U then u = αv for some v ∈ V so

αu = α2v = αv = u = ιUu.

If w ∈ W then u = (ι− α)v for some v ∈ V so

αw = (α− α2)v = 0.

Finally, if u ∈ U , w ∈ W the results of the previous paragraph give

〈u,w〉 = 〈αu, (ι− α)w〉 = 〈u, α(ι− α)w〉 = 〈u, 0〉 = 0.

(ii)⇒(iii) Immediate.

(iii)⇒(iv) Take an orthonormal basis for U and an orthonormal ba-sis for W . Together they form an orthonormal basis for V with therequired properties.

(iv)⇒(i) Let A be the matrix of α with respect to the specified basis.Then A∗ = A and A2 = A, so the results for α follow.

An orthogonal projection automatically obeys the condition α2 = α.

Give C2 the standard inner product. Let

α(z, w) = (13z + 2

3w, 1

3z + 2

3w).

By inspection, α is linear and α2 = α. By observing that the matrixA of α with respect to the standard basis is real, but not symmetric,we see that A∗ 6= A so α∗ 6= α. Thus α is a projection but not anorthogonal projection.

If α and β are orthogonal projections and αβ = βα, then

(αβ)2 = (αβ)(αβ) = α(βα)β = α(αβ)β = α2β2 = αβ

and (αβ)∗ = (βα)∗ = α∗β∗ = αβ.

391

Exercise 14.3.15

We haveρ2W = 4π2

W − 4πw + ι = ι.

‖ρWx‖2 = 〈(2πW − ι)x, (2πW − ι)x〉 = 〈(2πW − ι)x, (2πW − ι)x〉= 〈x,x〉 = ‖x‖2

so ρW is an isometry.

ρWρW⊥ = (2πW − ι)(2πW⊥ − ι)

= 4πWπW⊥ − 2πW − 2πW⊥ + ι = 0− 2ι+ ι = −ι.

392

Exercise 15.1.1

d1(A,B) = maxi,j |aij − bij | = ‖A− B‖1.(i) |aij| ≥ 0 ∀i, j so ‖A‖1 ≥ 0.

(ii) ‖A‖1 = 0 ⇒ |aij | = 0 ∀i, j ⇒ aij = 0 ∀i, j ⇒ A = 0.

(iii) ‖λA‖1 = maxi,j |λaij| = maxi,j |λ||aij| = |λ|maxi,j |aij | = |λ‖A‖1.(iv) |aij − bij | ≤ |aij |+ |bij | ∀i, j so

maxi,j

|aij − bij | ≤ maxi,j

|aij|+maxi,j

|bij |

and ‖A+B‖1 ≤ ‖A‖1 + ‖B‖1.d2(A,B) =

∑

i,j |aij − bij | = ‖A−B‖2.(i) |aij| ≥ 0 ∀i, j so

‖A‖2 =∑

i,j

|aij | ≥ 0.

(ii) ‖A‖2 = 0 ⇒ |aij | = 0 ∀i, j ⇒ aij = 0 ∀i, j ⇒ A = 0.

(iii) We have

‖λA‖2 =∑

i,j

|λaij | =∑

i,j

|λ||aij|

= |λ|∑

i,j

|aij| = |λ|‖A‖2

(iv) We have

‖A+B‖2 =∑

i,j

|aij + bij | ≤∑

i,j

|aij|+ |bij|

=∑

i,j

|aij|+∑

i,j

|bij| = ‖A‖2 + ‖B‖2.

d3(A,B) =(

∑

i,j |aij − bij |2)1/2

= ‖A− B‖3.

‖ ‖3 is just the standard norm for an inner product space of dimensionn2.

393

Exercise 15.1.3

Observe that, using the Cauchy–Schwarz inequality,

‖αx‖ =

n∑

i=1

∣

∣

∣

∣

∣

n∑

j=1

aijxj

∣

∣

∣

∣

∣

2

1/2

≤(

n∑

i=1

n∑

j=1

|aij|2n∑

j=1

|xj |2)1/2

=

(

n∑

i=1

n∑

j=1

|aij |2‖x‖2)1/2

≤(

n∑

i=1

n∑

j=1

|aij |2)1/2

‖x‖.

394

Exercise 15.1.7

Since U = (ι−α)−1(0) andW = α−1(0) are orthogonal complements,we can write any v ∈ V in the form

v = u+w

with u ∈ U and w ∈ W . Since u ⊥ w,

‖v‖2 = ‖u‖2 + ‖w‖2

and‖πv‖ = ‖u‖ ≤ ‖v

so ‖π‖ ≤ 1.

Since π 6= 0, we can find x with u = πx 6= 0 so ‖u‖ 6= 0, Sinceπ(u = u, we have ‖π‖ = 1.

We use row vectors. Let K > 0. If α(x, y) = (x+Ky, 0), then α islinear and α2(x, y) = (x+Ky, 0) = α(x, y), so α is a projection. Since‖(0, 1)‖ = 1 and ‖α(0, 1)‖ = K, ‖α‖ ≥ K.

395

Exercise 15.1.9

If e1, e2, . . . , en is an orthonormal basis for U , then the map

θ

(

n∑

j=1

xjej

)

= (x1, x2, . . . , xn)T

is a linear inner product preserving (and so norm preserving) isomor-phism θ : U → Fn where Fn is equipped with the standard dot product.

396

Exercise 15.1.10

Observe that‖Ix‖ = ‖x‖

so ‖I‖ = 1.

(i) ‖I + (−I)‖ = ‖0‖ = 0, ‖I‖ = ‖ − I‖ = 1.

(ii) ‖I + I)‖ = ‖2I‖ = 2, ‖I‖ = 1.

(iv) ‖II‖ = ‖I‖ = 1.

We now attack (iii). If

J =

(

0 10 0

)

,

then‖J(x, y)T‖ = ‖(y, 0)T‖ = |y| ≤ ‖(x, y)T‖

so ‖J‖| ≤ 1. However ‖(1, 0)T‖ = ‖(0, 1)T‖ = 1 and J(0, 1)T = (1, 0)T

so ‖J‖ = 1 but‖JJ‖ = ‖0‖ = 0.

397

Exercise 15.1.11

det(tI − A) = (t− 1)2 − µ2 = t2 − 2t + η

so the eigenvalues are 1± (1− η2)1/2

The eigenvectors corresponding to 1 + (1− η2)1/2 are given by

x+ µy = (1 + (1− η2)1/2)x

µx+ y = (1 + (1− η2)1/2)y

i.e x = y (as we could spot directly). The eigenvectors correspondingto 1− (1− η2)1/2 are given by

x+ µy = (1− (1− η2)1/2)x

µx+ y = (1− (1− η2)1/2)y

i.e x = −y (as we could spot directly).

Thus we have a basis of orthonormal eigenvectors e1 = 2−1/2(1, 1)T

eigenvalue λ1 = 1 + (1 − η2)1/2 and e2 = 2−1/2(1,−1)T eigenvalueλ2 = 1− (1− η2)1/2.

We have

‖T (x1e1 + x2e2‖ = ‖λ1x1e1 − λ2x2e2‖=(

(λ1x1)2 + (λ2x2)

)1/2

≤ λ1(x21 + x2

2)1/2 ≤ λ1‖(x1e1 + x2e2‖

so ‖A‖ ≤ λ1 (in fact, since Ae1 = λ1e1 we have ‖A‖ = λ1) and smallchanges in x produce small changes in y.

Observe that taking, y = e2, we have x = A−1 = λ−12 which is very

large.

(ii) Observe that looking at A−1 we have a basis of orthonormaleigenvectors e1 = 2−1/2(1, 1)T eigenvalue λ1 = 1 + (1 − η2)−1/2 ande2 = 2−1/2(1,−1)T eigenvalue λ2 = 1 − (1 − η2)−1/2. Thus arguing asin (i), ‖A−1‖ = λ−1

2 .

Now detB = detA detA−1 = detAA−1 = det I = 1 and

B−1 =

(

A−1 00 A

)

.

By looking at vectors of the form (0, 0, x, y) we see that ‖B‖ ≥ ‖A−1‖and by looking at vectors of the form (x, y, 0, 0) we see that ‖B‖ ≥‖A−1‖. Thus if η is very small, both ‖B‖ and ‖B−1‖ are very large.

398

Exercise 15.1.12

(i) We have c(A) = ‖A‖‖A−1| ≥ ‖AA−1‖ = ‖I‖ = 1.

(ii) We have

c(λA) = ‖λA‖‖(λA)−1‖ = ‖λA‖‖λ−1A−1| = |λ||λ−1|‖A‖‖A−1‖ = c(A).

399

Exercise 15.1.13

Let us use an orthonormal basis e1, e2, . . . , en. Observe that

n−2∑

i,j

|aij| ≤ n−2∑

i,j

maxr,s

|ars| = maxr,s

|ars|.

and that

|ars| ≤(

n∑

j=1

|ajs|2)1/2

=

∥

∥

∥

∥

∥

n∑

j=1

ajsej

∥

∥

∥

∥

∥

= ‖αes‖ ≤ ‖α‖‖es‖ = ‖α‖.If x =

∑nj=1 xjej then |xj| = |〈x, ej〉| ≤ ‖x‖ so

‖αx‖ =

∥

∥

∥

∥

∥

n∑

r=1

xrα(er)

∥

∥

∥

∥

∥

≤n∑

r=1

|xr|‖α(er)‖

≤ ‖x‖n∑

r=1

‖α(er)‖ = ‖x‖n∑

r=1

∥

∥

∥

∥

∥

n∑

j=1

ajrej

∥

∥

∥

∥

∥

≤ ‖x‖n∑

r=1

n∑

j=1

|ajr|‖ej‖ ≤ ‖x‖n∑

r=1

n∑

j=1

|ajr|,

so‖α‖ ≤

∑

i,j

|aij|.

Finally,∑

i,j

|aij| ≤∑

i,j

maxr,s

|ars| = n2maxr,s

|ars|.

Since

‖αx‖ ≤(

n∑

i=1

n∑

j=1

|aij|2)1/2

for all ‖x‖ ≤ 1, we have

‖αy‖ ≤(

n∑

i=1

n∑

j=1

|aij|2)1/2

‖y‖

for all y and

‖α‖ ≤(

∑

i,j

a2ij

)1/2

.

400

Exercise 15.2.2

A triangular matrix A has characteristic polynomial

χA(t) =

n∏

j=1

(t− aii)

so, if A is real, its characteristic equation has all its roots real. Thus,if α is triangularisable, its characteristic equation has all its roots real.Consider the matrix

B =

(

C 00 I

)

with I the n− 2× n− 2 identity matrix and

C =

(

0 −11 0

)

B has characteristic polynomial (t−1)n−2(t2+1) so not all its roots arereal. If β is the linear map corresponding to B with respect to somebasis then β is not triangularisable.

If dimV = 1 a 1 × 1 matrix is triangular so all endomorphisms aretriangularisable.

The fact that A = QR with R upper triangular and Q orthogo-nal does not imply that there is an invertible matrix P and an uppertriangular matrix T with A = PAP−1.

401

Exercise 15.2.4

Let A be the matrix of an endomorphism α with respect to someorthonormal basis. We can find α(n) ∈ L(U, U) with ‖α(n)− α‖ → 0as n → ∞ such that the characteristic equations of the α(n) have norepeated roots. Let A(n) be the matrix of α(n) with respect to thegiven basis. Then

maxi,j

|aij(n)− aij| ≤ ‖α(n)− α‖ → 0

as n → ∞.

402

Exercise 15.3.2

We havedet(tI − A) = (t− 1/2)(t− 1/4),

so A has eigenvalues 1/2 and 1/4

By induction on n,

An(0, 1)T = (2−n+1K, 0)T

for all n ≥ 1. Thus, if K = 2NL, we have ‖AN(0, 1)T‖ > L.

403

Exercise 15.3.3

We consider column vectors. Observe that, if x, y ∈ Rn and z =x+ iy, then

‖z‖2 =n∑

j=1

|zj |2

=

n∑

j=1

(|xj|2 + |yj|2)

= ‖x‖2 + ‖y‖2

Using the notation introduced in the first paragraph,

‖Anz‖2 = ‖An(x+ iy)‖2

= ‖Anx+ iAny)‖2

= ‖Anx‖2 + ‖Any)‖2,so

‖Anx‖, ‖Any)‖ → 0 ⇒ ‖Anz)‖ → 0.

The result follows.

404

Exercise 15.3.7

(i) We have

cij =

n∑

k=1

aikbkj =

j+s∑

k=1

aikbkj =∑

j+s+1≤k≤i−r+1

aikbkj = 0

if i− r ≤ j + s+ 1.

(ii) If D = (dij) is diagonal, then ‖D‖ = max |djj| = ρ(D).

405

Exercise 15.3.8

The computations by which we obtained the result for r, s ≥ 0 re-main valid and the result remains true.

406

Exercise 15.3.11

Let A be an n×n matrix with non-zero diagonal entries and b ∈ Fn

a column vector. Suppose that the equation

Ax = b

has the solution x∗. Let us write D for the n× n diagonal matrix withdiagonal entries the same as those of A and set B = A−D. If x0 ∈ Rn

and

xj+1 = D−1(b− Bxj),

then

‖xj − x∗‖ ≤ ‖D−1B‖j

and ‖xj − x∗‖ → 0 whenever ρ(D−1B) < 1. If ρ(D−1) = 1. we canfind an x0 such that ‖xj − x∗‖ 6→ 0.

To prove this, observe that −D−1Bx∗ = D−1b− x∗ and so

xj+1 − x∗ = D−1(b− Bxj)− x∗ = −D−1B(xj − x∗)

Thus

‖xn − x∗‖ ≤ ‖(D−1B)n‖‖x0 − x∗‖

If ρ(D−1B) < 1 ‖(D−1B)n‖ → 0 the result follows.

If ρ(D−1B) ≥ 1 we can find an eigenvector e with eigenvalue havingabsolute value at least 1. If we set x0 = x∗ + e0 convergence fails.

To prove the last part, suppose, if possible, that we can find aneigenvector y of D−1B with eigenvalue λ such that |λ| ≥ 1. ThenD−1By = λy and so By = λDy. Thus

n∑

j 6=i

aijyj = λaiiyi

and so

|aii||yi| ≤∑

j 6=i

|aij||yj|

for each i and (since y 6= 0)

|aii||yi| <∑

j 6=i

|aij||yj|

for at least one value of i.

407

Summing over all i and interchanging the order of summation, weget

n∑

i=1

|aii||yi| ≤n∑

i=1

∑

j 6=i

|aij||yj| =n∑

j=1

∑

i 6=j

|aij ||yj|

=n∑

j=1

|yj|∑

i 6=j

|aij| <n∑

j=1

|yj||ajj| =n∑

i=1

|aii||yi|

which is absurd.

Thus all the eigenvalues of D−1B have absolute value less than 1 andwe may apply the first part of the question.

408

Exercise 15.4.3

If α has matrix A with respect to given orthonormal basis, α∗ hasmatrix A∗, αα∗ has matrix AA∗ and α∗α has matrix A∗A.

Thus αα∗ = α∗α ⇔ A∗A = AA∗.

409

Exercise 15.4.6

If α is diagonalisable with distinct eigenvalues λ1, λ2, . . . , λm, thenU is the direct sum of the spaces

Ej = {e : αe = λje}.Moreover Ej ⊥ Ek, that is to say

ej ∈ Ej , ek ∈ Ek ⇒ 〈ej , ek〉 = 0.

when j 6= k. It follows that if πj is the (unique) orthogonal projectionwith πjU = Ej we have πjπk = 0 for all j 6= k If u ∈ U , we can writeu uniquely as

⋆ u = e1 + e2 + . . .+ em

with ej ∈ Ej . Since πjπk = 0 for all j 6= k we have πjek = πjπkek = 0.Thus, applying πj to both sides of ⋆, πju = ej. Thus

ιu =

m∑

j=1

πju

and

αu =

m∑

j=1

αej =

m∑

j=1

λjej =

m∑

j=1

λjπju.

for all u ∈ U . It follows that

ι = π1 + π2 + . . .+ πm and α = λ1π1 + λ2π2 + . . .+ λmπm.

Conversely, if the stated conditions hold and we write Ej = πjU ,we see that Ej is a subspace and πiEj = πiπjEj = 0 for i 6= j. Sinceiota = π1 + π2 + . . .+ πm, we have

u = ιu =m∑

j=1

πju =m∑

j=1

uj

with uj = πju ∈ Ej . On the other hand, if ej ∈ Ej and

m∑

j=1

ej = 0

then applying πi to both sides we get ei = 0 for all i. Thus

U = E1 ⊕E2 ⊕ . . .⊕ Em.

Moreover, since πjπk = 0 we have Ej ⊥ Ek for k 6= j.

Let Ej be an orthonormal basis for Ej . The set E =⋃m

j=1 Ej is anorthonormal basis for U . Since α has a diagonal matrix with respectto this basis, α is normal.

410

α is self-adjoint if and only if there exist distinct λj ∈ R and othog-onal projections πj such that πkπj = 0 when k 6= j,

ι = π1 + π2 + . . .+ πm textand α = λ1π1 + λ2π2 + . . .+ λmπm.

411

Exercise 15.4.7

If α is unitary, then α = α−1 so

αα∗ = ι = αα∗

and α is normal.

β unitary ⇔ ββ∗ = ι

⇔ α−1α∗(α−1α∗)∗ = ι

⇔ α−1α∗α∗∗(α∗)−1 = ι

⇔ α−1α∗α(α∗)−1 = ι

⇔ α∗α = αα∗

⇔ α normal

412

Exercise 15.4.9

(i) Choose an orthonormal basis such that α is represented by adiagonal matrix D with djj = eiθj with θj real. Let f(t) be representedby a diagonal matrix D with djj(t) = eiθjt. Then

‖f(t)− f(s)‖ ≤ maxj

|eiθjt − eiθjs| ≤ max |θj||t− s|

so f is continuous, f(t) is unitary, f(0) = ι, f(1) = α.

Recall that the unitary maps form a group. Let α = β−1γ. Since αis unitary we can find f as in the first paragraph. Set g(t) = βf(t).

(ii) By considering matrix representations, we know that det : L(U, U) →R is continuous. Thus, if f is as stated, det f(t) is a continuous func-tion of t. But det f(1) = −1, det f(0) = 1 so, by the intermediate valuetheorem, we can find an s ∈ [0, 1] such that det f(s) = 0 and so f(s) isnot invertible.

413

Exercise 16.1.5

If U is finite dimensional and

α(x,y) = 0 for all x ∈ U ⇒ y = 0,

then θR : U → U ′ is an isomorphism.

Proof. We observe that the stated condition tells us that θR is injectivesince

θR(y) = 0 ⇒ θR(y)(x) = 0 for all x ∈ U

⇒ α(x,y) = 0 for all x ∈ U

⇒ y = 0.

Since dimU = dimU ′, it follows that θR is an isomorphism. �

414

Exercise 16.1.9

Observe, that if x 6= 0, then at least one of the following two thingsmust occur:-

(A) x1 6= 0, so β(

(x1, x2)T , (0, 1)T

)

= x1 6= 0.

(B) x2 6= 0, so β(

(x1, x2)T , (1, 0)T

)

= −x2 6= 0.

Thus β is non-degenerate, but

β(x,x) = x1x2 − x2x1 = 0

for all x.

We cannot find an α of the type required. If α is degenerate, thenthere exists an x 6= 0 with α(x,y) = 0 for all y so, in particular,α(x,x) = 0.

415

Exercise 16.1.14

Observe that, with the notation of Definition 16.1.12,

q(u+ v) + q(u− v) = α(u+ v,u+ v) + α(u− v,u− v)

= α(u,u+ v) + α(v,u+ v)

+ α(u,u− v) + α(v,u− v)

= α(u,u) + α(u,v) + α(v,u) + α(v,v)

+ α(u,u)− α(u,v)− α(v,u) + α(v,v)

= 2α(u,u) + 2α(u,u)

= 2(

q(u) + q(v))

.

416

Exercise 16.1.16

Observe thatn∑

i=1

n∑

j=1

xibijxj =n∑

j=1

n∑

i=1

xjbijxi,

son∑

i=1

n∑

j=1

xibijxj =

n∑

i=1

n∑

j=1

xibij + bji

2xj

and q is a quadratic form with associated symmetric matrix

A = 12(A+ AT ).

417

Exercise 16.1.17

{x ∈ R3 : x21 + x2

2 + x23 = 1}

is a sphere.

{x ∈ R3 : x21 + x2

2 + x23 = 0}

is a point.

{x ∈ R3 : x21 + x2

2 + x23 = −1} = ∅.

{x ∈ R3 : x21 + x2

2 − x23 = 1}

is a one sheeted hyperboloid of revolution obtained by revolving thehyperbola

{x ∈ R3 : x21 − x2

3 = 1, x2 = 0}about its semi-minor axis Ox3.

{x ∈ R3 : x21 + x2

2 − x23 = 0}

is a circular cone.

{x ∈ R3 : x21 + x2

2 − x23 = −1}

is a two sheeted hyperboloid of revolution obtained by revolving thehyperbola

{x ∈ R3 : x21 − x2

3 = −1, x2 = 0}about its semi-major axis Ox3. (The two sheeted hyperboloid has twoconnected components. The one sheeted hyperboloid has one.)

{x ∈ R3 : x21 + x2

2 = 1}is a circular cylinder.

{x ∈ R3 : x21 + x2

2 = 0}is the straight line give along the axis Ox3.

{x ∈ R3 : x21 + x2

2 = −1} = ∅.

{x ∈ R3 : x21 − x2

2 = 1} and x ∈ R3 : x21 − x2

2 = −1}are hyperbolic cylinders.

{x ∈ R3 : x21 − x2

2 = 0}is a pair of planes intersecting at right angles.

{x ∈ R3 : x21 = 1}

418

is a pair of parallel planes.

{x ∈ R3 : x21 = 0}

is a plane.{x ∈ R3 : x2

1 = −1} = ∅.

{x ∈ R3 : 0 = 1} = {x ∈ R3 : 0 = −1} = ∅and

{x ∈ R3 : 0 = 0} = R3.

419

Exercise 16.1.23

By a rotation (which changes neither volume nor determinant) wemay suppose that our ellipsoid is

n∑

i=1

diy2i ≤ L.

By successive scale changes wi = d1/2i yi which multiply volume by d

1/2i

we can transform our ellipsoid ton∑

i=1

w2i ≤ L

Now make a scale change in every direction ui = L−1/2wi, to obtainthe unit sphere of volume Vn

Our original volume must have been Ln/2∏n

j=1 d−1/2j Vn i.e,

(detA)−1/2Ln/2Vn.

420

Exercise 16.1.25

We know that

fY(y) = K exp

(

−1

2

n∑

i=1

n∑

j=1

djy2j

)

is a density function and so∫

Rn

fY(y) dV (y) = 1.

Thus

1 = K

∫ ∞

−∞

∫ ∞

−∞

. . .

∫ ∞

−∞

exp

(

−1

2

n∑

j=1

djy2j

)

dy1dy2 . . . dyn

= K

∫ ∞

−∞

∫ ∞

−∞

. . .

∫ ∞

−∞

n∏

j=1

exp(−djy2j/2) dy1dy2 . . . dyn

= K

n∏

j=1

∫ ∞

−∞

exp(−djy2j/2) dyj

= Kn∏

j=1

(

d−1/2j

∫ ∞

−∞

exp(−t2j/2) dtj

)

= K(2π)n/2n∏

j=1

d−1/2j .

Thus

K = (2π)−n/2

(

n∏

j=1

dj

)1/2

.

421

Exercise 16.1.26

(i) Suppose α sesquilinear, β Hermitian, γ skew-Hermitian and

α = β + γ.

Thenα(u,v) = β(u,v) + γ(u,v)

andα(v,u)∗ =

(

β(v,u) + γ(v,u))∗

= β(u,v)− γ(v,u).

Thus

β(u,v) = 12(α(u,v) + α(v,u)∗)

γ(u,v) = 12(α(u,v)− α(v,u)∗)

Conversely, if α is sesquilinear, the definitions

β(u,v) = 12(α(u,v) + α(v,u)∗)

γ(u,v) = 12(α(u,v)− α(v,u)∗)

give sesquilinear forms (the point at issue is that if (u,v) 7→ α(u,v) issesquilinear, so is the map (u,v) 7→ α(v,u)∗) with

β(v,u) = 12(α(v,u) + α(u,v)∗) = β(u,v)∗

γ(v,u) = 12(α(v,u)∗ − α(v,u)) = −γ(v,u)∗

(ii) Since α(u,v) = α(v,u)∗ for all u, v, we have α(u,u) = α(u,u)∗,so α(u,u) is real for all u ∈ U .

(iii) Set β = iα. Then

α skew-Hermitian ⇔ α(v,u) = −α(u,v)∗ ∀v,u⇔ β(u,v)∗ =

(

iα(u,v))∗

= −iα(u,v)∗ = iα(v,u) = β(v,u) ∀v,u⇔ β Hermitian.

(iv) Observe that

α(u+ v,u+ v) = α(u,u) + α(u,v) + α(v,u) + α(u,u)

= α(u,u) + α(u,v) + α(u,v)∗ + α(v,v)

= α(u,u) + 2ℜα(u,v) + α(v,v).

Thusα(u− v,u− v) = α(u,u) + 2ℜα(u,v) + α(v,v)

andα(u+ v,u+ v)− α(u− v,u− v) = 4ℜα(u,v).

It follows that

α(u+iv,u+iv)−α(u−iv,u−iv) = 4ℜα(u, iv) = −4ℜ(

iα(u,v) = ℑ(u,v).The result follows.

422

(v) Choose any orthonormal basis f1, f2, . . . , fn for U . Define amatrix A = (apq) by the formula

apq = α(fp, fq).

By sesquilinearity

α

(

n∑

p=1

zpfp,n∑

q=1

wqfq

)

=n∑

p=1

n∑

q=1

zpapqw∗q

for all zp, wq ∈ C. Since α is Hermitian

aqp = α(fq, fp) = α(fp, fq)∗ = a∗pq

Thus A is Hermitian and we can find a unitary matrix M such thatM∗AM = D where D is a real diagonal matrix whose entries are theeigenvalues of A appearing with the appropriate multiplicities. If weset eq =

∑np=1mpqfp then (since M is unitary) the e1, e2, . . . , en are

an orthonormal basis and by direct calculation

α

(

n∑

r=1

zrer,

n∑

s=1

wses

)

=

n∑

t=1

dtztw∗t

for all zr, ws ∈ C.

423

Exercise 16.1.27

Suppose that U is vector space over C and α : U × U → C is abilinear form. Let us set

θR(w)z = α(z,w)

for z, w ∈ U .

We observe that

θR(w)(λ1z1 + λ2z2) = α(λ1z1 + λ2z2,w)

= λ1α(z1,w) + λ2α(z2,w)

= λ1θR(w)z1 + λ2θR(w)z2

for all λ1, λ2 ∈ C and z1, z2 ∈ U . Thus θR(w) ∈ U ′ for all w ∈ U .

Now

θR(λ1w1 + λ2w2)z) = α(z, λ1w1 + λ2w2)

= λ∗1α(z,w1) + λ∗

2α(z,w2)

= λ∗1θR(w1)z+ λ∗

2θR(w2)z

=(

λ∗1θR(w1) + λ∗

2θR(w2))

z

for all z ∈ U . Thus

θR(λ1w1 + λ2w2) = λ∗1θR(w1) + λ∗

2θR(w2)

and θR : U → U ′ is an anti-linear map.

Suppose in addition we know that U is finite dimensional and that

α(z,w) = 0 for all z ∈ U ⇒ w = 0.

Then θR is injective since

θR(w) = 0 ⇒ θR(w)(z) = 0 for all z ∈ U

⇒ α(z,w) = 0 for all w ∈ U

⇒ w = 0.

Since dimU = dimU ′, it follows that θR is an isomorphism.

Alternatively, observe that if e1, e2, . . . , en form a basis for Un∑

j=1

λjθRej = 0 ⇒ θR

(

n∑

j=1

λ∗jej

)

= 0

⇒n∑

j=1

λ∗jej = 0

⇒ λ∗j = 0 ∀j ⇒ λj = 0 ∀j

so θRe1, θRe2, . . . , θRen are linearly independent so a basis for U ′ sospan U ′. Thus θR is surjective.

424

Exercise 16.2.4

Let f1, f2, . . . , fn be the basis associated with S. Define

α

(

n∑

i=1

xifi,

n∑

j=1

yjfj

)

=

n∑

k=1

xkyk

for all xi, yj ∈ R. Then, by inspection, α is a symmetric form on Rn.

By Theorem 16.2.1 we can find a basis e1, e2, . . . , en and positiveintegers p and m with p+m ≤ n such that

α

(

n∑

i=1

xiei,

n∑

j=1

yjej

)

=

p∑

k=1

xkyk −p+m∑

k=p+1

xkyk

for all xi, yj ∈ R. We have

q

(

n∑

i=1

yiei

)

= α

(

n∑

i=1

yiei,

n∑

j=1

yjej

)

=

p∑

k=1

y2k −p+m∑

k=p+1

y2k

as required.

425

Exercise 16.2.8

A is two quadrants, B two lines at right angles.

(1, 0), (−1, 1) ∈ A, but (0, 1) = (1, 0) + (−1, 1) /∈ A. Thus A is nota subspace.

(1, 0), (−1, 1) ∈ B, but (0, 1) = (1, 0) + (−1, 1) /∈ B. Thus A is nota subspace.

426

Exercise 16.2.9

(i) Choose M orthogonal so that MTAM = D with D diagonal.Since matrix rank is unchanged by multiplication on the left or rightby invertible matrices, D has the same matrix rank as A. But thesignature rank of A and the matrix rank of D are both the numberof non-zero entries of D. Thus the signature and matrix ranks areidentical.

(ii) We continue with the notation of (i). The rank of q is the numberof non-zero characteristic roots (multiple roots counted multiply). Now

signature q = (no. strictly pos. entries D)−(no. strictly neg. entries D),

so the signature of D is the number of strictly positive characteristicroots minus the number of strictly negative characteristic roots of A(multiple roots counted multiply).

427

Exercise 16.2.10

(i) Observe that

γ(Pz) = (PzT )A(Pz) = (zTP T )A(Pz) = zT (P TAP )z.

(ii) We cannot use results on the diagonalisation of real symmetricmatrices or Hermitian matrices, but we can use completing the square.

(α ) Suppose that A = (aij)1≤i,j≤n is a symmetric n× n matrix witha11 6= 0, and we set bij = a−1

11 (a11aij − a1ia1j). Then B = (bij)2≤i,j≤n isa symmetric matrix. Further, if z ∈ Cn and (after choosing one square

root a1/211 of a11) we set

w1 = a1/211

(

z1 +

n∑

j=2

a1ja−1/211 zj

)

and wj = zj otherwise, we haven∑

i=1

n∑

j=1

ziaijzj = w21 +

n∑

i=2

n∑

j=2

wibijwj.

(β) Suppose that A = (aij)1≤i,j≤n is a symmetric n × n matrix.Suppose further that σ : {1, 2, . . . , n} → {1, 2, . . . , n} is a permutation(that is to say σ is a bijection) and we set bij = aσi σj . Then C =(cij)1≤i,j≤n is a symmetric matrix. Further, if z ∈ Cn and we setwj = xσ(j), we have

n∑

i=1

n∑

j=1

ziaijzj =n∑

i=1

n∑

j=1

wicijwj.

(γ) Suppose that n ≥ 2, A = (aij)1≤i,j≤n is a symmetric n×n matrixand a11 = a22 = 0, but a12 6= 0. Then there exists a symmetric n × nmatrix C with c11 6= 0 such that, if z ∈ Cn and we set w1 = (z1+z2)/2,w2 = (z1 − z2)/2, wj = zj for j ≥ 3, we have

n∑

i=1

n∑

j=1

ziaijzj =

n∑

i=1

n∑

j=1

wicijwj.

Combining (α), (β) and (γ) we have the following.

(δ) Suppose that A = (aij)1≤i,j≤n is a non-zero symmetric n × nmatrix. Then we can find an n× n invertible matrix M = (mij)1≤i,j≤n

and a symmetric (n− 1)× (n− 1) matrix B = (bij)2≤i,j≤n such that, ifz ∈ Cn and we set wi =

∑nj=1mijzj , then

n∑

i=1

n∑

j=1

ziaijzj = w21 +

n∑

i=2

n∑

j=2

wibijwj.

428

Repeated use of (δ) gives the desired result.

(iii) Observe that if P is invertible so is P T , so (using matrix rank)

rank(P TAP ) = rankA.

Thus, if

γ(Pz) =r∑

u=1

z2u,

we have r = rankA.

(iv) Observe that γ(iz) = −γ(z).

(v) By (i) and (ii) we can choose a basis ej for Cn so that

γ

(

n∑

j=1

zjej

)

=r∑

u=1

z2u.

Let s be the integer part of r/2. If F is the subspace spanned by thevectors e2k−1− ie2k with 1 ≤ k ≤ s and the vectors el with r+1 ≤ l ≤n. Then F has dimension at least m and any subspace E of F withdimension m will have desired properties.

(vi) Let E0 = Cm. Using (v) and proceeding inductively we canfind subspaces Ej such that Ej is a subspace of Ej−1, dimEj ≥ [2−jn](using the standard integer part notation) and γj|Ej = 0 [1 ≤ j ≤ k].We have dimEk ≥ 1, so we may choose a non-zero z ∈ Ek to obtainthe required result.

429

Exercise 16.2.12

If a = 0, we have Example 16.2.11. The rank is 3 and the signature−1.

If a 6= 0, then, setting y1 = (x1 − 2−1a−1x2 − 2−1a−1), y2 = x2,y3 = x3, we have

x1x2+x2x3+x3x1+ax21 = ay21+4−1a−1y22+4−1a−1y22+(1−2−1a−1)y2y3

Setting u1 = y1, u2 = y2 + 2a(1− 2−1a−1)y3 u3 = y3, we have

x1x2 + x2x3 + x3x1 + ax21 = au2

1 + 4−1a−1u22 − a(1− 2−1a−1)2u3.

If a = 1/2, we see that q has rank and signature 2.

If a > 0 and a 6= 1/2, q has rank 3 and signature 1.

If a < 0, q has rank 3 and signature −1.

430

Exercise 16.2.13

(i) We know that we can find a unitary matrix U such that C =U∗AU is diagonal with real entries. Let D be the diagonal matrix with

dii = 1 if cii = 0 and dii = c−1/2ii (positive square root) otherwise.

Setting P = DU we see that P ∗AP is diagonal with diagonal entriestaking the values 1, −1 or 0.

(ii) We have(

n∑

r=1

n∑

s=1

zrarsz∗s

)∗

=n∑

r=1

n∑

s=1

z∗ra∗rszs

=n∑

r=1

n∑

s=1

z∗rasrzs

=n∑

r=1

n∑

s=1

zrarsz∗s

so∑n

r=1

∑ns=1 zrarsz

∗s is real.

(iii) If P ∗1AP1 has k entries 1, then if E is the subspace of Cn spanned

by P1e with e a column vector with ith entry 1 and all other entries 0when the ith diagonal entry of P ∗

1AP1 is 1, we have

dimE = k, z∗Az > 0 ∀z ∈ E \ {0}.

If P ∗2AP2 has k′ entries 1 then if F is the subspace of Cn spanned

by P2e with e a column vector with ith entry 1 and all other entries 0when the ith diagonal entry of P ∗

1AP1 is −1 or 0, we have

dimF = n− k′, z∗Az ≤ 0 ∀z ∈ F.

Thus E∩F = {0}, dimE+dimF ≤ n and n−k′+k ≤ n. It followsthat k ≤ k′. The same argument shows that k′ ≤ k so k = k′.

In the same way (or by considering −A) we see that P ∗1AP1 and

P ∗2AP2 have the same number of entries −1 and so since they have the

same number of entries 1 and −1 must have the same number of entries0 along the diagonal.

431

Exercise 16.3.2

(i) Let U be a vector space over R. A quadratic form q : U → R issaid to be negative semi-definite if

q(u) ≤ 0 for all u ∈ U

and strictly negative definite if

q(u) < 0 for all u ∈ U with u 6= 0.

By inspection q : U → R is strictly negative definite (respectivelynegative semi-definite) if and only if −q is strictly positive definite(respectively positive semi-definite).

(ii) If q is a quadratic form over a finite dimensional real vector space,then we can find a basis e1, e2, . . . , en such that

q

(

n∑

j=1

xjej

)

=r∑

j=1

x2j −

r+s∑

j=r+1

x2j .

Setting

q1

(

n∑

j=1

xjej

)

=r∑

j=1

x2j and q2

(

n∑

j=1

xjej

)

= −r+s∑

j=r+1

x2j

we see that q = q1 + q2 with q1 a positive semi-definite quadratic formand q2 a negative semi-definite quadratic form.

The observation0 = 0 + 0 = x2 − x2

shows that the decomposition is not unique even for a space of dimen-sion 1.

432

Exercise 16.3.3

We haven∑

i=1

n∑

j=1

ci(EXiXj)cj = E

(

n∑

j=1

cjXj

)2

≥ 0

with equality if and only if

Pr

(

n∑

j=1

cjXj = 0

)

= 1.

433

Exercise 16.3.4

We know that we can find a basis ej such that

α

(

n∑

j=1

xjej

)

=u∑

j=1

x2j −

v∑

j=u+1

x2j .

If v < n, then q(en) = 0 and q is not strictly positive definite. Ifv ≥ u+ 1, q(eu+1) = −1 so q is not positive semi-definite.

By inspection, u = v = n implies q strictly positive definite so q isstrictly positive definite if and only if n = u = V i.e. q has rank andsignature n.

By inspection, u = v implies q positive semi-definite so q is positivesemi-definite if and only if u = v, i.e. q has rank and signature equal.

Finally q is negative semi-definite if and only if −q is positive semi-definite, i.e. q has signature equal to minus its rank.

434

Exercise 16.3.5

The conditions

(i) α(u,v) = α(v,u)

(ii) α(λu,v) = λα(u,v)

(iii) α(u+w,v) = α(u,v) + α(w,v)

(for all u, v, w ∈ U and λ ∈ R) say that α is a quadratic form.

The additional conditions

(iv) α(u,u) ≥ 0 ∀u ∈ U

(v) α(u,u) = 0 ⇒ u = 0

say that α gives rise to a positive definite quadratic form.

Conditions (i) to (v) say that α is an inner product.

435

Exercise 16.3.6

By translation we may take a = 0 and by subtracting a constant wemay take f(0) = 0.

Since f is smooth we know by the local Taylor’s theorem that

f(h) =∂f

∂xi(0)hi +

∂f

∂xi∂xj(0)hihj + ǫ(h)‖h‖2

with ǫ(h) → 0 as ‖h‖ → 0. (We use the summation convention.)

(i) Suppose we have a minimum at 0. If ei is the vector with 1 inthe ith place and 0 elsewhere

∂f

∂xi(0) = lim

hi→0+

f(hei)

h≥ 0

and∂f

∂xi

(0) = limhi→0−

f(hei)

h≤ 0

so ∂f∂xi

= 0.

We now havef(h) = Hijhihj + ǫ(h)‖h‖2

with H the Hessian. Let u be any non-zero vector We have

0 ≤ f(hu) = h2(

Hijuiuj + ǫ(hu)‖u‖2)

,

so0 ≤ Hijuiuj + ǫ(hu)‖u‖2

and, allowing h → 0,0 ≤ Hijuiuj.

Thus H is positive semi-definite.

(ii) We havef(h) = Hijhihj + ǫ(h)‖h‖2

with H the Hessian. Since u 7→ Hu is continuous and the surface S ofthe unit ball is compact, we know that H takes a minimum on S whichis non-zero so there exists an η > 0 with

H(u) ≥ η

for all ‖u‖ = 1.

Thusf(h) ≥ ‖h‖2

(

(η − ǫ(h))

.

There exists a δ > 0 such that |ǫ(h)| < η/2 for all ‖h‖ < δ.

If ‖h‖ < δ, then f(h) ≥ (η/2)‖h‖2.Thus f has a strict minimum at 0.

436

(ii) Define f : R → R by f(x) = x4. We have a strict minimum at0, but the Hessian H =

(

f ′′(0))

= (0) is not strictly positive definite.

437

Exercise 16.3.7

Since(

∂f

∂x(x, y),

∂f

∂y(x, y)

)

= (cosx sin y, sinx cos y)

the stationary points are

(x, y) =(

nπ,mπ)

and(x, y) =(

(n + 12)π, (m+ 1

2)π)

.

The Hessian

H =

(

− sin x sin y cosx cos ycosx cos y − sin x sin y

)

so if (x, y) =(

nπ,mπ)

H =

(

0 (−1)n+m

(−1)n+m 0

)

which is neither positive semi-definite nor negative semi-definite (notefor example that 4xy = (x + y)2 − (x − y)2) and we have a saddle,whilst if (x, y) =

(

(n+ 12)π, (m+ 1

2)π)

H =

(

(−1)n+m 00 (−1)n+m

)

which is strictly positive definite if n+m is even (so we have a minimum)and strictly negative definite if n+m is even (so we have a maximum).

438

Exercise 16.3.11

We haveAT = (LLT )T = LTT LT = LLT = A,

so A is symmetric.

Observe that

xT Ax = xT LLTx = ‖LTx‖2 ≥ 0

so A is symmetric positive semi-definite. The case L = 0 shows that Aneed not be strictly positive definite.

If L is lower triangular with non-zero diagonal entries, then detL 6=0, so L is non-singular and

xTAx = 0 ⇒ xTLLTx = 0

⇒ ‖LTx‖2 = 0

⇒ ‖LTx‖ = 0

⇒ LTx = 0

⇒ x = 0,

so A = LLT is a symmetric strictly positive definite matrix.

439

Exercise 16.3.12

If, at the first stage a11 < 0, then A is not positive definite.

If a11 > 0, then either the Schur matrix B is positive definite, inwhich case, A is or the Schur matrix B is not positive definite, inwhich case, A is not. (Consider vectors whose first term is 0.)

If at each stage, the upper corner entry is strictly positive then ourproof shows that A is positive definite. If at some stage it is not, weknow that A is not positive definite.

440

Exercise 16.3.14

We need to look at l1 = (2,−3, 1)T . Now

A− l1lT1 =

4 −6 2−6 8 −52 −5 14

−

4 −6 2−6 9 −32 −3 1

=

0 0 00 −1 −20 −2 13

.

The Schur matrix

B =

(

−1 −2−2 13

)

has strictly negative upper corner entry, so A is not positive semi-definite.

441

Exercise 16.3.15

(i) Look at the first formula of the proof of part (ii) of Theorem16.3.10.

(ii) Looking at the proof of of part (i) of Theorem16.3.10 we see thatthe computation of l involves roughly n operations and the computationof B involves roughly n2 operations. Thus the reduction of our problemfrom n× n matrices to (n− 1)× (n− 1) matrices involves roughly 2n2

operations and the total number of operations required is roughly

2

n∑

r=1

r2 ≈ 2

3n3

operations.

[If the reader’s definition of an operation agrees with mine, this argu-ment shows that we can certainly take K = 1 (and, if we really needed,as close to 2/3 as we wished) for n large.]

442

Exercise 16.3.16

If we take l1 = (1, 12, 13)T , then

H3 − l1lT1 =

1 12

13

12

13

14

13

14

15

−

1 12

13

12

14

16

13

16

19

=

0 0 00 1

12112

0 112

445

If we take l2 =(

( 112)1/2, ( 1

12)1/2)T

, then(

112

112

112

445

)

− l2lT2 =

(

0 00 1

180

)

Thus H3 = LLT with

L =

1 0 02−1 12−1/2 03−1 12−1/2 180−1/2

Second part

If we take l1 = (1, 12, 13, 14)T , then

H4 − l1lT1 =

1 12

13

14

12

13

14

15

13

14

15

16

14

15

16

λ

−

1 12

13

14

12

14

16

18

13

16

19

112

14

18

112

116

=

0 0 0 00 1

12112

340

0 112

445

112

0 340

112

λ− 116

If we take l2 = ((12)−1/2, (12)−1/2, (12)1/2 340)T then

112

112

340

112

445

112

340

112

λ− 116

− l2lT2 =

112

112

340

112

445

112

340

112

λ− 116

−

112

112

340

112

112

340

340

340

27400

=

0 0 00 180−1 120−1

0 120−1 λ− 13100

443

If we take l3 = ((180)−1/2, (180)1/2120−1)T )(

180−1 120−1

120−1 λ− 13100

)

− l3lT3 =

(

0 00 λ− 57

400

)

Thus λ0 = 57/400. We observe that

λ0 −1

7=

1

2800.

(The ‘Hilbert matrices’ Hn are ‘only just positive definite’ and ‘onlyjust invertible’ and are often used to test computational methods.)

If λ ≥ λ0 then H4 = LLT with

L =

1 0 0 02−1 12−1/2 0 03−1 12−1/2 180−1/2 04−1 (12)1/2 3

40(180)1/2120−1 (λ− 57

400)1/2

444

Exercise 16.3.17

(i) The result is true when n = 1, since then the only root is a0 whichis positive by hypothesis.

Suppose the result is true for m ≥ n ≥ 1 and P is a polynomial ofthe given form of degree m+1. We observe that P ′ is a polynomial ofthe given form of degree m and so all its real zeros are strictly positive.Thus, if m+1 is even, P ′(t) < 0 for t ≤ 0, so P is decreasing as t runsfrom −∞ to 0. But P (0) = a0 > 0 so P (t) > 0 for t ≤ 0 and P hasno non-positive real zeros. If m+ 1 is odd, P ′(t) > 0 for t ≤ 0, so P isincreasing as t runs from −∞ to 0. But P (0) = −a0 < 0, so P (t) < 0for t ≤ 0 and P has no non-positive real zeros.

The result follows by induction.

(ii) The only if part is immediate. To obtain the if part apply part (i)to

P (t) = det(tI −A) =

n∏

j=1

(t− λj) = tn +

n∑

j=1

(−1)n−jajtj,

observing that

an−1 =n∑

j=1

λj, an−2 =∑

j 6=i

λiλj > 0, an−3 =∑

i,j,k distinct

λiλjλk. . . . .

(iii) Let

A =

(

1 −33 1

)

.

Thendet(tI − A) = (t− 1)2 + 9 = t2 − 2t + 8

is a polynomial with no real roots (since (t− 1)2 +9 > 0 for all real t).

(iv) Observe that

det(tI −A) = t3 − b2t2 + b1t− b0

with

b2 = a11 + a22 + a33

b1 = a11a22 + a22a33 + a33a22 − a12a21 − a23a32 − a31a13 > 0

b0 = detA

and apply the results above.

445

Exercise 16.3.19

Suppose P is invertible with P TAP and P TBP diagonal. Since A isnot semi-positive definite P TAP is not and, since A is not semi-negativedefinite, P TAP is not. Thus

P TAP =

(

a 00 b

)

with a and b non-zero of opposite sides. Observing that(

0 11 0

)(

c 00 d

)(

0 11 0

)

=

(

c 00 d

)

and(

λ 00 µ

)(

c 00 d

)(

λ 00 µ

)

=

(

λ2c 00 µ2d

)

we see that there is a non-singular matrix Q with

QT

(

a 00 b

)

Q = A

and QTCQ diagonal whenever C is diagonal.

Setting M = PQ we see that M is invertible MTAM = A andMTBM is diagonal.

Let

M =

(

a bc d

)

.

Since MTAM = A we have(

1 00 −1

)

= A = MTAM

=

(

a cb d

)(

a b−c −d

)

=

(

a2 − c2 ab− cdba− cd b2 − d2

)

Thus ab = cd and d > 0, a > 0

On the other hand

MTBM =

(

a cb d

)(

c da b

)

=

(

2ac ad+ bcbc + ad 2bd

)

so if MTBM is diagonal bc = −ad.

We now get

c2d = abc = −a2d

so, since d 6= 0, we have c = a = 0 which contradicts the conditiona2 − c2 = 1.

446

(ii) Since detA, detB 6= 0, A and B have rank 2. A has signature 0by inspection. B has signature 0 since xy = 1

4

(

(x+ y)2 − (x− y)2)

.

The transformation x 7→ Mx leaves lines x = y and x = −yunchanged (or interchanges them) since these are the asymptotes tox2 − y2 = K for all K. It therefore gives dilation along the directionsx = y x = −y (and possibly interchanges these two axes) and willtherefore not transform xy = k in the desired manner.

(iii) C is strictly negative definite and the same argument which gavesimultaneous diagonalisation when one form is strictly positive definitewill work if one form is strictly negative definite.

447

Exercise 16.4.2

We have

q

(

n∑

i=1

xiei

)

= α

(

n∑

i=1

xiei,

n∑

j=1

xjej

)

=n∑

i=1

n∑

j=1

xiaijxj

=1

2

n∑

i=1

n∑

j=1

(xiaijxj + xiajixj) =1

2

n∑

i=1

n∑

j=1

0 = 0.

448

Exercise 16.4.3

Observe that(iA)∗ = −iA∗ = −iAT = iA.

Thus iA is Hermitian and so has real eigenvalues. If −λ is an eigenvalueof iA then iλ = (−i)(−λ) is an eigenvalue of A = (−i)(iA). Thus theeigenvalues of A are purely imaginary.

Observe that

(MTBM)T = MTBTMTT = −MTBM

so the eigenvalues of MTBM are purely imaginary. But MTBM is areal diagonal matrix whose diagonal entries are its eigenvalues whichare thus real and so must be zero. Thus MTBM = 0 and so B = 0.

449

Exercise 16.4.6

(i) Let uj [1 ≤ j ≤ n] form a basis for an n-dimensional complexvector space U . Let α be the antisymmetric form associated with Afor this basis.

Let ej [1 ≤ j ≤ n] be the basis found in Theorem 16.4.5. If M is theappropriate basis change matrix MTAM has the required form.

(ii) Observe that 2m = rankMTAM = rankA.

450

Exercise 16.4.7

Observe that, by Theorem 16.4.5, if α is non-singular, 2m = n, so nis even.

451

Exercise 16.4.8

(i) True. Since we can find a non-singular M such that MTAM hasm copies of

(

0 1−1 0

)

along the diagonal and all other entries 0

rankA = rankMTAM = 2m.

(ii) False. Take

A =

(

0 2−2 0

)

.

(iii) True. The non-real roots of a real polynomial occur in conjugatepairs. PA is a real polynomial with roots the eigenvalues (over C) andthese are purely imaginary (see Exercise 16.4.3). Thus

PA(t) = tn−2mm∏

j=1

(t+ idr)(t− idr) =m∏

j=1

(t2 + d2r)

with dr real.

(iv) False. Take

A =

(

0 4−1 0

)

.

(v) True. Let A be the n× n matrix with the m matrices(

0 dr−dr 0

)

along the diagonal and all other entries 0.

452

Exercise 16.4.9

We observe that

A is skew-Hermitian ⇔ A = iB where B is Hermitian.

(i) If A is skew-Hermitian, then, since iA is Hermitian, we can findunitary P such that iP ∗AP = P ∗(iA)P is a real diagonal matrix D andso P ∗AP = −iD is diagonal with diagonal entries purely imaginary.

(ii) If A is skew-Hermitian, then, since iA is Hermitian, Exercise 16.2.13tells us that we can an invertible matrix P such that P ∗iAP such thatP ∗iAP is diagonal with diagonal entries taking the values 1, −1 or 0and so P ∗AP is diagonal with diagonal entries taking the values i, −ior 0.

(iii) Suppose that A is a skew-Hermitian matrix and P1, P2 are invert-ible matrices such that P ∗

1AP1 and P ∗2AP2 are diagonal with diagonal

entries taking the values i, −i or 0. Then the number of entries of eachtype is the same for both diagonal matrices.

This follows at once from Exercise 16.2.13 (iii) by considering iA.

sketchsolutionsfor exercises inthe maintext of vectors ...twk/hex.pdf · textual through to the...

Documents