75686648 solution manual

Applied Linear Algebra

Instructor’s Solutions Manual

by Peter J. Olver and Chehrzad Shakiban

Table of Contents

Chapter Page

1. Linear Algebraic Systems . . . . . . . . . . . . . . . . . . . . . . . . . 1

2. Vector Spaces and Bases . . . . . . . . . . . . . . . . . . . . . . . . . 46

3. Inner Products and Norms . . . . . . . . . . . . . . . . . . . . . . . 78

4. Minimization and Least Squares Approximation . . . . . . . 114

5. Orthogonality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131

6. Equilibrium . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174

7. Linearity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193

8. Eigenvalues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 226

9. Linear Dynamical Systems . . . . . . . . . . . . . . . . . . . . . . . 262

10. Iteration of Linear Systems . . . . . . . . . . . . . . . . . . . . . . 306

11. Boundary Value Problems in One Dimension . . . . . . . . . 346

1

Solutions — Chapter 1

1.1.1.(a) Reduce the system to x − y = 7, 3y = −4; then use Back Substitution to solve for

x = 173 , y = − 4

3 .

(b) Reduce the system to 6u + v = 5, − 52 v = 5

2 ; then use Back Substitution to solve foru = 1, v = −1.

(c) Reduce the system to p + q − r = 0, −3q + 5r = 3, −r = 6; then solve for p = 5, q =−11, r = −6.

(d) Reduce the system to 2u − v + 2w = 2, − 32 v + 4w = 2, −w = 0; then solve for

u = 13 , v = − 4

3 , w = 0.

(e) Reduce the system to 5x1 + 3x2 − x3 = 9, 15 x2 − 2

5 x3 = 25 , 2x3 = −2; then solve for

x1 = 4, x2 = −4, x3 = −1.(f ) Reduce the system to x + z − 2w = −3, −y + 3w = 1, −4z − 16w = −4, 6w = 6; then

solve for x = 2, y = 2, z = −3, w = 1.(g) Reduce the system to 3x1 + x2 = 1, 8

3 x2 + x3 = 23 , 21

8 x3 + x4 = 34 , 55

21 x4 = 57 ; then

solve for x1 = 311 , x2 = 2

11 , x3 = 211 , x4 = 3

11 .

1.1.2. Plugging in the given values of x, y and z gives a+2b− c = 3, a−2− c = 1, 1+2b+c = 2.Solving this system yields a = 4, b = 0, and c = 1.

♥ 1.1.3.(a) With Forward Substitution, we just start with the top equation and work down. Thus

2x = −6 so x = −3. Plugging this into the second equation gives 12 + 3y = 3, and soy = −3. Plugging the values of x and y in the third equation yields −3 + 4(−3)− z = 7,and so z = −22.

(b) We will get a diagonal system with the same solution.(c) Start with the last equation and, assuming the coefficient of the last variable is 6= 0, use

the operation to eliminate the last variable in all the preceding equations. Then, againassuming the coefficient of the next-to-last variable is non-zero, eliminate it from all butthe last two equations, and so on.

(d) For the systems in Exercise 1.1.1, the method works in all cases except (c) and (f ).Solving the reduced system by Forward Substitution reproduces the same solution (asit must):

(a) The system reduces to 32 x = 17

2 , x + 2y = 3.

(b) The reduced system is 152 u = 15

2 , 3u− 2v = 5.(c) The method doesn’t work since r doesn’t appear in the last equation.

(d) Reduce the system to 32 u = 1

2 , 72 u− v = 5

2 , 3u− 2w = −1.

(e) Reduce the system to 23 x1 = 8

3 , 4x1 + 3x2 = 4, x1 + x2 + x3 = −1.(f ) Doesn’t work since, after the first reduction, z doesn’t occur in the next to last

equation.(g) Reduce the system to 55

21 x1 = 57 , x2 + 21

8 x3 = 34 , x3 + 8

3 x4 = 23 , x3 + 3x4 = 1.

1.2.1. (a) 3× 4, (b) 7, (c) 6, (d) (−2 0 1 2 ), (e)

0B@

02−6

1CA.

1

1.2.2. (a)

0B@

1 2 34 5 67 8 9

1CA, (b)

1 2 31 4 5

!, (c)

0B@

1 2 3 44 5 6 77 8 9 3

1CA, (d) ( 1 2 3 4 ),

(e)

0B@

123

1CA, (f ) ( 1 ).

1.2.3. x = − 13 , y = 4

3 , z = − 13 , w = 2

3 .

1.2.4.

(a) A =

1 −11 2

!, x =

xy

!, b =

73

!;

(b) A =

6 13 −2

!, x =

uv

!, b =

55

!;

(c) A =

0B@

1 1 −12 −1 3−1 −1 0

1CA, x =

0B@

pqr

1CA, b =

0B@

036

1CA;

(d) A =

0B@

2 1 2−1 3 3

4 −3 0

1CA, x =

0B@

uvw

1CA, b =

0B@

3−2

7

1CA;

(e) A =

0B@

5 3 −13 2 −11 1 2

1CA, x =

0B@

x1x2x3

1CA, b =

0B@

95−1

1CA;

(f ) A =

0BBB@

1 0 1 −22 −1 2 −10 −6 −4 21 3 2 −1

1CCCA, x =

0BBB@

xyzw

1CCCA, b =

0BBB@

−3321

1CCCA;

(g) A =

0BBB@

3 1 0 01 3 1 00 1 3 10 0 1 3

1CCCA, x =

0BBB@

x1x2x3x4

1CCCA, b =

0BBB@

1111

1CCCA.

1.2.5.(a) x− y = −1, 2x + 3y = −3. The solution is x = − 6

5 , y = − 15 .

(b) u + w = −1, u + v = −1, v + w = 2. The solution is u = −2, v = 1, w = 1.(c) 3x1 − x3 = 1, −2x1 − x2 = 0, x1 + x2 − 3x3 = 1.

The solution is x1 = 15 , x2 = − 2

5 , x3 = − 25 .

(d) x + y − z − w = 0, −x + z + 2w = 4, x− y + z = 1, 2y − z + w = 5.The solution is x = 2, y = 1, z = 0, w = 3.

1.2.6.

(a) I =

0BBB@

1 0 0 00 1 0 00 0 1 00 0 0 1

1CCCA, O =

0BBB@

0 0 0 00 0 0 00 0 0 00 0 0 0

1CCCA.

(b) I + O = I , IO = O I = O. No, it does not.

1.2.7. (a) undefined, (b) undefined, (c)

3 6 0−1 4 2

!, (d) undefined, (e) undefined,

(f )

0B@

1 11 93 −12 −127 8 8

1CA, (g) undefined, (h)

0B@

9 −2 14−8 6 −1712 −3 28

1CA, (i) undefined.

2

1.2.8. Only the third pair commute.

1.2.9. 1, 6, 11, 16.

1.2.10. (a)

0B@

1 0 00 0 00 0 −1

1CA, (b)

0BBB@

2 0 0 00 −2 0 00 0 3 00 0 0 −3

1CCCA.

1.2.11. (a) True, (b) true.

♥ 1.2.12. (a) Let A =

x yz w

!. Then AD =

ax byaz bw

!=

ax aybz bw

!= DA, so if a 6= b these

are equal if and only if y = z = 0. (b) Every 2 × 2 matrix commutes with

a 00 a

!= a I .

(c) Only 3 × 3 diagonal matrices. (d) Any matrix of the form A =

0B@

x 0 00 y z0 u v

1CA. (e) Let

D = diag (d1, . . . , dn). The (i, j) entry of AD is aij dj . The (i, j) entry of DA is di aij . If

di 6= dj , this requires aij = 0, and hence, if all the di’s are different, then A is diagonal.

1.2.13. We need A of size m × n and B of size n ×m for both products to be defined. Further,AB has size m×m while BA has size n× n, so the sizes agree if and only if m = n.

1.2.14. B =

x y0 x

!where x, y are arbitrary.

1.2.15. (a) (A + B)2 = (A + B)(A + B) = AA + AB + BA + BB = A2 + 2AB + B2, since

AB = BA. (b) An example: A =

1 20 1

!, B =

0 01 0

!.

1.2.16. If AB is defined and A is m×n matrix, then B is n×p matrix and AB is m×p matrix;on the other hand if BA is defined we must have p = m and BA is n × n matrix. Now,since AB = BA, we must have p = m = n.

1.2.17. A On×p = Om×p, Ol×m A = Ol×n.

1.2.18. The (i, j) entry of the matrix equation cA = O is caij = 0. If any aij 6= 0 then c = 0, so

the only possible way that c 6= 0 is if all aij = 0 and hence A = O.

1.2.19. False: for example,

1 00 0

! 0 01 0

!=

0 00 0

!.

1.2.20. False — unless they commute: AB = BA.

1.2.21. Let v be the column vector with 1 in its jth position and all other entries 0. Then Avis the same as the jth column of A. Thus, the hypothesis implies all columns of A are 0and hence A = O.

1.2.22. (a) A must be a square matrix. (b) By associativity, AA2 = AAA = A2 A = A3.(c) The naıve answer is n − 1. A more sophisticated answer is to note that you can com-

pute A2 = AA, A4 = A2 A2, A8 = A4 A4, and, by induction, A2r

with only r matrixmultiplications. More generally, if the binary expansion of n has r+1 digits, with s nonzerodigits, then we need r + s − 1 multiplications. For example, A13 = A8A4A since 13 is 1101in binary, for a total of 5 multiplications: 3 to compute A2, A4 and A8, and 2 more to mul-tiply them together to obtain A13.

3

1.2.23. A =

0 10 0

!.

♦ 1.2.24. (a) If the ith row of A has all zero entries, then the (i, j) entry of AB is ai1b1j + · · · +ainbnj = 0b1j + · · ·+ 0bnj = 0, which holds for all j, so the ith row of AB will have all 0’s.

(b) If A =

1 10 0

!, B =

1 23 4

!, then BA =

1 13 3

!.

1.2.25. The same solution X =

−1 1

3 −2

!in both cases.

1.2.26. (a)

4 51 2

!, (b)

5 −1−2 1

!. They are not the same.

1.2.27. (a) X = O. (b) Yes, for instance, A =

1 20 1

!, B =

3 2−2 −1

!, X =

1 01 1

!.

1.2.28. A = (1/c) I when c 6= 0. If c = 0 there is no solution.

♦ 1.2.29.(a) The ith entry of A z is 1 ai1+1 ai2+ · · ·+1 ain = ai1+ · · ·+ain, which is the ith row sum.

(b) Each row of W has n− 1 entries equal to1n and one entry equal to

1− nn and so its row

sums are (n− 1)1n +

1− nn = 0. Therefore, by part (a), W z = 0. Consequently, the

row sums of B = AW are the entries of B z = AW z = A0 = 0, and the result follows.

(c) z =

0B@

111

1CA, and so Az =

0B@

1 2 −12 1 3−4 5 −1

1CA

0B@

111

1CA =

0B@

260

1CA, while B = AW =

0BB@

1 2 −1

2 1 3

− 4 5 −1

1CCA

0BBB@

− 23

13

13

13 − 2

313

13

13 − 2

3

1CCCA =

0BB@

− 13 − 4

353

0 1 −1

4 −5 1

1CCA, and so B z =

0B@

000

1CA.

♦ 1.2.30. Assume A has size m× n, B has size n× p and C has size p× q. The (k, j) entry of BC

ispX

l=1

bklclj , so the (i, j) entry of A(BC) isnX

k=1

aik

0@

pX

l=1

bklclj

1A =

nX

k=1

pX

l=1

aikbklclj .

On the other hand, the (i, l) entry of AB iskX

i=1

aikbkl, so the (i, j) entry of (AB)C ispX

l=1

0@

nX

k=1

aikbkl

1A clj =

nX

k=1

pX

l=1

aikbklclj . The two results agree, and so A(BC) =

(AB)C. Remark : A more sophisticated, simpler proof can be found in Exercise 7.1.44.

♥ 1.2.31.(a) We need AB and BA to have the same size, and so this follows from Exercise 1.2.13.(b) AB −BA = O if and only if AB = BA.

(c) (i)

−1 2

6 1

!, (ii)

0 00 0

!, (iii)

0B@

0 1 11 0 1−1 1 0

1CA;

(d) (i) [ cA + dB, C ] = (cA + dB)C − C(cA + dB)

= c(AC − C A) + d(BC − C B) = c [ A, B ] + d [ B, C ],

[ A, cB + dC ] = A(cB + dC)− (cB + dC)A

= c(AB −BA) + d(AC − C A) = c [ A, B ] + d [ A, C ].(ii) [ A, B ] = AB −BA = − (BA−AB) = − [ B, A ].

4

(iii)h[ A, B ], C

i= (AB −BA)C − C (AB −BA) = ABC −BAC − C AB + C BA,

h[ C, A ], B

i= (C A−AC)B −B (C A−AB) = C AB −AC B −BC A + BAC,

h[ B, C ], A

i= (BC − C B)A−A(BC − C B) = BC A− C BA−ABC + AC B.

Summing the three expressions produces O.

♦ 1.2.32. (a) (i) 4, (ii) 0, (b) tr(A + B) =nX

i=1

(aii + bii) =nX

i=1

aii +nX

i=1

bii = tr A + tr B.

(c) The diagonal entries of AB arenX

j =1

aij bji, so tr(AB) =nX

i=1

nX

j =1

aij bji; the diagonal

entries of BA arenX

i=1

bji aij , so tr(BA) =nX

i=1

nX

j =1

bji aij . These double summations are

clearly equal. (d) tr C = tr(AB −BA) = tr AB − tr BA = 0 by part (a).(e) Yes, by the same proof.

♦ 1.2.33. If b = Ax, then bi = ai1x1 + ai2x2 + · · · + ainxn for each i. On the other hand,

cj = (a1j , a2j , . . . , anj)T , and so the ith entry of the right hand side of (1.13) is

x1ai1 + x2ai2 + · · ·+ xnain, which agrees with the expression for bi.

♥ 1.2.34.(a) This follows by direct computation.(b) (i) −2 1

3 2

! 1 −21 0

!=

−2

3

!( 1 −2 ) +

12

!( 1 0 ) =

−2 4

3 −6

!+

1 02 0

!=

−1 4

5 −6

!.

(ii)

1 −2 0−3 −1 2

!0B@

2 5−3 0

1 −1

1CA =

1−3

!( 2 5 ) +

−2−1

!(−3 0 ) +

02

!( 1 −1 )

=

2 5−6 −15

!+

6 03 0

!+

0 02 −2

!=

8 5−1 −17

!.

(iii)0B@

3 −1 1−1 2 1

1 1 −5

1CA

0B@

2 3 03 −1 40 4 1

1CA =

0B@

3−1

1

1CA( 2 3 0 ) +

0B@−1

21

1CA( 3 −1 4 ) +

0B@

11−5

1CA( 0 4 1 )

=

0B@

6 9 0−2 −3 0

2 3 0

1CA+

0B@−3 1 −4

6 −2 83 −1 4

1CA+

0B@

0 4 10 4 10 −20 −5

1CA =

0B@

3 14 −34 −1 95 −18 −1

1CA.

(c) If we set B = x, where x is an n× 1 matrix, then we obtain (1.14).

(d) The (i, j) entry of AB isnX

k=1

aikbkj . On the other hand, the (i, j) entry of ck rk equals

the product of the ith entry of ck, namely aik, with the jth entry of rk, namely bkj .

Summing these entries, aikbkj , over k yields the usual matrix product formula.

♥ 1.2.35.

(a) p(A) = A3 − 3A + 2 I , q(A) = 2A2 + I . (b) p(A) =

−2 −8

4 6

!, q(A) =

−1 0

0 −1

!.

(c) p(A)q(A) = (A3 − 3A + 2 I )(2A2 + I ) = 2A5 − 5A3 + 4A2 − 3A + 2 I , while

p(x)q(x) = 2x5 − 5x3 + 4x2 − 3x + 2.(d) True, since powers of A mutually commute. For the particular matrix from (b),

p(A) q(A) = q(A) p(A) =

2 8−4 −6

!.

5

♥ 1.2.36.

(a) Check that S2 = A by direct computation. Another example: S =

2 00 2

!. Or, more

generally, 2 times any of the matrices in part (c).

(b) S2 is only defined if S is square.

(c) Any of the matrices

±1 0

0 ±1

!,

a bc −a

!, where a is arbitrary and bc = 1− a2.

(d) Yes: for example

0 −11 0

!.

♥ 1.2.37. (a) M has size (i+j)×(k+l). (b) M =

0BBBBB@

1 1 −13 0 11 1 3−2 2 0

1 1 −1

1CCCCCA

. (c) Since matrix addition is

done entry-wise, adding the entries of each block is the same as adding the blocks. (d) Xhas size k ×m, Y has size k × n, Z has size l ×m, and W has size l × n. Then AX + BZwill have size i ×m. Its (p, q) entry is obtained by multiplying the pth row of M times theqth column of P , which is ap1x1q + · · · + apixiq + bp1z1q + · · · + bplzlq and equals the

sum of the (p, q) entries of AX and BZ. A similar argument works for the remaining three

blocks. (e) For example, if X = (1), Y = ( 2 0 ), Z =

01

!, W =

0 −11 0

!, then

P =

0B@

1 2 00 0 −11 1 0

1CA, and so M P =

0BBBBB@

0 1 −14 7 04 5 −1−2 −4 −2

0 1 −1

1CCCCCA

. The individual block products are

04

!=

13

!(1) +

1 −10 1

! 01

!,

0B@

4−2

0

1CA =

0B@

1−2

1

1CA (1) +

0B@

1 32 01 −1

1CA

01

!,

1 −17 0

!=

13

!( 2 0 ) +

1 −10 1

! 0 −11 0

!,

0B@

5 −1−4 −2

1 −1

1CA =

0B@

1−2

1

1CA ( 2 0 ) +

0B@

1 32 01 −1

1CA

0 −11 0

!.

1.3.1.

(a)

1 7−2 −9

˛˛˛

42

!2R1+R2

−→

1 70 5

˛˛˛

410

!. Back Substitution yields x2 = 2, x1 = −10.

(b)

3 −52 1

˛˛˛−1

8

!−

23R1+R2

−→

3 −50 13

3

˛˛˛−1263

!. Back Substitution yields w = 2, z = 3.

(c)

0B@

1 −2 10 2 −8−4 5 9

˛˛˛˛

08−9

1CA

4R1+R3

−→0B@

1 −2 10 2 −80 −3 13

˛˛˛˛

08−9

1CA

3

2R2+R3

−→0B@

1 −2 10 2 −80 0 1

˛˛˛˛

083

1CA.

Back Substitution yields z = 3, y = 16, x = 29.

(d)

0B@

1 4 −2−2 0 −3

3 −2 2

˛˛˛˛

1−7−1

1CA

2R1+R2

−→0B@

1 4 −20 8 −73 −2 2

˛˛˛˛

1−5−1

1CA

−3R1+R3

−→0B@

1 4 −20 8 −70 −14 8

˛˛˛˛

1−5−4

1CA

74R2+R3

−→0B@

1 4 −20 8 −70 0 − 17

4

˛˛˛˛

1−5− 51

4

1CA. Back Substitution yields r = 3, q = 2, p = −1.

6

(e)

0BBB@

1 0 −2 00 1 0 −10 −3 2 0−4 0 0 7

˛˛˛˛˛

−120−5

1CCCA reduces to

0BBB@

1 0 −2 00 1 0 −10 0 2 −30 0 0 −5

˛˛˛˛˛

−126

15

1CCCA.

Solution: x4 = −3, x3 = − 32 , x2 = −1, x1 = −4.

(f )

0BBB@

−1 3 −1 11 −1 3 −10 1 −1 44 −1 1 0

˛˛˛˛˛

−2075

1CCCA reduces to

0BBB@

−1 3 −1 10 2 2 00 0 −2 40 0 0 −24

˛˛˛˛˛

−2−2

8−48

1CCCA.

Solution: w = 2, z = 0, y = −1, x = 1.

1.3.2.(a) 3x + 2y = 2, −4x− 3y = −1; solution: x = 4, y = −5,(b) x + 2y = −3, −x + 2y + z = −6, −2x− 3z = 1; solution: x = 1, y = −2, z = −1,(c) 3x− y + 2z = −3, −2y − 5z = −1, 6x− 2y + z = −3;

solution: x = 23 , y = 3, z = −1,

(d) 2x− y = 0, −x + 2y − z = 1, −y + 2z − w = 1, −z + 2w = 0;solution: x = 1, y = 2, z = 2, w = 1.

1.3.3. (a) x = 173 , y = − 4

3 ; (b) u = 1, v = −1; (c) u = 32 , v = − 1

3 , w = 16 ; (d) x1 =

113 , x2 = − 10

3 , x3 = − 23 ; (e) p = − 2

3 , q = 196 , r = 5

2 ; (f ) a = 13 , b = 0, c = 4

3 , d = − 23 ;

(g) x = 13 , y = 7

6 , z = − 83 , w = 9

2 .

1.3.4. Solving 6 = a + b + c, 4 = 4a + 2b + c, 0 = 9a + 3b + c, yields a = −1, b = 1, c = 6, soy = −x2 + x + 6.

1.3.5.

(a) Regular:

2 11 4

!−→

2 10 7

2

!.

(b) Not regular.

(c) Regular:

0B@

3 −2 1−1 4 −3

3 −2 5

1CA −→

0B@

3 −2 10 10

3 − 83

0 0 4

1CA.

(d) Not regular:

0B@

1 −2 3−2 4 −1

3 −1 2

1CA −→

0B@

1 −2 30 0 50 5 −7

1CA.

(e) Regular:0BBB@

1 3 −3 0−1 0 −1 2

3 3 −6 12 3 −3 5

1CCCA −→

0BBB@

1 3 −3 00 3 −4 20 −6 3 10 −3 3 5

1CCCA −→

0BBB@

1 3 −3 00 3 −4 20 0 −5 50 0 −1 7

1CCCA −→

0BBB@

1 3 −3 00 3 −4 20 0 −5 50 0 0 6

1CCCA.

1.3.6.

(a)

− i 1 + i

1− i 1

˛˛˛−1−3 i

!−→

− i 1 + i0 1− 2 i

˛˛˛−1

1− 2 i

!;

use Back Substitution to obtain the solution y = 1, x = 1− 2 i .

(b)

0B@

i 0 1− i0 2 i 1 + i−1 2 i i

˛˛˛˛

2 i2

1− 2 i

1CA −→

0B@

i 0 1− i0 2 i 1 + i0 0 −2− i

˛˛˛˛

2 i2

1− 2 i

1CA.

solution: z = i , y = − 12 − 3

2 i , x = 1 + i .

(c)

1− i 2− i 1 + i

˛˛˛

i−1

!−→

1− i 2

0 2 i

˛˛˛

i− 3

2 − 12 i

!;

solution: y = − 14 + 3

4 i , x = 12 .

7

(d)

0B@

1 + i i 2 + 2 i1− i 2 i3− 3 i i 3− 11 i

˛˛˛˛

006

1CA −→

0B@

1 + i i 2 + 2 i0 1 −2 + 3 i0 0 −6 + 6 i

˛˛˛˛

006

1CA;

solution: z = − 12 − 1

2 i , y = − 52 + 1

2 i , x = 52 + 2 i .

1.3.7. (a) 2x = 3, −y = 4, 3z = 1, u = 6, 8v = −24. (b) x = 32 , y = −4, z = 1

3 ,

u = 6, v = −3. (c) You only have to divide by each coefficient to find the solution.

♦ 1.3.8. 0 is the (unique) solution since A0 = 0.

♠ 1.3.9.

Back Substitution

start

set xn = cn/unn

for i = n− 1 to 1 with increment −1

set xi =1

uii

0@ ci −

i+1X

j =1

uijxj

1A

next j

end

1.3.10. Since a11 a120 a22

! b11 b120 b22

!=

a11b11 a11b12 + a12b22

0 a22b22

!,

b11 b120 b22

! a11 a120 a22

!=

a11b11 a22b12 + a12b11

0 a22b22

!,

the matrices commute if and only if

a11b12 + a12b22 = a22b12 + a12b11, or (a11 − a22)b12 = a12(b11 − b22).

1.3.11. Clearly, any diagonal matrix is both lower and upper triangular. Conversely, A beinglower triangular requires that aij = 0 for i < j; A upper triangular requires that aij = 0 for

i > j. If A is both lower and upper triangular, aij = 0 for all i 6= j, which implies A is a

diagonal matrix.

♦ 1.3.12.

(a) Set lij =

(aij , i > j,

0, i ≤ j,, uij =

(aij , i < j,

0, i ≥ j,dij =

(aij , i = j,

0, i 6= j.

(b) L =

0B@

0 0 01 0 0−2 0 0

1CA , D =

0B@

3 0 00 −4 00 0 5

1CA , U =

0B@

0 1 −10 0 20 0 0

1CA .

♦ 1.3.13.

(a) By direct computation, A2 =

0B@

0 0 10 0 00 0 0

1CA, and so A3 = O.

(b) Let A have size n × n. By assumption, aij = 0 whenever i > j − 1. By induction, one

proves that the (i, j) entries of Ak are all zero whenever i > j − k. Indeed, to compute

the (i, j) entry of Ak+1 = A Ak you multiply the ith row of A, whose first i entries are 0,

8

by the jth column of Ak, whose first j − k − 1 entries are non-zero, and all the rest arezero, according to the induction hypothesis; therefore, if i > j − k − 1, every term in thesum producing this entry is 0, and the induction is complete. In particular, for k = n,

every entry of Ak is zero, and so An = O.

(c) The matrix A =

1 1−1 −1

!has A2 = O.

1.3.14.(a) Add −2 times the second row to the first row of a 2× n matrix.(b) Add 7 times the first row to the second row of a 2× n matrix.(c) Add −5 times the third row to the second row of a 3× n matrix.

(d) Add 12 times the first row to the third row of a 3× n matrix.

(e) Add −3 times the fourth row to the second row of a 4× n matrix.

1.3.15. (a)

0BBB@

1 0 0 00 1 0 00 0 1 00 0 1 1

1CCCA, (b)

0BBB@

1 0 0 00 1 0 00 0 1 −10 0 0 1

1CCCA, (c)

0BBB@

1 0 0 30 1 0 00 0 1 00 0 0 1

1CCCA, (d)

0BBB@

1 0 0 00 1 0 00 0 1 00 −2 0 1

1CCCA.

1.3.16. L3 L2 L1 =

0B@

1 0 02 1 00 − 1

2 1

1CA 6= L1L2L3.

1.3.17. E3 E2 E1 =

0B@

1 0 0−2 1 0−2 1

2 1

1CA, E1 E2 E3 =

0B@

1 0 0−2 1 0−1 1

2 1

1CA. The second is easier to predict

since its entries are the same as the corresponding entries of the Ei.

1.3.18.(a) Suppose that E adds c 6= 0 times row i to row j 6= i, while eE adds d 6= 0 times row k to

row l 6= k. If r1, . . . , rn are the rows, then the effect of eE E is to replace(i) rj by rl + cri + drk for j = l;

(ii) rj by rj + cri and rl by rl + (cd)ri + drj for j = k;

(iii) rj by rj + cri and rl by rl + drk otherwise.

On the other hand, the effect of E eE is to replace(i) rj by rl + cri + drk for j = l;

(ii) rj by rj + cri + (cd)rk and rl by rl + drk for i = l;

(iii) rj by rj + cri and rl by rl + drk otherwise.

Comparing results, we see that E eE = eE E whenever i 6= l and j 6= k.(b) E1E2 = E2E1, E1E3 6= E3E1, and E3E2 = E2E3.(c) See the answer to part (a).

1.3.19. (a) Upper triangular; (b) both special upper and special lower triangular; (c) lowertriangular; (d) special lower triangular; (e) none of the above.

1.3.20. (a) aij = 0 for all i 6= j; (b) aij = 0 for all i > j; (c) aij = 0 for all i > j and aii = 1

for all i; (d) aij = 0 for all i < j; (e) aij = 0 for all i < j and aii = 1 for all i.

♦ 1.3.21.(a) Consider the product L M of two lower triangular n× n matrices. The last n− i entries

in the ith row of L are zero, while the first j − 1 entries in the jth column of M are zero.So if i < j each summand in the product of the ith row times the jth column is zero,

9

and so all entries above the diagonal in LM are zero.(b) The ith diagonal entry of LM is the product of the ith diagonal entry of L times the ith

diagonal entry of M .(c) Special matrices have all 1’s on the diagonal, and so, by part (b), does their product.

1.3.22. (a) L =

1 0−1 1

!, U =

1 30 3

!, (b) L =

1 03 1

!, U =

1 30 −8

!,

(c) L =

0B@

1 0 0−1 1 0

1 0 1

1CA, U =

0B@−1 1 −1

0 2 00 0 3

1CA, (d) L =

0B@

1 0 012 1 0

0 13 1

1CA, U =

0B@

2 0 30 3 − 1

2

0 0 76

1CA,

(e) L =

0B@

1 0 0−2 1 0−1 −1 1

1CA, U =

0B@−1 0 0

0 −3 00 0 2

1CA, (f ) L =

0B@

1 0 02 1 0−3 1

3 1

1CA, U =

0B@

1 0 −10 3 40 0 − 13

3

1CA,

(g) L =

0BBBB@

1 0 0 00 1 0 0−1 3

2 1 0

0 − 12 3 1

1CCCCA

, U =

0BBBB@

1 0 −1 00 2 −1 −1

0 0 12

72

0 0 0 −10

1CCCCA

, (h) L =

0BBB@

1 0 0 0−1 1 0 0−2 1 1 0

3 −1 −2 1

1CCCA,

U =

0BBB@

1 1 −2 30 3 1 30 0 −4 10 0 0 1

1CCCA, (i) L =

0BBBBB@

1 0 0 012 1 0 032 − 3

7 1 012

17 − 5

22 1

1CCCCCA

, U =

0BBBBB@

2 1 3 1

0 72 − 3

212

0 0 − 227

57

0 0 0 3522

1CCCCCA

.

1.3.23. (a) Add 3 times first row to second row. (b) Add −2 times first row to third row.(c) Add 4 times second row to third row.

1.3.24.

(a)

0BBB@

1 0 0 02 1 0 03 4 1 05 6 7 1

1CCCA

(b) (1) Add −2 times first row to second row. (2) Add −3 times first row to third row.(3) Add −5 times first row to fourth row. (4) Add −4 times second row to third row.(5) Add −6 times second row to fourth row. (6) Add −7 times third row to fourth row.

(c) Use the order given in part (b).

♦ 1.3.25. See equation (4.51) for the general case.

1 1t1 t2

!=

1 0t1 1

! 1 10 t2 − t1

!

0B@

1 1 1t1 t2 t3t21 t22 t23

1CA =

0B@

1 0 0t1 1 0

t21 t1 + t2 1

1CA

0B@

1 1 10 t2 − t1 t3 − t10 0 (t3 − t1)(t3 − t2)

1CA ,

0BBBBB@

1 1 1 1

t1 t2 t3 t4t21 t22 t23 t24t31 t32 t33 t34

1CCCCCA

=

0BBBBB@

1 0 0 0

t1 1 0 0

t21 t1 + t2 1 0

t31 t21 + t1 t2 + t22 t1 + t2 + t3 1

1CCCCCA

0BBBBB@

1 1 1 1

0 t2 − t1 t3 − t1 t4 − t10 0 (t3 − t1)(t3 − t2) (t4 − t1)(t4 − t2)

0 0 0 (t4 − t1)(t4 − t2)(t4 − t3)

1CCCCCA

.

10

1.3.26. False. For instance

1 11 0

!is regular. Only if the zero appear in the (1, 1) position

does it automatically preclude regularity of the matrix.

1.3.27. (n− 1) + (n− 2) + · · ·+ 1 =n(n− 1)

2.

1.3.28. We solve the equation

1 0l 1

! u1 u20 u3

!=

a bc d

!for u1, u2, u3, l, where a 6= 0 since

A =

a bc d

!is regular. This matrix equation has a unique solution: u1 = a, u2 = b,

u3 = d− bc

a, l =

c

a.

♦ 1.3.29. The matrix factorization A = LU is

0 11 0

!=

1 0a 1

! x y0 z

!=

x yax ay + z

!.

This implies x = 0 and ax = 1, which is impossible.

♦ 1.3.30.(a) Let u11, . . . , unn be the pivots of A, i.e., the diagonal entries of U . Let D be the diago-

nal matrix whose diagonal entries are dii = sign uii. Then B = AD is the matrix ob-tained by multiplying each column of A by the sign of its pivot. Moreover, B = LUD =L eU , where eU = UD, is the LU factorization of B. Each column of eU is obtained bymultiplying it by the sign of its pivot. In particular, the diagonal entries of eU , which arethe pivots of B, are uii sign uii = |uii | > 0.

(b) Using the same notation as in part (a), we note that C = DA is the matrix obtainedby multiplying each row of A by the sign of its pivot. Moreover, C = DLU . How-ever, DL is not special lower triangular, since its diagonal entries are the pivot signs.But bL = DLD is special lower triangular, and so C = DLDDU = bL bU , wherebU = DU , is the LU factorization of B. Each row of bU is obtained by multiplying itby the sign of its pivot. In particular, the diagonal entries of bU , which are the pivots ofC, are uii sign uii = |uii | > 0.

(c)0B@−2 2 1

1 0 14 2 3

1CA =

0B@

1 0 0− 1

2 1 0−2 6 1

1CA

0B@−2 2 1

0 1 32

0 0 −4

1CA,

0B@

2 2 −1−1 0 −1−4 2 −3

1CA =

0B@

1 0 0− 1

2 1 0−2 6 1

1CA

0B@

2 2 −10 1 − 3

20 0 4

1CA,

0B@

2 −2 −11 0 1−4 −2 −3

1CA =

0B@

1 0 012 1 0−2 −6 1

1CA

0B@

2 −2 −10 1 3

20 0 4

1CA.

1.3.31. (a) x =

−123

!, (b) x =

0@

1414

1A, (c) x =

0B@

010

1CA, (d) x =

0BBB@

− 472757

1CCCA, (e) x =

0B@−1−1

52

1CA,

(f ) x =

0B@

01−1

1CA, (g) x =

0BBB@

2110

1CCCA, (h) x =

0BBBBBB@

− 3712

− 1712

14

2

1CCCCCCA

, (i) x =

0BBBBBB@

33563517835

1CCCCCCA

.

11

1.3.32.

(a) L =

1 0−3 1

!, U =

−1 3

0 11

!; x1 =

0@ −

511211

1A, x2 =

11

!, x3 =

0@

911311

1A;

(b) L =

0B@

1 0 0−1 1 0

1 0 1

1CA , U =

0B@−1 1 −1

0 2 00 0 3

1CA; x1 =

0B@−1

00

1CA, x2 =

0BBB@

− 16

− 3253

1CCCA;

(c) L =

0BB@

1 0 0

− 23 1 029

53 1

1CCA , U =

0BB@

9 −2 −1

0 − 13

13

0 0 − 13

1CCA; x1 =

0B@

123

1CA, x2 =

0B@−2−9−1

1CA;

(d) L =

0B@

1 0 0.15 1 0.2 1.2394 1

1CA, U =

0B@

2.0 .3 .40 .355 4.940 0 −.2028

1CA;

x1 =

0B@

.6944−1.3889

.0694

1CA, x2 =

0B@

1.1111−82.2222

6.1111

1CA, x3 =

0B@−9.305668.6111−4.9306

1CA

(e) L =

0BBB@

1 0 0 00 1 0 0−1 3

2 1 0

0 − 12 −1 1

1CCCA, U =

0BBBB@

1 0 −1 00 2 3 −1

0 0 − 72

72

0 0 0 4

1CCCCA

; x1 =

0BBBBB@

54

− 141414

1CCCCCA

, x2 =

0BBBBB@

114

− 51411412

1CCCCCA

;

(f ) L =

0BBB@

1 0 0 04 1 0 0−8 − 17

9 1 0−4 −1 0 1

1CCCA, U =

0BBB@

1 −2 0 20 9 −1 −90 0 1

9 00 0 0 1

1CCCA;

x1 =

0BBB@

1040

1CCCA, x2 =

0BBB@

1132

1CCCA, x3 =

0BBB@

108

414

1CCCA.

1.4.1. The nonsingular matrices are (a), (c), (d), (h).

1.4.2. (a) Regular and nonsingular, (b) singular, (c) nonsingular, (d) regular and nonsingular.

1.4.3. (a) x1 = − 53 , x2 = − 10

3 , x3 = 5; (b) x1 = 0, x2 = −1, x3 = 2;

(c) x1 = −6, x2 = 2, x3 = −2; (d) x = − 132 , y = − 9

2 , z = −1, w = −3;

(e) x1 = −11, x2 = − 103 , x3 = −5, x4 = −7.

1.4.4. Solve the equations −1 = 2b+c, 3 = −2a+4b+c, −3 = 2a−b+c, for a = −4, b = −2,c = 3, giving the plane z = −4x− 2y + 3.

1.4.5.(a) Suppose A is nonsingular. If a 6= 0 and c 6= 0, then we subtract c/a times the first row

from the second, producing the (2, 2) pivot entry (ad − bc)/a 6= 0. If c = 0, then thepivot entry is d and so ad − bc = ad 6= 0. If a = 0, then c 6= 0 as otherwise the firstcolumn would not contain a pivot. Interchanging the two rows gives the pivots c and b,and so ad− bc = bc 6= 0.

(b) Regularity requires a 6= 0. Proceeding as in part (a), we conclude that ad− bc 6= 0 also.

1.4.6. True. All regular matrices are nonsingular.

12

♦ 1.4.7. Since A is nonsingular, we can reduce it to the upper triangular form with nonzero diago-nal entries (by applying the operations # 1 and # 2). The rest of argument is the same asin Exercise 1.3.8.

1.4.8. By applying the operations # 1 and # 2 to the system Ax = b we obtain an equivalentupper triangular system Ux = c. Since A is nonsingular, uii 6= 0 for all i, so by Back Sub-

stitution each solution component, namely xn =cn

unn

and xi =1

uii

0@ ci −

nX

k= i+1

uikxk

1A,

for i = n− 1, n− 2, . . . , 1, is uniquely defined.

1.4.9. (a) P1 =

0BBB@

1 0 0 00 0 0 10 0 1 00 1 0 0

1CCCA, (b) P2 =

0BBB@

0 0 0 10 1 0 00 0 1 01 0 0 0

1CCCA,

(c) No, they do not commute. (d) P1 P2 arranges the rows in the order 4, 1, 3, 2, whileP2 P1 arranges them in the order 2, 4, 3, 1.

1.4.10. (a)

0B@

0 1 00 0 11 0 0

1CA, (b)

0BBB@

0 0 0 10 0 1 01 0 0 00 1 0 0

1CCCA, (c)

0BBB@

0 1 0 01 0 0 00 0 0 10 0 1 0

1CCCA, (d)

0BBBBB@

0 0 0 1 01 0 0 0 00 0 1 0 00 1 0 0 00 0 0 0 1

1CCCCCA

.

1.4.11. The (i, j) entry of the following Multiplication Table indicates the product PiPj , where

P1 =

0B@

1 0 00 1 00 0 1

1CA , P2 =

0B@

0 1 00 0 11 0 0

1CA , P3 =

0B@

0 0 11 0 00 1 0

1CA ,

P4 =

0B@

0 1 01 0 00 0 1

1CA , P5 =

0B@

0 0 10 1 01 0 0

1CA , P6 =

0B@

1 0 00 0 10 1 0

1CA .

The commutative pairs are P1Pi = PiP1, i = 1, . . . , 6, and P2P3 = P3P2.

P1 P2 P3 P4 P5 P6

P1 P1 P2 P3 P4 P5 P6

P2 P2 P3 P1 P6 P4 P5

P3 P3 P1 P2 P5 P6 P4

P4 P4 P5 P6 P1 P2 P3

P5 P5 P6 P4 P3 P1 P2

P6 P6 P4 P5 P2 P3 P1

1.4.12. (a)

0BBB@

1 0 0 00 1 0 00 0 1 00 0 0 1

1CCCA,

0BBB@

0 1 0 00 0 0 10 0 1 01 0 0 0

1CCCA,

0BBB@

0 0 0 11 0 0 00 0 1 00 1 0 0

1CCCA,

0BBB@

0 1 0 01 0 0 00 0 1 00 0 0 1

1CCCA,

0BBB@

0 0 0 10 1 0 00 0 1 01 0 0 0

1CCCA,

13

0BBB@

1 0 0 00 0 0 10 0 1 00 1 0 0

1CCCA; (b)

0BBB@

0 1 0 01 0 0 00 0 0 10 0 1 0

1CCCA,

0BBB@

1 0 0 00 0 0 10 1 0 00 0 1 0

1CCCA,

0BBB@

0 0 0 10 1 0 01 0 0 00 0 1 0

1CCCA,

0BBB@

1 0 0 00 1 0 00 0 0 10 0 1 0

1CCCA,

0BBB@

0 0 0 11 0 0 00 1 0 00 0 1 0

1CCCA,

0BBB@

0 1 0 00 0 0 11 0 0 00 0 1 0

1CCCA; (c)

0BBB@

1 0 0 00 0 1 00 1 0 00 0 0 1

1CCCA,

0BBB@

0 0 0 10 0 1 00 1 0 01 0 0 0

1CCCA.

1.4.13. (a) True, since interchanging the same pair of rows twice brings you back to where

you started. (b) False; an example is the non-elementary permuation matrix

0B@

0 0 11 0 00 1 0

1CA.

(c) False; for example P =

−1 0

0 −1

!is not a permutation matrix. For a complete list of

such matrices, see Exercise 1.2.36.

1.4.14. (a) Only when all the entries of v are different; (b) only when all the rows of A aredifferent.

1.4.15. (a)

0B@

1 0 00 0 10 1 0

1CA. (b) True. (c) False — AP permutes the columns of A according to

the inverse (or transpose) permutation matrix P−1 = PT .

♥ 1.4.16.(a) If P has a 1 in position (π(j), j), then it moves row j of A to row π(j) of P A, which is

enough to establish the correspondence.

(b) (i)

0B@

0 1 01 0 00 0 1

1CA, (ii)

0BBB@

0 0 0 10 1 0 00 0 1 01 0 0 0

1CCCA, (iii)

0BBB@

1 0 0 00 0 1 00 0 0 10 1 0 0

1CCCA, (iv)

0BBBBB@

0 0 0 0 10 0 0 1 00 0 1 0 00 1 0 0 01 0 0 0 0

1CCCCCA

.

Cases (i) and (ii) are elementary matrices.

(c) (i)

1 2 32 3 1

!, (ii)

1 2 3 43 4 1 2

!, (iii)

1 2 3 44 1 2 3

!, (iv)

1 2 3 4 52 5 3 1 4

!.

♦ 1.4.17. The first row of an n×n permutation matrix can have the 1 in any of the n positions, sothere are n possibilities for the first row. Once the first row is set, the second row can haveits 1 anywhere except in the column under the 1 in the first row, and so there are n − 1possibilities. The 1 in the third row can be in any of the n− 2 positions not under either ofthe previous two 1’s. And so on, leading to a total of n(n − 1)(n − 2) · · · 2 · 1 = n ! possiblepermutation matrices.

1.4.18. Let ri, rj denote the rows of the matrix in question. After the first elementary row op-

eration, the rows are ri and rj + ri. After the second, they are ri − (rj + ri) = −rj and

rj + ri. After the third operation, we are left with −rj and rj + ri + (−rj) = ri.

1.4.19. (a)

0 11 0

! 0 12 −1

!=

1 00 1

! 2 −10 1

!, x =

0@

52

3

1A;

14

(b)

0B@

0 1 00 0 11 0 0

1CA

0B@

0 0 −41 2 30 1 7

1CA =

0B@

1 0 00 1 00 0 1

1CA

0B@

1 2 30 1 70 0 −4

1CA, x =

0BBB@

5434

− 14

1CCCA;

(c)

0B@

0 0 11 0 00 1 0

1CA

0B@

0 1 −30 2 31 0 2

1CA =

0B@

1 0 00 1 00 2 1

1CA

0B@

1 0 20 1 −30 0 9

1CA, x =

0B@−1

10

1CA;

(d)

0BBB@

1 0 0 00 0 1 00 1 0 00 0 0 1

1CCCA

0BBB@

1 2 −1 03 6 2 −11 1 −7 21 −1 2 1

1CCCA =

0BBB@

1 0 0 01 1 0 03 0 1 01 3 21

5 1

1CCCA

0BBB@

1 2 −1 00 −1 −6 20 0 5 −10 0 0 − 4

5

1CCCA, x =

0BBB@

22−13−5−22

1CCCA;

(e)

0BBB@

0 0 1 01 0 0 00 1 0 00 0 0 1

1CCCA

0BBB@

0 1 0 02 3 1 01 4 −1 27 −1 2 3

1CCCA =

0BBB@

1 0 0 00 1 0 02 −5 1 07 −29 3 1

1CCCA

0BBB@

1 4 −1 20 1 0 00 0 3 −40 0 0 1

1CCCA, x =

0BBB@

−1−1

13

1CCCA;

(f )

0BBBBB@

0 0 1 0 00 1 0 0 00 0 0 1 01 0 0 0 00 0 0 0 1

1CCCCCA

0BBBBB@

0 0 2 3 40 1 −7 2 31 4 1 1 10 0 1 0 20 0 1 7 3

1CCCCCA

=

0BBBBB@

1 0 0 0 00 1 0 0 00 0 1 0 00 0 2 1 00 0 1 7

3 1

1CCCCCA

0BBBBB@

1 4 1 1 10 1 −7 2 30 0 1 0 20 0 0 3 00 0 0 0 1

1CCCCCA

, x =

0BBBBB@

100−1

0

1CCCCCA

.

1.4.20.

(a)

0B@

1 0 00 0 10 1 0

1CA

0B@

4 −4 2−3 3 1−3 1 −2

1CA =

0BB@

1 0 0

− 34 1 0

− 34 0 1

1CCA

0BB@

4 −4 2

0 −2 − 12

0 0 52

1CCA;

solution: x1 = 54 , x2 = 7

4 , x3 = 32 .

(b)

0BBB@

0 0 1 00 1 0 01 0 0 00 0 0 1

1CCCA

0BBB@

0 1 −1 10 1 1 01 −1 1 −31 2 −1 1

1CCCA =

0BBB@

1 0 0 00 1 0 00 1 1 01 3 5

2 1

1CCCA

0BBB@

1 −1 1 −30 1 1 00 0 −2 10 0 0 3

2

1CCCA;

solution: x = 4, y = 0, z = 1, w = 1.

(c)

0BBB@

1 0 0 00 0 0 10 0 1 00 1 0 0

1CCCA

0BBB@

1 −1 2 1−1 1 −3 0

1 −1 1 −31 2 −1 1

1CCCA =

0BBB@

1 0 0 01 1 0 01 0 1 0−1 0 − 1

2 1

1CCCA

0BBB@

1 −1 2 10 3 −3 00 0 2 −40 0 0 1

1CCCA;

solution: x = 193 , y = − 5

3 , z = −3, w = −2.

♦ 1.4.21.(a) They are all of the form P A = LU , where P is a permutation matrix. In the first case,

we interchange rows 1 and 2, in the second case, we interchange rows 1 and 3, in thethird case, we interchange rows 1 and 3 first and then interchange rows 2 and 3.

(b) Same solution x = 1, y = 1, z = −2 in all cases. Each is done by a sequence of elemen-tary row operations, which do not change the solution.

1.4.22. There are four in all:0B@

0 1 01 0 00 0 1

1CA

0B@

0 1 21 0 −11 1 3

1CA =

0B@

1 0 00 1 01 1 1

1CA

0B@

1 0 −10 1 20 0 2

1CA ,

0B@

0 1 00 0 11 0 0

1CA

0B@

0 1 21 0 −11 1 3

1CA =

0B@

1 0 01 1 00 1 1

1CA

0B@

1 0 −10 1 40 0 −2

1CA ,

0B@

0 0 10 1 01 0 0

1CA

0B@

0 1 21 0 −11 1 3

1CA =

0B@

1 0 01 1 00 −1 1

1CA

0B@

1 1 30 −1 −40 0 −2

1CA ,

15

0B@

0 0 11 0 00 1 0

1CA

0B@

0 1 21 0 −11 1 3

1CA =

0B@

1 0 00 1 01 −1 1

1CA

0B@

1 1 30 1 20 0 −2

1CA .

The other two permutation matrices are not regular.

1.4.23. The maximum is 6 since there are 6 different 3× 3 permutation matrices. For example,0B@

1 0 01 1 0−1 1 1

1CA =

0B@

1 0 01 1 0−1 1 1

1CA

0B@

1 0 00 1 00 0 1

1CA ,

0B@

1 0 00 0 10 1 0

1CA

0B@

1 0 01 1 0−1 1 1

1CA =

0B@

1 0 0−1 1 0

1 1 1

1CA

0B@

1 0 00 1 10 0 −1

1CA ,

0B@

0 1 01 0 00 0 1

1CA

0B@

1 0 01 1 0−1 1 1

1CA =

0B@

1 0 01 1 0−1 −2 1

1CA

0B@

1 1 00 −1 00 0 1

1CA ,

0B@

0 1 00 0 11 0 0

1CA

0B@

1 0 01 1 0−1 1 1

1CA =

0B@

1 0 0−1 1 0

1 − 12 1

1CA

0B@

1 1 00 2 10 0 1

2

1CA ,

0B@

0 0 10 1 01 0 0

1CA

0B@

1 0 01 1 0−1 1 1

1CA =

0B@

1 0 0−1 1 0−1 1

2 1

1CA

0B@−1 1 1

0 2 10 0 1

2

1CA ,

0B@

0 0 11 0 00 1 0

1CA

0B@

1 0 01 1 0−1 1 1

1CA =

0B@

1 0 0−1 1 0−1 2 1

1CA

0B@−1 1 1

0 1 10 0 −1

1CA .

1.4.24. False. Changing the permuation matrix typically changes the pivots.

♠ 1.4.25.

Permuted LU factorization

start

set P = I , L = I , U = A

for j = 1 to n

if ukj = 0 for all k ≥ j, stop; print “A is singular”

if ujj = 0 but ukj 6= 0 for some k > j then

interchange rows j and k of U

interchange rows j and k of P

for m = 1 to j − 1 interchange ljm and lkm next m

for i = j + 1 to n

set lij = uij/ujj

add −uij times row j to row i of A

next i

next j

end

16

1.5.1.

(a)

2 3−1 −1

! −1 −3

1 2

!=

1 00 1

!=

−1 −3

1 2

! 2 3−1 −1

!,

(b)

0B@

2 1 13 2 12 1 2

1CA

0B@

3 −1 −1−4 2 1−1 0 1

1CA =

0B@

1 0 00 1 00 0 1

1CA =

0B@

3 −1 −1−4 2 1−1 0 1

1CA

0B@

2 1 13 2 12 1 2

1CA,

(c)

0B@−1 3 2

2 2 −1−2 1 3

1CA

0BB@

− 1 1 147 − 1

7 − 37

− 67

57

87

1CCA =

0B@

1 0 00 1 00 0 1

1CA =

0BB@

− 1 1 147 − 1

7 − 37

− 67

57

87

1CCA

0B@−1 3 2

2 2 −1−2 1 3

1CA.

1.5.2. X =

0B@−5 16 6

3 −8 −3−1 3 1

1CA; X A =

0B@−5 16 6

3 −8 −3−1 3 1

1CA

0B@

1 2 00 1 31 −1 −8

1CA =

0B@

1 0 00 1 00 0 1

1CA.

1.5.3. (a)

0 11 0

!, (b)

1 0−5 1

!, (c)

1 20 1

!,

(d)

0B@

1 0 00 1 30 0 1

1CA, (e)

0BBB@

1 0 0 00 1 0 00 −6 1 00 0 0 1

1CCCA, (f )

0BBB@

0 0 0 10 1 0 00 0 1 01 0 0 0

1CCCA.

1.5.4.

0B@

1 0 0a 1 0b 0 1

1CA

0B@

1 0 0−a 1 0−b 0 1

1CA =

0B@

1 0 00 1 00 0 1

1CA =

0B@

1 0 0−a 1 0−b 0 1

1CA

0B@

1 0 0a 1 0b 0 1

1CA;

M−1 =

0B@

1 0 0−a 1 0

ac− b −c 1

1CA.

1.5.5. The ith row of the matrix multiplied by the ith column of the inverse should be equal 1.This is not possible if all the entries of the ith row are zero; see Exercise 1.2.24.

1.5.6. (a) A−1 =

−1 1

2 −1

!, B−1 =

0@

23

13

− 13

13

1A.

(b) C =

2 13 0

!, C−1 = B−1A−1 =

0@ 0 1

3

1 − 23

1A.

1.5.7. (a) R−1θ =

cos θ sin θ− sin θ cos θ

!. (b)

ab

!= R−1

θ

xy

!=

x cos θ + y sin θ−x sin θ + y cos θ

!.

(c) det(Rθ − a I ) = det

cos θ − a − sin θ

sin θ cos θ − a

!= (cos θ − a)2 + (sin θ)2 > 0

provided sin θ 6= 0, which is valid when 0 < θ < π.

1.5.8.

(a) Setting P1 =

0B@

1 0 00 1 00 0 1

1CA , P2 =

0B@

0 1 00 0 11 0 0

1CA , P3 =

0B@

0 0 11 0 00 1 0

1CA ,

P4 =

0B@

0 1 01 0 00 0 1

1CA , P5 =

0B@

0 0 10 1 01 0 0

1CA , P6 =

0B@

1 0 00 0 10 1 0

1CA ,

we find P−11 = P1, P−1

2 = P3, P−13 = P2, P−1

4 = P4, P−15 = P5, P−1

6 = P6.

(b) P1, P4, P5, P6 are their own inverses.

17

(c) Yes: P =

0BBB@

0 1 0 01 0 0 00 0 0 10 0 1 0

1CCCA interchanges two pairs of rows.

1.5.9. (a)

0BBB@

0 0 0 10 0 1 00 1 0 01 0 0 0

1CCCA, (b)

0BBB@

0 0 0 11 0 0 00 1 0 00 0 1 0

1CCCA, (c)

0BBB@

1 0 0 00 0 1 00 0 0 10 1 0 0

1CCCA, (d)

0BBBBB@

1 0 0 0 00 0 0 1 00 1 0 0 00 0 0 0 10 0 1 0 0

1CCCCCA

.

1.5.10.(a) If i and j = π(i) are the entries in the ith column of the 2 × n matrix corresponding to

the permutation, then the entries in the jth column of the 2× n matrix corresponding tothe permutation are j and i = π−1(j). Equivalently, permute the columns so that thesecond row is in order 1, 2, . . . , n and then switch the two rows.

(b) The permutations correspond to

(i)

1 2 3 44 3 2 1

!, (ii)

1 2 3 44 1 2 3

!, (iii)

1 2 3 41 3 4 2

!, (iv)

1 2 3 4 51 4 2 5 3

!.

The inverse permutations correspond to

(i)

1 2 3 44 3 2 1

!, (ii)

1 2 3 42 3 4 1

!, (iii)

1 2 3 41 4 2 3

!, (iv)

1 2 3 4 51 3 5 2 4

!.

1.5.11. If a = 0 the first row is all zeros, and so A is singular. Otherwise, we make d → 0 byan elementary row operation. If e = 0 then the resulting matrix has a row of all zeros.Otherwise, we make h → 0 by another elementary row operation, and the result is a matrixwith a row of all zeros.

1.5.12. This is true if and only if A2 = I , and so, according to Exercise 1.2.36, A is either of

the form

±1 0

0 ±1

!or

a bc −a

!, where a is arbitrary and bc = 1− a2.

1.5.13. (3 I −A)A = 3A−A2 = I , so 3 I −A is the inverse of A.

1.5.14.

1

cA−1

!(cA) =

1

ccA−1A = I .

1.5.15. Indeed, (An)−1 = (A−1)n.

1.5.16. If all the diagonal entries are nonzero, then D−1D = I . On the other hand, if one ofdiagonal entries is zero, then all the entries in that row are zero, and so D is not invertible.

1.5.17. Since U−1 is also upper triangular, the only nonzero summand in the product of the ith

row of U and the ith column of U−1 is the product of their diagonal entries, which mustequal 1 since U U−1 = I .

♦ 1.5.18. (a) A = I−1A I . (b) If B = S−1AS, then A = S BS−1 = T−1BT , where T = S−1.

(c) If B = S−1AS and C = T−1BT , then C = T−1(S−1AS)T = (S T )−1A(S T ).

♥ 1.5.19. (a) Suppose D−1 =

X YZ W

!. Then, in view of Exercise 1.2.37, the equation DD−1 =

I =

I OO I

!requires AX = I , AY = O, BZ = O, BW = I . Thus, X = A−1, W = B−1

and, since they are invertible, Y = A−1O = O, Z = B−1O = O.

18

(b)

0BBB@

− 13

23 0

23 − 1

3 0

0 0 13

1CCCA,

0BBB@

−1 1 0 0−2 1 0 0

0 0 −5 30 0 2 −1

1CCCA.

1.5.20.

(a) BA =

1 1 0−1 −1 1

!0B@

1 −10 11 1

1CA =

1 00 1

!.

(b) AX = I does not have a solution. Indeed, the first column of this matrix equation is

the linear system

0B@

1 −10 11 1

1CA

xy

!=

0B@

100

1CA, which has no solutions since x− y = 1, y = 0,

and x + y = 0 are incompatible.

(c) Yes: for instance, B =

2 3 −1−1 −1 1

!. More generally, BA = I if and only if B =

1− z 1− 2z z−w 1− 2w w

!, where z, w are arbitrary.

1.5.21. The general solution to AX = I is X =

0B@−2y 1− 2v

y v−1 1

1CA, where y, v are arbitrary.

Any of these matrices serves as a right inverse. On the other hand, the linear systemY A = I is incompatible and there is no solution.

1.5.22.

(a) No. The only solutions are complex, with a =„− 1

2 ± iq

23

«b, where b 6= 0 is any

nonzero complex number.

(b) Yes. A simple example is A =

−1 1−1 0

!, B =

1 00 1

!. The general solution to the

2 × 2 matrix equation has the form A = BM , where M =

x yz w

!is any matrix with

tr M = x + w = −1, and det M = xw − y z = 1. To see this, if we set A = BM ,then ( I + M)−1 = I + M−1, which is equivalent to I + M + M−1 = O. Writing thisout using the formula (1.38) for the inverse, we find that if det M = xw − y z = 1 then

tr M = x+w = −1, while if det M 6= 1, then y = z = 0 and x+x−1+1 = 0 = w+w−1+1,in which case, as in part (a), there are no real solutions.

1.5.23. E =

0BBB@

1 0 0 00 1 0 00 0 7 00 0 0 1

1CCCA, E−1 =

0BBB@

1 0 0 00 1 0 00 0 1

7 00 0 0 1

1CCCA.

1.5.24. (a)

0@ − 1 2

3

1 13

1A, (b)

0@ −

18

38

38 − 1

8

1A, (c)

0@

35

45

− 45

35

1A, (d) no inverse,

(e)

0B@

3 −2 −29 −7 −61 −1 −1

1CA, (f )

0BBB@

− 58

18

58

− 12

12 − 1

278 − 3

818

1CCCA, (g)

0BB@− 5

232

12

2 −1 −12 −1 0

1CCA,

19

(h)

0BBB@

0 2 1 11 −6 −2 −30 −5 0 −30 2 0 1

1CCCA, (i)

0BBB@

−51 8 12 3−13 2 3 1

21 −3 −5 −15 −1 −1 0

1CCCA.

1.5.25.

(a)

1 03 1

! 1 00 3

! 1 −20 1

!=

1 −23 −3

!,

(b)

1 03 1

! 1 00 −8

! 1 30 1

!=

1 33 1

!,

(c)

1 043 1

!0@

35 0

0 1

1A

1 0

0 53

!0@ 1 − 4

3

0 1

1A =

0@

35 − 4

545

35

1A,

(d) not possible,

(e)

0B@

1 0 03 1 00 0 1

1CA

0B@

1 0 00 1 0−2 0 1

1CA

0B@

1 0 00 1 00 −1 1

1CA

0B@

1 0 00 −1 00 0 1

1CA

0B@

1 0 00 1 00 0 −1

1CA

0B@

1 0 −20 1 00 0 1

1CA

0B@

1 0 00 1 −60 0 1

1CA =

0B@

1 0 −23 −1 0−2 1 −3

1CA,

(f )

0B@

1 0 03 1 00 0 1

1CA

0B@

1 0 00 1 02 0 1

1CA

0B@

1 0 00 1 00 3 1

1CA

0B@

1 0 00 −1 00 0 1

1CA

0B@

1 0 00 1 00 0 8

1CA

0B@

1 0 30 1 00 0 1

1CA

0B@

1 0 00 1 40 0 1

1CA

0B@

1 2 00 1 00 0 1

1CA =

0B@

1 2 33 5 52 1 2

1CA,

(g)

0B@

1 0 00 0 10 1 0

1CA

0B@

1 0 00 1 02 0 1

1CA

0B@

2 0 00 1 00 0 1

1CA

0B@

1 0 00 −1 00 0 1

1CA

0B@

1 0 00 1 00 0 −1

1CA

0B@

1 0 10 1 00 0 1

1CA

0B@

1 0 00 1 −10 0 1

1CA

0B@

1 12 0

0 1 00 0 1

1CA =

0B@

2 1 24 2 30 −1 1

1CA,

(h)

0BBB@

1 0 0 00 0 1 00 1 0 00 0 0 1

1CCCA

0BBB@

1 0 0 012 1 0 00 0 1 00 0 0 1

1CCCA

0BBB@

1 0 0 00 1 0 00 0 1 00 0 −2 1

1CCCA

0BBB@

2 0 0 00 1 0 00 0 1 00 0 0 1

1CCCA

0BBB@

1 0 0 00 − 1

2 0 00 0 1 00 0 0 1

1CCCA

0BBB@

1 0 0 12

0 1 0 00 0 1 00 0 0 1

1CCCA

0BBB@

1 0 0 00 1 0 30 0 1 00 0 0 1

1CCCA

0BBB@

1 0 0 00 1 0 00 0 1 30 0 0 1

1CCCA

0BBB@

1 12 0 0

0 1 0 00 0 1 00 0 0 1

1CCCA =

0BBB@

2 1 0 10 0 1 31 0 0 −10 0 −2 −5

1CCCA,

(i)

0BBB@

1 0 0 00 1 0 00 0 0 10 0 1 0

1CCCA

0BBB@

1 0 0 02 1 0 00 0 1 00 0 0 1

1CCCA

0BBB@

1 0 0 00 1 0 00 0 1 03 0 0 1

1CCCA

0BBB@

1 0 0 00 1 0 00 2 1 00 0 0 1

1CCCA

0BBB@

1 0 0 00 1 0 00 0 1 00 −1 0 1

1CCCA

0BBB@

1 0 0 00 1 0 00 0 −1 00 0 0 1

1CCCA

0BBB@

1 0 0 00 1 0 00 0 1 00 0 0 −1

1CCCA

0BBB@

1 0 0 10 1 0 00 0 1 00 0 0 1

1CCCA

0BBB@

1 0 0 00 1 0 −20 0 1 00 0 0 1

1CCCA

0BBB@

1 0 0 00 1 0 00 0 1 −50 0 0 1

1CCCA

0BBB@

1 0 1 00 1 0 00 0 1 00 0 0 1

1CCCA

0BBB@

1 0 0 00 1 1 00 0 1 00 0 0 1

1CCCA

0BBB@

1 −2 0 00 1 0 00 0 1 00 0 0 1

1CCCA =

0BBB@

1 −2 1 12 −3 3 03 −7 2 40 2 1 1

1CCCA.

20

1.5.26. Applying Gaussian Elimination:

E1 =

0@ 1 0

− 1√3

1

1A, E1A =

0B@

√3

2 − 12

0 2√3

1CA,

E2 =

0@ 1 0

0√

32

1A, E2E1A =

0@

√3

2 − 12

0 1

1A,

E3 =

0@

2√3

0

0 1

1A, E3E2E1A =

0@ 1 − 1√

3

0 1

1A,

E4 =

0@ 1 1√

3

0 1

1A, E4E3E2E1A = I =

1 00 1

!,

and hence A = E−11 E−1

2 E−13 E−1

4 =

0@ 1 0

1√3

1

1A0@ 1 0

0 2√3

1A0@

√3

2 0

0 1

1A0@ 1 − 1√

3

0 1

1A.

1.5.27. (a)

0@ −

i2

12

12 − i

2

1A, (b)

−1 1− i

1 + i −1

!,

(c)

0B@

i 0 −11− i − i 1−1 −1 − i

1CA, (d)

0B@

3 + i −1− i − i−4 + 4 i 2− i 2 + i−1 + 2 i 1− i 1

1CA.

1.5.28. No. If they have the same solution, then they both reduce to“

I | x”

under elementary

row operations. Thus, by applying the appropriate elementary row operations to reduce the

augmented matrix of the first system to“

I | x”, and then applying the inverse elemen-

tary row operations we arrive at the augmented matrix for second system. Thus, the firstsystem can be changed into the second by the combined sequence of elementary row opera-tions, proving equivalence. (See also Exercise 2.5.44 for the general case.)

♥ 1.5.29.(a) If eA = EN EN−1 · · · E2 E1 A where E1, . . . , EN represent the row operations applied to

A, then eC = eA B = EN EN−1 · · · E2 E1 AB = EN EN−1 · · · E2 E1 C, which representsthe same sequence of row operations applied to C.

(b)

(E A)B =

0B@

1 2 −12 −3 2−2 −3 −2

1CA

0B@

1 −23 0−1 1

1CA =

0B@

8 −3−9 −2−9 2

1CA =

0B@

1 0 00 1 0−2 0 1

1CA

0B@

8 −3−9 −2

7 −4

1CA = E (AB).

1.5.30. (a)

0@

12

12

14 − 1

4

1A

1

− 2

!=

0@ −

1234

1A; (b)

0@

517

217

− 117

317

1A

2

12

!=

2

2

!;

(c)

0BB@

2 − 52

32

1 −1 0

0 12 − 1

2

1CCA

0BB@

3

− 2

2

1CCA =

0BB@

14

5

− 2

1CCA; (d)

0B@

9 −15 −86 −10 −5−1 2 1

1CA

0B@

3−1

5

1CA =

0B@

230

1CA;

(e)

0B@−4 3 1

2 −1 03 −1 1

1CA

0B@

35−7

1CA =

0B@−4

1−3

1CA; (f )

0BBB@

1 0 1 10 0 −1 −12 −1 −1 02 −1 −1 −1

1CCCA

0BBB@

411−7

6

1CCCA =

0BBB@

314−2

1CCCA;

21

(g)

0BBBBB@

1 1 0 1

− 52 −2 3

2 − 32

− 4 −3 2 −3

− 12 −1 1

2 − 12

1CCCCCA

0BBBBB@

− 2

3

3

2

1CCCCCA

=

0BBBBB@

312

− 1

− 32

1CCCCCA

.

1.5.31. (a)

0@ −

13

− 23

1A , (b)

0@

1414

1A, (c)

0@

75

− 15

1A, (d) singular matrix,

(e)

0B@−1−4−1

1CA, (f )

0BBB@

18

− 1258

1CCCA, (g)

0B@− 1

201

1CA, (h)

0BBB@

4−10−8

3

1CCCA, (i)

0BBB@

−28−7123

1CCCA.

1.5.32.

(a)

1 2−3 1

!=

1 0−3 1

! 1 00 7

! 1 20 1

!;

(b)

0 11 0

! 0 4−7 2

!=

1 00 1

! −7 0

0 4

! 1 − 2

70 1

!

(c)

0B@

2 1 22 4 −10 −2 1

1CA =

0B@

1 0 01 1 00 − 2

3 1

1CA

0B@

2 0 00 3 00 0 −1

1CA

0B@

1 12 1

0 1 −10 0 1

1CA;

(d)

0B@

1 0 00 0 10 1 0

1CA

0B@

1 1 51 1 −22 −1 3

1CA =

0B@

1 0 02 1 01 0 1

1CA

0B@

1 0 00 −3 00 0 −7

1CA

0B@

1 1 50 1 7

30 0 1

1CA;

(e)

0B@

2 −3 21 −1 11 −1 2

1CA =

0B@

1 0 012 1 012 1 1

1CA

0B@

2 0 00 1

2 00 0 1

1CA

0B@

1 − 32 1

0 1 00 0 1

1CA;

(f )

0BBB@

1 −1 1 21 −4 1 51 2 −1 −13 1 1 6

1CCCA =

0BBB@

1 0 0 01 1 0 01 −1 1 03 − 4

3 1 1

1CCCA

0BBB@

1 0 0 00 −3 0 00 0 −2 00 0 0 4

1CCCA

0BBB@

1 −1 1 20 1 0 −10 0 1 00 0 0 1

1CCCA;

(g)

0BBB@

1 0 0 00 1 0 00 0 0 10 0 1 0

1CCCA

0BBB@

1 0 2 −32 −2 0 11 −2 −2 −10 1 1 2

1CCCA =

0BBB@

1 0 0 02 1 0 00 − 1

2 1 01 1 0 1

1CCCA

0BBB@

1 0 0 00 −2 0 00 0 −1 00 0 0 −5

1CCCA

0BBB@

1 0 2 −30 1 2 − 7

2

0 0 1 − 112

0 0 0 1

1CCCA.

1.5.33.

(a)

0@ −

3757

1A, (b)

−8

3

!, (c)

0BBB@

16

− 2323

1CCCA, (d)

0B@

1−2

0

1CA, (e)

0B@−12−3

7

1CA, (f )

0BBBB@

73

25

− 53

1CCCCA

, (g)

0BBB@

010−2

1CCCA.

1.6.1. (a) ( 1 5 ), (b)

1 01 2

!, (c)

1 22 1

!, (d)

0B@

1 22 0−1 2

1CA,

22

(e)

0B@

12−3

1CA, (f )

1 3 52 4 6

!, (g)

0B@

1 0 12 3 1−1 2 5

1CA.

1.6.2. AT =

0B@

3 1−1 2−1 1

1CA, BT =

−1 2 −3

2 0 4

!,

(AB)T = BT AT =

−2 0

2 6

!, (BA)T = AT BT =

0B@−1 6 −5

5 −2 113 −2 7

1CA.

1.6.3. If A has size m × n and B has size n × p, then (AB)T has size p ×m. Further, AT has

size n ×m and BT has size p × n, and so unless m = p the product AT BT is not defined.

If m = p, then AT BT has size n × n, and so to equal (AB)T , we must have m = n = p,

so the matrices are square. Finally, taking the transpose of both sides, AB = (AT BT )T =

(BT )T (AT )T = BA, and so they must commute.

♦ 1.6.4. The (i, j) entry of C = (AB)T is the (j, i) entry of AB, so

cij =nX

k=1

ajk bki =nX

k=1

ebikeakj ,

where eaij = aji and ebij = bji are the entries of AT and BT respectively. Thus, cij equals

the (i, j) entry of the product BT AT .

1.6.5. (ABC)T = CT BT AT

1.6.6. False. For example,

1 10 1

!does not commute with its transpose.

♦ 1.6.7. If A =

a bc d

!, then AT A = AAT if and only if b2 = c2 and (a− d)(b− c) = 0.

So either b = c, or c = −b 6= 0 and a = d. Thus all normal 2× 2 matrices are of the form a bb d

!or

a b−b a

!.

1.6.8.(a) (AB)−T = ((AB)T )−1 = (BT AT )−1 = (AT )−1(BT )−1 = A−T B−T .

(b) AB =

1 02 1

!, so (AB)−T =

1 −20 1

!, while A−T =

0 −11 1

!, B−T =

1 −1−1 2

!,

so A−T B−T =

1 −20 1

!.

1.6.9. If A is invertible, then so is AT by Lemma 1.32; then by Lemma 1.21 AAT and AT A areinvertible.

1.6.10. No; for example,

12

!( 3 4 ) =

3 46 8

!while

34

!( 1 2 ) =

3 64 8

!.

1.6.11. No. In general, BT A is the transpose of AT B.

♦ 1.6.12.(a) The ith entry of Aej is the product of the ith row of A with ej . Since all the entries in

ej are zero except the jth entry the product will be equal to aij , i.e., the (i, j) entry of A.

(b) By part (a), beTi Aej is the product of the row matrix beT

i and the jth column of A. Since

23

all the entries in beTi are zero except the ith entry, multiplication by the jth column of A

will produce aij .

♦ 1.6.13.(a) Using Exercise 1.6.12, aij = eT

i Aej = eTi B ej = bij for all i, j.

(b) Two examples: A =

1 20 1

!, B =

1 11 1

!; A =

0 00 0

!, B =

0 −11 0

!.

♦ 1.6.14.(a) If pij = 1, then P A maps the jth row of A to its ith row. Then Q = P T has qji = 1,

and so it does the reverse, mapping the ith row of A to its jth row. Since this holds forall such entries, the result follows.

(b) No. Any rotation matrix

cos θ − sin θsin θ cos θ

!also has this property. See Section 5.3.

♦ 1.6.15.(a) Note that (AP T )T = P AT , which permutes the rows of AT , which are the columns of

A, according to the permutation P .

(b) The effect of multiplying P AP T is equivalent to simultaneously permuting rows andcolumns of A according to the permutation P . Associativity of matrix multiplicationimplies that it doesn’t matter whether the rows or the columns are permuted first.

♥ 1.6.16.(a) Note that wvT is a scalar, and so

AA−1 = ( I − vwT )( I − cvw

T ) = I − (1 + c)vwT + cv (wT

v)wT

= I − (1 + c− cwTv)vw

T = I

provided c = 1/(vT w − 1), which works whenever wT v 6= 1.

(b) A = I − vwT =

2 −23 −5

!and c =

1

vT w − 1= 1

4 , so A−1 = I − 14 vwT =

54 − 1

234 − 1

2

!.

(c) If vT w = 1 then A is singular, since Av = 0 and v 6= 0, and so the homogeneoussystem does not have a unique solution.

1.6.17. (a) a = 1; (b) a = −1, b = 2, c = 3; (c) a = −2, b = −1, c = −5.

1.6.18.

(a)

0B@

1 0 00 1 00 0 1

1CA,

0B@

0 1 01 0 00 0 1

1CA,

0B@

0 0 10 1 01 0 0

1CA,

0B@

1 0 00 0 10 1 0

1CA.

(b)

0BBB@

1 0 0 00 1 0 00 0 1 00 0 0 1

1CCCA,

0BBB@

0 1 0 01 0 0 00 0 1 00 0 0 1

1CCCA,

0BBB@

0 0 1 00 1 0 01 0 0 00 0 0 1

1CCCA,

0BBB@

0 0 0 10 1 0 00 0 1 01 0 0 0

1CCCA,

0BBB@

1 0 0 00 0 1 00 1 0 00 0 0 1

1CCCA,

0BBB@

1 0 0 00 0 0 10 0 1 00 1 0 0

1CCCA,

0BBB@

1 0 0 00 1 0 00 0 0 10 0 1 0

1CCCA,

0BBB@

0 1 0 01 0 0 00 0 0 10 0 1 0

1CCCA,

0BBB@

0 0 1 00 0 0 11 0 0 00 1 0 0

1CCCA,

0BBB@

0 0 0 10 0 1 00 1 0 01 0 0 0

1CCCA.

1.6.19. True, since (A2)T = (AA)T = AT AT = AA = A2.

♦ 1.6.20. True. Invert both sides of the equation AT = A, and use Lemma 1.32.

24

♦ 1.6.21. False. For example

0 11 0

! 2 11 3

!=

1 32 1

!.

1.6.22.(a) If D is a diagonal matrix, then for all i 6= j we have aij = aji = 0, so D is symmetric.

(b) If L is lower triangular then aij = 0 for i < j, if it is symmetric then aji = 0 for i < j,

so L is diagonal. If L is diagonal, then aij = 0 for i < j, so L is lower triangular and itis symmetric.

1.6.23.(a) Since A is symmetric we have (An)T = (AA . . . A)T = AT AT . . . AT = AA . . . A = An

(b) (2A2 − 3A + I )T = 2(A2)T − 3AT + I = 2A2 − 3A + I

(c) If p(A) = cn An + · · ·+ c1 A + c0 I , then p(A)T = cn An + · · · c1 A + c0 I T = cn (AT )n +

· · · c1 AT + c0 I = p(AT ). In particular, if A = AT , then p(A)T = p(AT ) = p(A).

1.6.24. If A has size m × n, then AT has size n × m and so both products are defined. Also,

KT = (AT A)T = AT (AT )T = AT A = K and LT = (AAT )T = (AT )T AT = AAT = L.

1.6.25.

(a)

1 11 4

!=

1 01 1

! 1 00 3

! 1 10 1

!,

(b)

−2 3

3 −1

!=

1 0− 3

2 1

! −2 0

0 72

! 1 − 3

20 1

!,

(c)

0B@

1 −1 −1−1 3 2−1 2 0

1CA =

0B@

1 0 0−1 1 0−1 1

2 1

1CA

0B@

1 0 00 2 00 0 − 3

2

1CA

0B@

1 −1 −10 1 1

20 0 1

1CA,

(d)

0BBB@

1 −1 0 3−1 2 2 0

0 2 −1 03 0 0 1

1CCCA =

0BBB@

1 0 0 0−1 1 0 0

0 2 1 03 3 6

5 1

1CCCA

0BBB@

1 0 0 00 1 0 00 0 −5 00 0 0 − 49

5

1CCCA

0BBB@

1 −1 0 30 1 2 30 0 1 6

50 0 0 1

1CCCA.

1.6.26. M2 =

1 012 1

! 1 0

0 32

!0@ 1 1

2

0 1

1A, M3 =

0BB@

1 0 012 1 0

0 23 1

1CCA

0BB@

2 0 0

0 32 0

0 0 43

1CCA

0BBB@

1 12 0

0 1 23

0 0 1

1CCCA,

M4 =

0BBBBB@

1 0 0 012 1 0 0

0 23 1 0

0 0 34 1

1CCCCCA

0BBBBB@

2 0 0 0

0 32 0 0

0 0 43 0

0 0 0 54

1CCCCCA

0BBBBB@

1 12 0 0

0 1 23 0

0 0 1 34

0 0 0 1

1CCCCCA

.

♦ 1.6.27. The matrix is not regular, since after the first set of row operations the (2, 2) entry is 0.More explicitly, if

L =

0B@

1 0 0a 1 0b c 1

1CA , D =

0B@

p 0 00 q 00 0 r

1CA , then LDLT =

0B@

p ap bpap a2p + q abp + cqbp abp + cq b2p + c2q + r

1CA.

Equating this to A, the (1, 1) entry requires p = 1, and so the (1, 2) entry requires a = 2,but the (2, 2) entry then implies q = 0, which is not an allowed diagonal entry for D.Even if we ignore this, the (1, 3) entry would set b = 1, but then the (2, 3) entry saysabp + cq = 2 6= −1, which is a contradiction.

♦ 1.6.28. Write A = LDV , then AT = V T DUT = eL eU , where eL = V T and eU = D eL. Thus, AT

is regular since the diagonal entries of eU , which are the pivots of AT , are the same as those

25

of D and U , which are the pivots of A.

♥ 1.6.29. (a) The diagonal entries satisfy jii = −jii and so must be 0. (b)

0 1−1 0

!. (c) No,

because the (1, 1) entry is always 0. (d) Invert both sides of the equation JT = −J and

use Lemma 1.32. (e) (JT )T = J = −JT , (J ±K)T = JT ±KT = −J ∓K = −(J ±K).

J K is not, in general, skew-symmetric; for instance

0 1−1 0

! 0 1−1 0

!=

−1 0

0 −1

!.

(f ) Since it is a scalar, vT J v = (vT J v)T = vT JT (vT )T = −vT J v equals its ownnegative, and so is zero.

1.6.30.(a) Let S = 1

2 (A + AT ), J = 12 (A−AT ). Then ST = S, JT = −J , and A = S + J .

(b)

1 23 4

!=

1 5

252 4

!+

0 − 1

212 0

!;

0B@

1 2 34 5 67 8 9

1CA =

0B@

1 3 53 5 75 7 9

1CA+

0B@

0 −1 −21 0 −12 1 0

1CA.

1.7.1.(a) The solution is x = − 10

7 , y = − 197 . Gaussian Elimination and Back Substitution re-

quires 2 multiplications and 3 additions; Gauss–Jordan also uses 2 multiplications and 3

additions; finding A−1 =

0@

17

27

− 37

17

1A by the Gauss–Jordan method requires 2 additions

and 4 multiplications, while computing the solution x =

0@

17

27

− 37

17

1A

4

−7

!=

0@−

107

− 197

1A

takes another 4 multiplications and 2 additions.

(b) The solution is x = −4, y = −5, z = −1. Gaussian Elimination and Back Substitu-tion requires 17 multiplications and 11 additions; Gauss–Jordan uses 20 multiplications

and 11 additions; computing A−1 =

0B@

0 −1 −12 −8 −532 −5 −3

1CA takes 27 multiplications and 12

additions, while multiplying A−1b = x takes another 9 multiplications and 6 additions.(c) The solution is x = 2, y = 1, z = 2

5 . Gaussian Elimination and Back Substitutionrequires 6 multiplications and 5 additions; Gauss–Jordan is the same: 6 multiplications

and 5 additions; computing A−1 =

0BBB@

− 12

32

32

− 12

12

12

− 25 0 − 1

5

1CCCA takes 11 multiplications and 3

additions, while multiplying A−1b = x takes another 8 multiplications and 5 additions.

1.7.2.(a) For a general matrix A, each entry of A2 requires n multiplications and n − 1 additions,

for a total of n3 multiplications and n3 − n2 additions, and so, when compared withthe efficient version of the Gauss–Jordan algorithm, takes exactly the same amount ofcomputation.

(b) A3 = A2A requires a total of 2n3 multiplications and 2n3 − 2n2 additions, and so isabout twice as slow.

(c) You can compute A4 as A2A2, and so only 2 matrix multiplications are required. In

general, if 2r ≤ k < 2r+1 has j ones in its binary representation, then you need r multi-

plications to compute A2, A4, A8, . . . A2r

followed by j − 1 multiplications to form Ak asa product of these particular powers, for a total of r + j − 1 matrix multiplications, andhence a total of (r + j − 1)n3 multiplications and (r + j − 1)n2(n − 1) additions. See

26

Exercise 1.7.8 and [11] for more sophisticated ways to speed up the computation.

1.7.3. Back Substitution requires about one half the number of arithmetic operations as multi-plying a matrix times a vector, and so is twice as fast.

♦ 1.7.4. We begin by proving (1.61). We must show that 1 + 2 + 3 + . . . + (n − 1) = n(n − 1)/2for n = 2, 3, . . .. For n = 2 both sides equal 1. Assume that (1.61) is true for n = k. Then1+2+3+ . . .+(k−1)+k = k(k−1)/2+k = k(k+1)/2, so (1.61) is true for n = k+1. Nowthe first equation in (1.62) follows if we note that 1 + 2 + 3 + . . . + (n− 1) + n = n(n + 1)/2.

Next we prove the first equation in (1.60), namely 2 + 6 + 12 + . . . + (n − 1)n = 13 n3 − 1

3 nfor n = 2, 3, . . .. For n = 2 both sides equal 2. Assume that the formula is true for n = k.Then 2+6+12+ . . .+(k−1)k+k(k+1) = 1

3 k3− 13 k+k2 +k = 1

3 (k+1)3− 13 (k+1), so the

formula is true for n = k + 1, which completes the induction step. The proof of the secondequation is similar, or, alternatively, one can use the first equation and (1.61) to show that

nX

j =1

(n− j)2 =nX

j =1

(n− j)(n− j + 1)−nX

j =1

(n− j) =n3 − n

3− n2 − n

2=

2n3 − 3n2 + n

6.

♥ 1.7.5. We may assume that the matrix is regular, so P = I , since row interchanges have noeffect on the number of arithmetic operations.

(a) First, according to (1.60), it takes 13 n3 − 1

3 n multiplications and 13 n3 − 1

2 n2 + 16 n

additions to factor A = LU . To solve Lcj = ej by Forward Substitution, the first j − 1

entries of c are automatically 0, the jth entry is 1, and then, for k = j + 1, . . . n, we needk− j − 1 multiplications and the same number of additions to compute the kth entry, fora total of 1

2 (n− j)(n− j − 1) multiplications and additions to find cj . Similarly, to solve

U xj = cj for the jth column of A−1 requires 12 n2 + 1

2 n multiplications and, since the

first j − 1 entries of cj are 0, also 12 n2 − 1

2 n − j + 1 additions. The grand total is n3

multiplications and n(n− 1)2 additions.

(b) Starting with the large augmented matrix M =“

A | I”, it takes 1

2 n2(n − 1) multipli-

cations and 12 n(n− 1)2 additions to reduce it to triangular form

“U | C

”with U upper

triangular and C lower triangular, then n2 multiplications to obtain the special upper

triangular form“

V | B”, and then 1

2 n2(n − 1) multiplications and, since B is upper

triangular, 12 n(n − 1)2 additions to produce the final matrix

“I | A−1

”. The grand

total is n3 multiplications and n(n − 1)2 additions. Thus, both methods take the sameamount of work.

1.7.6. Combining (1.60–61), we see that it takes 13 n3 + 1

2 n2− 56 n multiplications and 1

3 n3− 13 n

additions to reduce the augmented matrix to upper triangular form“

U | c”. Dividing the

jth row by its pivot requires n−j+1 multiplications, for a total of 12 n2+ 1

2 n multiplications

to produce the special upper triangular form“

V | e”. To produce the solved form

“I | d

”

requires an additional 12 n2 − 1

2 n multiplications and the same number of additions for a

grand total of 13 n3 + 3

2 n2 − 56 n multiplications and 1

3 n3 + 12 n2 − 5

6 n additions needed tosolve the system.

1.7.7. Less efficient, by, roughly, a factor of 32 . It takes 1

2 n3 + n2 − 12 n multiplications and

12 n3 − 1

2 n additions.

27

♥ 1.7.8.(a) D1 + D3 −D4 −D6 = (A1 + A4)(B1 + B4) + (A2 −A4)(B3 + B4)−

− (A1 + A2)B4 −A4(B1 −B3) = A1B1 + A2B3 = C1,

D4 + D7 = (A1 + A2)B4 + A1(B2 −B4) = A1B2 + A2B4 = C2,

D5 −D6 = (A3 + A4)B1 −A4(B1 −B3) = A3B1 + A4B3 = C3,D1 −D2 −D5 + D7 = (A1 + A4)(B1 + B4)− (A1 −A3)(B1 + B2)−

− (A3 + A4)B1 + A1(B2 −B4) = A3B2 + A4B4 = C4.(b) To compute D1, . . . , D7 requires 7 multiplications and 10 additions; then to compute

C1, C2, C3, C4 requires an additional 8 additions for a total of 7 multiplications and 18additions. The traditional method for computing the product of two 2 × 2 matrices re-quires 8 multiplications and 4 additions.

(c) The method requires 7 multiplications and 18 additions of n × n matrices, for a total of

7n3 and 7n2(n−1)+18n2 ≈ 7n3 additions, versus 8n3 multiplications and 8n2(n−1) ≈8n3 additions for the direct method, so there is a savings by a factor of 7

8 .(d) Let µr denote the number of multiplications and αr the number of additions to compute

the product of 2r × 2r matrices using Strassen’s Algorithm. Then, µr = 7µr−1, while

αr = 7αr−1 + 18 · 22r−2, where the first factor comes from multiplying the blocks,

and the second from adding them. Since µ1 = 1, α1 = 0. Clearly, µr = 7r, while an

induction proves the formula for αr = 6(7r−1 − 4r−1), namely

αr+1 = 7αr−1 + 18 · 4r−1 = 6(7r − 7 · 4r−1) + 18 · 4r−1 = 6(7r − 4r).

Combining the operations, Strassen’s Algorithm is faster by a factor of

2n3

µr + αr

=23r+1

13 · 7r−1 − 6 · 4r−1,

which, for r = 10, equals 4.1059, for r = 25, equals 30.3378, and, for r = 100, equals678, 234, which is a remarkable savings — but bear in mind that the matrices have sizearound 1030, which is astronomical!

(e) One way is to use block matrix multiplication, in the trivial form

A OO I

! B OO I

!=

C OO I

!where C = AB. Thus, choosing I to be an identity matrix of the appropriate

size, the overall size of the block matrices can be arranged to be a power of 2, and thenthe reduction algorithm can proceed on the larger matrices. Another approach, trickierto program, is to break the matrix up into blocks of nearly equal size since the Strassenformulas do not, in fact, require the blocks to have the same size and even apply to rect-angular matrices whose rectangular blocks are of compatible sizes.

1.7.9.

(a)

0B@

1 2 0−1 −1 1

0 −2 3

1CA =

0B@

1 0 0−1 1 0

0 −2 1

1CA

0B@

1 2 00 1 10 0 5

1CA, x =

0B@−2

30

1CA;

(b)

0BBB@

1 −1 0 0−1 2 1 0

0 −1 4 10 0 −1 6

1CCCA =

0BBB@

1 0 0 0−1 1 0 0

0 −1 1 00 0 −1 1

1CCCA

0BBB@

1 −1 0 00 1 1 00 0 5 10 0 0 7

1CCCA, x =

0BBB@

1012

1CCCA;

28

(c)

0BBB@

1 2 0 0−1 −3 0 0

0 −1 4 −10 0 −1 −1

1CCCA =

0BBB@

1 0 0 0−1 1 0 0

0 1 1 00 0 − 1

4 1

1CCCA

0BBB@

1 2 0 00 −1 0 00 0 4 −10 0 0 − 5

4

1CCCA, x =

0BBBB@

−42

− 25

− 35

1CCCCA

.

1.7.10.

(a)

0B@

2 −1 0−1 2 −1

0 −1 2

1CA =

0B@

1 0 0− 1

2 1 0

0 − 23 1

1CA

0B@

2 −1 00 3

2 −1

0 0 43

1CA,

0BBB@

2 −1 0 0−1 2 −1 0

0 −1 2 −10 0 −1 2

1CCCA =

0BBBB@

1 0 0 0− 1

2 1 0 0

0 − 23 1 0

0 0 − 34 1

1CCCCA

0BBBB@

2 −1 0 00 3

2 −1 0

0 0 43 −1

0 0 0 54

1CCCCA

,

0BBBBB@

2 −1 0 0 0−1 2 −1 0 0

0 −1 2 −1 00 0 −1 2 −10 0 0 −1 2

1CCCCCA

=

0BBBBBBB@

1 0 0 0 0− 1

2 1 0 0 0

0 − 23 1 0 0

0 0 − 34 1 0

0 0 0 − 45 1

1CCCCCCCA

0BBBBBBB@

2 −1 0 0 00 3

2 −1 0 0

0 0 43 −1 0

0 0 0 54 −1

0 0 0 0 65

1CCCCCCCA

;

(b)“

32 , 2, 3

2

”T, ( 2, 3, 3, 2 )T ,

“52 , 4, 9

2 , 4, 52

”T.

(c) The subdiagonal entries in L are li+1,i = − i/(i + 1) → −1, while the diagonal entries in

U uii = (i + 1)/i→ 1.

♠ 1.7.11.

(a)

0B@

2 1 0−1 2 1

0 −1 2

1CA =

0B@

1 0 0− 1

2 1 0

0 − 25 1

1CA

0B@

2 1 00 5

2 1

0 0 125

1CA ,

0BBB@

2 1 0 0−1 2 1 0

0 −1 2 10 0 −1 2

1CCCA =

0BBBB@

1 0 0 0− 1

2 1 0 0

0 − 25 1 0

0 0 − 512 1

1CCCCA

0BBBB@

2 1 0 00 5

2 1 0

0 0 125 1

0 0 0 2912

1CCCCA

,

0BBBBB@

2 1 0 0 01 2 1 0 00 1 2 1 00 0 1 2 10 0 0 1 2

1CCCCCA

=

0BBBBBBB@

1 0 0 0 0− 1

2 1 0 0 0

0 − 25 1 0 0

0 0 − 512 1 0

0 0 0 − 1229 1

1CCCCCCCA

0BBBBBBB@

2 1 0 0 00 5

2 1 0 0

0 0 125 1 0

0 0 0 2912 1

0 0 0 0 7029

1CCCCCCCA

;

(b)“

13 , 1

3 , 23

”T,“

829 , 13

29 , 1129 , 20

29

”T,“

310 , 2

5 , 12 , 2

5 , 710

”T.

(c) The subdiagonal entries in L approach 1 −√

2 = −.414214, and the diagonal entries in

U approach 1 +√

2 = 2.414214.

1.7.12. Both false. For example,0BBB@

1 1 0 01 1 1 00 1 1 10 0 1 1

1CCCA

0BBB@

1 1 0 01 1 1 00 1 1 10 0 1 1

1CCCA =

0BBB@

2 2 1 02 3 2 11 2 3 20 1 2 2

1CCCA,

0BBB@

1 1 0 01 1 1 00 1 1 10 0 1 1

1CCCA

−1

=

0BBB@

1 0 −1 10 0 1 −1−1 1 0 0

1 −1 0 1

1CCCA.

♠ 1.7.13.

0BB@

4 1 1

1 4 1

1 1 4

1CCA =

0BB@

1 0 014 1 014

15 1

1CCA

0BB@

4 1 1

0 154

34

0 0 185

1CCA

29

0BBBBB@

4 1 0 1

1 4 1 0

0 1 4 1

1 0 1 4

1CCCCCA

=

0BBBBB@

1 0 0 014 1 0 0

0 415 1 0

14 − 1

1527 1

1CCCCCA

0BBBBB@

4 1 0 1

0 154 1 − 1

4

0 0 5615

1615

0 0 0 247

1CCCCCA

0BBBBBBB@

4 1 0 0 1

1 4 1 0 0

0 1 4 1 0

0 0 1 4 1

1 0 0 1 4

1CCCCCCCA

=

0BBBBBBBB@

1 0 0 0 014 1 0 0 0

0 415 1 0 0

0 0 1556 1 0

14 − 1

15156

519 1

1CCCCCCCCA

0BBBBBBBB@

4 1 0 0 1

0 154 1 0 − 1

4

0 0 5615 1 1

15

0 0 0 20956

5556

0 0 0 0 6619

1CCCCCCCCA

For the 6× 6 version we have0BBBBBBBBBB@

4 1 0 0 0 1

1 4 1 0 0 0

0 1 4 1 0 0

0 0 1 4 1 0

0 0 0 1 4 1

1 0 0 0 1 4

1CCCCCCCCCCA

=

0BBBBBBBBBBB@

1 0 0 0 0 014 1 0 0 0 0

0 415 1 0 0 0

0 0 1556 1 0 0

0 0 0 56209 1 0

14 − 1

15156 − 1

209726 1

1CCCCCCCCCCCA

0BBBBBBBBBBB@

4 1 0 0 0 1

0 154 1 0 0 − 1

4

0 0 5615 1 0 1

15

0 0 0 20956 1 − 1

56

0 0 0 0 780209

210209

0 0 0 0 0 4513

1CCCCCCCCCCCA

The pattern is that the only the entries lying on the diagonal, the subdiagonal or the lastrow of L are nonzero, while the only nonzero entries of U are on its diagonal, superdiagonalor last column.

♥ 1.7.14.(a) Assuming regularity, the only row operations required to reduce A to upper triangular

form U are, for each j = 1, . . . , n−1, to add multiples of the jth row to the (j +1)st andthe nth rows. Thus, the only nonzero entries below the diagonal in L are at positions(j, j + 1) and (j, n). Moreover, these row operations only affect zero entries in the lastcolumn, leading to the final form of U .

(b)

0B@

1 −1 −1−1 2 −1−1 −1 3

1CA =

0B@

1 0 0−1 1 0−1 2 1

1CA

0B@

1 −1 −10 1 −20 0 2

1CA,

0BBBBB@

1 −1 0 0 −1−1 2 −1 0 0

0 −1 3 −1 00 0 −1 4 −1−1 0 0 −1 5

1CCCCCA

=

0BBBBBB@

1 0 0 0 0−1 1 0 0 0

0 −1 1 0 00 0 − 1

2 1 0

−1 −1 − 12 − 3

7 1

1CCCCCCA

0BBBBBB@

1 −1 0 0 −10 1 −1 0 −10 0 2 −1 −10 0 0 7

2 − 32

0 0 0 0 137

1CCCCCCA

,

0BBBBBBB@

1 −1 0 0 0 −1−1 2 −1 0 0 0

0 −1 3 −1 0 00 0 −1 4 −1 00 0 0 −1 5 −1−1 0 0 0 −1 6

1CCCCCCCA

=

0BBBBBBBB@

1 0 0 0 0 0−1 1 0 0 0 0

0 −1 1 0 0 00 0 − 1

2 1 0 0

0 0 0 − 27 1 0

−1 −1 − 12 − 1

7 − 833 1

1CCCCCCCCA

0BBBBBBBB@

1 −1 0 0 0 −10 1 −1 0 0 −10 0 2 −1 0 −10 0 0 7

2 −1 − 12

0 0 0 0 337 − 8

7

0 0 0 0 0 10433

1CCCCCCCCA

.

The 4× 4 case is a singular matrix.

30

♥ 1.7.15.(a) If matrix A is tridiagonal, then the only nonzero elements in ith row are ai,i−1, aii, ai,i+1.

So aij = 0 whenever | i− j | > 1.

(b) For example,

0BBBBBBB@

2 1 1 0 0 01 2 1 1 0 01 1 2 1 1 00 1 1 2 1 10 0 1 1 2 10 0 0 1 1 2

1CCCCCCCA

has band width 2;

0BBBBBBB@

2 1 1 1 0 01 2 1 1 1 01 1 2 1 1 11 1 1 2 1 10 1 1 1 2 10 0 1 1 1 2

1CCCCCCCA

has band

width 3.(c) U is a matrix that result from applying the row operation # 1 to A, so all zero entries

in A will produce corresponding zero entries in U . On the other hand, if A is of bandwidth k, then for each column of A we need to perform no more than k row replace-ments to obtain zero’s below the diagonal. Thus L which reflects these row replace-ments will have at most k nonzero entries below the diagonal.

(d)

0BBBBBBB@

2 1 1 0 0 01 2 1 1 0 01 1 2 1 1 00 1 1 2 1 10 0 1 1 2 10 0 0 1 1 2

1CCCCCCCA

=

0BBBBBBBBBBB@

1 0 0 0 0 012 1 0 0 0 012

13 1 0 0 0

0 23

12 1 0 0

0 0 34

12 1 0

0 0 0 1 12 1

1CCCCCCCCCCCA

0BBBBBBBBBBB@

2 1 1 0 0 0

0 32

12 1 0 0

0 0 43

23 1 0

0 0 0 1 12 1

0 0 0 0 1 12

0 0 0 0 0 34

1CCCCCCCCCCCA

,

0BBBBBBB@

2 1 1 1 0 01 2 1 1 1 01 1 2 1 1 11 1 1 2 1 10 1 1 1 2 10 0 1 1 1 2

1CCCCCCCA

=

0BBBBBBBBBBB@

1 0 0 0 0 012 1 0 0 0 012

13 1 0 0 0

12

13

14 1 0 0

0 23

12

25 1 0

0 0 34

35

14 1

1CCCCCCCCCCCA

0BBBBBBBBBBB@

2 1 1 1 0 0

0 32

12

12 1 0

0 0 43

13

23 1

0 0 0 54

12

34

0 0 0 0 45

15

0 0 0 0 0 34

1CCCCCCCCCCCA

.

(e)“

13 , 1

3 , 0, 0, 13 , 1

3

”T,“

23 , 1

3 ,− 13 ,− 1

3 , 13 , 2

3

”T.

(f ) For A we still need to compute k multipliers at each stage and update at most 2k2 en-

tries, so we have less than (n−1)(k+2k2) multiplications and (n−1)2k2 additions. Forthe right-hand side we have to update at most k entries at each stage, so we have lessthan (n − 1)k multiplications and (n − 1)k additions. So we can get by with less than

total (n− 1)(2k + 2k2) multiplications and (n− 1)(k + 2k2) additions.(g) The inverse of a banded matrix is not necessarily banded. For example, the inverse of

0B@

2 1 01 2 10 1 2

1CA is

0BBB@

34 − 1

214

− 12 1 − 1

214 − 1

234

1CCCA

1.7.16. (a) (−8, 4 )T , (b) (−10,−4.1 )T , (c) (−8.1,−4.1 )T . (d) Partial pivoting reducesthe effect of round off errors and results in a significantly more accurate answer.

1.7.17. (a) x = 117 ≈ 1.57143, y = 1

7 ≈ .142857, z = − 17 ≈ .142857,

(b) x = 3.357, y = .5, z = −.1429, (c) x = 1.572, y = .1429, z = −.1429.

1.7.18. (a) x = −2, y = 2, z = 3, (b) x = −7.3, y = 3.3, z = 2.9, (c) x = −1.9, y = 2.,z = 2.9, (d) partial pivoting works markedly better, especially for the value of x.

31

1.7.19. (a) x = −220., y = 26, z = .91; (b) x = −190., y = 24, z = .84; (c) x = −210,y = 26, z = 1. (d) The exact solution is x = −213.658, y = 25.6537, z = .858586.Full pivoting is the most accurate. Interestingly, partial pivoting fares a little worse thanregular elimination.

1.7.20. (a)

0BBB@

65

− 135

− 95

1CCCA =

0B@

1.2−2.6−1.8

1CA, (b)

0BBBBBB@

− 14

− 54

1814

1CCCCCCA

, (c)

0BBB@

0110

1CCCA, (d)

0BBBBBB@

− 3235

− 1935

− 1235

− 7635

1CCCCCCA

=

0BBB@

−.9143−.5429−.3429−2.1714

1CCCA.

1.7.21. (a)

0@−

113813

1A =

−.0769

.6154

!, (b)

0BBB@

− 45

− 815

− 1915

1CCCA =

0B@−.8000−.5333−1.2667

1CA,

(c)

0BBBBBB@

21213812159242

− 56121

1CCCCCCA

=

0BBB@

.0165

.3141

.2438−.4628

1CCCA, (d)

0B@−.732−.002

.508

1CA.

1.7.22. The results are the same.

♠ 1.7.23.

Gaussian Elimination With Full Pivoting

start

for i = 1 to n

set σ(i) = τ(i) = i

next i

for j = 1 to n

if mσ(i),j = 0 for all i ≥ j, stop; print “A is singular”

choose i ≥ j and k ≥ j such that mσ(i),τ(k) is maximal

interchange σ(i)←→ σ(j)

interchange τ(k)←→ τ(j)

for i = j + 1 to n

set z = mσ(i),τ(j)/mσ(k),τ(j)

set mσ(i),τ(j) = 0

for k = j + 1 to n + 1

set mσ(i),τ(k) = mσ(i),τ(k) − zmσ(i),τ(k)

next k

next i

next j

end

32

♠ 1.7.24. We let x ∈ Rn be generated using a random number generator, compute b = Hnx and

then solve Hny = b for y. The error is e = x− y and we use e? = max | ei | as a measure ofthe overall error. Using Matlab, running Gaussian Elimination with pivoting:

n 10 20 50 100

e? .00097711 35.5111 318.3845 1771.1

Using Mathematica, running regular Gaussian Elimination:

n 10 20 50 100

e? .000309257 19.8964 160.325 404.625

In Mathematica, using the built-in LinearSolve function, which is more accurate sinceit uses a more sophisticated solution method when confronted with an ill-posed linear sys-tem:

n 10 20 50 100

e? .00035996 .620536 .65328 .516865

(Of course, the errors vary a bit each time the program is run due to the randomness of thechoice of x.)

♠ 1.7.25.

(a) H−13 =

0B@

9 −360 30−36 192 −180

30 −180 180

1CA,

H−14 =

0BBB@

16 −120 240 −140−120 1200 −2700 1680

240 −2700 6480 −4200−140 1680 −4200 2800

1CCCA ,

H−15 =

0BBBBB@

25 −300 1050 −1400 630−300 4080 −18900 26880 −126001050 −18900 79380 −117600 56700−1400 26880 −117600 179200 −88200

630 −12600 56700 −88200 44100

1CCCCCA

.

(b) The same results are obtained when using floating point arithmetic in either Mathe-

matica or Matlab.(c) The product fK10H10, where fK10 is the computed inverse, is fairly close to the 10 × 10

identity matrix; the largest error is .0000801892 in Mathematica or .000036472 inMatlab. As for fK20H20, it is nowhere close to the identity matrix: in Mathemat-

ica the diagonal entries range from −1.34937 to 3.03755, while the largest (in absolutevalue) off-diagonal entry is 4.3505; in Matlab the diagonal entries range from −.4918to 3.9942, while the largest (in absolute value) off-diagonal entry is −5.1994.

1.8.1.(a) Unique solution: (− 1

2 ,− 34 )T ;

(b) infinitely many solutions: (1− 2z,−1 + z, z)T , where z is arbitrary;(c) no solutions;

(d) unique solution: (1,−2, 1)T ;

(e) infinitely many solutions: (5− 2z, 1, z, 0)T , where z is arbitrary;

(f ) infinitely many solutions: (1, 0, 1, w)T , where w is arbitrary;

(g) unique solution: (2, 1, 3, 1)T .

33

1.8.2. (a) Incompatible; (b) incompatible; (c) (1, 0)T ; (d) (1+3x2−2x3, x2, x3)T , where x2

and x3 are arbitrary; (e) (− 152 , 23,−10)T ; (f ) (−5− 3x4, 19− 4x4,−6− 2x4, x4)

T , wherex4 is arbitrary; (g) incompatible.

1.8.3. The planes intersect at (1, 0, 0).

1.8.4. (i) a 6= b and b 6= 0; (ii) b = 0, a 6= −2; (iii) a = b 6= 0, or a = −2 and b = 0.

1.8.5. (a) b = 2, c 6= −1 or b = 12 , c 6= 2; (b) b 6= 2, 1

2 ; (c) b = 2, c = −1, or b = 12 , c = 2.

1.8.6.

(a)“

1 + i − 12 (1 + i )y, y,− i

”T, where y is arbitrary;;

(b) ( 4 i z + 3 + i , i z + 2− i , z )T , where z is arbitrary;

(c) ( 3 + 2 i ,−1 + 2 i , 3 i )T ;

(d) (−z − (3 + 4 i )w,−z − (1 + i )w, z, w )T , where z and w are arbitrary.

1.8.7. (a) 2, (b) 1, (c) 2, (d) 3, (e) 1, (f ) 1, (g) 2, (h) 2, (i) 3.

1.8.8.

(a)

1 11 −2

!=

1 01 1

! 1 10 −3

!,

(b)

2 1 3−2 −1 −3

!=

1 0−1 1

! 2 1 30 0 0

!,

(c)

0B@

1 −1 11 −1 2−1 1 0

1CA =

0B@

1 0 01 1 0−1 1 1

1CA

0B@

1 −1 10 0 10 0 0

1CA,

(d)

0B@

1 0 00 0 10 1 0

1CA

0B@

2 −1 01 1 −12 −1 1

1CA =

0B@

1 0 012 1 01 0 1

1CA

0B@

2 −1 00 3

2 −10 0 1

1CA,

(e)

0B@

30−2

1CA =

0B@

1 0 00 1 0− 2

3 0 1

1CA

0B@

300

1CA,

(f ) ( 0 −1 2 5 ) = ( 1 )( 0 −1 2 5 ),

(g)

0BBB@

0 1 0 01 0 0 00 0 1 00 0 0 1

1CCCA

0BBB@

0 −34 −11 2−1 −5

1CCCA =

0BBB@

1 0 0 00 1 0 014 − 3

4 1 0

− 14 − 7

4 0 1

1CCCA

0BBB@

1 20 70 00 0

1CCCA,

(h)

0BBBBB@

1 −1 2 12 1 −1 01 2 −3 −14 −1 3 20 3 −5 −2

1CCCCCA

=

0BBBBB@

1 0 0 0 02 1 0 0 01 1 1 0 04 1 0 1 00 1 0 0 1

1CCCCCA

0BBBBB@

1 −1 2 10 3 −5 −20 0 0 00 0 0 00 0 0 0

1CCCCCA

,

(i)

0B@

0 1 00 0 11 0 0

1CA

0B@

0 0 0 3 11 2 −3 1 −22 4 −2 1 −2

1CA =

0B@

1 0 02 1 00 0 1

1CA

0B@

1 2 −3 1 −20 0 4 −1 20 0 0 3 1

1CA.

1.8.9. (a)

x = 1,

y = 0,

z = 0.

(b)

x + y = 1,

y + z = 0,

x− z = 1.

(c)

x + y = 1,

y + z = 0,

x− z = 0.

1.8.10. (a)

1 0 00 1 0

!, (b)

0B@

1 0 00 1 00 0 0

1CA, (c)

0B@

1 00 10 0

1CA, (d)

0B@

1 0 00 1 00 0 1

1CA.

34

1.8.11.(a) x2 + y2 = 1, x2 − y2 = 2;

(b) y = x2, x− y + 2 = 0; solutions: x = 2, y = 4 and x = −1, y = 1;

(c) y = x3, x− y = 0; solutions: x = y = 0, x = y = −1, x = y = 1;(d) y = sin x, y = 0; solutions: x = kπ, y = 0, for k any integer.

1.8.12. That variable does not appear anywhere in the system, and is automatically free (al-though it doesn’t enter into any of the formulas, and so is, in a sense, irrelevant).

1.8.13. True. For example, take a matrix in row echelon form with r pivots, e.g., the matrix Awith aii = 1 for i = 1, . . . , r, and all other entries equal to 0.

1.8.14. Both false. The zero matrix has no pivots, and hence has rank 0.

♥ 1.8.15.(a) Each row of A = vwT is a scalar multiple, namely vi w, of the vector w. If necessary,

we use a row interchange to ensure that the first row is non-zero. We then subtract theappropriate scalar multiple of the first row from all the others. This makes all rows be-low the first zero, and so the resulting matrix is in row echelon form has a single non-zero row, and hence a single pivot — proving that A has rank 1.

(b) (i)

−1 2−3 6

!, (ii)

0B@−8 4

0 04 −2

1CA, (iii)

2 6 −2−3 −9 3

!.

(c) The row echelon form of A must have a single nonzero row, say wT . Reversing the ele-mentary row operations that led to the row echelon form, at each step we either inter-change rows or add multiples of one row to another. Every row of every matrix obtained

in such a fashion must be some scalar multiple of wT , and hence the original matrix

A = vwT , where the entries vi of the vector v are the indicated scalar multiples.

1.8.16. 1.

1.8.17. 2.

1.8.18. Example: A =

1 00 0

!, B =

0 10 0

!so AB =

0 10 0

!has rank 1, but BA =

0 00 0

!

has rank 0.

♦ 1.8.19.(a) Under elementary row operations, the reduced form of C will be

“U Z

”where U is the

row echelon form of A. Thus, C has at least r pivots, namely the pivots in A. Examples:

rank

1 2 12 4 2

!= 1 = rank

1 22 4

!, while rank

1 2 12 4 3

!= 2 > 1 = rank

1 22 4

!.

(b) Applying elementary row operations, we can reduce E to

UW

!where U is the row ech-

elon form of A. If we can then use elementary row operations of type #1 to eliminateall entries of W , then the row echelon form of E has the same number of pivots as Aand so rank E = rank A. Otherwise, at least one new pivot appears in the rows below U ,and rank E > rank A. Examples:

rank

0B@

1 22 43 6

1CA = 1 = rank

1 22 4

!, while rank

0B@

1 22 43 5

1CA = 2 > 1 = rank

1 22 4

!.

♦ 1.8.20. By Proposition 1.39, A can be reduced to row echelon form U by a sequence of elemen-tary row operations. Therefore, as in the proof of the LU decomposition, A = E1 E2 · · ·EN U

where E−11 , . . . , E−1

N are the elementary matrices representing the row operations. If A issingular, then U = Z must have at least one all zero row.

35

♦ 1.8.21. After row operations, the augmented matrix becomes N =“

U | c”

where the r = rank A

nonzero rows of U contain the pivots of A. If the system is compatible, then the last m − rentries of c are all zero, and hence N is itself a row echelon matrix with r nonzero rows andhence rank M = rank N = r. If the system is incompatible, then one or more of the lastm − r entries of c are nonzero, and hence, by one more set of row operations, N is placedin row echelon form with a final pivot in row r + 1 of the last column. In this case, then,rank M = rank N = r + 1.

1.8.22. (a) x = z, y = z, where z is arbitrary; (b) x = − 23 z, y = 7

9 z, where z is arbitrary;

(c) x = y = z = 0; (d) x = 13 z − 2

3 w, y = 56 z − 1

6 w, where z and w are arbitrary;

(e) x = 13z, y = 5z, w = 0, where z is arbitrary; (f ) x = 32 w, y = 1

2 w, z = 12 w, where

w is arbitrary.

1.8.23. (a)“

13 y, y

”T, where y is arbitrary; (b)

“− 6

5z, 85z, z

”T, where z is arbitrary;

(c)“− 11

5 z + 35w, 2

5z − 65w, z, w

”T, where z and w are arbitrary; (d) ( z,−2z, z )T , where

z is arbitrary; (e) (−4z, 2z, z )T , where z is arbitrary; (f ) ( 0, 0, 0 )T ; (g) ( 3z, 3z, z, 0 )T ,

where z is arbitrary; (h) ( y − 3w, y, w, w )T , where y and w are arbitrary.

1.8.24. If U has only nonzero entries on the diagonal, it must be nonsingular, and so the onlysolution is x = 0. On the other hand, if there is a diagonal zero entry, then U cannot haven pivots, and so must be singular, and the system will admit nontrivial solutions.

1.8.25. For the homogeneous case x1 = x3, x2 = 0, where x3 is arbitrary. For the inhomoge-

neous case x1 = x3 + 14 (a + b), x2 = 1

2 (a − b), where x3 is arbitrary. The solution to thehomogeneous version is a line going through the origin, while the inhomogeneous solution

is a parallel line going through the point“

14 (a + b), 0, 1

2 (a− b)”T

. The dependence on the

free variable x3 is the same as in the homogeneous case.

1.8.26. For the homogeneous case x1 = − 16 x3 − 1

6 x4, x2 = − 23 x3 + 4

3 x4, where x3 and x4

are arbitrary. For the inhomogeneous case x1 = − 16 x3 − 1

6 x4 + 13 a + 1

6 b, x2 = − 23 x3 +

43 x4 + 1

3 a + 16 b, where x3 and x4 are arbitrary. The dependence on the free variable x3 is

the same as in the homogeneous case.

1.8.27. (a) k = 2 or k = −2; (b) k = 0 or k = 12 ; (c) k = 1.

1.9.1.

(a) Regular matrix, reduces to upper triangular form U =

2 −10 1

!, so determinant is 2;

(b) Singular matrix, row echelon form U =

0B@−1 0 3

0 1 −20 0 0

1CA, so determinant is 0;

(c) Regular matrix, reduces to upper triangular form U =

0B@

1 2 30 1 20 0 −3

1CA, so determinant is −3;

(d) Nonsingular matrix, reduces to upper triangular form U =

0B@−2 1 3

0 1 −10 0 3

1CA after one row

interchange, so determinant is 6;

36

(e) Upper triangular matrix, so the determinant is a product of diagonal entries: −180;

(f ) Nonsingular matrix, reduces to upper triangular form U =

0BBB@

1 −2 1 40 2 −1 −70 0 −2 −80 0 0 10

1CCCA after

one row interchange, so determinant is 40;

(g) Nonsingular matrix, reduces to upper triangular form U =

0BBBBB@

1 −2 1 4 −50 3 −3 −1 20 0 4 −12 240 0 0 −5 100 0 0 0 1

1CCCCCA

after one row interchange, so determinant is 60.

1.9.2. det A = −2, det B = −11 and det AB = det

0B@

5 4 41 5 1−2 10 0

1CA = 22.

1.9.3. (a) A =

2 3−1 −2

!; (b) By formula (1.82),

1 = det I = det(A2) = det(AA) = det A det A = (det A)2, so det A = ±1.

1.9.4. det A2 = (det A)2 = det A, and hence det A = 0 or 1

1.9.5.(a) True. By Theorem 1.52, A is nonsingular, so, by Theorem 1.18, A−1 exists

(b) False. For A =

2 3−1 −2

!, we have 2 det A = −2 and det 2A = −4. In general,

det(2A) = 2n det A.

(c) False. For A =

2 3−1 −2

!and B =

0 10 0

!, we have det(A + B) = det

2 4−1 −2

!=

0 6= −1 = det A + det B.

(d) True. det A−T = det(A−1)T = det A−1 = 1/ det A, where the second equality followsfrom Proposition 1.56, and the third equality follows from Proposition 1.55.

(e) True. det(AB−1) = det A det B−1 = det A/ det B, where the first equality follows fromformula (1.82) and the second equality follows from Proposition 1.55.

(f ) False. If A =

2 3−1 −2

!and B =

0 10 0

!, then det(A+B)(A−B) = det

0 −40 2

!=

0 6= det(A2 −B2) = det

1 00 1

!= 1. However, if AB = BA, then det(A + B)(A−B) =

det(A2 −AB + BA−B2) = det(A2 −B2).(g) True. Proposition 1.42 says rank A = n if and only if A is nonsingular, while Theo-

rem 1.52 implies that det A 6= 0.(h) True. Since det A = 1 6= 0, Theorem 1.52 implies that A is nonsingular, and so B =

A−1O = O.

1.9.6. Never — its determinant is always zero.

1.9.7. By (1.82, 83) and commutativity of numeric multiplication,

det B = det(S−1 AS) = det S−1 det A det S =1

det Sdet A det S = det A.

1.9.8. Multiplying one row of A by c multiplies its determinant by c. To obtain c A, we mustmultiply all n rows by c, and hence the determinant is multiplied by c a total of n times.

1.9.9. By Proposition 1.56, det LT = det L. If L is a lower triangular matrix, then LT is an

37

upper triangular matrix. By Theorem 1.50, det LT is the product of its diagonal entrieswhich are the same as the diagonal entries of L.

1.9.10. (a) See Exercise 1.9.8. (b) If n is odd, det(−A) = − det A. On the other hand, if

AT = −A, then det A = det AT = − det A, and hence det A = 0. (c) A =

0 1−1 0

!.

♦ 1.9.11. We have

det

a b

c + ka d + kb

!= ad + akb− bc− bka = ad− bc = det

a bc d

!,

det

c da b

!= cb− ad = −(ad− bc) = − det

a bc d

!,

det

ka kbc d

!= kad− kbc = k (ad− bc) = k det

a bc d

!,

det

a b0 d

!= ad− b0 = ad.

♦ 1.9.12.(a) The product formula holds if A is an elementary matrix; this is a consequence of the

determinant axioms coupled with the fact that elementary matrices are obtained by ap-plying the corresponding row operation to the identity matrix, with det I = 1.

(b) By induction, if A = E1 E2 · · ·EN is a product of elementary matrices, then (1.82) alsoholds. Proposition 1.25 then implies that the product formula is valid whenever A isnonsingular.

(c) The first result is in Exercise 1.2.24(a), and so the formula follows by applying Lemma 1.51to Z and Z B.

(d) According to Exercise 1.8.20, every singular matrix can be written as A = E1 E2 · · ·EN Z,where the Ei are elementary matrices, while Z, its row echelon form, is a matrix with arow of zeros. But then Z B = W also has a row of zeros, and so AB = E1 E2 · · ·ENWis also singular. Thus, both sides of (1.82) are zero in this case.

1.9.13. Indeed, by (1.82), det A det A−1 = det(AA−1) = det I = 1.

♦ 1.9.14. Exercise 1.6.28 implies that, if A is regular, so is AT , and they both have the same piv-ots. Since the determinant of a regular matrix is the product of the pivots, this implies

det A = det AT . If A is nonsingular, then we use the permuted LU decomposition to write

A = PT LU where PT = P−1 by Exercise 1.6.14. Thus, det A = det P T det U = ±det U ,

while det AT = det(UT LT P ) = det U det P = ±det U where det P−1 = det P = ±1.Finally, if A is singular, then the same computation holds, with U denoting the row echelon

form of A, and so det A = det U = 0 = ± det AT .

1.9.15.

det

0BBB@

a11 a12 a13 a14a21 a22 a23 a24a31 a32 a33 a34a41 a42 a43 a44

1CCCA =

a11 a22 a33 a44 − a11 a22 a34 a43 − a11 a23 a32 a44 + a11 a23 a34 a42 − a11 a24 a33 a42

+ a11 a24 a32 a43 − a12 a21 a33 a44 + a12 a21 a34 a43 + a12 a23 a31 a44 − a12 a23 a34 a41

+ a12 a24 a33 a41 − a12 a24 a31 a43 + a13 a21 a32 a44 − a13 a21 a34 a42 − a13 a22 a31 a44

+ a13 a22 a34 a41 − a13 a24 a32 a41 + a13 a24 a31 a42 − a14 a21 a32 a43 + a14 a21 a33 a42

+ a14 a22 a31 a43 − a14 a22 a33 a41 + a14 a23 a32 a41 − a14 a23 a31 a42.

38

♦ 1.9.16.(i) Suppose B is obtained from A by adding c times row k to row l, so

bij =

8<:

alj + caij , i = l,

aij , i 6= l.Thus, each summand in the determinantal formula for

det B splits into two terms, and we find that det B = det A + c det C, where C is thematrix obtained from A by replacing row l by row k. But rows k and l of C are identi-cal, and so, by axiom (ii), if we interchange the two rows det C = − det C = 0. Thus,det B = det A.

(ii) Let B be obtained from A by interchanging rows k and l. Then each summand in theformula for det B equals minus the corresponding summand in the formula for det A,since the permutation has changed sign, and so det B = − det A.

(iii) Let B be obtained from A by multiplying rows k by c. Then each summand in the for-mula for det B contains one entry from row k, and so equals c times the correspondingterm in det A, hence det B = c det A.

(iv) The only term in det U that does not contain at least one zero entry lying below thediagonal is for the identity permutation π(i) = i, and so det U is the product of its diag-onal entries.

♦ 1.9.17. If U is nonsingular, then, by Gauss–Jordan elimination, it can be reduced to the iden-tity matrix by elementary row operations of types #1 and #3. Each operation of type #1doesn’t change the determinant, while operations of type #3 multiply the determinant bythe diagonal entry. Thus, det U = u11u22 · · ·unn det I . On the other hand, U is singular ifand only if one or more of its diagonal entries are zero, and so det U = 0 = u11u22 · · ·unn.

♦ 1.9.18. The determinant of an elementary matrix of type #2 is −1, whereas all elementary ma-trices of type #1 have determinant +1, and hence so does any product thereof.

♥ 1.9.19.(a) Since A is regular, a 6= 0 and ad− bc 6= 0. Subtracting c/a times the first from from the

second row reduces A to the upper triangular matrix

a b0 d + b(−c/a)

!, and its pivots

are a and d− bca =

ad− bca =

det Aa .

(b) As in part (a) we reduce A to an upper triangular form. First, we subtract c/a timesthe first row from the second row, and g/a times the first row from third row, resulting

in the matrix

0BBB@

a b e

0ad− bc

aaf − ce

a

0ah− bg

aaj − cg

a

1CCCA. Performing the final row operation reduces

the matrix to the upper triangular form U =

0BBB@

a b e

0ad− bc

a −af − cea

0 0 P

1CCCA, whose pivots

are a,ad− bc

a, and

aj − eg

a− (af − ce)(ah− bg)

a (ad− bc)=

adj + bf g + ech− af h− bcj − edg

ad− bc=

det A

ad− bc.

(c) If A is a regular n × n matrix, then its first pivot is a11, and its kth pivot, for k =2, . . . , n, is det Ak/det Ak−1, where Ak is the k × k upper left submatrix of A with en-tries aij for i, j = 1, . . . , k. A formal proof is done by induction.

♥ 1.9.20. (a–c) Applying an elementary column operation to a matrix A is the same as apply-

ing the elementary row operation to its transpose AT and then taking the transpose of theresult. Moreover, Proposition 1.56 implies that taking the transpose does not affect the de-

39

terminant, and so any elementary column operation has exactly the same effect as the cor-responding elementary row operation.

(d) Apply the transposed version of the elementary row operations required to reduce AT

to upper triangular form. Thus, if the (1, 1) entry is zero, use a column interchange toplace a nonzero pivot in the upper left position. Then apply elementary column operationsof type #1 to make all entries to the right of the pivot zero. Next, make sure a nonzeropivot is in the (2, 2) position by a column interchange if necessary, and then apply elemen-tary column operations of type #1 to make all entries to the right of the pivot zero. Con-tinuing in this fashion, if the matrix is nonsingular, the result is an lower triangular matrix.(e) We first interchange the first and second columns, and then use elementary column op-erations of type #1 to reduce the matrix to lower triangular form:

det

0B@

0 1 2−1 3 5

2 −3 1

1CA = − det

0B@

1 0 23 −1 5−3 2 1

1CA

= − det

0B@

1 0 03 −1 −1−3 2 7

1CA = − det

0B@

1 0 03 −1 0−3 2 5

1CA = 5.

♦ 1.9.21. Using the LU factorizations established in Exercise 1.3.25:

(a) det

1 1t1 t2

!= t2 − t1, (b) det

0B@

1 1 1t1 t2 t3t21 t22 t23

1CA = (t2 − t1)(t3 − t1)(t3 − t2),

(c) det

0BBBB@

1 1 1 1t1 t2 t3 t4t21 t22 t23 t24t31 t32 t33 t34

1CCCCA

= (t2 − t1)(t3 − t1)(t3 − t2)(t4 − t1)(t4 − t2)(t4 − t3).

The general formula is found in Exercise 4.4.29.

♥ 1.9.22.(a) By direct substitution:

ax + by = apd− bq

ad− bc+ b

aq − pc

ad− bc= p, cx + dy = c

pd− bq

ad− bc+ d

aq − pc

ad− bc= q.

(b) (i) x = − 1

10det

13 30 2

!= −2.6, y = − 1

10det

1 134 0

!= 5.2;

(ii) x =1

12det

4 −2−2 6

!=

5

3, y =

1

12det

1 43 −2

!= − 7

6.

(c) Proof by direct substitution, expanding all the determinants.

(d) (i) x =1

9det

0B@

3 4 02 2 10 1 −1

1CA = − 1

9, y =

1

9det

0B@

1 3 04 2 1−1 0 −1

1CA =

7

9,

z =1

9det

0B@

1 4 34 2 2−1 1 0

1CA =

8

9; (ii) x = − 1

2det

0B@

1 2 −12 −3 23 −1 1

1CA = 0,

y = − 1

2det

0B@

3 1 −11 2 22 3 1

1CA = 4, z = − 1

2det

0B@

3 2 11 −3 22 −1 3

1CA = 7.

(e) Assuming A is nonsingular, the solution to Ax = b is xi = det Ai/ det A, where Ai

is obtained by replacing the ith column of A by the right hand side b. See [60] for acomplete justification.

40

♦ 1.9.23.(a) We can individually reduce A and B to upper triangular forms U1 and U2 with the

determinants equal to the products of their respective diagonal entries. Applying theanalogous elementary row operations to D will reduce it to the upper triangular form

U1 O

O U2

!, and its determinant is equal to the product of its diagonal entries, which

are the diagonal entries of both U1 and U2, so det D = det U1 det U2 = det A det B.(b) The same argument as in part (a) proves the result. The row operations applied to A

are also applied to C, but this doesn’t affect the final upper triangular form.

(c) (i) det

0B@

3 2 −20 4 −50 3 7

1CA = det(3) det

4 −53 7

!= 3 · 43 = 129,

(ii) det

0BBB@

1 2 −2 5−3 1 0 −5

0 0 1 30 0 2 −2

1CCCA = det

1 2−3 1

!det

1 32 −2

!= 7 · (−8) = −56,

(iii) det

0BBB@

1 2 0 4−3 1 4 −1

0 3 1 80 0 0 −3

1CCCA = det

0B@

1 2 0−3 1 4

0 3 1

1CA det(−3) = (−5) · (−3) = 15,

(iv) det

0BBB@

5 −1 0 02 5 0 02 4 4 −23 −2 9 −5

1CCCA = det

5 −12 5

!det

4 −29 −5

!= 27 · (−2) = −54.

41


2.1.1. Commutativity of Addition:

(x + i y) + (u + i v) = (x + u) + i (y + v) = (u + i v) + (x + i y).

Associativity of Addition:

(x + i y) +h(u + i v) + (p + i q)

i= (x + i y) +

h(u + p) + i (v + q)

i

= (x + u + p) + i (y + v + q)

=h(x + u) + i (y + v)

i+ (p + i q) =

h(x + i y) + (u + i v)

i+ (p + i q).

Additive Identity : 0 = 0 = 0 + i 0 and

(x + i y) + 0 = x + i y = 0 + (x + i y).

Additive Inverse: −(x + i y) = (−x) + i (−y) and

(x + i y) +h(−x) + i (−y)

i= 0 =

h(−x) + i (−y)

i+ (x + i y).

Distributivity :

(c + d)(x + i y) = (c + d)x + i (c + d)y = (cx + dx) + i (cy + dy) = c(x + i y) + d(x + i y),

c[ (x + i y) + (u + i v) ] = c(x + u) + (y + v) = (cx + cu) + i (cy + cv) = c(x + i y) + c(u + i v).

Associativity of Scalar Multiplication:

c [d(x + i y) ] = c [ (dx) + i (dy) ] = (cdx) + i (cdy) = (cd)(x + i y).

Unit for Scalar Multiplication: 1(x + i y) = (1x) + i (1y) = x + i y.

Note: Identifying the complex number x + i y with the vector ( x, y )T ∈ R2 respects the opera-

tions of vector addition and scalar multiplication, and so we are in effect reproving that R2 is a

vector space.

2.1.2. Commutativity of Addition:

(x1, y1) + (x2, y2) = (x1 x2, y1 y2) = (x2, y2) + (x1, y1).


(x1, y1) +h(x2, y2) + (x3, y3)

i= (x1 x2 x3, y1 y2 y3) =

h(x1, y1) + (x2, y2)

i+ (x3, y3).

Additive Identity : 0 = (1, 1), and

(x, y) + (1, 1) = (x, y) = (1, 1) + (x, y).

Additive Inverse:

−(x, y) =

1

x,1

y

!and (x, y) +

h−(x, y)

i= (1, 1) =

h−(x, y)

i+ (x, y).

Distributivity :

(c + d)(x, y) = (xc+d, yc+d) = (xc xd, yc yd) = (xc, yc) + (xd, yd) = c(x, y) + d(x, y)

ch(x1, y1) + (x2, y2)

i= ((x1 x2)

c, (y1 y2)c) = (xc

1 xc2, yc

1 yc2)

= (xc1, yc

1) + (xc2, yc

2) = c(x1, y1) + c(x2, y2).


c(d(x, y)) = c(xd, yd) = (xcd, ycd) = (cd)(x, y).

Unit for Scalar Multiplication: 1(x, y) = (x, y).

42

Note: We can uniquely identify a point (x, y) ∈ Q with the vector ( log x, log y )T ∈ R2. Then

the indicated operations agree with standard vector addition and scalar multiplication in R2,

and so Q is just a disguised version of R2.

♦ 2.1.3. We denote a typical function in F(S) by f(x) for x ∈ S.Commutativity of Addition:

(f + g)(x) = f(x) + g(x) = (f + g)(x).


[f +(g +h)](x) = f(x)+ (g +h)(x) = f(x)+ g(x)+h(x) = (f + g)(x)+h(x) = [(f + g)+h](x).

Additive Identity : 0(x) = 0 for all x, and (f + 0)(x) = f(x) = (0 + f)(x).Additive Inverse: (−f)(x) = −f(x) and

[f + (−f)](x) = f(x) + (−f)(x) = 0 = (−f)(x) + f(x) = [(−f) + f ](x).

Distributivity :

[(c + d)f ](x) = (c + d)f(x) = cf(x) + df(x) = (cf)(x) + (df)(x),

[c(f + g)](x) = cf(x) + cg(x) = (cf)(x) + (cg)(x).


[c(df)](x) = cdf(x) = [(cd)f ](x).

Unit for Scalar Multiplication: (1f)(x) = f(x).

2.1.4. (a) ( 1, 1, 1, 1 )T , ( 1,−1, 1,−1 )T , ( 1, 1, 1, 1 )T , ( 1,−1, 1,−1 )T . (b) Obviously not.

2.1.5. One example is f(x) ≡ 0 and g(x) = x3 − x.

2.1.6. (a) f(x) = −4x + 3; (b) f(x) = −2x2 − x + 1.

2.1.7.

(a)

x− yxy

!,

ex

cos y

!, and

13

!, which is a constant function.

(b) Their sum is

x− y + ex + 1xy + cos y + 3

!. Multiplied by −5 is

−5x + 5y − 5ex − 5−5xy − 5 cos y − 15

!.

(c) The zero element is the constant function 0 =

00

!.

♦ 2.1.8. This is the same as the space of functions F(R2, R2). Explicitly:Commutativity of Addition:

v1(x, y)v2(x, y)

!+

w1(x, y)w2(x, y)

!=

v1(x, y) + w1(x, y)v2(x, y) + w2(x, y)

!=

w1(x, y)w2(x, y)

!+

v1(x, y)v2(x, y)

!.

Associativity of Addition: u1(x, y)u2(x, y)

!+

" v1(x, y)v2(x, y)

!+

w1(x, y)w2(x, y)

!#=

u1(x, y) + v1(x, y) + w1(x, y)u2(x, y) + v2(x, y) + w2(x, y)

!

=

" u1(x, y)u2(x, y)

!+

v1(x, y)v2(x, y)

!#+

w1(x, y)w2(x, y)

!.

Additive Identity : 0 = (0, 0) for all x, y, and

v1(x, y)v2(x, y)

!+ 0 =

v1(x, y)v2(x, y)

!= 0 +

v1(x, y)v2(x, y)

!.

Additive Inverse: −

v1(x, y)v2(x, y)

!=

−v1(x, y)−v2(x, y)

!, and

v1(x, y)v2(x, y)

!+

−v1(x, y)−v2(x, y)

!= 0 =

−v1(x, y)−v2(x, y)

!+

v1(x, y)v2(x, y)

!.

43

Distributivity :

(c + d)

v1(x, y)v2(x, y)

!=

(c + d)v1(x, y)(c + d)v2(x, y)

!= c

v1(x, y)v2(x, y)

!+ d

v1(x, y)v2(x, y)

!,

c

" v1(x, y)v2(x, y)

!+

w1(x, y)w2(x, y)

!#=

cv1(x, y) + cw1(x, y)cv2(x, y) + cw2(x, y)

!= c

v1(x, y)v2(x, y)

!+ c

w1(x, y)w2(x, y)

!.


c

"d

v1(x, y)v2(x, y)

!#=

cdv1(x, y)cdv2(x, y)

!= (cd)

v1(x, y)v2(x, y)

!.

Unit for Scalar Multiplication:

1

v1(x, y)v2(x, y)

!=

v1(x, y)v2(x, y)

!.

♥ 2.1.9. We identify each sample value with the matrix entry mij = f(ih, j k). In this way, every

sampled function corresponds to a uniquely determined m × n matrix and conversely. Ad-dition of sample functions, (f + g)(ih, j k) = f(ih, j k) + g(ih, j k) corresponds to matrixaddition, mij + nij , while scalar multiplication of sample functions, cf(ih, j k), corresponds

to scalar multiplication of matrices, cmij .

2.1.10. a + b = (a1 + b1, a2 + b2, a3 + b3, . . . ), ca = (ca1, ca2, ca3, . . . ). Explicity verification ofthe vector space properties is straightforward. An alternative, smarter strategy is to iden-tify R

∞ as the space of functions f : N → R where N = 1, 2, 3, . . . is the set of naturalnumbers and we identify the function f with its sample vector f = (f(1), f(2), . . . ).

2.1.11. (i) v + (−1)v = 1v + (−1)v =“

1 + (−1)”v = 0v = 0.

(j) Let z = c0. Then z + z = c(0 + 0) = c0 = z, and so, as in the proof of (h), z = 0.

(k) Suppose c 6= 0. Then v = 1v =

1

c· c!

v =1

c(cv) =

1

c0 = 0.

♦ 2.1.12. If 0 and e0 both satisfy axiom (c), then 0 = e0 + 0 = 0 + e0 = e0.

♦ 2.1.13. Commutativity of Addition:

(v,w) + (bv, bw) = (v + bv,w + bw) = (bv, bw) + (v,w).


(v,w) +h(bv, bw) + (ev, ew)

i= (v + bv + ev,w + bw + ew) =

h(v,w) + (bv, bw)

i+ (ev, ew).

Additive Identity : the zero element is (0,0), and

(v,w) + (0,0) = (v,w) = (0,0) + (v,w).

Additive Inverse: −(v,w) = (−v,−w) and

(v,w) + (−v,−w) = (0,0) = (−v,−w) + (v,w).

Distributivity :(c + d)(v,w) = ((c + d)v, (c + d)w) = c(v,w) + d(v,w),

ch(v,w) + (bv, bw)

i= (cv + c bv, cv + c bw) = c(v,w) + c(bv, bw).


c(d(v,w)) = (cdv, cdw) = (cd)(v,w).

Unit for Scalar Multiplication: 1(v,w) = (1v, 1w) = (v,w).

2.1.14. Here V = C0 while W = R, and so the indicated pairs belong to the Cartesian prod-uct vector space C0 × R. The zero element is the pair 0 = (0, 0) where the first 0 denotesthe identically zero function, while the second 0 denotes the real number zero. The laws ofvector addition and scalar multiplication are

(f(x), a) + (g(x), b) = (f(x) + g(x), a + b), c(f(x), a) = (cf(x), ca).

44

2.2.1.(a) If v = ( x, y, z )T satisfies x− y + 4z = 0 and ev = ( ex, ey, ez )T also satisfies ex− ey + 4ez = 0,

so does v + ev = ( x + ex, y + ey, z + ez )T since (x + ex)− (y + ey) + 4(z + ez) = (x− y + 4z) +

(ex−ey+4ez) = 0, as does cv = ( cx, cy, cz )T since (cx)−(cy)+4(cz) = c(x−y+4z) = 0.

(b) For instance, the zero vector 0 = ( 0, 0, 0 )T does not satisfy the equation.

2.2.2. (b,c,d,g,i) are subspaces; the rest are not. Case (j) consists of the 3 coordinate axes andthe line x = y = z.

2.2.3. (a) Subspace:

-1-0.5

00.5

1

-1

-0.5

0

0.5

1

-2

0

2

-1

-0.5

0

0.5

1

(b) Not a subspace:

-1-0.50

0.51

-1-0.5

00.5

1

-10

0

10

-1-0.5

00.5

1

(c) Subspace:

-1-0.5

0

0.5

1

-1

-0.5

0

0.51

-1

-0.5

0

0.5

1

-1-0.5

0

0.5

1

-1

-0.5

0

0.51

(d) Not a subspace:

-1-0.5

00.5

1

-1

-0.5

00.5

1

-1

-0.5

0

0.5

1

-1-0.5

00.5

1

-1

-0.5

00.5

1

(e) Not a subspace:-1

-0.5

0

0.5

1

-2.5

-2.25

-2

-1.75

-1.5

0.5

0.75

1

1.25

1.5

-1-0.5

0

0.5

1

-2.5

-2.25

-2

-1.75

-1.5

(f ) Even though the cylinders are not

subspaces, their intersection is the z axis, which is a subspace:

-2-1

01

2

-2

-1

01

2

-2

-1

0

1

2

-2-1

01

2

-2

-1

01

2

2.2.4. Any vector of the form a

0B@

12−1

1CA + b

0B@

201

1CA + c

0B@

0−1

3

1CA =

0B@

a + 2b2a− c

−a + b + 3c

1CA =

0B@

xyz

1CA will

belong to W . The coefficient matrix

0B@

1 2 02 0 −1−1 1 3

1CA is nonsingular, and so for any

45

x = ( x, y, z )T ∈ R3 we can arrange suitable values of a, b, c by solving the linear system.

Thus, every vector in R3 belongs to W and so W = R

3.

2.2.5. False, with two exceptions: [0, 0] = 0 and (−∞,∞) = R.

2.2.6.(a) Yes. For instance, the set S = (x, 0 ∪ (0, y) consisting of the coordinate axes has

the required property, but is not a subspace. More generally, any (finite) collection of 2or more lines going through the origin satisfies the property, but is not a subspace.

(b) For example, S = (x, y) |x, y ≥ 0 — the positive quadrant.

2.2.7. (a,c,d) are subspaces; (b,e) are not.

2.2.8. Since x = 0 must belong to the subspace, this implies b = A0 = 0. For a homogeneoussystem, if x,y are solutions, so Ax = 0 = Ay, so are x+y since A(x+y) = Ax+ Ay = 0,as is cx since A(cx) = cAx = 0.

2.2.9. L and M are strictly lower triangular if lij = 0 = mij whenever i ≤ j. Then N = L + M

is strictly lower triangular since nij = lij + mij = 0 whenever i ≤ j, as is K = cL since

kij = c lij = 0 whenever i ≤ j.

♦ 2.2.10. Note tr(A+B) =nX

i=1

(aii + bii) = tr A+tr B and tr(cA) =nX

i=1

caii = cnX

i=1

aii = c tr A.

Thus, if tr A = tr B = 0, then tr(A + B) = 0 = tr(cA), proving closure.

2.2.11.(a) No. The zero matrix is not an element.

(b) No if n ≥ 2. For example, A =

1 00 0

!, B =

0 00 1

!satisfy det A = 0 = det B, but

det(A + B) = det

1 00 1

!= 1, so A + B does not belong to the set.

2.2.12. (d,f,g,h) are subspaces; the rest are not.

2.2.13. (a) Vector space; (b) not a vector space: (0, 0) does not belong; (c) vector space;(d) vector space; (e) not a vector space: If f is non-negative, then −1 f = −f is not (un-less f ≡ 0); (f ) vector space; (g) vector space; (h) vector space.

2.2.14. If f(1) = 0 = g(1), then (f + g)(1) = 0 and (cf)(1) = 0, so both f + g and cf be-long to the subspace. The zero function does not satisfy f0) = 1. For a subspace, a can beanything, while b = 0.

2.2.15. All cases except (e,g) are subspaces. In (g), |x | is not in C1.

2.2.16. (a) Subspace; (b) subspace; (c) Not a subspace: the zero function does not satisfythe condition; (d) Not a subspace: if f(0) = 0, f(1) = 1, and g(0) = 1, g(1) = 0, then fand g are in the set, but f + g is not; (e) subspace; (f ) Not a subspace: the zero functiondoes not satisfy the condition; (g) subspace; (h) subspace; (i) Not a subspace: the zerofunction does not satisfy the condition.

2.2.17. If u′′ = xu, v′′ = xv, are solutions, and c, d constants, then (cu + dv)′′ = cu′′ + dv′′ =cxu + dxv = x(cu + dv), and hence cu + dv is also a solution.

2.2.18. For instance, the zero function u(x) ≡ 0 is not a solution.

2.2.19.(a) It is a subspace of the space of all functions f : [a, b ]→ R

2, which is a particular instance

of Example 2.7. Note that f(t) = ( f1(t), f2(t) )T is continuously differentiable if and

46

only if its component functions f1(t) and f2(t) are. Thus, if f(t) = ( f1(t), f2(t) )T and

g(t) = ( g1(t), g2(t) )T are continuously differentiable, so are

(f + g)(t) = ( f1(t) + g1(t), f2(t) + g2(t) )T and (c f)(t) = ( cf1(t), cf2(t) )T .

(b) Yes: if f(0) = 0 = g(0), then (c f + dg)(0) = 0 for any c, d ∈ R.

2.2.20.∇ · (cv + dw) = c∇ · v + d∇ ·w = 0 whenever ∇ · v = ∇ ·w = 0 and c, d,∈ R.

2.2.21. Yes. The sum of two convergent sequences is convergent, as is any constant multiple ofa convergent sequence.

2.2.22.(a) If v,w ∈ W ∩ Z, then v,w ∈ W , so cv + dw ∈ W because W is a subspace, and

v,w ∈ Z, so cv + dw ∈ Z because Z is a subspace, hence cv + dw ∈W ∩ Z.(b) If w + z, ew + ez ∈ W + Z then c(w + z) + d( ew + ez) = (cw + d ew) + (cz + dez) ∈ W + Z,

since it is the sum of an element of W and an element of Z.(c) Given any w ∈ W and z ∈ Z, then w, z ∈ W ∪ Z. Thus, if W ∪ Z is a subspace, the

sum w + z ∈ W ∪ Z. Thus, either w + z = ew ∈ W or w + z = ez ∈ Z. In the first casez = ew −w ∈ W , while in the second w = ez − z ∈ Z. We conclude that for any w ∈ Wand z ∈ Z, either w ∈ Z or z ∈ W . Suppose W 6⊂ Z. Then we can find w ∈ W \ Z, andso for any z ∈ Z, we must have z ∈W , which proves Z ⊂W .

♦ 2.2.23. If v,w ∈ T

Wi, then v,w ∈Wi for each i and so cv + dw ∈Wi for any c, d ∈ R becauseWi is a subspace. Since this holds for all i, we conclude that cv + dw ∈ T

Wi.

♥ 2.2.24.(a) They clearly only intersect at the origin. Moreover, every v =

xy

!=

x0

!+

0y

!can

be written as a sum of vectors on the two axes.(b) Since the only common solution to x = y and x = 3y is x = y = 0, the lines only

intersect at the origin. Moreover, every v =

xy

!=

aa

!+

3bb

!, where a = − 1

2 x+ 32 y,

b = 12 x− 1

2 y, can be written as a sum of vectors on each line.

(c) A vector v = ( a, 2a, 3a )T in the line belongs to the plane if and only if a + 2(2a) +3(3a) = 14a = 0, so a = 0 and the only common element is v = 0. Moreover, every

v =

0B@

xyz

1CA =

1

14

0B@

x + 2y + 3z2(x + 2y + 3z)3(x + 2y + 3z)

1CA +

1

14

0B@

13x− 2y − 3z−2x + 10y − 6z−3x− 6y + 5z

1CA can be written as a sum

of a vector in the line and a vector in the plane.(d) If w + z = ew + ez, then w − ew = ez− z. The left hand side belongs to W , while the right

hand side belongs to Z, and so, by the first assumption, they must both be equal to 0.Therefore, w = ew, z = ez.

2.2.25.(a) (v,w) ∈ V0 ∩W0 if and only if (v,w) = (v,0) and (v,w) = (0,w), which means v =

0,w = 0, and hence (v,w) = (0,0) is the only element of the intersection. Moreover, wecan write any element (v,w) = (v,0) + (0,w).

(b) (v,w) ∈ D ∩ A if and only if v = w and v = −w, hence (v,w) = (0,0). Moreover, we

can write (v,w) = ( 12 v + 1

2 w, 12 v + 1

2 w) + ( 12 v − 1

2 w,− 12 v + 1

2 w) as the sum of anelement of D and an element of A.

2.2.26.(a) If f(−x) = f(x), ef(−x) = ef(x), then (cf + d ef)(−x) = cf(−x) + d ef(−x) = cf(x) +

d ef(x) = (cf + d ef)(x) for any c, d,∈ R, and hence it is a subspace.(b) If g(−x) = −g(x), eg(−x) = −eg(x), then (cg + deg)(−x) = cg(−x) + deg(−x) =−cg(x) − deg(x) = −(cg + deg)(x), proving it is a subspace. If f(x) is both even and

47

odd, then f(x) = f(−x) = −f(x) and so f(x) ≡ 0 for all x. Moreover, we can write any

function h(x) = f(x) + g(x) as a sum of an even function f(x) = 12

hh(x) + h(−x)

iand

an odd function g(x) = 12

hh(x)− h(−x)

i.

(c) This follows from part (b), and the uniqueness follows from Exercise 2.2.24(d).

2.2.27. If A = AT and A = −AT is both symmetric and skew-symmetric, then A = O.

Given any square matrix, write A = S + J where S = 12

“A + AT

”is symmetric and

J = 12

“A−AT

”is skew-symmetric. This verifies the two conditions for complementary

subspaces. Uniqueness of the decomposition A = S + J follows from Exercise 2.2.24(d).

♦ 2.2.28.(a) By induction, we can show that

f (n)(x) = Pn

1

x

!e−1/x = Qn(x)

e−1/x

xn,

where Pn(y) and Qn(x) = xnPn(1/x) are certain polynomials of degree n. Thus,

limx→ 0

f (n)(x) = limx→ 0

Qn(x)e−1/x

xn= Qn(0) lim

y →∞yn e−y = 0,

because the exponential e−y goes to zero faster than any power of y goes to ∞.(b) The Taylor series at a = 0 is 0 + 0x + 0x2 + · · · ≡ 0, which converges to the zero

function, not to e−1/x.

2.2.29.

(a) The Taylor series is the geometric series1

1 + x2= 1− x2 + x4 − x6 + · · · .

(b) The ratio test can be used to prove that the series converges precisely when |x | < 1.(c) Convergence of the Taylor series to f(x) for x near 0 suffices to prove analyticity of the

function at x = 0.

♥ 2.2.30.(a) If v+a,w+a ∈ A, then (v+a)+(w+a) = (v+w+a)+a ∈ A requires v+w+a = u ∈ V ,

and hence a = u− v −w ∈ A.

(b) (i) -3 -2 -1 1 2 3

-3

-2

-1

1

2

3

(ii) -3 -2 -1 1 2 3

-3

-2

-1

1

2

3

(iii) -3 -2 -1 1 2 3

-3

-2

-1

1

2

3

(c) Every subspace V ⊂ R2 is either a point (the origin), or a line through the origin, or all

of R2. Thus, the corresponding affine subspaces are the point a; a line through a, or

all of R2 since in this case a ∈ V = R

2.(d) Every vector in the plane can be written as ( x, y, z )T = ( ex, ey, ez )T + ( 1, 0, 0 )T where

( ex, ey, ez )T is an arbitrary vector in the subspace defined by ex− 2ey + 3 ex = 0.(e) Every such polynomial can be written as p(x) = q(x) + 1 where q(x) is any element of

the subspace of polynomials that satisfy q(1) = 0.

48

2.3.1.

0B@−1

23

1CA = 2

0B@

2−1

2

1CA−

0B@

5−4

1

1CA.

2.3.2.

0BBB@

−3761

1CCCA = 3

0BBB@

1−3−2

0

1CCCA+ 2

0BBB@

−2634

1CCCA+

0BBB@

−246−7

1CCCA.

2.3.3.

(a) Yes, since

0B@

1−2−3

1CA =

0B@

110

1CA− 3

0B@

011

1CA;

(b) Yes, since

0B@

1−2−1

1CA = 3

10

0B@

122

1CA+ 7

10

0B@

1−2

0

1CA− 4

10

0B@

034

1CA;

(c) No, since the vector equation

0BBB@

30−1−2

1CCCA = c1

0BBB@

1201

1CCCA+ c2

0BBB@

0−1

30

1CCCA+ c3

0BBB@

201−1

1CCCA does not have a

solution.

2.3.4. Cases (b), (c), (e) span R2.

2.3.5.

(a) The line ( 3 t, 0, t )T :

-2

0

2-1-0.500.51-1-0.500.5

1

-2

0

2

-1-0.500.5

1

(b) The plane z = − 35 x− 6

5 y:

-1-0.5

0

0.5

1-1

-0.5

00.5

1

-1

0

1

-1

-0.5

00.5

1

(c) The plane z = −x− y:

-1

-0.5

0

0.5

1

-1

-0.50

0.51

-2

-1

0

1

2-1

-0.50

0.51

2.3.6. They are the same. Indeed, since v1 = u1 + 2u2, v2 = u1 + u2, every vector v ∈ V canbe written as a linear combination v = c1v1 + c2v2 = (c1 + c2)u1 +(2c1 + c2)u2 and hencebelongs to U . Conversely, since u1 = −v1 + 2v2, u2 = v1 − v2, every vector u ∈ U can bewritten as a linear combination u = c1u1 + c2u2 = (−c1 + c2)v1 + (2c1 − c2)v2, and hencebelongs to U .

2.3.7. (a) Every symmetric matrix has the form

a bb c

!= a

1 00 0

!+ c

0 00 1

!+ b

0 11 0

!.

49

(b)

0B@

1 0 00 0 00 0 0

1CA ,

0B@

0 0 00 1 00 0 0

1CA ,

0B@

0 0 00 0 00 0 1

1CA ,

0B@

0 1 01 0 00 0 0

1CA ,

0B@

0 0 10 0 01 0 0

1CA ,

0B@

0 0 00 0 10 1 0

1CA.

2.3.8.(a) They span P(2) since ax2 +bx+c = 1

2 (a−2b+c)(x2 +1)+ 12 (a−c)(x2−1)+b(x2 +x+1).

(b) They span P(3) since ax3 +bx2 +cx+d = a(x3−1)+b(x2 +1)+c(x−1)+(a−b+c+d)1.

(c) They do not span P(3) since ax3 +bx2 +cx+d = c1x3 +c2(x2 +1)+c3(x

2−x)+c4(x+1)cannot be solved when b + c− d 6= 0.

2.3.9. (a) Yes. (b) No. (c) No. (d) Yes: cos2 x = 1− sin2 x. (e) No. (f ) No.

2.3.10. (a) sin 3x = cos“

3x− 12 π

”; (b) cos x− sin x =

√2 cos

“x + 1

4 π”,

(c) 3 cos 2x+4 sin 2x = 5 cos“

2x− tan−1 43

”, (d) cos x sin x = 1

2 sin 2x = 12 cos

“2x− 1

2 π”.

2.3.11. (a) If u1 and u2 are solutions, so is u = c1 u1 + c2 u2 since u′′ − 4u′ + 3u = c1(u′′1 −

4u′1 + 3u1) + c2(u

′′2 − 4u′

2 + 3u2) = 0. (b) span ex, e3x ; (c) 2.

2.3.12. Each is a solution, and the general solution u(x) = c1 + c2 cos x + c3 sin x is a linearcombination of the three independent solutions.

2.3.13. (a) e2x; (b) cos 2x, sin 2x; (c) e3x, 1; (d) e−x, e−3x; (e) e−x/2 cos√

32 x,

e−x/2 sin√

32 x; (f ) e5x, 1, x; (g) ex/

√2 cos

x√2

, ex/√

2 sinx√2

, e−x/√

2 cosx√2

, e−x/√

2 sinx√2

.

2.3.14. (a) If u1 and u2 are solutions, so is u = c1 u1 + c2 u2 since u′′ + 4u = c1(u′′1 + 4u1) +

c2(u′′2 + 4u2) = 0, u(0) = c1 u1(0) + c2 u2(0) = 0, u(π) = c1 u1(π) + c2 u2(π) = 0.

(b) span sin 2x

2.3.15. (a)

21

!= 2 f1(x) + f2(x) − f3(x); (b) not in the span; (c)

1− 2x−1− x

!= f1(x) −

f2(x)− f3(x); (d) not in the span; (e)

2− x

0

!= 2 f1(x)− f3(x).

2.3.16. True, since 0 = 0v1 + · · ·+ 0vn.

2.3.17. False. For example, if z =

0B@

110

1CA, u =

0B@

100

1CA, v =

0B@

010

1CA, w =

0B@

001

1CA, then z = u + v, but

the equation w = c1u + c2v + c3 z =

0B@

c1 + c3c2 + c3

0

1CA has no solution.

♦ 2.3.18. By the assumption, any v ∈ V can be written as a linear combination

v = c1v1 + · · · + cm vm = c1v1 + · · · + cn vm + 0vm+1 + · · · + 0vn

of the combined collection.

♦ 2.3.19.

(a) If v =mX

j =1

cj vj and vj =nX

i=1

aij wi, then v =nX

i=1

bi vi where bi =mX

j =1

aij cj , or, in

vector language, b = A c.(b) Every v ∈ V can be written as a linear combination of v1, . . . ,vn, and hence, by part

(a), a linear combination of w1, . . . ,wm, which shows that w1, . . . ,wm also span V .

50

♦ 2.3.20.

(a) If v =mX

i=1

ai vi, w =nX

i=1

bi vi, are two finite linear combinations, so is

cv + dw =maxm,nX

i=1

(cai + dbi)vi where we set ai = 0 if i > m and bi = 0 if i > n.

(b) The space P(∞) of all polynomials, since every polynomial is a finite linear combinationof monomials and vice versa.

2.3.21. (a) Linearly independent; (b) linearly dependent; (c) linearly dependent;(d) linearly independent; (e) linearly dependent; (f ) linearly dependent;(g) linearly dependent; (h) linearly independent; (i) linearly independent.

2.3.22. (a) The only solution to the homogeneous linear system

c1

0BBB@

1021

1CCCA+ c2

0BBB@

−23−1

1

1CCCA+ c3

0BBB@

2−2

1−1

1CCCA = 0 is c1 = c2 = c3 = 0.

(b) All but the second lie in the span. (c) a− c + d = 0.

2.3.23.(a) The only solution to the homogeneous linear system

A c = c1

0BBB@

1110

1CCCA+ c2

0BBB@

11−1

0

1CCCA+ c3

0BBB@

1−1

01

1CCCA+ c4

0BBB@

1−1

0−1

1CCCA = 0

with nonsingular coefficient matrix A =

0BBB@

1 1 1 11 1 −1 −11 −1 0 10 0 1 −1

1CCCA is c = 0.

(b) Since A is nonsingular, the inhomogeneous linear system

v = A c = c1

0BBB@

1110

1CCCA+ c2

0BBB@

11−1

0

1CCCA+ c3

0BBB@

1−1

01

1CCCA+ c4

0BBB@

1−1

0−1

1CCCA

has a solution c = A−1v for any v ∈ R4.

(c)

0BBB@

1001

1CCCA = 3

8

0BBB@

1110

1CCCA+ 1

8

0BBB@

11−1

0

1CCCA+ 3

4

0BBB@

1−1

01

1CCCA−

14

0BBB@

1−1

0−1

1CCCA

2.3.24. (a) Linearly dependent; (b) linearly dependent; (c) linearly independent; (d) linearlydependent; (e) linearly dependent; (f ) linearly independent.

2.3.25. False:0B@

1 0 00 1 00 0 1

1CA−

0B@

0 1 01 0 00 0 1

1CA−

0B@

0 0 10 1 01 0 0

1CA−

0B@

1 0 00 0 10 1 0

1CA+

0B@

0 1 00 0 11 0 0

1CA+

0B@

0 0 11 0 00 1 0

1CA = O.

2.3.26. False — the zero vector always belongs to the span.

2.3.27. Yes, when it is the zero vector.

51

2.3.28. Because x,y are linearly independent, 0 = c1u + c2v = (ac1 + cc2)x + (bc1 + dc2)y ifand only if ac1 + cc2 = 0, bc1 + dc2 = 0. The latter linear system has a nonzero solution(c1, c2) 6= 0, and so u,v are linearly dependent, if and only if the determinant of the coef-

ficient matrix is zero: det

a cb d

!= ad − bc = 0, proving the result. The full collection

x,y,u,v is linearly dependent since, for example, ax+by−u+0v = 0 is a nontrivial linearcombination.

2.3.29. The statement is false. For example, any set containing the zero element that does notspan V is linearly dependent.

♦ 2.3.30. (b) If the only solution to A c = 0 is the trivial one c = 0, then the only linear com-bination which adds up to zero is the trivial one with c1 = · · · = ck = 0, proving linearindependence. (c) The vector b lies in the span if and only if b = c1v1 + · · · + ck vk = A cfor some c, which implies that the linear system A c = b has a solution.

♦ 2.3.31.(a) Since v1, . . . ,vn are linearly independent,

0 = c1v1 + · · · + ck vk = c1v1 + · · · + ck vk + 0vk+1 + · · · + 0vn

if and only if c1 = · · · = ck = 0.

(b) This is false. For example, v1 =

11

!, v2 =

22

!, are linearly dependent, but the

subset consisting of just v1 is linearly independent.

2.3.32.(a) They are linearly dependent since (x2 − 3) + 2(2− x)− (x− 1)2 ≡ 0.

(b) They do not span P(2).

2.3.33. (a) Linearly dependent; (b) linearly independent; (c) linearly dependent; (d) linearlyindependent; (e) linearly dependent; (f ) linearly dependent; (g) linearly independent;(h) linearly independent; (i) linearly independent.

2.3.34. When x > 0, we have f(x)− g(x) ≡ 0, proving linear dependence. On the other hand, ifc1f(x) + c2g(x) ≡ 0 for all x, then at, say x = 1, we have c1 + c2 = 0 while at x = −1, wemust have −c1 + c2 = 0, and so c1 = c2 = 0, proving linear independence.

♥ 2.3.35.

(a) 0 =kX

i=1

ci pi(x) =nX

j =0

kX

i=1

ci aij xj if and only ifnX

j =0

kX

i=1

ci aij = 0, j = 0, . . . , n, or, in

matrix notation, AT c = 0. Thus, the polynomials are linearly independent if and only if

the linear system AT c = 0 has only the trivial solution c = 0 if and only if its (n+1)×k

coefficient matrix has rank AT = rank A = k.

(b) q(x) =nX

j =0

bj xj =kX

i=1

ci pi(x) if and only if AT c = b.

(c) A =

0BBBBB@

−1 0 0 1 04 −2 0 1 00 −4 0 0 11 0 1 0 01 2 0 4 −1

1CCCCCA

has rank 4 and so they are linearly dependent.

(d) q(x) is not in the span.

♦ 2.3.36. Suppose the linear combination p(x) = c0 + c1 x + c2 x2 + · · · + cn xn ≡ 0 for all x.Thus, every real x is a root of p(x), but the Fundamental Theorem of Algebra says this isonly possible if p(x) is the zero polynomial with coefficients c0 = c1 = · · · = cn = 0.

52

♥ 2.3.37.(a) If c1 f1(x) + · · · + cn fn(x) ≡ 0, then c1 f1(xi) + · · · + cn fn(xi) = 0 at all sample points,

and so c1 f1 + · · · + cn fn = 0. Thus, linear dependence of the functions implies lineardependence of their sample vectors.

(b) Sampling f1(x) = 1 and f2(x) = x2 at −1, 1 produces the linearly dependent sample

vectors f1 = f2 =

11

!.

(c) Sampling at 0, 14 π, 1

2 π, 34 π, π, leads to the linearly independent sample vectors

0BBBBBBBB@

1

1

1

1

1

1CCCCCCCCA

,

0BBBBBBBB@

1√

220

−√

22−1

1CCCCCCCCA

,

0BBBBBBBB@

0√

221√

220

1CCCCCCCCA

,

0BBBBBBBB@

1

0

−1

0

1

1CCCCCCCCA

,

0BBBBBBBB@

0

1

0

−1

0

1CCCCCCCCA

.

2.3.38.(a) Suppose c1 f1(t) + · · · + cn fn(t) ≡ 0 for all t. Then c1 f1(t0) + · · · + cn fn(t0) = 0, and

hence, by linear independence of the sample vectors, c1 = · · · = cn = 0, which proveslinear independence of the functions.

(b) c1 f1(t) + c2 f1(t) =

2c2 t + (c1 − c2)

2c2 t2 + (c1 − c2)t

!≡ 0 if and only if c2 = 0, c1 − c2 = 0, and

so c1 = c2 = 0, proving linear independence. However, at any t0, the vectors f2(t0) =(2 t0 − 1)f1(t0) are scalar multiples of each other, and hence linearly dependent.

♥ 2.3.39.(a) Suppose c1 f(x) + c2 g(x) ≡ 0 for all x for some c = ( c1, c2 )T 6= 0. Differentiating,

we find c1 f ′(x) + c2 g′(x) ≡ 0 also, and hence

0@

f(x) g(x)

f ′(x) g′(x)

1A0@

c1

c2

1A = 0 for all x.

The homogeneous system has a nonzero solution if and only if the coefficient matrix issingular, which requires its determinant W [f(x), g(x) ] = 0.

(b) This is the contrapositive of part (a), since if f, g were not linearly independent, thentheir Wronskian would vanish everywhere.

(c) Suppose c1 f(x) + c2 g(x) = c1 x3 + c2 |x |3 ≡ 0. then, at x = 1, c1 + c2 = 0, whereasat x = −1, −c1 + c2 = 0. Therefore, c1 = c2 = 0, proving linear independence. On the

other hand, W [x3, |x |3 ] = x3(3x2 sign x)− (3x2) |x |3 ≡ 0.

2.4.1. Only (a) and (c) are bases.

2.4.2. Only (b) is a basis.

2.4.3. (a)

0B@

100

1CA ,

0B@

012

1CA; (b)

0BB@

34

10

1CCA ,

0BB@

14

01

1CCA; (c)

0BBB@

−2100

1CCCA,

0BBB@

−1010

1CCCA,

0BBB@

1001

1CCCA.

2.4.4.(a) They do not span R

3 because the linear system A c = b with coefficient matrix

A =

0B@

1 3 2 40 −1 −1 −12 1 −1 3

1CA does not have a solution for all b since rank A = 2.

(b) 4 vectors in R3 are automatically linearly dependent.

53

(c) No, because if v1,v2,v3,v4 don’t span R3, no subset of them will span it either.

(d) 2, because v1 and v2 are linearly independent and span the subspace, and hence form abasis.

2.4.5.(a) They span R

3 because the linear system A c = b with coefficient matrix

A =

0B@

1 2 0 1−1 −2 −2 3

2 5 1 −1

1CA has a solution for all b since rank A = 3.

(b) 4 vectors in R3 are automatically linearly dependent.

(c) Yes, because v1,v2,v3 also span R3 and so form a basis.

(d) 3 because they span all of R3.

2.4.6.

(a) Solving the defining equation, the general vector in the plane is x =

0B@

2y + 4zyz

1CA where

y, z are arbitrary. We can write x = y

0B@

210

1CA+ z

0B@

401

1CA = (y + 2z)

0B@

2−1

1

1CA+ (y + z)

0B@

02−1

1CA

and hence both pairs of vectors span the plane. Both pairs are linearly independentsince they are not parallel, and hence both form a basis.

(b)

0B@

2−1

1

1CA = (−1)

0B@

210

1CA+

0B@

401

1CA ,

0B@

02−1

1CA = 2

0B@

210

1CA−

0B@

401

1CA;

(c) Any two linearly independent solutions, e.g.,

0B@

611

1CA ,

0B@

1012

1CA, will form a basis.

♥ 2.4.7. (a) (i) Left handed basis; (ii) right handed basis; (iii) not a basis; (iv) right handedbasis. (b) Switching two columns or multiplying a column by −1 changes the sign of thedeterminant. (c) If det A = 0, its columns are linearly dependent and hence can’t form abasis.

2.4.8.

(a)“− 2

3 , 56 , 1, 0

”T,“

13 ,− 2

3 , 0, 1”T

; dim = 2.

(b) The condition p(1) = 0 says a + b + c = 0, so p(x) = (−b− c)x2 + bx + c = b(−x2 + x) +

c(−x2 + 1). Therefore −x2 + x, −x2 + 1 is a basis, and so dim = 2.(c) ex, cos 2x, sin 2x, is a basis, so dim = 3.

2.4.9. (a)

0B@

31−1

1CA, dim = 1; (b)

0B@

201

1CA,

0B@

0−1

3

1CA, dim = 2; (c)

0BBB@

10−1

2

1CCCA,

0BBB@

0113

1CCCA,

0BBB@

1−2

11

1CCCA, dim = 3.

2.4.10. (a) We have a + bt + ct2 = c1(1 + t2) + c2(t + t2) + c3(1 + 2 t + t2) provided a = c1 + c3,

b = c2 + 2c3, c = c1 + c2 + c3. The coefficient matrix of this linear system,

0B@

1 0 10 1 21 1 1

1CA,

is nonsingular, and hence there is a solution for any a, b, c, proving that they span the spaceof quadratic polynomials. Also, they are linearly independent since the linear combinationis zero if and only if c1, c2, c3 satisfy the corresponding homogeneous linear system c1+c3 =0, c2 + 2c3 = 0, c1 + c2 + c3 = 0, and hence c1 = c2 = c3 = 0. (Or, you can use

the fact that dimP(2) = 3 and the spanning property to conclude that they form a basis.)

54

(b) 1 + 4 t + 7 t2 = 2(1 + t2) + 6(t + t2)− (1 + 2 t + t2)

2.4.11. (a) a+bt+ct2+dt3 = c1+c2(1−t)+c3(1−t)2+c4(1−t)3 provided a = c1+c2+c3+c4,

b = −c2 − 2c3 − 3c4, c = c3 + 3c4, d = −c4. The coefficient matrix

0BBB@

1 1 1 10 −1 −2 −30 0 1 30 0 0 −1

1CCCA

is nonsingular, and hence they span P(3). Also, they are linearly independent since the lin-ear combination is zero if and only if c1 = c2 = c3 = c4 = 0 satisfy the corresponding

homogeneous linear system. (Or, you can use the fact that dimP (3) = 4 and the spanning

property to conclude that they form a basis.) (b) 1 + t3 = 2− 3(1− t) + 3(1− t)2− (1− t)3.

2.4.12. (a) They are linearly dependent because 2p1 − p2 + p3 ≡ 0. (b) The dimension is 2,since p1, p2 are linearly independent and span the subspace, and hence form a basis.

2.4.13.

(a) The sample vectors

0BBBBB@

1

1

1

1

1CCCCCA

,

0BBBBB@

1√

22

0

−√

22

1CCCCCA

,

0BBBBB@

1

0

−1

0

1CCCCCA

,

0BBBBB@

1

−√

22

0√

22

1CCCCCA

are linearly independent and

hence form a basis for R4 — the space of sample functions.

(b) Sampling x produces

0BBBBB@

0141234

1CCCCCA

=1

2

0BBBBB@

1

1

1

1

1CCCCCA− 2 +

√2

8

0BBBBB@

1√

22

0

−√

22

1CCCCCA− 2−

√2

8

0BBBBB@

1

−√

22

0√

22

1CCCCCA

.

2.4.14.

(a) E11 =

1 00 0

!, E12 =

0 10 0

!, E21 =

0 01 0

!, E22 =

0 00 1

!is a basis since we

can uniquely write any

a bc d

!= aE11 + bE12 + cE21 + dE22.

(b) Similarly, the matrices Eij with a 1 in position (i, j) and all other entries 0, for

i = 1, . . . , m, j = 1, . . . , n, form a basis for Mm×n, which therefore has dimension mn.

2.4.15. k 6= −1, 2.

2.4.16. A basis is given by the matrices Eii, i = 1, . . . , n which have a 1 in the ith diagonalposition and all other entries 0.

2.4.17.

(a) E11 =

1 00 0

!, E12 =

0 10 0

!, E22 =

0 00 1

!; dimension = 3.

(b) A basis is given by the matrices Eij with a 1 in position (i, j) and all other entries 0 for

1 ≤ i ≤ j ≤ n, so the dimension is 12 n(n + 1).

2.4.18. (a) Symmetric: dim = 3; skew-symmetric: dim = 1; (b) symmetric: dim = 6; skew-

symmetric: dim = 3; (c) symmetric: dim = 12 n(n+1); skew-symmetric: dim = 1

2 n(n−1).

♥ 2.4.19.(a) If a row (column) of A adds up to a and the corresponding row (column) of B adds up

to b, then the corresponding row (column) of C = A + B adds up to c = a + b. Thus,if all row and column sums of A and B are the same, the same is true for C. Similarly,the row (column) sums of c A are c times the row (column) sums of A, and hence all thesame if A is a semi-magic square.

55

(b) A matrix A =

0B@

a b cd e fg h j

1CA is a semi-magic square if and only if

a + b + c = d + e + f = g + h + j = a + d + e = b + e + h = c + f + j.The general solution to this system is

A = e

0B@

1 −1 0−1 1 0

0 0 0

1CA+ f

0B@

1 0 −1−1 0 1

0 0 0

1CA+ g

0B@−1 1 1

1 0 01 0 0

1CA+ h

0B@

0 0 11 0 00 1 0

1CA+ j

0B@

0 1 01 0 00 0 1

1CA

= (e− g)

0B@

1 0 00 1 00 0 1

1CA+ (g + j − e)

0B@

0 1 01 0 00 0 1

1CA+ g

0B@

0 0 10 1 01 0 0

1CA+

+ f

0B@

1 0 00 0 10 1 0

1CA+ (h− f)

0B@

0 0 11 0 00 1 0

1CA ,

which is a linear combination of permutation matrices.(c) The dimension is 5, with any 5 of the 6 permutation matrices forming a basis.(d) Yes, by the same reasoning as in part (a). Its dimension is 3, with basis0

B@2 2 −1−2 1 4

3 0 0

1CA ,

0B@

2 −1 21 1 10 3 0

1CA ,

0B@−1 2 2

4 1 −20 0 3

1CA.

(e) A = c1

0B@

2 2 −1−2 1 4

3 0 0

1CA+ c2

0B@

2 −1 21 1 10 3 0

1CA+ c3

0B@−1 2 2

4 1 −20 0 3

1CA for any c1, c2, c3.

♦ 2.4.20. For instance, take v1 =

10

!, v2 =

01

!, v3 =

11

!. Then

21

!= 2v1 + v2 =

v1 + v3. In fact, there are infinitely many different ways of writing this vector as a linearcombination of v1,v2,v3.

♦ 2.4.21.(a) By Theorem 2.31, we only need prove linear independence. If 0 = c1 Av1 + · · · +

cn Avn = A(c1v1 + · · · + cn vn), then, since A is nonsingular, c1v1 + · · · + cn vn = 0,and hence c1 = · · · = cn = 0.

(b) Aei is the ith column of A, and so a basis consists of the column vectors of the matrix.

♦ 2.4.22. Since V 6= 0, at least one vi 6= 0. Let vi16= 0 be the first nonzero vector in the list

v1, . . . ,vn. Then, for each k = i1 + 1, . . . , n− 1, suppose we have selected linearly indepen-dent vectors vi1

, . . . ,vij

from among v1, . . . ,vk. If vi1, . . . ,vij

,vk+1 form a linearly inde-

pendent set, we set vij+1= vk+1; otherwise, vk+1 is a linear combination of vi1

, . . . ,vij

,

and is not needed in the basis. The resulting collection vi1, . . . ,vim

forms a basis for Vsince they are linearly independent by design, and span V since each vi either appears inthe basis, or is a linear combination of the basis elements that were selected before it. Wehave dim V = n if and only if v1, . . . ,vn are linearly independent and so form a basis forV .

♦ 2.4.23. This is a special case of Exercise 2.3.31(a).

♦ 2.4.24.(a) m ≤ n as otherwise v1, . . . ,vm would be linearly dependent. If m = n then v1, . . . ,vn

are linearly independent and hence, by Theorem 2.31 span all of Rn. Since every vector

in their span also belongs to V , we must have V = Rn.

(b) Starting with the basis v1, . . . ,vm of V with m < n, we choose any vm+1 ∈ Rn \ V .

Since vm+1 does not lie in the span of v1, . . . ,vm, the vectors v1, . . . ,vm+1 are linearly

independent and span an m + 1 dimensional subspace of Rn. Unless m + 1 = n we can

56

then choose another vector vm+2 not in the span of v1, . . . ,vm+1, and so v1, . . . ,vm+2are also linearly independent. We continue on in this fashion until we arrive at n lin-early independent vectors v1, . . . ,vn which necessarily form a basis of R

n.

(c) (i)“

1, 1, 12

”T, ( 1, 0, 0 )T , ( 0, 1, 0 )T ; (ii) ( 1, 0,−1 )T , ( 0, 1,−2 )T , ( 1, 0, 0 )T .

♦ 2.4.25.(a) If dim V = ∞, then the inequality is trivial. Also, if dim W = ∞, then one can find

infinitely many linearly independent elements in W , but these are also linearly indepen-dent as elements of V and so dim V = ∞ also. Otherwise, let w1, . . . ,wn form a basisfor W . Since they are linearly independent, Theorem 2.31 implies n ≤ dim V .

(b) Since w1, . . . ,wn are linearly independent, if n = dim V , then by Theorem 2.31, theyform a basis for V . Thus every v ∈ V can be written as a linear combination ofw1, . . . ,wn, and hence, since W is a subspace, v ∈W too. Therefore, W = V .

(c) Example: V = C0[a, b ] and W = P(∞).

♦ 2.4.26. (a) Every v ∈ V can be uniquely decomposed as v = w + z where w ∈ W, z ∈ Z. Writew = c1w1 + . . . + cj wj and z = d1 z1 + · · ·+ dk zk. Then v = c1w1 + . . . + cj wj + d1 z1 +

· · · + dk zk, proving that w1, . . . ,wj , z1, . . . , zk span V . Moreover, by uniqueness, v = 0 if

and only if w = 0 and z = 0, and so the only linear combination that sums up to 0 ∈ V isthe trivial one c1 = · · · = cj = d1 = · · · = dk = 0, which proves linear independence of the

full collection. (b) This follows immediately from part (a): dim V = j+k = dim W +dim Z.

♦ 2.4.27. Suppose the functions are linearly independent. This means that for every 0 6= c =

( c1, c2, . . . , cn )T ∈ Rn, there is a point x

c∈ R such that

nX

i=1

ci fi(xc) 6= 0. The as-

sumption says that 0 6= Vx1,...,xm

for all choices of sample points. Recursively define the

following sample points. Choose x1 so that f1(x1) 6= 0. (This is possible since if f1(x) ≡ 0,then the functions are linearly dependent.) Thus Vx1

( Rm since e1 6∈ Vx1

. Then, for eachm = 1, 2, . . . , given x1, . . . , xm, choose 0 6= c0 ∈ Vx1,...,xm

, and set xm+1 = xc0

. Then

c0 6∈ Vx1,...,xm+1( Vx1,...,xm

and hence, by induction, dim Vm ≤ n − m. In particular,

dim Vx1,...,xn

= 0, so Vx1,...,xn

= 0, which contradicts our assumption and proves the

result. Note that the proof implies we only need check linear dependence at all possible col-lections of n sample points to conclude that the functions are linearly dependent.

2.5.1.

(a) Range: all b =

b1b2

!such that 3

4 b1 + b2 = 0; kernel spanned by

12

1

!.

(b) Range: all b =

b1b2

!such that 2b1 + b2 = 0; kernel spanned by

0B@

110

1CA,

0B@−2

01

1CA.

(c) Range: all b =

0B@

b1b2b3

1CA such that −2b1 + b2 + b3 = 0; kernel spanned by

0BBB@

− 54

− 78

1

1CCCA.

(d) Range: all b = ( b1, b2, b3, b4 )T such that −2b1 − b2 + b3 = 2b1 + 3b2 + b4 = 0;

kernel spanned by

0BBB@

1110

1CCCA,

0BBB@

−1001

1CCCA.

57

2.5.2. (a)

0BB@− 5

2

01

1CCA,

0BB@

12

10

1CCA: plane; (b)

0BBB@

1438

1

1CCCA: line; (c)

0B@

201

1CA,

0B@−3

10

1CA: plane;

(d)

0B@−1−2

1

1CA: line; (e)

0B@

000

1CA: point; (f )

0BB@

13531

1CCA: line.

2.5.3.

(a) Kernel spanned by

0BBB@

3100

1CCCA; range spanned by

0B@

120

1CA,

0B@

201

1CA,

0B@

02−3

1CA;

(b) compatibility: − 12 a + 1

4 b + c = 0.

2.5.4. (a) b =

0B@−1

2−1

1CA; (b) x =

0B@

1 + t2 + t3 + t

1CA where t is arbitrary.

2.5.5. In each case, the solution is x = x? + z, where x? is the particular solution and z belongsto the kernel:

(a) x? =

0B@

100

1CA, z = y

0B@

110

1CA+ z

0B@−3

01

1CA; (b) x? =

0B@

1−1

0

1CA, z = z

0BB@− 2

7171

1CCA;

(c) x? =

0BBB@

− 79

29109

1CCCA, z = z

0B@

221

1CA; (d) x? =

0BB@

56

1

− 23

1CCA, z = 0; (e) x? =

−1

0

!, z = v

21

!;

(f ) x? =

0BBBB@

11212

00

1CCCCA

, z = r

0BBBB@

− 132

− 32

10

1CCCCA

+ s

0BBBB@

− 32

− 12

01

1CCCCA

; (g) x? =

0BBB@

3200

1CCCA, z = z

0BBB@

6210

1CCCA+ w

0BBB@

−4−1

01

1CCCA.

2.5.6. The ith entry of A ( 1, 1, . . . , 1 )T is ai1 + . . . + ain which is n times the average of the en-

tries in the ith row. Thus, A ( 1, 1, . . . , 1 )T = 0 if and only if each row of A has average 0.

2.5.7. The kernel has dimension n−1, with basis −rk−1 e1+ek =“−rk−1, 0, . . . , 0, 1, 0, . . . , 0

”T

for k = 2, . . . n. The range has dimension 1, with basis (1, rn, r2n . . . , r(n−1)n)T .

♦ 2.5.8. (a) If w = P w, then w ∈ rng P . On the other hand, if w ∈ rng P , then w = P v for

some v. But then P w = P 2v = P v = w. (b) Given v, set w = P v. Then v = w + z

where z = v−w ∈ ker P since P z = P v− P w = P v− P 2v = P v− P v = 0. Moreover, ifw ∈ ker P ∩rng P , then 0 = P w = w, and so ker P ∩rng P = 0, proving complementarity.

2.5.9. False. For example, if A =

1 1−1 −1

!then

11

!is in both ker A and rng A.

♦ 2.5.10. Let r1, . . . , rm+k be the rows of C, so r1, . . . , rm are the rows of A. For v ∈ ker C, the

ith entry of C v = 0 is ri v = 0, but then this implies Av = 0 and so v ∈ ker A. As an

example, A = ( 1 0 ) has kernel spanned by

10

!, while C =

1 00 1

!has ker C = 0.

58

♦ 2.5.11. If b = Ax ∈ rng A, then b = C z where z =

x0

!, and so b ∈ rng C. As an example,

A =

00

!has rng A = 0, while the range of C =

0 10 0

!is the x axis.

2.5.12. x?1 =

−232

!, x?

2 =

−112

!; x = x?

1 + 4x?2 =

−672

!.

2.5.13. x? = 2x?1 + x?

2 =

0B@−1

33

1CA.

2.5.14.(a) By direct matrix multiplication: Ax?

1 = Ax?2 =

0B@

1−3

5

1CA.

(b) The general solution is x = x?1 + t(x?

2 − x?1) = (1− t)x?

1 + tx?2 =

0B@

110

1CA+ t

0B@−4

2−2

1CA.

2.5.15. 5 meters.

2.5.16. The mass will move 6 units in the horizontal direction and −6 units in the vertical di-rection.

2.5.17. x = c1x?1 + c2x?

2 where c1 = 1− c2.

2.5.18. False: in general, (A + B)x? = (A + B)x?1 + (A + B)x?

2 = c + d + Bx?1 + Ax?

2, and thethird and fourth terms don’t necessarily add up to 0.

♦ 2.5.19. rng A = Rn, and so A must be a nonsingular matrix.

♦ 2.5.20.(a) If Axi = ei, then xi = A−1ei which, by (2.13), is the ith column of the matrix A−1.

(b) The solutions to Axi = ei in this case are x1 =

0BB@

12

2

− 12

1CCA, x2 =

0BB@

− 12

− 1

− 1

1CCA, x3 =

0BB@

12

− 112

1CCA,

which are the columns of A−1 =

0BB@

12 − 1

212

2 −1 −1

− 12

12

12

1CCA.

2.5.21.

(a) range:

12

!; corange:

1−3

!; kernel:

31

!; cokernel:

−2

1

!.

(b) range:

0B@

012

1CA,

0B@−8−1

6

1CA; corange:

0B@

12−1

1CA,

0B@

00−8

1CA; kernel:

0B@−2

10

1CA; cokernel:

0B@

1−2

1

1CA.

(c) range:

0B@

112

1CA,

0B@

103

1CA; corange:

0BBB@

1121

1CCCA,

0BBB@

0−1−3

2

1CCCA; kernel:

0BBB@

1−3

10

1CCCA,

0BBB@

−3201

1CCCA; cokernel:

0B@−3

11

1CA.

59

(d) range:

0BBBBB@

10231

1CCCCCA

,

0BBBBB@

−33−3−3

0

1CCCCCA

,

0BBBBB@

1−2

033

1CCCCCA

; corange:

0BBBBB@

1−3

221

1CCCCCA

,

0BBBBB@

03−6

0−2

1CCCCCA

,

0BBBBB@

00004

1CCCCCA

;

kernel:

0BBBBB@

42100

1CCCCCA

,

0BBBBB@

−20010

1CCCCCA

; cokernel:

0BBBBB@

−2−1

100

1CCCCCA

,

0BBBBB@

210−1

1

1CCCCCA

.

2.5.22.

0B@−1

2−3

1CA,

0B@

012

1CA,

0B@−3

10

1CA, which are its first, third and fourth columns;

Second column:

0B@

2−4

6

1CA = 2

0B@−1

2−3

1CA; fifth column:

0B@

5−4

8

1CA = −2

0B@−1

2−3

1CA+

0B@

012

1CA−

0B@−3

10

1CA.

2.5.23. range:

0B@

12−3

1CA,

0B@

041

1CA; corange:

0B@

1−3

0

1CA,

0B@

004

1CA; second column:

0B@−3−6

9

1CA = −3

0B@

12−3

1CA;

second and third rows:

0B@

2−6

4

1CA = 2

0B@

1−3

0

1CA+

0B@

004

1CA,

0B@−3

91

1CA = −3

0B@

1−3

0

1CA+ 1

4

0B@

004

1CA.

2.5.24.(i) rank = 1; dim rng A = dim corng A = 1, dim ker A = dim coker A = 1;

kernel basis:

−2

1

!; cokernel basis:

21

!; compatibility conditions: 2b1 + b2 = 0;

example: b =

1−2

!, with solution x =

10

!+ z

−2

1

!.

(ii) rank = 1; dim rng A = dim corng A = 1, dimker A = 2, dim coker A = 1; kernel basis:0B@

13

10

1CA,

0B@

23

01

1CA; cokernel basis:

21

!; compatibility conditions: 2b1 + b2 = 0;

example: b =

3−6

!, with solution x =

0B@

100

1CA+ y

0B@

13

10

1CA+ z

0B@

23

01

1CA.

(iii) rank = 2; dim rng A = dim corng A = 2, dimker A = 0, dim coker A = 1;

kernel: 0; cokernel basis:

0BBB@

− 2013313

1

1CCCA; compatibility conditions: − 20

13 b1 + 313 b2 + b3 = 0;

example: b =

0B@

1−2

2

1CA, with solution x =

0B@

100

1CA.

(iv) rank = 2; dim rng A = dim corng A = 2, dimker A = dim coker A = 1;

kernel basis:

0B@−2−1

1


0B@−2

11

1CA; compatibility conditions:

−2b1 + b2 + b3 = 0; example: b =

0B@

213

1CA, with solution x =

0B@

100

1CA+ z

0B@−2−1

1

1CA.

(v) rank = 2; dim rng A = dim corng A = 2, dim ker A = 1, dim coker A = 2; kernel

60

basis:

0B@−1−1

1


0BBBBB@

− 94

14

1

0

1CCCCCA

,

0BBBBB@

14

− 14

0

1

1CCCCCA

; compatibility: − 94 b1 + 1

4 b2 + b3 = 0,

14 b1 − 1

4 b2 + b4 = 0; example: b =

0BBB@

2631

1CCCA, with solution x =

0B@

100

1CA+ z

0B@−1−1

1

1CA.

(vi) rank = 3; dim rng A = dim corng A = 3, dimker A = dim coker A = 1; kernel basis:0BBBBB@

134138

− 72

1

1CCCCCA

; cokernel basis:

0BBB@

−1−1

11

1CCCA; compatibility conditions: −b1 − b2 + b3 + b4 = 0;

example: b =

0BBB@

1313

1CCCA, with solution x =

0BBBBB@

1

0

0

0

1CCCCCA

+ w

0BBBBB@

134138

− 72

1

1CCCCCA

.

(vii) rank = 4; dim rng A = dim corng A = 4, dimker A = 1, dim coker A = 0; kernel basis:0BBBBB@

−21000

1CCCCCA

; cokernel is 0; no conditions;

example: b =

0BBB@

213−3

1CCCA, with x =

0BBBBB@

10000

1CCCCCA

+ y

0BBBBB@

−21000

1CCCCCA

.

2.5.25. (a) dim = 2; basis:

0B@

12−1

1CA,

0B@

220

1CA; (b) dim = 1; basis:

0B@

11−1

1CA;

(c) dim = 3; basis:

0BBB@

1010

1CCCA,

0BBB@

1001

1CCCA,

0BBB@

2210

1CCCA; (d) dim = 3; basis:

0BBB@

10−3

2

1CCCA,

0BBB@

012−3

1CCCA,

0BBB@

1−3−8

7

1CCCA;

(e) dim = 3; basis:

0BBBBB@

11−1

11

1CCCCCA

,

0BBBBB@

2−1

221

1CCCCCA

,

0BBBBB@

13−1

21

1CCCCCA

.

2.5.26. It’s the span of

0BBB@

1100

1CCCA,

0BBB@

−3010

1CCCA,

0BBB@

0231

1CCCA,

0BBB@

04−1−1

1CCCA; the dimension is 3.

2.5.27. (a)

0BBB@

2010

1CCCA,

0BBB@

0−1

01

1CCCA; (b)

0BBB@

1110

1CCCA,

0BBB@

0−1

01

1CCCA; (c)

0BBB@

−1301

1CCCA.

61

2.5.28. First method:

0BBB@

1021

1CCCA,

0BBB@

23−4

5

1CCCA; second method:

0BBB@

1021

1CCCA,

0BBB@

03−8

3

1CCCA. The first vectors are the

same, while

0BBB@

23−4

5

1CCCA = 2

0BBB@

1021

1CCCA+

0BBB@

03−8

3

1CCCA;

0BBB@

03−8

3

1CCCA = −2

0BBB@

1021

1CCCA+

0BBB@

23−4

5

1CCCA.

2.5.29. Both sets are linearly independent and hence span a three-dimensional subspace of R4.

Moreover, w1 = v1 + v3,w2 = v1 + v2 + 2v3,w3 = v1 + v2 + v3 all lie in the span ofv1,v2,v3 and hence, by Theorem 2.31(d) also form a basis for the subspace.

2.5.30.(a) If A = AT , then ker A = Ax = 0 = AT x = 0 = coker A, and rng A = Ax =

AT x = corng A.

(b) ker A = coker A has basis ( 2,−1, 1 )T ; rng A = corng A has basis ( 1, 2, 0 )T , ( 2, 6, 2 )T .(c) No. For instance, if A is any nonsingular matrix, then ker A = coker A = 0 and

rng A = corng A = R3.

2.5.31.(a) Yes. This is our method of constructing the basis for the range, and the proof is out-

lined in the text.

(b) No. For example, if A =

0BBB@

1 0 0 01 0 0 00 1 0 00 0 1 0

1CCCA, then U =

0BBB@

1 0 0 00 1 0 00 0 1 00 0 0 0

1CCCA and the first three

rows of U form a basis for the three-dimensional corng U = corng A. but the first threerows of A only span a two-dimensional subspace.

(c) Yes, since ker U = ker A.(d) No, since coker U 6= coker A in general. For the example in part (b), coker A has basis

(−1, 1, 0, 0 )T while coker A has basis ( 0, 0, 0, 1 )T .

2.5.32. (a) Example:

0 01 0

!. (b) No, since then the first r rows of U are linear combina-

tions of the first r rows of A. Hence these rows span corng A, which, by Theorem 2.31c,implies that they form a basis for the corange.

2.5.33. Examples: any symmetric matrix; any permutation matrix since the row echelon form is

the identity. Yet another example is the complex matrix

0B@

0 0 11 i i0 i i

1CA.

♦ 2.5.34. The rows r1, . . . , rm of A span the corange. Reordering the rows — in particular inter-changing two — will not change the span. Also, multiplying any of the rows by nonzeroscalars, eri = ai ri, for ai 6= 0, will also span the same space, since

v =nX

i=1

ci ri =nX

i=1

ci

ai

eri.

2.5.35. We know rng A ⊂ Rm is a subspace of dimension r = rank A. In particular, rng A = R

m

if and only if it has dimension m = rank A.

2.5.36. This is false. If A =

1 11 1

!then rng A is spanned by

11

!whereas the range of its

62

row echelon form U =

1 10 0

!is spanned by

10

!.

♦ 2.5.37.(a) Method 1: choose the nonzero rows in the row echelon form of A. Method 2: choose the

columns of AT that correspond to pivot columns of its row echelon form.

(b) Method 1:

0B@

124

1CA,

0B@

3−1

5

1CA,

0B@

2−4

2

1CA. Method 2:

0B@

124

1CA,

0B@

0−7−7

1CA,

0B@

002

1CA. Not the same.

♦ 2.5.38. If v ∈ ker A then Av = 0 and so BAv = B0 = 0, so v ∈ ker(BA). The first statementfollows from setting B = A.

♦ 2.5.39. If v ∈ rng AB then v = ABx for some vector x. But then v = Ay where y = Bx, andso v ∈ rng A. The first statement follows from setting B = A.

2.5.40. First note that BA and AC also have size m× n. To show rank A = rank BA, we provethat ker A = ker BA, and so rank A = n − dim ker A = n − dimker BA = rank BA.Indeed, if v ∈ ker A, then Av = 0 and hence BAv = 0 so v ∈ ker BA. Conversely, if v ∈ker BA then BAv = 0. Since B is nonsingular, this implies Av = 0 and hence v ∈ ker A,proving the first result. To show rank A = rank AC, we prove that rng A = rng AC, andso rank A = dim rng A = dim rng AC = rank AC. Indeed, if b ∈ rng AC, then b = AC xfor some x and so b = Ay where y = C x, and so b ∈ rng A. Conversely, if b ∈ rng Athen b = Ay for some y and so b = AC x where x = C−1y, so b ∈ rng AC, proving thesecond result. The final equality is a consequence of the first two: rank A = rank BA =rank(BA)C.

♦ 2.5.41. (a) Since they are spanned by the columns, the range of ( A B ) contains the range ofA. But since A is nonsingular, rng A = R

n, and so rng ( A B ) = Rn also, which proves

rank ( A B ) = n. (b) Same argument, using the fact that the corange is spanned by therows.

2.5.42. True if the matrices have the same size, but false in general.

♦ 2.5.43. Since we know dim rng A = r, it suffices to prove that w1, . . . ,wr are linearly indepen-dent. Given

0 = c1w1 + · · ·+ cr wr = c1 Av1 + · · ·+ cr Avr = A(c1v1 + · · ·+ cr vr),

we deduce that c1v1 + · · ·+ cr vr ∈ ker A, and hence can be written as a linear combinationof the kernel basis vectors:

c1v1 + · · ·+ cr vr = cr+1vr+1 + · · ·+ cn vn.

But v1, . . . ,vn are linearly independent, and so c1 = · · · = cr = cr+1 = · · · = cn = 0, whichproves linear independence of w1, . . . ,wr.

♦ 2.5.44.(a) Since they have the same kernel, their ranks are the same. Choose a basis v1, . . . ,vn of

Rn such that vr+1, . . . ,vn form a basis for ker A = ker B. Then w1 = Av1, . . . ,wr =

Avr form a basis for rng A, while y1 = Bv1, . . . ,yr = Bvr form a basis for rng B.Let M be any nonsingular m × m matrix such that M wj = yj , j = 1, . . . , r, which

exists since both sets of vectors are linearly independent. We claim M A = B. Indeed,M Avj = Bvj , j = 1, . . . , r, by design, while M Avj = 0 = Bvj , j = r + 1, . . . , n,

since these vectors lie in the kernel. Thus, the matrices agree on a basis of Rn which is

enough to conclude that M A = B.(b) If the systems have the same solutions x? + z where z ∈ ker A = ker B, then B x =

M Ax = M b = c. Since M can be written as a product of elementary matrices, we

conclude that one can get from the augmented matrix“

A | b”

to the augmented matrix

63

“B | c

”by applying the elementary row operations that make up M .

♦ 2.5.45. (a) First, W ⊂ rng A since every w ∈ W can be written as w = Av for some v ∈V ⊂ R

n, and so w ∈ rng A. Second, if w1 = Av1 and w2 = Av2 are elements of W , thenso is cw1 + dw2 = A(cv1 + dv2) for any scalars c, d because cv1 + dv2 ∈ V , provingthat W is a subspace. (b) First, using Exercise 2.4.25, dim W ≤ r = dim rng A since it isa subspace of the range. Suppose v1, . . . ,vk form a basis for V , so dim V = k. Let w =Av ∈ W . We can write v = c1v1 + · · · + ck vk, and so, by linearity, w = c1 Av1 + · · · +ck Avk. Therefore, the k vectors w1 = Av1, . . . ,wk = Avk span W , and therefore, byProposition 2.33, dim W ≤ k.

♦ 2.5.46.(a) To have a left inverse requires an n×m matrix B such that BA = I . Suppose dim rng A =

rank A < n. Then, according to Exercise 2.5.45, the subspace W = Bv |v ∈ rng A has dim W ≤ dim rng A < n. On the other hand, w ∈ W if and only if w = Bv wherev ∈ rng A, and so v = Ax for some x ∈ R

n. But then w = Bv = BAx = x, andtherefore W = R

n since every vector x ∈ Rn lies in it; thus, dim W = n, contradicting

the preceding result. We conclude that having a left inverse implies rank A = n. (Therank can’t be larger than n.)

(b) To have a right inverse requires an m×n matrix C such that AC = I . Suppose dim rng A =rank A < m and hence rng A ( R

m. Choose y ∈ Rm \ rng A. Then y = AC y = Ax,

where x = C y. Therefore, y ∈ rng A, which is a contradiction. We conclude that havinga right inverse implies rank A = m.

(c) By parts (a–b), having both inverses requires m = rank A = n and A must be squareand nonsingular.

2.6.1. (a) (b) (c)

(d) (e) or, equivalently,

2.6.2. (a)

(b) ( 1, 1, 1, 1, 1, 1, 1 )T is a basis for the kernel. The cokernel is trivial, containing only thezero vector, and so has no basis. (c) Zero.

64

2.6.3. (a)

0BBB@

−1 0 1 00 −1 1 00 1 0 −10 0 1 −1

1CCCA; (b)

0BBBBB@

−1 1 0 0−1 0 0 1

1 0 −1 00 1 0 −10 0 −1 1

1CCCCCA

; (c)

0BBBBBBB@

−1 0 1 0 0−1 1 0 0 0

0 −1 0 1 00 −1 0 0 10 0 1 −1 00 0 0 1 −1

1CCCCCCCA

;

(d)

0BBBBBBBBBB@

1 −1 0 0 01 0 −1 0 00 −1 0 1 00 −1 0 0 10 0 1 −1 00 0 −1 0 10 0 0 1 −1

1CCCCCCCCCCA

; (e)

0BBBBBBB@

−1 0 0 1 0 01 0 0 0 −1 00 1 −1 0 0 00 −1 0 0 0 10 0 1 0 0 −10 0 0 −1 1 0

1CCCCCCCA

;

(f )

0BBBBBBBBBBBB@

1 −1 0 0 0 01 0 −1 0 0 00 1 0 −1 0 0−1 0 0 1 0 0

0 0 1 0 0 −10 0 −1 0 1 00 0 0 −1 0 10 0 0 0 −1 1

1CCCCCCCCCCCCA

.

2.6.4. (a) 1 circuit:

0BBB@

0−1−1

1

1CCCA; (b) 2 circuits:

0BBBBB@

−11010

1CCCCCA

,

0BBBBB@

0−1−1

01

1CCCCCA

; (c) 2 circuits:

0BBBBBBB@

−111010

1CCCCCCCA

,

0BBBBBBB@

00−1

101

1CCCCCCCA

;

(d) 3 circuits:

0BBBBBBBBBB@

−1110100

1CCCCCCCCCCA

,

0BBBBBBBBBB@

1−1

0−1

010

1CCCCCCCCCCA

,

0BBBBBBBBBB@

00−1

1001

1CCCCCCCCCCA

; (e) 2 circuits:

0BBBBBBB@

001110

1CCCCCCCA

,

0BBBBBBB@

110001

1CCCCCCCA

;

(f ) 3 circuits:

0BBBBBBBBBBBB@

10110000

1CCCCCCCCCCCCA

,

0BBBBBBBBBBBB@

−11−1

01010

1CCCCCCCCCCCCA

,

0BBBBBBBBBBBB@

00001101

1CCCCCCCCCCCCA

.

♥ 2.6.5. (a)

0BBBBB@

1 −1 0 01 0 −1 01 0 0 −10 1 −1 00 1 0 −1

1CCCCCA

; (b) rank = 3; (c) dim rng A = dim corng A = 3,

dim ker A = 1, dim coker A = 2; (d) kernel:

0BBB@

1111

1CCCA; cokernel:

0BBBBB@

1−1

010

1CCCCCA

,

0BBBBB@

10−1

01

1CCCCCA

;

65

(e) b1 − b2 + b4 = 0, b1 − b3 + b5 = 0; (f ) example: b =

0BBBBB@

11100

1CCCCCA

; x =

0BBB@

1 + tttt

1CCCA.

♦ 2.6.6.(a)

0BBBBBBBBBBBBBBBBBBBBBB@

1 −1 0 0 0 0 0 01 0 −1 0 0 0 0 01 0 0 −1 0 0 0 00 1 0 0 −1 0 0 00 1 0 0 0 −1 0 00 0 1 0 −1 0 0 00 0 1 0 0 0 −1 00 0 0 1 0 −1 0 00 0 0 1 0 0 −1 00 0 0 0 1 0 0 −10 0 0 0 0 1 0 −10 0 0 0 0 0 1 −1

1CCCCCCCCCCCCCCCCCCCCCCA

Cokernel basis: v1 =


−110−1

01000000


,v2 =


−1010−1

0010000


,v3 =


0−1

1000−1

01000


,v4 =


000−1

10000−1

10


,v5 =


00000−1

100−1

01


.

These vectors represent the circuits around 5 of the cube’s faces.

(b) Examples:


0000000−1

10−1

1


= v1 − v2 + v3 − v4 + v5,


01−1−1

110−1

0000


= v1 − v2,


0−1

11−1

0−1

011−1

0


= v3 − v4.

♥ 2.6.7.

(a) Tetrahedron:

0BBBBBBB@

1 −1 0 01 0 −1 01 0 0 −10 1 −1 00 1 0 −10 0 1 −1

1CCCCCCCA

66

number of circuits = dim coker A = 3, number of faces = 4;

(b) Octahedron: 0BBBBBBBBBBBBBBBBBBBBBB@

1 −1 0 0 0 01 0 −1 0 0 01 0 0 −1 0 01 0 0 0 −1 00 1 −1 0 0 00 1 0 0 −1 00 1 0 0 0 −10 0 1 −1 0 00 0 1 0 0 −10 0 0 1 −1 00 0 0 1 0 −10 0 0 0 1 −1


number of circuits = dim coker A = 7, number of faces = 8.

(c) Dodecahedron:0BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB@

1 −1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 01 0 0 0 0 −1 0 0 0 0 0 0 0 0 0 0 0 0 0 00 1 −1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 00 1 0 0 0 0 −1 0 0 0 0 0 0 0 0 0 0 0 0 00 0 1 −1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 00 0 1 0 0 0 0 −1 0 0 0 0 0 0 0 0 0 0 0 00 0 0 1 −1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 00 0 0 1 0 0 0 0 −1 0 0 0 0 0 0 0 0 0 0 0−1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

0 0 0 0 1 0 0 0 0 −1 0 0 0 0 0 0 0 0 0 00 0 0 0 0 1 0 0 0 0 −1 0 0 0 0 0 0 0 0 00 0 0 0 0 1 0 0 0 0 0 −1 0 0 0 0 0 0 0 00 0 0 0 0 0 1 0 0 0 0 −1 0 0 0 0 0 0 0 00 0 0 0 0 0 1 0 0 0 0 0 −1 0 0 0 0 0 0 00 0 0 0 0 0 0 1 0 0 0 0 −1 0 0 0 0 0 0 00 0 0 0 0 0 0 1 0 0 0 0 0 −1 0 0 0 0 0 00 0 0 0 0 0 0 0 1 0 0 0 0 −1 0 0 0 0 0 00 0 0 0 0 0 0 0 1 0 0 0 0 0 −1 0 0 0 0 00 0 0 0 0 0 0 0 0 1 0 0 0 0 −1 0 0 0 0 00 0 0 0 0 0 0 0 0 1 −1 0 0 0 0 0 0 0 0 00 0 0 0 0 0 0 0 0 0 1 0 0 0 0 −1 0 0 0 00 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 −1 0 0 00 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 −1 0 00 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 −1 00 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 −10 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 −1 0 0 00 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 −1 0 00 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 −1 00 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 −10 0 0 0 0 0 0 0 0 0 0 0 0 0 0 −1 0 0 0 1

1CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCA


67

(d) Icosahedron:

0BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB@

1 −1 0 0 0 0 0 0 0 0 0 01 0 −1 0 0 0 0 0 0 0 0 01 0 0 −1 0 0 0 0 0 0 0 01 0 0 0 −1 0 0 0 0 0 0 01 0 0 0 0 −1 0 0 0 0 0 00 1 −1 0 0 0 0 0 0 0 0 00 1 0 0 0 0 −1 0 0 0 0 00 1 0 0 0 0 0 0 0 0 −1 00 0 1 −1 0 0 0 0 0 0 0 00 0 1 0 0 0 −1 0 0 0 0 00 0 1 0 0 0 0 −1 0 0 0 00 0 0 1 −1 0 0 0 0 0 0 00 0 0 1 0 0 0 −1 0 0 0 00 0 0 1 0 0 0 0 −1 0 0 00 0 0 0 1 −1 0 0 0 0 0 00 0 0 0 1 0 0 0 −1 0 0 00 0 0 0 1 0 0 0 0 −1 0 00 −1 0 0 0 1 0 0 0 0 0 00 0 0 0 0 1 0 0 0 −1 0 00 0 0 0 0 1 0 0 0 0 −1 00 0 0 0 0 0 1 −1 0 0 0 00 0 0 0 0 0 1 0 0 0 0 −10 0 0 0 0 0 0 1 −1 0 0 00 0 0 0 0 0 0 1 0 0 0 −10 0 0 0 0 0 0 0 1 −1 0 00 0 0 0 0 0 0 0 1 0 0 −10 0 0 0 0 0 0 0 0 1 −1 00 0 0 0 0 0 0 0 0 1 0 −10 0 0 0 0 0 −1 0 0 0 1 00 0 0 0 0 0 0 0 0 0 1 −1

1CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCA


♥ 2.6.8.

(a) (i)

0B@−1 1 0 0

0 1 −1 00 1 0 −1

1CA, (ii)

0BBB@

−1 1 0 0 00 −1 1 0 00 0 −1 1 00 1 0 0 −1

1CCCA,

(iii)

0BBBBB@

−1 1 0 0 0 00 1 −1 0 0 00 0 1 −1 0 00 1 0 0 −1 00 1 0 0 0 −1

1CCCCCA

, (iv)

0BBBBB@

−1 1 0 0 0 00 −1 1 0 0 00 0 −1 1 0 00 1 0 0 −1 00 0 1 0 0 −1

1CCCCCA

.

(b)

0BBB@

−1 1 0 0 00 −1 1 0 00 0 −1 1 00 0 0 −1 1

1CCCA,

0BBB@

−1 1 0 0 00 −1 1 0 00 0 −1 1 00 1 0 0 −1

1CCCA,

0BBB@

−1 1 0 0 00 1 −1 0 00 1 0 −1 00 1 0 0 −1

1CCCA.

68

(c) Let m denote the number of edges. Since the graph is connected, its incidence matrixA has rank n − 1. There are no circuits if and only if coker A = 0, which implies0 = dim coker A = m− (n− 1), and so m = n− 1.

♥ 2.6.9.

(a)

(b)

0B@

1 −1 01 0 −10 1 −1

1CA,

0BBBBBBB@

1 −1 0 01 0 −1 01 0 0 −10 1 −1 00 1 0 −10 0 1 −1

1CCCCCCCA

,

0BBBBBBBBBBBBBBBBB@

1 −1 0 0 01 0 −1 0 01 0 0 −1 01 0 0 0 −10 1 −1 0 00 1 0 −1 00 1 0 0 −10 0 1 −1 00 0 1 0 −10 0 0 1 −1

1CCCCCCCCCCCCCCCCCA

.

(c) 12 n(n− 1); (d) 1

2 (n− 1)(n− 2).

♥ 2.6.10.

(a)

(b)

0BBBBBBB@

1 0 −1 0 01 0 0 −1 01 0 0 0 −10 1 −1 0 00 1 0 −1 00 1 0 0 −1

1CCCCCCCA

,

0BBBBBBBBBBBB@

1 0 −1 0 0 01 0 0 −1 0 01 0 0 0 −1 01 0 0 0 0 −10 1 −1 0 0 00 1 0 −1 0 00 1 0 0 −1 00 1 0 0 0 −1

1CCCCCCCCCCCCA

,

0BBBBBBBBBBBBBB@

1 0 0 −1 0 01 0 0 0 −1 01 0 0 0 0 −10 1 0 −1 0 00 1 0 0 −1 00 1 0 0 0 −10 0 1 −1 0 00 0 1 0 −1 00 0 1 0 0 −1

1CCCCCCCCCCCCCCA

.

(c) m n; (d) (m− 1)(n− 1).

69

♥ 2.6.11.

(a) A =

0BBBBBBBBBB@

1 −1 0 0 0 0 0 0−1 0 0 1 0 0 0 0

0 −1 0 1 0 0 0 00 0 −1 0 0 1 0 00 0 0 0 −1 0 1 00 0 0 0 −1 0 0 10 0 0 0 0 0 1 −1

1CCCCCCCCCCA

.

(b) The vectors v1 =

0BBBBBBBBBBBB@

11010000

1CCCCCCCCCCCCA

, v2 =

0BBBBBBBBBBBB@

00100100

1CCCCCCCCCCCCA

, v3 =

0BBBBBBBBBBBB@

00001011

1CCCCCCCCCCCCA

form a basis for ker A.

(c) The entries of each vi are indexed by the vertices. Thus the nonzero entries in v1 cor-respond to the vertices 1,2,4 in the first connected component, v2 to the vertices 3,6 inthe second connected component, and v3 to the vertices 5,7,8 in the third connectedcomponent.

(d) Let A have k connected components. A basis for ker A consists of the vectors v1, . . . ,vk

where vi has entries equal to 1 if the vertex lies in the ith connected component of thegraph and 0 if it doesn’t. To prove this, suppose Av = 0. If edge #` connects vertex ato vertex b, then the `th component of the linear system is va − vb = 0. Thus, va = vbwhenever the vertices are connected by an edge. If two vertices are in the same con-nected component, then they can be connected by a path, and the values va = vb = · · ·at each vertex on the path must be equal. Thus, the values of va on all vertices in theconnected component are equal, and hence v = c1v1 + · · · + ckvk can be written as alinear combination of the basis vectors, with ci being the common value of the entries

va corresponding to vertices in the ith connected component. Thus, v1, . . . ,vk span thekernel. Moreover, since the coefficients ci coincide with certain entries va of v, the onlylinear combination giving the zero vector is when all ci are zero, proving their linear in-dependence.

♦ 2.6.12. If the incidence matrix has rank r, then # circuits

= dim coker A = n− r = dim ker A ≥ 1,

since ker A always contains the vector ( 1, 1, . . . , 1 )T .

2.6.13. Changing the direction of an edge is the same as multiplying the corresponding row ofthe incidence matrix by −1. The dimension of the cokernel, being the number of indepen-dent circuits, does not change. Each entry of a cokernel vector that corresponds to an edgethat has been reversed is multiplied by −1. This can be realized by left multiplying theincidence matrix by a diagonal matrix whose diagonal entries are −1 is the correspondingedge has been reversed, and +1 if it is unchanged.

♥ 2.6.14.(a) Note that P permutes the rows of A, and corresponds to a relabeling of the vertices of

the digraph, while Q permutes its columns, and so corresponds to a relabeling of theedges.

(b) (i),(ii),(v) represent equivalent digraphs; none of the others are equivalent.(c) v = (v1, . . . , vm) ∈ coker A if and only if bv = P v = ( vπ(1) . . . vπ(m) ) ∈ coker B. Indeed,

bvT B = (P v)T P AQ = vT AQ = 0 since, according to Exercise 1.6.14, P T = P−1 is theinverse of the permutation matrix P .

70

2.6.15. False. For example, any two inequivalent trees, cf. Exercise 2.6.8, with the same num-ber of nodes have incidence matrices of the same size, with trivial cokernels: coker A =coker B = 0. As another example, the incidence matrices

A =

0BBBBB@

1 −1 0 0 00 1 −1 0 0−1 0 1 0 0

1 0 0 −1 01 0 0 0 −1

1CCCCCA

and B =

0BBBBB@

1 −1 0 0 00 1 −1 0 0−1 0 1 0 0

1 0 0 −1 00 1 0 0 −1

1CCCCCA

both have cokernel basis ( 1, 1, 1, 0, 0 )T , but do not represent equivalent digraphs.

2.6.16.(a) If the first k vertices belong to one component and the last n−k to the other, then there

is no edge between the two sets of vertices and so the entries aij = 0 whenever i =

1, . . . , k, j = k + 1, . . . , n, or when i = k + 1, . . . , n, j = 1, . . . , k, which proves that A hasthe indicated block form.

(b) The graph consists of two disconnected triangles. If we use 1, 2, 3 to label the vertices inone triangle and 4, 5, 6 for those in the second, the resulting incidence matrix has the in-

dicated block form

0BBBBBBB@

1 −1 0 0 0 00 1 −1 0 0 0−1 0 1 0 0 0

0 0 0 1 −1 00 0 0 0 1 −10 0 0 −1 0 1

1CCCCCCCA

, with each block a 3× 3 submatrix.

71


3.1.1. Bilinearity:

〈 cu + dv ,w 〉 = (cu1 + dv1)w1 − (cu1 + dv1)w2 − (cu2 + dv2)w1 + b(cu2 + dv2)w2

= c(u1w1 − u1w2 − u2w1 + bu2w2) + d(v1w1 − v1w2 − v2w1 + bv2w2)

= c 〈u ,w 〉+ d 〈v ,w 〉,〈u , cv + dw 〉 = u1 (cv1 + dw1)− u1 (cv2 + dw2)− u2 (cv1 + dw1) + bu2 (cv2 + dw2)

= c(u1 v1 − u1 v2 − u2 v1 + bu2 v2) + d(u1w1 − u1w2 − u2w1 + bu2w2)

= c 〈u ,v 〉+ d 〈u ,w 〉.

Symmetry:

〈v ,w 〉 = v1w1 − v1w2 − v2w1 + bv2w2 = w1 v1 − w1 v2 − w2 v1 + bw2 v2 = 〈w ,v 〉.

To prove positive definiteness, note

〈v ,v 〉 = v21 − 2v1 v2 + bv22 = (v1 − v2)2 + (b− 1)v22 > 0 for all v = ( v1, v2 )T 6= 0

if and only if b > 1. (If b = 1, the formula is only positive semi-definite, since v = ( 1, 1 )T

gives 〈v ,v 〉 = 0, for instance.)

3.1.2. (a), (f ) and (g) define inner products; the others don’t.

3.1.3. It is not positive definite, since if v = ( 1,−1 )T , say, 〈v ,v 〉 = 0.

3.1.4.(a) Bilinearity:

〈 cu + dv ,w 〉 = (cu1 + dv1)w1 + 2(cu2 + dv2)w2 + 3(cu3 + dv3)w3

= c(u1w1 + 2u2w2 + 3u3w3) + d(v1w1 + 2v2w2 + 3v3w3)

= c 〈u ,w 〉+ d 〈v ,w 〉,〈u , cv + dw 〉 = u1 (cv1 + dw1) + 2u2 (cv2 + dw2) + 3u3 (cv3 + dw3)

= c(u1 v1 + 2u2 v2 + 3u3 v3) + d(u1w1 + 2u2w2 + 3u3w3)

= c 〈u ,v 〉+ d 〈u ,w 〉.

Symmetry:

〈v ,w 〉 = v1w1 + 2v2w2 + 3v3w3 = w1 v1 + 2w2 v2 + 3w3 v3 = 〈w ,v 〉.

Positivity:

〈v ,v 〉 = v21 + 2v22 + 3v23 > 0 for all v = ( v1, v2, v3 )T 6= 0

because it is a sum of non-negative terms, at least one of which is strictly positive.

72

(b) Bilinearity:〈 cu + dv ,w 〉 = 4(cu1 + dv1)w1 + 2(cu1 + dv1)w2 + 2(cu2 + dv2)w1 +

+ 4(cu2 + dv2)w2 + (cu3 + dv3)w3

= c(4u1w1 + 2u1w2 + 2u2w1 + 4u2w2 + u3w3) +

+ d(4v1w1 + 2v1w2 + 2v2w1 + 4v2w2 + v3w3)

= c 〈u ,w 〉+ d 〈v ,w 〉,〈u , cv + dw 〉 = 4u1 (cv1 + dw1) + 2u1 (cv2 + dw2) + 2u2 (cv1 + dw1) +

+ 4u2 (cv2 + dw2) + u3 (cv3 + dw3)

= c(4u1 v1 + 2u1 v2 + 2u2 v1 + 4u2 v2 + u3 v3) +

+ d(4u1w1 + 2u1w2 + 2u2w1 + 4u2w2 + u3w3)

= c 〈u ,v 〉+ d 〈u ,w 〉.Symmetry:

〈v ,w 〉 = 4v1w1 + 2v1w2 + 2v2w1 + 4v2w2 + v3w3

= 4w1 v1 + 2w1 v2 + 2w2 v1 + 4w2 v2 + w3 v3 = 〈w ,v 〉.Positivity:

〈v ,v 〉 = 4v21 +4v1 v2 +4v22 + v23 = (2v1 + v2)2 +3v22 + v23 > 0 for all v = ( v1, v2, v3 )T 6= 0,

because it is a sum of non-negative terms, at least one of which is strictly positive.(c) Bilinearity:〈 cu + dv ,w 〉 = 2(cu1 + dv1)w1 − 2(cu1 + dv1)w2 − 2(cu2 + dv2)w1 +

+ 3(cu2 + dv2)w2 − (cu2 + dv2)w3 − (cu3 + dv3)w2 + 2(cu3 + dv3)w3

= c(2u1w1 − 2u1w2 − 2u2w1 + 3u2w2 − u2w3 − u3w2 + 2u3w3) +

+ d(2v1w1 − 2v1w2 − 2v2w1 + 3v2w2 − v2w3 − v3w2 + 2v3w3)

= c 〈u ,w 〉+ d 〈v ,w 〉,〈u , cv + dw 〉 = 2u1 (cv1 + dw1)− 2u1 (cv2 + dw2)− 2u2 (cv1 + dw1) +

+ 3u2 (cv2 + dw2)− u2 (cv3 + dw3)− u3 (cv2 + dw2) + 2u3 (cv3 + dw3)

= c(2u1 v1 − 2u1 v2 − 2u2 v1 + 3u2 v2 − u2 v3 − u3 v2 + 2u3 v3) +

+ d(2u1w1 − 2u1w2 − 2u2w1 + 3u2w2 − u2w3 − u3w2 + 2u3w3)

= c 〈u ,v 〉+ d 〈u ,w 〉.Symmetry:〈v ,w 〉 = 2v1w1 − 2v1w2 − 2v2w1 + 3v2w2 − v2w3 − v3w2 + 2v3w3

= 2w1 v1 − 2w1 v2 − 2w2 v1 + 3w2 v2 − w2 v3 − w3 v2 + 2w3 v3 = 〈w ,v 〉.Positivity:

〈v ,v 〉 = 2v21 − 4v1 v2 + 3v22 − 2v2 v3 + 2v23 = 2(v1 − v2)2 + (v2 − v3)

2 + v23 > 0

for all v = ( v1, v2, v3 )T 6= 0, because it is a sum of non-negative terms, at least one ofwhich is strictly positive.

3.1.5. (a) ( cos t, sin t )T ,

cos t√

2,

sin t√5

!T

,

cos t+

sin t√3,

sin t√5

!T

.

-1 -0.5 0.5 1

-1

-0.5

0.5

1

-1 -0.5 0.5 1

-1

-0.5

0.5

1

-1 -0.5 0.5 1

-1

-0.5

0.5

1

73

(b) Note: By elementary analytical geometry, any quadratic equation of the form

ax2 + bxy + cy2 = 1 defines an ellipse provided a > 0 and b2 − 4ac < 0.Case (b): The equation 2v2

1 + 5v22 = 1 defines an ellipse with semi-axes 1√2, 1√

5.

Case (c): The equation v21−2v1 v2+4v22 = 1 also defines an ellipse by the preceding remark.

♦ 3.1.6.(a) The vector v = (x, y )T can be viewed as the hypotenuse of a right triangle with side

lengths x, y, and so by Pythagoras, ‖v ‖2 = x2 + y2.

(b) First, the projection p = (x, y, 0 )T of v = (x, y, z )T onto the xy plane ha length

‖p ‖ =qx2 + y2 by Pythagoras, as in part (a). Second, the right triangle formed by

0,p and v has side lengths ‖p ‖ and z, and, again by Pythagoras,

‖v ‖2 = ‖p ‖2 + z2 = x2 + y2 + z2.

♦ 3.1.7. ‖ cv ‖ =q〈 cv , cv 〉 =

qc2 〈v ,v 〉 = | c | ‖v ‖.

3.1.8. By bilinearity and symmetry,

〈 av + bw , cv + dw 〉 = a〈v , cv + dw 〉+ b〈w , cv + dw 〉= ac〈v ,v 〉+ ad〈v ,w 〉+ bc〈w ,v 〉+ bd〈w ,w 〉= ac‖v ‖2 + (ad+ bc)〈v ,w 〉+ bd‖w ‖2.

♦ 3.1.9. If we know the first bilinearity property and symmetry, then the second follows:

〈u , cv + dw 〉 = 〈 cv + dw ,u 〉 = c 〈v ,u 〉+ d 〈w ,u 〉 = c 〈u ,v 〉+ d 〈u ,w 〉.

♦ 3.1.10.(a) Choosing v = x, we have 0 = 〈x ,x 〉 = ‖x ‖2, and hence x = 0.(b) Rewrite the condition as 0 = 〈x ,v 〉 − 〈y ,v 〉 = 〈x− y ,v 〉 for all v ∈ V . Now use part

(a) to conclude that x− y = 0 and so x = y.(c) If v is any element of V , then we can write v = c1v1+· · ·+cn vn as a linear combination

of the basis elements, and so, by bilinearity, 〈x ,v 〉 = c1 〈x ,v1 〉 + · · · + cn 〈x ,vn 〉 = 0.Since this holds for all v ∈ V , the result in part (a) implies x = 0.

♦ 3.1.11.

(a) ‖u + v ‖2 − ‖u− v ‖2 = 〈u + v ,u + v 〉 − 〈u− v ,u− v 〉=“〈u ,u 〉+ 2 〈u ,v 〉+ 〈v ,v 〉

”−“〈u ,u 〉 − 2 〈u ,v 〉+ 〈v ,v 〉

”= 4 〈u ,v 〉.

(b) 〈v ,w 〉 = 14

h(v1 + w1)

2 − 3(v1 + w1)(v2 + w2) + 5(v2 + w2)2i−

− 14

h(v1 − w1)

2 − 3(v1 − w1)(v2 − w2) + 5(v2 − w2)2i

= v1w1 − 32 v1w2 − 3

2 v2w1 + 5 v2w2.

3.1.12.(a) ‖x + y ‖2 + ‖x− y ‖2 =

“‖x ‖2 + 2 〈x ,y 〉+ ‖y ‖2

”+“‖x ‖2 − 2 〈x ,y 〉+ ‖y ‖2

”

= 2 ‖x ‖2 + 2 ‖y ‖2.(b) The sum of the squared lengths of the diagonals in a parallelogram equals the sum of

the squared lengths of all four sides:

74

‖x ‖

‖x ‖

‖y ‖

‖y ‖

‖x + y ‖

‖x− y ‖

3.1.13. By Exercise 3.1.12, ‖v ‖2 = 12

“‖u + v ‖2 + ‖u− v ‖2

”− ‖u ‖2 = 17, so ‖v ‖ =

√17 .

The answer is the same in all norms coming from inner products.

3.1.14. Using (3.2), v · (Aw) = vTAw = (AT v)T w = (AT v) ·w.♦ 3.1.15. First, if A is symmetric, then

(Av) ·w = (Av)T w = vTAT

w = vTAw = v · (Aw).

To prove the converse, note that Aej gives the jth column of A, and so

aij = ei · (Aej) = (Aei) · ej = aji for all i, j. Hence A = AT .

3.1.16. The inner product axioms continue to hold when restricted to vectors in W since theyhold for all vectors in V , including those in W .

3.1.17. Bilinearity:〈〈〈 cu + dv ,w 〉〉〉 = 〈 cu + dv ,w 〉+ 〈〈 cu + dv ,w 〉〉

= c 〈u ,w 〉+ d 〈v ,w 〉+ c 〈〈u ,w 〉〉+ d 〈〈v ,w 〉〉 = c 〈〈〈u ,w 〉〉〉+ d 〈〈〈v ,w 〉〉〉,〈〈〈u , cv + dw 〉〉〉 = 〈u , cv + dw 〉+ 〈〈u , cv + dw 〉〉

= c 〈u ,v 〉+ d 〈u ,w 〉+ c 〈〈u ,v 〉〉+ d 〈〈u ,w 〉〉 = c 〈〈〈u ,v 〉〉〉+ d 〈〈〈u ,w 〉〉〉.Symmetry:

〈〈〈v ,w 〉〉〉 = 〈v ,w 〉+ 〈〈v ,w 〉〉 = 〈w ,v 〉+ 〈〈w ,v 〉〉 = 〈〈〈w ,v 〉〉〉.Positivity: 〈〈〈v ,v 〉〉〉 = 〈v ,v 〉+ 〈〈v ,v 〉〉 > 0 for all v 6= 0 since both terms are positive.

♦ 3.1.18. Bilinearity:〈〈〈 c (v,w) + d (ev, ew) , (bv, bw) 〉〉〉 = 〈〈〈 (cv + d ev, cw + d ew) , (bv, bw) 〉〉〉

= 〈 cv + d ev , bv 〉+ 〈〈 cw + d ew , bw 〉〉= c 〈v , bv 〉+ d 〈 ev , bv 〉+ c 〈w , bw 〉+ d 〈 ew , bw 〉= c 〈〈〈 (v,w) , (bv, bw) 〉〉〉+ d 〈〈〈 (ev, ew) , (bv, bw) 〉〉〉,

〈〈〈 (v,w) , c (ev, ew) + d (bv, bw) 〉〉〉 = 〈〈〈 (v,w) , (c ev + d bv, c ew + d bw) 〉〉〉= 〈v , c ev + d bv 〉+ 〈〈w , c ew + d bw 〉〉= c 〈v , ev 〉+ d 〈v , bv 〉+ c 〈w , ew 〉+ d 〈w , bw 〉= c 〈〈〈 (v,w) , (ev, ew) 〉〉〉+ d 〈〈〈 (v,w) , (bv, bw) 〉〉〉.

Symmetry:

〈〈〈 (v,w) , (ev, ew) 〉〉〉 = 〈v , ev 〉+ 〈〈w , ew 〉〉 = 〈 ev ,v 〉+ 〈〈 ew ,w 〉〉 = 〈〈〈 (ev, ew) , (v,w) 〉〉〉.Positivity:

〈〈〈 (v,w) , (v,w) 〉〉〉 = 〈v ,v 〉+ 〈〈w ,w 〉〉 > 0

75

for all (v,w) 6= (0,0), since both terms are non-negative and at least one is positive be-cause either v 6= 0 or w 6= 0.

3.1.19.(a) 〈 1 , x 〉 = 1

2 , ‖ 1 ‖ = 1, ‖x ‖ = 1√3;

(b) 〈 cos 2πx , sin 2πx 〉 = 0, ‖ cos 2πx ‖ = 1√2, ‖ sin 2πx ‖ = 1√

2;

(c) 〈x , ex 〉 = 1, ‖x ‖ = 1√3, ‖ ex ‖ =

r12 (e2 − 1) ;

(d)

*(x+ 1)2 ,

1

x+ 1

+=

3

2,‚‚‚ (x+ 1)2

‚‚‚ =

s31

5,

‚‚‚‚‚1

x+ 1

‚‚‚‚‚ =1√2.

3.1.20. (a) 〈 f , g 〉 = 34 , ‖ f ‖ = 1√

3, ‖ g ‖ =

r2815 ; (b) 〈 f , g 〉 = 0, ‖ f ‖ =

r23 , ‖ g ‖ =

r5615 ;

(c) 〈 f , g 〉 = 815 , ‖ f ‖ = 1

2 , ‖ g ‖ =

r76 .

3.1.21. All but (b) are inner products. (b) is not because it fails positivity: for instance,Z 1

−1(1− x)2 x dx = − 4

3 .

3.1.22. If f(x) is any nonzero function that satisfies f(x) = 0 for all 0 ≤ x ≤ 1 then 〈 f , f 〉 = 0.

An example is the function f(x) =

(x, −1 ≤ x ≤ 0,

0, 0 ≤ x ≤ 1.However, if the function f ∈ C0[0, 1]

is only considered on [0, 1], its values outside the interval are irrelevant, and so the positiv-ity is unaffected. The formula does define an inner product on the subspace of polynomialfunctions because 〈 p , p 〉 = 0 if and only if p(x) = 0 for all 0 ≤ x ≤ 1, which impliesp(x) ≡ 0 for all x since only the zero polynomial can vanish on an interval.

3.1.23.(a) No — positivity doesn’t hold since if f(0) = f(1) = 0 then 〈 f , f 〉 = 0 even if f(x) 6= 0

for any 0 < x < 1;(b) Yes. Bilinearity and symmetry are readily established. As for positivity,

〈 f , f 〉 = f(0)2+f(1)2+Z 1

0f(x)2 dx ≥ 0 is a sum of three non-negative quantities, and is

equal to 0 if and only if all three terms vanish, so f(0) = f(1) = 0 andZ 1

0f(x)2 dx = 0

which, by continuity, implies f(x) ≡ 0 for all 0 ≤ x ≤ 1.

3.1.24. No. For example, on [−1, 1], ‖ 1 ‖ =√

2 , but ‖ 1 ‖2 = 2 6= ‖ 12 ‖ =√

2.

♦ 3.1.25. Bilinearity:

〈 cf + dg , h 〉 =Z b

a

hcf(x) + dg(x)h(x) + cf(x) + dg(x)′h′(x)

idx

=Z b

a

hcf(x)h(x) + dg(x)h(x) + cf ′(x)h′(x) + dg′(x)h′(x)

idx

= cZ b

a

hf(x)h(x) + f ′(x)h′(x)

idx+ d

Z b

a

hg(x)h(x) + g′(x)h′(x)

idx

= c〈 f , h 〉+ d〈 g , h 〉.

〈 f , cg + dh 〉 =Z b

a

hf(x)cg(x) + dh(x)+ f ′(x)cg(x) + dh(x)′

idx

76

=Z b

a

hcf(x)g(x) + df(x)h(x) + cf ′(x)g′(x) + df ′(x)h′(x)

idx

= cZ b

a

hf(x)g(x) + f ′(x)g′(x)

idx+ d

Z b

a

hf(x)h(x) + f ′(x)h′(x)

idx

= c〈 f , g 〉+ d〈 f , h 〉.Symmetry:

〈 f , g 〉 =Z b

a

hf(x)g(x) + f ′(x)g′(x)

idx =

Z b

a

hg(x)f(x) + g′(x)f ′(x)

idx = 〈 g , f 〉.

Positivity

〈 f , f 〉 =Z b

a

hf(x)2 + f ′(x)2

idx > 0 for all f 6≡ 0,

since the integrand is non-negative, and, by continuity, the integral is zero if and only ifboth f(x) ≡ 0 and f ′(x) ≡ 0 for all a ≤ x ≤ b.

3.1.26.(a) No, because if f(x) is any constant function, then 〈 f , f 〉 = 0, and so positive definite-

ness does not hold.(b) Yes. To prove the first bilinearity condition:

〈 cf + dg , h 〉 =Z 1

−1

hcf ′(x) + dg′(x)

ih′(x) dx

= cZ 1

−1f ′(x)h′(x) dx+ d

Z 1

−1g′(x)h′(x) dx = c〈 f , h 〉+ d〈 g , h 〉.

The second has a similar proof, or follows from symmetry, cf. Exercise 3.1.9. To provesymmetry:

〈 f , g 〉 =Z 1

−1f ′(x) g′(x) dx =

Z 1

−1g′(x) f ′(x) dx = 〈 g , f 〉.

As for positivity, 〈 f , f 〉 =Z 1

−1f ′(x)2 dx ≥ 0. Moreover, since f ′ is continuous,

〈 f , f 〉 = 0 if and only if f ′(x) ≡ 0 for all x, and so f(x) ≡ c is constant. But the onlyconstant function in W is the zero function, and so 〈 f , f 〉 > 0 for all 0 6= f ∈W .

♦ 3.1.27. Suppose h(x0) = k > 0 for some a < x0 < b. Then, by continuity, h(x) ≥ 12k for

a < x0 − δ < x < x0 + δ < b for some δ > 0. But then, since h(x) ≥ 0 everywhere,Z b

ah(x) dx ≥

Z x0+δ

x0−δh(x) dx ≥ k δ > 0,

which is a contradiction. A similar contradiction can be shown when h(x0) = k < 0 forsome a < x0 < b. Thus h(x) = 0 for all a < x < b, which, by continuity, also impliesh(a) = h(b) = 0. The function in (3.14) gives a discontinuous counterexample.

♦ 3.1.28.(a) To prove the first bilinearity condition:

〈 cf + dg , h 〉 =Z b

a

hcf(x) + dg(x)

ih(x)w(x) dx

= cZ b

af(x)h(x)w(x) dx+ d

Z b

ag(x)h(x)w(x) dx = c〈 f , h 〉+ d〈 g , h 〉.

The second has a similar proof, or follows from symmetry, cf. Exercise 3.1.9. To provesymmetry:

〈 f , g 〉 =Z b

af(x) g(x)w(x) dx =

Z b

ag(x) f(x)w(x) dx = 〈 g , f 〉.

As for positivity, 〈 f , f 〉 =Z b

af(x)2 w(x) dx ≥ 0. Moreover, since w(x) > 0 and the inte-

77

grand is continuous, Exercise 3.1.27 implies that 〈 f , f 〉 = 0 if and only if f(x)2 w(x) ≡0 for all x, and so f(x) ≡ 0.

(b) If w(x0) < 0, then, by continuity, w(x) < 0 for x0 − δ ≤ x ≤ x0 + δ for some δ > 0. Nowchoose f(x) 6≡ 0 so that f(x) = 0 whenever |x− x0 | > δ. Then

〈 f , f 〉 =Z b

af(x)2 w(x) dx =

Z x0+δ

x0−δf(x)2 w(x) dx < 0, violating positivity.

(c) Bilinearity and symmetry continue to hold. The positivity argument says that 〈 f , f 〉 =0 implies that f(x) = 0 whenever w(x) > 0. By continuity, f(x) ≡ 0, provided w(x) 6≡ 0on any open subinterval a ≤ c < x < d ≤ b, and so under this assumption it remains aninner product. However, if w(x) ≡ 0 on a subinterval, then positivity is violated.

♥ 3.1.29.(a) If f(x0, y0) = k > 0 then, by continuity, f(x, y) ≥ 1

2 k for (x, y) ∈ D = ‖x− x0 ‖ ≤ δfor some δ > 0. But then

ZZ

Ωf(x, y) dx dy ≥

ZZ

Df(x, y) dx dy ≥ 1

2 πk δ2 > 0.

(b) Bilinearity:

〈 cf + dg , h 〉 =ZZ

Ω

hcf(x, y) + dg(x, y)h(x, y)

idx dy

= cZZ

Ωf(x, y)h(x, y) dx dy + d

ZZ

Ω

hg(x, y)h(x, y)

idx dy = c〈 f , h 〉+ d〈 g , h 〉.

The second bilinearity conditions follows from the first and symmetry:

〈 f , g 〉 =ZZ

Ωf(x, y)g(x, y) dx dy =

ZZ

Ωg(x, y)f(x, y) dx dy = 〈 g , f 〉.

Positivity; using part (a),

〈 f , f 〉 =ZZ

Ω[f(x, y) ]2 dx dy > 0 for all f 6≡ 0.

The formula for the norm is ‖ f ‖ =

sZZ

Ω[f(x, y) ]2 dx dy .

3.1.30. (a) 〈 f , g 〉 = 23 , ‖ f ‖ = 1, ‖ g ‖ =

q2845 ; (b) 〈 f , g 〉 = 1

2 π, ‖ f ‖ =√π , ‖ g ‖ =

qπ3 .

♥ 3.1.31.(a) To prove the first bilinearity condition:

〈〈 c f + dg ,h 〉〉 =Z 1

0

h(cf1(x) + dg1(x))h1(x) + (cf2(x) + dg2(x))h2(x)

idx

= cZ 1

0

hf1(x)h1(x) + f2(x)h2(x)

idx+ d

Z 1

0

hg1(x)h1(x) + g2(x)h2(x)

idx

= c〈〈 f ,h 〉〉+ d〈〈g ,h 〉〉.The second has a similar proof, or follows from symmetry, cf. Exercise 3.1.9. To provesymmetry:

〈〈 f ,g 〉〉 =Z 1

0

hf1(x)g1(x) + f2(x)g2(x)

idx =

Z 1

0

hg1(x)f1(x) + g2(x)f2(x)

idx = 〈〈g , f 〉〉.

As for positivity, 〈〈 f , f 〉〉 =Z 1

0

hf1(x)

2 + f2(x)2idx ≥ 0, since the integrand is a non-

negative function. Moreover, since f1(x) and f2(x) are continuous, so is f1(x)2 + f2(x)

2,

and hence 〈〈 f , f 〉〉 = 0 if and only if f1(x)2 + f2(x)

2 = 0 for all x, and so f(x) =

( f1(x), f2(x) )T ≡ 0.

(b) First bilinearity:

〈〈 c f + dg ,h 〉〉 =Z 1

0〈 c f(x) + dg(x) ,h(x) 〉 dx

= cZ 1

0〈 f(x) ,h(x) 〉 dx+ d

Z 1

0〈g(x) ,h(x) 〉 dx = c〈〈 f ,h 〉〉+ d〈〈g ,h 〉〉.

78

Symmetry:

〈〈 f ,g 〉〉 =Z 1

0〈 f(x) ,g(x) 〉 dx =

Z 1

0〈g(x) , f(x) 〉 dx = 〈〈g , f 〉〉.

Positivity: 〈〈 f , f 〉〉 =Z 1

0‖ f(x) ‖2 dx ≥ 0 since the integrand is non-negative. Moreover,

〈 f , f 〉 = 0 if and only if ‖ f(x) ‖2 = 0 for all x, and so, in view of the continuity of

‖ f(x) ‖2 we conclude that f(x) ≡ 0.

(c) This follows because 〈v ,w 〉 = v1w1−v1w2−v2w1+3v2w2vTKw for K =

1 −1−1 3

!> 0

defines an inner product on R2.

3.2.1.(a) |v1 · v2 | = 3 ≤ 5 =

√5√

5 = ‖v1 ‖ ‖v2 ‖; angle: cos−1 35 ≈ .9273;

(b) |v1 · v2 | = 1 ≤ 2 =√

2√

2 = ‖v1 ‖ ‖v2 ‖; angle: 23 π ≈ 2.0944;

(c) |v1 · v2 | = 0 ≤ 2√

6 =√

2√

12 = ‖v1 ‖ ‖v2 ‖; angle: 12 π ≈ 1.5708;

(d) |v1 · v2 | = 3 ≤ 3√

2 =√

3√

6 = ‖v1 ‖ ‖v2 ‖; angle: 34 π ≈ 2.3562;

(e) |v1 · v2 | = 4 ≤ 2√

15 =√

10√

6 = ‖v1 ‖ ‖v2 ‖; angle: cos−1„− 2√

15

«≈ 2.1134.

3.2.2. (a) 13 π; (b) 0, 1

3 π,12 π,

23 π, or π, depending upon whether −1 appears 0, 1, 2, 3 or 4 times

in the second vector.

3.2.3. The side lengths are all equal to

‖ (1, 1, 0)− (0, 0, 0) ‖ = ‖ (1, 1, 0)− (1, 0, 1) ‖ = ‖ (1, 1, 0)− (0, 1, 1) ‖ = · · · =√

2 .

The edge angle is 13 π = 60. The center angle is cos θ = − 1

3 , so θ = 1.9106 = 109.4712.

3.2.4.(a) |v ·w | = 5 ≤ 7.0711 =

√5√

10 = ‖v ‖ ‖w ‖.(b) |v ·w | = 11 ≤ 13.0767 = 3

√19 = ‖v ‖ ‖w ‖.

(c) |v ·w | = 22 ≤ 23.6432 =√

13√

43 = ‖v ‖ ‖w ‖.3.2.5.

(a) |v ·w | = 6 ≤ 6.4807 =√

14√

3 = ‖v ‖ ‖w ‖.(b) | 〈v ,w 〉 | = 11 ≤ 11.7473 =

√23√

6 = ‖v ‖ ‖w ‖.(c) | 〈v ,w 〉 | = 19 ≤ 19.4936 =

√38√

10 = ‖v ‖ ‖w ‖.

3.2.6. Set v = ( a, b )T ,w = ( cos θ, sin θ )T , so that Cauchy–Schwarz gives

|v ·w | = | a cos θ + b sin θ | ≤qa2 + b2 = ‖v ‖ ‖w ‖.

3.2.7. Set v = ( a1, . . . , an )T ,w = ( 1, 1, . . . , 1 )T , so that Cauchy–Schwarz gives

|v ·w | = | a1 + a2 + · · · + an | ≤√nqa21 + a2

2 + · · ·+ a2n = ‖v ‖ ‖w ‖.

Equality holds if and only if v = aw, i.e., a1 = a2 = · · · = an.

♦ 3.2.8. Using (3.20), ‖v −w ‖2 = 〈v −w ,v −w 〉 = ‖v ‖2 − 2 〈v ,w 〉+ ‖w ‖2= ‖v ‖2 + ‖w ‖2 − 2 ‖v ‖ ‖w ‖ cos θ.

♦ 3.2.9. Since a ≤ | a | for any real number a, so 〈v ,w 〉 ≤ | 〈v ,w 〉 | ≤ ‖v ‖ ‖w ‖.

♦ 3.2.10. Expanding ‖v + w ‖2 − ‖v −w ‖2 = 4 〈v ,w 〉 = 4 ‖v ‖ ‖w ‖ cos θ.

79

‖v ‖

‖w ‖

θ

‖v + w ‖

‖v −w ‖

♥ 3.2.11.(a) It is not an inner product. Bilinearity holds, but symmetry and positivity do not.(b) Assuming v,w 6= 0, we compute

sin2 θ = 1− cos2 θ =‖v ‖2 ‖w ‖2 − 〈v ,w 〉

‖v ‖2 ‖w ‖2 =(v ×w)2

‖v ‖2 ‖w ‖2 .

The result follows by taking square roots of both sides, where the sign is fixed by theorientation of the angle.

(c) By (b), v × w = 0 if and only if sin θ = 0 or v = 0 or w = 0, which implies that they

are parallel vectors. Alternative proof: v × w = det

v1 w1v2 w2

!= 0 if and only if the

columns v,w of the matrix are linearly dependent, and hence parallel vectors.(d) The parallelogram has side length ‖v ‖ and height ‖w ‖ | sin θ |, so its area is

‖v ‖ ‖w ‖ | sin θ | = |v ×w |.3.2.12.

(a) | 〈 f , g 〉 | = 1 ≤ 1.03191 =q

13

q12 e

2 − 12 = ‖ f ‖ ‖ g ‖;

(b) | 〈 f , g 〉 | = 2/e = .7358 ≤ 1.555 =q

23

q12 (e2 − e−2) = ‖ f ‖ ‖ g ‖;

(c) | 〈 f , g 〉 | = 12 = .5 ≤ .5253 =

√2− 5e−1

√e− 1 = ‖ f ‖ ‖ g ‖.

3.2.13. (a) 12 π, (b) cos−1 2

√2√π

= .450301, (c) 12 π.

3.2.14. (a) | 〈 f , g 〉 | = 23 ≤

q2845 = ‖ f ‖ ‖ g ‖; (b) | 〈 f , g 〉 | = π

2 ≤ π√3

= ‖ f ‖ ‖ g ‖.

3.2.15. (a) a = − 43 ; (b) no.

3.2.16. All scalar multiples of“

12 ,− 7

4 , 1”T

.

3.2.17. 3.2.15: a = 0. 3.2.16: all scalar multiples of“

16 ,− 21

24 , 1”T

.

3.2.18. All vectors in the subspace spanned by

0BBB@

1−2

10

1CCCA,

0BBB@

2−3

01

1CCCA, so v = a

0BBB@

1−2

10

1CCCA+ b

0BBB@

2−3

01

1CCCA.

3.2.19. (−3, 0, 0, 1 )T ,“

0,− 32 , 0, 1

”T, ( 0, 0, 3, 1 )T .

80

3.2.20. For example, u = ( 1, 0, 0 )T , v = ( 0, 1, 0 )T , w = ( 0, 1, 1 )T are linearly independent,

whereas u = ( 1, 0, 0 )T , v = w = ( 0, 1, 0 )T are linearly dependent.

3.2.21. (a) All solutions to a+ b = 1; (b) all solutions to a+ 3b = 2.

3.2.22. Only the zero vector satisfies 〈0 ,0 〉 = ‖0 ‖2 = 0.

♦ 3.2.23. Choose v = w; then 0 = 〈w ,w 〉 = ‖w ‖2, and hence w = 0.

3.2.24. 〈v + w ,v −w 〉 = ‖v ‖2 − ‖w ‖2 = 0 provided ‖v ‖ = ‖w ‖. They can’t both be

unit vectors: ‖v + w ‖2 = ‖v ‖2 + 2 〈v ,w 〉 + ‖w ‖2 = 2 + 2 〈v ,w 〉 = 1 if and only if

〈v ,w 〉 = − 12 , while ‖v −w ‖2 = 2 − 2 〈v ,w 〉 = 1 if and only if 〈v ,w 〉 = 1

2 , and soθ = 60.

♦ 3.2.25. If 〈v ,x 〉 = 0 = 〈v ,y 〉, then 〈v , cx + dy 〉 = c 〈v ,x 〉 + d 〈v ,y 〉 = 0 for c, d,∈ R,proving closure.

3.2.26. (a) 〈 p1 , p2 〉 =Z 1

0

“x− 1

2

”dx = 0, 〈 p1 , p3 〉 =

Z 1

0

“x2 − x+ 1

6

”dx = 0,

〈 p2 , p3 〉 =Z 1

0

“x− 1

2

”“x2 − x+ 1

6

”dx = 0.

(b) For n 6= m,

〈 sinnπx , sinmπx 〉 =Z 1

0sinnπx sinmπxdx =

Z 1

0

12

hcos(n+m)πx− cos(n−m)πx

idx = 0.

3.2.27. Any nonzero constant multiple of x2 − 13 .

3.2.28. p(x) = a“

(e− 1)x− 1”

+ b“x2 − (e− 2)x

”for any a, b ∈ R.

3.2.29. 1 is orthogonal to x, cosπx, sinπx; x is orthogonal to 1, cosπx; cosπx is orthogonal to1, x, sinπx; sinπx is orthogonal to 1, cosπx; ex is not orthogonal to any of the others.

3.2.30. Example: 1 and x− 23 .

3.2.31. (a) θ = cos−1 5√84≈ 0.99376 radians; (b) v ·w = 5 < 9.165 ≈

√84 = ‖v ‖ ‖w ‖,

‖v + w ‖ =√

30 ≈ 5.477 < 6.191 ≈√

14 +√

6 = ‖v ‖+ ‖w ‖; (c)“− 7

3 t,− 13 t, t

”T.

3.2.32.(a) ‖v1 + v2 ‖ = 4 ≤ 2

√5 = ‖v1 ‖+ ‖v2 ‖;

(b) ‖v1 + v2 ‖ =√

2 ≤ 2√

2 = ‖v1 ‖+ ‖v2 ‖;(c) ‖v1 + v2 ‖ =

√14 ≤

√2 +√

12 = ‖v1 ‖+ ‖v2 ‖;(d) ‖v1 + v2 ‖ =

√3 ≤√

3 +√

6 = ‖v1 ‖+ ‖v2 ‖;(e) ‖v1 + v2 ‖ =

√8 ≤√

10 +√

6 = ‖v1 ‖+ ‖v2 ‖.3.2.33.

(a) ‖v1 + v2 ‖ =√

5 ≤√

5 +√

10 = ‖v1 ‖+ ‖v2 ‖;(b) ‖v1 + v2 ‖ =

√6 ≤ 3 +

√19 = ‖v1 ‖+ ‖v2 ‖;

(c) ‖v1 + v2 ‖ =√

12 ≤√

13 +√

43 = ‖v1 ‖+ ‖v2 ‖.3.2.34.

(a) ‖ f + g ‖ =q

116 + 1

2 e2 ≈ 2.35114 ≤ 2.36467 ≈

q13 +

q12e

2 − 1 = ‖ f ‖+ ‖ g ‖;(b) ‖ f + g ‖ =

q23 + 1

2 e2 + 4e−1 − 1

2 e−2 ≈ 2.40105 ≤

81

≤ 2.72093 ≈q

23 +

q12 (e2 − e−2) = ‖ f ‖+ ‖ g ‖;

(c) ‖ f + g ‖ =√

2 + e− 5e−1 ≈ 1.69673 ≤ 1.71159 ≈√

2− 5e−1 +√e− 1 = ‖ f ‖+ ‖ g ‖.

3.2.35.(a) ‖ f + g ‖ =

q13345 ≈ 1.71917 ≤ 2.71917 ≈ 1 +

q2845 = ‖ f ‖+ ‖ g ‖;

(b) ‖ f + g ‖ =q

73 π ≈ 2.70747 ≤ 2.79578 ≈ √π +

q13 π = ‖ f ‖+ ‖ g ‖.

3.2.36.(a) 〈 1 , x 〉 = 0, so θ = 1

2 π. Yes, they are orthogonal.

(b) Note ‖ 1 ‖ =√

2, ‖x ‖ =q

23 , so 〈 1 , x 〉 = 0 <

q43 = ‖ 1 ‖ ‖x ‖, and

‖ 1 + x ‖ =q

83 ≈ 1.63299 < 2.23071 ≈

√2 +

q23 = ‖ 1 ‖+ ‖x ‖.

(c) 〈 1 , p 〉 = 2a+ 23 c = 0 and 〈x , p 〉 = 2

3 b = 0 if and only if p(x) = c“x2 − 1

3

”.

3.2.37.

(a)

˛˛˛

Z 1

0f(x) g(x) ex dx

˛˛˛ ≤

sZ 1

0f(x)2 ex dx

sZ 1

0g(x)2 ex dx ,

sZ 1

0

hf(x) + g(x)

i2ex dx ≤

sZ 1

0f(x)2 ex dx +

sZ 1

0g(x)2 ex dx ;

(b) 〈 f , g 〉 = 12 (e2 − 1) = 3.1945 ≤ 3.3063 =

√e− 1

q13 (e3 − 1) = ‖ f ‖ ‖ g ‖,

‖ f + g ‖ =q

13 e

3 + e2 + e− 73 = 3.8038 ≤ 3.8331 =

√e− 1 +

q13 (e3 − 1) = ‖ f ‖+ ‖ g ‖;

(c) cos θ =

√3

2

e2 − 1q

(e− 1)(e3 − 1)= .9662, so θ = .2607.

3.2.38.

(a)

˛˛˛

Z 1

0

hf(x)g(x) + f ′(x)g′(x)

idx

˛˛˛ ≤

sZ 1

0

hf(x)2 + f ′(x)2

idx

sZ 1

0

hg(x)2 + g′(x)2

idx;

sZ 1

0

h[f(x) + g(x)]2 + [f ′(x) + g′(x)]2

idx ≤

sZ 1

0

hf(x)2 + f ′(x)2

idx +

rhg(x)2 + g′(x)2

i.

(b) 〈 f , g 〉 = e− 1 ≈ 1.7183 ≤ 2.5277 ≈ 1 ·√e2 − 1 = ‖ f ‖ ‖ g ‖;

‖ f + g ‖ =√e2 + 2e− 2 ≈ 3.2903 ≤ 3.5277 ≈ 1 +

√e2 − 1 = ‖ f ‖+ ‖ g ‖.

(c) cos θ =

se− 1

e+ 1≈ .6798, so θ ≈ .8233.

3.2.39. Using the triangle inequality, ‖v ‖ = ‖ (v −w) + w ‖ ≤ ‖v −w ‖+ ‖w ‖.Therefore, ‖v −w ‖ ≥ ‖v ‖ − ‖w ‖. Switching v and w proves‖v −w ‖ ≥ ‖w ‖ − ‖v ‖, and the result is the combination of bothinequalities. In the figure, the inequality states that the length ofthe side v−w of the triangle opposite the origin is at least as largeas the difference between the other two lengths.

v −wv

w

3.2.40. True. By the triangle inequality,‖w ‖ = ‖ (−v) + (w + v) ‖ ≤ ‖−v ‖+ ‖v + w ‖ = ‖v ‖+ ‖v + w ‖.

♥ 3.2.41.(a) This follows immediately by identifying R

∞ with the space of all functions f : N → R

where N = 1, 2, 3, . . . are the natural numbers. Or, one can tediously verify all thevector space axioms.

82

(b) If x,y ∈ `2, then, by the triangle inequality on Rn,

nX

k=1

(xk + yk)2 ≤0B@

vuuutnX

k=1

x2k +

vuuutnX

k=1

y2k

1CA

2

≤0B@

vuuut∞X

k=1

x2k +

vuuut∞X

k=1

y2k

1CA

2

<∞,

and hence, in the limit as n→∞, the series of non-negative terms is also bounded:∞X

k=1

(xk + yk)2 <∞, proving that x + y ∈ `2.

(c) (1, 0, 0, 0, . . . ), (1, 12 ,

14 ,

18 , . . . ) and (1, 1

2 ,13 ,

14 , . . . ) are in `2, while

(1, 1, 1, 1, . . . ) and (1, 1√2, 1√

3, 1√

4, . . . ) are not.

(d) True. convergence of∞X

k=1

x2k requires x2

k → 0 as k →∞ and hence xk → 0.

(e) False – see last example in part (b).

(f )∞X

k=1

x2k =

∞X

k=1

α2k is a geometric series which converges if and only if |α | < 1.

(g) Using the integral test,∞X

k=1

x2k =

∞X

k=1

k2α converges if and only if 2α < −1, so α < − 12 .

(h) First, we need to prove that it is well-defined, which we do by proving that the series is

absolutely convergent. If x,y ∈ `2, then, by the Cauchy–Schwarz inequality on Rn for

the vectors ( |x1 |, . . . , |xn | )T , ( | y1 |, . . . , | yn | )

T ,

nX

k=1

|xk yk | ≤vuuut

nX

k=1

x2k

vuuutnX

k=1

y2k ≤

vuuut∞X

k=1

x2k

vuuut∞X

k=1

y2k <∞,

and hence, letting n → ∞, we conclude thatnX

k=1

|xk yk | < ∞. Bilinearity, symmetry

and positivity are now straightforward to verify.

(i)∞X

k=1

|xk yk | ≤vuuut

∞X

k=1

x2k

vuuut∞X

k=1

y2k ,

vuuut∞X

k=1

(xk + yk)2 ≤vuuut

∞X

k=1

x2k +

vuuut∞X

k=1

y2k .

3.3.1. ‖v + w ‖1 = 2 ≤ 2 = 1 + 1 = ‖v ‖1 + ‖w ‖1;‖v + w ‖2 =

√2 ≤ 2 = 1 + 1 = ‖v ‖2 + ‖w ‖2;

‖v + w ‖3 = 3√

2 ≤ 2 = 1 + 1 = ‖v ‖3 + ‖w ‖3;‖v + w ‖∞ = 1 ≤ 2 = 1 + 1 = ‖v ‖∞ + ‖w ‖∞.

3.3.2.(a) ‖v + w ‖1 = 6 ≤ 6 = 3 + 3 = ‖v ‖1 + ‖w ‖1;‖v + w ‖2 = 3

√2 ≤ 2

√5 =√

5 +√

5 = ‖v ‖2 + ‖w ‖2;‖v + w ‖3 = 3

√54 ≤ 2 3

√9 = 3√

9 + 3√

9 = ‖v ‖3 + ‖w ‖3;‖v + w ‖∞ = 3 ≤ 4 = 2 + 2 = ‖v ‖∞ + ‖w ‖∞.

(b) ‖v + w ‖1 = 2 ≤ 4 = 2 + 2 = ‖v ‖1 + ‖w ‖1;‖v + w ‖2 =

√2 ≤ 2

√2 =√

2 +√

2 = ‖v ‖2 + ‖w ‖2;‖v + w ‖3 = 3

√2 ≤ 2 3

√2 = 3√

2 + 3√

2 = ‖v ‖3 + ‖w ‖3;‖v + w ‖∞ = 1 ≤ 2 = 1 + 1 = ‖v ‖∞ + ‖w ‖∞.

(c) ‖v + w ‖1 = 10 ≤ 10 = 4 + 6 = ‖v ‖1 + ‖w ‖1;‖v + w ‖2 =

√34 ≤

√6 +√

14 = ‖v ‖2 + ‖w ‖2;

83

‖v + w ‖3 = 3√

118 ≤ 3√

10 + 3√

36 = ‖v ‖3 + ‖w ‖3;‖v + w ‖∞ = 4 ≤ 5 = 2 + 3 = ‖v ‖∞ + ‖w ‖∞.

3.3.3.(a) ‖u− v ‖1 = 5, ‖u−w ‖1 = 6, ‖v −w ‖1 = 7, so u,v are closest.

(b) ‖u− v ‖2 =√

13, ‖u−w ‖2 =√

12, ‖v −w ‖2 =√

21, so u,w are closest.(c) ‖u− v ‖∞ = 3, ‖u−w ‖∞ = 2, ‖v −w ‖∞ = 4, so u,w are closest.

3.3.4. (a) ‖ f ‖∞ = 23 , ‖ g ‖∞ = 1

4 ; (b) ‖ f + g ‖∞ = 23 ≤ 2

3 + 14 = ‖ f ‖∞ + ‖ g ‖∞.

3.3.5. (a) ‖ f ‖1 = 518 , ‖ g ‖1 = 1

6 ; (b) ‖ f + g ‖1 = 49√

3≤ 5

18 + 16 = ‖ f ‖1 + ‖ g ‖1.

3.3.6. (a) ‖ f − g ‖1 = 12 = .5, ‖ f − h ‖1 = 1− 2

π = .36338, ‖ g − h ‖1 = 12− 1

π = .18169, so g, h

are closest. (b) ‖ f − g ‖2 =q

13 = .57735, ‖ f − h ‖2 =

q32 − 4

π = .47619, ‖ g − h ‖2 =q56 − 2

π = .44352, so g, h are closest. (c) ‖ f − g ‖∞ = 1, ‖ f − h ‖∞ = 1, ‖ g − h ‖∞ = 1,so they are equidistant.

3.3.7.(a) ‖ f + g ‖1 = 3

4 = .75 ≤ 1.3125 ≈ 1 + 516 = ‖ f ‖1 + ‖ g ‖1;

(b) ‖ f + g ‖2 =q

3148 ≈ .8036 ≤ 1.3819 ≈ 1 +

q748 = ‖ f ‖2 + ‖ g ‖2;

(c) ‖ f + g ‖3 =3√

394 ≈ .8478 ≤ 1.4310 ≈ 1 +

3√

418 = ‖ f ‖3 + ‖ g ‖3;

(d) ‖ f + g ‖∞ = 54 = 1.25 ≤ 1.75 = 1 + 3

4 = ‖ f ‖∞ + ‖ g ‖∞.

3.3.8.(a) ‖ f + g ‖1 = e− e−1 ≈ 2.3504 ≤ 2.3504 ≈ (e− 1) + (1− e−1) = ‖ f ‖1 + ‖ g ‖1;(b) ‖ f + g ‖2 =

q12 e

2 + 2− 12 e

−2 ≈ 2.3721

≤ 2.4448 ≈q

12 e

2 − 12 +

q12 − 1

2 e−2 = ‖ f ‖2 + ‖ g ‖2;

(c) ‖ f + g ‖3 = 3

q23 e

3 + 3e− 3e−1 − 23 e

−3 ≈ 2.3945

≤ 2.5346 ≈ 3

q13 e

3 − 13 + 3

q13 − 1

3 e−3 = ‖ f ‖3 + ‖ g ‖3;

(d) ‖ f + g ‖∞ = e+ e−1 ≈ 3.08616 ≤ 3.71828 ≈ e+ 1 = ‖ f ‖∞ + ‖ g ‖∞.

3.3.9. Positivity : since both summands are non-negative, ‖x ‖ ≥ 0. Moreover, ‖x ‖ = 0 if and

only if x = 0 = x− y, and so x = (x, y )T = 0.

Homogeneity : ‖ cx ‖ = | cx |+ 2 | cx− cy | = | c |“|x |+ 2 |x− y |

”= | c | ‖v ‖.

Triangle inequality : ‖x + v ‖ = |x+ v |+ 2 |x+ v − y − w |≤“|x |+ | v |

”+ 2

“|x− y |+ | v − w |

”= ‖x ‖+ ‖v ‖.

3.3.10.(a) Comes from weighted inner product 〈v ,w 〉 = 2v1w1 + 3v2w2.

(b) Comes from inner product 〈v ,w 〉 = 2v1w1 − 12 v1w2 − 1

2 v2w1 + 2v2w2; positivity

follows because 〈v ,v 〉 = 2(v1 − 14 v2)

2 + 158 v

22 .

(c) Clearly positive; ‖ cv ‖ = 2 | cv1 |+ | cv2 | = | c |“

2 | v1 |+ | v2 |”

= | c | ‖v ‖;‖v + w ‖ = 2 | v1 + w1 |+ | v2 + w2 | ≤ 2 | v1 |+ | v2 |+ 2 |w1 |+ |w2 | = ‖v ‖+ ‖w ‖.

(d) Clearly positive; ‖ cv ‖ = maxn

2 | cv1 |, | cv2 |o

= | c | maxn

2 | v1 |, | v2 |o

= | c | ‖v ‖;‖v + w ‖ = max

n2 | v1 + w1 |, | v2 + w2 |

o≤ max

n2 | v1 |+ 2 |w1 |, | v2 |+ |w2 |

o

≤ maxn

2 | v1 |, | v2 |o

+ maxn

2 |w1 |, |w2 |o

= ‖v ‖+ ‖w ‖.(e) Clearly non-negative and equals zero if and only if v1 − v2 = 0 = v1 + v2, so v = 0;

‖ cv ‖ = maxn| cv1 − cv2 |, | cv1 + cv2 |

o= | c | max

n| v1 − v2 |, | v1 + v2 |

o= | c | ‖v ‖;

84

‖v + w ‖ = maxn| v1 + w1 − v2 − w2 |, | v1 + w1 + v2 + w2 |

o

≤ maxn| v1 − v2 |+ |w1 − w2 |, | v1 + v2 |+ |w1 + w2 |

o

≤ maxn| v1 − v2 |, | v1 + v2 |

o+ max

n|w1 − w2 |, |w1 + w2 |

o= ‖v ‖+ ‖w ‖.

(f ) Clearly non-negative and equals zero if and only if v1 − v2 = 0 = v1 + v2, so v = 0;

‖ cv ‖ = | cv1 − cv2 |+ | cv1 + cv2 | = | c |“| v1 − v2 |+ | v1 + v2 |

”= | c | ‖v ‖;

‖v + w ‖ = | v1 + w1 − v2 − w2 |+ | v1 + w1 + v2 + w2 |≤ | v1 − v2 |+ | v1 + v2 |+ |w1 − w2 |+ |w1 + w2 | = ‖v ‖+ ‖w ‖.

3.3.11. Parts (a), (c) and (e) define norms. (b) doesn’t since, for instance, ‖ ( 1,−1, 0 )T ‖ = 0.

(d) doesn’t since, for instance, ‖ ( 1,−1, 1 )T ‖ = 0.

3.3.12. Clearly if v = 0, then w = 0 since only the zero vector has norm 0. If v 6= 0, thenw = cv, and ‖w ‖ = | c | ‖v ‖ = ‖v ‖ if and only if | c | = 1.

3.3.13. True for an inner product norm, but false in general. For example,‖ e1 + e2 ‖1 = 2 = ‖ e1 ‖1 + ‖ e2 ‖1.

3.3.14. If x = ( 1, 0 )T ,y = ( 0, 1 )T , say, then ‖x + y ‖2∞ + ‖x− y ‖2∞ = 1 + 1 = 2 6= 4 =

2“‖x ‖2∞ + ‖y ‖2∞

”, which contradicts the identity in Exercise 3.1.12.

3.3.15. No — neither result satisfies the bilinearity property.

For example, if v = ( 1, 0 )T ,w = ( 1, 1 )T , then

〈 2v ,w 〉 = 14

“‖ 2v + w ‖21 − ‖ 2v −w ‖21

”= 3 6= 2 〈v ,w 〉 = 1

2

“‖v + w ‖21 − ‖v −w ‖21

”= 4,

〈 2v ,w 〉 = 14

“‖ 2v + w ‖2∞ − ‖ 2v −w ‖2∞

”= 2 6= 2 〈v ,w 〉 = 1

2

“‖v + w ‖2∞ − ‖v −w ‖2∞

”= 3

2 .

♦ 3.3.16. Let m = ‖v ‖∞ = max| v1 |, . . . , | v1 |. Then ‖v ‖p = m

0@

nX

i=1

| vi |m

!p1A

1/p

. Now if

| vi | < m then

| vi |m

!p

→ 0 as p→∞. Therefore, ‖v ‖p ∼ mk1/p → m as p→∞, where

1 ≤ k is the number of entries in v with | vi | = m.

♦ 3.3.17.

(a) ‖ f + g ‖1 =Z b

a| f(x) + g(x) | dx ≤

Z b

a

h| f(x) |+ | g(x) |

idx

=Z b

a| f(x) | dx+

Z b

a| g(x) | dx = ‖ f ‖1 + ‖ g ‖1.

(b) ‖ f + g ‖∞ = max | f(x) + g(x) | ≤ max | f(x) |+ | g(x) |≤ max | f(x) |+ max | g(x) | = ‖ f ‖∞ + ‖ g ‖∞.

♦ 3.3.18.(a) Positivity follows since the integrand is non-negative; further,

‖ cf ‖1,w =Z b

a| cf(x) | w(x) dx = | c |

Z b

a| f(x) | w(x) dx = | c | ‖ f ‖1,w;

‖ f + g ‖1,w =Z b

a| f(x) + g(x) | w(x) dx

≤Z b

a| f(x) | w(x) dx+

Z b

a| f(x) | w(x) dx = ‖ f ‖1,w + ‖ g ‖1,w.

(b) Positivity is immediate; further,

‖ cf ‖∞,w = maxa≤x≤b

n| cf(x) | w(x)

o= | c | max

a≤x≤b

n| f(x) | w(x)

o= | c | ‖ f ‖∞,w;

85

‖ f + g ‖∞,w = maxa≤x≤b

n| f(x) + g(x) | w(x)

o

≤ maxa≤x≤b

n| f(x) | w(x)

o+ max

a≤x≤b

n| g(x) | w(x)

o= ‖ f ‖∞,w + ‖ g ‖1,w.

3.3.19.(a) Clearly positive; ‖ cv ‖ = max

n‖ cv ‖1, ‖ cv ‖2

o= | c | max

n‖v ‖1, ‖v ‖2

o= | c | ‖v ‖;

‖v + w ‖ = maxn‖v + w ‖1, ‖v + w ‖2

o≤ max

n‖v ‖1 + ‖w ‖1, ‖v ‖2 + ‖w ‖2

o

≤ maxn‖v ‖1, ‖v ‖2

o+ max

n‖w ‖1, ‖w ‖2

o= ‖v ‖+ ‖w ‖.

(b) No. The triangle inequality is not necessarily valid. For example, in R2 set ‖v ‖1 =

|x | + | y |, ‖v ‖2 = 32 max|x |, | y |. Then if v = ( 1, .4 )T ,w = ( 1, .6 )T , then ‖v ‖ =

minn‖v ‖1, ‖v ‖2

o= 1.4, ‖w ‖ = min

n‖w ‖1, ‖w ‖2

o= 1.5, but ‖v + w ‖ =

minn‖v + w ‖1, ‖v + w ‖2

o= 3 > 2.9 = ‖v ‖+ ‖w ‖.

(c) Yes.

(d) No. the triangle inequality is not necessarily valid. For example, if v = ( 1, 1 )T ,w =

( 1, 0 )T , so v + w = ( 2, 1 )T , thenq‖v + w ‖1 ‖v + w ‖∞ =

√6 >

√2 + 1 =

q‖v ‖1 ‖v ‖∞ +

q‖w ‖1 ‖w ‖∞ .

3.3.20. (a)

0BBBB@

1√142√14

− 3√14

1CCCCA

; (b)

0BBB@

1323

−1

1CCCA; (c)

0BBB@

1613

− 12

1CCCA; (d)

0BBB@

1323

−1

1CCCA; (e)

0BBB@

1613

− 12

1CCCA.

3.3.21.(a) ‖v ‖2 = cos2 θ cos2 φ+ cos2 θ sin2 φ+ sin2 θ = cos2 θ + sin2 θ = 1;

(b) ‖v ‖2 = 12 (cos2 θ + sin2 θ + cos2 φ+ sin2 φ) = 1;

(c) ‖v ‖2 = cos2 θ cos2 φ cos2 ψ + cos2 θ cos2 φ sin2 ψ + cos2 θ sin2 φ + sin2 θ = cos2 θ cos2 φ +

cos2 θ sin2 φ+ sin2 θ = cos2 θ + sin2 θ = 1.

3.3.22. 2 vectors, namely u = v/‖v ‖ and −u = −v/‖v ‖.

3.3.23. (a) -1 -0.5 0.5 1

-1

-0.5

0.5

1

(b) -1 -0.5 0.5 1

-1

-0.5

0.5

1

(c) -1 -0.5 0.5 1

-1

-0.5

0.5

1

3.3.24. (a) -1 -0.75-0.5-0.25 0.25 0.5 0.75 1

-1

-0.75

-0.5

-0.25

0.25

0.5

0.75

1

(b) -1 -0.75-0.5-0.25 0.25 0.5 0.75 1

-1

-0.75

-0.5

-0.25

0.25

0.5

0.75

1

(c) -1 -0.5 0.5 1

-1

-0.5

0.5

1

86

(d) -1 -0.5 0.5 1

-1

-0.5

0.5

1

(e) -1 -0.5 0.5 1

-1

-0.5

0.5

1

(f ) -2 -1 1 2

-2

-1

1

2

3.3.25.

(a) Unit octahedron: (b) Unit cube:

(c) Ellipsoid with semi-axes 1√2, 1, 1√

3: (d)

In the last case, the corners of the “top” face of the parallelopiped are at v1 =“

12 ,

12 ,

12

”T,

v2 =“− 1

2 ,− 12 ,

32

”T, v3 =

“− 3

2 ,12 ,

12

”T,v4 =

“− 1

2 ,32 ,− 1

2

”T, while the corners of the

“bottom” (hidden) face are −v1,−v2,−v3,−v4.

3.3.26. Define |‖x ‖| = ‖x ‖/‖v ‖ for any x ∈ V .

3.3.27. True. Having the same unit sphere means that ‖u ‖1 = 1 whenever ‖u ‖2 = 1. If v 6= 0is any other nonzero vector space element, then u = v/‖v ‖1 satisfies 1 = ‖u ‖1 = ‖u ‖2,and so ‖v ‖2 = ‖ ‖v ‖1 u ‖2 = ‖v ‖1 ‖u ‖2 = ‖v ‖1. Finally ‖0 ‖1 = 0 = ‖0 ‖2, and so thenorms agree on all vectors in V .

3.3.28. (a) 185 x− 6

5 ; (b) 3x− 1; (c) 32 x− 1

2 ; (d) 910 x− 3

10 ; (e) 32√

2x− 1

2√

2; (f ) 3

4 x− 14 .

3.3.29. (a), (b), (c), (f ), (i). In cases (g), (h), the norm of f is not finite.

♦ 3.3.30. If ‖x ‖, ‖y ‖ ≤ 1 and 0 ≤ t ≤ 1, then, by the triangle inequality, ‖ tx + (1− t)y ‖ ≤t ‖x ‖ + (1 − t) ‖y ‖ ≤ 1. The unit sphere is not convex since, for instance, 1

2 x + 12 (−x) =

0 6∈ S1 when x,−x ∈ S1.

3.3.31.(a) ‖v ‖2 =

√2, ‖v ‖∞ = 1, and 1√

2

√2 ≤ 1 ≤

√2 ;

(b) ‖v ‖2 =√

14, ‖v ‖∞ = 3, and 1√3

√14 ≤ 3 ≤

√14 ;

(c) ‖v ‖2 = 2, ‖v ‖∞ = 1, and 12 2 ≤ 1 ≤ 2;

(d) ‖v ‖2 = 2√

2, ‖v ‖∞ = 2, and 1√5

2√

2 ≤ 2 ≤ 2√

2 .

87

3.3.32. (a) v = ( a, 0 )T or ( 0, a )T ; (b) v = ( a, 0 )T or ( 0, a )T ;

(c) v = ( a, 0 )T or ( 0, a )T ; (d) v = ( a, a )T or ( a,−a )T .

3.3.33. Let 0 < ε ¿ 1 be small. First, if the entries satisfy | vj | < ε for all j, then ‖v ‖∞ =

maxn| vj |

o< ε; conversely, if ‖v ‖∞ < ε, then | vj | ≤ ‖v ‖∞ < ε for any j. Thus, the

entries of v are small if and only if its ∞ norm is small. Furthermore, by the equivalenceof norms, any other norm satisfies c‖v ‖ ≤ ‖v ‖∞ ≤ C ‖v ‖ where C, c > 0 are fixed.Thus, if ‖v ‖ < ε is small, then its entries, | vj | ≤ ‖v ‖∞ ≤ C ε are also proportionately

small, while if the entries are all bounded by | vj | < ε, then ‖v ‖ ≤ 1

c‖v ‖∞ ≤

1

c, ε is also

proportionately small.

3.3.34. If | vi | = ‖v ‖∞ is the maximal entry, so | vj | ≤ | vi | for all j, then

‖v ‖2∞ = v2i ≤ ‖v ‖22 = v21 + · · · + v2n ≤ nv

2i = n ‖v ‖2∞.

3.3.35.

(i) ‖v ‖21 =

0@

nX

i=1

| vi |1A

2

=nX

i=1

| vi |2 + 2

X

i<j

| vi | | vj | ≥nX

i=1

| vi |2 = ‖v ‖22.

On the other hand, since 2xy ≤ x2 + y2,

‖v ‖21 =nX

i=1

| vi |2 + 2

X

i<j

| vi | | vj | ≤ nnX

i=1

| vi |2 = n ‖v ‖22.

(ii) (a) ‖v ‖2 =√

2, ‖v ‖1 = 2, and√

2 ≤ 2 ≤√

2√

2;

(b) ‖v ‖2 =√

14, ‖v ‖1 = 6, and√

14 ≤ 6 ≤√

3√

14;(c) ‖v ‖2 = 2, ‖v ‖1 = 4, and 2 ≤ 4 ≤ 2 · 2;(d) ‖v ‖2 = 2

√2, ‖v ‖1 = 6, and 2

√2 ≤ 6 ≤

√5 · 2√

2.(iii) (a) v = c ej for some j = 1, . . . , n; (b) | v1 | = | v2 | = · · · = | vn |.

3.3.36.(i) ‖v ‖∞ ≤ ‖v ‖1 ≤ n ‖v ‖∞.(ii) (a) ‖v ‖∞ = 1, ‖v ‖1 = 2, and 1 ≤ 2 ≤ 2 · 1;

(b) ‖v ‖∞ = 3, ‖v ‖1 = 6, and 3 ≤ 6 ≤ 3 · 3;(c) ‖v ‖∞ = 1, ‖v ‖1 = 4, and 1 ≤ 4 ≤ 4 · 1;(d) ‖v ‖∞ = 2, ‖v ‖1 = 6, and 2 ≤ 6 ≤ 5 · 2.

(iii) ‖v ‖∞ = ‖v ‖1 if and only if v = c ej for some j = 1, . . . , n; ‖v ‖1 = n ‖v ‖∞ if and

only if | v1 | = | v2 | = · · · = | vn |.

3.3.37. In each case, we minimize and maximize ‖ ( cos θ, sin θ )T ‖ for 0 ≤ θ ≤ 2π:

(a) c? =√

2 , C? =√

3 ; (b) c? = 1, C? =√

2 .

3.3.38. First, | vi | ≤ ‖v ‖∞. Furthermore, Theorem 3.17 implies ‖v ‖∞ ≤ C ‖v ‖, which provesthe result.

♦ 3.3.39. Equality implies that ‖u ‖2 = c? for all vectors u with ‖u ‖1 = 1. But then if v 6= 0 isany other vector, setting u = v/‖v ‖1, we find ‖v ‖2 = ‖v ‖1 ‖u ‖2 = c? ‖v ‖1, and hencethe norms are merely constant multiples of each other.

3.3.40. If C = ‖ f ‖∞, then | f(x) | ≤ C for all a ≤ x ≤ b. Therefore,

‖ f ‖22 =Z b

af(x)2 dx ≤

Z b

aC2 dx = (b− a)C2 = (b− a)‖ f ‖2∞.

88

♥ 3.3.41.(a) The maximum (absolute) value of fn(x) is 1 = ‖ fn ‖∞. On the other hand,

‖ fn ‖2 =

sZ ∞

−∞| fn(x) |2 dx =

sZ n

−ndx =

√2n −→ ∞.

(b) Suppose there exists a constant C such that ‖ f ‖2 ≤ C ‖ f ‖∞ for all functions. Then, in

particular,√

2n = ‖ fn ‖2 ≤ C ‖ fn ‖∞ = C for all n, which is impossible.

(c) First, ‖ fn ‖2 =

sZ ∞

−∞| fn(x) |2 dx =

vuutZ 1/n

−1/n

n

2dx = 1. On the other hand, the

maximum (absolute) value of fn(x) is ‖ fn ‖∞ =qn/2→∞. Arguing as in part (b), we

conclude that there is no constant C such that ‖ f ‖∞ ≤ C ‖ f ‖2.

(d) (i) fn(x) =

8><>:

n

2, − 1

n≤ x ≤ 1

n,

0, otherwise,has ‖ fn ‖1 = 1, ‖ fn ‖∞ =

n

2−→ ∞;

(ii) fn(x) =

8><>:

1√2n

, −n ≤ x ≤ n,

0, otherwise,

has ‖ fn ‖2 = 1, ‖ fn ‖1 =√

2n −→ ∞;

(iii) fn(x) =

8><>:

n

2, − 1

n≤ x ≤ 1

n,

0, otherwise,has ‖ fn ‖1 = 1, ‖ fn ‖2 =

n

2−→ ∞.

♥ 3.3.42.(a) We can’t use the functions in Exercise 3.3.41 directly since they are not continuous. In-

stead, consider the continuous functions fn(x) =

( √n (1− n |x |), − 1

n ≤ x ≤ 1n ,

0, otherwise,

Then ‖ fn ‖∞ =√n, while ‖ fn ‖2 =

vuutZ 1/n

−1/nn(1− n |x |)2 dx =

2√3

. Thus, there is no

constant C such that ‖ f ‖∞ ≤ C ‖ f ‖2 as otherwise√n = ‖ fn ‖∞ ≤ C ‖ fn ‖2 = 2√

3C

for all n, which is impossible.(b) Yes: since, by the definition of the L∞ norm, | f(x) | ≤ ‖ f ‖∞ for all −1 ≤ x ≤ 1,

‖ f ‖2 =

sZ 1

−1| f(x) |2 dx =

sZ 1

−1‖ f ‖2∞ dx =

√2 ‖ f ‖∞.

(c) They are not equivalent. The functions fn(x) =

(n (1− n |x |), − 1

n ≤ x ≤ 1n ,

0, otherwise,are

continuous and satisfy ‖ fn ‖∞ = n, while ‖ fn ‖1 = 1. Thus, there is no constant C such

that ‖ f ‖∞ ≤ C ‖ f ‖1 for all f ∈ C0[−1, 1].

3.3.43. First, since 〈v ,w 〉 is easily shown to be bilinear and symmetric, the only issue is posi-

tivity: Is 0 < 〈v ,v 〉 = α‖v ‖21 + β‖v ‖22 for all 0 6= v ∈ V ? Let† µ = min ‖v ‖2/‖v ‖1 over

all 0 6= v ∈ V . Then 〈v ,v 〉 = α‖v ‖21+β‖v ‖22 ≥ (α+β µ2)‖v ‖21 > 0 provided α+β µ2 > 0.

Conversely, if α+β µ2 ≤ 0 and 0 6= v achieves the minimum value, so ‖v ‖2 = µ ‖v ‖1, then〈v ,v 〉 ≤ 0. (If there is no v that actually achieves the minimum value, then one can also

allow α+ β µ2 = 0.)

† In infinite-dimensional situations, one should replace the minimum by the infimum, since theminimum value may not be achieved.

89

3.4.1. (a) Positive definite: 〈v ,w 〉 = v1w1 + 2v2w2; (b) not positive definite; (c) not pos-itive definite; (d) not positive definite; (e) positive definite: 〈v ,w 〉 = v1w1 − v1w2 −v2w1 + 3v2w2; (f ) not positive definite.

3.4.2. For instance, q(1, 0) = 1, while q(2,−1) = −1.

♦ 3.4.3.(a) The associated quadratic form q(x) = xTDx = c1x

21 + c2x

22 + · · · + cnx

2n is a sum of

squares. If all ci > 0, then q(x) > 0 for x 6= 0, since q(x) is a sum of non-negative terms,at least one of which is strictly positive. If all ci ≥ 0, then, by the same reasoning, Dis positive semi-definite. If all the ci < 0 are negative, then D is negative definite. If Dhas both positive and negative diagonal entries, then it is indefinite.

(b) 〈v ,w 〉 = vTDw = c1 v1w1 + c2 v2w2 + · · · + cn vnwn, which is the weighted innerproduct (3.10).

♦ 3.4.4. (a) kii = eTi K ei > 0. (b) For example, K =

1 22 1

!is not positive definite or even

semi-deifnite. (c) For example, K =

1 00 0

!.

3.4.5. | 4x1 y1 − 2x1 y2 − 2x2 y1 + 3x2 y2 | ≤q

4x21 − 4x1x2 + 3x2

2

q4y2

1 − 4y1 y2 + 3y22 ,

q4(x1 + y1)

2 − 4(x1 + y1)(x2 + y2) + 3(x2 + y2)2

≤q

4x21 − 4x1x2 + 3x2

2 +q

4y21 − 4y1 y2 + 3y2

2 .

3.4.6. First, (cK)T = cKT = cK is symmetric. Second, xT (cK)x = cxTKx > 0 for anyx 6= 0, since c > 0 and K > 0.

♦ 3.4.7. (a) xT (K + L)x = xTKx + xTLx > 0 for all x 6= 0, since both summands are strictly

positive. (b) For example, K =

2 00 −1

!, L =

−1 0

0 2

!, with K + L =

1 00 1

!> 0.

3.4.8.

3 11 1

! 1 11 4

!=

4 72 5

!is not even symmetric. Even the associated quadratic form

(x y )

4 72 5

! xy

!= 4x2 + 9xy + 5y2 is not positive definite.

3.4.9. Example:

0 11 0

!.

♦ 3.4.10.(a) Since K−1 is also symmetric, xTK−1x = xTK−1KK−1x = (K−1x)TK(K−1x) = yTKy.

(b) If K > 0, then yTKy > 0 for all y = K−1x 6= 0, and hence xTK−1x > 0 for all x 6= 0.

♦ 3.4.11. It suffices to note that K > 0 if and only if cos θ =v ·Kv

‖v ‖ ‖Kv ‖ =vTKv

‖v ‖ ‖Kv ‖ > 0 forall v 6= 0, which holds if and only if | θ | < 1

2 π.

♦ 3.4.12. If q(x) = xTKx with KT = K, then

q(x + y)− q(x)− q(y) = (x + y)TK(x + y)− xTKx− yTKy = 2xTKy = 2 〈x ,y 〉.

3.4.13. (a) No, by continuity. Or, equivalently, q(cx+) = c2q(x+) > 0 for any c 6= 0, so q is

positive at any nonzero scalar multiple of x+. (b) In view of the preceding calculation, this

holds if and only if q(x) is either positive or negative definite and x0 = 0.

90

3.4.14.(a) The quadratic form for K = −N is xT K x = −xT N x > 0 for all x 6= 0.

(b) a < 0 and detN = ac− b2 > 0.

(c) The matrix

−1 1

1 −2

!is negative definite. The others are not.

3.4.15. xTKx = ( 1 1 )

−1

1

!= 0, but Kx =

1 −2−2 3

! 11

!=

−1

1

!6= 0.

3.4.16. If q(x) > 0 and q(y) < 0, then the scalar function f(t) = q(tx + (1 − t)y) satisfiesf(0) > 0 and f(1) < 0, so, by continuity, there is a point 0 < t? < 1 such that f(t?) = 0and hence, setting z = t? x+(1−t?)y gives q(z) = 0. Moreover, z 6= 0, as otherwise x = cy,

c = 1− 1/t?, would be parallel vectors, but then q(x) = c2 q(y) would have the same sign.

3.4.17. (a) False. For example, the nonsingular matrix K =

1 00 −1

!has null directions, e.g.,

11

!and

1−1

!. (b) True; see Exercise 3.4.16.

♦ 3.4.18.

(a) x2 − y2 = (x− y)(x+ y) = 0:-1 -0.5 0.5 1

-1

-0.5

0.5

1

(b) x2 + 4xy + 3y2 = (x+ y)(x+ 3y) = 0:-1 -0.5 0.5 1

-1

-0.5

0.5

1

(c) x2 − y2 − z2 = 0:

♦ 3.4.19.(a) First, kii = eT

i K ei = eTi Lei = lii, so their diagonal entries are equal. Further,

kii + 2kij + kjj = (ei + ej)TK (ei + ej) = (ei + ej)

TL(ei + ej) = lii + 2 lij + ljj ,

and hence kij = kji = lij = lji, and so K = L.

(b) Example: If K =

0 11 0

!and L =

0 20 0

!then xTKx = xTLx = 2x1x2.

♦ 3.4.20. Since q(x) is a scalar q(x) = xTAx = (xTAx)T = xTAT x, and hence

q(x) = 12 (xTAx + xTAT x) = xTKx.

♦ 3.4.21. (a) `(cx) = a · cx = c (a · x) = c `(x); (b) q(cx) = (cx)TK(cx) = c2 xTKx =

c2 q(x); (c) Example: q(x) = ‖x ‖2 where ‖x ‖ is any norm that does not come from aninner product.

91

3.4.22. (i)

10 66 4

!; positive definite. (ii)

0B@

5 4 −34 13 −1−3 −1 2

1CA; positive semi-definite; null

vectors: all scalar multiples of

0B@

5−1

7

1CA. (iii)

6 −8−8 13

!; positive definite.

(iv)

0B@

2 1 11 2 11 1 2

1CA; positive definite. (v)

0B@

9 6 36 6 03 0 3

1CA; positive semi-definite; null vectors:

all scalar multiples of

0B@−1

11

1CA. (vi)

2 −1−1 3

!; positive definite. (vii)

0B@

30 0 −60 30 3−6 3 15

1CA;

positive definite. (viii)

0BBB@

2 −2 −1 0−2 5 2 2−1 2 2 3

0 2 3 13

1CCCA; positive definite.

3.4.23. (iii)

9 −12

−12 21

!, (iv)

0B@

3 1 21 4 32 3 5

1CA, (v)

0B@

21 12 912 9 39 3 6

1CA. Positive definiteness doesn’t

change, since it only depends upon the linear independence of the vectors.

3.4.24. (vi)

43 −1

−1 74

!, (vii)

0BB@

10 −2 −1

− 2 14512

103

− 1 103

416

1CCA, (viii)

0BBBB@

54 −2 −1 0

−2 92 2 1

−1 2 43 1

0 1 1 5

1CCCCA

.

3.4.25.K =

0BBB@

1 e− 1 12 (e2 − 1)

e− 1 12 (e2 − 1) 1

3 (e3 − 1)12 (e2 − 1) 1

3 (e3 − 1) 14 (e4 − 1)

1CCCA is positive definite since 1, ex, e2x are lin-

early independent functions.

3.4.26.K =

0BB@

1− 1/e 1 e− 1

1 e− 1 12 (e2 − 1)

e− 1 12 (e2 − 1) 1

3 (e3 − 1)

1CCA is also positive definite since 1, ex, e2x are

(still) linearly independent.

3.4.27.K =

0BBBBB@

2 0 23 0

0 23 0 2

523 0 2

5 0

0 25 0 2

7

1CCCCCA

is positive definite since 1, x, x2, x3 are linearly independent.

3.4.28.K =

0BBBBB@

2 23

23

25

23

23

25

25

23

25

25

27

25

25

27

29

1CCCCCA

is positive definite since 1, x, x2, x3 are (still) linearly independent.

♦ 3.4.29. Let 〈x ,y 〉 = xTKy be the corresponding inner product. Then kij = 〈 ei , ej 〉, and

hence K is the Gram matrix associated with the standard basis vectors e1, . . . , en.

92

♦ 3.4.30.(a) is a special case of (b) since positive definite matrices are symmetric.

(b) By Theorem 3.28 if S is any symmetric matrix, then STS = S2 is always positive semi-definite, and positive definite if and only if kerS = 0, i.e., S is nonsingular. In partic-

ular, if S = K > 0, then kerK = 0 and so K2 > 0.

♦ 3.4.31.(a) cokerK = kerK since K is symmetric, and so part (a) follows from Proposition 3.36.

(b) By Exercise 2.5.39, rngK ⊂ rngAT = corngA. Moreover, by part (a) and Theo-rem 2.49, both have the same dimension, and hence they must be equal.

3.4.32. 0 = zTKz = zTATCA z = yTCy, where y = A z. Since C > 0, this implies y = 0, andhence z ∈ kerA = kerK.

3.4.33.(a) L = (AT )T is the Gram matrix corresponding to the columns of AT , i.e., the rows of A.

(b) From Exercise 3.4.31 and Theorem 2.49, rankK = rankA = rankAT = rankL.(c) This is true if and only if both kerA and cokerA are 0, which, by Theorem 2.49, re-

quires that A be square and nonsingular.

3.4.34. A Gram matrix is positive definite if and only if the vector space elements used to con-struct it are linearly independent. Linear independence doesn’t depend upon the innerproduct being used, and so if the Gram matrix for one inner product is positive definite,so is the Gram matrix for any other inner product on the vector space.

♦ 3.4.35.(a) As in Exercise 3.4.7, the sum of positive definite matrices is positive definite.(b) Example: A1 = ( 1 0 ), A2 = ( 0 1 ), C1 = C2 = I , K = I .

(c) In block form, set A =

A1A2

!and C =

C1 OO C2

!. Then

ATCA = AT1 C1A1 +AT

2 C2A2 = K.

3.5.1. Only (a), (e) are positive definite.

3.5.2.

(a)

1 22 3

!=

1 02 1

! 1 00 −1

! 1 20 1

!; not positive definite.

(b)

5 −1−1 3

!=

1 0− 1

5 1

! 5 00 14

5

! 1 − 1

50 1

!; positive definite.

(c)

0B@

3 −1 3−1 5 1

3 1 5

1CA =

0B@

1 0 0− 1

3 1 0

1 37 1

1CA

0B@

3 0 00 14

3 0

0 0 87

1CA

0BB@

1 − 13 1

0 1 37

0 0 1

1CCA; positive definite.

(d)

0B@−2 1 −1

1 −2 1−1 1 −2

1CA =

0B@

1 0 0− 1

2 1 012 − 1

3 1

1CA

0B@−2 0 0

0 − 32 0

0 0 − 43

1CA

0BB@

1 − 12

12

0 1 − 13

0 0 1

1CCA;

not positive definite.

(e)

0B@

2 1 −21 1 −3−2 −3 11

1CA =

0B@

1 0 012 1 0−1 −4 1

1CA

0B@

2 0 00 1

2 00 0 1

1CA

0B@

1 12 −1

0 1 −40 0 1

1CA; positive definite.

93

(f )

0BBB@

1 1 1 01 2 0 11 0 1 10 1 1 2

1CCCA =

0BBB@

1 0 0 01 1 0 01 −1 1 00 1 −2 1

1CCCA

0BBB@

1 0 0 00 1 0 00 0 −1 00 0 0 5

1CCCA

0BBB@

1 1 1 00 1 −1 10 0 1 −20 0 0 5

1CCCA;

not positive definite.

(g)

0BBB@

3 2 1 02 3 0 11 0 3 20 1 2 4

1CCCA =

0BBBB@

1 0 0 023 1 0 013 − 2

5 1 0

0 35 1 1

1CCCCA

0BBB@

3 0 0 00 5

3 0 0

0 0 125 0

0 0 0 1

1CCCA

0BBBB@

1 23

13 0

0 1 − 25

35

0 0 1 10 0 0 1

1CCCCA

;

positive definite.

(h)

0BBB@

2 1 −2 01 3 −3 2−2 −3 4 −1

0 2 −1 7

1CCCA =

0BBBB@

1 0 0 012 1 0 0

−1 − 45 1 0

0 45

32 1

1CCCCA

0BBBB@

2 0 0 00 5

2 0 0

0 0 25 0

0 0 0 92

1CCCCA

0BBBB@

1 12 −1 0

0 1 − 45

45

0 0 1 32

0 0 0 1

1CCCCA

;

positive definite.

3.5.3.(a) Gaussian Elimination leads to U =

0BB@

1 1 00 c− 1 1

0 0 c− 2c− 1

1CCA;

the pivots 1, c− 1, c− 2c− 1 are all positive if and only if c > 2.

(b)

0B@

1 1 01 3 10 1 1

1CA =

0B@

1 0 01 1 00 1

2 1

1CA

0B@

1 0 00 2 00 0 1

2

1CA

0B@

1 1 00 2 10 0 1

2

1CA;

(c) q(x, y, z) = (x+ y)2 + 2(y + 12 z)

2 + 12 z

2.

(d) The coefficients (pivots) 1, 2, 12 are positive, so the quadratic form is positive definite.

3.5.4.K =

0BBB@

1 12 − 1

212 2 0

− 12 0 3

1CCCA; yes, it is positive definite.

3.5.5.(a) (x+ 4y)2 − 15y2; not positive definite.

(b) (x− 2y)2 + 3y2; positive definite.

(c) (x− y)2 − 2y2; not positive definite.

(d) (x+ 3y)2 − 9y2; not positive definite.

3.5.6. (a) (x+ 2z)2 + 3y2 + z2, (b)“x+ 3

2 y − z”2

+ 34 (y + 2z)2 + 4z2,

(c) 2“x1 + 1

4 x2 − 12 x3

”2+ 15

8

“x2 − 2

5 x3

”2+ 6

5 x23.

3.5.7.

(a) (x y z )T

0B@

1 0 20 2 42 4 12

1CA

0B@xyz

1CA; not positive definite;

(b) (x y z )T

0B@

3 −4 12

−4 −2 012 0 1

1CA

0B@xyz

1CA; not positive definite;

(c) (x y z )T

0B@

1 1 −21 2 −3−2 −3 6

1CA

0B@xyz

1CA; positive definite;

94

(d) (x1 x2 x3 )T

0BBB@

3 2 − 72

2 −1 92

− 72

92 5

1CCCA

0BB@

x1

x2

x3

1CCA; not positive definite;

(e) (x1 x2 x3 x4 )T

0BBB@

1 2 −1 02 5 0 −1−1 0 6 − 1

2

0 −1 − 12 4

1CCCA

0BBB@

x1x2x3x4

1CCCA; positive definite.

3.5.8. When a2 < 4 and a2 + b2 + c2 − abc < 4.

3.5.9. True. Indeed, if a 6= 0, then q(x, y) = a

x+

b

ay

!2

+ca− b2

ay2;

if c 6= 0, then q(x, y) = c

y +

b

cx

!2

+ca− b2

cx2;

while if a = c = 0, then q(x, y) = 2bxy = 12 b(x+ y)2 − 1

2 b(x− y)2.

3.5.10. (a) According to Theorem 1.52, detK is equal to the product of its pivots, which are

all positive by Theorem 3.37. (b) trK =nX

i=1

kii > 0 since, according to Exercise 3.4.4,

every diagonal entry of K is positive. (c) For K =

a bb c

!, if trK = a+ c > 0, and a ≤ 0,

then c > 0, but then detK = ac − b2 ≤ 0, which contradicts the assumptions. Thus, both

a > 0 and ac−b2 > 0, which, by (3.62), implies K > 0. (d) Example: K =

0B@

3 0 00 −1 00 0 −1

1CA.

3.5.11. Writing x =

x1x2

!∈ R

2n where x1,x2 ∈ Rn, we have xTKx = xT

1 K1x1+xT2 K2x2 > 0

for all x 6= 0 by positive definiteness of K1,K2. The converse is also true, because

xT1 K1x1 = xTKx > 0 when x =

x10

!with x1 6= 0.

3.5.12.(a) If x 6= 0 then u = x/‖x ‖ is a unit vector, and so q(x) = xTKx = ‖x ‖2 uTKu > 0.

(b) Using the Euclidean norm, let m = minuTSu | ‖u ‖ = 1 > −∞, which is finite since

q(u) is continuous and the unit sphere in Rn is closed and bounded. Then uTKu =

uTSu + c ‖u ‖2 ≥ m+ c > 0 for ‖u ‖ = 1 provided c > −m.

3.5.13. Write S = (S + c I ) + (−c I ) = K + N , where N = −c I is negative definite for anyc > 0, while K = S + c I is positive definite provided cÀ 0 is sufficiently large.

♦ 3.5.14.(a) The ith column of DLT is di li. Hence writing K = L(DLT ) and using formula (1.14)

results in (3.69).

(b)

4 −1−1 1

!= 4

1− 1

4

!“1 − 1

4

”+

3

4

01

!( 0 1 ) =

4 −1−1 1

4

!+

0 00 3

4

!,

0B@

1 2 12 6 11 1 4

1CA =

0B@

121

1CA( 1 2 1 ) + 2

0B@

01− 1

2

1CA“

0 1 − 12

”+

5

2

0B@

001

1CA( 0 0 1 )

=

0B@

1 2 12 4 21 2 1

1CA+

0B@

0 0 00 2 −10 −1 1

2

1CA+

0B@

0 0 00 0 00 0 5

2

1CA .

95

♥ 3.5.15. According to Exercise 1.9.19, the pivots of a regular matrix are the ratios of these suc-

cessive subdeterminants. For the 3×3 case, the pivots are a,ad− b2

a, and

detK

ad− b2 . There-

fore, in general the pivots are positive if and only if the subdeterminants are.

♦ 3.5.16. If a negative diagonal entry appears, it is either a pivot, or a diagonal entry of the re-maining lower right symmetric (m − i) × (m − i) submatrix, which, by Exercise 3.4.4, mustall be positive in order that the matrix be positive definite.

♦ 3.5.17. Use the fact that K = −N is positive definite. A 2 × 2 symmetric matrix N =

a bb c

!

is negative definite if and only if a < 0 and detN = ac − b2 > 0. Similarly, a 3 × 3 matrix

N =

0B@a b cb d ec e f

1CA < 0 if and only if a < 0, ad − b2 > 0, detK < 0. In general, K is

negative definite if and only if its upper left entry is negative and the sequence of squareupper left i× i subdeterminants, i = 1, . . . , n, have alternating signs: −,+,−,+, . . . .

3.5.18. False: if N has size n× n then trN < 0 but detN > 0 if n is even and < 0 if n is odd.

3.5.19.

(a)

3 −2−2 2

!=

0@√

3 0

− 2√3

√2√3

1A

0B@

√3 − 2√

3

0√

2√3

1CA,

(b)

4 −12

−12 45

!=

2 0−6 3

! 2 −60 3

!,

(c)

0B@

1 1 11 2 −21 −2 14

1CA =

0B@

1 0 01 1 01 −3 2

1CA

0B@

1 1 10 1 −30 0 2

1CA,

(d)

0B@

2 1 11 2 11 1 2

1CA =

0BBB@

√2 0 0

1√2

√3√2

01√2

1√6

2√3

1CCCA

0BBBB@

√2 1√

21√2

0√

3√2

1√6

0 0 2√3

1CCCCA

,

(e)

0BBB@

2 1 0 01 2 1 00 1 2 10 0 1 2

1CCCA =

0BBBBBBB@

√2 0 0 0

1√2

√3√2

0 0

0√

2√3

2√3

0

0 0√

32

√5

2

1CCCCCCCA

0BBBBBBBB@

√2 1√

20 0

0√

3√2

√2√3

0

0 0 2√3

√3

2

0 0 0√

52

1CCCCCCCCA

.

3.5.20. (a)

4 −2−2 4

!=

2 0−1

√3

! 2 −10√

3

!,

(e)

0BBB@

2 1 1 11 2 1 11 1 2 11 1 1 2

1CCCA =

0BBBBBBB@

√2 0 0 0

1√2

√3√2

0 01√2

1√6

2√3

0

1√2

1√6

12√

3

√5

2

1CCCCCCCA

0BBBBBBBB@

√2 1√

21√2

1√2

0√

3√2

1√6

1√6

0 0 2√3

12√

3

0 0 0√

52

1CCCCCCCCA

.

3.5.21.(a) z21 + z22 , where z1 = 4x1, z2 = 5x2;

96

(b) z21 + z22 , where z1 = x1 − x2, z2 =√

3 x2;

(c) z21 + z22 , where z1 =√

5 x1 − 2√5x2, z2 =

r115 x2;

(d) z21 + z22 + z23 , where z1 =√

3 x1 − 1√3x2 − 1√

3x3, z2 =

r53 x2 − 1√

15x3, z3 =

r285 x3;

(e) z21 + z22 + z23 , where z1 = x1 + 12 x2, z2 =

√3

2 x2 + 1√3x3, z3 =

r23 x3;

(f ) z21 + z22 + z23 , where z1 = 2x1 − 12 x2 − x3, z2 = 1

2 x2 − 2x3, z3 = x3;

(g) z21 + z22 + z23 + z24 ,

where z1 =√

3x1 + 1√3x2, z2 =

r83 x2 +

r38 x3, z3 =

r218 x3 +

r821 x4, z4 =

r5521 x3.

3.6.1. The equation is eπ i + 1 = 0, since eπ i = cosπ + i sinπ = −1.

3.6.2. ekπ i = cos kπ + i sin kπ = (−1)k =

(1, k even,

−1, k odd.

3.6.3. Not necessarily. Since 1 = e2kπ i for any integer k, we could equally well compute

1z = e2kπ i z = e−2kπy+ i (2kπx) = e−2kπy“

cos 2kπx+ i sin 2kπx”.

If z = n is an integer, this always reduces to 1n = 1, no matter what k is. If z = m/n is

a rational number (in lowest terms) then 1m/n has n different possible values. In all othercases, 1z has an infinite number of possible values.

3.6.4. e2aπ i = cos 2aπ + i sin 2aπ = 1 if and only if a is an integer. The problem is that, as inExercise 3.6.3, the quantity 1a is not necessarily equal to 1 when a is not an integer.

3.6.5. (a) i = eπ i /2; (b)√

i = eπ i /4 = 1√2

+ i√2

and e5π i /4 = − 1√2− i√

2.

(c) 3√

i = eπ i /6, e5π i /6, e3π i /2 = − i ; 4√

i = eπ i /8, e5π i /8, e9π i /8, e13π i /8.

3.6.6. Along the line through z at the reciprocal radius 1/r = 1/| z |.3.6.7.

(a) 1/z moves in a clockwise direction around a circle of radius 1/r;(b) z moves in a clockwise direction around a circle of radius r;

(c) Suppose the circle has radius r and is centered at a. If r < | a |, then1z moves in a

counterclockwise direction around a circle of radius| a |2

| a |2 − r2 centered ata

| a |2 − r2 ; if

r > | a |, then1z moves in a clockwise direction around a circle of radius

| a |2r2 − | a |2 cen-

tered ata

| a |2 − r2 ; if r = | a |, then1z moves along a straight line. On the other hand, z

moves in a clockwise direction around a circle of radius r centered at a.

♦ 3.6.8. Set z = x + i y. We find |Re z | = |x | =qx2 ≤

qx2 + y2 = | z |, which proves the first

inequality. Similarly, | Im z | = | y | =qy2 ≤

qx2 + y2 = | z |.

♦ 3.6.9. Write z = r e i θ so θ = ph z. Then Re e i ϕ z = Re (r e i (ϕ+θ)) = r cos(ϕ + θ) ≤ r = | z |,with equality if and only if ϕ+ θ is an integer multiple of 2π.

3.6.10. Set z = r e i θ, w = s e i ϕ, then zw = r s e i (θ+ϕ) has modulus | zw | = r s = | z | |w | and

97

phase ph (zw) = θ + ϕ = ph z + ph w. Further, z = r e− i θ has modulus | z | = r = | z | andphase ph z = −θ = −ph z.

3.6.11. If z = r e i θ, w = s e i ϕ 6= 0, then z/w = (r/s) e i (θ−ϕ) has phase ph (z/w) = θ − ϕ =

ph z − ph w, while z w = r s e i (θ−ϕ) also has phase ph (z w) = θ − ϕ = ph z − ph w.

3.6.12. Since tan(t+ π) = tan t, the inverse tan−1 t is only defined up to multiples of π, whereasph z is uniquely defined up to multiples of 2π.

3.6.13. Set z = x + i y, w = u + i v, then z w = (x + i y) · (u − i v) = (xu + yv) + i (yu − xv)has real part Re (z w) = xu+ yv, which is the dot product between (x, y )T and (u, v )T .

3.6.14.(a) By Exercise 3.6.13, for z = x+ i y, w = u+ i v, the quantity Re (z w) = xu+ yv is equal

to the dot product between the vectors (x, y )T , (u, v )T , and hence equals 0 if and onlyif they are orthogonal.

(b) z i z = − i z z = − i | z |2 is purely imaginary, with zero real part, and so orthogonality

follows from part (a). Alternatively, note that z = x + i y corresponds to (x, y )T while

i z = −y + ix corresponds to the orthogonal vector (−y, x )T .

3.6.15. e(x+ i y)+(u+ i v) = e(x+u)+ i (y+v) = ex+uhcos(y + v) + i sin(y + v)

i

= ex+uh(cos y cos v − sin y sin v) + i (cos y sin v + sin y cos v)

i

=hex(cos y + i sin y)

i heu(cos v + i sin v)

i= ez ew.

Use induction: e(m+1)z = emz+z = emz ez = (ez)m ez = (ez)m+1.

3.6.16.(a) e2 i θ = cos 2θ+ i sin 2θ while (e i θ)2 = (cos θ+ i sin θ)2 = (cos2 θ− sin2 θ)+2 i cos θ sin θ,

and hence cos 2θ = cos2 θ − sin2 θ, sin 2θ = 2 cos θ sin θ.(b) cos 3θ = cos3 θ − 3 cos θ sin2 θ, sin 3θ = 3 cos θ sin2 θ − sin3 θ.

(c) cosmθ =X

0≤j=2k≤m

(−1)k0@mj

1A cosm−j θ sinj θ,

sinmθ =X

0<j=2k+1≤m

(−1)k0@mj

1A cosm−j θ sinj θ.

♦ 3.6.17. cosθ − ϕ

2cos

θ + ϕ

2= 1

4

he i (ϕ−θ)/2 + e− i (ϕ−θ)/2

i he i (ϕ+θ)/2 + e− i (ϕ+θ)/2

i

= 14 e

i ϕ + 14 e

− i ϕ + 14 e

i θ + 14 e

− i θ = 12 cos θ − 1

2 cosϕ.

3.6.18. ez = ex cos y + i ex sin y = r cos θ + i r sin θ implies r = | ez | = ex and θ = ph ez = y.

3.6.19. cos(x + i y) = cosx cosh y − i sinx sinh y, sin(x + i y) = sinx cosh y + i cosx sinh y,

where cosh y =ey + e−y

2, sinh y =

ey − e−y

2. In particular, when y = 0, cosh y = 1 and

sinh y = 0, and so these reduce to the usual real trigonometric functions. If x = 0 we obtaincos i y = cosh y, sin i y = i sinh y.

3.6.20.(a) cosh(x+ i y) = coshx cos y − i sinhx sin y, sinh(x+ i y) = sinhx cos y + i coshx sin y;(b) Using Exercise 3.6.19,

cos i z =e−z + ez

2= cosh z, sin i z =

e−z − ez2 i

= iez − e−z

2= i sinh z.

98

♥ 3.6.21.

(a) If j + k = n, then (cos θ)j (sin θ)k =1

2n i k(e i θ + e− i θ)j (e i θ − e− i θ)k. When multiplied

out, each term has 0 ≤ l ≤ n factors of e i θ and n − l factors of e− i θ, which equals

e i (2 l−n)θ with −n ≤ 2 l − n ≤ n, and hence the product is a linear combination of theindicated exponentials.

(b) This follows from part (a), writing each e i kθ = cos kθ + i sin kθ and e− i kθ = cos kθ −i sin kθ for k ≥ 0.

(c) (i) cos2 θ = 14 e

2 i θ + 12 + 1

4 e−2 i θ = 1

2 + 12 cos 2θ,

(ii) cos θ sin θ = − 14 i e2 i θ + 1

4 i e−2 i θ = 12 sin 2θ,

(iii) cos3 θ = 18 e

3 i θ + 38 e

i θ + 38 e

− i θ + 18 e

−3 i θ = 34 cos θ + 1

4 cos 3θ,

(iv) sin4 θ = 116 e

4 i θ − 14 e

2 i θ + 38 − 1

4 e−2 i θ + 1

16 e−4 i θ = 3

8 − 12 cos 2θ + 1

8 cos 4θ,

(v) cos2 θ sin2 θ = − 116 e

4 i θ + 18 − 1

16 e−4 i θ = 1

8 − 18 cos 4θ.

♦ 3.6.22. xa+ i b = xa e i b log x = xa cos(b log x) + ixa sin(b log x).

♦ 3.6.23. First, using the power series for ex, we have the complex power series e i x =∞X

n=1

( ix)n

n !.

Since i n =

8>>>>><>>>>>:

1, n = 4k,

i , n = 4k + 1,

−1, n = 4k + 2,

− i , n = 4k + 3,

we can rewrite the preceding series as

e i x =∞X

n=1

( ix)n

n !=

∞X

k=0

(−1)kx2k

(2k) !+ i

∞X

k=0

(−1)kx2k+1

(2k) != cosx+ i sinx.

♦ 3.6.24.

(a)d

dxeλx =

d

dx

“eµx cos ν x+ i eµx sin ν x

”= (µeµx cos ν x− ν eµx sin ν x) +

+ i (µeµx sin ν x+ ν eµx cos ν x) = (µ+ i ν)“eµx cos ν x+ i eµx sin ν x

”= λ eλx.

(b) This follows from the Fundamental Theorem of Calculus. Alternatively, one can calcu-late the integrals of the real and imaginary parts directly:.Z b

aeµx cos ν x dx =

1

µ2 + ν2

“µeµb cos ν b+ ν eµb sin ν b− µeµa cos ν a− ν eµa sin ν a

”,

Z b

aeµx sin ν x dx =

1

µ2 + ν2

“µeµb sin ν b− ν eµb cos ν b− µeµa sin ν a+ ν eµa cos ν a

”.

3.6.25. (a) 12 x + 1

4 sin 2x, (b) 12 x − 1

4 sin 2x, (c) − 14cos 2x, (d) − 1

4cos 2x − 116cos 8x,

(e) 38 x+ 1

4 sin 2x+ 132 sin 4x, (f ) 3

8 x− 14 sin 2x+ 1

32 sin 4x, (g) 18 x− 1

32 sin 4x,

(h) − 14 cosx+ 1

20cos 5x− 136cos 9x− 1

60cos 15x.

♠ 3.6.26. Re z2: Im z2:

Both have saddle points at the origin,

99

Re1

z: Im

1

z:

Both have singularities (“poles”) at the origin.

♠ 3.6.27. ph z: | z |:

3.6.28. (a) Linearly independent; (b) linearly dependent; (c) linearly independent; (d) linearlydependent; (e) linearly independent; (f ) linearly independent; (g) linearly dependent.

3.6.29. (a) Linearly independent; (b) yes, they are a basis; (c) ‖v1 ‖ =√

2, ‖v2 ‖ =√

6,

‖v3 ‖ =√

5, (d) v1 · v2 = 1 + i , v2 · v1 = 1 − i , v1 · v3 = 0, v2 · v3 = 0, so v1,v3 andv2,v3 are orthogonal, but not v1,v2. (e) No, since v1 and v2 are not orthogonal.

3.6.30.(a) Dimension = 1; basis: ( 1, i , 1− i )T .

(b) Dimension = 2; basis: ( i − 1, 0, 1 )T , (− i , 1, 0 )T .

(c) Dimension = 2; basis: ( 1, i + 2 )T , ( i , 1 + 3 i )T .

(d) Dimension = 1; basis:“− 14

5 − 85 i , 13

5 − 45 i , 1

”T.

(e) Dimension = 2; basis: ( 1 + i , 1, 0 )T , ( i , 0, 1 )T .

3.6.31. False — it is not closed under scalar multiplication. For instance, i

zz

!=

i zi z

!is

not in the subspace since i z = − i z.

3.6.32.

(a) Range:

i−1

!; corange:

i2

!; kernel:

2 i1

!; cokernel:

− i1

!.

(b) Range:

2−4

!,

−1 + i

3

!; corange:

0B@

2−1 + i1− 2 i

1CA,

0B@

01 + i3− 3 i

1CA; kernel:

0B@

1 + 52 i

3 i1

1CA;

cokernel: 0.

(c) Range:

0B@

i−1 + 2 i

i

1CA,

0B@

2− i3

1 + i

1CA; corange:

0B@

i−1

2− i

1CA,

0B@

00−2

1CA; kernel:

0B@− i10

1CA;

100

cokernel:

0BB@

1− 32 i

− 12 + i1

1CCA.

3.6.33. If cv + dw = 0 where c = a + i b, d = e + i f , then, taking real and imaginary parts,(a + e)x + (−b + f)y = 0 = (b + f)x + (a − e)y. If c, d are not both zero, so v,w arelinearly dependent, then a ± e, b ± f cannot all be zero, and so x,y are linearly dependent.Conversely, if ax+by = 0 with a, b not both zero, then (a− i b)v+(a+ i b)w = 2(ax+by) =0 and hence v,w are linearly dependent.

3.6.34. This can be proved directly, or by noting that it can be identified with the vector spaceC

mn. The dimension is mn, with a basis provided by the mn matrices with a single entryof 1 and all other entries equal to 0.

3.6.35. Only (b) is a subspace.

3.6.36. False.

3.6.37. (a) Belongs: sinx = − 12 i e i x + 1

2 i e− i x; (b) belongs: cosx − 2 i sinx = ( 12 + i )e i x +

( 12 − i )e− i x; (c) doesn’t belong; (d) belongs: sin2 1

2 x = 12 − 1

2 ei x − 1

2 e− i x;

(e) doesn’t belong.

3.6.38.(a) Sesquilinearity:

〈 cu + dv ,w 〉 = (cu1 + dv1)w1 + 2(cu2 + dv2)w2

= c(u1w1 + 2u2w2) + d(v1w1 + 2v2w2) = c 〈u ,w 〉+ d 〈v ,w 〉,〈u , cv + dw 〉 = u1 ( cv1 + dw1 ) + 2u2 ( cv2 + dw2 )

= c(u1 v1 + 2u2 v2) + d(u1w1 + 2u2w2) = c 〈u ,v 〉+ d 〈u ,w 〉.Conjugate Symmetry:

〈v ,w 〉 = v1w1 + 2v2w2 = w1 v1 + 2w2 v2 = 〈w ,v 〉.Positive definite: 〈v ,v 〉 = | v1 |2 + 2 | v2 |2 > 0 whenever v = ( v1, v2 )T 6= 0.

(b) Sesquilinearity:

〈 cu + dv ,w 〉 = (cu1 + dv1)w1 + i (cu1 + dv1)w2 − i (cu2 + dv2)w1 + 2(cu2 + dv2)w2

= c(u1w1 + iu1w2 − iu2w1 + 2u2w2) + d(v1w1 + i v1w2 − i v2w1 + 2v2w2)

= c 〈u ,w 〉+ d 〈v ,w 〉,〈u , cv + dw 〉 = u1 ( cv1 + dw1 ) + iu1 ( cv2 + dw2 )− iu2 ( cv1 + dw1 ) + 2u2 ( cv2 + dw2 )

= c(u1 v1 + iu1 v2 − iu2 v1 + 2u2 v2) + d(u1w1 + iu1w2 − iu2w1 + 2u2w2)

= c 〈u ,v 〉+ d 〈u ,w 〉.Conjugate Symmetry:

〈v ,w 〉 = v1w1 + i v1w2 − i v2w1 + 2v2w2 = w1 v1 + iw1 v2 − iw2 v1 + 2w2 v2 = 〈w ,v 〉.Positive definite: Let v = ( v1, v2 )T = (x1 + i y1, x2 + i y2 )T :

〈v ,v 〉 = | v1 |2 + i v1 v2 − i v2 v1 + 2 | v2 |

2 = x21 + y2

1 + x1y2 − x2y1 + 2x22 + 2y2

2

= (x1 + y2)2 + (y1 − x2)

2 + x22 + y2

2 > 0 provided v 6= 0.

3.6.39. Only (d), (e) define Hermitian inner products.

♦ 3.6.40. (Av) ·w = (Av)T w = vTAT w = vTAw = vT Aw = v · (Aw).

101

3.6.41.

(a) ‖ z ‖2 =nX

j =1

| zj |2 =

nX

j =1

“|xj |

2 + | yj |2”

=nX

j =1

|xj |2 +

nX

j =1

| yj |2 = ‖x ‖2 + ‖y ‖2.

(b) No; for instance, the formula is not valid for the inner product in Exercise 3.6.38(b).

♦ 3.6.42.

(a) ‖ z + w ‖2 = 〈 z + w , z + w 〉 = ‖ z ‖2 + 〈 z ,w 〉+ 〈w , z 〉+ ‖w ‖2

= ‖ z ‖2 + 〈 z ,w 〉+ 〈 z ,w 〉+ ‖w ‖2 = ‖ z ‖2 + 2 Re 〈 z ,w 〉+ ‖w ‖2.(b) Using (a),

‖ z + w ‖2 − ‖ z−w ‖2 + i ‖ z + iw ‖2 − i ‖ z− iw ‖2 = 4 Re 〈 z ,w 〉+ 4 i Re (− i 〈 z ,w 〉)= 4 Re 〈 z ,w 〉+ 4 i Im 〈 z ,w 〉 = 4 〈 z ,w 〉.

♦ 3.6.43.

(a) The angle θ between v,w is defined by cos θ =| 〈v ,w 〉 |‖v ‖ ‖w ‖ . Note the modulus on the

inner product term, which is needed in order to keep the angle real.(b) ‖v ‖ =

√11 , ‖w ‖ = 2

√2, 〈v ,w 〉 = 2 i , so θ = cos−1 1√

22= 1.3556 radians.

3.6.44. ‖v ‖ = ‖ cv ‖ if and only if | c | = 1, and so c = e i θ for some 0 ≤ θ < 2π.

♦ 3.6.45. Assume w 6= 0. Then, by Exercise 3.6.42(a), for t ∈ C,

0 ≤ ‖v + tw ‖2 = ‖v ‖2 + 2Re t 〈v ,w 〉+ | t |2 ‖w ‖2.

With t = − 〈v ,w 〉‖w ‖2 , we find

0 ≤ ‖v ‖2 − 2| 〈v ,w 〉 |2‖w ‖2 +

| 〈v ,w 〉 |2‖w ‖2 = ‖v ‖2 − | 〈v ,w 〉 |

2

‖w ‖2 ,

which implies | 〈v ,w 〉 |2 ≤ ‖v ‖2 ‖w ‖2, proving Cauchy–Schwarz.To establish the triangle inequality,

‖v + w ‖2 = 〈v + w ,v + w 〉 = ‖v ‖2 + 2 Re 〈v ,w 〉+ ‖w ‖2

≤ ‖v ‖2 + 2 ‖v ‖ ‖w ‖+ ‖w ‖2 =“‖v ‖+ ‖w ‖

”2,

since, according to Exercise 3.6.8, Re 〈v ,w 〉 ≤ | 〈v ,w 〉 | ≤ ‖v ‖2 ‖w ‖2,.3.6.46.

(a) A norm on the complex vector space V assigns a real number ‖v ‖ to each vector v ∈ V ,subject to the following axioms, for all v,w ∈ V , and c ∈ C:

(i) Positivity : ‖v ‖ ≥ 0, with ‖v ‖ = 0 if and only if v = 0.(ii) Homogeneity : ‖ cv ‖ = | c | ‖v ‖.(iii) Triangle inequality : ‖v + w ‖ ≤ ‖v ‖+ ‖w ‖.

(b) ‖v ‖1 = | v1 |+· · ·+| vn |; ‖v ‖2 =q| v1 |2 + · · ·+ | vn |2; ‖v ‖∞ = max| v1 |, . . . , | vn |.

3.6.47. (e) Infinitely many, namely u = e i θv/‖v ‖ for any 0 ≤ θ < 2π.

♦ 3.6.48.

(a) (A†)† = (AT )T = (AT )T = A,

(b) (zA+ wB)† = (zA+ wB)T = (zAT + wBT ) = zAT + wBT = zA† + wB†,

(c) (AB)† = (AB)T = BTAT = BT AT = B†A†.

102

♦ 3.6.49.(a) The entries of H satisfy hji = hij ; in particular, hii = hii, and so hii is real.

(b) (H z) ·w = (H z)T w = zTHT w = zT Hw = z · (Hw).

(c) Let z =nX

i=1

zi ei,w =nX

i=1

wi ei be vectors in Cn. Then, by sesquilinearity, 〈 z ,w 〉 =

nX

i,j =1

hij ziwj = zTHw, where H has entries hij = 〈 ei , ej 〉 = 〈 ej , ei 〉 = hji, proving

that it is a Hermitian matrix. Positive definiteness requires ‖ z ‖2 = zTH z > 0 for allz 6= 0.

(d) First check that the matrix is Hermitian: hij = hji. Then apply Regular Gaussian

Elimination, checking that all pivots are real and positive.

♦ 3.6.50.(a) The (i, j) entry of the Gram matrix K is kij = 〈vi ,vj 〉 = 〈vj ,vi 〉 = kji, and so

K† = K is Hermitian.

(b) xTKx =nX

i,j =1

kij xixj =nX

i,j =1

xixj 〈vi ,vj 〉 = ‖v ‖2 ≥ 0 where v =

nX

i=1

xi vi.

(c) Equality holds if and only if v = 0. If v1, . . . ,vn are linearly independent, then

v =nX

i=1

xi vi = 0 requires x = 0, proving positive definiteness.

3.6.51.(a) (i) 〈 1 , e i πx 〉 = − 2

π i , ‖ 1 ‖ = 1, ‖ e i πx ‖ = 1;

(ii) | 〈 1 , e i πx 〉 | = 2π ≤ 1 = ‖ 1 ‖ ‖ e i πx ‖, ‖ 1 + e i πx ‖ =

√2 ≤ 2 = ‖ 1 ‖+ ‖ e i πx ‖.

(b) (i) 〈x+ i , x− i 〉 = − 23 + i , ‖x+ i ‖ = ‖x− i ‖ = 2√

3;

(ii) | 〈x+ i , x− i 〉 | =√

133 ≤ 4

3 = ‖x+ i ‖ ‖x− i ‖,‖ (x+ i ) + (x− i ) ‖ = ‖ 2x ‖ = 2√

3≤ 4√

3= ‖x+ i ‖+ ‖x− i ‖.

(c) (i) 〈 i x2 , (1− 2 i )x+ 3 i 〉 = 12 + 1

4 i , ‖ i x2 ‖ = 1√5, ‖ (1− 2 i )x+ 3 i ‖ =

q143 ;

(ii) | 〈 i x2 , (1− 2 i )x+ 3 i 〉 | =√

54 ≤

q1415 = ‖ i x2 ‖ ‖ (1− 2 i )x+ 3 i ‖,

‖ i x2 + (1− 2 i )x+ 3 i ‖ =q

8815 ≈ 2.4221 ≤ 2.6075 ≈ 1√

5+q

143

= ‖ i x2 ‖+ ‖ (1− 2 i )x+ 3 i ‖.3.6.52. w(x) > 0 must be real and positive. Less restrictively, one needs only require that

w(x) ≥ 0 as long as w(x) 6≡ 0 on any open subinterval a ≤ c < x < d ≤ b; see Exercise3.1.28 for details.

103


4.1.1. We need to minimize (3x − 1)2 + (2x + 1)2 = 13x2 − 2x + 2. The minimum value of 2513

occurs when x = 113 .

4.1.2. Note that f(x, y) ≥ 0; the minimum value f(x?, y?) = 0 is achieved when

x? = − 57 , y? = − 4

7 .

4.1.3. (a) (−1, 0 )T , (b) ( 0, 2 )T , (c)“

12 , 1

2

”T, (d)

“− 3

2 , 32

”T, (e) (−1, 2 )T .

4.1.4. Note: To minimize the distance between the point ( a, b )T to the line y = mx + c:

(i) in the ∞ norm we must minimize the scalar function f(x) = maxn|x− a |, |mx + c− b |

o,

while (ii) in the 1 norm we must minimize the scalar function f(x) = |x− a |+|mx + c− b |.

(i) (a) all points on the line segment ( x, 0 )T for −3 ≤ x ≤ 1; (b) all points on the line

segment ( 0, y )T for 1 ≤ y ≤ 3; (c)“

12 , 1

2

”T; (d)

“− 3

2 , 32

”T; (e) (−1, 2 )T .

(ii) (a) (−1, 0 )T ; (b) ( 0, 2 )T ; (c) all points on the line segment ( t, t )T for −1 ≤ t ≤ 2;

(d) all points on the line segment ( t,−t )T for −2 ≤ t ≤ −1; (e) (−1, 2 )T .

4.1.5.(a) Uniqueness is assured in the Euclidean norm. (See the following exercise.)(b) Not unique. For instance, in the ∞ norm, every point on the x-axis of the form (x, 0)

for −1 ≤ x ≤ 1 is at a minimum distance 1 from the point ( 0, 1 )T .(c) Not unique. For instance, in the 1 norm, every point on the line x = y of the form (x, x)

for −1 ≤ x ≤ 1 is at a minimum distance 1 from the point (−1, 1 )T .

♥ 4.1.6.

(a) The closest point v is found by dropping aperpendicular from the point to the line:

b

vθ

(b) Any other point w on the line lies at a larger distance since ‖w − b ‖ is the hypotenuseof the right triangle with corners b,v,w and hence is longer than the side length‖v − b ‖.

(c) Using the properties of the cross product, the distance is ‖b ‖ | sin θ | = |a× b |/‖a ‖,where θ is the angle between the line through a and b. To prove the other formmula,we note that

‖a ‖2 ‖b ‖2− (a ·b)2 = (a21 + a2

2)(b21 + b22)− (a1 b1 + a2 b2)

2 = (a1 b2− a2 b1)2 = (a× b)2.

4.1.7. This holds because the two triangles in the figure are congruent. According to Exercise4.1.6(c), when ‖a ‖ = ‖b ‖ = 1, the distance is | sin θ | where θ is the angle between a,b, asilustrated:

104

a

b

θ

4.1.8.

(a) Note that a = ( b,−a )T lies in the line. By Exercise 4.1.6(a), the distance is|a× x0 |‖a ‖ =

| ax0 + by0 |√a2 + b2

. (b) A similar geometric construction yields the distance| ax0 + by0 + c |√

a2 + b2.

♥ 4.1.9. (a) The distance is given by| ax0 + by0 + cz0 + d |√

a2 + b2 + c2. (b)

1√14

.

4.1.10.(a) Let v? be the minimizer. Since x2 is a montone, strictly increasing function for x ≥ 0,

we have 0 ≤ x < y if and only if x2 < y2. Thus, for any other v, we have

x = ‖v? − b ‖ < y = ‖v − b ‖ if and only if x2 = ‖v? − b ‖2 < y2 = ‖v − b ‖2,

proving that v? must minimize both quantities.(b) F (x) must be strictly increasing: F (x) < F (y) whenever x < y.

4.1.11.(a) Assume V 6= 0, as otherwise the minimum and maximum distance is ‖b ‖. Given any

0 6= v ∈ V , by the triangle inequality, ‖b− tv ‖ ≥ | t | ‖v ‖ − ‖b ‖ → ∞ as t → ∞, andhence there is no maximum distance.

(b) Maximize distance from a point to a closed, bounded (compact) subset of Rn, e.g., the

unit sphere ‖v ‖ = 1. For example, the maximal distance between the point ( 1, 1 )T

and the unit circle x2 + y2 = 1 is 1 +√

2, with x? =„− 1√

2,− 1√

2

«T

being the farthest

point on the circle.

4.2.1. x = 12 , y = 1

2 , z = −2 with f(x, y, z) = − 32 . This is the glboal minimum because the

coefficient matrix

0B@

1 1 01 3 10 1 1

1CA is positive definite.

4.2.2. At the point x? = − 111 , y? = 5

22 .

4.2.3. (a) Minimizer: x = − 23 , y = 1

6 ; minimum value: − 43 . (b) Minimizer: x = 2

9 , y = 29 ;

minimum value: 329 . (c) No minimum. (d) Minimizer: x = − 1

2 , y = −1, z = 1; mini-

mum value: − 54 . (e) No minimum. (f ) No minimum. (g) Minimizer: x = 7

5 , y = − 45 ,

z = 15 , w = 2

5 ; minimum value: − 85 .

4.2.4. (a) | b | < 2, (b) A =

1 bb 4

!=

1 0b 1

! 1 00 4− b2

! 1 b0 1

!; (c) If | b | < 2, the

minimum value is − 1

4− b2, achieved at x? = − b

4− b2, y? =

1

4− b2. When b ≥ ±2, the

minimum is −∞.

105

4.2.5.

(a) p(x) = 4x2 − 24xy + 45y2 + x− 4y + 3; minimizer: x? =“

124 , 1

18

”T ≈ ( .0417, .0556 )T ;

minimum value: p(x?) = 419144 ≈ 2.9097.

(b) p(x) = 3x2 + 4xy + y2 − 8x− 2y; no minimizer since K is not positive definite.

(c) p(x) = 3x2 − 2xy + 2xz + 2y2 − 2y z + 3z2 − 2x + 4z − 3; minimizer: x? =“

712 ,− 1

6 ,− 1112

”T ≈ ( .5833,−.1667,−.9167 )T ; minimum value: p(x?) = − 6512 = −5.4167.

(d) p(x) = x2 + 2xy + 2xz + 2y2 − 2y z + z2 + 6x + 2y − 4z + 1; no minimizer since K isnot positive definite.

(e) p(x) = x2 + 2xy + 2y2 + 2y z + 3z2 + 2zw + 4w2 + 2x − 4y + 6z − 8w; minimizer:

x? = (−8, 7,−4, 2 )T ; minimum: p(x?) = −42.

4.2.6. n = 2: minimizer x? =“− 1

6 ,− 16

”T; minimum value − 1

6 .

n = 3: minimizer x? =“− 5

28 ,− 314 ,− 5

28


7 .

n = 4: minimizer x? =“− 2

11 ,− 522 ,− 5

22 ,− 211


22 .

4.2.7. (a) maximizer: x? =“

1011 , 3

11

”T; maximum value: p(x?) = 16

11 . (b) There is no maxi-

mum, since the coefficient matrix is not negative definite.

4.2.8. False. Even in the scalar case, p1(x) = x2 has minimum at x?1 = 0, while p2(x) = x2 − 2x

has minimum at x?2 = 1, but the minimum of p1(x) + p2(x) = 2x2 − 2x is at x? = 1

2 6=x?1 + x?

2.

♦ 4.2.9. Let x? = K−1f be the minimizer. When c = 0, according to the third expression in

(4.12), p(x?) = −(x?)T K x? ≤ 0 because K is positive definite. The minimum value is 0 ifand only if x? = 0, which occurs if and only if f = 0.

4.2.10. First, using Exercise 3.4.20, we rewrite q(x) = xT K x where K = KT is symmetric. IfK is positive definite or positive semi-definite, then the minimum value is 0, attained whenx? = 0 (and other points in the semi-definite cases). Otherwise, there is at least one vector

v for which q(v) = vT Kv = a < 0. Then q(tv) = t2 a can be made arbitrarily largenegative for tÀ 0; in this case, there is no minimum value.

4.2.11. If and only if f = 0 and the function is constant, in which case every x is a minimizer.

♦ 4.2.12. p(x) has a maximum if and only if −p(x) has a minimum. Thus, we require either K isnegative definite, or negative semi-definite with f ∈ rng K. The maximizer x? is obtainedby solving K x? = f , and the maximum value p(x?) is given as before by any of the expres-sions in (4.12).

4.2.13. The complex numbers do not have an ordering, i.e., an inequality z < w doesn’t makeany sense. Thus, there is no “minimum” of a set of complex numbers. (One can, of course,minimize the modulus of a set of complex numbers, but this places us back in the situationof minimizing a real-valued function.)

4.3.1. Closest point:“

67 , 38

35 , 3635

”T ≈ ( .85714, 1.08571, 1.02857 )T ; distance: 1√35≈ .16903.

4.3.2.(a) Closest point: ( .8343, 1.0497, 1.0221 )T ; distance: .2575.

106

(b) Closest point: ( .8571, 1.0714, 1.0714 )T ; distance: 2673.

4.3.3.“

79 , 8

9 , 119

”T ≈ ( .7778, .8889, 1.2222 )T .

4.3.4.

(a) Closest point:“

74 , 7

4 , 74 , 7

4

”T; distance:

q114 .

(b) Closest point:“

2, 2, 32 , 3

2

”T; distance:

q52 .

(c) Closest point: ( 3, 1, 2, 0 )T ; distance: 1.

(d) Closest point:“

54 ,− 3

4 , 14 ,− 3

4

”T; distance: 7

2 .

4.3.5. Since the vectors are linearly dependent, one must first reduce to a basis consisting of the

first two. The closest point is“

12 , 1

2 , 12 , 1

2

”Tand the distance is 3.

4.3.6.

(i) 4.3.4: (a) Closest point:“

32 , 3

2 , 32 , 3

2

”T; distance:

q74 .


53 , 5

3 , 43 , 4

3

”T; distance:

q53 .

(c) Closest point: ( 3, 1, 2, 0 )T ; distance: 1.


23 ,− 1

6 ,− 13 ,− 1

3

”T; distance: 7√

6.

4.3.5: Closest point:“

311 , 13

22 ,− 411 ,− 1

22

”T ≈ ( .2727, .5909,−.3636,−.0455 )T ;

distance:q

15522 ≈ 2.6543.

(ii) 4.3.4: (a) Closest point:“

2514 , 25

14 , 2514 , 25

14

”T ≈ ( 1.7857, 1.7857, 1.7857, 1.7857 )T ;

distance:q

21514 ≈ 3.9188.


6635 , 66

35 , 5935 , 59

35

”T ≈ ( 1.8857, 1.8857, 1.6857, 1.6857 )T ;

distance:q

53435 ≈ 3.9060.

(c) Closest point:“

289 , 11

9 , 169 , 0

”T ≈ ( 3.1111, 1.2222, 1.7778, 0 )T ;

distance:q

329 ≈ 1.8856.


32 ,−1, 0,− 1

2

”T; distance:

√42.

4.3.5: Closest point:“

26259 ,− 107

259 ,− 292259 , 159

259

”T ≈ ( .1004,−.4131, 1.1274, .6139 )T ;

distance: 8q

143259 ≈ 5.9444.

4.3.7. v =“

65 , 3

5 , 32 , 3

2

”T= ( 1.2, .6, 1.5, 1.5 )T .

4.3.8. (a)q

83 ; (b) 7√

6.

♥ 4.3.9.(a) P 2 = A(AT A)−1AT A(AT A)−1AT = A(AT A)−1AT = P .

(b) (i)

0@

12 − 1

2

− 12

12

1A, (ii)

1 00 1

!, (iii)

0BBB@

16

13 − 1

613

23 − 1

3

− 16 − 1

316

1CCCA, (iv)

0BBB@

56 − 1

6 − 13

− 16

56 − 1

3

− 13 − 1

313

1CCCA.

(c) PT = (A(AT A)−1AT )T = A((AT A)T )−1AT = A(AT A)−1AT = P .

(d) Given v = P b ∈ rng P , then v = Ax where x = (AT A)−1AT b and so v ∈ rng A.Conversely, if v = Ax ∈ rng A, then we can write v = P b ∈ rng P where b solves the

107

linear system AT b = (AT A)x ∈ Rn, which has a solution since rank AT = n and so

rng AT = Rn.

(e) According to the formulas in the section, the closest point is w = Ax where

x = K−1AT b = (AT A)−1AT b and so w = A(AT A)−1AT b = P b.

(f ) If A is nonsingular, then (AT A)−1 = A−1A−T , hence P = A(AT A)−1AT =

AA−1A−T AT = I . In this case, the columns of A span Rn and the closest point to any

b ∈ Rn is b = P b itself.

♦ 4.3.10.(a) The quadratic function to be minimized is

p(x) =nX

i=1

‖x− ai ‖2 = n ‖x ‖2 − 2x ·

0@

nX

i=1

ai

1A+

nX

i=1

‖ai ‖2,

which has the form (4.22) with K = n I , f =nX

i=1

ai, c =nX

i=1

‖ai ‖2. Therefore, the

minimizing point is x = K−1f =1

n

nX

i=1

ai, which is the center of mass of the points.

(b) (i)“− 1

2 , 4”, (ii)

“13 , 1

3

”, (iii)

“− 1

4 , 34

”.

4.3.11. In general, for the norm based on the positive definite matrix C, the quadratic functionto be minimized is

p(x) =nX

i=1

‖x− ai ‖2 = n ‖x ‖2 − 2n 〈x ,b 〉+ nc = n

“x

T C x− 2xT C b + c

”,

where b =1

n

nX

i=1

ai is the center of mass and c =1

n

nX

i=1

‖ai ‖2. Therefore, the minimizing

point is still the center of mass: x = C−1C b = b. Thus, all answers are the same as inExercise 4.3.10.

4.3.12. The Cauchy–Schwarz inequality guarantees this.

♦ 4.3.13.

‖v − b ‖2 = ‖Ax− b ‖2 = (Ax− b)T C (Ax− b) = (xT AT − bT )C (Ax− b)

= xT AT C Ax− x

T AT C b− bT C Ax + b

Tb = x

T AT C Ax− 2xT AT C b

T + bTb

= xT Kx− 2x

Tf + c,

where the scalar quantity xT AT C b = (xT AT C b)T = bT C Ax since CT = C.

4.3.14. (a) x = 115 , y = 41

45 ; (b) x = − 125 , y = − 8

21 ; (c) u = 23 , v = 5

3 , w = 1;

(d) x = 13 , y = 2, z = 3

4 ; (e) x1 = 13 , x2 = 2, x3 = − 1

3 , x4 = − 43 .

4.3.15. (a) 12 , (b)

0@

852865

1A =

1.6.4308

!, (c)

0BBB@

2515

0

1CCCA, (d)

0@

227941304941

1A =

.2412.3231

!, (e)

0B@

.0414−.0680−.0990

1CA.

4.3.16. The solution is ( 1, 0, 3 )T . If A is nonsingular, then the least squares solution to the sys-

tem Ax = b is given by x? = K−1f = (AT A)−1AT b = A−1A−T AT b = A−1b, whichcoincides with the ordinary solution.

108

4.3.17. The solution is x? = (−1, 2, 3 )T . The least squares error is 0 because b ∈ rng A and sox? is an exact solution.

♦ 4.3.18.(a) This follows from Exercise 3.4.31 since f ∈ corng A = rng K.(b) This follows from Theorem 4.4.(c) Because if z ∈ ker A = ker K, then x + z is also a solution to the normal equations, and

has the same minimum value for the least squares error.

4.4.1. (a) y = 127 + 12

7 t = 1.7143 (1 + t); (b) y = 1.9− 1.1 t; (c) y = −1.4 + 1.9 t.

4.4.2. (a) y = 30.6504 + 2.9675 t; (b)

10 15 20 25 30 35

20

40

60

80

100

120

140

(c) profit: $179, 024, (d) profit: $327, 398.

4.4.3. (a) y = 3.9227 t− 7717.7; (b) $147, 359 and $166, 973.

♥ 4.4.4. (a) y = 2.2774 t − 4375; (b) 179.75 and 191.14; (c) y = e.0183 t−31.3571, with esti-mates 189.79 and 207.98. (d) The linear model has a smaller the least squares error be-tween its predictions and the data, 6.4422 versus 10.4470 for the exponential model, andalso a smaller maximal error, namely 4.3679 versus 6.1943.

4.4.5. Assuming a linear increase in temperature, the least squares fit is y = 71.6 + .405 t, whichequals 165 at t = 230.62 minutes, so you need to wait another 170.62 minutes, just underthree hours.

4.4.6.(a) The least squares exponential is y = e4.6051−.1903 t and, at t = 10, y = 14.9059.

(b) Solving e4.6051−.1903 t = .01, we find t = 48.3897 ≈ 49 days.

4.4.7. (a) The least squares exponential is y = e2.2773−.0265 t. The half-life is log 2/.0265 =

26.1376 days. (b) Solving e2.2773−.0265 t = .01, we find t = 259.5292 ≈ 260 days.

4.4.8.(a) The least squares exponential is y = e.0132 t−20.6443, giving the population values (in

millions) y(2000) = 296, y(2010) = 337, y(2050) = 571.

(b) The revised least squares exponential is y = e.0127 t−19.6763, giving a smaller predictedvalue of y(2050) = 536.

4.4.9.

(a) The sample matrix for the functions 1, x, y is A =

0BBBBBBB@

1 1 11 1 21 2 11 2 21 3 21 3 4

1CCCCCCCA

, while z =

0BBBBBBB@

36

11−2

03

1CCCCCCCA

is the

data vector. The least squares solution to the normal equations AT Ax = AT z for x =

( a, b, c )T gives the plane z = 6.9667− .8 x− .9333 y.

(b) Every plane going through ( 0, 2, 2 )T has the an equation z = a(x − 2) + b(y − 2), i.e., a

109

linear combination of the functions x− 2, y − 2. The sample matrix is A =

0BBBBBBB@

−1 −1−1 0

0 −10 01 01 2

1CCCCCCCA

,and the least squares solution gives the plane

z = − 45 (x− 2)− 14

15 (y − 2) = 3.4667− .8 x− .9333 y.

♦ 4.4.10. For two data points t1 = a, t2 = b, we have

t2 = 12 (a2 + b2), while ( t )2 = 1

4 (a2 + 2ab + b2),

which are equal if and only if a = b. (For more data points, there is a single quadratic con-dition for equality.) Similarly, if y1 = p, y2 = q, then

t y = 12 (ap + bq), while t y = 1

4 (a + b)(p + q).

Fixing a 6= b, these are equal if and only if p = q.

♦ 4.4.11.

1

m

mX

i=1

(ti − t )2 =1

m

mX

i=1

t2i −2 t

m

mX

i=1

ti +( t )2

m

mX

i=1

1 = t2 − 2 ( t )2 + ( t )2 = t2 − ( t )2.

4.4.12.(a) p(t) = − 1

5 (t− 2) + (t + 3) = 175 + 4

5 t,

(b) p(t) = 13 (t− 1)(t− 3)− 1

4 t(t− 3) + 124 t(t− 1) = 1− 5

8 t + 18 t2,

(c) p(t) = 12 t(t− 1)− 2(t− 1)(t + 1)− 1

2 (t + 1) t = 2− t− 2 t2,

(d) p(t) = 12 t(t− 2)(t− 3)− 2 t(t− 1)(t− 3) + 3

2 t(t− 1)(t− 2) = t2,

(e) p(t) = − 124 (t + 1) t(t− 1)(t− 2) + 1

3 (t + 2) t(t− 1)(t− 2) + 12 (t + 2)(t + 1)(t− 1)(t− 2)−

16 (t + 2)(t + 1) t(t− 2) + 1

8 (t + 2)(t + 1) t(t− 1) = 2 + 53 t− 13

4 t2 − 16 t3 + 3

4 t4.

4.4.13.

(a) y = 2 t− 7

2 4 6 8 10

-5

5

10

(b) y = t2 + 3 t + 6

-3 -2 -1 1 2

2

4

6

8

10

12

14

(c) y = −2 t− 1-3 -2 -1 1 2

-4

-2

2

4

110

(d) y = − t3 + 2t2 − 1

-1.5 -1 -0.5 0.5 1 1.5 2 2.5

-3

-2

-1

1

2

3

4

5

(e) y = t4 − t3 + 2 t− 3

-2 -1 1 2

-10

-5

5

10

15

20

25

4.4.14.(a) y = − 4

3 + 4 t.

(b) y = 2 + t2. The error is zero because the parabola interpolates the points exactly.

4.4.15.

(a) y = 2.0384 + .6055 t

2 4 6 8 10

3

4

5

6

7

8

(b) y = 2.14127 + .547248 t + .005374 t2

2 4 6 8 10

3

4

5

6

7

8

(c) y = 2.63492 + .916799 t− .796131 t2 + .277116 t3

− .034102 t4 + .001397 t5

2 4 6 8 10

6

8

10

(d) The linear and quadratic models are practically identical, with almost the same leastsquares errors: .729045 and .721432, respectively. The fifth order interpolating polyno-mial, of course, has 0 least squares error since it goes exactly through the data points.On the other hand, it has to twist so much to do this that it is highly unlikely to bethe correct theoretical model. Thus, one strongly suspects that this experimental datacomes from a linear model.

4.4.16. The quadratic least squares polynomial is y = 4480.5 + 6.05 t− 1.825 t2, and y = 1500 at42.1038 seconds.

4.4.17. The quadratic least squares polynomial is y = 175.5357 + 56.3625 t− .7241 t2, and y = 0at 80.8361 seconds.

4.4.18.(a) p2(t) = 1 + t + 1

2 t2, p4(t) = 1 + t + 12 t2 + 1

6 t6 + 124 t4;

111

(b) The maximal error for p2(t) over the interval [0, 1] is .218282, while for p4(t) it is .0099485.The Taylor polynomials do a much better job near t = 0, but become significantly worseat larger values of t; the least squares approximants are better over the entire interval.

♥ 4.4.19. Note: In this solution t is measured in degrees! (Alternatively, one can set up and solvethe problem in radians.) The error is the L∞ norm of the difference sin t − p(t) on the in-terval 0 ≤ t ≤ 60.

(a) p(t) = .0146352 t + .0243439; maximum error ≈ .0373.

(b) p(t) = − .000534346 + .0191133 t− .0000773989 t2; maximum error ≈ .00996.(c) p(t) = π

180 t; maximum error ≈ .181.

(d) p(t) = .0175934 t− 9.1214× 10−6 t2 − 7.25655× 10−7 t3; maximum error ≈ .000649.

(e) p(t) = π180 t− 1

6

“π

180 t”3; maximum error ≈ .0102.

(f ) The Taylor polynomials do a much better job at the beginning of the interval, but theleast squares approximants are better over the entire range.

4.4.20. (a) For equally spaced data points, the least squares line is y = .1617 + .9263 t with amaximal error of .1617 on the interval 0 ≤ t ≤ 1. (b) The least squares quadratic poly-

nomial is y = .0444 + 1.8057 t − .8794 t2 with a slightly better maximal error of .1002.Interestingly, the line is closer to

√t over a larger fraction of the interval than the quadratic

polynomial, and only does significantly worse near t = 0.Alternative solution: (a) The data points 0, 1

25 , 116 , 1

9 , 14 , 1 have exact square roots 0, 1

5 , 14 , 1

3 , 12 , 1.

For these, we obtain the least squares line y = .1685 + .8691 t, with a maximal error of.1685. (b) The least squares quadratic polynomial y = .0773 + 2.1518 t − 1.2308 t2 with,strangely, a worse maximal error of .1509, although it does do better over a larger fractionof the interval.

4.4.21. p(t) = .9409 t + .4566 t2 − .7732 t3 + .9330 t4. The graphs are very close over the interval0 ≤ t ≤ 1; the maximum error is .005144 at t = .91916. The functions rapidly diverge above1, with tan t → ∞ as t → 1

2 π, whereas p( 12 π) = 5.2882. The first graph is on the interval

[0, 1] and the second on [0, 12 π ]:

0.2 0.4 0.6 0.8 1

0.25

0.5

0.75

1

1.25

1.5

0.2 0.4 0.6 0.8 1 1.2 1.4

2

4

6

8

10

12

14

4.4.22. The exact value is log10 e ≈ .434294.

(a) p2(t) = − .4259 + .48835 t− .06245 t2 and p2(e) = .440126;

(b) p3(t) = − .4997 + .62365 t− .13625 t2 + .0123 t3 and p2(e) = .43585.

♦ 4.4.23.(a) q(t) = α + β t + γ t2 where

α = − y0 t1 t2 (t2 − t1) + y1 t2 t0 (t0 − t2) + y2 t0 t1 (t1 − t0)

(t1 − t0)(t2 − t1)(t0 − t2),

β =y0 (t22 − t21) + y1 (t20 − t22) + y2 (t21 − t20)

(t1 − t0)(t2 − t1)(t0 − t2), γ = − y0 (t2 − t1) + y1 (t0 − t2) + y2 (t1 − t0)

(t1 − t0)(t2 − t1)(t0 − t2).

(b) The minimum is at t? = − β

2γ, and m0 s1 −m1 s0 = − 1

2β(t2 − t0), s1 − s0 = γ(t2 − t0).

(c) q(t?) = α− β2

4 γ= − y2

0 (t2 − t1)4 + y2

1 (t0 − t2)4 + y2

2 (t1 − t0)4

4 (t1 − t0)(t2 − t1)(t0 − t2)hy0 (t2 − t1) + y1 (t0 − t2) + y2 (t1 − t0)

i .

112

♠ 4.4.24. When a < 2, the approximations are very good. At a = 2, a small amount of oscillationis noticed at the two ends of the intervals. When a > 2, the approximations are worthlessfor |x | > 2. The graphs are for n + 1 = 21 iteration points, with a = 1.5, 2, 2.5, 3:

-1.5 -1 -0.5 0.5 1 1.5

-0.4

-0.2

0.2

0.4

0.6

0.8

1

-2 -1.5 -1 -0.5 0.5 1 1.5 2

-0.4

-0.2

0.2

0.4

0.6

0.8

1

-2 -1 1 2

-0.4

-0.2

0.2

0.4

0.6

0.8

1

-3 -2 -1 1 2 3

-0.4

-0.2

0.2

0.4

0.6

0.8

1

Note: Choosing a large number of sample points, say n = 50, leads to an ill-conditionedmatrix, and even the small values of a exhibit poor approximation properties near the endsof the intervals due to round-off errors when solving the linear system.

♠ 4.4.25. The conclusions are similar to those in Exercise 4.4.24, but here the critical value of a isaround 2.4. The graphs are for n + 1 = 21 iteration points, with a = 2, 2.5, 3, 4:

-2 -1.5 -1 -0.5 0.5 1 1.5 2

-0.4

-0.2

0.2

0.4

0.6

0.8

1

-2 -1 1 2

-0.4

-0.2

0.2

0.4

0.6

0.8

1

-3 -2 -1 1 2 3

-0.4

-0.2

0.2

0.4

0.6

0.8

1

-4 -3 -2 -1 1 2 3 4

-0.4

-0.2

0.2

0.4

0.6

0.8

1

4.4.26. x ∈ ker A if and only if p(t) vanishes at all the sample points: p(ti) = 0, i = 1, . . . , m.

4.4.27.(a) For example, the the interpolating polynomial for the data (0, 0), (1, 1), (2, 2) is the

straight line y = t.(b) The Lagrange interpolating polynomials are zero at n of the sample points. But the

only polynomial of degree < n than vanishes at n points is the zero polynomial, whichdoes not interpolate the final nonzero data value.

♦ 4.4.28.(a) If p(xk) = a0 + a1 xk + a2 x2

k + · · ·+ an xnk = 0 for k = 1, . . . , n + 1, then V a = 0

where V is the (n + 1)× (n + 1) Vandermonde matrix with entries vij = xi−1j for

i, j = 1, . . . , n + 1. According to Lemma 4.12, if the sample points are distinct, then V isa nonsingular matrix, and hence the only solution to the homogeneous linear system isa = 0, which implies p(x) ≡ 0.

(b) This is a special case of Exercise 2.3.37.

(c) This follows from part (b); linear independence of 1, x, x2, . . . , xn means that p(x) =

a0 + a1 x + a2 x2 + · · ·+ an xn ≡ 0 if and only if a0 = · · · = an = 0.

♦ 4.4.29. This follows immediately from (4.51), since the determinant of a regular matrix is theproduct of the pivots, i.e., the diagonal entries of U . Every factor ti − tj appears once

among the pivot entries.

4.4.30. Note that kij = 1 + xixj + (xixj)2 + · · · + (xixj)

n−1 is the dot product of the ith and

jth columns of the n × n Vandermonde matrix V = V (x1, . . . , xn), and so K = V T V is aGram matrix. Moreover, V is nonsingular when the xi’s are distinct, which proves positivedefiniteness.

113

♥ 4.4.31.

(a) f ′(x) ≈ f(x + h)− f(x− h)

2h;

(b) f ′′(x) ≈ f(x + h)− 2f(x) + f(x− h)

h2;

(c) f ′(x) ≈ −f(x + 2h) + 4f(x + h)− 3f(x)

2h;

(d) f ′(x) ≈ −f(x + 2h) + 8f(x + h)− 8f(x− h) + f(x− 2h)

12h,

f ′′(x) ≈ −f(x + 2h) + 16f(x + h)− 30f(x) + 16f(x− h)− f(x− 2h)

12h2,

f ′′′(x) ≈ f(x + 2h)− 2f(x + h) + 2f(x− h)− f(x− 2h)

2h3,

f (iv)(x) ≈ f(x + 2h)− 4f(x + h) + 6f(x)− 4f(x− h) + f(x− 2h)

h4.

(e) For f(x) = ex at x = 0, using single precision arithmetic, we obtain the approximations:

For h = .1: f ′(x) ≈ 1.00166750019844,

f ′′(x) ≈ 1.00083361116072,

f ′(x) ≈ .99640457071210,

f ′(x) ≈ .99999666269610,

f ′′(x) ≈ .99999888789639,

f ′′′(x) ≈ 1.00250250140590,

f (iv)(x) ≈ 1.00166791722567.

For h = .01: f ′(x) ≈ 1.00001666675000,

f ′′(x) ≈ 1.00000833336050,

f ′(x) ≈ .99996641549580,

f ′(x) ≈ .99999999966665,

f ′′(x) ≈ .99999999988923,

f ′′′(x) ≈ 1.00002500040157,

f (iv)(x) ≈ 1.00001665913362.

For h = .001: f ′(x) ≈ 1.00000016666670,

f ′′(x) ≈ 1.00000008336730,

f ′(x) ≈ .99999966641660,

f ′(x) ≈ .99999999999997,

f ′′(x) ≈ 1.00000000002574,

f ′′′(x) ≈ 1.00000021522190,

f (iv)(x) ≈ .99969229415631.

114

For h = .0001: f ′(x) ≈ 1.00000000166730,

f ′′(x) ≈ 1.00000001191154,

f ′(x) ≈ 0.99999999666713,

f ′(x) ≈ 0.99999999999977,

f ′′(x) ≈ 0.99999999171286,

f ′′′(x) ≈ 0.99998010138450,

f (iv)(x) ≈ −3.43719068489236.

When f(x) = tan x at x = 0, using single precision arithmetic,

For h = .1: f ′(x) ≈ 1.00334672085451,

f ′′(x) ≈ 3.505153914964787× 10−16,

f ′(x) ≈ .99314326416565,

f ′(x) ≈ .99994556862489,

f ′′(x) ≈ 4.199852359823657× 10−15,

f ′′′(x) ≈ 2.04069133777138,

f (iv)(x) ≈ −1.144995345400589× 10−12.

For h = .01: f ′(x) ≈ 1.00003333466672,

f ′′(x) ≈ −2.470246229790973× 10−15,

f ′(x) ≈ .99993331466332,

f ′(x) ≈ .99999999466559,

f ′′(x) ≈ −2.023971870815236× 10−14,

f ′′′(x) ≈ 2.00040006801198,

f (iv)(x) ≈ 7.917000388601991× 10−10.

For h = .001: f ′(x) ≈ 1.00000033333347,

f ′′(x) ≈ −3.497202527569243× 10−15,

f ′(x) ≈ .99999933333147,

f ′(x) ≈ .99999999999947,

f ′′(x) ≈ 5.042978979811304× 10−15,

f ′′′(x) ≈ 2.00000400014065,

f (iv)(x) ≈ 1.010435775508413× 10−7.

For h = .0001: f ′(x) ≈ 1.00000000333333,

f ′′(x) ≈ 4.271860643001446× 10−13,

f ′(x) ≈ .99999999333333,

f ′(x) ≈ 1.00000000000000,

f ′′(x) ≈ 9.625811347808891× 10−13,

f ′′′(x) ≈ 2.00000003874818,

f (iv)(x) ≈ −8.362027156624034× 10−4.

115

In most cases, the accuracy improves as the step size gets smaller, but not always. Inparticular, at the smallest step size, the approximation to the fourth derivative getsworse, indicating the increasing role played by round-off error.

(f ) No — if the step size is too small, round-off error caused by dividing two very smallquantities ruins the approximation.

(g) If n < k, the kth derivative of the degree n interpolating polynomial is identically 0.

♥ 4.4.32.

(a) Trapezoid Rule:Z b

af(x) dx ≈ 1

2 (b− a)hf(x0) + f(x1)

i.

(b) Simpson’s Rule:Z b

af(x) dx ≈ 1

6 (b− a)hf(x0) + 4f(x1) + f(x2)

i.

(c) Simpson’s 38 Rule:

Z b

af(x) dx ≈ 1

8 (b− a)hf(x0) + 3f(x1) + 3f(x2) + f(x3)

i.

(d) Midpoint Rule:Z b

af(x) dx ≈ (b− a) f(x0).

(e) Open Rule:Z b

af(x) dx ≈ 1

2 (b− a)hf(x0) + f(x1)

i.

(f ) (i) Exact: 1.71828; Trapezoid Rule: 1.85914; Simpson’s Rule: 1.71886;

Simpson’s 38 Rule: 1.71854; Midpoint Rule: 1.64872; Open Rule: 1.67167.

(ii) Exact: 2.0000; Trapezoid Rule: 0.; Simpson’s Rule: 2.0944;


(iii) Exact: 1.0000; Trapezoid Rule: .859141; Simpson’s Rule: .996735;

Simpson’s 38 Rule: .93804; Midpoint Rule: 1.06553; Open Rule: .964339.

(iv) Exact: 1.11145; Trapezoid Rule: 1.20711; Simpson’s Rule: 1.10948;


Note: For more details on numerical differentiation and integration, you are encouraged to con-sult a basic numerical analysis text, e.g., [10].

4.4.33. The sample matrix is A =

0B@

1 00 1−1 0

1CA; the least squares solution to Ax = y =

0B@

1.5.25

1CA

gives g(t) = 38 cos π t + 1

2 sin π t.

4.4.34. g(t) = .9827 cosh t− 1.0923 sinh t.

4.4.35. (a) g(t) = .538642 et − .004497 e2 t, (b) .735894.(c) The maximal error is .745159 which occurs at t = 3.66351.

(d) Now the least squares approximant is 0.58165 et − .0051466 e2 t − .431624; the leastsquares error has decreased to .486091, although the maximal error over the interval [0, 4]has increased to 1.00743, which occurs at t = 3.63383!

4.4.36.(a) 5 points: g(t) = −4.4530 cos t + 3.4146 sin t = 5.6115 cos(t− 2.4874);

9 points: g(t) = −4.2284 cos t + 3.6560 sin t = 5.5898 cos(t− 2.4287).

(b) 5 points: g(t) = −4.9348 cos t + 5.5780 sin t + 4.3267 cos 2 t + 1.0220 sin 2 t

= 4.4458 cos(t− .2320) + 7.4475 cos(2 t− 2.2952);

9 points: g(t) = −4.8834 cos t + 5.2873 sin t + 3.6962 cos 2 t + 1.0039 sin 2 t

= 3.8301 cos(t− .2652) + 7.1974 cos(2 t− 2.3165).

116

♥ 4.4.37.(a) n = 1, k = 4: p(t) = .4172 + .4540 cos t;

maximal error: .1722;

-3 -2 -1 1 2 3

0.2

0.4

0.6

0.8

1

(b) n = 2, k = 8: p(t) = .4014 + .3917 cos t + .1288 cos 2 t;maximal error: .0781;

-3 -2 -1 1 2 3

0.2

0.4

0.6

0.8

1

(c) n = 2, k = 16: p(t) = .4017 + .389329 cos t + .1278 cos 2 t;maximal error: .0812;

-3 -2 -1 1 2 3

0.2

0.4

0.6

0.8

1

(d) n = 3, k = 16: p(t) = .4017 + .3893 cos t + .1278 cos 2 t + .0537 cos 3 t;maximal error: .0275;

-3 -2 -1 1 2 3

0.2

0.4

0.6

0.8

1

(e) Because then, due to periodicity of the trigonometric functions, the columns of the sam-ple matrix would be linearly dependent.

♥ 4.4.38. Since Sk(xj) =

(1, j = k,

0, otherwise., the same as the Lagrange polynomials, the coeffi-

cients are cj = f(xj). For each function and step size, we plot the sinc interpolant S(x)

and a comparison with the graph of the function.

f(x) = x2, h = .25, max error: .19078:

0.2 0.4 0.6 0.8 1

0.2

0.4

0.6

0.8

1

0.2 0.4 0.6 0.8 1

0.2

0.4

0.6

0.8

1

f(x) = x2, h = .1, max error: .160495:

0.2 0.4 0.6 0.8 1

0.2

0.4

0.6

0.8

1

0.2 0.4 0.6 0.8 1

0.2

0.4

0.6

0.8

1

f(x) = x2, h = .025, max error: .14591:

0.2 0.4 0.6 0.8 1

0.2

0.4

0.6

0.8

1

0.2 0.4 0.6 0.8 1

0.2

0.4

0.6

0.8

1

f(x) = 12 −

˛˛x− 1

2

˛˛, h = .25, max error: .05066:

0.2 0.4 0.6 0.8 1

0.1

0.2

0.3

0.4

0.5

0.2 0.4 0.6 0.8 1

0.1

0.2

0.3

0.4

0.5

117

f(x) = 12 −

˛˛x− 1

2

˛˛, h = .1, max error: .01877:

0.2 0.4 0.6 0.8 1

0.1

0.2

0.3

0.4

0.5

0.2 0.4 0.6 0.8 1

0.1

0.2

0.3

0.4

0.5

f(x) = 12 −

˛˛x− 1

2

˛˛, h = .025, max error: .004754:

0.2 0.4 0.6 0.8 1

0.1

0.2

0.3

0.4

0.5

0.2 0.4 0.6 0.8 1

0.1

0.2

0.3

0.4

0.5

In the first case, the error is reasonably small except near the right end of the interval. Inthe second case, the approximation is much better; the larger errors occur near the cornerof the function.

4.4.39. (a) y = − .9231 + 3.7692 t, (b) The same interpolating parabola y = 2 + t2.Note: When interpolating, the error is zero irrespective of the weights.

4.4.40. (a) .29 + .36 t, (b) 1.7565 + 1.6957 t, (c) −1.2308 + 1.9444 t, (d) 2.32 + .4143 t.

4.4.41. The weights are .3015, .1562, .0891, .2887, .2774, .1715. The weighted least squares planehas the equation z = 4.8680− 1.6462 x + .2858 y.

♦ 4.4.42. When x = (AT C A)−1AT C y is the least squares solution,

Error = ‖y −Ax ‖ =q

yT C y − yT C A(AT C A)−1AT C y .

4.4.43. (a) 37 + 9

14 t; (b) 928 + 9

7 t− 914 t2; (c) 24

91 + 18091 t− 216

91 t2 + 1513 t3.

4.4.44. 18 + 27

20 t− 32 t2;

-1 -0.5 0.5 1

-3

-2.5

-2

-1.5

-1

-0.5

0.5

the maximal error if 25 at t = ±1

4.4.45. p1(t) = .11477 + .66444 t, p2(t) = − .024325 + 1.19575 t− .33824 t2.

4.4.46. 1.00005 + 0.99845 t + .51058 t2 + 0.13966 t3 + .069481 t4.

♥ 4.4.47. (a) 1.875x2 − .875x, (b) 1.9420x2 − 1.0474x + .0494, (c) 1.7857x2 − 1.0714x + .1071.(d) The interpolating polynomial is the easiest to compute; it exactly coincides with thefunction at the interpolation points; the maximal error over the interval [0, 1] is .1728 att = .8115. The least squares polynomial has a smaller maximal error of .1266 at t = .8018.The L2 approximant does a better job on average across the interval, but its maximal errorof .1786 at t = 1 is comparable to the quadratic interpolant.

4.4.48. g(x) = 2 sin x.

♦ 4.4.49. Form the n × n Gram matrix with entries kij = 〈 gi , gj 〉 =Z b

agi(x) gj(x) w(x) dx and

vector f with entries fi = 〈 f , gi 〉 =Z b

af(x) gi(x) w(x) dx. The solution to the linear sys-

118

tem Kc = f then gives the required coefficients c = ( c1, c2, . . . , cn )T .

4.4.50.(i) 3

28 − 1514 t+ 25

14 t2 ≈ .10714−1.07143 t+1.78571 t2; maximal error: 528 = .178571 at t = 1;

(ii) 27 − 25

14 t + 5021 t2 ≈ .28571− 1.78571 t + 2.38095 t2; maximal error: 2

7 = .285714 at t = 0;

(iii) .0809− .90361 t + 1.61216 t2; maximal error: .210524 at t = 1. Case (i) is the best.

♥ 4.4.51.

(a) ‖ fa ‖22 =Z ∞

−∞

a dx

1 + a4 x2=

1

atan−1 a2 x

˛˛˛

∞

x=−∞=

π

a.

(b) The maximum value of fa(x) occurs at x = 0, where fa(0) =√

a = ‖ fa ‖∞.

(c) The least squares error between fa(x) and the zero function is ‖ fa ‖2 =

sπ

a, which

is small when a is large. But fa(x) has a large maximum value and so is very far fromzero near x = 0. Note that fa(x)→ 0 for all x 6= 0 as a→∞, but fa(0)→∞.

4.4.52. (a) z = x + y − 13 , (b) z = 9

10 (x− y), (c) z = 4/π2 — a constant function.

4.4.53. p(x, y) ≡ 0.

119


5.1.1. (a) Orthogonal basis; (b) orthonormal basis; (c) not a basis; (d) basis; (e) orthogonalbasis; (f ) orthonormal basis.

5.1.2. (a) Basis; (b) orthonormal basis; (c) not a basis.

5.1.3. (a) Basis; (b) basis; (c) not a basis; (d) orthogonal basis; (e) orthonormal basis; (f ) basis.

5.1.4. 〈 e1 , e2 〉 = 〈 e1 , e3 〉 = 〈 e2 , e3 〉 = 0. e1,1√2e2,

1√3e3 is an orthonormal basis.

5.1.5. (a) a = ±1. (b) a = ±q

23 , (c) a = ±

q32 .

5.1.6. a = 2b > 0.

5.1.7. (a) a = 32 b > 0; (b) no possible values — because they cannot be negative!

5.1.8. 12 x1 y1 + 1

8 x2 y2.

5.1.9. False. Consider the basis v1 =

0B@

110

1CA, v2 =

0B@

010

1CA, v3 =

0B@

001

1CA. Under the weighted inner

product, 〈v1 ,v2 〉 = b > 0, since the coefficients of a, b, c appearing in the inner productmust be strictly positive.

♥ 5.1.10.(a) By direct computation: u× v = u×w = 0.(b) First, if w = cv, then we compute v ×w = 0. Conversely, suppose v 6= 0 — otherwise

the result is trivial, and, in particular, v1 6= 0. Then v × w = 0 implies wi = c vi,i = 1, 2, 3, where c = w1/v1. The other cases when v2 6= 0 and v3 6= 0 are handled in asimilar fashion.

(c) If v,w are orthogonal and nonzero, then by Proposition 5.4, they are linearly indepen-dent, and so, by part (b), u = v × w is nonzero, and, by part (a), orthogonal to both.Thus, the three vectors are nonzero, mutually orthogonal, and so form an orthogonalbasis of R

3.(d) Yes. In general, ‖v ×w ‖ = ‖v ‖ ‖w ‖ | sin θ | where θ is the angle between them, and

so when v,w are orthogonal unit vectors, ‖v ×w ‖ = ‖v ‖ ‖w ‖ = 1. This can also beshown by direct computation of ‖v ×w ‖ using orthogonality.

♦ 5.1.11. See Example 5.20.

♦ 5.1.12. We repeatedly use the identity sin2 α+ cos2 α = 1 to simplify

〈u1 ,u2 〉 = − cosφ sinφ sin2 θ + (− cos θ cosψ sinφ− cosφ sinψ)(cos θ cosψ cosφ− sinφ sinψ)

+(cosψ sinφ+ cos θ cosφ sinψ)(cosφ cosψ − cos θ sinφ sinψ) = 0.

By similar computations, 〈u1 ,u3 〉 = 〈u2 ,u3 〉 = 0, 〈u1 ,u1 〉 = 〈u2 ,u2 〉 = 〈u3 ,u3 〉 = 1.

♥ 5.1.13.(a) The (i, j) entry of ATKA is vT

i Kvj = 〈vi ,vj 〉. Thus, ATKA = I if and only if

〈vi ,vj 〉 =

(1, i = j,

0, i 6= j,and so the vectors form an orthonormal basis.

120

(b) According to part (a), orthonormality requires ATKA = I , and so K = A−TA−1 =

(AAT )−1 is the Gram matrix for A−1, and K > 0 since A−1 is nonsingular. This alsoproves the uniqueness of the inner product.

(c) A =

1 21 3

!, K =

10 −7−7 5

!, with inner product

〈v ,w 〉 = vTKw = 10v1w1 − 7v1w2 − 7v2w1 + 5v2w2;

(d) A =

0B@

1 1 11 1 21 2 3

1CA, K =

0B@

3 −2 0−2 6 −3

0 −3 2

1CA, with inner product

〈v ,w 〉 = vTKw = 3v1w1 − 2v1w2 − 2v2w1 + 6v2w2 − 3v2w3 − 3v3w2 + 2v3w3.

5.1.14. One way to solve this is by direct computation. A more sophisticated approach is to

apply the Cholesky factorization (3.70) to the inner product matrix: K = MMT . Then,

〈v ,w 〉 = vTKw = bvT bw where bv = M v, bw = Mw. Therefore, v1,v2 form an orthonor-

mal basis relative to 〈v ,w 〉 = vTKw if and only if bv1 = M v1, bv2 = M v2, form anorthonormal basis for the dot product, and hence of the form determined in Exercise 5.1.11.Using this we find:

(a) M =

1 00√

2

!, so v1 =

cos θ1√2

sin θ

!, v2 =

± sin θ1√2

cos θ

!, for any 0 ≤ θ < 2π.

(b) M =

1 −10 1

!, so v1 =

cos θ + sin θ

sin θ

!, v2 =

± sin θ + cos θ

cos θ

!, for any 0 ≤ θ < 2π.

5.1.15. ‖v + w ‖2 = 〈v + w ,v + w 〉 = 〈v ,v 〉+2 〈v ,w 〉+〈w ,w 〉 = ‖v ‖2+‖w ‖2 if and onlyif 〈v ,w 〉 = 0. The vector v + w is the hypotenuse of the right triangle with sides v,w.

5.1.16. 〈v1 + v2 ,v1 − v2 〉 = ‖v1 ‖2 − ‖v2 ‖2 = 0 by assumption. Moreover, since v1,v2 arelinearly independent, neither v1 − v2 nor v1 + v2 is zero, and hence Theorem 5.5 impliesthat they form an orthogonal basis for the two-dimensional vector space V .

5.1.17. By orthogonality, the Gram matrix is a k × k diagonal matrix whose diagonal entriesare ‖v1 ‖2, . . . , ‖vk ‖2. Since these are all nonzero, the Gram matrix is nonsingular. Analternative proof combines Propositions 3.30 and 5.4.

5.1.18.(a) Bilinearity: for a, b constant,

〈 a p+ b ep , q 〉 =Z 1

0t (a p(t) + b ep(t)) q(t) dt

= aZ 1

0t p(t) q(t) dt+ b

Z 1

0t ep(t) q(t) dt = a 〈 p , q 〉+ b 〈 ep , q 〉.

The second bilinearity condition 〈 p , aq + beq 〉 = a 〈 p , q 〉+ b 〈 p , eq 〉 follows similarly, or isa consequence of symmetry, as in Exercise 3.1.9.

Symmetry: 〈 q , p 〉 =Z 1

0t q(t) · p(t) dt =

Z 1

0t p(t) · q(t) dt = 〈 p , q 〉.

Positivity: 〈 p , p 〉 =Z 1

0t p(t)2 dt ≥ 0, since t ≥ 0 and p(t)2 ≥ 0 for all 0 ≤ t ≤ 1.

Moreover, since p(t) is continuous, so is t p(t)2. Therefore, the integral can equal 0 if

and only if t p(t)2 ≡ 0 for all 0 ≤ t ≤ 1, and hence p(t) ≡ 0.

(b) p(t) = c“

1− 32 t”

for any c.

(c) p1(t) =√

2 , p2(t) = 4− 6 t;

(d) p1(t) =√

2 , p2(t) = 4− 6 t, p3(t) =√

2 (3− 12 t+ 10 t2).

121

5.1.19. SinceZ π

−πsinx cosx dx = 0, the functions cosx and sinx are orthogonal under the L2

inner product on [−π, π ]. Moreover, they span the solution space of the differential equa-tion, and hence, by Theorem 5.5, form an orthogonal basis.

5.1.20. They form a basis, but not on orthogonal basis since

〈 ex/2 , e−x/2 〉 =Z 1

0ex/2 e−x/2 dx = 1. An orthogonal basis is ex/2, e−x/2 − ex/2

e− 1.

5.1.21.(a) We compute 〈v1 ,v2 〉 = 〈v1 ,v3 〉 = 〈v2 ,v3 〉 = 0 and ‖v1 ‖ = ‖v2 ‖ = ‖v3 ‖ = 1.

(b) 〈v ,v1 〉 = 75 , 〈v ,v2 〉 = 11

13 , and 〈v ,v3 〉 = − 3765 , and so ( 1, 1, 1 )T = 7

5v1 + 1113v2− 37

65v3.

(c)“

75

”2+“

1113

”2+“− 37

65

”2= 3 = ‖v ‖2.

♥ 5.1.22.(a) By direct commputation: v1 · v2 = 0, v1 · v3 = 0, v2 · v3 = 0.

(b) v = 2v1 − 12v2 + 1

2v3, sincev1 · v‖v1 ‖2

=6

3= 2,

v2 · v‖v2 ‖2

=−3

6= −1

2,

v3 · v‖v3 ‖2

=1

2.

(c)

v1 · v‖v1 ‖

!2

+

v2 · v‖v2 ‖

!2

+

v3 · v‖v3 ‖

!2

=

6√3

!2

+

3√6

!2

+

1√2

!2

= 14 = ‖v ‖2.

(d) The orthonormal basis is u1 =

0BBBB@

1√3

1√3

1√3

1CCCCA, u2 =

0BBBB@

1√6

1√6

− 2√6

1CCCCA, u3 =

0BBBB@

− 1√2

1√2

0

1CCCCA

.

(e) v = 2√

3u1 −3√6u2 +

1√2u3 and

“2√

3”2

+

−3√6

!2

+

1√2

!2

= 14 = ‖v ‖2.

5.1.23. (a) Because 〈v1 ,v2 〉 = vT1 Kv2 = 0. (b) v =

〈v ,v1 〉‖v1 ‖2

v1 +〈v ,v2 〉‖v2 ‖2

v2 = 73 v1 − 1

3 v2.

(c)

v1 · v‖v1 ‖

!2

+

v2 · v‖v2 ‖

!2

=„

7√3

«2+„− 5√

15

«2= 18 = ‖v ‖2.

(d) u1 =„

1√3, 1√

3

«T

, u2 =„− 2√

15, 1√

15

«T

.

(e) v = 〈v ,u1 〉u1 + 〈v ,u2 〉u2 = 7√3u1 −

√153 u2; ‖v ‖2 = 18 =

„7√3

«2+„−

√153

«2.

5.1.24. Consider the non-orthogonal basis v1 =

11

!, v2 =

01

!. We have v =

10

!= v1 − v2,

but ‖v ‖2 = 1 6= 12 + (−1)2.

5.1.25.〈 1 , p1 〉‖ p1 ‖2

= 1,〈 1 , p2 〉‖ p2 ‖2

= 0,〈 1 , p3 〉‖ p3 ‖2

= 0,, so 1 = p1(x) + 0 p2(x) + 0 p3(x) = p1(x).

〈x , p1 〉‖ p1 ‖2

=1

2,〈x , p2 〉‖ p2 ‖2

= 1,〈x , p3 〉‖ p3 ‖2

= 0,, so x = 12 p1(x) + p2(x).

〈x2 , p1 〉‖ p1 ‖2

=1

3,〈x2 , p2 〉‖ p2 ‖2

= 1,〈x2 , p3 〉‖ p3 ‖2

= 1,, so x2 = 13 p1(x) + p2(x) + p3(x).

5.1.26.

(a) 〈P0 , P1 〉 =Z 1

−1t dt = 0, 〈P0 , P2 〉 =

Z 1

−1

“t2 − 1

3

”dt = 0,

122

〈P0 , P3 〉 =Z 1

−1

“t3 − 3

5 t”dt = 0, 〈P1 , P2 〉 =

Z 1

−1t“t2 − 1

3

”dt = 0,

〈P1 , P3 〉 =Z 1

−1t“t3 − 3

5 t”dt = 0, 〈P2 , P3 〉 =

Z 1

−1

“t2 − 1

3

”“t3 − 3

5 t”dt = 0.

(b) 1√2,

r32 t,

r52

“32 t

2 − 12

”,

r72

“52 t

3 − 32 t”,

(c)〈 t3 , P0 〉‖P0 ‖2

= 0,〈 t3 , P1 〉‖P1 ‖2

=3

5,〈 t3 , P2 〉‖P2 ‖2

= 0,〈 t3 , P3 〉‖P3 ‖2

= 1, so t3 = 35 P1(t) + P3(t).

5.1.27.

(a) 〈P0 , P1 〉 =Z 1

0

“t− 2

3

”t dt = 0, 〈P0 , P2 〉 =

Z 1

0

“t2 − 6

5 t+ 310

”t dt = 0,

〈P1 , P2 〉 =Z 1

0

“t− 2

3

”“t2 − 6

5 t+ 310

”t dt = 0.

(b)√

2 , 6 t− 4,√

6“

3− 12 t+ 10 t2”.

(c)〈 t2 , P0 〉‖P0 ‖2

=1

2,〈 t2 , P1 〉‖P1 ‖2

=6

5,〈 t2 , P2 〉‖P2 ‖2

= 1, so t2 = 12 P0(t) + 6

5 P1(t) + P2(t).

5.1.28. (a) cos2 x = 12 + 1

2 cos 2x, (b) cosx sinx = 12 sin 2x, (c) sin3 x = 3

4 sinx − 14 sin 3x,

(d) cos2 x sin3 x = 18 sinx+ 1

16 sin 3x− 116 sin 5x, (e) cos4 x = 3

8 + 12 cos 2x+ 1

8 cos 4x.

5.1.29.1√2π

,cosx√π,

sinx√π, . . . ,

cosnx√π

,sinnx√

π.

♦ 5.1.30. 〈 e i kx , e i lx 〉 = 1

2π

Z π

−πe i kx e i kx dx =

1

2π

Z π

−πe i (k−l)x dx =

(1, k = l,

0, k 6= l.

♦ 5.1.31.Z π

−πcos kx cos lx dx =

Z π

−π

12

“cos(k − l)x+ cos(k + l)x

”dx =

8>>><>>>:

0, k 6= l,

2π, k = l = 0,

π, k = l 6= 0,Z π

−πsin kx sin lx dx =

Z π

−π

12

“cos(k − l)x− cos(k + l)x

”dx =

8<:

0, k 6= l,

π, k = l 6= 0,Z π

−πcos kx sin lx dx =

Z π

−π

12

“sin(k − l)x+ sin(k + l)x

”dx = 0.

♦ 5.1.32. Given v = a1v1 + · · · + an vn, we have

〈v ,vi 〉 = 〈 a1v1 + · · · + an vn ,vi 〉 = ai ‖vi ‖2 ,

since, by orthogonality, 〈vj ,vi 〉 = 0 for all j 6= i. This proves (5.7). Then, to prove (5.8),

‖v ‖2 = 〈 a1v1 + · · · + an vn , a1v1 + · · · + an vn 〉 =nX

i,j =1

ai aj 〈vi ,vj 〉

=nX

i=1

a2i ‖vi ‖

2 =nX

i=1

〈v ,vi 〉‖vi ‖2

! 2

‖vi ‖2 =

nX

i=1

〈v ,vi 〉‖vi ‖

!2

.

5.2.1.(a) 1√

2( 1, 0, 1 )T , ( 0, 1, 0 )T , 1√

2(−1, 0, 1 )T ;

(b) 1√2

( 1, 1, 0 )T , 1√6

(−1, 1,−2 )T , 1√3

( 1,−1,−1 )T ;

123

(c) 1√14

( 1, 2, 3 )T , 1√3

( 1, 1,−1 )T , 1√42

(−5, 4,−1 )T .

5.2.2.

(a)„

1√2, 0, 1√

2, 0«T

,„

0, 1√2, 0,− 1√

2

«T

,“

12 ,

12 ,− 1

2 ,12

”T,“− 1

2 ,12 ,

12 ,

12

”T;

(b)„

1√2, 0, 0, 1√

2

«T

,“

23 ,

13 , 0,− 2

3

”T, ( 0, 0, 1, 0 )T ,

„− 1

3√

2, 43√

2, 0, 1

3√

2

«T

.

5.2.3. The first two Gram–Schmidt vectors are legitimate, v1 = ( 1,−1, 0, 1 )T , v2 = (−1, 0, 1, 1 )T ,but then v3 = 0, and the algorithm breaks down. The reason is that the given vectors arelinearly dependent, and do not, in fact, form a basis.

5.2.4.

(a)

0,

2√5,

1√5

!T

, ( 1, 0, 0 )T .

(b) Starting with the basis“

12 , 1, 0

”T, (−1, 0, 1 )T , the Gram–Schmidt process produces

the orthonormal basis„

1√5, 2√

5, 0«T

,„− 4

3√

5, 23√

5, 53√

5

«T

.

(c) Starting with the basis ( 1, 1, 0 )T , ( 3, 0, 1 )T , the Gram–Schmidt process produces the

orthonormal basis„

1√2, 1√

2, 0«T

,„

3√22,− 3√

22, 2√

22

«T

.

5.2.5. ( 1,−1,−1, 1, 1 )T , ( 1, 0, 1,−1, 1 )T , ( 1, 0, 1, 1,−1 )T .

5.2.6.(a) 1√

3( 1, 1,−1, 0 )T , 1√

15(−1, 2, 1, 3 )T , 1√

15( 3,−1, 2, 1 )T .

(b) Solving the homogeneous system we obtain the kernel basis (−1, 2, 1, 0 )T , ( 1,−1, 0, 1 )T .

The Gram-Schmidt process gives the orthonormal basis 1√6

(−1, 2, 1, 0 )T , 1√6

( 1, 0, 1, 2 )T .

(c) Applying Gram-Schmidt to the corange basis ( 2, 1, 0,−1 )T ,“

0, 12 ,−1, 1

2

”T, gives the

orthonormal basis 1√6

( 2, 1, 0,−1 )T , 1√6

( 0, 1,−2, 1 )T .

(d) Applying Gram-Schmidt to the range basis ( 1, 2, 0,−2 )T , ( 2, 1,−1, 5 )T , gives the or-

thonormal basis 13 ( 1, 2, 0,−2 )T , 1

9√

3( 8, 7,−3, 11 )T .

(e) Applying Gram-Schmidt to the cokernel basis“

23 ,− 1

3 , 1, 0”T

, (−4, 3, 0, 1 )T , gives the

orthonormal basis 1√14

( 2,−1, 3, 0 )T , 19√

42(−34, 31, 33, 14 )T .

(f ) Applying Gram-Schmidt to the basis (−1, 1, 0, 0 )T , ( 1, 0, 1, 0 )T , ( 1, 0, 0, 1 )T , gives

the orthonormal basis 1√2

(−1, 1, 0, 0 )T , 1√6

( 1, 1, 2, 0 )T , 12√

3( 1, 1,−1, 3 )T .

5.2.7.

(a) Range:1√10

1−3

!; kernel:

1√2

11

!; corange:

1√2

1−1

!; cokernel:

1√10

31

!.

(b) Range:1√2

0B@−1

10

1CA,

1√6

0B@

112

1CA; kernel:

1√6

0B@

2−1

1

1CA;

corange:1√5

0B@−1

02

1CA,

1√30

0B@

251

1CA; cokernel:

1√3

0B@−1−1

1

1CA.

124

(c) Range:1√3

0B@

11−1

1CA,

1√42

0B@

145

1CA,

1√14

0B@

3−2

1

1CA; kernel:

1

2

0BBB@

−1−1

11

1CCCA;

corange:1√2

0BBB@

1010

1CCCA,

1√2

0BBB@

0101

1CCCA,

1

2

0BBB@

−111−1

1CCCA; the cokernel is 0, so there is no basis.

(d) Range:1√3

0BBB@

10−1

1

1CCCA,

1√3

0BBB@

1−1

0−1

1CCCA; kernel:

1√21

0B@−4

12

1CA;

corange:1√6

0B@

121

1CA,

1√14

0B@

1−2

3

1CA; cokernel:

1√3

0BBB@

1110

1CCCA,

1√3

0BBB@

0−1

11

1CCCA.

5.2.8.(i) (a) 1

2 ( 1, 0, 1 )T , 1√2

( 0, 1, 0 )T , 12√

3(−1, 0, 3 )T ;

(b) 1√5

( 1, 1, 0 )T , 1√55

(−2, 3,−5 )T , 1√66

( 2,−3,−6 )T ;

(c) 12√

5( 1, 2, 3 )T , 1√

130( 4, 3,−8 )T , 1

2√

39(−5, 6,−3 )T .

(ii) (a)“

12 , 0,

12

”T,“

12 , 1,

12

”T,“− 1

2 , 0,12

”T;

(b)„

1√2, 1√

2, 0«T

,“− 1

2 , 0,− 12

”T,„

0,− 1√2,− 1√

2

«T

;

(c) 12√

3( 1, 2, 3 )T , 1√

42( 4, 5, 0 )T , 1√

14(−2, 1, 0 )T .

5.2.9. Applying the Gram–Schmidt process to the standard basis vectors e1, e2 gives

(a)

0@

1√3

0

1A,

0@ 0

1√5

1A; (b)

0@

12

0

1A,

0B@

12√

32√3

1CA; (c)

0@

1√2

0

1A,

0B@− 1√

10√2√5

1CA.

5.2.10. Applying the Gram–Schmidt process to the standard basis vectors e1, e2, e3 gives

(a)

0BB@

12

0

0

1CCA,

0BBBB@

12√

21√2

0

1CCCCA,

0BBBBB@

12√

61√6√2√3

1CCCCCA

; (b) 1√3

0B@

100

1CA, 1√

33

0B@

130

1CA, 1

4√

22

0B@−2

511

1CA.

5.2.11. (a) 2, namely ±1; (b) infinitely many; (c) no.

♦ 5.2.12. The key is to make sure the inner products are in the correct order, as otherwise com-plex conjugates appear on the scalars. By induction, assume that we already know

〈vi ,vj 〉 = 0 for i 6= j ≤ k − 1. Then, given vk = wk −k−1X

j =1

〈wk ,vj 〉‖vj ‖2

vj , for any i < k,

〈vk ,vi 〉 =

*wk −

k−1X

j =1


vj , vi

+

= 〈wk ,vi 〉 −k−1X

j =1


〈vj ,vi 〉 = 〈wk ,vi 〉 −〈wk ,vi 〉‖vi ‖2

〈vi ,vi 〉 = 0,

completing the induction step.

125

5.2.13. (a)

1 + i

2,1− i

2

!T

,

3− i

2√

5,1 + 3 i

2√

5

!T

;

(b)

1 + i

3,1− i

3,2− i

3

!T

,

−2 + 9 i

15,−9− 7 i

15,−1 + 3 i

15

!T

,

3− i

5,1− 2 i

5,−1 + 3 i

5

!T

.

5.2.14.

(a)

1− i√

3,

1√3, 0

!T

,

−1 + 2 i

2√

6,3− i

2√

6,

3 i

2√

6

!T

;

(b) 13 (−1− 2 i , 2, 0 )T , 1

3√

19( 6 + 2 i , 5− 5 i , 9 )T ;

(c)“− 1

2 i , 12 ,− 1

2 ,12 i”T

,“− 1

2 ,12 i , 1

2 ,12 i”T

,“− 1

2 i , 12 i , 0, 1

2 − 12 i”T

.

5.2.15. False. Any example that starts with a non-orthogonal basis will confirm this.

♦ 5.2.16. According to Exercise 2.4.24, we can find a basis of Rn of the form u1, . . . ,um,

vm+1, . . . ,vn. When we apply the Gram–Schmidt process to this basis in the indicated or-der, it will not alter the orthonormal vectors u1, . . . ,um, and so the result is the desiredorthonormal basis. Note also that none of the orthonormal basis vectors um+1, . . . ,un be-longs to V as otherwise it would be in the span of u1, . . . ,um, and so the collection wouldnot be linearly independent.

5.2.17. (a)

0B@

0..7071.7071

1CA,

0B@

.8165−.4082.4082

1CA,

0B@

.57735

.57735−.57735

1CA; (b)

0BBB@

.57735

.57735−.57735

0.

1CCCA,

0BBB@

−.2582.5164.2582.7746

1CCCA,

0BBB@

.7746−.2582.5164.2582

1CCCA;

(c)

0BBBBB@

.5164

.2582

.7746

.25820.

1CCCCCA,

0BBBBB@

−.2189−.5200.4926−.5200.4105

1CCCCCA,

0BBBBB@

.2529

.5454−.2380−.3372.6843

1CCCCCA

; (d)

0BBBBB@

0..7071

0..7071

0.

1CCCCCA,

0BBBBB@

.6325−.3162.6325.3162

0.

1CCCCCA,

0BBBBB@

.1291−.3873−.5164.3873−.6455

1CCCCCA,

0BBBBB@

.577350.

−.577350.

.57735

1CCCCCA

.

5.2.18. Same solutions.

5.2.19. See previous solutions.

♦ 5.2.20. Clearly, each uj = w(j)j /‖w(j)

j ‖ is a unit vector. We show by induction on k and then

on j that, for each 2 ≤ j ≤ k, the vector w(j)k is orthogonal to u1, . . . ,uk−1, which will

imply uk = w(k)k /‖w(k)

k ‖ is also orthogonal to u1, . . . ,uk−1; this will establish the result.Indeed, by the formulas,

〈w(2)k ,u1 〉 = 〈wk ,u1 〉 − 〈wk ,u1 〉〈u1 ,u1 〉 = 0.

Further, for i < j < k

〈w(j+1)k ,ui 〉 = 〈w

(j)k ,ui 〉 − 〈w

(j)k ,uj 〉〈uj ,ui 〉 = 0,

since, by the induction hypothesis, both 〈w(j)k ,ui 〉 = 0 and 〈uj ,ui 〉 =

〈w(j)j ,ui 〉‖w(j)

j ‖= 0.

Finally,

w(j+1)k uj = 〈w(j)

k ,uj 〉 − 〈w(j)k ,uj 〉〈uj ,uj 〉 = 0,

since uj is a unit vector. This completes the induction step, and the result follows.

5.2.21. Since u1, . . . ,un form an orthonormal basis, if i < j,

〈w(j+1)k ,ui 〉 = 〈w

(j)k ,ui 〉,

126

and hence, by induction, rik = 〈wk ,ui 〉 = 〈w(i)k ,ui 〉. Furthermore,

‖w(j+1)k ‖2 = ‖w(j)

k ‖2 − 〈w(j)k ,uj 〉

2 = ‖w(j)k ‖2 − r2jk,

and so, by (5.5), ‖w(i)i ‖2 = ‖wi ‖2 − r21i − · · · − r2i−1,i = r2ii.

5.3.1. (a) Neither; (b) proper orthogonal; (c) orthogonal; (d) proper orthogonal; (e) neither;(f ) proper orthogonal; (g) orthogonal.

5.3.2.(a) By direct computation RTR = I , QTQ = I ;

(b) Both RQ =

0B@

cos θ sin θ 00 0 1

− sin θ cos θ 0

1CA, and QR =

0B@

cos θ 0 sin θ− sin θ 0 cos θ

0 1 0

1CA

satisfy (RQ)T (RQ) = I = (QR)T (QR);(c) Q is proper orthogonal, while R,RQ and QR are all improper.

5.3.3.(a) True: Using the formula (5.31) for an improper 2× 2 orthogonal matrix,

cos θ sin θsin θ − cos θ

!2

=

1 00 1

!.

(b) False: For example,

0B@

cos θ − sin θ 0sin θ cos θ 00 0 −1

1CA

2

6= I for θ 6= 0, π.

♥ 5.3.4.(a) By direct computation using sin2 α+ cos2 α = 1, we find QT Q = I and detQ = +1.(b)

Q−1 = QT =

0B@

cosϕ cosψ − cos θ sinϕ sinψ − cosϕ sinψ − cos θ sinϕ cosψ sin θ sinϕsinϕ cosψ + cos θ cosϕ sinψ − sinϕ sinψ + cos θ cosϕ cosψ − sin θ cosϕ

sin θ sinψ sin θ cosψ cos θ

1CA.

♥ 5.3.5.(a) By a long direct computation, we find QT Q = (y2

1 + y22 + y2

3 + y24)2 I and

detQ = (y21 + y2

2 + y23 + y2

4)3 = 1.

(b) Q−1 = QT =

0BB@

y21 + y2

2 − y23 − y2

4 2(y2 y3 − y1 y4) 2(y2 y4 + y1 y3)

2(y2 y3 + y1 y4) y21 − y2

2 + y23 − y2

4 2(y3 y4 − y1 y2)2(y2 y4 − y1 y3) 2(y3 y4 + y1 y2) y2

1 − y22 − y2

3 + y24

1CCA;

(c) These follow by direct computation using standard trigonometric identities, e.g., the(1, 1) entry is

y21 + y2

2 − y23 − y

24

= cos2ϕ+ ψ

2cos2

θ

2+ cos2

ϕ− ψ2

sin2 θ

2− sin2 ϕ− ψ

2sin2 θ

2− sin2 ϕ+ ψ

2cos2

θ

2

= cos(ϕ+ ψ) cos2θ

2+ + cos(ϕ− ψ) sin2 θ

2

= cosϕ cosψ

cos2

θ

2+ sin2 θ

2

!− sinϕ sinψ

cos2

θ

2− sin2 θ

2

!

= cosϕ cosψ − cos θ sinϕ sinψ.

5.3.6. Since the rows of Q are orthonormal (see Exercise 5.3.8), so are the rows of R and hence

127

R is also an orthogonal matrix. Moreover, interchanging two rows changes the sign of thedeterminant, and so if detQ = +1, then detR = −1.

5.3.7. In general, det(Q1Q2) = detQ1 detQ2. If both determinants are +1, so is their product.Improper times proper is improper, while improper times improper is proper.

♦ 5.3.8.(a) Use (5.30) to show (QT )−1 = Q = (QT )T .

(b) The rows of Q are the columns of QT , and hence since QT is an orthogonal matrix, therows of Q must form an orthonormal basis.

5.3.9. (Q−1)T = (QT )T = Q = (Q−1)−1, proving orthogonality.

5.3.10.(a) False — they must be an orthonormal basis.

(b) True, since then QT has orthonormal basis columns, and so is orthogonal. Exercise 5.3.8

then implies that Q = (QT )T is also orthogonal.

(c) False. For example

0 11 0

!is symmetric and orthogonal.

5.3.11. All diagonal matrices whose diagonal entries are ±1.

5.3.12. Let U = (u1 u2 . . . un ), where the last n − j entries of the jth column uj are zero.

Since ‖u1 ‖ = 1, u1 = (±1, 0, . . . , 0 )T . Next, 0 = u1 · uj = ±u1,j for j 6= 1, and

so all non-diagonal entries in the first row of U are zero; in particular, since ‖u2 ‖ = 1,

u2 = ( 0,±1, 0, . . . , 0 )T . Then, 0 = u2 · uj = ±u2,j , j 6= 2, and so all non-diagonal entries

in the second row of U are zero; in particular, since ‖u3 ‖ = 1, u3 = ( 0, 0,±1, 0, . . . , 0 )T .The process continues in this manner, eventually proving that U is a diagonal matrix whosediagonal entries are ±1.

5.3.13. (a) Note that P 2 = I and P = PT , proving orthogonality. Moreover, detP = −1since P can be obtained from the identity matrix I by interchanging two rows. (b) Onlythe matrices corresponding to multiplying a row by −1.

5.3.14. False. This is true only for row interchanges or multiplication of a row by −1.

5.3.15.(a) The columns of P are the standard basis vectors e1, . . . , en, rewritten in a different or-

der, which doesn’t affect their orthonormality.(b) Exactly half are proper, so there are 1

2 n ! proper permutation matrices.

♦ 5.3.16.(a) ‖Qx ‖2 = (Qx)TQx = xTQTQx = xT Ix = xT x = ‖x ‖2.(b) According to Exercise 3.4.19, since both QTQ and I are symmetric matrices, the equa-

tion in part (a) holds for all x if and only if QTQ = I .

♥ 5.3.17.(a) QTQ = ( I −2uuT )T ( I −2uuT ) = I −4uuT +4uuT uuT = I , since ‖u ‖2 = uT u = 1

by assumption.

(b) (i)

−1 0

0 1

!, (ii)

0@

725 − 24

25

− 2425 − 7

25

1A, (iii)

0B@

1 0 00 −1 00 0 1

1CA, (iv)

0B@

0 0 −10 1 0−1 0 0

1CA.

(c) (i) v = c

01

!, (ii) v = c

−4

3

!, (iii) v = c

0B@

100

1CA+ d

0B@

001

1CA, (iv) v = c

0B@

101

1CA+ d

0B@

010

1CA.

128

In general, Qv = v if and only if v is orthogonal to u.

♦ 5.3.18.QT = ( I +A)T ( I −A)−T = ( I +AT )( I −AT )−1 = ( I −A)( I +A)−1 = Q−1. To prove

that I − A is invertible, suppose ( I − A)v = 0, so Av = v. Multiplying by vT and using

Exercise 1.6.29(f ) gives 0 = vTAv = ‖v ‖2, proving v = 0 and hence ker( I −A) = 0.5.3.19.

(a) If S = (v1 v2 . . . vn ), then S−1 = STD, where D = diag (1/‖v1 ‖2, . . . , 1/‖vn ‖2).

(b)

0BBB@

1 1 1 01 1 −1 01 −1 0 11 −1 0 −1

1CCCA

−1

=

0BBBBB@

14

14

14

14

14

14 − 1

4 − 14

12 − 1

2 0 0

0 0 12 − 1

2

1CCCCCA

=

0BBB@

1 1 1 11 1 −1 −11 −1 0 00 0 1 −1

1CCCA

0BBBBB@

14 0 0 0

0 14 0 0

0 0 12 0

0 0 0 12

1CCCCCA

.

♦ 5.3.20. Set A = (v1 v2 . . . vn ), B = (w1 w2 . . . wn ). The dot products are the same if and

only if the two Gram matrices are the same: ATA = BTB. Therefore, Q = BA−1 =

B−TAT satisfies QT = A−TBT = Q−1, and hence Q is an orthogonal matrix. Theresulting matrix equation B = QA is the same as the vector equations wi = Qvi fori = 1, . . . , n.

5.3.21. (a) The (i, j) entry of QTQ is the product of the ith row of QT times the jth column of

Q, namely uTi uj = ui · uj =

(1, i = j,

0, i 6= j,and hence QTQ = I . (b) No. For instance,

if Q = u =

0B@

1√2

1√2

1CA, then QQT =

0@

12

12

12

12

1A is not a 2× 2 identity matrix.

5.3.22.(a) Assuming the columns are nonzero, Proposition 5.4 implies they are linearly indepen-

dent. But there can be at most m linearly independent vectors in Rm, so n ≤ m.

(b) The (i, j) entry of ATA is the dot product vTi vj = vi · vj of the ith and jth columns

of A, and so by orthogonality this is zero if i 6= j. The ith diagonal entry is the squaredEuclidean norm of the ith column, ‖vi ‖2.

(c) Not necessarily. If A =

0B@

1 11 −10 2

1CA, then ATA =

2 00 6

!, but AAT =

0B@

2 0 20 2 −22 −2 4

1CA.

♦ 5.3.23. If S = (v1 v2 . . . vn ), then the (i, j) entry of STKS is vTi Kvj = 〈vi ,vj 〉, so

STKS = I if and only if 〈vi ,vj 〉 = 0 for i 6= j, while 〈vi ,vi 〉 = ‖vi ‖2 = 1.

♥ 5.3.24.(a) Given any A ∈ G, we have A−1 ∈ G, and hence the product AA−1 = I ∈ G also.

(b) (i) If A,B are nonsingular, so are AB and A−1, with (AB)−1 = B−1A−1, (A−1)−1 = A.(ii) The product of two upper triangular matrices is upper triangular, as is the inverse

of any nonsingular upper triangular matrix.(iii) If detA = 1 = detB, then A,B are nonsingular; det(AB) = detA detB = 1 and

det(A−1) = 1/ detA = 1.

(iv) If P,Q are orthogonal matrices, so P−1 = PT , Q−1 = QT , then (P Q)−1 =

Q−1P−1 = QTPT = (P Q)T , and (Q−1)−1 = Q = (QT )T = (Q−1)T , so both

P Q and Q−1 are orthogonal matrices.(v) According to part (d), the product and inverse of orthogonal matrices are also or-

thogonal. Moreover, by part (c), the product and inverse of matrices with determi-

129

nant 1 also have determinant 1. Therefore, the product and inverse of proper or-thogonal matrices are proper orthogonal.

(vi) The inverse of a permutation matrix is a permutation matrix, as is the product oftwo permutation matrices.

(vii) If A =

a bc d

!, B =

x yz w

!have integer entries with detA = ad − bd = 1,

detB = xw−y z = 1, then the product AB =

ax+ bz ay + bwcx+ dz cy + dw

!also has integer

entries and determinant det(AB) = detA detB = 1. Moreover, A−1 =

d −b−c a

!

also has integer entries and determinant det(A−1) = 1/ detA = 1.(c) Because the inverse of a matrix with integer entries does not necessarily have integer

entries. for instance,

1 1−1 1

!−1

=

0@

12 − 1

212

12

1A.

(d) No, because the product of two positive definite matrices is not necessarily symmetric,let alone positive definite.

♥ 5.3.25.(a) The defining equation U† U = I implies U−1 = U†.

(b) (i) U−1 = U† =

0B@

1√2− i√

2

− i√2

1√2

1CA, (ii) U−1 = U† =

0BBBB@

1√3

1√3

1√3

1√3− 1

2√

3− i

2 − 12√

3+ i

2

1√3− 1

2√

3+ i

2 − 12√

3− i

2

1CCCCA

,

(iii) U−1 = U† =

0BBBBB@

12

12

12

12

12 − i

2 − 12

i2

12 − 1

212 − 1

212

i2 − 1

2 − i2

1CCCCCA

.

(c) (i) No, (ii) yes, (iii) yes.

(d) Let u1, . . . ,un denote the columns of U . The ith row of U† is the complex conjugate of

the ith column of U , and so the (i, j) entry of U † U is uTi uj = uj · ui, i.e., the Hermitian

dot product of the column vectors. Thus, U † U = I if and only if u1, . . . ,un form anorthonormal basis of C

n.(e) Note first that (U V )† = V †U†. If U−1 = U†, V −1 = V †, then (U V )−1 = V −1U−1 =

V †U† = (U V )†, so U V is also unitary. Also (U−1)−1 = U = (U†)† = (U−1)†, and so

U−1 is also unitary.

5.3.26.

0BB@

2 0 1

2 4 2

− 1 −1 −3

1CCA =

0BBBB@

3 3 3

0 2√

2 1√2

0 0 3√2

1CCCCA

0BBBB@

23 − 1√

2− 1

3√

223

1√2− 1

3√

2

− 13 0 − 2

√2

3

1CCCCA

.

5.3.27.

(a)

1 −3

2 1

!=

0B@

1√5− 2√

52√5

1√5

1CA

0B@

√5 − 1√

5

0 7√5

1CA;

(b)

4 3

3 2

!=

0@

45

35

35 − 4

5

1A0@ 5 18

5

0 15

1A;

130

(c)

0BB@

2 1 −1

0 1 3

− 1 −1 1

1CCA =

0BBBBB@

2√5

− 1√30

1√6

0q

56

1√6

− 1√5−q

215

q23

1CCCCCA

0BBBB@

√5 3√

5− 3√

5

0q

65 7

q215

0 0 2q

23

1CCCCA

;

(d)

0BB@

0 1 2

− 1 1 1

− 1 1 3

1CCA =

0BBBB@

0 1 0

− 1√2

0 − 1√2

− 1√2

0 1√2

1CCCCA

0BB@

√2 −

√2 −2

√2

0 1 2

0 0√

2

1CCA;

(e)

0BB@

0 0 2

0 4 1

− 1 0 1

1CCA =

0BB@

0 0 1

0 1 0

− 1 0 0

1CCA

0BB@

1 0 −1

0 4 1

0 0 2

1CCA;

(f )

0BBBBB@

1 1 1 1

1 2 1 0

1 1 2 1

1 0 1 1

1CCCCCA

=

0BBBBBBBB@

12 0 − 1

2√

3

q23

12

1√2− 1

2√

3− 1√

612 0

√3

2 012 − 1√

2− 1

2√

3− 1√

6

1CCCCCCCCA

0BBBBBBBB@

2 2 52

32

0√

2 0 − 1√2

0 0√

32

12√

3

0 0 0 1√6

1CCCCCCCCA

.

5.3.28.

(i) (a)

1 2

− 1 3

!=

0B@

1√2

1√2

− 1√2

1√2

1CA

0B@

√2 − 1√

2

0 5√2

1CA; (b)

xy

!=

0@−

7515

1A;

(ii) (a)

0BB@

2 1 −1

1 0 2

2 −1 3

1CCA =

0BBBB@

23

1√2− 1

3√

213 0 2

√2

323 − 1√

2− 1

3√

2

1CCCCA

0BB@

3 0 2

0√

2 −2√

2

0 0√

2

1CCA; (b)

0B@xyz

1CA =

0BB@

1

−1

−1

1CCA;

(iii) (a)

0BB@

1 1 0

− 1 0 1

0 −1 1

1CCA =

0BBBB@

1√2

1√6

1√3

− 1√2

1√6

1√3

0 −q

23

1√3

1CCCCA

0BBBB@

√2 1√

2− 1√

2

0q

32 − 1√

6

0 0 2√3

1CCCCA

; (b)

0B@xyz

1CA =

0BBB@

− 121212

1CCCA.

♠ 5.3.29.0B@

4 1 01 4 10 1 4

1CA =

[email protected] −.2339 .0643.2425 .9354 −.2571

0 .2650 .9642

1CA

0B@

4.1231 1.9403 .24250 3.773 1.99560 0 3.5998

1CA,

0BBB@

4 1 0 01 4 1 00 1 4 10 0 1 4

1CCCA =

0BBB@

.9701 −.2339 .0619 −.0172

.2425 .9354 −.2477 .06880 .2650 .9291 −.25810 0 .2677 .9635

1CCCA

0BBB@

4.1231 1.9403 .2425 00 3.773 1.9956 .26500 0 3.7361 1.99970 0 0 3.596

1CCCA,

0BBBBB@

4 1 0 0 01 4 1 0 00 1 4 1 00 0 1 4 10 0 0 1 4

1CCCCCA

=

0BBBBB@

.9701 −.2339 .0619 −.0166 .0046

.2425 .9354 −.2477 .0663 −.01840 .2650 .9291 −.2486 .06910 0 .2677 .9283 −.25810 0 0 .2679 .9634

1CCCCCA

0BBBBB@

4.1231 1.9403 .2425 0 00 3.773 1.9956 .2650 00 0 3.7361 1.9997 .26770 0 0 3.7324 2.00000 0 0 0 3.5956

1CCCCCA.

131

5.3.30. : 5.3.27 (a) bv1 =

−1.2361

2.0000

!, H1 =

.4472 .8944.8944 −.4472

!,

Q =

.4472 .8944.8944 −.4472

!, R =

2.2361 −.4472

0 −3.1305

!;

(b) bv1 =

−1

3

!, H1 =

.8 .6.6 −.8

!, Q =

.8 .6.6 −.8

!, R =

5 3.60 .2

!;

(c) bv1 =

0B@−.2361

0−1

1CA, H1 =

0B@

.8944 0 −.44720 1 0

−.4472 0 −.8944

1CA,

bv2 =

0B@

0−.0954.4472

1CA, H2 =

0B@

1 0 00 .9129 .40820 .4082 −.9129

1CA,

Q =

0B@

.8944 −.1826 .40820 .9129 .4082

−.4472 −.3651 .8165

1CA, R =

0B@

2.2361 1.3416 −1.34160 1.0954 2.5560 0 1.633

1CA;

(d) bv1 =

0B@−1.4142−1−1

1CA, H1 =

0B@

0 −.7071 −.7071−.7071 .5 −.5−.7071 −.5 .5

1CA,

bv2 =

0B@

0−1.7071−.7071

1CA, H2 =

0B@

1 0 00 −.7071 −.70710 −.7071 .7071

1CA

Q =

0B@

0 1 0−.7071 0 −.7071−.7071 0 .7071

1CA, R =

0B@

1.4142 −1.4142 −2.82840 1 20 0 1.4142

1CA;

(e) bv1 =

0B@−1

0−1

1CA, H1 =

0B@

0 0 −10 1 0−1 0 0

1CA, bv2 =

0B@

000

1CA, H2 =

0B@

1 0 00 1 00 0 1

1CA,

Q =

0B@

0 0 −10 1 0−1 0 0

1CA, R =

0B@

1 0 −10 4 10 0 −2

1CA;

(f ) bv1 =

0BBB@

−1111

1CCCA, H1 =

0BBB@

.5 .5 .5 .5

.5 .5 −.5 −.5

.5 −.5 .5 −.5

.5 −.5 −.5 .5

1CCCA,

bv2 =

0BBB@

0−.4142

0−1

1CCCA, H2 =

0BBB@

1 0 0 00 .7071 0 −.70710 0 1 00 −.7071 0 −.7071

1CCCA,

bv3 =

0BBB@

00

−.3660.7071

1CCCA, H3 =

0BBB@

1 0 0 00 1 0 00 0 .5774 .81650 0 .8165 −.5774

1CCCA,

Q =

0BBB@

.5 0 −.2887 .8165

.5 .7071 −.2887 −.4082

.5 0 .866 0

.5 −.7071 −.2887 −.4082

1CCCA, R =

0BBB@

2 2 2.5 1.50 1.4142 0 −.70710 0 .866 .28870 0 0 .4082

1CCCA;

132

5.3.29: 3× 3 case:

bv1 =

0B@−.1231

10

1CA, H1 =

[email protected] .2425 0.2425 −.9701 0

0 0 1

1CA,

bv2 =

0B@

0−7.411

1

1CA, H2 =

0B@

1 0 00 −.9642 .26500 .2650 .9642

1CA

Q =

[email protected] −.2339 .0643.2425 .9354 −.2571

0 .2650 .9642

1CA, R =

0B@

4.1231 1.9403 .24250 3.773 1.99560 0 3.5998

1CA;

4× 4 case:

bv1 =

0BBB@

−.1231100

1CCCA, H1 =

0BBB@

.9701 .2425 0 0

.2425 −.9701 0 00 0 1 00 0 0 1

1CCCA,

bv2 =

0BBB@

0−7.411

10

1CCCA, H2 =

0BBB@

1 0 0 00 −.9642 .2650 00 .2650 .9642 00 0 0 1

1CCCA,

bv3 =

0BBB@

00

−.13631

1CCCA, H3 =

0BBB@

1 0 0 00 1 0 00 0 .9635 .26770 0 .2677 −.9635

1CCCA,

Q =

0BBB@

.9701 −.2339 .0619 .0172

.2425 .9354 −.2477 −.06880 .2650 .9291 .25810 0 .2677 −.9635

1CCCA, R =

0BBB@

4.1231 1.9403 .2425 00 3.773 1.9956 .26500 0 3.7361 1.99970 0 0 −3.596

1CCCA;

5× 5 case:

bv1 =

0BBBBB@

−.12311000

1CCCCCA, H1 =

0BBBBB@

.9701 .2425 0 0 0

.2425 −.9701 0 0 00 0 1 0 00 0 0 1 00 0 0 0 1

1CCCCCA,

bv2 =

0BBBBB@

0−7.411

100

1CCCCCA, H2 =

0BBBBB@

1 0 0 0 00 −.9642 .2650 0 00 .2650 .9642 0 00 0 0 1 00 0 0 0 1

1CCCCCA,

bv3 =

0BBBBB@

00

−.136310

1CCCCCA, H3 =

0BBBBB@

1 0 0 0 00 1 0 0 00 0 .9635 .2677 00 0 .2677 −.9635 00 0 0 0 1

1CCCCCA,

bv4 =

0BBBBB@

000

−7.32841

1CCCCCA, H4 =

0BBBBB@

1 0 0 0 00 1 0 0 00 0 1 0 00 0 0 −.9634 .26790 0 0 .2679 .9634

1CCCCCA,

133

Q =

0BBBBB@

.9701 −.2339 .0619 −.0166 .0046

.2425 .9354 −.2477 .0663 −.01840 .2650 .9291 −.2486 .06910 0 .2677 .9283 −.25810 0 0 .2679 .9634

1CCCCCA,

R =

0BBBBB@

4.1231 1.9403 .2425 0 00 3.773 1.9956 .2650 00 0 3.7361 1.9997 .26770 0 0 3.7324 2.0 0 0 0 3.5956

1CCCCCA.

♥ 5.3.31.(a) QR factorization requires n3 + n2 multiplication/divisions, n square roots, and

n3 − 12 n

2 − 12 n addition/subtractions.

(b) Multiplication of QT b requires an additional n2 multiplication/divisions and n2 − n

addition/subtractions. Solving Rx = QT b by Back Substitution requires 12 n

2 + 12 n

multiplication/divisions and 12 n

2 − 12 n addition/subtractions.

(c) The QR method requires approximately 3 times as much computational effort as Gaus-sian Elimination.

♦ 5.3.32. If QR = eQ eR, then Q−1 eQ = eRR−1. The left hand side is orthogonal, while theright hand side is upper triangular. Thus, by Exercise 5.3.12, both sides must be diagonalwith ±1 on the diagonal. Positivity of the entries of R and eR implies positivity of those ofeRR−1, and hence Q−1 eQ = eRR−1 = I , which implies Q = eQ and R = eR.

♥ 5.3.33.(a) If rankA = n, then the columns w1, . . . ,wn of A are linearly independent, and so form

a basis for its range. Applying the Gram–Schmidt process converts the column basisw1, . . . ,wn to an orthonormal basis u1, . . . ,un of rngA.

(b) In this case, for the same reason as in (5.23), we can write

w1 = r11u1,

w2 = r12u1 + r22u2,

w3 = r13u1 + r23u2 + r33u3,

......

.... . .

wn = r1n u1 + r2n u2 + · · · + rnn un.

The result is equivalent to the factorization A = QR where Q = (u1, . . . ,un), andR = (rij) is nonsingular since its diagonal entries are non-zero: rii 6= 0.

(c)

(i)

0BB@

1 −1

2 3

0 2

1CCA =

0BBBB@

1√5− 2

3

2√5

13

0 23

1CCCCA

√5√

5

0 3

!; (ii)

0BB@

− 3 1

0 1

4 2

1CCA =

0BBBB@

− 35

85√

5

0 1√5

45

65√

5

1CCCCA

5 1

0√

5

!;

(iii)

0BBBBB@

− 1 1

1 −2

− 1 2

1 −1

1CCCCCA

=

0BBBBB@

− 12 − 1

212 − 1

2

− 12

12

12

12

1CCCCCA

2 −3

0 1

!;

134

(iv)

0BBBBB@

0 1 −1

− 2 1 3

− 1 0 −2

2 1 −2

1CCCCCA

=

0BBBBBBBB@

0 1√3− 3

7√

2

− 23

1√3

1121

√2

− 13 0 − 13

√2

2123

1√3

−√

221

1CCCCCCCCA

0BBB@

3 0 − 83

0√

3 0

0 0 7√

23

1CCCA.

(d) The columns of A are linearly dependent, and so the algorithm breaks down, as in Exer-cise 5.2.3.

♥ 5.3.34.(a) If A = (w1 w2 . . . wn ), then U = (u1 u2 . . . un ) has orthonormal columns and hence

is a unitary matrix. The Gram-Schmidt process takes the same form:

w1 = r11u1,

w2 = r12u1 + r22u2,

w3 = r13u1 + r23u2 + r33u3,

......

.... . .

wn = r1n u1 + r2n u2 + · · · + rnn un,

which is equivalent to the factorization A = U R.

(b) (i)

i 1

− 1 2 i

!=

0B@

i√2

− 1√2

− 1√2

i√2

1CA

0B@

√2 − 3 i√

2

0 1√2

1CA;

(ii)

1 + i 2− i

1− i − i

!=

0@

12 + i

212 − i

212 − i

212 + i

2

1A

2 1− 2 i

0 1

!;

(iii)

0BB@

i 1 0

1 i 1

0 1 i

1CCA =

0BBBB@

i√2

1√3− i√

61√2

i√3

1√6

0 1√3

iq

23

1CCCCA

0BBB@

√2 0 1√

2

0√

3 0

0 0q

32

1CCCA;

(iv)

0BB@

i 1 − i

1− i 0 1 + i

− 1 2 + 3 i 1

1CCA =

0BBBB@

i2

i6

2−3 i3√

212 − i

212 + i

6

√2

3

− 12

12 + 2 i

3 − 13√

2

1CCCCA

0BBB@

2 −1− 2 i −1 + i

0 3 1− i3

0 0 2√

23

1CCCA.

(c) Each diagonal entry of R can be multiplied by any complex number of modulus 1. Thus,requiring them all to be real and positive will imply uniqueness of the U R factorization.The proof of uniqueness is modeled on the real version in Exercise 5.3.32.

5.3.35.

Householder’s Method

start

set R = A

for j = 1 to n− 1

for i = 1 to j − 1 set wi = 0 next i

for i = j to n set wi = rij next i

set v = w − ‖w ‖ ej

if v 6= 0

set uj = v/‖v ‖, rjj = ‖w ‖

135

for i = j + 1 to n set rij = 0 next i

for k = j + 1 to n

for i = j to n set rik = rik − 2ui

nX

l=1

ulrlk next i

next k

else

set uj = 0

endif

next j

end

5.4.1.(a) t3 = q3(t) + 3

5 q1(t), where

1 =〈 t3 , q3 〉‖ q3 ‖2

=175

8

Z 1

−1t3“t3 − 3

5 t”dt, 0 =

〈 t3 , q2 〉‖ q2 ‖2

=45

8

Z 1

−1t3“t2 − 1

3

”dt,

3

5=〈 t3 , q1 〉‖ q1 ‖2

=3

2

Z 1

−1t3 t dt, 0 =

〈 t3 , q0 〉‖ q0 ‖2

=1

2

Z 1

−1t3 dt;

(b) t4 + t2 = q4(t) + 137 q2(t) + 8

15 q0(t), where

1 =〈 t4 + t2 , q4 〉‖ q4 ‖2

=11025

128

Z 1

−1(t4 + t2)

“t4 − 6

7 t2 + 3

35

”dt,

0 =〈 t4 + t2 , q3 〉‖ q3 ‖2

=175

8

Z 1

−1(t4 + t2)

“t3 − 3

5 t”dt,

13

7=〈 t4 + t2 , q2 〉‖ q2 ‖2

=45

8

Z 1

−1(t4 + t2)

“t2 − 1

3

”dt,

0 =〈 t4 + t2 , q1 〉‖ q1 ‖2

=3

2

Z 1

−1(t4 + t2) t dt,

8

15=〈 t4 + t2 , q0 〉‖ q0 ‖2

=1

2

Z 1

−1(t4 + t2) dt;

(c) 7 t4 + 2 t3 − t = 7q4(t) + 2q3(t) + 6q2(t) + 15 q1(t) + 7

5 q0(t), where

7 =〈 7 t4 + 2 t3 − t , q4 〉

‖ q4 ‖2=

11025

128

Z 1

−1(7 t4 + 2 t3 − t)

“t4 − 6

7 t2 + 3

35

”dt,

2 =〈 7 t4 + 2 t3 − t , q3 〉

‖ q3 ‖2=

175

8

Z 1

−1(7 t4 + 2 t3 − t)

“t3 − 3

5 t”dt,

136

6 =〈 7 t4 + 2 t3 − t , q2 〉

‖ q2 ‖2=

45

8

Z 1

−1(7 t4 + 2 t3 − t)

“t2 − 1

3

”dt,

1

5=〈 7 t4 + 2 t3 − t , q1 〉

‖ q1 ‖2=

3

2

Z 1

−1(7 t4 + 2 t3 − t) t dt,

7

5=〈 7 t4 + 2 t3 − t , q0 〉

‖ q0 ‖2=

1

2

Z 1

−1(7 t4 + 2 t3 − t) dt.

5.4.2. (a) q5(t) = t5 − 109 t

3 + 521 t =

5!

10!

d5

dt5(t2 − 1)5, (b) t5 = q5(t) + 10

9 q3(t) + 37 q1(t),

(c) q6(t) = t6− 1511 t

4+ 511 t

2− 5231 =

6!

12!

d6

dt6(t2−1)6, t6 = q6(t)+

1511 q4(t)+

57 q2(t)+

17 q0(t).

♦ 5.4.3. (a) We characterized qn(t) as the unique monic polynomial of degree n that is orthogo-

nal to q0(t), . . . , qn−1(t). Since these Legendre polynomials form a basis of P (n−1), this im-

plies that qn(t) is orthogonal to all polynomials of degree ≤ n−1; in particular 〈 qn , tj 〉 = 0for j = 0, . . . , n − 1. Conversely, if the latter condition holds, then qn(t) is orthogonal toevery polynomial of degree ≤ n− 1, and, in particular, 〈 qn , qj 〉 = 0, j = 0, . . . , n− 1.

(b) Set q5(t) = t5 + c4 t4 + c3 t

3 + c2 t2 + c1 t+ c0. Then we require

0 = 〈 q5 , 1 〉 = 2c0 + 23 c2 + 2

5 c4, 0 = 〈 q5 , t 〉 = 2c0 + 23 c2 + 2

5 c4,

0 = 〈 q5 , t2 〉 = 2

3 c1 + 25 c3 + 2

7 , 0 = 〈 q5 , t3 〉 = 2

3 c0 + 25 c2 + 2

7 c4,

0 = 〈 q5 , t4 〉 = 2

5 c1 + 27 c3 + 2

9 .

The unique solution to this linear system is c0 = 0, c1 = 521 , c2 = 0, c3 = − 10

9 , c4 = 0,, and

so q5(t) = t5 − 109 t

3 + 521 t is the monic Legendre polynomial of degree 5.

5.4.4. Since even and odd powers of t are orthogonal with respect to the L2 inner product on[−1, 1], when the Gram–Schmidt process is run, only even powers of t will contribute tothe even order polynomial, whereas only odd powers of t will contribute to the odd ordercases. Alternatively, one can prove this directly from the Rodrigues formula, noting that

(t2 − 1)k is even, and the derivative of an even (odd) function is odd (even).

5.4.5. Use Exercise 5.4.4 and the fact that 〈 f , g 〉 =Z 1

−1f(t) g(t) dt = 0 if f is odd and g is

even, since their product is odd.

5.4.6. qk(t) =k !

(2k) !

dk

dtk(t2 − 1)k, ‖ qk ‖ =

2k (k !)2

(2k) !

s2

2k + 1.

5.4.7.Qk(t) =qk(t)

‖ qk(t) ‖ =(2k) !

2k (k !)2

s2k + 1

2qk(t) =

1

2k k !

s2k + 1

2

dk

dtk(t2 − 1)k.

♦ 5.4.8. Write Pk(t) =1

2k k !

dk

dtk(t− 1)k (t+ 1)k. Differentiating using Leibniz’ Rule, we con-

clude the only term that does not contain a factor of t − 1 is when all k derivatives are ap-

plied to (t− 1)k. Thus, Pk(t) =1

2k(t+ 1)k + (t− 1)Sk(t) for some polynomial Sk(t) and so

Pk(1) = 1.

♥ 5.4.9.(a) Integrating by parts k times, and noting that the boundary terms are zero by (5.50):

137

‖Rk,k ‖2 =

Z 1

−1

dk

dtk(t2 − 1)k

dk

dtk(t2 − 1)k dt

= (−1)kZ 1

−1(t2 − 1)k

d2k

dt2k(t2 − 1)k dt = (−1)k(2k) !

Z 1

−1(t2 − 1)k dt.

(b) Since t = cos θ satisfies dt = − sin θ dθ, and takes θ = 0 to t = 1 and θ = π to t = −1, wefind

Z 1

−1(t2 − 1)k dt = (−1)k

Z π

0sin2k+1 θ dθ = (−1)k

2k

2k + 1

Z π

0sin2k−1 θ dθ

= (−1)k2k

2k + 1

2k − 2

2k − 1

Z π

0sin2k−3 θ dθ = · · ·

= (−1)k(2k)(2k − 2) · · · 4 · 2

(2k + 1)(2k − 1) · · · 5 · 3Z π

0sin θ dθ = (−1)k

22k+1 (k !)2

(2k + 1)!,

where we integrated by parts k times after making the trigonometric change of vari-ables. Combining parts (a,b),

‖Rk,k ‖2 =

(2k) ! 22k+1 (k !)2

(2k + 1)!=

22k+1 (k !)2

2k + 1.

Thus, by the Rodrigues formula,

‖Pk ‖ =1

2k k !‖Rk,k ‖ =

1

2k k !

2k+1/2 k !√2k + 1

=

s2

2k + 1.

♥ 5.4.10.

(a) The roots of P2(t) are ± 1√3; the roots of P3(t) are 0,±

q35 ;

the roots of P4(t) are ±r

15±2√

3035 .

(b) We use induction on Rj+1,k =d

dtRj,k. Differentiation reduces the order of a root by 1.

Moreover, Rolle’s Theorem says that at least one root of f ′(t) lies strictly between any

two roots of f(t). Thus, starting with R0,k(t) = (1 − t)k(1 + t)k, which has roots of

order k at ±1, we deduce that, for each j < k, Rj,k(t) has j roots lying between −1 and

1 along with roots of order k − j at ±1. Since the degree of Rj,k(t) is 2k − j, and the

roots at ±1 have orders k − j, the j other roots between −1 and 1 must all be simple.

Setting k = j, we conclude that Pk(t) = Rk,k(t) =d

dtRk−1,k(t) has k simple roots

strictly between −1 and 1.

5.4.11.(a) P0(t) = 1, P1(t) = t− 3

2 , P2(t) = t2 − 3 t+ 136 , P3(t) = t3 − 9

2 t2 + 33

5 t− 6320 ;

(b) P0(t) = 1, P1(t) = t− 23 , P2(t) = t2 − 6

5 t+ 310 , P3(t) = t3 − 12

7 t2 + 6

7 t− 435 ;

(c) P0(t) = 1, P1(t) = t, P2(t) = t2 − 35 , P3(t) = t3 − 5

7 t;

(d) P0(t) = 1, P1(t) = t, P2(t) = t2 − 2, P3(t) = t3 − 12 t.

5.4.12. 1, t− 34 , t

2 − 43 t+ 2

5 , t3 − 15

8 t2 + 1514 t− 5

28 , t4 − 12

5 t3 + 2 t2 − 2

3 t+ 114 .

5.4.13. These are the rescaled Legendre polynomials:

1, 12 t,

38 t

2 − 12 ,

516 t

3 − 34 t ,

35128 t

4 − 1516 t

2 + 38 ,

63256 t

5 − 3532 t

3 + 1516 t .

138

5.4.14. Setting 〈 f , g 〉 =Z 1

0f(t) g(t) dt, ‖ f ‖2 =

Z 1

0f(t)2 dt,

q0(t) = 1 = eP 0(t),

q1(t) = t − 〈 t , q0 〉‖ q0 ‖2q0(t) = t− 1

2 = 12eP 1(t),

q2(t) = t2 − 〈 t2 , q0 〉‖ q0 ‖2

q0(t) −〈 t2 , q1 〉‖ q1 ‖2

q1(t) = t2 − 1/3

11− 1/12

1/12

“t− 1

2

”= t2 − t+ 1

6 = 16eP 2(t),

q3(t) = t3 − 1/4

11− 3/40

1/12

“t− 1

2

”− 1/120

1/180

“t2 − t+ 1

6

”= t3 − 3

2 t2 + 3

5 t− 120 = 1

20eP 3(t),

q4(t) = t4 − 1/5

11− 1/15

1/12

“t− 1

2

”− 1/105

1/180

“t2 − t+ 1

6

”− 1/1400

1/2800

“t3 − 3

2 t2 + 3

5 t− 120

”

= t4 − 2 t3 + 97 t

2 − 27 t+ 1

70 = 170eP 4(t).

5.4.15. p0(t) = 1, p1(t) = t− 12 , p2(t) = t2 − t+ 1

6 , p3(t) = t3 − 32 t

2 + 3365 t− 1

260 .

♦ 5.4.16. The formula for the norm follows from combining equations (5.48) and (5.59).

5.4.17. L4(t) = t4 − 16 t3 + 72 t2 − 96 t+ 24, ‖L4 ‖ = 24,

L5(t) = t5 − 25 t4 + 200 t3 − 600 t2 + 600 t− 120, ‖L5 ‖ = 120.

♦ 5.4.18. This is done by induction on k. For k = 0, we haveZ ∞

0e− t dt = −e− t

˛˛∞

t=0= 1.

Integration by parts impliesZ ∞

0tk e− t dt = − tke− t

˛˛∞

t=0+Z ∞

0k tk−1e− t dt = k

Z ∞

0tk−1e− t dt = k · (k − 1) ! = k !

♦ 5.4.19. p0(t) = 1, p1(t) = t, p2(t) = t2 − 12 , p3(t) = t3 − 3

2 t, p4(t) = t4 − 3 t2 + 34 .

♥ 5.4.20.(a) To prove orthogonality, use the change of variables t = cos θ in the inner product inte-

gral, noting that dt = − sin θ dθ, and so dθ =dt√

1− t2:

〈Tm , Tn 〉 =Z 1

−1

cos(m arccos t) cos(n arccos t)√1− t2

dt =Z π

0cosmθ cosnθ dθ =

8>>><>>>:

π, m = n = 0,

12 π, m = n > 0,

0, m 6= n.(b) ‖T0 ‖ =

√π , ‖Tn ‖ =

qπ2 , for n > 0.

(c) T0(t) = 1, T1(t) = t, T2(t) = 2 t2 − 1, T3(t) = 4 t3 − 3 t,

T4(t) = 8 t4 − 8 t2 + 1, T5(t) = 16 t5 − 20 t3 + 5 t, T6(t) = 32 t6 − 48 t4 + 18 t2 − 1.

-1 -0.5 0.5 1

-1.5

-1

-0.5

0.5

1

1.5

-1 -0.5 0.5 1

-1.5

-1

-0.5

0.5

1

1.5

-1 -0.5 0.5 1

-1.5

-1

-0.5

0.5

1

1.5

139

-1 -0.5 0.5 1

-1.5

-1

-0.5

0.5

1

1.5

-1 -0.5 0.5 1

-1.5

-1

-0.5

0.5

1

1.5

-1 -0.5 0.5 1

-1.5

-1

-0.5

0.5

1

1.5

5.4.21. The Gram–Schmidt process will lead to the monic Chebyshev polynomials qn(t), ob-

tained by dividing each Tn(t) by its leading coefficient: q0(t) = 1, q1(t) = t, q2(t) = t2− 12 ,

q3(t) = t3− 34 t, q4(t) = t4− t2 + 1

8 , etc. This follows from the characterization of each qn(t)as the unique monic polynomial of degree n that is orthogonal to all polynomials of degree< n under the weighted inner product, cf. (5.43); any other degree n polynomial with thesame property must be a scalar multiple of qn(t), or, equivalently, of Tn(t).

5.4.22. A basis for the solution set is given by ex and e2x. The Gram-Schmidt process yields

the orthogonal basis e2x and2 (e3 − 1)

3 (e2 − 1)ex.

5.4.23. cosx, sinx, ex form a basis for the solution space. Applying the Gram–Schmidt process,

we find the orthogonal basis cosx, sinx, ex +sinhπ

πcosx− sinhπ

πsinx.

5.4.24. Starting with a system of linearly independent functions f(1)j (t) = fj(t), j = 1, . . . , n, in

an inner product space, we recursively compute the orthonormal system u1(t), . . . , un(t) by

setting uj = f(j)j /‖ f (j)

j ‖ , f (j+1)k = f

(j)k − 〈 f (j)

k , uj 〉uj , for j = 1, . . . n, k = j + 1, . . . , n.

The algorithm leads to the same orthogonal polynomials.

♥ 5.4.25.(a) First, when s = 0, t = −1, while when s = 1, t = 1. Since

dt

ds> 0 for t > 0, the function

is monotone increasing, with inverse s = +q

12 (t+ 1) .

(b) If p(t) is any polynomial, so is q(s) = p(2s2 − 1). The formulas are q0(s) = 1, q1(s) =

2s2 − 1, q2(s) = 4s4 − 4s2 + 23 , q3(s) = 8s6 − 12s4 + 24

5 s2 − 2

5 .

(c) No. For example, 〈 q0 , q1 〉 =Z 1

0(2s2 − 1) ds = − 1

3 . They are orthogonal with respect

to the weighted inner product 〈F ,G 〉 =Z 1

0F (s)G(s) s ds = 1

4

Z 1

−1f(t) g(t) dt provided

F (s) = f(2s2 − 1), G(s) = g(2s2 − 1).

5.4.26.(a) By the change of variables formula for integrals, since ds = −e− t dt, then

〈 f , g 〉 =Z ∞

0f(t) g(t) e− t dt =

Z 1

0F (s)G(s) ds when f(t) = F (e− t), g(t) = G(e− t).

The change of variables does not map polynomials to polynomials.(b) The resulting exponential functions Ek(t) = eP k(e− t) are orthogonal with respect to the

L2 inner product on [0,∞).(c) The resulting logarithmic polynomials Qk(s) = qk(− log s) are orthogonal with respect

to the L2 inner product on [0, 1]. Note that 〈Qj , Qk 〉 =Z 1

0qj(− log s) qk(− log s) ds is

finite since the logarithmic singularity at s = 0 is integrable.

140

5.5.1. (a) v2,v4, (b) v3, (c) v2, (d) v2,v3, (e) v1, (f ) v1,v3,v4.

5.5.2.

(a)

0BBB@

− 131313

1CCCA, (b)

0BBB@

47

− 2767

1CCCA ≈

0B@

.5714−.2857.8571

1CA, (c)

0BBB@

7911919

1CCCA ≈

0B@

.77781.2222.1111

1CA, (d)

0BBB@

− 10074225301422560169

1CCCA ≈

0B@−.23833.07123.35503

1CA.

5.5.3.5

7

0B@

321

1CA− 2

3

0B@

2−2−2

1CA =

0BBB@

172158214321

1CCCA ≈

0B@

.80952.76192.0476

1CA.

5.5.4. Orthogonal basis:

0BB@

−1

2

1

1CCA,

0BB@

32

2

− 52

1CCA;

orthogonal projection:2

3

0BB@

−1

2

1

1CCA−

4

5

0BB@

32

2

− 52

1CCA =

0BBB@

8154415

− 43

1CCCA ≈

0B@

.53332.9333−1.3333

1CA.

5.5.5. (a)

0BBBBB@

11211021

− 27

− 1021

1CCCCCA

, (b)

0BBBBB@

− 35653565

1CCCCCA

, (c)

0BBBBB@

2373

−153

1CCCCCA

, (d)

0BBBBB@

2373

053

1CCCCCA

.

5.5.6. (i) (a)

0BBB@

− 151515

1CCCA, (b)

0BBB@

1019

− 5191519

1CCCA ≈

0B@

.5263−.2632.7895

1CA, (c)

0BBB@

15171917117

1CCCA ≈

0B@

.882351.11765.05882

1CA,

(d)

0BBB@

− 19124254632425

− 1297

1CCCA ≈

0B@

.0788

.1909−.1237

1CA. (ii) (a)

0B@

000

1CA, (b)

0BBB@

519

− 5381538

1CCCA ≈

0B@

.2632−.1316.3947

1CA,

(c)

0BBB@

1322922

− 122

1CCCA ≈

0B@

.5909

.4091−.0455

1CA, (d)

0BBB@

55314282

− 16027141

48193

1CCCA ≈

0B@

.0387−.2243.2487

1CA.

5.5.7. ( 1.3, .5, .2,−.1 )T .

♥ 5.5.8.(a) The entries of c = AT v are ci = uT

i v = ui · v, and hence, by (1.11), w = P v = Ac =c1u1 + · · ·+ ck uk, reproducing the projection formula (5.63).

(b) (i)

0@

12

12

12

12

1A, (ii)

0BBB@

49 − 4

929

− 49

49 − 2

929 − 2

919

1CCCA, (iii)

0BB@

12 0 1

2

0 1 012 0 1

2

1CCA,

141

(iv)

0BBBBB@

19 − 2

929 0

− 29

89 0 − 2

929 0 8

9 − 29

0 − 29 − 2

919

1CCCCCA

, (v)

0BBBBB@

34

14

14

14

14

34 − 1

4 − 14

14 − 1

434 − 1

414 − 1

4 − 14

34

1CCCCCA

.

(c) PT = (AAT )T = AAT = P .

(d) The entries of AT A are the inner products ui·uj , and hence, by orthonormality, AT A =

I . Thus, P 2 = (AAT )(AAT ) = A I AT = AAT = P . Geometrically, w = P v is theorthogonal projection of v onto the subspace W , i.e., the closest point. In particular, ifw ∈ W already, then Pw = w. Thus, P 2v = P w = w = Pv for all v ∈ R

n, and henceP 2 = P .

(e) Note that P is the Gram matrix for AT , and so, by Proposition 3.36,

rankP = rankAT = rankA.

5.5.9. (a) If x,y are orthogonal to V , then 〈x ,v 〉 = 0 = 〈y ,v 〉 for every v ∈ V . Thus,〈 cx + dy ,v 〉 = c〈x ,v 〉 + d〈y ,v 〉 = 0 and hence cx + dy is orthogonal to every v ∈ V .

(b)“− 3

2 ,34 , 1, 0

”T,“− 1

2 ,34 , 0, 1

”T.

5.5.10.“

12 ,− 1

2 , 2”T

.

5.5.11. (a)“− 1

7 , 0”T

, (b)“

914 ,

431

”T, (c)

“− 4

7 ,130 ,

27

”T.

5.5.12.“

47 ,

27 ,

2514 ,

1714

”T.

5.5.13. orthogonal basis: ( 1, 0, 2, 1 )T , ( 1, 1, 0,−1 )T ,“

12 ,−1, 0,− 1

2

”T;

closest point = orthogonal projection =“− 2

3 , 2,23 ,

43

”T.

5.5.14. orthogonal basis: ( 1, 0, 2, 1 )T ,“

54 , 1,

12 ,− 3

4

”T,“

1522 ,− 21

22 ,311 ,− 9

22

”T;

closest point = orthogonal projection =“− 8

7 , 2,87 ,

167

”T.

5.5.15.(a) p1(t) = 14 + 7

2 t, p2(t) = p3(t) = 14 + 72 t+ 1

14 (t2 − 2);

(b) p1(t) = .285714 + 1.01429 t, p2(t) = .285714 + 1.01429 t− .0190476 (t2 − 4),

p3(t) = .285714 + 1.01429 t− .0190476 (t2 − 4)− .008333 (t3 − 7 t);

(c) p1(t) = 100 + 807 t, p2(t) = p3(t) = 100 + 80

7 − 2021 (t2 − 4).

♦ 5.5.16.

(a) The key point is that, since the sample points are symmetric, tk = 0 whenever k is odd.Thus,

〈q0 ,q1 〉 =1

n

nX

i=1

ti = t = 0, 〈q0 ,q1 〉 =1

n

nX

i=1

(t2i − t2 ) = t2 − t2 = 0,

〈q0 ,q3 〉 =1

n

nX

i=1

0@ t3i −

t4

t2ti

1A = t3 − t4

t2t = 0,

142

〈q1 ,q2 〉 =1

n

nX

i=1

ti(t2i − t2 ) = t3 − t t2 = 0,

〈q1 ,q3 〉 =1

n

nX

i=1

ti

0@ t3i −

t4

t2ti

1A = t4 − t4

t2t2 = 0,

〈q2 ,q3 〉 =1

n

nX

i=1

(t2i − t2 )

0@ t3i −

t4

t2ti

1A = t5 − t2 t3 − t4

t2t3 + t4 t = 0.

(b) q4(t) = t4 − t6 − t2 t4

t4 −“t2”2“t2 − t2

”− t4 , ‖q4 ‖

2 = t8 −“t4”2

+

“t6 − t2 t4

”2

t4 −“t2”2 .

(c) p4(t) = .3429 + .7357 t+ .07381 (t2 − 4)− .008333 (t3 − 7 t) + .007197“t4 − 67

7 t2 + 72

7

”.

5.5.17.(a) p4(t) = 14 + 7

2 t+ 114 (t2 − 2)− 5

12

“t4 − 31

7 + 7235

”;

(b) p4(t) = .2857+1.0143 t− .019048 (t2 − 4)− .008333 (t3 − 7 t)+ .011742“t4 − 67

7 t2 + 72

7

”;

(c) p4(t) = 100 + 807 − 20

21 (t2 − 4)− 566

“t4 − 67

7 t2 + 72

7

”.

5.5.18. Because, according to (5.65), the kth Gram-Schmidt vector belongs to the subspacespanned by the first k of the original basis vectors.

♥ 5.5.19.(a) Since ti = t0 + ih, we have t = t0 + 1

2 nh and so si =“i− 1

2 n”h. In particular,

sn−i =“

12 n− i

”h = −si, proving symmetry of the points.

(b) Since p(ti) = q(ti − t ) = q(si), the least squares errors coincide:nX

i=1

hq(si)− yi

i2=

nX

i=1

hp(ti)− yi

i2, and hence q(s) minimizes the former if and only if p(t) = q(t − t )

minimizes the latter.(c) p1(t) = −2 + 92

35

“t− 7

2

”= − 56

5 + 9235 t = −11.2 + 2.6286 t,

p2(t) = −2 + 9235

“t− 7

2

”+ 9

56

h “t− 7

2

”2 − 3512

i= − 97

10 + 421280 t+ 9

56 t2

= −9.7 + 1.503 t+ .1607 t2.

♦ 5.5.20. q0(t) = 1, q1(t) = t− t, q2(t) = t2 − t3 − t t2t2 − t2

“t− t

”− t2 ,

q0 = t0, q1 = t1 − t t0, q2 = t2 −t3 − t t2t2 − t2

“t1 − t

”− t2 t0,

‖q0‖2 = 1, ‖q1‖

2 = t2 − t2, ‖q2‖2 = t4 −

“t2”2 −

“t3 − t t2

”2

t2 − t2.

5.5.21. For simplicity, we assume kerA = 0. According to Exercise 5.3.33, orthogonalizingthe basis vectors for rngA is the same as factorizing A = QR where the columns of Q arethe orthonormal basis vectors, while R is a nonsingular upper triangular matrix. The for-

mula for the coefficients c = ( c1, c2, . . . , cn )T of v = b in (5.63) is equivalent to the matrix

formula c = QT b. But this is not the least squares solution

x = (ATA)−1ATb = (RTQTQR)−1RTQT

b = (RTR)−1RTQTb = R−1QT

b = R−1c.

Thus, to obtain the least squares solution, John needs to multiply his result by R−1.

143

♦ 5.5.22. Note that QTQ = I , while R is a nonsingular square matrix. Therefore, the leastsquares solution is

x = (ATA)−1ATb = (RTQTQR)−1RTQT

b = (RTR)−1RTQTb = R−1QT

b.

5.5.23. The solutions are, of course, the same:

(a) Q =

0B@

.30151 .79455

.90453 −.06356−.30151 .60386

1CA, R =

3.31662 −.90453

0 2.86039

!, x =

.06667.91111

!;

(b) Q =

0BBB@

.8 −.43644

.4 .65465

.2 −.43644

.4 .43644

1CCCA, R =

5 00 4.58258

!, x =

−.04000−.38095

!;

(c) Q =

[email protected] .61721 .57735.80178 −.15430 −.57735.26726 −.77152 .57735

1CA, R =

0B@

3.74166 .26726 −1.870830 1.38873 −3.240370 0 1.73205

1CA, x =

0B@

.666671.666671.00000

1CA;

(d) Q =

0BBB@

.18257 .36515 .12910

.36515 −.18257 .903700 .91287 .12910

−.91287 0 .38730

1CCCA, R =

0B@

5.47723 −2.19089 00 1.09545 −3.651480 0 2.58199

1CA,

x = ( .33333, 2.00000, .75000 )T ;

(e) Q =

0BBBBB@

.57735 .51640 −.15811 −.204120 .77460 .15811 .20412

.57735 −.25820 .47434 .612370 0 .79057 −.61237

.57735 −.25820 −.31623 −.40825

1CCCCCA, R =

0BBB@

1.73205 .57735 .57735 −.577350 1.29099 −.25820 1.032800 0 1.26491 −.316230 0 0 1.22474

1CCCA,

x = ( .33333, 2.00000,−.33333,−1.33333 )T .

♦ 5.5.24. The second method is more efficient! Suppose the system is Ax = b where A is anm× n matrix. Constructing the normal equations requires mn2 multiplications and

(m− 1)n2 ≈ mn2 additions to compute ATA and an additional nm multiplications and

n(m−1) additions to compute AT b. To solve the normal equations ATAx = AT b by Gaus-

sian Elimination requires 13 n

3 +n2− 13 n ≈ 1

3 n3 multiplications and 1

3 n3 + 1

2 n2− 5

6 n ≈ 13 n

3

additions. On the other hand, to compute the A = QR decomposition by Gram–Schmidtrequires (m + 1)n2 ≈ mn2 multiplications and 1

2 (2m + 1)n(n − 1) ≈ mn2 additions.

To compute c = QT b requires mn multiplications and m(n − 1) additions, while solving

Rx = c by Back Substitution requires 12 n

2 + 12 n multiplications and 1

2 n2 − 1

2 n additions.Thus, the first step requires about the same amount of work as forming the normal equa-tions, and the second two steps are considerably more efficient than Gaussian Elimination.

♦ 5.5.25.(a) If A = Q has orthonormal columns, then

‖Qx? − b ‖2 = ‖b ‖2 − ‖QT

b ‖2 =mX

i=1

b2i −nX

i=1

(ui · b)2.

(b) If the columns v1, . . . ,vn of A are orthogonal, then ATA is a diagonal matrix with the

square norms ‖vi ‖2 along its diagonal, and so

‖Ax? − b ‖2 = ‖b ‖2 − b

TA(ATA)−1ATb =

mX

i=1

b2i −nX

i=1

(ui · b)2

‖ui ‖2.

144

♥ 5.5.26.(a) The orthogonal projection is w = Ax where x = (ATA)−1AT b is the least squares

solution to Ax = b, and so w = A(ATA)−1AT b = P b.

(b) If the columns of A are orthonormal, then ATA = I , and so P = AAT .

(c) Since Q has orthonormal columns, QTQ = I while R is invertible, so

P = A(ATA)−1AT = QR (RTQTQR)−1RTQT = QR (RTR)−1RTQT = QQT .

Note: in the rectangular case, the rows of Q are not necessarily orthonormal vectors,

and so QQT is not necessarily the identity matrix.

5.5.27.

(a) P =

0BBB@

.25 −.25 −.35 .05−.25 .25 .35 −.05−.35 .35 .49 −.07.05 −.05 −.07 .01

1CCCA, P v =

0BBB@

.25−.25−.35.05

1CCCA;

(b) P =

0BBBBB@

13 − 1

3 0 13

− 13

79 − 2

919

0 − 29

19 − 2

913

19 − 2

979

1CCCCCA

, P v =

0BBBBB@

13

− 13

013

1CCCCCA

;

(c) P =

0BBB@

.28 −.4 .2 .04−.4 .6 −.2 −.2.2 −.2 .4 −.4.04 −.2 −.4 .72

1CCCA, P v =

0BBB@

.28−.4.2.04

1CCCA;

(d) P =

0BBBBB@

715 − 2

5415

215

− 25

710

15

110

415

15

1315 − 1

15215

110 − 1

152930

1CCCCCA

, P v =

0BBBBB@

715

− 25415215

1CCCCCA

.

5.5.28. Both are the same quadratic polynomial: 15 + 4

7

“− 1

2 + 32 t

2”

= − 335 + 6

7 t2.

5.5.29.Quadratic: 1

5 + 25 (2 t−1)+ 2

7 (6 t2−6 t+1) = 335 − 32

35 t+127 t

2 = .08571− .91429 t+1.71429 t2;

Cubic: 15 + 2

5 (2 t−1)+ 27 (6 t2−6 t+1)+ 1

10 (20 t3−30 t2 +12 t−1) = − 170 + 2

7 t− 97 t

2 +2 t3 =

− .01429 + .2857 t− 1.2857 t2 + 2 t3.

5.5.30. 1.718282 + .845155 (2 t− 1) + .139864 (6 t2 − 6 t+ 1) + .013931 (20 t3 − 30 t2 + 12 t− 1) =

.99906 + 1.0183 t+ .421246 t2 + .278625 t3.

5.5.31. Linear: 14 + 9

20 (2 t − 1) = − 15 + 9

10 t; minimum value: 9700 = .01286. Quadratic:

14 + 9

20 (2 t−1)+ 14 (6 t2−6 t+1) = 1

20 − 35 t+

32 t

2; minimum value: 12800 = .0003571. Cubic:

14 + 9

20 (2 t− 1) + 14 (6 t2 − 6 t+ 1) + 1

20 (20 t3 − 30 t2 + 12 t− 1) = t3; minimum value: 0.

5.5.32. (a) They are both the same quadratic polynomial:

2

π+

10− 120π2

π3

1− 6

πt+

6

π2t2!

= − 120

π3+

12

π+

720− 60π2

π4t− 720− 60π2

π5t2

= − .050465 + 1.312236 t− .417698 t2.

145

(b)

0.5 1 1.5 2 2.5 3

0.2

0.4

0.6

0.8

1

The maximum error is .0504655 at the ends t = 0, π.

♠ 5.5.33. .459698 + .427919 (2 t− 1)− .0392436 (6 t2− 6 t+ 1)− .00721219 (20 t3− 30 t2 + 12 t− 1) =

−.000252739 + 1.00475 t− .0190961 t2 − .144244 t3.♠ 5.5.34.

p(t) = 1.175201 + 1.103638 t+ .357814“

32 t

2 − 12

”+

+ .070456“

52 t

3 − 32 t”

+ .009965“

358 t

4 − 154 t

2 + 38

”+

+ .00109959“

638 t

5 − 354 t

3 + 158 t

”+ .00009945

“23116 t6 − 315

16 t4 + 10516 t2 − 5

16

”

= 1.+ 1.00002 t+ .500005 t2 + .166518 t3 + .0416394 t4 + .00865924 t5 + .00143587 t6.

5.5.35.(a) 3

2 − 103

“t− 3

4

”+ 35

4

“t2 − 4

3 t+ 25

”= 15

2 − 15 t + 354 t2; it gives the smallest value to

‖ p(t)− 1t ‖

2 =Z 1

0

“p(t)− 1

t

”2t2 dt among all quadratic polynomials p(t).

(b) 32 − 10

3

“t− 3

4

”+ 35

4

“t2 − 4

3 t+ 25

”− 126

5

“t3 − 152

8 t+ 15t14 − 5

28

”

= 12− 42 t+ 56 t2 − 1265 t3.

(c)

0.2 0.4 0.6 0.8 1

5

10

15

20

25

30

(d) Both do a reasonable job approximating from t = .2 to 1, but can’t keep close near the

singularity at 0, owing to the small value of the weight function w(t) = t2 there. Thecubic does a marginally better job near the singularity.

♠ 5.5.36. Quadratic: .6215 + .3434(t− 1)− .07705(t2 − 4 t+ 2) = .1240− .6516 t− .07705 t2;Cubic: .6215 + .3434(t− 1)− .07705(t2 − 4 t+ 2) + .01238(t3 − 9 t2 + 18 t− 6)

= .0497 + .8744 t− .1884 t2 + .01238 t3.

The accuracy is reasonable up until t = 2 for the quadraticand t = 4 for the cubic polynomial.

2 4 6 8

0.2

0.4

0.6

0.8

1

1.2

1.4

5.6.1. (a) W⊥ has basis

0B@

13

10

1CA,

0B@− 1

3

01

1CA, dimW⊥ = 2; (b) W⊥ has basis

0BBB@

− 12

− 54

1

1CCCA,

146

dimW⊥ = 1; (c) W⊥ has basis

0B@−2

10

1CA,

0B@−3

01

1CA, dimW⊥ = 2; (d) W⊥ has basis

0B@

211

1CA,

dimW⊥ = 1; (e) W⊥ = 0, dimW⊥ = 0.

5.6.2. (a)

0B@

34−5

1CA; (b)

0B@

12

10

1CA,

0B@

32

01

1CA; (c)

0B@−1−1

1

1CA; (d)

0B@−1

10

1CA,

0B@

101

1CA.

5.6.3. (a)

0BBB@

−1321

1CCCA; (b)

0BBBB@

1214

10

1CCCCA,

0BBB@

−1−1

01

1CCCA; (c)

0BBB@

−127

10

1CCCA,

0BBB@

047

01

1CCCA; (d)

0BBB@

1010

1CCCA,

0BBBB@

12

− 74

01

1CCCCA

.

5.6.4. (a) w =

0@

310

− 110

1A, z =

0@

7102110

1A; (b) w =

0BBB@

− 15825

− 3125

1CCCA, z =

0BBB@

− 65

4225

− 625

1CCCA; (c) w =

0BBB@

13

− 13

− 13

1CCCA,

z =

0BBB@

231313

1CCCA; (d) w =

0BBB@

27

− 37

− 17

1CCCA, z =

0BBB@

573717

1CCCA; (e) w =

0BBBBB@

411

− 111111

− 211

1CCCCCA, z =

0BBBBB@

711111

− 1111311

1CCCCCA

.

5.6.5. (a) Span of

0B@

2310

1CA,

0B@−1

01

1CA; dimW⊥ = 2. (b) Span of

0BBB@

− 128

− 158

1

1CCCA; dimW⊥ = 1.

(c) Span of

0B@−4

10

1CA,

0B@−9

01

1CA; dimW⊥ = 2. (d) Span of

0B@

632

1

1CA; dimW⊥ = 1.

(e) W⊥ = 0, and dimW⊥ = 0.

5.6.6. For the weighted inner product, the orthogonal complement W⊥ is the set of all vectors

v = (x, y, z, w )T that satisfy the linear system

〈v ,w1 〉 = x+ 2y + 4w = 0, 〈v ,w2 〉 = x+ 2y + 3z − 8w = 0.

A non-orthogonal basis for W⊥ is z1 = (−2, 1, 0, 0 )T , z2 = (−4, 0, 4, 1 )T . Applying

Gram–Schmidt, the corresponding orthogonal basis is y1 = (−2, 1, 0, 0 )T ,

y2 =“− 4

3 ,− 43 , 4, 1

”T. We decompose v = w + z, where w =

“1343 ,

1343 ,

443 ,

143

”T ∈ W ,

z =“

3043 ,− 13

43 ,− 443 ,− 1

43

”T ∈ W⊥. Here, since w1,w2 are no longer orthogonal, its easier

to compute z first and then subtract w = v − z.

5.6.7. (a) 〈 p , q 〉 =Z 1

−1p(x) q(x) dx = 0 for all q(x) = a + bx + cx2, or, equivalently,

Z 1

−1p(x) dx =

Z 1

−1x p(x) dx =

Z 1

−1x2 p(x) dx = 0. Writing p(x) = a+ bx+ cx2 + dx3 + ex4,

the orthogonality conditions require 2a+ 23 c+ 2

5 e = 0, 23 b+ 2

5 d = 0, 23 a+ 2

5 c+ 27 e = 0.

(b) Basis: t3 − 35 t, t

4 − 67 t

2 + 335 ; dimW⊥ = 2; (c) the preceding basis is orthogonal.

♦ 5.6.8. If u,v ∈W⊥, so 〈u ,w 〉 = 〈v ,w 〉 = 0 for all w ∈W , then

〈 cu + dv ,w 〉 = c〈u ,w 〉+ d〈v ,w 〉 = 0 also, and so cu + dv ∈W⊥.

147

5.6.9. (a) If w ∈ W ∩ W⊥ then w ∈ W⊥ must be orthogonal to every vector in W and sow ∈ W is orthogonal to itself, which implies w = 0. (b) If w ∈ W then w is orthogonal to

every z ∈W⊥ and so w ∈ (W⊥)⊥.

5.6.10. (a) The only element orthogonal to all v ∈ V is 0, and hence V ⊥ contains only the zero

vector. (b) Every v ∈ V is orthogonal to 0, and so belongs to 0⊥.

5.6.11. If z ∈ W⊥2 then 〈 z ,w 〉 = 0 for every w ∈ W2. In particular, every w ∈ W1 ⊂ W2, and

hence z is orthogonal to every vector w ∈W1. Thus, z ∈W⊥1 , proving W⊥

2 ⊂W⊥1 .

5.6.12.(a) We are given that dimW + dimZ = n and W ∩ Z = 0. Now, dimW⊥ = n− dimW ,

dimZ⊥ = n− dimZ and hence dimW⊥ + dimZ⊥ = n. Furthermore, if v ∈ W⊥ ∩ Z⊥

then v is orthogonal to all vectors in both W and Z, and hence also orthogonal to anyvector of the form w + z for w ∈ W and z ∈ Z. But since W,Z are complementary,every vector in R

n can be written as w + z and hence v is orthogonal to all vectors inR

n which implies v = 0.

(b)Z

WZ⊥

W⊥

♦ 5.6.13. Suppose v ∈ (W⊥)⊥. Then we write v = w + z where w ∈W, z ∈W⊥. By assumption,

for every y ∈ W⊥, we must have 0 = 〈v ,y 〉 = 〈w ,y 〉 + 〈 z ,y 〉 = 〈 z ,y 〉. In particular,

when y = z, this implies ‖ z ‖2 = 0 and hence z = 0 which proves v = w ∈W .

5.6.14. Every w ∈W can be written as w =kX

i=1

ai wi; every z ∈ Z can be written as

z =lX

j =1

bj zj . Then, using bilinearity, 〈w , z 〉 =kX

i=1

lX

j =1

ai bj 〈wi , zj 〉 = 0, and hence W

and Z are orthogonal subspaces.

♦ 5.6.15.(a) We are given that 〈wi ,wj 〉 = 0 for all i 6= j between 1 and m and between m+1 and n.

It is also 0 if 1 ≤ i ≤ m and m+1 ≤ j ≤ n since every vector in W is orthogonal to every

vector in W⊥. Thus, the vectors w1, . . . ,wn are non-zero and mutually orthogonal, andso form an orthogonal basis.

(b) This is clear: w ∈ W since it is a linear combination of the basis vectors w1, . . . ,wm,

similarly z ∈W⊥ since it is a linear combination of the basis vectors wm+1, . . . ,wn.

♥ 5.6.16.(a) Let V = α+ βx = P(1) be the two-dimensional subspace of linear polynomials.

Every u(x) ∈ C0(a, b) can be written as u(x) = α + βx + w(x) where β =u(b)− u(a)

b− a ,

α = u(a)− βa, while w(a) = w(b) = 0, so w ∈W . Moreover, a linear polynomial α+ βxvanishes at a and b if and only if it is identically zero, and so V satisfies the conditions

148

for it to be a complementary subspace to W .(b) The only continuous function which is orthogonal to all functions in W is the zero func-

tion. Indeed, suppose 〈 v , w 〉 =Z b

av(x)w(x) dx = 0 for all w ∈ W , and v(c) > 0

for some a < c < b. Then, by continuity, v(x) > 0 for |x− c | < δ for some δ > 0.Choose w(x) ∈ W so that w(x) ≥ 0, w(c) > 0, but w(x) ≡ 0 for |x− c | ≥ δ. Then

v(x)w(x) ≥ 0, with v(c)w(c) > 0, and soZ b

av(x)w(x) dx > 0, which is a contradic-

tion. The same proof works for w(c) < 0 — only the inequalities are reversed. Therefore

v(x) = 0 for all a < x < b, and, by continuity, v(x) ≡ 0 for all x. Thus, W⊥ = 0, andthere is no orthogonal complementary subspace.

5.6.17. Note: To show orthogonality of two subspaces, it suffices to check orthogonality of theirrespective basis vectors.

(a) (i) Range:

12

!; cokernel:

−2

1

!; corange:

1−2

!; kernel:

21

!;

(ii)

12

!· −2

1

!= 0; (iii)

1−2

!·

21

!= 0.

(b) (i) Range:

0B@

510

1CA,

0B@

022

1CA; cokernel:

0B@

15−1

1

1CA; corange:

50

!,

02

!; kernel: 0;

(ii)

0B@

510

1CA ·

0B@

15−1

1

1CA =

0B@

022

1CA ·

0B@

15−1

1

1CA = 0; (iii)

50

!· 0 =

02

!· 0 = 0.

(c) (i) Range:

0B@

0−1−2

1CA,

0B@

103

1CA; cokernel:

0B@−3−2

1

1CA; corange:

0B@−1

0−3

1CA,

0B@

012

1CA; kernel:

0B@−3−2

1

1CA;

(ii)

0B@

0−1−2

1CA ·

0B@−3−2

1

1CA =

0B@

103

1CA ·

0B@−3−2

1

1CA = 0; (iii)

0B@−1

0−3

1CA ·

0B@−3−2

1

1CA =

0B@

012

1CA ·

0B@−3−2

1

1CA = 0.

(d) (i) Range:

0B@

1−1

0

1CA,

0B@

213

1CA; cokernel:

0B@−1−1

1

1CA; corange:

0BBB@

1201

1CCCA,

0BBB@

0332

1CCCA;

kernel:

0BBB@

2−1

10

1CCCA,

0BBBB@

13

− 2301

1CCCCA

; (ii)

0B@

1−1

0

1CA ·

0B@−1−1

1

1CA =

0B@

213

1CA ·

0B@−1−1

1

1CA = 0;

(iii)

0BBB@

1201

1CCCA ·

0BBB@

2−1

10

1CCCA =

0BBB@

1201

1CCCA ·

0BBBB@

13

− 2301

1CCCCA

=

0BBB@

0332

1CCCA ·

0BBB@

2−1

10

1CCCA =

0BBB@

0332

1CCCA ·

0BBBB@

13

− 2301

1CCCCA

= 0.

(e) (i) Range:

0B@

315

1CA,

0B@

112

1CA; cokernel:

0B@−3−1

2

1CA; corange:

0BBBBB@

31427

1CCCCCA,

0BBBBB@

011−1

1

1CCCCCA

; kernel:

149

0BBBBB@

−1−1

100

1CCCCCA,

0BBBBB@

−11010

1CCCCCA,

0BBBBB@

−2−1

001

1CCCCCA

; (ii)

0B@

315

1CA ·

0B@−3−1

2

1CA =

0B@

112

1CA ·

0B@−3−1

2

1CA = 0; (iii)

0BBBBB@

31427

1CCCCCA·

0BBBBB@

−1−1

100

1CCCCCA

=

0BBBBB@

31427

1CCCCCA·

0BBBBB@

−11010

1CCCCCA

=

0BBBBB@

31427

1CCCCCA·

0BBBBB@

−2−1

001

1CCCCCA

=

0BBBBB@

011−1

1

1CCCCCA·

0BBBBB@

−1−1

100

1CCCCCA

=

0BBBBB@

011−1

1

1CCCCCA·

0BBBBB@

−11010

1CCCCCA

=

0BBBBB@

011−1

1

1CCCCCA·

0BBBBB@

−2−1

001

1CCCCCA

= 0.

(f ) (i) Range:

0BBB@

1−2−3

1

1CCCA,

0BBB@

315−4

1CCCA; cokernel:

0BBB@

−1−2

10

1CCCA,

0BBB@

1101

1CCCA; corange:

0BBB@

130−2

1CCCA,

0BBB@

072−1

1CCCA;

kernel:

0BBBB@

67

− 2710

1CCCCA,

0BBBB@

1171701

1CCCCA

;

(ii)

0BBB@

1−2−3

1

1CCCA ·

0BBB@

−1−2

10

1CCCA =

0BBB@

1−2−3

1

1CCCA ·

0BBB@

1101

1CCCA =

0BBB@

315−4

1CCCA ·

0BBB@

−1−2

10

1CCCA =

0BBB@

315−4

1CCCA ·

0BBB@

1101

1CCCA = 0;

(iii)

0BBB@

130−2

1CCCA ·

0BBBB@

67

− 2710

1CCCCA

=

0BBB@

130−2

1CCCA ·

0BBBB@

1171701

1CCCCA

=

0BBB@

072−1

1CCCA ·

0BBBB@

67

− 2710

1CCCCA

=

0BBB@

072−1

1CCCA ·

0BBBB@

1171701

1CCCCA

= 0.

(g) (i) Range:

0BBBBB@

−12−3

1−2

1CCCCCA,

0BBBBB@

2−5

2−3−5

1CCCCCA

; cokernel:

0BBBBB@

−11−4

100

1CCCCCA,

0BBBBB@

−1−1

010

1CCCCCA,

0BBBBB@

−20−9

001

1CCCCCA

; corange:

0BBB@

−122−1

1CCCA,

0BBB@

0010

1CCCA; kernel:

0BBB@

2100

1CCCA,

0BBB@

−1001

1CCCA; (ii)

0BBBBB@

−12−3

1−2

1CCCCCA·

0BBBBB@

−11−4

100

1CCCCCA

=

0BBBBB@

−12−3

1−2

1CCCCCA·

0BBBBB@

−1−1

010

1CCCCCA

=

0BBBBB@

−12−3

1−2

1CCCCCA·

0BBBBB@

−20−9

001

1CCCCCA

=

0BBBBB@

2−5

2−3−5

1CCCCCA·

0BBBBB@

−11−4

100

1CCCCCA

=

0BBBBB@

2−5

2−3−5

1CCCCCA·

0BBBBB@

−1−1

010

1CCCCCA

=

0BBBBB@

2−5

2−3−5

1CCCCCA·

0BBBBB@

−20−9

001

1CCCCCA

= 0;

(iii)

0BBB@

−122−1

1CCCA ·

0BBB@

2100

1CCCA =

0BBB@

−122−1

1CCCA ·

0BBB@

−1001

1CCCA =

0BBB@

0010

1CCCA ·

0BBB@

2100

1CCCA =

0BBB@

0010

1CCCA ·

0BBB@

−1001

1CCCA = 0.

5.6.18.

(a) The compatibility condition is 23 b1 + b2 = 0 and so the cokernel basis is

“23 , 1

”T.

(b) The compatibility condition is −3b1 + b2 = 0 and so the cokernel basis is (−3, 1 )T .(c) There are no compatibility conditions, and so the cokernel is 0.(d) The compatibility conditions are −2b1− b2 + b3 = 2b1−2b2 + b4 = 0 and so the cokernel

basis is (−2,−1, 1, 0 )T , ( 2,−2, 0, 1 )T .

150

5.6.19. (a)

0BBB@

1−2

2−1

1CCCA,

0BBB@

0013

1CCCA; (b)

0B@

10−21−12

1CA,

0B@−11221

1CA;

(c)

0B@

1−2−1

1CA =

10

99

0B@

10−21−12

1CA+

1

99

0B@−11221

1CA,

0B@−2

42

1CA = − 20

99

0B@

10−21−12

1CA− 2

99

0B@−11221

1CA,

0B@

2−3

0

1CA =

7

33

0B@

10−21−12

1CA+

4

33

0B@−11221

1CA,

0B@−1

57

1CA = − 7

99

0B@

10−21−12

1CA+

29

99

0B@−11221

1CA.

5.6.20.(a) Cokernel basis: ( 1,−1, 1 )T ; compatibility condition: 2a− b+ c = 0;

(b) cokernel basis: (−1, 1, 1 )T ; compatibility condition: −a+ b+ c = 0;

(c) cokernel basis: (−3, 1, 1, 0 )T , ( 2,−5, 0, 1 )T ;compatibility conditions: −3b1 + b2 + b3 = 2b1 − 5b2 + b4 = 0;

(d) cokernel basis: (−1,−1, 1, 0 )T , ( 2,−1, 0, 1 )T ;compatibility conditions: −a− b+ c = 2a− b+ d = 0.

5.6.21.

(a) z =

0BB@

12

0

− 12

1CCA, w =

0BB@

12

012

1CCA = − 3

2

0BB@

1

−2

1

1CCA+

0BB@

2

−3

2

1CCA;

(b) z =

0BBB@

1313

− 13

1CCCA, w =

0BBB@

23

− 1313

1CCCA = −1

3

0BB@

1

1

2

1CCA−

0BB@

−1

0

−1

1CCA;

(c) z =

0BBBBB@

1417

− 117

− 417

− 517

1CCCCCA

, w =

0BBBBB@

317117417517

1CCCCCA

=1

51

0BBBBB@

1

−1

0

3

1CCCCCA

+4

51

0BBBBB@

2

1

3

3

1CCCCCA

;

(d) z =

0BBBBBBBB@

1213

− 16

− 13

0

1CCCCCCCCA

, w =

0BBBBBBBB@

12

− 131613

0

1CCCCCCCCA

=1

6

0BBBBBBB@

−3

2

−1

−2

0

1CCCCCCCA

.

5.6.22.

(a) (i) Fredholm requires that the cokernel basis“

12 , 1

”Tbe orthogonal to the right hand

side (−6, 3 )T ; (ii) the general solution is x = −3 + 2y with y free; (iii) the minimum

norm solution is x = − 35 , y = 6

5 .

(b) (i) Fredholm requires that the cokernel basis ( 27,−13, 5 )T be orthogonal to the right

hand side (−1, 1, 8 )T ; (ii) there is a unique solution: x = −2, y = 1; (iii) by unique-ness, the minimum norm solution is the same: x = −2, y = 1.

(c) (i) Fredholm requires that the cokernel basis (−1, 3 )T be orthogonal to the right hand

side ( 12, 4 )T (ii) the general solution is x = 2 + 12 y − 3

2 z with y, z free; (iii) the

minimum norm solution is x = 47 , y = − 2

7 , z = 67 .

151

(d) (i) Fredholm requires that the cokernel basis (−11, 3, 7 )T be orthogonal to the right

hand side ( 3, 11, 0 )T (ii) the general solution is x = −3 + z, y = 2 − 2z with z free;

(iii) the minimum norm solution is x = − 116 , y = − 1

3 , z = 76 .

(e) (i) Fredholm requires that the cokernel basis (−10,−9, 7, 0 )T , ( 6, 4, 0, 7 )T be orthogo-

nal to the right hand side (−8, 5,−5, 4 )T ; (ii) the general solution is x1 = 1 − t, x2 =

3+2 t, x3 = t with t free; (iii) the minimum norm solution is x1 = 116 , x2 = 4

3 , x3 = − 56 .

(f ) (i) Fredholm requires that the cokernel basis (−13, 5, 1 )T be orthogonal to the right

hand side ( 5, 13, 0 )T ; (ii) the general solution is x = 1 + y + w, z = 2 − 2w with y, w

free; (iii) the minimum norm solution is x = 911 , y = − 9

11 , z = 811 , w = 7

11 .

5.6.23. (a)

0BBB@

1−1

02

1CCCA,

0BBBB@

− 1313113

1CCCCA

; (b)

0BBB@

1100

1CCCA,

0BBB@

−11−1

1

1CCCA;

(c) yes because of the orthogonality of the corange and kernel; see Exercise 5.6.15.

5.6.24. If A is symmetric, kerA = kerAT = cokerA, and so this is an immediate consequence ofTheorem 5.55.

♦ 5.6.25. Since rngA = span v1, . . . ,vn = V , the vector w is orthogonal to V if and only if

w ∈ V ⊥ = (rngA)⊥ = cokerA.

♦ 5.6.26. Since kerA and corngA are complementary subspaces of Rn, we can write any x ∈ R

m

as a combination x = v + z with v = c1v1 + · · · + cr vr ∈ corngA and z ∈ kerA. Theny = Ax = Av = c1Av1 + · · · + crAvr and hence every y ∈ rngA can be written asa linear combination of Av1, . . . , Avr, which proves that they span rngA. To prove linearindependence, suppose 0 = c1Av1 + · · · + crAvr = A(c1v1 + · · · + cr vr). This impliesthat c1v1 + · · · + cr vr ∈ kerA. However, since they are orthogonal complements, the onlyvector in both corngA and kerA is the zero vector, and so c1v1 + · · · + cr vr = 0, which,since v1, . . . ,vr are a basis, implies c1 = · · · = cr = 0.

5.6.27. False. The resulting basis is almost never orthogonal.

5.6.28. False. See Example 5.60 for a counterexample.

♦ 5.6.29. If f 6∈ rngK, then there exists x ∈ kerK = cokerK such that xT f = x · f = b 6= 0. But

then p(sx) = −2sxT f + c = −2bs + c can be made arbitrarily large negative by choosings = tb with tÀ 0. Thus, p(x) has no minimum value.

♦ 5.6.30. The result is not true if one defines the cokernel and corange in terms of the transposed

matrix AT . Rather, one needs to replace AT by its Hermitian transpose A† = AT , cf. Ex-

ercise 5.3.25, and define corngA = rngA†, cokerA = kerA†. (These are the complex con-jugates of the spaces defined using the transpose.) With these modifications, both Theo-rem 5.54 and the Fredholm Alternative Theorem 5.55 are true for complex matrices.

5.7.1.(a) (i) c0 = 0, c1 = − 1

2 i , c2 = c−2 = 0, c3 = c−1 = 12 i , (ii) 1

2 i e− i x − 12 i e i x = sinx;

(b) (i) c0 = 12 π, c1 = 2

9 π, c2 = 0, c3 = c−3 = 118 π, c4 = c−2 = 0, c5 = c−1 = 2

9 π,

(ii) 118 πe

−3 i x + 29 πe

− i x + 12 π + 2

9 πei x = 1

2 π + 49 π cosx+ 1

18 π cos 3x− 118 π i sin 3x;

(c) (i) c0 = 13 , c1 = 3−

√3 i

12 , c2 = 1−√

3 i12 , c3 = c−3 = 0, c4 = c−2 = 1+

√3 i

12 ,

152

c5 = c−1 = 3+√

3 i12 , (ii) 1+

√3 i

12 e−2 i x + 3+√

3 i12 e− i x + 1

3 + 3−√

3 i12 e i x + 1−

√3 i

12 e2 i x =13 + 1

2 cosx+ 12√

3sinx+ 1

6 cos 2x+ 12√

3sin 2x;

(d) (i) c0 = − 18 , c1 = − 1

8 + 1+√

24 i , c2 = − 1

8 , c3 = − 18 − 1−

√2

4 i , c4 = c−4 =

− 18 , c5 = c−3 = − 1

8 + 1−√

24 i , c6 = c−2 = − 1

8 , c7 = c−1 = − 18 − 1+

√2

4 i ,

(ii) − 18 e

−4 i x +„− 1

8 + 1−√

24 i

«e−3 i x − 1

8 e−2 i x +

„− 1

8 − 1+√

24 i

«e− i x − 1

8 +„− 1

8 + 1+√

24 i

«e i x − 1

8 e2 i x +

„− 1

8 − 1−√

24 i

«e3 i x = − 1

8 − 14 cosx −

√2+12 sinx −

14 cos 2x− 1

4 cos 3x−√

2−12 sin 3x− 1

8 cos 4x− 1− i8 sin 4x.

5.7.2.(a) (i) f0 = f3 = 2, f1 = −1, f2 = −1. (ii) e− i x + e i x = 2 cosx;

(b) (i) f0 = f5 = 1, f1 = 1−√

5 , f2 = 1 +√

5 , f3 = 1 +√

5 , f4 = 1−√

5 ;

(ii) e−2 i x − e− i x + 1− e i x + e2 i x = 1− 2 cosx+ 2 cos 2x;

(c) (i) f0 = f5 = 6, f1 = 2 + 2e2π i /5 + 2e−4π i /5 = 1 + .7265 i ,

f2 = 2 + 2e2π i /5 + 2e4π i /5 = 1 + 3.0777 i ,

f3 = 2 + 2e−2π i /5 + 2e−4π i /5 = 1− 3.0777 i ,

f4 = 2 + 2e−2π i /5 + 2e4π i /5 = 1− .7265 i ;

(ii) 2e−2 i x + 2 + 2e i x = 2 + 2 cosx+ 2 i sinx+ 2 cos 2x− 2 i sin 2x;

(d) (i) f0 = f1 = f2 = f4 = f5 = 0, f3 = 6; (ii) 1 − e i x + e2 i x − e3 i x + e4 i x − e5 i x =1−cosx+cos 2x−cos 3x+cos 4x−cos 5x+ i (− sinx+ sin 2x− sin 3x+ sin 4x− sin 5x).

♠ 5.7.3.

1 2 3 4 5 6

1

2

3

4

5

6

1 2 3 4 5 6

1

2

3

4

5

6

1 2 3 4 5 6

1

2

3

4

5

6

The interpolants are accurate along most of the interval, but there is a noticeable problemnear the endpoints x = 0, 2π. (In Fourier theory, [16, 47], this is known as the Gibbs phe-nomenon.)

♠ 5.7.4.

(a)

1 2 3 4 5 6

10

20

30

40

1 2 3 4 5 6

10

20

30

40

1 2 3 4 5 6

10

20

30

40

(b)

1 2 3 4 5 6

2

4

6

8

10

1 2 3 4 5 6

2

4

6

8

10

1 2 3 4 5 6

2

4

6

8

10

(c)1 2 3 4 5 6

-1

-0.5

0.5

1

1 2 3 4 5 6

-1

-0.5

0.5

1

1 2 3 4 5 6

-1

-0.5

0.5

1

153

(d)1 2 3 4 5 6

-1

-0.5

0.5

1

1 2 3 4 5 6

-1

-0.5

0.5

1

1 2 3 4 5 6

-1

-0.5

0.5

1

(e)

1 2 3 4 5 6

0.2

0.4

0.6

0.8

1

1 2 3 4 5 6

0.2

0.4

0.6

0.8

1

1 2 3 4 5 6

0.2

0.4

0.6

0.8

1

(f )

1 2 3 4 5 6

0.5

1

1.5

2

2.5

3

1 2 3 4 5 6

0.5

1

1.5

2

2.5

3

1 2 3 4 5 6

0.5

1

1.5

2

2.5

3

5.7.5.

(a)ζ06 = 1

ζ6ζ26

ζ36 = −1

ζ46 ζ56(b) ζ6 = 1

2 +√

32 i ;

(c) ζ26 = − 12+

√3

2 i , ζ36 = −1, ζ46 = − 12−

√3

2 i , ζ56 = 12−

√3

2 i , so 1 + ζ6 + ζ26 + ζ36 + ζ46 + ζ56 = 0.(d) We are adding three pairs of unit vectors pointing in opposite directions, so the sum

cancels out. Equivalently, 16 times the sum is the center (of mass) of the regular hexagon,

which is at the origin.

♦ 5.7.6.(a) The roots all have modulus | ζk | = 1 and phase ph ζk = 2πk/n. The angle between

successive roots is 2π/n. The sides meet at an angle of π − 2π/n.

(b) Every root has modulus n

q| z | and the phases are

1

n

“ph z + 2πk

”, so the angle between

successive roots is 2π/n, and the sides continue to meet at an angle of π − 2π/n.

The n-gon has radius ρ = n

q| z | . Its first vertex makes an angle of ϕ =

1

nph z with the

horizontal.

In the figure, the roots ρ = 6√z are at the vertices of a

regular hexagon, whose first vertex makes an angle withthe horizontal that is 1

6 the angle made by the point z.

z

♦ 5.7.7.(a) (i) i ,− i ; (ii) e2πk i /5 for k = 1, 2, 3 or 4; (iii) e2πk i /9 for k = 1, 2, 4, 5, 7 or 8;

(b) e2πk i /n whenever k and n have no common factors, i.e., k is relatively prime to n.

154

5.7.8. (a) Yes, the discrete Fourier coefficients are real for all n. (b) A function f(x) has realdiscrete Fourier coefficients if and only if f(xk) = f(2π−xk) on the sample points x0, . . . , xn−1.

In particular, this holds when f(x) = f(2π − x).♥ 5.7.9. (a) In view of (2.13), formula (5.91) is equivalent to matrix multiplication f = Fn c,

where

Fn =“ω0 ω1 . . . ωn−1

”=

0BBBBBBBB@

1 1 1 . . . 11 ζ ζ2 . . . ζn−1

1 ζ2 ζ4 . . . ζ2(n−1)

......

.... . .

...1 ζn−1 ζ2(n−1) . . . ζ(n−1)2

1CCCCCCCCA

is the n×n matrix whose columns are the sampled exponential vectors (5.90). In particular,

F2 =

1 11 −1

!, F3 =

0BB@

1 1 11 − 1

2 +√

32 i − 1

2 −√

32 i

1 − 12 −

√3

2 i − 12 +

√3

2 i

1CCA, F4 =

0BBB@

1 1 1 11 i −1 − i1 −1 1 −11 − i −1 i

1CCCA,

F8 =

0BBBBBBBBBBBBBBB@

1 1 1 1 1 1 1 11 1+ i√

2i −1+ i√

2−1 −1− i√

2− i 1− i√

21 i −1 − i 1 i −1 − i1 −1+ i√

2− i 1+ i√

2−1 1− i√

2i −1+ i√

21 −1 1 −1 1 −1 1 −11 −1− i√

2i 1− i√

2−1 1+ i√

2− i −1+ i√

21 − i −1 i 1 − i −1 i1 −1+ i√

2i −1− i√

2−1 1− i√

2− i 1+ i√

2

1CCCCCCCCCCCCCCCA

.

(b) Clearly, if f = Fn c, then c = F−1n f . Moreover, formula (5.91) implies that the (i, j)

entry of F−1n is 1

n ζ− ijn = 1

n ζijn , which is 1

n times the complex conjugate of the (j, i) entry

of Fn. (c) By part (b), U−1n =

√nF−1

n = 1√nF †

n = U†n.

♠ 5.7.10.1 2 3 4 5 6

-1

-0.5

0.5

1

1.5

2

1 2 3 4 5 6

-1

-0.5

0.5

1

1.5

2

1 2 3 4 5 6

-1

-0.5

0.5

1

1.5

2

Original function, 11 mode compression, 21 mode compression.The average absolute errors are .018565 and .007981; the maximal errors are .08956 and.04836, so the 21 mode compression is about twice as accurate.

♠ 5.7.11.

(a)

1 2 3 4 5 6

1

2

3

4

5

6

1 2 3 4 5 6

1

2

3

4

5

6

1 2 3 4 5 6

1

2

3

4

5

6

Original function, 11 mode compression, 21 mode compression.The largest errors are at the endpoints, which represent discontinuities. The average ab-solute errors are .32562 and .19475; the maximal errors are 2.8716 and 2.6262, so, exceptnear the ends, the 21 mode compression is slightly less than twice as accurate.

155

(b)

1 2 3 4 5 6

20

40

60

80

1 2 3 4 5 6

20

40

60

80

1 2 3 4 5 6

20

40

60

80

Original function, 11 mode compression, 21 mode compression.The error is much less and more uniform than in cases with discontinuities. The averageabsolute errors are .02575 and .002475; the maximal errors are .09462 and .013755, sothe 21 mode compression is roughly 10 times as accurate.

(c)

1 2 3 4 5 6

0.2

0.4

0.6

0.8

1

1 2 3 4 5 6

0.2

0.4

0.6

0.8

1

1 2 3 4 5 6

0.2

0.4

0.6

0.8

1

Original function, 11 mode compression, 21 mode compression.The only noticeable error is at the endpoints and the corner, x = π. The average abso-lute errors are .012814 and .003612; the maximal errors are .06334 and .02823, so the 21mode compression is 3 times as accurate.

♣ 5.7.12. l = 4, 27, 57.

♣ 5.7.13. Very few are needed. In fact, if you take too many modes, you do worse! For example,if ε = .1,

1 2 3 4 5 6

0.5

1

1.5

2

1 2 3 4 5 6

0.5

1

1.5

2

1 2 3 4 5 6

0.5

1

1.5

2

1 2 3 4 5 6

0.5

1

1.5

2

1 2 3 4 5 6

0.5

1

1.5

2

plots the noisy signal and the effect of retaining 2 l + 1 = 3, 5, 11, 21 modes. Only the firstthree give reasonable results. When ε = .5 the effect is even more pronounced:

1 2 3 4 5 6

-0.5

0.5

1

1.5

2

2.5

1 2 3 4 5 6

-0.5

0.5

1

1.5

2

2.5

1 2 3 4 5 6

-0.5

0.5

1

1.5

2

2.5

1 2 3 4 5 6

-0.5

0.5

1

1.5

2

2.5

1 2 3 4 5 6

0.5

1

1.5

2

♠ 5.7.14. For noise varying between ±1, and 256 = 28 sample points, the errors are

# nodes 3 5 7 9 11 13

average error .8838 .1491 .0414 .0492 .0595 .0625

maximal error 1.5994 .2687 .1575 .1357 .1771 .1752

Thus, the optimal denoising is at 2 l + 1 = 7 or 9 modes, after which the errors start to getworse. Sampling on fewer points, say 64 = 26, leads to similar results with slightly worseperformance:

# nodes 3 5 7 9 11 13

average error .8855 .1561 .0792 .0994 .1082 .1088

maximal error 1.5899 .3348 .1755 .3145 .3833 .4014

On the other hand, tripling the size of the error, to vary between ±3, leads to similar, andmarginally worse performance:

156

# nodes 3 5 7 9 11 13

average error .8830 .1636 .1144 .3148 .1627 .1708

maximal error 1.6622 .4306 .1755 .3143 .3398 .4280

Note: the numbers will slightly vary each time the random number generator is run.

♣ 5.7.15. The “compressed” function differs significantly from original signal. The following plotsare the function, that obtained by retaining the first l = 11 modes, and then the first l = 21

modes:1 2 3 4 5 6

-1

-0.5

0.5

1

1.5

2

1 2 3 4 5 6

-1

-0.5

0.5

1

1.5

2

1 2 3 4 5 6

-1

-0.5

0.5

1

1.5

2

5.7.16. True for the odd case (5.103), but false for the even case (5.104). Since ω−k = ωk,

when f is real, c−k = ck, and so the terms c−k e− i kx + ck e

i kx = 2 Re ck ei kx combine

to form a real function. Moreover, the constant term c0, which equals the average samplevalue of f , is also real. Thus, all terms in the odd sum pair up into real functions. How-

ever, in the even version, the initial term c−m e− i mx has no match, and remains, in gen-eral, complex. See Exercise 5.7.1 for examples.

♠ 5.7.17. (a) f =

0BBBBB@

012

132

1CCCCCA, c(0) =

0BBBBB@

0

11232

1CCCCCA, c(1) =

0BBBBB@

12

− 12

1

− 12

1CCCCCA, c = c(2) =

0BBBBB@

34

− 14 + 1

4 i

− 14

− 14 + 1

4 i

1CCCCCA

;

(b) f =

0BBBBBBBBBBBBBBBBBBB@

01√2

11√2

0

− 1√2

−1

− 1√2

1CCCCCCCCCCCCCCCCCCCA

, c(0) =

0BBBBBBBBBBBBBBBBBBB@

0

0

1

−11√2

− 1√2

1√2

− 1√2

1CCCCCCCCCCCCCCCCCCCA

, c(1) =

0BBBBBBBBBBBBBBBBBB@

0

0

0

1

01√2

01√2

1CCCCCCCCCCCCCCCCCCA

, c(2) =


0

− 12 i

012 i

01− i2√

2

01+ i2√

2


, c = c(3) =

0BBBBBBBBBBBBBBBB@

0

− 12 i

0

0

0

0

012 i

1CCCCCCCCCCCCCCCCA

.

(c) f =

0BBBBBBBBBBBBBBBBB@

π34 π12 π14 π

014 π12 π34 π

1CCCCCCCCCCCCCCCCCA

, c(0) =

0BBBBBBBBBBBBBBBBB@

π

012 π12 π34 π14 π14 π34 π

1CCCCCCCCCCCCCCCCCA

, c(1) =

0BBBBBBBBBBBBBBBBB@

12 π12 π12 π

012 π14 π12 π

− 14 π

1CCCCCCCCCCCCCCCCCA

, c(2) =

0BBBBBBBBBBBBBBBBB@

12 π14 π

014 π12 π

1+ i8 π

01− i

8 π

1CCCCCCCCCCCCCCCCCA

, c = c(3) =

0BBBBBBBBBBBBBBBBBBBBB@

12 π√2+1

8√

2π

0√

2−18√

2π

0√

2−18√

2π

0√

2+18√

2π

1CCCCCCCCCCCCCCCCCCCCCA

;

157

(d) f = ( 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 )T ,

c(0) = ( 1, 0, 1, 0, 1, 0, 0, 0, 1, 0, 1, 0, 1, 0, 0, 0 )T ,

c(1) = ( .5, .5, .5, .5, .5, .5, 0, 0, .5, .5, .5, .5, .5, .5, 0, 0 )T ,

c(2) = ( .5, .25− .25 i , 0, .25 + .25 i , .25, .25, .25, .25, .5, .25− .25 i , 0, .25 + .25 i , .25, .25, .25, .25 )T ,

c(3) = (.375, .2134− .2134 i , .− .125 i , .0366 + .0366 i , .125, .0366− .0366 i , .125 i , .2134 + .2134 i ,

.375, .2134− .2134 i ,−.125 i , .0366 + .0366 i , .125, .0366− .0366 i , .125 i , .2134 + .2134 i )T ,

c = c(4) = (.375, .1644− .2461 i ,−.0442− .1067 i , .0422 + .0084 i , .0625− .0625 i ,−.0056− .0282 i ,

.0442 + .0183 i , .0490− .0327 i , 0, .0490 + .0327 i , .0442− .01831 i ,−.0056 + .0282 i ,

.0625 + .0625 i , .0422− .0084 i ,−.0442 + .1067 i , .1644 + .2461 i )T .

♠ 5.7.18.

(a) c =

0BBB@

1−1

1−1

1CCCA, f

(0) =

0BBB@

11−1−1

1CCCA, f

(1) =

0BBB@

20−2

0

1CCCA, f = f

(2) =

0BBB@

0040

1CCCA;

(b) c =

0BBBBBBBBBBBB@

220−1

2−1

0−1

1CCCCCCCCCCCCA

, f(0) =

0BBBBBBBBBBBB@

22002−1−1−1

1CCCCCCCCCCCCA

, f(1) =

0BBBBBBBBBBBB@

400013−2

0

1CCCCCCCCCCCCA

, f(2) =

0BBBBBBBBBBBB@

4040−1

333

1CCCCCCCCCCCCA

, f = f(3) =

0BBBBBBBBBBBBBBB@

33√2(1 + i )

4 + 3 i3√2(−1 + i )

5− 3√

2(1 + i )

4− 3 i3√2(1− i )

1CCCCCCCCCCCCCCCA

.

♥ 5.7.19.(a)

M0 =

0BBBBBBBBBBBB@

1 0 0 0 0 0 0 00 0 0 0 1 0 0 00 0 1 0 0 0 0 00 0 0 0 0 0 1 00 1 0 0 0 0 0 00 0 0 0 0 1 0 00 0 0 1 0 0 0 00 0 0 0 0 0 0 1

1CCCCCCCCCCCCA

, M1 =

0BBBBBBBBBBBB@

1 1 0 0 0 0 0 01 −1 0 0 0 0 0 00 0 1 1 0 0 0 00 0 1 −1 0 0 0 00 0 0 0 1 1 0 00 0 0 0 1 −1 0 00 0 0 0 0 0 1 10 0 0 0 0 0 1 −1

1CCCCCCCCCCCCA

,

M2 =

0BBBBBBBBBBBB@

1 0 1 0 0 0 0 00 1 0 i 0 0 0 01 0 −1 0 0 0 0 00 1 0 − i 0 0 0 00 0 0 0 1 0 1 00 0 0 0 0 1 0 i0 0 0 0 1 0 −1 00 0 0 0 0 1 0 − i

1CCCCCCCCCCCCA

, M3 =

0BBBBBBBBBBBBBBB@

1 0 0 0 1 0 0 00 1 0 0 0 1+ i√

20 0

0 0 1 0 0 0 i 00 0 0 1 0 0 0 i−1√

21 0 0 0 −1 0 0 00 1 0 0 0 − 1+ i√

20 0

0 0 1 0 0 0 − i 00 0 0 1 0 0 0 1− i√

2

1CCCCCCCCCCCCCCCA

.

(b) Because, by composition, f = M3M2M1M0c. On the other hand, according to Exercise5.7.9, f = F8c, and so M3M2M1M0c = F8c. Since this holds for all c, the coefficientmatrices must be equal: F8 = M3M2M1M0.

(c) c(0) = N0f , c

(1) = N1c(0), c

(2) = N2c(0), c = c

(3) = N3c(0), where N0 = M0, and

Nj = 12 Mj , j = 1, 2, 3, and F−1

8 = 18 F

†8 = N3N2N1N0..

158


6.1.1. (a) K =

3 −2−2 3

!; (b) u =

0@

185175

1A =

3.63.4

!; (c) the first mass has moved the

farthest; (d) e =“

185 ,− 1

5 ,− 175

”T= ( 3.6,− .2,−3.4 )T , so the first spring has stretched

the most, while the third spring experiences the most compression.

6.1.2. (a) K =

3 −1−1 2

!; (b) u =

0@

115135

1A =

2.22.6

!; (c) the second mass has moved the

farthest; (d) e =“

115 , 2

5 ,− 135

”T= ( 2.2, .4,−2.6 )T , so the first spring has stretched the

most, while the third spring experiences even more compression.

6.1.3. 6.1.1: (a) K =

3 −2−2 2

!; (b) u =

7172

!=

7.08.5

!; (c) the second mass has

moved the farthest; (d) e =“

7, 32

”T= ( 7.0, 1.5 )T , so the first spring has stretched the

most.

6.1.2: (a) K =

3 −1−1 1

!; (b) u =

0@

72132

1A =

3.56.5

!; (c) the second mass has moved

the farthest; (d) e =“

72 , 3

”T= ( 3.5, 3. )T , so the first spring has stretched slightly farther.

6.1.4.(a) u = ( 1, 3, 3, 1 )T , e = ( 1, 2, 0,−2,−1 )T . The solution is unique since K is invertible.

(b) Now u = ( 2, 6, 7.5, 7.5 )T , e = ( 2, 4, 1.5, 0 )T . The masses have all moved farther, andthe springs are elongated more; in this case, no springs are compressed.

6.1.5.(a) Since e1 = u1, ej = uj − uj+1, for 2 ≤ j ≤ n, while en+1 = −un, so

e1 + · · ·+ en+1 = u1 + (u2 − u1) + (u2 − u1) + · · ·+ (un − un−1)− un = 0.

Alternatively, note that z = ( 1, 1, . . . , 1 )T ∈ coker A and hence z ·e = e1 + · · ·+en+1 = 0since e = Au ∈ rng A.

(b) Now there are only n springs, and so

e1 + · · ·+ en = u1 + (u2 − u1) + (u2 − u1) + · · ·+ (un − un−1) = un.

Thus, the average elongation1n (e1 + · · ·+ en) =

1n un equals the displacement of the

last mass divided by the number of springs.

♦ 6.1.6. Since the stiffness matrix K is symmetric, so is its inverse K−1. The basis vector ei

represents a unit force on the ith mass only; the resulting displacement is uj = K−1ei,

which is the ith column of K−1. Thus, (j, i) entry of K−1 is the displacement of the jth

mass when subject to a unit force on the ith mass. Since K−1 is a symmetric matrix, thisis equal to its (i, j) entry, which, for the same reason, is the displacement of the ith masswhen subject to a unit force on the jth mass.

159

♣ 6.1.7. Top and bottom support; constant force:

20 40 60 80 100

2

4

6

8

10

12

20 40 60 80 100

-0.4

-0.2

0.2

0.4

Top and bottom support; linear force:

20 40 60 80 100

2

4

6

8

10

12

20 40 60 80 100

-0.6

-0.4

-0.2

0.2

Top and bottom support; quadratic force:

20 40 60 80 100

2.5

5

7.5

10

12.5

15

20 40 60 80 100

-0.4

-0.2

0.2

0.4

Top support only; constant force:

20 40 60 80 100

10

20

30

40

50

20 40 60 80 100

0.2

0.4

0.6

0.8

1

Top support only; linear force:

20 40 60 80 100

10

20

30

40

50

60

20 40 60 80 100

0.2

0.4

0.6

0.8

1

Top support only; quadratic force:

20 40 60 80 100

10

20

30

40

50

20 40 60 80 100

0.2

0.4

0.6

0.8

1

160

6.1.8.(a) For maximum displacement of the bottom mass, the springs should be arranged from

weakest at the top to strongest at the bottom, so c1 = c = 1, c2 = c′ = 2, c3 = c′′ = 3.(b) In this case, the order giving maximum displacement of the bottom mass is c1 = c = 2,

c2 = c′ = 3, c3 = c′′ = 1.

♣ 6.1.9.(a) When the bottom end is free, for maximum displacement of the bottom mass, the springs

should be arranged from weakest at the top to strongest at the bottom. In fact, theith elongation is ei = (n− i + 1)/ci. The displacement of the bottom mass is the sum

un =nX

i=1

ei =nX

i=1

n− i + 1

ci

of the elongations of all the springs above it, and achieves

its maximum value if and only if c1 ≤ c2 ≤ · · · ≤ cn.(b) In this case, the weakest spring should be at the bottom, while the remaining springs

are arranged in order from second weakest at the top to strongest just above the lastmass. A proof that this is best would be interesting...

6.1.10.

The sub-diagonal entries of L are li,i−1 = − 1

i, while the diagonal entries of D are dii =

i + 1i

.

♥ 6.1.11.(a) Since y = Au, we have y ∈ rng A = corng AT . Thus, according to Theorem 5.59, y has

minimal Euclidean norm among all solutions to AT y = f .

(b) To find the minimal norm solution to AT y = f , we proceed as in Chapter 5, and append

the conditions that y is orthogonal to ker AT = coker A. In the particular case of Exam-

ple 6.1, AT =

0B@

1 −1 0 00 1 −1 00 0 1 −1

1CA, and ker AT is spanned by z = ( 1, 1, 1, 1 )T . To find

the minimal norm solution, we solve the enlarged system

0BBB@

1 −1 0 00 1 −1 00 0 1 −11 1 1 1

1CCCAy =

0BBB@

0100

1CCCA

obtained by appending the compatibility condition z · y = 0, whose solution y =“

12 , 1

2 ,− 12 ,− 1

2

”Treproduces the stress in the example.

6.1.12. Regular Gaussian Elimination reduces them to

0B@

2 −1 00 3

2 1

0 0 43

1CA,

0B@

2 −1 00 3

2 1

0 0 13

1CA , respec-

tively. Since all three pivots are positive, the matrices are positive definite.

6.1.13. Denoting the gravitation force by g:

(a) p(u) =1

2( u1 u2 u3 )

0B@

2 −1 0−1 2 −1

0 −1 1

1CA

0B@

u1u2u3

1CA− ( u1 u2 u3 )

0B@

ggg

1CA

= u21 − u1 u2 + u2

2 − u2 u3 + 12 u2

3 − g (u1 + u2 + u3).

(b) p(u) =1

2( u1 u2 u3 u4 )

0BBB@

2 −1 0 0−1 2 −1 0

0 −1 2 −10 0 −1 2

1CCCA

0BBB@

u1u2u3u4

1CCCA− ( u1 u2 u3 u4 )

0BBB@

gggg

1CCCA

= u21 − u1 u2 + u2

2 − u2 u3 + u23 − u3 u4 + u2

4 − g (u1 + u2 + u3 + u4),

161

(c) p(u) =1

2( u1 u2 u3 u4 )

0BBB@

2 −1 0 0−1 2 −1 0

0 −1 2 −10 0 −1 1

1CCCA

0BBB@

u1u2u3u4

1CCCA− ( u1 u2 u3 u4 )

0BBB@

gggg

1CCCA

= u21 − u1 u2 + u2

2 − u2 u3 + u23 − u3 u4 + 1

2 u24 − g (u1 + u2 + u3 + u4).

6.1.14.

(a) p(u) =1

2( u1 u2 )

3 −2−2 3

! u1u2

!− ( u1 u2 )

43

!

= 32 u2

1 − 2u1 u2 + 32 u2

2 − 4u1 − 3u2, so p(u?) = p“

3.6, 3.4”

= −12.3.

(b) For instance, p(1, 0) = −2.5, p(0, 1) = −1.5, p(3, 3) = −12.

6.1.15.(a) p(u) = 3

4 u21 − 1

2 u1 u2 + 712 u2

2 − 23 u2 u3 + 7

12 u23 − 1

2 u3 u4 + 34 u2

4 − u2 − u3,

so p(u?) = p“

1, 3, 3, 1”

= −3.

(b) For instance, p(1, 0, 0, 0) = p(0, 0, 0, 1) = .75, p(0, 1, 0, 0) = p(0, 0, 1, 0) = −.4167.

6.1.16.(a) Two masses, both ends fixed, c1 = 2, c2 = 4, c3 = 2, f = (−1, 3 )T ;

equilibrium: u? = ( .3, .7 )T .

(b) Two masses, top end fixed, c1 = 4, c2 = 6, f = ( 0,−2 )T ;

equilibrium: u? =“− 1

2 ,− 56

”T= (−.5,−.8333 )T .

(c) Three masses, top end fixed, c1 = 1, c2 = 3, c3 = 5, f = ( 1, 1,−1 )T ;

equilibrium: u? =“− 1

2 ,− 56

”T= (−.5,−.8333 )T .

(d) Four masses, both ends fixed, c1 = 3, c2 = 1, c3 = 1, c4 = 1, c5 = 3, f = (−1, 0, 2, 0 )T ;

equilibrium: u? = (−.0606, .7576, 1.5758, .3939 )T .

6.1.17. In both cases, the homogeneous system Au = 0 requires 0 = u1 = u2 = · · · = un, and soker A = 0, proving linear independence of its columns.

6.1.18. This is an immediate consequence of Exercise 4.2.9.

♥ 6.1.19.(a) When only the top end is supported, the potential energy is lowest when the springs are

arranged from weakest at the top to strongest at the bottom: c1 = c = 1, c2 = c′ = 2,

c3 = c′′ = 3, with energy − 173 = −5.66667 under a unit gravitational force.

(b) When both ends are fixed, the potential energy is minimized when either the springs arein the order c1 = c = 1, c2 = c′ = 3, c3 = c′′ = 3 or the reverse order c1 = c = 2,

c2 = c′ = 3, c3 = c′′ = 1, both of which have energy − 1522 = −.681818 under a unit

gravitational force.

6.1.20. True. The potential energy function (6.16) uniquely determines the symmetric stiffnessmatrix K and the external force vector f . According to (6.12), (6.15), the off-diagonal en-tries of K determine the individual spring constants c2, . . . , cn of all but the first and (ifthere is one) last springs. But once we know c2 and cn, the remaining one or two con-stants, c1 and cn+1, are uniquely prescribed by the (1, 1) and (n, n) entries of K. If cn+1 =0, then the bottom end is not attached to a support. Thus, the potential energy uniquelyprescribes the entire mass–spring chain.

162

6.2.1. (a) (b)

(c) (d) (e)

6.2.2. (a) A =

0BBBBB@

1 −1 0 01 0 −1 01 0 0 −10 1 −1 00 1 0 −1

1CCCCCA

; (b)

0B@

3 −1 −1−1 3 −1−1 −1 2

1CA

0B@

u1u2u3

1CA =

0B@

300

1CA.

(c) u =“

158 , 9

8 , 32

”T= ( 1.875, 1.125, 1.5 )T ;

(d) y = v = Au =“

34 , 3

8 , 158 , − 3

8 , 98

”T= ( .75, .375, 1.875,− .375, 1.125 )T .

(e) The bulb will be brightest when connected to wire 3, which has the most currentflowing through.

6.2.3. The reduced incidence matrix is A? =

0BBBBB@

1 −11 01 00 10 1

1CCCCCA

, and the equilibrium equations are

3 −1−1 3

!u =

30

!, with solution u =

0@

9838

1A =

1.125.375

!; the resulting currents are

y = v = Au =“

34 , 9

8 , 98 , 3

8 , 38

”T= ( .75, 1.125, 1.125, .375, .375 )T . Now, wires 2 and 3

both have the most current. Wire 1 is unchanged; the current in wires 2 has increased; thecurrent in wires 3, 5 have decreased; the current in wire 4 has reversed direction.

6.2.4. (a) A =

0BBBBBBBBBB@

1 −1 0 0 00 1 −1 0 01 0 −1 0 01 0 0 −1 00 1 0 −1 00 0 1 0 −10 0 0 1 −1

1CCCCCCCCCCA

; (b) u =

0BBBBB@

3435233519351635

1CCCCCA

=

0BBB@

.9714

.6571

.5429

.4571

1CCCA;

y =“

1135 , 4

35 , 37 , 9

35 , 15 , 19

35 , 1635

”T= ( .3143, .1143, .4286, .2571, .2000, .5429, .4571 )T ;

(c) wire 6.

6.2.5. (a) Same incidence matrix; (b) u = ( .4714,−.3429, .0429,−.0429 )T ;

y = ( .8143,−.3857, .4286, .2571,−.3000, .0429,−.0429 )T ; (c) wire 1.

♠ 6.2.6. None.

163

♠ 6.2.7. There is no current on the two wires connecting the same poles of the batteries (positive-positive and negative-negative) and 2.5 amps along all the other wires.

♠ 6.2.8.(a) The potentials remain the same, but the currents are all twice as large.

(b) The potentials are u = (−4.1804, 3.5996,−2.7675,−2.6396, .8490, .9376,−2.0416, 0. )T ,while the currents are

y = ( 1.2200,−.7064,−.5136, .6876, .5324,−.6027,−.1037,−.4472,−.0664, .0849, .0852,−.1701 )T .

♣ 6.2.9. Resistors 3 and 4 should be on the battery wire and the opposite wire — in either or-der. Resistors 1 and 6 are connected to one end of resistor 3 while resistors 2 and 5 areconnected to its other end; also, resistors 1 and 5 are connected to one end of resistor 4while resistors 2 and 6 are connected to its other end. Once the wires are labeled, thereare 8 possible configurations. The current through the light bulb is .4523.

♣ 6.2.10.(a) For n = 2, the potentials are

0BBB@

116

18

116

18

38

18

116

18

116

1CCCA =

0B@

.0625 .125 .0625

.125 .375 .125

.0625 .125 .0625

1CA.

The currents along the horizontal wires are0BBB@

− 116 − 1

16116

116

− 18 − 1

414

18

− 116 − 1

16116

116

1CCCA =

0B@−.0625 −.0625 .0625 .0625−.125 −.25 .25 .125−.0625 −.0625 .0625 .0625

1CA,

where all wires are oriented from left to right, so the currents are all going away fromthe center. The currents in the vertical wires are given by the transpose of the matrix.

For n = 3, the potentials are0BBBBB@

.0288 .0577 .0769 .0577 .0288

.0577 .125 .1923 .125 .0577

.0769 .1923 .4423 .1923 .0769

.0577 .125 .1923 .125 .0577

.0288 .0577 .0769 .0577 .0288

1CCCCCA

.

The currents along the horizontal wires are0BBBBB@

−.0288 −.0288 −.0192 .0192 .0288 .0288−.0577 −.0673 −.0673 .0673 .0673 .0577−.0769 −.1153 −.25 .25 .1153 .0769−.0577 −.0673 −.0673 .0673 .0673 .0577−.0288 −.0288 −.0192 .0192 .0288 .0288

1CCCCCA

,

where all wires are oriented from left to right, so the currents are all going away fromthe center. The currents in the vertical wires are given by the transpose of the matrix.

For n = 4, the potentials are0BBBBBBBBBB@

.0165 .0331 .0478 .0551 .0478 .0331 .0165

.0331 .0680 .1029 .125 .1029 .0680 .0331

.0478 .1029 .1710 .2390 .1710 .1029 .0478

.0551 .125 .2390 .4890 .2390 .125 .0551

.0478 .1029 .1710 .2390 .1710 .1029 .0478

.0331 .0680 .1029 .125 .1029 .0680 .0331

.0165 .0331 .0478 .0551 .0478 .0331 .0165

1CCCCCCCCCCA

.

164

The currents along the horizontal wires are

0BBBBBBBBBB@

−.0165 −.0165 −.0147 −.0074 .0074 .0147 .0165 .0165−.0331 −.0349 −.0349 −.0221 .0221 .0349 .0349 .0331−.0478 −.0551 −.0680 −.0680 .0680 .0680 .0551 .0478−.0551 −.0699 −.1140 −.25 .25 .1140 .0699 .0551−.0478 −.0551 −.0680 −.0680 .0680 .0680 .0551 .0478−.0331 −.0349 −.0349 −.0221 .0221 .0349 .0349 .0331−.0165 −.0165 −.0147 −.0074 .0074 .0147 .0165 .0165

1CCCCCCCCCCA

,

where all wires are oriented from left to right, so the currents are all going away fromthe center source. The currents in the vertical wires are given by the transpose of thematrix.

As n → ∞ the potentials approach a limit, which is, in fact, the fundamental so-lution to the Dirichlet boundary value problem for Laplace’s equation on the square,[47, 59]. The horizontal and vertical currents tend to the gradient of the fundamentalsolution. But this, of course, is the result of a more advanced analysis beyond the scopeof this text. Here are graphs of the potentials and horizontal currents for n = 2, 3, 4, 10:

6.2.11. This is an immediate consequence of Theorem 5.59, which states that the minimum

norm solution to AT y = f is characterized by the condition y = corng AT = rng A. But,

solving the system AT Au = f results in y = Au ∈ rng A.

6.2.12.(a) (i) u = ( 2, 1, 1, 0 )T , y = ( 1, 0, 1 )T ; (ii) u = ( 3, 2, 1, 1, 0 )T , y = ( 1, 1, 0, 1 )T ;

(iii) u = ( 3, 2, 1, 1, 1, 0 )T , y = ( 1, 1, 0, 0, 1 )T ; (iv) u = ( 3, 2, 2, 1, 1, 0 )T ,

y = ( 1, 0, 1, 0, 1 )T ; (v) u = ( 3, 2, 2, 1, 1, 1, 0 )T , y = ( 1, 0, 1, 0, 0, 1 )T .(b) In general, the current only goes through the wires directly connecting the top and bot-

tom nodes. The potential at a node is equal to the number of wires transmitting thecurrent that are between it and the grounded node.

6.2.13.

(i) u =“

32 , 1

2 , 0, 0”T

, y =“

1, 12 , 1

2

”T;

165

(ii) u =“

52 , 3

2 , 12 , 0, 0

”T, y =

“1, 1, 1

2 , 12

”T;

(iii) u =“

73 , 4

3 , 13 , 0, 0, 0

”T, y =

“1, 1, 1

3 , 13 , 1

3

”T;

(iv) u =“

85 , 3

5 , 0, 15 , 0, 0

”T, y =

“1, 3

5 , 25 , 1

5 , 15

”T;

(v) u =“

117 , 4

7 , 0, 17 , 0, 0

”T, y =

“1, 4

7 , 37 , 1

7 , 17 , 1

7

”T.

6.2.14. According to Exercise 2.6.8(b), a tree with n nodes has n − 1 edges. Thus, the reducedincidence matrix A? is square, of size (n − 1) × (n − 1), and is nonsingular since the tree isconnected.

6.2.15.(a) True, since they satisfy the same systems of equilibrium equations Ku = −AT C b = f .(b) False, because the currents with the batteries are, by (6.37), y = C v = C Au + C b,

while for the current sources they are y = C v = C Au.

6.2.16. In general, if v1, . . . ,vm are the rows of the (reduced) incidence matrix A, then the re-

sistivity matrix is K = AT C A =mX

i=1

civTi vi. (This relies on the fact that C is a diagonal

matrix.) In the situation described in the problem, two rows of the incidence matrix are

the same, v1 = v2 = v, and so their contribution to the sum will be c1vT v + c2v

T v =

(c1 + c2)vT v = cvT v, which is the same contribution as a single wire between the two

vertices with conductance c = c1 + c2. The combined resistance is

R =1

c=

1

c1 + c2=

1

1/R1 + 1/R2

=R1R2

R1 + R2

.

6.2.17.(a) If f are the current sources at the nodes and b the battery terms, then the nodal volt-

age potentials satisfy AT C Au = f − AT C b.(b) By linearity, the combined potentials (currents) are obtained by adding the potentials

(currents) due to the batteries and those resulting from the current sources.

♦ 6.2.18. The resistivity matrix K? is symmetric, and so is its inverse. The (i, j) entry of (K?)−1

is the ith entry of uj = (K?)−1ej , which is the potential at the ith node due to a unit cur-

rent source at the jth node. By symmetry, this equals the (j, i) entry, which is the potentialat the jth node due to a unit current source at the ith node.

6.2.19. If the graph has k connected subgraphs, then there are k independent compatibilityconditions on the unreduced equilibrium equations Ku = f . The conditions are that thesum of the current sources at the nodes on every connected subgraph must be equal tozero.

6.3.1. 8 cm

6.3.2. The bar will be stress-free provided the vertical force is 1.5 times the horizontal force.

6.3.3.(a) For a unit horizontal force on the two nodes, the displacement vector is

u = ( 1.5,− .5, 2.5, 2.5 )T , so the left node has moved slightly down and three times asfar to the right, while the right node has moved five times as far up and to the right.Note that the force on the left node is transmitted through the top bar to the rightnode, which explains why it moves significantly further. The stresses are

166

e = ( .7071, 1, 0,−1.5811 )T , so the left and the top bar are elongated, the right bar isstress-free, and the reinforcing bar is significantly compressed.

(b) For a unit horizontal force on the two nodes, u = ( .75,−.25, .75, .25 )T so the left nodehas moved slightly down and three times as far to the right, while the right node hasmoved by the same amount up and to the right. The stresses are e = (.353553, 0.,

−.353553,−.790569, .79056)T , so the diagonal bars fixed at node 1 are elongated, thehorizontal bar is stress-free, while the bars fixed at node 4 are both compressed. the re-inforcing bars experience a little over twice the stress of the other two diagonal bars.

6.3.4. The swing set cannot support a uniform horizontal force, since f1 = f2 = f ,g1 = g2 = h1 = h2 = 0 does not satisfy the constraint for equilibrium. Thus, the swingset will collapse. For the reinforced version, under a horizontal force of magnitude f , the

displacements of the two free nodes are ( 14.5f, 0,−3f )T and ( 14.5f, 0, 3f )T respectively,so the first node has moved down and in the direction of the force, while the second nodehas moved up and in the same horizontal direction. The corresponding elongation vector is

e = ( 1.6583f, 1.6583f, 0,−1.6583f,−1.6583f,−3f,−3f )T ,

and so the horizontal bar experiences no elongation; the diagonal bars connecting the firstnode are stretched by an amount 1.6583f , the diagonals connecting the second node arecompressed by the same amount, while the reinforcing vertical bars are, respectively, com-pressed and stretched by an amount 3f .

♥ 6.3.5. (a) A =

0BBBBBBB@

0 1 0 0−1 0 1 0

0 0 0 10 0 1√

21√2

− 1√2

1√2

0 0

1CCCCCCCA

; (b)

32 u1 − 1

2 v1 − u2 = 0,

− 12 u1 + 3

2 v1 = 0,

−u1 + 32u2 + 1

2 v2 = 0,

12u2 + 3

2 v2 = 0.

(c) Stable, statically indeterminate. (d) Write down f = K e1, so f1 =

32

− 12

!, f2 =

−1

0

!.

The horizontal bar; it is compressed by −1; the upper left to lower right bar is compressed− 1√

2, while all other bars are stress free.

♥ 6.3.6. Under a uniform horizontal force, the displacements and stresses are: Non-joined version:

u = ( 3, 1, 3,−1 )T , e =“

1, 0,−1,√

2,−√

2”T

; Joined version: u = ( 5, 1, 5,−1, 2, 0 )T ,

e =“

1, 0,−1,√

2,−√

2,√

2,−√

2”T

; Thus, joining the nodes causes a larger horizontal

displacement of the upper two nodes, but no change in the overall stresses on the bars.Under a uniform vertical force, the displacements and elongations are:Non-joined version:

u =“

17 , 5

7 , − 17 , 5

7

”T= ( .1429, .7143,− .1429, .7143 )T ,

e =„

57 , 2

√2

7 , − 27 , 2

√2

7 , 57

«T

= ( .7143,− .2857, .7143, .4041, .4041 )T ;

Joined version:

u = ( .0909, .8182,− .0909, .8182, 0, .3636 )T ,

e = ( .8182,− .1818, .8182, .2571, .2571, .2571, .2571 )T ;

Thus, joining the nodes causes a larger vertical displacement, but smaller horizontal dis-placement of the upper two nodes. The stresses on the vertical bars increases, while thehorizontal bar and the diagonal bars have less stress (in magnitude).

167

6.3.7.

(a) A =

0BBBBBB@

0 1 0 0 0 0

− 1√2− 1√

21√2

1√2

0 0

0 0 − 1√2

1√2

1√2− 1√

2

0 0 0 0 0 1

1CCCCCCA

.

(b) There are two mechanisms: u1 = ( 1, 0, 1, 0, 1, 0 )T , where all three nodes move by the

same amount to the right, and u2 = ( 2, 0, 1, 1, 0, 0 )T , where the upper left node movesto the right while the top node moves up and to the right.

(c) f1 + f2 + f3 = 0, i.e., no net horizontal force, and 2f1 + f2 + g2 = 0.(d) You need to add additional two reinforcing bars; any pair, e.g. connecting the fixed

nodes to the top node, will stabilize the structure.

♥ 6.3.8.

(a) A =

0BBBBBBBBB@

0 1 0 0 0 0

− 3√10− 1√

103√10

1√10

0 0

− 1 0 0 0 1 0

0 0 − 3√10

1√10

3√10− 1√

10

0 0 0 0 0 1

1CCCCCCCCCA

=

0BBBBB@

0 1 0 0 0 0− .9487 − .3162 .9487 .3162 0 0−1 0 0 0 1 00 0 − .9487 .3162 .9487 − .31620 0 0 0 0 1

1CCCCCA

;

(b) One instability: the mechanism of simultaneous horizontal motion of the three nodes.

(c) No net horizontal force: f1 + f2 + f3 = 0. For example, if f1 = f2 = f3 = ( 0, 1 )T , then

e =„

32 ,q

52 ,− 3

2 ,q

52 , 3

2

«T

= ( 1.5, 1.5811,−1.5, 1.5811, 1.5 )T , so the compressed di-

agonal bars have slightly more stress than the compressed vertical bars or the elongatedhorizontal bar.

(d) To stabilize, add in one more bar starting at one of the fixed nodes and going to one ofthe two movable nodes not already connected to it.

(e) In every case, e =„

32 ,q

52 ,− 3

2 ,q

52 , 3

2 , 0«T

= ( 1.5, 1.5811,−1.5, 1.5811, 1.5, 0 )T , so the

stresses on the previous bars are all the same, while the reinforcing bar experiences nostress. (See Exercise 6.3.21 for the general principle.)

♣ 6.3.9. Two-dimensional house:

(a) A =

0BBBBBBBBBBBBBBB@

0 1 0 0 0 0 0 0 0 0

0 −1 0 1 0 0 0 0 0 0

− 1 0 0 0 0 0 0 0 1 0

0 0 − 2√5− 1√

52√5

1√5

0 0 0 0

0 0 0 0 − 2√5

1√5

2√5− 1√

50 0

0 0 0 0 0 0 0 1 0 −1

0 0 0 0 0 0 0 0 0 1

1CCCCCCCCCCCCCCCA

;

(b) 3 mechanisms: (i) simultaneous horizontal motion of the two middle nodes; (ii) simultaneoushorizontal motion of the three upper nodes; (iii) the upper left node moves horizon-tally to the right by 1 unit, the top node moves vertically by 1 unit and to the right by12 a unit.

168

(c) Numbering the five movable nodes in order starting at the middle left, the correspond-

ing forces f1 = ( f1, g1 )T , . . . , f5 = ( f5, g5 )T must satisfy: f1 + f5 = 0, f2 + f3 +

f4 = 0, f2 + 12 f3 + g3 = 0. When f1 = f2 = f4 = f5 = ( 0, 1 )T , f3 = 0,

then e = ( 2, 1, 0, 0, 0, 1, 2 )T , so the lower vertical bars are compressed twice as muchas the upper vertical bars, while the horizontal and diagonal bars experience no elon-

gation or stress. When f1 = f5 = 0, − f2 = f4 = ( 1, 0 )T , f3 = ( 0, 1 )T , then

e =„

12 , 1

2 ,√

52 , 0,

√5

2 , 12 , 1

2

«T

, so all the vertical bars are compressed by .5, the di-

agonal bars slightly more than twice as compressed, while the horizontal bar has nostress.

(d) To stabilize the structure, you need to add in at least three more bars.

(e) Suppose we add in an upper horizontal bar and two diagonal bars going from lower left

to upper right. For the first set of forces, e = ( 2, 1, 0, 0, 0, 1, 2, 0, 0, 0 )T ; for the second

set of forces, e =„

12 , 1

2 ,√

52 , 0,

√5

2 , 12 , 1

2 , 0, 0, 0«T

. In both cases, the stresses remain

the same, and the reinforcing bars experience no stress.

Three-dimensional house:

(a)

A =

0BBBBBBBBBBBBBBBBBBBBBBBBB@

0 0 1 0 0 0 0 0 00 − 1√

2− 1√

20 1√

21√2

0 0 0

0 −1 0 0 0 0 0 1 00 0 0 0 − 1√

21√2

0 1√2− 1√

20 0 0 0 0 0 0 0 1−1 0 0 0 0 0 0 0 00 0 0 −1 0 0 0 0 00 0 0 0 0 0 −1 0 00 0 0 0 0 0 0 0 00 0 0 0 0 0 0 0 00 0 0 0 0 0 0 0 00 0 0 0 0 0 0 0 00 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 00 0 0 0 0 0 0 0 00 0 0 0 0 0 0 0 00 0 0 0 0 0 0 0 00 0 0 0 0 0 0 0 01 0 0 0 0 0 0 0 00 0 0 1 0 0 0 0 00 0 0 0 0 0 1 0 00 0 1 0 0 0 0 0 00 − 1√

2− 1√

20 1√

21√2

0 0 0

0 −1 0 0 0 0 0 1 00 0 0 0 − 1√

21√2

0 1√2− 1√

20 0 0 0 0 0 0 0 1

1CCCCCCCCCCCCCCCCCCCCCCCCCA

;

(b) 5 mechanisms: horizontal motion of (i) the two topmost nodes in the direction of thebar connecting them; (ii) the two right side nodes in the direction of the bar connect-ing them; (iii) the two left side nodes in the direction of the bar connecting them;

169

(iv) the three front nodes in the direction of the bar connecting the lower pair; (v) thethree back nodes in the direction of the bar connecting the lower pair.

(c) Equilibrium requires no net force on each unstable pair or nodes in the direction of theinstability.

(d) For example, when the two topmost nodes are subject to a unit downwards vertical

force, the vertical bars have elongation/stress − 12 , the diagonal bars have 1√

2= .7071,

while the front and back horizontal bars have 12 . The longer horizontal bars have no

stress.

(e) To stabilize, you need to add in at least five more bars, e.g., two diagonal bars acrossthe front and back walls and a bar from a fixed node to the opposite topmost node.In all cases, if a minimal number of reinforcing bars are added, the stresses remainthe same on the old bars, while the reinforcing bars experience no stress. See Exercise6.3.21 for the general result.

♥ 6.3.10.(a) Letting wi denote the vertical displacement and hi the vertical component of the force

on the ith mass, 2w1 − w2 = h1, −w1 + 2w2 − w3 = h2, −w2 + w3 = h3. The system isstatically determinate and stable.

(b) Same equilibrium equations, but now the horizontal displacements u1, u2, u3 are arbi-trary, and so the structure is unstable — there are three independent mechanisms cor-responding to horizontal motions of each individual mass. To maintain equilibrium, thehorizontal force components must vanish: f1 = f2 = f3 = 0.

(c) Same equilibrium equations, but now the two horizontal displacements u1, u2, u3, v1, v2, v3are arbitrary, and so the structure is unstable — there are six independent mechanismscorresponding to the two independent horizontal motions of each individual mass. Tomaintain equilibrium, the horizontal force components must vanish: f1 = f2 = f3 =g1 = g2 = g3 = 0.

♥ 6.3.11.(a) The incidence matrix is

A =

0BBBBBBBBB@

1 −1−1 1

−1 1−1 1

. . .. . .−1 1

1CCCCCCCCCA

of size n× n. The stiffness matrix is

K = AT C A =

0BBBBBBBBBBBBBB@

c1 + c2 −c2 −c1−c2 c2 + c3 −c3

−c3 c3 + c4 −c4−c4 c4 + c5 −c5

. . .. . .

. . .

−cn−1 cn−1 + cn −cn−cn −cn cn + c1

1CCCCCCCCCCCCCCA

,

and the equilibrium system is Ku = f .

(b) Observe that ker K = ker A is one-dimensional with basis vector z = ( 1, 1, . . . , 1 )T .Thus, the stiffness matrix is singular, and the system is not stable. To maintain equilib-rium, the force f must be orthogonal to z, and so f1 + · · ·+ fn = 0, i.e., the net force on

170

the ring is zero.

(c) For instance, if c1 = c2 = c3 = c4 = 1, and f = ( 1,−1, 0, 0 )T , then the solution is

u =“

14 ,− 1

2 ,− 14 , 0

”T+ t ( 1, 1, 1, 1 )T for any t. Nonuniqueness is telling us that the

masses can all be moved by the same amount, i.e., the entire ring is rotated, withoutaffecting the force balance at equilibrium.

♣ 6.3.12.

(a)

A =

0BBBBBBBBBBBB@

−1 0 0 1 0 0 0 0 0 0 0 00 −1 0 0 0 0 0 1 0 0 0 00 0 −1 0 0 0 0 1√

20 0 0 0

0 0 0 1√2− 1√

20 − 1√

20 0 0 0 1

0 0 0 1√2

0 − 1√2

0 0 0 − 1√2

0 1√2

0 0 0 0 0 0 0 1√2− 1√

20 − 1√

21√2

1CCCCCCCCCCCCA

;

(b) v1 =


100100100100


, v2 =


010010010010


, v3 =


001001001001


, v4 =


00000000−1

010


, v5 =


00000−1

000100


, v6 =


0000−1

0100000


;

(c) v1,v2,v3 correspond to translations in, respectively, the x, y, z directions;(d) v4,v5,v6 correspond to rotations around, respectively, the x, y, z coordinate axes;

(e) K =

0BBBBBBBBBBBBBBBBBBBBBBBBBB@

1 0 0 −1 0 0 0 0 0 0 0 00 1 0 0 0 0 0 −1 0 0 0 00 0 1 0 0 0 0 0 0 0 0 −1−1 0 0 2 − 1

2 − 12 − 1

212 0 − 1

2 0 12

0 0 0 − 12

12 0 1

2 − 12 0 0 0 0

0 0 0 − 12 0 1

2 0 0 0 12 0 − 1

2

0 0 0 − 12

12 0 1

2 − 12 0 0 0 0

0 −1 0 12 − 1

2 0 − 12 2 − 1

2 0 − 12

12

0 0 0 0 0 0 0 − 12

12 0 1

2 − 12

0 0 0 − 12 0 1

2 0 0 0 12 0 − 1

2

0 0 0 0 0 0 0 − 12

12 0 1

2 − 12

0 0 −1 12 0 − 1

2 0 12 − 1

2 − 12 − 1

2 2

1CCCCCCCCCCCCCCCCCCCCCCCCCCA

;

171

(f ) For fi = ( fi, gi, hi )T we require f1 + f2 + f3 + f4 = 0, g1 + g2 + g3 + g4 = 0,h1 + h2 + h3 + h4 = 0, h3 = g4, h2 = f4, g2 = f3, i.e., there is no net horizontal forceand no net moment of force around any axis.

(g) You need to fix three nodes. Fixing two still leaves a rotation motion around the lineconnecting them.

(h) Displacement of the top node: u4 = (−1,−1,−1 )T ; since e = (−1, 0, 0, 0 )T , only thevertical bar experiences compression of magnitude 1.

♣ 6.3.13.

(a)

Placing the vertices at

0BBB@

1√3

0

0

1CCCA,

0BBB@

− 12√

312

0

1CCCA,

0BBB@

− 12√

3

− 12

0

1CCCA,

0BBB@

0

0√

2√3

1CCCA, we obtain

A =

0BBBBBBBBBBBBBBB@

3√2− 1

2 0 − 3√2

12 0 0 0 0 0 0 0

3√2

12 0 0 0 0 − 3√

2− 1

2 0 0 0 0

1√3

0 −√

2√3

0 0 0 0 0 0 − 1√3

0√

2√3

0 0 0 0 1 0 0 −1 0 0 0 0

0 0 0 − 12√

312 −

√2√3

0 0 0 12√

3− 1

2

√2√3

0 0 0 0 0 0 − 12√

3− 1

2 −√

2√3

12√

312

√2√3

1CCCCCCCCCCCCCCCA

;

(b) v1 =


100100100100


, v2 =


010010010010


, v3 =


001001001001


, v4 =


−√

2√6−1−2√

201000000


, v5 =


0−20√3

10−√

310000


, v6 =


−√

2−√

6−1000

−2√

201000


;

(c) v1,v2,v3 correspond to translations in, respectively, the x, y, z directions;

(d) v4,v5,v6 correspond to rotations around the top node;

172

(e) K =0BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB@

116 0 −

√2

3 − 34

√3

4 0 − 34 −

√3

4 0 − 13 0

√2

3

0 12 0

√3

4 − 14 0 −

√3

4 − 14 0 0 0 0

−√

23 0 2

3 0 0 0 0 0 0√

23 0 − 2

3

− 34

√3

4 0 56 − 1√

31

3√

20 0 0 − 1

121

4√

3− 1

3√

2√3

4 − 14 0 − 1√

332 − 1√

60 −1 0 1

4√

3− 1

41√6

0 0 0 13√

2− 1√

623 0 0 0 − 1

3√

21√6

− 23

− 34 −

√3

4 0 0 0 0 56

1√3

13√

2− 1

12 − 14√

3− 1

3√

2

−√

34 − 1

4 0 0 −1 0 1√3

32

1√6− 1

4√

3− 1

4 − 1√6

0 0 0 0 0 0 13√

21√6

23 − 1

3√

2− 1√

6− 2

3

− 13 0

√2

3 − 112

14√

3− 1

3√

2− 1

12 − 14√

3− 1

3√

212 0 0

0 0 0 14√

3− 1

41√6− 1

4√

3− 1

4 − 1√6

0 12 0

√2

3 0 − 23 − 1

3√

21√6

− 23 − 1

3√

2− 1√

6− 2

3 0 0 2

1CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCA

.

(f ) For fi = ( fi, gi, hi )T we require

f1 + f2 + f3 + f4 = 0,

g1 + g2 + g3 + g4 = 0,

h1 + h2 + h3 + h4 = 0,

−√

2 f1 +√

6 g1 − h1 − 2√

2 f2 + h2 = 0,

−2g1 +√

3 f2 + g2 −√

3 f3 + g3 = 0,

−√

2 f1 −√

6 g1 − h1 − 2√

2 f3 + h3 = 0,

i.e., there is no net horizontal force and no net moment of force around any axis.(g) You need to fix three nodes. Fixing only two nodes still permits a rotational motion

around the line connecting them.

(h) Displacement of the top node: u =“

0, 0,− 12

”T; all the bars connecting the top node

experience compression of magnitude 1√6.

6.3.14. True, since stability only depends on whether the reduced incidence matrix has trivialkernel or not, which depends only on the geometry, not on the bar stiffnesses.

6.3.15. (a) True. Since Ku = f , if f 6= 0 then u 6= 0 also. (b) False if the structure is unsta-ble, since any u ∈ ker A yields a zero elongation vector y = Au = 0.

6.3.16.(a) 3n.(b) Example: a triangle each of whose nodes is connected

to the ground by two additional, non-parallel bars.

♦ 6.3.17. As in Exercise 6.1.6, this follows from the symmetry of the stiffness matrix K, which

implies that K−1 is a symmetric matrix. Let fi = (0, . . . ,0,n,0, . . . ,0 )T denote the forcevector corresponding to a unit force at node i applied in the direction of the unit vector n.The resulting displacement is ui = K−1fi, and we are interested the displacement of node j

in the direction n, which equals the dot product fj · ui = fTj K−1fi = (K−1fj)

T fi = fi · uj ,

proving equality.

173

6.3.18. False in general. If the nodes are collinear, a rotation around the line through the nodeswill define a rigid motion. If the nodes are not collinear, then the statement is true.

6.3.19. Since y = e = Au ∈ rng A = corng AT , which, according to Theorem 5.59, is thecondition for the solution of minimal norm to the adjoint equation.

♦ 6.3.20.(a) We are assuming that f ∈ rng K = corng A = rng AT , cf. Exercise 3.4.31. Thus, we can

write f = AT h = AT C g where g = C−1h.

(b) The equilibrium equations Ku = f are AT C Au = AT C g which are the normal equa-tions (4.57) for the weighted least squares solution to Au = g.

♥ 6.3.21. Let A be the reduced incidence matrix of the structure, so the equilibrium equations are

AT C Au = f , where we are assuming f ∈ rng K = rng AT C A. We use Exercise 6.3.20 to

write f = AT C g and characterize u as the weighted least squares solution to Au = g, i.e.,the vector that minimizes the weighted norm ‖Au− g ‖2.

Now, the reduced incidence matrix for the reinforced structure is eA =

AB

!where

the rows of B represent the reinforcing bars. The structure will be stable if and only ifcorng eA = corng A + corng B = R

n, and the number of bars is minimal if and only ifcorng A ∩ corng B = 0. Thus, corng A and corng B are complementary subspaces ofR

n, which implies, as in Exercise 5.6.12, that their orthogonal complements ker A and ker Bare also complementary subspaces, so ker A + ker B = R

n, ker A ∩ ker B = 0.The reinforced equilibrium equations for the new displacement v are eAT eC eAv = AT C Av+

BT DBv = f , where D is the diagonal matrix whose entries are the stiffnesses of the rein-

forcing bars, while eC =

C OO D

!. Since f = AT C g = AT C g + BT D0 = eAT eC

g0

!,

again using Exercise 6.3.20, the reinforced displacement v is the least squares solution tothe combined systemAv = g, Bv = 0, i.e., the vector that minimizes the combined weighted norm ‖Av − b ‖2+

‖Bv ‖2. Now, since we are using the minimal number of bars, we can uniquely decom-

pose v = z + w where z ∈ ker A and w ∈ ker B, we find ‖Av − b ‖2 + ‖Bv ‖2 =

‖Aw − b ‖2 + ‖B z ‖2. Clearly this will be minimized if and only if B z = 0 and w min-

imizes ‖Aw − b ‖2. Therefore, w = u ∈ ker B, and so the entries of Bu = 0 are theelongations of the reinforcing bars.

♥ 6.3.22.

(a) A? =

0BBBB@

1√2

1√2

0 0 0

− 1 0 1 0 0

0 0 − 1√2

1√2

1√2

1CCCCA

; K? u = f? where K? =

0BBBBBBBB@

32

12 −1 0 0

12

12 0 0 0

− 1 0 32 − 1

2 − 12

0 0 − 12

12

12

0 0 − 12

12

12

1CCCCCCCCA

.

(b) Unstable: there are two mechanisms prescribed by the kernel basis elements ( 1,−1, 1, 1, 0 )T ,

which represents the same mechanism as when the end is fixed, and ( 1,−1, 1, 0, 1 )T , inwhich the roller and the right hand node move horizontally to the right, while the leftnode moves down and to the right.

174

♥ 6.3.23.

(a) A? =

0BBBB@

1√2

1√2

0 0 0

− 1 0 1 0 0

0 0 − 1√2

1√2− 1√

2

1CCCCA

;

K? u = f? where K? =

0BBBBBBBB@

32

12 −1 0 0

12

12 0 0 0

− 1 0 32 − 1

212

0 0 − 12

12 − 1

2

0 0 12 − 1

212

1CCCCCCCCA

.

(b) Unstable: there are two mechanisms prescribed by the kernel basis elements ( 1,−1, 1, 1, 0 )T ,

which represents the same mechanism as when the end is fixed, and (−1, 1,−1, 0, 1 )T ,in which the roller moves up, the right hand node moves horizontally to the left, whilethe left node moves up and to the left.

♥ 6.3.24.

(a) Horizontal roller: A? =

0BBBBB@

.7071 .7071 0 0 0

− 1 0 1 0 0

0 0 − .7071 .7071 .7071

− .9487 .3162 0 0 .9487

1CCCCCA

;


0BBBBBBB@

2.4 .2 −1 0 − .9

.2 .6 0 0 .3

− 1 0 1.5 − .5 − .5

0 0 − .5 .5 .5

− .9 .3 − .5 .5 1.4

1CCCCCCCA

;

unstable: there is one mechanism prescribed by the kernel basis element

( .75,−.75, .75,−.25, 1 )T , in which the roller moves horizontally to the right, the righthand node moves right and slightly down, while the left node moves right and down.

(b) Vertical roller: A? =

0BBBBB@

.7071 .7071 0 0 0

− 1 0 1 0 0

0 0 − .7071 .7071 − .7071

− .9487 .3162 0 0 − .3162

1CCCCCA

;


0BBBBBBB@

2.4 .2 −1 0 .3

.2 .6 0 0 − .1

− 1 0 1.5 − .5 .5

0 0 − .5 .5 − .5

.3 − .1 .5 − .5 .6

1CCCCCCCA

;

unstable: there is one mechanism prescribed by the kernel basis element

(−.25, .25,−.25, .75, 1. )T , in which the roller moves up, the right hand node moves upand slightly to the left, while the left node moves slightly left and up.

6.3.25.(a) Yes, if the direction of the roller is perpendicular to the vector between the two nodes,

the structure admits an (infinitesimal) rotation around the fixed node.(b) A total of six rollers is required to eliminate all six independent rigid motions. The

rollers must not be “aligned”. For instance, if they all point in the same direction, theydo not eliminate a translational mode.

175


7.1.1. Only (a) and (d) are linear.

7.1.2. Only (a),(d) and (f ) are linear.

7.1.3. (a) F (0, 0) =

20

!6=

00

!, (b) F (2x, 2y) = 4F (x, y) 6= 2F (x, y), (c) F (−x,−y) =

F (x, y) 6= −F (x, y), (d) F (2x, 2y) 6= 2F (x, y), (e) F (0, 0) =

10

!6=

00

!.

7.1.4. Since T

00

!=

ab

!, linearity requires a = b = 0, so the only linear translation is the

identity map.

7.1.5. (a)

0B@

0 −1 01 0 00 0 1

1CA, (b)

0BBB@

1 0 0

0 12 −

√3

2

0√

32

12

1CCCA, (c)

0B@

1 0 00 1 00 0 −1

1CA, (d)

0B@

0 0 11 0 00 1 0

1CA,

(e)

0BBB@

− 13

23

23

23 − 1

323

23

23 − 1

3

1CCCA, (f )

0B@

1 0 00 1 00 0 0

1CA, (g)

0BBB@

56

16 − 1

316

56

13

− 13

13

13

1CCCA.

7.1.6.L

xy

!= 5

2 x − 12 y. Yes, because

11

!,

1−1

!form a basis, so we can write any v ∈

R2 as a linear combination v = c

11

!+ d

1−1

!. Thus, by linearity, L[v ] = cL

11

!+

dL

1−1

!= 2c+ 3d is uniquely determined by its values on the two basis vectors.

7.1.7. L(x, y) =

0@−

23 x+ 4

3 y

− 13 x− 1

3 y

1A.

7.1.8. The linear function exists and is unique if and only if

x1y1

!,

x2y2

!are linearly indepen-

dent. In this case, the matrix form is A =

a1 a2b1 b2

! x1 x2y1 y2

!−1

. On the other hand, if

c1

x1y1

!+ c2

x2y2

!= 0, then the linear function exists (but is not uniquely defined) if and

only if c1

a1b1

!+ c2

a2b2

!= 0.

7.1.9. No, because linearity would require

L

0B@

01−1

1CA = L

264

0B@

1−1

0

1CA−

0B@

1−1

0

1CA

375 = L

0B@

10−1

1CA− L

0B@

1−1

0

1CA = 3 6= −2.

♦ 7.1.10. La[cv + dw ] = a× (cv + dw) = ca× v + da×w = cL

a[v ] + dL

a[w ];

176

matrix representative:

0B@

0 −c bc 0 −a−b a 0

1CA.

7.1.11. No, since N(−v) = N(v) 6= −N(v).

7.1.12. False, since Q(cv) = c2v 6= cv in general.

7.1.13. Set b = L(1). Then L(x) = L(x 1) = xL(1) = xb. The proof of linearity is straightfor-ward; indeed, this is a special case of matrix multiplication.

7.1.14.(a) L[cX + dY ] = A(cX + dY ) = cAX + dAY = cL[X ] + dL[Y ];


0BBB@

a 0 b 00 a 0 bc 0 d 00 c 0 d

1CCCA.

(b) R[cX + dY ] = (cX + dY )B = cXB + dY B = cR[X ] + dR[Y ];


0BBB@

p r 0 0q s 0 00 0 p r0 0 q s

1CCCA.

(c) K[cX + dY ] = A(cX + dY )B = cAXB + dAY B = cK[X ] + dK[Y ];;


0BBB@

ap ar bp braq as bq bscp cr dp drcq cs dq ds

1CCCA.

7.1.15. (a) Linear; target space =Mn×n. (b) Not linear; target space =Mn×n.

(c) Linear; target space = Mn×n. (d) Not linear; target space =Mn×n.

(e) Not linear; target space = R. (f ) Linear; target space = R. (g) Linear; target space =Rn. (h) Linear; target space = R

n. (i) Linear; target space = R.

♦ 7.1.16. (a) If L satisfies (7.1), then L[cv + dw ] = L[cv ] + L[dw ] = c L[v ] + dL[w ], proving(7.3). Conversely, given (7.3), the first equation in (7.1) is the special case c = d = 1,while the second corresponds to d = 0. (b) Equations (7.1, 3) prove (7.4) for k = 1, 2. Byinduction, assuming the formula is true for k, to prove it for k + 1, we compute

L[c1v1 + · · · + ckvk + ck+1vk+1 ] = L[c1v1 + · · · + ckvk ] + ck+1L[vk+1 ]

= c1L[v1 ] + · · · + ckL[vk ] + ck+1L[vk+1 ].

♦ 7.1.17. If v = c1v1 + · · ·+ cnvn, then, by linearity,L[v ] = L[c1v1 + · · ·+ cnvn ] = c1L[v1 ] + · · ·+ cnL[vn ] = c1w1 + · · ·+ cnwn.

Since v1, . . . ,vn form a basis, the coefficients c1, . . . , cn are uniquely determined by v ∈ Vand hence the preceding formula uniquely determines L[v ].

♥ 7.1.18.(a) B(cv+ec ev,w) = (cv1+ecev1)w1−2(cv2+ecev2)w2 = c(v1w1−2v2w2)+ec(ev1w1−2ev2w2) =

cB(v,w) + ecB(ev,w), so B(v,w) is linear in v for fixed w. Similarly, B(v, cw + ec ew) =v1 (cw1 + ec ew1)− 2v2 (cw2 + ec ew2) = c(v1w1− 2v2w2) + ec(v2 ew1− 2v2 ew2) = cB(v,w) +ecB(v, ew), proving linearity in w for fixed v.

(b) B(cv + ec ev,w) = 2(cv1 + ec ev1)w2 − 3(cv2 + ec ev2)w3 = c(2v1w2 − 3v2w3) +ec(2ev1w2 − 3ev2w3) = cB(v,w) + ecB(ev,w), so B(v,w) is linear in v for fixed w. Sim-ilarly, B(v, cw + ec ew) = 2v1 (cw2 + ec ew2) − 3v2 (cw3 + ec ew3) = c(2v1w2 − 3v2w3) +ec(2v2 ew1 − 3v2 ew3) = cB(v,w) + ecB(v, ew), proving bilinearity.

(c) B(cv + ec ev,w) = 〈 cv + ec ev ,w 〉 = c〈v ,w 〉+ ec〈 ev ,w 〉 = cB(v,w) + ecB(ev,w),

B(v, cw + ec ew) = 〈v , cw + ec ew 〉 = c〈v ,w 〉+ ec〈v , ew 〉 = cB(v,w) + ecB(v, ew).

177

(d) B(cv + ec ev,w) = (cv + ec ev)TAw = cvTAw + ec evTAw = cB(v,w) + ecB(ev,w),

B(v, cw + ec ew) = vTA(cw + ec ew) = cvTAw + ecvTA ew = cB(v,w) + ecB(v, ew).

(e) Set aij = B(ei, ej) for i = 1, . . . ,m, j = 1, . . . , n. Then

B(v,w) = B(v1 e1 + · · ·+ vm em,w) = v1B(e1,w) + · · ·+ vmB(em,w)

= v1B(e1, w1 e1 + · · ·+ wn en) + · · ·+ vmB(em, w1 e1 + · · ·+ wn en)

=mX

i=1

nX

j=1

viwjB(ei, ej) =mX

i=1

nX

j=1

viwj aij = vTAw.

(f ) Let B(v,w) = (B1(v,w), . . . , Bk(v,w) )T . Then the bilinearity conditions B(cv +ec ev,w) = cB(v,w) + ecB(v, ew) and B(v, cw + ec ew) = cB(v,w) + ecB(v, ew) hold if andonly if each component Bj(v,w) is bilinear.

(g) False. B(c(v,w) + ec (ev, ew)) = B(cv + ec ev, cw + ec ew) = c2B(v,w) + cecB(v, ew) +

cecB(ev,w) + ec2B(ev, ew) 6= cB(v,w) + ecB(ev, ew).

7.1.19. (a) Linear; target space = R. (b) Not linear; target space = R. (c) Linear; target

space = R. (d) Linear; target space = R. (e) Linear; target space = C1(R). (f ) Linear;

target space = C1(R). (g) Not linear; target space = C1(R). (h) Linear; target space =

C0(R). (i) Linear; target space = C0(R). (j) Linear; target space = C0(R). (k) Not lin-ear; target space = R. (l) Linear; target space = R. (m) Not linear; target space = R.

(n) Linear; target space = C2(R). (o) Linear; target space = C2(R). (p) Not linear; target

space = C1(R). (q) Linear; target space = C1(R). (r) Linear; target space = R. (s) Not

linear; target space = C2(R).

7.1.20. True. For any constants c, d,

A[cf + dg ] =1

b− aZ b

a[cf(x) + dg(x) ] dx =

c

b− aZ b

af(x) dx+

d

b− aZ b

ag(x) dx = cA[f ]+dA[g ].

7.1.21.

Mh[cf(x) + dg(x) ] = h(x) (cf(x) + dg(x)) = c h(x) f(x) + d h(x) g(x) = cMh[f(x) ]+dMh[g(x) ].

To show the target space is Cn[a, b ], you need the result that the product of two n timescontinuously differentiable functions is n times continuously differentiable.

7.1.22. Iw[cf + dg ] =Z b

a

hcf(x) + dg(x)

iw(x) dx

= cZ b

af(x)w(x) dx+ d

Z b

ag(x)w(x) dx = cIw[f ] + dIw[g ].

7.1.23. (a) ∂x[cf + dg ] =∂

∂x

hcf(x) + dg(x)

i= c

∂f

∂x+ d

∂g

∂x= c∂x[f ] + d∂x[g ]. The same

proof works for ∂y. (b) Linearity requires d = 0.

7.1.24. ∆[cf + dg ] =∂2

∂x2

hcf(x, y) + dg(x, y)

i+

∂2

∂y2

hcf(x, y) + dg(x, y)

i

= c

0@ ∂2f

∂x2+∂2f

∂y2

1A+ d

0@ ∂2g

∂x2+∂2g

∂y2

1A = c∆[f ] + d∆[g ].

178

7.1.25. G[cf + dg ] = ∇(cf + dg) =

0BBBBBB@

∂hcf(x) + dg(x)

i

∂x

∂hcf(x) + dg(x)

i

∂y

1CCCCCCA

= c

0BBBB@

∂f

∂x

∂f

∂y

1CCCCA

+ d

0BBBB@

∂g

∂x

∂g

∂y

1CCCCA

= c∇f + d∇g = cG[f ] + dG[g ].

7.1.26.(a) Gradient: ∇(cf + dg) = c∇f + d∇g; domain is space of continuously differentiable

scalar functions; target is space of continuous vector fields.(b) Curl: ∇ × (c f + dg) = c∇× f + d∇× g; domain is space of continuously differentiable

vector fields; target is space of continuous vector fields.(c) Divergence: ∇ · (c f + dg) = c∇ · f + d∇ · g; domain is space of continuously differen-

tiable vector fields; target is space of continuous scalar functions.

7.1.27.(a) dimension = 3; basis: ( 1, 0, 0 ) , ( 0, 1, 0 ) , ( 0, 0, 1 ).

(b) dimension = 4; basis:

1 00 0

!,

0 10 0

!,

0 01 0

!,

0 00 1

!.

(c) dimension = mn; basis: Eij with (i, j) entry equal to ! and all other entries 0, fori = 1, . . . ,m, j = 1, . . . , n.

(d) dimension = 4; basis given by L0, L1, L2, L3, where Li[a3x3 + a2x

2 + a1x+ a0 ] = ai.

(e) dimension = 6; basis given by L0, L1, L2,M0,M1,M2, where

Li[a2x2 + a1x+ a0 ] =

ai0

!, Mi[a2x

2 + a1x+ a0 ] =

0ai

!.

(f ) dimension = 9; basis given by L0, L1, L2,M0,M1,M2, N0, N1, N2, where, for i = 1, 2, 3,

Li[a2x2 + a1x+ a0 ] = ai, Mi[a2x

2 + a1x+ a0 ] = aix, Ni[a2x2 + a1x+ a0 ] = aix

2.

7.1.28. True. The dimension is 2, with basis

0 10 0

!,

0 00 1

!.

7.1.29. False. The zero function is not an element.

7.1.30. (a) a = ( 3,−1, 2 )T , (b) a =“

3,− 12 ,

23

”T, (c) a =

“54 ,− 1

2 ,54

”T.

♦ 7.1.31. (a) a = K−1rT since L[v ] = r v = rK−1Kv = aTKv and KT = K. (b) (i) a =

2−1

!, (ii) a =

3 00 2

!−1 2−1

!=

23−1

!, (iii) a =

2 −1−1 3

!−1 2−1

!=

10

!.

♥ 7.1.32.(a) By linearity, Li[x1v1 + · · ·+ xnvn ] = x1Li[v1 ] + · · ·+ xnLi[vn ] = xi.

(b) Every real-valued linear function L ∈ V ∗ has the form L[v ] = a1x1 + · · · + anxn =a1L1[v ] + · · · + anLn[v ] and so L = a1L1 + · · · + anLn proving that L1, . . . , Ln span

V ∗. Moreover, they are linearly independent since a1L1 + · · · + anLn = O gives thetrivial linear function if and only if a1x1 + · · · + anxn = 0 for all x1, . . . , xn, whichimplies a1 = · · · = an = 0.

(c) Let ri denote the ith row of A−1 which we identify as the linear function Li[v ] = riv.

The (i, j) entry of the equation A−1A = I says that Li[vj ] = ri vj =

(1 i = j,

0, i 6= j,which is the requirement for being a dual basis.

179

7.1.33. In all cases, the dual basis consists of the liner functions Li[v ] = ri v. (a) r1 =“12 ,

12

”, r2 =

“12 ,− 1

2

”, (b) r1 =

“17 ,

37

”, r2 =

“27 ,− 1

7

”, (c) r1 =

“12 ,

12 ,− 1

2

”, r2 =

“12 ,− 1

2 ,12

”, r3 =

“− 1

2 ,12 ,

12

”, (d) r1 = ( 8, 1, 3 ) , r2 = ( 10, 1, 4 ) , r3 = ( 7, 1, 3 ),

(e) r1 = ( 0, 1,−1, 1 ) , r2 = ( 1,−1, 2,−2 ) , r3 = (−2, 2,−2, 3 ) , r4 = ( 1,−1, 1,−1 ).

7.1.34. (a) 9− 36x+ 30x2, (b) 12− 84x+ 90x2, (c) 1, (d) 38− 192x+ 180x2.

7.1.35. 9− 36x+ 30x2, −36 + 192x− 180x2, 30− 180x+ 180x2.

7.1.36. Let w1, . . . ,wn be any basis of V . Write v = y1w1+ · · ·+ynwn, so, by linearity, L[v ] =y1L[w1 ]+ · · · +ynL[wn ] = b1y1+ · · · +bnyn, where bi = L[wi ]. On the other hand, if we

write a = a1w1 + · · ·+anwn, then 〈a ,v 〉 =nX

i,j=1

ai yj〈wi ,wj 〉 =nX

i,j=1

kij ai yj = xTKb,

where kij = 〈wi ,wj 〉 are the entries of the Gram matrix K based on w1, . . . ,wn. Thus

setting a = K−1b gives L[v ] = 〈a ,v 〉.

7.1.37.(a) S T = T S = clockwise rotation by 60 = counterclockwise rotation by 300;(b) S T = T S = reflection in the line y = x;(c) S T = T S = rotation by 180;

(d) S T = counterclockwise rotation by cos−1“− 4

5

”= 1

2 π − 2 tan−1 12 radians;

T S = clockwise rotation by cos−1“− 4

5

”= 1

2 π − 2 tan−1 12 radians;

(e) S T = T S = O;

(f ) S T maps (x, y )T to“

12 (x+ y), 0

”T; T S maps (x, y )T to

“12 x,

12 x

”T;

(g) S T maps (x, y )T to ( y, 0 )T ; T S maps (x, y )T to ( 0, x )T ;

(h) S T maps (x, y )T to“− 2

5 x+ 15 y,

45 x− 2

5 y”T

; T S maps“− 2

5 x+ 45 y,

15 x− 2

5 y”T

.

7.1.38. (a) L =

1 −1−3 2

!; (b) M =

−1 0−3 2

!; (c) N =

2 10 1

!;

(d) Each linear transformation is uniquely determined by its action on a basis of R2, and

M [ei ] = N L[ei ] for i = 1, 2. (e)

−1 0−3 2

!=

2 10 1

! 1 −1−3 2

!.

7.1.39. (a) R =

0B@

1 0 00 0 −10 1 0

1CA, S =

0B@

0 −1 01 0 00 0 1

1CA; (b) R S =

0B@

0 −1 00 0 −11 0 0

1CA 6= S R =

0B@

0 0 11 0 00 1 0

1CA; under R S, the basis vectors e1, e2, e3 go to e3,−e1,−e2, respectively.

Under S R, they go to e2, e3, e1. (c) Do it.

7.1.40. No. the matrix representatives for P,Q and R = Q P are, respectively,

P =

0BBB@

23 − 1

313

− 13

23

13

13

13

23

1CCCA, Q =

0BBB@

23

13

13

13

23 − 1

313 − 1

323

1CCCA, R = QP =

0BBB@

49 − 1

959

19

29 − 1

959

19

49

1CCCA, but orthogo-

180

nal projection onto L =

8><>:t

0B@

101

1CA

9>=>;

has matrix representative M =

0BB@

12 0 1

2

0 0 012 0 1

2

1CCA 6= R.

7.1.41. (a) L = E D where D[f(x) ] = f ′(x), E[g(x) ] = g(0). No, they do not commute— D E is not even defined since the target of E, namely R, is not the domain of D, thespace of differentiable functions. (b) e = 0 is the only condition.

7.1.42.L M = xD2 + (1− x2)D − 2x, M L = xD2 + (2− x2)D − x. They do not commute.

7.1.43. (a) According to Lemma 7.11, MaD is linear, and hence, for the same reason, L =

D (MaD) is also linear. (b) L = a(x)D2 + a′(x)D.

♦ 7.1.44. (a) Given L:V → U , M :W → V and N :Z →W , we have, for z ∈ Z,

((L M) N)[z ] = (L M)[N [z ] ] = L[M [N [z ] ] ] = L[ (M N)[z ] ] = (L (M N))[z ]

as elements of U . (b) Lemma 7.11 says that M N is linear, and hence, for the same rea-son, (L M) N is linear. (c) When U = R

m, V = Rn, W = R

p, Z = Rq, then L is

represented by an m × n matrix A, M is represented by a n × p matrix B, and N is repre-sented by an p× q matrix C. Associativity of composition implies (AB)C = A(BC).

7.1.45. Given L = anDn+ · · ·+a1D+a0, M = bnD

n+ · · ·+ b1D+ b0, with ai, bi constant, thelinear combination cL+ dM = (can + dbn)Dn + · · ·+ (ca1 + db1)D + (ca0 + db0), is alsoa constant coefficient linear differential operator, proving that it forms a subspace of thespace of all linear operators. A basis is Dn, Dn−1, . . . , D, 1 and so its dimension is n+ 1.

7.1.46. If p(x, y) =X

cijxiyj then p(x, y) =

Xcij∂

ix∂jy is a linear combination of linear opera-

tors, which can be built up as compositions ∂ix ∂jy = ∂x · · · ∂x ∂y · · · ∂y of the basic

first order linear partial differential operators.

♥ 7.1.47.(a) Both L M and M L are linear by Lemma 7.11, and, since the linear transformations

form a vector space, their difference L M −M L is also linear.(b) L M = M L if and only if [ L,M ] = L M −M L = O.

(c) (i)

1 30 −1

!, (ii)

0 −2−2 0

!, (iii)

0B@

0 2 0−2 0 −2

0 2 0

1CA.

(d)h[ L,M ], N

i= (L M −M L) N −N (L M −M L)

= L M N −M L N −N L M +N M L,h[N,L ],M

i= (N L− L N) M −M (N L− L N)

= N L M − L N M −M N L+M L N,h[M,N ], L

i= (M N −N M) L− L (M N −N M)

= M N L−N M L− L M N + L N M,which add up to O.

(e)h[ L,M ], N

i=

−3 2

2 3

!,h[N,L ],M

i=

0 0−2 0

!,h[M,N ], L

i=

3 −20 −3

!,

whose sum is

−3 2

2 3

!+

0 0−2 0

!+

3 −20 −3

!= O.

(f ) B(cL+ ec eL,M) = [cL+ ec eL,M ] = (cL+ ec eL) M −M (cL+ ec eL)

= c(L M −M L) + ec (eL M −M eL) = c [ L,M ] + ec [ eL,M ] = cB(L,M) + ecB(eL,M),

B(L, cM+ec fM) = −B(cM+ec fM,L) = −cB(M,L)−ecB(fM,L) = cB(L,M)+ecB(L, fM).(Or the latter property can be proved directly.)

181

♦ 7.1.48.(a) [ P,Q ] [f ] = P Q[f ]−Q P [f ] = P [xf ]−Q[f ′ ] = (xf)′ − xf ′ = f .(b) According to Exercise 1.2.32, the trace of any matrix commutator is zero: tr[ P,Q ] = 0.

On the other hand, tr I = n, the size of the matrix, not 0.

♥ 7.1.49.(a) D(1) is a subspace of the vector space of all linear operators acting on the space of poly-

nomials, and so, by Proposition 2.9, one only needs to prove closure. If L = p(x)D +q(x) and M = r(x)D + s(x) are operators of the given form, so is cL+ dM =[cp(x) + dr(x) ]D + [cq(x) + ds(x) ] for any scalars c, d ∈ R. It is an infinite dimens-

ional vector space since the operators xiD and xj for i, j = 0, 1, 2, . . . are all linearlyindependent.

(b) If L = p(x)D + q(x) and M = r(x)D + s(x), then

L M = prD2 + (pr′ + q r + ps)D + (ps′ + q s),

M L = prD2 + (p′ r + q r + ps)D + (q′ r + q s),

hence [ L,M ] = (pr′ − p′r)D + (ps′ − q′r).(c) [ L,M ] = L, [M,N ] = N, [N,L ] = −2M, and so

h[ L,M ], N

i+h[N,L ],M

i+h[M,N ], L

i= [L,N ]− 2[M,M ] + [N,L ] = O.

7.1.50. Yes, it is a vector space, but the commutator of two second order differential operatorsis, in general, a third order differential operator. For example [ xD2, D2 ] = −2D3.

7.1.51. (a) The inverse is the scaling transformation that halves the length of each vector.(b) The inverse is counterclockwise rotation by 45.(c) The inverse is reflection through the y axis. (d) No inverse.

(e) The inverse is the shearing transformation

1 −20 1

!.

7.1.52.

(a) Function:

2 00 2

!; inverse:

12 0

0 12

!.

(b) Function:

0B@

1√2

1√2

− 1√2

1√2

1CA; inverse:

0B@

1√2− 1√

21√2

1√2

1CA.

(c) Function:

−1 0

0 1

!; inverse:

−1 0

0 1

!.

(d) Function:

12

12

12

12

!; no inverse.

(e) Function:

1 20 1

!; inverse:

1 −20 1

!.

7.1.53. Since L has matrix representative

1 3−1 −2

!, its inverse has matrix representative

−2 −3

1 1

!, and so L−1[e1 ] =

−2

1

!and L−1[e2 ] =

−3

1

!.

7.1.54. Since L has matrix representative

0B@

2 1 −11 2 2−1 1 2

1CA, its inverse has matrix representative

182

0BBB@

− 23 1 − 4

343 −1 5

3

− 1 1 −1

1CCCA, and so L−1[e1 ] =

0BBB@

− 2343

− 1

1CCCA, L−1[e2 ] =

0BB@

1

− 1

1

1CCA L−1[e3 ] =

0BBB@

− 4353

− 1

1CCCA.

♦ 7.1.55. If L M = L N = IW , M L = N L = I V , then, by associativity,M = M IW = M (L N) = (M L) N = I V N = N .

♥ 7.1.56.(a) Every vector in V can be uniquely written as a linear combination of the basis elements:

v = c1v1 + · · ·+ cnvn. Assuming linearity, we compute

L[v ] = L[c1v1 + · · ·+ cnvn ] = c1L[v1 ] + · · · + cnL[vn ] = c1w1 + · · · + cnwn.

Since the coefficients c1, . . . , cn of v are uniquely determined, this formula serves touniquely define the function L:V → W . We must then check that the resulting func-tion is linear. Given any two vectors v = c1v1 + · · · + cnvn, w = d1v1 + · · · + dnvn inV , we have

L[v ] = c1w1 + · · · + cnwn, L[w ] = d1w1 + · · · + dnwn.

Then, for any a, b ∈ R,

L[av + bw ] = L[ (ac1 + bd1)v1 + · · ·+ (acn + bdn)vn ]

= (ac1 + bd1)w1 + · · · + (acn + bdn)wn

= a“c1w1 + · · · + cnwn

”+ b

“d1w1 + · · · + dnwn

”= aL[v ] + dL[w ],

proving linearity of L.(b) The inverse is uniquely defined by the requirement that L−1[wi ] = vi, i = 1, . . . , n.

Note that L L−1[wi ] = L[vi ] = wi, and hence L L−1 = IW since w1, . . . ,wn is a

basis. Similarly, L−1 L[vi ] = L−1[wi ] = vi, and so L−1 L = I V .

(c) If A = (v1 v2 . . . vn ), B = (w1 w2 . . . wn ), then L has matrix representative BA−1,

while L−1 has matrix representative AB−1.

(d) (i) L =

3 51 2

!, L−1 =

2 −5−1 3

!; (ii) L =

0@

13

13

1 −1

1A, L−1 =

0@

32

12

32 − 1

2

1A;

(iii) L =

0BBB@

− 12

12

12

12 − 1

212

12

12 − 1

2

1CCCA, L−1 =

0B@

0 1 11 0 11 1 0

1CA.

7.1.57. Let m = dimV = dimW . As guaranteed by Exercise 2.4.24, we can choose basesv1, . . . ,vn and w1, . . . ,wn of R

n such that v1, . . . ,vm is a basis of V and w1, . . . ,wm isa basis of W . We then define the invertible linear map L: Rn → R

n such that L[vi ] = wi,i = 1, . . . , n, as in Exercise 7.1.56. Moreover, since L[vi ] = wi, i = 1, . . . ,m, maps the basisof V to the basis of W , it defines an invertible linear function from V to W .

7.1.58. Any m× n matrix of rank n < m. The inverse is not unique. For example, if

A =

10

!, then B = ( 1 a )T , for any scalar a, satisfies BA = I = (1).

♦ 7.1.59. Use associativity of composition: N = N IW = N L M = I V M = M .

7.1.60.(a) L[ax2 + bx+ c ] = ax2 + (b+ 2a)x+ (c+ b);

L−1[ax2 + bx+ c ] = ax2 + (b− 2a)x+ (c− b+ 2a) = e−xZ x

−∞ey p(y) dy.

(b) Any of the functions Jc[p ] =Z x

0p(y) dy + c, where c is any constant, is a right inverse:

D Jc = I . There is no left inverse since kerD 6= 0 contains all constant functions.

183

♥ 7.1.61.(a) It forms a three-dimensional subspace since it is spanned by the linearly independent

functions x2ex, xex, ex.

(b) D[f ] =“ax2 + (b+ 2a)x+ (c+ b)

”ex is invertible, with inverse

D−1[f ] =“ax2 + (b− 2a)x+ (c− b+ 2a)

”ex =

Z x

−∞f(y) dy.

(c) D[p(x)ex ] =hp′(x) + p(x)

iex, while D−1[p(x)ex ] =

Z x

−∞p(y)ey dy.

7.2.1.(a)

0B@

1√2− 1√

21√2

1√2

1CA. (i) The line y = x; (ii) the rotated square 0 ≤ x+ y, x− y ≤

√2;

(iii) the unit disk.

(b)

−1 0

0 −1

!. (i) The x axis; (ii) the square −1 ≤ x, y ≤ 0; (iii) the unit disk.

(c)

0@ −

35

45

45

35

1A. (i) The line 4x+ 3y = 0; (ii) the rotated square with vertices

( 0, 0 )T ,„

1√2, 1√

2

«T,“

0,√

2”T

,„− 1√

2, 1√

2

«T; (iii) the unit disk.

(d)

1 02 1

!. (i) The line y = 4x; (ii) the parallelogram with vertices ( 0, 0 )T , ( 1, 2 )T ,

( 1, 3 )T , ( 0, 1 )T ; (iii) the elliptical domain 5x2 − 4xy + y2 ≤ 1.

(e)

0@ −

12

32

− 32

52

1A =

0B@

1√2− 1√

21√2

1√2

1CA

1 3

0 1

!0B@

1√2

1√2

− 1√2

1√2

1CA.

(i) The line y = 3x; (ii) the parallelogram with vertices ( 0, 0 )T ,“− 1

2 ,− 32

”T,

( 1, 1 )T ,“

32 ,

52

”T; (iii) the elliptical domain 17

2 x2 − 9xy + 5

2 y2 ≤ 1.

(f )

0@

15

25

25

45

1A. (i) The line y = 2x; (ii) the line segment

n(x, 2x )T

˛˛ 0 ≤ x ≤ 3

5

o;

(iii) the line segment

(x, 2x )T˛˛ − 1√

5≤ x ≤ 1√

5

ff.

7.2.2. Parallelogram with vertices

00

!,

12

!,

43

!,

31

!:

1 2 3 4

0.5

1

1.5

2

2.5

3

(a) Parallelogram with vertices

00

!,

11

!,

4−1

!,

3−2

!:

1 2 3 4

-2

-1.5

-1

-0.5

0.5

1

(b) Parallelogram with vertices

00

!,

21

!,

34

!,

13

!:

0.5 1 1.5 2 2.5 3

1

2

3

4

184

(c) Parallelogram with vertices

00

!,

57

!,

108

!,

51

!:

2 4 6 8 10

2

4

6

8

(d) Parallelogram with vertices

00

!,

0B@

1√2−√

21√2

+√

2

1CA,

0B@− 3√

2+ 2√

23√2

+ 2√

2

1CA, √

22√

2

!:

-0.5 0.5 1

1

2

3

4

5

(e) Parallelogram with vertices

00

!,

31

!,

2−1

!,

−1−2

!: -1 1 2 3

-2

-1.5

-1

-0.5

0.5

1

(f ) Line segment between

− 1

212

!and

1−1

!:

-1 -0.75-0.5-0.25 0.25 0.5 0.75 1

-1

-0.75

-0.5

-0.25

0.25

0.5

0.75

1

(g) Line segment between

−1−2

!and

36

!:

-3 -2 -1 1 2 3 4 5

-2

2

4

6

7.2.3.

(a) L2 =

−1 0

0 −1

!represents a rotation by θ = π;

(b) L is clockwise rotation by 90, or, equivalently, counterclockwise rotation by 270.

7.2.4.

0 11 0

!2

=

1 00 1

!. L represents a reflection through the line y = x. Reflecting twice

brings you back where you started.

7.2.5. Writing A =

1 02 1

! 1 00 −1

!=

1 00 −1

! 1 0−2 1

!, we see that it is the com-

position of a reflection in the x axis followed by a shear along the y axis with shear factor2, which is the same as first doing a shear along the y axis with shear factor −2 and thenreflecting in the x axis. If we perform L twice, the shears cancel each other out, while re-flecting twice brings us back where we started.

7.2.6. Its image is the line that goes through the image points

−1

2

!,

−4−1

!.

185

7.2.7. Example:

2 00 3

!. It is not unique, because you can compose it with any rotation or

reflection, e.g.,

2 0

0 3

!0B@

1√2− 1√

21√2

1√2

1CA =

0@√

2 −√

23√2

3√2

1A.

7.2.8. Example: A =

0B@

1 0 00 2 00 0 4

1CA. More generally, bA = AQ where Q is any 3 × 3 orthogonal

matrix.

7.2.9. (a) True. (b) True. (c) False: in general, squares are mapped to parallelograms. (d) False:in general circles are mapped to ellipses. (e) True.

♦ 7.2.10.(a) The reflection through the line takes e1 to ( cos θ, sin θ )T and e2 to ( sin θ,− cos θ )T ,

and hence has the indicated matrix representative.

(b)

cosϕ sinϕsinϕ − cosϕ

! cos θ sin θsin θ − cos θ

!=

cos(ϕ− θ) − sin(ϕ− θ)sin(ϕ− θ) cos(ϕ− θ)

!is rotation by angle

ϕ− θ. Composing in the other order gives the opposite rotation, through angle θ − ϕ.

♦ 7.2.11. (a) Let z be a unit vector that is orthogonal to u, so u, z form an orthonormal basis of

R2. Then L[u ] = u = Ru since uTu = 1, while L[z ] = −z = Rz since u · z = uT z = 0.

Thus, L[v ] = Rv since they agree on a basis of R2. (b) R = 2

vvT

‖v ‖2 − I .

(c) (i)

1 00 −1

!; (ii)

0@ −

725 − 24

25

− 2425

725

1A; (iii)

0 11 0

!, (iv)

0@ −

513 − 12

13

− 1213

513

1A.

7.2.12.

(a)

0 2−3 1

!=

0 11 0

! −3 0

0 1

! 1 00 2

! 1 − 1

30 1

!:

a shear of magnitude − 13 along the x axis, followed by a scaling in the y direction by a

factor of 2, followed by a scaling in the x direction by a factor of 3 coupled with a reflec-tion in the y axis, followed by a reflection in the line y = x.

(b)

1 1−1 1

!=

1 0−1 1

! 1 00 2

! 1 10 1

!:

a shear of magnitude −1 along the x axis, followed by a scaling in the y direction by afactor of 2, followed by a shear of magnitude −1 along the y axis.

(c)

3 11 2

!=

1 013 1

! 3 00 1

! 1 00 5

3

! 1 1

30 1

!:

a shear of magnitude 13 along the x axis, followed by a scaling in the y direction by a

factor of 53 , followed by a scaling of magnitude 3 in the x direction, followed by a shear

of magnitude 13 along the y axis.

(d)

0B@

1 1 01 0 10 1 1

1CA =

0B@

1 0 01 1 00 0 1

1CA

0B@

1 0 00 1 00 −1 1

1CA

0B@

1 0 00 1 00 0 2

1CA

0B@

1 0 00 −1 00 0 1

1CA

0B@

1 0 00 1 −10 0 1

1CA

0B@

1 1 00 1 00 0 1

1CA:

a shear of magnitude 1 along the x axis that fixes the xz plane, followed a shear of mag-nitude −1 along the y axis that fixes the xy plane, followed by a reflection in the xzplane, followed by a scaling in the z direction by a factor of 2, followed a shear of mag-nitude −1 along the z axis that fixes the xz plane, followed a shear of magnitude 1 alongthe y axis that fixes the yz plane.

(e)

0B@

1 2 02 4 12 1 1

1CA =

0B@

1 0 00 0 10 1 0

1CA

0B@

1 0 02 1 00 0 1

1CA

0B@

1 0 00 1 02 0 1

1CA

0B@

1 0 00 −3 00 0 1

1CA

0B@

1 0 00 1 − 1

30 0 1

1CA

0B@

1 2 00 1 00 0 1

1CA:

186

a shear of magnitude 2 along the x axis that fixes the xz plane, followed a shear of mag-nitude − 1

3 along the y axis that fixes the xy plane, followed by a scaling in the y di-rection by a factor of 3 coupled with a reflection in the xz plane, followed by a shearof magnitude 2 along the z axis that fixes the yz plane, followed shear of magnitude 2along the y axis that fixes the yz plane, followed by a reflection in the plane y − z = 0.

7.2.13.

(a)

1 a0 1

! 1 0b 1

! 1 a0 1

!=

1 + ab 2a+ a2 bb 1 + ab

!=


!since

1+ab = 1−tan 12 θ sin θ = 1−2 sin2 1

2θ = cos θ and 2a+a2 b = −2 tan 12 θ+tan2 1

2 θ sin θ =

−2 tan 12 θ“

1− sin2 12 θ”

= −2 cos 12 θ sin 1

2 θ = − sin θ.

(b) The factorization is not valid when θ is an odd multiple of π, where the matrix

−1 0

0 −1

!

represents rotation by 180.(c) The first and third factors represent shears along the x axis with shear factor a, while

the middle factor is a shear along the y axis with shear factor b.

7.2.14.

0BB@

1 0 00 1

2 −√

32

0√

32

12

1CCA =

0B@

1 0 00 1 00√

3 1

1CA

0B@

1 0 00 1

2 00 0 1

1CA

0B@

1 0 00 1 00 0 2

1CA

0B@

1 0 00 1 −

√3

0 0 1

1CA:

a shear of magnitude −√

3 along the y axis that fixes the xy plane, following by a scalingin the z direction by a factor of 2, following by a scaling in the y direction by a factor of 1

2 ,

following by a shear of magnitude√

3 along the z axis that fixes the xz plane.

7.2.15. (a)

1 00 0

!, (b)

0@

12

12

12

12

1A, (c)

0@

413 − 6

13

− 613

913

1A.

♦ 7.2.16.

(a) If A =

a bc d

!has rank 1 then its rows are linearly dependent, and hence either ( a, b ) =

λ ( c, d ) or c = d = 0. In the first case A =

λ1

!( c, d ); in the second, A =

10

!( a, b ).

(See Exercise 1.8.15 for the general case.)

(b) When u = v is a unit vector, or, more generally, when v = u/‖u ‖2.

7.2.17.

0B@

1 0 00 1 00 0 1

1CA is the identity transformation;

0B@

0 1 01 0 00 0 1

1CA is a reflection in the plane x = y;

0B@

0 0 10 1 01 0 0

1CA is a reflection in the plane x = z;

0B@

1 0 00 0 10 1 0

1CA is a reflection in the plane y = z;

0B@

0 1 00 0 11 0 0

1CA is rotation by 120 around the line x = y = z;

187

0B@

0 0 11 0 00 1 0

1CA is rotation by 240 around the line x = y = z.

7.2.18.Xψ =

0B@

1 0 00 cosψ sinψ0 − sinψ cosψ

1CA.

7.2.19. det

−1 0

0 −1

!= +1, representing a 180 rotation, while det

0B@−1 0 0

0 −1 00 0 −1

1CA = −1,

and so is a reflection — but through the origin, not a plane, since it doesn’t fix any nonzerovectors.

♦ 7.2.20. If v is orthogonal to u, then uTv = 0, and so Qπv = −v, while since ‖u ‖2 = uTu = 1,we have Qπu = u. Thus, u is fixed, while every vector in the plane orthogonal to it is ro-tated through an angle π. This suffices to show that Qπ represents the indicated rotation.

♦ 7.2.21.(a) First, w = (u · v)u = (uTv)u = uuTv is the orthogonal projection of v onto the line in

the direction of u. So the reflected vector is v − 2w = ( I − 2uuT )v.

(b) RTR = ( I − 2uuT )2 = I − 4uuT + 4uuT uuT = I because uT u = ‖u ‖2 = 1. it isimproper because it reverses orientation.

(c) (i)

0BB@

725 0 − 24

25

0 1 0

− 2425 0 − 7

25

1CCA, (ii)

0BBB@

151169 − 24

16972169

− 24169

137169

96169

72169

96169 − 119

169

1CCCA, (iii)

0BBB@

23

23 − 1

323 − 1

323

− 13

23

23

1CCCA.

(d) Because the reflected vector is minus the rotated vector.

♦ 7.2.22. (a) In the formula for Rθ, the first factor rotates to align a with the z axis, the sec-

ond rotates around the z axis by angle θ, while the third factor QT = Q−1 rotates the zaxis back to the line through a. The combined effect is a rotation through angle θ around

the axis a. (b) Set Q =

0B@

1 0 00 0 −10 1 0

1CA and multiply out to produce QZθQ

T = Yθ.

♥ 7.2.23.(a) (i) 3 + i + 2 j , (ii) 3− i + j + k , (iii) −10 + 2 i − 2 j − 6 k , (iv) 18.

(b) q q = (a + b i + c j + d k )(a − b i − c j − d k ) = a2 + b2 + c2 + d2 = ‖ q ‖2 since all otherterms in the product cancel.

(c) This can be easily checked for all basic products, e.g., (1 i ) j = k = 1( i j ), ( i j ) k =−1 = i ( j k ), ( i i ) j = − j = i ( i j ), etc. The general case follows by using the distribu-tive property (or by a direct computation).

(d) First note that if a ∈ R ⊂ H, then aq = qa for any q ∈ H. Thus, for any a, b ∈ R, weuse the distributive property to compute Lq[ar + bs ] = q (ar + bs) = aq r + bq s =

aLq[r ] + bLq[s ], and Rq[ar + bs ] = (ar + bs)q = arq + bsq = aRq[r ] + bRq[s ]. The

matrix representatives are Lq =

0BBB@

a −b −c −db a −d cc d a −bd −c b a

1CCCA, Rq =

0BBB@

a −b −c −db a d −cc −d a bd c −b a

1CCCA.

(e) By direct computation: LTq Lq = (a2 + b2 + c2 + d2) I = RTq Rq, and so LTq Lq = I =

RTq Rq when ‖ q ‖2 = a2 + b2 + c2 + d2 = 1.

(f ) For q = ( b, c, d )T , r = (x, y, z )T , we have q r = (b i + c j + d k )(x i + y j + z k ) =

188

−(bx+ cy+ dz) + (cz− dy) i + (dx− bz) j + (by− cx) k , while q · r = bx+ cy+ dz and

q× r = ( cz − dy, dx− bz, by − cx )T . The associativity law (q r) s = q (r s) implies that(q × r) · s = q · (r × s), which defines the vector triple product, and the cross productidentity (q× r)× s− (q · r)s = q× (r× s)− (r · s)q.

7.2.24. (a)

1 −4−2 3

!, (b)

1 −6− 4

3 3

!, (c)

−1 0

2 5

!, (d)

−1 0

0 5

!, (e)

−3 −8

2 7

!.

7.2.25. (a)

0B@−3 −1 −2

6 1 61 1 0

1CA, (b)

0B@−1 0 0

0 −2 00 0 1

1CA, (c)

0B@

15 0 − 12

50 −2 0− 2

5 0 − 15

1CA.

7.2.26.

(a) Bases:

10

!,

01

!, and

12

!,

21

!; canonical form:

1 00 1

!;

(b) bases:

0B@

100

1CA ,

0B@−4

01

1CA ,

0B@

043

1CA, and

1−2

!,

21

!; canonical form:

1 0 00 0 0

!;

(c) bases:

10

!,

01

!, and

0B@

20−1

1CA ,

0B@

341

1CA ,

0B@

4−5

8

1CA; canonical form:

0B@

1 00 10 0

1CA;

(d) bases:

0B@

100

1CA ,

0B@

010

1CA ,

0B@

1−2

3

1CA, and

0B@

112

1CA ,

0B@

2−1

1

1CA ,

0B@−1−1

1

1CA; canonical form:

0B@

1 0 00 1 00 0 0

1CA;

(e) bases:

0BBB@

1000

1CCCA,

0BBB@

0010

1CCCA,

0BBB@

−3100

1CCCA,

0BBB@

−1041

1CCCA, and

0BBB@

12−1

0

1CCCA,

0BBB@

01−1−1

1CCCA,

0BBB@

−1110

1CCCA,

0BBB@

−2101

1CCCA;

canonical form:

0BBB@

1 0 0 00 1 0 00 0 0 00 0 0 0

1CCCA.

7.2.27. (a) Let v1, . . . ,vn be any basis for the domain space and choose wi = L[vi ] for i =1, . . . , n. Invertibility implies that w1, . . . ,wn are linearly independent, and so form a ba-

sis for the target space. (b) Only the identity transformation, since A = S I S−1 = I .

(c) (i)

10

!,

01

!, and

20

!,

02

!. (ii)

10

!,

01

!, and

0B@

1√2

1√2

1CA ,

0B@− 1√

21√2

1CA.

(iii)

10

!,

01

!, and

1−2

!,

01

!.

♦ 7.2.28.(a) Given any v ∈ R

n, write v = c1v1 + · · · + cnvn; then Av = c1Av1 + · · ·+ cnAvn =c1w1 + · · · + cnwn, and hence Av is uniquely defined. In particular, the value of Aeiuniquely specifies the ith column of A.

(b) A = CB−1, where B = (v1 v2 . . . vn ), C = (w1 w2 . . . wn ).

♦ 7.2.29. (a) Let Q have columns u1, . . . ,un, so Q is an orthogonal matrix. Then the matrix

representative in the orthonormal basis is B = Q−1AQ = QT AQ, and BT = QTAT (QT )T =

189

QTAQ = B. (b) Not necessarily. For example, if A =

1 00 2

!and S =

1 −11 0

!, then

S−1AS =

2 01 1

!is not symmetric.

7.2.30. (a) Write 〈x ,y 〉 = xTKy where K > 0. Using the Cholesky factorization (3.70), write

K = MMT where M is invertible. Let M−T = (v1 v2 . . . vn ) define the basis. Then

x =nX

i=1

civi = M−Tc, y =

nX

i=1

divi = M−Td,

implies that 〈x ,y 〉 = xTKy = cTM−1MMTM−Td = cTd = c · d. The basis is notunique since one can right multiply by any orthogonal matrix Q to produce another one:

M−T Q = ( ev1ev2 . . . evn ). (b) (i)

√20

!,

0√3

!; (ii)

10

!,

0B@

1√3

2√3

1CA.

7.3.1.(a) (i) The horizontal line y = −1; (ii) the disk (x− 2)2 + (y + 1)2 ≤ 1 of radius 1 centered

at ( 2,−1 )T ; (iii) the square 2 ≤ x ≤ 3,−1 ≤ y ≤ 0.(b) (i) The x-axis; (ii) the ellipse 1

9 (x+ 1)2 + 14 y

2 ≤ 1;(iii) the rectangle −1 ≤ x ≤ 2, 0 ≤ y ≤ 2.

(c) (i) The horizontal line y = 2; (ii) the elliptical domain x2−4xy+5y2+6x−16y+12 ≤ 0;

(iii) the parallelogram with vertices ( 1, 2 )T , ( 2, 2 )T , ( 4, 3 )T , ( 3, 3 )T .

(d) (i) The line x = 1; (ii) the disk (x − 1)2 + y2 ≤ 1 of radius 1 centered at ( 1, 0 )T ;(iii) the square 1 ≤ x ≤ 2,−1 ≤ y ≤ 0.

(e) (i) The line 4x+3y+6 = 0; (ii) the disk (x+3)2 +(y−2)2 ≤ 1 of radius 1 centered at

(−3, 2 )T ; (iii) the rotated square with corners (−3, 2), (−2.4, 1.2), (−1.6, 1.8), (−2.2, 2.6).

(f ) (i) The line y = x−1; (ii) the line segment from„

1− 1√2,− 1√

2

«Tto„

1 + 1√2, 1√

2

«T;

(iii) the line segment from ( 1, 0 )T to ( 2, 1 )T .

(g) (i) The line x+ y + 1 = 0; (ii) the disk (x− 2)2 + (y + 3)2 ≤ 2 of radius√

2 centered at

( 2,−3 )T ; (iii) the rotated square with corners (2,−3), (3,−4), (4,−3), (3,−2).

(h) (i) The line x+y = 2; (ii) the line segment from“

1−√

5, 1 +√

5”T

to“

1 +√

5, 1−√

5”T

;

(iii) the line segment from ( 1, 1 )T to ( 4,−2 )T .

7.3.2.

(a) T3T4[x ] =

−2 1−1 0

!x +

22

!,

with

−2 1−1 0

!=

1 20 1

! 0 1−1 0

!,

22

!=

1 20 1

! 10

!+

12

!;

(b) T4T3[x ] =

0 1−1 −2

!x +

3−1

!,

with

0 1−1 −2

!=

0 1−1 0

! 1 20 1

!,

3−1

!=

0 1−1 0

! 12

!+

10

!;

(c) T3T6[x ] =

0@

32

32

12

12

1Ax +

2

2

!,

190

with

0@

32

32

12

12

1A =

1 2

0 1

!0@

12

12

12

12

1A,

22

!=

1 20 1

! 10

!+

12

!;

(d) T6T3[x ] =

0@

12

32

12

32

1Ax +

0@

5232

1A,

with

0@

12

32

12

32

1A =

0@

12

12

12

12

1A

1 20 1

!,

0@

5232

1A =

0@

12

12

12

12

1A

12

!+

4−3

!;

(e) T7T8[x ] =

0 0−4 −2

!x +

22

!,

with

0 0−4 −2

!=

1 1−1 1

! 2 1−2 −1

!,

4−3

!=

1 1−1 1

! 11

!+

2−3

!;

(f ) T8T7[x ] =

1 3−1 −3

!x +

20

!,

with

1 3−1 −3

!=

2 1−2 −1

! 1 1−1 1

!,

20

!=

2 1−2 −1

! 2−3

!+

11

!.

7.3.3. (a) True. (b) True. (c) False: in general, squares are mapped to parallelograms. (d) False:in general circles are mapped to ellipses. (e) True.

7.3.4. The triangle with vertices (−1,−6), (7,−2), (1, 6).

7.3.5.(a) if and only if their matrices are mutual inverse: B = A−1.(b) if and only if c = B a + b = 0, as in (7.34), so b = −B a.

7.3.6.(a) F [x ] = Ax + b has an inverse if and only if A in nonsingular.

(b) Yes: F−1[x ] = A−1x−A−1b.

(c) T−11

xy

!=

xy

!+

−2

1

!, T−1

2

xy

!=

13 0

0 12

! xy

!+

130

!,

T−13

xy

!=

1 −20 1

! xy

!+

3−2

!, T−1

4

xy

!=

0 −11 0

! xy

!+

0−1

!,

T−15

xy

!=

.6 −.8.8 .6

! xy

!+

3.41.2

!, T6 has no inverse,

T−17

xy

!=

0@

12 − 1

212

12

1A x

y

!+

0@ −

5212

1A, T8 has no inverse.

♦ 7.3.7.(a) First b = w0 = F [0 ], while Avi = F [vi ] − b = wi − w0 for i = 1, . . . , n. Therefore,

knowing its action on the basis vectors uniquely prescribes the matrix A.(b) A = (w1 −w0 w2 −w0 . . . wn −w0 ) and b = w0.

(c) A = CB−1, where B = (v1 v2 . . . vn ), C = (w1 −w0 . . . wn −w0 ), while b = w0.

7.3.8. It can be regarded as a subspace of the vector space of all functions from Rn to R

n andso one only needs to prove closure. If F [x ] = Ax + b and G[x ] = Cx + d, then (F +G)[x ] = (A + C)x + (b + d) and (cF )[x ] = (cA)x + (cb) are affine for all scalars c. The

dimension is n2 + n; a basis consists of the n2 linear functions Lij [x ] = Eij x, where Eij is

the n×n matrix with a single 1 in the (i, j) entry and zeros everywhere else, along with then translations Ti[x ] = x + ei, where ei is the ith standard basis vector.

191

♦ 7.3.9. (a)

A b0 1

! x1

!=

Ax + b

1

!, (b)

B b0 1

! A a0 1

!=

BA Ba + b0 1

!.

(c) The inverse of F [x ] = Ax + b is F−1[y ] = A−1(y − b) = A−1y − A−1b. The inverse

matrix is

A b0 1

!−1

=

A−1 −A−1b0 1

!.

7.3.10. (a), (b), (e) are isometries

7.3.11. (a) True. (b) True if n is even; false if n is odd, since det(− I n) = (−1)n.

7.3.12. Write y = F [x ] = Q(x − a) + a where Q =

0 −11 0

!represents a rotation through

an angle of 90, and a =“

32 , − 1

2

”T. Thus, the vector y − a = Q(x − a) is obtained

by rotating the vector x − a by 90, and so the point y is obtained by rotating x by 90

around the point a.

♦ 7.3.13. If Q = I , then F [x ] = x + a is a translation. Otherwise, since we are working in R2,

by Exercise 1.5.7(c), the matrix Q − I is invertible. Setting c = (Q − I )−1a, we rewritey = F [x ] as y − c = Q(x − c), so the vector y − c is obtained by rotating x − c accordingto Q. We conclude that F represents a rotation around the point c.

7.3.14.

(a) F [x ] =

0B@

1√2− 1√

21√2

1√2

1CAx; G[x ] = x +

1

0

!;

F G[x ] =

0B@

1√2− 1√

21√2

1√2

1CAx +

0B@

1√2

1√2

1CA =

0B@

1√2− 1√

21√2

1√2

1CA

24x−

0@ − 1

2√2+12

1A35+

0@ − 1

2√2+12

1A

is counterclockwise rotation around the point

0@

12√2+12

1A by 45;

G F [x ] =

0B@

1√2− 1√

21√2

1√2

1CAx +

1

0

!=

0B@

1√2− 1√

21√2

1√2

1CA

24x−

0@

12√2+12

1A35+

0@

12√2+12

1A


0@

12√2+12

1A by 45.

(b) F [x ] =

0@

√3

2 − 12

12

√3

2

1A"x−

1

1

!#+

1

1

!=

0@

√3

2 − 12

12

√3

2

1Ax +

0@

3−√

32

1−√

32

1A;

G[x ] =

0 −11 0

!"x−

−2

1

!#+

−2

1

!=

0 −11 0

!x +

−1

3

!;

F G[x ] =

0@ −

12 −

√3

2√3

2 − 12

1Ax +

−√

3√3

!=

0@ −

12 −

√3

2√3

2 − 12

1A24x−

0@

−1−√

32

−1+√

32

1A35+

0@

−1−√

32

−1+√

32

1A


0@

−1−√

32

−1+√

32

1A by 120;

192

G F [x ] =

0@ −

12 −

√3

2√3

2 − 12

1Ax +

0@

−3+√

32

9−√

32

1A =

0@ −

12 −

√3

2√3

2 − 12

1A24x−

0@

−1−√

32

5−√

32

1A35+

0@

−1−√

32

5−√

32

1A


0@

−1−√

32

5−√

32

1A by 120.

(c) F [x ] =

0 11 0

!x +

−1

1

!;

G[x ] =

−1 0

0 −1

!"x−

11

!#+

11

!=

−1 0

0 −1

!x +

22

!;

F G[x ] =

0 −1−1 0

!x +

13

!is a glide reflection (see Exercise 7.3.16) along the line

y = x+ 1 by a distance 2;

G F [x ] =

0 −1−1 0

!x +

31

!is a glide reflection (see Exercise 7.3.16) along the line

y = x− 1 by a distance 2.

♥ 7.3.15.(a) If F [x ] = Qx + a and G[x ] = Rx + b, then G F [x ] = RQx + (Ra + b) = Sx + c is

a isometry since S = QR, the product of two orthogonal matrices, is also an orthogonalmatrix.

(b) F [x ] = x + a and G[x ] = x + b, then G F [x ] = x + (a + b) = x + c.(c) Using Exercise 7.3.13, the rotation F [x ] = Qx+ a has Q 6= I , while G[x ] = x+b is the

translation. Then G F [x ] = Qx + (a + b) = Qx + c, and F G[x ] = Qx + (a +Qb) =Qx + ec are both rotations.

(d) From part (a), G F [x ] = RQx + (Ra + b) = Sx + c is a translation if and only if

R = Q−1, i.e., the two rotations are by the opposite angle.(e) Write x + c = G F [x ] where F [x ] = Qx and G[x ] = Q−1x + c for any Q 6= I .

♦ 7.3.16. (a)

1 00 −1

!x +

20

!=

x+ 2−y

!, (b)

0 11 0

!x +

0B@

3√2

3√2

1CA =

0B@y + 3√

2

x+ 3√2

1CA,

(c)

0 −1−1 0

! x−

10

!!+

10

!+

√2

−√

2

!=

−y + 1 +

√2

−x+ 1−√

2

!.

♦ 7.3.17.(a) F [x ] = R(x − a) + a where R = 2uuT − I is the elementary reflection matrix corre-

sponding to the line in the direction of u through the origin.

(b) G[x ] = R(x− a) + a + du, where R = 2uuT − I is the same reflection matrix.(c) Let F [x ] = Rx + b with R an improper orthogonal matrix. According to (5.31) and

Exercise 7.2.10, R represents a reflection through a line L = ker( I − R) that goesthrough the origin. The affine map is a reflection through a line ` parallel to L andpassing through a point a if it takes the form F [x ] = R(x − a) + a, which requires

b ∈ rng ( I −R) = L⊥. Otherwise, we decompose b = c+ e with c = ( I −R)a ∈ L⊥ ande ∈ L = ker( I − R) 6= 0. Then the affine map takes the form F [x ] = R(x − a) + a + e,which is a glide reflection along the line ` by a distance ‖ e ‖.

♥ 7.3.18.(a) Let A = x + b |x ∈W ( R

n be the affine subspace. If ai = xi + b ∈W for all i, thenai − aj = xi − xj ∈ W all belong to a proper subspace of R

n and so they cannot span

Rn. Conversely, let W be the span of all ai − aj . Then we can write ai = xi + b where

193

xi = ai − a1 ∈W and b = a1, and so all ai belong to the affine subspace A.(b) Let vi = ai − a0, wi = bi − b0, for i = 1, . . . , n. Then, by the assumption, vi · vi =

‖vi ‖2 = ‖ai − a0 ‖2 = ‖bi − b0 ‖2 = ‖wi ‖2 = wi · wi for all i = 1, . . . , n, while

‖vi ‖2 − 2vi · vj + ‖vj ‖2 = ‖vi − vj ‖2 = ‖ai − aj ‖2 = ‖bi − bj ‖2 = ‖wi −wj ‖2 =

‖wi ‖2 − 2wi · wj + ‖wj ‖2, and hence vi · vj = wi · wj for all i 6= j. Thus, we have

verified the hypotheses of Exercise 5.3.20, and so there is an orthogonal matrix Q suchthat wi = Qvi for i = 1, . . . , n. Therefore, bi = wi + b0 = Qvi + b0 = Qai + (b0 −Qa0) = F [ai ] where F [x ] = Qx + c with c = b0 −Qa0 is the desired isometry.

♦ 7.3.19. First, ‖v + w ‖2 = ‖v ‖2 + 2 〈v ,w 〉+ ‖w ‖2,‖L[v + w ] ‖2 = ‖L[v ] + L[w ] ‖2 = ‖L[v ] ‖2 + 2 〈L[v ] , L[w ] 〉+ ‖L[w ] ‖2.

If L is an isometry, ‖L[v + w ] ‖ = ‖v + w ‖, ‖L[v ] ‖ = ‖v ‖, ‖L[w ] ‖ = ‖w ‖. Thus,equating the previous two formulas, we conclude that 〈L[v ] , L[w ] 〉 = 〈v ,w 〉.

♦ 7.3.20. First, if L is an isometry, and ‖u ‖ = 1 then ‖L[u ] ‖ = 1, proving that L[u ] ∈ S1.Conversely, if L preserves the unit sphere, and 0 6= v ∈ V , then u = v/r ∈ S1 wherer = ‖v ‖, so ‖L[v ] ‖ = ‖L[rv ] ‖ = ‖ r L[u ] ‖ = r ‖L[u ] ‖ = r = ‖v ‖, proving (7.37).

7.3.21.(a) All affine transformations F [x ] = Qx + b where b is arbitrary and Q is a symmetry

of the unit square, and so a rotation by 0, 90, 180 or 270 degrees, or a reflection in the xaxis, the y axis, the line y = x or the line y = −x.

(b) Same form, but where Q is one of 48 symmetries of the unit cube, consisting of 24 rota-tions and 24 reflections. The rotations are the identity, the 9 rotations by 90, 180 or 270degrees around the coordinate axes, the 6 rotations by 180 degrees around the 6 linesthrough opposite edges, e.g., x = ±y, z = 0, and the 8 rotations by 120 or 240 degreesaround the 4 diagonal lines x = ±y = ±z. The reflections are obtained by multiplyingeach of the rotations by − I , and represent either a reflection through the origin, in thecase of − I , or a reflection through the plane orthogonal to the axis of the rotation inthe other cases.

7.3.22. Same answer as previous exercise. Now the transformation must preserve the unit dia-mond/octahedron, which has the same (linear) symmetries as the unit square/cube.

♥ 7.3.23.

(a) q(H x) = (x coshα+ y sinhα)2 − (x sinhα+ y coshα)2

= (cosh2 α− sinh2 α)(x2 − y2) = x2 − y2 = q(x).

(b) (ax + by + e)2 − (cx + dy + f)2 = x2 − y2 if and only if a2 − c2 = 1, d2 − b2 = 1,ab = cd, e = f = 0. Thus, a = ± coshα, c = sinhα, d = ± coshβ, c = sinhβ,and sinh(α− β) = 0, and so α = β. Thus, the complete collection of linear (and affine)transformations preserving q(x) is

coshα sinhαsinhα coshα

!,

coshα sinhα

− sinhα − coshα

!,

− coshα − sinhα

sinhα coshα

!,

− coshα − sinhα− sinhα − coshα

!.

♥ 7.3.24. (a) q =

x

1− y , 0

!T, (b) q =

y

1 + y − x ,y

1 + y − x

!T. The maps are nonlinear

— not affine; they are not isometries because distance between points is not preserved.

7.4.1.(a) L(x) = 3x; domain R; target R; right hand side −5; inhomogeneous.

194

(b) L(x, y, z) = x− y − z; domain R3; target R; right hand side 0; homogeneous.

(c) L(u, v, w) =

u− 2vv − w

!; domain R

3; target R2; right hand side

−3−1

!; inhomogeneous.

(d) L(p, q) =

3p− 2qp+ q

!; domain R

2; target R2; right hand side

00

!; homogeneous.

(e) L[u ] = u′(x) + 3xu(x); domain C1(R); target C0(R); right hand side 0; homogeneous.

(f ) L[u ] = u′(x); domain C1(R); target C0(R); right hand side −3x; inhomogeneous.

(g) L[u ] =

u′(x)− u(x)

u(0)

!; domain C1(R); target C0(R) × R; right hand side

01

!;

inhomogeneous.

(h) L[u ] =

u′′(x)− u(x)u(0)− 3u(1)

!; domain C2(R); target C0(R) × R; right hand side

ex

0

!;

inhomogeneous.

(i) L[u ] =

0B@u′′(x) + x2u(x)

u(0)u′(0)

1CA; domain C2(R); target C0(R)×R

2; right hand side

0B@

3x10

1CA;

inhomogeneous.

(j) L[u, v ] =

u′(x)− v(x)−2u(x) + v′(x)

!; domain C1(R) × C1(R); target C0(R) × C0(R); right

hand side

00

!; homogeneous.

(k) L[u, v ] =

0B@u′′(x)− v′′(x)− 2u(x) + v(x)

u(0)− v(0)u(1)− v(1)

1CA; domain C2(R)×C2(R); target C0(R)× R

2;

right hand side

0B@

000

1CA; homogeneous.

(l) L[u ] = u(x) + 3Z x

0u(y) dy; domain C0(R); target C0(R); right hand side the constant

function 1; inhomogeneous.

(m) L[u ] =Z ∞

0u(t) e−s t dt; domain C0(R); target C0(R); right hand side 1+ s2; inhomo-

geneous.

(n) L[u ] =Z 1

0u(x) dx−u

“12

”; domain C0(R); target R; right hand side 0; homogeneous.

(o) L[u, v ] =Z 1

0u(y) dy −

Z 1

0y v(y) dy; domain C0(R) × C0(R); target R; right hand side

0; homogeneous.

(p) L[u ] =∂u

∂t+ 2

∂u

∂x; domain C1(R2); target C0(R2); right hand side the constant

function 1; inhomogeneous.

(q) L[u ] =

∂u/∂x− ∂v/∂y∂u/∂y + ∂v/∂x

!; domain C1(R2) × C1(R2); target C0(R2); right hand side

the constant vector-valued function 0; homogeneous.

(r) L[u ] = − ∂2u

∂x2− ∂2u

∂y2; domain C2(R2); target C0(R2); right hand side x2 + y2 − 1;

inhomogeneous.

7.4.2.L[u ] = u(x) +Z b

aK(x, y)u(y) dy. The domain is C0(R) and the target is R. To show

linearity, for constants c, d,

195

L[cu+ dv ] = [cu(x) + dv(x) ] +Z b

aK(x, y) [cu(y) + dv(y) ] dy

= c

u(x) +

Z b

aK(x, y)u(y) dy

!+ d

v(x) +

Z b

aK(x, y) v(y) dy

!= cL[u ] + dL[v ].

7.4.3.L[u ] = u(t) +Z t

aK(t, s)u(s) ds. The domain is C0(R) and the target is C0(R). To show

linearity, for any constants c, d,

L[cu+ dv ] = [cu(t) + dv(t) ] +Z t

aK(t, s) [cu(s) + dv(s) ] ds

= c

u(t) +

Z t

aK(t, s)u(s) ds

!+ d

v(t) +

Z t

aK(t, s) v(s) ds

!= cL[u ] + dL[v ].

7.4.4.(a) Since a is constant, by the Fundamental Theorem of Calculus,

du

dt=

d

dt

a+

Z t

0k(s)u(s) ds

!= k(t)u(t). Moreover, u(0) = a+

Z 0

0k(s)u(s) ds = a.

(b) (i) u(t) = 2e− t, (ii) u(t) = et2−1, (iii) u(t) = 3ee

t−1.

7.4.5. True, since the equation can be written as L[x ] + b = c, or L[x ] = c− b.

7.4.6.(a) u(x) = c1 e

2x + c2 e−2x, dim = 2;

(b) u(x) = c1 e4x + c2 e

2x, dim = 2;

(c) u(x) = c1 + c2 e3x + c3 e

−3x, dim = 3;

(d) u(x) = c1 e−3x + c2 e

−2x + c3 e−x + c4 e

2x, dim = 4.

7.4.7.(a) If y ∈ C2[a, b ], then y′′ ∈ C0[a, b ] and so L[y ] = y′′ + y ∈ C0[a, b ]. Further,

L[cy + dz ] = (cy + dz)′′ + (cy + dz) = c(y′′ + y) + d(z′′ + z) = cL[y ] + dL[z ].(b) kerL is the span of the basic solutions cosx, sinx.

7.4.8. (a) If y ∈ C2[a, b ], then y′ ∈ C1[a, b ], y′′ ∈ C0[a, b ] and so L[y ] = 3y′′ − 2y′ − 5y ∈C0[a, b ]. Further, L[cy + dz ] = 3(cy + dz)′′−2(cy + dz)−5(cy + dz) = c(3y′′ − 2y′ − 5y)+

d(3z′′ − 2z′ − 5z) = cL[y ]+dL[z ]. (b) kerL is the span of the basic solutions e−x, e5x/3.

7.4.9.(a) p(D) = D3 + 5D2 + 3D − 9.

(b) ex, e−3x, xe−3x. The general solution is y(x) = c1 ex + c2 e

−3x + c2xe−3x.

7.4.10. (a) Minimal order 2: u′′ + u′ − 6u = 0. (b) minimal order 2: u′′ + u′ = 0.(c) minimal order 2: u′′− 2u′ +u = 0. (d) minimal order 3: u′′− 6u′′ +11u′− 6u = 0.

7.4.11. (a) u = c1x+c2x5

, (b) u = c1x2 + c2

1q|x |

, (c) u = c1 |x |(1+

√5)/2 + c2 |x |

(1−√

5)/2,

(d) u = c1 |x |√

3 + c2 |x |−

√3, (e) u = c1x

3 + c2x−1/3, (f ) v = c1 +

c2x

.

7.4.12. u = c1x + c2 |x |√

3 + c3 |x |−√

3. There is a three-dimensional solution space for x > 0;

only those in the two-dimensional subspace spanned by x, |x |√

3 are continuously differen-tiable at x = 0.

196

7.4.13.

(i) Using the chain rule,dv

dt= et

du

dx= x

du

dx,

d2v

dt2= e2 t

d2u

dx2+ et

du

dx= x2 d

2u

dx2+ x

du

dx,

and so v(t) solves ad2v

dt2+ (b− a) dv

dt+ cv = 0.

(ii) In all cases, u(x) = v(log x) gives the solutions in Exercise 7.4.11.

(a) v′′ + 4v′ − 5v = 0, with solution v(t) = c1 et + c2 e

−5 t.

(b) 2v′′ − 3v′ − 2v = 0, with solution v(t) = c1 e2 t + c2 e

t/2.

(c) v′′ − v′ − v = 0, with solution v(t) = c1 e12 (1+

√5) t + c2 e

12 (1−

√5) t.

(d) v′′ − 3v = 0, with solution v(t) = c1 e√

3 t + c2 e−

√3 t.

(e) 3v′′ − 8v′ − 3v = 0, with solution v(t) = c1 e3 t + c2 e

− t/3.

(f ) v′′ + v′ = 0, with solution v(t) = c1 + c2 e− t.

♦ 7.4.14.(a) v(t) = c1 e

r t + c2 ter t, so u(x) = c1 |x |

r + c2 |x |r log |x |.

(b) (i) u(x) = c1x+ c2x log |x |, (ii) u(x) = c1 + c2 log |x |.

7.4.15. v′′−4v = 0, so u(x) = c1e2x

x+c2

e−2x

x. The solutions with c1+c2 = 0 are continuously

differentiable at x = 0, but only the zero solution is twice continuously differentiable.

7.4.16. True if S is a connected interval. If S is disconnected, then D[u ] = 0 implies u is con-stant on each connected component. Thus, the dimension of kerD equals the number ofconnected components of S.

7.4.17. For u = log(x2 + y2), we compute∂2u

∂x2=

2y2 − 2x2

(x2 + y2)2= − ∂2u

∂y2. Similarly, when v =

x

x2 + y2, then

∂2v

∂x2=

6xy2 − 2x3

(x2 + y2)3= − ∂2v

∂y2. Or simply notice that v =

1

2

∂u

∂x, and so if

∂2u

∂x2+∂2u

∂x2= 0, then

∂2v

∂x2+∂2v

∂y2=

1

2

∂

∂x

0@ ∂2u

∂x2+∂2u

∂y2

1A = 0.

7.4.18. u = c1 + c2 log r. The solutions form a two-dimensional vector space.

7.4.19. log“c(x− a)2 + d(y − b)2

”. Not a vector space!

♥ 7.4.20.

(a) ∆[ex cos y ] =∂2

∂x2ex cos y +

∂2

∂y2ex cos y = ex cos y − ex cos y = 0.

(b) p2(x, y) = 1 + x+ 12 x

2 − 12 y

2 satisfies ∆p2 = 0.

(c) Same for p3(x, y) = 1 + x+ 12 x

2 − 12 y

2 + 16 x

3 − 12 xy

2.(d) If u(x, y) is harmonic, then any of its Taylor polynomials are also harmonic. To prove

this, we write u(x, y) = pn(x, y) + rn(x, y), where pn(x, y) is the Taylor polynomial ofdegree n and rn(x, y) is the remainder. Then ∆u(x, y) = ∆pn(x, y) + ∆rn(x, y), where∆pn(x, y) is a polynomial of degree n − 2, and hence the Taylor polynomial of degreen − 2 for ∆u, while ∆rn(x, y) is the remainder. If ∆u = 0, then its Taylor polynomial∆pn = 0 also, and hence pn is a harmonic polynomial.

(e) The Taylor polynomial of degree 4 is p4(x, y) = −2x − x2 + y2 − 23 x

3 + 2xy2 − 12 x

4 +

3x2 y2 − 12 y

4, which is harmonic: ∆p4 = 0.

7.4.21.(a) Basis: 1, x, y, z, x2 − y2, x2 − z2, xy, xz, y z; dimension = 9.

197

(b) Basis: x3−3xy2, x3−3xz2, y3−3x2y, y3−3y z2, z3−3x2z, z3−3y2z, xy z; dimension = 7.

7.4.22. u = c1 + c2/r. The solutions form a two-dimensional vector space.

7.4.23.(a) If x ∈ kerM , then L M [x ] = L[M [x ] ] = L[0 ] = 0 and so x ∈ kerL.(b) For example, if L = O, but M 6= O then ker(L M) = 0 6= kerM .

Other examples: L = M =

0 10 0

!, and L = M = D, the derivative function.

7.4.24. (a) Not in the range. (b) x =

0BB@

0

1

0

1CCA+ z

0BBB@

− 75

− 65

1

1CCCA. (c) x =

0BBB@

32

− 34

0

1CCCA; kernel element is 0.

(d) x =

0BBB@

−2020

1CCCA+

26664 y

0BBB@

3100

1CCCA+ w

0BBB@

20−3

1

1CCCA

37775. (e) Not in the range.

(f ) x =

0BBB@

−2100

1CCCA+

26664 z

0BBB@

1−1

10

1CCCA+ w

0BBB@

0101

1CCCA

37775.

7.4.25. (a) x = 1, y = −3, unique; (b) x = − 17 + 3

7 z, y = 47 + 2

7 z, not unique; (c) no solution;(d) u = 2, v = −1, w = 0, unique; (e) x = 2 + 4w, y = −2w, z = −1− 6w, not unique.

7.4.26.(a) u(x) = 11

16 − 14 x+ ce4x,

(b) u(x) = 16 ex sinx+ c1 e

2x/5 cos 45 x+ c2 e

2x/5 sin 45 x,

(c) u(x) = 13 xe

3x − 19 e

3x + c1 + c2 e3x.

7.4.27.(a) u(x) = 1

4 ex − 1

4 e4−3x,

(b) u(x) = 14 − 1

4 cos 2x,

(c) u(x) = 49 e

2x − 12 ex + 1

18 e−x − 1

3 xe−x,

(d) u(x) = − 110 cosx+ 1

5 sinx+ 1110 e

−x cos 2x+ 910 e

−x sin 2x,

(e) u(x) = −x− 1 + 12 ex + 1

2 cosx+ 32 sinx.

7.4.28. (a) Unique solution: u(x) = x − πsin√

2 x

sin√

2 π; (b) no solution; (c) unique solution:

u(x) = x + (x − 1) ex ; (d) infinitely many solutions: u(x) = 12 + ce−x sinx; (e) unique

solution: u(x) = 2x + 3 − 3e2 − 5

e2 − e ex +3e− 5

e2 − e e2x ; (f ) no solution; (g) unique solution:

u(x) =36

31x2− 5x2

31; (h) infinitely many solutions: u(x) = c(x− x2).

7.4.29. (a) u(x) = 12 x log x+ c1x+

c2x

, (b) u(x) = 12 log x+ 3

4 + c1x+ c2x2,

(c) u(x) = 1− 38 x+ c1x

5 +c2x

.

7.4.30. (a) If b ∈ rng (L M), then b = L M [x ] for some x, and so b = L[M [x ] ] = L[y ],

198

with y = M [x ], belongs to rngL. (b) If M = O, but L 6= O then rng (L M) = 0 6=rngL.

♦ 7.4.31.(a) First, Y ⊂ rngL since every y ∈ Y can be written as y = L[w ] for some w ∈W ⊂ U ,

and so y ∈ rngL. If y1 = L[w1 ] and y2 = L[w2 ] are elements of Y , then so is cy1 +dy2 = L[cw1 + dw2 ] for any scalars c, d since cw1 + dw2 ∈ W , proving that Y is asubspace.

(b) Suppose w1, . . . ,wk form a basis for W , so dimW = k. Let y = L[w ] ∈ Y for w ∈ W .We can write w = c1w1 + · · · + ckwk, and so, by linearity, y = c1L[w1 ] + · · · +ckL[wk ]. Therefore, the k vectors y1 = L[w1 ], . . . ,yk = L[wk ] span Y , and hence, byProposition 2.33, dimY ≤ k.

♦ 7.4.32. If z ∈ kerL then L[z ] = 0 ∈ kerL, which proves invariance. If y = L[x ] ∈ rngL thenL[y ] ∈ rngL, which proves invariance.

7.4.33. (a) 0, the x axis, the y axis,; R2; (b) 0, the x axis, R

2; (c) If θ 6= 0, π, then the

only invariant subspaces are 0 and R2. On the other hand, R0 = I , Rπ = − I , and so in

these cases every subspace is invariant.

♦ 7.4.34.(a) If L were invertible, then the solution to L[x ] = b would be unique, namely x = L−1[y ].

But according to Theorem 7.38, we can add in any element of the kernel to get anothersolution, which would violate uniqueness.

(b) If b 6= rngL, then we cannot solve L[x ] = b, and so the inverse cannot exist.(c) On a finite-dimensional vector space, every linear function is equivalent to multiplication

by a square matrix. If the kernel is trivial, then the matrix is nonsingular, and hence

invertible. An example: the integral operator I[f(x) ] =Z x

0f(y) dy on V = C0 has

trivial kernel, but is not invertible because any function with g(0) 6= 0 does not lie in therange of I and hence rng I 6= V .

7.4.35.(a) u(x) = 1

2 + 25 cosx+ 1

5 sinx+ c e−2x,

(b) u(x) = − 19 x− 1

10 sinx+ c1 e3x + c2 e

−3x,

(c) u(x) = 110 + 1

8 ex cosx+ c1 e

x cos 13 x+ c2 e

x sin 13 x,

(d) u(x) = 16 x e

x − 118 e

x + 14 e

−x + c1 ex + c2 e

−2x,

(e) u(x) = 19 x+ 1

54 e3x + c1 + c2 cos 3x+ c3 sin 3x.

7.4.36. (a) u(x) = 5x+ 5− 7 ex−1, (b) u(x) = c1(x+ 1) + c2 ex.

7.4.37. u(x) = −7 cos√x− 3 sin

√x.

7.4.38. u′′ + xu = 2, u(0) = a, u(1) = b, for any a, b.

7.4.39. (a) u(x) = 19 x+ cos 3x+ 1

27 sin 3x, (b) u(x) = 12 (x2 − 3x+ 2) e4x,

(c) u(x) = 3 cos 2x+ 310 sin 2x− 1

5 sin 3x, (d) u(x) = 1− 12 (x+1) ex+ 1

2 (x2+2x−1) ex−1.

7.4.40. u(x, y) = 14 (x2 + y2) + 1

12 (x4 + y4).

♥ 7.4.41.(a) If u = vu1, then u′ = v′u1 + vu′1, u

′′ = v′′u1 +2v′u′1 + vu′′1 , and so 0 = u′′ +au′ + bu =

u1 v′′ +(2u′1 +au1)v

′ +(u′′1 +au′1 + bu1)v = u1w′ +(2u′1 +au1)w, which is a first order

199

ordinary differential equation for w.

(b) (i) u(x) = c1 ex+ c2xe

x, (ii) u(x) = c1 (x−1)+ c2 e−x, (iii) u(x) = c1 e

−x2

+ c2xe−x2

,

(iv) u(x) = c1 ex2/2 + c2 e

x2/2Ze−x

2

dx.

♦ 7.4.42. We use linearity to compute

L[u? ] = L[c1u?1 + · · ·+ cku

?k ] = c1L[u?1 ] + · · ·+ ckL[u?k ] = c1 f1 + · · · + ck fk,

and hence u? is a particular solution to the differential equation (7.66). The second part ofthe theorem then follows from Theorem 7.38.

7.4.43. An example: Let A =

1 2i 2 i

!. Then kerA consist of all vectors c

−2

1

!where c =

a+ i b is any complex number. Then its real part a

−2

1

!and imaginary part b

−2

1

!are

also solutions to the homogeneous system.

7.4.44.(a) u(x) = c1 cos 2x+ c2 sin 2x,

(b) u(x) = c1 e−3x cosx+ c2 e

−3x sinx,

(c) u(x) = c1 ex + c2 e

−x/2 cos 32 x+ c3 e

−x/2 sin 32 x,

(d) u(x) = c1 ex/

√2 cos 1√

2x+ c2 e

−x/√

2 cos 1√2x+ c3 e

x/√

2 sin 1√2x+ c4 e

−x/√

2 sin 1√2x,

(e) u(x) = c1 cos 2x+ c2 sin 2x+ c3 cos 3x+ c4 sin 3x,

(f ) u(x) = c1x cos(√

2 log |x |) + c2x sin(√

2 log |x |),(g) u(x) = c1 x

2 + c2 cos(2 log |x |) + c3 sin(2 log |x |).7.4.45.

(a) Minimal order 2: u′′ + 2u′ + 10u = 0;

(b) minimal order 4: u(iv) + 2u′′ + u = 0;

(c) minimal order 5: u(v) + 4u(iv) + 14u′′′ + 20u′′ + 25u′ = 0;

(d) minimal order 4: u(iv) + 5u′′ + 4u = 0.

(e) minimal order 6: u(vi) + 3u(iv) + 3u′′ + u = 0.

7.4.46.(a) u(x) = c e i x = c cosx+ i c sinx.

(b) u(x) = c1 ex + c2 e

( i−1)x =“c1 e

x + c2 e−x cosx

”+ i e−x sinx,

(c)u(x) = c1 e

(1+ i )x/√

2 + c2 e−(1+ i )x/

√2

=hc1 e

x/√

2 cos x√2

+ c2 e−x/

√2 cos x√

2)i+ i

hc1 e

x/√

2 sin x√2− c2 e

−x/√

2 sin x√2)i.

7.4.47. (a) x4−6x2 y2 +y4, 4x3 y−4xy3. (b) The polynomial u(x, y) = ax4 + bx3 y+ cx2 y2 +

d xy3 + ey4 solves∂2u

∂x2+∂2u

∂y2= (12a + 2c)x2 + (6b + 6d)xy + (2c + 12e)y2 = 0 if and

only if 12a + 2c = 6b + 6d = 2c + 12e = 0. The general solution to this homogeneouslinear system is a = e, b = −d, c = −6e, where d, e are the free variables. Thus, u(x, y) =

e(x4 − 6x2 y2 + y4) + 14 d(4x3 y − 4xy3).

♥ 7.4.48. (a)∂u

∂t= −k2 e−k

2 t+ i kx =∂2u

∂x2; (b) e−k

2 t− i kx; (c) e−k2 t cos kx, e−k

2 t sin kx.

(d) Yes. When k = a+ i b is complex, we obtain the real solutions

200

e(b2−a2) t−bx cos(ax− 2abt), e(b

2−a2) t−bx sin(ax− 2abt) from e−k2 t+ i kx, along with

e(b2−a2) t+bx cos(ax+ 2abt), e(b

2−a2) t+bx sin(ax+ 2abt) from e−k2 t− i kx.

(e) All those in part (a), as well as those in part (b) for | a | > | b | — which, when b = 0,include those in part (a).

7.4.49. u(t, x) = ek2 t+kx, u(t, x) = e−k

2 t+kx, where k is any real or complex number. Whenk = a+ i b is complex, we obtain the four independent real solutions

e(a2−b2) t+ax cos(bx+ 2abt), e(a

2−b2) t+ax sin(bx+ 2abt),

e(b2−a2) t+ax cos(bx− 2abt), e(b

2−a2) t+ax sin(bx− 2abt).

7.4.50. (a), (c), (e) are conjugated. Note: Case (e) is all of C3.

♦ 7.4.51. If v = Re u =u + u

2, then v =

u + u

2=

u + u

2= v. Similarly, if w = Im u =

u− u

2 i,

then w =u− u

−2 i=

u− u

−2 i=

u− u

2 i= w.

♦ 7.4.52. Let z1, . . . , zn be a complex basis of V . Then xj = Re zj ,yj = Im zj , j = 1, . . . , n, are

real vectors that span V since each zj = xj + iyj is a linear combination thereof, cf. Ex-

ercise 2.3.19. Thus, by Exercise 2.4.22, we can find a basis of V containing n of the vectorsx1, . . . ,xn,y1, . . . ,yn. Conversely, if v1, . . . ,vn is a real basis, and v = c1v1 + · · ·+ cnvn ∈V , then v = c1v1 + · · ·+ cnvn ∈ V also, so V is conjugated.

♦ 7.4.53. Every linear function from Cn to C

m has the form L[u ] = Au for some m × n complex

matrix A. The reality condition implies Au = L[u ] = L[u ] = Au = Au for all u ∈ Cn.

Therefore, A = A and so A is a real matrix.

♦ 7.4.54.L[u ] = L[v ] + iL[w ] = f , and, since L is real, the real and imaginary parts of thisequation yield L[v ] = f , L[w ] = 0.

7.4.55. (a) L[u ] = L[u ] = 0. (b) u =“ “− 3

2 + 12 i”y +

“− 1

2 − 12 i”z, y, z

”Twhere y, z ∈ C

are the free variables, is the general solution to the first system, and so its complex conjugate

u =“ “− 3

2 − 12 i”y +

“− 1

2 + 12 i”z, y, z

”T, where y, z ∈ C are free variables, solves the conju-

gate system. (Note: Since y, z are free, they could be renamed y, z if desired.)

♦ 7.4.56. They are linearly independent if and only if u is not a complex scalar multiple of a realsolution. Indeed, if u = (a + i b)v where v is a real solution, then x = av,y = bv arelinearly dependent. Conversely, if y = 0, then u = x is already a real solution, while if

x = ay for a real†, then u = (1 + i a)y is a scalar multiple of the real solution y.

7.5.1. (a)

1 −12 3

!, (b)

0@ 1 − 3

243 3

1A , (c)

0@

137 − 10

757

157

1A.

7.5.2. Domain (a), target (b):

2 −34 9

!; domain (a), target (c):

3 −51 10

!;

† a can’t be complex, as otherwise x wouldn’t be real.

201

domain (b), target (a):

0@

12 − 1

223 1

1A; domain (b), target (c):

0@

32 − 5

213

103

1A;

domain (c), target (a):

0@

67 − 1

757

57

1A; domain (c), target (b):

0@

127 − 3

7107

157

1A.

7.5.3. (a)

0B@

1 −1 01 0 −10 1 2

1CA, (b)

0BB@

1 −2 012 0 − 3

2

0 23 2

1CCA, (c)

0BBB@

0 14

32

1 − 32 −4

0 114

92

1CCCA.

7.5.4. Domain (a), target (b):

0B@

1 −2 01 0 −30 2 6

1CA; domain (a), target (c):

0B@

1 −1 −12 0 −21 4 5

1CA;

domain (b), target (a):

0BB@

1 −1 012 0 − 1

2

0 13

23

1CCA; domain (b), target (c):

0BB@

1 −1 −1

1 0 −113

43

53

1CCA;


0BBB@

14 − 1

2 112 0 −2

− 14

12 2

1CCCA; domain (c), target (b):

0BBB@

14 −1 312 0 −6

− 14 1 6

1CCCA.

7.5.5. Domain (a), target (a):

1 0 −13 2 1

!; domain (a), target (b):

1 0 −33 4 3

!.

domain (a), target (c):

2 0 −28 8 4

!; domain (b), target (a):

0@

12 0 − 1

2

1 23

13

1A.

domain (b), target (b):

0@

12 0 − 3

2

1 43 1

1A; domain (b), target (c):

1 0 −183

83

43

!.


0@ 1 2

7 − 37

1 47

17

1A; domain (c), target (b):

0@ 1 4

7 − 97

1 87

37

1A.

domain (c), target (c):

0@

167

87 − 4

7187

167

67

1A.

7.5.6. Using the monomial basis 1, x, x2, the operator D has matrix representative

A =

0B@

0 1 00 0 20 0 0

1CA. The inner product is represented by the Hilbert matrix K =

0BBB@

1 12

13

12

13

14

13

14

15

1CCCA.

Thus, the adjoint is represented by K−1AT K =

0B@−6 2 312 −24 −260 30 30

1CA, so

D∗[1 ] = −6 + 12x, D∗[x ] = 2− 24x− 26x2, D∗[x2 ] = 3− 26x− 26x2.

One can check that

〈 p ,D[q ] 〉 =Z 1

0p(x) q′(x) dx =

Z 1

0D∗[p(x) ] q(x) dx = 〈D∗[p ] , q 〉

for all quadratic polynomials by verifying it on the monomial basis elements.

♦ 7.5.7. Suppose M,N :V → U both satisfy 〈u ,M [v ] 〉 = 〈〈L[u ] ,v 〉〉 = 〈u , N [v ] 〉 for all u ∈U,v ∈ V . Then 〈u , (M −N)[v ] 〉 = 0 for all u ∈ U , and so (M −N)[v ] = 0 for all v ∈ V ,which proves that M = N .

202

♦ 7.5.8.(a) For all u ∈ U,v ∈ V , we have 〈u , (L+M)∗[v ] 〉 = 〈〈 (L+M)[u ] ,v 〉〉 =〈〈L[u ] ,v 〉〉+ 〈〈M [u ] ,v 〉〉 = 〈u , L∗[v ] 〉+ 〈u ,M∗[v ] 〉 = 〈u , (L∗ +M∗)[v ] 〉. Since this

holds for all u ∈ U,v ∈ V , we conclude that (L+M)∗ = L∗ +M∗.(b) 〈u , (cL)∗[v ] 〉 = 〈〈 (cL)[u ] ,v 〉〉 = c 〈〈L[u ] ,v 〉〉 = c 〈u , L∗[v ] 〉 = 〈u , c L∗[v ] 〉.(c) 〈〈 (L∗)∗[u ] ,v 〉〉 = 〈u , L∗[v ] 〉 = 〈〈L[u ] ,v 〉〉.(d) (L−1)∗ L∗ = (L L−1)∗ = I ∗ = I , and L∗ (L−1)∗ = (L−1 L)∗ = I ∗ = I .

7.5.9. In all cases, L = L∗ if and only if its matrix representative A, with respect to the stan-

dard basis, is symmetric. (a) A =

−1 0

0 −1

!= AT , (b) A =

0 11 0

!= AT ,

(c) A =

3 00 3

!= AT , (d) A =

0@

12

12

12

12

1A = AT .

♦ 7.5.10. According to (7.78), the adjoint A∗ = M−1ATM = A if and only if MA = ATM =

(MTAT )T = (MA)T since M = MT .

7.5.11. The inner product matrix is M =

2 00 3

!, so MA =

12 66 12

!is symmetric, and

hence, by Exercise 7.5.10, A is self-adjint.

7.5.12. (a) a12 = 12 a21, a13 = 1

3 a31,12 a23 = 1

3 a32, (b)

0B@

0 1 12 1 23 3 2

1CA.

7.5.13.(a) 2a12−a22 = −a11+2a21−a31, 2a13−a23 = −a21+2a31,−a13+2a23−a33 = −a22+2a32,

(b)

0B@

0 1 12 3 −65 3 −16

1CA.

7.5.14. True. 〈 I [u ] ,v 〉 = 〈u ,v 〉 = 〈u , I [v ] 〉 for all u,v ∈ U , and so, according to (7.74),

I ∗ = I .

7.5.15. False. For example, A =

2 00 1

!is not self-adjoint with respect to the inner product

defined by M =

2 −1−1 2

!since MA =

4 −1−2 2

!is not symmetric, and so fails the

criterion of Exercise 7.5.10.

7.5.16.(a) (L+ L∗)∗ = L∗ + (L∗)∗ = L∗ + L.

(b) Since L L∗ = (L∗)∗ L∗, this follows from Theorem 7.60. (Or it can be proved directly.)

♦ 7.5.17.(a) Write the condition as 〈N [u ] ,u 〉 = 0 where N = K − M is also self-adjoint. Then,

for any u,v ∈ U , we have 0 = 〈N [u + v ] ,u + v 〉 = 〈N [u ] ,u 〉 + 〈N [u ] ,v 〉 +〈N [v ] ,u 〉 + 〈N [v ] ,v 〉 = 2 〈u , N [v ] 〉, where we used the self-adjointness of N tocombine 〈N [u ] ,v 〉 = 〈u , N [v ] 〉 = 〈N [v ] ,u 〉. Since 〈u , N [v ] 〉 = 0 for all u,v, weconclude that N = K −M = O.

(b) When we take U = Rn with the dot product, then K and M are represented by n × n

matrices, A,B, respectively, and the condition is (Au) · u = (Bu) · u for all u ∈ Rn,

which implies A = B provided A,B are symmetric matrices. In particular, if AT = −Ais any skew-symmetric matrix, then (Au) · u = 0 for all u.

203

7.5.18. (a) =⇒ (b): Suppose 〈L[u ] , L[v ] 〉 = 〈u ,v 〉 for all u,v ∈ U , then

‖L[u ] ‖ =q〈L[u ] , L[u ] 〉 =

q〈u ,u 〉 = ‖u ‖ .

(b) =⇒ (c): Suppose ‖L[u ] ‖ = ‖u ‖ for all u ∈ U . Then

〈L∗ L[u ] ,u 〉 = 〈L[u ] , L[u ] 〉 = 〈u ,u 〉.Thus, by Exercise 7.5.17, L∗ L = I . Since L is assumed to be invertible, this proves,cf. Exercise 7.1.59, that L∗ = L−1.

(c) =⇒ (a): If L∗ L = I , then

〈L[u ] , L[v ] 〉 = 〈u , L∗ L[v ] 〉 = 〈u ,v 〉 for all u,v ∈ U.

7.5.19.

(a) 〈Ma[u ] , v 〉 =Z b

aMa[u(x) ] v(x) dx =

Z b

aa(x)u(x) v(x) dx =

Z b

au(x)Ma[v(x) ] dx =

〈u ,Ma[v ] 〉, proving self-adjointness.

(b) Yes, by the same computation, 〈〈Ma[u ] , v 〉〉 =Z b

aa(x)u(x) v(x)w(x) dx = 〈〈u ,Ma[v ] 〉〉.

♥ 7.5.20.(a) If AT = −A, then (Au) ·v = (Au)Tv = uTATv = −uTAv = −u ·Av for all u,v ∈ R

n,

and so A∗ = −A.(b) When ATM = −MA.

(c) 〈 (L− L∗)[u ] ,v 〉 = 〈L[u ] ,v 〉−〈L∗[u ] ,v 〉 = 〈u , L∗[v ] 〉−〈u , L[v ] 〉 = 〈u , (L∗ − L)[v ] 〉.Thus, by the definition of adjoint, (L− L∗)∗ = L∗ − L = −(L− L∗).

(d) Write L = K + S, where K = 12 (L + L∗) is self-adjoint and S = 1

2 (L − L∗) is skew-adjoint.

♦ 7.5.21. Define L:U → V1 × V2 by L[u ] = (L1[u ], L2[u ]). Using the induced inner product〈〈〈 (v1,v2) , (w1,w2) 〉〉〉 = 〈〈v1 ,w1 〉〉1 + 〈〈v2 ,w2 〉〉2 on the Cartesian product V1 × V2 givenin Exercise 3.1.18, we find

〈u , L∗[v1,v2 ] 〉 = 〈〈〈L[u ] , (v1,v2) 〉〉〉= 〈〈〈 (L1[u ], L2[u ]) , (v1,v2) 〉〉〉 = 〈〈L1[u ] ,v1 〉〉1 + 〈〈L1[u ] ,v2 〉〉2= 〈u , L∗1 [v1 ] 〉+ 〈u , L∗2 [v2 ] 〉 = 〈u , L∗1 [v1 ] + L∗2 [v2 ] 〉,

and hence L∗[v1,v2 ] = L∗1 [v1 ] + L∗2 [v2 ]. As a result,

L∗ L[u ] = L∗1 L1[u ] + L∗2 L2[u ] = K1[u ] +K2[u ] = K[u ].

7.5.22. Minimizer:“

15 ,− 1

5

”T; minimum value: − 1

5 .


1413 ,

213 ,− 3

13


26 .


23 ,

13

”T; minimum value: −2.

7.5.25.

(a) Minimizer:“

518 ,

118


6 .

(b) Minimizer:“

53 ,

43

”T; minimum value: −5.

(c) Minimizer:“

16 ,

112


2 .

7.5.26.

(a) Minimizer:“

713 ,

213


26 .

204

(b) Minimizer:“

1139 ,

113


78 .

(c) Minimizer:“

1213 ,

526


52 .

(d) Minimizer:“

1939 ,

439


39 .

7.5.27. (a) 13 , (b) 6

11 , (c) 35 .

♦ 7.5.28. Suppose L:U → V is a linear map between inner product spaces with kerL 6= 0 and

adjoint map L∗:V → U . Let K = L∗ L:U → U be the associated positive semi-definiteoperator. If f ∈ rngK, then any solution to the linear system K[u? ] = f is a minimizer for

the quadratic function p(u) = 12 ‖L[u ] ‖2 − 〈u , f 〉. The minimum is not unique since if u?

is a minimizer, so is u = u? + z for any z ∈ kerL.

205


8.1.1. (a) u(t) = −3e5 t, (b) u(t) = 3e2(t−1), (c) u(t) = e−3(t+1).

8.1.2. γ = log 2/100 ≈ .0069. After 10 years: 93.3033 gram; after 100 years: 50 gram;after 1000 years: .0977 gram.

8.1.3. Solve e−(log 2)t/5730 = .0624 for t = −5730 log .0624/ log 2 = 22, 933 years.

8.1.4. By (8.6), u(t) = u(0) e−(log 2)t/t? = u(0)“

12

”t/t?

= 2−n u(0) when t = nt?. After every

time period of duration t?, the amount of radioactive material is halved.

8.1.5. The solution is u(t) = u(0) e1.3 t. To double, we need e1.3 t = 2, so t = log 2/1.3 = .5332.

To quadruple takes twice as long, t = 1.0664. To reach 2 million needs t = log 106/1.3 =10.6273.

8.1.6. The solution is u(t) = u(0) e.27 t. For the given initial conditions, u(t) = 1, 000, 000 whent = log(1000000/5000)/.27 = 19.6234 years.

♦ 8.1.7.(a) If u(t) ≡ u? = − b

a, then

du

dt= 0 = au + b, hence it is a solution.

(b) v = u− u? satisfiesdv

dt= av, so v(t) = ceat, and u(t) = ceat − b

a.

(c) The equilibrium solution is asymptotically stable if and only if a < 0 and is stable ifa = 0.

8.1.8. (a) u(t) = 12 + 1

2 e2 t, (b) u(t) = −3, (c) u(t) = 2− 3 e−3(t−2).

8.1.9. (a)du

dt= − log 2

1000u + 5 ≈ − .000693 u + 5. (b) Stabilizes at the equilibrium solution

u? = 5000/ log 2 ≈ 721 tons. (c) The solution is u(t) =5000

log 2

"1− exp

− log 2

1000t

!#which

equals 100 when t = − 1000

log 2log

1− 100

log 2

5000

!≈ 20.14 years.

♥ 8.1.10.(a) The first term on the right hand side says that the rate of growth remains proportional

to the population, while the second term reflects the fact that hunting decreases thepopulation by a fixed amount. (This assumes hunting is done continually throughoutthe year, which is not what happens in real life.)

(b) The solution is u(t) =“

5000− 1000.27

”e.27 t + 1000

.27 . Solving u(t) = 100000 gives

t =1

.27log

1000000− 1000/.27

5000− 1000/.27= 24.6094 years.

(c) To avoid extinction, tThe equilibrium u? = b/.27 must be less than the initial popula-tion, so b < 1350 deer.

♦ 8.1.11. (a) |u1(t)− u2(t) | = eat |u1(0)− u2(0) | → ∞ when a > 0, since u1(0) = u2(0) if andonly if the solutions are the same. (b) t = log(1000/.05)/.02 = 495.17.

8.1.12.(a) u(t) = 1

3 e2 t/7.

206

(b) One unit: t = logh1/(1/3− .3333)

i/(2/7) = 36.0813;

1000 units: t = logh1000/(1/3− .3333)

i/(2/7) = 60.2585;

(c) One unit: t ≈ 30.2328 solves 13 e2 t/7 − .3333 e.2857 t = 1.

1000 units: t ≈ 52.7548 solves 13 e2 t/7 − .3333 e.2857 t = 1000.

Note: The solutions to these nonlinear equations are found by a numerical equationsolver, e.g., the bisection method, or Newton’s method, [10].

♦ 8.1.13. According to Exercise 3.6.24,du

dt= caeat = au, and so u(t) is a valid solution. By

Euler’s formula (3.84), if Re a > 0, then u(t) → ∞ as t → ∞, and the origin is an unstableequilibrium. If Re a = 0, then u(t) remains bounded t → ∞, and the origin is a stableequilibrium. If Re a < 0, then u(t) → 0 as t → ∞, and the origin is an asymptoticallystable equilibrium.

8.2.1.

(a) Eigenvalues: 3,−1; eigenvectors:

−1

1

!,

11

!.

(b) Eigenvalues: 12 , 1

3 ; eigenvectors:

43

!,

11

!.

(c) Eigenvalue: 2; eigenvector:

−1

1

!.

(d) Eigenvalues: 1 + i√

2, 1− i√

2; eigenvectors:

− i√

21

!,

i√

21

!.

(e) Eigenvalues: 4, 3, 1; eigenvectors:

0B@

1−1

1

1CA,

0B@−1

01

1CA,

0B@

121

1CA.

(f ) Eigenvalues: 1,√

6,−√

6; eigenvectors:

0B@

201

1CA,

0BBB@

−1 +√

3√2

2 +√

3√2

1

1CCCA,

0BBB@

−1−√

3√2

2−√

3√2

1

1CCCA.

(g) Eigenvalues: 0, 1 + i , 1− i ; eigenvectors:

0B@

310

1CA,

0B@

3− 2 i3− i

1

1CA,

0B@

3 + 2 i3 + i

1

1CA.

(h) Eigenvalues: 2, 0; eigenvectors:

0B@

1−1

1

1CA,

0B@−1−3

1

1CA.

(i) −1 is a simple eigenvalue, with eigenvector

0B@

2−1

1

1CA;

2 is a double eigenvalue, with eigenvectors

0B@

1301

1CA,

0B@− 2

310

1CA.

(j) −1 is a double eigenvalue, with eigenvectors

0BBB@

1−1

00

1CCCA,

0BBB@

00−3

2

1CCCA;

207

7 is also a double eigenvalue, with eigenvectors

0BBB@

1100

1CCCA,

0BBB@

0012

1CCCA.

(k) Eigenvalues: 1, 2, 3, 4; eigenvectors:

0BBB@

0001

1CCCA,

0BBB@

0011

1CCCA,

0BBB@

0110

1CCCA,

0BBB@

1100

1CCCA.

8.2.2. (a) The eigenvalues are e± i θ = cos θ ± i sin θ with eigenvectors

1∓ i

!. They are real

only for θ = 0 and π. (b) Because Rθ − a I has an inverse if and only if a is not an eigen-value.

8.2.3. The eigenvalues are ±1 with eigenvectors ( sin θ,±1− cos θ )T .

8.2.4. (a) O, and (b) − I , are trivial examples.

8.2.5. (a) The characteristic equation is −λ3+γ λ2+βλ+α = 0. (b) For example,

0B@

0 1 00 0 1c b a

1CA.

8.2.6. The eigenvalues are 0 and ± i√

a2 + b2 + c2. If a = b = c = 0 then A = O and all vectors

are eigenvectors. Otherwise, the eigenvectors are

0B@

abc

1CA,

0B@

b−a

0

1CA∓ i√

a2 + b2 + c2

0B@−ac−bc

a2 + b2

1CA.

8.2.7.

(a) Eigenvalues: i ,−1 + i ; eigenvectors:

10

!,

−1

1

!.

(b) Eigenvalues: ±√

5; eigenvectors:

i (2±

√5)

1

!.

(c) Eigenvalues: −3, 2 i ; eigenvectors:

−1

1

!,

35 + 1

5 i1

!.

(d) −2 is a simple eigenvalue with eigenvector

0B@−1−2

1

1CA;

i is a double eigenvalue with eigenvectors

0B@−1 + i

01

1CA,

0B@

1 + i10

1CA.

8.2.8.(a) Since Ov = 0 = 0v, we conclude that 0 is the only eigenvalue; all nonzero vectors v 6= 0

are eigenvectors.(b) Since I v = v = 1v, we conclude that 1 is the only eigenvalue; all nonzero vectors v 6= 0

are eigenvectors.

8.2.9. For n = 2, the eigenvalues are 0, 2, and the eigenvectors are

−1

1

!, and

11

!. For n = 3,

the eigenvalues are 0, 0, 3, and the eigenvectors are

0B@−1

01

1CA,

0B@−1

10

1CA, and

0B@

111

1CA. In general,

the eigenvalues are 0, with multiplicity n − 1, and n, which is simple. The eigenvectors

corresponding to the eigenvalue 0 are all nonzero vectors of the form ( v1, v2, . . . , vn )T withv1 + · · ·+vn = 0. The eigenvectors corresponding to the eigenvalue n are all nonzero vectors

of the form ( v1, v2, . . . , vn )T with v1 = · · · = vn.

208

♦ 8.2.10.(a) If Av = λv, then A(cv) = cAv = cλv = λ(cv) and so cv satisfies the eigenvector

equation for the eigenvalue λ. Moreover, since v 6= 0, also cv 6= 0 for c 6= 0, and so cv isa bona fide eigenvector.

(b) If Av = λv, Aw = λw, then A(cv + dw) = cAv + dAw = cλv + dλw = λ(cv + dw).(c) Suppose Av = λv, Aw = µw. Then v and w must be linearly independent as other-

wise they would be scalar multiples of each other and hence have the same eigenvalue.Thus, A(cv + dw) = cAv + dAw = cλv + dµw = ν(cv + dw) if and only if cλ = cνand dµ = dν, which, when λ 6= µ, is only possible when either c = 0 or d = 0.

8.2.11. True — by the same computation as in Exercise 8.2.10(a), cv is an eigenvector for thesame (real) eigenvalue λ.

♦ 8.2.12. Write w = x + iy. Then, since λ is real, the real and imaginary parts of the eigenvectorequation Aw = λw are Ax = λx, Ay = λy, and hence x,y are real eigenvectors of A.Thus x = a1v1 + · · ·+ ak vk, y = b1v1 + · · ·+ bk vk for a1, . . . , ak, b1, . . . , bk ∈ R, and hencew = c1v1 + · · ·+ ck vk where cj = aj + i bj .

♦ 8.2.13.

(a) A =

0BBBBBBBBBBB@

0 1 0 0 00 1 0 0

0 1 0. . .

. . .. . .

0 1 00 0 11 0 0

1CCCCCCCCCCCA

.

(b) AT A = I by direct computation, or, equivalently, note that the columns of A are thestandard orthonormal basis vectors en, e1, e2, . . . , en−1, written in a slightly differentorder.

(c) Since ωk =„

1, e2kπ i /n, e4kπ i /n, . . . , e2(n−1)kπ i /n«T

,

Sωk =„

e2kπ i /n, e4kπ i /n, . . . , e2(n−1)kπ i /n,1«T

= e2kπ i /nωk,

so ωk is an eigenvector with corresponding eigenvalue e2kπ i /n for each k = 0, . . . , n− 1.

8.2.14. (a) Eigenvalues: −3, 1, 5; eigenvectors: ( 2,−3, 1 )T ,“− 2

3 ,−1, 1”T

, ( 2, 1, 1 )T ;

(b) tr A = 3 = −3 + 1 + 5, (c) det A = −15 = (−3) · 1 · 5.8.2.15.

(a) tr A = 2 = 3 + (−1); det A = −3 = 3 · (−1).

(b) tr A = 56 = 1

2 + 13 ; det A = 1

6 = 12 · 1

3 .(c) tr A = 4 = 2 + 2; det A = 4 = 2 · 2.(d) tr A = 2 = (1 + i

√2) + (1− i

√2); det A = 3 = (1 + i

√2) · (1− i

√2).

(e) tr A = 8 = 4 + 3 + 1; det A = 12 = 4 · 3 · 1.(f ) tr A = 1 = 1 +

√6 + (−

√6); det A = −6 = 1 ·

√6 · (−

√6).

(g) tr A = 2 = 0 + (1 + i ) + (1− i ); det A = 0 = 0 · (1 + i√

2) · (1− i√

2).(h) tr A = 4 = 2 + 2 + 0; det A = 0 = 2 · 2 · 0.(i) tr A = 3 = (−1) + 2 + 2; det A = −4 = (−1) · 2 · 2.(j) tr A = 12 = (−1) + (−1) + 7 + 7; det A = 49 = (−1) · (−1) · 7 · 7.(k) tr A = 10 = 1 + 2 + 3 + 4; det A = 24 = 1 · 2 · 3 · 4.

209

8.2.16.(a) a = a11 + a22 + a33 = tr A, b = a11 a22 − a12 a21 + a11 a33 − a13 a31 + a22 a33 − a23 a32,

c = a11 a22 a33 + a12 a23 a31 + a13 a21a32 − a11 a23 a32 − a12 a21 a33 − a13 a22 a31 = det A

(b) When the factored form of the characteristic polynomial is multiplied out, we obtain

− (λ−λ1)(λ−λ2)(λ−λ3) = −λ3 +(λ1 +λ2 +λ3)λ2− (λ1 λ2 +λ1 λ3 +λ2 λ3)λ+λ1 λ2 λ3,

giving the eigenvalue formulas for a, b, c.

8.2.17. If U is upper triangular, so is U − λ I , and hence p(λ) = det(U − λ I ) is the product ofthe diagonal entries, so p(λ) =

Q(uii − λ). Thus, the roots of the characteristic equation

are u11, . . . , unn — the diagonal entries of U .

♦ 8.2.18. Since Ja − λ I is an upper triangular matrix with λ− a on the diagonal, its determinantis det(Ja− λ I ) = (a− λ)n and hence its only eigenvalue is λ = a, of multiplicity n. (Or use

Exercise 8.2.17.) Moreover, (Ja − a I )v = ( v2, v3, . . . , vn, 0 )T = 0 if and only if v = ce1.

♦ 8.2.19. Parts (a,b) are special cases of part (c):If Av = λv then Bv = (cA + d I )v = (c λ + d)v.

8.2.20. If Av = λv then A2v = λAv = λ2v, and hence v is also an eigenvector of A2 witheigenvalue λ2.

8.2.21. (a) False. For example, 0 is an eigenvalue of both

0 10 0

!and

0 01 0

!, but the eigen-

values of A + B =

0 11 0

!are ± i . (b) True. If Av = λv and Bv = µv, then (A+B)v =

(λ + µ)v, and so v is an eigenvector with eigenvalue λ + µ.

8.2.22. False in general, but true if the eigenvectors coincide: If Av = λv and Bv = µv, thenABv = (λµ)v, and so v is an eigenvector with eigenvalue λµ.

♦ 8.2.23. If ABv = λv, then BAw = λw, where w = Bv. Thus, as long as w 6= 0, it isan eigenvector of BA with eigenvalue λ. However, if w = 0, then ABv = 0, and so theeigenvalue is λ = 0, which implies that AB is singular. But then so is BA, which also has0 as an eigenvalue. Thus every eigenvalue of AB is an eigenvalue of BA. The converse fol-lows by the same reasoning. Note: This does not imply that their null eigenspaces (kernels)have the same dimension; compare Exercise 1.8.18. In anticipation of Section 8.6, eventhough AB and BA have the same eigenvalues, they may have different Jordan canonicalforms.

♦ 8.2.24. (a) Starting with Av = λv, multiply both sides by A−1 and divide by λ to obtain

A−1v = (1/λ)v. Therefore, v is an eigenvector of A−1 with eigenvalue 1/λ.(b) If 0 is an eigenvalue, then A is not invertible.

♦ 8.2.25. (a) If all |λj | ≤ 1 then so is their product 1 ≥ |λ1 . . . λn | = | det A |, which is a contra-

diction. (b) False. A =

2 00 1

3

!has eigenvalues 2, 1

3 while det A = 23 .

8.2.26. Recall that A is singular if and only if ker A 6= 0. Any v ∈ ker A satisfies Av = 0 =0v. Thus ker A is nonzero if and only if A has a null eigenvector.

8.2.27. Let v,w be any two linearly independent vectors. Then Av = λv and Aw = µw forsome λ, µ. But v + w is an eigenvector if and only if A(v + w) = λv + µw = ν(v + w),which requires λ = µ = ν. Thus, Av = λv for every v, which implies A = λ I .

8.2.28. If λ is a simple real eigenvalue, then there are two real unit eigenvectors: u and −u.

For a complex eigenvalue, if u is a unit complex eigenvector, so is e i θu, and so there are

210

infinitely many complex unit eigenvectors. (The same holds for a real eigenvalue if we alsoallow complex eigenvectors.) If λ is a multiple real eigenvalue, with eigenspace of dimensiongreater than 1, then there are infinitely many unit real eigenvectors in the eigenspace.

8.2.29. All false. Simple 2× 2 examples suffice to disprove them:

Strt with

0 −11 0

!, which has eigenvalues i ,− i ; (a)

0 −11 −2

!has eigenvalue −1;

(b)

1 00 −1

!has eigenvalues 1,−1; (c)

0 −41 0

!has eigenvalues 2 i ,−2 i .

8.2.30. False. The eigenvalue equation Av = λv is not linear in the eigenvalue and eigenvectorsince A(v1 + v2) 6= (λ1 + λ2)(v1 + v2) in general.

8.2.31.

(a) (i) Q =

−1 0

0 1

!. Eigenvalues −1, 1; eigenvectors

10

!,

01

!. (ii) Q =

0@

725 − 24

25

− 2425 − 7

25

1A.

Eigenvalues −1, 1; eigenvectors

0@

3545

1A,

0@

45

− 35

1A. (iii) Q =

0B@

1 0 00 −1 00 0 1

1CA. Eigenvalue −1

has eigenvector:

0B@

010

1CA; eigenvalue 1 has eigenvectors:

0B@

100

1CA,

0B@

001

1CA. (iv) Q =

0B@

0 0 10 1 01 0 0

1CA.

Eigenvalue 1 has eigenvector:

0B@−1

01

1CA; eigenvalue −1 has eigenvectors:

0B@

101

1CA,

0B@

010

1CA.

(b) u is an eigenvector with eigenvalue −1. All vectors orthogonal to u are eigenvectorswith eigenvalue +1.

♦ 8.2.32.(a) det(B − λ I ) = det(S−1 AS − λ I ) = det

hS−1(A− λ I )S

i

= det S−1 det(A− λ I ) det S = det(A− λ I ).(b) The eigenvalues are the roots of the common characteristic equation.(c) Not usually. If w is an eigenvector of B, then v = Sw is an eigenvector of A and con-

versely.

(d) Both have 2 as a double eigenvalue. Suppose

1 1−1 3

!= S−1

2 00 2

!S, or, equiv-

alently, S

1 1−1 3

!=

2 00 2

!S for some S =

x yz w

!. Then, equating entries,

we must have x − y = 2x, x + 3y = 0, z − w = 0, z + 3w = 2w, which impliesx = y = z = w = 0, and so S = O, which is not invertible.

8.2.33.

(a) pA−1(λ) = det(A−1 − λ I ) = det

"λ A−1

1

λI −A

!#=

(−λ)n

det ApA

1

λ

!.

Or, equivalently, if

pA(λ) = (−1)nλn + cn−1 λn−1 + · · ·+ c1 λ + c0,

then, since c0 = det A 6= 0,

pA−1(λ) = (−1)n

λn +c1c0

λn−1 + · · ·+cn−1

c0λ

!+

1

c0

=(−1)n

c0

24 − 1

λ

!n

+ c1

− 1

λ

!n−1

+ · · ·+ cn

35 =

(−λ)n

det ApA

1

λ

!.

211

(b) (i) A−1 =

−2 1

32 − 1

2

!. Then pA(λ) = λ2 − 5λ− 2, while

pA−1(λ) = λ2 + 52 λ− 1

2 =λ2

2

− 2

λ2− 5

λ+ 1

!.

(ii) A−1 =

0BBB@

− 35 − 4

545

65

35 − 8

5

− 45 − 2

575

1CCCA. Then pA(λ) = −λ3 + 3λ2 − 7λ + 5, while

pA−1(λ) = −λ3 + 75 λ2 − 3

5 λ + 15 =

−λ3

5

− 1

λ3+

3

λ2− 7

λ+ 5

!.

♥ 8.2.34.(a) If Av = λv then 0 = Akv = λk v and hence λk = 0, so λ = 0.

(b) A =

0 10 0

!has A2 =

0 00 0

!;

A =

0B@

0 1 10 0 10 0 0

1CA has A2 =

0B@

0 0 10 0 00 0 0

1CA, A3 =

0B@

0 0 00 0 00 0 0

1CA.

In general, A can be any upper triangular matrix with all zero entries on the diagonal,and all nonzero entries on the super-diagonal.

♥ 8.2.35.(a) det(AT − λ I ) = det(A − λ I )T = det(A − λ I ), and hence A and AT have the same

characteristic polynomial, which implies that they have the same eigenvalues.(b) No. See the examples.

(c) λv · w = (Av)T w = vT AT w = µv · w, so if µ 6= λ, v · w = 0 and the vectors areorthogonal.

(d) (i) The eigenvalues are 1, 2; the eigenvectors of A are v1 =

−1

1

!,v2 =

− 1

21

!; the

eigenvectors of AT are w1 =

21

!,w2 =

11

!, and v1,w2 are orthogonal, as are

v2,w1.

(ii) The eigenvalues are 1,−1,−2; the eigenvectors of A are v1 =

0B@

110

1CA,v2 =

0B@

121

1CA,v3 =

0B@

0121

1CA; the eigenvectors of AT are w1 =

0B@

3−2

1

1CA,w2 =

0B@

2−2

1

1CA,w3 =

0B@

1−1

1

1CA. Note

that vi is orthogonal to wj whenever i 6= j.

8.2.36.

(a) The characteristic equation of a 3 × 3 matrix is a real cubic polynomial, and hence has at

least one real root. (b)

0BBB@

0 1 0 0−1 0 0 0

0 0 0 10 0 −1 0

1CCCA has eigenvalues ± i . (c) No, since the characteris-

tic polynomial is degree 5 and hence has at least one real root.

8.2.37. (a) If Av = λv, then v = A4v = λ4v, and hence, since v 6= 0, all its eigenvalues must

satisfy λ4 = 1. (b)

0BBB@

1 0 0 00 −1 0 00 0 0 −10 0 1 0

1CCCA.

212

8.2.38. If P v = λv then P 2v = λ2v. Since P v = P 2v, we find λv = λ2v. Since v 6= 0, itfollows that λ2 = λ, so the only eigenvalues are λ = 0, 1. All v ∈ rng P are eigenvectorswith eigenvalue 1 since if v = P u, then P v = P 2u = P u = v, whereas all w ∈ ker P arenull eigenvectors.

8.2.39. False. For example,

0B@

0 0 11 0 00 1 0

1CA has eigenvalues 1, − 1

2 ±√

32 i .

8.2.40.(a) According to Exercise 1.2.29, if z = ( 1, 1, . . . , 1 )T , then A z is the vector of row sums of

A, and hence, by the assumption, A z = z. Thus,o z is an eigenvector with eigenvalue 1.

(b) Yes, since the column sums of A are the row sums of AT , and Exercise 8.2.35 says that

A and AT have the same eigenvalues.

8.2.41.(a) If Qv = λv, then QT v = Q−1v = λ−1v and so λ−1 is an eigenvalue of QT . Further-

more, Exercise 8.2.35 says that a matrix and its transpose have the same eigenvalues.(b) If Qv = λv, then, by Exercise 5.3.16, ‖v ‖ = ‖Qv ‖ = |λ | ‖v ‖, and hence |λ | = 1.

Note that this proof also applies to complex eigenvalues/eigenvectors, with ‖ · ‖ denotingthe Hermitian norm in C

n.(c) Let λ = e i θ be the eigenvalue. Then

e i θv

Tv = (Qv)T v = v

T QTv = v

T Q−1v = e− i θ

vTv.

Thus, if e i θ 6= e− i θ, which happen if and only if it is not real, then

0 = vTv =

“‖x ‖2 − ‖y ‖2

”+ 2 ix · y,

and so the result follows from taking real and imaginary parts of this equation.

♦ 8.2.42.(a) According to Exercise 8.2.36, a 3 × 3 orthogonal matrix has at least one real eigenvalue,

which by Exercise 8.2.41 must be ±1. If the other two eigenvalues are complex conju-gate, µ± i ν, then the product of the eigenvalues is ±(µ2+ν2). Since this must equal thedeterminant of Q, which by assumption, is positive, we conclude that the real eigenvaluemust be +1. Otherwise, all the eigenvalues of Q are real, and they cannot all equal −1as otherwise its determinant would be negative.

(b) True. It must either have three real eigenvalues of ±1, of which at least one must be −1as otherwise its determinant would be +1, or a complex conjugate pair of eigenvaluesλ, λ, and its determinant is −1 = ±|λ |2, so its real eigenvalue must be −1 and its com-plex eigenvalues ± i .

♦ 8.2.43.(a) The axis of the rotation is the eigenvector v corresponding to the eigenvalue +1. Since

Qv = v, the rotation fixes the axis, and hence must rotate around it. Choose an or-thonormal basis u1,u2,u3, where u1 is a unit eigenvector in the direction of the axis of

rotation, while u2 + iu3 is a complex eigenvector for the eigenvalue e i θ. In this basis, Q

has matrix form

0B@

1 0 00 cos θ − sin θ0 sin θ cos θ

1CA, where θ is the angle of rotation.

(b) The axis is the eigenvector

0B@

2−5

1

1CA for the eigenvalue 1. The complex eigenvalue is

713 + i 2

√30

13 , and so the angle is θ = cos−1 713 ≈ 1.00219.

8.2.44. In general, besides the trivial invariant subspaces 0 and R3, the axis of rotation and

213

its orthogonal complement plane are invariant. If the rotation is by 180, then any line inthe orthogonal complement plane, as well as any plane spanned by such a line and the axisof rotation are also invariant. If R = I , then every subspace is invariant.

8.2.45.(a) (Q − I )T (Q − I ) = QT Q − Q − QT + I = 2 I − Q − QT = K and hence K is a Gram

matrix, which is positive semi-definite by Theorem 3.28.(b) The Gram matrix is positive definite if and only if ker(Q − I ) = 0, which means that

Q does not have an eigenvalue of 1.

♦ 8.2.46. If Q = I , then we have a translation. Otherwise, we decompose b = c + d, where

c ∈ rng (Q − I ) while d ∈ coker(Q − I ) = ker(QT − I ). Thus, c = (Q − I )a, while

QT d = d, and so d = Qd, so d belongs to the axis of the rotation represented by Q. Thus,referring to (7.41), F (x) = Q(x − a) + a + d represents either a rotation around the centerpoint a, when d = 0, or a screw around the line in the direction of the axis of Q passingthrough the point a, when d 6= 0.

♥ 8.2.47.

(a) M2 =

0 11 0

!: eigenvalues 1,−1; eigenvectors

11

!,

−1

1

!;

M3 =

0B@

0 1 01 0 10 1 0

1CA: eigenvalues −

√2, 0,√

2; eigenvectors

0B@

1−√

21

1CA,

0B@−1

01

1CA,

0B@

1√2

1

1CA.

(b) The jth entry of the eigenvalue equation Mn vk = λk vk reads

sin(j − 1)kπ

n + 1+ sin

(j + 1)kπ

n + 1= 2 cos

kπ

n + 1sin

j kπ

n + 1,

which follows from the trigonometric identity sin α + sin β = 2 cosα− β

2sin

α + β

2.

These are all the eigenvalues because an n×n matrix has at most n distinct eigenvalues.

♦ 8.2.48. We have A = a I + bMn, so by Exercises 8.2.19 and 8.2.47 it has the same eigenvectors

as Mn, while its corresponding eigenvalues are a + bλk = a + 2b coskπ

n + 1for k = 1, . . . , n.

♥ 8.2.49. For k = 1, . . . , n,

λk = 2 cos2kπ

n, vk =

cos

2kπ

n, cos

4kπ

n, cos

6kπ

n, . . . , cos

2(n− 1)kπ

n, 1

!T

.

8.2.50. Note first that if Av = λv, then D

v0

!=

Av0

!= λ

v0

!, and so

v0

!is an eigen-

vector for D with eigenvalue λ. Similarly, each eigenvalue µ and eigenvector w of B gives

an eigenvector

0w

!of D. Finally, to check that D has no other eigenvalue, we compute

D

vw

!=

AvBw

!=

λvλw

!= λ

vw

!and hence, if v 6= 0, then λ is an eigenvalue of A,

while if w 6= 0, then it must also be an eigenvalue for B.

♥ 8.2.51. (a) Follows by direct computation:

pA(A) =

a2 + bc ab + bdac + cd bc + d2

!− (a + d)

a bc d

!+ (ad− bc)

1 00 1

!=

0 00 0

!.

(b) by part (a), O = A−1pA(A) = A− (tr A) I + (det A)A−1, and the formula follows upon

solving for A−1. (c) tr A = 4, det A = 7 and one checks A2 − 4A + 7 I = O.

♥ 8.2.52.

214

(a) Bv = (A− vbT )v = Av − (b · v)v = (λ− β)v.

(b) B (w + cv) = (A − vbT )(w + cv) = µw + (c(λ − β) − b ·w)v = µ(w + cv) providedc = b ·w/(λ− β − µ).

(c) Set B = A−λ1v1bT where v1 is the first eigenvector of A and b is any vector such that

b · v1 = 1. For example, we can set b = v1/‖v1 ‖2. (Weilandt deflation, [10], choosesb = rj/(λ1 v1,j) where v1,j is any nonzero entry of v1 and rj is the corresponding row

of A.)

(d) (i) The eigenvalues of A are 6, 2 and the eigenvectors

11

!,

−3

1

!. The deflated ma-

trix B = A−λ1v1vT1

‖v1 ‖2=

0 0−2 2

!has eigenvalues 0, 2 and eigenvectors

11

!,

01

!.

(ii) The eigenvalues of A are 4, 3, 1 and the eigenvectors

0B@

1−1

1

1CA,

0B@−1

01

1CA,

0B@

121

1CA. The de-

flated matrix B = A − λ1v1vT1

‖v1 ‖2=

0BBB@

53

13 − 4

313

23

13

− 43

13

53

1CCCA has eigenvalues 0, 3, 1 and

eigenvectors

0B@

1−1

1

1CA,

0B@−1

01

1CA,

0B@

121

1CA.

8.3.1. (a) Complete; dim = 1 with basis ( 1, 1 )T . (b) Not complete; dim = 1 with basis

( 1, 0 )T . (c) Complete; dim = 1 with basis ( 0, 1, 0 )T . (d) Not an eigenvalue. (e) Complete;

dim = 2 with basis ( 1, 0, 0 )T , ( 0,−1, 1 )T . (f ) Complete; dim = 1 with basis ( i , 0, 1 )T .

(g) Not an eigenvalue. (h) Not complete; dim = 1 with basis ( 1, 0, 0, 0, 0 )T .

8.3.2.

(a) Eigenvalue: 2; eigenvector:

21

!; not complete.

(b) Eigenvalues: 2,−2; eigenvectors:

21

!,

11

!; complete.

(c) Eigenvalues: 1± 2 i ; eigenvectors:

1± i

2

!; complete.

(d) Eigenvalues: 0, 2 i ; eigenvectors:

− i1

!,

i1

!; complete.

(e) Eigenvalue 3 has eigenspace basis

0B@

110

1CA,

0B@

101

1CA; not complete.

(f ) Eigenvalue 2 has eigenspace basis

0B@−1

01

1CA,

0B@

010

1CA; eigenvalue −2 has

0B@−2−1

1

1CA; complete.

(g) Eigenvalue 3 has eigenspace basis

0B@

011

1CA; eigenvalue −2 has

0B@−1

11

1CA; not complete.

215

(h) Eigenvalue 0 has

0BBB@

0001

1CCCA; eigenvalue −1 has

0BBB@

0011

1CCCA; eigenvalue 1 has

0BBB@

1310

1CCCA,

0BBB@

1101

1CCCA; complete.

(i) Eigenvalue 0 has eigenspace basis

0BBB@

1010

1CCCA,

0BBB@

2−1

01

1CCCA; eigenvalue 2 has

0BBB@

−11−5

1

1CCCA; not complete.

8.3.3.

(a) Eigenvalues: −2, 4; the eigenvectors

−1

1

!,

11

!form a basis for R

2.

(b) Eigenvalues: 1 − 3 i , 1 + 3 i ; the eigenvectors

i1

!,

− i1

!, are not real, so the dimen-

sion is 0.

(c) Eigenvalue: 1; there is only one eigenvector v1 =

10

!spanning a one-dimensional sub-

space of R2.

(d) The eigenvalue 1 has eigenvector

0B@

102

1CA, while the eigenvalue −1 has eigenvectors

0B@

110

1CA,

0B@

0−1

0

1CA. The eigenvectors form a basis for R

3.

(e) The eigenvalue 1 has eigenvector

0B@

100

1CA, while the eigenvalue −1 has eigenvector

0B@

001

1CA.

The eigenvectors span a two-dimensional subspace of R3.

(f ) The eigenvalues are −2, 0, 2. The eigenvectors are

0B@

0−1

1

1CA,

0B@

011

1CA, and

0B@

857

1CA, forming a

basis for R3.

(g) The eigenvalues are i ,− i , 1. The eigenvectors are

0B@− i01

1CA,

0B@

i01

1CA and

0B@

010

1CA. The real

eigenvectors span only a one-dimensional subspace of R3.

(h) The eigenvalues are −1, 1,− i − 1,− i + 1. The eigenvectors are

0BBB@

0100

1CCCA,

0BBB@

4326

1CCCA,

0BBB@

−1i− i

1

1CCCA,

0BBB@

−1− i

i1

1CCCA. The real eigenvectors span a two-dimensional subspace of R

4.

8.3.4. Cases (a,b,d,f,g,h) have eigenvector bases of Cn.

8.3.5. Examples: (a)

0B@

1 1 00 1 10 0 1

1CA, (b)

0B@

1 0 00 1 10 0 1

1CA.

8.3.6.(a) True. The standard basis vectors are eigenvectors.

(b) False. The Jordan matrix

1 10 1

!is incomplete since e1 is the only eigenvector.

216

8.3.7. According to Exercise 8.2.19, every eigenvector of A is an eigenvector of cA + d I witheigenvalue cλ + d, and hence if A has a basis of eigenvectors, so does cA + d I .

8.3.8. (a) Every eigenvector of A is an eigenvector of A2 with eigenvalue λ2, and hence if A

has a basis of eigenvectors, so does A2. (b) A =

0 10 0

!with A2 = O.

♦ 8.3.9. Suppose Av = λv. Write v =nX

i=1

ci vi. Then Av =nX

i=1

ci λi vi and hence, by linear

independence, λi ci = λ ci. Thus, either λ = λi or ci = 0.

8.3.10. (a) If Av = λv, then, by induction, Anv = λnv, and hence v is an eigenvector witheigenvalue λn. (b) Conversely, if A is complete and An has eigenvalue µ, then at least oneof its complex nth roots λ = n

√µ is an eigenvalue of A. Indeed, the eigenvector basis of A

is an eigenvector basis of An, and hence, using Exercise 8.3.9, every eigenvalue of An is thenth power of an eigenvalue of A.

♦ 8.3.11. As in Exercise 8.2.32, if v is an eigenvector of A then S−1v is an eigenvector of B. More-over, if v1, . . . ,vn form a basis, so do S−1v1, . . . , S−1vn; see Exercise 2.4.21 for details.

8.3.12. According to Exercise 8.2.17, its only eigenvalue is λ, the common value of its diagonalentries, and so all eigenvectors belong to ker(U − λ I ). Thus U is complete if and only ifdim ker(U − λ I ) = n, which happens if and only if U − λ I = O.

8.3.13. Let V = ker(A− λ I ). If v ∈ V , then Av ∈ V since (A− λ I )Av = A(A− λ I )v = 0.

♦ 8.3.14.(a) Let v = x + iy, w = v = x − iy be the corresponding eigenvectors, so x = 1

2 v + 12 w,

y = − 12 iv + 1

2 iw. Thus, if cx + dy = 0 where c, d are real scalars, then“

12 c− 1

2 i d”v +

“12 c + 1

2 i d”w = 0.

Since v,w are eigenvectors corresponding to distinct eigenvalues µ + i ν 6= µ − i ν,Lemma 8.13 implies they are linearly independent, and hence12 c− 1

2 i d = 12 c + 1

2 i d = 0, which implies c = d = 0.

(b) Same reasoning: xj = 12 vj + 1

2 wj , y = − 12 ivj + 1

2 iwj , where vj ,wj = vj are the

complex eigenvectors corresponding to the eigenvalues µj + i νj , µj − i νj . Thus, if

0 = c1xk + d1y1 + · · · + ck xk + dk yk

=“

12 c1 − 1

2 i d1

”v1 +

“12 c1 + 1

2 i d1

”w1 + · · · +

“12 ck − 1

2 i dk

”vk +

“12 ck + 1

2 i dk

”wk,

then, again by Lemma 8.13, the complex eigenvectors v1, . . . ,vk,w1, . . . ,wk are linearly

independent, and so 12 c1− 1

2 i d1 = 12 c1+ 1

2 i d1 = · · · = 12 ck− 1

2 i dk = 12 ck+ 1

2 i dk = 0.This implies c1 = · · · = ck = d1 = · · · = dk = 0, proving linear independence ofx1, . . . ,xk,y1, . . . ,yk.

8.3.15. In all cases, A = S ΛS−1.

(a) S =

3 31 2

!, Λ =

0 00 −3

!.

(b) S =

2 11 1

!, Λ =

3 00 1

!.

(c) S =

− 3

5 + 15 i − 3

5 − 15 i

1 1

!, Λ =

−1 + i 0

0 −1− i

!.

217

(d) S =

0BB@

1 1 − 110

0 1 − 12

0 0 1

1CCA, Λ =

0B@−2 0 0

0 1 00 0 3

1CA.

(e) S =

0B@

0 21 11 −10 60 7 3

1CA, Λ =

0B@

0 0 00 7 00 0 −1

1CA.

(f ) S =

0B@− 1

5 − 35 i − 1

5 + 35 i −1

−1 −1 01 1 1

1CA, Λ =

0B@

2 + 3 i 0 00 2− 3 i 00 0 −2

1CA.

(g) S =

0B@

1 0 −10 −1 00 1 1

1CA, Λ =

0B@

2 0 00 2 00 0 −3

1CA.

(h) S =

0BBB@

−4 3 1 0−3 2 0 1

0 6 0 012 0 0 0

1CCCA, Λ =

0BBB@

−2 0 0 00 −1 0 00 0 1 00 0 0 2

1CCCA.

(i) S =

0BBB@

0 −1 0 1−1 0 1 0

0 1 0 11 0 1 0

1CCCA, Λ =

0BBB@

−1 0 0 00 −1 0 00 0 1 00 0 0 1

1CCCA.

(j) S =

0BBBB@

−1 1 32 i − 3

2 i

1 −3 − 12 − 2 i − 1

2 + 2 i0 0 1 + i 1− i0 0 1 1

1CCCCA

, Λ =

0BBB@

1 0 0 00 −1 0 00 0 i 00 0 0 − i

1CCCA.

8.3.16.

1 11 0

!=

0@

1+√

52

1−√

52

1 1

1A0@

1+√

52 0

0 1−√

52

1A

0B@

1√5− 1−

√5

2√

5

− 1√5

1+√

52√

5

1CA.

8.3.17.

0 −11 0

!=

i − i

1 1

! i 0

0 − i

!0@ −

i2

12

i2

12

1A; a rotation does not stretch any real

vectors, but somehow corresponds to two complex stretches.

8.3.18.

(a)

0B@

0 −1 01 0 00 0 1

1CA =

0B@

i − i 01 1 00 0 1

1CA

0B@

i 0 00 − i 00 0 1

1CA

0BB@

− i2

12 0

i2

12 0

0 0 12

1CCA,

(b)

0BB@

513 0 12

13

0 1 0

− 1213 0 5

13

1CCA =

0BB@

− i i 0

0 0 1

1 1 0

1CCA

0BBBB@

5+12 i13 0 0

0 5−12 i13 0

0 0 1

1CCCCA

0BBB@

i2 0 1

2

− i2 0 1

2

0 1 0

1CCCA.

8.3.19.(a) Yes: distinct real eigenvalues −3, 2.

(b) No: complex eigenvalues 1± i√

6.

(c) No: complex eigenvalues 1,− 12 ±

√5

2 i .(d) No: incomplete eigenvalue 1 (and complete eigenvalue −2).(e) Yes: distinct real eigenvalues 1, 2, 4.(f ) Yes: complete real eigenvalues 1,−1.

218

8.3.20. In all cases, A = S ΛS−1.

(a) S =

1 −11 1

!, Λ =

1 + i 0

0 −1 + i

!.

(b) S =

−2 + i 0

1 1

!, Λ =

1− i 0

0 2 + i

!.

(c) S =

1 − 1

2 − 12 i

1 1

!, Λ =

4 00 −1

!.

(d) S =

0B@

0 1 −11 −1− i 3

5 + 15 i

0 1 1

1CA, Λ =

0B@

1 0 00 1− i 00 0 −1− i

1CA.

8.3.21. Use the formula A = S ΛS−1. For parts (e,f ) you can choose any other eigenvalues andeigenvectors you want to fill in S and Λ.

(a)

7 4−8 −5

!, (b)

0B@

6 6 −2−2 −2 0

6 6 −4

1CA, (c)

3 00 3

!,

(d)

1 − 4

36 −3

!, (e) example:

0B@

0 0 40 1 00 0 −2

1CA, (f ) example:

0B@

3 12 0

−2 3 0−2 −2 0

1CA.

8.3.22. (a)

11 −618 −10

!, (b)

−1 0

0 2

!, (c)

− 4 −6

3 5

!.

♦ 8.3.23. Let S1 be the eigenvector matrix for A and S2 the eigenvector matrix for B. Thus, by

the hypothesis S−11 AS1 = Λ = S−1

2 BS2 and hence B = S2S−11 AS1S−1

2 = S−1AS where

S = S1S−12 .

8.3.24. The hypothesis says B = P AP T = P AP−1 where P is the corresponding permutation

matrix, and we are using Exercise 1.6.14 to identify P T = P−1. Thus A and B are similarmatrices, and so, according to Exercise 8.2.32, have the same eigenvalues. If v is an eigen-vector fofr A, then the vector w = P v obtained by permuting its entries is an eigenvector

for B with the same eigenvalue, since Bw = P AP T P v = P Av = λP v = λw.

8.3.25. True. Let λj = ajj denote the jth diagonal entry of A, which is the same as the jth

eigenvalue. We will prove that the corresponding eigenvector is a linear combination ofe1, . . . , ej , which is equivalent to the eigenvector matrix S being upper triangular. We

use induction on the size n. Since A is upper triangular, it leaves the subspace V spannedby e1, . . . , en−1 invariant, and hence its restriction to the subspace is represented by an

(n−1)× (n−1) upper triangular matrix. Thus, by induction and completeness, A possessesn − 1 eigenvectors of the required form. The remaining eigenvector vn cannot belong to V(otherwise the eigenvectors would be linearly dependent) and hence must involve en.

8.3.26. The diagonal entries are all eigenvalues, and so are obtained from each other by permu-tation. If all eigenvalues are distinct, then there are n ! different diagonal forms — other-

wise, if it has distinct eigenvalues of multiplicities j1, . . . jk, there aren !

j1 ! · · · jk !distinct

diagonal forms.

8.3.27. Let A = S ΛS−1. Then A2 = I if and only if Λ2 = I , and so all its eigenvalues are ±1.

Examples: A =

3 −24 −3

!with eigenvalues 1,−1 and eigenvectors

11

!,

12

!; or, even

simpler, A =

0 11 0

!.

219

♥ 8.3.28.(a) If A = S ΛS−1 and B = S DS−1 where Λ, D are diagonal, then AB = S ΛDS−1 =

S DΛS−1 = BA, since diagonal matrices commute.(b) According to Exercise 1.2.12(e), the only matrices that commute with an n× n diagonal

matrix with distinct entries is another diagonal matrix. Thus, if AB = BA, and A =S ΛS−1 where all entries of Λ are distinct, then D = S−1BS commutes with Λ andhence is a diagonal matrix.

(c) No, the matrix

0 10 0

!commutes with the identity matrix, but is not diagonalizable.

See also Exercise 1.2.14.

8.4.1.

(a) Eigenvalues: 5,−10; eigenvectors:1√5

21

!,

1√5

1−2

!.

(b) Eigenvalues: 7, 3; eigenvectors:1√2

−1

1

!,

1√2

11

!.

(c) Eigenvalues:7 +√

13

2,7−√

13

2;

eigenvectors:2

q26− 6

√13

0@

3−√

132

1

1A,

2q

26 + 6√

13

0@

3+√

132

1

1A.

(d) Eigenvalues: 6, 1,−4; eigenvectors:

0BBBB@

45√

23

5√

21√2

1CCCCA

,

0BBB@

− 3545

0

1CCCA,

0BBBB@

− 45√

2

− 35√

21√2

1CCCCA

.


0BBBB@

1√6

− 1√6

2√6

1CCCCA

,

0BBBB@

− 1√3

1√3

1√3

1CCCCA

,

0BBBB@

1√2

1√2

0

1CCCCA

.

8.4.2.(a) Eigenvalues 5

2 ± 12

√17; positive definite. (b) Eigenvalues −3, 7; not positive definite.

(c) Eigenvalues 0, 1, 3; positive semi-definite. (d) Eigenvalues 6, 3±√

3; positive definite.

8.4.3. Use the fact that K = −N is positive definite and so has all positive eigenvalues. Theeigenvalues of N = −K are −λj where λj are the eigenvalues of K. Alternatively, mimic

the proof in the book for the positive definite case.

8.4.4. If all eigenvalues are distinct, there are 2n different bases, governed by the choice of signin each of the unit eigenvectors ±uk. If the eigenvalues are repeated, there are infinitelymany, since any orthonormal basis of each eigenspace will contribute to an orthonormaleigenvector basis of the matrix.

8.4.5.(a) The characteristic equation p(λ) = λ2 − (a + d)λ + (ad − bc) = 0 has real roots if and

only if its discriminant is non-negative: 0 ≤ (a+d)2−4(ad− bc) = (a−d)2 +4bc, whichis the necessary and sufficient condition for real eigenvalues.

(b) If A is symmetric, then b = c and so the discriminant is (a− d)2 + 4b2 ≥ 0.

(c) Example:

1 10 2

!.

220

♥ 8.4.6.(a) If Av = λv and v 6= 0 is real, then

λ ‖v ‖2 = (Av) · v = (Av)T v = vT AT

v = −vT Av = −v · (Av) = −λ ‖v ‖2,

and hence λ = 0.(b) Using the Hermitian dot product,

λ ‖v ‖2 = (Av) · v = vT AT

v = −vT Av = −v · (Av) = −λ ‖v ‖2,

and hence λ = −λ, so λ is purely imaginary.(c) Since det A = 0, cf. Exercise 1.9.10, at least one of the eigenvalues of A must be 0.

(d) The characteristic polynomial of A =

0B@

0 c −b−c 0 a

b −a 0

1CA is −λ3 + λ(a2 + b2 + c2) and

hence the eigenvalues are 0,± i√

a2 + b2 + c2, and so are all zero if and only if A = O.(e) The eigenvalues are: (i) ±2 i , (ii) 0,±5 i , (iii) 0,±

√3 i , (iv) ±2 i ,±3 i .

♥ 8.4.7.(a) Let Av = λv. Using the Hermitian dot product,

λ‖v ‖2 = (Av) · v = vT AT

v = vT A v = v · (Av) = λ‖v ‖2,

and hence λ = λ, which implies that the eigenvalue λ is real.(b) Let Av = λv, Aw = µw. Then

λv ·w = (Av) ·w = vT AT

w = vT A w = v · (Aw) = µv ·w,

since µ is real. Thus, if λ 6= µ then v ·w = 0.

(c) (i) Eigenvalues ±√

5; eigenvectors:

(2−

√5) i

1

!,

(2 +

√5) i

1

!.

(ii) Eigenvalues 4,−2; eigenvectors:

2− i

1

!,

−2 + i

5

!.

(iii) Eigenvalues 0,±√

2; eigenvectors:

0B@

101

1CA,

0B@−1i√

21

1CA,

0B@−1− i√

21

1CA.

♥ 8.4.8.(a) Rewrite (8.31) as M−1K v = λv, and so v is an eigenvector for M−1K with eigenvalue

λ. The eigenvectors are the same.(b) M−1K is not necessarily symmetric, and so we can’t use Theorem 8.20 directly. If v is

an generalized eigenvector, then since K, M are real matrices, K v = λ M v. Therefore,

λ ‖v ‖2 = λvT M v = (λM v)T v = (K v)T v = v

T (K v) = λvT M v = λ ‖v ‖2,

and hence λ is real.(c) If K v = λM v, K w = µM w, with λ, µ and v,w real, then

λ 〈v ,w 〉 = (λM v)T w = (K v)T w = vT (K w) = µv

T M w = µ 〈v ,w 〉,and so if λ 6= µ then 〈v ,w 〉 = 0, proving orthogonality.

(d) If K > 0, then λ〈v ,v 〉 = vT (λM v) = vT K v > 0, and so, since M is positive definite,λ > 0.

(e) Part (b) proves that the eigenvectors are orthogonal with respect to the inner productinduced by M , and so the result follows immediately from Theorem 5.5.

8.4.9.

(a) Eigenvalues: 53 , 1

2 ; eigenvectors:

−3

1

!,

121

!.

(b) Eigenvalues: 2, 12 ; eigenvectors:

11

!,

− 1

21

!.

221

(c) Eigenvalues: 7, 1; eigenvectors:

121

!,

10

!.

(d) Eigenvalues: 12, 9, 2; eigenvectors:

0B@

6−3

4

1CA,

0B@−6

32

1CA,

0B@

210

1CA.


0B@

1−2

1

1CA,

0B@−1

01

1CA,

0B@

1− 1

21

1CA.

(f ) 2 is a double eigenvalue with eigenvector basis

0B@

101

1CA,

0B@−1

10

1CA, while 1 is a simple eigen-

value with eigenvector

0B@

2−2

1

1CA. For orthogonality you need to select an M orthogonal

basis of the two-dimensional eigenspace, say by using Gram–Schmidt.

♦ 8.4.10. If L[v ] = λv, then, using the inner product,

λ ‖v ‖2 = 〈L[v ] ,v 〉 = 〈v , L[v ] 〉 = λ ‖v ‖2,

which proves that the eigenvalue λ is real. Similarly, if L[w ] = µw, then

λ 〈v ,w 〉 = 〈L[v ] ,w 〉 = 〈v , L[w ] 〉 = µ 〈v ,w 〉,and so if λ 6= µ, then 〈v ,w 〉 = 0.

♦ 8.4.11. As shown in the text, since yi ∈ V ⊥, its image Ayi ∈ V ⊥ also, and hence Ayi is alinear combination of the basis vectors y1, . . . ,yn−1, proving the first statement. Further-

more, since y1, . . . ,yn−1 form an orthonormal basis, by (8.30), bij = yi ·Ayj = (Ayi) ·yj =

bji.

♥ 8.4.12.

(a)

0BBBBBBBBBBB@

−1 1 0 0 00 −1 1 0 0

0 −1 1 0. . .

. . .. . .

. . .0 −1 1 0

0 0 −1 11 0 0 −1

1CCCCCCCCCCCA

.

(b) Using Exercise 8.2.13(c),

∆ωk = (S − I )ωk =“

1− e2kπ i /n”ωk,

and so ωk is an eigenvector of ∆ with corresponding eigenvalue 1− e2kπ i /n.

(c) Since S is an orthogonal matrix, ST = S−1, and so STωk = e−2kπ i /n

ωk. Therefore,

K ωk = (ST − I )(S − I )ωk = (2 I − S − ST )ωk

=“

2 − e−2kπ i /n − e−2kπ i /n”ωk =

2− 2 cos

kπ i

n

!ωk,

and hence ωk is an eigenvector of K with corresponding eigenvalue 2− 2 coskπ i

n.

(d) Yes, K > 0 since its eigenvalues are all positive; or note that K = ∆T ∆ is a Grammatrix, with ker∆ = 0.

(e) Each eigenvalue 2 − 2 coskπ i

n= 2 − 2 cos

(n− k)π i

nfor k 6= 1

2 n is double, with a two-

dimensional eigenspace spanned by ωk and ωn−k = ωk. The corresponding real eigen-

vectors are Re ωk = 12 ωk + 1

2 ωn−k and Im ωk = 12 i ωk − 1

2 i ωn−k. On the other hand,

222

if k = 12 n (which requires that n be even), the eigenvector ωn/2 = ( 1,−1, 1,−1, . . . )T

is real.

♥ 8.4.13.(a) The shift matrix has c1 = 1, ci = 0 for i 6= 1; the difference matrix has c0 = −1, c1 = 1,

and ci = 0 for i > 1; the symmetric product K has c0 = 2, c1 = cn−1 = −1, and ci = 0for 1 < i < n− 2;

(b) The eigenvector equation

C ωk = (c0 + c1 e2kπ i /n + c2 e4kπ i /n + · · ·+ +cn−1 e2(n−1)kπ i /n)ωk

can either be proved directly, or by noting that

C = c0 I + c1 S + c2 S2 + · · · + cn−1 Sn−1,

and using Exercise 8.2.13(c).

(c) This follows since the individual columns of Fn =“ω0, . . . ,ωn−1

”are the sampled

exponential eigenvectors, and so the columns of the matrix equation C Fn = Λ Fn arethe eigenvector equations C ωk = λkωk for k = 0, . . . , n− 1.

(d) (i) Eigenvalues 3,−1; eigenvectors

11

!,

1−1

!.

(ii) Eigenvalues 6,− 32−

√3

2 i ,− 32+

√3

2 i ; eigenvectors

0B@

111

1CA,

0BBB@

1

− 12 +

√3

2 i

− 12 −

√3

2 i

1CCCA,

0BBB@

1

− 12 −

√3

2 i

− 12 +

√3

2 i

1CCCA.

(iii) Eigenvalues 0, 2− 2 i , 0, 2 + 2 i ; eigenvectors

0BBB@

1111

1CCCA,

0BBB@

1i−1− i

1CCCA,

0BBB@

1−1

1−1

1CCCA,

0BBB@

1− i−1

i

1CCCA.

(iv) Eigenvalues 0, 2, 4, 2; eigenvectors

0BBB@

1111

1CCCA,

0BBB@

1i−1− i

1CCCA,

0BBB@

1−1

1−1

1CCCA,

0BBB@

1− i−1

i

1CCCA.

(e) The eigenvalues are (i) 6, 3, 3; (ii) 6, 4, 4, 2; (iii) 6, 7+√

52 , 7+

√5

2 , 7−√

52 , 7−

√5

2 ; (iv) in

the n× n case, they are 4 + 2 cos2kπ

nfor k = 0, . . . , n− 1. The eigenvalues are real and

positive because the matrices are positive definite.(f ) Cases (i,ii) in (d) and all matrices in part (e) are invertible. In general, an n × n circu-

lant matrix is invertible if and only if none of the roots of the polynomial c0+c1 x+ · · ·+cn−1 xn−1 = 0 is an nth root of unity: x 6= e2kπ/n.

8.4.14.

(a)

−3 4

4 3

!=

0B@

1√5− 2√

52√5

1√5

1CA

5 0

0 −5

!0B@

1√5

2√5

− 2√5

1√5

1CA.

(b)

2 −1−1 4

!=

0BB@

1−√

2√4−2

√2

1+√

2√4+2

√2

1√4−2

√2

1√4+2

√2

1CCA

3 +√

2 0

0 3−√

2

!0BB@

1−√

2√4−2

√2

1√4−2

√2

1+√

2√4+2

√2

1√4+2

√2

1CCA.

(c)

0B@

1 1 01 2 10 1 1

1CA =

0BBBB@

1√6− 1√

21√3

2√6

0 − 1√3

1√6

1√2

1√3

1CCCCA

0BB@

3 0 0

0 1 0

0 0 0

1CCA

0BBBB@

1√6

2√6

1√6

− 1√2

0 1√2

1√3− 1√

31√3

1CCCCA

.

223

(d)

0B@

3 −1 −1−1 2 0−1 0 2

1CA =

0BBBB@

− 2√6

0 1√3

1√6− 1√

21√3

1√6

1√2

1√3

1CCCCA

0BB@

4 0 0

0 2 0

0 0 1

1CCA

0BBBB@

− 2√6

1√6

1√6

0 − 1√2

1√2

1√3

1√3

1√3

1CCCCA

.

8.4.15.

(a)

2 66 −7

!=

0B@

2√5

1√5

1√5− 2√

5

1CA

5 0

0 −10

!0B@

2√5

1√5

1√5− 2√

5

1CA.

(b)

5 −2−2 5

!=

0B@− 1√

21√2

1√2

1√2

1CA

5 0

0 −10

!0B@− 1√

21√2

1√2

1√2

1CA.

(c)

2 −1−1 5

!=

0BB@

3−√

13√26−6

√13

3+√

13√26+6

√13

2√26−6

√13

2√26+6

√13

1CCA

0@

7+√

132 0

0 7−√

132

1A

0BB@

3−√

13√26−6

√13

2√26−6

√13

3+√

13√26+6

√13

2√26+6

√13

1CCA,

(d)

0B@

1 0 40 1 34 3 1

1CA =

0BBBB@

45√

2− 3

5 − 45√

23

5√

245

35√

21√2

0 1√2

1CCCCA

0BB@

6 0 0

0 1 0

0 0 −4

1CCA

0BBBB@

45√

23

5√

21√2

− 35

45 0

− 45√

2− 3

5√

21√2

1CCCCA

,

(e)

0B@

6 −4 1−4 6 −1

1 −1 11

1CA =

0BBBB@

1√6− 1√

31√2

− 1√6

1√3

1√2

2√6

1√3

0

1CCCCA

0BB@

12 0 0

0 9 0

0 0 2

1CCA

0BBBB@

1√6− 1√

62√6

− 1√3

1√3

1√3

1√2

1√2

0

1CCCCA

.

8.4.16. (a)

0@

5725 − 24

25

− 2425

4325

1A, (b)

0@ −

12

32

32 − 1

2

1A. (c) None, since eigenvectors are not orthog-

onal. (d)

2 00 2

!. Note: even though the given eigenvectors are not orthogonal, one can

construct an orthogonal basis of the eigenspace.

8.4.17.

(a) 12

„3√10

x + 1√10

y«2

+ 112

„− 1√

10x + 3√

10y«2

= 120 (3x + y)2 + 11

20 (−x + 3y)2,

(b) 7„

1√5

x + 2√5

y«2

+ 112

„− 2√

5x + 1√

5y«2

= 75 (x + 2y)2 + 2

5 (−2x + y)2,

(c) −4„

45√

2x + 3

5√

2y − 1√

2z«2

+“− 3

5 x + 45 y”2

+ 6„

45√

2x + 3

5√

2y + 1√

2z«2

= − 225 (4x + 3y − 5z)2 + 1

25 (−3x + 4y)2 + 325 (4x + 3y + 5z)2,

(d) 12

„1√3

x + 1√3

y + 1√3

z«2

+„− 1√

2y + 1√

2z«2

+ 2„− 2√

6x + 1√

6y + 1√

6z«2

= 16 (x + y + z)2 + 1

2 (−y + z)2 + 13 (−2x + y + z)2,

(e) 2„

1√2

x + 1√2

y«2

+ 9„− 1√

3x + 1√

3y + 1√

3z«2

+ 12„

1√6

x− 1√6

y + 2√6

z«2

= (x + y)2 + 3(−x + y + z)2 + 2(x− y + 2z)2.

♥ 8.4.18.

(a) λ1 = λ2 = 3, v1 =

0B@

110

1CA, v2 =

0B@−1

01

1CA, λ3 = 0, v3 =

0B@

1−1

1

1CA;

(b) det A = λ1 λ2 λ3 = 0.(c) A is positive semi-definite, but not positive definite since it has a zero eigenvalue.

224

(d) u1 =

0BBBB@

1√2

1√2

0

1CCCCA

, u2 =

0BBBB@

− 1√6

1√6

2√6

1CCCCA

, u3 =

0BBBB@

1√3

− 1√3

1√3

1CCCCA

;

(e)

0B@

2 1 −11 2 1−1 1 2

1CA =

0BBBB@

1√2− 1√

61√3

1√2

1√6− 1√

3

0 2√6

1√3

1CCCCA

0BB@

3 0 0

0 3 0

0 0 0

1CCA

0BBBB@

1√2

1√2

0

− 1√6

1√6

2√6

1√3− 1√

31√3

1CCCCA

;

(f )

0B@

100

1CA = 1√

2u1 − 1√

6u2 + 1√

3u3.

♦ 8.4.19. The simplest is A = I . More generally, any matrix of the form A = ST ΛS, whereS = (u1 u2 . . . un ) and Λ is any real diagonal matrix.

8.4.20. True, assuming that the eigenvector basis is real. If Q is the orthogonal matrix formedby the eigenvector basis, then AQ = QΛ where Λ is the diagonal eigenvalue matrix. Thus,

A = QΛQ−1 = QΛQT = AT is symmetric. For complex eigenvector bases, the result is

false, even for real matrices. For example, any 2 × 2 rotation matrix


!has

orthonormal eigenvector basis

i1

!,

− i1

!. See Exercise 8.6.5 for details.

8.4.21. Using the spectral factorization, we have xT Ax = (QT x)T Λ(QT x) =nX

i=1

λi y2i , where

yi = ui · x = ‖x ‖ cos θi denotes the ith entry of QT x.

8.4.22. Principal stretches = eigenvalues: 4 +√

3, 4−√

3, 1;

principal directions = eigenvectors:“

1,−1 +√

3, 1”T

,“

1,−1−√

3, 1”T

, (−1, 0, 1 )T .

♦ 8.4.23. Moments of inertia: 4, 2, 1; principal directions: ( 1, 2, 1 )T , (−1, 0, 1 )T , ( 1,−1, 1 )T .

♦ 8.4.24.(a) Let K = QΛQT be its spectral factorization. Then xT K x = yT Λy where x = Qy.

The ellipse yT Λy = λ1 y21 + λ2 y2

1 = 1 has its principal axes aligned with the coordinate

axes and semi-axes 1/q

λi, i = 1, 2. The map x = Qy serves to rotate the coordinate

axes to align with the columns of Q, i.e., the eigenvectors, while leaving the semi-axesunchanged.

(b) (i) -1 -0.5 0.5 1

-1

-0.5

0.5

1

ellipse with semi-axes 1, 12 and principal axes

10

!,

01

!.

225

(ii) -1.5 -1 -0.5 0.5 1 1.5

-1.5

-1

-0.5

0.5

1

1.5

ellipse with semi-axes√

2,q

23 , and principal axes

−1

1

!,

11

!.

(iii) -1.5 -1 -0.5 0.5 1 1.5

-1.5

-1

-0.5

0.5

1

1.5

ellipse with semi-axes 1√2+

√2, 1√

2−√

2,

and principal axes

1 +√

21

!,

1−√

21

!.

(c) If K is positive semi-definite it is a parabola; if K is symmetric and indefinite, a hy-perbola; if negative (semi-)definite, the empty set. If K is not symmetric, replace K by12 (K + KT ) as in Exercise 3.4.20, and then apply the preceding classification.

♦ 8.4.25. (a) Same method as in Exercise 8.4.24. Its principal axes are the eigenvectors of K,and the semi-axes are the reciprocals of the square roots of the eigenvalues. (b) Ellipsoid

with principal axes: ( 1, 0, 1 )T , (−1,−1, 1 )T , (−1, 2, 1 )T and semi-axes 1√6, 1√

12, 1√

24.

8.4.26. If Λ = diag (λ1, . . . , λn), then the (i, j) entry of ΛM is dimij , whereas the (i, j) entry of

M Λ is dkmik. These are equal if and only if either mik = 0 or di = dk. Thus, ΛM = M Λwith M having one or more non-zero off-diagonal entries, which includes the case of non-zero skew-symmetric matrices, if and only if Λ has one or more repeated diagonal entries.

Next, suppose A = QΛQT is symmetric with diagonal form Λ. If AJ = J A with JT =

−J 6= O, then ΛM = M Λ where M = QT J Q is also nonzero, skew-symmetric, and henceA has repeated eigenvalues. Conversely, if λi = λj , choose M such that mij = 1 = −mji,

and then A commutes with J = QM QT .

♦ 8.4.27.(a) Set B = Q

√Λ QT , where

√Λ is the diagonal matrix with the square roots of the eigen-

values of A along the diagonal. Uniqueness follows from the fact that the eigenvectorsand eigenvalues are uniquely determined. (Permuting them does not change the finalform of B.)

(b) (i)1

2

√3 + 1

√3− 1√

3− 1√

3 + 1

!; (ii)

1

(2−√

2)q

2 +√

2

2√

2− 1 1−√

21−√

2 1

!;

(iii)

0B@

√2 0 00√

5 00 0 3

1CA; (iv)

0BBBB@

1 + 1√2

+ 1√3

−1 + 1√2− 1√

3−1 + 2√

3

− 1 + 1√2− 1√

31 + 1√

2+ 1√

31− 2√

3

− 1 + 2√3

1− 2√3

1 + 4√3

1CCCCA

.

8.4.28. Only the identity matrix is both orthogonal and positive definite. Indeed, if K = KT >0 is orthogonal, then K2 = I , and so its eigenvalues are all ±1. Positive definiteness im-plies that all the eigenvalues must be +1, and hence its diagonal form is Λ = I . But then

K = Q I QT = I also.

♦ 8.4.29. If A = QB, then K = AT A = BT QT QB = B2, and hence B =√

K is the positive

226

definite square root of K. Moreover, Q = AB−1 then satisfies QT Q = B−T AT AB−1 =B−1K B−1 = I since K = B2. Finally, det A = det Q det B, and det B > 0 since B > 0. Soif det A > 0, then det Q = +1 > 0.

8.4.30. (a)

0 12 0

!=

0 11 0

! 2 00 1

!, (b)

2 −3

1 6

!=

0B@

2√5− 1√

51√5

2√5

1CA √

5 0

0 3√

5

!,

(c)

1 20 1

!=

0B@

1√2

1√2

− 1√2

1√2

1CA

0B@

1√2

1√2

1√2

3√2

1CA,

(d)

0B@

0 −3 81 0 00 4 6

1CA =

0B@

0 − 35

45

1 0 00 4

535

1CA

0B@

1 0 00 5 00 0 10

1CA,

(e)

0B@

1 0 11 −2 01 1 0

1CA =

0B@

.3897 .0323 .9204

.5127 −.8378 −.1877

.7650 .5450 −.3430

1CA

0B@

1.6674 −.2604 .3897−.2604 2.2206 .0323

.3897 .0323 .9204

1CA.

♥ 8.4.31.(i) This follows immediately from the spectral factorization. The rows of ΛQT are

λ1uT1 , . . . , λn uT

n , and formula (8.34) follows from the alternative version of matrix mul-tiplication given in Exercise 1.2.34.

(ii) (a)

−3 4

4 3

!= 5

0@

15

25

25

45

1A− 5

0@

45 − 2

5

− 25

15

1A.

(b)

2 −1−1 4

!= (3 +

√2 )

0B@

3−2√

24−2

√2

1−√

24−2

√2

1−√

24−2

√2

14−2

√2

1CA+ (3−

√2 )

0B@

3+2√

24−2

√2

1+√

24−2

√2

1+√

24−2

√2

14−2

√2

1CA.

(c)

0B@

1 1 01 2 10 1 1

1CA = 3

0BBB@

16

13

16

13

23

13

16

13

16

1CCCA+

0BB@

12 0 − 1

2

0 0 0

− 12 0 1

2

1CCA.

(d)

0B@

3 −1 −1−1 2 0−1 0 2

1CA = 4

0BBB@

23 − 1

3 − 13

− 13

16

16

− 13

16

16

1CCCA+ 2

0BB@

0 0 0

0 12 − 1

2

0 − 12

12

1CCA+

0BBB@

13

13

13

13

13

13

13

13

13

1CCCA.

♦ 8.4.32. According to Exercise 8.4.7, the eigenvalues of an n × n Hermitian matrix are all real,and the eigenvectors corresponding to distinct eigenvalues orthogonal with respect to theHermitian inner product on C

n. Moreover, every Hermitian matrix is complete and has anorthonormal eigenvector basis of C

n; a proof follows along the same lines as the symmetriccase in Theorem 8.20. Let U be the corresponding matrix whose columns are the orthonor-

mal eigenvector basis. Orthonormality implies that U is a unitary matrix: U †U = I , andsatisfies H U = U Λ where Λ is the real matrix with the eigenvalues of H on the diagonal.

Therefore, H = U Λ U†.

8.4.33.

(a)

3 2 i

− 2 i 6

!=

0B@

i√5

2√5

− 2 i√5

1√5

1CA

2 0

0 7

!0B@− i√

52 i√

52√5

1√5

1CA,

(b)

6 1− 2 i

1 + 2 i 2

!=

0B@

1−2 i√6

−1+2 i√30

1√6

√5√6

1CA

7 0

0 1

!0B@

1+2 i√6

1√6

− 1+2 i√30

√5√6

1CA,

227

(c)

0B@−1 5 i −4−5 i −1 4 i−4 −4 i 8

1CA =

0BBBB@

− 1√6− i√

21√3

i√6

1√2− i√

32√6

0 1√3

1CCCCA

0BB@

12 0 0

0 −6 0

0 0 0

1CCA

0BBBB@

− 1√6− i√

62√6

i√2

1√2

0

1√3

i√3

1√3

1CCCCA

.

8.4.34. Maximum: 7; minimum: 3.

8.4.35. Maximum: 4+√

52 ; minimum: 4−

√5

2 .

8.4.36.(a) 5+

√5

2 = max 2x2 − 2xy + 3y2 |x2 + y2 = 1 ,5−

√5

2 = min 2x2 − 2xy + 3y2 |x2 + y2 = 1 ;(b) 5 = max 4x2 + 2xy + 4y2 |x2 + y2 = 1 ,

3 = min 4x2 + 2xy + 4y2 |x2 + y2 = 1 ;(c) 12 = max 6x2 − 8xy + 2xz + 6y2 − 2y z + 11z2 |x2 + y2 + z2 = 1 ,

2 = min 6x2 − 8xy + 2xz + 6y2 − 2y z + 11z2 |x2 + y2 + z2 = 1 ;(d) 6 = max 4x2 − 2xy − 4xz + 4y2 − 2y z + 4z2 |x2 + y2 + z2 = 1 ,

3−√

3 = min 4x2 − 2xy − 4xz + 4y2 − 2y z + 4z2 |x2 + y2 + z2 = 1 .

8.4.37. (c) 9 = max 6x2 − 8xy + 2xz + 6y2 − 2y z + 11z2 |x2 + y2 + z2 = 1, x− y + 2z = 0 ;(d) 3 +

√3 = max 4x2 − 2xy − 4xz + 4y2 − 2y z + 4z2 |x2 + y2 + z2 = 1, x− z = 0 ,

8.4.38. (a) Maximum: 3; minimum: −2; (b) maximum: 52 ; minimum: − 1

2 ;

(c) maximum: 8+√

52 = 5.11803; minimum: 8−

√5

2 = 2.88197;

(d) maximum: 4+√

102 = 3.58114; minimum: 4−

√10

2 = .41886.

8.4.39. Maximum: cosπ

n + 1; minimum: − cos

π

n + 1.

8.4.40. Maximum: r2 λ1; minimum: r2 λn, where λ1, λn are, respectively, the maximum andminimum eigenvalues of K.

8.4.41. maxxT K x | ‖x ‖ = 1 = λ1 is the largest eigenvalue of K. On the other hand, K−1

is positive definite, cf. Exercise 3.4.10, and hence minxT K−1x | ‖x ‖ = 1 = µn is its

smallest eigenvalue. But the eigenvalues of K−1 are the reciprocals of the eigenvalues of K,and hence its smallest eigenvalue is µn = 1/λ1, and so the product is λ1 µn = 1.

♦ 8.4.42. According to the discussion preceding the statement of the Theorem 8.30,

λj = maxn

yT Λy

˛˛ ‖y ‖ = 1, y · e1 = · · · = y · ej−1 = 0

o.

Moreover, using (8.33), setting x = Qy and using the fact that Q is an orthogonal matrixand so (Qv) · (Qw) = v ·w for any v,w ∈ R

n, we have

xT Ax = y

T Λy, ‖x ‖ = ‖y ‖, y · ei = x · vi,

where vi = Qei is the ith eigenvector of A. Therefore, by the preceding formula,

λj = maxn

xT Ax

˛˛ ‖x ‖ = 1, x · v1 = · · · = x · vj−1 = 0

o.

♦ 8.4.43. Let A be a symmetric matrix with eigenvalues λ1 ≥ λ2 ≥ · · · ≥ λn and corresponding

orthogonal eigenvectors v1, . . . ,vn. Then the minimal value of the quadratic form xT Ax

228

over all unit vectors which are orthogonal to the last n−j eigenvectors is the jth eigenvalue:

λj = minn

xT Ax

˛˛ ‖x ‖ = 1, x · vj+1 = · · · = x · vn = 0

o.

8.4.44. Note thatvT Kv

‖v ‖2 = uT K u, where u =v

‖v ‖ is a unit vector. Moreover, if v is orthogo-

nal to an eigenvector vi, so is u. Therefore, by Theorem 8.30

max

8<:

vT Kv

‖v ‖2

˛˛˛ v 6= 0, v · v1 = · · · = v · vj−1 = 0

9=;

= maxn

uT K u

˛˛ ‖u ‖ = 1, u · v1 = · · · = u · vj−1 = 0

o= λj

.

♥ 8.4.45.(a) Let R =

√M be the positive definite square root of M , and set fK = R−1K R−1. Then

xT K x = yT fK y, xT M x = yT y = ‖y ‖2, where y = Rx. Thus,

maxn

xT K x

˛˛ x

T M x = 1o

= maxn

yTfK y

˛˛ ‖y ‖2 = 1

o= eλ1,

the largest eigenvalue of fK. But fK y = λy implies K x = λx, and so the eigenvalues ofK and fK coincide.

(b) Write y =x√

xT M xso that yT M y = 1. Then, by part (a),

max

8<:

xT K x

xT M x

˛˛˛ x 6= 0

9=; = max

ny

T K y˛˛ y

T M y = 1o

= λ1.

(c) λn = minxT K x | xT M x = 1 .(d) λj = maxxT K x | xT M x = 1, xT M v1 = · · · = xT M vj−1 = 0 where v1, . . . ,vn are

the generalized eigenvectors.

8.4.46. (a) Maximum: 34 , minimum: 2

5 ; (b) maximum: 9+4√

27 , minimum: 9−4

√2

7 ;

(c) maximum: 2, minimum: 12 ; (d) maximum: 4, minimum: 1.

8.4.47. No. For example, if A =

1 b0 4

!has eigenvalues 1, 4, then q(x) > 0 for x 6= 0 if and

only if | b | < 4.

8.5.1. (a) 3±√

5, (b) 1, 1; (c) 5√

2; (d) 3, 2; (e)√

7,√

2; (f ) 3, 1.

8.5.2.

(a)

1 10 2

!=

0BB@

−1+√

5√10−2

√5

−1−√

5√10+2

√5

2√10−2

√5

2√10+2

√5

1CCA

0@q

3 +√

5 0

0q

3−√

5

1A

0BB@

−2+√

5√10−4

√5

1√10−4

√5

−2−√

5√10+4

√5

1√10+4

√5

1CCA,

(b)

0 1−1 0

!=

1 00 −1

! 1 00 1

! 0 11 0

!,

(c)

1 −2−3 6

!=

0B@− 1√

103√10

1CA“

5√

2”“− 1√

52√5

”,

(d)

2 0 00 3 0

!=

0 11 0

! 3 00 2

! 0 1 01 0 0

!,

(e)

2 1 0 −10 −1 1 1

!=

0B@− 2√

51√5

1√5

2√5

1CA √

7 0

0√

2

!0B@− 4√

35− 3√

351√35

3√35

2√10− 1√

102√10

1√10

1CA,

229

(f )

0B@

1 −1 0−1 2 −1

0 −1 1

1CA =

0BBBBB@

1√6− 1√

2

−√

2√3

0

1√6

1√2

1CCCCCA

3 0

0 1

!0B@

1√6−

√2√3

1√6

− 1√2

0 1√2

1CA.

8.5.3.

(a)

1 10 1

!=

0BB@

1+√

5√10+2

√5

1−√

5√10−2

√5

2√10+2

√5

2√10−2

√5

1CCA

0@q

32 + 1

2

√5 0

0q

32 − 1

2

√5

1A

0BB@

−1+√

5√10−2

√5

2√10−2

√5

−1−√

5√10+2

√5

2√10+2

√5

1CCA;

(b) The first and last matrices are proper orthogonal, and so represent rotations, while the mid-dle matrix is a stretch along the coordinate directions, in proportion to the singular values. Ma-trix multiplication corresponds to composition of the corresponding linear transformations.

8.5.4.(a) The eigenvalues of K = AT A are 15

2 ±√

2212 = 14.933, .0667. The square roots of these

eigenvalues give us the singular values of A. i.e., 3.8643, .2588. The condition number is3.86433 / .25878 = 14.9330.

(b) The singular values are 1.50528, .030739, and so the condition number is1.50528 / .030739 = 48.9697.

(c) The singular values are 3.1624, .0007273, and so the condition number is3.1624 / .0007273 = 4348.17; slightly ill-conditioned.

(d) The singular values are 12.6557, 4.34391, .98226, so he condition number is12.6557 / .98226 = 12.88418.

(e) The singular values are 239.138, 3.17545, .00131688, so the condition number is239.138 / .00131688 = 181594; ill-conditioned.

(f ) The singular values are 30.2887, 3.85806, .843107, .01015, so the condition number is30.2887 / .01015 = 2984.09; slightly ill-conditioned.

♠ 8.5.5. In all cases, the large condition number results in an inaccurate solution.(a) The exact solution is x = 1, y = −1; with three digit rounding, the computed solution is

x = 1.56, y = −1.56. The singular values of the coefficient matrix are 1615.22, .274885,and the condition number is 5876.

(b) The exact solution is x = −1, y = −109, z = 231; with three digit rounding, the com-puted solution is x = −2.06, y = −75.7, z = 162. The singular values of the coefficientmatrix are 265.6, 1.66, .0023, and the condition number is 1.17× 105.

(c) The exact solution is x = −1165.01, y = 333.694, z = 499.292; with three digit round-ing, the computed solution is x = −467, y = 134, z = 200. The singular values of thecoefficient matrix are 8.1777, 3.3364, .00088, and the condition number is 9293.

♠ 8.5.6.(a) The 2×2 Hilbert matrix has singular values 1.2676, .0657 and condition number 19.2815.

The 3 × 3 Hilbert matrix has singular values 1.4083, .1223, .0027 and condition number524.057. The 4 × 4 Hilbert matrix has singular values 1.5002, .1691, .006738, .0000967and condition number 15513.7.

(b) The 5 × 5 Hilbert matrix has condition number 4.7661 × 105; the 6 × 6 Hilbert matrix

has condition number 1.48294× 107.

8.5.7. Let A = v ∈ Rn be the matrix (column vector) in question. (a) It has one singular value:

‖v ‖; (b) P =v

‖v ‖ , Σ =“‖v ‖

”— a 1× 1 matrix, Q = (1); (c) v+ =

vT

‖v ‖2 .

8.5.8. Let A = vT , where v ∈ Rn, be the matrix (row vector) in question. (a) It has one singu-

lar value: ‖v ‖; (b) P = (1), Σ =“‖v ‖

”, Q =

v

‖v ‖ ; (c) v+ =v

‖v ‖2 .

230

8.5.9. Almost true, with but one exception — the zero matrix.

8.5.10. Since S2 = K = AT A, the eigenvalues λ of K are the squares, λ = σ2 of the eigenvaluesσ of S. Moreover, since S > 0, its eigenvalues are all non-negative, so σ = +

√λ, and, by

definition, the nonzero σ > 0 are the singular values of A.

8.5.11. True. If A = P Σ QT is the singular value decomposition of A, then the transposed

equation AT = Q Σ PT gives the singular value decomposition of AT , and so the diagonal

entries of Σ are also the singular values of AT .

♦ 8.5.12. Since A is nonsingular, so is K = AT A, and hence all its eigenvalues are nonzero. Thus,Q, whose columns are the orthonormal eigenvector basis of K, is a square orthogonal ma-trix, as is P . Therefore, the singular value decomposition of the inverse matrix is A−1 =

Q−T Σ−1 P−1 = Q Σ−1 PT . The diagonal entries of Σ−1, which are the singular values ofA−1, are the reciprocals of the diagonal entries of Σ. Finally, κ(A−1) = σn/σ1 = 1/κ(A).

♦ 8.5.13.(a) When A is nonsingular, all matrices in its singular value decomposition (8.40) are square.

Thus, we can compute det A = det P detΣ det QT = ±1 detΣ = ±σ1 σ2 · · · σn, sincethe determinant of an orthogonal matrix is ±1. The result follows upon taking absolutevalues of this equation and using the fact that the product of the singular values is non-negative.

(b) No — even simple nondiagonal examples show this is false.(c) Numbering the singular values in decreasing order, so σk ≥ σn for all k, we conclude

10−k > | det A | = σ1 σ2 · · · σn ≥ σnn , and the result follows by taking the nth root.

(d) Not necessarily, since all the singular values could be very small but equal, and in thiscase the condition number would be 1.

(e) The diagonal matrix with entries 10k and 10−k for k À 0, or more generally, any 2 × 2

matrix with singular values 10k and 10−k, has condition number 102k.

8.5.14. False. For example, the diagonal matrix with entries 2 · 10k and 10−k for k À 0 has

determinant 2 but condition number 2 · 102k.

8.5.15. False — the singular values are the absolute values of the nonzero eigenvalues.

8.5.16. False. For example, U =

1 10 2

!has singular values 3±

√5.

8.5.17. False, unless A is symmetric or, more generally, normal, meaning that AT A = AAT .

For example, the singular values of A =

1 10 1

!are

r32 ±

√5

2 , while the singular values of

A2 =

1 20 1

!are

q3± 2

√2 .

8.5.18. False. This is only true if S is an orthogonal matrix.

♥ 8.5.19.(a) If ‖x ‖ = 1, then y = Ax satisfies the equation yT By = 1, where B = A−T A−1 =

PT Σ−2P . Thus, by Exercise 8.4.24, the principal axes of the ellipse are the columns ofP , and the semi-axes are the reciprocals of the square roots of the diagonal entries ofΣ−2, which are precisely the singular values σi.

(b) If A is symmetric (and nonsingular), P = Q is the orthogonal eigenvector matrix, andso the columns of P coincide with the eigenvectors of A. Moreover, the singular valuesσi = |λi | are the absolute values of its eigenvalues.

(c) From elementary geometry, the area of an ellipse equals π times the product of its semi-

231

axes. Thus, area E = π σ1σ2 = π | det A | using Exercise 8.5.13.

(d) (i) principal axes:

2

1 +√

5

!,

2

1−√

5

!; semi-axes:

r32 +

√5

2 ,

r32 −

√5

2 ; area: π.

(ii) principal axes:

12

!,

2−1

!(but any orthogonal basis of R

2 will also do); semi-

axes:√

5,√

5 (it’s a circle); area: 5π.

(iii) principal axes:

31

!,

1−3

!; semi-axes: 3

√5 ,√

5 ; area: 15π.

(e) If A = O, then E = 0 is a point. Otherwise, rank A = 1 and its singular value decom-

position is A = σ1p1qT1 where Aq1 = σ1p1. Then E is a line segment in the direction

of p1 of length 2σ1.

8.5.20.“1011 u + 3

11 v”2

+“

311 u + 2

11 v”2

= 1 or 109u2 + 72uv + 13v2 = 121.

Since A is symmetric, the semi-axes are the eigenvalues,which are the same as the singular values, namely 11, 1,so the ellipse is very long and thin; the principal axes

are the eigenvectors,

−1

3

!,

31

!; the area is 11π.

-4 -2 2 4

-10

-7.5

-5

-2.5

2.5

5

7.5

10

8.5.21.(a) In view of the singular value decomposition of A = P ΣQT , the set E is obtained by

first rotating the unit sphere according to QT , which doesn’t change it, then stretchingit along the coordinate axes into an ellipsoid according to Σ, and then rotating it ac-cording to P , which aligns its principal axes with the columns of P . The equation is

(65u + 43v − 2w)2 + (43u + 65v + 2w)2 + (−2u + 2v + 20w)2 = 2162, or

1013u2 + 1862uv + 1013v2 − 28uw + 28vw + 68w2 = 7776.

(b) The semi-axes are the eigenvalues: 12, 9, 2; the principal axes are the eigenvectors:

( 1,−1, 2 )T , (−1, 1, 1 )T , ( 1, 1, 0 )T .

(c) Since the unit sphere has volume 43 π, the volume of E is 4

3 π det A = 288 π.

♦ 8.5.22.(a) ‖Au ‖2 = (Au)T Au = uT K u, where K = AT A. According to Theorem 8.28,

maxuT K u | ‖u ‖ = 1 is the largest eigenvalue λ1 of K = AT A, hence the maxi-

mum value of ‖Au ‖ =√

uT K u isq

λ1 = σ1.

(b) This is true if rank A = n by the same reasoning, but false if ker A 6= 0, since then theminimum is 0, but, according to our definition, singular values are always nonzero.

(c) The kth singular value σk is obtained by maximizing ‖Au ‖ over all unit vectors whichare orthogonal to the first k − 1 singular vectors.

♦ 8.5.23. Let λ1 be the maximal eigenvalue, and let u1 be a corresponding unit eigenvector. ByExercise 8.5.22, σ1 ≥ ‖Au1 ‖ = |λ1 |.

8.5.24. By Exercise 8.5.22, the numerator is the largest singular value, while the denominator isthe smallest, and so the ratio is the condition number.

8.5.25. (a)

0@

120 − 3

20

− 120

320

1A, (b)

0@

15

25

− 25

15

1A, (c)

0@

12 0 0

0 −1 0

1A, (d)

0B@

0 0 00 −1 01 0 0

1CA,

232

(e)

0BBB@

115 − 2

15

− 115

215

115 − 2

15

1CCCA, (f )

0@

1140

170

3140

3140

370

9140

1A, (g)

0BBB@

19 − 1

929

518

29

118

118

49 − 7

18

1CCCA.

8.5.26.

(a) A =

1 13 3

!, A+ =

0@

120

320

120

320

1A, x

? = A+

1−2

!=

0@ −

14

− 14

1A;

(b) A =

0B@

1 −32 11 1

1CA, A+ =

0@

16

13

16

− 311

111

111

1A, x

? = A+

0B@

2−1

0

1CA =

0− 7

11

!;

(c) A =

1 1 12 −1 1

!, A+ =

0BBB@

17

27

47 − 5

1427

114

1CCCA, x

? = A+

52

!=

0BBB@

97

157117

1CCCA.

♥ 8.5.27. We repeatedly use the fact that the columns of P, Q are orthonormal, and so

PT P = I , QT Q = I .

(a) Since A+ = Q Σ−1 PT is the singular value decomposition of A+, we have (A+)+ =

P (Σ−1)−1 QT = P ΣQT = A.

(b) AA+A = (P ΣQT )(Q Σ−1 PT )(P ΣQT ) = P Σ Σ−1 ΣQT = P ΣQT = A.

(c) A+AA+ = (Q Σ−1 PT )(P ΣQT )(Q Σ−1 PT ) = Q Σ−1 Σ Σ−1 PT = Q Σ−1 PT = A+.

Or, you can use the fact that (A+)+ = A.

(d) (AA+)T = (Q Σ−1 PT )T (P ΣQT )T = P (Σ−1)T QT QΣT PT = P (Σ−1)T ΣT PT =

P PT = P Σ−1Σ PT = (P ΣQT )(Q Σ−1 PT ) = AA+.

(e) This follows from part (d) since (A+)+ = A.

8.5.28. In general, we know that x? = A+b is the vector of minimum norm that minimizes theleast squares error ‖Ax− b ‖. In particular, if b ∈ rng A, so b = Ax0 for some x0, thenthe minimum least squares error is 0 = ‖Ax0 − b ‖. If ker A = 0, then the solution isunique, and so x? − x0; otherwise, x? ∈ corng A is the solution to Ax = b of minimumnorm.

8.6.1.

(a) U =

0B@

1√2

1√2

− 1√2

1√2

1CA, ∆ =

2 −2

0 2

!;

(b) U =

0B@

1√2

1√2

− 1√2

1√2

1CA, ∆ =

3 0

0 −1

!;

(c) U =

0B@

3√13

2√13

− 2√13

3√13

1CA, ∆ =

2 15

0 −1

!;

(d) U =

0B@

1+3 i√14

√2√7

−√

2√7

1−3 i√14

1CA, ∆ =

3 i −2− 3 i

0 −3 i

!;

233

(e) U =

0BBBB@

− 1√5

43√

523

0√

53 − 2

32√5

23√

513

1CCCCA

, ∆ =

0BBBB@

− 2 −1 22√5

0 1 − 9√5

0 0 1

1CCCCA

;

(f ) U =

0BBBB@

1√2− 1+ i

2√

212

0 1√2

1− i2

1√2

1+ i2√

2− 1

2

1CCCCA

, ∆ =

0BBB@

− 1 1 1− i√2

0 i −√

2 + i√

2

0 0 − i

1CCCA.

8.6.2. If U is real, then U† = UT is the same as its transpose, and so (8.45) reduces to UT U =I , which is the condition that U be an orthogonal matrix.

♦ 8.6.3. If U†1U1 = I = U†

2U2, then (U1 U2)†(U1 U2) = U†

2U†1U1 U2 = U†

2U2 = I , and so U1 U2 isalso orthogonal.

♦ 8.6.4. If A is symmetric, its eigenvalues are real, and hence its Schur Decomposition is A =

Q∆QT , where Q is an orthogonal matrix. But AT = (QT QT )T = QTT QT , and hence

∆T = ∆ is a symmetric upper triangular matrix, which implies that ∆ = Λ is a diagonalmatrix with the eigenvalues of A along its diagonal.

♥ 8.6.5.(a) If A is real, A† = AT , and so if A = AT then AT A = A2 = AAT .

(b) If A is unitary, then A†A = I = AA†.(c) Every real orthogonal matrix is unitary, so this follows from part (b).

(d) When A is upper triangular, the ith diagonal entry of the matrix equation A†A = AA†

is | aii |2 =nX

k= i

| aik |2, and hence aik = 0 for all k > i. Therefore A is a diagonal matrix.

(e) Let U = (u1 u2 . . . un ) be the corresponding unitary matrix, with U−1 = U†. Then

AU = U Λ, where Λ is the diagonal eigenvalue matrix, and so A = U ΛU † = U ΛU†.

Then AA† = U ΛU†UΛ†U† = U ΛΛ†U† = A†A since ΛΛ† = Λ†Λ as it is diagonal.

(f ) Let A = U ∆U† be its Schur Decomposition. Then, as in part (e), AA† = U ∆∆†U†,

while A†A = U ∆†∆U†. Thus, A is normal if and only if ∆ is; but part (d) says this

happens if and only if ∆ = Λ is diagonal, and hence A = U ΛU † satisfies the conditionsof part (e).

(g) If and only if it is symmetric. Indeed, by the argument in part (f ), A = QΛQT where Q

is real, orthogonal, which is just the spectral factorization of A = AT .

8.6.6.(a) One 2× 2 Jordan block; eigenvalue 2; eigenvector e1.(b) Two 1× 1 Jordan blocks; eigenvalues −3, 6; eigenvectors e1, e2.(c) One 1× 1 and one 2× 2 Jordan blocks; eigenvalue 1; eigenvectors e1, e2.(d) One 3× 3 Jordan block; eigenvalue 0; eigenvector e1.(e) One 1× 1, 2× 2 and 1× 1 Jordan blocks; eigenvalues 4, 3, 2; eigenvectors e1, e2, e4.

234

8.6.7.

0BBB@

2 0 0 00 2 0 00 0 2 00 0 0 2

1CCCA,

0BBB@

2 1 0 00 2 0 00 0 2 00 0 0 2

1CCCA,

0BBB@

2 0 0 00 2 1 00 0 2 00 0 0 2

1CCCA,

0BBB@

2 0 0 00 2 0 00 0 2 10 0 0 2

1CCCA,

0BBB@

2 1 0 00 2 0 00 0 2 10 0 0 2

1CCCA,

0BBB@

2 1 0 00 2 1 00 0 2 00 0 0 2

1CCCA,

0BBB@

2 0 0 00 2 1 00 0 2 10 0 0 2

1CCCA,

0BBB@

2 1 0 00 2 1 00 0 2 10 0 0 2

1CCCA.

8.6.8.

0B@

2 0 00 2 00 0 5

1CA,

0B@

2 0 00 5 00 0 2

1CA,

0B@

5 0 00 2 00 0 2

1CA,

0B@

2 0 00 5 00 0 5

1CA,

0B@

5 0 00 2 00 0 5

1CA,

0B@

5 0 00 5 00 0 2

1CA,

0B@

2 1 00 2 00 0 5

1CA,

0B@

5 0 00 2 10 0 2

1CA,

0B@

2 0 00 5 10 0 5

1CA,

0B@

5 1 00 5 00 0 2

1CA.

8.6.9.

(a) Eigenvalue: 2. Jordan basis: v1 =

10

!,v2 =

013

!. Jordan canonical form:

2 10 2

!.

(b) Eigenvalue: −3. Jordan basis: v1 =

12

!,v2 =

120

!.

Jordan canonical form:

−3 1

0 −3

!.

(c) Eigenvalue: 1. Jordan basis: v1 =

0B@

100

1CA ,v2 =

0B@

010

1CA ,v3 =

0B@

0−1

1

1CA.


0B@

1 1 00 1 10 0 1

1CA.

(d) Eigenvalue: −3. Jordan basis: v1 =

0B@

101

1CA ,v2 =

0B@

010

1CA ,v3 =

0B@

100

1CA.


0B@−3 1 0

0 −3 10 0 −3

1CA.

(e) Eigenvalues: −2, 0. Jordan basis: v1 =

0B@−1

01

1CA ,v2 =

0B@

0−1

0

1CA ,v3 =

0B@

0−1

1

1CA.


0B@−2 1 0

0 −2 00 0 0

1CA.

(f ) Eigenvalue: 2. Jordan basis: v1 =

0BBB@

−2110

1CCCA,v2 =

0BBB@

1000

1CCCA,v3 =

0BBBB@

−1− 1

212

0

1CCCCA

,v4 =

0BBB@

000− 1

2

1CCCA.


0BBB@

2 0 0 00 2 1 00 0 2 10 0 0 2

1CCCA.

235

8.6.10. J−1λ,n =

0BBBBBBBBBBB@

λ−1 −λ−2 λ−3 −λ−4 . . . −(−λ)n

0 λ−1 −λ−2 λ−3 . . . −(−λ)n−1

0 0 λ−1 −λ−2 . . . −(−λ)n−2

0 0 0 λ−1 . . . −(−λ)n−3

......

......

. . ....

0 0 0 0 . . . λ−1

1CCCCCCCCCCCA

.

8.6.11. True. All Jordan chains have length one, and so consist only of eigenvectors.

♦ 8.6.12. No in general. If an eigenvalue has multiplicity ≤ 3, then you can tell the size of its Jor-dan blocks by the number of linearly independent eigenvectors it has: if it has 3 linearlyindependent eigenvectors, then there are three 1 × 1 Jordan blocks; if it has 2 linearly in-dependent eigenvectors then there are two Jordan blocks, of sizes 1 × 1 and 2 × 2, whileif it only has one linearly independent eigenvector, then it corresponds to a single 3 × 3Jordan block. But if the multiplicity of the eigenvalue is 4, and there are only 2 linearly in-dependent eigenvectors, then it could have two 2 × 2 blocks, or a 1 × 1 and an 3 × 3 block.Distinguishing between the two cases is a difficult computational problem.

8.6.13. True. If zj = cwj , then A zj = cAwj = c λwj + cwj−1 = λzj + zj−1.

8.6.14. False. Indeed, the square of a Jordan matrix is not necessarily a Jordan matrix, e.g.,0B@

1 1 00 1 10 0 1

1CA

2

=

0B@

1 2 10 1 20 0 1

1CA .

♦ 8.6.15.

(a) Let A =

0 10 0

!. Then e2 is an eigenvector of A2 = O, but is not an eigenvector of A.

(b) Suppose A = S J S−1 where J is the Jordan canonical form of A. Then A2 = S J2S−1.

Now, even though J2 is not necessarily a Jordan matrix, cf. Exercise 8.6.14, since J isupper triangular with the eigenvalues on the diagonal, J2 is also upper triangular andits diagonal entries, which are its eigenvalues and the eigenvalues of A2, are the squaresof the diagonal entries of J .

8.6.16. Not necessarily. A simple example is A =

1 10 0

!, B =

0 10 0

!, so AB =

0 10 0

!,

whereas BA =

0 00 0

!.

♦ 8.6.17. First, since Jλ,n is upper triangular, its eigenvalues are its diagonal entries, and hence λ

is the only eigenvalue. Moreover, v = ( v1, v2, . . . , vn )T is an eigenvector if and only if

(Jλ,n − λ I )v = ( v2, . . . , vn, 0 )T = 0. This requires v2 = · · · = vn = 0, and hence v must

be a scalar multiple of e1.

♦ 8.6.18.(a) Observe that Jk

0,n is the matrix with 1’s along the kth upper diagonal, i.e., in positions

(i, k + i). In particular, when k = n, all entries are all 0, and so Jn0,n = O.

(b) Since a Jordan matrix is upper triangular, the diagonal entries of Jk are the kth powersof diagonal entries of J , and hence Jm = O requires that all its diagonal entries are

zero. Moreover, Jk is a block matrix whose blocks are the kth powers of the originalJordan blocks, and hence Jm = O, where m is the maximal size Jordan block.

(c) If A = S J S−1, then Ak = S JkS−1 and hence Ak = O if and only if Jk = O.(d) This follow from parts (c–d).

236

8.6.19. (a) Since Jk is upper triangular, Exercise 8.3.12 says it is complete if and only if it is

a diagonal matrix, which is the case if and only if J is diagonal, or Jk = O. (b) Write

A = S J S−1 in Jordan canonical form. Then Ak = S Jk S−1 is complete if and only if Jk is

complete, so either J is diagonal, whence A is complete, or Jk = O and so Ak = O.

♥ 8.6.20.(a) If D = diag (d1, . . . , dn), then pD(λ) =

nY

i=1

(λ− di). Now D − di I is a diagonal matrix

with 0 in its ith diagonal position. The entries of the product pD(D) =nY

i=1

(D − di I ) of

diagonal matrices is the product of the individual diagonal entries, but each such prod-uct has at least one zero, and so the result is a diagonal matrix with all 0 diagonal en-tries, i.e., the zero matrix: pD(D) = O.

(b) First, according to Exercise 8.2.32, similar matrices have the same characteristic poly-

nomials, and so if A = S DS−1 then pA(λ) = pD(λ). On the other hand, if p(λ) is any

polynomial, then p(S DS−1) = S−1p(D)S. Therefore, if A is complete, we can diago-

nalize A = S DS−1, and so, by part (a) and the preceding two facts,

pA(A) = pA(S DS−1) = S−1pA(D)S = S−1pD(D)S = O.

(c) The characteristic polynomial of the upper triangular Jordan block matrix J = Jµ,n

with eigenvalue µ is pJ (λ) = (λ−µ)n. Thus, pJ (J) = (J −µ I )n = Jn0,n = O by Exercise

8.6.18.(d) The determinant of a (Jordan) block matrix is the product of the determinants of the

individual blocks. Moreover, by part (c), substituting J into the product of the charac-teristic polynomials for its Jordan blocks gives zero in each block, and so the productmatrix vanishes.

(e) Same argument as in part (b), using the fact that a matrix and its Jordan canonicalform have the same characteristic polynomial.

♦ 8.6.21. The n vectors are divided into non-null Jordan chains, say w1,k, . . . ,wik,k, satisfying

Bwi,k = λk wi,k + wi−1,k with λk 6= 0 the eigenvalue, (and w0,k = 0 by convention)

along with the null Jordan chains, say y1,l, . . . ,wil,l,wil+1,l, supplemented by one addi-

tional vector, satisfying Byi,k = yi−1,k, and, in addition, the null vectors

z1, . . . , zn−r−k ∈ ker B \ rng B. Suppose some linear combination vanishes:X

k

ha1,k w1,k + · · · + aik,k wik,k

i+X

l

hb1,l y1,l + · · · + bil,l

yil,l+ bil+1,l yil+1,l

i

+ (c1 z1 + · · · cr−k zr−k) = 0.

Multiplying by B and using the Jordan chain equations, we findX

k

h(λk a1,k + a2,k)w1,k + · · · + (λk aik−1,k + aik,k)wik−1,k + λk aik,k wik,k

i

+X

l

hb2,l y1,l + · · · + bil+1,l yil,l

i= 0.

Since we started with a Jordan basis for W = rng B, by linear independence, their coeffi-cients in the preceding equation must all vanish, which implies that a1,k = · · · = aik,k =

b2,l = · · · = bil+1,l = 0. Substituting this result back into the original equation, we are left

with X

l

b1,l y1,l + (c1 z1 + · · · cr−k zr−k) = 0,

which implies all b1,l = cj = 0, since the remaining vectors are also linearly independent.

237


9.1.1.

(i) (a) u(t) = c1 cos 2 t + c2 sin 2 t. (b)du

dt=

0 1−4 0

!u. (c) u(t) =

c1 cos 2 t + c2 sin 2 t

−2c1 sin 2 t + 2c2 cos 2 t

!.

(d) (e) 2 4 6 8

-0.4

-0.2

0.2

0.4

(ii) (a) u(t) = c1 e−2 t + c2 e2 t. (b)du

dt=

0 14 0

!u. (c) u(t) =

c1 e−2 t + c2 e2 t

−2c1 e−2 t + 2c2 e2 t

!.

(d) (e)

-1 -0.5 0.5 1

1

2

3

4

5

6

7

(iii) (a) u(t) = c1 e− t + c2 te− t. (b)du

dt=

0 1−1 −2

!u. (c) u(t) =

c1 e− t + c2 te− t

(c2 − c1)e− t − c2 te− t

!.

(d) (e) -1 -0.5 0.5 1

-3

-2

-1

1

2

3

(iv) (a) u(t) = c1 e− t + c2 e−3 t. (b)du

dt=

0 1−3 −4

!u. (c) u(t) =

c1 e− t + c2 e−3 t

−c1 e− t − 3c2 e−3 t

!.

(d) (e)

-1 -0.5 0.5 1

-8

-6

-4

-2

2

(v) (a) u(t) = c1 et cos 3 t + c2 et sin 3 t. (b)du

dt=

0 1

−10 2

!u.

238

(c) u(t) =

c1 e− t cos 3 t + c2 e− t sin 3 t

−(c1 + 3c2)e− t cos 3 t + (3c1 − c2)e− t sin 3 t

!.

(d) (e) -1 1 2 3 4

-30

-20

-10

10

20

9.1.2.

(a)du

dt=

0B@

0 1 00 0 1

−12 −4 −3

1CAu.

(b) u(t) = c1 e−3 t + c2 cos 2 t + c3 sin 2 t, u(t) =

0BB@

c1 e−3 t + c2 cos 2 t + c3 sin 2 t

−3c1 e−3 t − 2c2 sin 2 t + 2c3 cos 2 t

9c1 e−3 t − 4c2 cos 2 t− 4c3 sin 2 t

1CCA,

(c) dimension = 3.

9.1.3. Set u1 = u, u2 =¦

u, u3 = v, u4 =¦

v and u(t) =

0BBB@

u1(t)u2(t)u3(t)u4(t)

1CCCA. Then

du

dt=

0BBB@

0 1 0 0c a d b0 0 0 1r p s q

1CCCAu.

9.1.4. False; by direct computation, we find that the functions u1(t), u2(t) satisfy a quadratic

equation αu21 + βu1u2 + γ u2

2 + δu1 + εu2 = c if and only if c1 = c2 = 0.

♦ 9.1.5.

(a) Use the chain rule to computedv

dt= − du

dt(− t) = −Au(− t) = −Av.

(b) Since v(t) = u(− t) parametrizes the same curve as u(t), but in the reverse direction.

(c) (i)dv

dt=

0 −14 0

!v; solution: v(t) =

c1 cos 2 t− c2 sin 2 t

2c1 sin 2 t + 2c2 cos 2 t

!.

(ii)dv

dt=

0 −1−4 0


c1 e2 t + c2 e−2 t

−2c1 e2 t + 2c2 e−2 t

!.

(iii)dv

dt=

0 −11 2


c1 et − c2 tet

(c2 − c1)et + c2 tet

!.

(iv)dv

dt=

0 −13 4


c1 et + c2 e3 t

−c1 et − 3c2 e3 t

!.

(v)dv

dt=

0 −1

10 −2


c1 et cos 3 t− c2 et sin 3 t

−(c1 + 3c2)et cos 3 t− (3c1 − c2)et sin 3 t

!.

(d) Time reversal changes u1(t) = u(t) into v1(t) = u1(− t) = u(− t) and u2(t) =¦

u(t) intov2(t) = u2(− t) =

¦

u(− t) = − ¦

v(t). The net effect is to change the sign of the coefficient

of the first derivative term, sod2u

dt2+ a

du

dt+ bu = 0 becomes

d2v

dt2− a

dv

dt+ bv = 0.

9.1.6. (a) Use the chain rule to computed

dtv(t) = 2

d

dtu(2 t) = 2Au(2 t) = 2Av(t), and

so the coefficient matrix is multiplied by 2. (b) The solution trajectories are the same, butthe solution moves twice as fast (in the same direction) along them.

♦ 9.1.7.(a) This is proved by induction, the case k = 0 being assumed.

239

Ifdk+1u

dtk+1=

d

dt

0@ dku

dtk

1A = A

dku

dtk, then differentiating the equation with respect to t

yieldsd

dt

0@ dk+1u

dtk+1

1A =

d

dt

0@A

dku

dtk

1A = A

dk+1u

dtk+1, which proves the induction step.

(b) This is also proved by induction, the case k = 1 being assumed. If true for k, then

dk+1u

dtk+1=

d

dt

0@ dku

dtk

1A =

d

dt(Ak

u) = Ak du

dt= AkAu = Ak+1

u.

9.1.8. False. If¦

u = Au then the speed along the trajectory at the point u(t) is ‖Au(t) ‖.So the speed is constant only if ‖Au(t) ‖ is constant. (Later, in Lemma 9.31, this will beshown to correspond to A being a skew-symmetric matrix.)

♠ 9.1.9. In all cases, the t axis is plotted vertically, and the three-dimensional solution curves

(u(t),¦

u(t), t)T project to the phase plane trajectories (u(t),¦

u(t))T .

(i) The solution curves are helices going around the t axis:

(ii) Hyperbolic curves going away from the t axis in both directions:

(iii) The solution curves converge on the t axis as t→∞:

(iv) The solution curves converge on the t axis as t→∞:

(v) The solution curves spiral away from the t axis as t→∞:

240

♥ 9.1.10.(a) Assuming b 6= 0, we have

v =1

b¦

u− a

bu,

¦

v =bc− ad

bu +

d

b¦

u. (∗)

Differentiating the first equation yields

dv

dt=

1

b¦¦

u− a

b¦

u.

Equating this to the right hand side of the second equation yields leading to the secondorder differential equation

¦¦

u− (a + d)¦

u + (ad− bc)u = 0. (∗∗)(b) If u(t) solves (∗∗), then defining v(t) by the first equation in (∗) yields a solution to the

first order system. Vice versa, the first component of any solution ( u(t), v(t) )T to thesystem gives a solution u(t) to the second order equation.

(c) (i)¦¦

u + u = 0, hence u(t) = c1 cos t + c2 sin t, v(t) = −c1 sin t + c2 cos t.(ii)

¦¦

u− 2¦

u + 5u = 0, hence

u(t) = c1 et cos 2 t + c2 et sin 2 t, v(t) = (c1 + 2c2)et cos 2 t + (−2c1 + c2)et sin 2 t.

(iii)¦¦

u− ¦

u− 6u = 0, hence u(t) = c1 e3 t + c2 e−2 t, v(t) = c1 e3 t + 6c2 e−2 t.(iv)

¦¦

u− 2u = 0, hence

u(t) = c1 e√

2 t + c2 e−√

2 t, v(t) = (√

2− 1)c1 e√

2 t − (√

2 + 1)c2 e−√

2 t.(v)

¦¦

u = 0, hence u(t) = c1 t + c2, v(t) = c1.

(d) For c 6= 0 we can solve for u =1

d¦

v − c

dv,

¦

u =ad− bc

du +

b

d¦

v, leading to the same

second order equation for v, namely,¦¦

v − (a + d)¦

v + (ad− bc)v = 0.(e) If b = 0 then u solves a first order linear equation; once we solve the equation, we can

then recover v by solving an inhomogeneous first order equation. Although u continuesto solve the same second order equation, it is no longer the most general solution, andso the one-to-one correspondence between solutions breaks down.

9.1.11. u(t) = 75 e−5 t + 8

5 e5 t, v(t) = − 145 e−5 t + 4

5 e5 t.

9.1.12.

(a) u1(t) = (√

10−1)c1 e(2+√

10) t−(√

10+1)c2 e(2−√

10) t, u2(t) = c1 e(2+√

10) t+c2 e(2−√

10) t;

(b) x1(t) = −c1 e−5 t + 3c2 e5 t, x2(t) = 3c1 e−5 t + c2 e5 t;

(c) y1(t) = e2 thc1 cos t− (c1 + c2) sin t

i, y2(t) = e2 t

hc2 cos t + (2c1 + c2) sin t

i;

(d) y1(t) = −c1 e− t − c2 et − 23 c3, y2(t) = c1 e− t − c2 et, y3(t) = c1 e− t + c2 et + c3;

(e) x1(t) = 3c1 et + 2c2 e2 t + 2c3 e4 t, x2(t) = c1 et + 12 c2 e2 t, x3(t) = c1 et + c2 e2 t + c3 e4 t.

9.1.13.

(a) u(t) =“

12 e2−2 t + 1

2 e−2+2 t,− 12 e2−2 t + 1

2 e−2+2 t”T

,

(b) u(t) =“

e− t − 3e3 t, e− t + 3e3 t”T

,

(c) u(t) =„

et cos√

2 t,− 1√2

et sin√

2 t«T

,

(d) u(t) =„

e− t“

2− cos√

6 t”, e− t

“1− cos

√6 t +

q23 sin

√6 t”, e− t

“1− cos

√6 t”«T

,

(e) u(t) = (−4− 6 cos t− 9 sin t, 2 + 3 cos t + 6 sin t,−1− 3 sin t )T ,

(f ) u(t) =“

12 e2−t + 1

2 e−2+t,− 12 e4−2 t + 1

2 e−4+2 t,− 12 e2−t + 1

2 e−2+t, 12 e4−2 t + 1

2 e−4+2 t”T

,

241

(g) u(t) =“− 1

2 e− t + 32 cos t− 3

2 sin t, 32 e− t − 5

2 cos t + 32 sin t, 2 cos t, cos t + sin t

”T.

9.1.14. (a) x(t) = e− t cos t, y(t) = −e− t sin t; (b)

9.1.15. x(t) = et/2“

cos√

32 t−

√3 sin

√3

2 t”, y(t) = et/2

“cos

√3

2 t− 1√3

sin√

32 t

”, and so at

time t = 1, the position is ( x(1), y(1) )T = (−1.10719, .343028 )T .

9.1.16. The solution is x(t) =“

c1 cos 2 t− c2 sin 2 t, c1 sin 2 t + c2 cos 2 t, c3 e− t”T

. The origin

is a stable, but not asymptotically stable, equilibrium point. Fluid particles starting on thexy plane move counterclockwise, at constant speed with angular velocity 2, around circlescentered at the origin. Particles starting on the z axis move in to the origin at an expo-nential rate. All other fluid particles spiral counterclockwise around the z axis, convergingexponentially fast to a circle in the xy plane.

9.1.17. The coefficient matrix has eigenvalues λ1 = −5, λ2 = −7, and, since the coefficient ma-

trix is symmetric, orthogonal eigenvectors v1 =

11

!, v2 =

−1

1

!. The general solution

is

u(t) = c1e−5 t

11

!+ c2e−7 t

−1

1

!.

For the initial conditions

u(0) = c1

11

!+ c2

−1

1

!=

12

!,

we can use orthogonality to find

c1 =〈u(0) ,v1 〉‖v1 ‖2

= 23 , c2 =

〈u(0) ,v2 〉‖v2 ‖2

= 12 .

Therefore, the solution is

u(t) = 32 e−5 t

11

!+ 1

2 e−7 t

−1

1

!.

9.1.18.

(a) Eigenvalues: λ1 = 0, λ2 = 1, λ3 = 3, eigenvectors: v1 =

0B@

111

1CA , v2 =

0B@

10−1

1CA , v3 =

0B@

1−2

1

1CA .

(b) By direct computation, v1 · v2 = v1 · v3 = v2 · v3 = 0.(c) The matrix is positive semi-definite since it has one zero eigenvalue and all the rest

are positive.(d) The general solution is

u(t) = c1

0B@

111

1CA+ c2 et

0B@

10−1

1CA+ c3 e3t

0B@

1−2

1

1CA .

For the given initial conditions

u(0) = c1

0B@

111

1CA+ c2

0B@

10−1

1CA+ c3

0B@

1−2

1

1CA =

0B@

12−1

1CA = u0,

242

we can use orthogonality to find

c1 =〈u0 ,v1 〉‖v1 ‖2

=2

3, c2 =

〈u0 ,v2 〉‖v2 ‖2

= 1, c3 =〈u0 ,v3 〉‖v3 ‖2

= − 2

3.

Therefore, the solution is u(t) =“

23 + et − 2

3 e3t, 23 + 4

3 e3t, 23 − et − 2

3 e3t”T

.

9.1.19. The general complex solution to the system is

u(t) = c1 e−t

0B@−1

11

1CA+ c2 e(1+2 i ) t

0B@

1i1

1CA+ c3 e(1−2i) t

0B@

1− i

1

1CA .

Substituting into the initial conditions,

u(0) =

0B@−c1 + c2 + c3c1 + i c2 − i c3c1 + +c2 + c3

1CA =

0BB@

2

−1

−2

1CCA we find

c1 = −2,

c2 = − 12 i ,

c3 = 12 i .

Thus, we obtain the same solution:

u(t) = −2e−t

0B@−1

11

1CA− 1

2 i e(1+2 i ) t

0B@

1i1

1CA+ 1

2 i e(1−2i) t

0B@

1− i

1

1CA =

0B@

2 e−t + et sin 2 t−2 e−t + et cos 2 t−2 e−t + et sin 2 t

1CA.

9.1.20. Only (d) and (g) are linearly dependent.

♦ 9.1.21. Using the chain rule,d

dteu(t) =

du

dt(t− t0) = Au(t− t0) = A eu(t), and hence eu(t) solves

the differential equation. Moreover, eu(t0) = u(0) = b has the correct initial conditions.The trajectories are the same curves, but eu(t) is always ahead of u(t) by an amount t0.

9.1.22.(a) This follows immediately from uniqueness, since they both solve the initial value prob-

lem¦

u = Au, u(t1) = a, which has a unique solution, and so u(t) = eu(t) for all t;(b) eu(t) = u(t + t2 − t1) for all t.

9.1.23. We computedu

dt=“

λ1 c1 eλ1 t, . . . , λn cn eλn t”T

= Λu. Moreover, the solution is a

linear combination of the n linearly independent solutions eλi t ei, i = 1, . . . , n.

9.1.24.dv

dt= S

du

dt= S Au = S AS−1

v = Bv.

♦ 9.1.25.(i) This is an immediate consequence of the preceding two exercises.

(ii) (a) u(t) =

−1 11 1

! c1 e−2 t

c2 e2 t

!, (b) u(t) =

−1 11 1

! c1 e3 t

c2 e− t

!,

(c) u(t) =

−√

2 i√

2 i1 1

!0@ c1 e(1+ i

√2) t

c2 e(1− i√

2)t

1A,

(d) u(t) =

0B@

2 1 11 1 + i

q23 1− i

q23

1 1 1

1CA

0BBB@

c1 e− t

c2 e(−1+ i√

6) t

c3 e(−1− i√

6)t

1CCCA,

(e) u(t) =

0B@

4 3 + 2 i 3− 2 i−2 −2− i −2 + i1 1 1

1CA

0BB@

c1c2 e i t

c3 e− i t

1CCA, (f ) u(t) =

0BBB@

0 −1 0 1−1 0 1 00 1 0 11 0 1 0

1CCCA

0BBBB@

c1 e−2 t

c2 e− t

c3 et

c4 e2 t

1CCCCA

,

243

(g) u(t) =

0BBBB@

−1 −1 32 i − 3

2 i

1 3 − 12 − 2 i − 1

2 + 2 i0 0 1 + i 1− i0 0 1 1

1CCCCA

0BBBBB@

c1 et

c2 e− t

c3 e i t

c4 e− i t

1CCCCCA

.

9.1.26.

(a)

c1e2 t + c2t e2 t

c2e2 t

!, (b)

0@ c1e− t + c2

“13 + t

”e− t

3c1e− t + 3c2te− t

1A,

(c)

0@ c1e−3 t + c2

“12 + t

”e−3 t

2c1e−3 t + 2c2te−3 t

1A, (d)

0BB@

c1e− t + c2et + c3t et

−c1e− t − c3et

2c1e− t + c2et + c3t et

1CCA,

(e)

0BB@

c1e−3 t + c2t e−3 t + c3“

1 + 12 t2

”e−3 t

c2 e−3 t + c3t e−3 t

c1e−3 t + c2te−3 t + 12 c3t2e−3 t

1CCA, (f )

0BB@

c1e− t + c2t e− t + 12 c3t2e−3 t

c2e− t + c3(t− 1)e− t

c3e− t

1CCA,

(g)

0BBBBB@

c1e3 t + c2te3 t − 14 c3e− t − 1

4 c4(t + 1)e− t

c3e− t + c4te− t

c2e3 t − 14 c4 e− t

c4 e− t

1CCCCCA

, (h)

0BBB@

c1 cos t + c2 sin t + c3t cos t + c4t cos t−c1 sin t + c2 cos t− c3t sin t + c4t cos t

c3 cos t + c4 sin t−c3 sin t + c4 cos t

1CCCA.

9.1.27. (a)du

dt=

2 − 1

20 1

!u, (b)

du

dt=

−1 1−9 −1

!u, (c)

du

dt=

0 01 0

!u,

(d)du

dt=

1 1−5 −1

!u, (e)

du

dt=

0B@

2 0 00 −3 02 3 0

1CAu, (f )

du

dt=

0B@

0 1 0−1 0 0

0 0 0

1CAu,

(g)du

dt=

0B@

0 12

12

−2 0 0−2 0 0

1CAu, (h)

du

dt=

0B@

1 12 0

−1 1 −10 1

2 1

1CAu.

9.1.28. (a) No, since neitherdui

dtis a linear combination of u1,u2. Or note that the trajec-

tories described by the solutions cross, violating uniqueness. (b) No, since polynomial solu-tions a two-dimensional system can be at most first order in t. (c) No, since a two-dimensional

system has at most 2 linearly independent solutions. (d) Yes:¦

u =

−1 0

0 1

!u. (e) Yes:

¦

u =

2 3−3 2

!u. (f ) No, since neither

dui

dtis a linear combination of u1,u2. Or note that

both solutions have the unit circle as their trajectory, but traverse it in opposite directions,

violating uniqueness. (g) Yes:¦

u =

0B@

0 0 01 0 0−1 0 0

1CAu. (h) Yes:

¦

u = u. (i) No, since a three-

dimensional system has at most 3 linearly independent solutions.

9.1.29. Setting u(t) =

0B@

u(t)¦

u(t)¦¦

u(t)

1CA, the first order system is

du

dt=

0B@

0 1 00 0 1

−12 −4 −3

1CAu. The eigen-

244

values of the coefficient matrix are −3,±2 i with eigenvectors

0B@

1−3

9

1CA,

0B@

1±2 i−4

1CA and the

resulting solution is u(t) =

0BB@

c1 e−3 t + c2 cos 2 t + c3 sin 2 t

−3c1 e−3 t − 2c2 sin 2 t + 2c3 cos 2 t

9c1 e−3 t − 4c2 cos 2 t− 4c3 sin 2 t

1CCA, which is the same as

that found in Exercise 9.1.2.

9.1.30. Setting u(t) =

0BBB@

u(t)¦

u(t)v(t)¦

v(t)

1CCCA, the first order system is

du

dt=

0BBB@

0 1 0 01 1 −1 00 0 0 1−1 0 1 1

1CCCA. The coeffi-

cient matrix has eigenvalues −1, 0, 1, 2 and eigenvectors

0BBB@

1−1−1

1

1CCCA,

0BBB@

1010

1CCCA,

0BBB@

1111

1CCCA,

0BBB@

12−1−2

1CCCA. Thus

u(t) =

0BBBBB@

c1 e− t + c2 + c3 et + c4 e2 t

−c1 e− t + c3 et + 2c4 e2 t

−c1 e− t + c2 + c3 et − c4 e2 t

c1 e− t + c3 et − 2c4 e2 t

1CCCCCA

, whose first and third components give the general

solution u(t) = c1 e− t + c2 + c3 et + c4 e2 t, v(t) = −c1 e− t + c2 + c3 et − c4 e2 t to the secondorder system.

9.1.31. The degree is at most n − 1, and this occurs if and only if A has only one Jordan chainin its Jordan basis.

♦ 9.1.32.(a) By direct computation,

duj

dt= λeλt

jX

i=1

tj−i

(j − i) !wi + eλt

j−1X

i=1

tj−i−1

(j − i− 1) !wi,

which equals

Auj = eλtjX

i=1

tj−i

(j − i) !Awi = eλt

24 tj−1

(j − 1) !w1 +

jX

i=2

tj−i

(j − i) !(λwi + wi−1)

35 .

(b) At t = 0, we have uj(0) = wj , and the Jordan chain vectors are linearly independent.

9.1.33.(a) The equilibrium solution satisfies Au? = −b, and so v(t) = u(t) − u? satisfies

¦

v =¦

u =Au + b = A(u− u?) = Av, which is the homogeneous system.

(b) (i)u(t) = −3c1 e2 t + c2 e−2 t − 1

4 ,

v(t) = c1 e2 t + c2 e−2 t + 14 .

(ii)u(t) = −2c1 cos 2 t + 2c2 sin 2 t− 3,

v(t) = c1 sin 2 t + c2 cos 2 t− 12 .

9.2.1.(a) Asymptotically stable: the eigenvalues are −2± i ;

(b) unstable: the eigenvalues are 12 ±

√112 i ;

(c) asymptotically stable — eigenvalue −3;(d) stable: the eigenvalues are ±4 i ;(e) stable: the eigenvalues are 0,−1, with 0 complete;(f ) unstable: the eigenvalues are 1,−1± 2 i ;(g) asymptotically stable: the eigenvalues are −1,−2;(h) unstable: the eigenvalues are −1, 0, with 0 incomplete.

245

9.2.2. u(t) = e− th “

c1 +

r23 c2

”cos√

6 t +“−r

23 c1 + c2

”sin√

6 ti+ 1

2 c3 e−2 t,

v(t) = e− thc1 cos

√6 t + c2 sin

√6 ti+ 1

2 c3 e−2 t,

w(t) = e− thc1 cos

√6 t + c2 sin

√6 ti+ c3 e−2 t.

The system is stable because all terms in the solution are exponentially decreasing as t→∞.

9.2.3.(a)

¦

u = −2u,¦

v = −2v, with solution u(t) = c1 e−2 t, v(t) = c2 e−2 t.

(b)¦

u = −v,¦

v = −u, with solution u(t) = c1 et + c2 e− t, v(t) = −c1 et + c2 e− t.(c)

¦

u = −8u + 2v,¦

v = 2u− 2v, with solution

u(t) = −c1

√13+32 e−(5+

√13)t+c2

√13−32 e−(5−

√13)t, v(t) = c1 e−(5+

√13)t+c2 e−(5−

√13)t.

(d)¦

u = −4u + v + 2v,¦

v = u− 4v + w,¦

w = 2u + v − 4w, with solution

u(t) = −c1 e−6 t + c2 e−(3+√

3) t + c3 e−(3−√

3) t, v(t) = −(√

3 + 1)c2 e−(3+√

3) t +

+(√

3− 1)c3 e−(3−√

3) t, w(t) = c1 e−6 t + c2 e−(3+√

3) t + c3 e−(3−√

3) t.

9.2.4.(a)

¦

u = 2v,¦

v = −2u, with solution u(t) = c1 cos 2 t+ c2 sin 2 t, v(t) = −c1 sin 2 t+ c2 cos 2 t;stable.

(b)¦

u = u,¦

v = −v, with solution u(t) = c1 et, v(t) = c2 e− t; unstable.

(c)¦

u = −2u + 2v,¦

v = −8u + 2v, with solution u(t) = 14 (c1 −

√3 c2) cos 2

√3 t +

14 (√

3 c1 + c2) sin 2√

3 t, v(t) = c1 cos 2√

3 t + c2 sin 2√

3 t; stable.

9.2.5. (a) Gradient flow; asymptotically stable. (b) Neither; unstable. (c) Hamiltonian flow;unstable. (d) Hamiltonian flow; stable. (e) Neither; unstable.

9.2.6. True. If K =

a bb c

!, then we must have

∂H

∂v= au + bv,

∂H

∂u= −bu − cv. Therefore,

by equality of mixed partials,∂2H

∂u ∂v= a = −c. But if K > 0, both diagonal entries must

be positive, a, c > 0, which is a contradiction.

9.2.7.(a) The characteristic equation is λ4 + 2λ2 + 1 = 0, and so ± i are double eigenvalues.

However, each has only one linearly independent eigenvector, namely ( 1,± i , 0, 0 )T .

(b) The general solution is u(t) =

0BBB@

c1 cos t + c2 sin t + c3t cos t + c4t cos t−c1 sin t + c2 cos t− c3t sin t + c4t cos t

c3 cos t + c4 sin t−c3 sin t + c4 cos t

1CCCA.

(c) All solutions with c23 + c24 6= 0 spiral off to ∞ as t → ±∞, while if c3 = c4 = 0, but

c21 + c22 6= 0, the solution goes periodically around a circle. Since the former solutions canstart out arbitrarily close to 0, the zero solution is not stable.

9.2.8. Every solution to a real first order system of period P comes from complex conjugateeigenvalues ±2π i /P . A 3 × 3 real matrix has at least one real eigenvalue λ1. Therefore,if the system has a solution of period P , its eigenvalues are λ1 and ±2π i /P . If λ1 = 0,every solution has period P . Otherwise, the solutions with no component in the directionof the real eigenvector all have period P , and are the only periodic solutions, proving theresult. The system is stable (but never asymptotically stable) if and only if the real eigen-value λ1 ≤ 0.

9.2.9. No, since a 4 × 4 matrix could have two distinct complex conjugate pairs of purely imag-inary eigenvalues, ±2π i /P1,±2π i /P2, and would then have periodic solutions of periods

246

P1 and P2. The general solution in such a case is quasi-periodic; see Section 9.5 for details.

9.2.10. The system is stable since ± i must be simple eigenvalues. Indeed, any 5× 5 matrix has5 eigenvalues, counting multiplicities, and the multiplicities of complex conjugate eigenval-ues are the same. A 6 × 6 matrix can have ± i as complex conjugate, incomplete doubleeigenvalues, in addition to the simple real eigenvalues −1,−2, and in such a situation theorigin would be unstable.

9.2.11. True, since Hn > 0 by Proposition 3.34.

9.2.12. True, because the eigenvalues of the coefficient matrix −K are real and non-negative,λ ≤ 0. Moreover, as it is symmetric, all its eigenvalues, including 0, are complete.

9.2.13. (a)¦

v = Bv = −Av. (b) True, since the eigenvalues of B = −A are minus theeigenvalues of A, and so will all have positive real parts. (c) False. For example, a saddlepoint, with one positive and one negative eigenvalue is still unstable when going backwardsin time. (d) False, unless all the eigenvalues of A and hence B are complete and purelyimaginary or zero.

9.2.14. The eigenvalues of −A2 are all of the form −λ2 ≤ 0, where λ is an eigenvalue of A.Thus, if A is nonsingular, the result is true, while if A is singular, then the equilibrium so-lutions are stable, since the 0 eigenvalue is complete, but not asymptotically stable.

9.2.15. (a) True, since the sum of the eigenvalues equals the trace, so at least one must bepositive or have positive real part in order that the trace be positive. (b) False. A = −1 0

0 −2

!gives an example of a asymptotically stable system with positive determinant.

9.2.16.(a) Every v ∈ ker K gives an equilibrium solution u(t) ≡ v.(b) Since K is complete, the general solution has the form

u(t) = c1 e−λ1 tv1 + · · ·+ cr e−λr t

vr + cr+1vr+1 + · · ·+ cn vn,

where λ1, . . . , λr > 0 are the positive eigenvalues of K with (orthogonal) eigenvectorsv1, . . . ,vr, while vr+1, . . . ,vn form a basis for the null eigenspace, i.e., ker K. Thus, as

t→∞, u(t)→ cr+1vr+1 + · · ·+ cn vn ∈ ker K, which is an equilibrium solution.

(c) The origin is asymptotically stable if K is positive definite, and stable if K is positivesemi-definite.

(d) Note that a = u(0) = c1v1 + · · ·+ cr vr + cr+1vr+1 + · · ·+ cn vn. Since the eigenvectorsare orthogonal, cr+1vr+1 + · · ·+ cn vn is the orthogonal projection of a onto ker K.

9.2.17.(a) The tangent to the Hamiltonian trajectory at a point ( u, v )T is v = ( ∂H/∂v,−∂H/∂u )T

while the tangent to the gradient flow trajectory is w = ( ∂H/∂u, ∂H/∂v )T . Sincev ·w = 0, the tangents are orthogonal.

(b) (i)-1 -0.5 0.5 1

-1

-0.5

0.5

1

(ii)-1 -0.75-0.5-0.25 0.25 0.5 0.75 1

-1

-0.75

-0.5

-0.25

0.25

0.5

0.75

1

9.2.18. False. Only positive definite Hamiltonian functions lead to stable gradient flows.

247

9.2.19.(a) When q(u) = 1

2 uT Ku then

d

dtq(u) = 1

2¦

uT Ku + 1

2 uT K

¦

u =¦

uT Ku = − (Ku)T Ku = − ‖Ku ‖2.

(b) Since Ku 6= 0 when u 6∈ ker K,d

dtq(u) = − ‖Ku ‖2 < 0 and hence q(u) is a strictly

decreasing function of t whenever u(t) is not an equilibrium solution. Moreover, u(t) →u? goes to equilibrium exponentially fast, and hence its energy decreases exponentiallyfast to its equilibrium value: q(u)→ q(u?).

♥ 9.2.20.(a) By the multivariable calculus chain rule

d

dtH(u(t), v(t)) =

∂H

∂u

du

dt+

∂H

∂v

dv

dt=

∂H

∂u

∂H

∂v+

∂H

∂v

− ∂H

∂u

!≡ 0.

Therefore H(u(t), v(t)) ≡ c is constant, with its valuec = H(u0, v0) fixed by the initial conditions u(t0) = u0, v(t0) = v0.

(b) The solutions areu(t) = c1 cos(2 t)−c1 sin(2 t)+2c2 sin(2 t),

-1 -0.75-0.5-0.25 0.25 0.5 0.75 1

-1

-0.75

-0.5

-0.25

0.25

0.5

0.75

1

v(t) = c2 cos(2 t)− c1 sin(2 t) + c2 sin(2 t),and leave the Hamiltonian function constant:

H(u(t), v(t)) = u(t)2 − 2u(t)v(t) + 2v(t)2 = c21 − 2c1 c2 + 2c22 = c.

♦ 9.2.21. In both cases, | f(t) | = tk eµt. If µ > 0, then eµt → ∞, while tk ≥ 1 as t → ∞, and so

| f(t) | ≥ eµt → ∞. If µ = 0, then | f(t) | = 1 when k = 0, while | f(t) | = tk → ∞ if k > 0.

If µ < 0, then | f(t) | = eµt+k log t → 0 as t→∞, since µt + k log t→ −∞.

♦ 9.2.22. An eigensolution u(t) = eλtv with λ = µ+ i ν is bounded in norm by ‖u(t) ‖ ≤ eµt‖v ‖.Moreover, since exponentials grow faster than polynomials, any solution of the form u(t) =

eλt p(t), where p(t) is a vector of polynomials, can be bounded by C eat for any a > µ =Re λ and some C > 0. Since every solution can be written as a linear combination of suchsolutions, every term is bounded by a multiple of eat provided a > a? = max Re λ and so,by the triangle inequality, is their sum. If the maximal eigenvalues are complete, then thereare no polynomial terms, and we can use the eigensolution bound, so we can set a = a?.

9.3.1.

(i) A =

0 −19 0

!;

λ1 = 3 i , v1 =

i3

!, λ2 = −3 i , v2 =

− i

3

!,

u1(t) = c1 cos 3 t + c2 sin 3 t, u2(t) = 3c1 sin 3 t− 3c2 cos 3 t;

center; stable.

(ii) A =

−2 3−1 1

!;

λ1 = 12 + i

√3

2 , v1 =

32 + i

√3

21

!,

λ2 = 12 − i

√3

2 , v2 =

32 − i

√3

21

!,

248

u1(t) = e− t/2» „

32 c1 −

√3

2 c2

«cos

√3

2 t +„ √

32 c1 + 3

2 c2

«sin

√3

2 t–,

u2(t) = e− t/2»c1 cos

√3

2 t + c2 sin√

32 t

–;

stable focus; asymptotically stable

(iii) A =

3 −22 −2

!;

λ1 = −1, v1 =

12

!, λ2 = 2, v2 =

21

!,

u1(t) = c1 e− t + 2c2 e2 t, u2(t) = 2c1 e− t + c2 e2 t;

saddle point; unstable

9.3.2.

(i) u(t) = c1 e− t

13

!+ c2 et

11

!;

saddle point; unstable.

(ii) u(t) = c1 e− t

2 cos t− sin t

5 cos t

!+ c2 e− t

2 sin t + cos t

5 sin t

!;

stable focus; asymptotically stable.

(iii) u(t) = c1 e− t/2

11

!+ c2 e− t/2

t

t + 25

!;

stable improper node; asymptotically stable.

9.3.3.

(a) For the matrix A =

−1 4

1 −2

!,

tr A = −3 < 0, det A = −2 < 0, ∆ = 17 > 0,so this is an unstable saddle point.

249

(b) For the matrix A =

−2 1

1 −4

!,

tr A = −6 < 0, det A = 7 > 0, ∆ = 8 > 0,so this is a stable node.

(c) For the matrix A =

5 41 2

!,

tr A = 7 > 0, det A = 6 > 0, ∆ = 25 > 0,so this is an unstable node.

(d) For the matrix A =

−3 −2

3 2

!,

tr A = −1 < 0, det A = 0, ∆ = 1 > 0,so this is a stable line.

9.3.4. (a) (b) (c)

(d) (e)

9.3.5. (a) u(t) =“

e−2 t cos t + 7e−2 t sin t, 3e−2 t cos t− 4e−2 t sin t”T

; (b)

(c) Asymptotically stable since the coefficient matrix has tr A = −4 < 0, det A = 5 > 0,∆ = −4 < 0, and hence it is a stable focus; equivalently, both eigenvalues −2 ± i have negativereal part.

250

♦ 9.3.6. For (9.31), the complex solution

eλtv = e(µ+ i ν) t(w + i z) = eµt

hcos(ν t)w − sin(ν t) z

i+ eµt

hsin(ν t)w + cos(ν t) z

i

leads to the general real solution

u(t) = c1 eµthcos(ν t)w − sin(ν t) z

i+ c2 eµt

hsin(ν t)w + cos(ν t) z

i

= eµthc1 cos(ν t) + c2 sin(ν t)

iw + eµt

h−c1 sin(ν t) + c2 cos(ν t)

iz

= r eµthcos(ν t− σ)w − sin(ν t− σ) z

i,

where r =q

c21 + c22 and tan σ = c2/c1.To justify (9.32), we differentiatedu

dt=

d

dt

h(c1 + c2 t)eλt

v + c2 eλtwi

= λh(c1 + c2 t)eλt

v + c2 eλtwi+ c2 eλt

v,

which is equal to

Au = (c1 + c2 t)eλt Av + c2 eλt Aw = (c1 + c2 t)eλt λv + c2 eλt (λw + v)

by the Jordan chain condition.

9.3.7. All except for cases IV(a–c), i.e., the stars and the trivial case.

9.4.1.

(a)

0@

43 et − 1

3 e−2 t − 13 et + 1

3 e−2 t

43 et − 4

3 e−2 t − 13 et + 4

3 e−2 t

1A, (b)

0@

12 et + 1

2 e− t 12 et − 1

2 e− t

12 et − 1

2 e− t 12 et + 1

2 e− t

1A =

cosh t sinh tsinh t cosh t

!,

(c)

cos t − sin tsin t cos t

!, (d)

1 t0 1

!, (e)

e2 t cos t− 3e2 t sin t 2e2 t sin t−5e2 t sin t e2 t cos t + 3e2 t sin t

!,

(f )

e− t + 2 te− t 2 te− t

−2 te− t e− t − 2 te− t

!.

9.4.2.

(a)

0B@

1 0 02 sin t cos t sin t

2 cos t− 2 − sin t cos t

1CA,

(b)

0BBB@

16 et + 1

2 e3 t + 13 e4 t 1

3 et − 13 e4 t 1

6 et − 12 e3 t + 1

3 e4 t

13 et − 1

3 e4 t 23 et + 1

3 e4 t 13 et − 1

3 e4 t

16 et − 1

2 e3 t + 13 e4 t 1

3 et − 13 e4 t 1

6 et + 12 e3 t + 1

3 e4 t

1CCCA,

(c)

0BB@

e−2 t + te−2 t te−2 t te−2 t

− 1 + e−2 t e−2 t −1 + e−2 t

1− e−2 t − te−2 t −te−2 t 1− te−2 t

1CCA,

(d)

0BBBBB@

13 et + 2

3 e− t/2 cos√

32 t 1

3 et − 13 e− t/2cos

√3

2 t− 1√3

e− t/2sin√

32 t

13 et − 1

3 e− t/2cos√

32 t + 1√

3e− t/2sin

√3

2 t 13 et + 2

3 e− t/2cos√

32 t

13 et − 1

3 e− t/2cos√

32 t− 1√

3e− t/2sin

√3

2 t 13 et − 1

3 e− t/2cos√

32 t + 1√

3e− t/2sin

√3

2 t

13 et − 1

3 e− t/2cos√

32 t + 1√

3e− t/2sin

√3

2 t

13 et − 1

3 e− t/2cos√

32 t− 1√

3e− t/2sin

√3

2 t

13 et + 2

3 e− t/2cos√

32 t

1CCCCA

.

251

9.4.3. 9.4.1 (a) det etA = e− t = et tr A, (b) det etA = 1 = et tr A, (c) det etA = 1 = et tr A,

(d) det etA = 1 = et tr A, (e) det etA = e4 t = et tr A, (f ) det etA = e−2 t = et tr A.

9.4.2 (a) det etA = 1 = et tr A, (b) det etA = e8 t = et tr A, (c) det etA = e−4 t = et tr A,

(d) det etA = 1 = et tr A.

9.4.4. (a)

0@

12 (e3 + e7) 1

2 (e3 − e7)12 (e3 − e7) 1

2 (e3 + e7)

1A, (b)

0@ e cos

√2 −

√2 e sin

√2

1√2

e sin√

2 e cos√

2

1A, (c)

3 −14 −1

!,

(d)

0B@

e 0 00 e−2 00 0 e−5

1CA, (e)

0BBB@

49 + 5

9 cos 3 49 − 4

9 cos 3 + 13 sin 3 2

9 − 29 cos 3− 2

3 sin 349 − 4

9 cos 3− 13 sin 3 4

9 + 59 cos 3 2

9 − 29 cos 3 + 2

3 sin 329 − 2

9 cos 3 + 23 sin 3 2

9 − 29 cos 3− 2

3 sin 3 19 + 8

9 cos 3

1CCCA.

9.4.5.

(a) u(t) =


! 1−2

!=

cos t + 2 sin tsin t− 2 cos t

!,

(b) u(t) =

3e− t − 2e−3 t −3e− t + 3e−3 t

2e− t − 2e−3 t −2e− t + 3e−3 t

! −1

1

!=

−6e− t + 5e−3 t

−4e− t + 5e−3 t

!,

(c) u(t) =

0BB@

3e− t − 2 cos 3 t− 2 sin 3 t 3e− t − 3 cos 3 t− sin 3 t 2 sin 3 t

− 2e− t + 2 cos 3 t + 2 sin 3 t −2e− t + 3 cos 3 t + sin 3 t −2 sin 3 t

2e− t − 2 cos 3 t 2e− t − 2 cos 3 t + sin 3 t cos 3 t + sin 3 t

1CCA

0BB@

0

1

0

1CCA

=

0BB@

3e− t − 3 cos 3 t− sin 3 t

− 2e− t + 3 cos 3 t + sin 3 t

2e− t − 2 cos 3 t + sin 3 t

1CCA.

9.4.6. etO = I for all t.

9.4.7. There are none, since etA is always invertible.

9.4.8.

etA =

cos 2πt − sin 2πtsin 2πt cos 2πt

!, and hence, when t = 1, eA =

cos 2π − sin 2πsin 2π cos 2π

!=

1 00 1

!.

9.4.9. (a) According to Exercise 8.2.51, A2 = −δ2 I since tr A = 0, det A = δ2. Thus, by

induction, A2m = (−1)mδ2m I , A2m+1 = (−1)mδ2m A.

etA =∞X

n=0

tn

n!An =

∞X

m=0

(−1)m(δ t)2m

(2m)!I +

∞X

m=0

(−1)mt2m+1 δ2m

(2m + 1)!A = cos δ t +

sin δ t

δ.

Setting t = 1 proves the formula. (b) eA = (cosh δ) I +sinh δ

δA, where δ =

√−det A .

(c) eA = I + A since A2 = O by Exercise 8.2.51.

♦ 9.4.10. Assuming A is an n × n matrix, since etA is a matrix solution, each of its n individual

columns must be solutions. Moreover, the columns are linearly independent since e0A = Iis nonsingular. Therefore, they form a basis for the n-dimensional solution space.

9.4.11. (a) False, unless A−1 = −A. (b) True, since A and A−1 commute.

♦ 9.4.12. Fix s and let U(t) = e(t+s) A, V (t) = etA es A. Then, by the chain rule,¦

U = A e(t+s) A = AU , while, by the matrix Leibniz rule (9.40),¦

V = A etA es A = AV .

Moreover, U(0) = es A = V (0). Thus U(t) and V (t) solve the same initial value problem,hence, by uniqueness, U(t) = V (t) for all t, proving the result.

252

9.4.13. Set U(t) = A etA, V (t) = etA A. Then, by the matrix Leibniz formula (9.40),¦

U =

A2 etA = A U ,¦

V = A etA A = A V , while U(0) = A = V (0). Thus U(t) and V (t) solve thesame initial value problem, hence, by uniqueness, U(t) = V (t) for all t. Alternatively, one

can use the power series formula (9.46): A etA =∞X

n=0

tn

n!An+1 = etAA.

9.4.14. Set U(t) = e− tλ etA. Then,¦

U = −λe− tλ etA + e− tλ AetA = (A− λ I )U. Moreover,

U(0) = I . Therefore, by the the definition of matrix exponential, U(t) = et(A−λ I ).

♦ 9.4.15.

(a) Let V (t) = (etA)T . ThendV

dt=

d

dtetA

!T

= (etA A)T = AT (etA)T = AT V , and

V (0) = I . Therefore, by the the definition of matrix exponential, V (t) = etAT

.

(b) The columns of etA form a basis for the solutions to¦

u = Au, while its rows are a basis

for the solutions to v = AT v. The stability properites of the two ystems are the same

since the eigenvalues of A are the same as those of AT .

♦ 9.4.16. First note that An = S BnS−1. Therefore, using (9.46),

etA =∞X

n=0

tn

n!An =

∞X

n=0

tn

n!S BnS−1 = S

0@

∞X

n=0

tn

n!Bn

1AS−1 = S etB S−1.

An alternative proof relies on the fact that etA and S etB S−1 both satisfy the initial value

problem¦

U = AU = S BS−1U, U(0) = I , and hence, by uniqueness, must be equal.

♦ 9.4.17. (a)d

dtdiag (etd1 , . . . etdn) = diag (d1 etd1 , . . . , dn etdn) = Ddiag (etd1 , . . . , etdn). More-

over, at t = 0, we have diag (e0d1 , . . . , e0dn) = I . Therefore, diag (etd1 , . . . , etdn) satisfies

the defining properties of etD. (b) See Exercise 9.4.16.

(c) 9.4.1: (a)

1 11 4

! et 00 e−2 t

! 1 11 4

!−1

=

0@

43 et − 1

3 e−2 t − 13 et + 1

3 e−2 t

43 et − 4

3 e−2 t − 13 et + 4

3 e−2 t

1A;

(b)

1 −11 1

! et 00 e− t

! 1 −11 1

!−1

=

0@

12 et + 1

2 e− t 12 et − 1

2 e− t

12 et − 1

2 e− t 12 et + 1

2 e− t

1A;

(c)

i − i1 1

! e i t 0

0 e− i t

! i − i1 1

!−1

=


!;

(d) not diagonalizable;

(e)

35 − 1

5 i 35 + 1

5 i1 1

! e(2+ i )t 0

0 e(2− i )t

! 35 − 1

5 i 35 + 1

5 i1 1

!−1

=

e2 t cos t− 3e2 t sin t 2e2 t sin t−5e2 t sin t e2 t cos t + 3e2 t sin t

!;

(f ) not diagonalizable.

9.4.2: (a)

0B@−1 0 0

0 − i i2 1 1

1CA

0B@

1 0 00 e i t 00 0 e− i t

1CA

0B@−1 0 0

0 − i i2 1 1

1CA

−1

=

0B@

1 0 02 sin t cos t sin t

2 cos t− 2 − sin t cos t

1CA;

253

(b)

0B@

1 −1 12 0 −11 1 1

1CA

0B@

et 0 00 e3 t 00 0 e4 t

1CA

0B@

1 −1 12 0 −11 1 1

1CA

−1

=

0BBB@

16 et + 1

2 e3 t + 13 e4 t 1

3 et − 13 e4 t 1

6 et − 12 e3 t + 1

3 e4 t

13 et − 1

3 e4 t 23 et + 1

3 e4 t 13 et − 1

3 e4 t

16 et − 1

2 e3 t + 13 e4 t 1

3 et − 13 e4 t 1

6 et + 12 e3 t + 1

3 e4 t

1CCCA;

(c) not diagonalizable;

(d)

0BBB@

1 − 12 − i

√3

2 − 12 + i

√3

2

1 − 12 + i

√3

2 − 12 − i

√3

2

1 1 1

1CCCA

0BBB@

1 0 0

0 e−(12− i

√3

2 )t 0

0 0 e−(12+ i

√3

2 )t

1CCCA

0BBB@

1 − 12 − i

√3

2 − 12 + i

√3

2

1 − 12 + i

√3

2 − 12 − i

√3

2

1 1 1

1CCCA

−1

which equals the same matrix exponential as before.

♦ 9.4.18. Let M have size p × q and N have size q × r. The derivative of the (i, j) entry of theproduct matrix M(t) N(t) is

d

dt

qX

k=1

mik(t) nkj(t) =qX

k=1

dmik

dtnkj(t) +

qX

k=1

mik(t)dnkj

dt.

The first sum is the (i, j) entry ofdM

dtN while the second is the (i, j) entry of M

dN

dt.

♦ 9.4.19. (a) The exponential series is a sum of real terms. Alternatively, one can choose a realbasis for the solution space to construct the real matrix solution U(t) before substituting

into formula (9.42). (b) According to Lemma 9.28, det eA = etr A > 0 since a real scalarexponential is always positive.

9.4.20. Lemma 9.28 implies det et A = et tr A = 1 for all t if and only if tr A = 0. (Even if tr A isallowed to be complex, by continuity, the only way this could hold for all t is if tr A = 0.)

♦ 9.4.21. Let u(t) = et λ v where v is the corresponding eigenvector of A. Then,du

dt= λ et λ

v =

λu = Au, and hence, by (9.41), u(t) = etA u(0) = etA v. Therefore, equating the two

formulas for u(t), we conclude that etA v = et λ v, which proves that v is an eigenvector of

etA with eigenvalue et λ.

9.4.22. The origin is an asymptotically stable if and only if all solutions tend to zero as t→∞.

Thus, all columns of etA tend to 0 as t→∞, and hence limt→∞

etA = O. Conversely, if

limt→∞

etA = O, then any solution has the form u(t) = etA c, and hence u(t)→ 0 as t→∞,

proving asymptotic stability.

9.4.23. According to Exercise 9.4.21, the eigenvalues of eA are eλ = eµ cos ν + i eµ sin ν, where

λ = µ + i ν are the eigenvalues of A. For eλ to have negative real part, we must have

cos ν < 0, and so ν = Im λ must lie between“

2k + 12

”π < ν <

“2k + 3

2

”π for some

integer k.

254

9.4.24. Indeed, the columns of eU(t) are linear combinations of the columns of U(t), and henceautomatically solutions to the linear system. Alternatively, we can prove this directly using

the Leibniz rule (9.40):d eUdt

=d

dt

“U(t) C

”=

dU

dtC = A U C = A eU , since C is constant.

♦ 9.4.25.

(a) If U(t) = C etB , thendU

dt= C etB B = U B, and so U satisfies the differential equation.

Moreover, C = U(0). Thus, U(t) is the unique solution to the initial value problem¦

U = U B, U(0) = C, where the initial value C is arbitrary.

(b) By Exercise 9.4.16, U(t) = C etB = etA C where A = C BC−1. Thus,¦

U = AU asclaimed. Note that A = B if and only if A commutes with U(0) = C.

9.4.26. (a) Let U(t) = (u1(t) . . . un(t) ) be the corresponding matrix-valued function with the

indicated columns. Thenduj

dt=

nX

i=1

bijui for all j = 1, . . . , n, if and only if¦

U = U B

where B is the n×n matrix with entries bij . Therefore, by Exercise 9.4.25,¦

U = AU , where

A = C BC−1 with C = U(0).

(b) According to Exercise 9.1.7, every derivativedku

dtkof a solution to

¦

u = Au is also a

solution. Since the solution space is an n-dimensional vector space, at most n of the deriva-tives are linearly independent, and so either

dnu

dtn= c0u + c1

du

dt+ · · · + cn−1

dn−1u

dtn−1, (∗)

for some constants c0, . . . , cn−1, or, for some k < n, we have

dku

dtk= a0u + a1

du

dt+ · · ·+ ak−1

dk−1u

dtk−1. (∗∗)

In the latter case,

dnu

dtn=

dn−k

dtn−k

0@ c0u + c1

du

dt+ · · ·+ ck−1

dk−1u

dtk−1

1A

= a0dn−ku

dtn−k+ a1

dn−k+1u

dtn−k+1+ · · ·+ ak−1

dn−1u

dtn−1,

and so (∗) continues to hold, now with c0 = · · · = cn−k−1 = 0, cn−k = a0, . . . , cn−1 = ak−1.

9.4.27. Write the matrix solution to the initial value problemdU

dt= A U, U(0) = I , in block

form U(t) = etA =

V (t) W (t)Y (t) Z(t)

!. Then the differential equation decouples into

dV

dt=

BV,dW

dt= AW,

dY

dt= C Y,

dZ

dt= C Z, with initial conditions V (0) = I , W (0) =

O, Y (0) = O, Z(0) = I . Thus, by uniqueness of solutions to the initial value problems,

V (t) = etB , W (t) = O, Y (t) = O, Z(t) = etC .

255

♦ 9.4.28.(a)

d

dt

0BBBBBBBBBBBBBBBBBBBB@

1 tt2

2

t3

6. . .

tn

n !

0 1 tt2

2. . .

tn−1

(n− 1) !

0 0 1 t . . .tn−2

(n− 2) !...

......

. . .. . .

...

0 0 0 . . . 1 t0 0 0 . . . 0 1

1CCCCCCCCCCCCCCCCCCCCA

=


0 1 tt2

2. . .

tn−1

(n− 1) !

0 0 1 t . . .tn−2

(n− 2) !

0 0 0 1 . . .tn−3

(n− 3) !...

......

. . .. . .

...

0 0 0 . . . 0 10 0 0 . . . 0 0


=

0BBBBBBBBBBBBBBB@

0 1 0 0 . . . 0

0 0 1 0 . . . 0...

......

.... . .

...

......

.... . .

. . ....

0 0 0 . . . 0 1

0 0 0 . . . 0 0

1CCCCCCCCCCCCCCCA


1 tt2

2

t3

6. . .

tn

n !

0 1 tt2

2. . .

tn−1

(n− 1) !

0 0 1 t . . .tn−2

(n− 2) !...

......

. . .. . .

...

0 0 0 . . . 1 t0 0 0 . . . 0 1


,

Thus, U(t) satisfies the initial value problem¦

U = J0,n U, U(0) = I , that characterizes

the matrix exponential, so U(t) = etJ0,n .

(b) Since Jλ,n = λ I + J0,n, by Exercise 9.4.14, etJλ,n = etλ etJ0,n , i.e., you merely multiply

all entries in the previous formula by etλ.(c) According to Exercises 9.4.17, 27, if A = S J S−1 where J is the Jordan canonical form,

then etA = S etJ S−1, and etJ is a block diagonal matrix given by the exponentials ofits individual Jordan blocks, computed in part (b).

♦ 9.4.29. If J is a Jordan matrix, then, by the arguments in Exercise 9.4.28, etJ is upper triangu-

lar with diagonal entries given by etλ where λ is the eigenvalue appearing on the diagonalof the corresponding Jordan block of A. In particular, the multiplicity of λ, which is the

number of times it appears on the diagonal of J , is the same as the multiplicity of etλ for

etJ . Moreover, since etA is similar to etJ , its eigenvalues are the same, and of the samemultiplicities.

♥ 9.4.30. (a) All matrix exponentials are nonsingular by the remark after (9.44). (b) Both A =

O and A =

0 −2π

2π 0

!have the identity matrix as their exponential eA = I . (c) If eA =

I and λ is an eigenvalue of A, then eλ = 1, since 1 is the only eigenvalue of I . Therefore,the eigenvalues of A must be integer multiples of 2π i . Since A is real, the eigenvalues mustbe complex conjugate, and hence either both 0, or ±2nπ i for some positive integer n. Inthe latter case, the characteristic equation of A is λ2 + 4n2π2 = 0, and hence A must have

zero trace and determinant 4n2π2. Thus, A =

a bc −a

!with a2 + bc = −4n2π2. If A

has both eigenvalues zero, it must be complete, and hence A = O, which is included in theprevious formula.

256

9.4.31. Even though this formula is correct in the scalar case, it is false in general. Would thatlife were so simple!

9.4.32.(a) u1(t) = 1

3 et − 112 e−2 t − 1

4 e2 t, u2(t) = 13 et − 1

3 e−2 t;

(b) u1(t) = et−1 − et + t et, u2(t) = et−1 − et + t et;

(c) u1(t) = 13 cos 2 t− 1

2 sin 2 t− 13 cos t, u2(t) = cos 2 t + 2

3 sin 2 t− 13 sin t;

(d) u(t) = 1316 e4 t + 3

16 − 14 t, v(t) = 13

16 e4 t − 2916 + 3

4 t;

(e) p(t) = 12 t2 + 1

3 t3, q(t) = 12 t2 − 1

3 t3.

9.4.33.(a) u1(t) = 1

2 cos 2 t + 14 sin 2 t + 1

2 − 12 t, u2(t) = 2e− t − 1

2 cos 2 t − 14 sin 2 t − 3

2 + 32 t,

u3(t) = 2e− t − 14 cos 2 t− 3

4 sin 2 t− 74 + 3

2 t;

(b) u1(t) = − 32 et + 1

2 e− t + te− t, u2(t) = te− t, u3(t) = −3et + 2e− t + 2 te− t.

9.4.34. Since λ is not an eigenvalue, A − λ I is nonsingular. Set w = (A − λ I )−1v. Then

u?(t) = eλt w is a solution. The general solution is u(t) = eλt w + z(t) = eλt w + etAb,where b is any vector.

9.4.35.

(a) u(t) =Z t

0e(t−s) A

b ds.

(b) Yes, since if b = Ac, then the integral can be evaluated as

u(t) =Z t

0e(t−s) A Ac ds = −e(t−s) A

c

˛˛t

s=0= etA

c− c = v(t)− u?,

where v(t) = etA c solves the homogeneous system¦

v = Av, while u? = c is the equilib-rium solution.

9.4.36.

(a)

e2 t 0

0 1

!— scalings in the x direction, which expand when t > 0 and contract when

t < 0. The trajectories are half-lines parallel to the x axis. Points on the y axis are leftfixed.

(b)

1 0t 1

!— shear transformations in the y direction. The trajectories are lines parallel

to the y axis. Points on the y axis are fixed.

(c)

cos 3 t sin 3 t− sin 3 t cos 3 t

!— rotations around the origin, starting in a clockwise direction for

t > 0. The trajectories are the circles x2 + y2 = c. The origin is fixed.

(d)

cos 2 t − sin 2 t2 sin 2 t 2 cos 2 t

!— elliptical rotations around the origin. The trajectories are the

ellipses x2 + 14 y2 = c. The origin is fixed.

(e)

cosh t sinh tsinh t cosh t

!— hyperbolic rotations. Since the determinant is 1, these are area-

preserving scalings: for t > 0, expanding by a factor of et in the direction x = y andcontracting by the reciprocal factor e− t in the direction x = −y; the reverse holds fort < 0. The trajectories are the semi-hyperbolas x2 − y2 = c and the four rays x = ±y.The origin is fixed.

257

9.4.37.

(a)

0B@

e2 t 0 00 et 00 0 1

1CA — scalings by a factor λ = et in the y direction and λ2 = e2 t in the x

direction. The trajectories are the semi-parabolas x = cy2, z = d for c, d constant, andthe half-lines x 6= 0, y = 0, z = d and x = 0, y 6= 0, z = d. Points on the z axis are leftfixed.

(b)

0B@

1 0 t0 1 00 0 1

1CA — shear transformations in the x direction, with magnitude proportional

to the z coordinate. The trajectories are lines parallel to the x axis. Points on the xyplane are fixed.

(c)

0B@

cos 2 t 0 − sin 2 t0 1 0

sin 2 t 0 cos 2 t

1CA — rotations around the y axis. The trajectories are the circles

x2 + z2 = c, y = d. Points on the y axis are fixed.

(d)

0B@

cos t sin t 0− sin t cos t 0

0 0 et

1CA — spiral motions around the z axis. The trajectories are the pos-

itive and negative z axes, circles in the xy plane, and cylindrical spirals (helices) wind-ing around the z axis while going away from the xy pane at an exponentially increasingrate. The only fixed point is the origin.

(e)

0B@

cosh t 0 sinh t0 1 0

sinh t 0 cosh t

1CA — hyperbolic rotations in the xz plane, cf. Exercise 9.4.36(e). The

trajectories are the semi-hyperbolas x2 − z2 = c, y = d, and the rays x = ±z, y = d.The points on the y axis are fixed.

9.4.38. (a) et:A =0BBBB@

13 + 2

3 cos√

3 t 13 − 1

3cos√

3 t + 1√3sin√

3 t − 13 + 1

3cos√

3 t + 1√3sin√

3 t13 − 1

3cos√

3 t− 1√3sin√

3 t 13 + 2

3 cos√

3 t − 13 + 1

3cos√

3 t− 1√3sin√

3 t

− 13 + 1

3cos√

3 t− 1√3sin√

3 t − 13 + 1

3cos√

3 t + 1√3sin√

3 t 13 + 2

3 cos√

3 t

1CCCCA

.

(b) The axis is the null eigenvector: ( 1, 1,−1 )T .

♥ 9.4.39.(a) Given c, d ∈ R, x,y ∈ R

3, we have

Lv[cx + dy ] = v × (cx + dy) = cv × x + dv × y = cL

v[x ] + dL

v[y ],

proving linearity.

(b) If v = ( a, b, c )T , then Av

=

0B@

0 c −b−c 0 a

b −a 0

1CA = −AT

v.

(c) Every 3× 3 skew-symmetric matrix has the form Av

for some v = ( a, b, c )T .(d) x ∈ ker A

vif and only if 0 = A

vx = v × x, which happens if and only if x = cv is

parallel to v.

(e) If v = r e3, then Ar e3=

0B@

0 −r 0r 0 00 0 0

1CA and hence etAr e3 =

0B@

cos r t − sin r t 0sin r t cos r t 0

0 0 1

1CA

represents a rotation by angle r t around the z axis. More generally, given v with r =‖v ‖, let Q be any rotation matrix such that Qv = re3. Then Q(v × x) = (Qv)× (Qx)since rotations preserve the orthogonality of the cross product and its right-handedness.

258

Thus, if¦

x = v×x and we set y = Qx then¦

y = re3×y. We conclude that the solutionsx(t) are obtained by rotating the solutions y(t) and so are given by rotations around theaxis v.

♥ 9.4.40.(a) The solution is x(t) =

0B@

x0 cos t− y0 sin tx0 sin t + y0 cos t

z0

1CA, which is a rotation by angle t around the z

axis. The trajectory of the point ( x0, y0, z0 )T is the circle of radius r0 =q

x20 + y2

0 atheight z0 centered on the z axis. The points on the z axis, with r0 = 0, are fixed.

(b) For the inhomogeneous system, the solution is x(t) =

0B@

x0 cos t− y0 sin tx0 sin t + y0 cos t

z0 + t

1CA, which is a

screw motion. If r0 = 0, the trajectory is the z axis; otherwise it is a helix of radius r0,spiraling up the z axis.

(c) The solution to the linear system¦

x = a × x is x(t) = Rt x0 where Rt is a rotationthrough angle t ‖a ‖ around the axis a. The solution to the inhomogeneous system isthe screw motion x(t) = Rtx0 + ta.

♥ 9.4.41.(a) Since A =

0B@

0 c −b−c 0 a

b −a 0

1CA, we have A2 =

0B@−b2 − c2 ab ac

ab −a2 − c2 bcac bc −a2 − b2

1CA while

A3 = −(a2 + b2 + c2)A = −A. Therefore, by induction, A2m+1 = (−1)m A and

A2m = (−1)m−1 A2 for m ≥ 1. Thus

etA =∞X

n=0

tn

n!An = I +

∞X

m=0

(−1)mt2m+1

(2m + 1)!A−

∞X

m=1

(−1)mt2m

(2m)!A2.

The power series are, respectively, those of sin t and cos t − 1 (since the constant termdoesn’t appear), proving the formula.

(b) Since u 6= 0, the matrices I , A, A2 are linearly independent and hence, by the Euler–

Rodrigues formula, etA = I if and only if cos t = 1, sin t = 0, so t must be an integermultiple of 2π.

(c) If v = ru with r = ‖v ‖, then

etAv = etrAu = I + (sin tr)Au

+ (1− cos tr) A2u

= I +sin t ‖v ‖‖v ‖ A

v+

1− cos t ‖v ‖

‖v ‖2!

A2v,

which equals the identity matrix if and only if t = 2kπ / ‖v ‖ for some integer k.

9.4.42. None of them commute:

" 2 00 0

!,

0 01 0

!#=

0 0−2 0

!,

" 2 00 0

!,

0 3−3 0

!#=

0 66 0

!,

" 2 00 0

!,

0 −14 0

!#=

0 −2−8 0

!,

" 2 00 0

!,

0 11 0

!#=

0 2−2 0

!,

" 0 01 0

!,

0 3−3 0

!#=

−3 0

0 3

!,

" 0 01 0

!,

0 −14 0

!#=

1 00 −1

!,

" 0 01 0

!,

0 11 0

!#=

−1 0

0 1

!,

" 0 3−3 0

!,

0 −14 0

!#=

9 00 −9

!,

" 0 3−3 0

!,

0 11 0

!#=

6 00 −6

!,

" 0 −14 0

!,

0 11 0

!#=

−5 0

0 5

!,

259

264

0B@

2 0 00 1 00 0 0

1CA ,

0B@

0 0 10 0 00 0 0

1CA

375 =

0B@

0 0 20 0 00 0 0

1CA,

264

0B@

2 0 00 1 00 0 0

1CA ,

0B@

0 0 −20 0 02 0 0

1CA

375 =

0B@

0 0 −40 0 0−4 0 0

1CA,

264

0B@

2 0 00 1 00 0 0

1CA ,

0B@

0 1 0−1 0 0

0 0 1

1CA

375 =

0B@

0 1 01 0 00 0 0

1CA,

264

0B@

2 0 00 1 00 0 0

1CA ,

0B@

0 0 10 0 01 0 0

1CA

375 =

0B@

0 0 20 0 0−2 0 0

1CA,

264

0B@

0 0 10 0 00 0 0

1CA ,

0B@

0 0 −20 0 02 0 0

1CA

375 =

0B@

2 0 00 0 00 0 −2

1CA,

264

0B@

0 0 10 0 00 0 0

1CA ,

0B@

0 1 0−1 0 0

0 0 1

1CA

375 =

0B@

0 0 10 0 10 0 0

1CA,

264

0B@

0 0 10 0 00 0 0

1CA ,

0B@

0 0 10 0 01 0 0

1CA

375 =

0B@

1 0 00 0 00 0 −1

1CA,

264

0B@

0 0 −20 0 02 0 0

1CA ,

0B@

0 1 0−1 0 0

0 0 1

1CA

375 =

0B@

0 0 −20 0 −2−2 2 0

1CA,

264

0B@

0 0 −20 0 02 0 0

1CA ,

0B@

0 0 10 0 01 0 0

1CA

375 =

0B@−4 0 0

0 0 00 0 4

1CA,

264

0B@

0 1 0−1 0 0

0 0 1

1CA ,

0B@

0 0 10 0 01 0 0

1CA

375 =

0B@

0 0 −10 0 −11 −1 0

1CA.

9.4.43.(a) If U, V are upper triangular, so are U V and V U and hence so is [ U, V ] = U V − V U .

(b) If AT = −A, BT = −B then

[ A, B ]T = (AB −BA)T = BT AT −AT BT = BA−AB = − [ A, B ].

(c) No.

♦ 9.4.44. The sum ofh[ A, B ], C

i= (AB −BA)C − C (AB −BA) = ABC −BAC − C AB + C BA,

h[ C, A ], B

i= (C A−AC)B −B (C A−AB) = C AB −AC B −BC A + BAC,

h[ B, C ], A

i= (BC − C B)A−A(BC − C B) = BC A− C BA−ABC + AC B,

is clearly zero.

9.4.45. In the matrix systemdU

dt= AU , the equations in the last row are

dunj

dt= 0 for

j = 1, . . . , n, and hence the last row of U(t) is constant. In particular, for the exponen-

tial matrix solution U(t) = etA the last row must equal the last row of the identity matrix

U(0) = I , which is eTn .

260

9.4.46. Write the matrix solution as U(t) =

V (t) f(t)g(t) w(t)

!, where f(t) is a column vector, g(t)

a row vector, and w(t) is a scalar function. Then the matrix systemdU

dt= AU decouples

intodV

dt= BV,

df

dt= B f + cw,

dg

dt= 0,

dw

dt= 0, with initial conditions V (0) = I ,

f(0) = O, g(0) = O, w(0) = 1. Thus, g ≡ 0, w ≡ 1, are constant, V (t) = etB . The equa-

tion for f(t) becomesdf

dt= B f + c, f(0) = 0, and the solution is u(t) =

Z t

0e(t−s) A

b ds,

cf. Exercise 9.4.35.

♦ 9.4.47. (a)

x + t

y

!: translations in x direction. (b)

et x

e−2 t y

!: scaling in x and y direc-

tions by respective factors et, e−2 t. (c)

(x + 1) cos t− y sin t− 1

(x + 1) sin t + y cos t

!: rotations around

the point

−1

0

!. (d)

et (x + 1)− 1

e− t (y + 2)− 2

!: scaling in x and y directions, centered at the

point

−1−2

!, by reciprocal factors et, e− t.

9.5.1. The vibrational frequency is ω =q

21/6 ≈ 1.87083, and so the number of Hertz is

ω/(2π) ≈ .297752.

9.5.2. We needω

2π=

1

2π√

m= 20 and so m =

1

1600 π2≈ .0000633257.

9.5.3.

(a) Periodic of period π:-2 2 4 6 8 10

-2

-1

1

2

(b) Periodic of period 2:-2 2 4 6 8 10-0.5

0.51

1.52

2.5

(c) Periodic of period 12:-5 5 10 15 20 25

-1.5-1

-0.5

0.51

1.52

(d) Quasi-periodic:-5 5 10 15 20 25

-2

-1

1

2

261

(e) Periodic of period 120π:100 200 300 400 500

-3

-2

-1

1

2

3

(f ) Quasi-periodic:10 20 30 40 50 60

-2

-1

1

2

3

(g) sin t sin 3 t = cos 2 t− cos 4 t,

and so periodic of period π : -5 5 10 15

-1

-0.75

-0.5

-0.25

0.25

0.5

9.5.4. The minimal period isπm

2k−1, where m is the least common multiple of q and s, while 2k

is the largest power of 2 appearing in both p and r.

9.5.5. (a)√

2 ,√

7 ; (b) 4 — each eigenvalue gives two linearly independent solutions;

(c) u(t) = r1 cos(√

2 t− δ1)

21

!+ r2 cos(

√7 t− δ2)

−1

2

!; (d) The solution is periodic

if only one frequency is excited, i.e., r1 = 0 or r2 = 0; all other solutions are quasiperiodic.

9.5.6. (a) 5, 10; (b) 4 — each eigenvalue gives two linearly independent solutions;

(c) u(t) = r1 cos(5 t − δ1)

−3

4

!+ r2 cos(10 t − δ2)

43

!; (d) All solutions are periodic;

when r1 6= 0, the period is 25 π, while when r1 = 0 the period is 1

5 π.

9.5.7.(a) u(t) = r1 cos(t− δ1) + r2 cos(

√5 t− δ2), v(t) = r1 cos(t− δ1)− r2 cos(

√5 t− δ2);

(b) u(t) = r1 cos(√

10 t− δ1)− 2r2 cos(√

15 t− δ2),

v(t) = 2r1 cos(√

10 t− δ1) + r2 cos(√

15 t− δ2);

(c) u(t) = ( r1 cos(t− δ1), r2 cos(2 t− δ2), r3 cos(3 t− δ1) )T ;

(d) u(t) = r1 cos(√

2 t− δ1)

0B@

110

1CA+ r2 cos(3 t− δ2)

0B@−1

11

1CA+ r3 cos(

√12 t− δ3)

0B@

1−1

2

1CA.

9.5.8. The system has stiffness matrix K = ( 1 −1 )

c1 00 c2

! 1−1

!= (c1 + c2) and so the

dynamical equation is m¦¦

u + (c1 + c2) u = 0, which is the same as a mass connected to asingle spring with stiffness c = c1 + c2.

9.5.9. Yes. For example, c1 = 16, c2 = 36, c3 = 37, leads to K =

52 −36−36 73

!with eigen-

values λ1 = 25, λ2 = 100, and hence natural frequencies ω1 = 5, ω2 = 10. Since ω2 is a

rational multiple of ω1, every solution is periodic with period 25 π or 1

5 π. Further examples

can be constructed by solving the matrix equation K =

c1 + c2 −c2−c2 c2 + c3

!= QT ΛQ for

c1, c2, c3, where Λ is a diagonal matrix with entries ω2, r2 ω2 where r is any rational num-

262

ber and Q is a suitable orthogonal matrix, making sure that the resulting stiffnesses are allpositive: c1, c2, c3 > 0.

♠ 9.5.10. (a) The vibrations slow down. (b) The vibrational frequencies are ω1 = .44504, ω2 =1.24698, ω3 = 1.80194, each of which is a bit smaller than the fixed end case, which has

frequencies ω1 =q

2−√

2 = .76537, ω2 =√

2 = 1.41421, ω3 =q

2 +√

2 = 1.84776.(c) Graphing the motions of the three masses for 0 ≤ t ≤ 50:

with bottom support:

without bottom support:

♠ 9.5.11.(a) The vibrational frequencies and eigenvectors are

ω1 =q

2−√

2 = .7654, ω2 =√

2 = 1.4142, , ω3 =q

2 +√

2 = 1.8478,

v1 =

0B@

1√2

1

1CA, v2 =

0B@−1

01

1CA, v3 =

0B@

1−√

21

1CA.

Thus, in the slowest mode, all three masses are moving in the same direction, with themiddle mass moving

√2 times farther; in the middle mode, the two outer masses are

moving in opposing directions by equal amounts, while the middle mass remains still; inthe fastest mode, the two outer masses are moving in tandem, while the middle mass ismoving farther in an opposing direction.

(b) The vibrational frequencies and eigenvectors are

ω1 = .4450, ω2 = 1.2470, ω3 = 1.8019,

v1 =

0B@

.3280

.5910

.7370

1CA, v2 =

0B@

.7370

.3280−.5910

1CA, v3 =

0B@−.5910

.7370−.32805

1CA.

Thus, in the slowest mode, all three masses are moving in the same direction, each slightlyfarther than the one above it; in the middle mode, the top two masses are moving inthe same direction, while the bottom, free mass moves in the opposite direction; in thefastest mode, the top and bottom masses are moving in the same direction, while themiddle mass is moving in an opposing direction.

♥ 9.5.12. Let c be the commons spring stiffness. The stiffness matrix K is tridiagonal with alldiagonal entries equal to 2c and all sub- and super-diagonal entries equal to −c. Thus, by

Exercise 8.2.48, the vibrational frequencies are

vuut2c

1− cos

kπ

n + 1

!= 2√

c sinkπ

2(n + 1)for k = 1, . . . , n. As n → ∞, the frequencies form a denser and denser set of points on thegraph of 2

√c sin θ for 0 ≤ θ ≤ 1

2 π.

♣ 9.5.13. We take “fastest” to mean that the slowest vibrational frequency is as large as possi-ble. Keep in mind that, for a chain between two fixed supports, completely reversing the

263

order of the springs does not change the frequencies. For the indicated springs connecting2 masses to fixed supports, the order 2, 1, 3 or its reverse, 3, 1, 2 is the fastest, with frequen-cies 2.14896, 1.54336. For the order 1, 2, 3, the frequencies are 2.49721, 1.32813, while for1, 3, 2 the lowest frequency is the slowest, at 2.74616, 1.20773. Note that as the lower fre-quency slows down, the higher one speeds up. In general, placing the weakest spring in themiddle leads to the fastest overall vibrations.

For a system of n springs with stiffnesses c1 > c2 > · · · > cn, when the bottom massis unattached, the fastest vibration, as measured by the minimal vibrational frequency,occurs when the springs should be connected in order c1, c2, . . . , cn from stiffest to weak-est, with the strongest attached to the support. For fixed supports, numerical computa-tions show that the fastest vibrations occur when the springs are attached in the ordercn, cn−3, cn−5, . . . , c3, c1, c2, c4, . . . , cn−1 when n is odd, and cn, cn−1, cn−4, cn−6, . . . , c4, c2,c1, c3, . . . , cn−5, cn−3, cn−2 when n is even. Finding analytic proofs of these observationsappears to be a challenge.

♣ 9.5.14. (a)d2u

dt2+

0BBBBB@

32 − 1

2 −1 0

− 12

32 0 0

− 1 0 32

12

0 0 12

32

1CCCCCAu = 0, where u(t) =

0BBB@

u1(t)v1(t)u2(t)v2(t)

1CCCA are the horizontal and

vertical displacements of the two free nodes. (b) 4; (c) ω1 =

r1− 1

2

√2 = .541196, ω2 =

r2− 1

2

√2 = 1.13705, ω3 =

r1 + 1

2

√2 = 1.30656, ω4 =

r2 + 1

2

√2 = 1.64533; (d) the

corresponding eigenvectors are v1 =

0BBB@

−1−√

2−1

−1−√

21

1CCCA =

0BBB@

−2.4142−1

−2.41421

1CCCA, v2 =

0BBB@

−1 +√

21

1−√

21

1CCCA =

0BBB@

.41421

−.41421

1CCCA, v3 =

0BBB@

−1 +√

2−1

−1 +√

21

1CCCA =

0BBB@

.4142−1

.41421

1CCCA, v4 =

0BBB@

−1−√

21

1 +√

21

1CCCA =

0BBB@

−2.41421

2.41421

1CCCA. In the

first mode, the left corner moves down and to the left, while the right corner moves up andto the left, and then they periodically reverse directions; the horizontal motion is propor-tionately 2.4 times the vertical. In the second mode, both corners periodically move up andtowards the center line and then down and away; the vertical motion is proportionately 2.4times the horizontal. In the third mode, the left corner first moves down and to the right,while the right corner moves up and to the right, periodically reversing their directions; thevertical motion is proportionately 2.4 times the horizontal. In the fourth mode, both cor-ners periodically move up and away from the center line and then down and towards it; thehorizontal motion is proportionately 2.4 times the vertical.

(e) u(t) =1

4√

2

“− cos(ω1 t)v1 + cos(ω2 t)v2 + cos(ω3 t)v3 − cos(ω4 t)v4

”, which is a

quasiperiodic combination of all four normal modes.

9.5.15. The system has periodic solutions whenever A has a complex conjugate pair of purelyimaginary eigenvalues. Thus, a quasi-periodic solution requires two such pairs, ± i ω1 and± i ω2, with the ratio ω1/ω2 an irrational number. The smallest dimension where this canoccur is 4.

9.5.16.(a) u(t) = at + b + 2r cos(

√5 t− δ), v(t) = −2at− 2b + r cos(

√5 t− δ).

264

The unstable mode consists of the terms with a in them; it will not be excited if theinitial conditions satisfy

¦

u(t0)− 2¦

v(t0) = 0.

(b) u(t) = −3at− 3b + r cos(√

10 t− δ), v(t) = at + b + 3r cos(√

10 t− δ).The unstable mode consists of the terms with a in them; it will not be excited if theinitial conditions satisfy −3

¦

u(t0) +¦

v(t0) = 0.

(c) u(t) = −2at− 2b− 1−√

134 r1 cos

r7+

√13

2 t− δ1

!− 1+

√13

4 r2 cos

r7−

√13

2 t− δ2

!,

v(t) = −2at− 2b + 3−√

134 r1 cos

r7+

√13

2 t− δ1

!+ 3+

√13

4 r2 cos

r7−

√13

2 t− δ2

!,

w(t) = at + b + r1 cos

r7+

√13

2 t− δ1

!+ r2 cos

r7−

√13

2 t− δ2

!.

The unstable mode is the term containing a; it will not be excited if the initial condi-tions satisfy −2

¦

u(t0)− 2¦

v(t0) +¦

w(t0) = 0.

(d) u(t) = (a1 − 2a2) t + b1 − 2b2 + r cos(√

6 t− δ),

v(t) = a1t + b1 − r cos(√

6 t− δ),

w(t) = a2t + b2 + 2r cos(√

6 t− δ).

The unstable modes consists of the terms with a1 and a2 in them; they will not be ex-cited if the initial conditions satisfy

¦

u(t0) +¦

v(t0) = 0 and −2¦

u(t0) +¦

w(t0) = 0.

9.5.17.

(a) Q =

0BBBB@

− 1√2

1√2

0

0 0 11√2

1√2

0

1CCCCA

, Λ =

0B@

4 0 00 2 00 0 2

1CA.

(b) Yes, because K is symmetric and has all positive eigenvalues.

(c) u(t) =

cos√

2 t,1√2

sin√

2 t, cos√

2 t

!T

.

(d) The solution u(t) is periodic with period√

2 π.

(e) No — since the frequencies 2,√

2 are not rational multiples of each other, the generalsolution is quasi-periodic.

9.5.18.

(a) Q =

0BBBB@

1√3

− 1√2

1√6

− 1√3

0 2√6

1√3

1√2

1√6

1CCCCA

, Λ =

0B@

3 0 00 2 00 0 0

1CA.

(b) No — K is only positive semi-definite.

(c) u(t) =

0BBBB@

13 (t + 1) + 2

3 cos√

3 t− 13√

3sin√

3 t

23 (t + 1)− 2

3 cos√

3 t + 13√

3sin√

3 t

13 (t + 1) + 2

3 cos√

3 t− 13√

3sin√

3 t

1CCCCA

.

(d) The solution u(t) is unstable, and becomes unbounded as | t | → ∞.(e) No — the general solution is also unbounded.

9.5.19. The solution to the initial value problem md2u

dt2+ ε u = 0, u(t0) = a,

¦

u(t0) = b, is

uε(t) = a cos

sε

m(t− t0) + b

sm

εsin

sε

m(t− t0). In the limit as ε→ 0, using the fact that

265

limh→ 0

sin c h

h= c, we find uε(t) → a + b(t − t0), which is the solution to the unrestrained

initial value problem m¦¦

u = 0, u(t0) = a,¦

u(t0) = b. Thus, as the spring stiffness goes tozero, the motion converges to the unrestrained motion. However, since the former solutionis periodic, while the latter moves along a straight line, the convergence is non-uniform onall of R and the solutions are close only for a period of time: if you wait long enough theywill diverge.

♠ 9.5.20.

(a) Frequencies: ω1 =

r32 − 1

2

√5 = .61803, ω2 = 1, ω3 =

r32 + 1

2

√5 = 1.618034;

stable eigenvectors: v1 =

0BBB@

2−√

5−1

−2 +√

51

1CCCA, v2 =

0BBB@

−1−1−1

1

1CCCA, v3 =

0BBB@

2 +√

51

−2−√

51

1CCCA; unstable

eigenvector: v4 =

0BBB@

1−1

11

1CCCA. In the lowest frequency mode, the nodes vibrate up and to-

wards each other and then down and away, the horizontal motion being less pronouncedthan the vertical; in the next mode, the nodes vibrate in the directions of the diagonalbars, with one moving towards the support while the other moves away; in the highestfrequency mode, they vibrate up and away from each other and then down and towards,with the horizontal motion significantly more than the vertical; in the unstable modethe left node moves down and to the right, while the right hand node moves at the samerate up and to the right.

(b) Frequencies: ω1 = .444569, ω2 = .758191, ω3 = 1.06792, ω4 = 1.757; eigenvectors:

v1 =

0BBB@

.237270−.117940

.498965

.825123

1CCCA, v2 =

0BBB@

−.122385.973375−.028695

.191675

1CCCA, v3 =

0BBB@

.500054

.185046

.666846−.520597

1CCCA, v4 =

0BBB@

.823815

.066249−.552745

.106830

1CCCA.

In the lowest frequency mode, the left node vibrates down and to the right, while theright hand node moves further up and to the right, then both reversing directions; inthe second mode, the nodes vibrate up and to the right, and then down and to the left,the left node moving further; in the next mode, the left node vibrates up and to theright, while the right hand node moves further down and to the right, then both revers-ing directions; in the highest frequency mode, they move up and towards each other andthen down and away, with the horizontal motion more than the vertical.

(c) Frequencies: ω1 = ω2 =

r211 = .4264, ω3 =

r2111 − 3

11

√5 = 1.1399, ω4 =

r2011 =

1.3484, ω5 =

r2111 + 3

11

√5 = 1.5871; stable eigenvectors:

v1 =

0BBBBBBB@

010000

1CCCCCCCA

, v2 =

0BBBBBBB@

000010

1CCCCCCCA

, v3 =

0BBBBBBBB@

12 −

√5

201

− 12 +

√5

201

1CCCCCCCCA

, v4 =

0BBBBBBBB@

− 130−1− 1

301

1CCCCCCCCA

, v5 =

0BBBBBBBB@

12 +

√5

201

− 12 −

√5

201

1CCCCCCCCA

;

unstable eigenvector: v6 =

0BBBBBBB@

30−1

301

1CCCCCCCA

. In the two lowest frequency modes, the individ-

266

ual nodes vibrate horizontally and transverse to the swing; in the next lowest mode, thenodes vibrate together up and away from each other, and then down and towards eachother; in the next mode, the nodes vibrate oppositely up and down, and towards andthen away from each other; in the highest frequency mode, they also vibrate vibrate upand down in opposing motion, but in the same direction along the swing; in the unsta-ble mode the left node moves down and in the direction of the bar, while the right handnode moves at the same rate up and in the same horizontal direction.

♥ 9.5.21. If the mass–spring molecule is allowed to move in space, then the vibrational modesand frequencies remain the same, while there are 14 independent solutions correspondingto the 7 modes of instability: 3 rigid translations, 3 (linearized) rotations, and 1 mecha-nism, which is the same as in the one-dimensional version. Thus, the general motion of themolecule in space is to vibrate quasi-periodically at frequencies

√3 and 1, while simultane-

ously translating, rigidly rotating, and bending, all at a constant speed.

♠ 9.5.22.(a) There are 3 linearly independent normal modes of vibration: one of frequency

√3, , in

which the triangle expands and contacts, and two of frequencyq

32 , , in which one of

the edges expands and contracts while the opposite vertex moves out in the perpendic-ular direction while the edge is contracting, and in when it expands. (Although thereare three such modes, the third is a linear combination of the other two.) There are3 unstable null eigenmodes, corresponding to the planar rigid motions of the triangle.To avoid exciting the instabilities, the initial velocity must be orthogonal to the ker-nel; thus, if vi is the initial velocity of the ith mode, we require v1 + v2 + v3 = 0 and

v⊥1 + v⊥2 + v⊥3 = 0 where v⊥i denotes the angular component of the velocity vector withrespect to the center of the triangle.

(b) There are 4 normal modes of vibration, all of frequency√

2, in which one of the edgesexpands and contracts while the two vertices not on the edge stay fixed. There are 4unstable modes: 3 rigid motions and one mechanism where two opposite corners movetowards each other while the other two move away from each other. To avoid excitingthe instabilities, the initial velocity must be orthogonal to the kernel; thus, if the ver-

tices are at (±1,±1 )T and vi = ( vi, wi )T is the initial velocity of the ith mode, werequire v1 + v2 = v3 + v4 = w1 + w4 = w2 + w3 = 0.

(c) There are 6 normal modes of vibration: one of frequency√

3, in which three nonadja-cent edges expand and then contact, while the other three edges simultaneously contract

and then expand; two of frequencyq

52 , in which two opposite vertices move back and

forth in the perpendicular direction to the line joining them (only two of these three

modes are linearly independent); two of frequencyq

32 , in which two opposite vertices

move back and forth towards each other (again, only two of these three modes are lin-early independent); and one of frequency 1, in which the entire hexagon expands andcontacts. There are 6 unstable modes: 3 rigid motions and 3 mechanisms where twoopposite vertices move towards each other while the other four move away. As usual,to avoid exciting the instabilities, the initial velocity must be orthogonal to the kernel;

thus, if the vertices are at“

cos 13 kπ, sin 1

3 kπ”T

, and vi = ( vi, wi )T is the initial veloc-

ity of the ith mode, we require

v1 + v2 + v3 + v4 + v5 + v6 = 0,

w1 + w2 + w3 + w4 + w5 + w6 = 0,√

3 v5 + w5 +√

3 v6 + w6 = 0,

−√

3 v1 + w1 + 2w2 = 0,√

3 v1 + w1 + 2w6 = 0,

2w3 +√

3 v4 + w4 = 0.

♠ 9.5.23. There are 6 linearly independent normal modes of vibration: one of frequency 2, in

267

which the tetrahedron expands and contacts; four of frequency√

2, in which one of theedges expands and contracts while the opposite vertex stays fixed; and two of frequency√

2, in which two opposite edges move towards and away from each other. (There are threedifferent pairs, but the third mode is a linear combination of the other two.) There are6 unstable null eigenmodes, corresponding to the three-dimensional rigid motions of thetetrahedron.

To avoid exciting the instabilities, the initial velocity must be orthogonal to the kernel,

and so, using the result of Exercise 6.3.13, if vi = ( ui, vi, wi )T is the initial velocity of the

ith mode, we require

u1 + u2 + u3 + u4 = 0,

v1 + v2 + v3 + v4 = 0,

w1 + w2 + w3 + w4 = 0,

−√

2 u1 +√

6 v1 − w1 − 2√

2 u2 + w2 = 0,

−2v1 +√

3 u2 + v2 −√

3 u3 + v3 = 0,

−√

2 u1 −√

6 v1 − w1 − 2√

2 u3 + w3 = 0.

♥ 9.5.24. (a) When C = I , then K = AT A and so the frequencies ωi =q

λi are the square roots

of its positive eigenvalues, which, by definition, are the singular values of the reduced inci-dence matrix. (b) Thus, a structure with one or more very small frequencies ωi ¿ 1, andhence one or more very slow vibrational modes, is almost unstable in that a small perturba-tion might create a null eigenvalue corresponding to an instability.

9.5.25. Since corng A is the orthogonal complement to ker A = ker K, the initial velocity isorthogonal to all modes of instability, and hence by Theorem 9.38, the solution remainsbounded, vibrating around the fixed point prescribed by the initial position.

9.5.26.

(a) u(t) = r1 cos„

1√2

t− δ1

« 12

!+ r2 cos

„q53 t− δ2

« −31

!,

(b) u(t) = r1 cos„

1√3

t− δ1

« 23

!+ r2 cos

„q85 t− δ2

« −52

!,

(c) u(t) = r1 cos

r3−

√3

2 t− δ1

!0@

1+√

32

1

1A+ r2 cos

r3+

√3

2 t− δ2

!0@

1−√

32

1

1A,

(d) u(t) = r1 cos ( t− δ1 )

0B@

0−1

1

1CA+ r2 cos

“√2 t− δ2

”0B@

321

1CA+ r3 cos

“√3 t− δ3

”0B@−3

21

1CA,

(e) u(t) = r1 cos„q

23 t− δ1

« 11

!+ r2 cos ( 2 t− δ2 )

−1

1

!,

(f ) u(t) = (at + b)

0B@

2−1

1

1CA+ r1 cos ( t− δ1 )

0B@−1

01

1CA+ r2 cos

“√3 t− δ2

”0B@

1−2

1

1CA.

9.5.27. u1(t) =√

3−12√

3cos

r3−

√3

2 t +√

3+12√

3cos

r3+

√3

2 t,

u2(t) = 12√

3cos

r3−

√3

2 t− 12√

3cos

r3+

√3

2 t.

9.5.28. u1(t) =√

17−32√

17cos

√5−

√17

2 t +√

17+32√

17cos

√5+

√17

2 t,

u2(t) = 1√17

cos

√5−

√17

2 t− 1√17

cos

√5+

√17

2 t.

268

♠ 9.5.29. The order does make a difference:

Mass order Frequencies

1, 3, 2 or 2, 3, 1 1.4943, 1.0867, .50281

1, 2, 3 or 3, 2, 1 1.5451, 1.0000, .52843

2, 1, 3 or 3, 1, 2 1.5848, .9158, .56259

Note that, from top to bottom in the table, the fastest and slowest frequencies speed up,but the middle frequency slows down.

♣ 9.5.30.(a) We place the oxygen molecule at the origin, one hydrogen at ( 1, 0 )T and the other at

( cos θ, sin θ )T = (−0.2588, 0.9659 )T with θ = 105180 π = 1.8326 radians. There are two

independent vibrational modes, whose fundamental frequencies are ω1 = 1.0386, ω2 =

1.0229, with corresponding eigenvectors v1 = ( .0555,−.0426,−.7054, 0.,−.1826, .6813 )T ,

v2 = (−.0327,−.0426, .7061, 0.,−.1827, .6820 )T . Thus, the (very slightly) higher fre-quency mode has one hydrogen atoms moving towards and the other away from the oxy-gen, which also slightly vibrates, and then reversing their motion, while in the lower fre-quency mode, they simultaneously move towards and then away from the oxygen atom.

(b) We place the carbon atom at the origin and the chlorine atoms at„

2√

33 , 0,− 1

3

«T

,

−

√2

3 ,

r23 ,− 1

3

!T

,

−

√2

3 ,

r23 ,− 1

3

!T

, ( 0, 0, 1 )T ,

which are the vertices of a unit tetrahedron. There are four independent vibrationalmodes, whose fundamental frequencies are ω1 = ω2 = ω3 = .139683, ω4 = .028571,with corresponding eigenvectors

v1 =

0BBBBBBBBBBBBBBBBBBBBBBBBBBBBB@

.2248−.6510

.6668

.00250.

−.0009−.1042

.1805−.0737

.0246

.0427

.01740.0.

−.1714

1CCCCCCCCCCCCCCCCCCCCCCCCCCCCCA

, v2 =


−.5998−.7314−.1559

.12450.

−.0440−.0318

.0551−.0225

.1130

.1957

.07990.0.

.0401


, v3 =


.95860.0.

−.21910.

.0775−.0548

.0949−.0387−.0548−.0949−.0387

0.0.0.


, v4 =


0.0.0.

.47140.

−.1667−.2357

.4082−.1667−.2357−.4082−.1667

0.0.

.5000


.

The three high frequency modes are where two of the chlorine atoms remain fixed, whilethe other two vibrate in opposite directions into and away from the carbon atom, whichslightly moves in the direction of the incoming atom. (Note: There are six possiblepairs, but only three independent modes.) The low frequency mode is where the fourchlorine atoms simultaneously move into and away from the carbon atom.

(c) There are six independent vibrational modes, whose fundamental frequencies are ω1 =2.17533, ω2 = ω3 = 2.05542, ω4 = ω5 = 1.33239, ω6 = 1.12603. In all cases, the

269

bonds periodically lengthen and shorten. In the first mode, adjacent bonds have the op-posite behavior; in the next two modes, two diametrically opposite bonds shorten whilethe other four bonds lengthen, and then the reverse happens; in the next two modes,two opposing nodes move in tandem along the line joining them while the other fourtwist accordingly; in the lowest frequency mode, the entire molecule expands and con-tracts. Note: The best way to understand the behavior is to run a movie of the differentmotions.

9.5.31.(a)

d2

dt2

0BBB@

x1y1x2y2

1CCCA+

0BBB@

2 0 −1 00 0 0 0−1 0 2 0

0 0 0 0

1CCCA

0BBB@

x1y1x2y2

1CCCA =

0BBB@

0000

1CCCA.

Same vibrational frequencies: ω1 = 1, ω2 =√

3, along with two unstable mechanismscorresponding to motions of either mass in the transverse direction.

(b)

d2

dt2

0BBB@

x1y1x2y2

1CCCA+

0BBB@

2 0 −1 00 0 0 0−1 0 1 0

0 0 0 0

1CCCA

0BBB@

x1y1x2y2

1CCCA =

0BBB@

0000

1CCCA.

Same vibrational frequencies: ω1 =

r3−

√5

2 , ω2 =

r3+

√5

2 , along with two unstable

mechanisms corresponding to motions of either mass in the transverse direction.(c) For a mass-spring chain with n masses, the two-dimensional motions are a combination

of the same n one-dimensional vibrational motions in the longitudinal direction, coupledwith n unstable motions of each individual mass in the transverse direction.

9.5.32.(a)

d2

dt2

0BBBBBBB@

x1y1z1x2y2z2

1CCCCCCCA

+

0BBBBBBB@

2 0 0 −1 0 00 0 0 0 0 00 0 0 0 0 0−1 0 0 2 0 0

0 0 0 0 0 00 0 0 0 0 0

1CCCCCCCA

0BBBBBBB@

x1y1z1x2y2z2

1CCCCCCCA

=

0BBBBBBB@

000000

1CCCCCCCA

.

Same vibrational frequencies: ω1 = 1, ω2 =√

3, along with four unstable mechanismscorresponding to motions of either mass in the transverse directions.

(b)

d2

dt2

0BBBBBBB@

x1y1z1x2y2z2

1CCCCCCCA

+

0BBBBBBB@

2 0 0 −1 0 00 0 0 0 0 00 0 0 0 0 0−1 0 0 1 0 0

0 0 0 0 0 00 0 0 0 0 0

1CCCCCCCA

0BBBBBBB@

x1y1z1x2y2z2

1CCCCCCCA

=

0BBBBBBB@

000000

1CCCCCCCA

.

ω1 =

r3−

√5

2 , ω2 =

r3+

√5

2 , along with four unstable mechanisms corresponding to

motions of either mass in the transverse directions.(c) For a mass-spring chain with n masses, the three-dimensional motions are a combination

of the same n one-dimensional vibrational motions in the longitudinal direction, coupledwith 2n unstable motions of each individual mass in the transverse directions.

♦ 9.5.33. Kv = λM v if and only if M−1Kv = λv, and so the eigenvectors and eigenvalues are

270

the same. The characteristic equations are the same up to a multiple, since

det(K − λ M) = dethM(M−1K − λ I )

i= det M det(P − λ I ).

9.5.34.

(a) First,d2eudt2

= Nd2u

dt2= −N K u = −N K N−1

u = −fK eu.

Moreover, fK is symmetric since fKT = N−T KT N−T = N−1KN−1 since both N and Kare symmetric. Positive definiteness follows since

exT fK ex = exT N−1K N−1 ex = xT K x > 0 for all ex = N x 6= 0.

(b) Each eigenvalue eλ = eω2 and corresponding eigenvector ev of fK produces two solutions

eu(t) =

(cos eω t

sin eω t

)ev to the modified system d2eu/dt2 = −fK eu. The corresponding

solutions to the original system are u(t) = N−1eu(t) =n

cos eω tsin eω t

ov, where ω = eω and

v = N−1ev. Finally, we observe that v is the generalized eigenvector for the generalizedeigenvalue λ = ω2 = eλ of the matrix pair K, M . Indeed, fK ev = eλ ev implies Kv =N K N N−1v = eλv.

♦ 9.5.35. Let v1, . . . ,vk be the eigenvectors corresponding to the non-zero eigenvalues λ1 = ω21 , . . . ,

λk = ω2k, and vk+1, . . . ,vn the null eigenvectors. Then the general solution to the vibra-

tional system¦¦

u + Ku = 0 is

u(t) =kX

i=1

hci cos(ωi (t− t0) ) + di sin(ωi (t− t0) )

ivi +

nX

j =k+1

(pj + qj (t− t0))vj .

which represents a quasi-periodic vibration with frequencies ω1, . . . , ωk around a linear mo-

tion with velocitynX

j =k+1

qj vj . Substituting into the initial conditions u(t0) = a,¦

u(t0) = b,

and using orthogonality, we conclude that

ci =〈a ,vi 〉‖vi ‖2

, di =〈b ,vi 〉ωi ‖vi ‖2

, pj =〈a ,vj 〉‖vj ‖2

, qj =〈b ,vj 〉‖vj ‖2

.

In particular, the unstable modes are not excited if and only if all qj = 0 which requires

that the initial velocity b be orthogonal to the null eigenvectors vk+1, . . . ,vn, which form

a basis for the null eigenspace or kernel of K. This requires¦

u(t0) = b ∈ (ker K)⊥ =corng K = rng K, using the fundamental Theorem 2.49 and the symmetry of K.

9.5.36.(a) u(t) = te−3 t; critically damped.

(b) u(t) = e− t“

cos 3 t + 23 sin 3 t

”; underdamped.

(c) u(t) = 14 sin 4(t− 1); undamped.

(d) u(t) = 2√

39 e−3 t/2 sin 3

√3

2 t; underdamped.

(e) u(t) = 4e− t/2 − 2e− t; overdamped.

(f ) u(t) = e−3 t(3 cos t + 7 sin t); underdamped.

9.5.37. The solution is u(t) = 14 (v+5)e− t− 1

4 (v+1)e−5 t, where v =¦

u(0) is the initial velocity.

This vanishes when e4 t =v + 1

v + 5, which happens when t = t? > 0 provided

v + 1

v + 5> 1, and

so the initial velocity must satisfy v < −5.

271

9.5.38.(a) By Hooke’s Law, the spring stiffness is k = 16/6.4 = 2.5. The mass is 16/32 = .5. The

equations of motion are .5¦¦

u + 2.5u = 0. The natural frequency is ω =√

5 = 2.23607.(b) The solution to the initial value problem .5

¦¦

u +¦

u + 2.5u = 0, u(0) = 2,¦

u(0) = 0, is

u(t) = e− t(2 cos 2 t + sin 2 t).(c) The system is underdamped, and the vibrations are less rapid than the undamped system.

9.5.39. The undamped case corresponds to a center; the underdamped case to a stable focus;the critically damped case to a stable improper node; and the overdamped case to a stablenode.

♦ 9.5.40.(a) The general solution has the form u(t) = c1 e−at + c2 e−bt for some 0 < a < b. If

c1 = 0, c2 6= 0, the solution does not vanish. Otherwise, u(t) = 0 if and only if e(b−a) t =

−c2/c1, which, since e(b−a) t is monotonic, happens for at most one time t = t?.

(b) Yes, since the solution is u(t) = (c1 + c2 t)e−at for some a > 0, which, for c2 6= 0, onlyvanishes when t = −c1/c2.

9.5.41. The general solution to md2u

dt2+ β

du

dt= 0 is u(t) = c1 + c2 e−β t/m. Thus, the mass

approaches its equilibrium position u? = c1, which can be anywhere, at an exponentiallyfast rate.

9.6.1.(a) cos 8 t− cos 9 t = 2 sin 1

2 t sin 172 t; fast frequency: 17

2 , beat frequency: 12 .

-5 5 10 15 20

-2

-1

1

2

(b) cos 26 t− cos 24 t = −2 sin t sin 25 t; fast frequency: 25, beat frequency: 1.

-2 2 4 6 8 10

-2

-1

1

2

(c) cos 10 t + cos 9.5 t = 2 sin .25 t sin 9.75 t; fast frequency: 9.75, beat frequency: .25.

5 10 15 20 25 30

-2

-1

1

2

(d) cos 5 t−sin 5.2 t = 2 sin“

.1 t− 14 π

”sin“

5.1 t− 14 π

”; fast frequency: 5.1, beat frequency: .1.

10 20 30 40 50 60 70

-2

-1

1

2

272

9.6.2.(a) u(t) = 1

27 cos 3 t− 127 cos 6 t,

(b) u(t) = 3550 te−3 t − 4

50 e−3 t + 450 cos t + 3

50 sin t,

(c) u(t) = 12 sin 2 t + e− t/2

“cos

√152 t−

√155 sin

√152 t

”,

(d) u(t) = cos 3 t− 12 t cos 3 t− 1

6 sin 3 t,

(e) u(t) = 15 cos 1

2 t + 35 sin 1

2 t + 95 e− t + e− t/2,

(f ) u(t) = − 110 cos t + 1

5 sin t + 14 e− t − 3

20 e− t/3.

9.6.3.(a) u(t) = 1

3 cos 4 t + 23 cos 5 t + 1

5 sin 5 t;undamped periodic motion with fast frequency 4.5 and beat frequency .5:

5 10 15 20 25 30

-1

-0.5

0.5

1

(b) u(t) = 3 cos 5 t + 4 sin 5 t − e−2 t“

3 cos 6 t + 133 sin 6 t

”; the transient is an underdamped

motion; the persistent motion is periodic of frequency 5 and amplitude 5:

2 4 6 8 10

-4

-2

2

4

(c) u(t) = − 6029 cos 2 t + 5

29 sin 2 t− 5629 e−5 t + 8e− t;

the transient is an overdamped motion; the persistent motion is periodic:

5 10 15 20 25

-2-1

1234

(d) u(t) = 132 sin 4 t− 1

8 t cos 4 t; resonant, unbounded motion:

5 10 15 20 25

-3-2-1

123

9.6.4. In general, by (9.102), the maximal allowable amplitude is α =q

m2(ω2 − η2)2 + β2 η2 =q625η4 − 49.9999 η2 + 1, which, in the particular cases is (a) .0975, (b) .002, (c) .1025.

9.6.5. η ≤ .14142 or η ≥ .24495.

9.6.6. β ≥ 5q

2−√

3 = 2.58819.

9.6.7. The solution to .5¦¦

u +¦

u + 2.5u = 2 cos 2 t, u(0) = 2,¦

u(0) = 0, is

u(t) = 417 cos 2 t + 16

17 sin 2 t + e− t“

3017 cos 2 t− 1

17 sin 2 t”

= .9701 cos(2 t− 1.3258) + 1.7657e− t cos(2 t + .0333).The solution consists of a persistent periodic vibration at the forcing frequency of 2, with aphase lag of tan−1 4 = 1.32582 and amplitude 4/

√17 = .97014, combined with a transient

vibration at the same frequency with exponentially decreasing amplitude.

273

♠ 9.6.8.(a) Yes, the same fast oscillations and beats can be observed graphically. For example, the

graph of cos t− .5 cos 1.1 t on the interval 0 ≤ t ≤ 300 is:

50 100 150 200 250 300

-1.5-1

-0.5

0.51

1.5

To prove this observation, we invoke the trigonometric identitya cos η t− b cos ω t

= (a− b) cos„ ω + η

2t«

cos„ ω − η

2t«

+ (a + b) sin„ ω + η

2t«

sin„ ω − η

2t«

= R(t) cos„ ω + η

2t− θ(t)

«,

where R(t), θ(t) are the polar coordinates of the point„

(a− b) cos„ ω − η

2t«

, (a + b) sin„ ω − η

2t««T

= ( R(t) cos θ(t), R(t) sin θ(t) )T ,

and represent, respectively, the envelope or slowly varying amplitude of the oscillations,i.e., the beats, and a periodically varying phase shift.

(b) Beats are still observed, but the larger | a− b | is — as prescribed by the initial condi-tions — the less pronounced the variation in the beat envelope. Also, when a 6= b, thefast oscillations are no longer precisely periodic, but exhibit a slowly varying phase shiftover the period of the beat envelope.

♦ 9.6.9. (a) u(t) =α(cos η t− cos ω t)

m(ω2 − η2), (b) u(t) =

αt

2mωsin ω t .

(c) Use l’Hopital’s rule, differentiating with respect to η to compute

limη →ω

α(cos η t− cos ω t)

m(ω2 − η2)= lim

η →ω

α t sin η t

2mη=

αt

2mωsin ω t.

♦ 9.6.10. Using the method of undetermined coefficients, we set

u?(t) = A cos η t + B sin η t.

Substituting into the differential equation (9.101), and then equating coefficients ofcos η t, sin η t, we find

m(ω2 − η2)A + β ηB = α, −β ηA + m(ω2 − η2)B = 0,

where we replaced k = mω2. Thus,

A =αm(ω2 − η2)

m2(ω2 − η2)2 + β2 η2, B =

αβη

m2(ω2 − η2)2 + β2 η2.

We then put the resulting solution in phase-amplitude form

u?(t) = a cos(η t− ε),

where, according to (2.7), A = a cos ε, B = a sin ε, which implies (9.102–103).

9.6.11.(a) Underdamped, (b) overdamped, (c) critically damped, (d) underdamped, (e) underdamped.

9.6.12. (a) u(t) = e− t/4 cos 14 t + e− t/4 sin 1

4 t, (b) u(t) = 32 e− t/3 − 1

2 e− t,

274

(c) u(t) = e− t/3 + 13 te− t/3, (d) u(t) = e− t/5 cos 1

10 t + 2e− t/5 sin 110 t,

(e) u(t) = e− t/2 cos 12√

3t +√

3 e− t/2 sin 12√

3t.

9.6.13. u(t) = 16541 e− t/4 cos 1

4 t− 9141 e− t/4 sin 1

4 t− 12441 cos 2 t + 32

41 sin 2 t

= 4.0244 e− .25 t cos .25 t− 2.2195 e− .25 t sin .25 t− 3.0244 cos 2 t + .7805 sin 2 t.

9.6.14. The natural vibrational frequency is ω = 1/√

RC. If η 6= ω then the circuit experiencesa quasi-periodic vibration as a combination of the two frequencies. As η gets close to ω, thecurrent amplitude becomes larger and larger, exhibiting beats. When η = ω, the circuit isin resonance, and the current amplitude grows without bound.

9.6.15. (a) .02, (b) 2.8126, (c) 26.25.

9.6.16. η ≤ .03577 or η ≥ .04382.

9.6.17. R ≥ .10051.

9.6.18.

(a) u(t) = cos t

31417

!+ r1 cos(2

√2 t− δ1)

−2

1

!+ r2 cos(

√3 t− δ2)

12

!,

(b) u(t) = sin 3 t

12−1

!+r1 cos(

q4 +√

5 t− δ1)

−1−

√5

2

!+r2 cos(4−

√5 t− δ2)

−1 +

√5

2

!,

(c) u(t) =

12 t sin 2 t + 1

3 cos 2 t34 t sin 2 t

!+ r1 cos

“√17 t− δ1

” −32

!+ r2 cos(2 t− δ2)

23

!,

(d) u(t) = cos 12 t

0@

217

− 1217

1A+ r1 cos(

q53 t− δ1)

−3

1

!+ r2 cos( 1√

2t− δ2)

1

2

!,

(e) u(t) = cos t

0@

13

− 13

1A+ sin 2 t

0@

16

− 23

1A+ r1 cos(

q85 t− δ1)

−5

2

!+ r2 cos( 1√

3t− δ2)

2

3

!,

(f ) u(t) = cos t

0BBB@

611511111

1CCCA+r1 cos(

√12 t− δ1)

0BB@

1

−1

2

1CCA+r2 cos(3 t− δ2)

0BB@

−1

1

1

1CCA+r3 cos(

√2 t− δ3)

0BB@

1

1

0

1CCA,

(g) u(t) = cos t

0BBB@

1838

0

1CCCA+r1 cos(

√3 t− δ1)

0BB@

−3

2

1

1CCA+r2 cos(

√2 t− δ2)

0BB@

3

2

1

1CCA+r3 cos(t− δ3)

0BB@

0

−1

1

1CCA.

9.6.19.

(a) The resonant frequencies are

r3−

√3

2 = .796225,

r3+

√3

2 = 1.53819.

(b) For example, a forcing function of the form cos

r3+

√3

2 t

!w, where w =

w1w2

!is

not orthogonal to the eigenvector

−1−

√3

1

!, so w2 6= (1 +

√3 )w1, will excite reso-

nance.

9.6.20. When the bottom support is removed, the resonant frequencies are

√5−

√17

2 = .468213,√5+

√17

2 = 1.51022. When the top support is removed, the resonant frequencies are

275

r2−

√2

2 = .541196,

r2+

√2

2 = 1.30656. In both cases the vibrations are slower. The previ-

ous forcing function will not excite resonance.

♣ 9.6.21. In each case, you need to force the system by cos(ωt)a where ω2 = λ is an eigenvalueand a is orthogonal to the corresponding eigenvector. In order not to excite an instability,a needs to also be orthogonal to the kernel of the stiffness matrix spanned by the unstablemode vectors.

(a) Resonant frequencies: ω1 = .5412, ω2 = 1.1371, ω3 = 1.3066, ω4 = 1.6453;

eigenvectors: v1 =

0BBB@

.6533

.2706

.6533− .2706

1CCCA, v2 =

0BBB@

.2706

.6533− .2706

.6533

1CCCA, v3 =

0BBB@

.2706− .6533

.2706

.6533

1CCCA, v4 =

0BBB@

− .6533.2706.6533.2706

1CCCA;

no unstable modes.(b) Resonant frequencies:

ω1 = .4209, ω2 = 1 (double), ω3 = 1.2783, ω4 = 1.6801, ω5 = 1.8347; eigenvectors:

v1 =

0BBBBBBB@

.6626

.1426

.6626− .1426

.28520

1CCCCCCCA

,v2 =

0BBBBBBB@

0−1

0110

1CCCCCCCA

, bv2 =

0BBBBBBB@

010101

1CCCCCCCA

,v3 =

0BBBBBBB@

− .5000− .2887

.5000− .2887

0.5774

1CCCCCCCA

,v4 =

0BBBBBBB@

.2470− .3825

.2470

.3825− .7651

0

1CCCCCCCA

,v5 =

0BBBBBBB@

.5000− .2887− .5000− .2887

0.5774

1CCCCCCCA

;

no unstable modes.(c) Resonant frequencies: ω1 = .3542, ω2 = .9727, ω3 = 1.0279, ω4 = 1.6894, ω5 = 1.7372;

eigenvectors:

v1 =

0BBBBBBB@

− .0989− .0706

0− .9851

.0989− .0706

1CCCCCCCA

, v2 =

0BBBBBBB@

− .1160.6780.2319

0− .1160− .6780

1CCCCCCCA

, v3 =

0BBBBBBB@

.1251− .6940

0.0744− .1251− .6940

1CCCCCCCA

, v4 =

0BBBBBBB@

.3914

.2009− .7829

0.3914− .2009

1CCCCCCCA

, v5 =

0BBBBBBB@

.6889

.11580

− .1549− .6889

.1158

1CCCCCCCA

;

unstable mode: z = ( 1, 0, 1, 0, 1, 0 )T . To avoid exciting the unstable mode, the initialvelocity must be orthogonal to the null eigenvector: z · ¦

u(t0) = 0, i.e., there is no nethorizontal velocity of the masses.

(d) Resonant frequencies: ω1 =√

32 = 1.22474 (double), ω2 =√

3 = 1.73205;

eigenvectors: v1 =

0BBBBBBBBBBBB@

12

−√

32

12√3

2

1

0

1CCCCCCCCCCCCA

, bv1 =

0BBBBBBBBBBBB@

−√

32

− 12√3

2

− 12

0

1

1CCCCCCCCCCCCA

, v2 =

0BBBBBBBBBB@

0

−2

−√

3

1√3

1

1CCCCCCCCCCA

;

unstable modes: z1 =

0BBBBBBB@

101010

1CCCCCCCA

, z2 =

0BBBBBBB@

010101

1CCCCCCCA

, z3 =

0BBBBBBBBBB@

−√

32

12

0100

1CCCCCCCCCCA

.

To avoid exciting the unstable modes, the initial velocity must be orthogonal to the nulleigenvectors: z1 ·

¦

u(t0) = z2 ·¦

u(t0) = z3 ·¦

u(t0) = 0, i.e., there is no net linear or angularvelocity of the three masses.

(e) Resonant frequencies: ω1 = 1, ω2 =√

3 = 1.73205;

276

eigenvectors: v1 =

0B@

10−1

1CA, v2 =

0B@

1−2

1

1CA; unstable mode: z =

0B@

111

1CA.

To avoid exciting the unstable mode, the initial velocity must be orthogonal to the nulleigenvector: z · ¦

u(t0) = 0, i.e., there is no net horizontal velocity of the atoms.(f ) Resonant frequencies: ω1 = 1.0386, ω2 = 1.0229;

eigenvectors: v1 =

0BBBBBBB@

.0555−.0426−.7054

0−.1826

.6813

1CCCCCCCA

, v2 =

0BBBBBBB@

−.0327−.0426

.70610

−.1827.6820

1CCCCCCCA

;

unstable modes: z1 =

0BBBBBBB@

101010

1CCCCCCCA

, z2 =

0BBBBBBB@

010101

1CCCCCCCA

, z3 =

0BBBBBBB@

000100

1CCCCCCCA

, z4 =

0BBBBBBB@

0000

sin 105

− cos 105

1CCCCCCCA

=

0BBBBBBB@

0000

.9659

.2588

1CCCCCCCA

.

To avoid exciting the unstable modes, the initial velocity must be orthogonal to the nulleigenvectors: z1 ·

¦

u(t0) = z2 ·¦

u(t0) = z3 ·¦

u(t0) = z4 ·¦

u(t0) = 0, i.e., there is no netlinear velocity of the three atoms, and neither hydrogen atom has an angular velocitycomponent around the oxygen.

277


10.1.1.(a) u(1) = 2, u(10) = 1024, u(20) = 1048576; unstable.

(b) u(1) = −.9, u(10) = .348678, u(20) = .121577; asymptotically stable.

(c) u(1) = i , u(10) = −1, u(20) = 1; stable.

(d) u(1) = 1− 2 i , u(10) = 237 + 3116 i , u(20) = −9653287 + 1476984 i ; unstable.

10.1.2.(a) u(k+1) = 1.0325 u(k), u(0) = 100, where u(k) represents the balance after k years.

(b) u(10) = 1.032510 × 100 = 137.69 dollars.

(c) u(k+1) = (1 + .0325/12) u(k) = 1.002708 u(k), u(0) = 100, where u(k) represents the

balance after k months. u(120) = (1 + .0325/12)120 × 100 = 138.34 dollars.

10.1.3. If r is the yearly interest rate, then u(k+1) = (1 + r/12) u(k), where u(k) represents

the balance after k months. Let v(m) denote the balance after m years, so v(m) = u(12m).

Thus v(m) satisfies the iterative system v(m+1) = (1 + r/12)12 v(m) = (1 + s)v(m), where

s = (1 + r/12)12 − 1 is the effective annual interest rate.

10.1.4. The balance after k years coming from compounding n times per year is„

1 +r

n

«nk

a −→ erk a as n→∞, by a standard calculus limit, [2, 58].

10.1.5. Since u(t) = aeαt we have u(k+1) = u((k + 1)h) = aeα(k+1)h = eαh“

aeαkh”

=

eαh u(k), and so λ = eαh. The stability properties are the same: |α | < 1 for asymptoticstability; |α | ≤ 1 for stability, |α | > 1 for an unstable system.

10.1.6. The solution u(k) = λk u(0) is periodic of period m if and only if λm = 1, and hence λ is

an mth root of unity. Thus, λ = e2 i kπ/m for some k = 0, 1, 2, . . . m − 1. If k and m have acommon factor, then the solution is of smaller period, and so the solutions of period exactlym are when k is relatively prime to m and λ is a primitive mth root of unity, as defined inExercise 5.7.7.

♠ 10.1.7. Let λ = e i θ where 0 ≤ θ < 2π. The solution is then u(k) = a λk = a e i kθ. If θ isa rational multiple of π, the solution is periodic, as in Exercise 10.1.6. When θ/π is irra-tional, the iterates eventually fill up (i.e., are dense in) the circle of radius | a | in the com-plex plane.

10.1.8. |u(k) | = |λ |k | a | > | v(k) | = |µ |k | b | provided k >log | b | − log | a |log |λ | − log |µ | , where the

inequality relies on the fact that log |λ | > log |µ |.

10.1.9. The equilibrium solution is u? = c/(1 − λ). Then v(k) = u(k) − u? satisfies the homo-

geneous system v(k+1) = λ v(k), and so v(k) = λkv(0) = λk(a − u?). Thus, the solution

to (10.5) is u(k) = λk(a − u?) + u?. If |λ | < 1, then the equilibrium is asymptotically

stable, with u(k) → u? as k → ∞; if |λ | = 1, it is stable, and solutions that start near u?

278

stay nearby; if |λ | > 1, it is unstable, and all non-equilibrium solutions become unbounded:

|u(k) | → ∞.

10.1.10. Let u(k) represent the balance after k years. Then u(k+1) = 1.05 u(k) + 120, with

u(0) = 0. The equilibrium solution is u? = −120/.05 = −2400, and so after k years the

balance is u(k) = (1.05k − 1) · 2400. Then

u(10) = $1, 509.35, u(50) = $25, 121.76, u(200) = $4, 149, 979.40.

10.1.11. If u(k) represent the balance after k months, then u(k+1) = (1 + .05/12) u(k) + 10,

u(0) = 0. The balance after k months is u(k) = (1.0041667k − 1) · 2400. Then

u(120) = $1, 552.82, u(600) = $26, 686.52, u(2400) = $5, 177, 417.44.

♥ 10.1.12. The overall yearly growth rate is 1.2 − .05 = 1.15, and so the deer population satisfies

the iterative system u(k+1) = 1.15 u(k) − 3600, u(0) = 20000. The equilibrium is u? =3600/.15 = 24000. Since λ = 1.15 > 1, the equilibrium is unstable; if the initial number ofdeer is less than the equilibrium, the population will decrease to zero, while if it is greater,then the population will increase without limit. Two possible options: ban hunting for 2years until the deer population reaches equilibrium of 24, 000 and then permit hunting atthe current rate again. Or to keep the population at 20, 000 allow hunting of only 3, 000deer per year. In both cases, the instability of the equilibrium makes it unlikely that thepopulation will maintain a stable number, so constant monitoring of the deer populationis required. (More realistic models incorporate nonlinear terms, and are less prone to suchinstabilities.)

10.1.13.

(a) u(k) =3k + (−1)k

2, v(k) =

−3k + (−1)k

2.

(b) u(k) = − 20

2k+

18

3k, v(k) = − 15

2k+

18

3k.

(c) u(k) =(√

5 + 2)(3−√

5)k + (√

5− 2)(3 +√

5)k

2√

5, v(k) =

(3−√

5)k − (3 +√

5)k

2√

5.

(d) u(k) = −8 +27

2k− 18

3k, v(k) = −4 +

1

3k−1, w(k) =

1

3k.

(e) u(k) = 1− 2k, v(k) = 1 + 2(−1)k − 2k+1, w(k) = 4(−1)k − 2k.

10.1.14.

(a) u(k) = c1 (−1−√

2)k −√

21

!+ c2 (−1 +

√2)k

√2

1

!;

(b) u(k) = c1

“12 +

√3

2 i”k

0@

5− i√

32

1

1A+ c2

“12 −

√3

2 i”k

0@

5+ i√

32

1

1A

= a1

0@

52 cos 1

3 kπ +√

32 sin 1

3 kπ

cos 13 kπ

1A+ a2

0@

52 sin 1

3 kπ −√

32 cos 1

3 kπ

sin 13 kπ

1A;

(c) u(k) = c1

0B@

120

1CA+ c2 (−2)k

0B@

232

1CA+ c3 (−3)k

0B@

233

1CA;

(d) u(k) = c1“− 1

2

”k

0B@

110

1CA+ c2

“− 1

3

”k

0B@

121

1CA+ c3

“16

”k

0B@

012

1CA.

279

10.1.15. (a) It suffices to note that the Lucas numbers are the general Fibonacci numbers (10.16)

when a = L(0) = 2, b = L(1) = 1. (b) 2, 1, 3, 4, 7, 11, 18. (c) Because the first two are in-

tegers and so, by induction, L(k+2) = L(k+1) + L(k) is an integer whenever L(k+1), L(k) areintegers.

10.1.16.

The second summand satisfies

˛˛˛˛− 1√

5

0@1−

√5

2

1A

k˛˛˛˛< .448×.62k < .5 for all k ≥ 0. Since

the sum is an integer, adding the second summand to the first is the same as rounding thefirst off to the nearest integer.

10.1.17. u(−k) = (−1)k+1 u(k). Indeed, since1

1+√

52

=−1 +

√5

2,

u(−k) =1√5

264

0@1 +

√5

2

1A

−k

−0@1−

√5

2

1A

−k375 =

1√5

264

0@−1 +

√5

2

1A

k

−0@−1−

√5

2

1A

k375

=1√5

264(−1)k

0@1−

√5

2

1A

−k

− (−1)k0@1 +

√5

2

1A

k375

=(−1)k+1

√5

264

0@1 +

√5

2

1A

k

−0@1−

√5

2

1A

k375 = (−1)k+1 u(k).

10.1.18.

(a)

5 22 2

!k

=

2 −11 2

! 6k 00 1

!0@

25

15

− 15

25

1A,

(b)

4 1−2 1

!k

=

−1 −1

1 2

! 3k 00 2k

! −2 −1

1 1

!,

(c)

1 1−1 1

!k

=

− i i

1 1

!0@ (1 + i )k 0

0 (1− i )k

1A0@

i2

12

− i2

12

1A,

(d)

0B@

1 1 21 2 12 1 1

1CA

k

=

0BB@

1 1 −1

1 −2 0

1 1 1

1CCA

0BB@

4k 0 0

0 1 0

0 0 (−1)k

1CCA

0BBB@

13

13

13

16 − 1

316

− 12 0 1

2

1CCCA,

(e)

0B@

0 1 00 0 1−1 0 2

1CA

k

=

0BBB@

3−√

52

3+√

52 1

−1+√

52

−1−√

52 1

1 1 1

1CCCA

0BBBBB@

„1+

√5

2

«k

0 0

0„

1+√

52

«k

0

0 0 1

1CCCCCA

0BBB@

−5−3√

510

−5−√

510

5+2√

55

−5+3√

510

−5+√

510

5−2√

55

1 1 −1

1CCCA.

10.1.19. (a)

u(k)

v(k)

!=

2 −11 2

! − 2

5 6k

− 15

!, (b)

u(k)

v(k)

!=

−1 −1

1 2

! 3k

−2k+1

!,

(c)

u(k)

v(k)

!=

− i i

1 1

!0@ − i (1 + i )k

(1− i )k

1A, (d)

0B@

u(k)

v(k)

w(k)

1CA =

0BB@

1 1 −1

1 −2 0

1 1 1

1CCA

0BBB@

23 4k

13

0

1CCCA,

280

(e)

0B@

u(k)

v(k)

w(k)

1CA =

0BBB@

3−√

52

3+√

52 1

−1+√

52

−1−√

52 1

1 1 1

1CCCA

0BBBBBB@

−5+3√

510

„1+

√5

2

«k

−5+√

510

„1+

√5

2

«k

− 1

1CCCCCCA

.

10.1.20. (a) Since the coefficient matrix T has all integer entries, its product T u with any vec-tor with integer entries also has integer entries; (b) c1 = −2, c2 = 3, c2 = −3;

(c) u(1) =

0B@

4−2−2

1CA, u

(2) =

0B@−26

10−2

1CA, u

(3) =

0B@

76−32

16

1CA, u

(4) =

0B@−164

76−44

1CA, u

(5) =

0B@

304−152

88

1CA.

10.1.21. The vectors u(k) =“

u(k), u(k+1), . . . , u(k+j−1)”T ∈ R

j satisfy u(k+1) = T u(k), where

T =

0BBBBBBBBB@

0 1 0 0 . . . 00 0 1 0 . . . 00 0 0 1 . . . 0...

......

.... . .

...0 0 0 0 . . . 1cj cj−1 cj−2 cj−3 . . . c1

1CCCCCCCCCA

. The initial conditions are u(0) = a =

0BBBBB@

a0a1...

aj−1

1CCCCCA

, and

so u(0) = a0, u(1) = a1, . . . , u(j−1) = aj−1.

10.1.22. (a) u(k) = 43 − 1

3 (−2)k, (b) u(k) =“

13

”k−1+“− 1

4

”k−1,

(c) u(k) =(5− 3

√5)(2 +

√5)k + (5 + 3

√5)(2−

√5)k

10,

(d) u(k) =“

12 − i

”(1 + i )k +

“12 + i

”(1− i )k = 2k/2

“cos 1

4 kπ + 2 sin 14 kπ

”,

(e) u(k) = − 12 − 1

2 (−1)k + 2k, (f ) u(k) = −1 +“

1 + (−1)k”

2k/2−1.

♣ 10.1.23. (a) u(k+3) = u(k+2) + u(k+1) + u(k), (b) u(4) = 2, u(5) = 4, u(6) = 7, u(7) = 13,

(c) u(k) ≈ .183× 1.839k + 2 Re (− .0914018 + .340547 i ) (− .419643 + .606291 i )k

= .183× 1.839k − .737353k“

.182804 cos 2.17623 k + .681093 sin 2.17623 k”.

♣ 10.1.24.(a) u(k) = u(k−1) + u(k−2) − u(k−8).(b) 0, 1, 1, 2, 3, 5, 8, 13, 21, 33, 53, 84, 134, 213, 339, 539, 857, 1363, 2167, . . . .

(c) u(k) =“

u(k), u(k+1), . . . , u(k+7)”T

satisfies u(k+1) = Au(k) where the 8 × 8 coeffi-

cient matrix A has 1’s on the superdiagonal, last row (−1, 0, 0, 0, 0, 0, 1, 1 ) and all otherentries 0.

(d) The growth rate is given by largest eigenvalue in magnitude: λ1 = 1.59, with u(k) ∝ 1.59k.For more details, see [34].

10.1.25. u(k)i =

nX

j =1

cj

2 cos

j π

n + 1

!k

sinij π

n + 1, i = 1, . . . , n.

10.1.26. The key observation is that the coefficient matrix T is symmetric. Then, according toExercise 8.5.19, the principal axes of the ellipse E1 = T x | ‖x ‖ = 1 are the orthogonal

eigenvectors of T . Moreover, T k is also symmetric and has the same eigenvectors. Hence,all the ellipses Ek have the same principal axes. The semi-axes are the absolute values of

the eigenvalues, and hence Ek has semi-axes ( .8)k and ( .4)k.

281

♠ 10.1.27.

(a)

E1: principal axes:

−1

1

!,

11

!, semi-axes: 1, 1

3 , area: 13 π.

E2: principal axes:

−1

1

!,

11

!, semi-axes: 1, 1

9 , area: 19 π.

E3: principal axes:

−1

1

!,

11

!, semi-axes: 1, 1

27 , area: 127 π.

E4: principal axes:

−1

1

!,

11

!, semi-axes: 1, 1

81 , area: 181 π.

(b)

E1: principal axes:

10

!,

01

!, semi-axes: 1.2, .4, area: .48π = 1.5080.

E2: circle of radius .48, area: .2304π = .7238.

E3: principal axes:

10

!,

01

!, semi-axes: .576, .192, area: .1106π = .3474.

E4: circle of radius .2304, area: .0531π = .1168.

(c)

E1: principal axes:

.6407.7678

!,

−.7678

.6407

!, semi-axes: 1.0233, .3909, area: .4π = 1.2566.

E2: principal axes:

.6765.7365

!,

−.7365

.6765

!, semi-axes: 1.0394, .1539, area: .16π = .5027.

E3: principal axes:

.6941.7199

!,

−.7199

.6941

!, semi-axes: 1.0477, .0611, area: .064π = .2011.

E4: principal axes:

.7018.7124

!,

−.7124

.7018

!, semi-axes: 1.0515, .0243, area: .0256π = .0804.

10.1.28. (a) This follows from Exercise 8.5.19(a), using the fact that K = T n is also positivedefinite. (b) True — they are the eigenvectors of T . (c) True — r1, s1 are the eigenvaluesof T . (d) True, since the area is π times the product of the semi-axes, so A1 = πr1s1, soα = r1s1 = | det T |. Then An = πrnsn = πrn

1 sn1 = π | det T |n = παn.

282

10.1.29. (a) This follows from Exercise 8.5.19(a) with A = T n. (b) False; see Exercise 10.1.27(c)for a counterexample. (c) False — the singular values of T n are not, in general, the nth

powers of the singular values of T . (d) True, since the product of the singular values is theabsolute value of the determinant, and so An = π | det T |n.

10.1.30. v(k) = c1 (αλ1 + β)k v1 + · · · + cn (αλn + β)k vn.

♦ 10.1.31. If u(k) = x(k) + iy(k) is a complex solution, then the iterative equation becomes

x(k+1) + iy(k+1) = T x(k) + i T y(k). Separating the real and imaginary parts of this

complex vector equation and using the fact that T is real, we deduce x(k+1) = T x(k),

y(k+1) = T y(k). Therefore, x(k),y(k) are real solutions to the iterative system.

♦ 10.1.32. The formula uniquely specifies u(k+1) once u(k) is known. Thus, by induction, once

the initial value u(0) is fixed, there is only one possible solution u(k) for k = 0, 1, 2, . . . .

Existence and uniqueness also hold for k < 0 when T is nonsingular, since u(−k−1) =

T−1u(−k). If T is singular, the solution will not exist for k < 0 if any u(−k) 6∈ rng T , or,

if it exists, is not unique since we can add any element of ker T to u(−k) without affecting

u(−k+1),u(−k+2), . . . .

10.1.33. According to Theorem 8.20, the eigenvectors of T are real and form an orthogonal ba-sis of R

n with respect to the Euclidean norm. The formula for the coefficients cj thus fol-

lows directly from (5.8).

10.1.34. Since matrix multiplication acts column-wise, cf. (1.11), the jth column of the matrix

equation T k+1 = T T k is c(k+1)j = T c

(k)j . Moreover, T 0 = I has jth column c

(0)j = ej .

10.1.35. Separating the equation into its real and imaginary parts, we find

x(k+1)

y(k+1)

!=

µ −νν µ

! x(k)

y(k)

!.

The eigenvalues of the coefficient matrix are µ ± i ν, with eigenvectors

1∓ i

!and so the

solution is

x(k)

y(k)

!=

x(0) + i y(0)

2(µ + i ν)k

1− i

!+

x(0) − i y(0)

2(µ− i ν)k

1i

!.

Therefore z(k) = x(k) + i y(k) = (x(0) + i y(0))(µ + i ν)k = λkz(0).

♦ 10.1.36.(a) Proof by induction:

T k+1wi = T

0@λk

wi + kλk−1wi−1 +

0@k

2

1Aλk−2

wi−2 + · · ·1A

= λk T wi + kλk−1 T wi−1 +

0@k

2

1Aλk−2 T wi−2 + · · ·

= λk (λwi + wi−1) + kλk−1 (λwi−1 + wi−2) +

0@k

2

1Aλk−2 (λwi−2 + wi−3) + · · ·

= λk+1wi + (k + 1)λk

wi−1 +

0@k + 1

2

1Aλk−1

wi−2 + · · · .

(b) Each Jordan chain of length j is used to construct j linearly independent solutions byformula (10.23). Thus, for an n-dimensional system, the Jordan basis produces the re-quired number of linearly independent (complex) solutions, and the general solution is

283

obtained by taking linear combinations. Real solutions of a real iterative system are ob-tained by using the real and imaginary parts of the Jordan chain solutions correspond-ing to the complex conjugate pairs of eigenvalues.

10.1.37.(a) u(k) = 2k

“c1 + 1

2 kc2”, v(k) = 1

3 2k c2;

(b) u(k) = 3k“

c1 +“

13 k − 1

2

”c2”, v(k) = 3k

“2c1 + 2

3 kc2”;

(c) u(k) = (−1)k“

c1 − kc2 + 12 k(k − 1)c3

”, v(k) = (−1)k

“c2 − (k + 1)c3

”, w(k) = (−1)k c3;

(d) u(k) = 3k“

c1 + 13 kc2 +

“118 k (k − 1) + 1

”c3”, v(k) = −3k

“c2 + 1

3 kc3”,

w(k) = 3k“

c1 + 13 kc2 + 1

18 k (k − 1)c3”;

(e) u(0) = −c2, v(0) = −c1 + c3, w(0) = c1 + c2, while, for k > 0,

u(k) = −2k“

c2 + 12 kc3

”, v(k) = 2k c3, w(k) = 2k

“c2 + 1

2 kc3”;

(f ) u(k) = − ik+1 c1 − k ik c2 − (− i )k+1c3 − k(− i )kc4, w(k) = − ik+1 c2 − (− i )k+1c4,

v(k) = i kc1 + k ik−1 c2 + (− i )kc3 + k(− i )k−1c4, z(k) = ik c2 + (− i )kc4.

10.1.38. Jkλ,n =


λk kλk−1“

k2

”λk−2

“k3

”λk−3 . . .

“k

n−1

”λk−n+1

0 λk kλk−1“

k2

”λk−2 . . .

“k

n−2

”λk−n+2

0 0 λk kλk−1 . . .“

kn−3

”λk−n+3

0 0 0 λk . . .“

kn−4

”λk−n+4

......

......

. . ....

0 0 0 0 . . . λn


.

10.1.39. (a) Yes, if T is nonsingular. Indeed, in this case, the solution formula u(k) = T k−k0u(k0)

is valid even when k < k0. But if T is singular, then one can only assert that u(k) = eu(k)

for k ≥ k0. (b) If T is nonsingular, then u(k) = eu(k−k0+k1) for all k; if T is singular, thenthis only holds when k ≥ maxk0, k1 .

♥ 10.1.40.(a) The system has an equilibrium solution if and only if (T − I )u? = b. In particular, if 1

is not an eigenvalue of T , every b leads to an equilibrium solution.

(b) Since v(k+1) = T v(k), the general solution is

u(k) = u

? + c1 λk1 v1 + c2 λk

2 v2 + · · · + cn λkn vn,

where v1, . . . ,vn are the linearly independent eigenvectors and λ1, . . . , λn the corre-sponding eigenvalues of T .

(c) (i) u(k) =

0@

23

−1

1A− 5k

−3

1

!− (−3)k

0@−

13

1

1A;

(ii) u(k) =

11

!+

(−1−√

2)k

2√

2

−√

21

!− (−1 +

√2)k

2√

2

√21

!;

(iii) u(k) =

0B@−1− 3

2−1

1CA− 3

0B@

120

1CA+ 15

2 (−2)k

0B@

232

1CA− 5(−3)k

0B@

233

1CA;

284

(iv) u(k) =

0BBB@

165332

1CCCA+ 7

2

“− 1

2

”k

0BB@

1

1

0

1CCA− 7

2

“− 1

3

”k

0BB@

1

2

1

1CCA+ 7

3

“16

”k

0BB@

012

1

1CCA.

(d) In general, using induction, the solution is

u(k) = T k

c + ( I + T + T 2 + · · ·+ T k−1)b.

If we write b = b1v1 + · · · + bn vn, c = c1v1 + · · · + cn vn, in terms of the eigenvectors,then

u(k) =

nX

j =1

hλk

j cj + (1 + λj + λ2j + · · ·+ λk−1

j )bj

ivj .

If λj 6= 1, one can use the geometric sum formula 1 + λj + λ2j + · · ·+ λk−1

j =1− λk

j

1− λj

,

while if λj = 1, then 1 + λj + λ2j + · · · + λk−1

j = k. Incidentally, when it exists the

equilibrium solution is u? =X

λj 6=1

bj

1− λj

vj .

♣ 10.1.41.(a) The sequence is 3, 7, 0, 7, 7, 4, 1, 5, 6, 1, 7, 8, 5, 3, 8, 1, 9, 0, 9, 9, 8, 7, 5, 2, 7, 9, 6, 5,

1, 6, 7, 3, 0, 3, 3, 6, 9, 5, 4, 9, 3, 2, 5, 7, 2, 9, 1, 0, 1, 1, 2, 3, 5, 8, 3, 1, 4, 5, 9, 4, 3, 7, 0,

and repeats when u(60) = u(0) = 3, u(61) = u1 = 7.

(b) When n = 10, other choices for the initial values u(0), u(1) lead to sequences that also

start repeating at u(60); if u(0), u(1) occur as successive integers in the preceding se-quence, then one obtains a shifted version of it — otherwise one ends up with a disjointpseudo-random sequence. Other values of n lead to longer or shorter sequences; e.g.,

n = 9 repeats at u(24), while n = 11 already repeats at u(10).

10.2.1.(a) Eigenvalues: 5+

√33

2 ≈ 5.3723, 5−√

332 ≈ − .3723; spectral radius: 5+

√33

2 ≈ 5.3723.

(b) Eigenvalues: ± i6√

2≈ ± .11785 i ; spectral radius: 1

6√

2≈ .11785.

(c) Eigenvalues: 2, 1,−1; spectral radius: 2.

(d) Eigenvalues: 4,−1± 4 i ; spectral radius:√

17 ≈ 4.1231.

10.2.2.(a) Eigenvalues: 2± 3 i ; spectral radius:

√13 ≈ 3.6056; not convergent.

(b) Eigenvalues: .95414, .34586; spectral radius: .95414; convergent.

(c) Eigenvalues: 45 , 3

5 , 0; spectral radius: 45 ; convergent.

(d) Eigenvalues: 1., .547214,−.347214; spectral radius: 1; not convergent.

10.2.3.(a) Unstable: eigenvalues −1,−3;

(b) unstable: eigenvalues 5+√

7312 ≈ 1.12867, 5−

√73

12 ≈ − .29533;

(c) asymptotically stable: eigenvalues 1± i2 ;

(d) stable: eigenvalues −1,± i ;

(e) unstable: eigenvalues 54 , 1

4 , 14 ;

(f ) unstable: eigenvalues 2, 1, 1;

(g) asymptotically stable: eigenvalues 12 , 1

3 , 0.

10.2.4.(a) λ1 = 3, λ2 = 1 + 2 i , λ3 = 1− 2 i , ρ(T ) = 3.

285

(b) λ1 = 35 , λ2 = 1

5 + 25 i , λ3 = 1

5 − 25 i , ρ(eT ) = 3

5 .

(c) u(k) → 0 in all cases; generically, the initial data has a non-zero component, c1 6= 0, in

the direction of the dominant eigenvector, and then (−1,−1, 1 )T , then

u(k) ≈ c1“

35

”k(−1,−1, 1 )T .

10.2.5.(a) T has a double eigenvalue of 1, so ρ(T ) = 1.

(b) Set u(0) =

ab

!. Then T k =

1 k0 1

!, and so u(k) =

a + kb

b

!→∞ provided b 6= 0.

(c) In this example, ‖u(k) ‖ =q

b2 k2 + 2abk + a2 + b2 ≈ bk → ∞ when b 6= 0, while

C ρ(T )k = C is constant, so eventually ‖u(k) ‖ > C ρ(T )k no matter how large C is.

(d) For any σ > 1, we have bk ≤ C σk for k ≥ 0 provided C À 0 is sufficiently large — morespecifically, if C > b/ log σ.

10.2.6. A solution u(k) → 0 if and only if the initial vector u(0) = c1v1 + · · · + cj vj is a linear

combination of the eigenvectors (or more generally, Jordan chain vectors) corresponding toeigenvalues satisfying |λi | < 1 for i = 1, . . . , j.

10.2.7. Since ρ(cA) = | c |ρ(A), then cA is convergent if and only if | c | < 1/ρ(A). So, techni-cally, there isn’t a largest c.

♦ 10.2.8.(a) Let u1, . . . ,un be a unit eigenvector basis for T , so ‖uj ‖ = 1. Let

mj = maxn| cj |

˛˛ ‖ c1u1 + · · · + cnun ‖ ≤ 1

o,

which is finite since we are maximizing a continuous function over a closed, bounded set.Let m? = maxm1, . . . , mn. Now, given ε > 0 small, if

‖u(0) ‖ = ‖ c1u1 + · · · + cnun ‖ < ε, then | cj | < m?ε for j = 1, . . . , n.

Therefore, by (10.25), ‖u(k) ‖ ≤ | c1 | + · · · + | cn | ≤ n m?ε, and hence the solutionremains close to 0.

(b) If any eigenvalue of modulus ‖λ ‖ = 1 is incomplete, then, according to (10.23), the sys-

tem has solutions of the form u(k) = λk wi + kλk−1wi−1 + · · · , which are unboundedas k → ∞. Thus, the origin is not stable in this case. On the other hand, if all eigen-values of modulus 1 are complete, then the system is stable, even if there are incompleteeigenvalues of modulus < 1. The proof is a simple adaptation of that in part (a).

10.2.9. Assume u(0) = c1v1 + · · · + cn vn with c1 6= 0. For k À 0, u(k) ≈ c1 λk1 v1 since

|λk1 | À |λk

j | for all j > 1. Thus, the entries satisfy u(k+1)i ≈ λ1 u

(k)i and so, if nonzero, are

just multiplied by λ1. Thus, if λ1 > 0 we expect to see the signs of all the entries of u(k)

not change once k is sufficiently large, whereas if λ1 < 0, again for sufficiently large k, thesigns alterate at each step of the iteration.

♥ 10.2.10. Writing u(0) = c1v1 + · · · + cn vn, then for k À 0,

u(k) ≈ c1 λk

1 v1 + c2 λk2 v2, (∗)

which applies even when v1,v2 are complex eigenvectors of a real matrix. Thus this hap-pens if and only if the iterates eventually belong (modulo a small error) to a two-dimensionalsubspace V , namely that spanned by the eigenvectors v1 and v2. In particular, for k À 0,

the iterates u(k) and u(k+1) form a basis for V (again modulo a small error) since if theywere linearly dependent, then there would only be one eigenvalue of largest modulus. Thus,

we can write u(k+2) ≈ au(k+1) + bu(k) for some scalars a, b. We claim that the dominant

286

eigenvalues λ1, λ2 are the roots of the quadratic equation λ2 = aλ + b, which gives an effec-tive algorithm for determining them. To prove the claim, for k À 0, by formula (∗),

u(k+2) ≈ c1 λk+2

1 v1 + c2 λk+22 v2,

au(k+1) + bu(k) ≈ c1 λk

1(aλ1 + b)v1 + c2 λk2(aλ2 + b)v2.

Thus, by linear independence of the eigenvectors v1,v2, we conclude that

λ21 = aλ1 + b, λ2

2 = aλ2 + b,which proves the claim. With the eigenvalues in hand, the determination of the eigenvec-tors is straightforward, either by directly solving the linear eigenvalue system, or by usingequation (∗) for k and k + 1 sufficiently large.

10.2.11. If T has eigenvalues λj , then cT + d I has eigenvalues cλj + d. However, it is not nec-

essarily true that the dominant eigenvalue of cT + d I is cλ1 + d when λ1 is the dominanteigenvalue of T . For instance, if λ1 = 3, λ2 = −2, so ρ(T ) = 3, then λ1 − 2 = 1, λ2 = −4,so ρ(T − 2 I ) = 4 6= ρ(T ) − 2. Thus, you need to know all the eigenvalues to predict ρ(T ),or, more accurately, the extreme eigenvalues, i.e., those such that all other eigenvalues lie intheir convex hull in the complex plane.

10.2.12. By definition, the eigenvalues of AT A are λi = σ2i , and so the spectral radius of AT A

is equal to ρ(AT A) = maxσ21 , . . . , σ2

n . Thus ρ(AT A) = λ1 < 1 if and only if σ1 =q

λ1 < 1.

♥ 10.2.13. (a) ρ(Mn) = 2 cosπ

n + 1. (b) No, since its spectral radius is slightly less than 2.

(c) The entries of u(k) are u(k)i =

nX

j =1

cj

2 cos

j π

n + 1

!k

sinij π

n + 1, i = 1, . . . , n, where

c1, . . . , cn are arbitrary constants.

♥ 10.2.14.(a) The entries of u(k) are u

(k)i =

nX

j =1

cj

β + 2α cos

j π

n + 1

!k

sinij π

n + 1, i = 1, . . . , n.

(b) The system is asymptotically stable if and only if

ρ(Tα,β) = max ˛˛β + 2α cos

π

n + 1

˛˛ ,˛˛β − 2α cos

π

n + 1

˛˛ff

< 1.

In particular, if |β ± 2α | < 1 the system is asymptotically stable for any n.

10.2.15. (a) According to Exercise 8.2.25, T has at least one eigenvalue with |λ | > 1. (b) No.

For example, T =

2 00 1

3

!has det T = 2

3 , but ρ(T ) = 2, and so the iterative system is

unstable.

10.2.16.(a) False: ρ(cA) = | c | ρ(A).

(b) True, since the eigenvalues of A and S−1AS are the same.

(c) True, since the eigenvalues of A2 are the squares of the eigenvalues of A.

(d) False, since ρ(A) = max λ whereas ρ(A−1) = max 1/λ.

(e) False in almost all cases; for instance, if A =

1 00 0

!and B =

0 00 1

!, then

ρ(A) = ρ(B) = ρ(A + B) = 1 6= 2 = ρ(A) + ρ(B).(f ) False: using the matrices in (e), AB = O and so ρ(AB) = 0 6= 1 = ρ(A) ρ(B).

10.2.17. (a) True by part (c) of Exercise 10.2.16. (b) False. For example, A =

0@

12 1

0 12

1A has

ρ(A) = 12 , whereas AT A =

0@

14

12

12

54

1A has ρ(AT A) = 3

4 + 12

√2 = 1.45711.

287

10.2.18. False. The first requires its eigenvalues satisfy Re λj < 0; the second requires |λj | < 1.

10.2.19. (a) A2 =„

limk →∞

T k«

2 = limk →∞

T 2k = A. (b) The only eigenvalues of A are 1 and 0.

Moreover, A must be complete, since if v1,v2 are the first two vectors in a Jordan chain,

then Av1 = λv1, Av2 = λv2 + v1, with λ = 0 or 1, but A2v2 = λ2v1 + 2λv2 6= Av2 =λv2 + v1, so there are no Jordan chains except for the ordinary eigenvectors. Therefore,

A = S diag (1, . . . , 1, 0, . . . 0) S−1 for some nonsingular matrix S. (c) If λ is an eigenvalue ofT , then either |λ | < 1, or λ = 1 and is a complete eigenvalue.

10.2.20. If v has integer entries, so does Akv for any k, and so the only way in which Akv→ 0

is if Akv = 0 for some k. Now consider the basis vectors e1, . . . , en. Let ki be such that

Akiei = 0. Let k = maxk1, . . . , kn, so Akei = 0 for all i = 1, . . . , n. Then Ak I = Ak =

O, and hence A is nilpotent. The simplest example is

0 10 0

!.

♥ 10.2.21. The equivalent first order system v(k+1) = C v(k) for v(k) =

u(k)

u(k+1)

!has coeffi-

cient matrix C =

O IB A

!. To compute the eigenvalues of C we form det(C − λ I ) =

det

−λ I I

B A− λ I

!. Now use row operations to subtract appropriate multiples of the

first n rows from the last n, and then a series of row interchanges to conclude that

det(C − λ I ) = det

−λ I I

B + λA− λ2 I O

!= ± det

B + λA− λ2 I O

−λ I I

!

= ± det(B + λA− λ2 I ).

Thus, the quadratic eigenvalues are the same as the ordinary eigenvalues of C, and hencestability requires they all satisfy |λ | < 1.

♦ 10.2.22. Set σ = µ/λ > 1. If p(x) = ck xk + · · ·+ c1 x + c0 has degree k, then p(n) ≤ ank for all

n ≥ 1 where a = max | ci |. To prove ank ≤ C σn it suffice to prove that k log n < n log σ +

log C − log a. Now h(n) = n log σ − k log n has a minimum when h′(n) = log σ − k/n = 0,so n = k/ log σ. The minimum value is h(k/ log σ) = k(1 − log(k/ log σ)). Thus, choosinglog C > log a + k(log(k/ log σ)− 1) will ensure the desired inequality.

♦ 10.2.23. According to Exercise 10.1.36, there is a polynomial p(x) such that

‖u(k) ‖ ≤X

i

|λi |kpi(k) ≤ p(k) ρ(A)k.

Thus, by Exercise 10.2.22, ‖u(k) ‖ ≤ C σk for any σ > ρ(A).

♦ 10.2.24.(a) Rewriting the system as u(n+1) = M−1u(n), stability requires ρ(M−1) < 1. The eigen-

values of M−1 are the reciprocals of the eigenvalues of M , and hence ρ(M−1) < 1 if andonly if 1/|µi | < 1 for all i.

(b) Rewriting the system as u(n+1) = M−1Ku(n), stability requires ρ(M−1K) < 1. More-

over, the eigenvalues of M−1K coincide with the generalized eigenvalues of the pair; seeExercise 8.4.8 for details.

288

10.2.25. (a) All scalar multiples of

11

!; (b)

00

!; (c) all scalar multiples of

0B@−1−2

1

1CA;

(d) all linear combinations of

0B@−1

10

1CA,

0B@

101

1CA.

10.2.26.(a) The eigenvalues are 1, 1

2 , so the fixed points are stable, while all other solutions go to a

unique fixed point at rate“

12

”k. When u(0) = ( 1, 0 )T , then u(k) →

“35 , 3

5

”T.

(b) The eigenvalues are −.9, .8, so the origin is a stable fixed point, and every nonzero solu-

tion goes to it, most at a rate of .9k. When u(0) = ( 1, 0 )T , then u(k) → 0 also.(c) The eigenvalues are −2, 1, 0, so the fixed points are unstable. Most solutions, specifi-

cally those with a nonzero component in the dominant eigenvector direction, become

unbounded. However, when u(0) = ( 1, 0, 0 )T , then u(k) = (−1,−2, 1 )T for k ≥ 1, andthe solution stays at a fixed point.

(d) The eigenvalues are 5 and 1, so the fixed points are unstable. Most solutions, specifi-cally those with a nonzero component in the dominant eigenvector direction, become

unbounded, including that with u(0) = ( 1, 0, 0 )T .

10.2.27. Since T is symmetric, its eigenvectors v1, . . . ,vn form an orthogonal basis of Rn. Writ-

ing u(0) = c1v1 + · · · + cn vn, the coefficients are given by the usual orthogonality for-

mula (5.7): ci = u(0) · vi/‖v1 ‖2. Moreover, since λ1 = 1, while |λj | < 1 for j ≥ 2,

u(k) = c1v1 + c2 λk2v2 + · · ·+ cn λk

nvn −→ c1v1 =u(0) · v1

‖v1 ‖2v1.

10.2.28. False: T has an eigenvalue of 1, but convergence requires that all eigenvalues be lessthan 1 in modulus.

10.2.29. True. In this case T u = u for all u ∈ Rn and hence T = I .

♥ 10.2.30.(a) The iterative system has a period 2 solution if and only if T has an eigenvalue of −1.

Indeed, the condition u(k+2) = T 2u(k) = u(k) implies that u(k) 6= 0 is an eigenvector of

T 2 with eigenvalue of 1. Thus, u(k) is an eigenvector of T with eigenvalue −1 because if

the eigenvalue were 1 then u(k) = u(k+1), and the solution would be a fixed point. For

example, T =

−1 0

0 2

!has the period 2 orbit u(k) =

c(−1)k

0

!for any c.

(b) The period 2 solution is never unique since any nonzero scalar multiple is also a period2 solution.

(c) T must have an eigenvalue equal to a primitive mth root of unity; the non-primitiveroots lead to solutions with smaller periods.

♦ 10.2.31.(a) Let u? = v1 be a fixed point of T , i.e., an eigenvector with eigenvalue λ1 = 1. Assuming

T is complete, we form an eigenvector basis v1, . . . ,vn with corresponding eigenvaluesλ1 = 1, λ2, . . . , λn. Assume λ1, . . . , λj have modulus 1, while the remaining eigenvalues

satisfy |λi | < 1 for i = j + 1, . . . , n. If the initial value u(0) = c1v1 + · · · + cn vn isclose to u? = v1, then | c1 − 1 |, | c2 |, . . . , | cn | < ε for ε small. Then the corresponding

289

solution u(k) = c1v1 + c2 λk2v2 + · · ·+ cn λk

n vn satisfies

‖u(k) − u? ‖ ≤ | c1 − 1 | ‖v1 ‖+ | c2 | |λ2 |k ‖v2 ‖+ · · · + | cn | |λn |

k‖vn ‖< ε

“‖v1 ‖+ · · · + ‖vn ‖

”= C ε,

and hence any solution that starts near u? stays near.(b) If A has an incomplete eigenvalue of modulus |λ | = 1, then, according to the solution

formula (10.23), the iterative system admits unbounded solutions eu(k) → ∞. Thus, for

any ε > 0, there is an unbounded solution u? + ε eu(k) → ∞ that starts out arbitrarilyclose to the fixed point u?. On the other hand, if all eigenvalues of modulus 1 are com-plete, then the preceding proof works in essentially the same manner. The first j termsare bounded as before, while the remainder go to 0 as k →∞.

10.3.1. (a) 34 , convergent; (b) 3, inconclusive; (c) 8

7 , inconclusive; (d) 74 , inconclusive;

(e) 87 , inconclusive; (f ) .9, convergent; (g) 7

3 , inconclusive; (h) 1, inconclusive.

10.3.2. (a) .671855, convergent; (b) 2.5704, inconclusive; (c) .9755, convergent; (d) 1.9571,inconclusive; (e) 1.1066, inconclusive; (f ) .8124, convergent; (g) 2.03426, inconclusive;(h) .7691, convergent.

10.3.3. (a) 23 , convergent; (b) 1

2 , convergent; (c) .9755, convergent; (d) 1.0308, divergent;

(e) .9437, convergent; (f ) .8124, convergent; (g) 23 , convergent; (h) 2

3 , convergent.

10.3.4. (a) ‖Ak ‖∞ = k2 + k, (b) ‖Ak ‖2 = k2 + 1, (c) ρ(Ak) = 0. (d) Thus, a conver-gent matrix can have arbitrarily large norm. (e) Because the norm in the inequality willdepend on k.

10.3.5. For example, when A =

1 10 1

!, A2 =

1 20 1

!, and (a) ‖A ‖∞ = 2, ‖A2 ‖∞ = 3;

(b) ‖A ‖2 =

r3+

√5

2 = 1.6180, ‖A2 ‖2 =q

3 + 2√

2 = 2.4142.

10.3.6. Since ‖ cA ‖ = | c | ‖A ‖ < 1.

♦ 10.3.7. For example, if A =

0 10 0

!, B =

0 11 0

!, then ρ(A+B) =

√2 > 0+1 = ρ(A)+ρ(B).

10.3.8. True: this implies ‖A ‖2 = max σi < 1.

10.3.9. For example, if A =

0@

12 1

0 12

1A, then ρ(A) = 1

2 . The singular values of A are σ1 =

√3+2

√2

2 = 1.2071 and σ2 =

√3−2

√2

2 = .2071.

10.3.10.

(a) False: For instance, if A =

0 10 1

!, S =

1 20 1

!, then B = S−1AS =

0 −20 1

!, and

‖B ‖∞ = 2 6= 1 = ‖A ‖∞.

(b) False: The same example has ‖B ‖2 =√

5 6=√

2 = ‖A ‖2.(c) True, since A and B have the same eigenvalues.

10.3.11. By definition, κ(A) = σ1/σn. Now ‖A ‖2 = σ1. On the other hand, by Exercise 8.5.12,

the singular values of A−1 are the reciprocals 1/σi of the singular values of A, and so the

290

largest one is ‖A−1 ‖2 = 1/σn.

♦ 10.3.12. (i) The 1 matrix norm is the maximum absolute column sum:

‖A ‖1 = max

8<:

nX

i=1

| aij |˛˛˛ 1 ≤ j ≤ n

9=; .

(ii) (a) 56 , convergent; (b) 17

6 , inconclusive; (c) 87 , inconclusive; (d) 11

4 , inconclusive;

(e) 127 , inconclusive; (f ) .9, convergent; (g) 7

3 , inconclusive; (h) 23 , convergent.

10.3.13. If a1, . . . ,an are the rows of A, then the formula (10.40) can be rewritten as ‖A ‖∞ =max ‖ai ‖1 | i = 1, . . . , n , i.e., the maximal 1 norm of the rows. Thus, by the properties ofthe 1-norm,

‖A + B ‖∞ = max‖ai + bi ‖1 ≤ max‖ai ‖1 + ‖bi ‖1 ≤ max‖ai ‖1 + max‖bi ‖1 = ‖A ‖∞ + ‖B ‖∞,

‖ cA ‖∞ = max‖ cai ‖1 = max| c | ‖ai ‖1 = | c | max‖ai ‖1 = | c | ‖A ‖∞.

Finally, ‖A ‖∞ ≥ 0 since we are maximizing non-negative quantities; moreover, ‖A ‖∞ = 0if and only if all its rows have ‖ai ‖1 = 0 and hence all ai = 0, which means A = O.

♦ 10.3.14. ‖A ‖ = maxσ1, . . . , σn is the largest generalized singular value, meaning σi =q

λi

where λ1, . . . , λn are the generalized eigenvalues of the positive definite matrix pair AT K A

and K, satisfying AT K Av = λK v for some v 6= 0, or, equivalently, the eigenvalues of

K−1AT K A.

10.3.15.

(a) ‖A ‖ = 72 . The “unit sphere” for this norm is the rectangle with corners

“± 1

2 ,± 13

”T.

It is mapped to the parallelogram with corners ±“

56 ,− 1

6

”T, ±

“16 , 7

6

”T, with respec-

tive norms 53 and 7

2 , and so ‖A ‖ = max ‖Av ‖ | ‖v ‖ = 1 = 72 .

(b) ‖A ‖ = 83 . The “unit sphere” for this norm is the diamond with corners ±

“12 , 0

”T,

±“

0, 13

”T. It is mapped to the parallelogram with corners ±

“12 , 1

2

”T, ±

“13 ,− 2

3

”T,

with respective norms 52 and 8

3 , and so ‖A ‖ = max ‖Av ‖ | ‖v ‖ = 1 = 83 .

(c) According to Exercise 10.3.14, ‖A ‖ is the square root of the largest generalized eigen-

value of the matrix pair K =

2 00 3

!, AT K A =

5 −4−4 14

!. Thus,

‖A ‖ =

r43+

√553

12 = 2.35436.

(d) According to Exercise 10.3.14, ‖A ‖ is the square root of the largest generalized eigen-

value of the matrix pair K =

2 −1−1 2

!, AT K A =

2 −1−1 14

!. Thus, ‖A ‖ = 3.

♥ 10.3.16. If we identify an n × n matrix with a vector in Rn2

, then the Frobenius norm is thesame as the ordinary Euclidean norm, and so the norm axioms are immediate. To check

the multiplicative property, let rT1 , . . . , rT

n denote the rows of A and c1, . . . , cn the columns

of B, so ‖A ‖F =

vuutnX

i=1

‖ ri ‖2 , ‖B ‖F =

vuuutnX

j =1

‖ cj ‖2 . Then, setting C = AB, we have

‖C ‖F =

vuuutnX

i,j =1

c2ij =

vuuutnX

i,j =1

(rTi cj)

2 ≤vuuut

nX

i,j =1

‖ ri ‖2 ‖ cj ‖

2 = ‖A ‖F ‖B ‖F ,

where we used the Cauchy–Schwarz inequality in the middle.

291

10.3.17.(a) This is a restatement of Proposition 10.28.

(b) ‖σ ‖22 =nX

i=1

σ2i =

nX

i=1

λi = tr(AT A) =nX

i,j =1

a2ij = ‖A ‖2F .

10.3.18. If we identify a matrix A with a vector in Rn2

, then this agrees with the ∞ norm on

Rn2

and hence satisfies the norm axioms. For example, when A =

1 10 1

!, then A2 =

1 20 1

!,

and so ‖A2 ‖ = 2 > 1 = ‖A ‖2.

10.3.19. First, if x =

ab

!,y =

cd

!are any two linearly independent vectors in R

2, then the

curve (cos ϕ)x − (sin ϕ)y =

a c−b −d

! cos ϕsin ϕ

!is the image of the unit circle under the

linear transformation defined by the nonsingular matrix

a c−b −d

!, and hence defines an

ellipse. The same argument shows that the curve (cos ϕ)x− (sin ϕ)y describes an ellipse inthe two-dimensional plane spanned by the vectors x,y.

♦ 10.3.20.(a) This follows from the formula (10.40) since | aij | ≤ si ≤ ‖A ‖∞, where si is the ith

absolute row sum.

(b) Let aij,n denote the (i, j) entry of An. Then, by part (a),∞X

n=0

| aij,n | ≤∞X

n=0

‖An ‖∞ <

∞ and hence∞X

n=0

aij,n = a?ij is an absolutely convergent series, [2, 16]. Since each

entry converges absolutely, the matrix series also converges.

(c) etA =∞X

n=0

tn

n!An and the series of norms

∞X

n=0

| t |nn!‖An ‖∞ ≤

∞X

n=0

| t |nn!‖A ‖n∞ = e| t | ‖A ‖

∞

is bounded by the standard scalar exponential series, which converges for all t, [2, 58].Thus, the convergence follows from part (b).

10.3.21. (a) Choosing a matrix norm such that a = ‖A ‖ < 1, the norm series is bounded by aconvergent geometric series:

∞X

‖ ‖=0

An ≤∞X

‖ ‖=0

An =∞X

n=0

an =1

1− a.

Therefore, the matrix series converges. (b) Moreover,

( I −A)∞X

n=0

An =∞X

n=0

An −∞X

n=0

An+1 = I ,

since all other terms cancel. I − A is invertible if and only if 1 is not an eigenvalue of A,and we are assuming all eigenvalues are less than 1 in magnitude.

292

10.3.22.

(a) Gerschgorin disk: | z − 1 | ≤ 2; eigenvalues: 3,−1;-2 -1 1 2 3 4

-2

-1

1

2

(b) Gerschgorin disks: | z − 1 | ≤ 23 ,

˛˛ z + 1

6

˛˛ ≤ 1

2 ;

eigenvalues: 12 , 1

3 ;-1 -0.5 0.5 1 1.5 2

-1

-0.75

-0.5

-0.25

0.25

0.5

0.75

1

(c) Gerschgorin disks: | z − 2 | ≤ 3, | z | ≤ 1;

eigenvalues: 1± i√

2;-1 1 2 3 4 5

-3

-2

-1

1

2

3

(d) Gerschgorin disks: | z − 3 | ≤ 1, | z − 2 | ≤ 2;

eigenvalues: 4, 3, 1;1 2 3 4

-2

-1

1

2

(e) Gerschgorin disks: | z + 1 | ≤ 2, | z − 2 | ≤ 3, | z + 4 | ≤ 3;

eigenvalues: −2.69805± .806289, 2.3961; -8 -6 -4 -2 2 4 6 8

-4

-2

2

4

(f ) Gerschgorin disks: z = 12 , | z | ≤ 1

3 , | z | ≤ 512 ;

eigenvalues: 12 ,± 1

3√

2;

-0.4 -0.2 0.2 0.4

-0.4

-0.2

0.2

0.4

293

(g) Gerschgorin disks: | z | ≤ 1, | z − 1 | ≤ 1;

eigenvalues: 0, 1± i ; -2 -1 1 2 3

-2

-1

1

2

(h) Gerschgorin disks: | z − 3 | ≤ 2, | z − 2 | ≤ 1, | z | ≤ 1,

| z − 1 | ≤ 2; eigenvalues: 12 ±

√5

2 , 52 ±

√5

2 ; -2 2 4 6

-3

-2

-1

1

2

3

10.3.23. False. Almost any non-symmetric matrix, e.g.,

2 10 1

!provides a counterexample.

♦ 10.3.24.(i) Because A and its transpose AT have the same eigenvalues, which must therefore belong

to both DA and DAT .(ii)

(a) Gerschgorin disk: | z − 1 | ≤ 2; eigenvalues: 3,−1; -2 -1 1 2 3 4

-2

-1

1

2

(b) Gerschgorin disks: | z − 1 | ≤ 12 ,

˛˛ z + 1

6

˛˛ ≤ 2

3 ;

eigenvalues: 12 , 1

3 ; -1 -0.5 0.5 1 1.5 2

-1

-0.75

-0.5

-0.25

0.25

0.5

0.75

1

(c) Gerschgorin disks: | z − 2 | ≤ 1, | z | ≤ 3;

eigenvalues: 1± i√

2;-3 -2 -1 1 2 3

-3

-2

-1

1

2

3

294

(d) Gerschgorin disks: | z − 3 | ≤ 1, | z − 2 | ≤ 2;

eigenvalues: 4, 3, 1;1 2 3 4

-2

-1

1

2

(e) Gerschgorin disks: | z + 1 | ≤ 2, | z − 2 | ≤ 4, | z + 4 | ≤ 2;

eigenvalues: −2.69805± .806289, 2.3961; -8 -6 -4 -2 2 4 6 8

-4

-2

2

4

(f ) Gerschgorin disks: | z − 12 | ≤ 1

4 , | z | ≤ 16 , | z | ≤ 1

3 ;

eigenvalues: 12 ,± 1

3√

2;

-0.75 -0.5 -0.25 0.25 0.5 0.75

-0.75

-0.5

-0.25

0.25

0.5

0.75

(g) Gerschgorin disks: z = 0, | z − 1 | ≤ 2, | z − 1 | ≤ 1;

eigenvalues: 0, 1± i ;

-2 -1 1 2 3

-2

-1

1

2

(h) Gerschgorin disks: | z − 3 | ≤ 1, | z − 2 | ≤ 2, | z | ≤ 2,

| z − 1 | ≤ 1; eigenvalues: 12 ±

√5

2 , 52 ±

√5

2 ; -2 2 4 6

-3

-2

-1

1

2

3

♦ 10.3.25. By elementary geometry, all points z in a closed disk of radius r centered at z = asatisfy max0, | a | − r ≤ | z | ≤ | a | + r. Thus, every point in the ith Gerschgorin disksatisfies max0, | aii | − ri ≤ | z | ≤ | aii | + ri = si. Since every eigenvalue lies in such adisk, they all satisfy max0, t ≤ |λi | ≤ s, and hence ρ(A) = max|λi | does too.

10.3.26. (a) The absolute row sums of A are bounded by si =nX

j =1

| aij | < 1, and so

295

ρ(A) ≤ s = max si < 1 by Exercise 10.3.25. (b) A =

0@

12

12

12

12

1A has eigenvalues 0, 1 and

hence ρ(A) = 1.

10.3.27. Using Exercise 10.3.25, we find ρ(A) ≤ s = maxs1, . . . , sn ≤ na?.

10.3.28. For instance, any diagonal matrix whose diagonal entries satisfy 0 < | aii | < 1.

10.3.29. Both false.

1 22 5

!is a counterexample to (a), while

1 00 −1

!is a counterexample

to (b). However, see Exercise 10.3.30.

♦ 10.3.30. The eigenvalues of K are real by Theorem 8.20. The ith Gerschgorin disk is centeredat kii > 0 and by diagonal dominance its radius is less than the distance from its center tothe origin. Therefore, all eigenvalues of K must be positive and hence, by Theorem 8.23,K > 0.

10.3.31.

(a) For example, A =

0 11 0

!has Gerschgorin domain | z | ≤ 1.

(b) No — see the proof of Theorem 10.37.

10.3.32. The ith Gerschgorin disk is centered at aii < 0 and, by diagonal dominance, its radiusis less than the distance from its center to the origin. Therefore, all eigenvalues of A lie inthe left half plane: Re λ < 0, which, by Theorem 9.15, implies asymptotic stability of thedifferential equation.

10.4.1. (a) Not a transition matrix; (b) not a transition matrix; (c) regular transition ma-

trix:“

817 , 9

17

”T; (d) regular transition matrix:

“16 , 5

6

”T; (e) not a regular transition

matrix; (f ) regular transition matrix:“

13 , 1

3 , 13

”T; (g) regular transition matrix:

( .2415, .4348, .3237 )T ; (h) not a transition matrix; (i) regular transition matrix:“

613 , 4

13 , 313

”T= ( .4615, .3077, .2308 )T ; (j) not a regular transition matrix;

(k) not a transition matrix; (l) regular transition matrix (A4 has all positive entries):“

2511001 , 225

1001 , 2351001 , 290

1001

”T= ( .250749, .224775, .234765, .28971 )T ; (m) regular transition

matrix: ( .2509, .2914, .1977, .2600 )T .

10.4.2. (a) 20.5%; (b) 9.76% farmers, 26.83% laborers, 63.41% professionals

10.4.3. 2004: 37,000 city, 23,000 country; 2005: 38,600 city, 21,400 country; 2006: 39,880city, 20,120 country; 2007: 40,904 city, 19,096 country; 2008: 41,723 city, 18,277 country;Eventual: 45,000 in the city and 15,000 in the country.

10.4.4. 58.33% of the nights.

10.4.5. When in Atlanta he always goes to Boston; when in Boston he has a 50% probability ofgoing to either Atlanta or Chicago; when in Chicago he has a 50% probability of going toeither Atlanta or Boston. The transition matrix is regular because

T 4 =

0B@

.375 .3125 .3125

.25 .5625 .5

.375 .125 .1875

1CA has all positive entries.

On average he visits Atlanta: 33.33%, Boston 44.44%, and Chicago: 22.22% of the time.

296

10.4.6. The transition matrix T =

0BBB@

0 23

23

1 0 13

0 13 0

1CCCA is regular because T 4 =

0BBB@

1427

2681

2681

29

4981

1627

727

227

781

1CCCA

has all positive entries. She visits branch A 40% of the time, branch B 45% and branch C:15%.

10.4.7. 25% red, 50% pink, 25% pink.

10.4.8. If u(0) = ( a, b )T is the initial state vector, then the subsequent state vectors switch

back and forth between ( b, a )T and ( a, b )T . At each step in the process, all of the popula-tion in state 1 goes to state 2 and vice versa, so the system never settles down.

10.4.9. This is not a regular transition matrix, so we need to analyze the iterative process di-rectly. The eigenvalues of A are λ1 = λ2 = 1 and λ3 = 1

2 , with corresponding eigenvectors

v1 =

0B@

100

1CA,v2 =

0B@

001

1CA, and v3 =

0B@

1−2

1

1CA. Thus, the solution with u(0) =

0B@

p0q0r0

1CA is

u(n) =“

p0 + 12 q0

”0B@

100

1CA+

“12 q0 + r0

”0B@

001

1CA− q0

2k+1

0B@

1−2

1

1CA −→

0B@

p0 + 12 q0

012 q0 + r0

1CA.

Therefore, this breeding process eventually results in a population with individuals of geno-types AA and aa only, the proportions of each depending upon the initial population.

10.4.10. Numbering the vertices from top to bottom and left to right, the transition matrix is

T =

0BBBBBBBBBBBB@

0 14

14 0 0 0

12 0 1

412

14 0

12

14 0 0 1

412

0 14 0 0 1

4 0

0 14

14

12 0 1

2

0 0 14 0 1

4 0

1CCCCCCCCCCCCA

. The probability eigenvector is

0BBBBBBBBBBBB@

192929192919

1CCCCCCCCCCCCA

and so the bug spends,

on average, twice as much time at the edge vertices as at the corner vertices.

10.4.11. Numbering the vertices from top to bottom and left to right, the transition matrix is

T =

0BBBBBBBBBBBBBBBBBBBBBBBB@

0 14

14 0 0 0 0 0 0 0

12 0 1

414

16 0 0 0 0 0

12

14 0 0 1

614 0 0 0 0

0 14 0 0 1

6 0 12

14 0 0

0 14

14

14 0 1

4 0 14

14 0

0 0 14 0 1

6 0 0 0 14

12

0 0 0 14 0 0 0 1

4 0 0

0 0 0 14

16 0 1

2 0 14 0

0 0 0 0 16

14 0 1

4 0 12

0 0 0 0 0 14 0 0 1

4 0

1CCCCCCCCCCCCCCCCCCCCCCCCA

. The probability eigenvector is

0BBBBBBBBBBBBBBBBBBBBBBBB@

11819191916191181919118

1CCCCCCCCCCCCCCCCCCCCCCCCA

and

so the bug spends, on average, twice as much time on the edge vertices and three times asmuch time at the center vertex as at the corner vertices.

297

10.4.12. The transition matrix T =

0BBBBBBBBBBBBBBBBBBBBB@

0 13 0 1

3 0 0 0 0 012 0 1

2 0 14 0 0 0 0

0 13 0 0 0 1

3 0 0 012 0 0 0 1

4 0 12 0 0

0 13 0 1

3 0 13 0 1

3 0

0 0 12 0 1

4 0 0 0 12

0 0 0 13 0 0 0 1

3 0

0 0 0 0 14 0 1

2 0 12

0 0 0 0 0 13 0 1

3 0

1CCCCCCCCCCCCCCCCCCCCCA

is not regular. Indeed,

if, say, the bug starts out at a corner,then after an odd number of steps it can only be atone of the edge vertices, while after an even number of steps it will be either at a corner

vertex or the center vertex. Thus, the iterates u(n) do not converge. If the bug starts at

vertex i, so u(0) = ei, after a while, the probability vectors u(n) end up switching back and

forth between the probability vectors v = T w =“

0, 14 , 0, 1

4 , 0, 14 , 0, 1

4 , 0”T

, where the bug

has an equal probability of being at any edge vertex, and

w = T v =“

16 , 0, 1

6 , 0, 13 , 0, 1

6 , 0, 16

”T, where the bug is either at a corner vertex or, twice as

likely, at the middle vertex.

♦ 10.4.13. The ith column of T k is u(k)i = T kei → v by Theorem 10.40.

10.4.14. In view of Exercise 10.4.13, the limit is

0BBB@

13

13

13

13

13

13

13

13

13

1CCCA.

10.4.15. First, v is a probability vector since the sum of its entries isp

p + q+

pp + q

= 1. More-

over, Av =

(1− p)q + q p

p + q

pq + (1− q)p

p + q

!=„ q

p + qp

p + q

«= v, proving it is an

eigenvector for eigenvalue 1.

10.4.16. All equal probabilities: z =„

1n, . . . ,

1n

«T

.

10.4.17. z =„

1n, . . . ,

1n

«T

.

10.4.18. False:

0B@

.3 .5 .2

.3 .2 .5

.4 .3 .3

1CA is a counterexample.

10.4.19. False. For instance, if T =

0@

12

13

12

23

1A, then T−1 =

4 −2−3 3

!, while T =

0@

12

12

12

12

1A is

not even invertible.

10.4.20. False. For instance, 0 is not a probability vector.

10.4.21. (a) The 1 norm.

10.4.22. True. If v = ( v1, v2, . . . , vn )T is a probability eigenvector, thennX

i=1

vi = 1 andnX

j =1

tij vj = λvi for all i = 1, . . . , n. Summing the latter equations over i, we find

298

λ = λnX

i=1

vi =nX

i=1

nX

j =1

tij vj =nX

j =1

vj = 1,

since the column sums of a transition matrix are all equal to 1.

10.4.23. (a)

0 11 0

!; (b)

0@ 0 1

2

1 12

1A.

♦ 10.4.24. The ith entry of v is vi =nX

j =1

tij uj . Since each tij ≥ 0 and uj ≥ 0, the sum vi ≥ 0

also. Moreover,nX

i=1

vi =nX

i,j =1

tij uj =nX

j =1

uj = 1 because all the column sums of T are

equal to 1, and u is a probability vector.

♦ 10.4.25.(a) The columns of T S are obtained by multiplying T by the columns of S. Since S is a

transition matrix, its columns are probability vectors. Exercise 10.4.24 shows that eachcolumn of T S is also a probability vector, and so the product is a transition matrix.

(b) This follows by induction from part (a), where we write T k+1 = T T k.

10.5.1.(a) The eigenvalues are − 1

2 , 13 , so ρ(T ) = 1

2 .

(b) The iterates will converge to the fixed point“− 1

6 , 1”T

at rate 12 . Asymptotically, they

come in to the fixed point along the direction of the dominant eigenvector (−3, 2 )T .

10.5.2.(a) ρ(T ) = 2; the iterates diverge: ‖u(k) ‖ → ∞ at a rate of 2.

(b) ρ(T ) = 34 ; the iterates converge to the fixed point ( 1.6, .8, 7.2 )T at a rate 3

4 , along the

dominant eigenvector direction ( 1, 2, 6 )T .

(c) ρ(T ) = 12 ; the iterates converge to the fixed point (−1, .4, 2.6 )T at a rate 1

2 , along the

dominant eigenvector direction ( 0,−1, 1 )T .

10.5.3. (a,b,e,g) are diagonally dominant.

♠ 10.5.4. (a) x = 17 = .142857, y = − 2

7 = − .285714; (b) x = −30, y = 48;(e) x = −1.9172, y = − .339703, z = −2.24204;(g) x = − .84507, y = − .464789, z = − .450704;

♠ 10.5.5. (c) Jacobi spectral radius = .547723, so Jacobi converges to the solution

x = 87 = 1.142857, y = 19

7 = 2.71429;(d) Jacobi spectral radius = .5, so Jacobi converges to the solution

x = − 109 = −1.1111, y = − 13

9 = −1.4444, z = 29 = .2222;

(f ) Jacobi spectral radius = 1.1180, so Jacobi does not converge.

10.5.6. (a) u =

.7857.3571

!, (b) u =

−4

5

!, (c) u =

0B@

.3333−1.0000

1.3333

1CA,

(d) u =

0B@

.7273−3.1818

.6364

1CA, (e) u =

0BBB@

.8750−.1250−.1250−.1250

1CCCA, (f ) u =

0BBB@

0..7143−.1429−.2857

1CCCA.

299

♣ 10.5.7.(a) | c | > 2.(b) If c = 0, then D = c I = O, and Jacobi iteration isn’t even defined. Otherwise, T =

−D−1(L + U) is tridiagonal with diagonal entries all 0 and sub- and super-diagonal

entries equal to −1/c. According to Exercise 8.2.48, the eigenvalues are − 2

ccos

kπ

n + 1

for k = 1, . . . , n, and so the spectral radius is ρ(T ) =2

| c | cos1

n + 1. Thus, convergence

requires | c | > 2 cos1

n + 1; in particular, | c | ≥ 2 will ensure convergence for any n.

(c) For n = 5, the solution is u = ( .8333,− .6667, .5000,− .3333, .1667 )T , with a conver-

gence rate of ρ(T ) = cos 16 π = .8660. It takes 51 iterations to obtain 3 decimal place

accuracy, while log(.5× 10−4)/ log ρ(T ) ≈ 53.For n = 10, the solution is u = (.9091,− .8182, .7273,− .6364, .5455,− .4545, .3636,

− .2727, .1818,− .0909)T , with a convergence rate of cos 111 π = .9595. It takes 173 itera-

tions to obtain 3 decimal place accuracy, while log(.5× 10−4)/ log ρ(T ) ≈ 184.For n = 20, u = (.9524,− .9048, .8571,− .8095, .7619,− .7143, .6667,− .6190, .5714,

− .5238, .4762,− .4286, .3810,− .3333, .2857,− .2381, .1905,− .1429, .0952,− .0476)T , with

a convergence rate of cos 121 π = .9888. It takes 637 iterations to obtain 3 decimal place

accuracy, while log(.5× 10−4)/ log ρ(T ) ≈ 677.

10.5.8. If Au = 0, then D u = − (L + U)u, and hence T u = −D−1(L + U)u = u, proving thatu is a eigenvector for T with eigenvalue 1. Therefore, ρ(T ) ≥ 1, which implies that T is nota convergent matrix.

♦ 10.5.9. If A is nonsingular, then at least one of the terms in the general determinant expansion(1.85) is nonzero. If a1,π(1) a2,π(2) · · · an,π(n) 6= 0 then each ai,π(i) 6= 0. Applying the per-

mutation π to the rows of A will produce a matrix whose diagonal entries are all nonzero.

10.5.10. Assume, for simplicity, that T is complete with a single dominant eigenvalue λ1, so

that ρ(T ) = |λ1 | and |λ1 | > |λj | for j > 1. We expand the initial error e(0) = c1v1 + · · ·+cn vn in terms of the eigenvectors. Then e(k) = T ke(0) = c1 λk

1 v1 + · · · + cn λkn vn, which,

for k À 0, is approximately e(k) ≈ c1 λk1 v1. Thus, ‖ e(k+j) ‖ ≈ ρ(T )j ‖ e(k) ‖. In particular,

if at iteration number k we have m decimal places of accuracy, so ‖ e(k) ‖ ≤ .5 × 10−m,

then, approximately, ‖ e(k+j) ‖ ≤ .5 × 10−m+j log10

ρ(T ) = .5 × 10−m−1 provided j =−1/ log10 ρ(T ).

10.5.11. False for elementary row operations of types 1 & 2, but true for those of type 3.

♥ 10.5.12.

(a) x =

0BB@

7236234023

1CCA =

0B@

.30435

.260871.73913

1CA;

(b) x(1) =

0B@−.5−.251.75

1CA, x

(2) =

0B@

.4375

.06251.8125

1CA, x

(3) =

0B@

.390625.3125

1.65625

1CA, with error e(3) =

0B@

.0862772

.0516304−.0828804

1CA;

300

(c) x(k+1) =

0BBB@

0 − 14

12

14 0 − 1

4

− 14 − 1

4 0

1CCCAx(k) +

0BBB@

− 12

− 1474

1CCCA;

(d) x(1) =

0B@−.5−.3751.78125

1CA, x

(2) =

0B@

.484375

.3164061.70801

1CA, x

(3) =

0B@

.274902

.2457281.74271

1CA; the error at the third

iteration is e(3) =

0B@−.029446−.015142

.003576

1CA, which is about 30% of the Jacobi error;

(e) x(k+1) =

0BBB@

0 − 14

12

0 − 116

38

0 364 − 1

32

1CCCAx(k) +

0BBB@

− 12

− 38

5732

1CCCA;

(f ) ρ(TJ ) =√

34 = .433013,

(g) ρ(TGS) = 3+√

7364 = .180375, so Gauss–Seidel converges about log ρGS/ log ρJ = 2.046

times as fast.(h) Approximately log(.5× 10−6)/ log ρGS ≈ 8.5 iterations.

(i) Under Gauss–Seidel, x(9) =

0B@

.304347

.2608691.73913

1CA, with error e(9) = 10−6

0B@−1.0475−.4649

.1456

1CA.

♠ 10.5.13. (a) x = 17 = .142857, y = − 2

7 = − .285714; (b) x = −30, y = 48;(e) x = −1.9172, y = − .339703, z = −2.24204;(g) x = − .84507, y = − .464789, z = − .450704;

10.5.14. (a) ρJ = .2582, ρGS = .0667; (b) ρJ = .7303, ρGS = .5333; (c) ρJ = .5477,ρGS = .3; (d) ρJ = .5, ρGS = .2887; (e) ρJ = .4541, ρGS = .2887; (f ) ρJ = .3108,ρGS = .1667; (g) ρJ = 1.118, ρGS = .7071. Thus, all systems lead to convergentGauss–Seidel schemes, with faster convergence than Jacobi (which doesn’t even converge incase (g)).

10.5.15.

(a) Solution: u =

.7857.3571

!; spectral radii: ρJ = 1√

15= .2582, ρGS = 1

15 = .06667, so

Gauss–Seidel converges exactly twice as fast;

(b) Solution: u =

−4

5

!; spectral radii: ρJ = 1√

2= .7071, ρGS = 1

2 = .5, so Gauss–Seidel

converges exactly twice as fast;

(c) Solution: u =

0B@

.3333−1.0000

1.3333

1CA; spectral radii: ρJ = .7291, ρGS = .3104, so Gauss–Seidel

converges log ρGS/ log ρJ = 3.7019 times as fast;

(d) Solution: u =

0B@

.7273−3.1818

.6364

1CA; spectral radii: ρJ = 2√

15= .5164, ρGS = 4

15 = .2667, so

Gauss–Seidel converges exactly twice as fast;

(e) Solution: u =

0BBB@

.8750−.1250−.1250−.1250

1CCCA; spectral radii: ρJ = .6, ρGS = .1416, so Gauss–Seidel

converges log ρGS/ log ρJ = 3.8272 times as fast;

301

(f ) Solution: u =

0BBB@

0..7143−.1429−.2857

1CCCA; spectral radii: ρJ = .4714, ρGS = .3105, so Gauss–Seidel

converges log ρGS/ log ρJ = 1.5552 times as fast.

♣ 10.5.16. (a) | c | > 2; (b) c > 1.61804; (c) same answer; (d) Gauss–Seidel converges exactly

twice as fast since ρGS = ρ2J for all values of c.

♠ 10.5.17. The solution is x = .083799, y = .21648, z = 1.21508. The Jacobi spectral radiusis .8166, and so it converges reasonably rapidly to the solution; indeed, after 50 iterations,

x(50) = .0838107, y(50) = .216476, z(50) = 1.21514. On the other hand, the Gauss–

Seidel spectral radius is 1.0994, and it slowly diverges; after 50 iterations, x(50) = −30.5295,

y(50) = 9.07764, z(50) = −90.8959.

♠ 10.5.18. The solution is x = y = z = w = 1. Gauss–Seidel converges, but extremely slowly.

Starting with the initial guess x(0) = y(0) = z(0) = w(0) = 0, after 2000 iterations, the

approximate solution x(50) = 1.00281, y(50) = .99831, z(50) = .999286, w(50) = 1.00042,is correct to 2 decimal places. The spectral radius is .9969 and so it takes, on average, 741iterations per decimal place.

10.5.19. ρ(TJ ) = 0, while ρ(TGS) = 2. Thus Jacobi converges extremely rapidly, whereasGauss–Seidel diverges.

♠ 10.5.20. Jacobi doesn’t converge because its spectral radius is 3.4441. Gauss–Seidel converges,but extremely slowly, since its spectral radius is .999958.

♦ 10.5.21. For a general matrix, both Jacobi and Gauss–Seidel require kn(n − 1) multiplications

and kn(n−1) additions to perform k iterations, along with n2 divisions to set up the initialmatrix T and vector c. They are more efficient than Gaussian Elimination provided thenumber of steps k < 1

3 n (approximately).

♣ 10.5.22. (a) Diagonal dominance requires | z | > 4; (b) The solution is u = (.0115385,

−.0294314,−.0755853, .0536789, .31505, .0541806,−.0767559,−.032107, .0140468, .0115385)T .It takes 41 Jacobi iterations and 6 Gauss–Seidel iterations to compute the first three deci-mal places of the solution. (c) Computing the spectral radius, we conclude that the Ja-cobi scheme converges to the solution whenever | z | > 3.6387, while the Gauss–Seidelscheme converges for z < −3.6386 or z > 2.

♣ 10.5.23.(a) If λ is an eigenvalue of T = I − A, then µ = 1 − λ is an eigenvalue of A, and hence the

eigenvalues of A must satisfy | 1− µ | < 1, i.e., they all lie within a distance 1 of 1.(b) The Gerschgorin disks are

D1 = | z − .8 | ≤ .2 , D2 = | z − 1.5 | ≤ .3 , D3 = | z − 1 | ≤ .3 ,

and hence all eigenvalues of A are within a distance 1 of 1. Indeed, we can explicitlycompute the eigenvalues of A, which are

µ1 = 1.5026, µ2 = .8987 + .1469 i , µ3 = .8987− .1469 i .

Hence, the spectral radius of T = I −A is ρ(T ) = maxn| 1− µj |

o= .5026. Starting the

iterations with u(0) = 0, we arrive at the solution u? = ( 1.36437,−.73836, 1.65329 )T to4 decimal places after 13 iterations.

302

♥ 10.5.24.

(a) u =

1.4.2

!.

(b) The spectral radius is ρJ = .40825 and so it takes about −1/ log10 ρJ ≈ 2.57 iterationsto produce each additional decimal place of accuracy.

(c) The spectral radius is ρGS = .16667 and so it takes about −1/ log10 ρGS ≈ 1.29 itera-tions to produce each additional decimal place of accuracy.

(d) u(n+1) =

0@ 1− ω − 1

2 ω

− 13 (1− ω)ω 1

6 ω2 − ω + 1

1Au(n) +

0@

32 ω

23 ω − 1

2 ω2

1A.

(e) The SOR spectral radius is minimized when the two eigenvalues of Tω coincide, whichoccurs when ω? = 1.04555, at which value ρ? = ω? − 1 = .04555, so the optimalSOR method is almost 3.5 times as fast as Jacobi, and about 1.7 times as fast as Gauss–Seidel.

(f ) For Jacobi, about −5/ log10 ρJ ≈ 13 iterations; for Gauss–Seidel, about −5/ log10 ρGS =7 iterations; for optimal SOR, about −5/ log10 ρSOR ≈ 4 iterations.

(g) To obtain 5 decimal place accuracy, Jacobi requires 12 iterations, Gauss–Seidel requires6 iterations, while optimal SOR requires 5 iterations.

♣ 10.5.25. (a) x = .5, y = .75, z = .25, w = .5. (b) To obtain 5 decimal place accuracy,Jacobi requires 14 iterations, Gauss–Seidel requires 8 iterations. One can get very goodapproximations of the spectral radii ρJ = .5, ρGS = .25, by taking ratios of entries ofsuccessive iterates, or the ratio of norms of successive error vectors. (c) The optimal SORscheme has ω = 1.0718, and requires 6 iterations to get 5 decimal place accuracy. The SORspectral radius is ρSOR = .0718.

♠ 10.5.26. (a) ρJ = 1+√

54 = .809017, ρGS = 3+

√5

8 = .654508; (b) no; (c) ω? = 1.25962 and

ρ? = .25962; (d) The solution is x = ( .8,−.6, .4,−.2 )T . Jacobi: predicted 44 iterations;actual 45 iterations. Gauss-Seidel: predicted 22 iterations; actual 22 iterations. OptimalSOR: predicted 7 iterations; actual 9 iterations.

♠ 10.5.27. (a) ρJ = 1+√

54 = .809017, ρGS = 3+

√5

8 = .654508; (b) no; (c) ω? = 1.25962and ρ? = 1.51315, so SOR with that value of ω doesn’t converge! However, by numericallycomputing the spectral radius of Tω, the optimal value is found to be ω? = .874785 (under-relaxation), with ρ? = .125215. (d) The solution is x = (.413793,−.172414, .0689655,

−.0344828)T . Jacobi: predicted 44 iterations; actual 38 iterations. Gauss-Seidel: predicted22 iterations; actual 19 iterations. Optimal SOR: predicted 5 iterations; actual 5 iterations.

♠ 10.5.28. (a) The Jacobi iteration matrix TJ = D−1(L + U) is tridiagonal with all 0’s on

the main diagonal and 12 ’s on the sub- and super-diagonals. Thus, using Exercise 8.2.47,

ρJ = cos 19 π < 1, and so Jacobi converges. (b) ω? =

2

1 + sin 19 π

= 1.49029. Since

ρ? = .490291, it takeslog ρJ

log ρ?

≈ 11.5 Jacobi steps per SOR step. (c) The solution is

u =“

89 , 8

9 , 79 , 2

3 , 59 , 4

9 , 13 , 2

9 , 19

”T= ( .888889, .7778, .6667, .5556, .4444, .3333, .2222, .1111 )T .

Starting with u(0) = 0, it takes 116 Jacobi iterations versus 13 SOR iterations to achieve 3place accuracy.

♠ 10.5.29. The optimal value for SOR is ω = 1.80063, with spectral radius ρSOR = .945621.

Starting with x(0) = y(0) = z(0) = w(0) = 0, it take 191 iterations to obtain 2 decimal placeaccuracy in the solution. Each additional decimal place requires about −1/ log10 ρSOR ≈41 iterations, which is about 18 times as fast as Gauss–Seidel.

303

♠ 10.5.30. The Jacobi and Gauss–Seidel spectral radii are ρJ =√

73 = .881917, ρGS = 7

9 =.777778, respectively. It takes 99 Jacobi iterations versus 6 Gauss-Seidel iterations to ob-tain the solution with 5 decimal place accuracy. Using (10.86) to fix the optimal SOR pa-rameter ω? = 1.35925 with spectral radius ρ? = .359246. However, it takes 16 iterations toobtain the solution with 5 decimal place accuracy, which is significantly slower than Gauss-Seidel, which converges much faster than it should, owing to the particular right hand sideof the linear system.

♣ 10.5.31.(a) u = ( .0625, .125, .0625, .125, .375, .125, .0625, .125, .0625 )T .(b) It takes 11 Jacobi iterations to compute the first two decimal places of the solution, and

17 iterations for 3 place accuracy.(c) It takes 6 Gauss–Seidel iterations to compute the first two decimal places of the solu-

tion, and 9 iterations for 3 place accuracy.(d) ρJ = 1√

2, and so, by (10.86), the optimal SOR parameter is ω? = 1.17157. It takes only

4 iterations for 2 decimal place accuracy, and 6 iterations for 3 places.

♣ 10.5.32. Using (10.86), the optimal SOR parameter is ω? =2

1 +q

1− ρ2J

=2

1 + sin πn+1

.

For the n = 5 system, ρJ =√

32 , and ω? = 4

3 with ρ? = 13 , and the convergence is about 8

times as fast as Jacobi, and 4 times as fast as Gauss–Seidel. For the n = 25 system, ρJ =.992709, and ω? = 1.78486 with ρ? = .78486, and the convergence is about 33 times as fastas Jacobi, and 16.5 times as fast as Gauss–Seidel.

♣ 10.5.33. The Jacobi spectral radius is ρJ = .909657. Using (10.86) to fix the SOR parameterω = 1.41307 actually slows down the convergence since ρSOR = .509584 while ρGS =.32373. Computing the spectral radius directly, the optimal SOR parameter is ω? = 1.17157with ρ? = .290435. Thus, optimal SOR is about 13 times as fast as Jacobi, but only marginallyfaster than Gauss-Seidel.

♦ 10.5.34. The two eigenvalues

λ1 = 18

„ω2 − 8ω + 8 + ω

qω2 − 16ω + 16

«, λ2 = 1

8

„ω2 − 8ω + 8− ω

qω2 − 16ω + 16

«.

are real for 0 ≤ ω ≤ 8− 4√

3 . A graph of the modulusof the eigenvalues over the range 0 ≤ ω ≤ 2 reveals that,as ω increases, the smaller eigenvalue is increasingand the larger decreasing until they meet at 8− 4

√3 ;

0.5 1 1.5 2

0.2

0.4

0.6

0.8

1

after this point, both eigenvalues are complex conjugatesof the same modulus. To prove this analytically, we compute

dλ2

dω=

3− ω

4+

2− ω√ω2 − 16ω + 16

> 0

for 1 ≤ ω ≤ 8− 4√

3 , and so the smaller eigenvalue is increasing. Furthermore,

dλ2

dω=

3− ω

4− 2− ω√

ω2 − 16ω + 16< 0

on the same interval, so the larger eigenvalue is decreasing. Once ω > 8 − 4√

3 , the eigen-values are complex conjugates, of equal modulus |λ1 | = |λ2 | = ω − 1 > ω? − 1.

♥ 10.5.35.(a) u

(k+1) = u(k) + D−1

r(k) = u

(k) −D−1Au(k) + D−1

b

= u(k) −D−1(L + D + U)u(k) + D−1

b = −D−1(L + U)u(k) + D−1b,

which agrees with (10.65).

304

(b) u(k+1) = u

(k) + (L + D)−1r(k) = u

(k) − (L + D)−1Au(k) + (L + D)−1

b

= u(k) − (L + D)−1(L + D + U)u(k) + (L + D)−1

b

= − (L + D)−1U u(k) + (L + D)−1

b,which agrees with (10.71).

(c) u(k+1) = u

(k) + (ωL + D)−1r(k) = u

(k) − (ωL + D)−1Au(k) + (ωL + D)−1

b

= u(k) − (ωL + D)−1(L + D + U)u(k) + (ωL + D)−1

b

= −(ωL + D)−1“

(1− ω)D + U”u(k) + (ωL + D)−1

b,

which agrees with (10.80).

(d) If u? is the exact solution, so Au? = b, then r(k) = A(u? − u(k)) and so ‖u(k) − u? ‖ ≤‖A−1 ‖ ‖ r(k) ‖. Thus, if ‖ r(k) ‖ is small, the iterate u(k) is close to the solution u? pro-

vided ‖A−1 ‖ is not too large. For instance, if A =

1 00 .0001

!and b =

10

!, then

x =

1

100

!has residual r = b − Ax =

0

.001

!, even though x is nowhere near the

exact solution x? =

10

!.

10.5.36. Note that the iteration matrix is T = I − ε A, which has eigenvalues 1 − ε λj . When

0 < ε <2

λ1

, the iterations converge. The optimal value is ε =2

λ1 + λn

, with spectral

radius ρ(T ) =λ1 − λn

λ1 + λn

.

10.5.37.

In each solution, the last uk is the actual solution, with residual rk = f −Kuk = 0.

(a) r0 =

21

!, u1 =

.76923.38462

!, r1 =

.07692−.15385

!, u2 =

.78571.35714

!;

(b) r0 =

0B@

10−2

1CA, u1 =

0B@

.50−1

1CA, r1 =

0B@−1−2−.5

1CA, u2 =

0B@

.51814−.72539−1.94301

1CA,

r2 =

0B@

1.28497−.80311

.64249

1CA, u3 =

0B@

1.−1.4−2.2

1CA;

(c) r0 =

0B@−1−2

7

1CA, u1 =

0B@−.13466−.26933

.94264

1CA, r1 =

0B@

2.36658−4.01995−.81047

1CA, u2 =

0B@−.13466−.26933

.94264

1CA,

r2 =

0B@

.72321

.38287

.21271

1CA, u3 =

0B@

.33333−1.00000

1.33333

1CA;

(d) r0 =

0BBB@

120−1

1CCCA, u1 =

0BBB@

.2

.40

−.2

1CCCA, r1 =

0BBB@

1.2−.8−.8−.4

1CCCA, u2 =

0BBB@

.90654

.46729−.33645−.57009

1CCCA,

305

r2 =

0BBB@

−1.45794−.59813−.26168−2.65421

1CCCA, u3 =

0BBB@

4.56612.40985

−2.92409−5.50820

1CCCA, r3 =

0BBB@

−1.369931.11307−3.59606

.85621

1CCCA, u4 =

0BBB@

9.501.25

−10.25−13.00

1CCCA;

(e) r0 =

0BBB@

4000

1CCCA, u1 =

0BBB@

.8000

1CCCA, r1 =

0BBB@

0.−.8−.8−.8

1CCCA, u2 =

0BBB@

.875−.125−.125−.125

1CCCA.

♣ 10.5.38. Remarkably, after only two iterations, the method finds the exact solution: u3 = u? =

( .0625, .125, .0625, .125, .375, .125, .0625, .125, .0625 )T , and hence the convergence is dra-matically faster than the other iterative methods.

♣ 10.5.39. (a) n = 5: b = ( 2.28333, 1.45, 1.09286, .884524, .745635 )T ;

n = 10: b = ( 2.92897, 2.01988, 1.60321, 1.3468, 1.16823, 1.0349, .930729, .846695, .777251, .718771 )T ;n = 30: b = (3.99499, 3.02725, 2.5585, 2.25546, 2.03488, 1.86345, 1.72456, 1.60873, 1.51004, 1.42457,

1.34957, 1.28306, 1.22353, 1.16986, 1.12116, 1.07672, 1.03596, .998411, .963689, .931466,

.901466, .873454, .847231, .82262, .799472, .777654, .75705, .737556, .719084, .70155)T ;(b) For regular Gaussian Elimination, using standard arithmetic in Mathematica, themaximal error, i.e., ∞ norm of the difference between the computed solution and u?, is:n = 5: 1.96931 × 10−12; n = 10: 5.31355 × 10−4; n = 30: 457.413. (c) Pivoting has

little effect for m = 5, 10, but for n = 30 the error is reduced to 5.96011× 10−4. (d) Usingthe conjugate gradient algorithm all the way to completiong results in the following errors:n = 5: 3.56512 × 10−3; n = 10: 5.99222 × 10−4; n = 30: 1.83103 × 10−4, and so, at leastfor moderate values of n, it outperforms Gaussian Elimination with pivoting.

10.5.40. r0 =

0B@−2

11

1CA, u1 =

0B@

.9231−.4615−.4615

1CA, r1 =

0B@

.30772.3846−1.7692

1CA, u2 =

0B@

2.7377−3.0988−.2680

1CA,

r2 =

0B@

7.20334.6348−4.3823

1CA, u3 =

0B@

5.5113−9.1775

.7262

1CA, but the solution is u =

0B@

1−1

1

1CA. The problem is

that the coefficient matrix is not positive definite, and so the fact that the solution is “or-thogonal” to the conjugate vectors does not uniquely specify it.

10.5.41. False. For example, consider the homogeneous system Ku = 0 where K =

.0001 0

0 1

!,

with solution u? = 0. The residual for u =

10

!is r = −Ku =

−.01

0

!with ‖ r ‖ =

.01, yet not even the leading digit of u agrees with the true solution. In general, if u? isthe true solution to Ku = f , then the residual is r = f −Au = A(u? − u), and hence

‖u? − u ‖ ≤ ‖A−1 ‖‖ r ‖, so the result is valid only when ‖A−1 ‖ ≤ 1.

10.5.42. Referring to the pseudocode program in the text, at each step to compute

rk = f −Kuk requires n2 multiplications and n2 additions;

‖ rk ‖2 requires n multiplications and n− 1 additions;

vk+1 = rk +‖ rk ‖2‖ rk−1 ‖2

vk requires n + 1 multiplications and n additions — since

‖ rk−1 ‖2 was already computed in the previous iteration (but when k = 0 this step

requires no work);

306

vTk+1Kvk+1 requires n2 + n multiplications and n2 − 1 additions;

uk+1 = uk +‖ rk ‖2

vTk+1Kvk+1

vk+1 requires n2 multiplications and n2 additions;

for a grand total of 2(n + 1)2 ≈ 2n2 multiplications and 2n2 + 3n− 2 ≈ 2n2 additions.

Thus, if the number of steps k < 16 n (approximately), the conjugate gradient method is

more efficient than Gaussian Elimination, which requires 13 n3 operations of each type.

♦ 10.5.43. tk =‖ rk ‖2rTk K rk

=uT

k K2uk − 2 fT Kuk + ‖ f ‖2uT

k K3uk − 2 fT K2uk + fT K f.

♠ 10.6.1. In all cases, we use the normalized version (10.101) starting with u(0) = e1; the answersare correct to 4 decimal places.

(a) After 17 iterations, λ = 2.00002, u = (−.55470, .83205 )T ;

(b) after 26 iterations, λ = −3.00003, u = ( .70711, .70710 )T ;

(c) after 38 iterations, λ = 3.99996, u = ( .57737,−.57735, .57734 )T ;

(d) after 121 iterations, λ = −3.30282, u = ( .35356, .81416,−.46059 )T ;

(e) after 36 iterations, λ = 5.54911, u = (−.39488, .71005, .58300 )T ;

(f ) after 9 iterations, λ = 5.23607, u = ( .53241, .53241, .65810 )T ;

(g) after 36 iterations, λ = 3.61800, u = ( .37176,−.60151, .60150,−.37174 )T ;

(h) after 30 iterations, λ = 5.99997, u = ( .50001, .50000, .50000, .50000 )T .

♠ 10.6.2.For n = 10, it takes 159 iterations to obtain λ1 = 3.9189 = 2 + 2 cos 1

6 π to 4 decimal places.

for n = 20, it takes 510 iterations to obtain λ1 = 3.9776 = 2 + 2 cos 121 π to 4 decimal places.

for n = 50, it takes 2392 iterations to obtain λ1 = 3.9962 = 2 + 2 cos 151 π to 4 decimal

places.

♠ 10.6.3. In each case, to find the dominant singular value of a matrix A, we apply the power

method to K = AT A and take the square root of its dominant eigenvalue to find the domi-

nant singular value σ1 =q

λ1 of A.

(a) K =

2 −1−1 13

!; after 11 iterations, λ1 = 13.0902 and σ1 = 3.6180;

(b) K =

0B@

8 −4 −4−4 10 2−4 2 2

1CA; after 15 iterations, λ1 = 14.4721 and σ1 = 3.8042;

(c) K =

0BBB@

5 2 2 −12 8 2 −42 2 1 −1−1 −4 −1 2

1CCCA; after 16 iterations, λ1 = 11.6055 and σ1 = 3.4067;

(d) K =

0B@

14 −1 1−1 6 −6

1 −6 6

1CA; after 39 iterations, λ1 = 14.7320 and σ1 = 3.8382.

307

10.6.4. Since v(k) → λk1v1 as k →∞,

u(k) =

v(k)

‖v(k) ‖−→ c1 λk

1v1

| c1 | |λ1 |k‖v1 ‖=

8<:

u1, λ1 > 0,

(−1)ku1, λ1 < 0,where u1 = sign c1

v1

‖v1 ‖

is one of the two real unit eigenvectors. Moreover, Au(k) →8<:

λ1u1, λ1 > 0,

(−1)kλ1u1, λ1 < 0,so

‖Au(k) ‖ → |λ1 |. If λ1 > 0, the iterates u(k) → u1 converge to one of the two dominant

unit eigenvectors, whereas if λ1 < 0, the iterates u(k) → (−1)ku1 switch back and forthbetween the two real unit eigenvectors.

♦ 10.6.5.

(a) If Av = λv then A−1v =1

λv, and so v is also the eigenvector of A−1.

(b) If λ1, . . . , λn are the eigenvalues of A, with |λ1 | > |λ2 | > · · · > |λn | > 0 (recalling that

0 cannot be an eigenvalue if A is nonsingular), then1

λ1

, . . . ,1

λn

are the eigenvalues of

A−1, and1

|λn |>

1

|λn−1 |> · · · > 1

|λ1 |and so

1

λn

is the dominant eigenvalue of A−1.

Thus, applying the power method to A−1 will produce the reciprocal of the smallest (inabsolute value) eigenvalue of A and its corresponding eigenvector.

(c) The rate of convergence of the algorithm is the ratio |λn/λn−1 | of the moduli of thesmallest two eigenvalues.

(d) Once we factor P A = LU , we can solve the iteration equation Au(k+1) = u(k) by

rewriting it in the form LU u(k+1) = P u(k), and then using Forward and Back Substi-

tution to solve for u(k+1). As we know, this is much faster than computing A−1.

♠ 10.6.6.(a) After 15 iterations, we obtain λ = .99998, u = ( .70711,−.70710 )T ;

(b) after 24 iterations, we obtain λ = −1.99991, u = (−.55469,−.83206 )T ;

(c) after 12 iterations, we obtain λ = 1.00001, u = ( .40825, .81650, .40825 )T ;

(d) after 6 iterations, we obtain λ = .30277, u = ( .35355,−.46060, .81415 )T ;

(e) after 7 iterations, we obtain λ = −.88536, u = (−.88751,−.29939, .35027 )T ;

(f ) after 7 iterations, we obtain λ = .76393, u = ( .32348, .25561,−.91106 )T ;

(g) after 11 iterations, we obtain λ = .38197, u = ( .37175, .60150, .60150, .37175 )T ;

(h) after 16 iterations, we obtain λ = 2.00006, u = ( .500015,−.50000, .499985,−.50000 )T .

♦ 10.6.7.(a) According to Exercises 8.2.19, 8.2.24, if A has eigenvalues λ1, . . . , λn, then (A − µ I )−1

has eigenvalues νi =1

λi − µ. Thus, applying the power method to (A − µ I )−1 will

produce its dominant eigenvalue ν?, for which |λ? − µ | is the smallest. We then recover

the eigenvalue λ? = µ +1

ν?of A which is closest to µ.

(b) The rate of convergence is the ratio | (λ? − µ)/(λ?? − µ) | of the moduli of the smallesttwo eigenvalues of the shifted matrix.

(c) µ is an eigenvalue of A if and only if A − µ I is a singular matrix, and hence one cannotimplement the method. Also choosing µ too close to an eigenvalue will result in an ill-conditioned matrix, and so the algorithm may not converge properly.

♠ 10.6.8.(a) After 11 iterations, we obtain ν? = 2.00002, so λ? = 1.0000, u = ( .70711,−.70710 )T ;

308

(b) after 27 iterations, we obtain ν? = −.40003, so λ? = −1.9998, u = ( .55468, .83207 )T ;(c) after 10 iterations, we obtain ν? = 2.00000, so λ? = 1.00000,

u = ( .40825, .81650, .40825 )T ;(d) after 7 iterations, we obtain ν? = −5.07037, so λ? = .30278,

u = (−.35355, .46060,−.81415 )T ;(e) after 8 iterations, we obtain ν? = .72183, so λ? = −.88537,

u = ( .88753, .29937,−.35024 )T ;(f ) after 6 iterations, we obtain ν? = 3.78885, so λ? = .76393,

u = ( .28832, .27970,−.91577 )T ;(g) after 9 iterations, we obtain ν? = −8.47213, so λ? = .38197,

u = (−.37175,−.60150,−.60150,−.37175 )T ;(h) after 14 iterations, we obtain ν? = .66665, so λ? = 2.00003,

u = ( .50001,−.50000, .49999,−.50000 )T .

♠ 10.6.9.(i) First, compute the dominant eigenvalue λ1 and eigenvector v1 using the power method.

Then set B = A − λ1v1bT , where b is any vector such that b · v1 = 1, e.g., b =

v1/‖v1 ‖2. According to Exercise 8.2.52, B has eigenvalues 0, λ2, . . . , λn, and corre-sponding eigenvectors v1 and wj = vj − cj v1, where cj = b · vj/λj for j ≥ 1. Thus,

applying the power method to B will produce the subdominant eigenvalue λ2 and theeigenvector w2 of the deflated matrix B, from which v2 can be reconstructed using thepreceding formula.

(ii) In all cases, we use the normalized version (10.101) starting with u(0) = e1; the answersare correct to 4 decimal places.

(a) Using the computed values λ1 = 2., v1 = (−.55470, .83205 )T , the deflated matrix

is B =

−1.61538 −1.07692

3.92308 2.61538

!; it takes only 3 iterations to produce λ2 = 1.00000,

v2 = (−.38075, .924678 )T .

(b) Using λ1 = −3., v1 = ( .70711, .70711 )T , the deflated matrix is B =

−3.5 3.5−1.5 1.5

!; it

takes 3 iterations to produce λ2 = −2.00000.

(c) Using λ1 = 4., v1 = ( .57737,−.57735, .57735 )T , the deflated matrix is

B =

0B@

1.66667 .333333 −1.33333.333333 .666667 .333333−1.33333 .333333 1.66667

1CA; it takes 11 iterations to produce λ2 = 2.99999.

(d) Using λ1 = −3.30278, v1 = (−.35355,−.81415, .46060 )T , the deflated matrix is B =0B@−1.5872 .95069 .46215−2.04931 .18924 −1.23854−2.53785 3.76146 4.70069


(e) Using λ1 = 5.54913, v1 = (−.39488, .71006, .58300 )T ,, the deflated matrix is B =0B@−1.86527 −.44409 −.722519

2.55591 −.797798 2.70287.277481 1.70287 −1.88606

1CA; it takes 13 iterations to produce λ2 = −3.66377.

(f ) Using λ1 = 5.23607, v1 = ( 0.53241, 0.53241, 0.65809 )T , the deflated matrix is B =0B@

.5158 .5158 −.8346−.4842 1.5158 −.8346

.1654 .1654 −.2677


(g) Using λ1 = 3.61803, v1 = ( .37176,−.60151, .60150,−.37174 )T , the deflated matrix

309

is B =

0BBB@

1.5 −.19098 −.80902 .5000−.19098 .69098 .30902 −.80902−.80902 .30902 .69098 −.19098

.5000 −.80902 −.19098 1.5000

1CCCA; it takes 18 iterations to produce

λ2 = 2.61801.

(h) Using λ1 = 6., v1 = ( .5, .5, .5, .5 )T , the deflated matrix is B =0BBB@

2.5 −.5 −1.5 −.5−.5 2.5 −.5 −1.5−1.5 −.5 2.5 −.5−.5 −1.5 −.5 2.5

1CCCA; it takes 17 iterations to produce λ2 = 3.99998.

10.6.10. That A is a singular matrix and 0 is an eigenvalue. The corresponding eigenvectors are

the nonzero elements of ker A. In fact, assuming u(k) 6= 0, the iterates u(0), . . . ,u(k) forma Jordan chain for the zero eigenvalue. To find other eigenvalues and eigenvectors, you need

to try a different initial vector u(0).

10.6.11.

(a) Eigenvalues: 6.7016, .2984; eigenvectors:

.3310.9436

!,

.9436−.3310

!.

(b) Eigenvalues: 5.4142, 2.5858; eigenvectors:

−.3827

.9239

!,

.9239.3827

!.

(c) Eigenvalues: 4.7577, 1.9009,−1.6586; eigenvectors:

0B@

.2726

.7519

.6003

1CA,

0B@

.9454−.0937−.3120

1CA,

0B@−.1784

.6526−.7364

1CA.

(d) Eigenvalues: 7.0988, 2.7191,−4.8180;

eigenvectors:

0B@

.6205

.6328−.4632

1CA,

0B@−.5439−.0782−.8355

1CA,

0B@

.5649−.7704−.2956

1CA.

(e) Eigenvalues: 4.6180, 3.6180, 2.3820, 1.3820;

eigenvectors:

0BBB@

−.3717.6015−.6015

.3717

1CCCA,

0BBB@

−.6015.3717.3717−.6015

1CCCA,

0BBB@

−.6015−.3717

.3717

.6015

1CCCA,

0BBB@

.3717

.6015

.6015

.3717

1CCCA.

(f ) Eigenvalues: 8.6091, 6.3083, 4.1793, 1.9033;

eigenvectors:

0BBB@

−.3182−.9310−.1008

.1480

1CCCA,

0BBB@

.8294−.2419−.4976−.0773

1CCCA,

0BBB@

.4126−.1093

.6419

.6370

1CCCA,

0BBB@

−.2015.2507−.5746

.7526

1CCCA.

10.6.12. The iterates converge to the diagonal matrix An →0B@

6 0 00 9 00 0 3

1CA. The eigenvalues ap-

pear on along the diagonal, but not in decreasing order, because, when the eigenvalues are

listed in decreasing order, the corresponding eigenvector matrix S =

0B@

0 1 11 −1 1−2 −1 1

1CA (or,

rather its transpose) is not regular, and so Theorem 10.57 does not apply.

10.6.13.

(a) Eigenvalues: 2, 1; eigenvectors:

−2

3

!,

−1

1

!.

310

(b) Eigenvalues: 1.2087, 5.7913; eigenvectors:

− .9669

.2550

!,

− .6205− .7842

!.

(c) Eigenvalues: 3.5842,−2.2899, 1.7057;

eigenvectors:

0B@−.4466−.7076

.5476

1CA,

0B@

.1953−.8380−.5094

1CA,

0B@

.7491−.2204

.6247

1CA.

(d) Eigenvalues: 7.7474,− .2995,−3.4479;

eigenvectors:

0B@− .4697− .3806− .7966

1CA,

0B@− .7799

.2433

.5767

1CA,

0B@

.6487− .7413

.1724

1CA.

(e) Eigenvalues: 18.3344, 4.2737, 0,−1.6081;

eigenvectors:

0BBB@

.4136

.8289

.2588

.2734

1CCCA,

0BBB@

− .4183.9016− .0957

.0545

1CCCA,

0BBB@

− .5774− .5774

.57740

1CCCA,

0BBB@

− .2057.4632− .6168

.6022

1CCCA.

10.6.14. Yes. After 10 iterations, one finds

R10 =

0B@

2.0011 1.4154 4.89830 .9999 −.00040 0 .9995

1CA, S10 =

0B@−.5773 .4084 .7071−.5774 .4082 −.7071−.5774 −.8165 .0002

1CA,

so the diagonal entries of R10 give the eigenvalues correct to 3 decimal places, and thecolumns of S10 are similar approximations to the orthonormal eigenvector basis.

10.6.15. It has eigenvalues ±1, which have the same magnitude. The QR factorization is triv-ial, with Q = A and R = I . Thus, RQ = A, and so nothing happens.

♦ 10.6.16. This follows directly from Exercise 8.2.23.

♦ 10.6.17.(a) By induction, if Ak = Qk Rk = RT

k QTk = AT

k , then, since Qk is orthogonal,

ATk+1 = (RkQk)T = QT

k RTk = QT

k RTk QT

k Qk = QTk Qk Rk Qk = Rk Qk = Ak+1,

proving symmetry of Ak+1. Again, proceeding by induction, if Ak = Qk Rk is tridiago-

nal, then its jth column is a linear combination of the standard basis vectors ej−1, ej , ej+1.

By the Gram-Schmidt formulas (5.19), the jth column of Qk is a linear combination ofthe first j columns of Ak, and hence is a linear combination of e1, . . . , ej+1. Thus, all

entries below the sub-diagonal of Qk are zero. Since Rk is upper triangular, this impliesall entries of Ak+1 = Rk Qk lying below the sub-diagonal are also zero. But we alreadyproved that Ak+1 is symmetric, and hence this implies all entries lying above the super-diagonal are also 0, which implies Ak+1 is tridiagonal.

(b) The result does not hold if A is only tridiagonal and not symmetric. For example, when

A =

0B@

1 −1 01 1 10 1 1

1CA, then Q =

0BBBBB@

1√2− 1√

31√6

1√2

1√3− 1√

6

0 1√3

√2√3

1CCCCCA

, R =

0BBBB@

√2 0 1√

2

0√

3 2√3

0 0 1√6

1CCCCA

,, and

A1 = RQ =

0BBBBB@

1 − 1√6

2√3√

3√2

53

13√

2

0 13√

213

1CCCCCA

, which is not tridiagonal.

311

10.6.18.

(a) H =

0B@

1 0 00 −.9615 .27470 .2747 .9615

1CA, T = H AH =

0B@

8.0000 7.2801 07.2801 20.0189 3.5660

0 3.5660 4.9811

1CA.

(b)

H1 =

0BBB@

1 0 0 00 −.4082 .8165 −.40820 .8165 .5266 .23670 −.4082 .2367 .8816

1CCCA, T1 = H1 AH1 =

0BBB@

5.0000 −2.4495 0 0−2.4495 3.8333 1.3865 .9397

0 1.3865 6.2801 −.95660 .9397 −.9566 6.8865

1CCCA,

H2 =

0BBB@

1 0 0 00 1 0 00 0 −.8278 −.56100 0 −.5610 .8278

1CCCA, T = H2 T1 H2 =

0BBB@

5.0000 −2.4495 0 0−2.4495 3.8333 −1.6750 0

0 −1.6750 5.5825 .07280 0 .0728 7.5842

1CCCA.

(c)

H1 =

0BBB@

1 0 0 00 0 .7071 −.70710 .7071 .5000 .50000 −.7071 .5000 .5000

1CCCA, T1 = H1 AH1 =

0BBB@

4.0000 −1.4142 0 0−1.4142 2.5000 .1464 −.8536

0 .1464 1.0429 .75000 −.8536 .7500 2.4571

1CCCA,

H2 =

0BBB@

1 0 0 00 1 0 00 0 −.1691 .98560 0 .9856 .1691

1CCCA, T = H2 T1 H2 =

0BBB@

4.0000 −1.4142 0 0−1.4142 2.5000 −.8660 0

0 −.8660 2.1667 .94280 0 .9428 1.3333

1CCCA.

♠ 10.6.19.

(a) Eigenvalues: 24, 6, 3; (b) eigenvalues: 7.6180, 7.5414, 5.3820, 1.4586;(c) eigenvalues: 4.9354, 3.0000, 1.5374, .5272.

♠ 10.6.20. The singular values are the square roots of the non-zero eigenvalues of

K = AT A =

0BBB@

5 2 2 −12 9 0 −62 0 5 3−1 −6 3 6

1CCCA.

Applying the tridiagonalization algorithm, we find

H1 =

0BBB@

1 0 0 00 −.6667 −.6667 .33330 −.6667 .7333 .13330 .3333 .1333 .9333

1CCCA, A1 =

0BBB@

5.0000 −3.0000 0 0−3.0000 8.2222 4.1556 0.7556

0 4.1556 8.4489 4.80890 .7556 4.8089 3.3289

1CCCA,

H2 =

0BBB@

1 0 0 00 1 0 00 0 −.9839 −.17890 0 −.1789 .9839

1CCCA, A2 =

0BBB@

5.0000 −3.0000 0 0−3.0000 8.2222 −4.2237 0

0 −4.2237 9.9778 −3.60000 0 −3.6000 1.8000

1CCCA.

Applying the QR algorithm to the final tridiagonal matrix, the eigenvalues are found to be14.4131, 7.66204, 2.92482, 0, and so the singular values of the original matrix are3.79646, 2.76804, 1.71021.

10.6.21.

(a) H1 =

0B@

1 0 00 −.4472 −.89440 −.8944 .4472

1CA, A1 =

0B@

3.0000 −1.3416 1.7889−2.2361 −2.2000 1.6000

0 −1.4000 4.2000

1CA;

312

(b)

H1 =

0BBB@

1 0 0 00 −.8944 0 −.44720 0 1 00 −.4472 0 .8944

1CCCA, A1 =

0BBB@

3.0000 −2.2361 −1.0000 0−2.2361 3.8000 2.2361 .4000

0 1.7889 2.0000 −5.81380 1.4000 −4.4721 1.2000

1CCCA,

H2 =

0BBB@

1 0 0 00 1 0 00 0 −.7875 −.61630 0 −.6163 .7875

1CCCA, A2 =

0BBB@

3.0000 −2.2361 .7875 .6163−2.2361 3.8000 −2.0074 −1.0631

0 −2.2716 −3.2961 2.29500 0 .9534 6.4961

1CCCA;

(c)

H1 =

0BBB@

1 0 0 00 −.5345 .2673 −.80180 .2673 .9535 .13960 −.8018 .1396 .5811

1CCCA, A1 =

0BBB@

1.0000 −1.0690 −.8138 .4414−3.7417 1.0714 −1.2091 −1.4507

0 −2.2316 1.7713 1.91900 −2.1248 .0482 3.1573

1CCCA,

H2 =

0BBB@

1 0 0 00 1 0 00 0 −.7242 −.68970 0 −.6896 .7242

1CCCA, A2 =

0BBB@

1.0000 −1.0690 .2850 .8809−3.7417 1.0714 1.876 −.2168

0 3.0814 3.4127 −1.67580 0 .1950 1.5159

1CCCA.

♠ 10.6.22.(a) Eigenvalues: 4.51056, 2.74823,−2.25879,(b) eigenvalues: 7., 5.74606,−4.03877, 1.29271,(c) eigenvalues: 4.96894, 2.31549,−1.70869, 1.42426.

10.6.23. First, by Lemma 5.28, H1x1 = y1. Furthermore, since the first entry of u1 is zero,

uT1 e1 = 0, and so H e1 = ( I − 2u1 uT

1 )e1 = e1. Thus, the first column of H1 A is

H1(a11e1 + x1) = a11e1 + y1 = ( a11,±r, 0, . . . , 0 )T .Finally, again since the first entry of u1 is zero, the first column of H1 is e1 and so multi-plying H1 A on the right by H1 doesn’t affect its first column. We conclude that the firstcolumn of the symmetric matrix H1 AH1 has the form given in (10.109); symmetry impliesthat its first row is just the transpose of its first column, which completes the proof.

10.6.24. Since T = H−1AH where H = H1H2 · · ·Hn is the product of the Householder reflec-

tions, Av = λv if and only if T w = λw where w = H−1v is the corresponding eigenvectorof the tridiagonalized matrix. Thus, to recover the eigenvectors of A we need to multiplyv = H w = H1H2 · · ·Hnw.

♦ 10.6.25.(a) Starting with a symmetric matrix A = A1, for each j = 1, . . . , n − 1, the tridiagonaliza-

tion algorithm produces a symmetric matrix Aj+1 from Aj as follows. We first extract

xj , which requires no arithmetic operations, and then determine vj = xj ± ‖xj ‖ ej+1,

which, since their first j entries are 0, requires n − j multiplications and additions and asquare root. To compute Aj+1, we only need to work on the lower right (n− j)× (n− j)

block of Aj since its first j − 1 rows and columns are not affected, while the jth row and

column entries of Aj+1 are predetermined. Setting uj = vj/‖vj ‖2,

Aj+1 = HjAjHj = ( I − 2uTj uj)A( I − 2u

Tj uj) = A−

2“vjz

Tj + vjz

Tj

”

‖vj ‖2+

4vTj vjzjv

Tj

‖vj ‖4,

where zj = Avj . Thus, the updated entries baik of Aj+1 are given by

baik = aik − αj(vizk + vkzi) + α2jβjvivk, where αj =

2

‖vj ‖2, βj = zjv

Tj = zj · vj ,

313

for i, k = j + 1, . . . , n, where aik are the entries of Aj and Aj+1. To compute zj re-

quires (n− j)2 multiplications and (n− j)(n− j − 1) additions; to compute the different

products vivk and vizk requires, respectively, (n − j)2 and 12 (n − j)(n − j + 1) mul-

tiplications. Using these, to compute αj and βj requires 2(n − j − 1) additions and 1

division; finally, to compute the updated entries on and above the diagonal (the ones be-low the diagonal following from symmetry) requires 2(n − j)(n − j + 1) multiplications

and 32 (n− j)(n− j + 1) additions. The total is 3

2 n3 − 12 n2 −−1 ≈ 3

2 n3 multiplications,56 n3 + 1

2 n2 − 103 n + 2 ≈ 5

6 n3 additions and n − 1 square roots that are required totridiagonalize A.

(b) First, to factor a tridiagonal A = QR using the pseudocode program on page 242, wenote that at the beginning of the jth step, for j < n, the last n − j − 1 entries of thejth column and the last n− j − 2 entries of the (j + 1)st column are zero, while columnsj + 2, . . . , n are still in tridaigonal form. Thus, to compute rjj requires j + 1 multiplica-

tions, j additions and a square root; to compute the nonzero aij requires j + 1 multipli-

cations. We only need compute rjk for k = j+1, which requires j+1 multiplications and

j additions, and, if j < n− 1, for k = j + 2, which requires just 1 multiplication. We up-date the entries in columns j +1, which requires j +1 multiplications and j +1 additionsand, when j < n − 1, column j + 2, which requires j + 1 multiplications and 1 addition.The final column only requires 2n multiplications, n − 1 additions and a square root tonormalize. The totals are 5

2 n2 + 92 n − 7 ≈ 5

2 n2 multiplications, 32 n2 + 3

2 n − 4 ≈ 32 n2

additions and n square roots.

(c) Much faster: Once the matrix is tridiagonalized, each iteration requires 52 n2 versus n3

multiplications and 32 n2 versus n3 additions, as found in Exercise 5.3.31. Moreover, by

part (a), the initial triadiagonalization only requires the effort of about 1 12 QR steps.

10.6.26.

Tridiagonalization

start

set R = A

for j = 1 to n− 2

for i = 1 to j set xi = 0 next i

for i = j + 1 to n set xi = rij next i

set v = x + (sign xj+1) ‖x ‖ ej+1

if v 6= 0, set uj = v/‖v ‖, Hj = I − 2uTj uj, R = HjR Hj

else set uj = 0, Hj = I endif

next j

end

The program also works as written for reducing a non-symmetric matrix A to upperHessenberg form.

314


11.1.1. The greatest displacement is at x =dα +

12 , with u(x) =

α8 +

d2 +

d2

2a , when d < 2α, and

at x = 1, with u(x) = d, when d ≥ 2α. The greatest stress and greatest strain are at x = 0,

with v(x) = w(x) =α2 + d.

11.1.2. u(x) =

8<:

14 x− 1

2 x2, 0 ≤ x ≤ 1

2 ,

14 − 3

4 x+ 12 x

2, 12 ≤ x ≤ 1,

v(x) =

8<:

14 − x, 0 ≤ x ≤ 1

2 ,

x− 34 ,

12 ≤ x ≤ 1.

0.2 0.4 0.6 0.8 1

-0.03

-0.02

-0.01

0.01

0.02

0.03

0.2 0.4 0.6 0.8 1

-0.2

-0.1

0.1

0.2

11.1.3.

(a) u(x) = b+

8<:− 1

2 x2, 0 ≤ x ≤ 1

2 `,

14 `

2 − `x+ 12 x

2, 12 ` ≤ x ≤ 1,

for any b.

(b) v(x) =

8<:−x, 0 ≤ x ≤ 1

2 `,

x− `, 12 ` ≤ x ≤ 1,

which is the same for all equilibria.

(c) u(x):0.2 0.4 0.6 0.8 1

-0.1

-0.05

0.05

0.1

v(x):

0.2 0.4 0.6 0.8 1

-0.5

-0.4

-0.3

-0.2

-0.1

11.1.4.

u(x) =log(x+ 1)

log 2− x, v(x) = u′(x) =

1

log 2(x+ 1)− 1, w(x) = (1 + x)v(x) =

1

log 2− 1− x.

0.2 0.4 0.6 0.8 1

0.02

0.04

0.06

0.08

0.2 0.4 0.6 0.8 1

-0.2

-0.1

0.1

0.2

0.3

0.4

0.2 0.4 0.6 0.8 1

-0.4

-0.2

0.2

0.4

Maximum displacement where u′(x) = 0, so x = 1/ log 2− 1 ≈ .4427. The bar will break atthe point of maximum strain, which is at x = 0.

11.1.5. u(x) = 2 log(x+ 1)− x, v(x) = u′(x) =1− x1 + x

, w(x) = (1 + x)v(x) = 1− x.

0.2 0.4 0.6 0.8 1

0.1

0.2

0.3

0.2 0.4 0.6 0.8 1

0.2

0.4

0.6

0.8

1

0.2 0.4 0.6 0.8 1

0.2

0.4

0.6

0.8

1

315

Maximum displacement at x = 1. Maximum strain at x = 0.

11.1.6. There is no equilibrium solution.

11.1.7. u(x) = x3 − 32 x

2 − x, v(x) = 3x2 − 3x− 1.

0.5 1 1.5 2

-0.2

-0.1

0.1

0.2

0.5 1 1.5 2

-0.4

-0.2

0.2

0.4

0.6

0.8

1

The two points x = 1 − 13

√3 ≈ .42265, 1 + 1

3

√3 ≈ 1.57735 have maximal (absolute)

displacement, while the maximal stress is at x = 0 and 2, so either end is most likely tobreak.

11.1.8. u(x) =

8<:

78 x− 1

2 x2, 0 ≤ x ≤ 1,

14 + 3

8 x− 14 x

2, 1 ≤ x ≤ 2,v(x) =

8<:

78 − x, 0 ≤ x ≤ 1,

38 − 1

2 x, 1 ≤ x ≤ 2.

0.5 1 1.5 2

0.1

0.2

0.3

0.5 1 1.5 2

-0.6

-0.4

-0.2

0.2

0.4

0.6

0.8

The greatest stress is at x = where v(0) = 78 , and the greatest strain is at x = 2, where

w(2) = 2v(2) = − 54 .

11.1.9. The bar will stretch farther if the stiffer half is on top. Both boundary value problemshave the form

− d

dx

c(x)

du

dx

!= 1, 0 < x < 2, u(0) = 0, u′(2) = 0.

When the stiffer bar is on bottom, c(x) =

(1, 0 ≤ x ≤ 1,

2, 1 ≤ x ≤ 2,and the solution is

u(x) =

8<:

32 x− 1

2 x2, 0 ≤ x ≤ 1,

14 + x− 1

4 x2, 1 ≤ x ≤ 2,

with u(2) = 54 . When the stiffer bar is on top,

c(x) =

(2, 0 ≤ x ≤ 1,

1, 1 ≤ x ≤ 2,and u(x) =

8<:

32 x− 1

4 x2, 0 ≤ x ≤ 1,

− 14 + 2x− 1

2 x2, 1 ≤ x ≤ 2,

with u(2) = 74 .

11.1.10. −((1− x)u′)′ = 1, u(0) = 0, w(1) = limx→ 1

(1− x)u′(x) = 0.

The solution is u = x. Note that we still need the limiting strain at x = 1 to be zero,w(1) = 0, which requires u′(1) < ∞. Indeed, the general solution to the differential equa-tion is u(x) = a+b log(1−x)+x; the first boundary condition implies a = 0, while u′(1) <∞is required to eliminate the logarithmic term.

♥ 11.1.11. The boundary value problem is

−u′′ = f(x), u(0) = u(2π), u′(0) = u′(2π).

Integrating the differential equation, we find

u(x) = ax+ b−Z x

0

„Z y

0f(z) dz

«dy.

The first boundary condition implies that a =1

2π

Z x

0

„Z y

0f(z) dz

«dy. The second bound-

316

ary condition requires

〈 f , 1 〉 =Z 2π

0f(z) dz = 0, (∗)

which is required for the forcing function to maintain equilibrium. The condition (∗) is pre-cisely the Fredholm alternative in this case, since any constant function solves the homo-geneous boundary value problem. For example, if f(x) = sinx, then the displacement isu(x) = b+ sinx, where b is an arbitrary constant, and the stress is u′(x) = cosx.

11.1.12. Displacement u and position x: meters. Strain v = u′: no units. Stiffness c(x): N/m =

kg/sec2. Stress w = c v′: N/m = kg/sec2. External force f(x): N/m2 = kg/(m sec2).

11.2.1. (a) 1, (b) 0, (c) e, (d) log 2, (e) 19 , (f ) 0.

11.2.2.

(a) ϕ(x) = δ(x);Z b

aϕ(x)u(x) dx = u(0) for a < 0 < b.

(b) ϕ(x) = δ(x− 1);Z b

aϕ(x)u(x) dx = u(1) for a < 1 < b.

(c) ϕ(x) = 3 δ(x− 1) + 3 δ(x+ 1);Z b

aϕ(x)u(x) dx = 3u(1) + 3u(−1) for a < −1 < 1 < b.

(d) ϕ(x) = 12 δ(x− 1);

Z b

aϕ(x)u(x) dx = 1

2 u(1) for a < 1 < b.

(e) ϕ(x) = δ(x)− δ(x− π)− δ(x+ π);Z b

aϕ(x)u(x) dx = u(0)− u(π)− u(−π) for a < −π < π < b.

(f ) ϕ(x) = 12 δ(x− 1) + 1

5 δ(x− 2);Z b

aϕ(x)u(x) dx = 1

2 u(1) + 15 u(2) for a < 1 < 2 < b.

11.2.3.(a) x δ(x) = lim

n→∞

nx

π“

1 + n2x2” = 0 for all x, including x = 0. Moreover, the functions

are all bounded in absolute value by 12 , and so the limit, although non-uniform, is to an

ordinary function.

(b) 〈u(x) , x δ(x) 〉 =Z b

au(x)x δ(x) dx = u(0) 0 = 0 for all continuous functions u(x), and so

x δ(x) has the same dual effect as the zero function: 〈u(x) , 0 〉 = 0 for all u.

11.2.4.

(a) ϕ(x) = limn→∞

24 n

π“

1 + n2x2” − 3n

π“

1 + n2(x− 1)2”

35 ;

(b)Z b

aϕ(x)u(x) dx = u(0)− 3u(1), for any interval with a < 0 < 1 < b.

♦ 11.2.5.(a) Using the limiting sequence, (11.31)

δ(2x) = limn→∞

gn(2x) = limn→∞

n

π“1 + n2(2x)2

” = limn→∞

n

π“1 + (2n)2x2

”

= limm→∞

12 m

π“1 +m2x2

” = 12 lim

m→∞gm(x) = 1

2 δ(x),

where we set m = 2n in the middle step.

317

(b) Use the change of variables bx = 2x in the integral

Z a

−aδ(2x) f(x) dx =

1

2

Z 2a

−2aδ(x) f

“12 x

”dx = 1

2 f(0) =Z a

−a

12 δ(x) f(x) dx.

(c) δ(ax) =1a δ(x).

11.2.6.

(a) f ′(x) = − δ(x+ 1)− 9 δ(x− 3) +

8>><>>:

2x, 0 < x < 3,

1, −1 < x < 0,

0, otherwise.-2 -1 1 2 3 4

-6

-4

-2

2

4

6

(b) g′(x) = δ“x+ 1

2π”− δ

“x− 1

2π”

+

8>>><>>>:

− cosx, − 12π < x < 0,

cosx, 0 < x < 12π,

0, otherwise.

-3 -2 -1 1 2 3

-1.5

-1

-0.5

0.5

1

1.5

(c) h′(x) = − e−1 δ(x+ 1) +

8>><>>:

π cos πx, x > 1,

−2x, −1 < x < 1,

ex, x < −1.-3 -2 -1 1 2 3 4

-4

-2

2

4

(d) k′(x) = (1 + π2)δ(x) +

8>>><>>>:

cosx, x < −π,2x, −π < x < 0,

−e−x, x > 0.

-6 -4 -2 2

-6

-4

-2

11.2.7.

(a) f ′(x) =

8>>><>>>:

1, −1 < x < 0,

−1, 0 < x < 1,

0, otherwise,

= σ(x+ 1)− 2σ(x) + σ(x− 1),

f ′′(x) = δ(x+ 1)− 2 δ(x) + δ(x− 1).

(b) k′(x) = 2 δ(x+ 2)− 2 δ(x− 2) +

8>>><>>>:

−1, −2 < x < 0,

1, 0 < x < 2,

0, otherwise,

= 2 δ(x+ 2)− 2 δ(x− 2)− σ(x+ 2) + 2σ(x)− σ(x− 2),

k′′(x) = 2 δ ′(x+ 2)− 2 δ ′(x− 2)− δ(x+ 2) + 2 δ(x)− δ(x− 2).

(c) s′(x) =

( −π sin πx, −1 < x < 1,

0, otherwise,s′′(x) =

8<:−π2 cos πx, −1 < x < 1,

0, otherwise.

11.2.8.(a) f ′(x) = − signx e−| x |, f ′′(x) = e−| x | − 2 δ(x).

318

(b) f ′(x) =

8>><>>:

−1 x < 0,

3, 0 < x < 1,

1, x > 1,

= −1 + 4σ(x)− 2σ(x− 1), f ′′(x) = 4 δ(x)− 2 δ(x− 1).

(c) f ′(x) =

8<:

2x+ 1, x > 0, or x < −1,

−2x− 1, −1 < x < 0,

f ′′(x) = 2 δ(x+ 1) + 2 δ(x) +

8<:

2, x > 0, or x < −1,

−2, −1 < x < 0,.

(d) f ′(x) = 4 δ(x+ 2)− 4 δ(x− 2) +

8<:

1, |x | > 2,

−1, |x | < 2,

= 4 δ(x+ 2)− 4 δ(x− 2) + 1− 2σ(x+ 2) + 2σ(x− 2),

f ′′(x) = 4 δ ′(x+ 2)− 4 δ ′(x− 2)− 2 δ(x+ 2) + 2 δ(x− 2).

(e) f ′(x) = signx cosx, f ′′(x) = 2 δ(x)− sin |x |.(f ) f ′(x) = sign(sinx) cosx, f ′′(x) = − | sinx |+ 2

∞X

n=−∞δ(x− nπ).

(g) f ′(x) = 2∞X

k=−∞

δ(x− 2kπ)− 2∞X

k=−∞

δ(x− (2k + 1)π),

f ′′(x) = 2∞X

k=−∞

δ ′(x− 2kπ)− 2∞X

k=−∞

δ ′(x− (2k + 1)π).

11.2.9.(a) It suffices to note that, when λ > 0, the product λx has the same sign as x, and so

σ(λx) =

(1, x > 0,

0, x < 0,= σ(x).

(b) If λ < 0, then σ(λx) =

(1, x < 0,

0, x > 0,= 1− σ(x).

(c) Use the chain rule. If λ > 0, then δ(x) = σ′(x) = λσ′(λx) = λ δ(λx), while if λ < 0,then δ(x) = σ′(x) = −λσ′(λx) = −λ δ(λx).

♦ 11.2.10. limn→∞

n√πe−n2x2

=

(0, x 6= 0,

∞, x = 0,

Z ∞

−∞

n√πe−n2x2

dx =Z ∞

−∞

1√πe−y2

dy = 1.

♦ 11.2.11.(a) First, by the definition of mn, we have

Z `

0egn(x) dx =

Z `

0gn(x− y) dxmn

= 1.

Second, note that mn =Z `

0g(n)y (x) dx =

tan−1 n(`− y)− tan−1(−ny)π

, and hence

limn→∞

mn = 1. Therefore, using (11.32),

limn→∞

egn(x) = limn→∞

gn(x− y)mn

= 0 whenever x 6= y.

(b) It suffices to note that limn→∞

Z `

0gn(x− y) dx = lim

n→∞mn = 1, as shown in part (a).

319

♥ 11.2.12.

(a)

1n− 1

n

12 n

(b) First, limn→∞

gn(x) = 0 for any x 6= 0 since gn(x) = 0 whenever n > 1/|x |. Moreover,Z ∞

−∞gn(x) dx = 1, and hence the sequence satisfies (11.32–33), proving lim

n→∞gn(x) =

δ(x).

(c) fn(x) =Z x

−∞gn(y) dy =

8>>><>>>:

0, x < − 1n ,

12 nx+ 1

2 , |x | < 1n ,

1, x > n.

1n− 1

n

1

Yes, since1n → 0 as n→∞, the limiting function is lim

n→∞fn(x) = σ(x) =

8><>:

0 x < 0,12 x = 0,

1 x > 0.(d) hn(x) = 1

2 n δ“x+ 1

n

”− 1

2 n δ“x− 1

n

”.

(e) Yes. To compute the limit, we use the dual interpretation of the delta function. Given

any C1 function u(x),

〈hn , u 〉 =Z ∞

−∞hn(x)u(x) dx

=Z ∞

−∞

h12 n δ

“x+ 1

n

”− 1

2 n δ“x− 1

n

” iu(x) dx =

u“− 1

n

”− u

“1n

”

2n

.

The n → ∞ limit is −u′(0), since the final expression is minus the standard centered

difference formula u′(x) = limh→ 0

u(x+ h)− u(x− h)2h

where h =1

nat x = 0. (Alterna-

tively, you can use l’Hopital’s rule to prove this.) Thus, limn→∞

g′n(x) = δ ′(x).

♥ 11.2.13.

(a)

1n− 1

n

12 n

(b) First, limn→∞

gn(x) = 0 for any x 6= 0 since gn(x) = 0 whenever n > 1/|x |. Moreover,

320

Z ∞

−∞gn(x) dx = 1 since its graph is a triangle of base

2n and height n. We conclude that

the sequence satisfies (11.32–33), proving limn→∞

gn(x) = δ(x).

(c) fn(x) =

8>>>>>>><>>>>>>>:

0, x < − 1n ,

12 + nx+ 1

2 n2x2, − 1

n < x < 0,

12 + nx− 1

2 n2x2, 0 < x < 1

n ,

1, x > 1n . 1

n− 1n

1

(d) Yes, since1n → 0 as n→∞, the limiting function is lim

n→∞fn(x) = σ(x) =

8><>:

0 x < 0,12 x = 0,

1 x > 0.

(e) hn(x) =

8>>>><>>>>:

0, |x | > 1n ,

n2, − 1n < x < 0,

−n2, 0 < x < 1n .

(f ) Yes, using the same integration by parts argument in (11.54).

11.2.14. By duality, for any f ∈ C1,

limn→∞

Dnhδ“x− 1

n

”− δ

“x+ 1

n

” i, fE

= limn→∞

nhf“

1n

”− f

“− 1

n

” i= 2 f ′(0) = −2 〈 δ ′ , f 〉,

where we used l’Hopital’s rule to evaluate the limit.

11.2.15.

s(x) =Z x

aδy(t) dt = −σy(x) =

8<:

0, x > y,

−1, x < y,

r(x) =Z x

aσy(z) dz = ρy(x) + y − a =

( y − a, x < y,

x− a, x > y,

when y < a.

s(x):

−1

a yr(x):

a− y

a y

11.2.16. 〈 δy , u 〉 =Z `

0δy(x)u(x) dx = u(y), while 〈σy , u

′ 〉 =Z `

yu′(x) dx = u(`)− u(y) = −u(y)

provided u(`) = 0.

11.2.17. Use induction on the order k. Let 0 < y < `. Integrating by parts,

〈 δ(k+1)y , u 〉 =

Z `

0δ(k+1)y (x)u(x) dx = δ(k)

y (x)u(x)˛˛`

x=0−Z `

0δ(k)y (x))u′(x) dx

= −Z `

0δ(k)y (x))u′(x) dx = −〈 δ(k)

y , u 〉 = (−1)k+1 u(k+1)(y),

by the induction hypothesis. The boundary terms at x = 0 and x = ` vanish since

δ(k)y (x) = 0 for any x 6= y.

11.2.18. x δ ′(x) = − δ(x) because they both yield the same value on a test function:

〈x δ ′ , u 〉 =Z ∞

−∞x δ ′(x)u(x) dx = −

hxu(x)

i′ ˛˛x=0

= u(0) =Z ∞

−∞δ(x)u(x) dx = 〈u , δ 〉.

321

See Exercise 11.2.20 for the general version.

11.2.19. The correct formulae are

(f(x) δ(x))′ = f ′(x) δ(x) + f(x) δ ′(x) = f(0) δ ′(x).

Indeed, integrating by parts,

〈 (f δ)′ , u 〉 =Z

(f(x) δ(x))′ u(x) dx = −Zf(x) δ(x)u′(x) dx = −f(0)u′(0),

〈 f δ ′ , u 〉 =Zf(x) δ ′(x)u(x) dx = −(f(x)u(x))′

˛˛x=0

= −f(0)u′(0)− f ′(0)u(0),

〈 f ′δ , u 〉 =Zf ′(x) δ(x)u(x) dx = f ′(0)u(0).

Adding the last two produces the first. On the other hand On the other hand,

〈 f(0) δ ′ , u 〉 =Zf(0) δ ′(x)u(x) dx = −(f(0)u(x))′

˛˛x=0

= −f(0)u′(0)

also gives the same result, proving the second formula. (See also the following exercise.)

♦ 11.2.20.(a) For any text function,

〈 f δ ′ , u 〉 =Z ∞

−∞u(x) f(x) δ ′(x) dx = − (u(x) f(x))′

˛˛x=0

= −u′(0) f(0)− u(0) f ′(0)

=Z ∞

−∞u(x) f(0) δ ′(x) dx− u(x) f ′(0) δ(x) = 〈 f(0) δ ′ − f ′(0) δ , u 〉.

(b) f(x) δ(n)(x) =nX

i=1

(−1)i0@ni

1Af (i)(0)δ(n−i)(x).

11.2.21.

(a) ϕ(x) = −2 δ ′(x)− δ(x),Z ∞

−∞ϕ(x)u(x) dx = 2u′(0)− u(0);

(b) ψ(x) = δ ′(x),Z ∞

−∞ψ(x)u(x) dx = −u′(0);

(c) χ(x) = δ(x− 1)− 4 δ ′(x− 2) + 4 δ(x− 2),Z ∞

−∞χ(x)u(x) dx = u(1) + 4u′(2) + 4u(2);

(d) γ(x) = e−1δ ′′(x+ 1)− 2e−1δ ′(x+ 1) + e−1δ(x+ 1),Z ∞

−∞γ(x)u(x) dx =

u′′(−1) + 2u′(−1) + u(−1)

e.

♦ 11.2.22. If f(x0) > 0, then, by continuity, f(x) > 0 in some interval |x− x0 | < ε. But then the

integral of f over this interval is positive:Z x0+ε

x0−εf(x) dx > 0, which is a contradiction. An

analogous argument shows that f(x0) < 0 is also not possible. We conclude that f(x) ≡ 0for all x.

♦ 11.2.23. Suppose there is. Let us show that δy(z) = 0 for all 0 < z 6= y < `. If δy(z) > 0 at

some z 6= y, then, by continuity, δy(x) > 0 in a small interval 0 < z − ε < x < z + ε < `.

We can further assume ε < | z − y | so that y doesn’t lie in the interval. Choose u(x) to bea continuous function so that u(z) > 0 for z − ε < x < z + ε but u(x) = 0 for all 0 ≤ x ≤ `

such that | z − x | ≥ ε; for example, u(x) =

(ε− |x− z |, |x− z | ≤ ε,0, otherwise.

Note that, in

particular, u(y) = 0. ThenZ `

0δy(x)u(x) dx =

Z z+ε

z−εδy(x)u(x) dx > 0 because we are

integrating a positive continuous function. But this contradicts (11.39) since u(y) = 0. Asimilar argument shows that δy(z) < 0 also leads to a contradiction. Therefore, δy(x) = 0

for all 0 < x 6= y < ` and so, by continuity, δy(x) = 0 for all 0 ≤ x ≤ `. But then

322

Z `

0δy(x)u(x) dx = 0 for all functions u(x) and so (11.39) doesn’t hold if u(y) 6= 0.

♦ 11.2.24. By definition of uniform convergence, for every δ > 0 there exists an n? such that

| fn(x)− σ(x) | < δ for all n ≥ n?. However, if δ < 12 , then there is no such n? since each

fn(x) is continuous, but would have to satisfy fn(x) < δ < 12 for x < 0 and fn(x) > 1− δ >

12 for x > 0 which is impossible for a continuous function, which, by the Intermediate ValueTheorem, must assume every value in between δ and 1− δ, [2].

11.2.25. .5 mm — by linearity and symmetry of the Green’s function.

11.2.26. To determine the Green’s function, we must solve the boundary value problem

−c u′′ = δ(x− y), u(0) = 0, u′(1) = 0.The general solution to the differential equation is

u(x) = − ρ(x− y)c

+ ax+ b, u′(x) = − σ(x− y)c

+ a.

The integration constants a, b are fixed by the boundary conditions

u(0) = b = 0, u′(1) = − 1

c+ a = 0.

Therefore, the Green’s function for this problem is

G(x, y) =

(x/c, x ≤ y,y/c, x ≥ y.

11.2.27. (a) G(x, y) =

8<:

18 x(4− y), x ≤ y,18 y(4− x), x ≥ y.

(b) G(x, y) =

8<:

12 x, x ≤ y,12 y, x ≥ y.

(c) The free

boundary value problem is not positive definite, and so there is not a unique solution.

11.2.28.

(a) G(x, y) =

8>>>><>>>>:

log(1 + x)

1− log(1 + y)

log 2

!, x < y,

log(1 + y)

1− log(1 + x)

log 2

!, x > y.

(b) u(x) =Z 1

0G(x, y) dy

=Z x

0log(1 + y)

1− log(1 + x)

log 2

!dy +

Z 1

xlog(1 + x)

1− log(1 + y)

log 2

!dy =

log(x+ 1)

log 2− x.

♥ 11.2.29.

(a) u(x) = 916 x− 1

2 x2 + 3

16 x3 − 1

4 x4, w(x) =

u′(x)

1 + x2= 9

16 − x;

(b) G(x, y) =

8><>:

“1− 3

4 y − 14 y

3”“x+ 1

3 x3”, x < y,

“1− 3

4 x− 14 x

3”“x+ 1

3 x3”, x > y.

(c) u(x) =Z 1

0G(x, y) dy

=Z x

0

“1− 3

4 x− 14 x

3”“x+ 1

3 x3”dy +

Z 1

x

“1− 3

4 y − 14 y

3”“x+ 1

3 x3”dy

= 916 x− 1

2 x2 + 3

16 x3 − 1

4 x4.

(d) Under an impulse force at x = y, the maximal displacement is at the forcing point,

namely g(x) = G(x, x) = x − 34 x

2 + 13 x

3 − 12 x

4 − 112 x

6. The maximum value of

323

g(x?) = 13 occurs at the solution x? =

3

q1 +√

2− 13

q1 +√

2= .596072 to

g′(x) = 1− 32 x+ x2 − 2x3 − 1

2 x5 = 0.

11.2.30. (a) G(x, y) =

((1 + y)x, x < y,

y(1 + x), x > y;(b) all of them;

(c) u(x) =Z 1

0G(x, y)f(y) dy =

Z x

0y (1 + x)f(y) dy +

Z 1

x(1 + y)xf(y) dy,

u′(x) = x(1 + x)f(x) +Z x

0yf(y) dy − (1 + x)xf(x) +

Z 1

x(1 + y)f(y) dy

=Z x

0yf(y) dy +

Z 1

x(1 + y)f(y) dy,

u′′(x) = xf(x)− (1 + x)f(x) = −f(x).(d) The boundary value problem is not positive definite — the function u(x) = x solves thehomogeneous problem — and so there is no Green’s function.

♥ 11.2.31.

(a) un(x) =

8>>>><>>>>:

x(y − 1), 0 ≤ x ≤ y − 1n ,

14n − 1

2 x+ 14 nx

2 − 12 y +

“1− 1

2 n”xy + 1

4 ny2, |x− y | ≤ 1

n ,

x(1− y), y + 1n ≤ x ≤ 1.

(b) Since un(x) = G(x, y) for all |x− y | ≥ 1n , we have lim

n→∞un(x) = G(x, y) for all x 6= y,

while limn→∞

un(y) = limn→∞

“y2 − y + 1

4n

”= y2 − y = G(y, y). (Or one can appeal to

continuity to infer this.) This limit reflects the fact that the external forces converge tothe delta function: lim

n→∞fn(x) = δ(x− y).

(c)

0.2 0.4 0.6 0.8 1

-0.2

-0.15

-0.1

-0.05

11.2.32. Use formula (11.64) to compute

u(x) =1

c

Z x

0yf(y) dy +

1

c

Z 1

xxf(y) dy,

u′(x) = xf(x)− xf(x) +1

c

Z 1

xf(y) dy =

1

c

Z 1

xf(y) dy,

u′′(x) = − 1

cf(x).

Moreover, u(0) =1

c

Z 0

0yf(y) dy = 0 and u′(1) =

1

c

Z 1

1f(y) dy = 0.

♠ 11.2.33.(a) The jth column of G is the solution to the linear system Ku = ej/∆x, corresponding to

a force of magnitude 1/∆x concentrated on the jth mass. The total force on the chainis 1, since the force only acts over a distance ∆x, and so forcing function represents aconcentrated unit impulse, i.e., a delta function, at the sample point yj = j/n. Thus, for

n À 0, the solution should approximate the sampled values of the Green’s function ofthe limiting solution.

(b) In fact, in this case, the entries of G are exactly equal to the sample values of the Green’sfunction, and, at least in this very simple case, no limiting procedure is required.

324

♠ 11.2.34. Yes. For a system of n masses connected to both top and bottom supports by n + 1

springs, the spring lengths are ∆x =1

n+ 1, and we rescale the incidence matrix A by di-

viding by ∆x, and set K = ATA. Again, the jth column of G = K−1/∆x represents theresponse of the system to a concentrated unit force on the ith mass, and so its entries ap-proximate G(xi, yj), where G(x, y) is the Green’s function (11.59) for c = 1. Again, in this

specific case, the matrix entries are, in fact, exactly equal to the sampled values of the con-tinuum Green’s function.

♦ 11.2.35. Set

I(x, z, w) =Z w

zF (x, y) dy, so

∂I

∂x=Z w

z

∂F

∂x(x, y) dy,

∂I

∂z= −F (x, z),

∂I

∂w= F (x,w),

by the Fundamental Theorem of Calculus. Thus, by the multivariable chain rule,d

dx

Z β(x)

α(x)F (x, y) dy =

d

dxI(x, α(x), β(x)) =

∂I

∂x+∂I

∂y

dβ

dx+∂I

∂z

dγ

dx

= F (x, β(x))dβ

dx− F (x, α(x))

dα

dx+Z β(x)

α(x)

∂F

∂x(x, y) dy.

11.3.1. (a) Solution: u?(x) = 52 x− 5

2 x2. (b) P[u? ] = − 25

24 = −1.04167,

(i) P[x− x2 ] = − 23 = − .66667, (ii) P[ 3

2 x− 32 x

3 ] = − 3940 = − .975,

(iii) P[ 23 sinπx ] = − 20

3π + 19 π

2 = −1.02544, (iv) P[x2 − x4 ] = − 1635 = − .45714;

all are larger than the minimum P[u? ].

11.3.2. (i) u?(x) = 16 x − 1

6 x3, (ii) P[u ] =

Z 1

0

h12 (u′)2 − xu

idx, u(0) = u(1) = 0,

(iii) P[u? ] = − 190 = − .01111, (iv) P[cx− cx3 ] = 2

5 c2 − 2

15 c > − 190 for c 6= 1

6 ,

P[cx− cx2 ] = 16 c

2− 112 c ≥ − 1

96 = − .01042, P[c sinπx ] =π2

4 c2 − 1π c ≥ −

1

π4= − .01027.

11.3.3.(a) (i) u?(x) = 1

18 x6 + 1

12 x4 − 5

36 ,

(ii) P[u ] =Z 1

−1

24 (u′)2

2(x2 + 1)+ x2u

35 dx, u(−1) = u(1) = 0,

(iii) P[u? ] = − .0282187,

(iv) Ph− 1

5 (1− x2)i

= − .018997, Ph− 1

5 cos 12 πx

i= − .0150593.

(b) (i) u?(x) = 12 − e

−1 + e−x−1 − 12 e

−2x,

(ii) P[u ] =Z 1

0

h12 e

x (u′)2 − e−xuidx, u(0) = u′(1) = 0,

(iii) P[u? ] = − .0420967,

(iv) Ph

25 x− 1

5 x2)i

= − .0386508, Ph

15 sin 1

2 πx)i

= − .0354279.(c) (i) u?(x) = 5

2 − x−1 − 1

2 x2,

(ii) P[u ] =Z 2

1

h12 x

2 (u′)2 − 3x2uidx, u′(1) = u(2) = 0,

(iii) P[u? ] = − 3720 = −1.85,

(iv) P[2x− x2 ] = 116 = −1.83333, P

hcos 1

2 π (x− 1)i

= −1.84534.

(d) (i) u?(x) = 13 x+ 7

9 − 49 x

−2,

(ii) P[u ] =Z −1

−2

h− 1

2 x3 (u′)2 − x2u

idx, u(−2) = u(−1) = 0.

325

(iii) P[u? ] = − 13216 = − .060185,

(iv) Ph− 1

4 (x+ 1)(x+ 2)i

= − .0536458, Ph− 3

20 x(x+ 1)(x+ 2)i

= − .0457634.11.3.4.

(a) Boundary value problem: −u′′ = 3, u(0) = u(1) = 0;

solution: u?(x) = − 32 x

2 + 32 x.

(b) Boundary value problem: −“

(x+ 1)u′”′

= 5, u(0) = u(1) = 0;

solution: u?(x) =5

log 2log(x+ 1)− 5x.

(c) Boundary value problem: −2(xu′)′ = −2, u(1) = u(3) = 0;

solution: u?(x) = x− 1− 2

log 3log x.

(d) Boundary value problem: −(exu′)′ = 1 + ex, u(0) = u(1) = 0;

solution: u?(x) = (x− 1)e−x − x+ 1.

(e) Boundary value problem: −2d

dx

1

1 + x2

du

dx

!= − x

(x2 + 1)2, u(−1) = u(1) = 0;

solution: u?(x) = − 116 x+ 1

16 x3.

11.3.5.

(a) Unique minimizer: u?(x) =1

2x2 − 2x+

3

2+

log x

2 log 2.

(b) No minimizer since x is not positive for all −π < x < π.

(c) Unique minimizer: u?(x) =1

2(x− 1) −

log

cos 1(1 + sinx)

(1 + sin 1) cosx

!

log

1 + sin 1

1− sin 1

! .

(d) No minimizer since 1− x2 is not positive for all −2 < x < 2.(e) Not a unique minimizer due to Neumann boundary conditions.

11.3.6. Yes: any function of the form u(x) = a + 14 x

2 − 16 x

3 satisfies the Neumann boundary

value problem −u′′ = x − 12 , u

′(0) = u′(1) = 0. Note that the right hand side satisfies

the Fredholm conditionD

1 , x− 12

E=Z 1

0

“x− 1

2

”dx = 0; otherwise the boundary value

problem would have no solution.

11.3.7. Arguing as in Example 11.3, any minimum must satisfy the Neumann boundary valueproblem −(c(x)u′)′ = f(x), u′(0) = 0, u′(1) = 0. The general solution to the differential

equation is u(x) = a x + b −Z x

0

1

c(y)

Z y

0f(z) dz

!dy. Since u′(x) = a − 1

c(x)

Z x

0f(z) dz,

the first boundary condition requires a = 0. The second boundary condition requires

u′(1) =1

c(x)

Z 1

0f(x) dx = 0. The mean zero condition is both necessary and sufficient

for the boundary value problem to have a (non-unique) solution, and hence the functionalto achieve its minimum value. Note that all solutions to the boundary value problem give

the same value for the functional, since P[u+ b ] = P[u ]− bZ 1

0f(x) dx = P[u ].

♦ 11.3.8. 12 ‖D[u ] ‖2 = 1

2 ‖ v ‖2 =

Z `

0

12 c(x) v(x)

2 dx =Z `

0

12 v(x)w(x) dx = 1

2 〈 v , w 〉.

11.3.9. According to (11.91–92) (or, using a direct integration by parts) P[u ] = 12 〈K[u ] , u 〉 −

〈u , f 〉, so when K[u? ] = f is the minimizer, P[u? ] = − 12 〈u? , f 〉 = − 1

2 〈u? ,K[u? ] 〉 < 0

326

since K is positive definite and u? is not the zero function when f 6≡ 0.

11.3.10. Yes. Same argument

11.3.11. u(x) = x2 satisfiessZ 1

0u′′(x)u(x) dx = 2

3 . Positivity ofZ 1

0−u′′(x)u(x) dx holds only

for functions that satisfy the boundary conditions u(0) = u(1).

11.3.12. No. The boundary terms still vanish and so the integration by parts identity continues

to hold, but now u(x) = a constant makesZ `

0

h−u′′(x)u(x)

idx = 0. The integral is ≥ 0

for all functions satisfying the Neumann boundary conditions.

11.3.13. 〈〈 I [u ] , v 〉〉 =Z `

0u(x) v(x) c(x) dx =

Z `

0u(x) v(x) ρ(x) dx = 〈u , I ∗[v ] 〉, provided

I ∗[v ] =c(x)

ρ(x)v(x) is a multiplication operator.

11.3.14. 〈K[u ] , v 〉 =Z `

0c(x)u(x) v(x) dx =

Z `

0u(x) c(x) v(x) dx = 〈u ,K[v ] 〉. No boundary

conditions are required.

11.3.15. Integrating by parts,

〈〈D[u ] , v 〉〉 =Z `

0

du

dxv(x) c(x) dx =

hu(`)c(`)v(`)− u(0)c(0)v(0)

i−Z `

0u(x)

d

dx

hv(x) c(x)

idx

=Z `

0u(x)

− 1

ρ(x)

d

dx

hv(x) c(x)

i !ρ(x) dx =

*u , − 1

ρ

d(v c)

dx

+= 〈u ,D∗[v ] 〉,

provided the boundary terms vanish:

u(`)c(`)v(`)− u(0)c(0)v(0) = 0.

Therefore,

D∗[v(x) ] = − 1

ρ(x)

d

dx

hc(x) v(x)

i= − c(x)

ρ(x)

dv

dx− c′(x)

ρ(x)v(x).

Note that we have the same boundary terms as in (11.87), and so all of our self-adjointboundary conditions continue to be valid. The self-adjoint boundary value problem K[u ] =

D∗ D[u ] = f is now given by

K[u ] = − 1

ρ(x)

d

dxc(x)

du

dx= f(x),

along with the selected boundary conditions, e.g., Dirichlet conditions u(0) = u(`) = 0.

♥ 11.3.16.(a) Integrating the first term by parts, we find

〈〈L[u ] , v 〉〉 =Z 1

0

hu′(x)v(x) + 2xu(x)v(x)

idx

=hu(1) v(1)− u(0) v(0)

i+Z 1

0

h−u(x)v′(x) + 2xu(x)v(x)

idx

= 〈u ,−v′ + 2xv 〉 = 〈u , L∗[v ] 〉,owing to the boundary conditions. Therefore, the adjoint operator is L∗[v ] = −v′+2xv.

(b) The operator K is positive definite because kerL = 0. Indeed, the solution to L[u ] = 0

is u(x) = c e−x2

and either boundary condition implies c = 0.

(c) K[u ] = (−D + 2x)(D + 2x)u = −u′′ + (4x2 − 2)u = f(x), u(0) = u(1) = 0.

(d) The general solution to the differential equation (−D + 2x)v = ex2

is v(x) = (b− x)ex2

,

327

and so the solution to (−D + 2x)(D + 2x)u = ex2

is found by solving

(D + 2x)u = (b− x)ex2

, so u(x) = − 14 e

x2

+ ae−x2

+ be−x2Z x

0e2y2

dy.

Imposing the boundary conditions, we find

u(x) =e−x2

− ex2

4+

(e4 − 1) e−x2Z x

0e2y2

dy

4Z 1

0e2y2

dy.

(e) Because the boundary terms u(1) v(1) − u(0) v(0) in the integration by parts argu-ment are not zero. Indeed, we should identify v(x) = u′(x) + 2xu(x), and so the cor-rect form of the free boundary conditions v(0) = v(1) = 0 requires u′(0) = u′(1) +2u(1) = 0. On the other hand, although it is not self-adjoint and so doesn’t admit aminimization principle, the free boundary value problem does have a unique solution:

u(x) = − 14 e

2“

1 + e−x2 ”.

♥ 11.3.17.(a) a(x)u′′ + b(x)u′ = −(c(x)u′)′ = −c(x)u′′ − c′(x)u′ if and only if a = −c and

b = −c′ = a′.

(b) We require ρ b = (ρ a)′ = ρ′ a+ ρ a′, and soρ′

ρ=b− a′a

. Hence, ρ = exp

Z b− a′a

dx

!

is found by one integration.(c) (i) No integrating factor needed: − (x2u′)′ = x− 1;

(ii) no integrating factor needed: − (exu′)′ = −e2x;

(iii) integrating factor ρ(x) = −e2x, so − (e2xu′)′ = −e2x;

(iv) integrating factor ρ(x) = x2, so −(x3u′)′ = x3;

(v) integrating factor ρ(x) = − 1

cos2 x, so −

u′

cosx

!= − 1

cosx.

11.3.18.(a) Integrating the first term by parts and using the boundary conditions,

〈〈L[u ] ,v 〉〉 =Z 1

0

hu′ v1 + uv2

idx =

Z 1

0uh−v′1 + v2

idx = 〈u , L∗[v ] 〉,

and so the adjoint operator is L∗[v ] = −v′1 + v2.

(b) −u′′ + u = x− 1, u(0) = u(1) = 0, with solution u(x) = x− 1 +e2−x − exe2 − 1

.

11.3.19. Use integration by parts:

〈L[u ] , v 〉 =Z 2π

0iu′(x) v(x) dx = −

Z 2π

0iu(x) v′(x) dx =

Z 2π

0u′(x) i v′(x) dx = 〈u , L[v ] 〉,

where the boundary terms vanish because both u and v are assumed to be 2π periodic.

11.3.20. Quadratic polynomials do not, in general, satisfy any of the allowable boundary condi-tions, and so, in this situation, the the boundary terms will contribute to the computationof the adjoint.

11.3.21. Solving the corresponding boundary value problem

− d

dx(xdu

dx) = −x2, u(1) = 0, u(2) = 1, gives u(x) =

x3 − 1

9− 2 log x

9 log 2.

11.3.22.(a) (i) −u′′ = −1, u(0) = 2, u(1) = 3; (ii) u?(x) = 1

2 x2 + 1

2 x+ 2.

328

(b) (i) −2(xu′)′ = 2x, u(1) = 1, u(e) = 1. (ii) u?(x) = 54 − 1

4 x2 + 1

4 (e2 − 1) log x.

(c) (i) − d

dx

1

1 + x2

du

dx

!= 0, u(−1) = 1, u(1) = −1. (ii) u?(x) = − 3

4 x− 14 x

3.

(d) (i) −(e−x u′)′ = 1, u(0) = −1, u(1) = 0; (ii) u?(x) = (x− 1)e−x.

11.3.23.

(a) (i)Z π

0

h12 (u′)2 − (cosx)u

idx, u(0) = 1, u(π) = −2; (ii) u?(x) = cosx− x

π.

(b) (i)Z 1

−1

24 (u′)2

2(x2 + 1)− x2u

35 dx, u(−1) = −1, u(1) = 1;

(ii) u?(x) = − 118 x

6 − 112 x

4 + 14 x

3 + 34 x− 5

36 .

(c) (i)Z 1

0

h12 e

x (u′)2 + uidx, u(0) = 1, u(1) = 0; (ii) u?(x) = e−x(1− x).

(d) (i)Z 1

−1

h12 x

2 (u′)2 − (x2 − x)uidx, u(1) = −1, u(3) = 2;

(ii) u?(x) = − 16 x

2 + 12 x+ 11

3 − 5x .

11.3.24. The function eu(x) = u(x) −hα (1− x) + β x

isatisfies the homogeneous boundary

conditions − eu′′ = f, eu(0) = eu(1) = 0, and so is given by the superposition formula

eu(x, y) =Z 1

0f(y)G(x, y) dy. Therefore,

u(x) = α (1− x) + β x+Z 1

0f(y)G(x, y) dy

= α (1− x) + β x+Z x

0(1− x)yf(y) dy +

Z 1

xx(1− y)f(y) dy.

11.3.25.

(a) − d

dx

"(1 + x)

du

dx

#= 1− x, u(0) = 0, u(1) = .01;

solution: u?(x) = .25x2 − 1.5x+ 1.8178 log(1 + x);

(b) P[u ] =Z 1

0

0@ 1

2 (1 + x)

du

dx

!2

− (1− x)u1A dx;

(c) P[u? ] = −.0102;

(d) When u(x) = .1x+α(x−x2), then P[u ] = .25α2− .085α− .00159, with minimum value

−.0070535 at α = .17. When u(x) = .1x + α sin πx, then P[u ] = 3.7011α2 − .3247α −.0087, with minimum value −.0087 at α = .0439.

♦ 11.3.26. We proceed as in the Dirichlet case, setting u(x) = eu(x) + h(x) where h(x) is any

function that satisfies the mixed boundary conditions h(0) = α, h′(`) = β, e.g., h(x) =α+ β x. The same calculation leads to (11.105), which now is

〈〈u′ , h′ 〉〉 − 〈u ,K[h ] 〉 = c(`)h′(`)u(`)− c(0)h′(0)α = β c(`)u(`)− c(0)h′(0)α.The second term, C1 = −c(0)h′(0)α, doesn’t depend on u. Therefore,

P[ eu ] = −β c(`)u(`) + P[u ]− C1 + C0,

and so eu(x) minimizes P[ eu ] if and only if u(x) = eu(x) + h(x) minimizes the functional(11.106).

11.3.27. Solving the boundary value problem − d

dx

xdu

dx

!= − 1

2 x2 with the given boundary

conditions gives u(x) = 118 x

3 + 1718 − 4

3 log x.

11.3.28. For any ε > 0, the functions u(x) =

8<:

0, 0 ≤ x ≤ 1− ε,12ε (x− 1 + ε)2, 1− ε ≤ x ≤ 1,

satisfy the

329

boundary conditions, but J [u ] = 13 ε, and hence J [u ] can be made as close to 0 as desired.

However, if u(x) is continuous, J [u ] = 0 if and only if u(x) = c is a constant function. Butno constant function satisfies both boundary conditions, and hence J [u ] has no minimumvalue when subject to the boundary conditions.

♦ 11.3.29. The extra boundary terms serve to cancel those arising during the integration by partscomputation:

bP[u ] =Z b

a

h12 (u′)2 + u′′?u

idx− u′?(b)u(b) + u′?(a)u(a)

=Z b

a

h12 (u′)2 − u′?u

′idx =

Z b

a

12 (u′ − u′?)2 dx+

Z b

a

12 (u′?)2 dx.

Again, the minimum occurs when u = u?, but is no longer unique since we can add in anyconstant function, which solves the associated homogeneous boundary value problem.

11.4.1.(a) u(x) = 1

24 x4 − 1

12 x3 + 1

24 x, w(x) = u′′(x) = 12 x

2 − 12 x.

(b) no equilibrium solution.

(c) u(x) = 124 x

4 − 16 x

3 + 13 x, w(x) = u′′(x) = 1

2 x2 − x.

(d) u(x) = 124 x

4 − 16 x

3 + 14 x

2, w(x) = u′′(x) = 12 x

2 − x+ 12 .

(e) u(x) = 124 x

4 − 16 x

3 + 16 x

2, w(x) = u′′(x) = 12 x

2 − x+ 13 .

♥ 11.4.2.(a) Maximal displacement: u

“12

”= 3

384 = .01302; maximal stress: w“

12

”= − 1

8 = − .125;(b) no solution;

(c) Maximal displacement: u(1) = 524 = .2083; maximal stress: w(1) = − 1

2 = − .5;(d) Maximal displacement: u(1) = 1

8 = .125; maximal stress: w(0) = 12 = .5;

(e) Maximal displacement: u(1) = 124 = .04167; maximal stress: w(0) = 1

3 = .3333.Thus, case (c) has the largest displacement, while cases (c,d) have the largest stress,.

11.4.3. Except in (b), which has no minimization principle, we minimize

P[u ] =Z 1

0

h12 u

′′(x)2 − u(x)idx

subject to the boundary conditions(a) u(0) = u′′(0) = u(1) = u′′(1) = 0. (c) u(0) = u′′(0) = u′(1) = u′′′(1) = 0.(d) u(0) = u′(0) = u′′(1) = u′′′(1) = 0. (e) u(0) = u′(0) = u′(1) = u′′′(1) = 0.

11.4.4.

(a) (i) G(x, y) =

8<:

13 xy − 1

6 x3 − 1

2 xy2 + 1

6 x3y + 1

6 xy3, x < y,

13 xy − 1

2 x2y − 1

6 y3 + 1

6 x3y + 1

6 xy3, x > y;

(ii)

0.2 0.4 0.6 0.8 1

0.005

0.01

0.015

0.02

(iii)u(x) =

Z x

0( 13 xy − 1

2 x2y − 1

6 y3 + 1

6 x3y + 1

6 xy3)f(y) dy

+Z 1

x( 13 xy − 1

6 x3 − 1

2 xy2 + 1

6 x3y + 1

6 xy3)f(y) dy;

(iv) maximal displacement at x = 12 with G( 1

2 ,12 ) = 1

48 .(b) no Green’s function.

330

(c) (i) G(x, y) =

8<:xy − 1

6 x3 − 1

2 xy2, x < y,

xy − 12 x

2 y − 16 y

3, x > y;(ii)

0.2 0.4 0.6 0.8 1

0.05

0.1

0.15

0.2

(iii) u(x) =Z x

0(xy − 1

6 x3 − 1

2 xy2)f(y) dy +

Z 1

x(xy − 1

6 x3 − 1

2 xy2)f(y) dy;

(iv) maximal displacement at x = 1 with G(1, 1) = 13 .

(d) (i) G(x, y) =

8<:− 1

6 x3 + 1

2 x2 y, x < y,

12 xy

2 − 16 y

3, x > y;(ii)

0.2 0.4 0.6 0.8 1

0.02

0.04

0.06

0.08

0.1

(iii) u(x) =Z x

0( 12 xy

2 − 16 y

3)f(y) dy +Z 1

x(− 1

6 x3 + 1

2 x2 y)f(y) dy;


(e) (i) G(x, y) =

8<:− 1

6 x3 + 1

2 x2 y − 1

4 x2 y2, x < y,

12 xy

2 − 16 y

3 − 14 x

2 y2, x > y;(ii)

0.2 0.4 0.6 0.8 1

0.01

0.02

0.03

0.04

(iii) u(x) =Z x

0( 12 xy

2 − 16 y

3 − 14 x

2 y2)f(y) dy +Z 1

x(− 1

6 x3 + 1

2 x2 y − 1

4 x2 y2)f(y) dy;


♥ 11.4.5. The boundary value problem for the bar is

−u′′ = f, u(0) = u(`) = 0, with solution u(x) = 12f x(`− x).

The maximal displacement is at the midpoint, with u“

12 `”

= 18f `

2. The boundary value

problem for the beam is

u′′′′ = f, u(0) = u′(0) = u(`) = u′(`) = 0, with solution u(x) = 124f x

2(`− x)2.The maximal displacement is at the midpoint, with u

“12 `”

= 1384f `

4. The beam displace-

ment is greater than the bar displacement if and only if ` > 4√

3.

11.4.6. The strain is v(x) =x2 − x

2(1 + x2), with stress w(x) = (1 + x2) v(x) = 1

2 (x2 − x), which

is obtained by solving w′′ = 1 subject to boundary conditions w(0) = w(1) = 0. The prob-lem is statically determinate because the boundary conditions uniquely determine w(x) andv(x) without having to first find the displacement u(x) (which is rather complicated).

11.4.7.(a) The differential equation is u′′′′ = f . Integrating once, the boundary conditions u′′′(0) =

u′′′(1) = 0 imply

u′′′(x) =Z x

0f(y) dy with

Z 1

0f(y) dy = 0.

Integrating again, the other boundary conditions u′′(0) = u′′(1) = 0 imply that

u′′(x) =Z x

0

„Z z

0f(y) dy

«dz with

Z 1

0

„Z z

0f(y) dy

«dz = 0.

(b) The Fredholm alternative requires f(x) be orthogonal to the functions in kerK = kerL =

kerD2, with basis 1 and x. (Note that both functions satisfy the free boundary condi-tions.) Then

〈 f , 1 〉 =Z 1

0f(x) dx = 0, 〈 f , x 〉 =

Z 1

0xf(x) dx = 0.

331

Comparing with the two previous constraints, the first are identical, while the secondare equivalent using an integration by parts.

(c) f(x) = x2 − x+ 16 satisfies the constraints; the corresponding solution is

u(x) = 1360 x

6 − 1120 x

5 + 1144 x

4 + cx+ d, where c, d are arbitrary constants.

11.4.8.(a) The differential equation is u′′′′ = f . Integrating once, the boundary condition

u′′′(0) = 0 implies u′′′(x) =Z x

0f(y) dy. Integrating again, the boundary conditions

u′′(0) = u′′(1) = 0 imply that

u′′(x) =Z x

0

„Z z

0f(y) dy

«dz with

Z 1

0

„Z z

0f(y) dy

«dz = 0.

Integrating twice more, one can arrange to satisfy the remaining boundary conditionu(1) = 0, and so in this case there is only one constraint.

(b) The Fredholm alternative requires f(x) be orthogonal to the functions in kerK = kerL =

kerD2, with basis x− 1, and so

〈 f , x− 1 〉 =Z 1

0(x− 1)f(x) dx = 0.

This is equivalent to the previous constraint through an integration by parts.(c) f(x) = x− 1

3 satisfies the constraint; the corresponding solution is

u(x) = 1120 x

5 − 172 x

4 + c(x− 1) + 1180 , where c is an arbitrary constant.

11.4.9. False. Any boundary conditions leading to self-adjointness result in a symmetric Green’sfunction.

11.4.10.

u =1

6

Z x

0y2 (1− x)2 (3x− y − 2xy) f(y) dy +

1

6

Z 1

xx2 (1− y)2 (3y − x− 2xy) f(y) dy,

du

dx=

1

2

Z x

0y2 (1− x) (1− 3x+ 2xy) f(y) dy +

1

2

Z 1

xx (1− y)2 (2y − x− 2xy) f(y) dy,

d2u

dx2=Z x

0y2 (−2 + 3x+ y − 2xy) f(y) dy +

Z 1

x(1− y)2 (y − x− 2xy) f(y) dy,

d3u

dx3=Z x

0y2 (3− 2y) f(y) dy −

Z 1

x(1− y)2 (1 + 2y) f(y) dy,

d4u

dx4= f(x). Moreover, u(0) = u′(0) = u(1) = u′(1).

♥ 11.4.11. Same answer in all three parts: Since, in the absence of boundary conditions, kerD2 =ax+ b, we must impose boundary conditions in such a way that no non-zero affine func-tion can satisfy them. The complete list is (i) fixed plus any other boundary condition; or(ii) simply supported plus any other except free. All other combinations have a nontriv-ial kernel, and so have non-unique equilibria, are not positive definite, and do not have aGreen’s function.

11.4.12.(a) Simply supported end with right end raised 1 unit; solution u(x) = x.(b) Clamped ends, the left is held horizontally and moved one unit down, the right is held

at an angle tan−1 2; solution u(x) = x2 − 1.(c) Left end is clamped at a 45 angle, right end is free with an induced stress; solution

u(x) = x+ 12 x

2.(d) Left end is simply supported and raised 2 units up, right end is sliding and tilted at an

angle − tan−1 2; solution u(x) = 2− 2x.

332

11.4.13. (a) P[u ] =Z 1

0

12 u

′′(x)2 dx, (b) P[u ] =Z 1

0

12 u

′′(x)2 dx,

(c) P[u ] = u′(1) +Z 1

0

12 u

′′(x)2 dx, (d) P[u ] =Z 1

0

12 u

′′(x)2 dx.

11.4.14.

(a) u(x) =

8<:−1.25(x+ 1)3 + 4.25(x+ 1)− 2, −1 ≤ x ≤ 0,

1.25x3 − 3.75x2 + .5x− 1, 0 ≤ x ≤ 1.

-1 -0.5 0.5 1

-2

-1.5

-1

-0.5

0.5

1

(b) u(x) =

8>>><>>>:

−x3 + 2x+ 1, 0 ≤ x ≤ 1,

2(x− 1)3 − 3(x− 1)2 − (x− 1) + 2, 1 ≤ x ≤ 2,

−(x− 2)3 + 3(x− 2)2 − (x− 2), 2 ≤ x ≤ 3.0.5 1 1.5 2 2.5 3

0.5

1

1.5

2

(c) u(x) =

8<:

23 (x− 1)3 − 11

3 (x− 1) + 3, 1 ≤ x ≤ 2,

− 13 (x− 2)3 + 2(x− 2)2 − 5

3 (x− 2), 2 ≤ x ≤ 4.

1.5 2 2.5 3 3.5 4

0.5

1

1.5

2

2.5

3

(d) u(x) =

8>>>>>>><>>>>>>>:

1.53571(x+ 2)3 − 4.53517(x+ 2) + 5, −2 ≤ x ≤ −1,

−3.67857(x+ 1)3 + 4.60714(x+ 1)2 + .07143(x+ 1) + 2, −1 ≤ x ≤ 0,

4.17857x3 − 6.42857x2 − 1.75x+ 3, 0 ≤ x ≤ 1,

−2.03571(x− 1)3 + 6.10714(x− 1)2 − 2.07143(x− 1)− 1, 1 ≤ x ≤ 2.

-2 -1 1 2

-1

1

2

3

4

5

11.4.15. In general, the formulas for the homogeneous clamped spline coefficients aj , bj , cj , dj ,

for j = 0, . . . , n− 1, areaj = yj , j = 0, . . . , n− 1,

dj =cj+1 − cj

3hj

, j = 0, . . . , n− 2, dn−1 = −bn−1

3h2n−1

−2cn−1

3hn−1

,

b0 = 0, bj =yj+1 − yj

hj

−(2cj + cj+1)hj

3, j = 1, . . . , n− 2,

bn−1 =3(yn − yn−1)

2hn−1

− 1

2cn−1hn−1 ,

where c =“c0, c1, . . . , cn−1

”Tsolves A c = z =

“z0, z1, . . . , zn−1

”T, with

333

A =

0BBBBBBBBBBBBB@

2h0 h0h0 2(h0 + h1) h1

h1 2(h1 + h2) h2h2 2(h2 + h3) h3

. . .. . .

. . .hn−3 2(hn−3 + hn−2) hn−2

hn−2 2hn−2 + 32 hn−1

1CCCCCCCCCCCCCA

,

z0 = 3y1 − y0h0

, zj = 3

0@yj+1 − yj

hj

−yj − yj−1

hj−1

1A , j = 1, . . . , n− 2.

The particular solutions are:

(a) u(x) =

8<:−5.25(x+ 1)3 + 8.25(x+ 1)2 − 2, −1 ≤ x ≤ 0,

4.75x3 − 7.5x2 + .75x+ 1, 0 ≤ x ≤ 1.

-1 -0.5 0.5 1

-2

-1.5

-1

-0.5

0.5

1

(b) u(x) =

8>>><>>>:

−2.6x3 + 3.6x2 + 1, 0 ≤ x ≤ 1,

2.8(x− 1)3 − 4.2(x− 1)2 − .6(x− 1) + 2, 1 ≤ x ≤ 2,

−2.6(x− 2)3 + 4.2(x− 2)2 − .6(x− 2), 2 ≤ x ≤ 3.0.5 1 1.5 2 2.5 3

0.5

1

1.5

2

(c) u(x) =

8<:

3.5(x− 1)3 − 6.5(x− 1)2 + 3, 1 ≤ x ≤ 2,

−1.125(x− 2)3 + 4(x− 2)2 − 2.5(x− 2), 2 ≤ x ≤ 4.1.5 2 2.5 3 3.5 4

-0.5

0.5

1

1.5

2

2.5

3

(d) u(x) =

8>>>>>>><>>>>>>>:

4.92857(x+ 2)3 − 7.92857(x+ 2)2 + 5, −2 ≤ x ≤ −1,

−4.78571(x+ 1)3 + 6.85714(x+ 1)2 − 1.07143(x+ 1) + 2, −1 ≤ x ≤ 0,

5.21429x3 − 7.5x2 − 1.71429x+ 3, 0 ≤ x ≤ 1,

−5.07143(x− 1)3 + 8.14286(x− 1)2 − 1.07143(x− 1)− 1, 1 ≤ x ≤ 2.

-2 -1 1 2

-1

1

2

3

4

5

11.4.16.

(a) u(x) =

8>>><>>>:

x3 − 2x2 + 1, 0 ≤ x ≤ 1,

(x− 1)2 − (x− 1), 1 ≤ x ≤ 2,

−(x− 2)3 + (x− 2)2 + (x− 2), 2 ≤ x ≤ 3. 0.5 1 1.5 2 2.5 3-0.2

0.2

0.4

0.6

0.8

1

334

(b) u(x) =

8>>><>>>:

−x3 + 2x+ 1, 0 ≤ x ≤ 1,

2(x− 1)3 − 3(x− 1)2 − (x− 1) + 2, 1 ≤ x ≤ 2,

−(x− 2)3 + 3(x− 2)2 − (x− 2), 2 ≤ x ≤ 3.0.5 1 1.5 2 2.5 3

0.5

1

1.5

2

(c) u(x) =

8>>>>>>><>>>>>>>:

54 x

3 − 94 x

2 + 1, 0 ≤ x ≤ 1,

− 34 (x− 1)3 + 3

2 (x− 1)2 − 34 (x− 1), 1 ≤ x ≤ 2,

34 (x− 2)3 − 3

4 (x− 2)2, 2 ≤ x ≤ 3,

− 54 (x− 3)3 + 3

2 (x− 3)2 + 34 (x− 3), 3 ≤ x ≤ 4. 1 2 3 4

0.2

0.4

0.6

0.8

1

(d) u(x) =

8>>>>>>><>>>>>>>:

−2(x+ 2)3 + 34 (x+ 2)2 + 9

4 (x+ 2) + 1, −2 ≤ x ≤ −1,

72 (x+ 1)3 − 21

4 (x+ 1)2 − 94 (x+ 1) + 2, −1 ≤ x ≤ 0,

−2x3 + 214 x

2 − 94 x− 2, 0 ≤ x ≤ 1,

12 (x− 1)3 − 3

4 (x− 1)2 + 94 (x− 1)− 1, 1 ≤ x ≤ 2.

-2 -1 1 2

-2

-1

1

2

♠ 11.4.17.(a) If we measure x in radians,

u(x) =

8>>>>><>>>>>:

.9959x− .1495x3, 0 ≤ x ≤ 1,

.5 + .8730“x− 1

6 π”− .2348

“x− 1

6 π”2 − .297

“x− 1

6 π”3, 1

6 π ≤ x ≤ 14 π,

.7071 + .6888“x− 1

4 π”− .4686

“x− 1

4 π”2 − .5966

“x− 1

4 π”3, 1

4 π ≤ x ≤ 13 π.

We plot the spline, a comparison with the exact graph, and a graph of the error:

0.2 0.4 0.6 0.8 1

0.2

0.4

0.6

0.8

0.2 0.4 0.6 0.8 1

0.2

0.4

0.6

0.8

0.2 0.4 0.6 0.8 1

0.001

0.002

0.003

(b) The maximal error in the spline is .002967 versus .000649 for the interpolating polyno-mial. The others all have larger error.

♣ 11.4.18.(a)

u(x) =

8>>>>><>>>>>:

2.2718x− 4.3490x3, 0 ≤ x ≤ 1,

.5 + 1.4564“x− 1

4

”− 3.2618

“x− 1

4

”2+ 3.7164

“x− 1

4

”3, 1

4 ≤ x ≤ 916 ,

.75 + .5066“x− 9

16

”+ .2224

“x− 9

16

”2 − .1694“x− 9

16

”3, 9

16 ≤ x ≤ 1.

We plot the spline, a comparison with the exact graph, and a graph of the error:

0.2 0.4 0.6 0.8 1

0.2

0.4

0.6

0.8

1

0.2 0.4 0.6 0.8 1

0.2

0.4

0.6

0.8

1

0.2 0.4 0.6 0.8 1-0.02

0.02

0.04

0.06

0.08

0.1

(b) The maximal error in the spline is .1106 versus .1617 for the interpolating polynomial.(c) The least squares error for the spline is .0396, while the least squares cubic polynomial,

p(x) = .88889x3 − 1.90476x2 + 1.90476x + .12698 has larger maximal error .1270, butsmaller least squares error .0112 (as it must!).

335

♠ 11.4.19.

-3 -2 -1 1 2 3

0.2

0.4

0.6

0.8

1

-3 -2 -1 1 2 3

0.2

0.4

0.6

0.8

1

-3 -2 -1 1 2 3

0.2

0.4

0.6

0.8

1

The cubic spline interpolants do not exhibit any of the pathology associated with the inter-polating polynomials. Indeed, the maximal absolute errors are, respectively, .4139, .1001,.003816, and so increasing the number of nodes significantly reduces the overall error.

♣ 11.4.20. Sample letters:

♣ 11.4.21. Sample letters:

Clearly, interpolating polynomials are completely unsuitable for typography!

♥ 11.4.22.(a)

C0(x) =

8>>>>><>>>>>:

1− 1915 x+ 4

15 x3, 0 ≤ x ≤ 1,

− 715 (x− 1) + 4

5 (x− 1)2 − 13 (x− 1)3, 1 ≤ x ≤ 2,

215 (x− 2)− 1

5 (x− 2)2 + 115 (x− 2)3, 2 ≤ x ≤ 3,

0.5 1 1.5 2 2.5 3

0.2

0.4

0.6

0.8

1

C1(x) =

8>>>>><>>>>>:

85 x− 3

5 x3, 0 ≤ x ≤ 1,

1− 15 (x− 1)− 9

5 (x− 1)2 + (x− 1)3, 1 ≤ x ≤ 2,

− 45 (x− 2) + 6

5 (x− 2)2 − 25 (x− 2)3, 2 ≤ x ≤ 3, 0.5 1 1.5 2 2.5 3

0.2

0.4

0.6

0.8

1

C2(x) =

8>>>>><>>>>>:

− 25 x+ 2

5 x3, 0 ≤ x ≤ 1,

45 (x− 1) + 6

5 (x− 1)2 − (x− 1)3, 1 ≤ x ≤ 2,

15 (x− 2)− 9

5 (x− 2)2 + 35 (x− 2)3, 2 ≤ x ≤ 3, 0.5 1 1.5 2 2.5 3

0.2

0.4

0.6

0.8

1

336

C3(x) =

8>>>>><>>>>>:

115 x− 1

15 x3, 0 ≤ x ≤ 1,

− 1215(x− 1)− 1

5 (x− 1)2 + 13 (x− 1)3, 1 ≤ x ≤ 2,

715 (x− 2) + 4

5 (x− 2)2 − 415 (x− 2)3, 2 ≤ x ≤ 3.

0.5 1 1.5 2 2.5 3

0.2

0.4

0.6

0.8

1

(b) It suffices to note that any linear combination of natural splines is a natural spline.Moreover, u(xj) = y0C0(xj) + y1C1(xj) + · · · + ynCn(xj) = yj , as desired.

(c) The n + 1 cardinal splines C0(x), . . . , Cn(x) orthogonal matrix a basis. Part (b) showsthat they span the space since we can interpolate any data. Moreover, they are linearlyindependent since, again by part (b), the only spline that interpolates the zero data,u(xj) = 0 for all j = 0, . . . , n, is the trivial one u(x) ≡ 0.

(d) The same ideas apply to clamped splines with homogeneous boundary conditions, andto periodic splines. In the periodic case, since C0(x) = Cn(x), the vector space only hasdimension n. The formulas for the clamped cardinal and periodic cardinal splines aredifferent, though.

♥ 11.4.23.

(a) β(x) =

8>>>>>><>>>>>>:

(x+ 2)3, −2 ≤ x ≤ −1,

4− 6x2 + 3x3, −1 ≤ x ≤ 0,

4− 6x2 − 3x3, 0 ≤ x ≤ 1,

(2− x)3, 1 ≤ x ≤ 2.-3 -2 -1 1 2 3

1

2

3

4

(b) By direct computation β′(−2) = 0 = β′(2);(c) By the interpolation conditions, the natural boundary conditions and (b), β(−2) =

β′(−2) = β′′(−2) = 0 = β(2) = β′(2) = β′′(2), and so periodicity is assured.

(d) Since β(x) is a spline, it is C2 for −2 < x < 2, while the zero function is everywhere

C2. Thus, the only problematic points are at x = ±2, and, by part (c), β?(−2−) = 0 =

β?(−2+), β?(−2−) = 0 = β?(−2+), β?′(−2−) = 0 = β?′(−2+), β?(2−) = 0 = β?(2+),

β?(2−) = 0 = β?(2+), β?′(2−) = 0 = β?′(2+),, proving the continuity of β?(x) at ±2.

♥ 11.4.24.(a) According to Exercise 11.4.23, the functions Bj(x) are all periodic cubic splines. More-

over, their sample values Bj(xk) =

8>><>>:

4, j = k,

1, | j − k | = 1 modn,

0, otherwise,

form a linearly indepen-

dent set of vectors because the corresponding circulant tridiagonal n× n matrix B, withentries bjk = Bj(xk) for j, k = 0, . . . , n− 1, is diagonally dominant.

(b)

B0(x):

1 2 3 4 5

0.5

1

1.5

2

2.5

3

3.5

4

B1(x):

1 2 3 4 5

0.5

1

1.5

2

2.5

3

3.5

4

B2(x):

1 2 3 4 5

0.5

1

1.5

2

2.5

3

3.5

4

B3(x):

1 2 3 4 5

0.5

1

1.5

2

2.5

3

3.5

4

B4(x):

1 2 3 4 5

0.5

1

1.5

2

2.5

3

3.5

4

337

(c) Solve the linear system Bα = y, where B is the matrix constructed in part (a), for theB–spline coefficients.

(d) u(x) = 511 B1(x) + 2

11 B2(x)− 211 B3(x)− 5

11 B4(x).1 2 3 4 5

-2

-1

1

2

11.5.1. u(x) =e3x/2 − e−3x/2

e3 − e−3=

sinh 32 x

sinh 3; yes, the solution is unique.

11.5.2. True — the solution is u(x) = 1.

11.5.3. The Green’s function superposition formula is u(x) =Z 1/2

0G(x, y) dx −

Z 1

1/2G(x, y) dx.

If x ≤ 12 , then

u(x) =Z x

0

sinhω (1− x) sinhωy

ω sinhωdy +

Z 1/2

x

sinhωx sinhω (1− y)ω sinhω

dy −

−Z 1

1/2


dy =1

ω2− eωx + eω/2 e−ωx

ω2(1 + eω/2),

while if x ≥ 12 , then

u(x) =Z 1/2

0


ω sinhωdy −

Z x

1/2


ω sinhωdy −

−Z 1

x


dy =1

ω2− e−ω/2 eωx + eω e−ωx

ω2(1 + eω/2).

11.5.4.

(a) G(x, y) =

8>>><>>>:

sinhωx coshω (1− y)ω coshω

, x < y,

coshω (1− x) sinhωy

ω coshω, x > y.

(b) If x ≤ 12 , then

u(x) =Z x

0


ω coshωdy +

Z 1/2

x


dy −

−Z 1

1/2


dy

=1

ω2− (eω/2 − e−ω/2 + e−ω) eωx + (eω − eω/2 + e−ω/2) e−ωx

ω2(eω + e−ω),


u(x) =Z 1/2

0


ω coshωdy −

Z x

1/2


ω coshωdy −

−Z 1

x


dy

= − 1

ω2+

(e−ω/2 − e−ω + e−3ω/2) eωx + (e3ω/2 − eω + eω/2) e−ωx

ω2(eω + e−ω).

338

11.5.5.

G(x, y) =

8>>><>>>:

coshωx coshω (1− y)ω sinhω

, x < y,

coshω (1− x) coshωy

ω sinhω, x > y.

If x ≤ 12 , then

u(x) =Z x

0


ω sinhωdy +

Z 1/2

x


dy −

−Z 1

1/2


dy =1

ω2− coshωx

ω2 cosh 12 ω

,


u(x) =Z 1/2

0


ω sinhωdy −

Z x

1/2


ω sinhωdy −

−Z 1

x


dy = − 1

ω2+

coshω (1− x)ω2 cosh 1

2 ω.

This Neumann boundary value problem has a unique solution since kerK = kerL =0. Therefore, the Neumann boundary value problem is positive definite, and hence hasa Green’s function.

♦ 11.5.6. Since L[u ] =“u′, u

”T, clearly kerL = 0, irrespective of any boundary conditions.

Thus, every set of self-adjoint boundary conditions, including the Neumann boundary valueproblem, leads to a positive definite boundary value problem. The solution u?(x) to thehomogeneous Neumann boundary value problem minimizes the same functional (11.156)

among all u ∈ C2[a, b ] satisfying the Neumann boundary conditions u′(a) = u′(b) = 0.

♥ 11.5.7.(a) The solution is unique provided the homogeneous boundary value problem z′′ + λ z = 0,

z(0) = z(1) = 0, has only the zero solution z(x) ≡ 0, which occurs whenever λ 6= n2π2

for n = 1, 2, 3, . . . . If λ = n2π2, then z(x) = c sinnπx for any c 6= 0 is a non-zerosolution to the homogeneous boundary value problem, and so can be added in to anysolution to the inhomogeneous system.

(b) For λ = −ω2 < 0, G(x, y) =

8>>>><>>>>:

− sinhω (1− y) sinhωx

ω sinhω, x < y,

− sinhω (1− x) sinhωy

ω sinhω, x > y;

for λ = 0, G(x, y) =

8<:x(y − 1), x < y,

y(x− 1), x > y;

for λ = ω2 6= n2π2 > 0, G(x, y) =

8>>>><>>>>:

sinω (1− y) sinωx

ω sinω, x < y,

sinω (1− x) sinωy

ω sinω, x > y.

(c) The Fredholm alternative requires the forcing function to be orthogonal to the solutionsto the homogeneous boundary value problem, and so

〈h , sinnπx 〉 =Z 1

0h(x) sinnπx dx = 0.

♦ 11.5.8. The first term is written as − d

dx

p(x)

du

dx

!= D∗ D[u ], where D[u ] = u′, and the

339

adjoint is computed with respect to the weighted L2 inner product

〈〈 v , ev 〉〉 =Z b

ap(x)v(x)ev(x) dx.

The second term is written as q(x)u = I ∗ I [u ], where I [u ] = u is the identity operator,

and the adjoint is computed with respect to the weighted L2 inner product

〈〈w , ew 〉〉 =Z b

aq(x)w(x) ew(x) dx.

Thus, by Exercise 7.5.21, the sum can be written in self-adjoint form

D∗ D[u ] + I ∗ I [u ] = L∗ L[u ], with L[u ] =

D[u ]I [u ]

!=

u′

u

!

taking its values in the Cartesian product space, with inner product exactly given by (11.154).

♥ 11.5.9.

(a) µ(x)ha(x)u′′ + b(x)u′ + c(x)u

i= −

hp(x)u′

i′+ q(x)u = −p(x)u′′ − p′(x)u + q(x)u if

and only if µa = −p, µ b = −p′, µ c = q. Thus, (µa)′ = µ b, and so the formula for theintegrating factor is

µ(x) = exp

Z b(x)− a′(x)a(x)

dx

!=

1

a(x)exp

Z b(x)

a(x)dx

!.

(b) (i) µ(x) = e2x yields − d

dx

e2x du

dx

!+ e2xu = e3x,

(ii) µ(x) = x−4 yields − d

dx

1

x2

du

dx

!+

3

x4u =

1

x4,

(iii) µ(x) = −e−x yields − d

dx

xe−x du

dx

!− e−xu = 0.

(c) (i) Minimize P[u ] =Z b

a

h12 e

2xu′(x)2 + 12 e

2xu(x)2 − e3xu(x)idx subject to

u(1) = u(2) = 0;

(ii) Minimize P[u ] =Z b

a

h u′(x)2

2x2+

3u(x)2

2x4− u(x)

x4

idx subject to u(1) = u(2) = 0;

(iii) Minimize P[u ] =Z b

a

h12 xe

−xu′(x)2 − 12 e

−xu(x)2idx subject to u(1) = u(2) = 0.

11.5.10. Since λ > 0, the general solution to the ordinary differential equation is

u(x) = e−x“c1 cos

√λx+ c2 sin

√λx

”. The first boundary condition implies c1 = 0, while

the second implies c2 = 0 unless sin 2√λ = 0, and so the desired values are λ = 1

4 n2π2 for

any positive integer n.

♦ 11.5.11. The solution is

u(x, ε) =1

ε2− (1− e−ε) eεx + (eε − 1) e−εx

ε2(eε − e−ε).

Moreover, by l’Hopital’s Rule, or using Taylor expansions in ε,

limε→0+

u(x, ε) = limε→0+

eε − eεx + eε(x−1) − e−ε + e−εx − eε(1−x)

ε2(eε − e−ε)= 1

2 x− 12 x

2 = u?(x),

which is the solution to the limiting boundary value problem.

♦ 11.5.12. The solution is

u(x, ε) = 1− 1− e−1/ε

e1/ε − e−1/εex/ε − e1/ε − 1

e1/ε − e−1/εe−x/ε.

340

To evaluate the limit, we rewrite

limε→0+

u(x, ε) = 1− limε→0+

1− e−1/ε

1− e−2/εe(x−1)/ε − 1− e−1/ε

1− e−2/εe−x/ε =

(1, 0 < x < 1,

0, x = 0 or x = 1,

since limε→0+

e−x/ε = 0 for all x > 0. The convergence is non-uniform since the limiting

function is discontinuous.

11.5.13. To prove the first bilinearity condition:

〈〈 cv + dw , z 〉〉 =Z 1

0

hp(x)(cv1(x) + dw1(x))z1(x) + q(x)(cv2(x) + dw2(x))z2(x)

idx

= cZ 1

0

hp(x)v1(x)z1(x) + q(x)v2(x)z2(x)

idx+ d

Z 1

0

hp(x)w1(x)z1(x) + q(x)w2(x)z2(x)

idx

= c〈〈v , z 〉〉+ d〈〈w , z 〉〉.The second has a similar proof, or follows from symmetry as in Exercise 3.1.9. To provesymmetry:

〈〈v ,w 〉〉 =Z 1

0

hp(x)v1(x)w1(x) + q(x)v2(x)w2(x)

idx

=Z 1

0

hp(x)w1(x)v1(x) + q(x)w2(x)v2(x)

idx = 〈〈w ,v 〉〉.

As for positivity,

〈〈v ,v 〉〉 =Z 1

0

hp(x)v1(x)

2 + q(x)v2(x)2idx ≥ 0,

since the integrand is a non-negative function. Moreover, since p(x), q(x), v1(x), v2(x) are

all continuous, so is p(x)v1(x)2+q(x)v2(x)

2, and hence 〈〈v ,v 〉〉 = 0 if and only if p(x)v1(x)2+

q(x)v2(x)2 ≡ 0. Since p(x) > 0, q(x) > 0, this implies v(x) = ( v1(x), v2(x) )T ≡ 0.

♦ 11.5.14.

(a) sinhα coshβ + coshα sinhβ = 14 (eα − e−α) (eβ + e−β) + 1

4 (eα + e−α) (eβ − e−β)

= 12 (eα+β − e−α−β) = sinh(α+ β).

(b) coshα coshβ + sinhα sinhβ = 14 (eα + e−α) (eβ + e−β) + 1

4 (eα − e−α) (eβ − e−β)

= 12 (eα+β + e−α−β) = cosh(α+ β).

♣ 11.6.1. Exact solution: u(x) = 2e2ex − 1

e2 − 1− xex. The graphs compare the exact and finite

element approximations to the solution for 6, 11 and 21 nodes:

0.5 1 1.5 2

0.2

0.4

0.6

0.8

1

1.2

1.4

0.5 1 1.5 2

0.2

0.4

0.6

0.8

1

1.2

1.4

0.5 1 1.5 2

0.2

0.4

0.6

0.8

1

1.2

1.4

The respective maximal overall errors are .1973, .05577, .01476. Thus, halving the nodalspacing apparently reduces the error by a factor of 4.

♠ 11.6.2.(a) Solution:

u(x) = 14 x− ρ2(x− 1) =

8<:

14 x, 0 ≤ x ≤ 1,

14 x− 1

2 (x− 1)2, 1 ≤ x ≤ 2;

341

finite element sample values:( 0., .06, .12, .18, .24, .3, .32, .3, .24, .14, 0. );

maximal error at sample points .05; maximal overall error: .05. 0.5 1 1.5 2

0.05

0.1

0.15

0.2

0.25

0.3

(b) Solution: u(x) =log x+ 1

log 2− x;

0.2 0.4 0.6 0.8 1

0.02

0.04

0.06

0.08

finite element sample

values: ( 0., .03746, .06297, .07844, .08535, .08489, .07801, .06549, .04796, .02598, 0. );maximal error at sample points .00007531; maximal overall error: .001659.

(c) Solution: u(x) = 12 x− 2 + 3

2 x−1;

1.5 2 2.5 3

-0.25

-0.2

-0.15

-0.1

-0.05

finite element sample

values: ( 0.,−.1482,−.2264,−.2604,−.2648,−.2485,−.2170,−.1742,−.1225,−.0640, 0. );maximal error at sample points .002175; maximal overall error: .01224.

(d) Solution: u(x) =e2 + 1− 2e1−x

e2 − 1− x;

-1 -0.5 0.5 1

0.1

0.2

0.3

0.4

finite element

sample values: ( 0., .2178, .3602, .4407, .4706, .4591, .4136, .3404, .2444, .1298, 0. );maximal error at sample points .003143; maximal overall error: .01120.

♣ 11.6.3.(a) The finite element sample values are c = ( 0, .096, .168, .192, .144, 0 )T .

(b)

0.2 0.4 0.6 0.8 1

0.05

0.1

0.15

0.2 0.4 0.6 0.8 1

0.05

0.1

0.15

0.2 0.4 0.6 0.8 1

0.05

0.1

0.15

(c) (i) The maximal error at the mesh points is 2.7756×10−17, almost 0! (ii) The maximalerror on the interval using the piecewise affine interpolant is .0135046.

(d) (i) The maximal error at the mesh points is the same 2.775 × 10−17; (ii) the maximal

error on the interval using the spline interpolant is 5.5511 × 10−17, making it, for allpractical purposes, identical with the exact solution.

♣ 11.6.4. The only difference is the last basis function, which should changed to

ϕn−1(x) =

8>>>>><>>>>>:

x− xn−2

xn−1 − xn−2

, xn−2 ≤ x ≤ xn−1,

1, xn−1 ≤ x ≤ b,0, x ≤ xn−2,

in order to satisfy the free boundary condition at xn = b. This only affects the bottomright entry

mn−1,n−1 =sn−2

h2

342

of the finite element matrix (11.169) and the last entry

bn−1 =1

h

Z xn−1

xn−2

(x− xn−2)f(x) dx+Z xn

xn−1

f(x) dx ≈ hf(xn−1) + 12 hf(xn),

of the vector (11.172). For the particular boundary value problem, the exact solution isu(x) = 4 log(x + 1) − x. We graph the finite element approximation and then a comparisonwith the solution:

0.2 0.4 0.6 0.8 1

0.1

0.2

0.3

0.2 0.4 0.6 0.8 1

0.1

0.2

0.3

They are almost identical; indeed, the maximal error on the interval is .003188.

♣ 11.6.5.

(a) u(x) = x+π ex

1− e2π+

π e−x

1− e−2π.

(b) Minimize P[u ] =Z 2π

0

h12 u

′(x)2 + 12 u(x)

2 − xu(x)idx over all C2 functions u(x) that

satisfy the boundary conditions u(0) = u(2π), u′(0) = u′(2π).(c) dimW5 = 4 since any piecewise affine function ϕ(x) that satisfies the two boundary

conditions is uniquely determined by its 4 interior sample values c1 = ϕ(x1), . . . , c4 =

ϕ(x4), with c0 = ϕ(x0) = 12 (c1 + c4) then determined so that ϕ(x0) = ϕ(x5) and

ϕ′(x0) = ϕ′(x5). Thus, a basis consists of the following four functions with listed inter-polation values plotted in

ϕ1 : 12 , 1, 0, 0, 0

1 2 3 4 5 6

0.2

0.4

0.6

0.8

1

ϕ2 : 0, 0, 1, 0, 0

1 2 3 4 5 6

0.2

0.4

0.6

0.8

1

ϕ3 : 0, 0, 0, 1, 0

1 2 3 4 5 6

0.2

0.4

0.6

0.8

1

ϕ4 : 12 , 0, 0, 0, 1

1 2 3 4 5 6

0.2

0.4

0.6

0.8

1

(d) n = 5: maximal error .9435

1 2 3 4 5 6

1

2

3

4

1 2 3 4 5 6

1

2

3

4

(e) n = 10: maximal error .6219

1 2 3 4 5 6

1

2

3

4

1 2 3 4 5 6

1

2

3

4

343

n = 20: maximal error .3792

1 2 3 4 5 6

1

2

3

4

1 2 3 4 5 6

1

2

3

4


1 2 3 4 5 6

1

2

3

4

1 2 3 4 5 6

1

2

3

4

Each decrease in the step size by 12 decreases the maximal error by slightly less than 1

2 .

♣ 11.6.6.

(c) dimW5 = 5, since a piecewise affine function ϕ(x) that satisfies the two boundary con-ditions is uniquely determined by its values cj = ϕ(xj), j = 0, . . . , 4. A basis consists

of the 5 functions interpolating the values ϕi(xj) =

(1, i = j,

0, i 6= j,for 0 ≤ i, j < 5, and

ϕi(2π) = ϕi(x5) = ϕi(x0) = ϕi(0). The basis functions are graphed below:

ϕ0 :

1 2 3 4 5 6

0.2

0.4

0.6

0.8

1

ϕ1 :

1 2 3 4 5 6

0.2

0.4

0.6

0.8

1

ϕ2 :

1 2 3 4 5 6

0.2

0.4

0.6

0.8

1

ϕ3 :

1 2 3 4 5 6

0.2

0.4

0.6

0.8

1

ϕ4 :

1 2 3 4 5 6

0.2

0.4

0.6

0.8

1

(d) n = 5: maximal error 1.8598

1 2 3 4 5 6

1

2

3

4

1 2 3 4 5 6

1

2

3

4

(e) n = 10: maximal error .9744

1 2 3 4 5 6

1

2

3

4

1 2 3 4 5 6

1

2

3

4


1 2 3 4 5 6

1

2

3

4

1 2 3 4 5 6

1

2

3

4

344


1 2 3 4 5 6

1

2

3

4

1 2 3 4 5 6

1

2

3

4

Each decrease in the step size by 12 also decreases the maximal error by about 1

2 . How-ever, the approximations in Exercise 11.6.5 are slightly more accurate.

♣ 11.6.7. n = 5: maximum error .1117

1 2 3 4 5 6

0.5

1

1.5

2

n = 10: maximum error .02944

1 2 3 4 5 6

0.5

1

1.5

2

n = 20: maximum error .00704

1 2 3 4 5 6

0.5

1

1.5

2

11.6.8.

(a) L =

0BBBBBBBB@

1− 1

2 1

− 23 1

− 34 1

. . .. . .

1CCCCCCCCA

, D =1

h

0BBBBBBBB@

332

43

54

. . .

1CCCCCCCCA

;

since all pivots are positive, the matrix is positive definite.

(b) By Exercise 8.2.48, the eigenvalues of M are λk = 2 − 2 coskπn+ 1 > 0 for k = 1, . . . , n.

Since M is symmetric and these are all positive definite, the matrix is positive definite.

11.6.9. (a) No. (b) Yes. The Jacobi iteration matrix T , cf. (10.65), is the tridiagonal matrix

with 0’s on the main diagonal, and 12 ’s on the sub and super-diagonals, and hence, accord-

ing to Exercise 8.2.48, has eigenvalues λk = coskπ

n+ 1, k = 1, . . . , n. Thus, its spectral

radius, and hence rate of convergence of the iteration, is ρ(T ) = cosπ

n+ 1 < 1, provingconvergence.

♣ 11.6.10.(a) A basis for Wn consists of the n − 1 polynomials ϕk(x) = xk(x − 1) = xk+1 − xk for

k = 1, . . . , n− 1.

345

(b) The matrix entries are

mij = 〈〈L[ϕi ] , L[ϕj ] 〉〉 = 〈〈ϕ′i , ϕ

′j 〉〉

=Z 1

0

h(i+ 1)xi − ixi−1

ih(j + 1)xj − j xj−1

i(x+ 1) dx

=4 i2 j + 4 ij2 + 4 ij + j2 + i2 − i− j

(i+ j − 1)(i+ j)(i+ j + 1)(i+ j + 2), 1 ≤ i, j ≤ n− 1.

while the right hand side vector has entries

bi = 〈 f , ϕi 〉 =Z 1

0

h(i+ 1)xi − ixi−1

idx = − 1

i2 + 3 i+ 2, i = 1, . . . , n− 1.

Solving M c = b, the computed solutions v(x) =n−1X

k=1

ckϕk(x) are almost identical to

the exact solution: for n = 5, the maximal error is 2.00× 10−5, for n = 10, the maximalerror is 1.55 × 10−9, and for n = 20, the maximal error is 1.87 × 10−10. Thus, thepolynomial finite element method gives much closer approximations, although solvingthe linear system is (slightly) harder since the coefficient matrix is not sparse.

♣ 11.6.11.(a) There is a unique solution provided λ 6= −n2 for n = 1, 2, 3, . . . , namely

u(x) =

8>>>>>>>>><>>>>>>>>>:

x

λ− π sinh

√λ x

λ sinh√λ π

, λ > 0,

16 π

2x− 16 x

3, λ = 0,

x

λ− π sin

√−λ x

λ sin√−λ π

, −n2 6= λ < 0.

When λ = −n2, the boundary value problem has no solution.

(b) The minimization principle P[u ] =Z 2π

0

h12 u

′(x)2 + 12 λu(x)

2 − xu(x)idx over all C2

functions u(x) that satisfy the boundary conditions u(0) = 0, u(π) = 0, is only valid forλ > 0. The minimizer is unique. Otherwise, the functional has no minimum.

(c) Let h = π/n. The finite element equations are M c = b where M is the (n− 1)× (n− 1)

tridiagonal matrix whose diagonal entries are2

h+

2

3hλ and sub- and super-entries are

− 1h

+1

6hλ.

(d) According to Exercise 8.2.48, the eigenvalues of M are

2

h+

2

3hλ+

− 2

h+

1

3hλ

!cos kh, k = 1, . . . , n− 1.

The finite element system has a solution if and only if the matrix is not singular, which

occurs if and only if 0 is an eigenvalue, and so λ =6

h2

cos kh− 1

cos kh+ 2≈ −k2 , provided

kh ¿ 0, and so the eigenvalues of the finite element matrix converge to the eigenvaluesof the boundary value problem. Interestingly, the finite element solution converges tothe actual solution, provided on exists, even when the boundary value problem is notpositive definite.

(e–f ) Here are some sample plots comparing the finite element approximant with the actualsolution. First, for λ > 0, the boundary value problem is positive definite, and we ex-pect convergence to the unique solution.

346

λ = 1, n = 5, maximal error .1062:

0.5 1 1.5 2 2.5 3

0.2

0.4

0.6

0.8

1

λ = 1, n = 10, maximal error .0318:

0.5 1 1.5 2 2.5 3

0.2

0.4

0.6

0.8

1

λ = 100, n = 10, maximal error .00913:

0.5 1 1.5 2 2.5 3

0.005

0.01

0.015

0.02

0.025

0.03

λ = 100, n = 30, maximal error .00237:

0.5 1 1.5 2 2.5 3

0.005

0.01

0.015

0.02

0.025

Then, for negative λ not near an eigenvalue the convergence rate is simliar:

λ = −.5, n = 5, maximal error .2958:

0.5 1 1.5 2 2.5 3

1

2

3

4

λ = −.5, n = 10, maximal error .04757:

0.5 1 1.5 2 2.5 3

1

2

3

4

Then, for λ very near an eigenvalue (in this case −1), convergence is much slower, butstill occurs. Note that the solution is very large:

λ = −.99, n = 10, maximal error 76.0308:

0.5 1 1.5 2 2.5 3

50

100

150

200

λ = −.99, n = 50, maximal error 5.333:

0.5 1 1.5 2 2.5 3

50

100

150

200

347

λ = −.99, n = 50, maximal error .4124:

0.5 1 1.5 2 2.5 3

50

100

150

200

On the other hand, when λ is an eigenvalue, the finite element solutions don’t converge.Note the scales on the two graphs:

λ = −1, n = 10, maximum value 244.3

0.5 1 1.5 2 2.5 3

50

100

150

200

250

λ = −1, n = 50, maximum value 6080.4

0.5 1 1.5 2 2.5 3

1000

2000

3000

4000

5000

6000

The final example shows convergence even for large negative λ. The convergence is slowbecause −50 is near an eigenvalue of −49:

λ = −50, n = 10, maximal error .3804:0.5 1 1.5 2 2.5 3

-0.3

-0.2

-0.1

0.1

0.2

λ = −50, n = 50, maximal error 1.1292:0.5 1 1.5 2 2.5 3

-1.5

-1

-0.5

0.5

1

λ = −50, n = 200, maximal error .0153:0.5 1 1.5 2 2.5 3

-0.3

-0.2

-0.1

0.1

0.2

0.3

♥ 11.6.12.

(a) We define f(x) = ui +ui+1 − ui

xi+1 − xi

(x− xi) for xi ≤ x ≤ xi+1.

(b) Clearly, since each hat function is piecewise affine, any linear combination is also piece-

wise affine. Since ϕi(xj) =

(1, i = j,

0, i 6= j,we have u(xj) =

nX

i=0

ui ϕi(xj) = uj , and so

u(x) has the correct interpolation values.

(c) u(x) = 2ϕ0(x) + 3ϕ1(x) + 6ϕ2(x) + 11ϕ3(x)

=

8>>><>>>:

x+ 2, 0 ≤ x ≤ 1,

3x, 1 ≤ x ≤ 2,

5x− 4, 2 ≤ x ≤ 3.0.5 1 1.5 2 2.5 3

2

4

6

8

10

12

348

♦ 11.6.13.(a) If α(x) is any continuously differentiable function at xj , then α′(x+

j ) − α′(x−j ) = 0.

Now, each term in the sum is continuously differentiable at x = xj except for αj(x) =

cj |x− xj |, and so f ′(x+j )− f ′(x−j ) = α′

j(x+j ) − α′

j(x−j ) = 2cj , proving the formula.

Further, a = f ′(x−0 ) +nX

i=1

ci, b = f(x0)− ax0 −nX

i=1

ci |x0 − xi |.

(b) ϕj(x) =1

2h|x− xj−1 | −

1

h|x− xj |+

1

2h|x− xj+1 |.

If u(x) = 0 for x < 0 or x > 3, then

u(x) = 132 + 1

2 |x |+ |x− 1 |+ |x− 2 | − 52 |x− 3 |.

More generally, for any constants c, d, we can set

u(x) = (3− c+ d)x− (3d+ 1) + c |x |+ |x− 1 |+ |x− 2 |+ d |x− 3 |.

349

75686648 solution manual

Documents