chapter 4giventh/la/la_16_ch4.pdf · chapter 4 eigenvalues 1 the spectral theorem hermitian spaces...

52
Chapter 4 Eigenvalues 1 The Spectral Theorem Hermitian Spaces Given a C-vector space V , an Hermitian inner product in V is defined as a Hermitian symmetric sesquilinear form such that the corresponding Hermitian quadratic form is positive definite. A space V equipped with an Hermitian inner product 〈·, ·〉 is called a Her- mitian space. 1 The inner square z, zis interpreted as the square of the length |z| of the vector z. Respectively, the distance between two points z and w in an Hermitian space is defined as |zw|. Since the Hermitian inner product is positive, distance is well-defined, symmetric, and positive (unless z = w). In fact it satisfies the triangle inequality 2 : |z w|≤|z| + |w|. This follows from the Cauchy – Schwarz inequality: |〈z, w〉| 2 ≤〈z, z〉〈w, w, where the equality holds if and only if z and w are linearly dependent. To derive the triangle inequality, write: |z w| 2 = z w, z w= z, z〉−〈z, w〉−〈w, z+ w, w≤|z| 2 +2|z||w| + |w| 2 =(|z| + |w|) 2 . 1 Other terms used are unitary space and finite dimensional Hilbert space. 2 This makes a Hermitian space a metric space. 135

Upload: duongkhanh

Post on 28-Aug-2018

227 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Chapter 4giventh/la/la_16_ch4.pdf · Chapter 4 Eigenvalues 1 The Spectral Theorem Hermitian Spaces Given a C-vector space V, an Hermitian inner product in V is defined as a Hermitian

Chapter 4

Eigenvalues

1 The Spectral Theorem

Hermitian Spaces

Given a C-vector space V, an Hermitian inner product in V isdefined as a Hermitian symmetric sesquilinear form such that thecorresponding Hermitian quadratic form is positive definite. A spaceV equipped with an Hermitian inner product 〈·, ·〉 is called a Her-mitian space.1

The inner square 〈z, z〉 is interpreted as the square of the length|z| of the vector z. Respectively, the distance between two points zandw in an Hermitian space is defined as |z−w|. Since the Hermitianinner product is positive, distance is well-defined, symmetric, andpositive (unless z = w). In fact it satisfies the triangle inequality2:

|z−w| ≤ |z|+ |w|.

This follows from the Cauchy – Schwarz inequality:

|〈z,w〉|2 ≤ 〈z, z〉 〈w,w〉,

where the equality holds if and only if z andw are linearly dependent.To derive the triangle inequality, write:

|z−w|2 = 〈z−w, z−w〉 = 〈z, z〉 − 〈z,w〉 − 〈w, z〉+ 〈w,w〉≤ |z|2 + 2|z||w| + |w|2 = (|z| + |w|)2.

1Other terms used are unitary space and finite dimensional Hilbert space.2This makes a Hermitian space a metric space.

135

Page 2: Chapter 4giventh/la/la_16_ch4.pdf · Chapter 4 Eigenvalues 1 The Spectral Theorem Hermitian Spaces Given a C-vector space V, an Hermitian inner product in V is defined as a Hermitian

136 Chapter 4. EIGENVALUES

To prove the Cauchy–Schwarz inequality, note that it sufficesto consider the case |w| = 1. Indeed, when w = 0, both sidesvanish, and when w 6= 0, both sides scale the same way when wis normalized to the unit length. So, assuming |w| = 1, we putλ := 〈w, z〉 and consider the projection λw of the vector z tothe line spanned by w. The difference z − λw is orthogonal to w:〈w, z−λw〉 = 〈w, z〉−λ〈w,w〉 = 0. From positivity of inner squares,we have:

0 ≤ 〈z− λw, z− λw〉 = 〈z, z − λw〉 = 〈z, z〉 − λ〈z,w〉.

Since 〈z,w〉 = 〈w, z〉 = λ, we conclude that |z|2 ≥ |〈z,w〉|2 asrequired. Notice that the equality holds true only when z = λw.

All Hermitian spaces of the same dimension are iso-metric (or Hermitian isomorphic), i.e. isomorphic through iso-morphisms respecting Hermitian inner products. Namely, as it fol-lows from the Inertia Theorem for Hermitian forms, every Hermitianspace has an orthonormal basis, i.e. a basis e1, . . . , en such that〈ei, ej〉 = 0 for i 6= j and = 1 for i = j. In the coordinate systemcorresponding to an orthonormal basis, the Hermitian inner producttakes on the standard form:

〈z,w〉 = z1w1 + · · ·+ znwn.

An orthonormal basis is not unique. Moreover, as it follows fromthe proof of Sylvester’s rule, one can start with any basis f1, . . . , fn inV and then construct an orthonormal basis e1, . . . , en such that ek ∈Span(f1, . . . , fk). This is done inductively; namely, when e1, . . . , ek−1

have already been constructed, one subtracts from fk its projectionto the space Span(e1, . . . , ek−1):

fk = fk − 〈e1, fk〉e1 − · · · − 〈ek−1, fk〉ek−1.

The resulting vector fk lies in Span(f1, . . . , fk−1, fk) and is orthogonalto Span(f1, . . . , fk−1) = Span(e1, . . . , ek−1). Indeed,

〈ei, fk〉 = 〈ei, fk〉 −k−1∑

j=1

〈ej , fk〉〈ei, ej〉 = 0

for all i = 1, . . . , k−1. To construct ek, one normalizes fk to the unitlength:

ek := fk/|fk|.

Page 3: Chapter 4giventh/la/la_16_ch4.pdf · Chapter 4 Eigenvalues 1 The Spectral Theorem Hermitian Spaces Given a C-vector space V, an Hermitian inner product in V is defined as a Hermitian

1. The Spectral Theorem 137

The above algorithm of replacing a given basis with an orthonormalone is known as Gram–Schmidt orthogonalization.

EXERCISES

316. Prove that if two vectors u and v in an Hermitian space are orthog-onal, then |u|2 + |v|2 = |u− v|2. Is the converse true? �

317. Prove that for any vectors u, v in an Hermitian space,|u+ v|2 + |u− v|2 = 2|u|2 + 2|v|2.

Find a geometric interpretation of this fact. �

318. Apply Gram–Schmidt orthogonalization to the basis f1 = e1+2ie2+2ie3, f2 = e1 + 2ie2, f3 = e1 in the coordinate Hermitian space C3.

319. Apply Gram–Schmidt orthogonalization to the standard basis e1, e2of C2 to construct an orthonormal basis of the Hermitian inner product〈z,w〉 = z1w1 + 2z1w2 + 2z2w1 + 5z2w2.

320. Let f ∈ V be a vector in an Hermitian space, e1, . . . , ek an orthonor-mal basis in a subspace W . Prove that u =

〈ei,v〉ei is the point of Wclosest to v, and that v−u is orthogonal to W . (The point u ∈ W is calledthe orthogonal projection of v to W .)

321.⋆ Let f1, . . . , fN be a finite sequence of vectors in an Hermitian space.The Hermitian N × N -matrix 〈fi, fj〉 is called the Gram matrix of thesequence. Show that two finite sequences of vectors are isometric, i.e. ob-tained from each other by a unitary transformation, if and only if theirGram matrices are the same.

Normal Operators

Our next point is that an Hermitian inner product on a com-plex vector space allows one to identify sesquilinear formson it with linear transformations.

Let V be an Hermitian vector space, and T : V 7→ V a C-lineartransformation. Then the function V × V → C

T (w, z) := 〈w, Tz〉

is C-linear in z and anti-linear in w, i.e. it is sesquilinear.

In coordinates, w =∑

i wiei, z =∑

j zjej, and

〈w, Tz〉 =∑

i,j

wi 〈ei, Tej〉zj ,

i.e., T (ei, ej) = 〈ei, Tej〉 form the coefficient matrix of the sesquilin-ear form. On the other hand, Tej =

i tijei, where [tij] is the matrix

Page 4: Chapter 4giventh/la/la_16_ch4.pdf · Chapter 4 Eigenvalues 1 The Spectral Theorem Hermitian Spaces Given a C-vector space V, an Hermitian inner product in V is defined as a Hermitian

138 Chapter 4. EIGENVALUES

of the linear transformation T with respect to the basis (e1, . . . , en).Note that if the basis is orthonormal, then 〈ei, Tej〉 = tij, i.e.the two matrices coincide.

Since tij could be arbitrary, it follows that every sesquilinear formon V is uniquely represented by a linear transformation.

Earlier we have associated with a sesquilinear form its Hermitianadjoint (by changing the order of the arguments and conjugatingthe value). When the sesquilinear form is obtained from a lineartransformation T , the adjoint corresponds to another linear trans-formation denoted T † and called Hermitian adjoint to T . Thus,by definition,

〈w, Tz〉 = 〈z, T †w〉 = 〈T †w, z〉 for all z,w ∈ V.

Of course, we also have 〈Tw, z〉 = 〈w, T †z〉 (check this!), and eitheridentity completely characterizes T † in terms of T .

The matrix of T † in an orthonormal basis is obtained from thatof T by complex conjugation and transposition:

t†ij := 〈ei, T †ej〉 = 〈Tei, ej〉 = 〈ej , Tei〉 =: tji.

Definition. A linear transformation on an Hermitian vectoris called normal (or a normal operator) if it commutes with itsadjoint: T †T = TT †.

Example 1. A scalar operator is normal. Indeed, (λI)† = λ I,which is also scalar, and scalars commute.

Example 2. A linear transformation on an Hermitian space iscalled Hermitian if it coincides with its Hermitian adjoint: S† = S.A Hermitian operator3 is normal.

Example 3. A linear transformation is called anti-Hermitian ifit is opposite to its adjoint: A† = −A. Multiplying an Hermitian op-erator by

√−1 yields an anti-Hermitian one, and vice versa (because

(√−1I)† = −

√−1I). Anti-Hermitian operators are normal.

Example 4. Every linear transformation T : V → V can beuniquely written as the sum of Hermitian and anti-Hermitian opera-tors: T = S+Q, where S = (T +T †)/2 = S†, and Q = (T −T †)/2 =−Q†. We claim that an operator is normal whenever its Hermitianand anti-Hermitian parts commute. Indeed, T † = S −Q, and

TT † − T †T = (S +Q)(S −Q)− (S −Q)(S +Q) = 2(QS − SQ).

3The term operator in Hermitian geometry is synonimous to linear map.

Page 5: Chapter 4giventh/la/la_16_ch4.pdf · Chapter 4 Eigenvalues 1 The Spectral Theorem Hermitian Spaces Given a C-vector space V, an Hermitian inner product in V is defined as a Hermitian

1. The Spectral Theorem 139

Example 5. An invertible linear transformation U : V → V iscalled unitary if it preserves inner products:

〈Uw, Uz〉 = 〈w, z〉 for all w, z ∈ V.

Equivalently, 〈w, (U †U − I)z〉 = 0 for all w, z ∈ V. Taking w =(U †U−I)z, we conclude that (U †U−I)z = 0 for all z ∈ V, and henceU †U = I. Thus, for a unitary map U , U−1 = U †. The converse state-ment is also true (and easy to check by starting from U−1 = U † andreversing our computation). Since every invertible transformationcommutes with its own inverse, we conclude that unitary transfor-mations are normal.

EXERCISES

322. Generalize the construction of Hermiatian adjoint operators to thecase of operators A : V → W between two different Hermitian spaces.Namely, show that A† : W → V is uniquely determined by the identity〈A†w,v〉V = 〈w, Av〉W for all v ∈ V and w ∈ W .

323. Show that the matrices of A : V → W and A† : W → V in orthornor-mal bases of V and W are obtained from each other by transposition andcomplex conjugation.

324. The trace of a square matrix A is defined as the sum of its diagonalentries, and is denoted trA. Prove that 〈A,B〉 := tr(A†B) defines anHermitian inner product on the space Hom(Cn,Cm) of m× n-matrices.

325. Let A1, . . . , Ak : V → W be linear maps between Hermitian spaces.Prove that if

A†iAi = 0, then A1 = · · · = Ak = 0.

326. Let A : V → W be a linear map between Hermitian spaces. Showthat B := A†A and C = AA† are Hermitian, and that the correspondingHermitian forms B(x,x) := 〈x, Bx〉 in V and C(y,y) := 〈y, Cy〉 in W arenon-negative. Under what hypothesis about A is the 1st of them positive?the 2nd one? both?

327. Let W ⊂ V be a subspace in an Hermitian space, and let P : V → Vbe the map that to each vector v ∈ V assigns its orthogonal projectionto W . Prove that P is an Hermitian operator, that P 2 = P , and thatKerP = W⊥.(It is called the orthogonal projector to W .)

328. Prove that an n × n-matrix is unitary if and only if its rows (orcolumns) form an orthonormal basis in the coordinate Hermitian space Cn.

329. Prove that the determinant of a unitary matrix is a complex numberof absolute value 1.

330. Prove that theCayley transform: C 7→ (I−C)/(I+C), well-definedfor linear transformations C such that I+C is invertible, transforms unitary

Page 6: Chapter 4giventh/la/la_16_ch4.pdf · Chapter 4 Eigenvalues 1 The Spectral Theorem Hermitian Spaces Given a C-vector space V, an Hermitian inner product in V is defined as a Hermitian

140 Chapter 4. EIGENVALUES

operators into anti-Hermitian and vice versa. Compute the square of theCayley transform. �

331. Prove that the commutator AB −BA of anti-Hermitian operators Aand B is anti-Hermitian.

332. Give an example of a normal 2 × 2-matrix which is not Hermitian,anti-Hermitian, unitary, or diagonal.

333. Prove that for any n× n-matrix A and any complex numbers α, β ofabsolute value 1, the matrix αA+ βA† is normal.

334. Prove that A : V → V is normal if and only if |Ax| = |A†x| for allx ∈ V .

The Spectral Theorem for Normal Operators

Let A : V → V be a linear transformation, v ∈ V a vector, andλ ∈ C a scalar. The vector v is called an eigenvector of A withthe eigenvalue λ, if v 6= 0, and Av = λv. In other words, Apreserves the line spanned by the vector v and acts on this line asthe multiplication by λ.

Theorem. A linear transformation A : V → V on a finitedimensional Hermitian vector space is normal if and onlyif V has an orthonormal basis of eigenvectors of A.

Proof. In one direction, the statement is almost obvious: Ifa basis consists of eigenvectors of A, then the matrix of A in thisbasis is diagonal. When the basis is orthonormal, the matrix of theHermitian adjoint operator A† in this basis is Hermitian adjoint tothe matrix of A and is also diagonal. Since all diagonal matricescommute, we conclude that A is normal. Thus, it remains to provethat, conversely, every normal operator has an orthonormal basis ofeigenvectors. We will prove this in four steps.

Step 1. Existence of eigenvalues. We need to show that thereexists a scalar λ ∈ C such that the system of linear equations Ax =λx has a non-trivial solution. Equivalently, this means that the lineartransformation λI − A has a non-trivial kernel. Since V is finitedimensional, this can be re-stated in terms of the determinant of thematrix of A (in any basis) as

det(λI −A) = 0.

This relation, understood as an equation for λ, is called the char-acteristic equation of the operator A. When A = 0, it becomes

Page 7: Chapter 4giventh/la/la_16_ch4.pdf · Chapter 4 Eigenvalues 1 The Spectral Theorem Hermitian Spaces Given a C-vector space V, an Hermitian inner product in V is defined as a Hermitian

1. The Spectral Theorem 141

λn = 0, where n = dimV. In general, it is a degree-n polynomialequation

λn + p1λn−1 + · · · + pn−1λ+ pn = 0,

where the coefficients p1, . . . , pn are certain algebraic expressions ofmatrix entries of A (and hence are complex numbers). According tothe Fundamental Theorem of Algebra, this equation has a complexsolution, say λ0. Then det(λ0I − A) = 0, and hence the system(λ0I − A)x = 0 has a non-trivial solution, v 6= 0, which is thereforean eigenvector of A with the eigenvalue λ0.

Remark. Solutions to the system Ax = λ0x form a linear sub-space W in V, namely the kernel of λ0I − A, and eigenvectors of Awith the eigenvalue λ0 are exactly all non-zero vectors in W. Slightlyabusing terminology, W is called the eigenspace of A correspond-ing to the eigenvalue λ0. Obviously, A(W) ⊂ W. Subspaces withsuch property are called A-invariant. Thus eigenspaces of a lineartransformation A are A-invariant.

Step 2. A†-invariance of eigenspaces of A. Let W 6= {0} be theeigenspace of a normal operator A, corresponding to an eigenvalueλ. Then for every w ∈ W,

A(A†w) = A†(Aw) = A†(λw) = λ(A†w).

Therefore A†w ∈ W, i.e. the eigenspace W is A†-invariant.

Step 3. Invariance of orthogonal complements. Let W ⊂ V bea linear subspace. Denote by W⊥ the orthogonal complement ofthe subspace W with respect to the Hermitian inner product:

W⊥ := {v ∈ V | 〈w,v〉 = 0 for all w ∈ W.}

Note that if e1, . . . , ek is a basis in W, then W⊥ is given by k linearequations 〈ei,v〉 = 0, i = 1, . . . , k, and thus has dimension ≥ n− k.On the other hand, W ∩ W⊥ = {0}, because no vector w 6= 0can be orthogonal to itself: 〈w,w〉 > 0. It follows from dimensioncounting formulas that dimW⊥ = n−k. Moreover, this implies thatV = W ⊕W⊥, i.e. the whole space is represented as the direct sumof two orthogonal subspaces.

We claim that if a subspace is both A- and A†-invariant, then itsorthogonal complement is also A- and A†-invariant. Indeed, supposethat A†(W) ⊂ W, and v ∈ W⊥. Then for any w ∈ W, we have:〈w, Av〉 = 〈A†w,v〉 = 0, since A†w ∈ W. Therefore Av ∈ W⊥, i.e.W⊥ is A-invariant. By the same token, if W is A-invariant, then W⊥

is A†-invariant.

Page 8: Chapter 4giventh/la/la_16_ch4.pdf · Chapter 4 Eigenvalues 1 The Spectral Theorem Hermitian Spaces Given a C-vector space V, an Hermitian inner product in V is defined as a Hermitian

142 Chapter 4. EIGENVALUES

Step 4. Induction on dimV. When dimV = 1, the theorem isobvious. Assume that the theorem is proved for normal operators inspaces of dimension < n, and prove it when dimV = n.

According to Step 1, a normal operator A has an eigenvalue λ.Let W 6= {0} be the corresponding eigenspace. If W = V, thenthe operator is scalar, A = λI, and any orthonormal basis in Vwill consist of eigenvectors of A. If W 6= V, then (by Steps 2 and3) both W and W⊥ are A- and A†-invariant and have dimensions< n. The restrictions of the operators A and A† to each of thesesubspaces still satisfy AA† = A†A and 〈A†x,y〉 = 〈x, Ay〉 for allx,y. Therefore these restrictions remain adjoint to each other normaloperators on W and W⊥. Applying the induction hypothesis, we canfind orthonormal bases of eigenvectors of A in each W and W⊥. Theunion of these bases form an orthonormal basis of eigenvectors of Ain V = W ⊕W⊥. �

Remark. Note that Step 1 is based on the Fundamental Theo-rem of Algebra, but does not use normality of A and applies to anyC-linear transformation. Thus, every linear transformation ona complex vector space has eigenvalues and eigenvectors.Furthermore, Step 2 actually applies to any commuting transforma-tions and shows that if AB = BA then eigenspaces of A areB-invariant. The fact that B = A† is used in Step 3.

Corollary 1. A normal operator has a diagonal matrixin a suitable orthonormal basis.

Corollary 2. Let A : V → V be a normal operator, λi dis-tinct roots of its characteristic polynomial, mi their multi-plicities, and Wi corresponding eigenspaces. Then dimWi =mi, and

dimWi = dimV.Indeed, this is true for transformations defined by any diagonal

matrices. For normal operators, in addition Wi ⊥ Wj when i 6= j.In particular we have the following corollary.

Corollary 3. Eigenvectors of a normal operator corre-sponding to different eigenvalues are orthogonal.

Here is a matrix version of the Spectral Theorem.

Corollary 4. A square complex matrix A commutingwith its Hermitian adjoint A† can be transformed to a diag-onal form by transformations A 7→ U †AU defined by unitarymatrices U .

Note that for unitary matrices, U † = U−1, and therefore theabove transformations coincide with similarity transformations A 7→

Page 9: Chapter 4giventh/la/la_16_ch4.pdf · Chapter 4 Eigenvalues 1 The Spectral Theorem Hermitian Spaces Given a C-vector space V, an Hermitian inner product in V is defined as a Hermitian

1. The Spectral Theorem 143

U−1AU . This is how the matrix A of a linear transformation changesunder a change of basis. When both the old and new bases areorthonormal, the transition matrix U must be unitary (because inboth old and new coordinates the Hermitian inner product has thesame standard form: 〈x,y〉 = x†y). The result follows.

EXERCISES

335. Prove that the characteristic polynomial det(λI −A) of a square ma-trix A does not change under similarity transformations A 7→ C−1AC andthus depends only on the linear operator defined by the matrix.

336. Show that if λn + p1λn−1 + · · · + pn is the characteristic polynomial

of a matrix A, then pn = (−1)n detA, and p1 = − trA, and conclude thatthe trance is invariant under similarity transformations.

337. Prove that trA = −∑λi, where λi are the roots of det(λI−A) = 0.�

338. Prove that if A and B are normal and AB = 0, then BA = 0. Doesthis remain true without the normality assumption?

339.⋆ Let operator A be normal. Prove that the set of complex numbers{〈x, Ax〉 | |x| = 1} is a convex polygon whose vertices are the eigenvaluesof A. �

340. Prove that two (or several) commuting normal operators have a com-mon orthonormal basis of eigenvectors. �

341. Prove that if A is normal and AB = BA, then AB† = B†A, A†B =BA†, and A†B† = B†A†.

Unitary Transformations

Note that if λ is an eigenvalue of a unitary operator U then |λ| =1. Indeed, if x 6= 0 is a corresponding eigenvector, then 〈x,x〉 =〈Ux, Ux〉 = λλ〈x,x〉, and since 〈x,x〉 6= 0, it implies λλ = 1.

Corollary 5. A transformation is unitary if and only ifin some orthonormal basis its matrix is diagonal, and thediagonal entries are complex numbers of absolute value 1.

On the complex line C, multiplication by λ with |λ| = 1 andarg λ = θ defines the rotation through the angle θ. We will callthis transformation on the complex line a unitary rotation. Wearrive therefore to the following geometric characterization of unitarytransformations.

Corollary 6. Unitary transformations in an Hermitianspace of dimension n are exactly unitary rotations (throughpossibly different angles) in n mutually perpendicular com-plex directions.

Page 10: Chapter 4giventh/la/la_16_ch4.pdf · Chapter 4 Eigenvalues 1 The Spectral Theorem Hermitian Spaces Given a C-vector space V, an Hermitian inner product in V is defined as a Hermitian

144 Chapter 4. EIGENVALUES

Orthogonal Diagonalization

Corollary 7. A linear operator is Hermitian (respectivelyanti-Hermitian) if and only if in some orthonormal basisits matrix is diagonal with all real (respectively imaginary)diagonal entries.

Indeed, if Ax = λx and A† = ±A, we have:

λ〈x,x〉 = 〈x, Ax〉 = 〈A†x,x〉 = ±λ〈x,x〉.

Therefore λ = ±λ provided that x 6= 0, i.e. eigenvalues of an Hermi-tian operator are real and of anti-Hermitian imaginary. Vice versa,a real diagonal matrix is obviously Hermitian, and imaginary anti-Hermitian.

Recall that (anti-)Hermitian operators correspond to (anti-)Her-mitian forms A(x,y) := 〈x, Ay〉. Applying the Spectral Theoremand reordering the basis eigenvectors in the monotonic order of thecorresponding eigenvalues, we obtain the following classification re-sults for forms.

Corollary 8. In a Hermitian space of dimension n, anHermitian form can be transformed by unitary changes ofcoordinates to exactly one of the normal forms

λ1|z1|2 + · · · + λn|zn|2, λ1 ≥ · · · ≥ λn.

Corollary 9. In a Hermitian space of dimension n, ananti-Hermitian form can be transformed by unitary changesof coordinates to exactly one of the normal forms

iω1|z1|2 + · · ·+ iωn|zn|2, ω1 ≥ · · · ≥ ωn.

Uniqueness follows from the fact that eigenvalues and dimensionsof eigenspaces are determined by the operators in a coordinate-lessfashion.

Corollary 10. In a complex vector space of dimension n,a pair of Hermitian forms, of which the first one is positivedefinite, can be transformed by a choice of a coordinatesystem to exactly one of the normal forms:

|z1|2 + · · ·+ |zn|2, λ1|z1|2 + · · ·+ λn|zn|2, λ1 ≥ · · · ≥ λn.

Page 11: Chapter 4giventh/la/la_16_ch4.pdf · Chapter 4 Eigenvalues 1 The Spectral Theorem Hermitian Spaces Given a C-vector space V, an Hermitian inner product in V is defined as a Hermitian

1. The Spectral Theorem 145

This is the Orthogonal Diagonalization Theorem for Her-mitian forms. It is proved in two stages. First, applying the InertiaTheorem to the positive definite form one transforms it to the stan-dard form; the 2nd Hermitian form changes accordingly but remainsarbitrary at this stage. Then, applying Corollary 8 of the SpectralTheorem, one transforms the 2nd Hermitian form to its normal formby transformations preserving the 1st one.

Note that one can take the positive definite sesquilinear formcorresponding to the 1st Hermitian form for the Hermitian innerproduct, and describe the 2nd form as 〈z, Az〉, where A is an opera-tor Hermitian with respect to this inner product. The operator, itseigenvalues, and their multiplicities are thus defined by the given pairof forms in a coordinate-less fashion. This guarantees that pairs withdifferent collections λ1 ≥ · · · ≥ λn of eigenvalues are non-equivalentto each other.

EXERCISES

342. Prove that all roots of characteristic polynomials of Hermitian matri-ces are real.

343. Find eigenspaces and eigenvalues of an orthogonal projector to a sub-space W ⊂ V in an Hermitian space.

344. Prove that every Hermitian operator P satisfying P 2 = P is an or-thogonal projector. Does this remain true if P is not Hermitian?

345. Prove directly, i.e. not referring to the Spectral Theorem, that everyHermitian operator has an orthonormal basis of eigenvectors. �

346. Prove that if (λ − λ1) · · · (λ − λn) is the characteristic polynomial ofa normal operator A, then

∑ |λi|2 = tr(A†A).

347. Classify up to linear changes of coordinates pairs (Q,A) of forms,where S is positive definite Hermitian, and A anti-Hermitian.

348. An Hermitian operator S is called positive (written: S ≥ 0) if〈x, Sx) ≥ 0 for all x. Prove that for every positive operator S there isa unique positive square root (denoted by

√S), i.e. a positive operator

whose square is S.

349.⋆ Prove the Singular Value Decomposition Theorem: For a rankr linear map A : V → W between Hermitian spaces, there exist orthonormalbases v1, . . . ,vn in V and w1, . . . ,wm in W , and reals µ1 ≥ · · · ≥ µr > 0,such that Av1 = µ1w1, . . . , Avr = µrwr, Avr+1 = · · · = Avn = 0. �

Page 12: Chapter 4giventh/la/la_16_ch4.pdf · Chapter 4 Eigenvalues 1 The Spectral Theorem Hermitian Spaces Given a C-vector space V, an Hermitian inner product in V is defined as a Hermitian

146 Chapter 4. EIGENVALUES

350. Prove that for every complex m × n-matrix A of rank r, there existunitary m×m- and n× n-matrices U and V , and a diagonal r × r-matrix

M with positive diagonal entries, such that A = U †[

M 00 0

]

V . �

351. Using the Singular Value Decomposition Theorem with m = n, provethat every linear transformation A of an Hermitian space has a polar de-

composition A = SU , where S is positive, and U is unitary.

352. Prove that the polar decomposition A = SU is unique when A isinvertible; namely S =

√AA∗, and U = S−1A. What are polar decompo-

sitions of non-zero 1× 1-matrices?

Page 13: Chapter 4giventh/la/la_16_ch4.pdf · Chapter 4 Eigenvalues 1 The Spectral Theorem Hermitian Spaces Given a C-vector space V, an Hermitian inner product in V is defined as a Hermitian

2. Euclidean Geometry 147

2 Euclidean Geometry

Euclidean Spaces

Let V be a real vector space. A Euclidean inner product (or Eu-clidean structure) on V is defined as a positive definite symmetricbilinear form 〈·, ·〉. A real vector space equipped with a Euclidean in-ner product is called a Euclidean space. A Euclidean inner productallows one to talk about distances between points and angles betweendirections:

|x− y| =√

〈x− y,x − y〉, cos θ(x,y) :=〈x,y〉|x| |y| .

It follows from the Inertia Theorem that every finite dimen-sional Euclidean vector space has an orthonormal basis. Incoordinates corresponding to an orthonormal basis e1, . . . , en the in-ner product is given by the standard formula:

〈x,y〉 =n∑

i,j=1

xiyj〈ei, ej〉 = x1y1 + · · ·+ xnyn.

Thus, every Euclidean space V of dimension n can be identified withthe coordinate Euclidean space R

n by an isomorphism Rn → V

respecting inner products. Such an isomorphism is not unique, butcan be composed with any invertible linear transformation U : V → Vpreserving the Euclidean structure:

〈Ux, Uy〉 = 〈x,y〉 for all x,y ∈ V.

Such transformations are called orthogonal.

A Euclidean structure on a vector space V allows one to identifythe space with its dual V∗ by the rule that to a vector v ∈ V assignsthe linear function on V whose value at a point x ∈ V is equal to theinner product 〈v,x〉. Respectively, given a linear map A : V → Wbetween Euclidean spaces, the adjoint map At : W∗ → V∗ can beconsidered as a map between the spaces themselves: At : W → V.The defining property of the adjoint map reads:

〈Atw,v〉 = 〈w, Av〉 for all v ∈ V and w ∈ W.

Consequently matrices of adjoint maps A and At with respect toorthonormal bases of the Euclidean spaces V and W are transposedto each other.

Page 14: Chapter 4giventh/la/la_16_ch4.pdf · Chapter 4 Eigenvalues 1 The Spectral Theorem Hermitian Spaces Given a C-vector space V, an Hermitian inner product in V is defined as a Hermitian

148 Chapter 4. EIGENVALUES

As in the case of Hermitian spaces, one easily derives that a lineartransformation U : V → V is orthogonal if and only if U−1 = U t.In the matrix form, the relation U tU = I means that columns of Uform an orthonormal set in the coordinate Euclidean space.

Our goal here is to develop the spectral theory for real nor-mal operators, i.e. linear transformations A : V → V on a Eucle-dean space commuting with their transposed operators: AtA = AAt.Symmetric (At = A), anti-symmetric (At = −A), and orthogonaltransformations are examples of normal operators in Euclidean ge-ometry.

The right way to proceed is to consider Euclidean geometry asHermitian geometry, equipped with an additional, real structure, andapply the Spectral Theorem of Hermitian geometry to real normaloperators extended to the complex space.

EXERCISES

353. Prove the Cauchy-Schwartz inequality for Euclidean inner products:〈x,y〉2 ≤ 〈x,x〉〈y,y〉, strictly, unless x and y are proportional, and derivefrom this that the angle between non-zero vectors is well-defined. �

354. For x,y ∈ Rn, put 〈x,x〉 =

∑ni=1 2x

2i −2

∑n−1i=1 xixi+1. Show that the

corresponding symmetric bilinear form defines on Rn a Euclidean structure,and find the angles between the standard coordinate axes in Rn. �

355. Prove that 〈x,x〉 := 2∑

i≤j xixj defines in Rn a Euclidean structure,find pairwise angles between the standard coordinate axes, and show thatpermutations of coordinates define orthogonal transformations. �

356. In the standard Euclidean space Rn+1 with coordinates x0, . . . , xn,consider the hyperplane H given by the equation x0 + · · ·+ xn = 0. Findexplicitly a basis {fi} in H , in which the Euclidean structure has the sameform as in the previous exercise, and then yet another basis {hi} in whichit has the same form as in the exercise preceding it. �

357. Prove that if U is orthogonal, then detU = ±1. �

358. Provide a geometric description of orthogonal transformations of theEuclidean plane. Which of them have determinant 1, and which −1? �

359. Prove that an n × n-matrix U defines an orthogonal transformationin the standard Euclidean space Rn if and only if the columns of U forman orthonormal basis.

360. Show that rows of an orthogonal matrix form an orthonormal basis.

Page 15: Chapter 4giventh/la/la_16_ch4.pdf · Chapter 4 Eigenvalues 1 The Spectral Theorem Hermitian Spaces Given a C-vector space V, an Hermitian inner product in V is defined as a Hermitian

2. Euclidean Geometry 149

Complexification

Since R ⊂ C, every complex vector space can be considered as areal vector space simply by “forgetting” that one can multiply bynon-real scalars. This operation is called realification; applied to aC-vector space V, it produces an R-vector space, denoted VR, of realdimension twice the complex dimension of V.

In the reverse direction, to a real vector space V one can associatea complex vector space, VC, called the complexification of V. Asa real vector space, it is the direct sum of two copies of V:

VC := {(x,y) | x,y ∈ V}.

Thus the addition is performed componentwise, while the multipli-cation by complex scalars α + iβ is introduced with the thought inmind that (x,y) stands for x+ iy:

(α+ iβ)(x,y) := (αx− βy, βx + αy).

This results in a C-vector space VC whose complex dimension equalsthe real dimension of V.

Example. (Rn)C = Cn = {x+ iy | x,y ∈ R

n}.A productive point of view on complexification is that it is a com-

plex vector space with an additional structure that “remembers” thatthe space was constructed from a real one. This additional structureis the operation of complex conjugation (x,y) 7→ (x,−y).

The operation in itself is a map σ : VC → VC, satisfying σ2 = id,which is anti-linear over C. The latter means that σ(λz) = λσ(z)for all λ ∈ C and all z ∈ VC. In other words, σ is R-linear, butanti-commutes with multiplication by i: σ(iz) = −iσ(z).

Conversely, let W be a complex vector space equipped with ananti-linear operator whose square is the identity4:

σ : W → W, σ2 = id, σ(λz) = λσ(z) for all λ ∈ C, z ∈ W.

Let V denote the real subspace in W that consists of all σ-invariantvectors. We claim that W is canonically identified with thecomplexification of V: W = VC. Indeed, every vector z ∈ Wis uniquely written as the sum of σ-invariant and σ-anti-invariantvectors:

z =1

2(z+ σz) +

1

2(z− σz).

4An transformation whose square is the identity is called an involution.

Page 16: Chapter 4giventh/la/la_16_ch4.pdf · Chapter 4 Eigenvalues 1 The Spectral Theorem Hermitian Spaces Given a C-vector space V, an Hermitian inner product in V is defined as a Hermitian

150 Chapter 4. EIGENVALUES

Since σi = −iσ, multiplication by i transforms σ-invariant vectors toσ-anti-invariant ones, and vice versa. Thus, W as a real space is thedirect sum V ⊕ (iV) = {x + iy | x,y ∈ V}, where multiplication byi acts in the required for the complexification fashion: i(x + iy) =−y + ix.

The construction of complexification and its abstract descriptionin terms of the complex conjugation operator σ are the tools thatallow one to carry over results about complex vector spaces to realvector spaces. The idea is to consider real objects as complex onesinvariant under the complex conjugation σ, and apply (or improve)theorems of complex linear algebra in a way that would respect σ.

Example. A real matrix can be considered as a complex one. Thisway an R-linear map defines a C-linear map (on the complexifiedspace). More abstractly, given an R-linear map A : V → V, onecan associate to it a C-linear map AC : VC → VC by AC(x,y) :=(Ax, Ay). This map is real in the sense that it commutes with thecomplex conjugation: ACσ = σAC.

Vice versa, let B : VC → VC be a C-linear map that commuteswith σ: σ(Bz) = Bσ(z) for all z ∈ VC. When σ(z) = ±z, we findσ(Bz) = ±Bz, i.e. the subspaces V and iV of real and imaginaryvectors are B-invariant. Moreover, since B is C-linear, we find thatfor x,y ∈ V, B(x+ iy) = Bx+ iBy. Thus B = AC where the linearoperator A : V → V is obtained by restricting B to V.

Our nearest goal is to obtain real analogues of the Spectral Theo-rem and its corollaries. One way to do it is to combine correspondingcomplex results with complexification. Let V be a Euclidean space.We extend the inner product to the complexification VC in such away that it becomes an Hermitian inner product. Namely, for allx,y,x′,y′ ∈ V, put

〈x+ iy,x′ + iy′〉 = 〈x,x′〉+ 〈y,y′〉+ i〈x,y′〉 − i〈y,x′〉.

It is straightforward to check that this form on VC is sesquilinear andHermitian symmetric. It is also positive definite since 〈x + iy,x +iy〉 = |x|2 + |y|2. Note that changing the signs of y and y′ preservesthe real part and reverses the imaginary part of the form. In otherwords, for all z,w ∈ VC, we have:

〈σ(z), σ(w)〉 = 〈z,w〉 (= 〈w, z〉).

This identity expresses the fact that the Hermitian structure of VC

came from a Euclidean structure on V. When A : VC → VC is a real

Page 17: Chapter 4giventh/la/la_16_ch4.pdf · Chapter 4 Eigenvalues 1 The Spectral Theorem Hermitian Spaces Given a C-vector space V, an Hermitian inner product in V is defined as a Hermitian

2. Euclidean Geometry 151

operator, i.e. σAσ = A, the Hermitian adjoint operator A† is alsoreal.5 Indeed, since σ2 = id, we find that for all z,w ∈ VC

〈σA†σz,w〉 = 〈σw, A†σz〉 = 〈Aσw, σz〉 = 〈σAw, σz〉 = 〈z, Aw〉,

i.e. σA†σ = A†. In particular, complexifications of orthogonal(U−1 = U t), symmetric (At = A), anti-symmetric (At = −A),normal (AtA = AAt) operators in a Euclidean space are respec-tively unitary, Hermitian, anti-Hermitian, normal operators on thecomplexified space, commuting with the complex conjugation.

EXERCISES

361. Consider Cn as a real vector space, and describe its complexification.

362. Let σ be the complex conjugation operator on Cn. Consider Cn as areal vector space. Show that σ is symmetric and orthogonal.

363. On the complex line C1, find all involutions σ anti-commuting with

the multiplication by i: σi = −iσ.364. Let σ be an involution on a complex vector space W . Considering Was a real vector space, find eigenvalues of σ and describe the correspondingeigenspaces. �

The Real Spectral Theorem

Theorem. Let V be a Euclidean space, and A : V → V a normaloperator. Then in the complexification VC, there exists anorthonormal basis of eigenvectors of AC which is invariantunder complex conjugation and such that the eigenvaluescorresponding to conjugated eigenvectors are conjugated.

Proof. Applying the complex Spectral Theorem to the normaloperator B = AC, we obtain a decomposition of the complexifiedspace VC into a direct orthogonal sum of eigenspaces W1, . . . ,Wr

of B corresponding to distinct complex eigenvalues λ1, . . . , λr. Notethat if v is an eigenvector of B with an eigenvalue µ, then Bσv =σBv = σ(µv) = µσv, i.e. σv is an eigenvector of B with the con-jugate eigenvalue µ. This shows that if λi is a non-real eigenvalue,then its conjugate λi is also one of the eigenvalues of B (say, λj), andthe corresponding eigenspaces are conjugated: σ(Wi) = Wj. By the

5This is obvious in the matrix form: In a real orthonormal basis of V (which isa complex orthonormal basis of VC) A has a real matrix, so that A† = At. Herewe argue the “hard way” in order to illustrate how various aspects of σ-invariancefit together.

Page 18: Chapter 4giventh/la/la_16_ch4.pdf · Chapter 4 Eigenvalues 1 The Spectral Theorem Hermitian Spaces Given a C-vector space V, an Hermitian inner product in V is defined as a Hermitian

152 Chapter 4. EIGENVALUES

same token, if λk is real, then σ(Wk) = Wk. This last equality meansthat Wk itself is the complexification of a real space, namely of theσ-invariant part of Wk. It coincides with the space Ker(λkI−A) ⊂ Vof real eigenvectors of A with the eigenvalue λk. Thus, to constructa required orthonormal basis, we take: for each real eigenspace Wk,a Euclidean orthonormal basis in the corresponding real eigenspace,and for each pair Wi,Wj of complex conjugate eigenspaces, an Her-mitian orthonormal basis {fα} in Wi and the conjugate basis {σ(fα)}in Wj = σ(Wi). The vectors of all these bases altogether form anorthonormal basis of VC satisfying our requirements. �

Example 1. Identify C with the Euclidean plane R2 in the usualway, and consider the operator (x + iy) 7→ (α + iβ)(x + iy) of mul-tiplication by given complex number α+ iβ. In the basis 1, i, it hasthe matrix

A =

[

α −ββ α

]

.

Since At represents multiplication by α − iβ, it commutes with A.Therefore A is normal. It is straightforward to check that

z =1√2

[

1−i

]

and z =1√2

[

1i

]

are complex eigenvectors of A with the eigenvalues α+ iβ and α− iβrespectively, and form an Hermitian orthonormal basis in (R2)C.

Example 2. If A is a linear transformation in Rn, and λ0 is a

non-real root of its characteristic polynomial det(λI − A), then thesystem of linear equations Az = λ0z has non-trivial solutions, whichcannot be real though. Let z = u+ iv be a complex eigenvector of Awith the eigenvalue λ0 = α+ iβ. Then σz = u− iv is an eigenvectorof A with the eigenvalue λ0 = α − iβ. Since λ0 6= λ0, the vectors zand σz are linearly independent over C, and hence the real vectorsu and v must be linearly independent over R. Consider the planeSpan(u,v) ⊂ R

n. Since

A(u− iv) = (α− iβ)(u − iv) = (αu− βv)− i(βu + αv),

we conclude that A preserves this plane and in the basis u,−v in it

(note the sign change!) acts by the matrix

[

α −ββ α

]

. If we assume

in addition that A is normal (with respect to the standard Euclideanstructure in R

n), then the eigenvectors z and σz must be Hermitian

Page 19: Chapter 4giventh/la/la_16_ch4.pdf · Chapter 4 Eigenvalues 1 The Spectral Theorem Hermitian Spaces Given a C-vector space V, an Hermitian inner product in V is defined as a Hermitian

2. Euclidean Geometry 153

orthogonal, i.e.

〈u− iv,u + iv〉 = 〈u,u〉 − 〈v,v〉 + 2i〈u,v〉 = 0.

We conclude that 〈u,v〉 = 0 and |u|2 − |v|2 = 0, i.e. u and v areorthogonal and have the same length. Normalizing the length to 1,we obtain an orthonormal basis of the A-invariant plane, in whichthe transformation A acts as in Example 1. The geometry of thistransformation is known to us from studying geometry of complexnumbers: It is the composition of the rotation through the anglearg(λ0) with the expansion by the factor |λ0|. We will call such atransformation of the Euclidean plane a complex multiplicationor multiplication by a complex scalar, λ0.

Corollary 1. Given a normal operator on a Euclideanspace, the space can be represented as a direct orthogonalsum of invariant lines and planes, on each of which thetransformation acts as multiplication by a real or complexscalar respectively.

Corollary 2. A transformation in a Euclidean space isorthogonal if and only if the space can be represented asthe direct orthogonal sum of invariant lines and planes oneach of which the transformation acts as multiplication by±1 and rotation respectively.

Corollary 3. In a Euclidean space, every symmetric op-erator has an orthonormal basis of eigenvectors.

Corollary 4. Every quadratic form in a Euclidean spaceof dimension n can be transformed by an orthogonal changeof coordinates to exactly one of the normal forms:

λ1x21 + · · · + λnx

2n, λ1 ≥ · · · ≥ λn.

Corollary 5. In a Euclidean space of dimension n, everyanti-symmetric bilinear form can be transformed by an or-thogonal change of coordinates to exactly one of the normalforms

〈x,y〉 =r∑

i=1

ωi(x2i−1y2i − x2iy2i−1), ω1 ≥ · · · ≥ ωr > 0, 2r ≤ n.

Page 20: Chapter 4giventh/la/la_16_ch4.pdf · Chapter 4 Eigenvalues 1 The Spectral Theorem Hermitian Spaces Given a C-vector space V, an Hermitian inner product in V is defined as a Hermitian

154 Chapter 4. EIGENVALUES

Corollary 6. Every real normal matrix A can be writtenin the form A = U tMU where U is an orthogonal matrix,and M is block-diagonal matrix with each block either of

size 1, or of size 2 of the form

[

α −ββ α

]

, where α2+β2 6= 0.

If A is symmetric, then only blocks of size 1 are present(i.e. M is diagonal).

If A is anti-symmetric, then blocks of size 1 are zero,

and of size 2 are of the form

[

0 −ωω 0

]

, where ω > 0.

If A is orthogonal, then all blocks of size 1 are equal

to ±1, and blocks of size 2 have the form

[

cos θ − sin θsin θ cos θ

]

,

where 0 < θ < π.

EXERCISES

365. Prove that an operator on a Euclidean vector space is normal if andonly if it is the sum of commuting symmetric and anti-symmetric operators.

366. Prove that in the complexification (R2)C of a Euclidean plane, allrotations of R2 have a common basis of eigenvectors, and find these eigen-vectors. �

367. Prove that an orthogonal transformation in R3 is either the rotation

through an angle θ, 0 ≤ θ ≤ π, about some axis, or the composition of sucha rotation with the reflection about the plane perpendicular to the axis.

368. Find an orthonormal basis in Cn in which the transformation definedby the cyclic permutation of coordinates: (z1, z2, . . . , zn) 7→ (z2, . . . , zn, z1)is diagonal.

369. In the coordinate Euclidean space Rn with n ≤ 4, find real and com-plex normal forms of orthogonal transformations defined by various permu-tations of coordinates.

370. Transform to normal forms by orthogonal transformations:(a) x1x2 + x3x4, (b) 2x21 − 4x1x2 + x22 − 4x2x3,

(c) 5x21 + 6x22 + 4x23 − 4x1x2 − 4x1x3.

371. Show that any anti-symmetric bilinear form on R2 is proportional to

det[x,y] =

x1 x2y1 y2

= x1y2 − x2y1.

Find the operator corresponding to this form, its complex eigenvalues andeigenvectors. �

Page 21: Chapter 4giventh/la/la_16_ch4.pdf · Chapter 4 Eigenvalues 1 The Spectral Theorem Hermitian Spaces Given a C-vector space V, an Hermitian inner product in V is defined as a Hermitian

2. Euclidean Geometry 155

372. In Euclidean spaces, classify all operators which are both orthogonaland anti-symmetric.

373. Recal that a biliniar form on V is called non-dedgenerate if thecorresponding linear map V → V∗ is an isomorphism, and degenerate

otherwise. Prove that all non-degenerate anti-symetric bilinear forms onR

2n are equivalent to each other, and that all antisymmetric bilinear formson R2n+1 are degenerate.

374. Derive Corollaries 1 – 6 from the Real Spectral Theorem.

375. Let U and V be two subspaces of dimension 2 in the Euclidean 4-space. Consider the map T : V → V defined as the composition: V ⊂ R4 →U ⊂ R4 → V , where the arrows are the orthogonal projections to U andV respectively. Prove that T is positive, and that its eigenvalues have theform cosφ, cosψ where φ, ψ are certain angles, 0 ≤ φ, ψ ≤ π/2.

376. Solve Gelfand’s problem: In the Euclidean 4-space, classify pairsof planes passing through the origin up to orthogonal transformations ofthe space. �

Courant–Fischer’s Minimax Principle

One of the consequences (equivalent to Corollary 4) of the SpectralTheorem is that a pair (Q,S) of quadratic forms in R

n, of which thefirst one is positive definite, can be transformed by a linear changeof coordinates to the normal form:

Q = x21 + · · ·+ x2n, S = λ1x21 + · · · + λnx

2n, λ1 ≥ · · · ≥ λn.

The eigenvalues λ1 ≥ . . . λn form the spectrum of the pair (Q,S).The following result gives a coordinate-less, geometric description ofthe spectrum (and thus implies the Orthogonal DiagonalizationTheorem as it was stated in the Introduction).

Theorem. The k-th greatest spectral number is given by

λk = maxW : dimW=k

minx∈W−0

S(x)

Q(x),

where the maximum is taken over all k-dimensional sub-spaces W ⊂ R

n, and minimum over all non-zero vectors inthe subspace.

Proof. When W is given by the equations xk+1 = · · · = xn = 0,the minimal ratio S(x)/Q(x) (achieved on vectors proportional toek) is equal to λk because

λ1x21 + · · ·+ λkx

2k ≥ λk(x

21 + · · ·+ x2k) when λ1 ≥ · · · ≥ λk.

Page 22: Chapter 4giventh/la/la_16_ch4.pdf · Chapter 4 Eigenvalues 1 The Spectral Theorem Hermitian Spaces Given a C-vector space V, an Hermitian inner product in V is defined as a Hermitian

156 Chapter 4. EIGENVALUES

Therefore it suffices to prove for every other k-dimensional subspaceW the minimal ratio cannot be greater than λk. For this, denoteby V the subspace of dimension n − k + 1 given by the equationsx1 = · · · = xk−1 = 0. Since λk ≥ · · · ≥ λn, we have:

λkx2k + · · ·+ λnx

2n ≤ λk(x

2k + · · ·+ x2

n),

i.e. for all non-zero vectors x in V the ratio S(x)/Q(x) ≤ λk. Nowwe invoke the dimension counting argument: dimW + dimV = k +(n− k+1) = n+1 > dimR

n, and conclude that W has a non-trivialintersection with V. Let x be a non-zero vector in W ∩ V. ThenS(x)/Q(x) ≤ λk, and hence the minimum of the ratio S/Q on W−0cannot exceed λk. �

Applying Theorem to the pair (Q,−S) we obtain yet anothercharacterization of the spectrum:

λk = minW : dimW=n−k+1

maxx∈W−0

S(x)

Q(x).

Formulating some applications, we assume that the space Rn is

Euclidean, and refer to the spectrum of the pair (Q,S) where Q =|x|2, simply as the spectrum of S.

Corollary 1. When a quadratic form increases, its spec-tral numbers do not decrease: If S ≤ S′ then λk ≤ λ′k for allk = 1, . . . , n.

Proof. Indeed, since S/Q ≤ S′/Q, the minimum of the ratioS/Q on every k-dimensional subspaceW cannot exceed that of S′/Q,which in particular remains true for that W on which the maximumof S/Q equal to λk is achieved.

The following result is called Cauchy’s interlacing theorem.

Corollary 2. Let λ1 ≥ · · · ≥ λn be the spectrum of aquadratic form S, and λ′1 ≥ · · · ≥ λ′n−1 be the spectrum ofthe quadratic form S′ obtained by restricting S to a givenhyperplane R

n−1 ⊂ Rn passing through the origin. Then:

λ1 ≥ λ′1 ≥ λ2 ≥ λ′2 ≥ · · · ≥ λn−1 ≥ λ′n−1 ≥ λn.

Proof. The maximum over all k-dimensional subspaces W can-not be smaller than the maximum (of the same quantities) over sub-spaces lying inside the hyperplane. This proves that λk ≥ λ′k. Apply-ing the same argument to −S and subspaces of dimension n− k− 1,we conclude that −λk+1 ≥ −λ′k. �

Page 23: Chapter 4giventh/la/la_16_ch4.pdf · Chapter 4 Eigenvalues 1 The Spectral Theorem Hermitian Spaces Given a C-vector space V, an Hermitian inner product in V is defined as a Hermitian

2. Euclidean Geometry 157

An ellipsoid in a Euclidean space is defined as the level-1 setE = {x | S(x) = 1} of a positive definite quadratic form, S. It followsfrom the Spectral Theorem that every ellipsoid can be transformedby an orthogonal transformation to principal axes: a normal form

x21α21

+ · · ·+ x2nα2n

= 1, 0 < α1 ≤ · · · ≤ αn.

The vectors x = ±αkek lie on the ellipsoid, and their lengths αk

are called semiaxes of E. They are related to the spectral numbersλ1 ≥ · · · ≥ λk > 0 of the quadratic form by α−1

k =√λk. From

Corollaries 1 and 2 respectively, we obtain:

Given two concentric ellipsoids enclosing one another, the semi-axes of the inner ellipsoid do not exceed corresponding semiaxes ofthe outer:

If E′ ⊂ E, then α′k ≤ αk for all k = 1, . . . , n.

Semiaxes of a given ellipsoid are interlaced by semiaxes of anysection of it by a hyperplane passing through the center:

If E′ = E ∩ Rn−1, then αk ≤ α′

k ≤ αk+1 for k = 1, . . . , n − 1.

EXERCISES

377. Prove that every ellipsoid in Rn has n pairwise perpendicular hyper-planes of bilateral symmetry.

378. Given an ellipsoid E ⊂ R3, find a plane passing through its centerand intersecting E in a circle. �

379. Formulate and prove counterparts of Courant–Fischer’s minimax prin-ciple and Cauchy’s interlacing theorem for Hermitian forms.

380. Prove that semiaxes α1 ≤ α2 ≤ . . . of an ellipsoid in Rn and semiaxesα′k ≤ α′

2 ≤ . . . of its section by a linear subspaces of codimension k arerelated by the inequalities: αi ≤ α′

i ≤ αi+k, i = 1, . . . , n− k.

381. From the Real Spectral Theorem, derive the Orthogonal Diagonal-ization Theorem as it is formulated in the Introduction, i.e. for pairs ofquadratic forms on Rn, one of which is positive definite.�

Small Oscillations

Let us consider the system of n identical masses m positioned at thevertices of a regular n-gon, which are cyclically connected by n iden-tical elastic springs, and can oscillate in the direction perpendicularto the plane of the n-gon.

Page 24: Chapter 4giventh/la/la_16_ch4.pdf · Chapter 4 Eigenvalues 1 The Spectral Theorem Hermitian Spaces Given a C-vector space V, an Hermitian inner product in V is defined as a Hermitian

158 Chapter 4. EIGENVALUES

Assuming that the amplitudes of the oscillation are small, we candescribe the motion of the masses as solutions to the following systemof n second-order Ordinary Differential Equations (ODE for short)expressing Newton’s law of motion (mass × acceleration = force):

mx1 = −k(x1 − xn)− k(x1 − x2),mx2 = −k(x2 − x1)− k(x2 − x3),

· · ·mxn−1 = −k(xn−1 − xn−2)− k(xn−1 − xn)mxn = −k(xn − xn−1)− k(xn − x1).

.

Here x1, . . . , xn are the displacements of the nmasses in the directionperpendicular to the plane, and k characterizes the rigidity of thesprings. 6

Figure 43

In fact the above ODE system can be read off a pair of quadraticforms: the kinetic energy

K(x) =mx212

+mx222

+ · · ·+ mx2n2

,

and the potential energy

P (x) = k(x1 − x2)

2

2+ k

(x2 − x3)2

2+ · · · + k

(xn − x1)2

2.

6More precisely (see Figure 43, where n = 4), we may assume that the springsare stretched, but the masses are confined on the vertical rods and can only slidealong them without friction. When a string of length L is horizontal (∆x = 0), thestretching force T is compensated by the reactions of the rods. When ∆x 6= 0, thehorizontal component of the stretching force is still compensated, but the verticalcomponent contributes to the right hand side of Newton’s equations. When ∆xis small, the contribution equals approximately −T (∆x)/L (so that k = −T/L).

Page 25: Chapter 4giventh/la/la_16_ch4.pdf · Chapter 4 Eigenvalues 1 The Spectral Theorem Hermitian Spaces Given a C-vector space V, an Hermitian inner product in V is defined as a Hermitian

2. Euclidean Geometry 159

Namely, for any conservative mechanical system with quadratickinetic and potential energy functions

K(x) =1

2〈x,M x〉, P (x) = 1

2〈x, Qx〉

the equations of motion assume the form

M x = −Qx.

A linear change of variables x = Cy transforms the kineticand potential energy functions to a new form with the matricesM ′ = CtMC and Q′ = CtQC. On the other hand, the samechange of variables transforms the ODE system M x = −Qx toMCy = −QCy. Multiplying by Ct we get M ′y = −Q′y and seethat the relationship between K,P and the ODE system is preserved.The relationship is therefore intrinsic, i.e. independent on the choiceof coordinates.

Since the kinetic energy is positive we can apply the OrthogonalDiagonalization Theorem in order to transform K and P simultane-ously to

1

2(X2

1 + ...+ X2n), and

1

2(λ1X

21 + ...+ λnX

2n).

The corresponding ODE system splits into unlinked 2-nd order ODEs

X1 = −λ1X1, ..., Xn = −λnXn.

When the potential energy is also positive, we obtain a system of nunlinked harmonic oscillators with frequencies ω =

√λ1, ...,

√λn.

Example 1:Harmonic oscillators. The equation X = −ω2X hassolutions

X(t) = A cosωt+B sinωt,

where A = X(0) and B = X(0)/ω are arbitrary real constants. It isconvenient to plot the solutions on the phase plane with coordinates(X,Y ) = (X, X/ω). In such coordinates, the equations of motionassume the form

X = ωY

Y = −ωX ,

and the solutions[

X(t)Y (t)

]

=

[

cosωt sinωt− sinωt cosωt

] [

X(0)Y (0)

]

.

Page 26: Chapter 4giventh/la/la_16_ch4.pdf · Chapter 4 Eigenvalues 1 The Spectral Theorem Hermitian Spaces Given a C-vector space V, an Hermitian inner product in V is defined as a Hermitian

160 Chapter 4. EIGENVALUES

In other words (see Figure 44), the motion on the phase plane isdescribed as clockwise rotation with the angular velocity ω.Since there is one trajectory through each point of the phase plane,the general theory of Ordinary Differential Equations (namely, thetheorem about uniqueness and existence of solutions with given ini-tial conditions) guarantees that these are all the solutions to the ODE

X = −ω2X.

Figure 44

X

Y

Let us now examine the behavior of our system of n masses cycli-cally connected by the springs. To find the common orthogonal basisof the pair of quadratic forms K and P , we first note that, since K isproportional to the standard Euclidean structure, it suffices to findan orthogonal basis of eigenvectors of the symmetric matrix Q.

In order to give a concise description of the ODE system mx =Qx, introduce operator T : Rn → R

n which cyclically shifts the co-ordinates: T (x1, x2, . . . , xn)

t = (x2, . . . , xn, x1). Then Q = k(T +T−1 − 2I). Note that the operator T is obviously orthogonal, andhence unitary in the complexification C

n of the the space Rn. We

will now construct its basis of eigenvectors, which should be calledthe Fourier basis.7 Namely, let xk = ζk where ζn = 1. Then the se-quence {xk} is repeating every n terms, and xk+1 = ζxk fo all k ∈ Z.

Thus Tx = ζx, where x = (ζ, ζ2, . . . , ζn)t. When ζ = e2π√−1 l/n,

l = 1, 2, . . . , n runs various nth roots of unity, we obtain n eigenvec-tors of the operator T , which corresponds to different eigenvalues,and hence are linearly independent. They are automatically pair-wise Hermitian orthogonal (since T is unitary), and happen to havethe same Hermitian inner square, equal to n. Thus, when divided

7After French mathematician Joseph Fourier (1768–1830), and by analogywith the theory of Fourier series.

Page 27: Chapter 4giventh/la/la_16_ch4.pdf · Chapter 4 Eigenvalues 1 The Spectral Theorem Hermitian Spaces Given a C-vector space V, an Hermitian inner product in V is defined as a Hermitian

2. Euclidean Geometry 161

by√n, these vectors form an orthonormal basis in C

n. Besides, thisbasis is invariant under complex conjugation (because replacing theeigenvalue ζ with ζ also conjugates the corresponding eigenvector).

Now, applying this to Q = k(T + T−1 − 2I), we conclude that Qis diagonal in the Fourier basis with the eigenvalues

k(ζ+ζ−1−2) = 2k(cos(2πl/n)−1) = −4k sin2 πl/n, l = 1, 2, . . . , n.

When ζ 6= ζ, this pair of roots of unity yields the same eigenvalue ofQ, and the real and imaginary parts of the Fourier eigenvector x =(ζ, . . . , ζn)t span in R

n the 2-dimensional eigenplane of the operatorQ. When ζ = 1 or −1 (the latter happens only when n is even),the corresponding eigenvalue of Q is 0 and −4k respectively, and theeigenspace is 1-dimensional (spanned the respective Fourier vectors(1, . . . , 1)t and (−1, 1, . . . ,−1, 1)t. The whole systems decomposesinto superposition of independent “modes of oscillation” (patterns)described by the equations

Xl = −ω2lXl, where ωl = 2

k

m

sin πl

n

, l = 1, . . . , n.

Figure 45

dba c

Example 2: n = 4. Here z = 1,−1,±i. The value z = 1 cor-responds to the eigenvector (1, 1, 1, 1) and the eigenvalue 0. This

“mode of oscillation” is described by the ODE X = 0, and actuallycorresponds to the steady translation of the chain as a whole withthe constant speed (Figure 45a). The value ζ = −1 correspondsto the eigenvector (−1, 1,−1, 1) (Figure 45b) with the frequency of

oscillation 2√

k/m. The values ζ = ±i correspond to the eigen-vectors (±i,−1,∓i, 1). Their real and imaginary parts (0,−1, 0, 1)and (1, 0,−1, 0) (Figures 45cd) span the plane of modes of oscillation

with the same frequency√

2k/m. The general motion of the systemis the superposition of these four patterns.

Page 28: Chapter 4giventh/la/la_16_ch4.pdf · Chapter 4 Eigenvalues 1 The Spectral Theorem Hermitian Spaces Given a C-vector space V, an Hermitian inner product in V is defined as a Hermitian

162 Chapter 4. EIGENVALUES

Remark. In fact the oscillatory system we’ve just studied canbe considered as a model of sound propagation in a one-dimensionalcrystal. One can similarly analyze propagation of sound waves in 2-dimensional membranes of rectangular or periodic (toroidal) shape,or in similar 3-dimesnional regions. Physicists often call the resultingpicture: the superposition of independent sinusoidal waves, an idealgas of phonons. Here “ideal gas” refers to independence of eigen-modes of oscillators (therefore behaving as noninterracting particlesof a sparse gas), and “phonons” emphasises that the “particles” arerather bells producing sound waves of various frequencies.

The mathematical aspect of this theory is even more general: theOrthogonal Diagonalization Theorem guarantees that small oscil-lations in any conservative mechanical system near a localminimum of potential energy are described as superposi-tions of independent harmonic oscillations.

EXERCISES

382. A mass m is suspended on a weightless rod of length l (as a clockpendulum), and is swinging without friction under the action of the forceof gravity mg (where g is the gravitation constant). Show that theNewton equation of motion of the pendulum has the form lx = −g sinx,where x is the angle the rod makes with the downmward vertical direction,and show that the frequency of small oscillations of the pendulum near thelower equlibrium (x = 0) is equal to

g/l. �

383. In the mass-spring chain (studied in the text) with n = 3, find fre-quencies and describe explicitly the modes of oscillations.

384. The same, for 6 masses positioned at the vertices of the regularhexagon (like the 6 carbon atoms in benzene molecules).

385.⋆ Given n numbers C1, . . . , Cn (real or complex), we form from them aninfinite periodic sequence {Ck}, . . . , C−1, C0, C1, . . . , Cn, Cn+1, . . . , whereCk+n = Ck. Let C denote the n×n-matrix whose entries cij = Cj−i. Provethat all such matrices (corresponding to different n-periodic sequences) arenormal, that they commute, and find their common eigenvectors. �

386.⋆ Study small oscillations of a 2-dimansional crystal lattice of toroidalshape consisting of m×n identical masses (positioned in m circular “rows”and n circular “columns”, each interacting only with its four neighbors).

387. Using Courant–Fischer’s minimax principle, explain why a crackedbell sounds lower than the intact one.

Page 29: Chapter 4giventh/la/la_16_ch4.pdf · Chapter 4 Eigenvalues 1 The Spectral Theorem Hermitian Spaces Given a C-vector space V, an Hermitian inner product in V is defined as a Hermitian

3. Jordan Canonical Forms 163

3 Jordan Canonical Forms

Characteristic Polynomials and Root Spaces

Let V be a finite dimensional K-vector space. We do not assume thatV is equipped with any structure in addition to the structure of a K-vector space. In this section, we study geometry of linear operatorson V. In other words, we study the problem of classification of linearoperators A : V → V up to similarity transformations A 7→ C−1AC,where C stands for arbitrary invertible linear transformations of V.

Let n = dimV, and let A be the matrix of a linear operator withrespect to some basis of V. Recall that

det(λI −A) = λn + p1λn−1 + · · ·+ pn−1λ+ pn

is called the characteristic polynomial of A. In fact it does notdepend on the choice of a basis. Indeed, under a change x = Cx′ ofcoordinates, the matrix of a linear operator x 7→ Ax is transformedinto the matrix C−1AC similar to A. We have:

det(λI − C−1AC) = det[C−1(λI −A)C] =

(detC−1) det(λI −A)(detC) = det(λI −A).

Therefore, the characteristic polynomial of a linear operator is well-defined (by the geometry of A). In particular, coefficients of thecharacteristic polynomial do not change under similaritytransformations.

Let λ0 ∈ K be a root of the characteristic polynomial. Thendet(λ0I−A) = 0, and hence the system of homogeneous linear equa-tions Ax = λ0x has a non-trivial solution, x 6= 0. As before,we callany such solution an eigenvector of A, and call λ0 the correspond-ing eigenvalue. All solutions to Ax = λ0x (including x = 0) forma linear subspace in V, called the eigenspace of A corresponding tothe eigenvalue λ0.

Let us change slightly our point of view on the eigenspace. Itis the null space of the operator A − λ0I. Consider powers of thisoperator and their null spaces. If (A − λ0I)

kx = 0 for some k > 0,then (A− λ0I)

lx = 0 for all l ≥ k. Thus the null spaces are nested:

Ker(A− λ0I) ⊂ Ker(A− λ0I)2 ⊂ · · · ⊂ Ker(A− λ0I)

k ⊂ . . .

On the other hand, since dimV < ∞, nested subspaces must stabi-lize, i.e. starting from some m > 0, we have:

Wλ0:= Ker(A− λ0I)

m = Ker(A− λ0I)m+1 = · · ·

Page 30: Chapter 4giventh/la/la_16_ch4.pdf · Chapter 4 Eigenvalues 1 The Spectral Theorem Hermitian Spaces Given a C-vector space V, an Hermitian inner product in V is defined as a Hermitian

164 Chapter 4. EIGENVALUES

We call the subspace Wλ0a root space of the operator A, namely,

the root space corresponding to the root λ0 of the characteristicpolynomial.

Note that if x ∈ Wλ0, then Ax ∈ Wλ0

, because (A− λ0I)mAx =

A(A− λ0I)mx = A0 = 0. Thus a root space is A-invariant. Denote

by Uλ0the range of (A − λ0I)

m. It is also A-invariant, since if x =(A− λ0I)

my, then Ax = A(A− λ0I)my = (A− λ0I)

m(Ay).

Lemma. V = Wλ0⊕ Uλ0

.

Proof. Put B := (A − λ0I)m, so that Wλ0

= KerB, Uλ0=

B(V). Let x = By ∈ KerB. Then Bx = 0, i.e. y ∈ KerB2. ButKerB2 = KerB by the assumption that KerB = Wλ0

is the rootspace. Thus y ∈ KerB, and hence x = By = 0. This proves thatKerB∩B(V) = {0}. Therefore the subspace in V spanned by KerBand B(V) is their direct sum. On the other hand, for any operator,dimKerB + dimB(V) = dimV. Thus, the subspace spanned byKerB and B(V) is the whole space V.

Corollary 1. For any λ 6= λ0, the root space Wλ ⊂ Uλ0.

Proof. Indeed, Wλ is invariant with respect to A − λ0I, butcontains no eigenvectors of A with eigenvalue λ0. Therefore A− λ0Iand all powers of it are invertible on Wλ. Thus Wλ lies in the rangeUλ0

of B = (A− λ0I)m.

Corollary 2. Suppose that (λ − λ1)m1 . . . (λ − λr)

mr is thecharacteristic polynomial of A : V → V, where λ1, . . . , λr ∈ K

are pairwise distinct roots. Then V is the direct sum of rootspaces:

V = Wλ1⊕ · · · ⊕Wλr

.

Proof. From Corollary 1, it follows by induction on r, thatV = Wλ1

⊕ · · · ⊕Wλr⊕ U , where U = Uλ1

∩ · · · ∩ Uλr. In particular,

U is A-invariant as the intersection of A-invariant subspaces, butcontains no eigenvectors of A with eigenvalues λ1, . . . , λr. Pickingbases in each of the direct summands Wλi

and in U , we obtain abasis in V, in which the matrix of A is block-diagonal. Therefore thecharacteristic polynomial of A is the product of the characteristicpolynomials of A restricted to the summands. So far we haven’tused the hypothesis that the characteristic polynomial of A factorsinto a product of λ − λi. Invoking this hypothesis, we see that thefactor of the characteristic polynomial, corresponding to U must havedegree 0, and hence dimU = 0.

Page 31: Chapter 4giventh/la/la_16_ch4.pdf · Chapter 4 Eigenvalues 1 The Spectral Theorem Hermitian Spaces Given a C-vector space V, an Hermitian inner product in V is defined as a Hermitian

3. Jordan Canonical Forms 165

Remarks. (1) We will see later that dimensions of the root spacescoincide with multiplicities of the roots: dimWλi

= mi.

(2) The restriction of A to Wλihas the property that some power

of A−λiI vanishes. A linear operator some power of which vanishesis called nilpotent. Our next task will be to study the geometry ofnilpotent operators.

(3) Our assumption that the characteristic polynomial factorscompletely over K is automatically satisfied in the case K = C dueto the Fundamental Theorem of Algebra. Thus, we have proved forevery linear operator on a finite dimensional complex vector space,that the space decomposes in a canonical fashion into the direct sumof invariant subspaces on each of which the operator differs from anilpotent one by scalar summand.

EXERCISES

388. Let A,B : V → V be two commuting linear operators, and p and qtwo polynomials in one variable. Show that the operators p(A) and q(B)commute.

389. Prove that if A commutes with B, then root spaces of A are B-invariant.

390. Let λ0 be a root of the characteristic polynomial of an operator A,and m its multiplicity. What are possible values for the dimension of theeigenspace corresponding to λ0?

391. Let v ∈ V be a non-zero vector, and a : V → K a non-zero linearfunction. Find eigenvalues and eigenspaces of the operator x 7→ a(x)v.

Nilpotent Operators

Example. Introduce a nilpotent linear operator N : Kn → Kn by

describing its action on vectors of the standard basis:

Nen = en−1, Nen−1 = en−2, . . . , Ne2 = e1, Ne1 = 0.

Then Nn = 0 but Nn−1 6= 0. We will call N , as well as any operatorsimilar to it, a regular nilpotent operator. The matrix of N in thestandard basis has the form

0 1 0 . . . 00 0 1 . . . 0

. . .0 0 . . . 0 10 0 . . . 0 0

.

Page 32: Chapter 4giventh/la/la_16_ch4.pdf · Chapter 4 Eigenvalues 1 The Spectral Theorem Hermitian Spaces Given a C-vector space V, an Hermitian inner product in V is defined as a Hermitian

166 Chapter 4. EIGENVALUES

It has the range of dimension n− 1 spanned by e1, . . . , en−1 and thenull space of dimension 1 spanned by e1.

Proposition. Let N : V → V be a nilpotent operator on aK-vector space of finite dimension. Then the space can bedecomposed into the direct sum of N-invariant subspaces,on each of which N is regular.

Proof. We use induction on dimV. When dimV = 0, there isnothing to prove. Now consider the case when dimV > 0.

The range N(V) is N -invariant, and dimN(V) < dimV (sinceotherwise N could not be nilpotent). By the induction hypothesis,the spaceN(V) can be decomposed into the direct sum ofN -invariantsubspaces, on each of whichN is regular. Let l be the number of these

subspaces, n1, . . . , nl their dimensions, and e(i)1 , . . . , e

(i)ni

a basis in theith subspace such that N acts on the basis vectors as in Example:

e(i)ni7→ · · · 7→ e

(i)1 7→ 0.

Since each e(i)ni

lies in the range of N , we can pick a vector e(i)ni+1 ∈ V

such that Ne(i)ni+1 = e

(i)ni. Note that e

(1)1 , . . . , e

(l)1 form a basis in

(KerN) ∩N(V). We complete it to a basis

e(1)1 , . . . , e

(l)1 , e

(l+1)1 , . . . , e

(r)1

of the whole null space KerN . We claim that all the vectors e(i)j

form a basis in V, and therefore the l + r subspaces

Span(e(i)1 , . . . , e(i)ni

, e(i)ni+1), i = 1, . . . , l, l + 1, . . . , r,

(of which the last r − l are 1-dimensional) form a decomposition ofV into the direct sum with required properties.

To justify the claim, notice that the subspace U ⊂ V spanned

by n1 + · · · + nl = dimN(V) vectors e(i)j with j > 1 is mapped by

N onto the space N(V). Therefore: (a) those vectors form a basisof U , (b) dimU = dimN(V), and (c) U ∩ KerN = {0}. On the

other hand, vectors e(i)j with j = 1 form a basis of KerN , and since

dimKerN + dimN(V) = dimV, together with the above basis of U ,they form a basis of V. �

Corollary 1. The matrix of a nilpotent operator in asuitable basis is block diagonal with regular diagonal blocks(as in Example) of certain sizes n1 ≥ · · · ≥ nr > 0.

Page 33: Chapter 4giventh/la/la_16_ch4.pdf · Chapter 4 Eigenvalues 1 The Spectral Theorem Hermitian Spaces Given a C-vector space V, an Hermitian inner product in V is defined as a Hermitian

3. Jordan Canonical Forms 167

The basis in which the matrix has this form, as well as the de-composition into the direct sum of invariant subspaces as describedin Proposition, are not canonical, since choices are involved on eachstep of induction. However, the dimensions n1 ≥ · · · ≥ nr > 0 ofthe subspaces turn out to be uniquely determined by the geometryof the operator.

To see why, introduce the following Young tableaux (Figure46). It consists of r rows of identical square cells. The lengths ofthe rows represent dimensions n1 ≥ n2 ≥ · · · ≥ nr > 0 of invariantsubspaces, and in the cells of each row we place the basis vectorsof the corresponding subspace, so that the operator N sends eachvector to its left neighbor (and those of the leftmost column to 0).

Figure 46

e e e e(1) (1) (1) (1)

ee e e e

e e e

e

(2) (2) (2) (2)

(3) (3) (3) (3)

n

en

1

1

1

1

(r−1)e1 e

(r)

2

(1)

(r−1)

2

2

2

3

3

3 en−1 n

2

3

1 1

The format of the tableaux is determined by the partition of thetotal number n of cells (equal to dimV) into the sum n1 + · · · + nrof positive integers. Reading the same format by columns, we obtainanother partition n = m1 + · · ·+md, called transposed to the firstone, where m1 ≥ · · · ≥ md > 0 are the heights of the columns.Obviously, two transposed partitions determine each other.

It follows from the way how the cells are filled with vectors e(i)j ,

that the vectors in the columns 1 through k form a basis of the spaceKerNk. Therefore

mk = dimKerNk − dimKerNk−1, k = 1, . . . , d.

Corollary 2. Consider the flag of subspaces defined by anilpotent operator N : V → V:

KerN ⊂ KerN2 ⊂ · · · ⊂ KerNd = V

and the partition of n = dimV into the summands mk =dimKerNk − dimKerNk−1. The summands of the transposed

Page 34: Chapter 4giventh/la/la_16_ch4.pdf · Chapter 4 Eigenvalues 1 The Spectral Theorem Hermitian Spaces Given a C-vector space V, an Hermitian inner product in V is defined as a Hermitian

168 Chapter 4. EIGENVALUES

partition n = n1 + · · · + nr are the dimensions of the regularnilpotent blocks of N (described in Proposition and Corollary 1).

Corollary 3. The number of equivalence classes of nilpo-tent operators on a vector space of dimension n is equal tothe number of partitions of n.

EXERCISES

392. Let Vn ⊂ K[x] be the space of all polynomials of degree < n. Provethat the differentiation d

dx: Vn → Vn is a regular nilpotent operator.

393. Find all matrices commuting with a regular nilpotent one.

394. Is there an n×n-matrix A such that A2 6= 0 but A3 = 0: (a) if n = 2?(b) if n = 3?

395. Classify similarity classes of nilpotent 4× 4-matrices.

396. An operator is called unipotent, if it differs from the identity by anilpotent operator. Prove that the number of similarity classes of unipotentn × n-matrices is equal to the number of partitions of n, and find thisnumber for n = 5.

The Jordan Canonical Form Theorem

We proceed to classification of linear operators A : Cn → Cn up to

similarity transformations.

Theorem. Every complex matrix is similar to a block-diagonal normal form with each diagonal block of the form:

λ0 1 0 . . . 00 λ0 1 . . . 0

. . .0 0 . . . λ0 10 0 . . . 0 λ0

, λ0 ∈ C,

and such a normal form is unique up to permutations ofthe blocks.

The block-diagonal matrices described in the theorem are calledJordan canonical forms (or Jordan normal forms). Their di-agonal blocks are called Jordan cells.

It is instructive to analyze a Jordan canonical form before goinginto the proof of the theorem. The characteristic polynomial of aJordan cell is (λ− λ0)

m where m is the size of the cell. The charac-teristic polynomial of a block-diagonal matrix is equal to the product

Page 35: Chapter 4giventh/la/la_16_ch4.pdf · Chapter 4 Eigenvalues 1 The Spectral Theorem Hermitian Spaces Given a C-vector space V, an Hermitian inner product in V is defined as a Hermitian

3. Jordan Canonical Forms 169

of characteristic polynomials of the diagonal blocks. Therefore thecharacteristic polynomial of the whole Jordan canonical form is theproduct of factors (λ−λi)mi , one per Jordan cell. Thus the diagonalentries of Jordan cells are roots of the characteristic polynomial. Af-ter subtracting the scalar matrix λ0I, Jordan cells with λi = λ0 (andonly these cells) become nilpotent. Therefore the root space Wλ0

isexactly the direct sum of those subspaces on which the Jordan cellswith λi = λ0 operate.

Proof of Theorem. Everything we need has been already es-tablished in the previous two subsections.

Thanks to the Fundamental Theorem of Algebra, the character-istic polynomial det(λI−A) of a complex n×n-matrix A factors intothe product of powers of distinct linear factors: (λ − λ1)

n1 · · · (λ −λr)

nr . According to Corollary 2 of Lemma, the space Cn is decom-

posed in a canonical fashion into the direct sum Wλ1⊕ · · · ⊕Wλr

ofA-invariant root subspaces. On each root subspace Wλi

, the oper-ator A − λiI is nilpotent. According to Proposition, the root spaceWni

is represented (in a non-canonical fashion) as the direct sum ofinvariant subspaces on each of which A−λiI acts as a regular nilpo-tent operator. Since the scalar operator λiI leaves every subspaceinvariant, this means that Wλi

is decomposed into the direct sumof A-invariant subspaces, on each of which A acts as a Jordan cellwith the eigenvalue λ0 = λi. Thus, existence of a basis in which Ais described by a Jordan normal form is established.

To prove uniqueness, note that the root spaces Wλiare intrin-

sically determined by the operator A, and the partition of dimWλi

into the sizes of Jordan cells with the eigenvalue λi is uniquely deter-mined, according to Corollary 2 of Proposition, by the geometry ofthe operator A−λiI nilpotent on Wλi

. Therefore the exact structureof the Jordan normal form of A (i.e. the numbers and sizes of Jordancells for each of the eigenvalues λi) is uniquely determined by A, andonly the ordering of the diagonal blocks remains ambiguous. �

Corollary 1. Dimensions of root spaces Wλicoincide

with multiplicities of λi as roots of the characteristic poly-nomial.

Corollary 2. If the characteristic polynomial of a com-plex matrix has only simple roots, then the matrix is diag-onalizable, i.e. is similar to a diagonal matrix.

Corollary 3. Every operator A : Cn → Cn in a suitable

basis is described by the sum D +N of two commuting ma-

Page 36: Chapter 4giventh/la/la_16_ch4.pdf · Chapter 4 Eigenvalues 1 The Spectral Theorem Hermitian Spaces Given a C-vector space V, an Hermitian inner product in V is defined as a Hermitian

170 Chapter 4. EIGENVALUES

trices, of which D is diagonal, and N strictly upper trian-gular.

Corollary 4. Every operator on a complex vector spaceof finite dimension can be represented as the sum D+N oftwo commuting operators, of which D is diagonalizable andN nilpotent.

Remark. We used that K = C only to factor the characteris-tic polynomial of the matrix A into linear factors. Therefore thesame results hold true over any field K such that all non-constantpolynomials from K[λ] factor into linear factors. Such fields arecalled algebraically closed. In fact (see [8]) every field K is con-tained in an algebraically closed field. Thus every linear operatorA : Kn → K

n can be brought to a Jordan normal form by transfor-mations A 7→ C−1AC, where however entries of C and scalars λ0 inJordan cells may belong to a larger field F ⊃ K.8 We will see howthis works when K = R and F = C.

EXERCISES

397. Find Jordan normal forms of the following matrices: �

(a)

[

1 2 00 2 0

−2 −2 1

]

, (b)

[

4 6 0−3 −5 0−3 −6 1

]

, (c)

[

13 16 16−5 −7 −6−6 −8 −7

]

,

(d)

[

3 0 83 −1 −6

−2 0 −5

]

, (e)

[ −4 2 10−4 3 7−3 1 7

]

, (f)

[

7 −12 −23 −4 0

−2 0 2

]

,

(g)

[ −2 8 6−4 10 64 −8 −4

]

, (h)

[

0 3 3−1 8 62 −14 −10

]

, (i)

[

1 1 −1−3 −3 3−2 −2 2

]

,

(j)

[

1 −1 23 −3 62 −2 4

]

, (k)

[ −1 1 1−5 21 176 −26 −21

]

, (l)

[

3 7 −3−2 −5 2−4 −10 3

]

,

(m)

[

8 30 −14−6 −19 9−6 −23 11

]

, (n)

[

9 22 −6−1 −4 18 16 −5

]

, (o)

[

4 5 −2−2 −2 1−1 −1 1

]

.

398. Compute powers of Jordan cells. �

399. Prove that if some power of a complex matrix is the identity, then thematrix is diagonalizable.

8For this, F does not have to be algebraically closed, but only needs to containall roots of det(λI −A).

Page 37: Chapter 4giventh/la/la_16_ch4.pdf · Chapter 4 Eigenvalues 1 The Spectral Theorem Hermitian Spaces Given a C-vector space V, an Hermitian inner product in V is defined as a Hermitian

3. Jordan Canonical Forms 171

400. Prove that transposed square matrices are similar.

401. Prove that trA =∑

λi and detA =∏

λi, where λ1, . . . , λn are allroots of the characteristic polynomial (repeated according to their multi-plicities).

402. Prove that a square matrix satisfies its own characteristic equation;namely, if p denotes the characteristic polynomial of a matrix A, thenp(A) = 0. (This identity is called the Hamilton–Cayley equation.)

The Real Case

Let A : Rn → Rn be an R-linear operator. It acts9 on the com-

plexification Cn of the real space, and commutes with the complex

conjugation operator σ : x+ iy 7→ x− iy.

The characteristic polynomial det(λI − A) has real coefficients,but its roots λi can be either real or come in pairs of complex con-jugated roots (of the same multiplicity). Consequently, the complexroot spaces Wλi

, which are defined as null spaces in Cn of sufficiently

high powers of A−λiI, come in two types. If λi is real, then the rootspace is real in the sense that it is σ-invariant, and thus is the com-plexification of the real root spaceWλi

∩Rn. If λi is not real, and λi isits complex conjugate, then Wλi

and Wλiare different root spaces of

A, but they are transformed into each other by σ. Indeed, σA = Aσ,and σλi = λiσ. Therefore, if z ∈ Wλi

, i.e. (A− λiI)dz = 0 for some

d, then 0 = σ(A − λiI)dz = (A − λiI)

dσz, and hence σz ∈ Wλi.

This allows one to obtain the following improvement for the JordanCanonical Form Theorem applied to real matrices.

Theorem. A real linear operator A : Rn → Rn can be repre-

sented by the matrix in a Jordan normal form with respectto a basis of the complexified space C

n invariant under com-plex conjugation.

Proof. In the process of construction bases in Wλiin which A

has a Jordan normal form, we can use the following procedure. Whenλi is real, we take the real root space Wλi

∩Rn and take in it a real

basis in which the matrix of A − λiI is block-diagonal with regularnilpotent blocks. This is possible due to Proposition applied to thecase K = R. This real basis serves then as a σ-invariant complexbasis in the complex root space Wλi

. When Wλiand Wλi

is a pairof complex conjugated root spaces, then we take a required basis in

9Strictly speaking, it is the complexification AC of A that acts on Cn = (Rn)C,

but we will denote it by the same letter A.

Page 38: Chapter 4giventh/la/la_16_ch4.pdf · Chapter 4 Eigenvalues 1 The Spectral Theorem Hermitian Spaces Given a C-vector space V, an Hermitian inner product in V is defined as a Hermitian

172 Chapter 4. EIGENVALUES

one of them, and then apply σ to obtain such a basis in the other.Taken together, the bases form a σ-invariant set of vectors. �

Of course, for each Jordan cell with a non-real eigenvalue λ0,there is another Jordan cell of the same size with the eigenvalue λ0.Moreover, if e1, . . . , em is the basis in the A-invariant subspace ofthe first cell, i.e. Aek = λ0ek + ek−1, k = 2, . . . ,m, and Ae1 = λ0e1,then the A-invariant subspace corresponding to the other cell comeswith the complex conjugate basis e1, . . . , em, where ek = σek. Thedirect sum U := Span(e1, . . . , em, e1, . . . , em) of the two subspacesis both A- and σ-invariant and thus is a complexification of the realA-invariant subspace U ∩ R

n. We use this to describe a real normalform for the action of A on this subspace.

Namely, let λ0 = α−iβ, and write each basis vector ek in terms ofits real and imaginary part: ek = uk − ivk. Then the real vectors uk

and vk form a real basis in the real part of the complex 2-dimensionalspace spanned by ek and ek = uk + ivk. Thus, we obtain a basisu1,v1,u2,v2, . . . ,um,vm in the subspace U ∩ R

n. The action of Aon this basis is found from the formulas:

Au1 + iAv1 = (α− iβ)(u1 + iv1) = (αu1 + βv1) + i(−βu1 + αv1),

Auk + iAvk = (αuk + βvk + uk−1) + i(−βuk + αvk + vk−1), k > 1.

Corollary 1. A linear operator A : Rn → R

n is repre-sented in a suitable basis by a block-diagonal matrix withthe diagonal blocks that are either Jordan cells with realeigenvalues, or have the form (β 6= 0):

α −β 1 0 0 . . . 0 0β α 0 1 0 . . . 0 00 0 α −β 1 0 . . . 00 0 β α 0 1 . . . 0

. . .0 . . . 0 α −β 1 00 . . . 0 β α 0 10 0 . . . 0 0 α −β0 0 . . . 0 0 β α

.

Corollary 2. If two real matrices are related by a com-plex similarity transformation, then they are related by areal similarity transformation.

Remark. The proof meant here is that if two real matrices aresimilar over C then they have the same Jordan normal form, and

Page 39: Chapter 4giventh/la/la_16_ch4.pdf · Chapter 4 Eigenvalues 1 The Spectral Theorem Hermitian Spaces Given a C-vector space V, an Hermitian inner product in V is defined as a Hermitian

3. Jordan Canonical Forms 173

thus they are similar over R to the same real matrix, as described inCorollary 1. However, Corollary 2 can be proved directly, withouta reference to the Jordan Canonical Form Theorem. Namely, if tworeal matrices, A and A′, are related by a complex similarity trans-formation: A′ = C−1AC, we can rewrite this as CA′ = AC, andtaking C = B+ iD where B and D are real, obtain: BA′ = AB andDA′ = AD. The problem now is that neither B nor D is guaranteedto be invertible. Yet, there must exist an invertible linear combina-tion E = λB + µD, for if the polynomial det(λB + µD) of λ and µvanishes identically, then det(B + iD) = 0 too. For invertible E, wehave EA′ = AE and hence A′ = E−1AE.

EXERCISES

403. Classify all linear operators in R2 up to linear changes of coordinates.

404. Consider real traceless 2 × 2-matrices

[

a bc −a

]

as points in the 3-

dimensional space with coordinates a, b, c. Sketch the partition of this spaceinto similarity classes.

Page 40: Chapter 4giventh/la/la_16_ch4.pdf · Chapter 4 Eigenvalues 1 The Spectral Theorem Hermitian Spaces Given a C-vector space V, an Hermitian inner product in V is defined as a Hermitian
Page 41: Chapter 4giventh/la/la_16_ch4.pdf · Chapter 4 Eigenvalues 1 The Spectral Theorem Hermitian Spaces Given a C-vector space V, an Hermitian inner product in V is defined as a Hermitian

4. Linear Dynamical Systems 175

4 Linear Dynamical Systems

We consider here applications of Jordan Canonical Forms to dynam-ical systems, i.e. models of evolutionary processes. An evolutionarysystem can be described mathematically as a set, X, of states of thesystem, and a collection of maps gt2t1 : X → X which describe theevolution of the states from the time moment t1 to the time momentt2, whereas the composition of gt3t2 with gt2t1 is required to coincide

with gt3t1 .

One says that the dynamical sustem is time-independent (orstationary), if the maps gt1,t2 depend only on the time increment

t = t2 − t1, i.e. gt2t1 = gt2−t1

0 . Omitting the subscript 0, we then have

a family of maps gt : X → X satisfying gtgu = gt+u.

We will consider time-independent linear dynamical systems,i.e. such systems where the states form a vector space, V, and theevolution maps are linear, and examine both cases: of discrete timeor continuous time. In the latter case, time takes on integer valuest ∈ Z, and the evolution is described by iterations of an invertiblelinear map G : V → V. Namely, if the state of the system at t = 0 isx(0) ∈ V, then the state of the system at the moment n ∈ Z is x(n) =Gnx(0). In the case of continuous time, a time-independent lineardynamical systems are described by a system of constant coefficientslinear ordinary differential equations (ODE for short): x = Ax,and the evolution maps are found by solving the system.

We begin with the case of discrete time.

Iterations of Linear Maps

Example 1: Fibonacci numbers.. The sequence of integers

0, 1, 1, 2, 3, 5, 8, 13, 21, 34, . . . ,

formed by adding the previous two terms to obtaine the next one,is well-known under the name of Fibonacci sequence.10 It is thesolution of the 2nd order linear recursion relation fn+1 = fn +fn−1, satisfying the intial conditions f0 = 0, f1 = 1. The sequencecan be recast as a trajectory of a linear dynamical system on theplane as follows. Append the pair fn, fn+1 of consecutive terms ofthe sequence into a 2-column (xn, yn)

t and express the next pair as

10After Leonardo Fibonacci (c. 1170 – c. 1250).

Page 42: Chapter 4giventh/la/la_16_ch4.pdf · Chapter 4 Eigenvalues 1 The Spectral Theorem Hermitian Spaces Given a C-vector space V, an Hermitian inner product in V is defined as a Hermitian

176 Chapter 4. EIGENVALUES

a function of the previous one:[

xn+1

yn+1

]

=

[

0 11 1

] [

xnyn

]

, where

[

xnyn

]

:=

[

fnfn+1

]

.

We will now derive the general formula for the numbers fn (and forall solutions to the recursion relation).

The characteristic polynomial of the matrix G :=

[

0 11 1

]

is

λ2 − λ − 1. It has two roots λ± = (1 ±√5)/2. The corresponding

eigenvectors can be taken in the form v± =

[

1λ±

]

. Let v(0) be the

vector of initial conditions (it is (0, 1)t for the Fibonacci sequence),and let it be written as a linear combination of the eigenvectors:v(0) = C+v++C−v−. Then v(n) := Gnv(0) = C+λ

n+v++C−λ

n−v−.

Taking the first component of this vector equality, we obtain the gen-eral formula, depending on two arbitrary constants, C±, for solutionsof the recursion relation xn+1 = xn + xn−1:

xn = C+

(

1 +√5

2

)n

+C−

(

1−√5

2

)n

.

For the Fibonacci sequence per se, from λ+C+ + λ−C− = 1,C+ + C− = 0, we find C± = ±1/

√5, and hence

fn =1√5

[(

1 +√5

2

)n

−(

1−√5

2

)n]

.

Note that |λ−| < 1, and |λ+| > 1. Consequently, as n increases indef-initely, the second’summand will tends to 0, and the first to infinity.Thus asymptotically (for large n) fn ≈ [(1+

√5)/2]n/

√5, while con-

secutive ratios fn+1/fn tend to λ+ = (1 +√5)/2, known as golden

ratio. In fact the fractions 3/2, 5/3, 8/5, 13/8, 21/13, 34/21, . . . , i.e.fn+1/fn, are the best possible rational approximations to the goldenratio with the denominators not eexceeding fn.

11

11The golden ratio often occurs in Nature. For example, on a pine cone, ora pineapple fruit, count the numbers of spirals (formed by the scales near thebutt) going in clockwise and in counter-clockwise directions. Most likely you’llfind two consquitive Fibonacci numbers: 5 and 8, or 8 and 13. It is not surprisingthat such observations led to the mystical believes of our ancestors into magicalproperties of the golden ratio. In fact we shouldn’t judge them too harshely:Inspite of some plausible scientific models, the causes for occurences of Fibonaccinumbers and the golden ratio in various plants are still not entirely clear.

Page 43: Chapter 4giventh/la/la_16_ch4.pdf · Chapter 4 Eigenvalues 1 The Spectral Theorem Hermitian Spaces Given a C-vector space V, an Hermitian inner product in V is defined as a Hermitian

4. Linear Dynamical Systems 177

In general, an (invertible) linear map G : Rm → Rm defined a

linear dynamical system: Given the initial state v(0), the state at thediscrete time moment n ∈ Z is defined by v(n) = Gnv(0). In orderto efficiently compute the state v(n), we need therefore to computepowers of a linear map.

According to the general theory, there exists an invertible com-plex matrix C such that G = CJC−1, where J is one of the JordamCanonical Forms. Therefore

Gn = (CJC−1)(CJC−1) . . . (CJC−1) = CJnC−1,

and the problem reduces to that for J . Recall that J = D +N , thesum of a diagonal matrix D (whose diagonal entries are eigenvaluesof G), and of a nilpotent Jordan matrix N (in particular, Nm = 0),commuting with D. Thus, by the binomial formula, we have a finitesum:

Jn = (D+N)n = Dn+nDn−1N+

(

n

2

)

Dn−1N2+

(

n

3

)

Dn−3N3+· · · .

Example 2: Powers of Jordan cells. In fact, to find explicitly thepowers of Jn it suffcices to do this for each Jordan cell ceparately:

λ 1 0 0 . . .0 λ 1 0 . . .0 0 λ 1 . . .0 0 0 λ . . .

. . .

n

=

λn nλn−1(

n2

)

λn−2(

n3

)

λn−3 . . .0 λn nλn−1

(

n2

)

λn−2 . . .0 0 λn nλn−1 . . .0 0 0 λn . . .

. . .

Example 3. Let us find the general formula for solutions of therecursive sequence an+1 = 4an − 6an−1 + 4an−2 − an−3. Put v(n) =(an−3, an−2, an−1, an)

t, Then v(n + 1) = Gv(n), where

G =

0 1 0 00 0 1 00 0 0 1

−1 4 −6 4

.

The reader is asked to check that

det(λI −G) = λ4 − 4λ3 + 6λ2 − 4λ+ 1 = (λ− 1)4.

Therefore G has only one eigenvalue, λ = 1, or multiplicity 4. Be-sides, the rank of I −G is at least 3 due to the 3× 3-identity matrix

Page 44: Chapter 4giventh/la/la_16_ch4.pdf · Chapter 4 Eigenvalues 1 The Spectral Theorem Hermitian Spaces Given a C-vector space V, an Hermitian inner product in V is defined as a Hermitian

178 Chapter 4. EIGENVALUES

which occurs in the right upper corner of G. This shows that theeigenspace has dimension 1, and thus the Jordan canonical form of Ghas only one Jordan cell, J , of size 4, with the eigenvalue 1. Exam-ining the matrix entries of Jn from Example 2 (with λ = 1, m = 4),we see that they all have the form

(nk

)

, where k = 0, 1, 2, 3. Note thatas functions of n these binomial coefficients form a basis in the spaceof polynomilas of the form C0 + C1n + C2n

2 + C3n3. We conclude

that matrix entries of Gn = CJnC−1 must be, as functions of n,polynomilas of degree ≤ 3. In particular, an must be a polynomialof n of degree ≤ 3. Since solution sequences {an} must depend on4 arbitrary parameters a0, a1, a2, a3, and thus form a 4-dimensionalspace (the same as the space of polynomials of degree ≤ 3), we finallyconlude that the general solution to the recursion equation has theform an = C0+C1n+C2n

2+C3n3, where Ci are arbitrary constants.

Given specific values of a0, a1, a2, a3, one can find the correspondingvalues of C0, C1, C2, C3 by plugging n = 0, 1, 2, 3, and solving thesystem of 4 linear equations.

As it follows from Example 2 (and is illustrated by Examples 1and 3), for any discrete dymanical system v(n + 1) = Gv(n),components of the vector trajectories v(n) = Gnv(0) as func-tions of n are linear combinations of the monomials λni n

k−1,where λi is a roots of the characteristic polynomial of theoperator G, and k does not exceed the multiplicity of thisroot. In particular, if all roots are simple, the egneral solution hasthe form v(n) =

∑mi=1Ciλ

ni vi, where vi is an eigenvector correspond-

ing to the eigenvalue λi, and Ci are arbitrary constants.

EXERCISES

405. Find the general formula for the sequence an such that a0 = 2, a1 = 1,and an+1 = an + an−1 for all n ≥ 1.

406. Find all solutions to the recursion equation an+1 = 5an + 6an−1.

407. For the recursion equation an+1 = 6an − 9an−1, express the generalsolutions as a function of n, a0 and a1. �

408. Given two sequences {xn}, {yn} satisfying xn+1 = 4xn +3yn, yn+1 =3xn + 2yn, and such that x0 = y0 = 13. Find the limit of the ratio xn/ynas n tends to: (a) +∞; (b) −∞. �

409. Prove that for a given G : V → V , all trajectories {x(n)} of a givendynamical system x(n + 1) = Gx(n) form a linear subspace of dimensiondimV in the space of all vector-valued sequences.

410.⋆ For integer n > 0, prove that(

3+√17

2

)n

+(

3−√17

2

)n

is an odd integer.

Page 45: Chapter 4giventh/la/la_16_ch4.pdf · Chapter 4 Eigenvalues 1 The Spectral Theorem Hermitian Spaces Given a C-vector space V, an Hermitian inner product in V is defined as a Hermitian

4. Linear Dynamical Systems 179

Linear ODE Systems

Letx1 = a11x1 + ...+ a1nxn

...x2 = an1x1 + ...+ annxn

be a linear homogeneous system of ordinary differential equationswith constant (possibly complex) coefficients aij. It can be writtenin the matrix form as x = Ax.

Consider the infinite matrix series

etA := I + tA+t2A2

2+t3A3

6+ ...+

tkAk

k!+ ...

If M is an upper bound for the absolute values of the entries of A,then the matrix entries of tkAk/k! are bounded by nktkMk. It iseasy to deduce from this that the series converges (at least as fast asthe series for entM ).

Proposition. The solution to the system x = Ax with theinitial condition x(0) is given by the formula x(t) = etAx(0).

Proof. Differentiating the series∑∞

0 tkAk/k! we find

d

dtetA =

∞∑

k=1

tk−1Ak

(k − 1)!=

∞∑

k=0

tkAk+1

k!= AetA

and hence ddte

tAx(0) = A(etAx(0)). Thus x(t) satisfies the ODE

system. At t = 0 we have e0Ax(0) = Ix(0) = x(0) and therefore theinitial condition is also satisfied. �

Remark. This Proposition defins the exponential fucntion A 7→eA of a matrix (or of a linear operator). The exponential fucntionsatisfies eAeB = eA+B provided that A and B commute. This canbe proved the same way as this is done in the supplement ComplexNumbers for the exponential function of the complex argument.

The proposition reduces the problem of solving the ODE systemx = Ax to computation of the exponential function etA of a matrix.Notice that if A = CBC−1 then

Ak = CBC−1CBC−1CBC−1... = CBBB...C−1 = CBkC−1

Page 46: Chapter 4giventh/la/la_16_ch4.pdf · Chapter 4 Eigenvalues 1 The Spectral Theorem Hermitian Spaces Given a C-vector space V, an Hermitian inner product in V is defined as a Hermitian

180 Chapter 4. EIGENVALUES

and therefore exp(tA) = C−1 exp(tB)C. This observation reducescomputation of etA to that of etB where the Jordan normal form ofA can be taken on the role of B.

Example 4. Let Λ be a diagonal matrix with the diagonal entriesλ1, ..., λn. Then Λk is a diagonal matrix with the diagonal entriesλk1 , ..., λ

kn and hence

etΛ = I + tΛ+t2

2Λ2 + ... =

eλ1t 0 ......

... 0 eλnt

.

Example 5. Let N be a nilpotent Jordan cell of size m. We have(for m = 4):

N =

0 1 0 00 0 1 00 0 0 10 0 0 0

N2 =

0 0 1 00 0 0 10 0 0 00 0 0 0

N3 =

0 0 0 10 0 0 00 0 0 00 0 0 0

and N4 = 0. Generalizing to arbitrary m, we find:

etN = I + tN +t2

2N2 +

t3

6N3 + ... =

1 t t2

2 ... tm−1

(m−1)!

0 1 t t2

2 ...0 0 1 t ...

...... 0 1

.

Let λI + N be the Jordan cell of size m with the eigenvalue λ.Then et(λI+N) = etλIetN = eλtetN . Here we use the multiplicativeproperty of the matrix exponential function, valid since I and Ncommute.

Finally, let A =

[

B 00 D

]

be a block-diagonal square matrix.

Then Ak =

[

Bk 0

0 Dk

]

and respectively etA =

[

etB 0

0 etD

]

. To-

gether with Examples 4 and 5, this shows how to compute the expo-nential function etJ for any Jordan normal form J : each Jordan cellhas the form λI + N and should be replaced by eλtetN . Since anysquare matrix A can be reduced to one of the Jordan normal matri-ces J by similarity transformations, we conclude that etA = CetJC−1

with suitable invertible C.

Page 47: Chapter 4giventh/la/la_16_ch4.pdf · Chapter 4 Eigenvalues 1 The Spectral Theorem Hermitian Spaces Given a C-vector space V, an Hermitian inner product in V is defined as a Hermitian

4. Linear Dynamical Systems 181

Applying Example 3 to ODE systems x = Ax we arrive at thefollowing conclusion. Suppose that the characteristic polynomial ofA has n distinct roots λ1, ..., λn. Let C be the matrix whose columnsare the corresponding n complex eigenvectors v1, ...,vn. Then thesolution with the initial condition x(0) is given by the formula

x(t) = C

eλ1t 0 ......

... 0 eλnt

C−1x(0).

Notice that C−1x(0) here (as well as x(0)) is a column c = (c1, ..., cn)t

of arbitrary constants, while the columns of CetΛ are eλitvi. Weconclude that the general solution formula reads

x(t) = c1eλ1tv1 + ...+ cne

λntvn.

This formula involves eigenvalues λi of A, the corresponding eigen-vectors vi, and the arbitrary complex constants ci, to be found fromthe system of linear equations x(0) = c1v1 + ...+ cnvn.

Example 6. The ODE system

x1 = 2x1 + x2x2 = x1 + 3x2 − x3x3 = 2x2 + 3x3 − x1

has the matrix A =

2 1 01 3 −1−1 2 3

.

The characteristic polynomial of A is λ3 − 8λ2 + 22λ − 20. It hasa real root λ0 = 2, and factors as (λ − 2)(λ2 − 6λ + 10). Thusis has two complex conjugate roots λ± = 3 ± i. The eigenvectorsv0 = (1, 0, 1)t and v± = (1, 1 ± i, 2 ∓ i)t are found from the systemsof linear equations:

2x1 + x2 = 2x1x1 + 3x2 − x3 = 2x2

−x1 + 2x2 + 3x3 = 2x3and

2x1 + x2 = (3± i)x1x1 + 3x2 − x3 = (3± i)x2

−x1 + 2x2 + 3x3 = (3± i)x3.

Thus the general complex solution is a linear combination of thesolutions

e2t

101

, e(3+i)t

11 + i2− i

, e(3−i)t

11− i2 + i

Page 48: Chapter 4giventh/la/la_16_ch4.pdf · Chapter 4 Eigenvalues 1 The Spectral Theorem Hermitian Spaces Given a C-vector space V, an Hermitian inner product in V is defined as a Hermitian

182 Chapter 4. EIGENVALUES

with arbitrary complex coefficients. Real solutions can be extractedfrom the complex ones by taking their real and imaginary parts:

e3t

cos tcos t− sin t2 cos t+ sin t

, e3t

sin tcos t+ sin t2 sin t− cos t

.

Thus the general real solution is described by the formulas

x1(t) = c1et + c2e

3t cos t + c3e3t sin t

x2(t) = c2e3t(cos t− sin t) + c3e

3t(2 cos t+ sin t)x3(t) = c1e

t + c2e3t(cos t+ sin t) + c3e

3t(2 sin t− cos t),

where c1, c2, c3 are arbitrary real constants. At t = 0 we have

x1(0) = c1 + c2, x2(0) = c2 + 2c3, x3(0) = c1 + c2 − c3.

Given the initial values (x1(0), x2(0), x3(0)) the corresponding con-stants c1, c2, c3 can be found from this system of linear algebraicequations.

In general, even if the characteristic polynomial of A has multipleroots, our theory (and Examples 4, 5) show that all componentsof solutions to the ODE system x = Ax are expressible aslinear combinations of functions tkeλt, tkeat cos bt, tkeat sin bt,where λ are real eigenvalues, a± ib are complex eigenvalues,and k = 0, 1, 2, ... is to be smaller than the multiplicity ofthe corresponding eigenvalue. This observation suggests to ap-proach the ODE systems with multiple eigenvalues in the followingway avoiding explicit similarity transformation to the Jordan nor-mal form: look for the general solution in the form of linear combi-nations of these functions with arbitrary coefficients by substitutingthem into the equations, and find the relations between the arbitraryconstants from the resulting system of linear algebraic equations.

Example 7. The ODE system

x1 = 2x1 + x2 + x3x2 = −3x1 − 2x2 − 3x3x3 = 2x1 + 2x2 + 3x3

has the matrix A =

2 1 1−3 −2 −32 2 3

with the characteristic polynomial λ3−3λ2+3λ−1 = (λ−1)3. Thuswe can look for solutions in the form

x1 = et(a1+b1t+c1t2), x2 = et(a2+b2t+c2t

2), x3 = et(a3+b3t+c3t2).

Page 49: Chapter 4giventh/la/la_16_ch4.pdf · Chapter 4 Eigenvalues 1 The Spectral Theorem Hermitian Spaces Given a C-vector space V, an Hermitian inner product in V is defined as a Hermitian

4. Linear Dynamical Systems 183

Substituting into the ODE system, and introducing the notation∑

ai =: A,∑

bi =: B,∑

ci =: C, we get (omitting the factoret):

(a1 + b1) + (b1 + 2c1)t+ c1t2 = (a1 +A) + (b1 +B)t+ (c1 + C)t2,

(a2 + b2) + (b2 + 2c2)t+ c2t2 = −(3A− a2)− (3B − b2)t− (3C − c2)t

2,(a3 + b3) + (b3 + 2c3)t+ c3t

2 = (2A+ a3) + (2B + b3)t+ (2C + c3)t2.

Introducing the notation P := A+Bt+Ct2, we rewrite the systemof 9 linear equations in 9 unknowns in the form

b1 + 2c1t = P, b2 + 2c2t = −3P, b3 + 2c3t = 2P.

This yields b1 = A, b2 = −3A, b3 = 2A, and (since B = A−3A+2A =0), we find c1 = c2 = c3 = 0. Thus, the general solution, dependingon 3 arbitrary constants a1, a− 2, a3, reads:

x1 = et(a1 + (a1 + a2 + a3)t)

x2 = et(a2 − 3(a1 + a2 + a3)t)

x3 = et(a3 + 2(a1 + a2 + a3)t).

In particular, since t2 does not occur in the formulas, we canconclude that the Jordan form of our matrix has only Jordan cellsof size 1 or 2 (and hence one of each, for the total size must be 3):

J =

1 1 00 1 00 0 1

.

EXERCISES

411.⋆ Give an example of (non-commuting) 2 × 2 matrices A and B forwhich eAeB 6= eA+B. �

412. Solve the following ODE systems. Find the solution satisfying theinitial condition x1(0) = 1, x2(0) = 0, x3(0) = 0:

(a) λ1 = 1x1 = 3x1 − x2 + x3x2 = x1 + x2 + x3x3 = 4x1 − x2 + 4x3

(b) λ1 = 1x1 = −3x1 + 4x2 − 2x3x2 = x1 + x3x3 = 6x1 − 6x2 + 5x3

(c) λ1 = 1x1 = x1 − x2 − x3x2 = x1 + x2x3 = 3x1 + x3

(d) λ1 = 2x1 = 4x1 − x2 − x3x2 = x1 + 2x2 − x3x3 = x1 − x2 + 2x3

(e) λ1 = 1x1 = −x1 + x2 − 2x3x2 = 4x1 + x2x3 = 2x1 + x2 − x3

Page 50: Chapter 4giventh/la/la_16_ch4.pdf · Chapter 4 Eigenvalues 1 The Spectral Theorem Hermitian Spaces Given a C-vector space V, an Hermitian inner product in V is defined as a Hermitian

184 Chapter 4. EIGENVALUES

(f) λ1 = 2x1 = 4x1 − x2x2 = 3x1 + x2 − x3x3 = x1 + x3

(g) λ1 = 1x1 = 2x1 − x2 − x3x2 = 2x1 − x2 − 2x3x3 = −x1 + x2 + 2x3

.

413. Find the Jordan canonical forms of the 3 × 3-matrices of the ODEsystems (a–g) from the previous exercise.

414.⋆ Prove the following identity: det eA = etrA. �

Higher Order Linear ODE

A linear homogeneous constant coefficient n-th order ODE

dn

dtnx+ a1

dn−1

dtn−1x+ ...+ an−1

d

dtx+ anx = 0

can be rewritten, by introducing the notations

x1 := x, x2 := x, x2 := x, ..., xn−1 := dn−1x/dtn−1,

as a system x = Ax of n first order ODE with the matrix

A =

0 1 0 ... 00 0 1 0 ...

...0 ... 0 0 1

−an −an−1 ... −a2 −a1

,

Then our theory of linear ODE system applies. There are howeversome simplifications which are due to the special form of the matrixA. First, computing the characteristic polynomial of A we find

det(λI −A) = λn + a1λn−1 + ...+ an−1λ+ an.

Thus the polynomial can be easily read off the high order differentialequation. Next, let λ1, ..., λr be the roots of the characteristic poly-nomial, and m1, ...,mr their multiplicities (m1+ ...+mr = n). Thenit is clear that the solutions x(t) = x1(t) must have the form

eλ1tP1(t) + ...+ eλrtPr(t),

where Pi = a0 + a1t + ... + ami−1tmi−1 is a polynomial of degree

< mi. The total number of arbitrary coefficients in these polyno-mials equals m1 + ... + mr = n. On the other hand, the generalsolution to the n-th order ODE must depend on n arbitrary initial

Page 51: Chapter 4giventh/la/la_16_ch4.pdf · Chapter 4 Eigenvalues 1 The Spectral Theorem Hermitian Spaces Given a C-vector space V, an Hermitian inner product in V is defined as a Hermitian

4. Linear Dynamical Systems 185

values (x(0), x(0), ..., x(n−1)(0)). This can be justified by the generalExistence and Uniqueness Theorem for solutions of ODE systems.We conclude that each of the functions

eλ1t, eλ1tt, ..., eλ1ttm1−1, ..., eλrt, eλrtt, ..., eλrttmr−1

must satisfy our differential equation, and any (complex)solution is uniquely written as a linear combination of thesefunctions with suitable (complex) coefficients. (In other words,these functions form a basis of the space of complex solutions to thedifferential equation.)

Example 8: x(xii)−3x(viii)+3x(iv)−x = 0 has the characteristicpolynomial λ12−3λ8+3λ4−1 = (λ−1)3(λ+1)3(λ− i)3(λ+ i)3. Thefollowing 12 functions form therefore a complex basis of the solutionspace:

et, tet, t2et, e−t, te−t, t2e−t, eit, teit, t2eit, e−it, te−it, t2e−it.

Of course, a basis of the space of real solutions is obtained by takingreal and imaginary parts of the complex basis:

et, tet, t2et, e−t, te−t, t2e−t,

cos t, sin t, t cos t, t sin t, t2 cos t, t2 sin t.

Remark. The fact that the functions eλittk, k < mi, i = 1, ..., r,form a basis of the space of solutions to the differential equation withthe characteristic polynomial (λ − λ1)

m1 ...(λ − λr)mr is not hard to

check directly, without a reference to linear algebra and Existenceand Uniqueness Theorem. However, it is useful to understand howthis property of the equation is related to the Jordan structure ofthe corresponding matrix A. In fact the Jordan normal form of thematrix A consists of exactly r Jordan cells — one cell of size mi foreach eigenvalue λi. This simplification can be explained as follows.For every λ, the matrix λI − A has rank n − 1 at least (due to thepresence of the (n−1)×(n−1) identity submatrix in the right uppercorner of A). This guarantees that the eigenspaces of A have dimen-sion 1 at most, and hence that A cannot have more than one Jordancell corresponding to the same root of the characteristic polynomial.Using this property, the reader can check now that the formulationof the Jordan Theorem in terms of differential equations given in theIntroduction is indeed equivalent to the matrix formulation given inthe previous Section.

Page 52: Chapter 4giventh/la/la_16_ch4.pdf · Chapter 4 Eigenvalues 1 The Spectral Theorem Hermitian Spaces Given a C-vector space V, an Hermitian inner product in V is defined as a Hermitian

186 Chapter 4. EIGENVALUES

EXERCISES

415. Solve the following ODE, and find the solution satisfying the initialcondition x(0) = 1, x(0) = 0, ..., x(n−1)(0) = 0: (a) x(3) − 8x = 0,

(b) x(4) + 4x = 0, (c) x(6) + 64x = 0, (d) x(5) − 10x(3) + 9x = 0,

(e) x(3) − 3x− 2x = 0, (f) x(5) − 6x(4) + x(3) = 0,

(g) x(5) + 8x(3) + 16x = 0, (h) x(4) + 4x+ 3x = 0.

416. Rewrite the ODE system

x1 + 4x1 − 2x1 − 2x2 − x2 = 0x1 − 4x1 − x2 + 2x2 + 2x2 = 0

of two 2-nd order equations in the form of a linear ODE system x = Ax offour 1-st order equations and solve it.

417. Show that every polynomial λn+a1λn−1+· · ·+an is the charactersitic

polynomial of an n× n-matrix.