invariant subspaces

Upload: dipro-mondal

Post on 02-Jun-2018

216 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/10/2019 Invariant Subspaces

    1/7

    2p

    Chapter 2

    Invariant Subspaces

    Reminder: Unless explicitly stated we are talking about finite dimensional vectorspaces, and linear transformations and operators between finite dimensional vectorspaces. However, we will from time to time explicitly discuss the infinite dimensional

    case. Similarly, although much of the theory holds for vector spaces over some field,we focus (for practical purposes) on vector spaces of the real of complex field.

    Definition 2.1. Let Vbe a vector space (real or complex) andL : V V be alinear operator overV. We say a vector spaceW V is an invariant subspace ofL if for everyw W, Lw W(we also writeLW W).

    Note that V,{0}(the set containing only the zero vector in V), Null(L), andRange(L) are all invariant subspaces ofL.

    Exercise 2.1. Prove the statement above.

    Theorem 2.2. LetV andL be as before, and letW1,W2,W3 be invariant subspacesof L. Then (1) W1 + W2 is an invariant subspace of L, (2) (W1 + W2) + W3 =W1+ (W2+ W3), (3) W1+ {0} ={0} + W1.

    Exercise 2.2. Prove theorem2.2. (The set of all invariant subspaces of a linearoperator with the binary operation of the sum of two subspaces is a semigroup anda monoid).

    Exercise 2.3. Prove that the sum of invariant subspaces is commutative.

    If an invariant subspace of a linear operator, L, is one-dimensional, we can

    29

  • 8/10/2019 Invariant Subspaces

    2/7

    2p

    30 Chapter 2. Invariant Subspaces

    say a bit more about it, and hence we have a special name for non-zero vectors insuch a space.

    Definition 2.3. We call a nonzero vector v V an eigenvector if Lv = v forsome scalar . The scalar is called an eigenvalue. The (ordered) pair, (v, ) iscalled an eigenpair. (Note: this is the generalization from Chapter 1 to cover anylinear operator.)

    Exercise 2.4. Given a one-dimensional invariant subspace, prove that any nonzerovector in that space is an eigenvector and all such eigenvectors have the same eigen-value.

    Vice versa the span of an eigenvector is an invariant subspace. From Theo-rem2.2 then follows that the span of a set of eigenvectors, which is the sum of theinvariant subspaces associated with each eigenvalue, is an invariant subspace.

    Example 2.1. As mentioned before, the matrixA Rnn defines a linear operator

    overRn. Consider the real matrix A =

    1 2 34 5 67 8 9

    and vector v =

    121

    .

    ThenAv= 0 = 0 v. Hencev is an eigenvector ofA and0 is an eigenvalue ofA.The pair(v, 0)is an eigenpair. Note that a matrix with an eigenvalue0 is singular.

    Example 2.2. The following is an example for an infinite dimensional subspace.Let C[a, b] be the set of all infinitely differentiable real functions on the (closed)interval[a, b]. We define the addition of two functionsf, g C[a, b] byh = f+ gis the function h(x) (f+ g)(x) = f(x) + g(x) (for x [a, b]), and for all RandfC[a, b]we defineh = f byh(x) (f)(x) = f(x). ThenC[a, b]withthis definition of scalar multiplication and vector addition is a vector space (showthis).

    LetL be defined by Lu = uxx for u C[a, b]. ThenL is a linear operator

    overC[a, b] and(sin x, 2) is an eigenpair for any R.

    We have Lv = v Lv v= 0. We can rewrite the right-hand side of the

    last expression as (L I)v = 0. Since v = 0, the operator (L I) is singular(i.e. not invertible).Although in some cases the eigenvalues and eigenvectors of a linear operator

    are clear by inspection, in general we need some procedure to find them. As alllinear operators over finite dimensional subspaces can be represented as matrices,all we need is a systematic procedure to find the eigenvalues and eigenvectors ofmatrices (we get our answer for the original operator by the corresponding linearcombinations of basis vectors). Remember that the function that maps linear trans-formations between finite dimensional vector spaces given bases for the spaces tomatrices is an isomorphism (i.e. an invertible linear transformation).

    Given a basis B, let A = [L]B . Using the linearity of the standard transfor-mation from operator to matrix (given a basis), we also have A I = [L I]B

  • 8/10/2019 Invariant Subspaces

    3/7

    2p

    31

    and the matrix A I must be singular. Hence, det(A I) = 0

    Definition 2.4. The polynomialdet(AI)is called the characteristic polynomialof A. The (polynomial) equationdet(A I) = 0 in is called the characteristicequation forA (and forL!). The eigenvalues of A (and ofL) are the roots of thischaracteristic equation. The multiplicity of an eigenvalue as a root of this equationis called the algebraic multiplicity of that eigenvalue.

    IMPORTANT!! It may seem from the above that eigenvalues depend on thechoice of basis, but this is not the case! To show that this is the case, we needonly pick two different bases for V, and show that the corresponding matrix ofthe transformation for one basis is similar tothe matrix of the transformation forthe other basis. Let A = [L]B. Let Calso be a basis for V and define B = [L]C .Furthermore, letX = [I]CB . Then we have A = X

    1BX(recall that this is calleda similarity transformation between A and B).

    Theorem 2.5. A and B as defined above have the same eigenvalues.

    Proof. From A= X1BX we see that A I= X1BX I= X1(B I)X.Hence det(A I) = det(X1)det(B I)det(X) and det(A I) = 0 det(B I) = 0.

    So, it is indeed fine to define the eigenvalues of an operator over a vectorspace using any basis for the vector space and its corresponding matrix. In fact,as the characteristic polynomial of an n n matrix has leading term of (1)nn

    (check this), the characteristic polynomials ofA and B are equal. So, we can callthis polynomial the characteristic polynomial of L without confusion. Note thatthe proof above does not rely on the fact that A and B are representations of the(same) linear operatorL, only that A is obtained from a similarity transformationofB. So, this is a general result for similar matrices.

    We say that similarity transformations preserve eigenvalues. The standardmethods for computing eigenvalues of any but the smallest matrices are in factbased on sequences of cleverly chosen similarity transformations the limit of whichis an uppertriangular (general case) or diagonal matrix (Hermitian or real symmetric

    case). Take a course in Numerical Linear Algebra for more details!Eigenvectors of matrices are not preserved under similarity transformations,

    but they change in a straightforward way. Let A and B be as above and letAv = v. Then, X1BXv = v B(Xv) = (Xv), so Xv is an eigenvector ofB corresponding to .

    If we are interested in computing eigenvectors of an operator L, then again thechoice of basis is irrelevant (at least in theory; in practice, it can matter a lot). LetAand B be representations ofL with respect to the bases B= {v1, v2, . . . vn} andC= {w1, w2, . . . wn}as above, with the change of coordinate matrix X = [I]CB .By some procedure we obtain Au = u, which corresponds to B(Xu) = (Xu).Define y =

    ni=1 uivi (so that u = [y]B), then we have [Ly]B = [y]B, which

    implies (by the standard isomorphism) that Ly = y. However, we also have

  • 8/10/2019 Invariant Subspaces

    4/7

    2p

    32 Chapter 2. Invariant Subspaces

    [y]C = [I]CB [y]B = Xu. This gives [Ly]C = [y]C B(Xu) = (Xu). So,computing eigenvectors ofBleads to the same eigenvectors for L as using A.

    Theorem 2.6. A linear operator, L, is diagonalizable if and only if there is a basisfor the vector space with each basis vector an eigenvector ofL. A matrix Ann isdiagonalizable if and only if there is a basis forRn (respectivelyCn) that consistsof eigenvectors5 ofA.

    We often say A diagonalizable if there exists an invertible matrix U such thatU1AU is diagonal. But clearly, if we set D= U1AU, we rearrage, we get

    AU= UD [Au1, . . . , Aun] = [d11u1, d22u2, . . . , dnnun]

    Aui = diiui, i= 1, . . . , n

    and since the ui cannot be zero (since U was assumed invertible), the columns ofU must be eigenvectors and elements ofDmust be eigenvalues.

    Furthermore, you should make sure you are able to show that ifL(x) = AxandA is diagonalizable, then if you use the eigenvectors for the (only) basis on theright and left of the linear transformation/matrix transformation picture, you findthat the matrix of the transformation is precisely the diagonal matrix containingthe eigenvalues.

    As mentioned in the previous chapter, a matrix may not be diagonalizable. Wenow consider similarity transformations to block diagonal as an alternative. (Referalso to definition of block diagonal in Chapter 1)

    Definition 2.7. LetA be a complex or realn n matrix and let the numbersm1,m2, . . . , ms be given such that 1 mi n for i = 1, . . . , s and

    si=1 mi = n.

    Furthermore, for i= 1, . . . , s let fi = 1 +i1

    j=1 mj (wheref1 = 1), i =i

    j=1 mj,and Pi ={fi, . . . , i}. We say that A is a (complex or real) block diagonal matrixwiths diagonal blocks of sizesm1, m2, . . . , ms, if its coefficients satisfyar,c= 0 ifr andc are not elements of the same index setPi. (See note below).

    Note that A is a block diagonal matrix if the coefficients outside the diagonal

    blocks are all zero. The first diagonal block ism1m1, the second block ism2m2,and so on. The first coefficient of block i has index fi; the last coefficient of blocki has index i.

    Theorem 2.8. Let L be a linear operator over V, with dim(V) = n, and withinvariant subspaces V1, . . . , V s, such that V1

    V2

    Vs = V. Further, let

    there be bases v(i)1 , . . . , v

    (i)mi for each Vi (for i = 1, . . . , s), and define the ordered

    set B = {v(1)1 , . . . , v

    (1)m1 , v

    (2)1 , . . . , v

    (2)m2 , . . . , v

    (s)ms} (i.e., B is basis for V). Then

    5Here again, if the matrix is real, we must be careful to specify whether or not we are considering

    the matrix transformation as a map on Rn or on Cn. If the former and the eigenpairs are not all

    real, then we are forced to conclude it is not diagonalizable with respect to Rn, even though it

    may be if we take it with respect to Cn.

  • 8/10/2019 Invariant Subspaces

    5/7

    2p

    33

    A= [L]B is block-diagonal.

    Proof. (This is a sketch of the proof.) As each Vi is an invariant subspace, Lv(i)j

    Vi. Hence, Lv(i)j =

    mik=1 kv

    (i)k . These coefficients correspond to columnsfi, fi+

    1, . . . , i and the same rows. So, only the coefficients in the diagonal blocks ofA,Am1m11,1 , . . . , A

    msmss,s can have nonzero coefficients.

    Exercise 2.5. Write the proof in detail.

    Note that a block-diagonal matrix (as a linear operator over Rn or Cn) revealsinvariant subspace information quite trivially. Vectors with zero coefficients exceptcorresponding to one diagonal block obviously lie within an invariant subspace.Moreover, bases for the invariant subspaces can be trivially found. Finally, findingeigenvalues and eigenvectors (generalized eigenvectors) for block diagonal matricesis greatly simplified. For this reason we proceed by working with matrices. However,the standard isomorphism between the n-dimensional vector spaceV and Cn) (orRn) given a basis B for Vguarantees that the invariant subspaces we find for the

    matrix [L]B correspond to invariant subspaces for L, as we observe in the followingtheorem.

    Theorem 2.9. LetA Cnn be a block diagonal matrix withs diagonal blocks ofsizes m1, m2, . . . , ms. Define the integers fi = 1 +

    i1j=1 mj (where f1 = 1) and

    i=i

    j=1 mj for i= 1, . . . , s. Then the subspaces (ofCn)

    Vi = {x Cn|xj = 0 for all j < fi andj > i}= Span(efi , efi+1, . . . , ei)

    for i= 1, . . . , s are invariant subspaces of A.

    Proof. The proof is left as an exercise.

    Exercise 2.6. Prove Theorem2.9.

    Using the previous theorem we can also make a statement about invariantsubspaces ofL if the matrix representing L with respect to a particular basis isblock diagonal.

    Theorem 2.10. LetL be a linear operator over V, withdim(V) = n and let theordered set B= {v1, v2, . . . , vn} be a basis for V. Furthermore, let A = [L]B beblock-diagonal with block sizes m1, m2, . . . , ms (in that order). Let fi and i fori= 1, . . . , sbe as defined in Definition2.7.

    Then the subspaces (ofV)V1, . . . , V s, defined byVi= Span(vfi , vfi+1, . . . , vmi)are invariant subspaces ofL and V1

    V2

    Vs = V.

    Proof. The proof is left as an exercise.

  • 8/10/2019 Invariant Subspaces

    6/7

    2p

    34 Chapter 2. Invariant Subspaces

    Exercise 2.7. Prove Theorem2.10.

    Procedure to find invariant SS for L

    1. Get the matrix representation ofL first. That is, pick a basis Bfor Vand letA= [L]B.

    2. Find an invertible matrix S such that F := S1AS is block diagonal withblocks satisfying certain nice properties (nice is TBD).

    3. The columns ofS will span the invariant subspaces ofA (group the columnsaccording to the block sizes of F, as weve been doing in the preceding dis-cussion).

    4. Use S and Bto compose the invariant subspaces (in V) forL.

    Now, Step 2 is non-trivial, but well put this off and just assume it can bedone. What remains is HOW do we finish Step 4? The following two theoremaddresss this issue.

    Theorem 2.11. Let L be a linear operator over a complex n-dimensional vectorspace V, and letB={b1, b2, . . . ,bn} be a (arbitrary) basis for V. Let A= [L]Band let F = S1AS, for any invertible matrix S Cnn, be block diagonal withblock sizes m1, m2, . . . , ms. Furthermore, let fi = 1 +

    i1j=1 mj (f1 = 1) and

    i =

    ij=1 mj, and let the ordered basis C = {c1, c2, . . . , cn} be defined by ci =n

    j=1 bjsj,i for i= 1, . . . n. Then the spaces Vi = Span(cfi , . . . , ci) are invariantsubspaces ofL andV =V1

    V2

    Vs.

    Proof. The proof is left as an exercise. Hint: considerI L I , and note that withthis definition of the Cbasis, S= [I]CB .

    We will now consider in some more detail a set of particularly useful andrevealing invariant subspaces that span the vector space. Hence, we consider block-diagonal matrices of a fundamental type. Most of the following results hold only forcomplex vector spaces, which are, from a practical point of view, the most importantones.

    Next we provide some links between the number of distinct eigenvalues, thenumber of eigenvectors, and the number of invariant subspaces of a matrix A Cnn (working over the complex field).

    Theorem 2.12. Let be an eigenvalue of A (of arbitrary algebraic multiplicity).There is at least one eigenvectorv ofA corresponding to .

    Proof. Since A I is singular, there is at least one nonzero vector v such that(A I)v= 0.

    Theorem 2.13. Let1, 2, . . . , k be distinct eigenvalues and letv(i)1 , . . . , v

    (i)mi be

  • 8/10/2019 Invariant Subspaces

    7/7

    2p

    2.1. Toward a Direct Sum Decomposition 35

    independent eigenvectors associated withi, for i= 1, . . . , k. Then

    {v(1)1 , . . . , v

    (1)m1

    , v(2)1 , . . . , v

    (2)m2

    , . . . , v(k)1 , . . . , v

    (k)mk

    }

    is an independent set.

    Proof. Left as an exercise for the reader (its in most linear algebra textbooks.)

    2.1 Toward a Direct Sum Decomposition

    In general the above set of eigenvectors does not always give a direct sum decompo-sition for the vector space V. That is, it is not uncommon that we will not have acomplete set ofn linearly independent eigenvectors. So we need to think of anotherway to get what were after. A complete set of independent eigenvectors for ann-dimensional vector space (nindependent eigenvectors for an n-dimensional vectorspace) would give n 1-dimensional invariant subspaces, each the span of a singleeigenvector. These 1-dimensional subspaces form a direct sum decomposition ofthe vector space V. Hence the representation of the linear operator in this basis ofeigenvectors is a block diagonal matrix with each block size equal to one, that is, a

    diagonal matrix. In the following we try to get as close as possible to such a blockdiagonal matrix and hence to such a direct sum decomposition. We will aim forthe following two properties. First, we want the diagonal blocks to be as small aspossible and we want each diagonal block to correspond to a single eigenvector. Thelatter means that the invariant subspace corresponding to a diagonal block containsa single eigenvector. Second, we want to make the blocks as close to diagonal aspossible. It turns out that bidiagonal, with a single nonzero diagonal right above(or below) the main diagonal (picture?) is the best we can do.

    In the following discussion, polynomials of matrices or linear operatorsplay an important role. Note that for a matrix A Cnn the matricesA2 Cnn,A3 Cnn, etc. are well-defined and that Cnn (over the complex field) is itself avector space.

    Hence, polynomials (of finite degree) in A, expressed as0I+1A+ +nAn

    are well-defined as well and, for fixed A, are elements of Cnn

    . Note there is adifference between a polynomial as a function of a free variable of a certain typeand the evaluation of a polynomial for a particular choice of that variable.

    Similarly, for linear operators over a vector space, L : V V, composition of(or product of) the operator with itself, one or more (but finite) times, results inanother linear operator over the same space, (L)m L (Lm1) and (L)m :V V.Indeed, the set of all linear operators over a vector space V, (often expressedas L(V)) is itself a vector space (over the same field). (Exercise: Prove this!)

    A nice property of polynomials in a fixed matrix (or linear operator) is thatthey commute (in contrast to two general matrices).

    Exercise 2.8. Prove this for linear matrix polynomials A I and A I and