m tech notes

8/12/2019 m Tech Notes

1/32

LINEAR ALGEBRA

Mohan C Joshi

IIT [email protected], [email protected]

1 Vector Spaces

1.1 Vector Spaces and Subspaces, Linear Span and Direct Sum

Definition 1.1. A vector space V is a set which is closed with respect to binary operation

(uV, vV u + vV) and scalar multiplication (uV, K u V,Kis either the setR of real numbers or the set C of complex numbers) and is such that the following holds.

V is commutative group with respect to the binary operation (u + v) = u + v, ( + )u= u + vu, v V; , K (u) = ()u= (u)u V; , K 1u= uuV

Elements of the space Vare called vectors. It is said to be real vector space ifK= R and is calledcomplex vector space ifK= C. The setKwill be referred as field of scalars and elements ofKwillbe referred as scalars.

Example 1.1. Let Rn denote the set of all n-tuples of real numbers. We shall denote the elements

ofRn as

u1u2...

un

That is, the elements ofRn aren 1 matrices.Define the binary operationu + v and scalar multiplication u as

u1u2

..

.un

+

v1v2

..

.vn

=

u1+ v1u2+ v2

..

.un+ vn

and

u1u2...

un

=

u1u2

.

.

.un

R.

1


2/32

Rn is a real vector space. Similarly, one can show that the set Cn of all n-tuples of complex

numbers is a complex vector space over the field of complex numbers as scalars.

Example 1.2. LetC[a, b] denote the set of all continuous real valued functions on [a, b]. Definebinary operation f+ g and scalar multiplication f as

[f+ g](x) = [f(x) + g(x)], [f](x) = f(x), x[a, b], R

C[a, b] is a real vector space. The elements of the real vector spaceC[a, b] are continuous functionson [a, b].

Example 1.3. Proceeding as above, we shall denote byPas the vector space of polynomials onR,P[a, b] as the vector space of polynomials on [a, b] andF[a, b] as the vector space of all functionson [a, b].

Example 1.4. In a similar wayC(n)[a, b] shall denote the real vector space of n-times continuouslydifferentiable real valued functions on [a, b].C[a, b] will denote the space of real valued functionson [a,b], which are differentiable infinitely many times.Theorem 1.2. In any vector spaceV, the following holds.

1. 0 = 0 K.2. 0u= 0 uV.3. (1)u=uuV.

Proof. 1. We have0 = (0 + 0) = 0 + 0

This gives

(0) + (0) =

(0) + 0 + 0 = (

(0) + 0) + 0

and hence

0 = 0 + 0 = 0

which proves the result.2. We have

0u= (0 + 0)u= 0u + 0u

which gives

(0u) + (0u) =0u + 0u + 0u= (0u + 0u) + 0u= 0 + 0u= 0uand hence

0 = 0u

3.(1)u + u= (1)u + (1)u= (1 + 1)u= 0u= 0

Hence(1)u=u

Definition 1.3. (Subspace) A nonempty subset U ofV is a subspace ifUis a vector space byitself with respect to the binary operation and scalar multiplication defined on V.

Theorem 1.4. Uis a subspace ofV iff

u

U, v

U

u + v

U

2


3/32

uU, K u U

Remark 1.1. The space{0}, consisting of just the zero element and the entire spaceV, are trivialsubspaces ofV.Theorem 1.5. We have the following inclusions.

C(1)[a, b] is a subspace ofC[a, b] C(n)[a, b] is a subspace ofC(n1)[a, b], n 1 P[a, b] is a subspace ofC(n)[a, b] Pis a subspace ofP[a, b]

Example 1.5. The following subsets ofC(1)[a, b] are its subspaces.{f C(1)[a, b] :f(c) = 0, c (a, b)},{f C(1)[a, b] :f(a) = f(b)},

{f C

(1)[a, b] : ba

f(x)dx= 0}

.

Example 1.6. The following subsets ofC(1)[a, b] are not its subspaces.{f C(1)[a, b] :f(c) = 2, c(a, b)},{f C(1)[a, b] : b

af(x)dx=1},

Definition 1.6. (Linear Span) Let Sbe a subset ofV . Then the set of all finite linearcombinations of elements ofSis called the linear span ofSand is denoted by L[S]. That isL[S] ={1u1+ 2u2+ + nun, i K, uiS, 1 in}.Theorem 1.7. LetS= be a subset ofV. ThenL[S] is the smallest subspace ofV containingS.Proof. Letu = 1u1+ 2u2+ + nun and v = 1v1+ 2v2+ + nvn,i, i, K, ui, viS, 1 in be two elements ofL[S]. Thenu + v= 1u1+ 1v1+ + nun+ nvn, again a finite linear combination of elements ofSandhence it is in L[S]. Similarly, u V, whenever K. This proves that L[S] is a subspace ofV.We now show that it is the smallest subspace containing S. It is obvious that it contains S. LetWbe any other subspace ofV , containingS. We need to show that L[S]W. Letu= 1u1+ 2u2+ + nunL[S], with uiS, i K, 1 in. Since SW, we haveuiW, 1 in and hence uW. Thus L[S]W. So, L[S] is the smallest subspace ofVcontainingS.

Using the above theorem, we immediately get the following theorem.

Theorem 1.8. LetS= be a subset ofV.Then (i) L[S] =S iffSis a subspace ofV and (ii) L[L[S]] =L[S].

LetU andWbe subspaces ofV. Then, it is clear that U W is also a subspace ofV . Now,consider the subspaces U1, U2, , Un ofV . By induction, it follows that U1 U2 Un is asubspace ofV.However, union of two subspaces need not be a subspace. For example, consider U=x-axis andW =y-axis. ThenUW={0}. ButUWis NOT a subspace ofR2, because (1, 0) and (0, 1) aretwo elements in U Wbut (1, 0) + (0, 1) = (1, 1) /U W.We know that L[U W] is the smallest subspace containingU W. We can identify L[U W]completely, if we define the sum U+ Wof two subspaces U andW ofV .

Definition 1.9. LetU and Wbe two subspaces ofV . We define their sum U+ W as follows.

U+ W={vV :v = u + w, u U, wW}

IfU W ={0}, then the sum ofU andWis called direct sum and is denoted as U W.

3


4/32

It is clear that U+ W is a subspace ofV containingU W. Further, one can show that U+ W isa smallest space containing U W.Theorem 1.10. LetU andWbe subspaces ofV. Then, U+ W=L[U

W].

Proof. It is clear that U+ W L[U W], as the space U+ W contains elements of the formv= u + w, u U, wW, which is a finite linear combination of elements ofU W. Converesely,assume thatvL[U W]. Then, we havev= 1u1+ 2u2+ + nun+ 1w1+ 2w2+ + nwn,i, j, K, uiU, wjW, 1 in, 1 jm.. Thus v = u + w, u U, wW and hencevU+ W. This proves the theorem.Theorem 1.11. LetU andWbe two subspaces ofV and letV =U+ W. ThenV =U W iffvVhas a unique representation of the formv= u + w, u U, wW.

Proof. LetV =U W. Then, we can write vV asv = u + w, u U, wW. We claim that thisrepresentation is unique. If possible let v = u1+ w1, u1U, w1W, be another representation ofv. Then, we get u u1 = w1 w, where u u1U and w1 wW. This implies thatu u1 = w1 wU W. As U W={0}, we have u = u1 andw = w1. Hence, therepresentation of elements ofV is unique.

Converesely, let the representation of elements ofV be unique. Let vU W. Then, we have tworepresentationsv = v+ 0, vU, 0 Wand also v = 0 + v, 0 U, vW. As the representation ofelements ofVis unique, we havev = 0. Hence U W={0}. This implies that the sum U+ W isa direct sum.

Remark 1.2. The sum V =U Wis also referred as internal direct sumof two subspaces UandW ofV .Example 1.7. LetV =F[a, a] be the vector space of all real valued functions on [a, a]. Let usdenoteU=Fe[a, b] andW =Fo[a, b] the vector subspaces ofVconsisting of even and oddfunctions, respectively. Then, it follows that V =U W. This follows from the fact that eachfV =F[a, a] has a represenation of the form f(x) = [ f(x) + f(x)

2 ] + [

f(x) f(x)2

], where

fe= [f(x) + f(x)

2 ] is an even function andfo = [

f(x) f(x)2

] is odd.

We claim that this representation is unique. For, iff=ge+ go is another representation, withgoU andgoW. Then, we have h = fe ge= go fo. As h is both even and odd, it is a zerofunction. Hence, fe= ge andfo= go.

LetU andWbe two vector spaces. We shall define external direct sum of these two spaces. First,

we define the sum U W as follows.U W={v: v = (u, w), uU, wW}

Theorem 1.12. The sumU Wis a vector space with respect to the following addition andscalar multiplication operation.

(u1, w1) + (u2, w2) = (u1+ u2, w1+ w2) (u, w) = (u,w)

The spaceU W is called theexternal direct sum of vector spacesU andW.Definition 1.13. LetWbe a subspace of the vector space V and let vV. The set v + W iscalled a linear variety or translate ofWor parallel ofWor coset of W.

4


5/32

Remark 1.3. Linear variety v + U is a subspace iffvU.LetWbe a subspace of the vector space V . Then, we denote by V /Wthe set of all cosets ofWand is defined as follows.

V /W={v+ W :vV}Theorem 1.14. The spaceV /W is a vector space with respect to the following addition and scalarmultiplication operation.

(v1+ W) + (v2+ W) = (v1+ v2) + W (v+ W) = v+ W

The spaceV /W is called quotient space.

1.2 Linear Independence and Dependence, Dimension and Basis

Definition 1.15. A set S={u1, u2, , un} of vectors is called linearly dependent (l.d.) if thereexists a non-trivial linear combination ofu1, u2, , un that equals the zero vector. That is,i K, 1 i n, not all zero such that

1u1+ 2u2+ + nun= 0Otherwise,Sis said to be linearly independent (l.i.).

Example 1.8. S={ex, e2x, , enx}is l.i. set in C(, ).To prove the l.i. ofS, we proceed as follows. Assume that

1ex + 2e

2x + nenx= 0 xR

Differentiating the above equation (n 1)-times, we get

1ex + 22e

2x + + nnenx= 01e

x + 222e2x + + n2nenx= 0

. + . + + .= 01e

x + 2n12e2x + + nn1nenx= 0

Evaluating the above expression at atx = 0, we get

1+ 2+ + n= 01+ 22+ + nn= 0

1+ 222+ + n2n= 0

. + . + + .= 01+ 2

n12+ + nn1n= 0

As the determinant of the above equation is nonzero, it follows that i = 0, 1 in. HenceS1 ={ex, e2x, , enx}is l.i. set inC(, ).Example 1.9. LetS={v1, v2, , vn} V. Ifv1 = 0, then S is l.d.Example 1.10. S1 =

{1, sin2 x, cos2 x

}is l.d. where as S2 =

{1, sin x, cos x

} is l.i. in

C[

1, 1].

5


6/32

Theorem 1.16. In a vector spaceV

Any subset of a l.i. set is l.i. and any superset of a l.d. set is l.d.

If the setS={v1, v2, , vn} is an ordered set withv1= 0, thenS is l.d. iff one of thevectorsv2, v3, , vn, say, vk belongs to the linear span of the preceding vectors{v2, v3, , vk1}

Corollary 1.1. A finite set of vectorsS={v1, v2, , vn} of a vectors inV containing a non-zerovectors has a linearly independent subsetA such thatL[A] =L[S].

Definition 1.17. An infinite set S is l.i. if every finite subset ofS is l.i.

Example 1.11. LetS={1, x , x2 }be an infinite subset ofP. To show that it is l.i., we need toshow that every finite subset of it is l.i. Let{xk1 , xk2 ,...xkn , }(wherek1 < k2 < .. < kn arenonnegative integers) be any finite subset ofS. Assume that

k1xk1 + k2x

k2 +

+ knx

kn = 0, ki

, 1

i

n

Differentiating the above equation k1 times, we get

k1k1! + k2 [k2(k2 1) + + (k2 k1+ 1)]x(k2k1) + +kn [kn(kn 1) + ...(kn k1+ 1)]x(knk1) = 0

Evaluating the above equation at x = 0, we get k1k1! = 0 and hence k1 = 0. Similarly,Differentiating the above equation k2, k3, ,kn-times and evaluating at x = 0, we get

0 = k1 =k2 =... = kn

This proves that{xk1 , xk2 ,...xkn , } is l.i. and henceS is l.i.Example 1.12. 1 and i are l.i. in the real vector space C of complex numbers over the field ofreal numbers as scalars. Whereas this set is l.d. in the complex vector space Cof complex numbersover the field of complex numbers as scalars.

Example 1.13. S={sin x, sin2x, , sin nx}is a l.i. subset ofC[, ].Let

1sin x + 2sin 2x + + nsin nx= 0, i , 1 i n

Multiplying the above equation by sin kx and integrating from to , we get

1

sin kx sin xdx + 2

sin kx sin2xdx + + n

sin kx sin nxdx= 0

Using the fact that

sin kx sinjxdx =

0 ifk=j ifj = k

we get that

j = 0, 1 jn

This proves the linear independence of the set S. Similarly one can show that{1, cos x, cos2x, , cos nx}is a l.i. subset ofC[, ]. One can now extend the argument to claimthe linear independence of the infinite set{sin x, sin2x, , sin nx, }and the infinite set{1, cos x, cos2x, , cos nx, }.Definition 1.18. A subset B ofVis a basis if (i) B is l.i. and (ii) L[B] = V.

6


7/32

If a set B ofn elements generates V, then no l.i. set can have more than n vectors. More precisely,we have the following theorem.

Theorem 1.19. In a vector spaceV ifS={

v1

, v2

,

, vn}

generatesV and ifB={w1, w2, , wm} is l.i., thenm n.Proof. As B is l.i., wm= 0 V. Denote by S1 the setwmS={wm, v1, v2, , vn}. As L[S] =V,L[S1] =V. Further, wmVand hence S1 is l.d. with wm= 0. By Theorem 1.16, we can discardan element vi1S1, 2i1n such that it is a linear combination of the preceding elements{wm, v1, v2, , vi11}. Denote this deleted subset ofS1 byS1. Now, consider the setS2 = wm1S

1. L[S2] = V because L[S

1] =V and S2 contains 0 =wm1V =L[S1] and hence is

l.d. Again, by Theorem 1.16, we can discard an element vi2S2, 2i2i1 1 such that it is alinear combination of the preceding elements{wm, v1, v2, , vi21}. from S2 and get S2.ConstructS3 = wm2S

2. Note that at every stage the discarded element is from S.

Inductively proceeding we continue till all the elements of the setB are used up. Then obviouslym n. If not (m > n), by construction we get a linearly dependent set Sn+1 = wmnSn. Hence,Sn+1 =

{wmn, wmn1,

, wm

}, being a subset ofB , is llinearly independent, a contradiction.

Hence,m > n is not possible.

Corollary 1.2. IfVhas a basis ofn elements, then every set ofp vectors withp > n is l.d.

Proof. LetS={v1, v2, , vn} be a basis and let B ={w1, w2, , wp}be a set ofp vectors. Ifthis setB is l.i., it follows by the above theorem thatp n.Corollary 1.3. IfVhas a basis ofn elements, then every other basis forV has also n elements.

Proof. LetS1 ={v1, v2, , vn}and S2 ={w1, w2, , vm}be a two bases for V . Then both S1andS2 are l.i. and further L[S1] =V =L[S2]. Hence, by above theorem m n and also n m.This proves the corollary.

Definition 1.20. (Finite Dimensional Space)A vector space V is said to be finite dimensional ifVhas a basis consisting of finite number ofelements. It is clear that the number of elements in the basis is unique and it called its dimension.

Example 1.14. The spacePn of all polynomials of degreen is of dimension (n + 1) asS={1, x , x2, , xn} is a basis forPn. However, the spacePof all polynomials isnot finitedimensional.

Theorem 1.21. In an n-dimensional spaceVany set of n linearly independent vectors form abasis forV.

Proof. LetB ={v1, v2, , vn} be a l.i. set. We need to show that L[B] =V. LetvV bearbitrary. Denote this element by vn+1. Then the setB1 ={v1, v2, , vn+ 1} is a set of (n + 1)vectors in a n-dimensional space and hence is l.d. By Theorem 1.16, there exist a vector vi from

B1 such that it is linear combination of the preceding (i 1) vectors. This elementviB. If notso, it will contradict the assumption that B is l.i. Thus vi= vn+1. This proves that L[B] =V.

Theorem 1.22. LetB ={v1, v2, , vn} be a basis for a vector spaceV . Then every elementvV has a unique presentation

v= 1v1+ 2v2+ + nvn, i K, 1 inDefinition 1.23. LetB ={v1, v2, , vn}be a basis for a finite dimensional space V . Then, bythe previous theorem, every element vV has a unique presentation

v= 1v1+ 2v2+ + nvn, i K, 1 in(1, 2, , n) Kn is said to be the coordinate ofv with respect to the basis B .

7


8/32

Remark 1.4. In case we are short ofn l.i vectors in a vector space of dimension n, we can extendit to get a basis.

Theorem 1.24. In ann-dimensional vector space, any setB={

v1

, v2

,

, vk}

of l.i. vectors canbe extended to a basis{v1, v2, , vk, vk+ 1, , vn} ofV .Proof. Ifk = n, we are done. Letk < n and V= [v1, v2, , vk]. Hence, there existsvk+1L[v1, v2, , vk] and thus{v1, v2, , vk+1} is l.i. Ifk + i= n, we are done. Otherwise,repeat the process to get a set{v1, v2, , vk, vk+1, , vn} of n l.i. vectorts in V . This is a basisas it a collection of n l.i. elements in a finite dimensional space of dimension n.

We now prove two theorems regarding the dimensions of subspaces of a vector space.

Theorem 1.25. LetWbe a subspace of a finite dimensional spaceV. Then

(a) dim(W)dim(V)

Equality holds iffW=V.

(b)dim(V /W) = dim(V) dim(W)Proof. (a) Let dim W=m and dim V =n. LetS= be a basis for W. SinceSis also l.i. in V ,m n. This proves that dim W dim V. Assume that dim W= dim V =n. AsSis a basis for WandW V, S is l.i. in V . As dim V =n, it follows that S is also a basis forV . ThusW =L[S] =V, which implies that W =V.(b) Extend the basisS={w1, w2, , wm} ofWto basis B ={w1, w2, , wm, v1, v2, vr} ofV ,wherem + r= n. Let v1 = v1+ W,v2 = v2+ W, vr = vr+ Wbe elements of the quotient spaceV /W. We need to show that{v1, v2, vr}is a basis for V /W. Since L[B] =V, it follows that foranyvV, we have

v= 1w1+ 2w2+

+ nwn+ 1v1+ 2v2+

+ rvr, i, i

KThis gives

v= 1v1+ 2v2+ + rvr, i KThis proves that L[ v1, v2, vr] =V /W.We claim that v1, v2, vr are l.i. Let 1v1+ 2v2+ + rvr = o. This implies thatw= 1v1+ 2v2+ + rvrW and hencew = 1w1+ 2w2+ + nwn for some scalarsi K, 1 ir. This gives us

o= 1w1+ 2w2+ + nwn 1v1 2v2 rvrThe linear independence of the basis B implies that 0 =i= j , 1 im; 1 jr, therebyproving the linear independence of{v1, v2, vr}. This proves the theorem.

Theorem 1.26. IfU andWare two subspaces of a finite dimensional spaceV, then

dim(U+ W) = dim(U) + dim(W) dim(U W)

Proof. STEP 1

Letn = dim(V),m = dim(U), p = dim(W), r = dim(U W). Assume that B ={v1, v2, , vr}basis for U W. ExtendB to a basis B1 ={v1, v2, , vr, ur+1, ur+2, , um} ofU andB2 ={v1, v2, , vr, wr+1, wr+2, , wp}, a basis for W.STEP 2

We need to show that S={v1, v2, , vr, ur+1, ur+2, , um, wr+1, wr+2, , wp}is a basis forU+ W.

8


9/32

S is l.i. : Assume that there exists scalars i, 1 ir, i, 1 im, i, 1 ip such thatr

i=1

ivi+

m

i=r+1

iui+

p

i=r+1

iwi = 0 (1.1)

This implies that

ri=1

ivi+m

i=r+1

iui =p

i=r+1

iwi (1.2)

Letv =pi=r+1 iwi =ri=1 ivi+ mi=r+1 iui. Sincev is linear combinations of elements{v1, v2, , vr, ur+1, ur+2, , um} ofUand elements{wr+1, wr+2, , wp} ofW, it follows thatvU W. Therefore there exists scalars i, 1 ir such that

v=

ri=1

ivi (1.3)

and hence

ri=1

ivi+

pi=r+1

iwi = 0 (1.4)

Since{v1, v2, , vr, wr+1, wr+2, , wp} is l.i., it follows that i= 0 = j , 1 ir, r+ 1jp.Using r+1 = r+2 = = p= 0 in (1.2), we get that

r

i=1

ivi+

m

i=r+1

iui= 0 (1.5)

Using the fact that the elements{v1, v2, , vr, ur+1, ur+2, , um} ofUare l.i., we get thati, = 0 = j , 1 ir, j , r+ 1 j. This proves the linear independence ofS.STEP 3

L[S] = U+ W: LetzU+ W. Then z = u + w, u U, wW. This implies that there existsscalarsi, 1 ir, j , r+ 1 jm and scalars i, 1 ir, j , r+ 1 jm such that

z =

ri=1

ivi+

mi=r+1

iui+

ri=1

ivi+

pi=r+1

iwi (1.6)

RHS of the above equality implies that zL[S]. This implies that U+ W L[S]. The reversecontainment L[S]U+ W is obvious. Hence U+ W =L[S]. This proves the theorem.

Corollary 1.4. IfV =U W, then

dim(V) = dim(U) + dim(W).

Example 1.15. LetV =P3, U={p P3 : p(1) = 0}, W ={p P3 : p(1) = 0}be two subspacesofV . dim V= 4, dim U= 3, dim W= 3. We wish to get explicit representation ofU,V,W, U WandU+ W.

9


10/32

B={1, x , x2, x3} is a basis for V and hence

U = {p V :p(1) = 0= { + x + x2 + x3 : + + + = 0, , , , }= {( ) + x + x2 + x3; , , }= {(x 1) + (x2 1) + (x3 1); , , }

This shows thatB1 ={(x 1), (x2 1), (x3 1)} is a basis for Uand dim(U) = 3.

W = {p V :p(1) = 0= { + x + x2 + x3 :+ 2+ 3= 0, , , , }= { + (x2 2x) + (x3 3x); , , }

This shows thatB2 ={1, (x2 2x), (x3 3x)}is a basis for Wand dim(W) = 3.

U W = {p V :p(1) =p(1) = 0= { + x + x2 + x3 : + + + = 0 = + 2+ 3,,,, }= {(1 2x + x2) + (2 3x + x3); , }

This shows thatB3 ={1 2x + x2, (2 3x + x3)} is a basis forU Wand dim(U W) = 2.Hence, by the above theorem, it follows that dim(U+ W) = 4 = dim(V). Hence, it follows thatU+ W =V.

2 Linear Transformations

Definition 2.1. LetU and Vbe two vector spaces over the same field of scalars. A functionT :U Vis said to be a linear transformation or linear map or a linear operator if

(a)T(u + v) = T u + T v,u, vU (b)T(u) = Tu,uU, KRemark 2.1. The function f : R R defined by f(x) = x + a, a R fixed, is customarily called alinear function, because its graph is a line. However, it is NOT a linear transformation as definedby us.

Theorem 2.2. LetT :U

V be linear. Then

T(0) = 0 T(u) =T(u) T(1u1+ 2u2+ + nun) = 1T u1+ 2T u2+ + nT un for every finite linear

combination(1u1+ 2u2+ + nun)U

Theorem 2.3. LetU andVbe vector spaces withU finite dimensional. LetT :U V be a lineartransformation. ThenTis completely defined by its action on the basis elements ofU.

Proof. Let{u1, u2, , un} be a basis Uand let T u1 = v1, T u2 = v2, , T un= vn be the actionon the basis elements. Let uUbe arbitrary. We haveu = 1u1+ 2u2+ + nun. Then, wedefineT u= 1v1+ 2v2+ + nvn.This transformation T is the required transformation.

10


11/32

(i) We first show that Tis unique. If there is any other linear map S: U V withSui = vi, 1 in, we need to show that S= T. By linearity ofS, we have

S(1u1+ 2u2+ + nun) = 1Su1+ 2Su2+ + nSun= 1v1+ 2v2+ + nvn= T uThis proves the uniqueness ofT.

(ii) Letu = 1u1+ 2u2+ + nun andv = 1u1+ 2u2+ + nun be elements ofU. Then

u + v= (1+ 1)u1+ (2+ 2)u2+ (n+ n)unHence

T(u + v) = (1+ 1)v1+ (2+ 2)v2+ (n+ n)vn= T u + T vand

T(u) = 1v1+ 2v2+ nvn= T u

This proves the linearity ofT and hence the theorem.

2.1 Range Space, Null Space and Rank-Nullity Theorem

Definition 2.4. LetT :U Vbe a linear operator. The null space N(T) and range space R(T)are defined as

(a)N(T) ={uU :T(u) = 0} (b)R(T) ={v= T(u), u U}

IfR(T) is finite dimensional, then dim(R(T)) is called the rank ofTand is denoted by r(T).Similarly dim(N(T)) is called the nullity ofTand is denoted by n(T).

Theorem 2.5. LetT :U

V be linear. Then

1. R(T) is a subspace ofV.

2. N(T) is a subspace ofU

3. T is1-1 iffN(T) ={0}.4. IfL[u1, u2, , un] =U, thenR(T) = L[T(u1), T(u2), , T(un)].5. IfUis finite dimensional, thendim(R(T))dim U.

Proof. 1 3 are easy to prove.4. Let L[u1, u2, , un] = Uand let vR(T). Then, we have

v = Tu, u

U

= T(1u1+ 2u2+ + nun)= 1T u1+ 2T u2+ + nT unL[T(u1), T(u2), , T(un)]

Hence

R(T) = L[T(u1), T(u2), , T(un)]

5. If dim(U)


12/32

Theorem 2.6. LetT :U Vbe linear. IfT is1-1 and if{u1, u2, , un} is l.i., then{T(u1), T(u2), , T(un)} is l.i.

If{v1, v2, , vn} is a l.i. set ofR(T) such thatvi= T(ui), 1 in, then{u1, u2, , un}is l.i.

Theorem 2.7. (Rank-Nullity Theorem) LetT :U V be linear withdim(U) = n. Then

dim(R(T)) + dim(N(T)) = dim(U)

or equivalentlyr(T) + n(T) = n

Proof. STEP 1

N(T) is a finite dimensional subspace ofU with B ={u1, u2, , up}as its basis. ExtendB toB1 =

{u1, u2,

, up, up+1,

, un

}to a basis for U.

STEP 2

Consider the setS={T up+1, T up+2, , T un} R(T). We show that (i) L[S] =R(T) and (ii) Sis l.i.

(i) We have L[B1] = U and hence

L[R(T)] = L[T u1, T u2, , T up, T up+1, , T un] = L[T up+1, T up+2, , , T un] =L[S]

(ii) Letp+1T up+1+ p+2T up+2+ + nT un= 0. By linearity ofTit follows thatT(p+1up+1+ p+2up+2+ + nun) = 0 and hence p+1up+1+ p+2up+2+ + nun)N(T).Hence there exists scalars 1, 2, , p such that

p+1up+1+ p+2up+2+ + nun= 1u1+ 2u2+ + pupThis gives

1u1+ 2u2+ + pup p+1up+1 p+2up+2 nun= 0Linear independence of{ui}ni=0 implies that 1 = 2 = = p= p+1 = p+2 = = n. Thisproves thatS is l.i. Thus S is a basis for R(T) and hence r(T) = n p r(T) + n(T) = n.

Corollary 2.1. LetT : Rn Rn. ThenT is1-1 iffT is onto.Corollary 2.2. LetT :U V withU andVfinite dimensional. IfT is1-1, thendim(U)dim(V).

Corollary 2.3. LetT :U V withU andVfinite dimensional. IfT is onto, thendim(U)dim(V).

2.2 Invertible Linear Transformation

Definition 2.8. (Invertible Linear Transformation) A linear map or a linear operator or alinear transformation T :U V is invertible or nonsingular ifT is 1-1 and onto. Such atransformation is also called isomorphism.LetT :U V be nonsingular. LetvVbe arbitrary. AsTis 1-1 and on to, it follows that thereexists a unique uU such that T u= v. This help us in defining a mapping T1 :V U, calledthe inverse ofT :U V . This is done as under.

T1v= u T u = v (2.1)

12


13/32

We claim that T1 :V U is linear. Let

T1(v1) = u1 and T1(v2) = u2

This implies that

T(u1) = v1 and T(u2) = v2

Lnearity ofT gives

T(1u1+ 2u2) = 1T(u1) + 2T(u2)

= 1v1+ 2v2

and hence

T1(1v1+ 2v2) = 1u1+ 2u2 (2.2)

(2.1) and (2.2) together imply that

T1(1v1+ 2v2) = 1T1v1+ 2T

1v2

We now use Rank-Nullity Theorem to get the following.

Theorem 2.9. LetT :U Vbe linear map anddim(U) = dim(V) = n. Then the following areequivalent.

Tis a isomorphism T is1-1 T maps l.i. sets into l.i. sets

T transforms basis to basis T is onto r(T) = n n(T) = 0 T1 exists.

Definition 2.10. A vector spaces Uis said to be isomorphic to vector space V if there exists anisomorphismT :U V. We denote two isomorphic spaces as UV.Theorem 2.11. Every real (complex) vector spaceV of dimensionn is isomorphic to Rn(Cn).

We now examine the dimension of the quotient space.

Theorem 2.12. LetU andWbe a subspaces of a vector spaceV such thatV =U W. Then1.

W (V /U)2. IfV is finite dimensional withdim(V) = n anddim(U) = m, then

dim(V /W) = n mProof. : 1. We define an isomorphism from W to (V /U) as follows

T w= w+ U, wWIt is clear that Tis linear and onto. It is also 1-1, as wN(T) T(w) = 0. This gives w + U=Uand hencew = 0. Thus Tis an isomorphism from W to (V /U). HenceW (V /U).2. As W (V /U), it follows that dim(V /U) = dim(W) = n m.

13


14/32

2.3 The Vector SpaceL(U, V) and Composition of Linear MapsWe now examine the set of all linear transformation from U toV . Denote this set byL(U, V) anddefine a binary operation T+ Sand scalar multiplication T on

L(U, V) as follows.

[T+ S]u= T u + Su, uU. [T]u= [T u], uU, K.

We immediately get the following theorem concerningL(U, V).Theorem 2.13. The setL(U, V)of all linear transformations fromU to V is a vector space withrespect to the above defined binary operation and scalar multiplication.

Definition 2.14. LetU, V,Wbe vector spaces and let T L(U, V) andS L(V, W). Then thecomposition operation (S T) is defined as follows.

(S T)(u) = S(T(u)),uU

Alternatively, we shall also be writing the composition as S T instead of (S T). The compositionoperation is linear, as we see below.

(S T)(u1+ u2) = S(T(u1+ u2))= S(T u1+ T u2)

= S(T u1) + S(T(u2)

= ST(u1) + ST(u2)

= (S T)(u1) + (S T)(u2)

Similarly, we have

(S T)(u) = S(T(u))= S(T u)

= S(T u)

= (S T)(u)

Thus (S T) L(U, W).We have the following result concerning the binary operation and composition operation on thevector spaceL(V) of linear operators from V to itself.Theorem 2.15. For allT , S , R L(V), we have

R(T+ S) = RT+ RS (T+ S)R= T R+ SR R(ST) = (RS)T (S)T =(ST) = S(T) IT =T I= T

This immediately gives us the following theorem.

Theorem 2.16. The vector spaceL(V) of all linear transformations fromV toV is an algebrawith identity.

14


15/32

We have the following two interesting theorems concerning invertible operators.

Theorem 2.17. LetT :U V andS: V W be two linear maps.

IfSandT are nonsingular, thenS Tis nonsingular and(ST)1 =T1S1. IfST is1-1, thenT is one-one. IfST is onto, thenS is onto. IfSTis nonsingular, thenT is one-one andS is onto. Ifdim(U) = dim(V) = dim(W) andS T is nonsingular, then bothSandTare nonsingular.

Theorem 2.18. T :U Vis nonsingular iff there existsS: V Usuch thatT S= IV andST =IU. In such a case

S= T1 and T =S1 (2.3)

Proof. Let us assume that T :U V is nonsingular. Denote by Sthe inverse operatorT1 :V U. Inview of (2.1), it follows that

S(v) = u T(u) = vand hence

(S T)(u) = S(T u) = S(v) = u and(T S)(v) = T(Sv) = T(u) = vwhich implies that

ST =IUand TS= IV

Let us now assume that there exists S: V Usuch that (2.3) holds. As S T=IU, it follows byTheorem 2.17 that T is 1-1 andT S= IV implies that Tis on to. Hence T is invertible withT1 =S. Similarly, it follows that S is invertible with S1 =T, which proves the theorem.

3 Geometry of Vector Spaces

3.1 Inner Product Spaces and Orthogonality

Definition 3.1. An inner productu, v in a vector spaceUis a function onU Uwith values inK such that the following holds.

1.u, u 0 for all uUand equality holds iffu = o2.u, v=v, u u, vU3.u + v, w= u, w + v, w , Kandu, v,wU

Definition 3.2. The vector spaceUwith an inner product defined is called inner product space.In an inner product space U, u is said to be orthogonal to v ifu, v= 0. This is denoted by uv.IfMis any subset ofU, then the set M denotes the set of all vectors perpendiculars to M. Thatis

M ={wV :u, w= 0 uU}Definition 3.3. A vector space Uis said to be a normed space if there exists a functionufromU toR such that the following properties hold.

1.u 0 for all uU andu = 0 iffu = 02.

u

=

|

|u

for all u

U and

K15


16/32

3.u + v u + v for all u, vUIn an inner product space U, the induced norm is defined by

u2 =u, u , uU

Definition 3.4. In a normed space U, the induced metricd(u, v) (distance between two vectors) isdefined as

d(u, v) =u v u, vUIn view of Definition 4.3, this metric d(u, v) satisfies the following properties.

1. d(u, u)0 for all u, vU andd(u, v) = 0 iffu = v2. d(u, v) = d(v, u) for all u, vU3. d(u, w)d(u, v) + d(v, w) for all u, v,wU

Definition 3.5. It is possible to define a metric d(u, v) satisfying properties 1 3, in any setwithout having a vector space structure. Such a set is called a metric space.In a metric space, without the vector space structure, we can easily define the concepts ofconvergence, Cauchy convergence, completeness etc., as we see below.

Definition 3.6. A sequence{xk}in a metrix space (U, d) is said to be convergent to xU ifd(xk, x) 0 as k .{xk} is said to be Cauchy ifd(xk, x) 0 as k, l .Definition 3.7. A metrix space (U, d) is said to be complete if every Cauchy sequence in (U, d)converges. A complete normed space is called a Banach space, whereas a complete inner productspace is called a Hilbert space. In a normed space U, it is possible to define infinite sum

i=1 ui.

We say that S=

i=1 ui iff the sequence of partial sums Sn=n

i=1 uiU, converges to SU.Example 3.1. In the space

n of n-tuples, the Euclidean norm and inner product are defined as

u2 = u21+ u22+ . . . + u2n, u= (u1, u2, . . . , un) (3.1)u, v = u1v1+ u2v2+ . . . unvn, u= (u1, x2, . . . , un), v= (v1, v2, . . . , vn) (3.2)

In terms of the matrix notation we writeu, v= vu= uv, if we treat u, v as column vectorsandu, v as row vectors.

One can show thatn is complete with respect to the norm induced by the inner product definedby Eq. (3.2) and hence it is a Hilbert space.

Remark 3.1. It is possible to define other norms in the spacen, as we see below

u1 =|u1| + |u2| + . . . + |un|u = max

1in{|ui|}

Definition 3.8. Two different normsua andub in a normed space u are said to be equivalentif positive constants , such that

ua ubuaOne can show that all norms inn are equivalent and hencen is also complete with respect tothe normsu1 andu, defined earlier.

16


17/32

Example 3.2. In the spaceC[0, 1] of all real valued continuous functions, one can define norm andinner product as under

f22 = 10

f2(t)dt

f, g = 10

f(t)g(t)dt

Example 3.3. LetL2[a, b] denote the space of all square integrable functions.L2[a, b] ={f : [a, b] ,

ba

f2(t)dt


18/32

Example 3.4. X=C[0, 1] withf = sup

t(0,1]

|f(t)|, fX t[0, 1]

Take f=t, g= 1. One can check that parallegram law is not satisfied and hence the Banach spaceC[0, 1] with respect to the above norm can never be made a Hilbert space.Theorem 3.10. (Schwartz Inequality)LetUbe an inner product space. Then for allx,y, U, we have the following inequality

| x, y | xy (3.4)Equality holds iffx andy are l.d.

Proof. If any of the elements x, y is zero, we are through. So, assume that x, y= 0. By normalizingy as

y

y =e, the above inequality reduces to

| x, e | xfor all xX (3.5)So, it suffices to show Eq. (3.5).We have

0 x x, e e, x x, e e= x, x [x, e]2

This gives Eq. (3.5). Also, if equality holds in Eq. (3.5), we get

x x, y y, x x, y y= 0This implies that

x x, y y = 0and hencex, y are l.d. On the other hand ifx, y are l.d., then Eq. (3.5) is true. This proves thetheorem.

LetMbe a closed subspace of a Hilbert space U. Given an element xU, we wish to obtainuMwhich is closest to M . We have the following theorem in this direction.Theorem 3.11. SupposexU andMa closed subspace ofU. Then a unique elementuMsuch that

x u = infyM

x y

Proof. Letd = infyMx y. Then a sequence{un} M such that

x un dBy parallelogram law, we have

un um2 = (un um) x2= [2un x2 + 2um x2 2x + un+ um2] [2un x2 + 2um x2 4d2]

2d2 + 2d2 4d2 = 0That is, un is Cauchy and since Mis a closed subspace ofU, it is complete and henceun uM. It is clear that

x

u

= lim

n

x

un

=d

18


19/32

To prove the uniqueness ofu, assume that there exists a vM, which is also closest to M. Asbefore, we have

u v2 [2u x2 + 2v x2 4d2]= 2d2 + 2d2 4d2 = 0

This gives u = v.

In the following theorem M represent the spaces of all elements orthogonal to M .

Theorem 3.12. LetMbe a closed subspace ofU. ThenU=M M.Proof. SupposexU andMa closed subspace ofU. Then by the previous theorema uniqueelement uMwhich is closest to x. Definev = x u. Then, clearly x = u + v. LetwM andt be arbitrary. Ifd =x u =v. Then

d2

x

(u + tw)

2

= v tw2= d2 2tRev, w + t2w2

Thus,2tRev, w + t2w2 0 for all t, thereby implying that Rev, w= 0. Repeating the sameargument with it instead oft, we get I mv, w= 0 and hencev, w= 0. This implies thatvM. Thus we have x = u + v, u M, vM. We claim that this representation x = u + v isunique. If not, let

x= u1+ v1 = u2+ v2, u1, u2M andv1, v2M

This implies that

u1 u2 = v2 v1M M

and hence

u1 u2 = v2 v1 =0

This proves that Uis the direct sum ofM and M.

Definition 3.13. A set SUis called an orthonormal set ife, eS e, e= 0, =ande = 1 eS.We have the following property for orthonormal sets in U.

Theorem 3.14. Let

{e

}A be a collection of orthonormal elements inU. Then, for allx

U,

we have A

| x, e |2 x2 (3.6)

and further

x A x, e ee A.The inequality Eq. (3.6) is referred as Bessels inequality.

Definition 3.15. LetSbe an orthonormal set in a Hilbert space U. Then Sis called a basis for U(or a complete orthonormal system) if no other orthonormal set contains Sas a proper subset.

The following theorem presents the most important property of a complete orthonormal set.

Theorem 3.16. LetUbe a Hilbert space and letS={e}A be an orthonormal set inU. Thenthe following are equivalent inX.

19


20/32

(1) S={e}A is complete.(2) 0 is the only vector which is orthogonal to everyeS. That is, xe for every x= o.

(3) Every vectorxUhas Fourier series expansionx=A

x, e e.

(4) Every vectorxU satisfies the Parsevals equality

x2 =A

x, e |2 (3.7)

Proof. (1)(2)Suppose (2) is not true, then x=o such that xe A.Defininge =

x

x| , we get an orthonormal setS {e}which properly contains S. This contradictsthe completeness ofS.

(2)

(3)

x A

x, e e is orthogonal to e for every . But by (2), it follows that it must be the zerovector 0 and hence

x=A

x, e e

(3)(4)We have

x2 =x, x =A

x, e e,A

x, e e

=A

| x, e |2

(4)(1)IfS={e}A is not complete, it is properly contained in an orthonormal set Sand let e S S.then Parsevals equation Eq. (3.7) gives

e2 =A

| e, e |2 = 0

contradicting the fact that e is a unit vector.

Example 3.5. InU=L2[, ], the collection of functions

S=

1

2,cos t

,

cos 2t

, . . .sin t

,

sin 2t

, . . .

is a complete orthonormal set and every f L2[, ] has a Fourier series expansion

f(t) =a0

2 +

k=1

[akcos kt + bksin kt] (3.8)

where

ak= 1

f(t)cos ktdt and bk = 1

f(t)sin ktdt, k= 0, 1, 2, . . .

The fact that the above set Sforms an orthonormal basis inL2[, ] will be proved elsewhere.We also note that for anyf L2[, ] we have the Parsevals relation

f2(t)dt=(f, 1)2

2 +

n=11

[(f, cos nt)2 + (f, sin nt)2]

20


21/32

The Fourier series representation given by Eq. (3.8) will be used while discussing the solution ofboundary value problems.

For a piecewise continuous function f(t) defined on [, ],f(t) =

1, t 01, t 0

we have the following Fourier series representation.

f(t) = 4

n=0

sin(2n + 1)t

2n + 1

3.3 Gram-Schmidt Orthonormalisation Procedure

We now give the following Gram-Schmidt procedure which helps in producing an orthonormalcollection

{e1, e2, . . . , en,

,

}out of the .i. collection

{x1, x2, . . . , xn,

,

}. We first obtain

orthogonal vectors{y1, y2, . . . , yn, , }, as follows.y1 = x1

...yi = xi i,1 y1 i,2 y2 . . . i,i1 yi1

where

i,j=xi, yjyj , yj , 1 ji 1

This is continued inductively for i + 1, i + 2, , . . . , n. It is clear that

[x1, x2, . . . , xn] = [y1, y2, . . . , yn]

for each n. Normalizingyi, we get the orthonormal collection{e1, e2, . . . , en, , }, ei= yi/yi.Example 3.6. In the spaceL2[1, 1],the set of all polynomials{P0(t), P1(t), . . .}, called Legendrepolynomials, are obtained by Gram-Schmidt orthonormalisation procedure. We first compute

21


22/32

polinomials p0, p1, p2, p3 andp4.

p0(t) = 1, p1(t) = t

p2(x) = t2 0p0(t) 1p1(t), 0 = (t

2, 1)

(1, 1) =

1

3,

1 =(t3, t)

(t, t) = 0

and hence

p2(t) = t2 1

3

p3(t) = t3 0p0(t) 1p1(t) 2p2(t), 0= (t

3, 1)

(1, 1) = 0,

1 = (t3

, t)(t, t)

=35

, 2 = 0

and hence

p3(t) = t3 3

5t

p4(t) = t4 0p0(t) 1p1(t) 2p2(t) 3p3(t),

0 =(t4, 1)

(1, 1) =

1

5, 1 = 0,

2 = (t4, p2(t))

(p2(t), p2(t))=

6

7, 3 = 0

and hence

p4(t) = 0 =t4 6

7t2 1

5

Normalising these polynomials we get the Legendre polynomials Pi(t), 1 i4,given by P0(t) = 1, P1(t) = t, P2(t) =

1

2(3t2 1), P3(t) = 1

2(5t3 3t), P4(t) = 1

8(35t4 30t2 + 3).

The higher degree Legendre polynomials are obtained in a similar way.Forf L2[1, 1], we have the Legendre series representation

f(t) =i=0

aiPi(t), ai =f, Pi= 11

f(t)Pi(t)dt, i0

4 Linear Transformations on Finite Dimensional Spaces

4.1 Linear Transformations and Matrices

Definition 4.1. LetT :U Vbe a linear transformation with dim(U) = n and dim(V) = m.LetB1 ={u1, u2, , un}be a basis for U andB2 ={v1, v2, , vm}be a basis for V. Hence thereexists scalars ij K, 1 im, 1 jn such that

T uj =

mi

ijvi, 1 jn (4.1)

22


23/32

Then (ij) is called the matrix ofTwith respect to the basis B1 ofUand basis B2 ofV and isdenote by m(T). That is

m(T) =

11 12 1n21 22 2n m1 m2 mn

Conversely, letA = (ij) be an m n matrix. LetT :U Vbe defined through Eq(4.1) by itsaction on the basis elements of the basis B1 ={u1, u2, , un}for U. Since a linear opeartor isuniquely defined by its action on the basis elements, Tdefined by (4.1) is such that T L(U, V).Thus, there is a 1-1 correspondence between T L(U, V) and an m n matrixA.Also, from the definition of the matrix associated with a linear operator it follows that ifm(T) = (ij), m(S) = (ij), then

m(T+ S) = (ij) + (ij), , KThus, we immediately get the following theorem.

Theorem 4.2. The vector spaceL(U, V) is isomorphic to the spaceMmn of allm n matrices.Corollary 4.1.

dim L(U, V) = mnWe now focus onL(V) and examine the composition operation on it. Let T , S , R L(V) withm(T) = ij, m(S) = ij andm(R) = ij. LetR = T S. We will show that ij =

kikkj . We

have

Rvj = T(Svj)

= T(i

ijvi)

= i

ij

T(vi)

=i

ij(k

kivk)

Interchanging the indices i, k we get

Rvj =

kkj(

i ikvi)

=k

(i

ikkj)vi

=i

ijvi

This shows thatm(R) = ij . That is m(T S) = m(T)m(S). Further, ifTis invertible , thenT T1 =T1T=I. This gives us m(T T1) = m(T)m(T1) = Iand hence m(T1) = [m(T)]1.This can be stated as a theorem.

Theorem 4.3. The composition operation inL(V) corresponds to multiplication operation inMn2 . Further, ifTis invertible, thenm(T1) = [m(T)]1.LetT L(V) and letm1(T) be the matrix ofTwith respect to the basis B1 andm2(T) be thematrix ofTwith respect to the basis B2. We wish to find the relation between m1(T) and m2(T).

Theorem 4.4. LetT L(V) and letA = m1(T) be the matrix ofTwith respect to the basisB1 ={v1, v2, , vn} andB = m2(T) be the matrix ofTwith respect to the basisB2 ={w1w2, , wn}. Then there exists a nonsingular matrixC such that

B = C1AC

23


24/32

Proof. LetSbe the linear operator which maps basis elements ofB1 to basis elements ofB2, Thatis

Svi = wi, 1 inAs Smaps basis to basis, by Theorem 2.9 we get that Sis invertible. Let C= m(S) andC1 =m(S1). LetA = (ij), B= (ij). We have T wi=

jjiwj andwi = Svi. This gives us

T Svi =j

ji(Svj) = S(j

jivj)

SinceSis invertible, we get [S1T S]vi = S1S(

jjivj). So, we get

[S1T S]vi=j

jivj

m(S1T S) = (ij) = B

By Theorem 4.3 , m(S1T S) = m(S1)m(S)m(T) = [m(S)]1m(T)m(S) and henceB= C1AC.

Definition 4.5. The matrices A and B are said to be similar if there exists an invertible matrix Csich that B = C1AC.

Remark 4.1. The above theorem states that the matrices corresponding to different basis aresimilar.

Example 4.1. LetD :Pn Pn be the differential operator D(p(x)) = p(x). Then

m1

(T) =

0 1 0 00 0 2 0...

..

.

..

.

..

.

..

.0 0 0 n0 0 0 0

with respect to basis B1 ={1, x , x2, , xn} forPn.If we use B2 ={1, 1 + x, 1 + x2, , 1 + xn} as a basis forPn, then

m2(T) =

0 1 2 n0 0 2 0...

......

......

0 0 0 0

is the matrix ofTwith respect to the basis B2. The linear transformation which maps B1 toB2

has a matrix representation given by

C=

1 1 1 10 1 0 0...

......

......

0 0 0 1

One can show that m1(T) and m2(T) are similar through the invertible matrixC. That ism2(T) = C1m1(T)C.

24


25/32

4.2 Eigenvalues and Eigenvectors

Theorem 4.6. K is called a characteristic root or eigenvalue ofTiff there existsv= 0 inVsuch thatT v= v. Thisv

= 0 inV is called a characteristic vector or eigenvector ofT

corresponding to the characteristic root or eigenvalue.

Remark 4.2. If K is a characteristic root ofT then k is a characteristic root ofTk for allpositive integers k . Further, ifp(x) = a0x

n + a1xn1 + + an(ai K, 0 in) is any

polynomial, then [a0Tn + a1T

n1 + + anI]v= [a0n + a1n1 + + an]v and hence p() is acharacteristic root ofp(T).

We first observe that ifT L(V) with dim(V) = n, then it satisfies a nontrivial polynomial (withcoefficients inK) of degree at most n2. This follows from the fact thatI , T , T 2, Tn2 are (n2 + 1)elements in the vector spaceL(V) of dimensionn2.Theorem 4.7. The eigenvectors corresponding to distinct eigenvalues ofT L(V) arelinearlyindependent.

Proof. Letv1, v2, . . . , vn be the eigenvectors ofT L(V) with 1, 2, . . . , n as the correspondingeigenvalues. This implies that

T vi = ivi,= 0 f or1 inWe proceed by induction on n. Obviously, the result is true for n = 1, as v1= 0. Assume that theresult is true for all i < k . We wish to show that it is also true for i = k. So, let

1v1+ 2v2+ . . . + kvk = 0 (4.2)

Operating in the above equation by T, we get

1T v1+ 2T v2+ . . . + kT vk= 0 (4.3)

As vi is eigenvectors ofTcorresponding to eigenvalue i, 1in, it follows that

11v1+ 22v2+ . . . + knvk = 0 (4.4)

Multiplying Eq(4.2) by k and subtracting Eq(4.4) from it, we get

(k 1)1v1+ (k 2)2v2+ . . . + (k k1)k1vk1 = 0 (4.5)

As v1, v2, . . . , vk1 are assumed to be linearly independent, we get that

(k i)i= 0, 1 ik 1Since eigenvalues ofTare distinct, it follows that i= 0, 1 ik 1. This in turn implies thatk= 0, in view of Eqn(4.4). This proves the linear independence of the eigenvectors ofT.

This gives us the following corollaries.

Corollary 4.2. T L(V) can have at mostn distinct characteristic roots.Corollary 4.3. IfT L(V) hasn distinct characteristic roots. ThenVhas a basis consisting ofcharacteristic vectors ofT.

25


26/32

We now show how to diagonalize a matrix when its eigenvectors are l.i. Let v(1), v(2), . . . , v(n) bethe independent eigenvectors of a matrix with 1, 2, . . . , n as the corresponding eigenvalues.Define the matrix Cas

C=

v(1) v(2) v(n)

As v(i) (1 in) are linearly independent,C is nonsingular and further we have,

AC=

Av(1) Av(2) . . . Av(n)

=

1v(1) 2v(2) nv(n)

= CD

where D=

1 0 . . . 00 2 . . . 0... 0

. . . ...

0... . . . n

This implies that

C1AC= D =

1 0 . . . 00 2 . . . 0... 0

. . . ...

0... . . . n

Thus A is similar to a diagonal matrix.

Example 4.2. Let

A=

2 2 32 1 6

1 2 0

A has eigenvalues3,3, 5 with linearly independent eigenvectors21

0

,

30

1

,

121

Therefore,

C=

2 3 11 0 2

0 1 1

with C1 =

1/4 1/2 3/41/8 1/4 5/8

1/8 1/4 3/8

It can be seen that A can be diagonalized as

C1AC=

3 0 00 3 00 0 5

Symmetric Matrices

Theorem 4.8. LetA be a symmetric matrix. Then(i) the eigenvalues ofA are real and(ii) the eigenvectors corresponding to distinct eigenvalues are orthogonal.

Proof. Let K be an eigenavlue ofA with eigenvector v . This implies that

Av= v, v= 0Taking inner product of the above equation with v , we get

Av,v

=

v,v

,

v,Av

=

v,v

26


27/32

Subtracting the above equations we get

Av,v v,Av=v,v v,vwhich gives

Av,v Av, v= ( )v, vAs A =A, we get that ( ) = 0. That is is real.Let(=) R be another eigenavlue ofA, with eigenvector w= 0. This implies that

Av= v, Aw = w

Taking inner product of the above equation with w and v , respectively, we get

Av,w=v,w,v,Aw=v,wSubtracting the above equations we get

Av,w v,Aw= ( )v, wwhich gives

Av,w Av, w= ( )v, vAs A =A and ( )= 0 we get thatv, w= 0 This proves thatv andw are orthogonal.

Definition 4.9. A matrix Uis said to be orthogonal if

UU=U U =I

Theorem 4.10. A matrixUis an orthogonal matrix iff its column vectors form an orthonormalcollection.

Proof. LetUbe a matrix with u(1), u(2), . . . , u(n) be its column vectors. Then

U =

u(1)u(2)

...u(n)

andUUis given by

UU=

u(1)u

(2)...u(n)

u(1) u(2) . . . u(n)

This implies that

UU=

u(1)u(1) u(1)u(2) . . . u

(1)u(n)

u(2)u(1) u(2)u(2) . . . u

(2)u(n)

...u(n)u(1) u

(n)u(2) . . . u

(n)u(n)

=I

iff{u(i)} is orthonormal.

27


28/32

Theorem 4.11. Spectral TheoremFor a symmetric matrixA, we have the eigenvaluedecomposition

A=n

i=1

iuiui

=U DU, D= diag(1

, 2

,

, n

)

where the matrixU= [u1, u2, , un] is orthogonal and contains the eigenvectors ofA, while thediagonal matrixD contaions the eigenvalues ofA.

Proof. We prove by induction on the size n of the matrix A. By fundamental theorem of algebra,the charachterstic polynomial of the symmetric matrix A, giving the eigenvalues ofA, has at leastone root1 inK. Since A is symmetric this root is real. Correspondingly, there exists u1= 0 suchthatAu1 = 1u1. We normalize this u1. So, the result is true for n = 1. Now, assume that theresult is true for matrix of size n 1. Begin with eigenvalue1 and associated eigenvectoru1.Using Gram-Schmidt orhogonilsation procedure, we get n (n 1) matrix V1 such that [u1, V1] isorthogonal. We define a (n 1) (n 1) symmetric matrix B as V1 AV1. By induction

B = V1

AV1

= Q1

D1

Q1

, D1

= diag(2

, 3

,

, n

)

where the matrix Q1 = [u2, u3, , un] is orthogonal and contains the eigenvectors ofB , while thediagonal matrix D1 contaiisns the eigenvalues ofB and hence that ofA. Definen (n 1) matrixU1 asU1 = V1Q1 andn n matrixU as U= [u1, U1]. By constructionUis orthogonal and

UAU=

u1U1

A[u1, U1]

=

u1Au1 u1AU1

U1 Au1 U1 AU1

=

1 00 D1

by using the fact that U1 Au1 = 1U

1 u1 = 0 and U

1 AU1 = D1. This completes the induction

process.

Definition 4.12. A symmetric matrix A is said to be positive semi-definiteif

Au,u 0,uRn

It is said to be positive definiteif the above inequality is strict for u= 0.f(u) =Au,u is called the quadratic formin u. Accordingly, the quadratic formAu,u is calledpositive semi-definite (definite) ifA is positive semi-definite (definite).

Theorem 4.13. A symmetric matrixA is positive semi-definite (definite) iff all eigenvalues ofAare non-negative (positive).

Proof. Let eigenvalues ofA be non-negative. As eigenvectors of a symmetric matrix form anorthonormal basis,

uRn u= a1u(1)+ + anu(n), aiRwhereu(i), u(j)= ij, Au(i) = iu(i). This gives us

Au,u=

ni=1

aiAu(i),n

j=1

aju(j)

=

ni=1

a2i Au(i), u(i)

28


29/32

=

ni=1

ia2i u(i), u(i)

=

ni=1

ia2i 0 as i0

and henceA is positive semi-definite.Assume now thatA is positive semi-definite and let be an eigenvalue ofA with eigenvectoru= 0. That is, Au = u, u= 0. This gives us

=Au,u

u, u 0

The result concerning the positive definiteness is obvious, whereis are eigenvalues ofA, notnecessarily distinct.This gives us the following corollary.

Corollary 4.4. If a symmetricA is positive definite, then we have the followingRayleighs

inequalityminu2 Au,u maxu2, uRn (4.6)

Heremin , max are the minimum and maximum of the eigenvalues ofA, respectively.

4.3 Linear Equations - Solvability Analysis

Basic Solvability Result Let V =L[v1, v2, , vn] and W =L[w1, w2, , wm] be real vectorspaces. LetT L(V, W). LetA = (ij)mn be the matrix ofTwith respect to the given bases. Letb= (b1, b2, , bm) m. We wish to determinex = (x1, x2, , xn) n, which solves thenonhomogeneous system

11x1+ 12x2+

+ 1nxn = b1

21x1+ 22x2+ + 2nxn = b2... +

... + + ... = ...i1x1+ i2x2+ + inxn = bi

... +... + + ... = ...

m1x1+ m2x2+ + mnxn = bn(4.7)

Ax= b (4.8)

The solvability of this nonhomogeneous system is linked with the solvability of the homogeneoussystem

11x1+ 12x2+ + 1nxn = 021x1+ 22x2+ + 2nxn = 0

... +... + + ... = ...

i1x1+ i2x2+ + inxn = 0... +

... + + ... = ...m1x1+ m2x2+ + mnxn = 0

(4.9)

29


30/32

Ax= 0 (4.10)

The solvability of the nonhomogeneous system (4.8) is equivalent to the solvability of the followingoperator equation for vV

T v= b (4.11)

wherevVhas coordinates (x1, x2, , xn) n andb W has coordinates(b1, b2, , bm) m. The following theorem is immediate.Theorem 4.14. (a) Ifb R(T), then Eq(4.11) has no solution.(b) Ifb R(T), then Eq(4.11) has a solution. It is unique ifN(T) ={o}.(c) Ifb

R(T) andN(T)

=

{o

}, then Eq(4.11) has infinitely many solutions and these solutions

are given by the setS= v0+ N(T), wherev0 is a particular solution of Eq(4.11).

We first state a theorem giving a relationship between the rank ofTand its adjoint T.

Theorem 4.15. LetV andWbe as defined above and letT L(V, W). Thenr(T) = r(T).Proof. We have [N(T)] = [R(T)]. Hence r(T) = dim[N(T]] =n n(T) = r(T).

We now derive a relationship between the rank ofTand the linearly independent rows andcolumns of the matrix A = m(T). We have R(T) = L[T v1, T v2, , T vn] and and r(T)=maximalnumber of linearly independent subset of{T v1, T v2, , T vn}. Observe that T vj =

i ijvi.

Hence, the coordinates ofT vj are (1j , 2j , , nj). Thus T v1, T v2, , T vn coressponds tocolumn vectors (11, 21, , n1),(12, 22, , n2) ,(1n, 2n, , mn). of the matrix

A=

11 12 1n21 22 2n

...m1 m2 mn

,

Hencer(T)=maximal number of linearly independent columns ofA = m(T). Similarly,r(T)=maximal number of linearly independent rows ofA = m(T). By the previous theorem,r(T) = r(T) and hence we get that r(T)=maximal number of linearly independent columns ofA= maximal number of linearly independent rows ofA. This motivates us to define the rank of amatrix as follows.

Definition 4.16. LetA be any m n matrix. Its rank is defined as maximal number of linearlyindependent columns ofA or maximal number of linearly independent rows ofA.

Denote byr(A) byr. We shall denote by (A, b) the matrix A augmented by b. As the solvability ofthe matrix equation Eq(4.8) is equivalent to the solvability of the operator equation Eq(4.11), itfollows that (4.8) is solvable iffb R(T. Note that b R(T) iffr(A) = r(A, b). This gives us thefollowing solvability result for Eqs(4.8)-(4.10).

Theorem 4.17. (a) The nonhomogeneous system (4.8) has a solution for allb m iffr= r(A, b).

(b) The nonhomogeneous system (4.8) has a solution for allb m iffr = m.(c) The nonhomogeneous system (4.8) has a unique solution for allb m iffr= m = n. Inaddition, the homogeneous system has only zero solution.

30


31/32

(d) Letr= m andm < n. then the nonhomogeneous system (4.8) as well as the homogeneoussystem (4.10) have infinitely many solutions for allb m.(e) Letr < m and letb

R(A).

Then in all three cases (Case 1: r < m= n, Case 2: r < m < n, Case 3: r < n < m), bothnonhomogeneous system (4.8) as well as the homogeneous system (4.10) have infinitely manysolutions.

(f) Letr = n < m. Then homogeneous system (4.10) has only trivial solution. Further, ifb R(A), then nonhomogeneous system (4.8) has a unique solution.(g) Ifm = n, then homogeneous system (4.10) has a trivial solution iffA is nonsingular.

Proof. (a) Obvious.

(b)r = m implies that dim(R(A)) = m = dim(W). HenceR(A) = Wand so A is on to.

For (c)-(f), we use Rank-Nullity theorem: n(A) + r= n.

(c)r = m = n n(A) = 0. This gives that N(A) ={0}and hence the uniqueness of solutions ofboth homogeneous and nonhomogeneous equations (4.10) and (4.8), respectively.

(d)r = m n(A) = n m > 0. This gives that N(A)={0} and hence both homogeneous andnonhomogeneous equations (4.10) and (4.8) have infinitely many solutions.

(e)b R(A) implies that nonhomogeneous system (4.8) has a solution. Further, all three casesimples thatn(A) = n r >0. Hence N(A)={0} and so the the solution set of both homogeneousand nonhomogeneous equations is infinite.

(f) Let r = n < m n(A) = 0. Hence homogeneous system (4.10) has only trivial solution and ifb R(A), then nonhomogeneous system (4.8) has a solution and this solution is unque.(g)m = n

A is a square matrix and hence homogeneous system (4.10) has a trivial solution iffA

is nonsingular.

.It is known that a square matrixA is nonsingular iff det A = 0. Hence, we immediately get thefollowing corollary from (g) of the above theorem.

Corollary 4.5. LetA be a square matrix. The homogeneous system (4.10) has a trivial solutioniffdet A = 0 .

5 Some Sample Texts and Notes

Published Texts[1] Inroduction to Differential Equations and Linear Algebra, Stephen W Goode, Prentice Hall,1991

[2] Finite Dimensional Vector Spaces, Paul R Halmos, Affiliated East-West Pvt. Ltd. 1965

[3] Linear Algebra, Jim Hefferon, Virginia Commonwealth University Mathematics, 2009

[4] Inroduction to Linear Algebra, Lee W Johnson, Jimmy Arnold and R Dean Riess, Pearson, 2001

[5] Introduction to Linear Algebra: An Applied First Course, B Kolman and D Hill, Pearson, 2001

[6] Linear Algebra: A Geometric Approach, S Kumaresan, Prentice Hall of India, 1999

[7] Inroduction to Linear Algebra, Serge Lang, Springer, 1985

31


32/32

[8] Linear Algebra and Applications, David Lay, Adison Wesley, 2011

[9] Applied Linear Algebra, Ben Noble, Prentice Hall, 1969

[10] Linear Algebra and Applications, Gilbert Strang, Thomson Brooks/Cole, 2005

Lecture Notes

[1] Notes on Linear Algebra (2008), Peter J Cameron, www.maths.qmul.ac.uk

[2] Linear Algebra, Jim Hefferon (2009), www.mathematik.uni-muenchen.de

[3] Linear Algebra, Theory and Applications (2012), Kenneth Kuttler, www.math.byu.edu

[3] Linear Algebra in Twenty Five Lectures,Tom Denton and Andrew Waldron (2012),www.math.ucdavis.edu

m tech notes

Documents