Download - Linear Algebra Notes
Chapter 1
Systems of linear equations
1.1 Examples
1.1.1 Example
Consider the equation
x1 + 2x2 = 4
Let’s assume that x1 and x2 are real numbers. That will be the case for much of this course.We can solve for x1 in terms of x2 to get
x1 = 4− 2x2
There are infinitely many solutions to this equation:
x1 = 4− 2t, x2 = t, t ∈ R
1.1.2 Example
This time let’s consider the solution of two equations
x1 + 2x2 = 4
x1 − 4x2 = 7
We can multiply the first equation by -1 and add to the second equation to get
x1 + 2x2 = 4
− 6x2 = 3
1
2 CHAPTER 1. SYSTEMS OF LINEAR EQUATIONS
We can now divide the second equation by -6 to solve for x2.
x1 + 2x2 = 4
x2 = − 1/2
We can now multiply the second equation by -2 and add to the first.
x1 = 5
x2 = − 1/2
Note that this time we get one unique solution and not infinitely many.
1.1.3 Example
x1 + 2x2 = 4
2x1 + 4x2 = 7
We can multiply the first equation by -2 and add to the second equation to get
x1 + 2x2 = 4
0 + 0 = − 1
This is a contradiction and we conclude that the system of equations has no solution.
1.2 Idea
Note how simple the idea is: We multiply an equation by a scalar and then add to anotherequation. This simplifies the system. We keep simplifying until we get a solution or we geta contradiction.
1.3 Nonlinear equations
The situation with nonlinear equations is very different.
1.4. SOLVING SYSTEMS OF LINEAR EQUATIONS 3
1.3.1 Example
How do we solve
tanx = x
Do we know that this nonlinear equation has a solution? How many solutions does it have?It actually has infinitely many solutions.
1.3.2 Example
tanx = x
x2 + y2 = 1
How many solutions does this system have?
1.4 Solving systems of linear equations
1.4.1 Basic operations
The basic operations used to solve a system of linear equations are
• scaling a row
• interchanging rows
• replacing a row by the sum of the row and a scalar multiple of another row
1.4.2 Matrix notation
Consider the system of linear equations
1x1 + 1x2 + 1x3 = 62x1 + 1x2 + 1x3 = 71x1 + 1x2 + 2x3 = 12
In matrix notation we would write this system1 1 12 1 11 1 2
x1
x2
x3
=
6712
4 CHAPTER 1. SYSTEMS OF LINEAR EQUATIONS
1.4.3 Coefficient matrix1 1 12 1 11 1 2
︸ ︷︷ ︸
coefficient matrix
x1
x2
x3
=
6712
1.4.4 Augmented matrix
We also often represent the system by an augmented matrix1 1 1 62 1 1 71 1 2 12
1.4.5 Apply elementary row operations to solve the system
-2 * Row 1 + Row 2 replaces Row 21 1 1 60 −1 −1 −51 1 2 12
-1 * Row 1 + Row 3 replaces Row 31 1 1 6
0 −1 −1 −50 0 1 6
1 * Row 3 + Row 2 replaces Row 2 1 1 1 6
0 −1 0 10 0 1 6
-1 * Row 3 + Row 1 replaces Row 1 1 1 0 0
0 −1 0 10 0 1 6
-1 * Row 2 replaces Row 2 1 1 0 0
0 1 0 −10 0 1 6
-1 * Row 2 + Row 1 replaces Row 1
1.5. REDUCED ROW ECHELON FORM 5
1 0 0 10 1 0 −10 0 1 6
1.4.6 Solution
We now read off the solution
x1 = 1, x2 = −1, x3 = 6
which we might also write as a vector 1−16
1.5 Reduced row echelon form
1.5.1 Leading coefficient or pivot of a nonzero row
The first nonzero element from the left end of the row.
1.5.2 Example
Consider the matrix 1 1 1 60 2 1 51 1 2 12
The leading coefficient or pivot of the second row is 2.
1.5.3 Row echelon form
A matrix is in row echelon form if
• All rows with at least one nonzero element are above any rows which only containzeros. Rows with all zeros should be at the bottom of the matrix.
• The leading coefficient (pivot) of a nonzero row is always to the right of the leadingcoefficient of the row above it.
• All entries in a column below a leading coefficient are zeroes.
6 CHAPTER 1. SYSTEMS OF LINEAR EQUATIONS
1.5.4 Example
Consider the matrix 1 1 1 60 2 1 51 1 2 12
This matrix is not in row echelon form because the leading coefficient 1 in the first rowdoes not have all zeros below it, equivalently the leading coefficient 1 in the third row is notstrictly to the right of the leading coefficient 1 in the first row.
1.5.5 Example
Consider the matrix 1 1 1 60 2 1 50 0 0 00 0 2 6
This matrix is not in row echelon form because there is a row of zeros above a nonzero row.If we interchange the last two rows
1 1 1 60 2 1 50 0 2 60 0 0 0
the matrix is now in row echelon form.
1.5.6 Reduced row echelon form
A matrix is in reduced row echelon form if
• It is in row echelon form
• The leading coefficients of each nonzero row is 1 and there is no other nonzero entryin the same column
1.5.7 Example
Consider the matrix 1 1 1 60 2 1 50 0 2 60 0 0 0
1.6. GAUSS-JORDAN ELIMINATION 7
This matrix is in row echelon form but not in reduced row echelon form because there arenonzero rows which have pivots that are not 1 and there are nonzero entries above some ofthe pivots.
1.5.8 Example contd
Consider the matrix 1 1 1 60 2 1 50 0 2 6
This matrix is not in reduced row echelon form. We can scale the second and third rows toget 1 1 1 6
0 1 1/2 5/20 0 1 3
We can now multiply the third row by -1 and add to the first row and multiply the thirdrow by -1/2 and add to the second to get:1 1 0 3
0 1 0 10 0 1 3
We can now add -1 times the second row to the first to get:1 0 0 2
0 1 0 10 0 1 3
This matrix is in reduced row echelon form.
1.6 Gauss-Jordan elimination
The process of using elementary row operations:
• scaling a row
• interchanging rows
• replacing a row by the sum of the row and a scalar multiple of another row
to put a matrix into row echelon form or reduced row echelon form is called Gauss-Jordanelimination or Gaussian elimination.
8 CHAPTER 1. SYSTEMS OF LINEAR EQUATIONS
1.7 Geometric interpretations
1.7.1 Hyperplanes
An equation of the form
c1x1 + c2x2 + · · ·+ cnxn = d
with the ci and d constants defines an n− 1 dimensional hyperspace in Rn.
1.7.2 Example
The equation
x+ y + z = 7
defines a two dimensional plane in three (or perhaps higher) dimensional space.
1.7.3 Example
The equation
x+ y = 7
defines a one-dimensional line in two (or perhaps higher) dimensional space.
1.7.4 Geometry and solutions
Solutions of systems of linear equations then can be interpreted as intersections of hyper-planes.
1.7.5 Example
Consider the system of equations
x+ y = 3
x− y = − 1
If we draw the lines defined by these two equations then they intersect at exactly one point(x = 1, y = 2) which is the solution to the system.
At how many points can two lines intersect?
1.8. EXERCISES 9
1.7.6 Example
Consider the system of equations
x+ y = 3
2x+ 2y = 7
If we draw the one-dimensional lines defined by these two equations then they run parallelwithout ever touching. There is no point of intersection on the graph and no solution to thissystem of equations.
1.7.7 Example
Consider the system of equations
x+ y + z = 1
x− y + z = 7
The two equations are of two-dimensional equations which intersect in a line and the solutionof system is
x = t, y = −3, z = 4− t, t ∈ R
How many ways can two planes intersect (or not intersect) in three dimensions? How aboutthree planes?
1.8 Exercises
1.8.1 Exercise
Consider the system of linear equations
x1 + x2 + x3 = 5
2x1 + 3x2 + 5x3 = 8
4x1 + 5x3 = 2
Write the augmented matrix that represents this system. Put the augmented matrix in re-duced row echelon form. Find the solution.
Answer: x1 = 3 , x2 = 4 , x3 = −2.
10 CHAPTER 1. SYSTEMS OF LINEAR EQUATIONS
1.8.2 Exercise
Consider the system of linear equations
x1 + x2 + x3 = 5
2x1 + 3x2 + 5x3 = 8
Write the augmented matrix that represents this system. Put the augmented matrix in re-duced row echelon form. Find the solution.
Answer: x1 = 7 + 2t , x2 = −2− 3t , x3 = t, t ∈ R
Chapter 2
Matrices
2.1 Example
2.1.1 Problem
We consider solving a 3x3 linear system:
x+ y + z = − 1
2x+ y + z = − 2
x+ 2y + 3z = − 3
2.1.2 Gaussian elimination
Gaussian elimination is the process of putting the augmented matrix in row echelon form.We will use this first:
The augmented matrix is 1 1 1 −12 1 1 −21 2 3 −3
We get zeros below the first pivot of the first row by
-2*R1 + R2-1*R1 + R3
1 1 1 −10 −1 −1 00 1 2 −2
11
12 CHAPTER 2. MATRICES
We get zeros below the first pivot of the second row by
1*R2 + R3
1 1 1 −10 −1 −1 00 0 1 −2
2.1.3 Row echelon form
The augmented matrix is now in row echelon form after performing Gaussian eliminationand the solutions can be found by scaling and back substitution:
The third row gives
z = − 2
The second row gives
−y = 0 + z ⇒ y = −z = −(−2) = 2
Then the first row gives
x = − 1− y − z = − 1− 2− (−2) = − 1
2.1.4 Gauss-Jordan elimination
Gauss-Jordan elimination puts the matrix in reduced row echelon form. We can think of itas continuing on with Gaussian elimination until we get to reduced row echelon form.
2.1.5 Continuing
We had 1 1 1 −10 −1 −1 00 0 1 −2
We scale so that the pivots are all 1: 1 1 1 −1
0 1 1 00 0 1 −2
We can get zeros above the pivot of the second row by
2.2. MATRICES 13
-1*R2 + R1
1 0 0 −10 1 1 00 0 1 −2
We can get zeros above the pivot of the third row by
-1*R3 + R2
1 0 0 −10 1 0 20 0 1 −2
The augmented matrix is now in reduced row echelon form and the results can be read offdirectly from the matrix:
x = − 1, y = 2, z = − 2
2.2 Matrices
2.2.1 Definition
Rectangular arrays of real numbers as follows.
2.2.2 Example
A =
a11 a12 · · · a1n
a21 a22 · · · a2n...
... · · · ...ai1 ai2 · · · ain...
... · · · ...am1 am2 · · · amn
is an m× n matrix with m rows and n columns.
aij is the element of A that is in the ith row and the jth column.
2.2.3 Other examples
B =
(1 24 5
)
14 CHAPTER 2. MATRICES
is a 2x2 matrix and
C =
12345
is a column matrix with five elements.
2.2.4 Equality of matrices
Two matrices Am×n and Bm′×n′ are equal if they have the same dimensions and each corre-sponding element is the same:
m = m′, n = n′
aij = bij
for all 1 ≤ i ≤ m, 1 ≤ j ≤ n.
2.2.5 Addition of matrices
If two matrices
A =
a11 a12 · · · a1n
a21 a22 · · · a2n...
... · · · ...ai1 ai2 · · · ain...
... · · · ...am1 am2 · · · amn
and
B =
b11 b12 · · · b1n
b21 b22 · · · b2n...
... · · · ...bi1 bi2 · · · bin...
... · · · ...bm1 bm2 · · · bmn
are of the same dimension then their sum C = A+B is defined to be a matrix of the samedimension whose elements are given by adding the corresponding elements of A and B:
2.2. MATRICES 15
C =
c11 c12 · · · c1n
c21 c22 · · · c2n...
... · · · ...ci1 ci2 · · · cin...
... · · · ...cm1 cm2 · · · cmn
with
cij = aij + bij
2.2.6 Example (1 1 12 2 2
)+
(1 1 11 1 1
)=
(2 2 23 3 3
)
2.2.7 Subtraction of matrices
If two matrices
A =
a11 a12 · · · a1n
a21 a22 · · · a2n...
... · · · ...ai1 ai2 · · · ain...
... · · · ...am1 am2 · · · amn
and
B =
b11 b12 · · · b1n
b21 b22 · · · b2n...
... · · · ...bi1 bi2 · · · bin...
... · · · ...bm1 bm2 · · · bmn
are of the same dimension then their difference C = A− B is defined to be a matrix of thesame dimension whose elements are given by subtracting the corresponding elements of Aand B:
16 CHAPTER 2. MATRICES
C =
c11 c12 · · · c1n
c21 c22 · · · c2n...
... · · · ...ci1 ci2 · · · cin...
... · · · ...cm1 cm2 · · · cmn
with
cij = aij − bij
2.2.8 Example (1 1 12 2 2
)−(
1 1 11 1 1
)=
(0 0 01 1 1
)
2.2.9 Multiplication by a scalar multiple
If Am×n is a matrix
A =
a11 a12 · · · a1n
a21 a22 · · · a2n...
... · · · ...ai1 ai2 · · · ain...
... · · · ...am1 am2 · · · amn
and c ∈ R then the scalar multiple (cA)m×n is given by multiplying each entry of A by c
(cA)ij = c · aij
cA =
ca11 ca12 · · · ca1n
ca21 ca22 · · · ca2n...
... · · · ...cai1 cai2 · · · cain
...... · · · ...
cam1 cam2 · · · camn
2.2.10 Example
If
A =
(1 1 12 2 2
)
2.3. PRODUCTS OF MATRICES 17
then 4A is
4A =
(4 4 48 8 8
)
2.3 Products of matrices
2.3.1 Definition
The product of matrix Am×k and Bk×n ( note that the number of columns of A must be thesame as the number of rows of B is defined by
(AB)ij =k∑p=1
aipbpj
We can think of this as taking dot products of the the ith row of A with the jth column ofB.
2.3.2 Example
Suppose
A =
(1 2 20 5 1
)and
B =
314
A has three columns and B has three rows so the product of A and B is defined. A has tworows and B has one column so the product will have two rows and one column.
(AB)11 =∑p
a1pbp1 = 1 · 3 + 2 · 1 + 2 · 4 = 13
(AB)21 =∑p
a2pbp1 = 0 · 3 + 5 · 1 + 1 · 4 = 9
Then
AB =
(139
)
18 CHAPTER 2. MATRICES
2.3.3 Lemma
If A and B are matrices and the product AB is defined then the jth column of AB is givenby
A[jth col of B]
2.3.4 Lemma
If A and B are matrices and the product AB is defined then the ith row of AB is given by
[ith row of A]B
2.3.5 Multiplication is not commutative
Given two matrices A and B, we may have that AB is defined and BA is not defined. Ifboth products are defined there is no requirement that BA = AB.
2.3.6 Exercise
Given
A =
(−1 02 3
)
B =
(1 23 0
)Show that AB 6= BA.
2.4 Transpose of a matrix
2.4.1 Definition
If A is a matrix
A =
a11 a12 · · · a1n
a21 a22 · · · a2n...
... · · · ...ai1 ai2 · · · ain...
... · · · ...am1 am2 · · · amn
then the transpose At of A is
2.4. TRANSPOSE OF A MATRIX 19
A =
a11 a21 · · · an1
a12 a22 · · · an2...
... · · · ...a1i a2i · · · ani...
... · · · ...a1m a2m · · · anm
or
(At)ij = (A)ji = aji
2.4.2 Example
Consider the matrix
A =
(1 2 20 5 1
)The transpose At of A is
At =
1 02 52 1
2.4.3 Transpose of a transpose
If A is a matrix then
(At)t = A
2.4.4 Transpose of a sum
If A and B are matrices of the same size then
(A+B)t = At +Bt
2.4.5 Transpose of a scalar multiple
If A is a matrix and c is a real number then
(cA)t = cAt
20 CHAPTER 2. MATRICES
2.4.6 Transpose of a product
If A and B are matrices such that the product AB is defined then
(AB)t = BtAt
2.5 Trace of a square matrix
2.5.1 Definition
If A is a square matrix
A =
a11 a21 · · · an1
a12 a22 · · · an2...
... · · · ...a1i a2i · · · ani...
... · · · ...a1n a2n · · · ann
the trace tr( A ) of A is the sum
trA =∑i
aii
2.5.2 Example
A =
1 0 22 5 72 1 4
tr A = 10.
2.6 Some properties of matrices
Suppose that in this section A, B, and C are matrices and a , b, and c are real numbers.Suppose that the operations discussed in this section are defined.
2.6.1 Addition is commutative
A+B = B + A
2.6.2 Addition is associative
(A+B) + C = A+ (B + C)
2.7. PROOFS OF THE PROPERTIES 21
2.6.3 Multiplication is associative
(AB)C = A(BC)
2.6.4 Distribution
A(B + C) = AB + AC
(B + C)A = BA+ CA
2.6.5 Scalar multiplication
a(B + C) = aB + aC
a(bC) = (ab)C
a(BC) = (aB)C = B(aC)
2.7 Proofs of the properties
In this section the elements of A are denoted aij, the elements of B are denoted bij, and theelements of C are denoted cij.
2.7.1 Proof
A+B = B + A
Proof.
(A+B)ij = aij + bij
= bij + aij
= (B + A)ij
2.7.2 Proof
Am×n(Bn×p + Cn×p) = Am×nBn×p + Am×nCn×p
Proof.
22 CHAPTER 2. MATRICES
(A[B + C])ij =n∑k=1
Aik(B + C)kj
=n∑k=1
(AikBkj + AikCkj)
=n∑k=1
AikBkj +n∑k=1
AikCkj
= (AB)ij + (AC)ij
2.7.3 Exercise
Using the previous two proofs as a model, give the proof for the other properties presentedin the previous section.
2.8 Identity matrix
2.8.1 Definition
In×n is the square matrix with all zeros except for ones on the main diagonal.
2.8.2 Example
I =
1 0 00 1 00 0 1
2.8.3 Property
If An×n is a square matrix then
In×nAn×n = An×nIn×n = An×n
2.8.4 Exercise
I =
1 0 00 1 00 0 1
A =
2 3 57 1 04 9 1
Confirm that IA = AI = A.
2.9. ELEMENTARY MATRICES 23
2.9 Elementary matrices
2.9.1 Definition
Any matrix formed from the identity matrix by an elementary row operation is called anelementary matrix.
2.9.2 Example
From the identity matrix
I =
1 0 00 1 00 0 1
we can form the elementary matrix E by multiplying the second row by 3:
E =
1 0 00 3 00 0 1
Now if
A =
2 3 57 1 04 9 1
then EA is
EA =
2 3 521 3 04 9 1
E multiplies the third row of A by 3.
2.9.3 Example
From the identity matrix
I =
1 0 00 1 00 0 1
we can form the elementary matrix E by interchanging the first and second rows:
E =
0 1 01 0 00 0 1
Now if
24 CHAPTER 2. MATRICES
A =
2 3 57 1 04 9 1
then EA is
EA =
7 1 02 3 54 9 1
E exchanges the first and second rows of A.
2.9.4 Example
From the identity matrix
I =
1 0 00 1 00 0 1
we can form the elementary matrix E by multiplying the first row by -2 and adding to thethird row:
E =
1 0 00 1 0−2 0 1
Now if
A =
2 3 57 1 04 9 1
then EA is
EA =
2 3 57 1 00 3 −9
E multiplies the first row of A and adds it to the third row of A, replacing the third row.
2.10 Exercises
2.10.1 Exercise
Solve the system of linear equations or show that it does not have a solution.
2.10. EXERCISES 25
2x1 + x2 = 7
x1 − x2 = 5
2.10.2 Exercise
Solve the system of linear equations or show that it does not have a solution.
2x1 − 5x2 + 4x3 = 8
2x1 + 2x3 = 4
− x1 − 2x2 + x3 = 2
2.10.3 Exercise
Solve the system of linear equations or show that it does not have a solution.
2x1 − 5x2 + 4x3 = 8
2x1 + 2x3 = 4
2.10.4 Exercise
If
A =
(2 1 00 1 5
)then 3A = ?
2.10.5 Exercise
If
A =
(2 1 00 1 5
)
B =
1 1 10 1 10 0 1
then AB =
26 CHAPTER 2. MATRICES
2.10.6 Exercise
If
A =
1 1 10 1 10 0 1
then tr A =
2.10.7 Exercise
Prove the properties of matrices given in this chapter.
2.10.8 Exercise
What does an elementary matrix do when it multiplies (from the left) a matrix A? Can youprove this?
2.10.9 Exercise
Consider the system of linear equations
x+ y + z = 4
x− y + z = 2
2x− z = 0
Write the augmented matrix for this problem. What elementary matrices would be used toreduce the augmented matrix to reduced row echelon form?
Chapter 3
Inverse of a matrix
3.1 Inverse of a matrix
3.1.1 Definition
If An×n is a square matrix, then A−1n×n is that matrix (if such a matrix exists) so that
AA−1 = A−1A = In×n
3.1.2 Example
Consider the matrix
A =
(3 11 1
)Then A has an inverse and
A−1 =1
2
(1 −1−1 3
)We can confirm that
AA−1 =
(1 00 1
)= A−1A
3.1.3 Invertible
A matrix A is said to be invertible if it has an inverse A−1.
3.2 Inverse of a two by two matrix
3.2.1 Inverses don’t always exist
As an example of a matrix with no inverse, consider the zero matrix
27
28 CHAPTER 3. INVERSE OF A MATRIX
(0 00 0
)No matter what you multiply this matrix by, you would always get the zero matrix back andcould never get the identity matrix.
3.2.2 Inverse of a two by two matrix
Suppose that
A =
(a bc d
)is such that ad− bc 6= 0. Then
A−1 =1
ad− bc
(d −b−c a
)
3.2.3 Proof
The result can be proven by direct calculation:
AA−1 =
(a bc d
)1
ad− bc
(d −b−c a
)=
1
ad− bc
(a bc d
)(d −b−c a
)=
1
ad− bc
(ad− bc 0
0 ad− bc
)=
(1 00 1
)and
A−1A =1
ad− bc
(d −b−c a
) (a bc d
)=
1
ad− bc
(ad− bc 0
0 ad− bc
)=
(1 00 1
)
3.3. INVERSE OF THE INVERSE 29
3.2.4 Example
The inverse of
A =
(1 48 5
)is
A−1 =−1
27
(5 −4−8 1
)
3.3 Inverse of the inverse
3.3.1 The inverse of the inverse is the original matrix
If A is an invertible matrix then
(A−1)−1 = A
3.3.2 Proof
Follows directly from the definition of the inverse.
3.4 Inverse of a product of invertible matrices
3.4.1 Inverse of a product of invertible matrices
Suppose that A and B are both invertible. Then
(AB)−1 = B−1A−1
3.4.2 Proof
(AB)(B−1A−1) = A(BB−1)A−1 = AIA−1 = AA−1 = I
(B−1A−1)(AB) = B−1(A−1A)B = B−1IB = B−1B = I
3.4.3 Extension
If A1 , A2 , ... , An are invertible matrices then what is the inverse of A1A2 · · ·An?
30 CHAPTER 3. INVERSE OF A MATRIX
3.5 Powers of matrices
3.5.1 Definition
If A is a square matrix (invertible or otherwise) then
A0 ≡ I
A1 = A
A2 = AA
An = AA · · ·A︸ ︷︷ ︸n times
3.5.2 Inverse of a power of a matrix
Suppose that A is invertible. Then
(An)−1 = (A−1)n
3.5.3 Proof
By induction.
3.5.4 Notation
For an invertible matrix A and n a positive integer
A−n ≡ (A−1)n
3.5.5 Exponent rules
If A is an invertible square matrix and r, s ∈ Z
ArAs = Ar+s
(Ar)s = Ars
3.5.6 Proof
By induction.
3.6. INVERSE OF A TRANSPOSE 31
3.5.7 Polynomials of matrices
If A is a square matrix and p is a polynomial function defined by
p(x) = a0 + a1x+ a2x2 + · · ·+ anx
n
then
p(A) = a0I + a1A+ a2A2 + · · ·+ anA
n
3.5.8 Example
If
A =
(0 1−1 0
)and
f(x) = x3 + x2 + x
then
f(A) = − I2×2
3.6 Inverse of a transpose
3.6.1 Inverse of a transpose of an invertible matrix
Suppose that A is an invertible matrix. Then At is also invertible and
(At)−1 = (A−1)t
3.6.2 Proof
(A−1)tAt = (AA−1)t = I t = I
At(A−1)t = (A−1A)t = I t = I
3.7 Inverses of elementary matrices
3.7.1 Inverse of scaling elementary matrix
One elementary row operation is to multiply the ith row by a real number c 6= 0. The ele-mentary matrix for this operation is found by multiply the ith row of the identity matrix byc. The inverse of this elementary matrix is found by multiplying the ith row of the identity
32 CHAPTER 3. INVERSE OF A MATRIX
matrix by 1/c.
This can be confirmed by directly computing the product of the two matrices.
3.7.2 Inverse of an interchanging rows elementary matrix
One elementary row operation is to multiply the ith row and the jth row of a matrix. Theelementary matrix for this operation is found by interchanging the ith row and jth row ofthe identity matrix. This matrix is its own inverse.This can be confirmed by directly computing the product of the matrix with itself.
3.7.3 Inverse of a add multiple of a row to a row elementarymatrix
One elementary row operation is to multiply the ith row of a matrix by c 6= 0 and add itto the jth row of a matrix. The elementary matrix for this operation is found by doing thesame operation to the identity matrix.
The inverse of this matrix is found by multiplying the ith row of the identity matrix by−c 6= 0 and adding it to the jth row of the identity matrix.This can be confirmed by directly computing the product of the matrix with itself.
3.7.4 All elementary matrices are invertible
We see that all elementary matrices are invertible and the inverses are also elementarymatrices.
3.8 Inverses and solutions of systems
3.8.1 Example
Consider the system of equations
x1 + x2 = 3
x1 − x2 = − 1
If we let
A =
(1 11 −1
)Then we can write
3.9. FUNDAMENTAL THEOREM OF LINEAR ALGEBRA 33
A
(x1
x2
)=
(3−1
)The inverse of A is
A−1 =−1
2
(−1 −1−1 1
)=
(1/2 1/21/2 −1/2
)Then we can multiply from the left on both sides to get
A−1A
(x1
x2
)= A−1
(3−1
)And so (
x1
x2
)=
(1/2 1/21/2 −1/2
)(3−1
)=
(12
)
3.8.2 Generalizing
If An×n is an invertible square matrix and xn×1 is a column matrix of unknowns x1, x2, ..., xnand bn×1 is a column of real numbers then the equation
Ax = b
has the unique solution
x = A−1b
3.8.3 Proof
Multiply by A−1 on both sides.
3.9 Fundamental theorem of linear algebra
3.9.1 Name
It’s not actually called that. But it’s important enough that it should be. We will see atdifferent times in this course quite a few statements that are all equivalent and are extremelyimportant in linear algebra.
3.9.2 FTLA
The following are equivalent:
1) An×n is invertible.
34 CHAPTER 3. INVERSE OF A MATRIX
2) The equation Ax = 0 has only the trivial solution. ( All entries of x are zero ).
3) The reduced row echelon form of A is the identity matrix I.
4) A is a product of elementary matrices.
3.9.3 Example
We know that the equation (1 20 5
)(x1
x2
)=
(00
)has only the trivial solution (
x1
x2
)=
(00
)because the matrix on the left of the equation is invertible.
3.10 Calculating the inverse by hand
3.10.1 Method
Suppose An×n is a square matrix and we want to find the inverse. We set up a matrix
(A|In×n)
and then we perform elementary operations until we get(I|A−1
)If we can’t get I on the left it means that the matrix A was not invertible.
3.10.2 Example
If we want to invert
A =
1 1 10 1 10 0 1
we start with the matrix 1 1 1 | 1 0 0
0 1 1 | 0 1 00 0 1 | 0 0 1
and perform elementary row operations until we get
3.11. PROOF OF THE FTLA 35
1 0 0 | 1 −1 00 1 0 | 0 1 −10 0 1 | 0 0 1
and so the inverse is
A−1 =
1 −1 00 1 −10 0 1
3.11 Proof of the FTLA
3.11.1 Theorem
The following are equivalent:
a) An×n is invertible
b) The equation Ax = 0 has only the trivial solution x = 0.
c) The reduced row echelon form of A is the I.
d) A is a product of elementary matrices.
3.11.2 Proof a) implies b)
If A is invertible then
Ax = 0 ⇒A−1(Ax) = A−10 ⇒(A−1A)x = 0 ⇒
Ix = 0 ⇒x = 0
3.11.3 Proof b) implies c)
Suppose that the equation Ax = 0 has only the solution x = 0. Then the augmented matrixin rref looks like
36 CHAPTER 3. INVERSE OF A MATRIX
1 0 0 0 | 00 1 0 0 | 00 0 1 0 | 0
. . .
0 0 0 1 | 0
and so the rref of A is I.
3.11.4 Proof c) implies d)
If the rref of A is I then A can be transformed to I by elementary row operations. Since theelementary row operations can be done by elementary matrices, we have
EnEn−1 · · ·E2E1A = I
for some elementary matrices E1, E2, ... , En. Then inverting gives
A = E−11 E−1
2 · · ·E−1n
3.11.5 Proof d) implies a)
If we can show that d) implies a) then we have completed the proof.
If A is a product of elementary matrices then A is a product of invertible matrices sinceall elementary matrices are invertible. The product of invertible matrices is invertible so Ais invertible.
3.12 Even more FTLA
3.12.1 Theorem
The following are equivalent:
a) An×n is invertible
b) The equation Ax = 0 has only the trivial solution x = 0.
c) The reduced row echelon form of A is the I.
d) A is a product of elementary matrices.
NEW
e) Ax = b is consistent for all n× 1 matrices b
3.12. EVEN MORE FTLA 37
f) Ax = b has exactly one solution for all n× 1 matrices b
3.12.2 Proof so far
We have a) implies b) implies c) implies d) implies a).
3.12.3 e) implies a)
Consider
e1 =
100...0
There is some column matrix v1 so that
Av1 = e1
Consider
e2 =
010...0
There is some column matrix v2 so that
Av2 = e2
If ej is the column vector of n rows which has jth entry 1 and otherwise zero, then there arecolumn matrices vj so that
Avj = ej
for j = 1, 2, ..., n. If we write collect the vj as columns of a square matrix and the ej as thecolumns of the identity matrix then we can rewrite the result as
A(v1 v2 · · · vn
)=(e1 e2 · · · en
)A(v1 v2 · · · vn
)= I
and so
38 CHAPTER 3. INVERSE OF A MATRIX
A−1 =(v1 v2 · · · vn
)3.12.4 Proof a) implies f)
Exercise for the student.
3.13 Exercises
3.13.1 Exercise
Consider the matrix
A =
1 1 11 0 10 1 1
Find the inverse of this matrix.
3.13.2 Exercise
Consider the system of linear equations
x1 + x2 + x3 = 3
x1 + x3 = 4
x2 + x3 = 5
What is the solution of this system?
3.13.3 Exercise
We proved that for invertible matrices A1 and A2 of the same size that
(A1A2)−1 = A−12 A−1
1
Given this fact show that
(A1A2A3)−1 = A−13 A−1
2 A−11
Assume that for some positive integer k and any k invertible matrices of the same size that
(A1A2A3 · · ·Ak−1Ak)−1 = A−1
k A−1k−1 · · ·A
−12 A−1
1
Prove that for another invertible matrix Ak+1 of the same size that
3.13. EXERCISES 39
(A1A2A3 · · ·Ak−1AkAk+1)−1 = A−1k+1A
−1k A−1
k−1 · · ·A−12 A−1
1
What can be concluded by mathematical induction?
3.13.4 Exercise
Consider the matrix
A =
0 1 10 0 10 0 0
Show that A3 = 0.
3.13.5 Exercise
Suppose that a square matrix A is such that A4 = 0. Prove that A does not have an inverse.Hint: Assume that it does have an inverse and that this assumption leads to a contradiction.
3.13.6 Exercise
Consider the matrix
A =
1 1 11 0 10 1 1
Write A as the product of elementary matrices. What does the FTLA tell you?
3.13.7 Exercise
A square matrix is said to be diagonal if all the entries of the matrix not on the main diagonalare zero. Then entries on the main diagonal may or may not be zero.
Show that if a diagonal matrix has no entries on the main diagonal that are zero thenthe diagonal matrix is a product of elementary matrices. Find the inverse of such a matrix.
3.13.8 Exercise
What is the formula for a positive integer power of a diagonal matrix?
40 CHAPTER 3. INVERSE OF A MATRIX
3.13.9 Exercise
Suppose that A is a square matrix. Consider the two equations
Ax = b, Ax = c
Is it possible for the first equation to have exactly one solution and for the second equationto have more than one solution?
3.13.10 Exercise
Suppose that there is a polynomial function p such that
p(x) = a0 + a1x+ a2x2
You are given that
p(0) = 1, p(1) = 3, p(−1) = 1
What is the polynomial function p?
3.13.11 Exercise
Consider the system of linear equations
x1 + x2 = 0
cx1 + x2 = 0
where c is some real number. This system obviously has the trivial solution. For what valuesof c does this system have only the trivial solution?
3.13.12 Exercise
Is this matrix A invertible? 4 0 00 3 00 0 0
3.13.13 Exercise
Go through the proof of the FTLA carefully.
Chapter 4
Some types of matrices
4.1 Diagonal matrices
4.1.1 Examples
The following square matrices are examples of diagonal matrices
I =
1 0 00 1 00 0 1
A =
1 0 00 5 00 0 7
B =
0 0 00 1 00 0 0
4.1.2 Definition
A square matrix which has the property that all entries not on the main diagonal are zerois said to be diagonal. The entries on the main diagonal can be zero or not zero.
D =
d1 0 0 0 · · · 00 d2 0 0 · · · 00 0 d3 0 · · · 0
0 0 · · · . . . · · · 00 0 · · · · · · · · · dn
41
42 CHAPTER 4. SOME TYPES OF MATRICES
4.2 Diagonal and elementary matrices
4.2.1 Elementary matrices that scale rows
Let’s use 3x3 examples:
E1(d1) =
d1 0 00 1 00 0 1
This elementary matrix performs the elementary row operation of multiplying the first rowof a matrix by d1.
The inverse if d1 6= 0 is
E1(1/d1) =
1/d1 0 00 1 00 0 1
This elementary matrix performs the elementary row operation of multiplying the first rowof a matrix by 1/d1.
Now consider
E2(d2) =
1 0 00 d2 00 0 1
This elementary matrix performs the elementary row operation of multiplying the secondrow of a matrix by d2.
The inverse if d2 6= 0 is
E2(1/d2) =
1 0 00 1/d2 00 0 1
This elementary matrix performs the elementary row operation of multiplying the secondrow of a matrix by 1/d2.
Now consider
E3(d3) =
1 0 00 1 00 0 d3
This elementary matrix performs the elementary row operation of multiplying the third rowof a matrix by d3.
The inverse if d3 6= 0 is
4.3. INVERSES OF DIAGONAL MATRICES 43
E3(1/d3) =
1 0 00 1 00 0 1/d3
This elementary matrix performs the elementary row operation of multiplying the third rowof a matrix by 1/d3.
4.2.2 3x3 diagonal matrix
Then we can get
D(d1, d2, d3) =
d1 0 00 d2 00 0 d3
= E1(d1)E2(d2)E3(d3)
4.2.3 General result
Exercise.
4.3 Inverses of diagonal matrices
4.3.1 Inverses
If none of the di are zero then
D−1 =
1/d1 0 0 0 · · · 0
0 1/d2 0 0 · · · 00 0 1/d3 0 · · · 0
0 0 · · · . . . · · · 00 0 · · · · · · · · · 1/dn
4.3.2 What if one of the di is zero?
Then the matrix is not invertible.
4.3.3 Example
If
A =
1 0 00 5 00 0 7
then
44 CHAPTER 4. SOME TYPES OF MATRICES
A−1 =
1 0 00 1/5 00 0 1/7
4.3.4 Product of elementary matrices
If a diagonal matrix has no zero entries on the main diagonal then it is the product ofelementary matrices.
4.3.5 Proof
Exercise. Hint: Consider performing a set of row operations on the identity matrix that willeventually give a diagonal matrix with no zeros on the main diagonal.
4.3.6 Inverses
A diagonal matrix with no zeros on the main diagonal is invertible.
4.3.7 Proof
Previous result plus the FTLA.
4.4 Powers of diagonal matrices
4.4.1 Recall
When the diagonal matrix D(d1, d2, ..., dn) multiplies another matrix A from the left it mul-tiplies the ith row of A by di.
4.4.2 Repeated multiplication by a diagonal matrix
If we apply D twice the ith row of A will be multiplied by di twice or d2i . If we apply D to
A for a total of k times then the ith row of A will be multiplied by dki . Then
(D(d1, d2, . . . , dn))k = D(dk1, dk2, . . . , d
kn)
4.5 Multiplying from the right
4.5.1 Question
We said that if D is a diagonal matrix which multiplies A from the left, then it scales therows of A. What if instead of DA we had AB?
4.6. IMPORTANCE OF DIAGONAL MATRICES 45
4.5.2 Example
Consider
D =
1 0 00 2 00 0 3
and
A =
1 0 11 1 01 0 1
From what we have seen before, we can write
DA =
1 0 12 2 03 0 3
without even doing full matrix multiplication. We know that the rows of A will be scaled.Now consider that
AD =
1 0 11 1 01 0 1
1 0 00 2 00 0 3
=
1 0 31 2 01 0 3
4.5.3 Exercise
Show that multiplication by a diagonal matrix from the right scales columns.
4.6 Importance of diagonal matrices
It is very easy to find the inverses and powers of a diagonal matrix. We will see later in thecourse that this simplifies many calculations.
4.7 Symmetric matrices
4.7.1 Definition
A square matrix A is said to be symmetric if At = A.
4.7.2 Alternatively
If its entries aij are such that
aij = aji
46 CHAPTER 4. SOME TYPES OF MATRICES
4.7.3 Example
The identity matrix is symmetric.
4.7.4 Example
A =
1 2 32 0 43 4 −7
is a symmetric matrix.
4.7.5 Transpose of a symmetric matrix
If A is symmetric then At is symmetric.
4.7.6 Proof
If A is symmetric then At = A. So
(At)t = At = A = At
4.7.7 Sum of symmetric matrices
If A and B are symmetric matrices then their sum A+B is symmetric.
4.7.8 Proof
A is symmetric so At = A. B is symmetric so Bt = B. Then (A + B)t = At + Bt by therules for transpose. Then (A+B)t = A+B.
4.7.9 Sum of symmetric matrices
If A and B are symmetric matrices then their difference A−B is symmetric.
4.7.10 Proof
Exercise.
4.7.11 Scalar multiple of a symmetric matrix
If A is a symmetric matrix and k is a real number then the matrix kA is symmetric
4.7.12 Proof
If the elements aij of A have the property that aji = aij then the elements bij = kaij havethe property that bji = kaji = kaij = bij. Thus kA is symmetric.
4.8. TRIANGULAR MATRICES 47
4.7.13 Inverse of a symmetric matrix
If A is a symmetric matrix and is invertible then its inverse is also symmetric.
4.7.14 Proof
Suppose that A is symmetric. That is, At = A. Suppose that A is invertible. Then
(A−1)tAt = (AA−1)t = I t = I
and
At(A−1)t = (A−1A)t = I t = I
But as A = At
(A−1)tA = I
and
A(A−1)t = I
And so
(A−1)t = A−1
4.7.15 Product of a matrix and its transpose
Suppose Am×n is a general matrix and Atn×m. Then their product AtA is symmetric.
4.7.16 Proof
(AtA)t = At(At)t = AtA
4.8 Triangular matrices
4.8.1 Upper triangular matrices
A square matrix is upper triangular if all the entries below the main diagonal are zero. En-tries on the main diagonal may be zero or not.
That is, A is upper triangular if its entries aij are such that aij = 0 whenever i > j.
48 CHAPTER 4. SOME TYPES OF MATRICES
4.8.2 Example
A =
0 1 20 5 10 0 12
is an example of an upper triangular matrix.
4.8.3 Lower triangular matrices
A square matrix is lower triangular if all the entries above the main diagonal are zero. En-tries on the main diagonal may be zero or not.
That is, A is lower triangular if its entries aij are such that aij = 0 whenever i < j.
4.8.4 Example
A =
1 0 02 3 04 5 6
is an example of a lower triangular matrix.
4.8.5 Identity matrix
The identity matrix is both upper triangular and lower triangular. In fact, diagonal matricesare both lower triangular and upper triangular.
4.9 Properties of triangular matrices
4.9.1 Transpose of an upper triangular matrix
The transpose of an upper triangular matrix is lower triangular.
4.9.2 Proof
A is upper triangular if its entries aij are such that aij = 0 whenever i > j. Let the entriesof At be called bij. Then bij = aji and are such that bij is zero whenever j > i. Thus At islower triangular.
4.9.3 Transpose of an upper triangular matrix
The transpose of a lower triangular matrix is upper triangular.
4.9.4 Proof
Exercise.
4.9. PROPERTIES OF TRIANGULAR MATRICES 49
4.9.5 The sum of upper triangular matrices
If A and B are upper triangular matrices of the same size then their sum is upper triangular.
4.9.6 Proof
A is upper triangular if its entries aij are such that aij = 0 whenever i > j. B is uppertriangular if its entries bij are such that bij = 0 whenever i > j. The sum C = A + B hasentries cij = aij + bij. If i > j then both aij and bij are zero and so cij = 0 + 0 = 0 if i > j.Then C is upper triangular.
4.9.7 The sum of lower triangular matrices
If A and B are lower triangular matrices of the same size then their sum is lower triangular.
4.9.8 Proof
Exercise.
4.9.9 The difference of upper triangular matrices
If A and B are upper triangular matrices of the same size then their difference A − B isupper triangular.
4.9.10 Proof
Exercise.
4.9.11 The difference of lower triangular matrices
If A and B are lower triangular matrices of the same size then their difference A−B is lowertriangular.
4.9.12 Proof
Exercise.
4.9.13 Invertibility of upper triangular matrices
This depends on the the main diagonal.
If an upper triangular matrix has no zeros on the main diagonal then it is invertible.
50 CHAPTER 4. SOME TYPES OF MATRICES
4.9.14 Proof
We can use elementary matrices to scale the pivots of each row to 1. We can then useelementary matrices to produce zeros above the pivots. Thus the rref of the matrix is theidentity matrix and then the matrix is invertible by the FTLA.
4.9.15 Invertibility of lower triangular matrices
A lower triangular matrix is invertible if and only if it has no zeros on the main diagonal.
4.9.16 Proof
Exercise.
4.9.17 Example
The matrix
A =
1 1 10 2 20 0 3
is invertible and the matrix
B =
1 1 1 40 0 2 50 0 3 70 0 0 1
is not invertible.
4.9.18 Form of inverses of triangular matrices
The inverse of an invertible upper triangular matrix is upper triangular and the inverse ofan invertible lower triangular is lower triangular.
4.9.19 The upper triangular case
Suppose that Un×n is an upper triangular matrix that is invertible. We consider how to findthe inverse using
(U |I)
The matrix I is upper triangular. Every operation that we will perform will not change thezeros below the main diagonal and so the resulting matrix will be upper triangular.
4.10. LU DECOMPOSITION 51
4.9.20 The lower triangular case
Exercise.
4.10 LU decomposition
4.10.1 Idea
Suppose that A is an invertible matrix and we want to solve
Ax = b
We decompose A as the product of a lower triangular matrix L and an upper triangularmatrix U .
A = LU
Now we want to solve
LUx = b
If we write
y = Ux
then we can solve
Ly = b
for y and then solve
Ux = y
for x.
4.10.2 Benefits
In some applications, A will be fixed while b keeps changing. So one factors A only oncewhich is equivalent to Gaussian elimination.
4.10.3 Example
Suppose that we have the system 2 −1 3−4 5 04 2 18
x1
x2
x3
=
1−20
One LU decomposition could be of the form
52 CHAPTER 4. SOME TYPES OF MATRICES
L =
1 0 0l21 1 0l31 l32 1
U =
u11 u12 u13
0 u22 u23
0 0 u33
Note that when we multiply L and U the first row of the result will be the first row of U , sothe first row of U should be the first row of A
L =
1 0 0l21 1 0l31 l32 1
U =
2 −1 30 u22 u23
0 0 u33
Consider that l21 · 2 = a21 = −4. Then l21 = −2. So we have
L =
1 0 0−2 1 0l31 l32 1
U =
2 −1 30 u22 u23
0 0 u33
Also (−2)(−1) + (1)(u22) = 5. So u22 = 3.
L =
1 0 0−2 1 0l31 l32 1
U =
2 −1 30 3 u23
0 0 u33
Also l31(2) = 4 so l31 = 2.
L =
1 0 0−2 1 02 l32 1
U =
2 −1 30 3 u23
0 0 u33
Also (2)(−1) + l32(3) = 2 so l32 = 4/3.
4.11. EXERCISES 53
L =
1 0 0−2 1 02 4/3 1
U =
2 −1 30 3 u23
0 0 u33
The (−2)(3) + (1)u23 = 0 so u23 = 6.
L =
1 0 0−2 1 02 4/3 1
U =
2 −1 30 3 60 0 u33
Finally (2)(3) + (4/3)(6) + u33 = 18 so u33 = 4.
L =
1 0 0−2 1 02 4/3 1
U =
2 −1 30 3 60 0 u33
It is left to the reader to solve the system.
4.10.4 Pivoting
A pivoting matrix P that exchanges rows is often used with a decomposition given by
A = PLU
We don’t want to go too deeply into numerical linear algebra at this point and leave furtherdiscussion to later courses.
4.11 Exercises
4.11.1 Exercise
Find the inverse of the diagonal matrix or show that the inverse does not exist.
D =
1 0 00 2 00 0 4
54 CHAPTER 4. SOME TYPES OF MATRICES
D =
1 0 00 0 00 0 4
4.11.2 Exercise
The exponential function exp : R→ R is defined by
exp(x) ≡ 1 + x+x2
2!+x3
3!+ · · ·+ +
xk
k!+ · · ·
The exponential function for square matrices A is defined by
exp(A) ≡ I + A+A2
2!+A3
3!+ · · ·+ +
Ak
k!+ · · ·
If
D =
1 0 00 2 00 0 4
then what is exp(D)?
4.11.3 Exercise
A square A matrix is said to be skew-symmetric if At = −A. Prove that a skew-symmetricmatrix has all zeros on its main diagonal.
4.11.4 Exercise
Show that a square matrix A can always be written as the sum of a symmetry matrix anda skew-symmetric matrix.
4.11.5 Exercise
Consider the lower triangular matrix
L =
1 0 02 1 02 2 1
Find the inverse of this matrix.
4.11. EXERCISES 55
4.11.6 Exercise
Consider the lower triangular matrix
L =
1 0 02 1 02 2 1
and the matrix
A =
1 2 32 6 72 8 11
Find an upper triangular matrix U so that A = LU .
56 CHAPTER 4. SOME TYPES OF MATRICES
Chapter 5
Determinants
5.1 Idea
The determinant is a measure of what a linear transformation represented by a square ma-trix does to a unit area. A determinant of two would mean that the area of the output ofthe transformation is twice the area of the input. The sign represents a change in orientation.
In this course, we will be interested in determinants for what they can tell us about matrices.
5.2 For 2x2 systems
5.2.1 Definition
Given
A2×2 =
(a bc d
)we define
|A| ≡ ad− bc
or
det(A) ≡ ad− bc
5.2.2 Example
If
A =
(1 23 4
)we define
57
58 CHAPTER 5. DETERMINANTS
|A| =
∣∣∣∣1 23 4
∣∣∣∣ = 1 · 4− 2 · 3 = − 2
5.3 For general square matrices
What if the matrix is bigger than a 2x2?
5.3.1 Minors
If the square matrix An×n has entries aij let mij be the determinant of the matrix given bycrossing out the row and column of aij.
5.3.2 Example
Suppose that we have a matrix
A =
1 2 34 5 67 8 9
Then
m11 =
∣∣∣∣5 68 9
∣∣∣∣ = 5 · 9− 6 · 8 = − 3
m23 =
∣∣∣∣1 27 8
∣∣∣∣ = 1 · 8− 2 · 7 = − 6
5.3.3 Cofactors
If A is a square matrix with entries aij and the associated minors are mij then the cofactorcij is given by
cij = (−1)i+jmij
5.3.4 Example
We calculated for
A =
1 2 34 5 67 8 9
the minors
m11 = − 3
5.3. FOR GENERAL SQUARE MATRICES 59
m23 = − 6
The associated cofactors are
c11 = (−1)1+1m11 = (1)(−3) = − 3
c23 = (−1)2+3m23 = (−1)(−6) = 6
5.3.5 Determinants
For a square matrix An×n we define
|A| = a11c11 + a12c12 + · · ·+ a1nc1n
5.3.6 Example
For
A =
1 2 34 5 67 8 9
|A| = (1)
∣∣∣∣5 68 9
∣∣∣∣+ (−2)
∣∣∣∣4 67 9
∣∣∣∣+ (3)
∣∣∣∣4 57 8
∣∣∣∣5.3.7 General definition
You can follow whatever row or column you like, just make sure to get the alternating signsright. For a square matrix An×n expanding along the ith row, we have
|A| =n∑j=1
(−1)i+jaij|Aij|
where Aij is the matrix resulting from crossing out the ith row and j column of A.
For a square matrix An×n expanding along the jth column, we have
|A| =n∑i=1
(−1)i+jaij|Aij|
60 CHAPTER 5. DETERMINANTS
5.3.8 Example
For
A =
1 0 34 0 67 0 9
we can use
|A| = a12c12 + a22c22 + a32c32 = 0 + 0 + 0 = 0
5.3.9 Example
What is the determinant of
A =
1 2 3 40 5 6 70 0 1 20 0 0 6
The result is immediate if you make the right choice for the rows and columns to use.
5.4 Some properties of determinants
5.4.1 Determinants of matrices with rows/cols of zeros
The determinant of a matrix with a row of all zeros is zero. The determinant of a matrixwith a column of all zeros is zero.
5.4.2 Examples
Consider a 2x2 matrix (a b0 0
)The determinant is clearly zero.
Consider a 3x3 matrix
A =
a b c0 0 0d e f
If we do the calculation using the second row we get
|A| = 0| · |+ 0| · |+ 0| · | = 0
5.4. SOME PROPERTIES OF DETERMINANTS 61
Similarly, if we have a row of zeros in any n× n matrix A we would get
|A| = 0| · |+ 0| · |+ 0| · |+ · · ·+ 0| · | = 0
The same reasoning holds for a column of zeros.
5.4.3 Multiplying a single row by a real
Suppose that An×n is a square matrix and B is the matrix given by multiplying each elementof a row by k. Then
|B| = k|A|
5.4.4 Example ∣∣∣∣∣∣1 2 02 4 03 5 0
∣∣∣∣∣∣ = 0
and so ∣∣∣∣∣∣5 10 02 4 03 5 0
∣∣∣∣∣∣ = 5 · 0 = 0
5.4.5 Proof
Suppose that An×n has elements aij. Let Aij be the (n − 1) × (n − 1) matrix formed bycrossing out the ith row and jthe column of A.
Suppose that we multiply row s of A by a real number k to get B which has elementsbij. Let Bij be the (n − 1) × (n − 1) matrix formed by crossing out the ith row and jthecolumn of B.
Note that
bsj = kasj
from the definition of B and
Bsj = Asj
since A and B only differ in row s which is being crossed out in Asj and Bsj.
Then calculating the determinant of B by expanding along the sth row of B:
62 CHAPTER 5. DETERMINANTS
|B| = (−1)s+1bs1|Bs1|+ (−1)s+2bs2|Bs2|+ · · ·+ (−1)s+nbsn|Bsn|= (−1)s+1kas1|As1|+ (−1)s+2kas2|As2|+ · · ·+ (−1)s+nkasn|Asn|= k
{(−1)s+1as1|As1|+ (−1)s+2as2|As2|+ · · ·+ (−1)s+nasn|Asn|
}= k|A|
5.4.6 Multiplying all elements by a real
Suppose that An×n is a square matrix and B is the matrix given by multiplying each elementof A by k. Then
|B| = |kA| = kn|A|
5.4.7 Proof
Apply the proof for a single row n times.
5.4.8 Result for multiplying a column by a real
Suppose that An×n is a square matrix and B is the matrix given by multiplying a column ofA by k. Then
|B| = |kA| = k|A|
5.4.9 Proof
Exercise
5.4.10 Determinant after interchanging two adjacent rows
Suppose that An×n is a square matrix and the matrix Bn×n is obtained from A by swappingtwo ADJACENT rows of A. Then
|B| = − |A|
5.4.11 Example ∣∣∣∣∣∣1 2 30 4 50 0 6
∣∣∣∣∣∣ = 24
so
5.4. SOME PROPERTIES OF DETERMINANTS 63
∣∣∣∣∣∣1 2 30 0 60 4 5
∣∣∣∣∣∣ = − 24
5.4.12 Proof
Let A be the original matrix with entries ai,j. Let Ai,j be the matrix obtained by crossingout the ith row of A and the jth column of A.
Let B be the matrix obtained by interchanging two adjacent rows of A with entries bi,j.Let Bi,j be the matrix obtained by crossing out the ith row of B and the Bth column of B.
Let’s suppose that B is obtained from A by interchanging row i and i+ 1.
Note that
bi+1,k = ai,k
and that
Bi+1,k = Ai,k
from the definition of B.
We calculate the determinant of B by expansion along row i+ 1
|B| = (−1)i+1+1bi+1,1|Bi+1,1|+ (−1)i+1+2bi+1,2|Bi+1,2|+ · · ·+ (−1)i+1+nbi+1,n|Bi+1,n|= (−1)i+1+1ai,1|Ai,1|+ (−1)i+1+2ai,2|Ai,2|+ · · ·+ (−1)i+1+nai,n|Ai,n|= (−1)
{(−1)i+1ai,1|Ai,1|+ (−1)i+2ai,2|Ai,2|+ · · ·+ (−1)i+nai,n|Ai,n|
}= (−1)|A|
5.4.13 Determinant after swapping ANY two rows
Suppose that An×n is a square matrix and Bn×n is the matrix obtained from A by inter-changing any two rows of A. Then
|B| = − |A|
5.4.14 Example ∣∣∣∣∣∣1 2 30 4 50 0 6
∣∣∣∣∣∣ = 24
64 CHAPTER 5. DETERMINANTS
so ∣∣∣∣∣∣0 0 60 4 51 2 3
∣∣∣∣∣∣ = − 24
5.4.15 Proof
Suppose that row r and row s are the two rows to be interchanged with 1 ≤ r < s ≤ n.We think of interchanging row r with the row below it repeatedly, including with row s andthen leaving it as the sth row of the new matrix. There were s− r interchanges.
Now that row s is one row above row r do repeated interchanges with the row above ituntil there row s is now the rth row of the new matrix. There were s− r − 1 interchanges.
Then by the result for interchanges of adjacent rows
|B| = (−1)s−r+s−r−1|A| = (−1)2(s−r)−1|A| = − |A|
5.4.16 Determinant after interchanging two columns
Suppose that An×n is a square matrix and Bn×n is the matrix obtained from A by inter-changing any two columns of A. Then
|B| = − |A|
5.4.17 Proof
Exercise.
5.4.18 Determinant of a square matrix with two identical rows
The determinant of a square matrix with two identical rows is zero.
5.4.19 Example ∣∣∣∣∣∣∣∣1 1 1 11 2 3 44 3 2 11 1 1 1
∣∣∣∣∣∣∣∣ = 0
5.4. SOME PROPERTIES OF DETERMINANTS 65
5.4.20 Proof
Suppose that A is a square matrix with two rows that are the same. From the result onrow swaps, we can swap two rows and the determinant changes by a minus sign. Then if weswap the two identical rows
|A| = − |A| ⇒ |A| = 0
5.4.21 Determinant of a square matrix with two identical columns
The determinant of a square matrix with two identical columns is zero.
5.4.22 Proof
Exercise.
5.4.23 Determinants and the third row operation
Let An×n be a square matrix. Let k be a real number. Let B be the result of adding k timesrow r of A to another row s of A and replacing the original row s of A. Then
|B| = |A|
5.4.24 Example∣∣∣∣∣∣1 1 12 1 33 2 5
∣∣∣∣∣∣ =
∣∣∣∣∣∣1 1 10 −1 13 2 5
∣∣∣∣∣∣ =
∣∣∣∣∣∣1 1 10 −1 10 −1 2
∣∣∣∣∣∣ =
∣∣∣∣∣∣1 1 10 −1 10 0 1
∣∣∣∣∣∣ = − 1
5.4.25 Proof
Let An×n be a square matrix with entries aij. Let Aij be the matrix formed by crossing outthe ith row and jth column of A.
Let k be a real number.
Let B be the result of adding k times row r of A to row s of A and replacing the origi-nal row s of A. Let the entries of B be bij. Let Bij be the matrix formed by crossing outthe ith row and jth column of B.
Let C be the result of replacing row s of A by row r of A. Let the entries of C be cij.Let Cij be the matrix formed by crossing out the ith row and jth column of B.
Note that
Asj = Bsj = Csj
66 CHAPTER 5. DETERMINANTS
and
arj = csj
and as C has two identical rows
|C| = 0
We calculate the determinant of B by expanding along row s:
|B| =n∑j=1
(−1)s+jbsj|Bsj|
=n∑j=1
(−1)s+j(asj + karj)|Bsj|
= kn∑j=1
(−1)s+jarj|Bsj|
n∑j=1
(−1)s+jasj|Bsj|
= kn∑j=1
(−1)s+jcsj|Csj|
n∑j=1
(−1)s+jasj|Asj|
= k|C|+ |A|= |A|
5.5 Determinants of matrices and their transposes
5.5.1 Example
Consider the 2x2 case
A =
(a bc d
)
At =
(a cb d
)The determinants are
|A| = ad− bc
5.5. DETERMINANTS OF MATRICES AND THEIR TRANSPOSES 67
|At| = ad− cb
So in the 2x2 case the determinant of a matrix A and its transpose At are the same.
5.5.2 Example
Consider the 3x3 case
A =
a11 a12 a13
a21 a22 a23
a31 a32 a33
At =
a11 a21 a31
a12 a22 a32
a13 a23 a33
We can find the determinant of A by going along its first row to get
|A| = a11
∣∣∣∣a22 a23
a32 a33
∣∣∣∣− a12
∣∣∣∣a21 a23
a31 a33
∣∣∣∣+ a13
∣∣∣∣a21 a22
a31 a32
∣∣∣∣Now to get the determinant of the transpose we can expand along the first column to get
At =
a11 a21 a31
a12 a22 a32
a13 a23 a33
|At| = a11
∣∣∣∣a22 a32
a23 a33
∣∣∣∣− a12
∣∣∣∣a21 a31
a23 a33
∣∣∣∣+ a13
∣∣∣∣a21 a31
a22 a32
∣∣∣∣and we see that the result is exactly the same. Note that the determinants in the expansionare the determinants of transposes of 2x2 matrices, for which we have already establishedthe result.
5.5.3 Theorem
If An×n is a square matrix then
|At| = |A|
5.5.4 Proof
Will do by induction.
Let Pn be the proposition to be proven. We already have P2. Assume Pk, that for 2,...,k
|At2×2| = |A|
68 CHAPTER 5. DETERMINANTS
|At3×3| = |A|
...
|Atk×k| = |A|
Now let A be a square (k + 1)× (k + 1) matrix. Then
|A| =k+1∑j=1
(−1)i+jaij|Aij|
where Aij is the k × k matrix that results from crossing out the ith row and jth col of A.
But by the induction hypothesis, |Atij| = |Aij| so
|A| =k+1∑j=1
(−1)i+jaij|Atij|
Note that the terms of the transpose matrix are given by
a′ji = aij
and so
|A| =k+1∑j=1
(−1)i+ja′ji|Atji|
The RHS is the determinant of At expanding along the ith column so
|A| = |At|
As P2 and Pk implies Pk+1 then Pn by mathematical induction.
5.6 Determinants, row operations, and elementary ma-
trices
5.6.1 Identity
|I| = 1
5.6.2 Determinant of row swapping elementary matrix
We find the matrix for row swapping by swapping rows of the identity matrix. By a previousresult the determinant of the elementary matrix will be -1.
5.6. DETERMINANTS, ROW OPERATIONS, AND ELEMENTARY MATRICES 69
5.6.3 Determinant of row multiplying elementary matrix
We find the matrix for row multiplying by multiplying a row of the identity matrix. By aprevious result the determinant of the elementary matrix will be the multiple.
5.6.4 Determinant of elementary matrix that adds a multiple of arow to another row
We find the matrix for this operation by performing this operation on the identity matrix.The determinant of the new matrix must be the same as the determinant of the identity bya previous result, and so is 1.
5.6.5 Determinant of a product involving an elementary matrixand a square matrix
Suppose that E is an elementary matrix and A is a square matrix of the same size. Then
|EA| = |E||A|
5.6.6 Proof
We consider the three types of elementary matrices and show that the result holds for eachtype.
Suppose that E is an elementary matrix that interchanges two rows. Then
|EA| = − |A|
from a previous result. But |E| = −1 so
|EA| = − |A| = |E||A|
Suppose that E multiplies a row by a real number k. Then
|EA| = k|A| = |E||A|
Suppose that E multiplies a row and adds it to another row. Then
|EA| = |A| = 1 · |A| = |E||A|
5.6.7 Determinant of an invertible matrix is nonzero
Suppose that a square matrix A is invertible. Then its determinant is non-zero.
70 CHAPTER 5. DETERMINANTS
5.6.8 Proof
By the FTLA A is invertible if and only if it is the product of elementary matrices
A = E1E2 · · ·EsThen
|A| = |E1E2 · · ·Es| = |E1||E2 · · ·Es| = |E1||E2| · · · |Es|
The determinants of elementary matrices are not zero by our previous discussion so for Ainvertible |A| 6= 0.
5.6.9 Determinant of a non-invertible (singular) matrix is zero
If a square matrix A is not invertible then its determinant is zero.
5.6.10 Proof
We do row reduction on A using elementary matrices to get its rref B. Note that B cannotbe I since A is not invertible and has at least one row of zeros.
B = Ek · · ·E2E1A
Taking the determinant of both sides and using a previous result
|B| = |Ek · · ·E2E1A| = |Ek| · · · |E2||E1||A|
As B has at least one row of zeros by a previous result |B| = 0.
0 = |Ek| · · · |E2||E1||A|
As elementary matrices have nonzero determinant we must have that |A| = 0.
5.6.11 Product of matrices with singular factor
Suppose that A and B are matrices of the same size and A is not invertible. Then theproduct AB has determinant zero.
5.6.12 Proof
We apply a set of elementary matrices to reduce A to its rref
C = EkEk−1 · · ·E2E1A
Note that as A is not invertible its rref has at least one row of zeros. Now multiply by B onthe right.
CB = EkEk−1 · · ·E2E1AB
5.6. DETERMINANTS, ROW OPERATIONS, AND ELEMENTARY MATRICES 71
Since C has a row of zeros, so does CB. Then the determinant of CB is zero and
|CB| = |EkEk−1 · · ·E2E1AB| = |Ek||Ek−1| · · · |E2||E1||AB|
0 = |Ek||Ek−1| · · · |E2||E1||AB|
As elementary matrices have nonzero determinants then |AB| = 0.
5.6.13 Product of matrices with singular factor
Suppose that A and B are matrices of the same size and A is not invertible. Then theproduct AB has determinant zero.
5.6.14 Proof
Exercise.
5.6.15 Determinant of a product of matrices
Suppose that A and B are square matrices of the same size. Then
|AB| = |A||B|
5.6.16 Proof
Suppose that A is invertible. Then A is the product of elementary matrices
A = E1E2 · · ·Ekand we can write
AB = E1E2 · · ·EkB
and then
|AB| = |E1E2 · · ·EkB|= |E1||E2| · · · |Ek|B|= |E1E2 · · ·Ek|B|= |A||B|
The other possibility is that A is not invertible. Then by a previous result
|AB| = 0 = 0|B| = |A||B|
72 CHAPTER 5. DETERMINANTS
5.6.17 Corollary
If A is invertible then
|A−1| =1
|A|
5.6.18 Proof
|I| = |AA−1| = |A||A−1|
and so
|A−1| =|I||A|
=1
|A|
5.6.19 Theorem
A square matrix A is invertible if an only if its determinant is not zero.
5.6.20 Proof
We saw that if A is invertible then its determinant is nonzero. Suppose that the determinantof A is nonzero. If we apply elementary matrices Ei to get the rref B then
B = EkEk−1 · · ·E2E1A
Taking the determinant of both sides
|B| = |Ek||Ek−1| · · · |E2||E1||A|
and so the determinant of the rref of A is not zero. If B had a row of zeros then |B| = 0and so B must be I
I = EkEk−1 · · ·E2E1A
and then A is invertible.
5.6.21 Theorem
A square matrix A is singular if and only if its determinant is zero.
5.6.22 Proof
Follows logically from previous theorem.
5.7. FTLA 73
5.7 FTLA
5.7.1 Before
The following are equivalent:
a) An×n is invertible
b) The equation Ax = 0 has only the trivial solution x = 0.
c) The reduced row echelon form of A is I.
d) A is a product of elementary matrices.
e) Ax = b is consistent for alln× 1 matrices b.
5.7.2 Now add
f) |A| 6= 0
5.8 Adjoint
5.8.1 Definition - matrix of cofactors
Suppose that A is a square matrix with entries aij and cofactors cij. The matrix C whoseentries are the cofactors cij of A is called the matrix of cofactors of A.
5.8.2 Definition - adjoint
The transpose of the matrix of cofactors of A is called the adjoint of A and is often denotedadj(A).
5.8.3 Example
For
A =
3 2 −11 6 32 −4 0
c11 = 12, c12 = 6, c13 = − 16
c21 = 4, c22 = 2, c23 = 16
74 CHAPTER 5. DETERMINANTS
c31 = 12, c32 = − 10, c33 = 16
The matrix of cofactors is
C =
12 6 −164 2 1612 −10 16
and then taking the transpose of the matrix of cofactors gives us adj( A ):
adj(A) =
12 4 126 2 −10−16 16 16
5.8.4 Adjoint and inverse
If the determinant of A is not zero then
A−1 =1
|A|adj(A)
5.8.5 Proof
Will do later in class.
5.8.6 Example
For
A =
3 2 −11 6 32 −4 0
we found the cofactors and then can calculate the
|A| = 64
The adjoint matrix was
adj(A) =
12 4 126 2 −10−16 16 16
and so
A−1 =1
64
12 4 126 2 −10−16 16 16
5.9. EXERCISES 75
5.9 Exercises
5.9.1 Exercise
Evaluate the determinant ∣∣∣∣1 22 1
∣∣∣∣5.9.2 Exercise
Evaluate the minors of
A =
1 2 12 1 23 0 5
5.9.3 Exercise
Calculate the determinant of A.
A =
1 2 12 1 23 0 5
5.9.4 Exercise
Calculate the determinant of A.
A =
1 0 0 00 1 2 10 2 1 20 3 0 5
5.9.5 Exercise
Calculate the determinant of A.
A =
1 6 7 120 0 0 00 2 1 20 3 0 5
76 CHAPTER 5. DETERMINANTS
5.9.6 Exercise
Calculate the determinant of A.
A =
1 6 7 122 12 14 240 2 1 20 3 0 5
5.9.7 Exercise
Calculate the determinant of A.
A =
1 6 7 120 1 14 240 0 1 20 0 0 5
5.9.8 Exercise
Calculate the determinant of A.
A =
1 6 7 120 1 14 240 0 0 50 0 1 2
5.9.9 Exercise
Calculate the determinant of A.
A =
1 6 7 120 1 14 240 0 0 50 0 5 10
5.9.10 Exercise
Calculate the determinant of 2A.
A =
1 6 7 120 1 14 240 0 0 50 0 5 10
5.9. EXERCISES 77
5.9.11 Exercise
Calculate the determinant of 2A.
A =
1 6 12 120 1 24 240 0 5 50 0 10 10
5.9.12 Exercise
Given
A =
1 6 7 120 1 14 240 0 0 50 0 5 10
how many solutions can the equation Ax = 0 have?
5.9.13 Exercise
Suppose that for a given square matrix A the equation Ax = 0 has a single unique solution.What can you say about the determinant of A?
5.9.14 Exercise
Suppose that for a given square matrix A there is a column matrix b so that the equationAx = b does not have any solution.
What can you say about the determinant of A?
5.9.15 Exercise
Find the adjoint of A.
A =
1 6 7 120 1 14 240 0 0 50 0 1 2
Use the adjoint to calculate the inverse of A.
78 CHAPTER 5. DETERMINANTS
Chapter 6
Vectors
6.1 A vector space
6.1.1 Defining the vector space
For the purposes of this course, the vector space R2 consists of column matrices of the form(x1
x2
)where x1 and x2 are real numbers.
6.1.2 Examples
Some examples are
u =
(15
)and
v =
(21
)
6.1.3 Zero vector
When we use 0 for a vector in R2 we mean (00
)
6.1.4 Addition
R2 consists of column matrices with addition being defined as for matrices as previouslydiscussed. So for
79
80 CHAPTER 6. VECTORS
u =
(u1
u2
)and
v =
(v1
v2
)the sum u+ v is
u+ v =
(u1 + v1
u2 + v2
)
6.1.5 Scalar multiplication
R2 consists of column matrices with scalar multiplication being defined as for matrices aspreviously discussed. So for
u =
(u1
u2
)and k ∈ R
ku =
(ku1
ku2
)
6.1.6 Coordinate geometry
We can associate a vector
r =
(xy
)with the point (x, y) in the xy-plane. Or we can think of r as a directed line segment fromthe origin to the point in the plane.
6.2 Properties of the vector space
6.2.1 Vectors are column matrices
We will think of the vectors of the vector space R2 as being a special class of matrices. Theyinherit the properties of matrices that we have already discussed.
6.3. ANOTHER VECTOR SPACE 81
6.2.2 Properties
Then the following properties are immediate:
a) For u and v vectors in R2
u+ v = v + u
b) For u, v, and w vectors in R2
(u+ v) + w = u+ (v + w)
b) For u, in R2 and 0 the zero vector
u+ 0 = u = 0 + u
c) For u in R2 we have
u+ (−u) = 0
d) For u and v in R2 and k a real number
k(u+ v) = ku+ kv
e) For u in R2
1u = u
6.3 Another vector space
6.3.1 Defining the vector space
R3 consists of column matrices of the formx1
x2
x3
where x1, x2 and x3 are real numbers.
6.3.2 Examples
Some examples are
u =
150
and
82 CHAPTER 6. VECTORS
v =
213
6.3.3 Zero vector
When we use 0 for a vector in R3 we mean000
6.3.4 Addition
R3 consists of column matrices with addition being defined as for matrices as previouslydiscussed. So for
u =
u1
u2
u3
and
v =
v1
v2
v3
the sum u+ v is
u+ v =
u1 + v1
u2 + v2
u3 + v3
6.3.5 Scalar multiplication
R3 consists of column matrices with scalar multiplication being defined as for matrices aspreviously discussed. So for
u =
u1
u2
u3
and k ∈ R
ku =
ku1
ku2
ku3
6.4. PROPERTIES OF THE VECTOR SPACE 83
6.3.6 Coordinate geometry
We can associate a vector
r =
xyz
with the point (x, y, z) in three dimensional space . Or we can think of r as a directed linesegment from the origin to the point in the space.
6.4 Properties of the vector space
6.4.1 Vectors are column matrices
We will think of the vectors of the vector space R3 as being a special class of matrices. Theyinherit the properties of matrices that we have already discussed.
6.4.2 Properties
Then the following properties are immediate:
a) For u and v vectors in R3
u+ v = v + u
b) For u, v, and w vectors in R3
(u+ v) + w = u+ (v + w)
b) For u, in R3 and 0 the zero vector
u+ 0 = u = 0 + u
c) For u in R3 we have
u+ (−u) = 0
d) For u and v in R3 and k a real number
k(u+ v) = ku+ kv
e) For u in R3
1u = u
84 CHAPTER 6. VECTORS
6.5 More vector spaces
6.5.1 Defining the vector space
Rn for n = 2, 3, 4, ... consists of column matrices of the formx1
x2...xn
where x1, x2, ..., xn are real numbers.
6.5.2 Zero vector
When we use 0 for a vector in Rn we mean00...0
6.5.3 Addition
Rn consists of column matrices with addition being defined as for matrices as previouslydiscussed. So for
u =
u1
u2...un
and
v =
v1
v2...vn
the sum u+ v is
u+ v =
u1 + v1
u2 + v2...
un + vn
6.6. PROPERTIES OF THE VECTOR SPACE 85
6.5.4 Scalar multiplication
Rn consists of column matrices with scalar multiplication being defined as for matrices aspreviously discussed. So for
u =
u1
u2...un
and k ∈ R
ku =
ku1
ku2...
kun
6.6 Properties of the vector space
6.6.1 Vectors are column matrices
We will think of the vectors of the vector space Rn as being a special class of matrices. Theyinherit the properties of matrices that we have already discussed.
6.6.2 Properties
Then the following properties are immediate:
a) For u and v vectors in Rn
u+ v = v + u
b) For u, v, and w vectors in Rn
(u+ v) + w = u+ (v + w)
b) For u, in Rn and 0 the zero vector
u+ 0 = u = 0 + u
c) For u in Rn we have
u+ (−u) = 0
d) For u and v in Rn and k a real number
k(u+ v) = ku+ kv
86 CHAPTER 6. VECTORS
e) For u in Rn
1u = u
6.7 The euclidean inner product
6.7.1 Definition
For u and v in Rn we define
u · v =n∑i=1
uivi
where ui is the ith entry of u and vi is the ith entry of v.
6.7.2 Example
Given
u =
−101
v =
10−1
The inner product of u and v is found by
u · v = (−1)(1) + (0)(0) + (1)(−1) = − 2
6.7.3 Magnitude of a vector
If we think of a with elements x1, x2, ..., xn as a directed line segment starting at the ori-gin and going to the associated point then we might also ask what the length of the vector is.
In the two-dimensional case the magnitude squared of
u =
(u1
u2
)is
u · u = u21 + u2
2
In the three-dimensional case the magnitude squared of
6.7. THE EUCLIDEAN INNER PRODUCT 87
u =
u1
u2
u3
is
u · u = u21 + u2
2 + u23
In the n-dimensional case the magnitude squared of
u =
u1
u2...un
is
u · u = u21 + u2
2 + · · ·+ u2n
6.7.4 Norm
Often the term norm is used for the magnitude of the vector
‖u‖ =√u · u
6.7.5 Example
For
u =
(1√2
1√2
)
‖u‖2 =
(1√2
)2
+
(1√2
)2
= 1
and so
‖u‖ = 1
6.7.6 Distance between two two-dimensional vectors
If we identify a vector
r =
(r1
r2
)with the point (r1, r2) in the cartesian plane and another vector
88 CHAPTER 6. VECTORS
s =
(s1
s2
)with the point (s1, s2) then we are used to using Pythagorus to find the distance betweenthe two points: √
(s1 − r1)2 + (s2 − r2)2 = ‖s− r‖
We define the distance between the two vectors r and s by
‖s− r‖
6.7.7 Distance between two three-dimensional vectors
If we identify a vector
r =
r1
r2
r3
with the point (r1, r2, r3) and another vector
s =
s1
s2
s3
with the point (s1, s2, s3) then we are used to using Pythagoras to find the distance betweenthe two points: √
(s1 − r1)2 + (s2 − r2)2 + (s3 − r3)2 = ‖s− r‖
We define the distance between the two vectors r and s by
‖s− r‖
6.7.8 Distance between vectors
Given two vectors r and s in Rn we define the distance between the two vectors to be
‖s− r‖
6.7.9 Cauchy-Schwartz inequality
For u and v in Rn we have
|u · v| ≤ ‖u‖‖v‖
6.8. PROPERTIES OF THE EUCLIDEAN NORM 89
6.8 Properties of the Euclidean norm
For u in Rn
1)
‖u‖ ≥ 0
2)
‖u‖ = 0 iff u = 0
3) For k a real number
‖ku‖ = |k|‖u‖
4) Triangle inequality
‖u+ v‖ ≤ ‖u‖+ ‖v‖
6.8.1 Proof of triangle inequality
‖u+ v‖2 = (u+ v) · (u+ v)
= u · u+ 2u · v + v · v= ‖u‖2 + 2u · v + ‖v‖2
≤ ‖u‖2 + 2|u · v‖+ ‖v‖2
≤ ‖u‖2 + 2‖u‖‖v‖+ ‖v‖2 by Cauchy-Schwartz
= (‖u‖+ ‖v‖)2
6.9 Orthogonality
6.9.1 Definition
Two vectors u and v in Rn are said to be orthogonal if their dot product is zero
u · v = 0
6.9.2 Example
The vectors
u =
(10
)
90 CHAPTER 6. VECTORS
v =
(01
)are orthogonal.
6.10 Vectors and systems of linear equations
6.10.1 Example
Recall that we had systems that looked like
2x1 + x2 = 5
x1 + x2 = 7
In the language of vectors and matrices we would write(2 11 1
)(x1
x2
)=
(57
)If we then perform row reduction we get(
1 00 1
)(x1
x2
)=
(−29
)So there is a unique solution to the matrix-vector equation which is
x =
(−29
)
6.10.2 Example
Solve the equation Ax = b for
A =
(1 12 2
)
b =
(11
)Performing row reduction gives us(
1 10 0
)(x1
x2
)=
(01
)The bottom row is a contradiction 0=1 so we conclude that this matrix-vector equation hasno solution.
6.10. VECTORS AND SYSTEMS OF LINEAR EQUATIONS 91
6.10.3 Example
Solve the equation Ax = b for
A =
(1 12 2
)
b =
(12
)Performing row reduction gives us(
1 10 0
)(x1
x2
)=
(10
)So we conclude that x2 could be any real number and
x1 = 1− 1 · x2
So the vector solution to this matrix-vector equation is
x =
(1− x2
x2
)=
(10
)+ x2
(−11
)
6.10.4 Example
Consider the matrix-vector equation1 1 10 1 10 0 1
x1
x2
x3
=
123
Performing row reduction gives1 0 0
0 1 00 0 1
x1
x2
x3
=
−1−13
The solution vector to the matrix-vector equation is
x =
−1−13
6.10.5 Example
Consider the matrix-vector equation1 1 10 1 10 2 2
x1
x2
x3
=
124
92 CHAPTER 6. VECTORS
Performing row reduction gives1 0 00 1 10 0 0
x1
x2
x3
=
−120
We see that x3 could be any real number and
x2 = − x3
and
x1 = − 1
Then the solution vector to the matrix-vector equation is
x =
−12− x3
x3
=
−120
+ x3
0−11
6.10.6 Example
Consider the matrix-vector equation0 1 10 2 20 3 3
x1
x2
x3
=
123
Performing row reduction gives0 1 1
0 0 00 0 0
x1
x2
x3
=
100
We see that x1 and x3 could be any real numbers and
x2 = 1− x3
Then the solution vector to the matrix-vector equation is
x =
x1
1− x3
x3
=
010
+ x1
100
+ x3
0−11
6.11. EXERCISES 93
6.11 Exercises
6.11.1 Exercise
Consider the vectors
u =
(15
)
v =
(−17
)Calculate the vector w = 3u+ 7v.
6.11.2 Exercise
Give examples of three vectors in R2 which are of magnitude 1.
6.11.3 Exercise
Give two vectors in R3 which are orthogonal to
u =
110
6.11.4 Exercise
What is the solution to the equation
(1 1 0 0
)x1
x2
x3
x4
= 0
94 CHAPTER 6. VECTORS
Chapter 7
Vector Spaces
7.1 The vector spaces already discussed
7.1.1 Examples of vector spaces
We have discussed some examples of vector spaces
R2, R3, Rn
which were column matrices with all real entries.
7.1.2 Zero vector
All of these vector spaces had a zero vector which is a column vector with all zero entries.
0 =
00...0
7.1.3 Property of the zero vector
For any vector u ∈ Rn and the zero vector we have
u+ 0 = 0 + u = u
7.1.4 Addition
For Rn addition was defined entry-wise in such a way that that for u and v vectors in Rn wehave that
u+ v is defined in Rn
95
96 CHAPTER 7. VECTOR SPACES
7.1.5 Scalar multiplication
A scalar multiplication was defined so that for any α ∈ R and any u ∈ Rn
αu ∈ Rn
7.2 Properties we have seen
From the way we defined addition and scalar multiplication for vectors in Rn we can show:
7.2.1 Properties
1. For u and v vectors in Rn
u+ v
is also a vector in Rn.
2. For u and v vectors in Rn
u+ v = v + u
3. For u, v, and w vectors in Rn
(u+ v) + w = u+ (v + w)
4. For u, in Rn there is a zero vector 0 such that
u+ 0 = u = 0 + u
5. For every vector u in Rn there is a vector −u in Rn such that
u+ (−u) = 0
6. For every vector u in Rn and α in R there is a vector
αu
in Rn.
7. For every u and v in Rn and α a real number
α(u+ v) = αu+ αv
8. For every u in Rn and α and β real numbers
(α + β)u = αu+ βu
9. For every u in Rn and α and β real numbers
7.3. PROPERTIES OF A VECTOR SPACE 97
α(βu) = (αβ)u
10. For u in Rn
1u = u
7.3 Properties of a vector space
7.3.1 Definition
A set V is said to be a real vector space if the properties that we just recalled for the Rn
vector spaces hold.
7.3.2 Properties
1. For u and v vectors in V
u+ v
is also a vector in V .
2. For u and v vectors in V
u+ v = v + u
3. For u, v, and w vectors in V
(u+ v) + w = u+ (v + w)
4. For u, in V there is a zero vector 0 such that
u+ 0 = u = 0 + u
5. For every vector u in V there is a vector −u in Rn such that
u+ (−u) = 0
6. For every vector u in V and α in R there is a vector
αu
in V .
7. For every u and v in V and α a real number
α(u+ v) = αu+ αv
8. For every u in V and α and β real numbers
98 CHAPTER 7. VECTOR SPACES
(α + β)u = αu+ βu
9. For every u in V and α and β real numbers
α(βu) = (αβ)u
10. For u in V
1u = u
7.3.3 Example
We checked previously that Rn is a vector space. It has all the properties required of a realvector space.
7.3.4 Example
Consider the set of all polynomial functions of finite degree. If we define addition and scalarmultiplication for functions in the usually way then this is a real vector space.
7.4 A longer example
7.4.1 Potential vector space
Consider the set V of all elements of R2 of the form(0α
)where α could be any real number.
7.4.2 Question
Is this subset of R2 a vector space?
7.4.3 Check the required properties one by one
Checking 1.
Is the sum of two elements of V in V ? Consider
u =
(0α
)∈ V
and
v =
(0β
)∈ V
7.4. A LONGER EXAMPLE 99
The sum
u+ v =
(0
α + β
)is also in V .
Checking 2.
For vectors u and v in V is u+ v = v + u?
Yes, don’t really need to check this since it was already checked for R2.
Checking 3.
For u, v, and w vectors in V is addition associative?
Yes, don’t really need to check this since it was already checked for R2.
Checking 4.
Is the zero vector in V ? Yes. Don’t really need to check that adding the zero vector to avector doesn’t change the vector since this is already known for all of R2.
Checking 5.
For every vector u in V there is a vector −u in V such that
u+ (−u) = 0
Consider
u =
(0α
)∈ V
Then
−u =
(0−α
)∈ V
and u+ (−u) = 0.
Checking 6.
If u is in V is αu in V ? Consider
u =
(0b
)∈ V
Then
100 CHAPTER 7. VECTOR SPACES
αu =
(0αb
)∈ V
Checking 7.
For every u and v in V and α a real number is α(u+ v) = αu+ αv?
Consider
u =
(0a
)∈ V
and
v =
(0b
)∈ V
Then
α(u+v) = α
(0
a+ b
)=
(0
αa+ αb
)=
(0αa
)+
(0αb
)= α
(0a
)+α
(0b
)= αu+αv
Checking 8.
For every u in V and α and β real numbers is it true that (α + β)u = αu+ βu?
Consider
u =
(0a
)∈ V
Then for α and β real numbers
(α+β)u = (α+β)
(0a
)=
(0
αa+ βa
)=
(0αa
)+
(0βa
)= α
(0a
)+β
(0a
)= αu+βu
Checking 9.
For every u in V and α and β real numbers is it true that α(βu) = (αβ)u?Consider
u =
(0a
)∈ V
Then for α and β real numbers
α(βu) = α
(0βa
)=
(0αβa
)= (αβ)
(0a
)= (αβ)u
7.5. SUBSPACES 101
Checking 10.
For u in V is 1u = u? Don’t really need to check this since we already know that it is truefor all vectors u in R2.
7.5 Subspaces
7.5.1 Definition
If V is a real vector space, and W is a non-empty subset of V which is also itself a vectorspace then W is said to be a subspace of V .
7.5.2 Example
The V ⊂ R2 that we just discussed is a subspace of R2.
7.5.3 Theorem - Checking if a subset is a subspace
If W is a non-empty subset of vectors of a vector space V then W is a subspace of V if andonly if
a) u, v ∈ W implies u+ v ∈ W
b) α ∈ R and u ∈ W implies αu ∈ W .
7.5.4 Proof
Exercise.
7.5.5 Example
Consider the set of vectors W ⊂ R2 which consists of all vectors whose first and secondelements are the same. A typical element looks like
u =
(αα
)∈ W
This is a non-empty subset of R2. If we add two typical elements
u =
(αα
)∈ W
v =
(ββ
)∈ W
we get
102 CHAPTER 7. VECTOR SPACES
u+ v =
(α + βα + β
)which has first and second elements the same and so the sum of two elements of W is in W .
Next, consider
u =
(αα
)∈ W
and a scalar k. We get
ku =
(kαkα
)which has equal first and second entries. Then the scalar multiple of an element of W is in W .
Both required conditions hold and we conclude that W is a subspace of R2.
7.5.6 Example
Suppose that V is a real vector space. Consider the subset
W = {0}which consists only of the zero vector. Then for two vectors u and v in W :
u+ v = 0 + 0 = 0 ∈ WFor any real number α and any vector u in V :
αu = α0 = 0 ∈ VThen the required conditions have been checked and we have found a trivial subspace ofevery vector space.
7.5.7 Example
Consider the vector space R2. Consider the subset
W =
{(11
)}Is this subset W a subspace of R2?
Consider for an element u in W that
0u = 0
which is not in W . So some scalar multiples of elements of W are not in W and so W is nota subspace.
7.6. SUBPSACES AND SOLUTIONS OF SYSTEMS OF EQUATIONS 103
7.6 Subpsaces and solutions of systems of equations
7.6.1 Theorem
If Am×nx = 0 is a system of m linear equations in n unknowns then the set of solutions is asubspace of Rn.
7.6.2 Proof
Let W be the set of all solutions of the equation Ax = 0. Let u and v be elements of W . Asthey are solutions of Ax = 0 we have Au = 0 and Av = 0.
Does u+ v belong to W?
A(u+ v) = Au+ Av = 0 + 0 = 0
So if u is a solution and v is a solution then the sum u+ v is a solution.
Suppose that u is a solution and α is a real number. Then is αu a solution?
A(αu) = α(Au) = α0 = 0
So if u is in W then αu is in W .
Then both conditions have been checked and W , the set of all solutions to the equationAx = 0, is a subspace of Rn.
7.6.3 Example
Consider the system 1 −2 32 −4 63 −6 9
x1
x2
x3
=
000
We form the augmented matrix and do row reduction to get the rref1 2 3 | 0
0 0 0 | 00 0 0 | 0
from which we conclude that
7.6.4 Example
Consider the equation
104 CHAPTER 7. VECTOR SPACES
1 −2 32 −4 63 −6 9
x1
x2
x3
=
000
If we form the augmented matrix and do row reduction we get1 −2 3 |0
0 0 0 |00 0 0 |0
This gives us that x2 and x3 can be any real numbers ( call them s and t ) and
x1 = 2x2 − 3x3
Then the solution vector is
x =
x1
x2
x3
=
2x2 − 3x3
x2
x3
=
2s− 3tst
= s
210
+ t
−301
with s and t in R.
According to the theorem, this set of solutions is in fact a subspace of R3.
7.7 Linear combinations
7.7.1 Linear combination of vectors
A vector v in a real vector space V is said to be a linear combination of vectors v1, v2 , ... ,vn if there are real numbers α1, α2, ... , αn such that
v = α1v1 + α2v2 + · · ·+ αnvn
7.7.2 Example
In R2 the vector
v =
(αβ
)can be written as a linear combination of the vectors
e1 =
(10
)e2 =
(01
)by
7.7. LINEAR COMBINATIONS 105
v = αe1 + βe2
7.7.3 Example
In R2 the vector
v =
(αβ
)can be written as a linear combination of the vectors
v1 =
(1−1
)
v2 =
(11
)by
v =α− β
2v1 +
α + β
2v2
7.7.4 Example
In R3 all vectors can be written as linear combinations of the vectors
e1 =
100
e2 =
010
e3 =
001
7.7.5 Example
In Rn all vectors can be written as linear combinations of the vectors
e1 =
10...0
106 CHAPTER 7. VECTOR SPACES
e2 =
01...0
...
en =
00...1
7.7.6 Theorem - linear combinations form a vector space
Suppose v1, v2, ... , vk are vectors in a vector space V .
a) Then the set W of all linear combinations of v1, v2, ... , vk is a subspace of V .
b) It is the smallest subspace that contains all the vi.
7.7.7 Proof of a)
Suppose that p and q are elements of W . Then p and q are both linear combinations of thevi:
p = α1v1 + α2v2 + · · ·+ αkvk
q = β1v1 + β2v2 + · · ·+ βkvk
The sum is
p+ q = (α1 + β1)v1 + (α2 + β2)v2 + · · ·+ (αk + βk)vk
which is a linear combination of the vi and so in W . So the sum of two vectors in W is alsoin W .
Now consider a scalar multiple of p:
γp = (γα1)v1 + (γα2)v2 + · · ·+ (γαk)vk
And then this is also a linear combination of the vi and so is in W . So a scalar multipleof a vector in W is in W . The two required conditions have been checked and so W is asubspace.
7.7. LINEAR COMBINATIONS 107
7.7.8 Proof of b)
Suppose a subspace contains vectors v1, v2 , ... , vk.
One of the conditions for a subspace is that all scalar multiples of vectors in the subspacemust also be in the subspace. Then for all real numbers α1, α2 , ... , αk we must have thanα1v1, α2v2 , ... , αkvk are also in the subspace.
The other condition for a subspace is that sum of any two vectors in the subspace mustalso be in the subspace. Then
α1v1 + α2v2
is in the subspace. Then
(α1v1 + α2v2) + α3v3
is in the subspace. And so on till
α1v1 + α2v2 + · · ·+ αkvk
is also in the subspace.
Thus v1, v2 , ... , vk in a subspace implies that all linear combinations of the vi are inthe subspace.
7.7.9 Span
Suppose a subspace V contains vectors v1, v2 , ... , vk. The subspace W consisting of all thelinear combinations of the vi is called the space spanned by v1, v2 , ... , vk and v1, v2 , ... ,vk are said to span W .
7.7.10 Example
e1, e2, and e3 span R3.
7.7.11 Example
The vectors 100
1
10
108 CHAPTER 7. VECTOR SPACES
111
span R3.
7.7.12 Example
The vectors 321
2
10
1
11
do NOT span R3.
7.8 Linear independence and dependence
7.8.1 Linear independence
Suppose that v1, v2 , ... , vk are vectors in a real vector space V . The vi are said to belinearly dependent if we can find real numbers α1, α2, .. , αk, not all zero so that
α1v1 + α2v2 + · · ·+ αkvk = 0
Otherwise the set is said to be linearly independent.
7.8.2 Alternatively
Another way of putting this is that the vi are linearly independent if
α1v1 + α2v2 + · · ·+ αkvk = 0
only for all the αi = 0. Then if the vectors are not linearly independent they are said to belinearly dependent.
7.8. LINEAR INDEPENDENCE AND DEPENDENCE 109
7.8.3 Example
The vectors e1 and e2 in R2 are linearly independent because the only way to get
αe1 + βe2 = 0
is for both α and β to be zero.
7.8.4 Example
Are the vectors
u =
212
v =
236
w =
224
linearly independent?
We need to check whether or not the equation
xu+ yv + zw = 0
has only the trivial solution x = y = z = 0 or whether there are other solutions. Let’s writethis out. We want to check if there is a solution other than x = y = z = 0 for
x
212
+ y
236
+ z
224
=
000
We can rewrite this as 2 2 2
1 3 22 6 4
xyz
=
000
If we form the augmented matrix then we get the rref1 0 1/2 | 0
0 1 1/2 | 00 0 0 | 0
and so we get infinitely many solutions and conclude that the vectors considered are linearlydependent.
110 CHAPTER 7. VECTOR SPACES
7.9 Back to the FTLA
7.9.1 Recall
The following are equivalent:a) An×n is invertible.
b) The equation Ax = 0 has only the trivial solution x = 0.
c) The reduced row echelon form of A is I.
d) A is a product of elementary matrices.
e) Ax = b is consistent for all n× 1 matrices b.
f) |A| 6= 0.
7.9.2 Add to the FTLA
g) The columns of A considered as vectors in Rn are linearly independent.
7.9.3 Proof
Suppose that A is invertible. Then the only solution to Ax = 0 is x = 0. Then the onlylinear combination of the columns of A that gives zero is one where the scalar coefficients ofthe sum are zero. Then the columns are linearly independent.
If the columns are linearly independent then the only linear combination that gives zerois the one with all scalar coefficients zero. Then Ax = 0 has only the trivial solution and soby the FTLA then A is invertible.
7.10 Dimension
7.10.1 Example
Consider the vectors
u =
(20
)
v =
(11
)
w =
(−11
)
7.10. DIMENSION 111
in R2. We can consider the vector space W which is spanned by all of these vectors. Theelements of W look like
p = αu+ βv + γw
where α, β and γ can be any real numbers. But the set of vectors {u, v, w} are not linearlyindependent since
u− v + w = 0
So we can write
w = − u+ v
Then vectors in the space W can be written
p = αu+ βv + γw = αu+ βv + γ(−u+ v) = α′u+ β′v
We can write every vector in W as a linear combination of u, v, and w but we can also justwrite the vector as a linear combination of u and v.
It is helpful to use as few vectors as are actually needed to span the space that one isinterested in.
7.10.2 General idea
Suppose that we are interested in a space W that we know is spanned by a set of vectors
B = {v1, v2, . . . , vm}
If the set of vectors is linearly independent, then none of the vectors can be written as alinear combination of the other vectors. If the set is linearly dependent, then some vectorcan be written as a linear combination of the others. Say, vm. We can drop vm from the setand just use
B′ = {v1, v2, . . . , vm−1}
We keep going till we get a linearly independent set.
7.10.3 Basis
Such a minimal set is called a basis.
7.10.4 Minimality of the basis
For a given vector space W with two bases, the number of vectors in both bases is the same.
112 CHAPTER 7. VECTOR SPACES
7.10.5 Example
The vectors
e1 =
(10
)
e2 =
(01
)are a basis for R2. So is
B = {e1 , e1 + e2}
Both bases have two vectors.
7.10.6 Dimension
The dimension of a vector space V is the number of vectors in its basis.
7.10.7 Example
Rn has dimension n.
7.10.8 Example
The space of polynomials on R is of infinite dimension.
7.11 Exercises
7.11.1 Exercise
Show that the set of polynomial functions is a vector space.
7.11.2 Exercise
Give an example of a one-dimensional subspace in R3.
7.11.3 Exercise
Given an example of a two-dimensional subspace in R3.
7.11. EXERCISES 113
7.11.4 Exercise
The standard basis for R3 is
B = {e1, e2, e3}
where
e1 =
100
e2 =
010
e3 =
001
Find another basis for R3 in which all the vectors have length 1.
7.11.5 Exercise
Can a set of five vectors in R4 be linearly independent?
114 CHAPTER 7. VECTOR SPACES
Chapter 8
Euclidean vector spaces
8.1 Euclidean vector spaces
8.1.1 Definition
The vector spaces Rn with the inner/scalar/dot product that we have already discussed arecalled Euclidean vector spaces.
8.1.2 Review
If you don’t recall the dot product and its properties you should review them at this time.
8.2 Dot product and matrix multiplication
8.2.1 Transpose and dot product
Recall that for u and v in Rn we defined
u · v = u1v1 + u2v2 + · · ·+ unvn
Note that
u · v = vtu
by the rules of matrix multiplication.
8.2.2 Example
Consider the vectors
u =
111
and
115
116 CHAPTER 8. EUCLIDEAN VECTOR SPACES
v =
123
The dot product can be calculated by
u · v = vtu =(1 2 3
)111
= 6
8.2.3 Matrix transpose and dot product
By the previous rule, for a square matrix An×n and vectors u and v in Rn
Au · v = vt(Au) = (vtA)u = (Atv)tu = u · Atv
and
u · Av = (Av)tu = (vtAt)u = vt(Atu) = Atu · v
8.2.4 Example
Consider the vectors
u =
(11
)and
v =
(25
)and the matrix
A =
(1 23 4
)Then
Au · v = u · Atv =
(11
)·(
1 32 4
)(25
)=
(11
)·(
1624
)= 40
8.2.5 Matrix multiplication and vectors
One way of looking at the rule for matrix multiplication is to think of the first matrix asbeing made up of row vectors
8.3. FUNCTIONS/MAPS 117
Am×k =
R1
R2...
Rm
and the second matrix as being made up of columns
Bk×n =(C1 C2 · · · Cn
)and then the product looks like
(AB)m×n =
R1C1 R1C2 · · ·R1Cn
R2C1 R2C2 · · ·R2Cn...
... · · · ...RmC1 RmC2 · · ·RmCn
8.3 Functions/maps
8.3.1 Domain of functions
Suppose that f : U → V is a function that maps elements of the set A to elements of theset B. We call A the domain of f .
8.3.2 Example
Suppose that a function f : (0, 1)→ R is defined by
f(x) =1√
1− x2
The domain of this function is (0, 1).
8.3.3 Example
Suppose that a function f : R2 → R is defined by
f
(xy
)= x2 + y2
The domain of this function is R2.
8.3.4 Range of a function
Suppose that f : U → V is a function that maps elements of the set A to elements of theset B. We call the subset of B of all the elements that f maps to the range of f .
118 CHAPTER 8. EUCLIDEAN VECTOR SPACES
8.3.5 Example
Suppose that a function f : R2 → R is defined by
f
(xy
)= x2 + y2
The domain of this function is R2. The range is [0,∞).
8.3.6 Example
Suppose that a function f : U → R3 is defined by
U =
{(xy
)| x2 + y2 ≤ 1
}⊂ R2
f
(xy
)=
xy
x2 + y2
The domain of this function is the unit disk U and the range can be graphed as a paraboloidin 3-space.
8.4 Linear transformation
8.4.1 Linearity
Suppose that there is a function L : U → V that maps between real vector spaces U and V .Then L is said to be a linear transformation if the following conditions hold:
For any vectors x and y in U
L(x+ y) = L(x) + L(y)
and for α a real scalar
L(αx) = αL(x)
8.4.2 Question
What do these two conditions remind you of?
8.4.3 Exercise
Show that the range of L is a subspace of V .
8.5. EXAMPLE 119
8.4.4 Exercise
Consider a function f : R2 → R3 defined by
f
(x1
x2
)=
(2x1 + x2
3x2 + 3x2
)
Confirm that
f
(11
)=
(36
)
Show that this f is a linear transformation by checking the two requirements for a lineartransformation.
8.5 Example
8.5.1 Transformation
Suppose that we have a linear transformation L : R2 → R3 defined by
L
(x1
x2
)=
x1 + x2
x1 − x2
x1
8.5.2 Linearity
We confirm that the transformation L is linear because for vectors u and v in R2 and scalarα:
L(u+ v) = Lu+ Lv
L(αu) = αLu
120 CHAPTER 8. EUCLIDEAN VECTOR SPACES
8.5.3 First condition for linearity
L
((x1
x2
)+
(y1
y2
))= L
(x1 + y1
x2 + y2
)
=
(x1 + y1) + (x2 + y2)(x1 + y1)− (x2 + y2)
(x1 + y1)
=
x1 + x2
x1 − x2
x1
+
y1 + y2
y1 − y2
y1
= L
(x1
x2
)+ L
(y1
y2
)
8.5.4 Second condition for linearity
L
(α
(x1
x2
))= L
(αx1
αx2
)=
αx1 + αx2
αx1 − αx2
αx1
= α
x1 + x2
x1 − x2
x1
= αL
(x1
x2
)
8.5.5 Associated matrix
Note that
L
(xy
)=
1 11 −11 0
(xy
)So we can calculate the result of the linear transformation by doing a matrix-vector multi-plication.
8.6 General linear transformation
8.6.1 Transformation
Suppose L : Rn → Rm is defined by
L
x1
x2...xn
=
a11x1 + a12x2 + · · ·+ a1nxna21x1 + a22x2 + · · ·+ a2nxn
...am1x1 + am2x2 + · · ·+ amnxn
=
a11 a12 · · · a1n
a21 a22 · · · a2n...am1 am2 · · · amn
x1
x2...xn
8.7. SOME LINEAR TRANSFORMATIONS 121
We can make a connection between the linear transformation L and the matrix A
A =
a11 a12 · · · a1n
a21 a22 · · · a2n...am1 am2 · · · amn
which yields the same result as L:
L(x) = Ax
8.6.2 Standard matrix
A is often called the standard matrix of L.
8.7 Some linear transformations
8.7.1 Zero transformation
This is the transformation that maps every vector in Rn to the zero vector in Rm.
8.7.2 Exercise
What is the standard matrix for this linear transformation?
8.7.3 Identity operator
The identity operator L : Rn → Rn is defined by
L(x) = x
8.7.4 Exercise
What is the standard matrix for the identity operator?
8.7.5 Reflection operators
2D reflection in the y axis
Consider the operator L : R2 → R2 defined by
L(x) =
(−x1
x2
)
122 CHAPTER 8. EUCLIDEAN VECTOR SPACES
Example
L
(11
)=
(−11
)
Exercise
What is the standard matrix for this reflection operator?
2D reflection about x axis
Consider the operator L : R2 → R2 defined by
L(x) =
(x1
−x2
)
Example
Consider the operator L : R2 → R2 defined by
L
(26
)=
(2−6
)
Exercise
What is the standard matrix for this operator?
8.7.6 Generally
Operators that reflect vectors in lines in 2-space and 3-space are called reflection operators.
8.7.7 Projection operators
In lower dimensions
In lower dimensions the projection operators are
L
(xy
)=
(x0
)
L
(xy
)=
(0y
)
L
xyz
=
xy0
8.8. ROTATION OPERATORS 123
L
xyz
=
x0z
L
xyz
=
0yz
Example
The projection of the vector
v =
247
onto the yz-plane is
Lv =
047
Exercise
What are the standard matrices for these operators?
8.8 Rotation operators
8.8.1 Rotations in 2-space
Suppose that one identifies the vector
r =
(xy
)with the point (x, y) in the Cartesian plain or with a directed line segment running from theorigin to the point. Suppose that we want to keep the tail of the line segment at the originand rotate the tip (without changing the magnitude of the line segment) through an angleθ in the counterclockwise direction.
Where does the tip of the line segment end up?
8.8.2 Rotation operators in 2-space
Rotations counterclockwise through an angle θ are defined by
124 CHAPTER 8. EUCLIDEAN VECTOR SPACES
Rθ
(xy
)=
(x cos θ − y sin θx sin θ + y cos θ
)
8.8.3 Example
Let
r =
(10
)Then
Rπ/4(r) =
(cos π/4sin π/4
)=
(1√2
1√2
)
8.8.4 Exercise
What is the standard matrix for a rotation operator in 2-space?
8.8.5 Exercise
Show that this operator does not change the distance from the origin. That is
‖r‖ = ‖Rθ(r)‖
8.8.6 Rotation operators in 3-space
Rotation about positive x axis
L
xyz
=
xy cos θ − z sin θy sin θ + z cos θ
Rotation about positive y axis
L
xyz
=
x cos θ + z sin θy
−x sin θ + z cos θ
Rotation about positive z axis
L
xyz
=
x cos θ − y sin θx sin θ + y cos θ
z
8.9. DILATION/CONTRACTION OPERATORS 125
8.8.7 Exercise
What are the standard matrices for the rotations about the axes in 3-space?
8.9 Dilation/contraction operators
8.9.1 Idea
We think of a vector
r =
(xy
)as lying in the Cartesian plane between the origin and the point (x, y). We then considerstretching or contracting the vector but leaving its direction unchanged.
8.9.2 Example
Consider the vector
r =
(23
)The vector that has double the magnitude and points in the same direction as r is
2r =
(46
)
8.9.3 Operator
The operator that contracts or dilates by a factor k > 0 without changing the length of thevector is
Lr =
(kxky
)in 2-space and
Lr =
kxkykz
in 3-space.
8.9.4 Exercise
What is the standard matrix for this operator?
126 CHAPTER 8. EUCLIDEAN VECTOR SPACES
8.9.5 Exercise
What is the extension to n-space?
8.10 Composition of linear transformations
8.10.1 Composition of linear transformations
Suppose that there is a linear transformation P : Rn → Rk and another linear transformationQ : Rk → Rm. The composition is written
Q ◦ P
and is defined by
(Q ◦ P )(x) = Q(P (x))
8.10.2 The composition of linear functions is linear
Proof.
Q(P (x+ y)) = Q(P (x) + P (y)) = Q(P (x)) +Q(P (y))
Q(P (αx)) = Q(αP (x)) = αQ(P (x))
8.10.3 Exercise
What is the standard matrix for a composition of linear functions?
8.10.4 Exercise
Show that composition is not commutative.
Chapter 9
Linear transformations of euclideanspaces
9.1 Recall
9.1.1 Linearity
The map L : Rn → Rm is said to be a linear transformation if for any u and v in the domainand α a real number we have
L(u+ v) = L(u) + L(v)
L(αu) = αL(u)
9.1.2 Significance
The range of a linear transformation is a vector space.
9.1.3 Standard matrix
A linear transformation L : Rn → Rm has a matrix Am×n associated with it such that for uin the domain of L
L(u) = Au
9.2 one-to-one functions
9.2.1 one-to-one
A function f : U → V is said to be one-to-one (injective) if for x and y in U
f(x) = f(y) =⇒ x = y
127
128 CHAPTER 9. LINEAR TRANSFORMATIONS OF EUCLIDEAN SPACES
9.2.2 Example
Consider the function f : [−1, 1]→ R defined by
f(x) = x2
This function is NOT one-to-one (injective) since f(x) = f(y) does not necessarily implythat x = y. For example, if f(x) = 1 = f(y) it could be that x = 1 and y = −1.
9.2.3 Example
Consider the function f : [0, 1]→ R defined by
f(x) = x2
This function is one-to-one (injective) since f(x) = f(y) implies that x2 = y2 and x = ysince all the possible values of x and y are non-negative.
9.3 Onto functions
9.3.1 Onto functions
A function f : U → V is said to be onto V (surjective) if for any y in V there is an x in Uso that
y = f(x)
9.3.2 Example
Consider the function f : (−1, 1)→ R defined by
f(x) = x2
This function is not onto R because 12 is in R and there is no x in (−1, 1) so that x2 = 12.
9.3.3 Example
Consider the function f : [0, 1]→ [0, 1] defined by
f(x) = x2
This function is onto [0, 1].
9.3.4 Exercise
Consider the identity operator I : R3 → R3. Show that this function is both one-to-one andonto.
9.4. INVERTIBLE FUNCTIONS 129
9.4 Invertible functions
9.4.1 Condition
A function f : U → V has a well-defined inverse if f is both one-to-one and onto ( injectiveand surjective ). That is because for every v in V there is a u in U so that f(u) = v (andonly one such u) so it makes sense to define the inverse function by
f−1(v) = u
9.5 FTLA
9.5.1 Recall
We had that the following are equivalent for a square matrix An×n:
a) A is invertible.
b) The equation Ax = 0 has only the trivial solution x = 0.
c) The rref of A is I.
d) A is a product of elementary matrices.
e) Ax = b is consistent for all n× 1 matrices b.
f) |A| 6= 0.
g) The columns of A considered as vectors in Rn are linearly independent.
9.5.2 Linear transformation and standard matrix
Recall that a linear transformation L : Rn → Rn has a square matrix associated with it andevery square matrix can be thought of as the standard matrix for a linear transformationL : Rn → Rn.
What does the invertibility of A tell us about L and how does it relate to the FTLA?
9.5.3 Theorem
Suppose An×n is a square matrix and L : Rn → Rn is multiplication by A. If A is invertiblethen the range of L is Rn.
130 CHAPTER 9. LINEAR TRANSFORMATIONS OF EUCLIDEAN SPACES
9.5.4 Proof
Suppose that A is invertible. We want to show that for any y in Rn there is a x in Rn sothat
Ax = y
Then
x = A−1y
gives us the desired x for any given y. Thus L is onto Rn.
9.5.5 Theorem
Suppose An×n is a square matrix and L : Rn → Rn is multiplication by A. If the range of Lis Rn then A is invertible.
9.5.6 Proof
Consider the n standard basis vectors for Rn. The ith one is ei which has all zeros exceptfor a 1 in the ith entry. As L is onto, for each ei there is some vector xi so that
Axi = ei
Now consider the matrix B whose columns are the xi:
B = ( x1 x2 · · · xn−1 xn )
Then
AB = A( x1 x2 · · · xn−1 xn ) = ( e1 e2 · · · en−1 en ) = I
and so B is the inverse of A.
9.5.7 Theorem
Suppose An×n is a square matrix and L : Rn → Rn is multiplication by A. If A is invertiblethen L is one-to-one.
9.5.8 Proof
If x and y are vectors in Rn and
L(x) = L(y)
then
Ax = Ay
9.6. INVERSE OF MATRIX AND INVERSE OF FUNCTION 131
and multiplying by the inverse of A on both sides gives
x = y
Then
L(x) = L(y) =⇒ x = y
9.5.9 Theorem
Suppose An×n is a square matrix and L : Rn → Rn is multiplication by A. If L is one-to-onethen A is invertible.
9.5.10 Proof
Suppose that L is one-to-one. Then
L(x) = L(0) =⇒ x = 0
Ax = 0 =⇒ x = 0
So the equation Ax = 0 has only the trivial solution and A must be invertible by the FTLA.
9.5.11 Additions to the FTLA
h) The range of the linear transformation which is multiplication by A is Rn.
i) The linear transformation which is multiplication by A is one-to-one.
9.5.12 Example
A projection operator P : R2 → R2 which projects onto the x-axis has standard matrix
A =
(1 00 0
)A is not invertible so this operator P is not one-to-one and the range of P is not R2.
9.6 Inverse of matrix and inverse of function
9.6.1 Recall
If An×n is invertible then the operator L : Rn → Rn which is multiplication by A is one-to-oneand onto and so L has an inverse.
132 CHAPTER 9. LINEAR TRANSFORMATIONS OF EUCLIDEAN SPACES
9.6.2 Standard matrix for the inverse linear transformation
The standard matrix for the inverse linear transformation is A−1.
9.7 Standard matrix for a linear transformation
9.7.1 Question
Given a linear transformation L : Rn → Rm how do we find its standard matrix?
9.7.2 Consider how linear transformation acts on basis vectors
Suppose that L acts on the basis vectors of Rn as
L(ei) = xi
Once we have that we know what happens to any vector in the domain since it is a linearcombination of the ei:
L(∑i
αiei) = αixi
Now construct Am×n as the matrix whose columns are the xi.
A =
x1
1 x21 · · ·xn1
x12 x2
2 · · ·xn2...
... · · · ...x1m x2
m · · ·x6m
Note that
Aei = xi = L(ei)
so multiplication by A has the same result as L on the basis vectors and so the same is truefor any vector in the domain of L. Then A is the standard matrix for L.
9.7.3 Example
Consider the linear operator L : R2 → R2 which is defined by
L
(xy
)=
(x+ yx− y
)Then
L
(10
)=
(1 + 01− 0
)=
(11
)and
9.7. STANDARD MATRIX FOR A LINEAR TRANSFORMATION 133
L
(01
)=
(0 + 10− 1
)=
(1−1
)Then the standard matrix for L is
A =
(1 11 −1
)
134 CHAPTER 9. LINEAR TRANSFORMATIONS OF EUCLIDEAN SPACES
Chapter 10
Least squares
10.1 Orthogonal bases
10.1.1 Given a subspace
Suppose that W is an m-dimensional subspace of Rn with a basis
S = { w1, w2, . . . wm}Every vector in W can be expressed as a linear combination of the basis vectors:
w = α1w1 + α2w2 + · · ·+ αnwm
What are the coefficients αi?
10.1.2 Orthogonal basis
A basis is said to be orthogonal if each basis vector is orthogonal to all of the other basisvectors.
10.1.3 Example
The standard basis for Rn is an example of an orthogonal basis.
10.1.4 Linear combinations with an orthogonal basis
Suppose that a subspace W of a euclidean vector space has an orthogonal basis. Then anyvector w in W can be written as a linear combination of the basis vectors. Now if we takethe dot product with the ith basis vector
w · wi = αiwi · wiand so
αi =w · wiwi · wi
135
136 CHAPTER 10. LEAST SQUARES
Then any vector in W can be written
w =w · w1
w1 · w1
w1 +w · w2
w2 · w2
w2 + · · ·+ w · wmwm · wm
wm
10.2 Example
10.2.1 Problem
Consider the subspace W spanned by the vectors
p =
100
and
q =
111
Any vector in W can be written as a linear combination of these vectors
w = αp+ βq
Note that
p · q = 1 6= 0
10.2.2 Finding an orthogonal basis
We will find a new orthonormal basis
O = { r, s }
for W . Let
r = p
be the first basis vector. Now let’s find a new basis vector in W
s = q + αr
that is orthogonal to r.
r · s = r · q + αr · r = 0
Then
α = −r · qr · r
10.3. GRAM-SCHMIDT ORTHOGONALIZATION PROCESS 137
Then the new orthogonal basis vectors for W are
r = p =
100
and
s = q + αr = q − r · qr · r
r =
111
− 1
1
100
=
011
10.3 Gram-Schmidt orthogonalization process
10.3.1 Problem
Given a basis
S = { w1, w2, . . . , wm}
of a subspace of Rn we would like to find an orthogonal basis
O = { u1, u2, . . . , um}
for W .
10.3.2 Gram-Schmidt orthogonalization process
1) Let
u1 = w1
2) Let
u2 = w2 −w2 · u1
u1 · u1
2) Let
u3 = w3 −w3 · u1
u1 · u1
− w3 · u2
u2 · u2
...2) Let
um = wm −wm · u1
u1 · u1
− wm · u2
u2 · u2
− · · · − wm · um−1
um−1 · um−1
This results in an orthogonal basis for W .
138 CHAPTER 10. LEAST SQUARES
10.3.3 Example again using Gram-Schmidt
Consider the subspace W of R3 spanned by the vectors
w1 =
100
and
w2 =
111
These vectors are not orthogonal. We will use the Gram-Schmidt process to create anorthogonal basis for the same subspace.
10.3.4 Applying Gram-Schmidt
Let
u1 = w1 =
100
u2 = w2 −
w2 · u1
u1 · u1
u1 =
111
− 1
1
100
=
011
Then an orthogonal basis for the subspace W is
O =
u1 =
100
, u2 =
011
10.4 Using a basis to approximate a vector
10.4.1 A vector as a linear combination of basis vectors
Suppose that we have a vector v ∈ R3
v =
125
We can construct this vector ( and every other vector in R3 ) as a linear combination of thestandard basis vectors
v = 1e1 + 2e2 + 5e3
10.4. USING A BASIS TO APPROXIMATE A VECTOR 139
10.4.2 Question
Can the vector v be constructed as the linear combination of only two of the standard basisvectors e1 and e2? No.
10.4.3 What is the best that you could do?
Suppose that you wanted to find a vector
w∗ = α1e1 + α2e2
in the span of e1 and e2 closest ( in some sense ) to v.
Note that the basis vectors for this span are already orthogonal.
The distance squared between v and w∗ is given by
‖w∗ − v‖2 = (α1 − 1)2 + (α2 − 2)2 + (0− 5)2
Differentiating wrt α1 and α2 and setting the partial derivatives to zero gives
2(α1 − 1) = 0
2(α2 − 2) = 0
implying that the closest vector to v in the span of e1 and e2 is
1e1 + 2e2 =
120
10.4.4 What about a different span of vectors?
Suppose that we have vectors
w1 =
100
w2 =
111
and we want the closest vector w∗ to v.
We previously saw that after the Gram-Schmidt orthogonalization process we get a newbasis vectors for the same subspace W :
140 CHAPTER 10. LEAST SQUARES
u1 =
100
u2 =
011
Now we express the approximation w∗ for the vector v in terms of the new basis vectors
w∗ = α1u1 + α2u2 =
α1
α2
α2
The distance squared to v is
‖w∗ − v‖2 = (α1 − 1)2 + (α2 − 2)2 + (α2 − 5)2
If we seek to minimize this distance then we get two equations from the differentiation
2(α1 − 1) = 0
2(α2 − 2) + 2(α2 − 5) = 0
and a bit of algebra gives
α1 = 1, α2 = 7/2
Then the closest vector in the span of w1 and w2 to the vector v is
w∗ = 1u1 +7
2u2 =
17/27/2
10.5 Geometric viewpoint
10.5.1 Geometric form of problem
We have a vector in R3
v =
125
which we identify with the point (1, 2, 5) in Cartesian 3-space.
We also have the span of the vectors
10.5. GEOMETRIC VIEWPOINT 141
w1 =
100
and
w2 =
111
which we view geometrically as a plane in Cartesian 3-space. We want to project v onto thisplane.
Life is a little easier when working with an orthogonal basis so we use the orthogonal basisvectors
u1 =
100
u2 =
011
instead. After projecting onto the plane we get the ’shadow’ of v in W
w∗ =
17/27/2
which we identify with the point in the plane (1,7/2,7/2).
10.5.2 Orthogonality to the plane
We can imagine drawing line segments from the original point v to different points in theplane. Intuitively, the shortest line segment will be the one that is perpendicular to theplane. This corresponds to the difference between v and w∗.
We confirm that w∗ − v is orthogonal to any vector in W :
(w∗ − v) · (α1u1 + α2u2) =
03/2−3/2
·α1
α2
α2
= 0
142 CHAPTER 10. LEAST SQUARES
10.6 The least squares problem
10.6.1 The least squares problem in Rn
Let W be an m-dimensional subspace of Rn. Given a vector v in Rn find a vector w∗ in Wso that
‖w∗ − v‖ ≤ ‖w − v‖ for all w ∈ W
10.6.2 Best least squares approximation
The vector w∗ in W is called the best least squares approximation to v.
10.6.3 Example
We have already done an example, with v ∈ R3 and w∗ in a vector space which is the spanof two vectors w1 and w2. The vector that we found is the best least squares approximationto v.
10.7 Orthogonality result
10.7.1 Theorem
Suppose that v is a vector in Rn. Suppose that W is an m-dimensional subspace of Rn andw∗ in W is such that
(w∗ − v)tw = 0
for all w in W . Then w∗ is the best least-squares approximation to v.
10.7.2 In other words ..
The vector connecting v to w∗ is orthogonal to the subspace W .
10.7.3 Proof
Given the conditions of the theorem
10.8. TESTING FOR ORTHOGONALITY 143
‖w − v‖2 = ‖w − w∗ + w∗ − v‖2
= [(w − w∗) + (w∗ − v)]t[(w − w∗) + (w∗ − v)]
= [(w − w∗)t + (w∗ − v)t][(w − w∗) + (w∗ − v)]
= (w − w∗)t(w − w∗) + (w − w∗)t(w∗ − v) + (w∗ − v)t(w − w∗) + (w∗ − v)t(w∗ − v)
= (w − w∗)t(w − w∗) + 2(w∗ − v)t(w − w∗) + (w∗ − v)t(w∗ − v)
= ‖w − w∗‖2 + 2(w∗ − v)t(w − w∗) + ‖w∗ − v‖2
Now as w and w∗ are in W by the assumption of the theorem
(w∗ − v)tw = 0
and
(w∗ − v)tw∗ = 0
Then
‖w − v‖2 = = ‖w − w∗‖2 + ‖w∗ − v‖2
Now as the magnitude of a vector is greater than or equal to zero we have
‖w − v‖2 ≥ ‖w∗ − v‖2
So
‖w∗ − v‖ ≤ ‖w − v‖
for all vectors w in W .
10.8 Testing for orthogonality
10.8.1 Lemma
Suppose that we have a m-dimensional subspace W of Rn and we want to find out if a givenvector n of Rn is orthogonal to every vector in W . Then n is orthogonal to every vector win W if and only if it is orthogonal to every basis vector of W .
10.8.2 Proof
Suppose that W has a set of basis vectors
S = {w1, w2, . . . wm}
and
144 CHAPTER 10. LEAST SQUARES
n · wi = 0
for all the wi in S. Then as S is a set of basis vectors for W any vector w in S can be writtenas a linear combination of the vectors in S:
w = α1w1 + α2w2 + · · ·+ αmwm
Taking the dot product with n gives
n · w = n · (α1w1 + α2w2 + · · ·+ αmwm)
= n · α1w1 + n · α2w2 + · · ·+ n · αmwm= α1n · w1 + α2n · w2 + · · ·+ αmn · wm= 0 + 0 + · · · 0= 0
Conversely, if n is orthogonal to every vector in W then it is orthogonal in particular to thebasis vectors in W .
10.9 Existence and uniqueness of best approximations
10.9.1 Recall
Any best approximation w∗ in a subspace W for v in Rn will have the property that for anyw in W
(w∗ − v)tw = 0
and to find such a w∗ − v it is sufficient to find one that is orthogonal to very basis vectorof W .
10.9.2 Resulting equations
If W has a set of basis vectors
S = {w1, w2, . . . wm}then the best approximation w∗ for v in Rn has the property that the following equationshold
(w∗ − v)tw1 = 0
(w∗ − v)tw2 = 0
...
(w∗ − v)twm = 0
10.9. EXISTENCE AND UNIQUENESS OF BEST APPROXIMATIONS 145
10.9.3 Uniqueness
If these equations have a unique solution then the best approximation exists and must beunique.
10.9.4 Solving the system of equations
Suppose that
S = {w1, w2, . . . wm}
is a set of orthogonal basis vectors, i.e. each basis vector is orthogonal to the others. Wecan always find such a basis for any subspace of Rn.
We write w∗ as a linear combination of the orthonormal basis vectors for W
w∗ = α1w1 + α2w2 + · · ·+ αmwm
Then the system of equations becomes
(α1w1 + α2w2 + · · ·+ αmwm − v)tw1 = 0
(α1w1 + α2w2 + · · ·+ αmwm − v)tw2 = 0
...
(α1w1 + α2w2 + · · ·+ αmwm − v)twm = 0
and using orthogonality gives
α1‖w1‖2 − vtw1 = 0
α2‖w2‖2 − vtw2 = 0
...
αm‖wm‖2 − vtwm = 0
and then
αi =vtwi‖wi‖2
for i = 1, 2, . . . ,m.
146 CHAPTER 10. LEAST SQUARES
10.9.5 A best approximation
The best approximation for v then is
w∗ =vtw1
‖w1‖2w1 +
vtw2
‖w2‖2w2 + · · ·+ vtwm
‖wm‖2wm =
m∑i=1
vtwi‖wi‖2
wi
10.9.6 Uniqueness
The best approximation w∗ is unique, no other vector in W can be a better least squaresapproximation.
10.9.7 Proof
Suppose that w∗ is the best approximation as we have just constructed. Let w+ be someother best approximation in W . Then
‖w+ − v‖2 = ‖w+ − w∗ + w∗ − v‖2
= [(w+ − w∗) + (w∗ − v)]t[(w+ − w∗) + (w∗ − v)]
= (w+ − w∗)t(w+ − w∗) + (w+ − w∗)t(w∗ − v) + (w∗ − v)t(w+ − w∗) + (w∗ − v)t(w∗ − v)
= ‖w+ − w∗‖2 + 2(w+ − w∗)t(w∗ − v) + ‖w∗ − v‖2
As w∗ − v was orthogonal to any vector in W
‖w+ − v‖2 = ‖w+ − w∗‖2 + ‖w∗ − v‖2
But w∗ and w+ are both best approximations so
‖w+ − v‖2 = ‖w∗ − v‖2
Then
‖w+ − w∗‖2 = 0
and these best approximations are in fact the same vector.
10.10 Example
10.10.1 Problem
Consider the subspace W spanned by the vectors
w1 =
−110
10.10. EXAMPLE 147
and
w2 =
201
We want to find the best least squares approximation in W to the vector
v =
1−24
The spanning vectors are linearly independent so
S = {w1, w2}is a basis. As the basis is not orthogonal, we construct an orthogonal basis for W using theGram-Schmidt process:
Let
u1 = w1 =
−110
u2 = w2 −
w2 · u1
u1 · u1
u1 =
201
− −2
2
−110
=
111
So an orthogonal basis for W is
O =
u1 =
−110
, u2 =
111
Then the best least squares approximation in W is given by
w∗ =v · u1
u1 · u1
u1 +v · u2
u2 · u2
u2
=−3
2u1 +
3
3u2
=−3
2
−110
+
111
=
5/2−1/2
1
148 CHAPTER 10. LEAST SQUARES
10.10.2 Check
Recall that w∗ − v needs to be orthogonal to the basis vectors of W . We have
w∗ − v =
5/2−1/2
1
− 1−24
=
3/23/2−3
We see that
(w∗ − v)tw1 = 0
(w∗ − v)tw2 = 0
10.11 Finding a line that best fits data points
10.11.1 Experiment
I believe that a quantity y varies with time according to some law
y = mt+ b
and now wish to conduct an experiment to find m and t.
10.11.2 Data points
I measure y at various times t and the measurements are recorded as
t y1 12 54 75 11
10.11.3 Trying to find the line
Using my model of y = mt + b and assuming that if I plotted this points they would lie onthe graph of this equation, I get the equations
m1 + b = 1
m2 + b = 5
m4 + b = 7
m5 + b = 11
10.11. FINDING A LINE THAT BEST FITS DATA POINTS 149
This can be written 1 12 14 15 1
(mb)
=
15711
10.11.4 Inconsistency
If I form the augmented matrix and do row reduction it turns out that the system is incon-sistent. Now what?
10.11.5 Strategy
We will have some measurement in any experiment. So, let’s try and find an m and a b thatwill give a best fit of the line to the data points.
10.11.6 Seek to minimize square errors
We then seek an m and a b so as to minimize the sum of the differences for each t of mt+ band the actual measured value:
I =1
2
[(m+ b− 1)2 + (2m+ b− 5)2 + (4m+ b− 7)2 + (5m+ b− 11)2
]Taking partial derivatives
Im = (m+ b− 1) + 2(2m+ b− 5) + 4(4m+ b− 7) + 5(5m+ b− 11) = 0
Ib = (m+ b− 1) + (2m+ b− 5) + (4m+ b− 7) + (5m+ b− 11) = 0
or
(m+ b) + 2(2m+ b) + 4(4m+ b) + 5(5m+ b) = 1 + 2 · 5 + 4 · 7 + 5 · 11
and
(m+ b) + (2m+ b) + (4m+ b) + (5m+ b) = 1 + 5 + 7 + 11
This can be rewritten in terms of matrices as
(1 2 4 51 1 1 1
)1 12 14 15 1
(mb)
=
(1 2 4 51 1 1 1
)15711
Note that we now have the original system multiplied on both sides by the transpose of theoriginal matrix. Then we get
150 CHAPTER 10. LEAST SQUARES
(46 1212 4
)(mb
)=
(9424
)and so (
mb
)=
(46 1212 4
)−1(9424
)=
(11/5−3/5
)
10.12 Least squares solutions to inconsistent systems
10.12.1 Problem
Suppose that the equation
Ax = b
is inconsistent. We cannot find an x which satisfies the equation so we will try to find onethat is the best fit. That is, we want to minimize the magnitude of the vector
r = Ax− b
10.12.2 Solution
We seek to minimize
F (x) = (Ax− b)t(Ax− b)Let h be a small deviation about x
F (x+ h) = [A(x+ h)− b]t[A(x+ h)− b]= [(Ax− b) + Ah]t[(Ax− b) + Ah]
= (Ax− b)t(Ax− b) + (Ah)t(Ax− b) +O(h2)
= F (x) + htAt(Ax− b) +O(h2)
The term At(Ax−b) is a sort of derivative which we will set equal to zero to get a minimum.So, we want to solve
At(Ax− b) = 0 =⇒ AtAx = Atb
10.12.3 Normal equations
The equations
AtAx = Atb
are called the normal equations.
10.13. EXERCISES 151
10.12.4 Properties
a) The normal equations are always consistent.
b) The solutions of the normal equations are the least-squares solutions of Ax = b.
c) If A is m × n the solutions of the normal equations are unique if and only if A hasrank n.
10.13 Exercises
10.13.1 Exercise
Fit a line with an equation of the form y = mx+ b to the following data:t y1 12 34 35 6
10.13.2 Exercise
Fit a a curve with an equation of the form y = ax2 + bx+ c to the following data:t y1 32 63 145 30
10.13.3 Exercise
Consider the vector v in R4
v =
1234
Let W ⊂ R4 be the span of the vector
152 CHAPTER 10. LEAST SQUARES
w =
4321
Find the best least-squares approximation to v in the subspace W .
Chapter 11
Vector Spaces
11.1 Examples
We will illustrate the properties of vector spaces using three example spaces.
11.1.1 Euclidean spaces
Note that linear combinations of two vectors in Rn give vectors in Rn.
11.1.2 Finite degree polynomials
Note that linear combinations of polynomials result in a polynomial.
11.1.3 Solutions to some differential equations
Consider the solutions to the differential equation
y′(x) = 0, x ∈ [0, 1]
Note that linear combinations of two solutions is another solution.
11.2 Zero vector
11.2.1 Zero vector in euclidean spaces
The Rn vector spaces have a zero vector which is a column vector with all zero entries.
0 =
00...0
153
154 CHAPTER 11. VECTOR SPACES
11.2.2 Zero vector in the polynomial vector space
What would be the zero vector in the polynomial vector space?
11.2.3 Zero vector in solutions of the differential eqn
What would be the zero vector in the vector space of the solutions of
y′(x) = 0, x ∈ [0, 1]
11.2.4 Property of the zero vector
For any vector u in the vector space
u+ 0 = 0 + u = u
11.2.5 Exercise
Confirm that the zero vector in each of the three example spaces has the desired property.
11.3 Addition
11.3.1 Addition
If u and v vectors in a vector space V then
u+ v also in V
11.3.2 Example
Euclidean spaces.
11.3.3 Example
Adding two polynomials gives another polynomial.
11.3.4 Example
If u and v solutions to
y′(x) = 0 x ∈ [0, 1]
then
u′(x) = 0 x ∈ [0, 1]
11.4. SCALAR MULTIPLICATION 155
v′(x) = 0 x ∈ [0, 1]
and
(u+ v)′(x) = u′(x) + v′(x) = 0 + 0 = 0 x ∈ [0, 1]
11.4 Scalar multiplication
11.4.1 Scalar multiplication
If u is a vector in a real vector space then for any real number α we have that αu is a memberof the vector space.
11.4.2 Example
A scalar multiplication was defined so that for any α ∈ R and any u ∈ Rn
αu ∈ Rn
11.4.3 Example
The scalar multiplication of polynomials is defined as
α(c0 + c1x+ c2x2 + · · ·+ cnx
n) = ααc0 + αc1x+ αc2x2 + · · ·+ αcnx
n
which is a vector in the space of polynomials.
11.4.4 Example
If u is a solution to
y′(x) = 0 x ∈ [0, 1]
then
u′(x) = 0 x ∈ [0, 1]
and
(αu)′(x) = α(u)′(x) = α · 0 = 0 x ∈ [0, 1]
Then αu is also a solution and a vector in the vector space of solutions of the differentialequation
y′(x) = 0 x ∈ [0, 1]
156 CHAPTER 11. VECTOR SPACES
11.5 Properties we have seen for Euclidean spaces
From the way we defined addition and scalar multiplication for vectors in Rn we can show:
11.5.1 Properties
1. For u and v vectors in Rn
u+ v
is also a vector in Rn.
2. For u and v vectors in Rn
u+ v = v + u
3. For u, v, and w vectors in Rn
(u+ v) + w = u+ (v + w)
4. For u, in Rn there is a zero vector 0 such that
u+ 0 = u = 0 + u
5. For every vector u in Rn there is a vector −u in Rn such that
u+ (−u) = 0
6. For every vector u in Rn and α in R there is a vector
αu
in Rn.
7. For every u and v in Rn and α a real number
α(u+ v) = αu+ αv
8. For every u in Rn and α and β real numbers
(α + β)u = αu+ βu
9. For every u in Rn and α and β real numbers
α(βu) = (αβ)u
10. For u in Rn
1u = u
11.6. PROPERTIES OF A VECTOR SPACE 157
11.6 Properties of a vector space
11.6.1 Definition
A set V is said to be a real vector space if the properties that we just recalled for the Rn
vector spaces hold.
11.6.2 Property 1
1. For u and v vectors in V
u+ v
is also a vector in V .
11.6.3 Example
We checked this for the space of polynomials and the space of solutions to the differentialequation previously.
11.6.4 Property 2
2. For u and v vectors in V
u+ v = v + u
11.6.5 Example
Addition of polynomial functions is commutative and addition of functions is commutative.
11.6.6 Property 3
3. For u, v, and w vectors in V
(u+ v) + w = u+ (v + w)
11.6.7 Example
Addition of functions in general is associative so this property holds in particular for poly-nomial functions and solutions of the differential equation.
158 CHAPTER 11. VECTOR SPACES
11.6.8 Property 4
4. For u, in V there is a zero vector 0 such that
u+ 0 = u = 0 + u
11.6.9 Example
We already checked this for the example spaces.
11.6.10 Property 5
5. For every vector u in V there is a vector −u in V such that
u+ (−u) = 0
11.6.11 Example
For a polynomial p defined by
p(x) = c0 + c1x+ c2x2 + · · ·+ cnx
n
the polynomial −p is defined by
(−p)(x) = − c0 +−c1x+−c2x2 + · · ·+−cnxn
which is another polynomial and p+−p is such that
(p+−p)(x) = c0 − c0 + (c1 − c1)x+ (c2 − c2)x2 + · · ·+ (cn − cn)xn ≡ 0
11.6.12 Example
If u is a solution to the differential equation then
u′(x) = 0
and
(−u)prime(x) = − u′(x) = − 0 = 0
so −u is also a vector in the space of solutions and
11.6.13 Property 6
6. For every vector u in V and α in V there is a vector
αu
in V .
11.6. PROPERTIES OF A VECTOR SPACE 159
11.6.14 Example
Already checked these for the example spaces.
11.6.15 Property 7
7. For every u and v in V and α a real number
α(u+ v) = αu+ αv
11.6.16 Example
True for the Euclidean spaces because of the properties of the real numbers. True in generalfor functions so specifically for the polynomial functions and the functions that are membersof the solution space.
11.6.17 Property 8
8. For every u in V and α and β real numbers
(α + β)u = αu+ βu
11.6.18 Examples
True for Euclidean spaces by the properties of real numbers. True in general for functionsand so true in particular for the polynomial functions and the functions that are solutionsto the differential equation.
11.6.19 Property 9
9. For every u in V and α and β real numbers
α(βu) = (αβ)u
11.6.20 Examples
True for the Euclidean spaces by the properties of real numbers. True for real valued functionsin general and hence true in particular for the polynomial functions and the functions thatare solutions to the differential equation.
11.6.21 Property 10
10. For u in V
1u = u
160 CHAPTER 11. VECTOR SPACES
11.6.22 Examples
True for Euclidean spaces by the properties of real numbers. True in general for real valuedfunctions by the definition of scalar multiplication.
11.7 Subspaces
11.7.1 Definition
If V is a real vector space, and W is a non-empty subset of V which is also itself a vectorspace then W is said to be a subspace of V .
11.7.2 Theorem - Checking if a subset is a subspace
If W is a non-empty subset of vectors of a vector space V then W is a subspace of V if andonly if
a) u, v ∈ W implies u+ v ∈ W
b) α ∈ R and u ∈ W implies αu ∈ W .
11.7.3 Proof
Exercise.
11.7.4 Example
Consider the subset W of R2 which consists of all of the vectors of the form(xx2
)Is this a subspace?
Consider the addition of two vectors(11
)+
(24
)=
(35
)/∈ W
W is not a subspace since the sum of two vectors in W is not necessarily in W .
11.7.5 Example
Consider the subset W of the space of all polynomials which consists of all polynomials ofdegree 3 or less. This is a subspace because: 1) the sum of two polynomials of degree threeor less is a polynomial of degree three or less; 2) the scalar multiple of a polynomial of degreethree or less is a polynomial of degree three or less.
11.8. LINEAR COMBINATIONS 161
11.7.6 Example
Every vector space has a trivial subspace which is not very interesting.
11.7.7 Example
Solution space of the homogeneous equation Ax = 0.
11.8 Linear combinations
11.8.1 Linear combination of vectors
A vector v in a real vector space V is said to be a linear combination of vectors v1, v2 , ... ,vn if there are real numbers α1, α2, ... , αn such that
v = α1v1 + α2v2 + · · ·+ αnvn
11.8.2 Example
In R2 the vector
v =
(αβ
)can be written as a linear combination of the vectors
e1 =
(10
)
e2 =
(01
)by
v = αe1 + βe2
11.8.3 Example
In R2 the vector
v =
(αβ
)can be written as a linear combination of the vectors
v1 =
(1−1
)
162 CHAPTER 11. VECTOR SPACES
v2 =
(11
)by
v =α− β
2v1 +
α + β
2v2
11.8.4 Example
The polynomial function
p(x) = 2 + 5x+ 7x2
is a linear combination of the vectors
1, x, x2
11.8.5 Example
In Rn all vectors can be written as linear combinations of the vectors
e1 =
10...0
e2 =
01...0
...
en =
00...1
11.8.6 Theorem - linear combinations form a subspace
Suppose v1, v2, ... , vk are vectors in a vector space V .
a) Then the set W of all linear combinations of v1, v2, ... , vk is a subspace of V .
b) It is the smallest subspace that contains all the vi.
11.8. LINEAR COMBINATIONS 163
11.8.7 Proof of a)
Suppose that p and q are elements of W . Then p and q are both linear combinations of thevi:
p = α1v1 + α2v2 + · · ·+ αkvk
q = β1v1 + β2v2 + · · ·+ βkvk
The sum is
p+ q = (α1 + β1)v1 + (α2 + β2)v2 + · · ·+ (αk + βk)vk
which is a linear combination of the vi and so in W . So the sum of two vectors in W is alsoin W .
Now consider a scalar multiple of p:
γp = (γα1)v1 + (γα2)v2 + · · ·+ (γαk)vk
And then this is also a linear combination of the vi and so is in W . So a scalar multipleof a vector in W is in W . The two required conditions have been checked and so W is asubspace.
11.8.8 Proof of b)
Suppose a subspace contains vectors v1, v2 , ... , vk.
One of the conditions for a subspace is that all scalar multiples of vectors in the subspacemust also be in the subspace. Then for all real numbers α1, α2 , ... , αk we must have thanα1v1, α2v2 , ... , αkvk are also in the subspace.
The other condition for a subspace is that sum of any two vectors in the subspace mustalso be in the subspace. Then
α1v1 + α2v2
is in the subspace. Then
(α1v1 + α2v2) + α3v3
is in the subspace. And so on till
α1v1 + α2v2 + · · ·+ αkvk
is also in the subspace.
Thus v1, v2 , ... , vk in a subspace implies that all linear combinations of the vi are inthe subspace.
164 CHAPTER 11. VECTOR SPACES
11.8.9 Example
The set of all of the linear combinations of
u =
(11
)and
v =
(−11
)is a subspace. In fact, it is all of R2.
11.8.10 Span
Suppose a subspace V contains vectors v1, v2 , ... , vk. The subspace W consisting of all thelinear combinations of the vi is called the space spanned by v1, v2 , ... , vk and v1, v2 , ... ,vk are said to span W .
11.8.11 Example
The span of the polynomial functions p, q, and r defined
p(x) = 1, q(x) = 1 + x, r(x) = 1 + x+ x2
is a subspace of the space of polynomial functions.
11.9 Linear independence and dependence
11.9.1 Linear independence
Suppose that v1, v2 , ... , vk are vectors in a real vector space V . The vi are said to belinearly dependent if we can find real numbers α1, α2, .. , αk, not all zero so that
α1v1 + α2v2 + · · ·+ αkvk = 0
Otherwise the set is said to be linearly independent.
11.9.2 Alternatively
Another way of putting this is that the vi are linearly independent if
α1v1 + α2v2 + · · ·+ αkvk = 0
only for all the αi = 0. Then if the vectors are not linearly independent they are said to belinearly dependent.
11.10. BACK TO THE FTLA 165
11.9.3 Example
The standard basis vectors in Rn are linearly independent.
11.9.4 Example
The vectors p , q , r in the space of polynomial functions defined by
p(x) = 1 + x
q(x) = 1− x
r(x) = x
are linearly dependent because
p− q − 2r = 0
11.10 Back to the FTLA
11.10.1 Recall
The following are equivalent:a) An×n is invertible.
b) The equation Ax = 0 has only the trivial solution x = 0.
c) The reduced row echelon form of A is I.
d) A is a product of elementary matrices.
e) Ax = b is consistent for all n× 1 matrices b.
f) |A| 6= 0.
11.10.2 Add to the FTLA
g) The columns of A considered as vectors in Rn are linearly independent.
11.10.3 Proof
Suppose that A is invertible. Then the only solution to Ax = 0 is x = 0. Then the onlylinear combination of the columns of A that gives zero is one where the scalar coefficients ofthe sum are zero. Then the columns are linearly independent.
166 CHAPTER 11. VECTOR SPACES
If the columns are linearly independent then the only linear combination that gives zerois the one with all scalar coefficients zero. Then Ax = 0 has only the trivial solution and soby the FTLA then A is invertible.
11.11 Dimension
11.11.1 Example
We saw that the vectors p , q , r in the space of polynomial functions defined by
p(x) = 1 + x
q(x) = 1− x
r(x) = x
are linearly dependent because
p− q − 2r = 0
Then
r =1
2p− 1
2q
Then the span of p, q, and r is the same as the span of just p and q.
11.11.2 General idea
Suppose that we are interested in a space W that we know is spanned by a set of vectors
B = {v1, v2, . . . , vm}
If the set of vectors is linearly independent, then none of the vectors can be written as alinear combination of the other vectors. If the set is linearly dependent, then at least onevector can be written as a linear combination of the others. Say, vm. We can drop vm fromthe set and just use
B′ = {v1, v2, . . . , vm−1}
We keep going till we get a linearly independent set.
11.11.3 Basis
Such a minimal set is called a basis.
11.12. EXERCISES 167
11.11.4 Minimality of the basis
For a given vector space W with two bases, the number of vectors in both bases is the same.
11.11.5 Example
B = {p, q}
and
B′ = {p, r}
are both bases for the same subspace of the space of polynomial functions.
11.11.6 Dimension
The dimension of a vector space V is the number of vectors in its basis.
11.11.7 Example
Rn has dimension n.
11.11.8 Example
The space of polynomial functions defined on R is of infinite dimension.
11.12 Exercises
11.12.1 Exercise
Give an example of a two-dimensional subspace in the space of polynomial functions.
11.12.2 Exercise
How many different bases are there for R3?
11.12.3 Exercise
Can a set of n+ 1 vectors in Rn be linearly independent?
168 CHAPTER 11. VECTOR SPACES
Chapter 12
Some particular vector spaces
12.1 Domain and range
12.1.1 Recall
Recall that if f is a map from a set U to set V then U is called the domain and the set ofvalues that f takes are called the range of f .
12.1.2 Example
Suppose f : R→ R is defined by
f(x) = x2
then the domain of f is R and the range of f is all the non-negative reals.
12.2 Example
12.2.1 Question
Consider the matrix A3×3
A =
1 4 72 5 83 6 9
A acts on vectors in R3 and sends them to R3. Let L : R3 → R3 be the linear operatordefined by
L(u) = Au
What is the range of A?
169
170 CHAPTER 12. SOME PARTICULAR VECTOR SPACES
12.2.2 What does the matrix do to the standard basis vectors?
L(e1) = Ae1 =
1 4 72 5 83 6 9
100
=
123
L(e2) = Ae2 =
1 4 72 5 83 6 9
010
=
456
L(e3) = Ae3 =
1 4 72 5 83 6 9
001
=
789
12.2.3 What does the matrix do to a general vector?
Where does A send a general vector? A typical vector in R3 looks like
v = α1 + β2 + γ3
so
L(v) = Av = α
123
+ β
456
+ γ
789
12.2.4 Answer
The range of L is the set of vectors that look like
α
123
+ β
456
+ γ
789
where α, β, and γ are any real numbers.
12.2.5 Note
The range of L is a vector space spanned by the column vectors of A. Is this true in general?
12.3 Column space
12.3.1 Matrix and column vectors
Suppose that we have a matrix Am×n. Let L : Rn → Rm be the linear transformation definedby
L(u) = Au
12.3. COLUMN SPACE 171
We write
A =
a11 a12 · · · a1n
a21 a22 · · · a2n...
... · · · ...am1 am2 · · · amn
We can think of the matrix as being made up of n column vectors
v1 =
a11
a21...am1
v2 =
a12
a22...am2
...
vn =
a1n
a2n...
amn
12.3.2 Action of matrix on basis vectors
The standard basis vectors for Rn are
e1 =
10...0
e2 =
01...0
...
172 CHAPTER 12. SOME PARTICULAR VECTOR SPACES
en =
00...1
Now we consider the products
L(e1) = Ae1 = v1
L(e2) = Ae2 = v2
...
L(en) = Aen = vn
12.3.3 Matrix acting on a typical vector
Any vector u ∈ Rn can be written as
u = α11 + α22 + · · ·+ αnn
and so
L(u) = Au = α1v1 + α2v2 + · · ·+ αnvn
12.3.4 Column space
The range of L is a vector space, namely the span of the column vectors of A. This space iscalled the column space of A.
12.3.5 Column space and solutions
For a matrix Am×n and column vectors x and b, the matrix-vector equation
Ax = b
is consistent if and only if the vector b is in the column space of A.
12.3. COLUMN SPACE 173
12.3.6 Proof
Suppose that the columns of A are
{v1, v2, v3, · · · , vn}
If b is in the column space of A then b is a linear combination of the columns of A
b = α1v1 + α2v2 + · · ·+ αnbn
Let
x =
α1
α2...αn
Then
Ax =
v(1)1 v
(1)2 v
(1)3 · · · v
(1)n
v(2)1 v
(2)2 v
(2)3 · · · v
(2)n
......
... · · · ...
v(k)1 v
(k)2 v
(k)3 · · · v
(k)n
......
... · · · ...
v(m)1 v
(m)2 v
(m)3 · · · v
(m)n
α1
α2...αn
=
α1v(1)1 + α2v
(1)2 + α3v
(1)3 + · · ·+ αnv
(1)n
α1v(2)1 + α2v
(2)2 + α3v
(2)3 + · · ·+ αnv
(2)n
...
α1v(k)1 + α2v
(k)2 + α3v
(k)3 + · · ·+ αnv
(k)n
...
α1v(m)1 + α2v
(m)2 + α3v
(m)3 + · · ·+ αnv
(m)n
=
α1v(1)1
α1v(2)1
...
α1v(k)1
...
α1v(m)1
+
α2v(1)2
α2v(2)2
...
α2v(k)2
...
α2v(m)2
+
α3v(1)3
α3v(2)3
...
α3v(k)3
...
α3v(m)3
+ · · ·+
αnv(1)n
αnv(2)n
...
αnv(k)n
...
αnv(m)n
= α1v1 + α2v2 + · · ·+ αnvn
= b
174 CHAPTER 12. SOME PARTICULAR VECTOR SPACES
So b in the column space of A implies that Ax = b has a solution.
Now, suppose that Ax = b is consistent. Then there is a vector
x0 =
α1
α2...αn
so that Ax0 = b. But
Ax0 = α1v1 + α2v2 + · · ·+ αnvn = b
so b can be written as a linear combination of the vectors that span the column space of A.So b is in the column space of A.
12.4 Null space
12.4.1 Null space
Given a matrix Am×n, the subspace of all vectors x ∈ Rn so that
Ax = 0
is called the null space of A.
12.4.2 The null space is a subspace
The null space is a subspace of Rn. If there are two vectors u and v such that
Au = 0
and
Av = 0
then for any linear combination
w = αu+ βv
of u and v we have that
Aw = A(αu+ βv) = αAu+ βAv = 0 + 0 = 0
so any linear combination of two vectors in the null space is also in the null space. Then thenull space is a subspace.
12.5. SOLUTIONS OF HOMOGENEOUS AND INHOMOGENEOUS SYSTEMS 175
12.4.3 Example
Consider the matrix
A =
(1 23 4
)The null space of A is only the zero vector and nothing else. Why?
12.4.4 Example
Consider the matrix
A =
(1 22 4
)What is the null space of A?
The null space of A consists of all vectors of the form(−2ss
)= s
(−21
)for any real number s. Then the basis for the null space of A is{(
−21
)}
12.5 Solutions of homogeneous and inhomogeneous sys-
tems
12.5.1 Homogeneous linear system
Given matrix Am×n, the linear system
Ax = 0
is called homogeneous.
12.5.2 Inhomogeneous linear system
Given matrix Am×n and vector b 6= 0, the linear system
Ax = b
is called inhomogeneous.
176 CHAPTER 12. SOME PARTICULAR VECTOR SPACES
12.5.3 Theorem - solns of the inhomogeneous system
If the equation Ax = b is consistent then the solutions can be written in the form
x = xp + α1v1 + α2v2 + · · ·+ αkvk
where
B = {v1 , v2 , · · · , vk}
is a basis for the null space of A and xp is a particular solution to the equation Ax = b.
12.5.4 In other words
The general solution to Ax = b is any particular solution plus the null space of A.
12.5.5 Proof in one direction
Suppose that
B = {v1 , v2 , · · · , vk}
is a basis for the null space of A and xp is a solution of Ax = b and x is any solution. Then
Axp = b
and
Ax = b
and so subtracting gives
A(x− xp) = 0
Then as x−x0 is in the null space of A we must be able to write it as some linear combinationof the basis vectors of the null space:
x− xp = α1v1 + α2v2 + · · ·+ αkvk
This implies that every solution of the equation can be written in the form
x = xp + α1v1 + α2v2 + · · ·+ αkvk
12.5. SOLUTIONS OF HOMOGENEOUS AND INHOMOGENEOUS SYSTEMS 177
12.5.6 Proof in the other direction
Suppose that xp is a solution of Ax = b and that
B = {v1 , v2 , · · · , vk}
is a basis for the null space of A. Let
x = xp + α1v1 + α2v2 + · · ·+ αkvk
Then
Ax = Axp + α1Av1 + α2Av2 + · · ·+ αkAvk = b+ α10 + α20 + · · ·+ αk0 = b
so a vector of the form
x = xp + α1v1 + α2v2 + · · ·+ αkvk
is a solution of the equation.
12.5.7 Particular solution
We call xp a particular solution of Ax = b.
12.5.8 General solution
The vector
x = xp + α1v1 + α2v2 + · · ·+ αkvk
is called the general solution of Ax = b.
12.5.9 General solution of the homogeneous equation
The linear combination
α1v1 + α2v2 + · · ·+ αkvk
is called the general solution to Ax = 0.
12.5.10 Example
Consider the solution of
1 1 1 10 1 1 10 0 1 1
x1
x2
x3
x4
=
1097
178 CHAPTER 12. SOME PARTICULAR VECTOR SPACES
The rref of the augmented matrix is 1 0 0 0 10 1 0 0 20 0 1 1 7
which gives a solution
x =
x1
x2
x3
x4
=
12
7− x4
x4
=
1270
+
00−x4
x4
=
1270
+ t
00−11
where t could be any real number. In the notation that we have been using, the particularsolution of the inhomogeneous problem is
xp =
1270
and the general solution of the homogeneous problem is
xh = t
00−11
and the general solution of the inhomogeneous problem is
x = xp + xh
12.6 Null space and elementary row operations
12.6.1 Question
We might use elementary row operations on a matrix to find the null space of the matrix.There would be a problem if the null space of the original matrix and the null space of therow reduced matrix were not the same.
It turns out that the null spaces of a matrix and the matrix multiplied by an elementarymatrix from the left are the same.
12.6.2 Theorem
If E is an elementary matrix and A is a general matrix so that their product EA is definedthen the null space of EA and the null space of A are the same.
12.7. COLUMN SPACE AND ELEMENTARY OPERATIONS 179
12.6.3 Proof
Suppose that u is in the null space of A. That is, Au = 0. Then
(EA)u = E(Au) = E · 0 = 0
So, every vector in the null space of A is in the null space of EA. So, the null space of A isa subset of the null space of EA.
Now, suppose that there is a vector v in the null space of EA. That is, (EA)v = 0. Then
(EA)v = E(Av) = 0
The elementary matrices are all invertible so
Av = E−10 = 0
So, every vector in the null space of EA is a vector in the null space of A. Then the nullspace of EA is a subset of the null space of A.
Since the null space of A is a subset of the null space of EA and the null space of EAis a subset of the null space of A, the two spaces are the same.
12.7 Column space and elementary operations
12.7.1 Linear independence of column spaces
Suppose that we have a matrix Am×n and B is obtained by A through elementary rowoperations. Then the column vectors of A are linearly independent if and only if the columnvectors of B are linearly independent.
12.7.2 Proof
Sufficient to show this for
B = EA
with E an elementary matrix.
Suppose that the column vectors of A are linearly independent. Consider
u =
α1
α2...αn
Then the vector Au is
180 CHAPTER 12. SOME PARTICULAR VECTOR SPACES
Au = α1c1 + α2c2 + · · ·+ αncn
where the ci are the column vectors of A.
If the column vectors of A are linearly independent then Au is not zero for any nonzerou. No elementary matrix can make a nonzero vector zero, so
(EA)u = E(Au) 6= 0
Suppose that A has linearly dependent column vectors. Then there is a nonzero u so thatAu = 0. Then
(EA)u = E(Au) = 0
and so the column vectors of EA are linearly dependent.
12.7.3 Example
Consider the matrix
A =
1 2 53 8 62 1 95 3 1
Its rref is
1 0 00 1 00 0 10 0 0
As the column vectors of the rref are linearly independent the column vectors of the originalmatrix A are linearly independent.
12.7.4 Related theorem
A given set of column vectors of A is linearly independent if and only if the correspondingcolumn vectors of B are linearly independent.
12.7.5 Proof
Work with A modified by dropping any columns not in the set.
12.7. COLUMN SPACE AND ELEMENTARY OPERATIONS 181
12.7.6 Basis of the column space
Suppose that A is a matrix and B is given by performing elementary row operations on A.
A given set of column vectors of A forms a basis for the column space of A if and onlyif the corresponding column vectors of B form a basis for the column space of B.
12.7.7 Proof
Sufficient to show this for B = EA where E is an elementary matrix.
Suppose that a given set of column vectors are a basis for the column space of A. Thenthat set of column vectors is linearly independent, by the definition of a basis. Then thecorresponding column vectors in B are linearly independent by the previous theorem. Now,we would like to show that the corresponding column vectors are a basis.
We need to show that every vector in the column space of EA can be written as a lin-ear combination of the corresponding vectors.
Suppose, without loss of generality, that that the first k vectors are a basis for the col-umn space of A and the remaining n−k can be written as linear combinations of the first k.
Then any vector in the column space is the image of a vector u mapped by the associatedlinear transformation:
Au = α1c1 + α2c2 + · · ·+ αkck + αkck+1 + · · ·+ αncn
= α1c1 + α2c2 + · · ·+ αkck + αk
k∑i=1
βki ci + · · ·+ αn
k∑i=1
βni cn
for some scalars bji . Then a typical vector in the column space of EA is
(EA)u = E(Au) = α1Ec1 + α2Ec2 + · · ·+ αkEck + αk
k∑i=1
βki Eci + · · ·+ αn
k∑i=1
βni Ecn
so the images of the original basis vectors also span the column space of EA.
For the proof in the opposite direction, B = EA means E−1B = A and note that theinverse of an elementary matrix is an elementary matrix. Then the same proof can be usedagain.
182 CHAPTER 12. SOME PARTICULAR VECTOR SPACES
12.7.8 RREF and column space
If a matrix A is in rref, then the column vectors with the leading ones of the row vectorsform a basis for the column space of A.
12.7.9 Proof
Exercise.
12.7.10 Example
Consider the matrix
A =
1 1 1 1 12 2 3 4 43 3 1 3 4
The rref is 1 1 0 0 1/4
0 0 1 0 −1/20 0 0 1 5/4
So a basis of the column space is
c1 =
100
c3 =
010
c4 =
001
Since this basis would span all of R3 we can say that the range of A is all of R3.
12.8 Row space
12.8.1 Row space
The row space of a matrix A is the space spanned by the row vectors of A.
12.9. ELEMENTARY OPERATIONS AND THE ROW SPACE 183
12.8.2 Example
Consider the matrix
A =
1 1 1 10 0 1 10 0 0 2
The row vectors of A are
r1 =(1 1 1 1
)r2 =
(0 0 1 1
)r3 =
(0 0 0 2
)The row space consists of all the linear combinations
αr1 + βr2 + γr3
12.9 Elementary operations and the row space
12.9.1 Elementary operations do not change the row space
If A is a matrix and B is obtained by A by multiplication by elementary matrices then therow space of A and the row space of B are the same.
12.9.2 Proof
Sufficient to show for B = EA where E is single elementary matrix.
Suppose that E interchanges matrices. Then the rows of A and B are the same, so therow spaces do not change.
Suppose that E multiplies one row of A by a non-zero scalar. Then again, the row spacedoes not change.
Now, suppose that E multiplies the ith row of A by a non-zero scalar α and adds it tothe jth row of A, replacing the jth row.
Then if the row space of A is spanned by
{r1, r2, · · · , ri, · · · , rj, · · · , rm}
the row space of B is spanned by
184 CHAPTER 12. SOME PARTICULAR VECTOR SPACES
{r1, r2, · · · , ri, · · · , rj + αri, · · · , rm}
which will span the same space.
12.9.3 Row space and rref
If a matrix A is in rref, then the row vectors with the leading ones form a basis for the rowspace of A.
12.9.4 Example
Consider the matrix
A =
1 2 53 8 62 1 95 3 1
Its rref is
1 0 00 1 00 0 10 0 0
and so the row space of A is spanned by the basis vectors
r1 =(1 0 0
)r2 =
(0 1 0
)r3 =
(0 0 1
)