matrices and gaussian elimination -- sage
TRANSCRIPT
Matrices and Gaussian elimination
REFERENCE: Chapter 1, Linear Algebra andits applications, by G. Strang.
The Geometry of systems of linear equationsThe book "Linear Algebra and its applications" begins with systems of linear equations. The simplestcase is when the number of unknowns equals the number of equations. We have
equations in unknowns, starting with = 2 with an example:
We have two equations and two unknowns, and. The next question is not how to solve it but what we are solving.
In order to understand what's going on, we can look at that system by rows or by columns.
The row pictureBy rows, we want to compute the point of intersection that lies on both lines,
var('x,y')show(implicit_plot(x+y,(x,-2,2),(y,-2,2)) + implicit_plot(x+3*y-2,(x,-2,2),(y,-2,2),color='red'))
nnn
x + y = 0
x + 3 y = 2
xy
We appreciate that such a point is (-1,1) and that's right, the system has a single unique solution.
solve([x+y,x+3*y-2],x,y) [[x == -1, y == 1]]
The column pictureThe second approach looks at the columns of the linear system. The two separate equations are really onevector equation:
That is, we want to know if the vector is linear combination of the vectors and. We've already known the answer: yes! let's draw it.
x ( ) + y ( ) = ( )11
13
02
(0 , 2)(1, 1)(1, 3 )
u=vector((1,1))v=vector((1,3))s=vector((0,2)) -1*u+1*v ##here we have the combination (0, 2)show(plot(-u,color='green') + arrow(-u,-u+v) + plot(s,color='red'))
It is well known that a system can have no solutions, that is, the system is inconsistent:
That means that there are not intersection points and that the vector is not multiple of.
And that a system can have infinitely many solutionsinfinitely many solutions. In general (not always) a system with fewer
x + y = 1
x + y = 2
(1, 2)(1, 1)
3
equations than unknowns has infinitely many solutions; for example the intersection of two planes in is usually a line,
A linear system can be any overdetermined (having more equations than unknowns), underdetermined(having fewer equations than unknowns), or exactly determined.
var('z')show(implicit_plot3d(x+y,(x,-2,2),(y,-2,2),(z,-2,2)) + implicit_plot3d(x+3*y-2,(x,-2,2),(y,-2,2),(z,-2,2),color='red'))
That implies also there are infinitely many different combinations of the vector (1,1), (1,3) and (1,-1)which produces (0,2),
What does it mean? Take a look at the solution,solve([x+y+z,x+3*y-z-2],x,y,z) [[x == -2*r1 - 1, y == r1 + 1, z == r1]]
Let's "study" the solution. For every r1, we have the combination:
var('r1')w=vector((1,-1))(-2*r1 - 1)*u+(r1+1)*v+r1*w
R3
x + y + z = 0
x + 3 y − z = 2
x ( ) + y ( ) + z ( ) = ( )11
13
1−1
02
(0, 2)
If r1=0:
(-2*0 - 1)*u+(0+1)*v+0*w (0, 2)
So...
-2*r1*u+r1*v+r1*w (0, 0)
That means that the vectors are linearly dependent.
DEFINITIONS
A linear combination of a set of vectors, , is another vector of the form
with. The numbers
are colled the coefficients of the combination.
Suppose only happens when
. Then the vectors are linearly independent. If any
are nonzero, the are linearly dependent. That means that one vector is a combination of the others.
PROBLEMS
Given a set of vectors , how do you know if they are linearlyindependent/dependent?Given a set of vectors , and , how do you know if is a l.c. of ?
It's easy : just you have to discuss the corresponding linear system of equations. Recall that given amatrix
and a vector, the multiplication of and,
, is a combination of the columns of A. The coefficients are the components of the vector.
LET'S SEE EXAMPLES ON THE BLACKBOARD
−2.r1. u + r1. v + r1. w =????
u, v, w
, … ,v1 vn
+ ⋯ +c1v1 cnv1
∈ Rci, … ,c1 cn
+ ⋅ ⋅ ⋅ + = 0c1v1 cnvn= ⋅ ⋅ ⋅ = = 0c1 cn, . . . ,v1 vnsc′
sv′
, … ,v1 vn
, … ,v1 vn w w , … ,v1 vn
AxAxA xx
Gaussian EliminationObviously, it is easy to solve triangular systems by back-substitution, for example:
So, given a linear system, the goal of the Gaussian Elimination is to get a triangular system, equivalent tothe original (with the same solutions) and easy to solve. In order to get equivalent systems, we can
swap equations (exchange of equations)multiply an equation by a number different from 0add a multiple of an equation to another.
If we consider the (well-known) matrix form to describe the original system,,
coefficient matrix,: the augmented matrix, these operations are row operations of elimination:
Row exchangemultiply a row by a number different from 0add a multiple of a row to another.
We say that, by row operations, we can always reach the echelon form of the matrix
, with zeros below the pivots. The echelon form of a matrix is a matrix that verifies:
1. The pivots are the first nonzero entries in their rows.
2. Below each pivot is a column of zeros, obtained by elimination.
3. Each pivot lies to the right of the pivot in the row above. This produces the staircase pattern, and zerorows come last.
The matrix defines an equivalent system easier to solve than the original.
We can go further than , to make the matrix even simpler with The reduced echelon form
4. Zero above pivots.
5. All pivots are equal to 1.
LET'S SEE EXAMPLES ON THE BLACKBOARD
Now let's see how SAGE gets reduced echelon forms.
2u + v + w = 5 , −8 v − 2w = −12, w = 2
w = 2− > v = 1− > u = 1
Ax = bA :[A b]
U[A b]U
U
U R
a) } ,
var('y')implicit_plot(x+y,(x,-2,1),(y,-1,2),color='red')+implicit_plot(x+2*y-1,(x,-2,1),(y,-1,2))
A=matrix(QQ,[[1,1],[1,2]]); #coefficient matrix b=column_matrix(QQ,[[0,1]]); #free vector Ab=A.augment(b,subdivide=true); #augmented matrix Ab
[1 1|0][1 2|1]Ab.echelon_form() #echelon form
a) } ,x + y = 0x + 2y = 1
[ 1 0|-1][ 0 1| 1]
solution: . We check it:
A*column_matrix(QQ,[[-1,1]])
[0][1]
implicit_plot(x+2*y-1,(x,-2,2),(y,-2,2),color="blue")+implicit_plot(x+y-2,(x,-2,2),(y,-2,2),color='red')+implicit_plot(2*x+y-1,(x,-2,2),(y,-2,2),color='pink')
x = −1, y = 1
b) ,x + 2y = 1x + y = 22x + y = 1
⎫⎭⎬⎪⎪
A=matrix(QQ,[[1,2],[1,1],[2,1]]); b=column_matrix(QQ,[[1,2,1]]); Ab=A.augment(b ,subdivide=true); Ab
[1 2|1][1 1|2][2 1|1]
Ab.echelon_form()#hacemos su forma escalonada
[1 0|0][0 1|0][0 0|1]
The system is singular (inconsistent), that is, it has no solutions.
var('z')implicit_plot3d(2*x+4*z,(x,-7,7),(y,-7,7),(z,-7,7),color='blue')+implicit_plot3d(-x-z-3,(x,-7,7),(y,-7,7),(z,-7,7),color='red')+implicit_plot3d(2*x+y+5*z,(x,-7,7),(y,-7,7),(z,-7,7),color='pink',viewer='threejs')
A=matrix(QQ,[[2,0,4],[-1,0,-1],[2,1,5]]);b=column_matrix(QQ,[[0,3,-4]]);
c)2x + 4 z = 0−x − z = 3
2x + y + 5 z = −4
⎫⎭⎬⎪⎪
Ab=A.augment(b ,subdivide=true);Ab
[ 2 0 4| 0][-1 0 -1| 3][ 2 1 5|-4]
Ab.echelon_form()
[ 1 0 0|-6][ 0 1 0|-7][ 0 0 1| 3]
Solution: :
A*column_matrix(QQ,[[-6,-7,3]]);
[ 0][ 3][-4]
A=matrix(QQ,[[1,0,2,5],[0,1,-3,0]]);b=column_matrix(QQ,[[2,1]]); Ab=A.augment(b ,subdivide=true);Ab
[ 1 0 2 5| 2][ 0 1 -3 0| 1]
It is already in echelon form !
Ab.echelon_form()
[ 1 0 2 5| 2][ 0 1 -3 0| 1]
The system is consistent but two variables are free, and. So, there are infinitely many solutions depending on the real values of and.
var('z,t') (z, t)A*column_matrix(QQ[z,t],[[2-5*t-2*z,1+3*z,z,t]]); # QQ[z,t]: we have
x = −6 , y = −7 , z = 3
d) } ,x + 2z + 5 t = 2
y − 3 z = 1
tzzt
x = 2 − 5 t − 2z,
y = 1 + 3 z,
to add z,t to the rational numbers.
[2][1]
Which null combination is "hidden" in the solution ?
The next one, up to you :
SOME CONCLUSIONS:
If the system is consistent and the number of pivots is equal to the number of unknowns, the system hasjust one solution.
If the system is consistent and the number of pivots is less than the number of unknowns, the system hasinfinitely many solutions. Then solutions depend on "free variables" or "parameters", and the columns ofthe coefficient matrix are linearly dependant.
INVERSES
The inverse of an by matrix is another by matrix, written and pronounced “ inverse” ,that verifies
is called invertible if does exist; otherwise, singular.
Notes from the book of G. Strang:
Note 1. The inverse exists if and only if elimination produces pivots. (if and only if the reducedechelon form is equal to the identity matrix).
Note 2. The matrix cannot have two different inverses.
Note 3. If is invertible, the one and only solution to
is
Note 4. Suppose there is a nonzero vector such that
. Then cannot have an inverse.
Note 6. A diagonal matrix has an inverse provided no diagonal entries are zero:
e) } ,2x + y + z + t = 4
z + 4 t = 2
n n n n A−1 A
A = A = IA−1 A−1
A A−1
n
A
AAx = bx = b.A−1
xAx = 0A
d=diagonal_matrix(4,[1,2,3,4])show("d =",d)show("d^-1 =",d.inverse())
Another points:
A product of invertible matrices is inverted by , that is,
Given an invertible matrix , the transpose of is the transpose of the inverse of :
Calculation of : The Gauss-Jordan Method
Given an by matrix, we have to solve systems of equations in order to get the inverse. By the Gauss-Jordan Method, we solve all systems
simultaneously.
LET'S SEE SOME EXAMPLES ON THE BLACKBOARD
Inverses - Ranks
Given a matrix, suppose elimination reduces to (echelon form) and , with
pivots. This important number will be given a name—it is the rank of the matrix.
d =
⎛⎝⎜⎜⎜
1000
0200
0030
0004
⎞⎠⎟⎟⎟
d^-1 =
⎛
⎝⎜⎜⎜⎜⎜
10
0
0
0120
0
00130
00
014
⎞
⎠⎟⎟⎟⎟⎟
AB B−1A−1
(AB = .)−1 B−1A−1
A A−1 A
( = ( .At)−1 A−1)t
A−1
nnAn
AAURrrA
The rows and columns in and that contain the pivots are called pivot rows pivot columns respectively.
Thus, an by matrix is invertible if and only if its rank is equal to .
Elementary Matrices and LU decompositionRow operations in Gaussian elimination process can be seen as multiplication of matrices. Let's computethe echelon form of the following matrix (Chapter 1, Section 1.4),
A=matrix(QQ,[[2,1,1 ],[4, -6,0],[-2, 7,2]]);show("A= ",A)
The first step: added -2 times the first row to the second.
The same result is achieved if we multiply this elementary matrix by,
E=elementary_matrix( 3, row1=1, row2=0, scale=-2)# The matrix which multiplies row 0 by -2 and adds it to row 1.show("E= ",E)
A1=E*AA1
[ 2 1 1][ 0 -8 -2][-2 7 2]
The second step: add 1 times the first row to the third.
UR
nnn
U
A=⎛⎝⎜
24
−2
1−6
7
102
⎞⎠⎟
− > − > U =⎛⎝⎜
24
−2
1−6
7
102
⎞⎠⎟
⎛⎝⎜
200
1−8
8
1−2
3
⎞⎠⎟
⎛⎝⎜
200
1−8
0
1−2
1
⎞⎠⎟
EA
E=⎛⎝⎜
1−2
0
010
001
⎞⎠⎟
The same result is achieved if we multiply this elementary matrix by
,
F=elementary_matrix( 3, row1=2, row2=0, scale=1)# The matrix which multiplies row 0 by 1 and adds it to row 2.show("F= ",F)
A2=F*A1A2
[ 2 1 1][ 0 -8 -2][ 0 8 3]
The third (and last) step: add 1 times the second row to the third.
The same result is achieved if we multiply this elementary matrix by,
G=elementary_matrix( 3, row1=2, row2=1, scale=1)# The matrix which multiplies row 1 by 1 and adds it to row 2.show("G= ",G)
U=G*A2U
[ 2 1 1][ 0 -8 -2][ 0 0 1]
Thus, ,
G*F*E*A==U Trueelementary_matrix?
File: /Applications/SageMath/local/lib/python2.7/site-packages/sage/matrix/special.py
Type: <type ‘function’>
Definition: elementary_matrix(arg0, arg1=None, **kwds)
Docstring:
FA1
F=⎛⎝⎜
101
010
001
⎞⎠⎟
GA2
G=⎛⎝⎜
100
011
001
⎞⎠⎟
G ⋅ F ⋅ E ⋅ A = U
This function is available as elementary_matrix(…) and matrix.elementary(…).
Creates a square matrix that corresponds to a row operation or a column operation.
FORMATS:
In each case, R is the base ring, and is optional. n is the size of the square matrixcreated. Any call may include the sparse keyword to determine the representationused. The default is False which leads to a dense representation. We describe thematrices by their associated row operation, see the output description for more.
elementary_matrix(R, n, row1=i, row2=j)
The matrix which swaps rows i and j.
elementary_matrix(R, n, row1=i, scale=s)
The matrix which multiplies row i by s.
elementary_matrix(R, n, row1=i, row2=j, scale=s)
The matrix which multiplies row j by s and adds it to row i.
Elementary matrices representing column operations are created in an entirelyanalogous way, replacing row1 by col1 and replacing row2 by col2.
Specifying the ring for entries of the matrix is optional. If it is not given, and a scaleparameter is provided, then a ring containing the value of scale will be used.Otherwise, the ring defaults to the integers.
OUTPUT:
An elementary matrix is a square matrix that is very close to being an identity matrix. IfE is an elementary matrix and A is any matrix with the same number of rows, then E*Ais the result of applying a row operation to A. This is how the three types created bythis function are described. Similarly, an elementary matrix can be associated with acolumn operation, so if E has the same number of columns as A then A*E is the resultof performing a column operation on A.
An elementary matrix representing a row operation is created if row1 is specified,while an elementary matrix representing a column operation is created if col1 isspecified.
EXAMPLES:
Over the integers, creating row operations. Recall that row and column numberingbegins at zero.
sage: A = matrix(ZZ, 4, 10, range(40)); A[ 0 1 2 3 4 5 6 7 8 9][10 11 12 13 14 15 16 17 18 19][20 21 22 23 24 25 26 27 28 29][30 31 32 33 34 35 36 37 38 39]
sage: E = elementary_matrix(4, row1=1, row2=3); E[1 0 0 0][0 0 0 1][0 0 1 0][0 1 0 0]sage: E*A[ 0 1 2 3 4 5 6 7 8 9][30 31 32 33 34 35 36 37 38 39][20 21 22 23 24 25 26 27 28 29]
[10 11 12 13 14 15 16 17 18 19]
sage: E = elementary_matrix(4, row1=2, scale=10); E[ 1 0 0 0][ 0 1 0 0][ 0 0 10 0][ 0 0 0 1]sage: E*A[ 0 1 2 3 4 5 6 7 8 9][ 10 11 12 13 14 15 16 17 18 19][200 210 220 230 240 250 260 270 280 290][ 30 31 32 33 34 35 36 37 38 39]
sage: E = elementary_matrix(4, row1=2, row2=1, scale=10); E[ 1 0 0 0][ 0 1 0 0][ 0 10 1 0][ 0 0 0 1]sage: E*A[ 0 1 2 3 4 5 6 7 8 9][ 10 11 12 13 14 15 16 17 18 19][120 131 142 153 164 175 186 197 208 219][ 30 31 32 33 34 35 36 37 38 39]
Over the rationals, now as column operations. Recall that row and column numberingbegins at zero. Checks now have the elementary matrix on the right.
sage: A = matrix(QQ, 5, 4, range(20)); A[ 0 1 2 3][ 4 5 6 7][ 8 9 10 11][12 13 14 15][16 17 18 19]
sage: E = elementary_matrix(QQ, 4, col1=1, col2=3); E[1 0 0 0][0 0 0 1][0 0 1 0][0 1 0 0]sage: A*E[ 0 3 2 1][ 4 7 6 5][ 8 11 10 9][12 15 14 13][16 19 18 17]
sage: E = elementary_matrix(QQ, 4, col1=2, scale=1/2); E[ 1 0 0 0][ 0 1 0 0][ 0 0 1/2 0][ 0 0 0 1]sage: A*E[ 0 1 1 3][ 4 5 3 7][ 8 9 5 11][12 13 7 15][16 17 9 19]
sage: E = elementary_matrix(QQ, 4, col1=2, col2=1, scale=10); E[ 1 0 0 0][ 0 1 10 0]
Now, we can "get back" to the matrix because elementary matrices are invertibles,
show("E^-1 =",E.inverse())show("F^-1 =",F.inverse())show("G^-1=",G.inverse())
[ 0 0 1 0][ 0 0 0 1]sage: A*E[ 0 1 12 3][ 4 5 56 7][ 8 9 100 11][ 12 13 144 15][ 16 17 188 19]
An elementary matrix is always nonsingular. Then repeated row operations can berepresented by products of elementary matrices, and this product is again nonsingular.If row operations are to preserve fundamental properties of a matrix (like rank), we donot allow scaling a row by zero. Similarly, the corresponding elementary matrix is notconstructed. Also, we do not allow adding a multiple of a row to itself, since this couldalso lead to a new zero row.
sage: A = matrix(QQ, 4, 10, range(40)); A[ 0 1 2 3 4 5 6 7 8 9][10 11 12 13 14 15 16 17 18 19][20 21 22 23 24 25 26 27 28 29][30 31 32 33 34 35 36 37 38 39]
sage: E1 = elementary_matrix(QQ, 4, row1=0, row2=1)sage: E2 = elementary_matrix(QQ, 4, row1=3, row2=0, scale=100)sage: E = E2*E1sage: E.is_singular()Falsesage: E*A[ 10 11 12 13 14 15 16 17 18 19][ 0 1 2 3 4 5 6 7 8 9][ 20 21 22 23 24 25 26 27 28 29][1030 1131 1232 1333 1434 1535 1636 1737 1838 1939]
sage: E3 = elementary_matrix(QQ, 4, row1=3, scale=0)Traceback (click to the left of this block for traceback)...
A
G ⋅ F ⋅ E ⋅ A = U → A = ⋅ ⋅ ⋅ U,E−1 F −1 G−1
E^-1 =⎛⎝⎜
120
010
001
⎞⎠⎟
F^-1 =⎛⎝⎜
10
−1
010
001
⎞⎠⎟
G^-1=⎛⎝⎜
⎞⎠⎟
E.inverse()*F.inverse()*G.inverse()*U==A True
Moreover, is a lower matrix,
L=E.inverse()*F.inverse()*G.inverse()*Ushow(L)
So, is equal to a lower matrix times an upper matrix...actually, this is the
decomposition of,
,
Observe that in the example:
The diagonal entires of are the pivots of the Gaussian process.L is lower triangular, with 1s on the diagonal.
First conclusions:
Assume that is an INVERTIBLE MATRIX that can be row reduced to echelon form with nowith noexchanges of rowsexchanges of rows. Then can be written in the form
where is a lower triangular matrix with 1s on the diagonal, and is the echelon form of .The diagonal entires of are the pivots of the Gaussian process. This factorization is UNIQUE and iscalled the LU factorization of .
The matrix includes all row operations required to get the reduced form of
.
Let's see LU decomposition with SAGE:
G^-1=⎛⎝⎜
100
01
−1
001
⎞⎠⎟
⋅ ⋅E−1 F −1 G−1
⎛⎝⎜
24
−2
1−6
7
102
⎞⎠⎟
ALULUAA = L ⋅ U
=⎛⎝⎜
24
−2
1−6
7
102
⎞⎠⎟
⎛⎝⎜
12
−1
01
−1
001
⎞⎠⎟
⎛⎝⎜
200
1−8
0
1−2
1
⎞⎠⎟
U
AA
A = L U,
L U AU
A
L−1
A
p,l,u=A.LU()show(p,l,u) ## SURPRISE: LU factorization with partial pivoting. p is a "permutation matrix": matrices that produce row exchanges.
ATTENTION, in SAGE:
A=PLU !! and not PA=LU
A.LU?
File: /Applications/SageMath/src/sage/matrix/matrix2.pyx
Type: <type ‘builtin_function_or_method’>
Definition: A.LU(pivot=None, format=’plu’)
Docstring:
Finds a decomposition into a lower-triangular matrix and an upper-triangularmatrix.
INPUT:
pivot - pivoting strategy‘auto’ (default) - see if the matrix entries are ordered (i.e. if they havean absolute value method), and if so, use a the partial pivotingstrategy. Otherwise, fall back to the nonzero strategy. This is the bestchoice for general routines that may call this for matrix entries of avariety of types.‘partial’ - each column is examined for the element with the largestabsolute value and the row containing this element is swapped intoplace.‘nonzero’ - the first nonzero element in a column is located and therow with this element is used.
format - contents of output, see more discussion below about output.‘plu’ (default) - a triple; matrices P, L and U such that A = P*L*U.‘compact’ - a pair; row permutation as a tuple, and the matrices L andU combined into one matrix.
OUTPUT:
Suppose that is an
matrix, then an LU decomposition is a lower-triangular matrix
with every diagonal element equal to 1, and an upper-triangular matrix,
such that the product, after a permutation of the rows, is then equal to
. For the ‘plu’ format the permutation is returned as an permutation matrix
such that
⎛⎝⎜
010
100
001
⎞⎠⎟
⎛⎝⎜⎜
112
− 12
01
1
00
1
⎞⎠⎟⎟
⎛⎝⎜
400
−640
011
⎞⎠⎟
Am × nm × mLm × nULUAm × mP
A = PLU
It is more common to place the permutation matrix just to the left of. If you desire this version, then use the inverse of which is computed most efficiently as its transpose.
If the ‘partial’ pivoting strategy is used, then the non-diagonal entries of will beless than or equal to 1 in absolute value. The ‘nonzero’ pivot strategy may befaster, but the growth of data structures for elements of the decomposition mightcounteract the advantage.
By necessity, returned matrices have a base ring equal to the fraction field of thebase ring of the original matrix.
In the ‘compact’ format, the first returned value is a tuple that is a permutation ofthe rows of
that yields. See the doctest for how you might employ this permutation. Then the matrices and are merged into one matrix – remove the diagonal of ones in and the remaining nonzero entries can replace the entries of beneath the diagonal.
The results are cached, only in the compact format, separately for each pivotstrategy called. Repeated requests for the ‘plu’ format will require just a smallamount of overhead in each call to bust out the compact format to the threematrices. Since only the compact format is cached, the components of the compactformat are immutable, while the components of the ‘plu’ format are regenerated,and hence are mutable.
Notice that while is similar to row-echelon form and the rows of span the row space of, the rows of are not generally linearly independent. Nor are the pivot columns (or rank)
immediately obvious. However for rings without specialized echelon form routines,this method is about twice as fast as the generic echelon form routine since it onlyacts “below the diagonal”, as would be predicted from a theoretical analysis of thealgorithms.
Note
This is an exact computation, so limited to exact rings. If you need numericalresults, convert the base ring to the field of real double numbers, RDF or the field ofcomplex double numbers, CDF, which will use a faster routine that is careful aboutnumerical subtleties.
ALGORITHM:
“Gaussian Elimination with Partial Pivoting,” Algorithm 21.1 of[TB1997].
EXAMPLES:
Notice the difference in the matrix as a result of different pivoting strategies. With partial pivoting, every
entry of has absolute value 1 or less.
sage: A = matrix(QQ, [[1, -1, 0, 2, 4, 7, -1],....: [2, -1, 0, 6, 4, 8, -2],....: [2, 0, 1, 4, 2, 6, 0],
A = PLU
AP
L
LUALULU
UUAU
L
L
....: [1, 0, -1, 8, -1, -1, -3],
....: [1, 1, 2, -2, -1, 1, 3]])sage: P, L, U = A.LU(pivot='partial')sage: P[0 0 0 0 1][1 0 0 0 0][0 0 0 1 0][0 0 1 0 0][0 1 0 0 0]sage: L[ 1 0 0 0 0][ 1/2 1 0 0 0][ 1/2 1/3 1 0 0][ 1 2/3 1/5 1 0][ 1/2 -1/3 -2/5 0 1]sage: U[ 2 -1 0 6 4 8 -2][ 0 3/2 2 -5 -3 -3 4][ 0 0 -5/3 20/3 -2 -4 -10/3][ 0 0 0 0 2/5 4/5 0][ 0 0 0 0 1/5 2/5 0]sage: A == P*L*UTruesage: P, L, U = A.LU(pivot='nonzero')sage: P[1 0 0 0 0][0 1 0 0 0][0 0 1 0 0][0 0 0 1 0][0 0 0 0 1]sage: L[ 1 0 0 0 0][ 2 1 0 0 0][ 2 2 1 0 0][ 1 1 -1 1 0][ 1 2 2 0 1]sage: U[ 1 -1 0 2 4 7 -1][ 0 1 0 2 -4 -6 0][ 0 0 1 -4 2 4 2][ 0 0 0 0 1 2 0][ 0 0 0 0 -1 -2 0]sage: A == P*L*UTrue
An example of the compact format.
sage: B = matrix(QQ, [[ 1, 3, 5, 5],....: [ 1, 4, 7, 8],....: [-1, -4, -6, -6],....: [ 0, -2, -5, -8],....: [-2, -6, -6, -2]])sage: perm, M = B.LU(format='compact')sage: perm(4, 3, 0, 1, 2)sage: M[ -2 -6 -6 -2][ 0 -2 -5 -8][-1/2 0 2 4][-1/2 -1/2 3/4 0][ 1/2 1/2 -1/4 0]
We can easily illustrate the relationships between the two formats with a squarematrix.
sage: C = matrix(QQ, [[-2, 3, -2, -5],....: [ 1, -2, 1, 3],....: [-4, 7, -3, -8],....: [-3, 8, -1, -5]])sage: P, L, U = C.LU(format='plu')sage: perm, M = C.LU(format='compact')sage: (L - identity_matrix(4)) + U == MTruesage: p = [perm[i]+1 for i in range(len(perm))]sage: PP = Permutation(p).to_matrix()sage: PP == PTrue
For a nonsingular matrix, and the ‘nonzero’ pivot strategy there is no need topermute rows, so the permutation matrix will be the identity. Furthermore, it can beshown that then the
and matrices are uniquely determined by requiring to have ones on the diagonal.
sage: D = matrix(QQ, [[ 1, 0, 2, 0, -2, -1],....: [ 3, -2, 3, -1, 0, 6],....: [-4, 2, -3, 1, -1, -8],....: [-2, 2, -3, 2, 1, 0],....: [ 0, -1, -1, 0, 2, 5],....: [-1, 2, -4, -1, 5, -3]])sage: P, L, U = D.LU(pivot='nonzero')sage: P[1 0 0 0 0 0][0 1 0 0 0 0][0 0 1 0 0 0][0 0 0 1 0 0][0 0 0 0 1 0][0 0 0 0 0 1]sage: L[ 1 0 0 0 0 0][ 3 1 0 0 0 0][ -4 -1 1 0 0 0][ -2 -1 -1 1 0 0][ 0 1/2 1/4 1/2 1 0][ -1 -1 -5/2 -2 -6 1]sage: U[ 1 0 2 0 -2 -1][ 0 -2 -3 -1 6 9][ 0 0 2 0 -3 -3][ 0 0 0 1 0 4][ 0 0 0 0 -1/4 -3/4][ 0 0 0 0 0 1]sage: D == L*UTrue
The base ring of the matrix may be any field, or a ring which has a fraction fieldimplemented in Sage. The ring needs to be exact (there is a numerical LUdecomposition for matrices over RDF and CDF). Matrices returned are over theoriginal field, or the fraction field of the ring. If the field is not ordered (i.e. theabsolute value function is not implemented), then the pivot strategy needs to be‘nonzero’.
LUL
sage: A = matrix(RealField(100), 3, 3, range(9))sage: P, L, U = A.LU()Traceback (click to the left of this block for traceback)...
We can find elimination with partial pivoting in Section 1.7 of the book:
Small pivot force a practical change in elimination. Normally we compare each pivot with all possiblepivots in the same column. Exchanging rows to obtain the largest possible pivot is called partial pivoting.
p.inverse()*A ==l*u Trueshow("p_inv . A =",p.inverse()*A,",A=",A)
A.LU(pivot='nonzero')
([1 0 0] [ 1 0 0] [ 2 1 1][0 1 0] [ 2 1 0] [ 0 -8 -2][0 0 1], [-1 -1 1], [ 0 0 1])
EXERCISE: compute the echelon form of the following matrix multiplying by the requiredelementary matrices. Afterwards, compute its decomposition.
A=matrix(QQ,[[1,1,1],[1,2,2],[1,2,3]]) #example 3, section 1.5show(A)
LU factorization with row exchanges and permutation matrices.Consider now Example 7 of Section 1.5,
A=matrix(QQ,[[1,1,1],[1,1,3],[2,5,8]])A
p_inv . A = ,A=⎛⎝⎜
42
−2
−617
012
⎞⎠⎟
⎛⎝⎜
24
−2
1−6
7
102
⎞⎠⎟
U ALU
⎛⎝⎜
111
122
123
⎞⎠⎟
A = :⎛⎝⎜
112
115
138
⎞⎠⎟
[1 1 1][1 1 3][2 5 8]
In the gaussian process,
E=elementary_matrix(QQ, 3, row1=1, row2=0,scale=-1)show(E)A2=E*Ashow("E.A=",A2)
P=elementary_matrix(QQ, 3, row1=1, row2=2)A3=P*E*Ashow("P.E.A=",A3)
F=elementary_matrix(QQ, 3, row1=1, row2=0,scale=-2)show(F)A4=F*P*E*Ashow("F.P.E.A=",A4)
BUT F*P*E IS NOT A LOWER TRIANGULAR MATRIX, neither the inverse
show(F*P*E)(F*P*E).inverse()
[1 0 0][1 0 1]
⎛⎝⎜
1−1
0
010
001
⎞⎠⎟
E.A=⎛⎝⎜
102
105
128
⎞⎠⎟
P.E.A=⎛⎝⎜
120
150
182
⎞⎠⎟
⎛⎝⎜
1−2
0
010
001
⎞⎠⎟
F.P.E.A=⎛⎝⎜
100
130
162
⎞⎠⎟
⎛⎝⎜
1−2−1
001
010
⎞⎠⎟
[2 1 0]
But with the rows reordered in advance, there exists a permutation matrix such that
can be factored into.
In this example, we first multiply by P and then, compute the LU factorization.
newA=P*Ashow(newA)
E=elementary_matrix(QQ, 3, row1=1, row2=0,scale=-2)E*newA
[1 1 1][0 3 6][1 1 3]
F=elementary_matrix(QQ, 3, row1=2, row2=0,scale=-1)F*E*newA
[1 1 1][0 3 6][0 0 2]
U=F*E*newAshow("U=",U)L=(F*E).inverse()show("L=",L)
show("LU=",L*U)show(newA)
PPALU
⎛⎝⎜
121
151
183
⎞⎠⎟
U=⎛⎝⎜
100
130
162
⎞⎠⎟
L=⎛⎝⎜
121
010
001
⎞⎠⎟
LU=⎛⎝⎜
121
151
183
⎞⎠⎟
⎛⎝⎜
121
151
183
⎞⎠⎟
A
Conclusion: Assume that is an INVERTIBLE MATRIX that can be row reduced to echelonform with exchanges of rowswith exchanges of rows. Then there exists a permutation matrix such that can be writtenin the form
What happens if is square but singular?
By applying Gauss, if is the rank, the last rows of the echelon form of will be zero rows. So,we can find a permutation matrix such that can be row reduced to echelon form with nowith noexchanges of rows, but now exchanges of rows, but now
singular. singular.
A=matrix(QQ,[[2,5,8],[2,2,2],[1,1,1]])p,l,u=A.LU()show(l,u)
show(l*u,A)
A=matrix(QQ,[[2,2,2],[1,1,1],[2,5,8]])A.LU()
([1 0 0] [ 1 0 0] [2 2 2][0 0 1] [ 1 1 0] [0 3 6][0 1 0], [1/2 0 1], [0 0 0])
What happens if is not square,
?
we can find a permutation matrix such that can be row reduced to echelon form with nowith noexchanges of rows, but now exchanges of rows, but now
not square, not square, square, invertible, square, invertible, by by ..
AP PA
P A = L U,
A
r n − r AP PA
P A = L U,
U
⎛⎝⎜
1112
0112
001
⎞⎠⎟
⎛⎝⎜
200
5−3
0
8−6
0
⎞⎠⎟
⎛⎝⎜
221
521
821
⎞⎠⎟
⎛⎝⎜
221
521
821
⎞⎠⎟
Am × n
P PA
P A = L U,
U L m m
A=matrix(QQ,[[2,5,8,1],[2,2,2,1],[1,1,1,1]])p,l,u=A.LU()show(p,l,u)
Application: One Linear System = Two Triangular SystemsIf
, then the system can be written as L(Ux)=b. So, we first solve the following triangular system
and then, the other one:
If , then we apply the same idea to (we only permute the equations).
EXAMPLE:
A=matrix(QQ,[[2,1,5],[2,0,4],[-1,0,-1]]);b=column_matrix(QQ,[[-4,0,3]]);show(A,b)
p,l,u=A.LU(pivot='nonzero')show(p,l,u)
A==l*u True
⎛⎝⎜
100
010
001
⎞⎠⎟
⎛⎝⎜
1112
0112
001
⎞⎠⎟
⎛⎝⎜
200
5−3
0
8−6
0
1012
⎞⎠⎟
A = LUAx = b
L y = b,
Ux = y
PA = LUPAx = Pb
2x + y + 5 z = −42x + 4 z = 0−x − z = 3
⎫⎭⎬⎪⎪
⎛⎝⎜
22
−1
100
54
−1
⎞⎠⎟
⎛⎝⎜
−403
⎞⎠⎟
⎛⎝⎜
100
010
001
⎞⎠⎟
⎛⎝⎜
11
− 12
01
− 12
00
1
⎞⎠⎟
⎛⎝⎜
200
1−1
0
5−1
1
⎞⎠⎟
Ly = b
First, we solve
l.augment(b,subdivide=true)
[ 1 0 0| -4][ 1 1 0| 0][-1/2 -1/2 1| 3]
l.augment(b,subdivide=true).echelon_form()
[ 1 0 0|-4][ 0 1 0| 4][ 0 0 1| 3]
so,. Next, we solve
u.augment(column_matrix((-4,4,3)),subdivide=true)
[ 2 1 5|-4][ 0 -1 -1| 4][ 0 0 1| 3]
u.augment(column_matrix((-4,4,3)),subdivide=true).echelon_form()
[ 1 0 0|-6][ 0 1 0|-7][ 0 0 1| 3]
So, the solution of the system is x=(-6,-7,3):
solve([2*x+y+5*z+4,2*x+4*z,-x-z-3],x,y,z) [[x == -6, y == -7, z == 3]]
Ly = b
y = (−4 , 4 , 3 )Ux = y