the effect of matrix condition in the solution of a system
TRANSCRIPT
Scholars' Mine Scholars' Mine
Masters Theses Student Theses and Dissertations
1964
The effect of matrix condition in the solution of a system of linear The effect of matrix condition in the solution of a system of linear
algebraic equations. algebraic equations.
Herbert R. Alcorn
Follow this and additional works at: https://scholarsmine.mst.edu/masters_theses
Part of the Applied Mathematics Commons
Department: Department:
Recommended Citation Recommended Citation Alcorn, Herbert R., "The effect of matrix condition in the solution of a system of linear algebraic equations." (1964). Masters Theses. 5642. https://scholarsmine.mst.edu/masters_theses/5642
This thesis is brought to you by Scholars' Mine, a service of the Missouri S&T Library and Learning Resources. This work is protected by U. S. Copyright Law. Unauthorized use including reproduction for redistribution requires the permission of the copyright holder. For more information, please contact [email protected].
THE EFFECT OF MATRIX CONDITION IN THE SOLUTION OF A SYSTEM OF
LINEAR ALGEBRAIC EQUATIONS
BY
HERBERT RICHARD ALCORN
A
THESIS
submitted to the faculty of the
UNIVERSITY OF MISSOURI AT ROLLA
in partial fulfillment of the requirements for the
Degree of
MASTER OF SCIENCE, APPLIED MATHEMATICS
Rolla, Missouri
1964
Approved by
(advisor)
11
ABSTRACT
The solution of a system of linear non-homogeneous
equations may contain errors which originate from many
sources. A system of linear equations in which small
changes in the coefficients cause large changes in the
solution is unstable and the coefficient matrix is ill-
conditioned .
The purpose of this study is to define several measures
of matrix condition and to test them by correlation with a
measure of the actual errors introduced into a system of
equations.
The study indicates that three of the five measures
of condition tested were reliable indices of the magnitude
of error to expect in the solution of a system of linear
equations.
iii
ACKNOWLEDGMENT
The author wishes to express his sincere appreciation
to Professor Ralph E. Lee, Director of the Computer Science
Center for his help in the selection of this subject and
for guidance and supervision during the investigation.
IV
TABLE OF CONTENTS
Page
A B S T R A C T .............................................. ii
ACKNOWLEDGMENT ........................................ iii
I. INTRODUCTION ................................... 1
II. REVIEW OF LITERATURE.......................... 3
III. DISCUSSION..................................... 25
IV. C O N C L U S I O N S ................................... 32
A P P E N D I X ............................................... 35
BIBLIOGRAPHY.......................................... 37
V I T A ................................................... 39
1
I. INTRODUCTION
Systems of linear non-homogeneous equations arise from
many sources; physical problems, numerical solution of
ordinary and partial differential equations, curve fitting,
data reduction, solution of the eigenvalue problem, and many
others.
There are two categories of numerical solutions for
systems of linear algebraic equations: exact and iterative
methods. The exact method is one which will complete the
solution in a known, finite number of basic arithmetic
operations. An iterative solution is a means of determining
an approximate solution to the system. Many of the
conditions which affect the solution of the system of
equations by the exact method also affect the solution by an
iterative technique; however, only the exact method will be
used or considered in this investigation.
Errors in the solution of a system of linear equations
may arise from several sources. The need to round off
numbers during the computation and the disappearance of
significant figures due to the subtraction of two nearly
equal quantities both contribute to error in the solution.
Also, due to physical limitations the coefficients of the
equations may only be known to some degree of acci
A system of linear equations in which small changes in
the coefficient matrix cause large changes in the solution
2
is unstable and shall be defined as ill-conditioned. It is
the purpose of this study to define several measures of
matrix condition and to test them by correlation with the
effect of actual errors introduced into a system of
equations.
3
II. REVIEW OF LITERATURE
There is extensive literature pertai.ning to the subject
of simultaneous linear equations and to the difficulties in
solving them. Consequently this survey of the literature
will be presented in three parts.
A. Sources of error in the solution
D. K. Faddeev and V. N. Faddeeva (_1)* have shown the
error in an element of the inverse of a matrix to be a
function of the magnitude of the elements of the inverse
matrix and the errors in the original matrix. From the
identity
AA_1 = I ,
upon taking the partial derivative with respect to the ele
ment of A in the i-th row and the j-th column, it follows
that
dAa
+ A 0 ,
from which
(2 .0 1 )
*A11 numbers (x) refer to the bibliography while the numbers (x.y) refer to equations.
where e.. is a zero matrix except for the element in theL i j Ji-th row and j-th column which is equal to unity. Using thi
definitionj
dA -1
da ij(2.02)
however,, may be expressed as the product of two vectors
where
and
-00
•
1
6
o
Consequently
5
and letting
then
-1
ali
a2i
a .m
a . i
j 1 °j2
ou . a,. l i j l au . a . • • •l i j 2 a, .a. l i jna „ . a . i2 i j l a_ . a . „ • • •2 i j2
a„ .a. 2i jn
a . a . - ni j l a .a.„ . . . ni j2 a . a . ni jn
and therefore
•a, . a .ki jr (2.0 3)
Equation (2.03) shows that the change in each element of the
inverse of A produced by a change in an element of A is the
product of this change and two elements of the inverse. Thus
if the inverse contains some large elements, a small or
insignificant change in an element of the matrix can result
in large deviations in certain elements of the inverse.
Now taking into account all of the elements of the
matrix A in which changes will affect the element in the
k-th row and r-th column of the inverse;
n nda.kr 2 2 a (2.04)
i=l j=l
From this relationship (2.04) it can be observed that the k,
r-th element of the inverse is affected by each error in A,
Of course, there are cases when errors due to changes in
different elements of the matrix may combine so as to compen
sate for each other.
A system of linear equations with an unstable or ill-
conditioned coefficient matrix would be unstable, since the
solution would be greatly affected by changes in the constant
vector as well as in the coefficient matrix. The extent of
this instability has been noted by Hildebrand (2_)} Faddeev
and Faddeeva (1_), and others. Let
by the magnitude of the elements of the k-th row of A and
by the magnitude of the elements of the r-th column of A \
Ax = b (2.05)be a system of linear equations * then
x = A ^b (2.06)and as before,
Using equations (2.01), (2.02), and (2.06)
dxda. .
- A -1 A _1bda. .ij
= - A -1 e . .. iJJ
x
= - A -1 e .L i i J L i j Je-, . x
“ ha2i
a . ni
0 0 0
and
dx
Saij
- —ali a-, .x.ii ja2i x . = -J
.x. 2i J
a . a .x.ni m J— — _ _
from which
dx.
Saua, . x . . kl J
Similarly from equation (2.06)
8
8x __ A ~1 8b 8bi 8b.
0M
J °21i = *
0a .ni
from which it follows that
axka.ki (2.08)
From equations (2.07) and (2.08) it can be seen that if the
inverse has large elements, then a small change in either the
coefficients or in the constants can cause significant errors
in the result.
From equations (2.07) and (2.08) an expression can be
obtained which takes into account all of the sources of
error in x;
n n2 2
i=l 1 = 1a. . x . da. .ki J iJ
n+ 2
i=lct. . db . ki l )
and by rearranging,n
dx, = 2k i=i ^
na. . db. - 2 a, . x . da. .ki 1 j=i ki j xj
n= 2 a. .
i=i kl
ndb . - 2 x .da . .
1 j u j
9
Hildebrand (2) writes this as
n5xk = ^ “ki^i (2.09)
where
ri. = 6b. - (xn5a.n + x^8a.rt + 'i l v 1 ll 2 i2 •• + x 8a. ) .n m ’ (2.1 0)
Equation (2.09) may be written as
5x = A Hi , where h = ^i
or
A6x = h (2.11)
which may be solved simultaneously with equation (2.05) by
augmenting matrix x with matrix 5x and augmenting matrix b
with matrix h, or shown in partitioned form:
x.;5x.1. 1 b . .* h .l. l
In practice, it is usually known only that the errors
6a^ and 8b^ do not exceed some known magnitude, e; thus
and
- e < 5a.. < €ij -
- e < 5b. < e . — i —
(2.12)
From equation (2.10) it is certain that
ri. i < E r 1 — (2 .1 3 )
10
where
E = (1 + | x- ̂| + | | + • • • I | )e ; (2.14)
and from equation (2.09)* it follows that
n< 2
i=l(2.15)
Thus the error in x, is related to the sum of the absolutekvalues of the k-th row of the inverse of the coefficient
matrix and the quantity E (2.14).
Scarborough (2) illustrates this error analysis by
considering the system of equations
Ax = b
where
and
A =1.22 -1.32 3.962.12 -3- 52 1.624.23 -1.21 1.09
b2.12-1.263-22
in which all elements have been rounded to the number of
digits given; hence e = .005- The solution of the system
is
0.943851.22724 0.65365
x
11
with
A -1-0.04631 0.11209 0.30416
-0.08274 -0.38058 -0.10137
0.29123 0.15841
- 0.03692Using equation (2.14)
3E = (1 + 2 |x. | )e = 0.0191237
i=l 1
and from equation (2.1 5)
36xn | = 2 |a,.|E = 0.0080
1 i=l 11
| = 2 | a | E = 0.0125d i=l
3| 5xQ | = 2 |a3 . |E = 0.0085 .
J i=l 1
Thus it is evident that the solution of the true system of
equations is such that
0. 936 1 Xj < 0.952
1.215 < x0 < 1.240C- 3
0.645 1 X3 < 0.662
which could be written as
x-̂ = 0.94
x2 = 1.23
X, = O .65
with the last digit in doubt by 1 unit in each case.
12
B. Some examples of ill-conditioning
There are numerous examples of ill-conditioned systems
of linear equations in the literature, the following were
selected to show the effects of this instability.
Turing (4_) develops an ill-conditioned system of
equations from a well conditioned one in the following
manner, by considering the system of equations
1.4x + 0.9y = 2 .7
-0.8x + 1.7y = -1.2 (2.16)
and forming from them another set by adding one-hundredth of
the first to the second, to give a new equation to replace
the first
-.786x + 1.709y = -I-173-.800x + 1.700y = -1.200 . (2 .1 7)
The second set is fully equivalent to the first (2.16)* but
a numerical solution of the second set involving round-off
errors is quite certain to yield a less accurate solution.
The solution to either set of equations is
x = 1.82903
y = 0.15^84 .
Now, modify each set of equations slightly by adding 0.001
to the coefficient of y in the second equation, from
equations (2.16)
13
1.400x + 0.900y = 2.700
-0.800x + 1.70ly = 1.200 ,
and from (2.1 7 )
-0.786x + 1.709y = -1.173
-0.800x + 1.701y = -1.200 .
The solutions to these two sets of equations are respectively:
x = 1.82908
y = 0.154X7
and
x = 1.8^779
y = 0.15887 .
The first digit to differ from the solution to the original
set of equations is underlined; it is clear that the second
set was more sensitive to a small change in the coefficient
matrix than was the first. The set of equations (2.17)
was described as ill-conditioned, or at any rate, ill-
conditioned with respect to the first system (2.16).
Bodewig , Todd (6_), and Faddeev and Faddeeva (1_) all
borrow T. S. Wilson’s well known example in integer numbers:
where
Ax = b
f-5 7 6 5"7 10 8 76 8 10 95 7 9 10
(2 -1 8 )
14
and
233233 37_
whose solution is obviously
V1
x = 1 1
and whose determinant is det(A) = 1. The ill-condition in
this system of equations is apparent from the inverse
68 -41 -17 10
A'1 = -41 25 10 -6 (2.19)-17 10 5 -3
. 1 0 -6 -3 2_
Now, perturb this system by adding to the first element of
the first row of (2. 18) an amount then it can be observed
how the determinant of the matrix A is affected. Let
"5 + € 7 6 5“7 10 8 7
A(e) = 6 8 10 9 3
5 7 9 10
then
A(e)_
det = 1 + 68 e
15
from which it follows that for
€ = ' 15 =-0.015
the matrix A(e) will be singular. Unless the elements of A,
(2.18), are known within 0.02, then for practical purposes
the matrix must be considered singular.
C. Some measures of matrix condition
A condition number of a matrix is a measure of the
stability or condition of that matrix. A large value for a
condition number usually indicates an ill-conditioned matrix.
Several formulas for calculating a measure of condition of
a non-singular matrix have been proposed in the literature.
F. R. Moulton (J_) discussed the solution of a system of
linear algebraic equations possessing a small determinant in
1913- He illustrated in his paper a system whose solution
changed appreciably with small changes in the coefficient
matrix; however, he did not describe this system as
ill-conditioned, nor did he propose any measure of ill-
conditioning.
John von Neumann and H. H. Goldstine (8) suggest that a
possible measure of matrix condition is
where 7\(A) and p(A) are the maximum and minimum eigenvalues
respectively of A. John Todd (iS), (£), (10)* (_11), and (12)
16
formalized this suggested measure of condition and
published a series of papers describing various properties
of P(A). He also applied P(A) to a nonsymmetric matrix B by
letting
A = BTB .
Todd (6_) uses (2.18) to show the effectiveness of P(A); one
would expect this condition number to be large in view of
the preceeding analysis, and in the case of (2.18)
P(A) ^ 3000 .
A. M. Turing (4) has proposed two additional measures
of condition:
1) N-number = i N(A)N(A ^ ) (2.21)
where' n n il/2
N (A) = Z 2 a;Li=l j=l
(2.22)
and
2) M-number = n M(A)M(A ^ ) (2.23)
where
M(A) = t>i |a..| . (2.24)
Turing adds that there is substantial agreement between
these two measures, although the M-number tends to yield a
larger result especially with diagonal matrices, or
matrices with diagonal dominance.
17
E. Bodewig (^) defines the condition number
n
J i a i
det (A)
This should be an effective measure of matrix condition pro
viding there is diagonal dominance. This condition number
is simpler to calculate than some of the other measures in
that neither the eigenvalues nor the inverse of the matrix
are required.
Andrew D. Booth (1^) agrees with von Neumann and
Goldstine that
P(A) T i*j(A )m -n i v a )
is an effective measure of the condition of the matrix A.
Unfortunately, the calculation of the eigenvalues is usually
a task of at least the complexity of the solution of the
system of equations themselves, so this measure is not too
practical. He adds, that if each equation is normalized by
dividing that equation by
" n p il/22 af. , i = 1, 2, •••, n, (2.26)
Lj=l
then
18
1det(A )v n 7
n7ri=l
nZ a
.L Jbldet(A)
2 1/2
(2.27)
where A is the normalized matrix A, should be an effective nmeasure of the matrix’s condition. As an example of the
practical value of his measure, Booth uses the much quoted
(2.18) set of equations * in which no ill-conditioning is
evident from observation of the determinant; as in this case
det(A) = 1 .
However,, upon calculation of (2.27) we have
--- ---- ^ 50,000det(A )x n 7
which clearly indicates the degree of ill-condition.
It has been noticed that by writing (2.27) as
det(A )v n 7
n7T.J=1
n 2 Z av .J1/2
det(A)(2.28)
that this measure of condition is Hadamard’s inequality
det(A) <n n 07T 2 af
*-i=l i=l ^
1/2
divided by the actual value of the determinant.
19
In a discussion on the effect of noise on the solution
of large linear systems of equations, C. Lanczos (14)
defines
maxi
mini
A i (ATA)
A i (ATA)(2.29)
to be a "critical ratio", and he states that any linear2|.system whose critical ratio surpasses 10 can hardly be
considered adequate for full determination of the unknowns
of the problem. This condition number (2.29) has the
advantage that the matrix A A is symmetric positive definite,
therefore all of the eigenvalues are real and positive. It
is also well known, (ljj), that if A is a symmetric matrixpand has an eigenvalue A, then A must be an eigenvalue of
TA A.
J. H. Wilkinson (_16), {]J_) uses the matrix norm to
develop some condition numbers. A short digression will
be made at this point to define the vector and matrix norms
and to state some of their properties.
A norm is an overall assessment of the magnitude of
a vector or a matrix and possesses some useful properties.
20
A. The vector norm.
The norm of a vector x will be denoted by ||x || and
will satisfy all three of the following conditions:
||x|| > 0 unless x = 0 (2 .3 0)
|]kx|| = | k| ||x || k is a complex scalar (2 .3 1 )
lx+yII i IMI + llyl! (2 .3 2)
From the second two of these conditions, it can be shown
that
llx -y|| ± I llx ll - llyl! I • (2.33)
The three vector norms in common use are defined by:
*llp + lx2 ip + ••• + lxn |p )1,/p (p= l»2 (2.31*)
llx 11! =n2 xll (2.35)i=l
" n p 1 / 2llx ll2 = S d.
x . (2.36)-i=l
||x|| = m a X11 00 1 X. I
1 1• (2.37)
B. The matrix norm
The norm of a square matrix A will be denoted by ||A||
and will satisfy
OA unless A = 0 (2.38)||kA|| = |k| ||A11 k is a complex scalar (2.39)l|A+B || < IIA || + ||B|| (2.40)
21
and
l|AB |! 1 ||A || ||B I! . (2.41)
There are several matrix norms in popular use, corresponding
to the three vector norms,, there are:
l|A||x = mfx s |a I (2.42)1 J i=l 1J
and
maxi
n2
j=i
maxi \ ( A TA)
(2.43)
(2.44)
The last of these, ||A||̂ is known as the spectral norm, and
it can be shown that if A is the unit matrix, then ||A|| = 1.
There is one additional important norm, the Euclidean or
Shur norm which is consistent with ||x|L and is defined as
' n n2 2
Li=l j=l(2.45)
Consider the sensitivity of the solution of the set of
equations
Ax = b (2.46)
to variations in b. If
A(x+h) = b+k
22
then
Ah = k
and
h = A"1k
which after taking the norms yields
I N = l|A"1k|| < [|A-11| ||k|| (2.47)
and
which is consistent with (2.0 8).
Now consider the relative change ||h||/||x||, from (2.46)
IN I = l|Ax|| < ||A || ||x|| (2.48)
l|x|| > ||b || llAlf1 (2.49)
using (2.47)
and finally
Ih l < Mllli M1 x [
y
" INI'1 l|b ||
llh 1 ||x 1\ l IIAII ||A-11| ]||- , (2.50)
in which, |[A|| ||A |̂| is the decisive quantity and may be
regarded as a condition number. Using (2.44), the third
matrix norm (in the notation of Faddeev and Faddeeva (!_));
23
H-number = ||A|| ||A |̂|
maxi
_mini \
(a ta )
(ATA) j
1/2(2.51)
Using the Euclidean or Shur norm,
IMIe I|a _1||e = N(A)N(A_1)
and from (2.21)
N-number = ^ ||A||e ||A 1 ||e . (2.52)
Richard S. Varga (18) defines the formula
A = ||S|| ||S_1|| (2.53)
in which S is defined by
A = S_1JS , (2.54)
where J is the Jordan normal form (or Jordan canonical form)
of A. The major drawback to the use of this condition
number is the extreme difficulty of calculating the simi
larity transformation matrix S.
There exists considerable additional literature on the
subject of condition numbers and their application, several
other articles on this subject are mentioned below.
The investigation of various measures of condition was
the subject of a paper by J. D. Copeland (19).
24
The establishment of a confidence region for the
solution of a system of linear equtions in which the error
could be considered multinormally distributed was the subject
of a paper by G. E. P. Box and J. S. Hunter (20).
J. D. Calton (21) investigated various measures of
condition for small matrices and proposed a measure which
he thoroughly studied and reported in his paper.
There exists some interesting and informative inter
relationships and inequalities between several of the
condition numbers mentioned herein:
1. H-number p (a t a )1/2
2. If A is symmetric,
P(A) = H-numbero3. N-number <_ M-number <_ (n ) N-number
4. N-number <_ H-number < (n) N-number
5* P(A) H-number
6. If A is orthogonal, then
N-number = M-number = P(A) = 1 .
25
III. DISCUSSION
A computer program was written to determine the
correlation of the errors in the solution of a system of
linear algebraic equations with the condition of the
coefficient matrix. The program generated numerous test
matrices and for each of them several measures of condition
were calculated. The matrix was then randomly perturbed
and the resulting system of equations was solved; a measure
of error in the solution was calculated; and finally,, the
correlations between the experimental measure of error and
the condition numbers were computed. A detailed description
of the program which was written for the IBM 1620 Model II
digital computer using the Fortran II language follows,
the system of n linear algebraic equations being considered
is
Ax = b .
The elements of the coefficient matrix A and the
constant vector b were generated by using uniformly
distributed psuedo random numbers in the range
0 — aij — 1
0 < b^ < 1 for i,j = l,2,***,n .
Thus the matrices A and b are arbitrary and there is the
possibility that no unique solution would exist. However,
the program was written so as to reject a singular A matrix.
26
The following measures of matrix condition were
included in the experimentation.
f-number = i N(A)N(A”^) = C-̂ (3.01)N-
M-number = n M(A)M(A~^)
nIT
H = i=la. . 11
det (A)= C.
( 3 - 02)
(3.03)
n r p i l / 22 a. .T
i=lLdet (A ) v norm' det (A)
= C4 (3-04)
H-number =mf (ATA) ">l/2
-mm '5 * (3-05)i L X. (A A)
Three of the condition number formulas mentioned in the
previous chapter were not included in the program. Equation
(2.29), Lanzcos "critical ratio" was not used because of its
similarity to the H-number (3-05)• Since the program is not
restricted to symmetric coefficient matrices, P(A), equation
(2.20) was not used due to the difficulty in obtaining the
eigenvalues of non-symmetric matrices; and as noted previously
in the case where the coefficient matrix A is symmetric
P(A) = H-number .
The condition number formula (2.53) mentioned by Varga was
not considered due to the difficulty in computing the
transformation matrices required.
27
Each element of the matrix A -was perturbed by adding to
it some quantity 5a.. to form
+ ( 3 -06)
The values of 5a.. -were randomly selected from a uniform
distribution in the interval
- € < 5 3 ^ < e ,
where the value of e was specified as data input to the
program. In the same manner, each element of the dependent
variable vector b -was perturbed by adding to it some random
quantity 5b^, selected in the same interval as 5a_^j, to form
r i
bp = |_bi + 5bi (3-07)
The perturbed system of linear equations
AP bP(3-08)
was solved in order to obtain the error in each element of
the solution
5xi = xpi - xi, for i = 1 , 2 , . (3*09)
The maximum of these errors was then used to compute an
experimental measure of condition for the coefficient matrix
max I 6x.Ax = n n
^2 S Z I 6a.. n . i . -ii=l j=l J
(3-10)
which is a measure of the relative change in the solution.
28
Numerous subroutines written by the author and
contained in the Computer Science Center subroutine library
were utilized by the program.
RAND
This subroutine computes one of a set of uniformly
distributed psuedo random numbers between zero and one.
ZERO
Each element of a matrix or a vector may be set to
equal zero by the use of this subroutine.
EIGVAL
This subroutine determines the largest eigenvalue and
the corresponding eigenvector of the given symmetric matrix
by an iterative technique using the well known dominant
eigenvalue method (I5) .
Evaluation of the H-number (3 .O5) required both theTlargest and the smallest eigenvalues of A A, the first of
these was obtained by a direct application of the subroutine
to the matrix A A. Since the inverse of the matrix A was
available,, having been required for other purposes, the
computation of the smallest eigenvalue of A A was performed
using the results which follow.
It is well known that the reciprocal of the maximum eigen
value of the inverse of the matrix is the minimum eigenvalue
29
of the matrix itself. Instead of inverting the matrix TA A, the following is done: let
TB = A lA (3-11)
then
B-1 (a ’a )-1
= A - h A 1)-1
= A ' V V (3 .12)
Using equation (3*12) and the basic equation
Bx = xx (3-13)
it follows that
A 1 (A V x = kx (3-14)
where
k = I . (3.15)
Thus the application of the subroutine EIGVAL to the matrix -1 - I TA (A ) would yield the reciprocal of the smallest eigen-
Tvalue of A A.
INVRT
This procedure uses the method of Gauss-Jordan elimina
tion, pivoting on the maximum element in the matrix,, to
compute the inverse and the determinant of the given matrix.
MTXMPY
The multiplication of two matrices,, a matrix and a
vector, or two vectors is performed with this subroutine.
30
AMAX
This subprogram finds the maximum value of a given set
of numbers,, either the element of greatest algebraic value
or the element of maximum absolute value may be found.
GAUJOR
This subroutine solves a given set of n linear algebraic
equations by the method of Gauss-Jordan reduction, pivoting
on the element of maximum magnitude by columns is employed
to help retain accuracy in near singular systems. This
routine does not solve a set of equations whose coefficient
matrix is singular.
Experimentation using the previously described program
was performed in the following manner. The input variables
to the program consisted of n, the size of the system of
equations,* e, the bound upon the random perturbation of the
system; and m, the number of times this particular system
is to be perturbed and solved. For each perturbation and
solution, the quantity Ax is calculated, these are averaged
at the conclusion of the m trials, and define the quantity
i mA = - 2 Ax, . (3-16)
m k=l K
The size of the system, n, was allowed to vary from 2-6to 35; for each size three values of e were used, 10 ,
- 4 - 210 , and 10 ; for each value of e five trials were
conducted. The output of the program for each separate
value of e was:
31
1. n, the size of the system
2. e, the bound on the perturbation
3. A, the experimental measure of error
4. V i = 1,2,••*,5 the five condition numbers
After extensive experimentation with the program,
coefficients indicating the degree of correlation between
the experimental measure of error and the five condition
numbers were calculated as follows. Let r^ be the simple
product-moment correlation coefficient (22) between the
experimental measure of error and the condition number Ch,
then
r . r2A.h C . „ ---2a +_ 2C. .t it m t it
,/i- k 2̂ sc i t - <zc i d
(3-17)
where each summation extends from t=l to t=m, in which m is
the number of observations being correlated. The value
obtained for r^ is a measure of the fit of the observations
to an equation of the form
Ac = a C± + b (3.18)
where the coefficients a and b are to be determined by the
method of least squares. The range of values for r^ is
-1 to +1; r. = 0 indicates a total lack of correlation
between the variables, r^ = +1 or -1 indicates perfect
positive or negative linear correlation.
32
IV. CONCLUSIONS
From the results of the experimentation and the corre
lation of these results, several conclusions were drawn.
The most important is that all of the condition formulas
tested, with the exception of two, seem to be reliable
indicators of the magnitude of error to expect in the
solution of a system of linear non-homogenous equations.
Three of the formulas investigated yield condition
numbers which appear to be well correlated* with the
experimental measure of condition (3 .1 6), they are:
N-number = — N(A)N(A = C-̂
M-number = n M(A)M(A = C0
H-number =■mfx X (a ta )'1/2
•mm T ai x. (A A)= C,
Of these three formulas, and C^ are slightly better corre
lated with A (3*16) than is C^, also the correlation coeffi
cients obtained for C-̂ and C^ with A are nearly equal in all
cases. This is a desirable conclusion in that a value for
Cf or C^ is considerably easier to compute than is a value
for Cc.
*The appendix contains tabulations of the correlation coefficients obtained.
33
The two remaining condition formulas studied
u
n7T
i=la. .
11
det (A)C3
and
nTT
n 2 2 aT.1/2
d e t(An orJ I
i=lLi=l |det(A)|
= C,
seem to be poor measures of a matrix1s condition in that
there is little correlation between them and the experimental
measure of condition A. However,, when the size of the system
of equations is held constant, or nearly so, and appear
to be better indicators of condition than when the size is
allowed to vary. It was noticed that as the size of the
system of equations was increased, the values obtained for
became smaller and those obtained for increased; this
is an undesirable characteristic for a condition number to
possess.
The values observed for the correlation coefficients_ 6
for C p Cgj and versus A when e = 10 were nearly
identical to those observed for the same Ch when e = 10 ;_p
however, when e = 10 (approximately one per cent of the
value of the elements of the matrix), there was less
correlation for the same CL. With additional experimentation
this observation might lead to the conclusion that as the
34
error in an element the matrix approaches the magnitude
of that element, the condition numbers C-̂ , C^, and
become less indicative of the errors in the solution vector x.
It was also observed that small correlation coeffi
cients were obtained for some values of n, the following is
a possible explanation of this phenomenon. If a set of
observations of and A were to be plotted, and the
matrices considered were of approximately the same condition;
most of the points plotted would lie in a small cluster
and could not define a straight line as well as a more
widely scattered set of points.
In summary, it could be stated that for small values
of e, the condition number formulas C-p C^, and should
provide reliable indices of the condition of the coefficient
matrix and the errors in the solution of system of equations.
35
APPENDIX
The following three tables contain tabulations of the
correlation coefficients obtained between the condition
numbers Ch., i = 1,2,•••,5 an^ the experimental measure of
matrix condition A for various arrangements of the data.
TABLE 1.
Correlation between condition numbers and experimental
measure of condition for all systems of equations tested.
Size of System
B ound on Perturbation
Number of Systems
RunCorrelation Coefficients
n e m rl r2 r3 r5
2-3510'610-410-2
981 .819 .815 .010 -.003 • 770
TABLE 2.
Correlation between condition numbers and experimental
measure of condition for the three perturbation bounds.
n € m r i r 2 r 3 r 4 r 5
2-35
CO1OI-1 327 .919 .914 .006 - .004 .860
2-35 10"4 327 •927 .923 .006 - .004 .867
2-35 i—‘ o i ro 327 • 557 .555 .025 - .007 •535
mea
tes
n
234
5
67
8
9
10
111213
14
15
16
17
18
19
20 -
36
TABLE 3-
ween condition numbers and experimental
itio n for the various sizes of systems
10- 6 , 1 0 '\ and 1 0 '2 .
r l r2 r3 r4 r5
. 946 .925 .964 .945 • 945
• 853 .869 . 568 .754 • 855
• 913 .960 .858 .828 .914
. 661 .668 .621 .710 . 666
• 797 . 704 -.066 .253 • 779
.987 •992 .611 .676 .986
.70 1 • 732 .416 .243 .699
.864 .864 . 142 .849 .864
.9 17 .980 • 544 • 591 .914
.774 • 756 .080 .788 • 773
.347 .372 -.013 . 150 .341
.828 .820 • 835 .499 .827
.554 .547 . 520 . 584 • 553
.894 .887 • 345 • 949 .893
• 959 .966 .964 .908 .960
.919 .917 .884 .917 .919
.506 .787 .366 .162 . 544
.921 • 924 .951 .950 .920
.978 .912 -.32 0 -.303 .923
. 806 .820 .530 • 635 .804
37
BIBLIOGRAPHY
1. Faddeev, D. K. and Faddeeva, V. N ., (1963) Computational Methods of Linear Algebra Translated by Robert C. W illiam s, W. 1L Freeman and Co. , San Francisco and London, p. 119-128.
2. Hildebrand, F. B., (1956) Introduction to Numerical Analysis, McGraw-Hill, New York, p . 436-439 .
3- Scarborough, J. B., (1962) Numerical MathematicalA n alysis , 5th Ed. , The John Hopkins Press, Baltim ore,p. 301-305.
4. Turing, A. M., (1948) Rounding-Off Errors in Matrix Processes, Quarterly Journal of Mechanics and Applied Mathematics, 1: p7 287-3O8 .
5 . Bodewig, E., (1959) Matrix Calculus, 2nd Ed., Interscience Publishers, Inc., New York, p. 133-137*
6 . Todd, John, (1962) Survey of Numerical Analysis,McGraw Hill, New York, pi 239-243.
7 . Moulton, F. R., (1913) On the Solutions of Linear Equations Having Small Determinants, American Mathematical Monthly, 20: p. 242-249.
8 . Von Neumann, John and H. H. Goldstine, (1947) Numerical Inverting of Matrices of High Order, Bulletin of the American Mathematical Society, 53* p. 1021-1099.
9. Todd, John, (1954) The Condition of the F in ite Segments of the H ilbert M atrix, National Bureau of Standards Applied Mathematics S e r ie s , 39: p . 109-116.
10. Todd, John, (1949) The Condition of Certain Matrices, I, Quarterly Journal of Mechanics and Applied Mathematics, 2: p. 469-472“
11. Todd, John, (1950) The Condition o f a Certain Matrix, Proc. of the Cambridge Philosophical S ociety , 46:__ pq7J7TT87
38
12. Todd, John* (1958) The Condition of Certain Matrices,III, Journal of Research National Bureau of Standards,,60: p. 1-7.
13. Booths Andrew D., (1957) Numerical Methods, 2nd E d ., Academic Press Inc., New York, pi 80-85.
14. Lanczos, Cornelius, (1956) Applied Analysis, Prentice- Hall, Inc., Englewood Cliffs, N. J ., p . 167-170.
15* Hildebrand, F. B., (1952) Methods of Applied Mathematics, Prentice-Hall, Inc., Englewood Cliffs, N. J ., p7 1-95•
16. Wilkinson, J. H., (1963) Rounding Errors in Algebraic Processes, Prentice-Hall, Inc., Englewood Cliffs, N. J.,p. 79-93.
17 . Wilkinson, J. H., (1961) Error Analysis of Direct Methods of Matrix Inversion, Journal of the Association of Computing Machinery, 8: p . 281-330.
18. Varga, Richard S., (1962) Matrix Iterative Analysis, Prentice-Hall, Inc., Englewood Cliffs, N. J ., p . 66, 95*
19. Copeland, J. D., (1963) On Condition Numbers for Matrices. Thesis, University of Texas, 5O P •
20. Box, G. E. P. and J. S. Hunter, (1954) A Confidence Region for the Solution of a Set of Simultaneous Equations with an Application to Experimental Design, Biometrika, 41: p. 190-199.
21. Calton, T. D., (1963) Investigation of Measures of I11-Conditioning. Thesis, Missouri School of Mines and Metallurgy, 40 p .
22. Ralston, Anthony and Herbert S. Wilf, (I960)Mathematical Methods for Digital Computers, John Wiley and Sons, New York, pT 213-220.
39
VITA
The author was born October 2 4 , 1933, at S t. Louis,
M issouri. His primary education was received th e re , he
attended high school in Kirkwood, M issou ri, graduating in
June 1953. He received a Bachelor o f Science degree in
Mechanical Engineering from the U niversity of M issouri
School of Mines and M etallurgy at R o lla , M issou ri, in June
1962. In September 1962, work was started toward the
Master of Science degree in Applied Mathematics.
Since July 1962, the author has been employed as a
Computer Analyst and Instructor in Computer Science by the
University of Missouri School of Mines and Metallurgy
Computer Science Center.