[scot._1970_symposium_on_the_theory_of_numerical_a(bookzz.org).pdf
TRANSCRIPT
-
Lecture Notes in Mathematics A collection of informal reports and seminars Edited by A. Dold, Heidelberg and B. Eckmann, Z0rich
193
Symposium on the Theory of Numerical Analysis Held in Dundee/Scotland, September 15-23, 1970
Edited by John LI. Morris, University of Dundee, Dundee/Scotland
Springer-Verlag Berlin. Heidelbera New York 1971
-
AMS Subject Classifications (1970): 65M05, 65M10, 65M 15, 65M30, 65N05, 65N 10, 65N 15, 65N20, 65N25
ISBN 3-540-05422-7 Springer-Verlag Berlin Heidelberg New York ISBN 0-387-05422-7 Springer-Verlag Near York Heidelberg Berlin
This work is subject to copyright. All rights are reserved, whether the whole or part of the material is concerned, specifically those of translation, reprinting, re-use of illustrations, broadcasting, reproduction by photocopying machine or similar means, and storage in data banks.
Under 54 of the German Copyright Law where copies are made for other than private use, a fee is payable to the publisher, the amount of the fee to be determined by agreement with the publisher.
by Springer-Verlag Berlin Heidelberg 1971. Library of Congress Catalog Card Number 70-155916. Printed in Germany.
Offsetdruck: Julius Beltz, Hemsbach
-
Foreword
This publication by Springer Verlag represents the proceedings of a series
of lectures given by four eminent Numerical Analysts, namely Professors Golub,
Thomee, Wachspress and Widlund, at the University of Dundee between September
15th and September 23rd, 1970o
The lectures marked the beginning of the British Science Research Council's
sponsored Numerical Analysis Year which is being held at the University of Dundee
from September 1970 to August 1971. The aim of this year is to promote the theory
of numerical methods and in particular to upgrade the study of Numerical Analysis
in British universities and technical colleges. This is being effected by the
arranging of lecture courses and seminars which are being held in Dundee through-
out the Year. In addition to lecture courses research conferences are being
held to allow workers in touch with modern developments in the field of Numerical
Analysis to hear and discuss the most recent research work in their field. To
achieve these aims, some thirty four Numerical Analysts of international repute
are visiting the University of Dundee during the Numerical Analysis Year. The
complete project is financed by the Science Research Council, and we acknowledge
with gratitude their generous support. The present proceedings, contain a great
deal of theoretical work which has been developed over recent years. There are
however new results contained within the notes. In particular the lectures pre-
sented by Professor Golub represent results recently obtained by him and his co-
workers. Consequently a detailed account of the methods outlined in Professor
Golub's lectures will appear in a forthcoming issue of the Journal of the Society
for Industrial and Applied Mathematics (SIAM) Numerical Analysis, published
jointly by &club, Buzbee and Nielson.
In the main the lecture notes have been provided by the authors and the
proceedings have been produced from these original manuscripts. The exception
is the course of lectures given by Professor Golub. These notes were taken at
the lectures by members of the staff and research students of the Department of
Mathematics, the University of Dundee. In this context it is a pleasure to ack-
nowledge the invaluable assistance provided to the editor by Dr. A. Watson, Mr.
-
IV
R. Wait, Mr. K. Brodlie and Mr. G. McGuire.
Finally we owe thanks to Misses Y. Nedelec and F. Duncan Secretaries in
the Mathematics Department for their patient typing and retyping of the manu-
scripts and notes.
J. L1. Morris
Dundee, January 1971
-
Contents
G.Golub: Di rect Methods for So lv ing E l l ip t ic D i f fe rence Equat ions . . . . . . . . . . . . . . . . . . . . . . . . . . I
I. In t roduct ion . . . . . . . . . . . . . . . . . . . . . . . 2 2. Mat r ix Decompos i t ion . . . . . . . . . . . . . . . . . . . 2 3. B lock Cycl ic Reduct ion . . . . . . . . . . . . . . . . . . 6 4. App l i ca t ions . . . . . . . . . . . . . . . . . . . . . . . 10 5. The Buneman A lgor i thm and Var iants . . . . . . . . . . . . 12 6. A~curacy of the Buneman A lgor i thms . . . . . . . . . . . . 14 7. Non-Rectangu lar Regions . . . . . . . . . . . . . . . . . 15 8. Conc lus ion . . . . . . . . . . . . . . . . . . . . . . . . 18 9. References . . . . . . . . . . . . . . . . . . . . . . . . 18
G.Golub: Mat r ix Methods in Mathemat ica l P rogramming . . . . . . . 21
I. In t roduct ion . . . . . . . . . . . . . . . . . . . . . . . 22 2. L inear P rogramming . . . . . . . . . . . . . . . . . . . . 22 3. A Stable Imp lementat ion of the S implex A lgor i thm . . . . . 24 4. I terat ive Ref inement of the So lut ion . . . . . . . . . . . 28 5. Househo lder T r iangu lar i za t ion . . . . . . . . . . . . . . 28 6. P ro jec t ions . . . . . . . . . . . . . . . . . . . . . . . 31 7. L inear Least -Squares Prob lem . . . . . . . . . . . . . . . 33 8. Least -Squares Prob lem with L inear Const ra in ts . . . . . . 35
B ib l iography . . . . . . . . . . . . . . . . . . . . . . . 37
V.Thom@e: Topics in Stab i l i ty Theory for Part ia l D i f ference Operators . . . . . . . . . . . . . . . . . . . . . . . . . . 41
Preface . . . . . . . . . . . . . . . . . . . . . . . . . 42 I. In t roduct ion . . . . . . . . . . . . . . . . . . . 43 2. In i t ia l -Va lue Problems in L ~ w~th Constant Coef f i c ients . 51 3. D i f fe rence Approx imat ions in L ~ to In i t ia l -Va lue Problems
wi th Constant Coef f i c ients . . . . . . . . . . . . . . . . 59 4. Es t imates in the Max imum-Norm . . . . . . . . . . . . . . 70 5. On the Rate of Convergence of D i f fe rence Schemes . . . . . 79
References . . . . . . . . . . . . . . . . . . . . . . . . 89
E .L .Wachspress : I terat ion Parameters in the Numer ica l So lu t ion of E l l ip t i c Prob lems . . . . . . . . . . . . . . . . . . . . . . 93
I. A Concise Rev iew of the Genera l Topic and Background Theory . . . . . . . . . . . . . . . . . . . . . . . . . . 95
2. Success ive Overre laxat ion: Theory . . . . . . . . . . . . 98 3. Success ive Overre laxat ion: Pract ice . . . . . . . . . . . 100 4. Res idua l Po lynomia ls : Chebyshev Ext rapo lat ion : Theory .102 5. Res idua l Po lynomia ls : Pract ice . . . . . . . . . . . . . . 103 6. A l te rnat ing -D i rec t ion - lmp l i c i t I terat ion . . . . . . . . . 106 7. Parameters for the Peaceman-Rachford Var iant of Adi .107
0.Widlund: In t roduct ion to F in i te D i f fe rence Approx imat ions to In i t ia l Value Problems for Part ia l D i f fe rent ia l Equat ions .111
I. In t roduct ion . . . . . . . . . . . . . . . . . . . . . . . 112 2. The Form of the Part ia l D i f fe rent ia l Equat ions . . . . . . 114 3. The Form of the F in i te D i f fe rence Schemes . . . . . . . . 117 4. An Example of D ivergence. The Max imum Pr inc ip le . . . . . 121 5. The Choice of Norms and Stab i l i ty Def in i t ions . . . . . . 124 6. Stabi l i ty , E r ro r Bounds and a Per turbat ion Theorem . .133
-
VI
7. The yon Neumann Condit ion, D iss ipat ive and Mu l t i s tep Schemes . . . . . . . . . . . . . . . . . . . . . . . . . 138
8. Semibounded Operators . . . . . . . . . . . . . . . . . . 142 9. Some App l i ca t ions of the Energy Method . . . . . . . . . 145
10. Max imum Norm Convergence for L 2 Stable Schemes . . . . . 149 References . . . . . . . . . . . . . . . . . . . . . . . 151
-
Direct Methods for Solving Elliptic Difference Equations
GENE GOLUB
Stanford University
-
i. Introduction
General methods exist for solving elliptic partial equations of general type
in general regions. However, it is often the ease that physical problems such as
those of plasma physics give rise to several elliptic equations which require to be
solved mauy times. It is not unco~non that the elliptic equations which arise re-
duce to Poisson's equation with differing right hand side. For this reason it is
judicious to use direct methods which take advantage of this structure and which
thereby yield fast and accurate techniques for solving the associated linear
equations.
Direct methods for solving such equations are attractive since in theory they
yield the exact solution to the difference equation, whereas commonly used methods
seek to approximate the solution by iterative procedures [12]. Hockney [8] has
devised an efficient direct method which uses the reduction process Also Buneman
[2] recently developed an efficient direct method for solving the reduced system
of equations. Since these methods offer considerable economy over older tech-
niques [5], the purpose of this paper is to present a unified mathematical deve-
lopment and generalization of them. Additional generalizations are given by
George [6].
2. Matrix Decomposition
Consider the system of equations
= ~ , (2 .1)
where M is an NxN real symmetric matrix cf block tridiagonal form,
M =
A T
T A e
W
T A
(2.2)
The matrices A and T are pp symmetric matrices and we assume that
AT = TA .
-
This situation arises in many systems However, other direct methods which are
applicable for more general systems are less efficient to implement in this case.
Moreover the classical methods require more computer storage than the methods te be
discussed here which will require only the storage of the vector ~. Since A and T
commute and are s~et r i c , it is well known Ill that there exists an orthogonal
matrix Q such that
QT A Q = A, QT T Q = 0 ,
and A and O are real diagonal matrices.
(2.3) The matrix Q is the set ef eigenvectars of
A and T, and A and n are the diagonal matrices of the p-distinct eigenvalues cf A
and T, respectively
To conform with the matrix M, we write the vectors x and ~ in partitioned form,
x --
X ~q
i I
Furthermore, it is quite natural to write
x2j I
xj = .
I X
, p j I
L j
System (2.2) may be written
~Cj = 2J
YPJ I
J = 2,3,...,q-1 ,
(2.~_)
(2.5a) (2.5b)
-
T~q_ I + AX~q = ~ . (2.5e)
Frem Eq. (2.3) we have
A = Q A QT and T = Q O QT
Substituting A and T into Eq. (2.5) and pre-multiplying by QT we obtain
where
z f~
rewritten as
(,i = 2,3, . . . ,q- i ) (2.6)
- = Q~x = Q~ x..i ~CI ' Z,i '~J ' J = 1,2, . . . ,q.
and ~j are partitioned as before then the ith components of Eq. (2,6) may be
u N u
~iXij_l + kiXij + ~ixij+l
wiXiq-I + klXiq = Ylq j
fer i = 1 ,2 , . . .pp.
= ~-~j , (j = 2,...,q-~) ,
If we rewrite the equatio~by reversing the rolls of i and J we may write
r i =
% - -=
P
6o i X i - qxq
" N
Xil
x i2
Xiq
A
-] Yil Yi2
1
-
so that Eq. (2.7) is equivalent to the block diagonal system of equations,
r i~o ~ , ( i ~ 1 ,2 , . . . ,p ) . (2.8)
Thus, the vector ~isatisfies a symmetric tridiagonal system of equations that has a
constant diagonal element and a constant super- and sub- diagonal element.
(2.8) has been solved block by block it is possible to solve for ~j = Q~j.
have:
Algorithm 1
1. Compute or determine the eigensystem of A and T.
2. 0o~pute ij Q~j (J 1,2,...,ql. 3. Solve ri~i = ~ (i = 1,2,...,p).
~. Compute xj = ~j (j . 1,2,...,q).
After Eq.
Thus we
r i =
For our system
k i w i
and the eigenvalues may be written down as
v = 2~ i r_~ ir k i + cos q+l
" " ~ i
si ki
r = 1,2,..., q
It should be noted that only Q, and the yj, j = 1,2,...,q have to be stored,
A since _~ oan over~rite the ~j the ^~ can overwrite the ~ and the ~joan overwrite
the ~j. A simple aleulatien will show that approximately 2plq + 5Pq arithmetic opera-
tors are required for the algorithm when step 3 is solved using @aussian el4m4~a-
tion for a tridiagonal matrix when r i are positive definite. The arithemtic opera-
ters are dominated by the 2p2q multiplications arising from the matrix multiplica-
tions of steps 2 + 4. It is not easy to reduce this re,tuber unless the matrix Q ham
special properties (as in Poisson's equation) when the fast Fourier transform can be
used (see Hookney [8]).
-
er that
r i = Z V i Z T ,
rs~ V i the diagonal matrix ef eigenvalues of r i and Zrs = o s sin ~ . Since r i and rj
have the same set of eigenvectors
r i rj = rj r i .
Because of this decomposition, step (3) can be solved by computing
~i = Z V~' Z T
where the Z is stored for each r i. This therefore requires of the order of 2pq"
multiplications and this approximately doubles the computing time for the algorithm.
Thus performing the fast Fourier transform method in step 3 as well as steps 2 and
is not advisable.
3. Block C,yclic Reductien
In Section 2, we gave a method for which one had to know the eigenvalues and
eigenvectors of some matrix. We now give a more direct method for solving the
system of Eq. (2.1).
We assume again that A and T are symmetric and that A and T commute. Further-
more, we assume that q = m-I and
m = 2 k+l
where k is some positive integer. Let us rewrite Eq. (2.5b) as follows:
~.i-2 + A~j-I + ~J = ~J-l '
TXj_l + A~j + Txj+ 1 = ~j ,
~ j ~J+l + ~J+2 = ~j+l "
Multiplying the first and third equation by T, the second equation by -A, and addim@
we have
T2xj_ 2 + (2T" - A 2)xj + T2xj+ 2 = T~j_I - A~j + T~j+I .
Thus if j is even, the new system of equations involves x.'s with even indices. ~j
Similar equations held for x and Xm_ 2. The process of reducing the equations in
-
this fashion is known as c2clic reduction. Then Eq. (2.1) may be written as the
following equivalent system:
( 2T 2 -A" )
T" ( 2T 2-A" ) T 2
@
k, -
~+~ -~
e
e
(2'~'~')
F
o
o
~m_n,
I.
(3.1)
and
~j = Zj + ~(Xj_l + X i+l) J = 3,5,...,m-3 (3.2)
2 k+l , Since m = and the new system of Eq. (3.1) involves xj's with even indlcesp the
block dimension ef the new system of eqtmticns is 2k-l. Note that once Eq. (3.1) is
solved, it is easy to solve for the xj's with odd indices as evidenced by Eq. (3.2)
We shall refer to the system of Eq. (3.2) as the eliminated equations.
Also, note that Algorithm i may be applied to System (3.1). Since A and T
commute, the matrix (2Ta-A a) has the same set of eigenvectors as A and T. Also, if
~(A) = ki, ~(T) = %, for i = 1,2,...,m-l,
= - .
Heckney [8] has advocated this procedure.
Since System (3.1) is block tridiagonal and of the form of Eq. (2.2), we can
apply the reduction repeatedly until we have one block. However, as noted above, we
can stop the process after any step and use the toothed of Section 2 to solve the
-
resulting equations.
To define the procedure recursively, let
~o) (j = 1,2, .,m-l). A () = A, T (e) = T; ~ = Zj, -" (3.3)
Then for r = O,l,..,k
A (r+l) = 2(T(r)) = _ (A(r)) =,
T (r+z) = (T(r))" , (3.~)
~(r-1) = T(r) (r) . (r) - A(r) (r) j ~ J -2 r + ~j+2 r Yj
The eliminated equations at each stage are the solu~on of the diagonal system
(r-l) - T(r-l) A (r-l) X2r_2r_ , = ~2r_2r-, X2r
(r-l) - T(r-1) (xj2 r ) A(r -1) X j2 r -2r " = ~ j2r -2 r - ' + x ( j -1 )2r (3.5)
j = 1,2,...,2 k-r
. (r-l) A(r-1) ~. I _2r., =~k+l_2r . , - T(r'l) X2k+l_2r
After all of the k steps, we must solve the system of equations
A(k) . (k) ~2 k -- ~2 k . (3.6)
In either ease, we must solve Eq. (3.5) to find the eliminated unknowns, Just a~ in
Eq. (3.2). If it is done by direct solution, an ill-conditloned system may arise.
Furthermore A = A()is tridiag~nal A (i) is quindiagonal and so on destroying the
simple structure of the original system. Alternatively polynomial factorization
retains the simple structure of A.
From Eq. (3.1), we note that A (1) is a polynomial of degree 2 in A and T. By
induction, it is easy to show that A (r) is a polynomial of degree 2 r in the matrices
A and T, so that 2r-I
A(r) = ~ e(r)2j A2j T2r-2j "~ P2 r(A'T)"
We shall proceed %0 determine the linear factors of P2r(A,T).
-
Let 2r-I
j--o
For t ~ O, we make the substitution
a/~ : -2 OOS e .
From Eq. (3.3), we note that
I p2r1(a,t) = 2t 2~ _ (p2r(a,t))~
It is then easy to verify using Eq~. (3.7) and (3.8), that
P2r(a,t) =-2t 2r cos 2re ,
and, consequently 2 r
J=l
and, hence,
-~-2-i (, + 2t cos ~2~+,~ ,) ,
(3.7)
(3.8)
A (r) = -~ (A + 2 cos e!r)T)~ , (3.9)
01
(r) = (2j_I)~/2~+, where ~j
Thus to solve the original system it is only necessary to solve the factored system
recursively. For example when r = 2, we obtain
A (1) = 2~ - A m = (~ T - A ) (~ T + A)
whence the simple tridiagonal systems
(J: T -A) ~=~
(4~ T +A) x = w
are used to solve the system
A(1)x = ~
We call this method the cyclic odd-even reduction and factorization (CORF) algorithm.
-
10
4. Applications
Exampie I Poissen's Equation wit h Dirichlet Boundar~ Conditions,
It is instructive to apply the results of Section 3 to the solution of the
finite-difference approximation to Poisson's equation on a rectangle, R, with speci-
fied boundary values Consider the equation
u + u : f(x,y) for (x,y)ER, ~x yy (~.l)
u(x,y) : g(x,y) for (x,y)aR .
(Here aR indicates the boundary of R.) We assume that the reader is familiar with
the general technique of imposing a mesh of discrete points onto R and approximating
~q. (4.Z). The eq~tion u + Uyy : f(x,y) is approximated at (xl,Yj) by
Vi-l.j - 2vi,j + Vi+l.j vi,j-1 - 2vi. j + vi.j+l C~)" + (Ay)"
= fi,J (i < i < n-l, I < J < m-i) ,
with appropriate values taken on the boundary
VO,J = gC,~' Vm, j = gm,J ( 1 g J g m-l ) ,
and
Vi,@ = gi,o' vi,m : gi,J (i < i ~ n-l).
Then vii is an approximation to u(xi,Yj) , and fi,j = f(xi'Yj)' gi,j : g(xl,Yj)-
Hereafter, we assume that
2k+l m -~
When u(x,y) is specified on the boundary, we have the Dirichlet boundary con-
dition. For simplicity, we shall assume hereafter that Ax = Ay. Then
1
l -4 I
(~ . 1
and T = I . . l
1
-4 (n - l ) x (n - l )
-
11
The matrix In_ I indicates the identity matrix of order (n-l). A and T are symmetric
and co~ute, and, thus the results of Sections 2 and 3 are applicable In addition,
since A is tridlagcnal, the use of the facterization (3.10) is greatly simplified.
The nine-polnt difference formula for the same Poisson's equation can be treated
similarly when m
-20 4
4 -20
A =
0
O
& -20
, T=
(n-l)~n-ll
"~ z 0 1 4 1
(~ . . I
1 &
Example II
The method can also be used for Poisson's equation in rectangular regions
under natural boundary conditions provided one uses
au = u(x + ~.y ) - u(x - ~ .y ) Ox 2h
and similarly ~ at the boundarie S,
Example III
Poisson's equation in a rectangle with doubly periodic boundary conditions is
an additional example when the algorithm can be applied.
Example IV
The method can be extended successfully to three dimensions for Foissents
equation.
For all the above examples the eigensystems are known an~ the fast Fourier
transform can ~e applied,
Example V
The equation of the form
(~(x)~)x + (KY)~)y + u(x,y) = q(x,y)
on a rectangular region can be solved by the CORF algorithm provided the eigensystem
is calculated since this is not generally known.
-
12
The counterparts in cylindrical polar co-ordinates can also be solved using
CORF on the ractangle~ in the appropriate co-ordinates.
5. The Buneman algorithm and variants
In this section, we shall describe in detail the Buneman algorithm [2] and a
variation of it. The difference between the Buneman algorithm and the CORF algo-
rithm lies in the way the right hand side is calculated at each stage of the reduc-
tion. Henceforth, we shall assume that in the system of Eqs (2.5) T = Ip, the
identity matrix of order p.
Again consider the system of equations as given by Eqs. (2.5) with q = 2k+l-1.
After one stage of cyclic reduction, we have
+ (21 - A')~j (5.1) 5j-2 p + 5j+2 = ZJ-I + ZJ+I -AZJ for J = 2,4,...,q-I with ~e = ~+l = ~ ~ the null vector. Note that the right han~
side of Eq. (5.1) may be written as
(i) (5.2) ~J = ZJ-1 + ZJ+I -~ J = A(1) A-'~j + ZJ-I + ~J+l - 2A-'~j
where A (1) = (21p- A') .
Let us define
(i) ~J-(1) ~j-I ~j+l " 22~ I)_ 2j : A-'Zj ; = +
(These are easily calculated since A is a tridiagon~l matrix.) Then
(1) = A(1) _(1) (1) (5-3) Z~ j + %j
After r reductions, we have by Eq. (3.i)
(r+l) , (r) (r)) -A(r) (r) j = ~ j -2 ~ + ~j+2 ~j . (5.4)
Let us write
Substituting
21 - A (r+1) P
in a fashion similar to Eq. (5.3)
(5.5)
Eq. (5.5) into Eq. (5.4) and making use of the identity (A(r)) ' =
from Eq. (3.4), we have the following relationships:
(r+l) = 2(r) (A(r))_~ , (r) ~(r) ~r) ) (5.6a) J J - ~j_2 r + ~j+2 r -
-
13
( r + l ) ~(r) (r) ^ (r+l) For J = i2 r+l (i = 1,2,...,2k-r-l) with
~!r) = ~(r) (r) = ~(r) = O 2k+l = 2k+l -
~r) Because the number of vectors ~ is reduced by a factor of two for each successive r, the computer storage requirements becomes equal to almost twice the number of
data points.
To compute
of equations
A(r) , (r) (r+l) (r)r (r) (r) !,~j - ~,J ) == ~J-2 + ~j+2 r - ~j '
where A (r) is given by the factorization Eq. (3.9); namely,
2 r A (r) ~ (A + 2 cos 8 (r)j = - Ip) ,
J=l
o~ r) = (2~ - ~)~/2 r~ l
After k reductions, one has the equation
. = A(k) (k) ,~(k) A (k) x k ~2 k + ~2k
2
and hence
(A(r))-'(~J-2(r)r + ~J +2r~(r) _ ~r)) in Eq. (5.6a). we solve the system
~(k) (A(k))_1 ~(k) ~2k = ~2k + ~2k
Again one uses the factorization of A (k) for computing (A(k)) -I ~I~ ) .
solve, we use the relationship
~J -2r + A(r) ~J + ~J +2r = A(r) ~r) + ~r)
for J = i2r(l,2,...,2k+l-r-1) with ~o = ~2k+ 1 = ~
For J = 2 r, 3.2r,...,2k+l-2 r, we solve the system of equations
A(r)(xj - ~r)) = ~r) _ (xj_2r + xj+2r) ,
Te back
(5.7)
-
14
using the factorlzation of A(r); hence
~J 2~r) (r)) = + (~J - d " (5.~3)
Thus to summarise, the Buneman algorithm proceeds as follows:
((r) (r)~ 1. Compute the sequence ~ j , ~j } by Eq. (5.6) for r = l,...,k with
(o) e for J = 0,...,2 k+l ,ana ~O)Z = ~J for j = l, 2,...,2k+l-1.
2. Back-solve for ~j using Eqs. (5.7) and (5.8).
The use of the p(r) and q(r) produce a stable algorithm. Numerical experi- ~J ~J
ments by the author and his colleagues have shown that computationally the Buneman
algorihhm requires approximately 30% less time than the fast Fourier transform
method of Hockney.
6. Accuracy of the BunemanAl~orithms
As was shown in Section 5, the Bunemau algorithms consist of generating the
(r)l. Let us write, using Eqs. (5.12) and (5.13) sequence of vectors I~ r), ~J
~r) : ~r) + ~J(r) (6.la)
(r) = Xj 2 r + x r - Afr) (r) (6.1h) ~J ~ - ~j+2 ~j '
where
and
Then
and
whe re
k : l (6.2)
S (r) = (A( r - l ) . . . A(O)) - ' . (6.3)
I1~.~ r) .(r)ll IIs(r)ll2 - ~ i l 2 ~i
l i~ r) - (~j_2r + ~j+2rl12
11.t1' (6.~)
IIs (r) ACr)il2 i1~1' , (6 .5)
llVll 2 indicates the Euclidean norm of a vector v ,
IICII 2 indicates the spectral norm of a matrix C, and
-
15
1~1t'. ~ll~jll 2 . j=l
Thus for A : A T r-1
Ns(r)II2 -~ I](A(J))-III2 j:o
and since A ( j ) are polynomials of degree 2 j in A we have
r-I
lls(r)ll2 Vt [P2 j() max I ]"[ , j:o [xif
where p2j(Xi) are polynomials in Ikil , the eigenvalues of A.
For Poisson's equation it may be shown that
lls(r)II2 < e'r e
where o : 2r-1 and e > O. r
Thus Hs(r)ll2 _, o and h~ce
I12~ r) - ~H 2 ~ 0
~r) That is p tends to the exact solution wihh increasing r.
that llq~ r)N2 remains bounded throughout the calculation, the Buneman
leads to numerically stable results.
(6.6)
Since it can be shown
algorithm
7. Non-Rectangular Regions
In many situations, one wishes to solve an elliptic equation over the region
R
where there are n I data points in R I , n2 data points in R z and ne data points in
R, (~ R2. We shall assume that Dirichlet boundary conditions are given. When Ax is
-
16
the same throughout the region, one has a matrix equation of the form
m
G
P
pT
@ ~(2)J ~c (2)
where
"A T
T A
G=
#
e . T
T A n I xn l
B s
$ B .
(~ " . S
H = S
B n 2 xn=
and P is (noXno).
Also, we write x~ z) x!2~ x (1) =
o
x(a) ,,,r
x(2) q I
We assume again that AT = TA and BS = SB.
From Eq. (7.1), we see that
0
0 x(1) = ~-I y(1) _ ~-1 .
(7.1)
(7.2)
(7.3)
(7 .~)
an~
-
17
x(2) = H-I Z(2) - H-I
pT
0
0
x(1) ,,.,r (7.5)
Now let us write
G~(1) = ~(1), H~(2)
~w (I) =
= ~(2) ,
~(2)=
(7.6)
;l o I "I
oJ
(7.7)
Then as -e partition the vectors z (i), z (2) and the matrices W (1) and W (2) as
in Eq (7.3), Eqs (7.4) and (7.5) becomes
(i) ~i) ~(1) x!2) (j = 1,2,...,r), ~j = - ,,j ~ ,
(2) = (2) _ w!2) x(1) (j = 1,2,..,s) J ~j J ,,~ ,
For Eq. (7.8), we have
I w (1) r
w~ 2) i
(7.8)
(1)
z(2) (7.9)
It can be noted that W ~lj( ~ and W ~2j( ~ are dependent only on the given region and hence
the algorithm becomes useful if many problems on the same region are to be conside-
red.
Thus, the algorithm proceeds as follows
i. Solve z(I) aria z! 2) using the methods of Section 2 or 3.
-
18
2. Solve for W (I) and W! 2) using the methods of Section 2 or 3. r
3. Solve Eq. (7.9) using Gaussian elimination. Save the LU decomposition of
Eq. (7.9).
h. Solve for the unknown components of ~(1) and ~(2)
8. Conclusion
Numerous applications require the repeated solution of a Poisson equation.
The operation counts given by Dorr [5] indicate that the methods we have discussed
should offer significant economies over older techniques; and this has been veri-
fied in practice by many users. Computational experiments comparing the Buneman
algorithm, the MD algorithm, the Peaceman-Raohford alternating direction algorithm,
and the point successive over-relaxation algorithm are given by Buzbee, at al [3].
We conclude that the method of matrix decomposition, the Buneman algorithm, and
Hookney's algorithm (when used with care) are valuable methods.
This paper has benefited greatly from the comments of Dr. F. Dorr,
Mr. J. Alan George, Dr. R. Hockney and Professor 0. Widlund.
9. References
1. Richard Bellman, Introduction to Matrix Analysis, McGraw-Hill, New York, 1960.
2. Oscar Buneman, Stanford University Institute for Plasma Research, Report No.294, 1969.
B.L. Buzbee, G.H. Golub and C.W. Nielson, "The Method of Odd/Even Reduction and Factorization with Application to Poisson's Equation, Part II," LA-h288, Los Alamos Scientific Laboratory. (To appear SIAM J. Num. Anal. )
J.W. Cooley and J.W. Tukey, "An algorithm for machine calculation of complex Fourier series," Math. Comp., Vol.19, No.90 (1965), pp. 297-301.
F.W. Dorr, "The direct solution to the discrete Poisson equation on a rectangle," to appear in SIAM Review.
J.A. George, "An Embedding Approach to the Solution of Poisson's Equation on an Arbitrary Bounded Region," to appear as a Stanford Report.
G.H. Golub, R. Underwood and J. Wilkinson, "Solution of Ax = kBx when B is positive definite," (to be published). ~
R.W. Hockney, "A fast direct solution of Poisson's equation using Fourier analysis," J. ACM., Vol.12 No.1 (1965), pp. 95-113.
3.
4.
o
6.
.
8.
-
19
9.
lO.
R.W. Hockney, in Methods in Computational Physics (B. Adler, S. Fernbach an~ M. Rotenberg, Eds.), Vol.S Academic Press, New York and London, 1969.
R.E. Lynch, J.R. Rice and D.H. Thomas, "Direct solution of partial difference equations by tensor product methods," Num. Math., Vol.6 (196A), pp. 185-199.
ii. R.S. Varga, Matrix Interative Anal2sis, Prentice Hall, New York, 1962.
-
Matrix Methods in Mathematical Programming
GENE GOLUB
Stanford University
-
22
I. Introduction
With the advent of modern computers, there has been a great development in
matrix algorithms. A major contributer to this advance is J. H. Wilkinson [30].
Simultaneously, a considerable growth has occurred in the field of mathematical
programming. However, in this field, until recently, very little analysis has been
carried out for the matrix algorithms involved.
In the following lectures, matrix algorithms will be developed which can be
efficiently applied in certain areas of mathematical programming and which give
rise to stable processes.
We consider problems of the following types:
maximize ~ (~) , where ~ = (x,, x,, .. Xn) T
subject to Ax= b
Gx ~h
where the objective function ~ (~) is linear or quadratic.
2. Linear Programming
The linear programming problem can be posed as follows:
T m~x~i,e ~ (~) = ~
subject to A~_ = b (2.1)
) 0 (2.2)
We assume that A is an m x n matrix, with m < n, which satisfies the Haar
condition (that is, every m x m submatrix of A is non-singular). The vector ~ is
said to be feasible if it satisfies the constraints (2.1) and (2.2).
Let I = lil, i2, .. iml be a set of m indices such that, on setting xj = O,
j $ I, we can solve the remaining m equations in (2.1) and obtain a solution such
that
xij > 0 , J = I, 2, .. m .
Thi8 vector x is said to be a basic feasible solution. It is well-known that
the vector ~ which maximizes ~ (~) = o T x is a basic feasible solution, and this
suggests a possible algorithm for obtaining the optimum solution, namely, examine
all possible basic feasible solutions.
-
23
Such a process is generally inefficient. A more systematic procedure, due to
Dantzig, is the SimylexAl~orithm. In this algorithm, a series of basic feasible
solutions is generated by changing one variable at a time in such a way that the
value of the objective function is increased at each step. There seems to be no
way of determining the rate of convergence of the simplex method; however, it works
well in practice.
The steps involved may be given as follows:
(i) Assume that we can determine a set of m indices I = liI , i,, .. iml such that
the corresponding x i are the non-zero variables in a basic feasible solution. J
Define the basis matrix
B = [ai , Ai2, .. aim ]
where the a are columns of A corresponding to the basic variables. --lj
(ii) Solve the system of equations:
B~=b
where ~.T= [Xil, Xi, ' .. Xim]
(iii) Solve the system of equations:
B T ^ W = C
where _~T__ [ci,, ci2' .. cim] are the coefficients of the basic variables in the
objective function.
(iv) Calculate
~T w] - T w , say. ( .cj - ~ ~. = Cr ~r - max j I
T If c r - ~ w 0 , then the optimum solution has been reached. Otherwise, a is to
~r
be introduced into the basis.
(v) Solve the system of equations:
B t = - a - - r
If t ~ 0 , k = I, 2, r k
bounded.
. m , then this indicates that the optimum solution is un-
Otherwise determine the component s for which
x i x s = min - ~ trk 0
t r s 1 ~k~m t r k
-
24
Eliminate the column a i from the basis matrix and introduce column a r. s
This process is continued from step (ii) until an optimum solution is obtained (or
shown to be unbounded).
We have defined the complete algorithm explicitly, provided a termination rule,
and indicated how to detect an unbounded solution. We now show how the simplex
algorithm can be implemented in a stable numerical fashion.
~. A stable implementation of the simplex al6orithm
Throughout the algorithm, there are three systems of linear equations to be
solved at each iteration. These are:
B~ = b , m
BTw = c ,
Bt = -a --r -r
Assuming Gaussian elimination is used, this requires about m3/3 multiplica-
tions for each system. However, if it is assumed that the triangular factors of B
are available, then only O(m 2) multiplications are needed. An important considera-
tion is that only one column of B is changed in one iteration, and it seem, reasonable
to assume that the number of multiplications can be reduced if use is made of this.
We would hope to reduce the m3/3 multiplications to O(m 2) multiplications per step.
This is the basis of the classical simplex method. The disadvantage of this method
is that the pivoting strategy which is generally used does not take numerical
stability into consideration. We now show that it is possible to implement the
simplex algorithm in a more stable manner, the cost being that more storage is re-
quired.
Consider methods for the solution of a set of linear equations. It is well-
known that there exists a permutation matrix n such that
HB = LU
where L is a lower triangular matrix, and U is an upper triangular matrix.
If Gaussian elimination with partial (row) pivoting is used, then we proceed
as follows :
Choose a permutation matrix H, such that the maximum modulus element of the
-
25
first column of B becomes the (I, 1) - element of 1"] 1 B.
Define an elementary lower triangular matrix F k as
k ~ | -
r k = I ' ! - ! f
" i |
". ~ I
'LL I ' l , I ' | " ~, J ".
Now~ can be chosen so that
P, HI B
has all elements below the diagonal in the first column set equal to zero.
Now choose 92 so that
92 r, 9, B
has the maximum modulus element in the second column in position (2, 2), and
choose r e so that
r= fl~ 1"t H2 B
has all elements below the diagonal in the second column set equal to zero. This
can be done without affecting the zeros already computed in the first column.
Continuing in this way we obtain:
rm- , ~m- , . . .P2 ~, r, 9, B = U
where U is an upper triangular matrix.
Note that permuting the rows of the matrix B merely implies a re-ordering of
the right-hand-side elements. Thus, no actual permutation need be performed,
merely a record kept. Further any product of elementary lower triangular matrices
is a lower triangular matrix, as may easily be shown. Thus on the left-hand side
we have essentially a lower triangular matrix, and thus the required factorization.
The relevant elements of the successive matrices F k can be stored in the
lower triangle of B, in the space where zeros have been introduced. Thus the
method is economical in storage.
-
26
To return to the linear programming problem, we require to solve a system of
equations of the form
B (1 ) ~ = v (3 .~)
where B (i) and B (i-I) differ in only one column (although the columns may be re-
ordered)
Consider the first iteration of the algorithm. Suppose that we have obtained
the factorization:
B () = S () U(o)
where the right-hand-side vector has been re-ordered to take account of the permuta-
tions.
The solution to (3 . i ) with i = 0 is obtained by computing = (L~)) -~ x
and solving the triangular system
v(O) = ~ , 2
each of which requires m + 0 (m) multiplications.
Suppose that the column b () is eliminated from B () and the column g(O) is S
O
introduced as the last column, then
BO) = [b(O) b(O) . b(O) bCo) ~(o)] L t ~2 ' " ~S " t ~S *1 ' " "
0 0
Therefore,
(~(o) ) .1 BO) = HO) ,
where H (I) has the form:
/ { <
-
27
Such a matrix is called an upper Hessenberg matrix. 0nly the last column need be
computed, as all others are available from the previous step. We require to apply
a sequence of transformations to restore the upper triangular form. It is clear
that we have a particularly simple case of the LU factorization procedure as
previously described, where r! I) is of the form: i
R I ' I i I #-~
I k_Y ' I
" I
11 1 ,q~'/ I1 . J i "
I I 0
r~ I) =
only one element requiring to be calculated. On applying a sequence of transforma-
tion matrices and permutation matrices as before, we obtain
1) 1) . . r (1) H(1) = u (1) s s
o o
where U (I) is upper triangular.
(I) it is only necessary to compare two Note that in this case to obtain Hj
(I) and elements. Thus the storage required is very small: (m - So) multipliers gi
(m - So) bits to indicate whether or not interchanges are necessary.
All elements in the computation are bounded, and so we have good numerical
accuracy throughout. The whole procedure compares favourably with standard forms,
for example, the product form of the inverse where no account of numerical accuracy
is taken. Further this procedure requires fewer operations than the method which
uses the product form of the inverse. If we consider the steps involved, forward
and backward substitution with L () and U (i) require a total of m 2 multiplications
and the application of the remaining transformation in (L(i)) -I requires at most
i(m - I) multiplications. (If we assume that on the average the middle column of
the Basis matrix is eliminated, then this will be closer to (i/2) (m - I) ). Thus
a total of m 2 + i (m - I) multiplications are required to solve the system at each
-
28
stage, assuming an initial factorization is available. Note that if the matrix A
is sparse, then the algorithm can make use of this structure as is done in the
method using the product form of the inverse.
4" Iterative refinement of the.solution
Consider the set of equations
B~ = X
and suppose that ~ is a computed approximation to ~ . Let
-- ~+
Therefore,
that is,
B(~ + 2) : v ,
Be_ -- v -B~
We can now solve for c very efficiently, since the LU decomposition of B is
available. This process can be repeated until ~ is obtained to the required accur-
acy. The algorithm can be outlined as follows:
(i) Compute ~j = ~ - B~_j
(ii) Solve B_cj = r -j
(iii) Compute ~j+1 = ~J + ~J
It is necessary for r to be computed in double precision and then rounded to --j
single precision. Note that step (ii) requires 0(m 2) operations, since the LU de-
composition of B is available. This procedure can be used in the following sections.
~. Householder Trian~ularization
Householder transformations have been widely discussed in the literature. In
this section we are concerned with their use in reducing a matrix A to upper-
triangular form, and in particular we wish to show how to update the decomposition
of A when its columns are changed one by one. This will open the way to implemen-
tation of efficient and stable algorithms for solving problems involving linear
constraints.
Householder transformations are symmetric orthogonal matrices of the form
Pk = I - k UkUk where u k is a vector and Ck = 2/( ). Their utility in this
-
29
context is due to the fact that for any non-zero vector 2 it is possible to choos~
u k in such a way that the transformed vector Pk a is zero except for its first
element. Householder [15] used this property to construct a sequence of transfor-
mations to reduce a matrix to upper-triangular form. In [29], Wilkinson describes
the process and his error analysis shows it to be very stable.
Given any A, we can construct a sequence of transformations such that A is
reduced to upper triangular form. Premultiplying by P annihilates (m - 1) O
elements in the first column. Similarly, premultiplying by PI eliminates (m - 2)
elements in the second column, and so on.
Therefore,
em-1 Pm-2 "'PI PoA = [ RO ] ' (5.1)
where R is an upper triangular matrix.
Since the product of orthogonal matrices is an orthogonal matrix, we can
write (5.1) as
QA = [ R ] 0
A=QT[ R ] 0
The above process is close to the Gram-Schmidt process in that it produces
a set of orthogonal vectors spanning E . In addition, the Householder transforma- n
tion produces a complementary set of vectors which is often useful. Since this
process has been shown to be numerically stable, it does produce an orthogonal
matrix, in contrast to the Gram-Schmidt process.
If A = (~I ,...,~n) is an mxn matrix of rank r, then at the k-th stage of the
triangularization (k < r ) we have
where R k
A (k) PoA= = Pk-1Pk-2 "'" 0
is an upper-triangular matrix of order r.
T k
The next step is to compute
A.k+1.( ~ = Pk A'k" ( ~ where Pk is chosen to reduce the first column of T k to zero
except for the first component. This component becomes the last diagonal element
-
30
of ~+I and since its modulus is equal to the Euclidean length of the first column
of T k it should in general be maximized by a suitable interchange of the columns
of Sk . After r steps, T will be effectively zero (the length of each of its r
T k
col~Im=~ will be smaller than some tolerance) and the process stops.
Hence we conclude that if rank(A) = r then for some permutation matrix H the
Householder decomposition (or "QR decomposition") of A is
Q A ~ = Pr-1 Pr-2 "'" PO A =
r
O 0
where Q = Pr -1Pr -2 "'" PO is an m x m orthogonal matrix and R is upper-triangular
and non-singular.
We are now concerned with the manner in which Q should be stored and the
means by which Q, R, S may be updated if the columns of A are changed. We will
suppose that a column a is deleted from A and that a column a is added. It will ~p ~q
be clear what is to be done if only one or the other takes place.
Since the Householder transformations Pk are defined by the vectors u k the
usual method is to store the Uk'S in the area beneath R, with a few extra words of
memory being used to store the ~k'S and the diagonal elements of R. The product
Q~ for some vector ~ is then easily computed in the form Pr -1Pr -2 "'" PO ~ where,
T T for example, PO ~ = (I - ~0~O~0)~ = ~ - ~o(Uo~)Uo . The updating is best
accomplished as follows. The first p-1 columns of the new R are the same as before;
the other columns p through n are simply overwritten by columns ap+1, ..., an, aq
and transformed by the product Pp-1Pp-2 "'" PO to obtain a new
I (Sp_ I ~ I' then T is triangularized as usual.
\%1 ] p-1
This method allows Q to be kept in product form always, and there is no accumula-
tion of errors. Of course, if p = I the complete decomposition must be re-done
and since with m~ n the work is roughly proportional to (m-n/3)n 2 this can mean
a lot of work. But if p A n/2 on the average, then only about I/8 of the original
work must be repeated each updating.
-
31
Assume that we have a matrix A which is to be replaced by a matrix ~ formed
from A by eliminating column a and inserting a new vector g as the last column.
As in the simplex method, we can produce an updating procedure using Householder
transformations. If ~ is premultiplied by Q, the resulting matrix has upper
Qi = /
/
<
As before, this can be reduced to an upper triangular matrix in O(m 2) multiplica-
tions.
6. Projections
In optimization problems involving linear constraints it is often necessary
to compute the projections of some vector either into or orthogonal to the space
defined by a subset of the constraints (usually the current "basis"). In this
section we show how Householder transformations may be used to compute such pro-
jections. As we have shown, it is possible to update the Householder decomposi-
tion of a matrix when the number of columns in the matrix is changed, and thus we
will have an efficient and stable means of orthogonalizing vectors with respect to
basis sets whose component vectors are changing one by one.
Let the basis set of vectors a 1,a2,...,a n form the columns of an m x n
matrix A, and let S be the sub-space spanned by fail We shall assume that the r
first r vectors are linearly independent and that rank(A) = r. In general,
m > n > r , although the following is true even if m < n
Given an arbitrary vector z we wish to compute the projections
u = Pz , v = (I - P) z
for some projection matrix P , such that
Diagramatically, Hessenberg form as before.
-
32
a) z = u + v
(b) 2v = 0
(o) ~s r (i.e., 3~ ~uoh that ~ = ~) (i.e., ATv (d) v is orthogonal to S r ~ = o)
One method is to write P as AA + where A + is the n x m generalized inverse of A,
and in [7~ Fletcher shows how A + may be updated upon changes of basis. In contrast,
the method based on Householder transformations does not deal with A + explicitly
but instead keeps AA + in factorized form and simply updates the orthogonal matrix
required to produce this form. Apart from being more stable and just as efficient,
the method has the added advantage that there are always two orthonormal sets of
vectors available, one spanning S and the other spanning its complement. r
As already shown, we can construct an m x n orthogona~ matrix Q such that
r n-r
QA = i 0 S1
where R is an r x r upper-triangular matrix. Let
W = Qz =
I r
m-r
(6.~)
and define
~ ' X= ~2 (6.2)
Then it is easily verified that ~,~ are the required projections of ~, which is to
say they satisfy the above four properties. Also, the x in (c) is readily shown
to be
In effect, we are representing the projection matrices in the form
-
33
and
P Q C: r) = (z r o)Q (6 .~)
I-P =QT (im_rO ) (OI r)Q (6.A)
and we are computing ~ = P z, Z = (I - P)~ by means of (6.1), (6.2) The first r
col,m~R of Q span S and the remaining m-r span its complement. Since Q and R may r
be updated accurately and efficiently if they are computed using Householder
transformations, we have as claimed the means of orthogonalizing vectors with re-
spect to varying bases.
As an example of the use of the projection (6.4), consider the problem of
finding the stationary values of xTAx subject to xTx = I and cTx = O, where A is a
real symmetric matrix of order n and C is an n x p matrix of rank r, with r ! P
-
34
mn l l b - A~_It 2
where we assume that the rank of A is n.
Since length is invariant under an orthogonal transformation we have
where QA =
lib - Ax l l 2 = l lQb - QA~_II "+ 2 2
[ 1{ ]. Let 0
Qb = c : [o_, ] . - - - - C2 m- n
Then,
2, 1{] x U' = Ha_,- ~_H" + lla.il" " [~_,] - [o - , ,
and the solution to the least-squares problem is given by
= 1{ -1 c,
Thus it is easy to solve the least-squares problem using orthogonal transformations.
Alternatively, the least-squares problem can be solved by constructing the
normal equations
A x = A D
However these are well-known to be ill-conditioned.
Nevertheless the normal equations can be used in the following way.
Let the residual vector r be defined by:
r = b -A~
Then,
ATr = ATb - ATA~ = 0
These equations can be written:
[IA A]O Ir> (:Jx+ Thus,
0 I
Multiplying out:
(1{7o) o
IAT AIi TO IOii IO
C CO/o
(r) X
(7.~)
:I(:)
-
35
where ~ = QE and S = Q~ .
This system can easily be solved for ~ and ~. The method of iterative refine-
ment may he applied to obtain a very accurate solution.
This method has been analysed by BJhrck [2].
8. Least-squares problem with linear constraints
Here we consider the problem
minimize ~ - A~_~ 2 2
subject to G~ = ~ .
Using Lagrange multipliers ~ , we may incorporate the constraints into
equation (7.1) and obtain
0 I A
G T A T 0 1 b 0 The methods of the previous sections can be applied to obtain the solution of this
system of equations, without actually constructing the above matrix. The problem
simplifies and a very accurate solution may be obtained.
Now we consider the problem
minimize llb - A~_~ 2 2
subject to Gx ~> h .
Such a problem might arise in the following manner. Suppose we wish to approximate
given aata by the polynomial
y(t) = ~t ~ + @t 2 + yt +
such that y(t) is convex. This implies
y(')(t) = 6at + 2~ ) 0 .
Thus, we require
6 a t i + 2~ ) 0
where t. are the data points, (This aces not necessarily guarantee that the poly- l
hernial will be convex throughout the interval. ) Introduce slack variables w such
that Gx - w = h
where w ~ _O .
-
36
Introducing Lagrange multipliers as before, we may write the system as:
i O 0 G -I 0 I A 0
G T A T 0 0
r
x
w
h
b
0
At the solution, we must have
T _~o, w~o, _z_w=0.
This implies that when a Lagrange multiplier is non-zero then the corresponding
constraint holds with equality.
Conversely, corresponding to a non-zero w i the Lagrange multiplier must be
zero. Therefore, if we know which constraints held with equality at the solution,
we could treat the problem as a linear least-squares problem with linear equality
constraints. A technique, due to Cottle and Dantzig [5], exists for solving the
problem inthis way.
-
37
Bibliography
[11 Beale, E.M.L., "Numerical Methods", in Ngn~.inear Programming, J. Abadie (ed.).
John Wiley, New York, 1967; pp. 133-205.
[2] Bjorck, ~., "Iterative Refinement of Linear Least Squares Solutions II", BIT 8
(1968), pp. 8-30.
[3] and G. H. Golub, "Iterative Refinement of Linear Least Squares
Solutions by Householder Transformations", BIT 7 (1967), pp. 322-37.
[4] and V. Pereyra, "Solution of Vandermonde Systems of Equations",
Publicaion 70-02, Universidad Central de Venezuela, Caracas, Venezuela, 1970.
[5] Cottle, R. W. and @. B. Dantzig, "Complementary Pivot Theory of Mathematical
Programming", Mathematics of the Decision Sclences~ Part 1, G. B. Dantzig and
A. F. Veinott (eds.), American Mathematical Societ 2 (1968), pp. 115-136.
[6] Dantzig, G. B., R. P. Harvey, R. D. McKnight, and S. S. Smith, "Sparse Matrix
Techniques in Two Mathematical Programming Codes", Proceedinss of the S.ymposium
on Sparse Matrices and Their Appllcations, T. J. Watson Research Publications
RAI, no. 11707, 1969.
[7] Fletcher, R., "A Technique for Orthogonalization", J. Inst. Maths. Applics. 5
(1969), pp. 162-66.
[8] Forsythe, G. E., and G. H. Golub, "On the Stationary Values of a Second-Degree
Polynomial on the Unit Sphere", J. SIAM, 13 (1965), pp. 1050-68.
[9] and C. B. Moler, Computer Solution of Linear Algebraic Systems,
Prentice-Hall, Englewood Cliffs, New Jersey, 1967.
[10] Francis, J., "The QR Transformation. A Unitary Analogue to the LR Transforma-
tion," Comput. J. 4 (1961-62), pp. 265-71.
[11] golub, G. H., and C. Reinsch, "Singular Value Decomposition and Least Squares
Solutions", Numer. Math., 14(1970), pp. 403-20.
[12] and R. Underwood, "Stationary Values of the Ratio of Quadratic
Forms Subject to Linear Constraints", Technical Report No. CS 142, Computer
Science Department, Stanford University, 1969.
[13] Hanson, R. J., "Computing Quadratic Programming Problems: Linear Inequality
and Equality Constraints", Technical Memorandum No. 240, Jet Propulsion
-
38
Laboratory, Pasadena, California, 1970.
[14] and C. L. Lawson, "Extensions and Applications of the House-
holder Algorithm for Solving Linear Least Squares Problems", Math. Comp., 23
(1969), pp. 787-812.
[15] Householder, A.S., "Unitary Triangularization of a Nonsymmetric Matrix",
J. Assoc. Comp. Mach., 5 (1968), pp. 339-42.
[16] Lanozos, C., Linear Differential Operators. Van Nostrand, London, 1961.
Chapter 3
[17] Leringe, 0., and P. Wedln, "A Comparison Betweem Different Methods to Compute
a Vector x Which Minimizes JJAx - bH2 When Gx = h", Technical Report, Depart-
ment of Computer Sciences, Lund University, Sweden.
[18] Levenberg, K., "A Method for the solution of Certain Non-Linear Problems in
Least Squares", ~uart. Appl. Math., 2 (1944), pp. 164-68.
[19] Marquardt, D. W., "An Algorithm for Least-Squares Estimation of Non-Linear
Parameters", J. SIAM, 11 (1963), pp. 431-41.
[20] Meyer, R. R., "Theoretical and Computational Aspects of Nonlinear Regression",
P-181 9, Shell Development Company, Emeryville, California.
[21] Penrose, R., "A Generalized Inverse for Matrices", Proceedings of the
Cambridge Philosophical Society, 51 (1955), pp. 406-13.
[22] Peters, G., and J. H. Wilkinson, "Eigenvalues of Ax = kB x with Band Symmetric
A and B", Comput. J., 12 (1969), pp. 398-404.
[23] Powell, M.J.D., "Rank One Methods for Unconstrained Optimization", T. P. 372,
Atomic Energy Research Establishment, Harwell, England, (1969).
[24] Rosen, J. B., "Gradient Projection Method for Non-linear Programming. Part
I. Linear Constraints", J. SIAM, 8 (1960), pp. 181-217.
[25] Shanno, D. C. "Parameter Selection for Modified Newton Methods for Function
Minimization", J. SIAM, Numer. Anal., Ser. B,7 (1970).
[26] Stoer, J., "On the Numerical Solution of Constrained Least Squares Problems",
(private communication), 1970.
[27] Tewarson, R. P., "The Gaussian Elimination and Sparse Systems", Proceedings
of the Symposium on Sparse Matrices and Their Applications~ T. J. Watson
-
39
Research Publication RA1, no. 11707, 1969.
[28] Wilkinson, J. H., "Error Analysis of Direct Methods of Matrix Inversion",
J. Assoc. Comp. Mach., 8 (1961), pp. 281-330.
[29] "Error Analysis of Transformations Based on the Use of
Matrices of the Form I - 2ww H', in Error in Digital Computation, Vol. ii, L.
B. Rall (ed.), John Wiley and Sons, Inc., New York, 1965, pp. 77-101.
[30] The Algebraic Eigenvalue Problem, Clarendon Press, Oxford,
1 965.
[31] ZoutendiJk, G., Methods of Feasible Directions, Elsevier Publishing Company,
Amsterdam (1960), pp. 80-90.
-
Topics in Stability Theory for Partial Difference Operators
VIDAR THOM~E
University of Gothenburg
-
42
PREFACE
The purpose of these lectures is to present a short introduction to some aspects
of the theory of difference schemes for the solution of initial value problems for
linear systems of partial differential equations. In particular, we shall discuss
various stability concepts for finite difference operators and the related question
of convergence of the solution of the discrete problem to the solution of the con-
tinuous problem. Special emphasis will be given to the strong relationship between
stability of difference schemes and correctness of initial value problems.
In practice, most important applications deal with mixed initial boundary value
problems for non-linear equations. It will net be possible in this short course to
develop the theory to such a general context. However, the results in the particular
cases we shall treat have intuitive implications for the more complicated situations.
The two most important methods in stability theory for difference operators have been
the Fourier method and the energy method. The former applies in its pure form only
to equations with constant coefficients whereas the latter is more directly appli-
cable to variable coefficients and even to non-linear situations. Often different
methods have to be combined so that for instance Fourier methods are first used to
analyse the linearized equations with coefficients fixed at some point and then the
energy method, or some other method, is applied to appraise the error comm~tte~ by
treating the simplified case. We have selected in these lectures to concentrate on
Fourier techniques.
These notes were developed from material used previously by the author for a
similar course held in the summer of 1968 in a University of Michigan engineering
summer conference on numerical analysis and also used for the author's survey paper
~361. Some of the relevant literature is collected in the list of references. A
thorough account of the theory can be obtained by combining the book by Richtmyer
and Morton E28] with the above mentioned survey paper E36S. Both these sources con-
rain extensive lists of further references~
-
43
I. Introduction
Let ~ be the set of uniformly continuous, bounded functions of x, and let
be the set of functions v with (d/dx)Jv in ~ for J ~k . For v ~ ~ set
X
For any v C ~)amy k, and ~ >0we can f indv G ~Ksuch that
~1 v - v / / .0 (1)
If v ~ C ~ this problem admits one and only one solution in C D
(2)
(3)
It is clear that the solution u depends for fixed t linearly on v; we define a
linear operator Ee(t ) By
where u is defined by (3) and where v C C~A The solution operator Eo(t ) has the
properties
and
II ~-~ b') v /t 0
-
44
The operator E(t) still has the properties
= ~ " 0+)
l ie (~:~~ I/ ~< /i v I\ , (~)
and is continuous in t for t ~ O. For this particular equation we actually get a
c lass io~ solutio~ for t ~ o~ even i f ~ i s o~y in C .e have E( t ) . ~ (_ - - /~ K=O
for t > O,
Consider new the initial-value problem
, (~)
For v g ~
Clearly
this problem admits one and only one genuine solution, namely
(7)
(act~mlly we have equality) and it is again natural to define a generalized solution
operator, continuous in t by
This has again the properties (~), (5). In this case, the solution is as irregular
for t >0 as it is for t = O.
Both these problems are thus "correctly posed" in ~ ; they can be uniquel~
solved for a dense subset of ~ and the solution operator is bounded.
We could instead of ~ also have considered ether Basic classes of functions.
Thus let L ~ be the set of square integrable functions with
,, (LI 1 Consider again the initial-value problem (1),(2) and assume that u(x,t) is a classi-
cal solution and that u(x,t) tends to zero as fast as necessary whsm I~I .-~o for
the following to hold. Assume for simplicity that u is real-valued. We then have
~t (8 )
-
45
so that for t ~ O,
i~ ~ [., ~-~'~ II ~ II v I\ (9)
Relative to the present framework it is also possible to define genuine and gene-
ralized solution operators; the latter is defined on the whole of L 2 and satisfies
(~-), (5) .
For the problem (6), (7) the calculation corresponding to (8) goes similarly~
b'
One other way of looking at this is to introduce the Fourier transform; for
integrable v, set ~o ~X
v : ( lO)
Notice the Parseval relation, for v ~nx addition in L 2 we have ~ L a and
II '~ Ii = / i -~ il v i~ .
For the Fourier-transform u(~ ,t) with respect to x of the solution u(x,t) we then
get i~itial-value problems for the ordinary differential equations, namely,
for (l), (2) an~ a~ . ~, A~ ~ = Av(~
fo r (6), (7). ~ese have the ~oZut~ons _~L -~ "~-,~
u ~ (n )
_~ (12)
respectively, and the actual solutions can be obtained, under certain conditions,
by the inverse Fourier transform. Also by Parseval's formula we have for both (ll)
and (12), _ _~I I
which is again (9).
For the purpose of approximate solution of the initial-value problem (1), (2),
where h,k are small positive numbers which we shall later make tend to zero in such
a fashion that ~=k/h 2 is kept constant. Solving for u(x,t+k), we get
-
46
This suggests that for the exact (generalized) solution to (I), (2),
. (z~)
or after n steps
We sh~l l prove that th i s i s es~ent i~ l ly cor rect for any v ~- ~ i f , but only i f ~
Thus, let us first notice that if ~ ~ ~ , then the coefficients of ~ are all non-
negative and add up to 1 so that (the norm is again the sup-norm)
or generally
iiE vll .< tl ll The boundedness of the powers of ~ is referred to as stability of ~ .
Assume now that v 6 ~ We then know that the classical solution of (i), (2)
exists and if n(x,t) E(t)v = Eo(t)v , the~ u E g ~ = for t '~ 0 an~
We shall prove that~if nk = t, then ~ ~ II ~ J II
To see this let us consider
Notice now that we can write
-
47
Therefore "
~-, E- ~- V(~l
which we wanted to prove.
We shall new prove that for v not necessarily in
have for nk = t,
To see this, let ~ ~ 0 be arbitrary, and choose 'v"
but only in ~ , we still
when k ~ 0 .
such that
We then have
- -K
Therefore, choosing ~ -- ~'z(~il~l')-w~' have for h ~ ~
which concludes the proof.
Consider now the case ~
Taking
we get
~o
X~
The middle coeffic ~nt in ~ is th~ negative.
so that the effect of ~ is multiplication by ( i -~) . We generally get
-
48
Since ~ > we have 1-~'~ ~ -i and it follows that it is not possible to have
an inequality of the form
// T.
This can a l so be in terpreted to mean that smal l e r ro rs in the in i t ia l d~ta are blown
up to an extent where they overshadow the real solution. This phenomenon is oalle~
instability.
Instead of the simple difference scheme (13) we could study a more general
type of operator, e.g.
If we wa~t this to be "consistent" with the equation (i) we have to demand that E k
apprexi~tes E(k), or if u(x,t) is a solution, then
Taylor series development gives for smooth u,
or
( \
J
Assuming these consistency relations to hold and assuming that all the aj
we get as above
are ~ O,
(15)
and the convergence analysis above can be carried over to this more general case
with few chs~ges.
-
49
However, the reasons for choosing an operator of the form (14) which is not our
old operator (13) would be to obtain higher accuracy in the approximation and it will
turn out then that all the coefficients are in general not non-negative. We cannot
have (15) then, but we may still have
fo r some C depend~mg on To
When we work with the L2-norm rather than the maximum norm, Fourier transforms
are again helpful; indeed in most of the subsequent lectures, Fourier analysis will
be the foremost tool.
Thus, let ~ be the Fourier transform of v defined by (lO). We then have
,.J
3 J
or, introducing the characteristic (trigonometric) polynomial of the operator ~ ,
Jj i~_,,,f we find that the effect of E k on the Fourier transform side is multiplication by
n a(h~ )n. One easily findsthat similarly, the effect of E k is multiplication by
a(h ~)n. Using Parseval's relation, one then easily finds (the norm is now the L z-
norm)
IIE I ]11 and that this inequality is the best possible. It follows that we have stability if
and only if la(~ ) I ~ 1 for all real~ . We then actually have (15) in the L2-norm
Consider again the special operator (13).
and a(~ ) takes all values in the interval[l-&/\
We have in this case
2A
, i]. We therfore find that also in
L 2 we have stability if and only if 1-4~ ~ -1, that is ~ ~ .
Difference approximations to the initial value problem (6), (7) can be analysed
similarly.
-
50
We shall put the above considerations in a more general setting and discuss an
initial-value problem in a Banach space B. Thus let A be a linear operator with
domain D(A) and let v ~ B.
such that
Consider then the problem of finding u(t) E B, t ~ O,
A~*~t-) , E--, o (16)
v (17)
More precisely, we shall say that u(t), t ~ O, is a genuine solution of (16), (17)
if (17) holds and
(ii) Ii u(t,
-
51
We shall now study the approximation of a solution u(t) = E(t)v of a correctly
posed Initial-value problem (16), (17). We will then for small k,k $ ke, consider
an approximation ~ of E(k), where ~ is a bounded linear operator with D(E k) = B
which depends continuously on k for 0 $ k ~ k e. The thought is then that ~
is going to approximate E(nk)v = E(k)nv.
We say that the eperator E k is consistent with the initlal-value problem (16),
(17) if there is a set'~ of genuine solutio~ of (16), (17) such that
(i) % ~ ~ ~(0~ ~ ~ ~ ~ ~ is dense in B.
for any T > O.
If the operator ~ is consistent with (16), (17), we say that it is convergent
(in B) if for any v e B and any t ~ O, and any pair of sequences ~i ~ 9
with kj ~-> O, njkj -->t for j ~ , we have
II g vI/ o whenj-e . We say that the operator ~ is stable (in B) if for any T ~ 0 there is a con-
stant C such that
It turns out that consistency alone does not guarantee convergence; we have the
following theorem which is referred to as Lax's equivalence theorem [22].
Theorem Assume that (16), (17) is correctly posed and that ~ is a consistent
approximation operator. Then stability is necessary and sufficient for convergence.
The proof of the sufficiency of stability for convergence is similar to the
proof in the particular case treated above; the proof of the necessity depends on
the Banach-Steinhaus theorem.
2. Ini~al-value problems in L 2 with constant coefficients
We begin with some notation. We shall work here with the Banach space
L ~ = L2(R d) with the norm
-
52
We define for a multi-index ~< = C~,~ ' j~%~ with ~
-
53
and double bars will indicate norms with respect to L a , so that for the N-vector
u(x) 6 L ' , ~_ \ ~_
For later use we need the following
Lemma I Let ~ be a dense subset of L a and let a(~ ) be a continuous NxN matrix.
Then il~ V II
veglv ~i v/\ ~
Let u(x,t) ~e an N-vector-function defined for x ~ R d and t ~ O.
initial-value problem
-~ _ ~( ,~ = "5_ ~.~)~ ~ >~o
Consider the
where P~ are constant N~N matrices and where we can consider Pu to be defined for
u ~ ~ Let ~
We have :
Theorem I
for any T ~. O, there is a C such that
Proof
The initial-value ~oblem (I), (2) is correctly posed in L" if and onlyif,
a ~ IL o~-~ ~T
Assume that (3) holds. Let v ~ ~ and consider
(~)
By differentiation under the integral sign we find that u(x,t) satisfies (i), and so
is a solution te (i), (2). Since for t >I O, u(x,t) ~ ~) it is a genuine solution
in the sense of Lecture land is also unique. Thus Eo(t)v = u(x,t) with D = 2 . By O
Fourier's inversion formula smd Parseval's theorem
-
54
Since S is dense in L ~ it follows that the initial-value problem is correctly
posed Jk
We now want to prove the necessity of (3) for correctness. Let now v ~ ~o an~
define u(x,t) by (4). We find at once that u(x,t) satisfies the initial-value
problem (I), (2) and so u(x,t) = E(t)v. Again, by Fourier's inversion formula and
Parseval' s theorem
[I v//
so that by Lemma I,
which proves the necessity of (3) since Co is dense in L z.
Ex. i Consider the symmetric hyperbolic system
(5)
Then the in~ial-value problem for (5) is correctly posed in L 2 for 8
since this is a unitary matrix.
Before proceeding to the next example we state a lemma.
A with eigenvalues ~ d' j = I,...,N, we introduce
We then have
For an arbitrary NxN matrix
Lemma 2 If A is an NxN matrix we have for t ~ 0
.j --0
Proof See [9].
Ex~2 Consider the system (I) and consider also the principal part P of P which
corresponds to the polynomi~l
-
55
We say that the system (i) is parabolic in Petrovskii's sense if there is a ~, > 0
such that
.)
By homogeneity this is equivalent to the existence of a ~> 0 and a C such that
We then have that if (I) is parabolic in Petrovskii's sense, the corresponding
initial-value problem is correctly posed in L 2. For by Lemma 2 we have for
0 ~ t ~ T,
which is clearly bounded. In particular, the heat equation
clearly falls into this category.
Solutions of parabolic systems are smooth for t ~ 0; we have
Theorem 2 Assume that (1) is parabolic in Petrovsk~'s sense. Then for t ~ O,
D E(t)v ~ L 2 for any ~ and for ar~y T > 0 and anyQ there is a C such that
6
-
56
-i o -~
and a simple calculation yields
which is not bounded for any t > 0 whe~ V~ cO .
Ex. 5 Although our theory only deals with systems which are first-order systems with
respect to t, it is actually possible to consider also hi~her-order systems by
reducing them to first-order systems. We shall only exemplify this in one particu-
lar case. Consider the initial-value problem (d=l)
.~"~ _- ~-~"~ ~ "k >~ ~
Introducing
~-~ ~
(7)
(a)
we have for u the initial-value problem
ul~o5 = vL~,5. ~ere
(9)
co~ ~ = / _~
so that we have that the initial-value problem (9) obtained by the transformation (8)
from (7) is correctly posed in L i.
In order that an initial-value problem of the type (I), (2) be correctly posed in
L 2, it is necessary that it be correctly posed in the sense of Petrovskii, more pre-
cisely:
Theorem 3 If (i), (2) is correctly posed in L 2 then there is a constant C such that
-
57
Proof Follows at once by
We shall see at once by the following example that (I0) is not sufficiemt for
correctness in Ls
Ex. 6 Take the initial-value problem corresponding to (d~l)
0 _ .~ = - f T_ -v -l: We get then
However, a simple calculation yields ~ 1
which is easily seen to he unbounded for 0 $ t ~ I (take t ~ = i).
Necessary and sufficient conditions for correctness have been given by Kreiss
[19]. The main contents in Kreiss' result are concentrated in the following lena.
Here for a NxN matrix A we denote by Re A the matrix
Also recall that for hermitian matrices A and B, A ~ B means
for all N-vectors v. We denote the resolvent of A by R(A;z);
It will be implicitly assume~, when we write down R(A;z ), that z is not an eigen-
value of A.
Lemma ~ Let ~ be a fa~ly of NxN matrices. Then the following four conditions are
equivalent
j
? - - -
(iii) For A ~ ~ ~A(A)
-
58
and such that
(, i s i ~/s- '~) 0 such that for each A ~ ~ there is a hermitian
matrix H = H(A) with
C-~ ~
-
59
One commonly used criterion is:
Theorem 5 Let P(~) be a normal matrix. Then (i), (2) is correctly posed if and
only if (IO) holds.
Proof By Theorem 3 we only have to prove the sufficiency. Since P( ~ ) is normal we
can find a unitary U(~ ) such that
is diagonal. Hence
which proves the result.
For later use we state:
Theorem 6 If (1), (2) is correctly posed in L 2 then (lO) holds and there are posi-
tive constant C I and C a and for each ~ ~ R d a positive definite hermitian matrix
H(~ ) such ~t -I
and.
(13)
l:'roof By Theorem 4 there is a constant 7 such that the family S in (11) satisfies
condition (iv) of Lemma 3 with C = C I. Thus for each ~g R d there is a positive
definite H(~ ) such that
But by (12) this implies (13).
3. Difference appr0ximations in L i to initlal-value problems with constant
coefficients
Consider again the initial-value problem
ae - ?(~)~:- Z P,~'~u. , ~ , o (l) "oe bzl ._ M
u(', ,o) ~ v~) (2)
-
60
For the approximate solution of (i), (2) we consider explicit difference operators
of the form
where h is a small positive parameter, ~ = (~, .... ,~d) with ~j integer, e~(h) are
NxN matrices which are polynomials in h, and the sunwaation is over a finite set of ~.
We introduce the symbol of the operator ~,
which is periodic with period 2,U/h in ] and notice that for v 6 ~ ~ the Fourier
transform of ~v is n /k A
Assume that the initial-value problem (i), (2 ) is correctly posed. Pie then want
to choose ~ so that it approximates the solution operator E(k) when k is a positive
parameter tied to h by the relation
~/h ~ = ~ = constant;
we actually want to approximate u(x,nk) = E(nk)v = E(k)nv by ~v. In the future we
shall emphasise the dependence on k rather than h and write ~ as in Lecture i.
To accomplish this, we shall assume that E k satisfies the condition in the
following definition. We say that ~ is consistent with (i) if for any solution of
(1) ~ C ~o
i f o(k) can be ~p~c,~ by k(~(h~), w, ~y that ~ ie aco~ate of o rder f . C1~arly
any consistent scheme is accurate of order at least i.
We can express consistency and accuracy in terms of the symbol (cf. [35]):
Lemma i The operator ~ is consistent with (I) if and only if
The operator ~ is accurate of order ~ i f and only i f ~ v ~
-
61
The proof of (3), say, consists in proving like in the special case in Lecture
1 that consistency is equivalent to a number of algebraic conditions for the coeffi-
cients, which turn out to be equivalent to the analytic functions exp(kP(h -I ~ )) and
Ek(h-' ~ ) having the same coefficients for h j ~ up to a certain order.
Using LemmA 1 it is easy to deduce that if ~ is consistent with (1) in the
present sense then we also have consistency in the sense of Lecture 1. For the set
@ of genuine solutions in the previous definition we can for instance take the ones
corresponding to v ~ . From Lax's equivalence theorem it is clear that we want
to discuss the stability of operators ~ of the form described. We have
Theorem 1 khe operator ~ is stable if and only if for any T > O,
c O~
,Proof We notice that F-,k( ~ )n in Lecture 2 that
which praves the theorem.
is the ~ymbol of ~k" It follows in the same way as
and so
Proof We have for nk ~ l,
It is easy to prove by counter-examples that (4) is not sufficient for stabili~
Necessary and sufficient conditions for stability have been given by Kreiss [18] and
Buchanan [51 ; we quote here Kreiss' result. The main content in Kreiss' theorem is
concentrated in the following Lemma. Here we have introduced the following
notation: For H hermitian and positive definite, we introduce
(4)
We now turn to the algebraic characterization of stability. We first prove
the necessity of the yon Neumann condition. For any NxN matrix A we denote byp (A)
its spectral radius, the maximum of the moduli of the eigenvalues of A.
Theorem 2 If ~ is stable in L 2, there exists a constant ~ such that
-
62
IAu Ii~
Recall again that for hermitlan matrices, A ~ B means (Au,u) ~ (Bu,u).
Lemma 2 Let ~ be a family of NxN matrices. Then the following four conditions are
equivalent.
(i) sup ~ ~ ~ ~ ~-~ ~--~, ~ ~ ~o
(iii) For A ~ ~, ~ (A) ~ l, and there are two constants C, and Cm and for each
A @ Jl- a matrix S = S(A) such that
and such that
Sg =
is a triangular matrix wlth
(iv) There is a constant C ~ 0 such that for each A ~ ~ there is a hermitian
matrix H = H(A) with
c- 'T_ ~ ~ ~ C I
and
Proof see [28].
To be able to apply this lemma to our problem we need the following analogue of
Lemma 2.4.
Lemma 3 Assume that ~ is stable in L 2. Then there exists a constant
for ~,~ (~ ~ ~ ~t~) one has
such that
~O
-
63
An alternative way of expressing this result is that for some
any n we have ~
Y ,k ~ i, and
Combining Lemmas 2 and 3 we have at once :
Theorem ~ If ~he operator ~ is stable in L i, then there is a ~ such that
satisfies the conditions of Lemma 2. On the other hand, if there is a constant
such that kF satisfies at least one of the conditions of Lemma 2, then E k is stable
in L ~.
One commonly used criterion is :
~eo~m ~ Let ~ be su~ t~t ~(~) is a no~l matrix. ~en ~cn ~e~'s con-
dition is necessary and sufficient for stability.
Proof By Theorem 2 we only have to prove the sufficiency. Since ~( ~ ) is nor,~al
there is for each k ~ I and ~ ~ R d a unitary matrix Uk(~) such that
is diagonal. Her~e
I ('
which proves the result. To see the relation with Lemmas 2 and 3, we could also have
formulated this as fellows. We have with the same ~ as in (4) for Fk(~ ) = e Ek(~)
that _ ~ ~\
which is diagonal with eigenvalues of modulus ~ i. Thus, afortiori it is triangu-
lar, and the estimates in condition (iii) of Lemma 2 hold.
As for existence of stable operators, we have (cf. [17]):
Theorem ~ There exist L2-stable operators consistent with (i), (2) if and only if
(I), (2) is correctly posed in L z.
Proof We first prove that the correctness is necessary.
the stability that l~p ~i: ~f~/ "= ~m / ~.~]a~/
It follows by Lemma i, ar~1
-
6't
which implies correctness.
On the other hand if (I), (2) is correctly posed one can construct a consistent
difference operator, er which is equivalent, its symbol, by setting
Using Kreiss' stability theorems one can prove that this ~ is stable for small
~= K/h~ ~ The part of this operator corresponding to the second term in (5) is
referred to as an artificial viscosity.
We shall consider some examples.
Consider the initial-value problem for a symmetric hyperbolic system
We know from Lecture 2 that this problem is correctly posed in L ~.
before a difference operator
where for simplicity we assume e~ independent of h.
by Friedrichs [8].
(6)
Consider as
We have the following result
~ =- ' I then Theorem 6 If e~ are hermitian, positive semi-definite and
and thus E k is stable.
Proof We have the generalized Cauchy-Sohwartz inequality
where (u,v) = ~ uj ~j. Therefore ~-
-
65
and hence with w = Ek( ~ )v, ~.
which proves the lemma.
We have
so that
satisfies
As an application, take
: , -4
t , 3
is consistent with (6) and accurate of order i. It is clear that if
o < ~ .< ~'~ Ca/~ l / -~ ,
the coefficients are positive semi-definite and so the operator ~ is stable.
The operator ~ can be considered as obtained from replacing (6) by
Consider for a moment the perhaps more natural equation
which gives the consistent operatorj ~i
E~vC~} --- vl~) ~ ~ aa~ L~l~*~-~( '~-~' ~
with J
-
66
We shall prove that this operator is not stable in L ~ if any of the Aj
Assume e.g. A, ~ 0 and set ~ j = 0 for J ~ I, ~ I h =~/2.
which has the eigenvalues
is non-zero.
With this choice,
where the rea l numbers #~ are the eigenvalues of A~. Thus the von Neum~nn cond i -
t ion is not satisfied and the operator is unstable for any ~ .
It can be shown that in general the operator ~ defined in (8) is accurate of
order exactly 1. We shall now look at an operator which is accurate of order 2 in
the case of one space dimension (d=l).
= ~ - -
thus have the ~stem
(9)
Consider the difference operator
=
with
This operator is often referred to as the Lax-Wendroff operator. We have
and so, E k is consistent with (9), and in general accurate of order 2. We shall
prove :
Theorem
(lO) is stable in L 2 if and only if
Pro0f It is easy to see that the eigenvalues of ~(h -I ~ ) are
Let~j , j = 1,...,N, be the eigenvalues of A. Then the operator E k in
(ll)
and we obtain after a simple calculation
if and only if (II) holds. Since E k is clearly normal, this proves the theorem.
For a NxN matrix A consider the numerical range
-
67
We have :
Theorem 8 If ~ is a family of NxN matrices such that
then ~ is a stable family, that is there is a constant C such that
Proof
since
We shall prove that condition (ii) in Kreiss' theorem is satisfied. Clearly
we have ~(A) ~ 1 so that R(A;z) exists for ~z~ >I .
and v = R(A;z)w we have
or
Therefore, if w is arbitrary,
which proves the result.
Remark One can actually prove that IAnl ~ 2, A ~
This result can be used to prove the stability of certain generalizations of
the Lax-Wendroff operator to two dimensions (see [2~]).
Consider again the symmetric hyperbolic system (6) and a difference operator
of the form (7), consistent with (6). Then A(~ ) = ~(h-1~ ) is independent of h.
We say with Kreiss that E k is dissipative or order O (~ even) if there is a ~ ~ 0
such that /X)
We shall prove
Theorem 9 Under the above assumptions, if E k
pative of order ~ it is stable in L 2 .
is acct~ate of order~ -I an~ dissi-
-
68
Proof By the definition of accuracy, we have
o,.s :~ -? O
Let U = U( "~ ) be a unitary matrix which triangulates A( } ) so that
Since B(~ ) is upper triangular it follows that the below-diagonal elements in
exp(~UP(~ )U ~) are O(~) . Since this matrix is unitary, the same can easily be
proved to hold for its above-diagonal terms, and thus the same holds for the above-
diagonal terms in B( ~ ) so that
\o
. . . .
and the s tab i l i ty fo l low~ by cond i t ion ( i i i ) in ~e iss t ~eorem, v
Consider now the initial-value ~roblem for a Petrovskii parabolic system
. _ ~ ~0
so that
We know from Lecture 2 that this problem is correctly posed in L 2. Consider a
-
69
aifference operator
We say, foZlo~r.l.ng John [15] and ~idlund [38] that E is a parabol ic d i f fe re~e k
operator if there are constants ~ and C, S ~ 0 such that
Notice the close analogy with the concept of a dissipative operator.
Theorem 10 Let E be consistent with (12) and parabolic. Then it is stable in L ~. k
We shall base a proof on the following lemma, which we shall also need later for
other purposes.
Lemma 4 There exists a constant C N depending only on N such that for any NxN
matrix A with spectral radius ~ we have for n ~ N, !
IP, l O there is a C
such that -- ~
-
70
,Proof By Fourier transformation this reduces to proving . _~ l
and the result therefcre easily follows by (13).
We know by Lax's equivalence theorem that the stability of the parabolic
difference operators considered above implies convergence. We shall now see
that the difference quotients also converge to the corresponding derivatives,
which we know to exist for t > 0 since the systems are parabolic.
Theorem 12 Assume that (12) is parabolic and that ~ is consistent with (12) an~
parabolic. Then for any t > O, any o~ , and any v 6 L 2 we have for nk = t,
~ i [ ~ _ ~ li b-~ ,, v f~ (,) v ii --> o ~ ~, ---. o , (~, )
Proo____~f By Theorems 2,2 and ii one finds that it is sufficient to prove (14) for v A~
in the dense subset C~ . But then, by Parseval's relation,
"~ t~ - %~
The result therefore follows by the following lemma which is a simple consequence