[scot._1970_symposium_on_the_theory_of_numerical_a(bookzz.org).pdf

Lecture Notes in Mathematics A collection of informal reports and seminars Edited by A. Dold, Heidelberg and B. Eckmann, Z0rich

193

Symposium on the Theory of Numerical Analysis Held in Dundee/Scotland, September 15-23, 1970

Edited by John LI. Morris, University of Dundee, Dundee/Scotland

Springer-Verlag Berlin. Heidelbera New York 1971

AMS Subject Classifications (1970): 65M05, 65M10, 65M 15, 65M30, 65N05, 65N 10, 65N 15, 65N20, 65N25

ISBN 3-540-05422-7 Springer-Verlag Berlin Heidelberg New York ISBN 0-387-05422-7 Springer-Verlag Near York Heidelberg Berlin

This work is subject to copyright. All rights are reserved, whether the whole or part of the material is concerned, specifically those of translation, reprinting, re-use of illustrations, broadcasting, reproduction by photocopying machine or similar means, and storage in data banks.

Under 54 of the German Copyright Law where copies are made for other than private use, a fee is payable to the publisher, the amount of the fee to be determined by agreement with the publisher.

by Springer-Verlag Berlin Heidelberg 1971. Library of Congress Catalog Card Number 70-155916. Printed in Germany.

Offsetdruck: Julius Beltz, Hemsbach

Foreword

This publication by Springer Verlag represents the proceedings of a series

of lectures given by four eminent Numerical Analysts, namely Professors Golub,

Thomee, Wachspress and Widlund, at the University of Dundee between September

15th and September 23rd, 1970o

The lectures marked the beginning of the British Science Research Council's

sponsored Numerical Analysis Year which is being held at the University of Dundee

from September 1970 to August 1971. The aim of this year is to promote the theory

of numerical methods and in particular to upgrade the study of Numerical Analysis

in British universities and technical colleges. This is being effected by the

arranging of lecture courses and seminars which are being held in Dundee through-

out the Year. In addition to lecture courses research conferences are being

held to allow workers in touch with modern developments in the field of Numerical

Analysis to hear and discuss the most recent research work in their field. To

achieve these aims, some thirty four Numerical Analysts of international repute

are visiting the University of Dundee during the Numerical Analysis Year. The

complete project is financed by the Science Research Council, and we acknowledge

with gratitude their generous support. The present proceedings, contain a great

deal of theoretical work which has been developed over recent years. There are

however new results contained within the notes. In particular the lectures pre-

sented by Professor Golub represent results recently obtained by him and his co-

workers. Consequently a detailed account of the methods outlined in Professor

Golub's lectures will appear in a forthcoming issue of the Journal of the Society

for Industrial and Applied Mathematics (SIAM) Numerical Analysis, published

jointly by &club, Buzbee and Nielson.

In the main the lecture notes have been provided by the authors and the

proceedings have been produced from these original manuscripts. The exception

is the course of lectures given by Professor Golub. These notes were taken at

the lectures by members of the staff and research students of the Department of

Mathematics, the University of Dundee. In this context it is a pleasure to ack-

nowledge the invaluable assistance provided to the editor by Dr. A. Watson, Mr.

IV

R. Wait, Mr. K. Brodlie and Mr. G. McGuire.

Finally we owe thanks to Misses Y. Nedelec and F. Duncan Secretaries in

the Mathematics Department for their patient typing and retyping of the manu-

scripts and notes.

J. L1. Morris

Dundee, January 1971

Contents

G.Golub: Di rect Methods for So lv ing E l l ip t ic D i f fe rence Equat ions . . . . . . . . . . . . . . . . . . . . . . . . . . I

I. In t roduct ion . . . . . . . . . . . . . . . . . . . . . . . 2 2. Mat r ix Decompos i t ion . . . . . . . . . . . . . . . . . . . 2 3. B lock Cycl ic Reduct ion . . . . . . . . . . . . . . . . . . 6 4. App l i ca t ions . . . . . . . . . . . . . . . . . . . . . . . 10 5. The Buneman A lgor i thm and Var iants . . . . . . . . . . . . 12 6. A~curacy of the Buneman A lgor i thms . . . . . . . . . . . . 14 7. Non-Rectangu lar Regions . . . . . . . . . . . . . . . . . 15 8. Conc lus ion . . . . . . . . . . . . . . . . . . . . . . . . 18 9. References . . . . . . . . . . . . . . . . . . . . . . . . 18

G.Golub: Mat r ix Methods in Mathemat ica l P rogramming . . . . . . . 21

I. In t roduct ion . . . . . . . . . . . . . . . . . . . . . . . 22 2. L inear P rogramming . . . . . . . . . . . . . . . . . . . . 22 3. A Stable Imp lementat ion of the S implex A lgor i thm . . . . . 24 4. I terat ive Ref inement of the So lut ion . . . . . . . . . . . 28 5. Househo lder T r iangu lar i za t ion . . . . . . . . . . . . . . 28 6. P ro jec t ions . . . . . . . . . . . . . . . . . . . . . . . 31 7. L inear Least -Squares Prob lem . . . . . . . . . . . . . . . 33 8. Least -Squares Prob lem with L inear Const ra in ts . . . . . . 35

B ib l iography . . . . . . . . . . . . . . . . . . . . . . . 37

V.Thom@e: Topics in Stab i l i ty Theory for Part ia l D i f ference Operators . . . . . . . . . . . . . . . . . . . . . . . . . . 41

Preface . . . . . . . . . . . . . . . . . . . . . . . . . 42 I. In t roduct ion . . . . . . . . . . . . . . . . . . . 43 2. In i t ia l -Va lue Problems in L ~ w~th Constant Coef f i c ients . 51 3. D i f fe rence Approx imat ions in L ~ to In i t ia l -Va lue Problems

wi th Constant Coef f i c ients . . . . . . . . . . . . . . . . 59 4. Es t imates in the Max imum-Norm . . . . . . . . . . . . . . 70 5. On the Rate of Convergence of D i f fe rence Schemes . . . . . 79

References . . . . . . . . . . . . . . . . . . . . . . . . 89

E .L .Wachspress : I terat ion Parameters in the Numer ica l So lu t ion of E l l ip t i c Prob lems . . . . . . . . . . . . . . . . . . . . . . 93

I. A Concise Rev iew of the Genera l Topic and Background Theory . . . . . . . . . . . . . . . . . . . . . . . . . . 95

2. Success ive Overre laxat ion: Theory . . . . . . . . . . . . 98 3. Success ive Overre laxat ion: Pract ice . . . . . . . . . . . 100 4. Res idua l Po lynomia ls : Chebyshev Ext rapo lat ion : Theory .102 5. Res idua l Po lynomia ls : Pract ice . . . . . . . . . . . . . . 103 6. A l te rnat ing -D i rec t ion - lmp l i c i t I terat ion . . . . . . . . . 106 7. Parameters for the Peaceman-Rachford Var iant of Adi .107

0.Widlund: In t roduct ion to F in i te D i f fe rence Approx imat ions to In i t ia l Value Problems for Part ia l D i f fe rent ia l Equat ions .111

I. In t roduct ion . . . . . . . . . . . . . . . . . . . . . . . 112 2. The Form of the Part ia l D i f fe rent ia l Equat ions . . . . . . 114 3. The Form of the F in i te D i f fe rence Schemes . . . . . . . . 117 4. An Example of D ivergence. The Max imum Pr inc ip le . . . . . 121 5. The Choice of Norms and Stab i l i ty Def in i t ions . . . . . . 124 6. Stabi l i ty , E r ro r Bounds and a Per turbat ion Theorem . .133

VI

7. The yon Neumann Condit ion, D iss ipat ive and Mu l t i s tep Schemes . . . . . . . . . . . . . . . . . . . . . . . . . 138

8. Semibounded Operators . . . . . . . . . . . . . . . . . . 142 9. Some App l i ca t ions of the Energy Method . . . . . . . . . 145

10. Max imum Norm Convergence for L 2 Stable Schemes . . . . . 149 References . . . . . . . . . . . . . . . . . . . . . . . 151

Direct Methods for Solving Elliptic Difference Equations

GENE GOLUB

Stanford University

i. Introduction

General methods exist for solving elliptic partial equations of general type

in general regions. However, it is often the ease that physical problems such as

those of plasma physics give rise to several elliptic equations which require to be

solved mauy times. It is not unco~non that the elliptic equations which arise re-

duce to Poisson's equation with differing right hand side. For this reason it is

judicious to use direct methods which take advantage of this structure and which

thereby yield fast and accurate techniques for solving the associated linear

equations.

Direct methods for solving such equations are attractive since in theory they

yield the exact solution to the difference equation, whereas commonly used methods

seek to approximate the solution by iterative procedures [12]. Hockney [8] has

devised an efficient direct method which uses the reduction process Also Buneman

[2] recently developed an efficient direct method for solving the reduced system

of equations. Since these methods offer considerable economy over older tech-

niques [5], the purpose of this paper is to present a unified mathematical deve-

lopment and generalization of them. Additional generalizations are given by

George [6].

2. Matrix Decomposition

Consider the system of equations

= ~ , (2 .1)

where M is an NxN real symmetric matrix cf block tridiagonal form,

M =

A T

T A e

W

T A

(2.2)

The matrices A and T are pp symmetric matrices and we assume that

AT = TA .

This situation arises in many systems However, other direct methods which are

applicable for more general systems are less efficient to implement in this case.

Moreover the classical methods require more computer storage than the methods te be

discussed here which will require only the storage of the vector ~. Since A and T

commute and are s~et r i c , it is well known Ill that there exists an orthogonal

matrix Q such that

QT A Q = A, QT T Q = 0 ,

and A and O are real diagonal matrices.

(2.3) The matrix Q is the set ef eigenvectars of

A and T, and A and n are the diagonal matrices of the p-distinct eigenvalues cf A

and T, respectively

To conform with the matrix M, we write the vectors x and ~ in partitioned form,

x --

X ~q

i I

Furthermore, it is quite natural to write

x2j I

xj = .

I X

, p j I

L j

System (2.2) may be written

~Cj = 2J

YPJ I

J = 2,3,...,q-1 ,

(2.~_)

(2.5a) (2.5b)

T~q_ I + AX~q = ~ . (2.5e)

Frem Eq. (2.3) we have

A = Q A QT and T = Q O QT

Substituting A and T into Eq. (2.5) and pre-multiplying by QT we obtain

where

z f~

rewritten as

(,i = 2,3, . . . ,q- i ) (2.6)

- = Q~x = Q~ x..i ~CI ' Z,i '~J ' J = 1,2, . . . ,q.

and ~j are partitioned as before then the ith components of Eq. (2,6) may be

u N u

~iXij_l + kiXij + ~ixij+l

wiXiq-I + klXiq = Ylq j

fer i = 1 ,2 , . . .pp.

= ~-~j , (j = 2,...,q-~) ,

If we rewrite the equatio~by reversing the rolls of i and J we may write

r i =

% - -=

P

6o i X i - qxq

" N

Xil

x i2

Xiq

A

-] Yil Yi2

1

so that Eq. (2.7) is equivalent to the block diagonal system of equations,

r i~o ~ , ( i ~ 1 ,2 , . . . ,p ) . (2.8)

Thus, the vector ~isatisfies a symmetric tridiagonal system of equations that has a

constant diagonal element and a constant super- and sub- diagonal element.

(2.8) has been solved block by block it is possible to solve for ~j = Q~j.

have:

Algorithm 1

1. Compute or determine the eigensystem of A and T.

2. 0o~pute ij Q~j (J 1,2,...,ql. 3. Solve ri~i = ~ (i = 1,2,...,p).

~. Compute xj = ~j (j . 1,2,...,q).

After Eq.

Thus we

r i =

For our system

k i w i

and the eigenvalues may be written down as

v = 2~ i r_~ ir k i + cos q+l

" " ~ i

si ki

r = 1,2,..., q

It should be noted that only Q, and the yj, j = 1,2,...,q have to be stored,

A since _~ oan over~rite the ~j the ^~ can overwrite the ~ and the ~joan overwrite

the ~j. A simple aleulatien will show that approximately 2plq + 5Pq arithmetic opera-

tors are required for the algorithm when step 3 is solved using @aussian el4m4~a-

tion for a tridiagonal matrix when r i are positive definite. The arithemtic opera-

ters are dominated by the 2p2q multiplications arising from the matrix multiplica-

tions of steps 2 + 4. It is not easy to reduce this re,tuber unless the matrix Q ham

special properties (as in Poisson's equation) when the fast Fourier transform can be

used (see Hookney [8]).

er that

r i = Z V i Z T ,

rs~ V i the diagonal matrix ef eigenvalues of r i and Zrs = o s sin ~ . Since r i and rj

have the same set of eigenvectors

r i rj = rj r i .

Because of this decomposition, step (3) can be solved by computing

~i = Z V~' Z T

where the Z is stored for each r i. This therefore requires of the order of 2pq"

multiplications and this approximately doubles the computing time for the algorithm.

Thus performing the fast Fourier transform method in step 3 as well as steps 2 and

is not advisable.

3. Block C,yclic Reductien

In Section 2, we gave a method for which one had to know the eigenvalues and

eigenvectors of some matrix. We now give a more direct method for solving the

system of Eq. (2.1).

We assume again that A and T are symmetric and that A and T commute. Further-

more, we assume that q = m-I and

m = 2 k+l

where k is some positive integer. Let us rewrite Eq. (2.5b) as follows:

~.i-2 + A~j-I + ~J = ~J-l '

TXj_l + A~j + Txj+ 1 = ~j ,

~ j ~J+l + ~J+2 = ~j+l "

Multiplying the first and third equation by T, the second equation by -A, and addim@

we have

T2xj_ 2 + (2T" - A 2)xj + T2xj+ 2 = T~j_I - A~j + T~j+I .

Thus if j is even, the new system of equations involves x.'s with even indices. ~j

Similar equations held for x and Xm_ 2. The process of reducing the equations in

this fashion is known as c2clic reduction. Then Eq. (2.1) may be written as the

following equivalent system:

( 2T 2 -A" )

T" ( 2T 2-A" ) T 2

@

k, -

~+~ -~

e

e

(2'~'~')

F

o

o

~m_n,

I.

(3.1)

and

~j = Zj + ~(Xj_l + X i+l) J = 3,5,...,m-3 (3.2)

2 k+l , Since m = and the new system of Eq. (3.1) involves xj's with even indlcesp the

block dimension ef the new system of eqtmticns is 2k-l. Note that once Eq. (3.1) is

solved, it is easy to solve for the xj's with odd indices as evidenced by Eq. (3.2)

We shall refer to the system of Eq. (3.2) as the eliminated equations.

Also, note that Algorithm i may be applied to System (3.1). Since A and T

commute, the matrix (2Ta-A a) has the same set of eigenvectors as A and T. Also, if

~(A) = ki, ~(T) = %, for i = 1,2,...,m-l,

= - .

Heckney [8] has advocated this procedure.

Since System (3.1) is block tridiagonal and of the form of Eq. (2.2), we can

apply the reduction repeatedly until we have one block. However, as noted above, we

can stop the process after any step and use the toothed of Section 2 to solve the

resulting equations.

To define the procedure recursively, let

~o) (j = 1,2, .,m-l). A () = A, T (e) = T; ~ = Zj, -" (3.3)

Then for r = O,l,..,k

A (r+l) = 2(T(r)) = _ (A(r)) =,

T (r+z) = (T(r))" , (3.~)

~(r-1) = T(r) (r) . (r) - A(r) (r) j ~ J -2 r + ~j+2 r Yj

The eliminated equations at each stage are the solu~on of the diagonal system

(r-l) - T(r-l) A (r-l) X2r_2r_ , = ~2r_2r-, X2r

(r-l) - T(r-1) (xj2 r ) A(r -1) X j2 r -2r " = ~ j2r -2 r - ' + x ( j -1 )2r (3.5)

j = 1,2,...,2 k-r

. (r-l) A(r-1) ~. I _2r., =~k+l_2r . , - T(r'l) X2k+l_2r

After all of the k steps, we must solve the system of equations

A(k) . (k) ~2 k -- ~2 k . (3.6)

In either ease, we must solve Eq. (3.5) to find the eliminated unknowns, Just a~ in

Eq. (3.2). If it is done by direct solution, an ill-conditloned system may arise.

Furthermore A = A()is tridiag~nal A (i) is quindiagonal and so on destroying the

simple structure of the original system. Alternatively polynomial factorization

retains the simple structure of A.

From Eq. (3.1), we note that A (1) is a polynomial of degree 2 in A and T. By

induction, it is easy to show that A (r) is a polynomial of degree 2 r in the matrices

A and T, so that 2r-I

A(r) = ~ e(r)2j A2j T2r-2j "~ P2 r(A'T)"

We shall proceed %0 determine the linear factors of P2r(A,T).

Let 2r-I

j--o

For t ~ O, we make the substitution

a/~ : -2 OOS e .

From Eq. (3.3), we note that

I p2r1(a,t) = 2t 2~ _ (p2r(a,t))~

It is then easy to verify using Eq~. (3.7) and (3.8), that

P2r(a,t) =-2t 2r cos 2re ,

and, consequently 2 r

J=l

and, hence,

-~-2-i (, + 2t cos ~2~+,~ ,) ,

(3.7)

(3.8)

A (r) = -~ (A + 2 cos e!r)T)~ , (3.9)

01

(r) = (2j_I)~/2~+, where ~j

Thus to solve the original system it is only necessary to solve the factored system

recursively. For example when r = 2, we obtain

A (1) = 2~ - A m = (~ T - A ) (~ T + A)

whence the simple tridiagonal systems

(J: T -A) ~=~

(4~ T +A) x = w

are used to solve the system

A(1)x = ~

We call this method the cyclic odd-even reduction and factorization (CORF) algorithm.

10

4. Applications

Exampie I Poissen's Equation wit h Dirichlet Boundar~ Conditions,

It is instructive to apply the results of Section 3 to the solution of the

finite-difference approximation to Poisson's equation on a rectangle, R, with speci-

fied boundary values Consider the equation

u + u : f(x,y) for (x,y)ER, ~x yy (~.l)

u(x,y) : g(x,y) for (x,y)aR .

(Here aR indicates the boundary of R.) We assume that the reader is familiar with

the general technique of imposing a mesh of discrete points onto R and approximating

~q. (4.Z). The eq~tion u + Uyy : f(x,y) is approximated at (xl,Yj) by

Vi-l.j - 2vi,j + Vi+l.j vi,j-1 - 2vi. j + vi.j+l C~)" + (Ay)"

= fi,J (i < i < n-l, I < J < m-i) ,

with appropriate values taken on the boundary

VO,J = gC,~' Vm, j = gm,J ( 1 g J g m-l ) ,

and

Vi,@ = gi,o' vi,m : gi,J (i < i ~ n-l).

Then vii is an approximation to u(xi,Yj) , and fi,j = f(xi'Yj)' gi,j : g(xl,Yj)-

Hereafter, we assume that

2k+l m -~

When u(x,y) is specified on the boundary, we have the Dirichlet boundary con-

dition. For simplicity, we shall assume hereafter that Ax = Ay. Then

1

l -4 I

(~ . 1

and T = I . . l

1

-4 (n - l ) x (n - l )

11

The matrix In_ I indicates the identity matrix of order (n-l). A and T are symmetric

and co~ute, and, thus the results of Sections 2 and 3 are applicable In addition,

since A is tridlagcnal, the use of the facterization (3.10) is greatly simplified.

The nine-polnt difference formula for the same Poisson's equation can be treated

similarly when m

-20 4

4 -20

A =

0

O

& -20

, T=

(n-l)~n-ll

"~ z 0 1 4 1

(~ . . I

1 &

Example II

The method can also be used for Poisson's equation in rectangular regions

under natural boundary conditions provided one uses

au = u(x + ~.y ) - u(x - ~ .y ) Ox 2h

and similarly ~ at the boundarie S,

Example III

Poisson's equation in a rectangle with doubly periodic boundary conditions is

an additional example when the algorithm can be applied.

Example IV

The method can be extended successfully to three dimensions for Foissents

equation.

For all the above examples the eigensystems are known an~ the fast Fourier

transform can ~e applied,

Example V

The equation of the form

(~(x)~)x + (KY)~)y + u(x,y) = q(x,y)

on a rectangular region can be solved by the CORF algorithm provided the eigensystem

is calculated since this is not generally known.

12

The counterparts in cylindrical polar co-ordinates can also be solved using

CORF on the ractangle~ in the appropriate co-ordinates.

5. The Buneman algorithm and variants

In this section, we shall describe in detail the Buneman algorithm [2] and a

variation of it. The difference between the Buneman algorithm and the CORF algo-

rithm lies in the way the right hand side is calculated at each stage of the reduc-

tion. Henceforth, we shall assume that in the system of Eqs (2.5) T = Ip, the

identity matrix of order p.

Again consider the system of equations as given by Eqs. (2.5) with q = 2k+l-1.

After one stage of cyclic reduction, we have

+ (21 - A')~j (5.1) 5j-2 p + 5j+2 = ZJ-I + ZJ+I -AZJ for J = 2,4,...,q-I with ~e = ~+l = ~ ~ the null vector. Note that the right han~

side of Eq. (5.1) may be written as

(i) (5.2) ~J = ZJ-1 + ZJ+I -~ J = A(1) A-'~j + ZJ-I + ~J+l - 2A-'~j

where A (1) = (21p- A') .

Let us define

(i) ~J-(1) ~j-I ~j+l " 22~ I)_ 2j : A-'Zj ; = +

(These are easily calculated since A is a tridiagon~l matrix.) Then

(1) = A(1) _(1) (1) (5-3) Z~ j + %j

After r reductions, we have by Eq. (3.i)

(r+l) , (r) (r)) -A(r) (r) j = ~ j -2 ~ + ~j+2 ~j . (5.4)

Let us write

Substituting

21 - A (r+1) P

in a fashion similar to Eq. (5.3)

(5.5)

Eq. (5.5) into Eq. (5.4) and making use of the identity (A(r)) ' =

from Eq. (3.4), we have the following relationships:

(r+l) = 2(r) (A(r))_~ , (r) ~(r) ~r) ) (5.6a) J J - ~j_2 r + ~j+2 r -

13

( r + l ) ~(r) (r) ^ (r+l) For J = i2 r+l (i = 1,2,...,2k-r-l) with

~!r) = ~(r) (r) = ~(r) = O 2k+l = 2k+l -

~r) Because the number of vectors ~ is reduced by a factor of two for each successive r, the computer storage requirements becomes equal to almost twice the number of

data points.

To compute

of equations

A(r) , (r) (r+l) (r)r (r) (r) !,~j - ~,J ) == ~J-2 + ~j+2 r - ~j '

where A (r) is given by the factorization Eq. (3.9); namely,

2 r A (r) ~ (A + 2 cos 8 (r)j = - Ip) ,

J=l

o~ r) = (2~ - ~)~/2 r~ l

After k reductions, one has the equation

. = A(k) (k) ,~(k) A (k) x k ~2 k + ~2k

2

and hence

(A(r))-'(~J-2(r)r + ~J +2r~(r) _ ~r)) in Eq. (5.6a). we solve the system

~(k) (A(k))_1 ~(k) ~2k = ~2k + ~2k

Again one uses the factorization of A (k) for computing (A(k)) -I ~I~ ) .

solve, we use the relationship

~J -2r + A(r) ~J + ~J +2r = A(r) ~r) + ~r)

for J = i2r(l,2,...,2k+l-r-1) with ~o = ~2k+ 1 = ~

For J = 2 r, 3.2r,...,2k+l-2 r, we solve the system of equations

A(r)(xj - ~r)) = ~r) _ (xj_2r + xj+2r) ,

Te back

(5.7)

14

using the factorlzation of A(r); hence

~J 2~r) (r)) = + (~J - d " (5.~3)

Thus to summarise, the Buneman algorithm proceeds as follows:

((r) (r)~ 1. Compute the sequence ~ j , ~j } by Eq. (5.6) for r = l,...,k with

(o) e for J = 0,...,2 k+l ,ana ~O)Z = ~J for j = l, 2,...,2k+l-1.

2. Back-solve for ~j using Eqs. (5.7) and (5.8).

The use of the p(r) and q(r) produce a stable algorithm. Numerical experi- ~J ~J

ments by the author and his colleagues have shown that computationally the Buneman

algorihhm requires approximately 30% less time than the fast Fourier transform

method of Hockney.

6. Accuracy of the BunemanAl~orithms

As was shown in Section 5, the Bunemau algorithms consist of generating the

(r)l. Let us write, using Eqs. (5.12) and (5.13) sequence of vectors I~ r), ~J

~r) : ~r) + ~J(r) (6.la)

(r) = Xj 2 r + x r - Afr) (r) (6.1h) ~J ~ - ~j+2 ~j '

where

and

Then

and

whe re

k : l (6.2)

S (r) = (A( r - l ) . . . A(O)) - ' . (6.3)

I1~.~ r) .(r)ll IIs(r)ll2 - ~ i l 2 ~i

l i~ r) - (~j_2r + ~j+2rl12

11.t1' (6.~)

IIs (r) ACr)il2 i1~1' , (6 .5)

llVll 2 indicates the Euclidean norm of a vector v ,

IICII 2 indicates the spectral norm of a matrix C, and

15

1~1t'. ~ll~jll 2 . j=l

Thus for A : A T r-1

Ns(r)II2 -~ I](A(J))-III2 j:o

and since A ( j ) are polynomials of degree 2 j in A we have

r-I

lls(r)ll2 Vt [P2 j() max I ]"[ , j:o [xif

where p2j(Xi) are polynomials in Ikil , the eigenvalues of A.

For Poisson's equation it may be shown that

lls(r)II2 < e'r e

where o : 2r-1 and e > O. r

Thus Hs(r)ll2 _, o and h~ce

I12~ r) - ~H 2 ~ 0

~r) That is p tends to the exact solution wihh increasing r.

that llq~ r)N2 remains bounded throughout the calculation, the Buneman

leads to numerically stable results.

(6.6)

Since it can be shown

algorithm

7. Non-Rectangular Regions

In many situations, one wishes to solve an elliptic equation over the region

R

where there are n I data points in R I , n2 data points in R z and ne data points in

R, (~ R2. We shall assume that Dirichlet boundary conditions are given. When Ax is

16

the same throughout the region, one has a matrix equation of the form

m

G

P

pT

@ ~(2)J ~c (2)

where

"A T

T A

G=

#

e . T

T A n I xn l

B s

$ B .

(~ " . S

H = S

B n 2 xn=

and P is (noXno).

Also, we write x~ z) x!2~ x (1) =

o

x(a) ,,,r

x(2) q I

We assume again that AT = TA and BS = SB.

From Eq. (7.1), we see that

0

0 x(1) = ~-I y(1) _ ~-1 .

(7.1)

(7.2)

(7.3)

(7 .~)

an~

17

x(2) = H-I Z(2) - H-I

pT

0

0

x(1) ,,.,r (7.5)

Now let us write

G~(1) = ~(1), H~(2)

~w (I) =

= ~(2) ,

~(2)=

(7.6)

;l o I "I

oJ

(7.7)

Then as -e partition the vectors z (i), z (2) and the matrices W (1) and W (2) as

in Eq (7.3), Eqs (7.4) and (7.5) becomes

(i) ~i) ~(1) x!2) (j = 1,2,...,r), ~j = - ,,j ~ ,

(2) = (2) _ w!2) x(1) (j = 1,2,..,s) J ~j J ,,~ ,

For Eq. (7.8), we have

I w (1) r

w~ 2) i

(7.8)

(1)

z(2) (7.9)

It can be noted that W ~lj( ~ and W ~2j( ~ are dependent only on the given region and hence

the algorithm becomes useful if many problems on the same region are to be conside-

red.

Thus, the algorithm proceeds as follows

i. Solve z(I) aria z! 2) using the methods of Section 2 or 3.

18

2. Solve for W (I) and W! 2) using the methods of Section 2 or 3. r

3. Solve Eq. (7.9) using Gaussian elimination. Save the LU decomposition of

Eq. (7.9).

h. Solve for the unknown components of ~(1) and ~(2)

8. Conclusion

Numerous applications require the repeated solution of a Poisson equation.

The operation counts given by Dorr [5] indicate that the methods we have discussed

should offer significant economies over older techniques; and this has been veri-

fied in practice by many users. Computational experiments comparing the Buneman

algorithm, the MD algorithm, the Peaceman-Raohford alternating direction algorithm,

and the point successive over-relaxation algorithm are given by Buzbee, at al [3].

We conclude that the method of matrix decomposition, the Buneman algorithm, and

Hookney's algorithm (when used with care) are valuable methods.

This paper has benefited greatly from the comments of Dr. F. Dorr,

Mr. J. Alan George, Dr. R. Hockney and Professor 0. Widlund.

9. References

1. Richard Bellman, Introduction to Matrix Analysis, McGraw-Hill, New York, 1960.

2. Oscar Buneman, Stanford University Institute for Plasma Research, Report No.294, 1969.

B.L. Buzbee, G.H. Golub and C.W. Nielson, "The Method of Odd/Even Reduction and Factorization with Application to Poisson's Equation, Part II," LA-h288, Los Alamos Scientific Laboratory. (To appear SIAM J. Num. Anal. )

J.W. Cooley and J.W. Tukey, "An algorithm for machine calculation of complex Fourier series," Math. Comp., Vol.19, No.90 (1965), pp. 297-301.

F.W. Dorr, "The direct solution to the discrete Poisson equation on a rectangle," to appear in SIAM Review.

J.A. George, "An Embedding Approach to the Solution of Poisson's Equation on an Arbitrary Bounded Region," to appear as a Stanford Report.

G.H. Golub, R. Underwood and J. Wilkinson, "Solution of Ax = kBx when B is positive definite," (to be published). ~

R.W. Hockney, "A fast direct solution of Poisson's equation using Fourier analysis," J. ACM., Vol.12 No.1 (1965), pp. 95-113.

3.

4.

o

6.

.

8.

19

9.

lO.

R.W. Hockney, in Methods in Computational Physics (B. Adler, S. Fernbach an~ M. Rotenberg, Eds.), Vol.S Academic Press, New York and London, 1969.

R.E. Lynch, J.R. Rice and D.H. Thomas, "Direct solution of partial difference equations by tensor product methods," Num. Math., Vol.6 (196A), pp. 185-199.

ii. R.S. Varga, Matrix Interative Anal2sis, Prentice Hall, New York, 1962.

Matrix Methods in Mathematical Programming

GENE GOLUB

Stanford University

22

I. Introduction

With the advent of modern computers, there has been a great development in

matrix algorithms. A major contributer to this advance is J. H. Wilkinson [30].

Simultaneously, a considerable growth has occurred in the field of mathematical

programming. However, in this field, until recently, very little analysis has been

carried out for the matrix algorithms involved.

In the following lectures, matrix algorithms will be developed which can be

efficiently applied in certain areas of mathematical programming and which give

rise to stable processes.

We consider problems of the following types:

maximize ~ (~) , where ~ = (x,, x,, .. Xn) T

subject to Ax= b

Gx ~h

where the objective function ~ (~) is linear or quadratic.

2. Linear Programming

The linear programming problem can be posed as follows:

T m~x~i,e ~ (~) = ~

subject to A~_ = b (2.1)

) 0 (2.2)

We assume that A is an m x n matrix, with m < n, which satisfies the Haar

condition (that is, every m x m submatrix of A is non-singular). The vector ~ is

said to be feasible if it satisfies the constraints (2.1) and (2.2).

Let I = lil, i2, .. iml be a set of m indices such that, on setting xj = O,

j $ I, we can solve the remaining m equations in (2.1) and obtain a solution such

that

xij > 0 , J = I, 2, .. m .

Thi8 vector x is said to be a basic feasible solution. It is well-known that

the vector ~ which maximizes ~ (~) = o T x is a basic feasible solution, and this

suggests a possible algorithm for obtaining the optimum solution, namely, examine

all possible basic feasible solutions.

23

Such a process is generally inefficient. A more systematic procedure, due to

Dantzig, is the SimylexAl~orithm. In this algorithm, a series of basic feasible

solutions is generated by changing one variable at a time in such a way that the

value of the objective function is increased at each step. There seems to be no

way of determining the rate of convergence of the simplex method; however, it works

well in practice.

The steps involved may be given as follows:

(i) Assume that we can determine a set of m indices I = liI , i,, .. iml such that

the corresponding x i are the non-zero variables in a basic feasible solution. J

Define the basis matrix

B = [ai , Ai2, .. aim ]

where the a are columns of A corresponding to the basic variables. --lj

(ii) Solve the system of equations:

B~=b

where ~.T= [Xil, Xi, ' .. Xim]

(iii) Solve the system of equations:

B T ^ W = C

where _~T__ [ci,, ci2' .. cim] are the coefficients of the basic variables in the

objective function.

(iv) Calculate

~T w] - T w , say. ( .cj - ~ ~. = Cr ~r - max j I

T If c r - ~ w 0 , then the optimum solution has been reached. Otherwise, a is to

~r

be introduced into the basis.

(v) Solve the system of equations:

B t = - a - - r

If t ~ 0 , k = I, 2, r k

bounded.

. m , then this indicates that the optimum solution is un-

Otherwise determine the component s for which

x i x s = min - ~ trk 0

t r s 1 ~k~m t r k

24

Eliminate the column a i from the basis matrix and introduce column a r. s

This process is continued from step (ii) until an optimum solution is obtained (or

shown to be unbounded).

We have defined the complete algorithm explicitly, provided a termination rule,

and indicated how to detect an unbounded solution. We now show how the simplex

algorithm can be implemented in a stable numerical fashion.

~. A stable implementation of the simplex al6orithm

Throughout the algorithm, there are three systems of linear equations to be

solved at each iteration. These are:

B~ = b , m

BTw = c ,

Bt = -a --r -r

Assuming Gaussian elimination is used, this requires about m3/3 multiplica-

tions for each system. However, if it is assumed that the triangular factors of B

are available, then only O(m 2) multiplications are needed. An important considera-

tion is that only one column of B is changed in one iteration, and it seem, reasonable

to assume that the number of multiplications can be reduced if use is made of this.

We would hope to reduce the m3/3 multiplications to O(m 2) multiplications per step.

This is the basis of the classical simplex method. The disadvantage of this method

is that the pivoting strategy which is generally used does not take numerical

stability into consideration. We now show that it is possible to implement the

simplex algorithm in a more stable manner, the cost being that more storage is re-

quired.

Consider methods for the solution of a set of linear equations. It is well-

known that there exists a permutation matrix n such that

HB = LU

where L is a lower triangular matrix, and U is an upper triangular matrix.

If Gaussian elimination with partial (row) pivoting is used, then we proceed

as follows :

Choose a permutation matrix H, such that the maximum modulus element of the

25

first column of B becomes the (I, 1) - element of 1"] 1 B.

Define an elementary lower triangular matrix F k as

k ~ | -

r k = I ' ! - ! f

" i |

". ~ I

'LL I ' l , I ' | " ~, J ".

Now~ can be chosen so that

P, HI B

has all elements below the diagonal in the first column set equal to zero.

Now choose 92 so that

92 r, 9, B

has the maximum modulus element in the second column in position (2, 2), and

choose r e so that

r= fl~ 1"t H2 B

has all elements below the diagonal in the second column set equal to zero. This

can be done without affecting the zeros already computed in the first column.

Continuing in this way we obtain:

rm- , ~m- , . . .P2 ~, r, 9, B = U

where U is an upper triangular matrix.

Note that permuting the rows of the matrix B merely implies a re-ordering of

the right-hand-side elements. Thus, no actual permutation need be performed,

merely a record kept. Further any product of elementary lower triangular matrices

is a lower triangular matrix, as may easily be shown. Thus on the left-hand side

we have essentially a lower triangular matrix, and thus the required factorization.

The relevant elements of the successive matrices F k can be stored in the

lower triangle of B, in the space where zeros have been introduced. Thus the

method is economical in storage.

26

To return to the linear programming problem, we require to solve a system of

equations of the form

B (1 ) ~ = v (3 .~)

where B (i) and B (i-I) differ in only one column (although the columns may be re-

ordered)

Consider the first iteration of the algorithm. Suppose that we have obtained

the factorization:

B () = S () U(o)

where the right-hand-side vector has been re-ordered to take account of the permuta-

tions.

The solution to (3 . i ) with i = 0 is obtained by computing = (L~)) -~ x

and solving the triangular system

v(O) = ~ , 2

each of which requires m + 0 (m) multiplications.

Suppose that the column b () is eliminated from B () and the column g(O) is S

O

introduced as the last column, then

BO) = [b(O) b(O) . b(O) bCo) ~(o)] L t ~2 ' " ~S " t ~S *1 ' " "

0 0

Therefore,

(~(o) ) .1 BO) = HO) ,

where H (I) has the form:

/ { <

27

Such a matrix is called an upper Hessenberg matrix. 0nly the last column need be

computed, as all others are available from the previous step. We require to apply

a sequence of transformations to restore the upper triangular form. It is clear

that we have a particularly simple case of the LU factorization procedure as

previously described, where r! I) is of the form: i

R I ' I i I #-~

I k_Y ' I

" I

11 1 ,q~'/ I1 . J i "

I I 0

r~ I) =

only one element requiring to be calculated. On applying a sequence of transforma-

tion matrices and permutation matrices as before, we obtain

1) 1) . . r (1) H(1) = u (1) s s

o o

where U (I) is upper triangular.

(I) it is only necessary to compare two Note that in this case to obtain Hj

(I) and elements. Thus the storage required is very small: (m - So) multipliers gi

(m - So) bits to indicate whether or not interchanges are necessary.

All elements in the computation are bounded, and so we have good numerical

accuracy throughout. The whole procedure compares favourably with standard forms,

for example, the product form of the inverse where no account of numerical accuracy

is taken. Further this procedure requires fewer operations than the method which

uses the product form of the inverse. If we consider the steps involved, forward

and backward substitution with L () and U (i) require a total of m 2 multiplications

and the application of the remaining transformation in (L(i)) -I requires at most

i(m - I) multiplications. (If we assume that on the average the middle column of

the Basis matrix is eliminated, then this will be closer to (i/2) (m - I) ). Thus

a total of m 2 + i (m - I) multiplications are required to solve the system at each

28

stage, assuming an initial factorization is available. Note that if the matrix A

is sparse, then the algorithm can make use of this structure as is done in the

method using the product form of the inverse.

4" Iterative refinement of the.solution

Consider the set of equations

B~ = X

and suppose that ~ is a computed approximation to ~ . Let

-- ~+

Therefore,

that is,

B(~ + 2) : v ,

Be_ -- v -B~

We can now solve for c very efficiently, since the LU decomposition of B is

available. This process can be repeated until ~ is obtained to the required accur-

acy. The algorithm can be outlined as follows:

(i) Compute ~j = ~ - B~_j

(ii) Solve B_cj = r -j

(iii) Compute ~j+1 = ~J + ~J

It is necessary for r to be computed in double precision and then rounded to --j

single precision. Note that step (ii) requires 0(m 2) operations, since the LU de-

composition of B is available. This procedure can be used in the following sections.

~. Householder Trian~ularization

Householder transformations have been widely discussed in the literature. In

this section we are concerned with their use in reducing a matrix A to upper-

triangular form, and in particular we wish to show how to update the decomposition

of A when its columns are changed one by one. This will open the way to implemen-

tation of efficient and stable algorithms for solving problems involving linear

constraints.

Householder transformations are symmetric orthogonal matrices of the form

Pk = I - k UkUk where u k is a vector and Ck = 2/( ). Their utility in this

29

context is due to the fact that for any non-zero vector 2 it is possible to choos~

u k in such a way that the transformed vector Pk a is zero except for its first

element. Householder [15] used this property to construct a sequence of transfor-

mations to reduce a matrix to upper-triangular form. In [29], Wilkinson describes

the process and his error analysis shows it to be very stable.

Given any A, we can construct a sequence of transformations such that A is

reduced to upper triangular form. Premultiplying by P annihilates (m - 1) O

elements in the first column. Similarly, premultiplying by PI eliminates (m - 2)

elements in the second column, and so on.

Therefore,

em-1 Pm-2 "'PI PoA = [ RO ] ' (5.1)

where R is an upper triangular matrix.

Since the product of orthogonal matrices is an orthogonal matrix, we can

write (5.1) as

QA = [ R ] 0

A=QT[ R ] 0

The above process is close to the Gram-Schmidt process in that it produces

a set of orthogonal vectors spanning E . In addition, the Householder transforma- n

tion produces a complementary set of vectors which is often useful. Since this

process has been shown to be numerically stable, it does produce an orthogonal

matrix, in contrast to the Gram-Schmidt process.

If A = (~I ,...,~n) is an mxn matrix of rank r, then at the k-th stage of the

triangularization (k < r ) we have

where R k

A (k) PoA= = Pk-1Pk-2 "'" 0

is an upper-triangular matrix of order r.

T k

The next step is to compute

A.k+1.( ~ = Pk A'k" ( ~ where Pk is chosen to reduce the first column of T k to zero

except for the first component. This component becomes the last diagonal element

30

of ~+I and since its modulus is equal to the Euclidean length of the first column

of T k it should in general be maximized by a suitable interchange of the columns

of Sk . After r steps, T will be effectively zero (the length of each of its r

T k

col~Im=~ will be smaller than some tolerance) and the process stops.

Hence we conclude that if rank(A) = r then for some permutation matrix H the

Householder decomposition (or "QR decomposition") of A is

Q A ~ = Pr-1 Pr-2 "'" PO A =

r

O 0

where Q = Pr -1Pr -2 "'" PO is an m x m orthogonal matrix and R is upper-triangular

and non-singular.

We are now concerned with the manner in which Q should be stored and the

means by which Q, R, S may be updated if the columns of A are changed. We will

suppose that a column a is deleted from A and that a column a is added. It will ~p ~q

be clear what is to be done if only one or the other takes place.

Since the Householder transformations Pk are defined by the vectors u k the

usual method is to store the Uk'S in the area beneath R, with a few extra words of

memory being used to store the ~k'S and the diagonal elements of R. The product

Q~ for some vector ~ is then easily computed in the form Pr -1Pr -2 "'" PO ~ where,

T T for example, PO ~ = (I - ~0~O~0)~ = ~ - ~o(Uo~)Uo . The updating is best

accomplished as follows. The first p-1 columns of the new R are the same as before;

the other columns p through n are simply overwritten by columns ap+1, ..., an, aq

and transformed by the product Pp-1Pp-2 "'" PO to obtain a new

I (Sp_ I ~ I' then T is triangularized as usual.

\%1 ] p-1

This method allows Q to be kept in product form always, and there is no accumula-

tion of errors. Of course, if p = I the complete decomposition must be re-done

and since with m~ n the work is roughly proportional to (m-n/3)n 2 this can mean

a lot of work. But if p A n/2 on the average, then only about I/8 of the original

work must be repeated each updating.

31

Assume that we have a matrix A which is to be replaced by a matrix ~ formed

from A by eliminating column a and inserting a new vector g as the last column.

As in the simplex method, we can produce an updating procedure using Householder

transformations. If ~ is premultiplied by Q, the resulting matrix has upper

Qi = /

/

<

As before, this can be reduced to an upper triangular matrix in O(m 2) multiplica-

tions.

6. Projections

In optimization problems involving linear constraints it is often necessary

to compute the projections of some vector either into or orthogonal to the space

defined by a subset of the constraints (usually the current "basis"). In this

section we show how Householder transformations may be used to compute such pro-

jections. As we have shown, it is possible to update the Householder decomposi-

tion of a matrix when the number of columns in the matrix is changed, and thus we

will have an efficient and stable means of orthogonalizing vectors with respect to

basis sets whose component vectors are changing one by one.

Let the basis set of vectors a 1,a2,...,a n form the columns of an m x n

matrix A, and let S be the sub-space spanned by fail We shall assume that the r

first r vectors are linearly independent and that rank(A) = r. In general,

m > n > r , although the following is true even if m < n

Given an arbitrary vector z we wish to compute the projections

u = Pz , v = (I - P) z

for some projection matrix P , such that

Diagramatically, Hessenberg form as before.

32

a) z = u + v

(b) 2v = 0

(o) ~s r (i.e., 3~ ~uoh that ~ = ~) (i.e., ATv (d) v is orthogonal to S r ~ = o)

One method is to write P as AA + where A + is the n x m generalized inverse of A,

and in [7~ Fletcher shows how A + may be updated upon changes of basis. In contrast,

the method based on Householder transformations does not deal with A + explicitly

but instead keeps AA + in factorized form and simply updates the orthogonal matrix

required to produce this form. Apart from being more stable and just as efficient,

the method has the added advantage that there are always two orthonormal sets of

vectors available, one spanning S and the other spanning its complement. r

As already shown, we can construct an m x n orthogona~ matrix Q such that

r n-r

QA = i 0 S1

where R is an r x r upper-triangular matrix. Let

W = Qz =

I r

m-r

(6.~)

and define

~ ' X= ~2 (6.2)

Then it is easily verified that ~,~ are the required projections of ~, which is to

say they satisfy the above four properties. Also, the x in (c) is readily shown

to be

In effect, we are representing the projection matrices in the form

33

and

P Q C: r) = (z r o)Q (6 .~)

I-P =QT (im_rO ) (OI r)Q (6.A)

and we are computing ~ = P z, Z = (I - P)~ by means of (6.1), (6.2) The first r

col,m~R of Q span S and the remaining m-r span its complement. Since Q and R may r

be updated accurately and efficiently if they are computed using Householder

transformations, we have as claimed the means of orthogonalizing vectors with re-

spect to varying bases.

As an example of the use of the projection (6.4), consider the problem of

finding the stationary values of xTAx subject to xTx = I and cTx = O, where A is a

real symmetric matrix of order n and C is an n x p matrix of rank r, with r ! P

34

mn l l b - A~_It 2

where we assume that the rank of A is n.

Since length is invariant under an orthogonal transformation we have

where QA =

lib - Ax l l 2 = l lQb - QA~_II "+ 2 2

[ 1{ ]. Let 0

Qb = c : [o_, ] . - - - - C2 m- n

Then,

2, 1{] x U' = Ha_,- ~_H" + lla.il" " [~_,] - [o - , ,

and the solution to the least-squares problem is given by

= 1{ -1 c,

Thus it is easy to solve the least-squares problem using orthogonal transformations.

Alternatively, the least-squares problem can be solved by constructing the

normal equations

A x = A D

However these are well-known to be ill-conditioned.

Nevertheless the normal equations can be used in the following way.

Let the residual vector r be defined by:

r = b -A~

Then,

ATr = ATb - ATA~ = 0

These equations can be written:

[IA A]O Ir> (:Jx+ Thus,

0 I

Multiplying out:

(1{7o) o

IAT AIi TO IOii IO

C CO/o

(r) X

(7.~)

:I(:)

35

where ~ = QE and S = Q~ .

This system can easily be solved for ~ and ~. The method of iterative refine-

ment may he applied to obtain a very accurate solution.

This method has been analysed by BJhrck [2].

8. Least-squares problem with linear constraints

Here we consider the problem

minimize ~ - A~_~ 2 2

subject to G~ = ~ .

Using Lagrange multipliers ~ , we may incorporate the constraints into

equation (7.1) and obtain

0 I A

G T A T 0 1 b 0 The methods of the previous sections can be applied to obtain the solution of this

system of equations, without actually constructing the above matrix. The problem

simplifies and a very accurate solution may be obtained.

Now we consider the problem

minimize llb - A~_~ 2 2

subject to Gx ~> h .

Such a problem might arise in the following manner. Suppose we wish to approximate

given aata by the polynomial

y(t) = ~t ~ + @t 2 + yt +

such that y(t) is convex. This implies

y(')(t) = 6at + 2~ ) 0 .

Thus, we require

6 a t i + 2~ ) 0

where t. are the data points, (This aces not necessarily guarantee that the poly- l

hernial will be convex throughout the interval. ) Introduce slack variables w such

that Gx - w = h

where w ~ _O .

36

Introducing Lagrange multipliers as before, we may write the system as:

i O 0 G -I 0 I A 0

G T A T 0 0

r

x

w

h

b

0

At the solution, we must have

T _~o, w~o, _z_w=0.

This implies that when a Lagrange multiplier is non-zero then the corresponding

constraint holds with equality.

Conversely, corresponding to a non-zero w i the Lagrange multiplier must be

zero. Therefore, if we know which constraints held with equality at the solution,

we could treat the problem as a linear least-squares problem with linear equality

constraints. A technique, due to Cottle and Dantzig [5], exists for solving the

problem inthis way.

37

Bibliography

[11 Beale, E.M.L., "Numerical Methods", in Ngn~.inear Programming, J. Abadie (ed.).

John Wiley, New York, 1967; pp. 133-205.

[2] Bjorck, ~., "Iterative Refinement of Linear Least Squares Solutions II", BIT 8

(1968), pp. 8-30.

[3] and G. H. Golub, "Iterative Refinement of Linear Least Squares

Solutions by Householder Transformations", BIT 7 (1967), pp. 322-37.

[4] and V. Pereyra, "Solution of Vandermonde Systems of Equations",

Publicaion 70-02, Universidad Central de Venezuela, Caracas, Venezuela, 1970.

[5] Cottle, R. W. and @. B. Dantzig, "Complementary Pivot Theory of Mathematical

Programming", Mathematics of the Decision Sclences~ Part 1, G. B. Dantzig and

A. F. Veinott (eds.), American Mathematical Societ 2 (1968), pp. 115-136.

[6] Dantzig, G. B., R. P. Harvey, R. D. McKnight, and S. S. Smith, "Sparse Matrix

Techniques in Two Mathematical Programming Codes", Proceedinss of the S.ymposium

on Sparse Matrices and Their Appllcations, T. J. Watson Research Publications

RAI, no. 11707, 1969.

[7] Fletcher, R., "A Technique for Orthogonalization", J. Inst. Maths. Applics. 5

(1969), pp. 162-66.

[8] Forsythe, G. E., and G. H. Golub, "On the Stationary Values of a Second-Degree

Polynomial on the Unit Sphere", J. SIAM, 13 (1965), pp. 1050-68.

[9] and C. B. Moler, Computer Solution of Linear Algebraic Systems,

Prentice-Hall, Englewood Cliffs, New Jersey, 1967.

[10] Francis, J., "The QR Transformation. A Unitary Analogue to the LR Transforma-

tion," Comput. J. 4 (1961-62), pp. 265-71.

[11] golub, G. H., and C. Reinsch, "Singular Value Decomposition and Least Squares

Solutions", Numer. Math., 14(1970), pp. 403-20.

[12] and R. Underwood, "Stationary Values of the Ratio of Quadratic

Forms Subject to Linear Constraints", Technical Report No. CS 142, Computer

Science Department, Stanford University, 1969.

[13] Hanson, R. J., "Computing Quadratic Programming Problems: Linear Inequality

and Equality Constraints", Technical Memorandum No. 240, Jet Propulsion

38

Laboratory, Pasadena, California, 1970.

[14] and C. L. Lawson, "Extensions and Applications of the House-

holder Algorithm for Solving Linear Least Squares Problems", Math. Comp., 23

(1969), pp. 787-812.

[15] Householder, A.S., "Unitary Triangularization of a Nonsymmetric Matrix",

J. Assoc. Comp. Mach., 5 (1968), pp. 339-42.

[16] Lanozos, C., Linear Differential Operators. Van Nostrand, London, 1961.

Chapter 3

[17] Leringe, 0., and P. Wedln, "A Comparison Betweem Different Methods to Compute

a Vector x Which Minimizes JJAx - bH2 When Gx = h", Technical Report, Depart-

ment of Computer Sciences, Lund University, Sweden.

[18] Levenberg, K., "A Method for the solution of Certain Non-Linear Problems in

Least Squares", ~uart. Appl. Math., 2 (1944), pp. 164-68.

[19] Marquardt, D. W., "An Algorithm for Least-Squares Estimation of Non-Linear

Parameters", J. SIAM, 11 (1963), pp. 431-41.

[20] Meyer, R. R., "Theoretical and Computational Aspects of Nonlinear Regression",

P-181 9, Shell Development Company, Emeryville, California.

[21] Penrose, R., "A Generalized Inverse for Matrices", Proceedings of the

Cambridge Philosophical Society, 51 (1955), pp. 406-13.

[22] Peters, G., and J. H. Wilkinson, "Eigenvalues of Ax = kB x with Band Symmetric

A and B", Comput. J., 12 (1969), pp. 398-404.

[23] Powell, M.J.D., "Rank One Methods for Unconstrained Optimization", T. P. 372,

Atomic Energy Research Establishment, Harwell, England, (1969).

[24] Rosen, J. B., "Gradient Projection Method for Non-linear Programming. Part

I. Linear Constraints", J. SIAM, 8 (1960), pp. 181-217.

[25] Shanno, D. C. "Parameter Selection for Modified Newton Methods for Function

Minimization", J. SIAM, Numer. Anal., Ser. B,7 (1970).

[26] Stoer, J., "On the Numerical Solution of Constrained Least Squares Problems",

(private communication), 1970.

[27] Tewarson, R. P., "The Gaussian Elimination and Sparse Systems", Proceedings

of the Symposium on Sparse Matrices and Their Applications~ T. J. Watson

39

Research Publication RA1, no. 11707, 1969.

[28] Wilkinson, J. H., "Error Analysis of Direct Methods of Matrix Inversion",

J. Assoc. Comp. Mach., 8 (1961), pp. 281-330.

[29] "Error Analysis of Transformations Based on the Use of

Matrices of the Form I - 2ww H', in Error in Digital Computation, Vol. ii, L.

B. Rall (ed.), John Wiley and Sons, Inc., New York, 1965, pp. 77-101.

[30] The Algebraic Eigenvalue Problem, Clarendon Press, Oxford,

1 965.

[31] ZoutendiJk, G., Methods of Feasible Directions, Elsevier Publishing Company,

Amsterdam (1960), pp. 80-90.

Topics in Stability Theory for Partial Difference Operators

VIDAR THOM~E

University of Gothenburg

42

PREFACE

The purpose of these lectures is to present a short introduction to some aspects

of the theory of difference schemes for the solution of initial value problems for

linear systems of partial differential equations. In particular, we shall discuss

various stability concepts for finite difference operators and the related question

of convergence of the solution of the discrete problem to the solution of the con-

tinuous problem. Special emphasis will be given to the strong relationship between

stability of difference schemes and correctness of initial value problems.

In practice, most important applications deal with mixed initial boundary value

problems for non-linear equations. It will net be possible in this short course to

develop the theory to such a general context. However, the results in the particular

cases we shall treat have intuitive implications for the more complicated situations.

The two most important methods in stability theory for difference operators have been

the Fourier method and the energy method. The former applies in its pure form only

to equations with constant coefficients whereas the latter is more directly appli-

cable to variable coefficients and even to non-linear situations. Often different

methods have to be combined so that for instance Fourier methods are first used to

analyse the linearized equations with coefficients fixed at some point and then the

energy method, or some other method, is applied to appraise the error comm~tte~ by

treating the simplified case. We have selected in these lectures to concentrate on

Fourier techniques.

These notes were developed from material used previously by the author for a

similar course held in the summer of 1968 in a University of Michigan engineering

summer conference on numerical analysis and also used for the author's survey paper

~361. Some of the relevant literature is collected in the list of references. A

thorough account of the theory can be obtained by combining the book by Richtmyer

and Morton E28] with the above mentioned survey paper E36S. Both these sources con-

rain extensive lists of further references~

43

I. Introduction

Let ~ be the set of uniformly continuous, bounded functions of x, and let

be the set of functions v with (d/dx)Jv in ~ for J ~k . For v ~ ~ set

X

For any v C ~)amy k, and ~ >0we can f indv G ~Ksuch that

~1 v - v / / .0 (1)

If v ~ C ~ this problem admits one and only one solution in C D

(2)

(3)

It is clear that the solution u depends for fixed t linearly on v; we define a

linear operator Ee(t ) By

where u is defined by (3) and where v C C~A The solution operator Eo(t ) has the

properties

and

II ~-~ b') v /t 0

44

The operator E(t) still has the properties

= ~ " 0+)

l ie (~:~~ I/ ~< /i v I\ , (~)

and is continuous in t for t ~ O. For this particular equation we actually get a

c lass io~ solutio~ for t ~ o~ even i f ~ i s o~y in C .e have E( t ) . ~ (_ - - /~ K=O

for t > O,

Consider new the initial-value problem

, (~)

For v g ~

Clearly

this problem admits one and only one genuine solution, namely

(7)

(act~mlly we have equality) and it is again natural to define a generalized solution

operator, continuous in t by

This has again the properties (~), (5). In this case, the solution is as irregular

for t >0 as it is for t = O.

Both these problems are thus "correctly posed" in ~ ; they can be uniquel~

solved for a dense subset of ~ and the solution operator is bounded.

We could instead of ~ also have considered ether Basic classes of functions.

Thus let L ~ be the set of square integrable functions with

,, (LI 1 Consider again the initial-value problem (1),(2) and assume that u(x,t) is a classi-

cal solution and that u(x,t) tends to zero as fast as necessary whsm I~I .-~o for

the following to hold. Assume for simplicity that u is real-valued. We then have

~t (8 )

45

so that for t ~ O,

i~ ~ [., ~-~'~ II ~ II v I\ (9)

Relative to the present framework it is also possible to define genuine and gene-

ralized solution operators; the latter is defined on the whole of L 2 and satisfies

(~-), (5) .

For the problem (6), (7) the calculation corresponding to (8) goes similarly~

b'

One other way of looking at this is to introduce the Fourier transform; for

integrable v, set ~o ~X

v : ( lO)

Notice the Parseval relation, for v ~nx addition in L 2 we have ~ L a and

II '~ Ii = / i -~ il v i~ .

For the Fourier-transform u(~ ,t) with respect to x of the solution u(x,t) we then

get i~itial-value problems for the ordinary differential equations, namely,

for (l), (2) an~ a~ . ~, A~ ~ = Av(~

fo r (6), (7). ~ese have the ~oZut~ons _~L -~ "~-,~

u ~ (n )

_~ (12)

respectively, and the actual solutions can be obtained, under certain conditions,

by the inverse Fourier transform. Also by Parseval's formula we have for both (ll)

and (12), _ _~I I

which is again (9).

For the purpose of approximate solution of the initial-value problem (1), (2),

where h,k are small positive numbers which we shall later make tend to zero in such

a fashion that ~=k/h 2 is kept constant. Solving for u(x,t+k), we get

46

This suggests that for the exact (generalized) solution to (I), (2),

. (z~)

or after n steps

We sh~l l prove that th i s i s es~ent i~ l ly cor rect for any v ~- ~ i f , but only i f ~

Thus, let us first notice that if ~ ~ ~ , then the coefficients of ~ are all non-

negative and add up to 1 so that (the norm is again the sup-norm)

or generally

iiE vll .< tl ll The boundedness of the powers of ~ is referred to as stability of ~ .

Assume now that v 6 ~ We then know that the classical solution of (i), (2)

exists and if n(x,t) E(t)v = Eo(t)v , the~ u E g ~ = for t '~ 0 an~

We shall prove that~if nk = t, then ~ ~ II ~ J II

To see this let us consider

Notice now that we can write

47

Therefore "

~-, E- ~- V(~l

which we wanted to prove.

We shall new prove that for v not necessarily in

have for nk = t,

To see this, let ~ ~ 0 be arbitrary, and choose 'v"

but only in ~ , we still

when k ~ 0 .

such that

We then have

- -K

Therefore, choosing ~ -- ~'z(~il~l')-w~' have for h ~ ~

which concludes the proof.

Consider now the case ~

Taking

we get

~o

X~

The middle coeffic ~nt in ~ is th~ negative.

so that the effect of ~ is multiplication by ( i -~) . We generally get

48

Since ~ > we have 1-~'~ ~ -i and it follows that it is not possible to have

an inequality of the form

// T.

This can a l so be in terpreted to mean that smal l e r ro rs in the in i t ia l d~ta are blown

up to an extent where they overshadow the real solution. This phenomenon is oalle~

instability.

Instead of the simple difference scheme (13) we could study a more general

type of operator, e.g.

If we wa~t this to be "consistent" with the equation (i) we have to demand that E k

apprexi~tes E(k), or if u(x,t) is a solution, then

Taylor series development gives for smooth u,

or

( \

J

Assuming these consistency relations to hold and assuming that all the aj

we get as above

are ~ O,

(15)

and the convergence analysis above can be carried over to this more general case

with few chs~ges.

49

However, the reasons for choosing an operator of the form (14) which is not our

old operator (13) would be to obtain higher accuracy in the approximation and it will

turn out then that all the coefficients are in general not non-negative. We cannot

have (15) then, but we may still have

fo r some C depend~mg on To

When we work with the L2-norm rather than the maximum norm, Fourier transforms

are again helpful; indeed in most of the subsequent lectures, Fourier analysis will

be the foremost tool.

Thus, let ~ be the Fourier transform of v defined by (lO). We then have

,.J

3 J

or, introducing the characteristic (trigonometric) polynomial of the operator ~ ,

Jj i~_,,,f we find that the effect of E k on the Fourier transform side is multiplication by

n a(h~ )n. One easily findsthat similarly, the effect of E k is multiplication by

a(h ~)n. Using Parseval's relation, one then easily finds (the norm is now the L z-

norm)

IIE I ]11 and that this inequality is the best possible. It follows that we have stability if

and only if la(~ ) I ~ 1 for all real~ . We then actually have (15) in the L2-norm

Consider again the special operator (13).

and a(~ ) takes all values in the interval[l-&/\

We have in this case

2A

, i]. We therfore find that also in

L 2 we have stability if and only if 1-4~ ~ -1, that is ~ ~ .

Difference approximations to the initial value problem (6), (7) can be analysed

similarly.

50

We shall put the above considerations in a more general setting and discuss an

initial-value problem in a Banach space B. Thus let A be a linear operator with

domain D(A) and let v ~ B.

such that

Consider then the problem of finding u(t) E B, t ~ O,

A~*~t-) , E--, o (16)

v (17)

More precisely, we shall say that u(t), t ~ O, is a genuine solution of (16), (17)

if (17) holds and

(ii) Ii u(t,

51

We shall now study the approximation of a solution u(t) = E(t)v of a correctly

posed Initial-value problem (16), (17). We will then for small k,k $ ke, consider

an approximation ~ of E(k), where ~ is a bounded linear operator with D(E k) = B

which depends continuously on k for 0 $ k ~ k e. The thought is then that ~

is going to approximate E(nk)v = E(k)nv.

We say that the eperator E k is consistent with the initlal-value problem (16),

(17) if there is a set'~ of genuine solutio~ of (16), (17) such that

(i) % ~ ~ ~(0~ ~ ~ ~ ~ ~ is dense in B.

for any T > O.

If the operator ~ is consistent with (16), (17), we say that it is convergent

(in B) if for any v e B and any t ~ O, and any pair of sequences ~i ~ 9

with kj ~-> O, njkj -->t for j ~ , we have

II g vI/ o whenj-e . We say that the operator ~ is stable (in B) if for any T ~ 0 there is a con-

stant C such that

It turns out that consistency alone does not guarantee convergence; we have the

following theorem which is referred to as Lax's equivalence theorem [22].

Theorem Assume that (16), (17) is correctly posed and that ~ is a consistent

approximation operator. Then stability is necessary and sufficient for convergence.

The proof of the sufficiency of stability for convergence is similar to the

proof in the particular case treated above; the proof of the necessity depends on

the Banach-Steinhaus theorem.

2. Ini~al-value problems in L 2 with constant coefficients

We begin with some notation. We shall work here with the Banach space

L ~ = L2(R d) with the norm

52

We define for a multi-index ~< = C~,~ ' j~%~ with ~

53

and double bars will indicate norms with respect to L a , so that for the N-vector

u(x) 6 L ' , ~_ \ ~_

For later use we need the following

Lemma I Let ~ be a dense subset of L a and let a(~ ) be a continuous NxN matrix.

Then il~ V II

veglv ~i v/\ ~

Let u(x,t) ~e an N-vector-function defined for x ~ R d and t ~ O.

initial-value problem

-~ _ ~( ,~ = "5_ ~.~)~ ~ >~o

Consider the

where P~ are constant N~N matrices and where we can consider Pu to be defined for

u ~ ~ Let ~

We have :

Theorem I

for any T ~. O, there is a C such that

Proof

The initial-value ~oblem (I), (2) is correctly posed in L" if and onlyif,

a ~ IL o~-~ ~T

Assume that (3) holds. Let v ~ ~ and consider

(~)

By differentiation under the integral sign we find that u(x,t) satisfies (i), and so

is a solution te (i), (2). Since for t >I O, u(x,t) ~ ~) it is a genuine solution

in the sense of Lecture land is also unique. Thus Eo(t)v = u(x,t) with D = 2 . By O

Fourier's inversion formula smd Parseval's theorem

54

Since S is dense in L ~ it follows that the initial-value problem is correctly

posed Jk

We now want to prove the necessity of (3) for correctness. Let now v ~ ~o an~

define u(x,t) by (4). We find at once that u(x,t) satisfies the initial-value

problem (I), (2) and so u(x,t) = E(t)v. Again, by Fourier's inversion formula and

Parseval' s theorem

[I v//

so that by Lemma I,

which proves the necessity of (3) since Co is dense in L z.

Ex. i Consider the symmetric hyperbolic system

(5)

Then the in~ial-value problem for (5) is correctly posed in L 2 for 8

since this is a unitary matrix.

Before proceeding to the next example we state a lemma.

A with eigenvalues ~ d' j = I,...,N, we introduce

We then have

For an arbitrary NxN matrix

Lemma 2 If A is an NxN matrix we have for t ~ 0

.j --0

Proof See [9].

Ex~2 Consider the system (I) and consider also the principal part P of P which

corresponds to the polynomi~l

55

We say that the system (i) is parabolic in Petrovskii's sense if there is a ~, > 0

such that

.)

By homogeneity this is equivalent to the existence of a ~> 0 and a C such that

We then have that if (I) is parabolic in Petrovskii's sense, the corresponding

initial-value problem is correctly posed in L 2. For by Lemma 2 we have for

0 ~ t ~ T,

which is clearly bounded. In particular, the heat equation

clearly falls into this category.

Solutions of parabolic systems are smooth for t ~ 0; we have

Theorem 2 Assume that (1) is parabolic in Petrovsk~'s sense. Then for t ~ O,

D E(t)v ~ L 2 for any ~ and for ar~y T > 0 and anyQ there is a C such that

6

56

-i o -~

and a simple calculation yields

which is not bounded for any t > 0 whe~ V~ cO .

Ex. 5 Although our theory only deals with systems which are first-order systems with

respect to t, it is actually possible to consider also hi~her-order systems by

reducing them to first-order systems. We shall only exemplify this in one particu-

lar case. Consider the initial-value problem (d=l)

.~"~ _- ~-~"~ ~ "k >~ ~

Introducing

~-~ ~

(7)

(a)

we have for u the initial-value problem

ul~o5 = vL~,5. ~ere

(9)

co~ ~ = / _~

so that we have that the initial-value problem (9) obtained by the transformation (8)

from (7) is correctly posed in L i.

In order that an initial-value problem of the type (I), (2) be correctly posed in

L 2, it is necessary that it be correctly posed in the sense of Petrovskii, more pre-

cisely:

Theorem 3 If (i), (2) is correctly posed in L 2 then there is a constant C such that

57

Proof Follows at once by

We shall see at once by the following example that (I0) is not sufficiemt for

correctness in Ls

Ex. 6 Take the initial-value problem corresponding to (d~l)

0 _ .~ = - f T_ -v -l: We get then

However, a simple calculation yields ~ 1

which is easily seen to he unbounded for 0 $ t ~ I (take t ~ = i).

Necessary and sufficient conditions for correctness have been given by Kreiss

[19]. The main contents in Kreiss' result are concentrated in the following lena.

Here for a NxN matrix A we denote by Re A the matrix

Also recall that for hermitian matrices A and B, A ~ B means

for all N-vectors v. We denote the resolvent of A by R(A;z);

It will be implicitly assume~, when we write down R(A;z ), that z is not an eigen-

value of A.

Lemma ~ Let ~ be a fa~ly of NxN matrices. Then the following four conditions are

equivalent

j

? - - -

(iii) For A ~ ~ ~A(A)

58

and such that

(, i s i ~/s- '~) 0 such that for each A ~ ~ there is a hermitian

matrix H = H(A) with

C-~ ~

59

One commonly used criterion is:

Theorem 5 Let P(~) be a normal matrix. Then (i), (2) is correctly posed if and

only if (IO) holds.

Proof By Theorem 3 we only have to prove the sufficiency. Since P( ~ ) is normal we

can find a unitary U(~ ) such that

is diagonal. Hence

which proves the result.

For later use we state:

Theorem 6 If (1), (2) is correctly posed in L 2 then (lO) holds and there are posi-

tive constant C I and C a and for each ~ ~ R d a positive definite hermitian matrix

H(~ ) such ~t -I

and.

(13)

l:'roof By Theorem 4 there is a constant 7 such that the family S in (11) satisfies

condition (iv) of Lemma 3 with C = C I. Thus for each ~g R d there is a positive

definite H(~ ) such that

But by (12) this implies (13).

3. Difference appr0ximations in L i to initlal-value problems with constant

coefficients

Consider again the initial-value problem

ae - ?(~)~:- Z P,~'~u. , ~ , o (l) "oe bzl ._ M

u(', ,o) ~ v~) (2)

60

For the approximate solution of (i), (2) we consider explicit difference operators

of the form

where h is a small positive parameter, ~ = (~, .... ,~d) with ~j integer, e~(h) are

NxN matrices which are polynomials in h, and the sunwaation is over a finite set of ~.

We introduce the symbol of the operator ~,

which is periodic with period 2,U/h in ] and notice that for v 6 ~ ~ the Fourier

transform of ~v is n /k A

Assume that the initial-value problem (i), (2 ) is correctly posed. Pie then want

to choose ~ so that it approximates the solution operator E(k) when k is a positive

parameter tied to h by the relation

~/h ~ = ~ = constant;

we actually want to approximate u(x,nk) = E(nk)v = E(k)nv by ~v. In the future we

shall emphasise the dependence on k rather than h and write ~ as in Lecture i.

To accomplish this, we shall assume that E k satisfies the condition in the

following definition. We say that ~ is consistent with (i) if for any solution of

(1) ~ C ~o

i f o(k) can be ~p~c,~ by k(~(h~), w, ~y that ~ ie aco~ate of o rder f . C1~arly

any consistent scheme is accurate of order at least i.

We can express consistency and accuracy in terms of the symbol (cf. [35]):

Lemma i The operator ~ is consistent with (I) if and only if

The operator ~ is accurate of order ~ i f and only i f ~ v ~

61

The proof of (3), say, consists in proving like in the special case in Lecture

1 that consistency is equivalent to a number of algebraic conditions for the coeffi-

cients, which turn out to be equivalent to the analytic functions exp(kP(h -I ~ )) and

Ek(h-' ~ ) having the same coefficients for h j ~ up to a certain order.

Using LemmA 1 it is easy to deduce that if ~ is consistent with (1) in the

present sense then we also have consistency in the sense of Lecture 1. For the set

@ of genuine solutions in the previous definition we can for instance take the ones

corresponding to v ~ . From Lax's equivalence theorem it is clear that we want

to discuss the stability of operators ~ of the form described. We have

Theorem 1 khe operator ~ is stable if and only if for any T > O,

c O~

,Proof We notice that F-,k( ~ )n in Lecture 2 that

which praves the theorem.

is the ~ymbol of ~k" It follows in the same way as

and so

Proof We have for nk ~ l,

It is easy to prove by counter-examples that (4) is not sufficient for stabili~

Necessary and sufficient conditions for stability have been given by Kreiss [18] and

Buchanan [51 ; we quote here Kreiss' result. The main content in Kreiss' theorem is

concentrated in the following Lemma. Here we have introduced the following

notation: For H hermitian and positive definite, we introduce

(4)

We now turn to the algebraic characterization of stability. We first prove

the necessity of the yon Neumann condition. For any NxN matrix A we denote byp (A)

its spectral radius, the maximum of the moduli of the eigenvalues of A.

Theorem 2 If ~ is stable in L 2, there exists a constant ~ such that

62

IAu Ii~

Recall again that for hermitlan matrices, A ~ B means (Au,u) ~ (Bu,u).

Lemma 2 Let ~ be a family of NxN matrices. Then the following four conditions are

equivalent.

(i) sup ~ ~ ~ ~ ~-~ ~--~, ~ ~ ~o

(iii) For A ~ ~, ~ (A) ~ l, and there are two constants C, and Cm and for each

A @ Jl- a matrix S = S(A) such that

and such that

Sg =

is a triangular matrix wlth

(iv) There is a constant C ~ 0 such that for each A ~ ~ there is a hermitian

matrix H = H(A) with

c- 'T_ ~ ~ ~ C I

and

Proof see [28].

To be able to apply this lemma to our problem we need the following analogue of

Lemma 2.4.

Lemma 3 Assume that ~ is stable in L 2. Then there exists a constant

for ~,~ (~ ~ ~ ~t~) one has

such that

~O

63

An alternative way of expressing this result is that for some

any n we have ~

Y ,k ~ i, and

Combining Lemmas 2 and 3 we have at once :

Theorem ~ If ~he operator ~ is stable in L i, then there is a ~ such that

satisfies the conditions of Lemma 2. On the other hand, if there is a constant

such that kF satisfies at least one of the conditions of Lemma 2, then E k is stable

in L ~.

One commonly used criterion is :

~eo~m ~ Let ~ be su~ t~t ~(~) is a no~l matrix. ~en ~cn ~e~'s con-

dition is necessary and sufficient for stability.

Proof By Theorem 2 we only have to prove the sufficiency. Since ~( ~ ) is nor,~al

there is for each k ~ I and ~ ~ R d a unitary matrix Uk(~) such that

is diagonal. Her~e

I ('

which proves the result. To see the relation with Lemmas 2 and 3, we could also have

formulated this as fellows. We have with the same ~ as in (4) for Fk(~ ) = e Ek(~)

that _ ~ ~\

which is diagonal with eigenvalues of modulus ~ i. Thus, afortiori it is triangu-

lar, and the estimates in condition (iii) of Lemma 2 hold.

As for existence of stable operators, we have (cf. [17]):

Theorem ~ There exist L2-stable operators consistent with (i), (2) if and only if

(I), (2) is correctly posed in L z.

Proof We first prove that the correctness is necessary.

the stability that l~p ~i: ~f~/ "= ~m / ~.~]a~/

It follows by Lemma i, ar~1

6't

which implies correctness.

On the other hand if (I), (2) is correctly posed one can construct a consistent

difference operator, er which is equivalent, its symbol, by setting

Using Kreiss' stability theorems one can prove that this ~ is stable for small

~= K/h~ ~ The part of this operator corresponding to the second term in (5) is

referred to as an artificial viscosity.

We shall consider some examples.

Consider the initial-value problem for a symmetric hyperbolic system

We know from Lecture 2 that this problem is correctly posed in L ~.

before a difference operator

where for simplicity we assume e~ independent of h.

by Friedrichs [8].

(6)

Consider as

We have the following result

~ =- ' I then Theorem 6 If e~ are hermitian, positive semi-definite and

and thus E k is stable.

Proof We have the generalized Cauchy-Sohwartz inequality

where (u,v) = ~ uj ~j. Therefore ~-

65

and hence with w = Ek( ~ )v, ~.

which proves the lemma.

We have

so that

satisfies

As an application, take

: , -4

t , 3

is consistent with (6) and accurate of order i. It is clear that if

o < ~ .< ~'~ Ca/~ l / -~ ,

the coefficients are positive semi-definite and so the operator ~ is stable.

The operator ~ can be considered as obtained from replacing (6) by

Consider for a moment the perhaps more natural equation

which gives the consistent operatorj ~i

E~vC~} --- vl~) ~ ~ aa~ L~l~*~-~( '~-~' ~

with J

66

We shall prove that this operator is not stable in L ~ if any of the Aj

Assume e.g. A, ~ 0 and set ~ j = 0 for J ~ I, ~ I h =~/2.

which has the eigenvalues

is non-zero.

With this choice,

where the rea l numbers #~ are the eigenvalues of A~. Thus the von Neum~nn cond i -

t ion is not satisfied and the operator is unstable for any ~ .

It can be shown that in general the operator ~ defined in (8) is accurate of

order exactly 1. We shall now look at an operator which is accurate of order 2 in

the case of one space dimension (d=l).

= ~ - -

thus have the ~stem

(9)

Consider the difference operator

=

with

This operator is often referred to as the Lax-Wendroff operator. We have

and so, E k is consistent with (9), and in general accurate of order 2. We shall

prove :

Theorem

(lO) is stable in L 2 if and only if

Pro0f It is easy to see that the eigenvalues of ~(h -I ~ ) are

Let~j , j = 1,...,N, be the eigenvalues of A. Then the operator E k in

(ll)

and we obtain after a simple calculation

if and only if (II) holds. Since E k is clearly normal, this proves the theorem.

For a NxN matrix A consider the numerical range

67

We have :

Theorem 8 If ~ is a family of NxN matrices such that

then ~ is a stable family, that is there is a constant C such that

Proof

since

We shall prove that condition (ii) in Kreiss' theorem is satisfied. Clearly

we have ~(A) ~ 1 so that R(A;z) exists for ~z~ >I .

and v = R(A;z)w we have

or

Therefore, if w is arbitrary,

which proves the result.

Remark One can actually prove that IAnl ~ 2, A ~

This result can be used to prove the stability of certain generalizations of

the Lax-Wendroff operator to two dimensions (see [2~]).

Consider again the symmetric hyperbolic system (6) and a difference operator

of the form (7), consistent with (6). Then A(~ ) = ~(h-1~ ) is independent of h.

We say with Kreiss that E k is dissipative or order O (~ even) if there is a ~ ~ 0

such that /X)

We shall prove

Theorem 9 Under the above assumptions, if E k

pative of order ~ it is stable in L 2 .

is acct~ate of order~ -I an~ dissi-

68

Proof By the definition of accuracy, we have

o,.s :~ -? O

Let U = U( "~ ) be a unitary matrix which triangulates A( } ) so that

Since B(~ ) is upper triangular it follows that the below-diagonal elements in

exp(~UP(~ )U ~) are O(~) . Since this matrix is unitary, the same can easily be

proved to hold for its above-diagonal terms, and thus the same holds for the above-

diagonal terms in B( ~ ) so that

\o

. . . .

and the s tab i l i ty fo l low~ by cond i t ion ( i i i ) in ~e iss t ~eorem, v

Consider now the initial-value ~roblem for a Petrovskii parabolic system

. _ ~ ~0

so that

We know from Lecture 2 that this problem is correctly posed in L 2. Consider a

69

aifference operator

We say, foZlo~r.l.ng John [15] and ~idlund [38] that E is a parabol ic d i f fe re~e k

operator if there are constants ~ and C, S ~ 0 such that

Notice the close analogy with the concept of a dissipative operator.

Theorem 10 Let E be consistent with (12) and parabolic. Then it is stable in L ~. k

We shall base a proof on the following lemma, which we shall also need later for

other purposes.

Lemma 4 There exists a constant C N depending only on N such that for any NxN

matrix A with spectral radius ~ we have for n ~ N, !

IP, l O there is a C

such that -- ~

70

,Proof By Fourier transformation this reduces to proving . _~ l

and the result therefcre easily follows by (13).

We know by Lax's equivalence theorem that the stability of the parabolic

difference operators considered above implies convergence. We shall now see

that the difference quotients also converge to the corresponding derivatives,

which we know to exist for t > 0 since the systems are parabolic.

Theorem 12 Assume that (12) is parabolic and that ~ is consistent with (12) an~

parabolic. Then for any t > O, any o~ , and any v 6 L 2 we have for nk = t,

~ i [ ~ _ ~ li b-~ ,, v f~ (,) v ii --> o ~ ~, ---. o , (~, )

Proo____~f By Theorems 2,2 and ii one finds that it is sufficient to prove (14) for v A~

in the dense subset C~ . But then, by Parseval's relation,

"~ t~ - %~

The result therefore follows by the following lemma which is a simple consequence

[scot._1970_symposium_on_the_theory_of_numerical_a(bookzz.org).pdf

Documents