[scot._1970_symposium_on_the_theory_of_numerical_a(bookzz.org).pdf

Upload: ashiquehussain

Post on 01-Mar-2016

14 views

Category:

Documents


0 download

TRANSCRIPT

  • Lecture Notes in Mathematics A collection of informal reports and seminars Edited by A. Dold, Heidelberg and B. Eckmann, Z0rich

    193

    Symposium on the Theory of Numerical Analysis Held in Dundee/Scotland, September 15-23, 1970

    Edited by John LI. Morris, University of Dundee, Dundee/Scotland

    Springer-Verlag Berlin. Heidelbera New York 1971

  • AMS Subject Classifications (1970): 65M05, 65M10, 65M 15, 65M30, 65N05, 65N 10, 65N 15, 65N20, 65N25

    ISBN 3-540-05422-7 Springer-Verlag Berlin Heidelberg New York ISBN 0-387-05422-7 Springer-Verlag Near York Heidelberg Berlin

    This work is subject to copyright. All rights are reserved, whether the whole or part of the material is concerned, specifically those of translation, reprinting, re-use of illustrations, broadcasting, reproduction by photocopying machine or similar means, and storage in data banks.

    Under 54 of the German Copyright Law where copies are made for other than private use, a fee is payable to the publisher, the amount of the fee to be determined by agreement with the publisher.

    by Springer-Verlag Berlin Heidelberg 1971. Library of Congress Catalog Card Number 70-155916. Printed in Germany.

    Offsetdruck: Julius Beltz, Hemsbach

  • Foreword

    This publication by Springer Verlag represents the proceedings of a series

    of lectures given by four eminent Numerical Analysts, namely Professors Golub,

    Thomee, Wachspress and Widlund, at the University of Dundee between September

    15th and September 23rd, 1970o

    The lectures marked the beginning of the British Science Research Council's

    sponsored Numerical Analysis Year which is being held at the University of Dundee

    from September 1970 to August 1971. The aim of this year is to promote the theory

    of numerical methods and in particular to upgrade the study of Numerical Analysis

    in British universities and technical colleges. This is being effected by the

    arranging of lecture courses and seminars which are being held in Dundee through-

    out the Year. In addition to lecture courses research conferences are being

    held to allow workers in touch with modern developments in the field of Numerical

    Analysis to hear and discuss the most recent research work in their field. To

    achieve these aims, some thirty four Numerical Analysts of international repute

    are visiting the University of Dundee during the Numerical Analysis Year. The

    complete project is financed by the Science Research Council, and we acknowledge

    with gratitude their generous support. The present proceedings, contain a great

    deal of theoretical work which has been developed over recent years. There are

    however new results contained within the notes. In particular the lectures pre-

    sented by Professor Golub represent results recently obtained by him and his co-

    workers. Consequently a detailed account of the methods outlined in Professor

    Golub's lectures will appear in a forthcoming issue of the Journal of the Society

    for Industrial and Applied Mathematics (SIAM) Numerical Analysis, published

    jointly by &club, Buzbee and Nielson.

    In the main the lecture notes have been provided by the authors and the

    proceedings have been produced from these original manuscripts. The exception

    is the course of lectures given by Professor Golub. These notes were taken at

    the lectures by members of the staff and research students of the Department of

    Mathematics, the University of Dundee. In this context it is a pleasure to ack-

    nowledge the invaluable assistance provided to the editor by Dr. A. Watson, Mr.

  • IV

    R. Wait, Mr. K. Brodlie and Mr. G. McGuire.

    Finally we owe thanks to Misses Y. Nedelec and F. Duncan Secretaries in

    the Mathematics Department for their patient typing and retyping of the manu-

    scripts and notes.

    J. L1. Morris

    Dundee, January 1971

  • Contents

    G.Golub: Di rect Methods for So lv ing E l l ip t ic D i f fe rence Equat ions . . . . . . . . . . . . . . . . . . . . . . . . . . I

    I. In t roduct ion . . . . . . . . . . . . . . . . . . . . . . . 2 2. Mat r ix Decompos i t ion . . . . . . . . . . . . . . . . . . . 2 3. B lock Cycl ic Reduct ion . . . . . . . . . . . . . . . . . . 6 4. App l i ca t ions . . . . . . . . . . . . . . . . . . . . . . . 10 5. The Buneman A lgor i thm and Var iants . . . . . . . . . . . . 12 6. A~curacy of the Buneman A lgor i thms . . . . . . . . . . . . 14 7. Non-Rectangu lar Regions . . . . . . . . . . . . . . . . . 15 8. Conc lus ion . . . . . . . . . . . . . . . . . . . . . . . . 18 9. References . . . . . . . . . . . . . . . . . . . . . . . . 18

    G.Golub: Mat r ix Methods in Mathemat ica l P rogramming . . . . . . . 21

    I. In t roduct ion . . . . . . . . . . . . . . . . . . . . . . . 22 2. L inear P rogramming . . . . . . . . . . . . . . . . . . . . 22 3. A Stable Imp lementat ion of the S implex A lgor i thm . . . . . 24 4. I terat ive Ref inement of the So lut ion . . . . . . . . . . . 28 5. Househo lder T r iangu lar i za t ion . . . . . . . . . . . . . . 28 6. P ro jec t ions . . . . . . . . . . . . . . . . . . . . . . . 31 7. L inear Least -Squares Prob lem . . . . . . . . . . . . . . . 33 8. Least -Squares Prob lem with L inear Const ra in ts . . . . . . 35

    B ib l iography . . . . . . . . . . . . . . . . . . . . . . . 37

    V.Thom@e: Topics in Stab i l i ty Theory for Part ia l D i f ference Operators . . . . . . . . . . . . . . . . . . . . . . . . . . 41

    Preface . . . . . . . . . . . . . . . . . . . . . . . . . 42 I. In t roduct ion . . . . . . . . . . . . . . . . . . . 43 2. In i t ia l -Va lue Problems in L ~ w~th Constant Coef f i c ients . 51 3. D i f fe rence Approx imat ions in L ~ to In i t ia l -Va lue Problems

    wi th Constant Coef f i c ients . . . . . . . . . . . . . . . . 59 4. Es t imates in the Max imum-Norm . . . . . . . . . . . . . . 70 5. On the Rate of Convergence of D i f fe rence Schemes . . . . . 79

    References . . . . . . . . . . . . . . . . . . . . . . . . 89

    E .L .Wachspress : I terat ion Parameters in the Numer ica l So lu t ion of E l l ip t i c Prob lems . . . . . . . . . . . . . . . . . . . . . . 93

    I. A Concise Rev iew of the Genera l Topic and Background Theory . . . . . . . . . . . . . . . . . . . . . . . . . . 95

    2. Success ive Overre laxat ion: Theory . . . . . . . . . . . . 98 3. Success ive Overre laxat ion: Pract ice . . . . . . . . . . . 100 4. Res idua l Po lynomia ls : Chebyshev Ext rapo lat ion : Theory .102 5. Res idua l Po lynomia ls : Pract ice . . . . . . . . . . . . . . 103 6. A l te rnat ing -D i rec t ion - lmp l i c i t I terat ion . . . . . . . . . 106 7. Parameters for the Peaceman-Rachford Var iant of Adi .107

    0.Widlund: In t roduct ion to F in i te D i f fe rence Approx imat ions to In i t ia l Value Problems for Part ia l D i f fe rent ia l Equat ions .111

    I. In t roduct ion . . . . . . . . . . . . . . . . . . . . . . . 112 2. The Form of the Part ia l D i f fe rent ia l Equat ions . . . . . . 114 3. The Form of the F in i te D i f fe rence Schemes . . . . . . . . 117 4. An Example of D ivergence. The Max imum Pr inc ip le . . . . . 121 5. The Choice of Norms and Stab i l i ty Def in i t ions . . . . . . 124 6. Stabi l i ty , E r ro r Bounds and a Per turbat ion Theorem . .133

  • VI

    7. The yon Neumann Condit ion, D iss ipat ive and Mu l t i s tep Schemes . . . . . . . . . . . . . . . . . . . . . . . . . 138

    8. Semibounded Operators . . . . . . . . . . . . . . . . . . 142 9. Some App l i ca t ions of the Energy Method . . . . . . . . . 145

    10. Max imum Norm Convergence for L 2 Stable Schemes . . . . . 149 References . . . . . . . . . . . . . . . . . . . . . . . 151

  • Direct Methods for Solving Elliptic Difference Equations

    GENE GOLUB

    Stanford University

  • i. Introduction

    General methods exist for solving elliptic partial equations of general type

    in general regions. However, it is often the ease that physical problems such as

    those of plasma physics give rise to several elliptic equations which require to be

    solved mauy times. It is not unco~non that the elliptic equations which arise re-

    duce to Poisson's equation with differing right hand side. For this reason it is

    judicious to use direct methods which take advantage of this structure and which

    thereby yield fast and accurate techniques for solving the associated linear

    equations.

    Direct methods for solving such equations are attractive since in theory they

    yield the exact solution to the difference equation, whereas commonly used methods

    seek to approximate the solution by iterative procedures [12]. Hockney [8] has

    devised an efficient direct method which uses the reduction process Also Buneman

    [2] recently developed an efficient direct method for solving the reduced system

    of equations. Since these methods offer considerable economy over older tech-

    niques [5], the purpose of this paper is to present a unified mathematical deve-

    lopment and generalization of them. Additional generalizations are given by

    George [6].

    2. Matrix Decomposition

    Consider the system of equations

    = ~ , (2 .1)

    where M is an NxN real symmetric matrix cf block tridiagonal form,

    M =

    A T

    T A e

    W

    T A

    (2.2)

    The matrices A and T are pp symmetric matrices and we assume that

    AT = TA .

  • This situation arises in many systems However, other direct methods which are

    applicable for more general systems are less efficient to implement in this case.

    Moreover the classical methods require more computer storage than the methods te be

    discussed here which will require only the storage of the vector ~. Since A and T

    commute and are s~et r i c , it is well known Ill that there exists an orthogonal

    matrix Q such that

    QT A Q = A, QT T Q = 0 ,

    and A and O are real diagonal matrices.

    (2.3) The matrix Q is the set ef eigenvectars of

    A and T, and A and n are the diagonal matrices of the p-distinct eigenvalues cf A

    and T, respectively

    To conform with the matrix M, we write the vectors x and ~ in partitioned form,

    x --

    X ~q

    i I

    Furthermore, it is quite natural to write

    x2j I

    xj = .

    I X

    , p j I

    L j

    System (2.2) may be written

    ~Cj = 2J

    YPJ I

    J = 2,3,...,q-1 ,

    (2.~_)

    (2.5a) (2.5b)

  • T~q_ I + AX~q = ~ . (2.5e)

    Frem Eq. (2.3) we have

    A = Q A QT and T = Q O QT

    Substituting A and T into Eq. (2.5) and pre-multiplying by QT we obtain

    where

    z f~

    rewritten as

    (,i = 2,3, . . . ,q- i ) (2.6)

    - = Q~x = Q~ x..i ~CI ' Z,i '~J ' J = 1,2, . . . ,q.

    and ~j are partitioned as before then the ith components of Eq. (2,6) may be

    u N u

    ~iXij_l + kiXij + ~ixij+l

    wiXiq-I + klXiq = Ylq j

    fer i = 1 ,2 , . . .pp.

    = ~-~j , (j = 2,...,q-~) ,

    If we rewrite the equatio~by reversing the rolls of i and J we may write

    r i =

    % - -=

    P

    6o i X i - qxq

    " N

    Xil

    x i2

    Xiq

    A

    -] Yil Yi2

    1

  • so that Eq. (2.7) is equivalent to the block diagonal system of equations,

    r i~o ~ , ( i ~ 1 ,2 , . . . ,p ) . (2.8)

    Thus, the vector ~isatisfies a symmetric tridiagonal system of equations that has a

    constant diagonal element and a constant super- and sub- diagonal element.

    (2.8) has been solved block by block it is possible to solve for ~j = Q~j.

    have:

    Algorithm 1

    1. Compute or determine the eigensystem of A and T.

    2. 0o~pute ij Q~j (J 1,2,...,ql. 3. Solve ri~i = ~ (i = 1,2,...,p).

    ~. Compute xj = ~j (j . 1,2,...,q).

    After Eq.

    Thus we

    r i =

    For our system

    k i w i

    and the eigenvalues may be written down as

    v = 2~ i r_~ ir k i + cos q+l

    " " ~ i

    si ki

    r = 1,2,..., q

    It should be noted that only Q, and the yj, j = 1,2,...,q have to be stored,

    A since _~ oan over~rite the ~j the ^~ can overwrite the ~ and the ~joan overwrite

    the ~j. A simple aleulatien will show that approximately 2plq + 5Pq arithmetic opera-

    tors are required for the algorithm when step 3 is solved using @aussian el4m4~a-

    tion for a tridiagonal matrix when r i are positive definite. The arithemtic opera-

    ters are dominated by the 2p2q multiplications arising from the matrix multiplica-

    tions of steps 2 + 4. It is not easy to reduce this re,tuber unless the matrix Q ham

    special properties (as in Poisson's equation) when the fast Fourier transform can be

    used (see Hookney [8]).

  • er that

    r i = Z V i Z T ,

    rs~ V i the diagonal matrix ef eigenvalues of r i and Zrs = o s sin ~ . Since r i and rj

    have the same set of eigenvectors

    r i rj = rj r i .

    Because of this decomposition, step (3) can be solved by computing

    ~i = Z V~' Z T

    where the Z is stored for each r i. This therefore requires of the order of 2pq"

    multiplications and this approximately doubles the computing time for the algorithm.

    Thus performing the fast Fourier transform method in step 3 as well as steps 2 and

    is not advisable.

    3. Block C,yclic Reductien

    In Section 2, we gave a method for which one had to know the eigenvalues and

    eigenvectors of some matrix. We now give a more direct method for solving the

    system of Eq. (2.1).

    We assume again that A and T are symmetric and that A and T commute. Further-

    more, we assume that q = m-I and

    m = 2 k+l

    where k is some positive integer. Let us rewrite Eq. (2.5b) as follows:

    ~.i-2 + A~j-I + ~J = ~J-l '

    TXj_l + A~j + Txj+ 1 = ~j ,

    ~ j ~J+l + ~J+2 = ~j+l "

    Multiplying the first and third equation by T, the second equation by -A, and addim@

    we have

    T2xj_ 2 + (2T" - A 2)xj + T2xj+ 2 = T~j_I - A~j + T~j+I .

    Thus if j is even, the new system of equations involves x.'s with even indices. ~j

    Similar equations held for x and Xm_ 2. The process of reducing the equations in

  • this fashion is known as c2clic reduction. Then Eq. (2.1) may be written as the

    following equivalent system:

    ( 2T 2 -A" )

    T" ( 2T 2-A" ) T 2

    @

    k, -

    ~+~ -~

    e

    e

    (2'~'~')

    F

    o

    o

    ~m_n,

    I.

    (3.1)

    and

    ~j = Zj + ~(Xj_l + X i+l) J = 3,5,...,m-3 (3.2)

    2 k+l , Since m = and the new system of Eq. (3.1) involves xj's with even indlcesp the

    block dimension ef the new system of eqtmticns is 2k-l. Note that once Eq. (3.1) is

    solved, it is easy to solve for the xj's with odd indices as evidenced by Eq. (3.2)

    We shall refer to the system of Eq. (3.2) as the eliminated equations.

    Also, note that Algorithm i may be applied to System (3.1). Since A and T

    commute, the matrix (2Ta-A a) has the same set of eigenvectors as A and T. Also, if

    ~(A) = ki, ~(T) = %, for i = 1,2,...,m-l,

    = - .

    Heckney [8] has advocated this procedure.

    Since System (3.1) is block tridiagonal and of the form of Eq. (2.2), we can

    apply the reduction repeatedly until we have one block. However, as noted above, we

    can stop the process after any step and use the toothed of Section 2 to solve the

  • resulting equations.

    To define the procedure recursively, let

    ~o) (j = 1,2, .,m-l). A () = A, T (e) = T; ~ = Zj, -" (3.3)

    Then for r = O,l,..,k

    A (r+l) = 2(T(r)) = _ (A(r)) =,

    T (r+z) = (T(r))" , (3.~)

    ~(r-1) = T(r) (r) . (r) - A(r) (r) j ~ J -2 r + ~j+2 r Yj

    The eliminated equations at each stage are the solu~on of the diagonal system

    (r-l) - T(r-l) A (r-l) X2r_2r_ , = ~2r_2r-, X2r

    (r-l) - T(r-1) (xj2 r ) A(r -1) X j2 r -2r " = ~ j2r -2 r - ' + x ( j -1 )2r (3.5)

    j = 1,2,...,2 k-r

    . (r-l) A(r-1) ~. I _2r., =~k+l_2r . , - T(r'l) X2k+l_2r

    After all of the k steps, we must solve the system of equations

    A(k) . (k) ~2 k -- ~2 k . (3.6)

    In either ease, we must solve Eq. (3.5) to find the eliminated unknowns, Just a~ in

    Eq. (3.2). If it is done by direct solution, an ill-conditloned system may arise.

    Furthermore A = A()is tridiag~nal A (i) is quindiagonal and so on destroying the

    simple structure of the original system. Alternatively polynomial factorization

    retains the simple structure of A.

    From Eq. (3.1), we note that A (1) is a polynomial of degree 2 in A and T. By

    induction, it is easy to show that A (r) is a polynomial of degree 2 r in the matrices

    A and T, so that 2r-I

    A(r) = ~ e(r)2j A2j T2r-2j "~ P2 r(A'T)"

    We shall proceed %0 determine the linear factors of P2r(A,T).

  • Let 2r-I

    j--o

    For t ~ O, we make the substitution

    a/~ : -2 OOS e .

    From Eq. (3.3), we note that

    I p2r1(a,t) = 2t 2~ _ (p2r(a,t))~

    It is then easy to verify using Eq~. (3.7) and (3.8), that

    P2r(a,t) =-2t 2r cos 2re ,

    and, consequently 2 r

    J=l

    and, hence,

    -~-2-i (, + 2t cos ~2~+,~ ,) ,

    (3.7)

    (3.8)

    A (r) = -~ (A + 2 cos e!r)T)~ , (3.9)

    01

    (r) = (2j_I)~/2~+, where ~j

    Thus to solve the original system it is only necessary to solve the factored system

    recursively. For example when r = 2, we obtain

    A (1) = 2~ - A m = (~ T - A ) (~ T + A)

    whence the simple tridiagonal systems

    (J: T -A) ~=~

    (4~ T +A) x = w

    are used to solve the system

    A(1)x = ~

    We call this method the cyclic odd-even reduction and factorization (CORF) algorithm.

  • 10

    4. Applications

    Exampie I Poissen's Equation wit h Dirichlet Boundar~ Conditions,

    It is instructive to apply the results of Section 3 to the solution of the

    finite-difference approximation to Poisson's equation on a rectangle, R, with speci-

    fied boundary values Consider the equation

    u + u : f(x,y) for (x,y)ER, ~x yy (~.l)

    u(x,y) : g(x,y) for (x,y)aR .

    (Here aR indicates the boundary of R.) We assume that the reader is familiar with

    the general technique of imposing a mesh of discrete points onto R and approximating

    ~q. (4.Z). The eq~tion u + Uyy : f(x,y) is approximated at (xl,Yj) by

    Vi-l.j - 2vi,j + Vi+l.j vi,j-1 - 2vi. j + vi.j+l C~)" + (Ay)"

    = fi,J (i < i < n-l, I < J < m-i) ,

    with appropriate values taken on the boundary

    VO,J = gC,~' Vm, j = gm,J ( 1 g J g m-l ) ,

    and

    Vi,@ = gi,o' vi,m : gi,J (i < i ~ n-l).

    Then vii is an approximation to u(xi,Yj) , and fi,j = f(xi'Yj)' gi,j : g(xl,Yj)-

    Hereafter, we assume that

    2k+l m -~

    When u(x,y) is specified on the boundary, we have the Dirichlet boundary con-

    dition. For simplicity, we shall assume hereafter that Ax = Ay. Then

    1

    l -4 I

    (~ . 1

    and T = I . . l

    1

    -4 (n - l ) x (n - l )

  • 11

    The matrix In_ I indicates the identity matrix of order (n-l). A and T are symmetric

    and co~ute, and, thus the results of Sections 2 and 3 are applicable In addition,

    since A is tridlagcnal, the use of the facterization (3.10) is greatly simplified.

    The nine-polnt difference formula for the same Poisson's equation can be treated

    similarly when m

    -20 4

    4 -20

    A =

    0

    O

    & -20

    , T=

    (n-l)~n-ll

    "~ z 0 1 4 1

    (~ . . I

    1 &

    Example II

    The method can also be used for Poisson's equation in rectangular regions

    under natural boundary conditions provided one uses

    au = u(x + ~.y ) - u(x - ~ .y ) Ox 2h

    and similarly ~ at the boundarie S,

    Example III

    Poisson's equation in a rectangle with doubly periodic boundary conditions is

    an additional example when the algorithm can be applied.

    Example IV

    The method can be extended successfully to three dimensions for Foissents

    equation.

    For all the above examples the eigensystems are known an~ the fast Fourier

    transform can ~e applied,

    Example V

    The equation of the form

    (~(x)~)x + (KY)~)y + u(x,y) = q(x,y)

    on a rectangular region can be solved by the CORF algorithm provided the eigensystem

    is calculated since this is not generally known.

  • 12

    The counterparts in cylindrical polar co-ordinates can also be solved using

    CORF on the ractangle~ in the appropriate co-ordinates.

    5. The Buneman algorithm and variants

    In this section, we shall describe in detail the Buneman algorithm [2] and a

    variation of it. The difference between the Buneman algorithm and the CORF algo-

    rithm lies in the way the right hand side is calculated at each stage of the reduc-

    tion. Henceforth, we shall assume that in the system of Eqs (2.5) T = Ip, the

    identity matrix of order p.

    Again consider the system of equations as given by Eqs. (2.5) with q = 2k+l-1.

    After one stage of cyclic reduction, we have

    + (21 - A')~j (5.1) 5j-2 p + 5j+2 = ZJ-I + ZJ+I -AZJ for J = 2,4,...,q-I with ~e = ~+l = ~ ~ the null vector. Note that the right han~

    side of Eq. (5.1) may be written as

    (i) (5.2) ~J = ZJ-1 + ZJ+I -~ J = A(1) A-'~j + ZJ-I + ~J+l - 2A-'~j

    where A (1) = (21p- A') .

    Let us define

    (i) ~J-(1) ~j-I ~j+l " 22~ I)_ 2j : A-'Zj ; = +

    (These are easily calculated since A is a tridiagon~l matrix.) Then

    (1) = A(1) _(1) (1) (5-3) Z~ j + %j

    After r reductions, we have by Eq. (3.i)

    (r+l) , (r) (r)) -A(r) (r) j = ~ j -2 ~ + ~j+2 ~j . (5.4)

    Let us write

    Substituting

    21 - A (r+1) P

    in a fashion similar to Eq. (5.3)

    (5.5)

    Eq. (5.5) into Eq. (5.4) and making use of the identity (A(r)) ' =

    from Eq. (3.4), we have the following relationships:

    (r+l) = 2(r) (A(r))_~ , (r) ~(r) ~r) ) (5.6a) J J - ~j_2 r + ~j+2 r -

  • 13

    ( r + l ) ~(r) (r) ^ (r+l) For J = i2 r+l (i = 1,2,...,2k-r-l) with

    ~!r) = ~(r) (r) = ~(r) = O 2k+l = 2k+l -

    ~r) Because the number of vectors ~ is reduced by a factor of two for each successive r, the computer storage requirements becomes equal to almost twice the number of

    data points.

    To compute

    of equations

    A(r) , (r) (r+l) (r)r (r) (r) !,~j - ~,J ) == ~J-2 + ~j+2 r - ~j '

    where A (r) is given by the factorization Eq. (3.9); namely,

    2 r A (r) ~ (A + 2 cos 8 (r)j = - Ip) ,

    J=l

    o~ r) = (2~ - ~)~/2 r~ l

    After k reductions, one has the equation

    . = A(k) (k) ,~(k) A (k) x k ~2 k + ~2k

    2

    and hence

    (A(r))-'(~J-2(r)r + ~J +2r~(r) _ ~r)) in Eq. (5.6a). we solve the system

    ~(k) (A(k))_1 ~(k) ~2k = ~2k + ~2k

    Again one uses the factorization of A (k) for computing (A(k)) -I ~I~ ) .

    solve, we use the relationship

    ~J -2r + A(r) ~J + ~J +2r = A(r) ~r) + ~r)

    for J = i2r(l,2,...,2k+l-r-1) with ~o = ~2k+ 1 = ~

    For J = 2 r, 3.2r,...,2k+l-2 r, we solve the system of equations

    A(r)(xj - ~r)) = ~r) _ (xj_2r + xj+2r) ,

    Te back

    (5.7)

  • 14

    using the factorlzation of A(r); hence

    ~J 2~r) (r)) = + (~J - d " (5.~3)

    Thus to summarise, the Buneman algorithm proceeds as follows:

    ((r) (r)~ 1. Compute the sequence ~ j , ~j } by Eq. (5.6) for r = l,...,k with

    (o) e for J = 0,...,2 k+l ,ana ~O)Z = ~J for j = l, 2,...,2k+l-1.

    2. Back-solve for ~j using Eqs. (5.7) and (5.8).

    The use of the p(r) and q(r) produce a stable algorithm. Numerical experi- ~J ~J

    ments by the author and his colleagues have shown that computationally the Buneman

    algorihhm requires approximately 30% less time than the fast Fourier transform

    method of Hockney.

    6. Accuracy of the BunemanAl~orithms

    As was shown in Section 5, the Bunemau algorithms consist of generating the

    (r)l. Let us write, using Eqs. (5.12) and (5.13) sequence of vectors I~ r), ~J

    ~r) : ~r) + ~J(r) (6.la)

    (r) = Xj 2 r + x r - Afr) (r) (6.1h) ~J ~ - ~j+2 ~j '

    where

    and

    Then

    and

    whe re

    k : l (6.2)

    S (r) = (A( r - l ) . . . A(O)) - ' . (6.3)

    I1~.~ r) .(r)ll IIs(r)ll2 - ~ i l 2 ~i

    l i~ r) - (~j_2r + ~j+2rl12

    11.t1' (6.~)

    IIs (r) ACr)il2 i1~1' , (6 .5)

    llVll 2 indicates the Euclidean norm of a vector v ,

    IICII 2 indicates the spectral norm of a matrix C, and

  • 15

    1~1t'. ~ll~jll 2 . j=l

    Thus for A : A T r-1

    Ns(r)II2 -~ I](A(J))-III2 j:o

    and since A ( j ) are polynomials of degree 2 j in A we have

    r-I

    lls(r)ll2 Vt [P2 j() max I ]"[ , j:o [xif

    where p2j(Xi) are polynomials in Ikil , the eigenvalues of A.

    For Poisson's equation it may be shown that

    lls(r)II2 < e'r e

    where o : 2r-1 and e > O. r

    Thus Hs(r)ll2 _, o and h~ce

    I12~ r) - ~H 2 ~ 0

    ~r) That is p tends to the exact solution wihh increasing r.

    that llq~ r)N2 remains bounded throughout the calculation, the Buneman

    leads to numerically stable results.

    (6.6)

    Since it can be shown

    algorithm

    7. Non-Rectangular Regions

    In many situations, one wishes to solve an elliptic equation over the region

    R

    where there are n I data points in R I , n2 data points in R z and ne data points in

    R, (~ R2. We shall assume that Dirichlet boundary conditions are given. When Ax is

  • 16

    the same throughout the region, one has a matrix equation of the form

    m

    G

    P

    pT

    @ ~(2)J ~c (2)

    where

    "A T

    T A

    G=

    #

    e . T

    T A n I xn l

    B s

    $ B .

    (~ " . S

    H = S

    B n 2 xn=

    and P is (noXno).

    Also, we write x~ z) x!2~ x (1) =

    o

    x(a) ,,,r

    x(2) q I

    We assume again that AT = TA and BS = SB.

    From Eq. (7.1), we see that

    0

    0 x(1) = ~-I y(1) _ ~-1 .

    (7.1)

    (7.2)

    (7.3)

    (7 .~)

    an~

  • 17

    x(2) = H-I Z(2) - H-I

    pT

    0

    0

    x(1) ,,.,r (7.5)

    Now let us write

    G~(1) = ~(1), H~(2)

    ~w (I) =

    = ~(2) ,

    ~(2)=

    (7.6)

    ;l o I "I

    oJ

    (7.7)

    Then as -e partition the vectors z (i), z (2) and the matrices W (1) and W (2) as

    in Eq (7.3), Eqs (7.4) and (7.5) becomes

    (i) ~i) ~(1) x!2) (j = 1,2,...,r), ~j = - ,,j ~ ,

    (2) = (2) _ w!2) x(1) (j = 1,2,..,s) J ~j J ,,~ ,

    For Eq. (7.8), we have

    I w (1) r

    w~ 2) i

    (7.8)

    (1)

    z(2) (7.9)

    It can be noted that W ~lj( ~ and W ~2j( ~ are dependent only on the given region and hence

    the algorithm becomes useful if many problems on the same region are to be conside-

    red.

    Thus, the algorithm proceeds as follows

    i. Solve z(I) aria z! 2) using the methods of Section 2 or 3.

  • 18

    2. Solve for W (I) and W! 2) using the methods of Section 2 or 3. r

    3. Solve Eq. (7.9) using Gaussian elimination. Save the LU decomposition of

    Eq. (7.9).

    h. Solve for the unknown components of ~(1) and ~(2)

    8. Conclusion

    Numerous applications require the repeated solution of a Poisson equation.

    The operation counts given by Dorr [5] indicate that the methods we have discussed

    should offer significant economies over older techniques; and this has been veri-

    fied in practice by many users. Computational experiments comparing the Buneman

    algorithm, the MD algorithm, the Peaceman-Raohford alternating direction algorithm,

    and the point successive over-relaxation algorithm are given by Buzbee, at al [3].

    We conclude that the method of matrix decomposition, the Buneman algorithm, and

    Hookney's algorithm (when used with care) are valuable methods.

    This paper has benefited greatly from the comments of Dr. F. Dorr,

    Mr. J. Alan George, Dr. R. Hockney and Professor 0. Widlund.

    9. References

    1. Richard Bellman, Introduction to Matrix Analysis, McGraw-Hill, New York, 1960.

    2. Oscar Buneman, Stanford University Institute for Plasma Research, Report No.294, 1969.

    B.L. Buzbee, G.H. Golub and C.W. Nielson, "The Method of Odd/Even Reduction and Factorization with Application to Poisson's Equation, Part II," LA-h288, Los Alamos Scientific Laboratory. (To appear SIAM J. Num. Anal. )

    J.W. Cooley and J.W. Tukey, "An algorithm for machine calculation of complex Fourier series," Math. Comp., Vol.19, No.90 (1965), pp. 297-301.

    F.W. Dorr, "The direct solution to the discrete Poisson equation on a rectangle," to appear in SIAM Review.

    J.A. George, "An Embedding Approach to the Solution of Poisson's Equation on an Arbitrary Bounded Region," to appear as a Stanford Report.

    G.H. Golub, R. Underwood and J. Wilkinson, "Solution of Ax = kBx when B is positive definite," (to be published). ~

    R.W. Hockney, "A fast direct solution of Poisson's equation using Fourier analysis," J. ACM., Vol.12 No.1 (1965), pp. 95-113.

    3.

    4.

    o

    6.

    .

    8.

  • 19

    9.

    lO.

    R.W. Hockney, in Methods in Computational Physics (B. Adler, S. Fernbach an~ M. Rotenberg, Eds.), Vol.S Academic Press, New York and London, 1969.

    R.E. Lynch, J.R. Rice and D.H. Thomas, "Direct solution of partial difference equations by tensor product methods," Num. Math., Vol.6 (196A), pp. 185-199.

    ii. R.S. Varga, Matrix Interative Anal2sis, Prentice Hall, New York, 1962.

  • Matrix Methods in Mathematical Programming

    GENE GOLUB

    Stanford University

  • 22

    I. Introduction

    With the advent of modern computers, there has been a great development in

    matrix algorithms. A major contributer to this advance is J. H. Wilkinson [30].

    Simultaneously, a considerable growth has occurred in the field of mathematical

    programming. However, in this field, until recently, very little analysis has been

    carried out for the matrix algorithms involved.

    In the following lectures, matrix algorithms will be developed which can be

    efficiently applied in certain areas of mathematical programming and which give

    rise to stable processes.

    We consider problems of the following types:

    maximize ~ (~) , where ~ = (x,, x,, .. Xn) T

    subject to Ax= b

    Gx ~h

    where the objective function ~ (~) is linear or quadratic.

    2. Linear Programming

    The linear programming problem can be posed as follows:

    T m~x~i,e ~ (~) = ~

    subject to A~_ = b (2.1)

    ) 0 (2.2)

    We assume that A is an m x n matrix, with m < n, which satisfies the Haar

    condition (that is, every m x m submatrix of A is non-singular). The vector ~ is

    said to be feasible if it satisfies the constraints (2.1) and (2.2).

    Let I = lil, i2, .. iml be a set of m indices such that, on setting xj = O,

    j $ I, we can solve the remaining m equations in (2.1) and obtain a solution such

    that

    xij > 0 , J = I, 2, .. m .

    Thi8 vector x is said to be a basic feasible solution. It is well-known that

    the vector ~ which maximizes ~ (~) = o T x is a basic feasible solution, and this

    suggests a possible algorithm for obtaining the optimum solution, namely, examine

    all possible basic feasible solutions.

  • 23

    Such a process is generally inefficient. A more systematic procedure, due to

    Dantzig, is the SimylexAl~orithm. In this algorithm, a series of basic feasible

    solutions is generated by changing one variable at a time in such a way that the

    value of the objective function is increased at each step. There seems to be no

    way of determining the rate of convergence of the simplex method; however, it works

    well in practice.

    The steps involved may be given as follows:

    (i) Assume that we can determine a set of m indices I = liI , i,, .. iml such that

    the corresponding x i are the non-zero variables in a basic feasible solution. J

    Define the basis matrix

    B = [ai , Ai2, .. aim ]

    where the a are columns of A corresponding to the basic variables. --lj

    (ii) Solve the system of equations:

    B~=b

    where ~.T= [Xil, Xi, ' .. Xim]

    (iii) Solve the system of equations:

    B T ^ W = C

    where _~T__ [ci,, ci2' .. cim] are the coefficients of the basic variables in the

    objective function.

    (iv) Calculate

    ~T w] - T w , say. ( .cj - ~ ~. = Cr ~r - max j I

    T If c r - ~ w 0 , then the optimum solution has been reached. Otherwise, a is to

    ~r

    be introduced into the basis.

    (v) Solve the system of equations:

    B t = - a - - r

    If t ~ 0 , k = I, 2, r k

    bounded.

    . m , then this indicates that the optimum solution is un-

    Otherwise determine the component s for which

    x i x s = min - ~ trk 0

    t r s 1 ~k~m t r k

  • 24

    Eliminate the column a i from the basis matrix and introduce column a r. s

    This process is continued from step (ii) until an optimum solution is obtained (or

    shown to be unbounded).

    We have defined the complete algorithm explicitly, provided a termination rule,

    and indicated how to detect an unbounded solution. We now show how the simplex

    algorithm can be implemented in a stable numerical fashion.

    ~. A stable implementation of the simplex al6orithm

    Throughout the algorithm, there are three systems of linear equations to be

    solved at each iteration. These are:

    B~ = b , m

    BTw = c ,

    Bt = -a --r -r

    Assuming Gaussian elimination is used, this requires about m3/3 multiplica-

    tions for each system. However, if it is assumed that the triangular factors of B

    are available, then only O(m 2) multiplications are needed. An important considera-

    tion is that only one column of B is changed in one iteration, and it seem, reasonable

    to assume that the number of multiplications can be reduced if use is made of this.

    We would hope to reduce the m3/3 multiplications to O(m 2) multiplications per step.

    This is the basis of the classical simplex method. The disadvantage of this method

    is that the pivoting strategy which is generally used does not take numerical

    stability into consideration. We now show that it is possible to implement the

    simplex algorithm in a more stable manner, the cost being that more storage is re-

    quired.

    Consider methods for the solution of a set of linear equations. It is well-

    known that there exists a permutation matrix n such that

    HB = LU

    where L is a lower triangular matrix, and U is an upper triangular matrix.

    If Gaussian elimination with partial (row) pivoting is used, then we proceed

    as follows :

    Choose a permutation matrix H, such that the maximum modulus element of the

  • 25

    first column of B becomes the (I, 1) - element of 1"] 1 B.

    Define an elementary lower triangular matrix F k as

    k ~ | -

    r k = I ' ! - ! f

    " i |

    ". ~ I

    'LL I ' l , I ' | " ~, J ".

    Now~ can be chosen so that

    P, HI B

    has all elements below the diagonal in the first column set equal to zero.

    Now choose 92 so that

    92 r, 9, B

    has the maximum modulus element in the second column in position (2, 2), and

    choose r e so that

    r= fl~ 1"t H2 B

    has all elements below the diagonal in the second column set equal to zero. This

    can be done without affecting the zeros already computed in the first column.

    Continuing in this way we obtain:

    rm- , ~m- , . . .P2 ~, r, 9, B = U

    where U is an upper triangular matrix.

    Note that permuting the rows of the matrix B merely implies a re-ordering of

    the right-hand-side elements. Thus, no actual permutation need be performed,

    merely a record kept. Further any product of elementary lower triangular matrices

    is a lower triangular matrix, as may easily be shown. Thus on the left-hand side

    we have essentially a lower triangular matrix, and thus the required factorization.

    The relevant elements of the successive matrices F k can be stored in the

    lower triangle of B, in the space where zeros have been introduced. Thus the

    method is economical in storage.

  • 26

    To return to the linear programming problem, we require to solve a system of

    equations of the form

    B (1 ) ~ = v (3 .~)

    where B (i) and B (i-I) differ in only one column (although the columns may be re-

    ordered)

    Consider the first iteration of the algorithm. Suppose that we have obtained

    the factorization:

    B () = S () U(o)

    where the right-hand-side vector has been re-ordered to take account of the permuta-

    tions.

    The solution to (3 . i ) with i = 0 is obtained by computing = (L~)) -~ x

    and solving the triangular system

    v(O) = ~ , 2

    each of which requires m + 0 (m) multiplications.

    Suppose that the column b () is eliminated from B () and the column g(O) is S

    O

    introduced as the last column, then

    BO) = [b(O) b(O) . b(O) bCo) ~(o)] L t ~2 ' " ~S " t ~S *1 ' " "

    0 0

    Therefore,

    (~(o) ) .1 BO) = HO) ,

    where H (I) has the form:

    / { <

  • 27

    Such a matrix is called an upper Hessenberg matrix. 0nly the last column need be

    computed, as all others are available from the previous step. We require to apply

    a sequence of transformations to restore the upper triangular form. It is clear

    that we have a particularly simple case of the LU factorization procedure as

    previously described, where r! I) is of the form: i

    R I ' I i I #-~

    I k_Y ' I

    " I

    11 1 ,q~'/ I1 . J i "

    I I 0

    r~ I) =

    only one element requiring to be calculated. On applying a sequence of transforma-

    tion matrices and permutation matrices as before, we obtain

    1) 1) . . r (1) H(1) = u (1) s s

    o o

    where U (I) is upper triangular.

    (I) it is only necessary to compare two Note that in this case to obtain Hj

    (I) and elements. Thus the storage required is very small: (m - So) multipliers gi

    (m - So) bits to indicate whether or not interchanges are necessary.

    All elements in the computation are bounded, and so we have good numerical

    accuracy throughout. The whole procedure compares favourably with standard forms,

    for example, the product form of the inverse where no account of numerical accuracy

    is taken. Further this procedure requires fewer operations than the method which

    uses the product form of the inverse. If we consider the steps involved, forward

    and backward substitution with L () and U (i) require a total of m 2 multiplications

    and the application of the remaining transformation in (L(i)) -I requires at most

    i(m - I) multiplications. (If we assume that on the average the middle column of

    the Basis matrix is eliminated, then this will be closer to (i/2) (m - I) ). Thus

    a total of m 2 + i (m - I) multiplications are required to solve the system at each

  • 28

    stage, assuming an initial factorization is available. Note that if the matrix A

    is sparse, then the algorithm can make use of this structure as is done in the

    method using the product form of the inverse.

    4" Iterative refinement of the.solution

    Consider the set of equations

    B~ = X

    and suppose that ~ is a computed approximation to ~ . Let

    -- ~+

    Therefore,

    that is,

    B(~ + 2) : v ,

    Be_ -- v -B~

    We can now solve for c very efficiently, since the LU decomposition of B is

    available. This process can be repeated until ~ is obtained to the required accur-

    acy. The algorithm can be outlined as follows:

    (i) Compute ~j = ~ - B~_j

    (ii) Solve B_cj = r -j

    (iii) Compute ~j+1 = ~J + ~J

    It is necessary for r to be computed in double precision and then rounded to --j

    single precision. Note that step (ii) requires 0(m 2) operations, since the LU de-

    composition of B is available. This procedure can be used in the following sections.

    ~. Householder Trian~ularization

    Householder transformations have been widely discussed in the literature. In

    this section we are concerned with their use in reducing a matrix A to upper-

    triangular form, and in particular we wish to show how to update the decomposition

    of A when its columns are changed one by one. This will open the way to implemen-

    tation of efficient and stable algorithms for solving problems involving linear

    constraints.

    Householder transformations are symmetric orthogonal matrices of the form

    Pk = I - k UkUk where u k is a vector and Ck = 2/( ). Their utility in this

  • 29

    context is due to the fact that for any non-zero vector 2 it is possible to choos~

    u k in such a way that the transformed vector Pk a is zero except for its first

    element. Householder [15] used this property to construct a sequence of transfor-

    mations to reduce a matrix to upper-triangular form. In [29], Wilkinson describes

    the process and his error analysis shows it to be very stable.

    Given any A, we can construct a sequence of transformations such that A is

    reduced to upper triangular form. Premultiplying by P annihilates (m - 1) O

    elements in the first column. Similarly, premultiplying by PI eliminates (m - 2)

    elements in the second column, and so on.

    Therefore,

    em-1 Pm-2 "'PI PoA = [ RO ] ' (5.1)

    where R is an upper triangular matrix.

    Since the product of orthogonal matrices is an orthogonal matrix, we can

    write (5.1) as

    QA = [ R ] 0

    A=QT[ R ] 0

    The above process is close to the Gram-Schmidt process in that it produces

    a set of orthogonal vectors spanning E . In addition, the Householder transforma- n

    tion produces a complementary set of vectors which is often useful. Since this

    process has been shown to be numerically stable, it does produce an orthogonal

    matrix, in contrast to the Gram-Schmidt process.

    If A = (~I ,...,~n) is an mxn matrix of rank r, then at the k-th stage of the

    triangularization (k < r ) we have

    where R k

    A (k) PoA= = Pk-1Pk-2 "'" 0

    is an upper-triangular matrix of order r.

    T k

    The next step is to compute

    A.k+1.( ~ = Pk A'k" ( ~ where Pk is chosen to reduce the first column of T k to zero

    except for the first component. This component becomes the last diagonal element

  • 30

    of ~+I and since its modulus is equal to the Euclidean length of the first column

    of T k it should in general be maximized by a suitable interchange of the columns

    of Sk . After r steps, T will be effectively zero (the length of each of its r

    T k

    col~Im=~ will be smaller than some tolerance) and the process stops.

    Hence we conclude that if rank(A) = r then for some permutation matrix H the

    Householder decomposition (or "QR decomposition") of A is

    Q A ~ = Pr-1 Pr-2 "'" PO A =

    r

    O 0

    where Q = Pr -1Pr -2 "'" PO is an m x m orthogonal matrix and R is upper-triangular

    and non-singular.

    We are now concerned with the manner in which Q should be stored and the

    means by which Q, R, S may be updated if the columns of A are changed. We will

    suppose that a column a is deleted from A and that a column a is added. It will ~p ~q

    be clear what is to be done if only one or the other takes place.

    Since the Householder transformations Pk are defined by the vectors u k the

    usual method is to store the Uk'S in the area beneath R, with a few extra words of

    memory being used to store the ~k'S and the diagonal elements of R. The product

    Q~ for some vector ~ is then easily computed in the form Pr -1Pr -2 "'" PO ~ where,

    T T for example, PO ~ = (I - ~0~O~0)~ = ~ - ~o(Uo~)Uo . The updating is best

    accomplished as follows. The first p-1 columns of the new R are the same as before;

    the other columns p through n are simply overwritten by columns ap+1, ..., an, aq

    and transformed by the product Pp-1Pp-2 "'" PO to obtain a new

    I (Sp_ I ~ I' then T is triangularized as usual.

    \%1 ] p-1

    This method allows Q to be kept in product form always, and there is no accumula-

    tion of errors. Of course, if p = I the complete decomposition must be re-done

    and since with m~ n the work is roughly proportional to (m-n/3)n 2 this can mean

    a lot of work. But if p A n/2 on the average, then only about I/8 of the original

    work must be repeated each updating.

  • 31

    Assume that we have a matrix A which is to be replaced by a matrix ~ formed

    from A by eliminating column a and inserting a new vector g as the last column.

    As in the simplex method, we can produce an updating procedure using Householder

    transformations. If ~ is premultiplied by Q, the resulting matrix has upper

    Qi = /

    /

    <

    As before, this can be reduced to an upper triangular matrix in O(m 2) multiplica-

    tions.

    6. Projections

    In optimization problems involving linear constraints it is often necessary

    to compute the projections of some vector either into or orthogonal to the space

    defined by a subset of the constraints (usually the current "basis"). In this

    section we show how Householder transformations may be used to compute such pro-

    jections. As we have shown, it is possible to update the Householder decomposi-

    tion of a matrix when the number of columns in the matrix is changed, and thus we

    will have an efficient and stable means of orthogonalizing vectors with respect to

    basis sets whose component vectors are changing one by one.

    Let the basis set of vectors a 1,a2,...,a n form the columns of an m x n

    matrix A, and let S be the sub-space spanned by fail We shall assume that the r

    first r vectors are linearly independent and that rank(A) = r. In general,

    m > n > r , although the following is true even if m < n

    Given an arbitrary vector z we wish to compute the projections

    u = Pz , v = (I - P) z

    for some projection matrix P , such that

    Diagramatically, Hessenberg form as before.

  • 32

    a) z = u + v

    (b) 2v = 0

    (o) ~s r (i.e., 3~ ~uoh that ~ = ~) (i.e., ATv (d) v is orthogonal to S r ~ = o)

    One method is to write P as AA + where A + is the n x m generalized inverse of A,

    and in [7~ Fletcher shows how A + may be updated upon changes of basis. In contrast,

    the method based on Householder transformations does not deal with A + explicitly

    but instead keeps AA + in factorized form and simply updates the orthogonal matrix

    required to produce this form. Apart from being more stable and just as efficient,

    the method has the added advantage that there are always two orthonormal sets of

    vectors available, one spanning S and the other spanning its complement. r

    As already shown, we can construct an m x n orthogona~ matrix Q such that

    r n-r

    QA = i 0 S1

    where R is an r x r upper-triangular matrix. Let

    W = Qz =

    I r

    m-r

    (6.~)

    and define

    ~ ' X= ~2 (6.2)

    Then it is easily verified that ~,~ are the required projections of ~, which is to

    say they satisfy the above four properties. Also, the x in (c) is readily shown

    to be

    In effect, we are representing the projection matrices in the form

  • 33

    and

    P Q C: r) = (z r o)Q (6 .~)

    I-P =QT (im_rO ) (OI r)Q (6.A)

    and we are computing ~ = P z, Z = (I - P)~ by means of (6.1), (6.2) The first r

    col,m~R of Q span S and the remaining m-r span its complement. Since Q and R may r

    be updated accurately and efficiently if they are computed using Householder

    transformations, we have as claimed the means of orthogonalizing vectors with re-

    spect to varying bases.

    As an example of the use of the projection (6.4), consider the problem of

    finding the stationary values of xTAx subject to xTx = I and cTx = O, where A is a

    real symmetric matrix of order n and C is an n x p matrix of rank r, with r ! P

  • 34

    mn l l b - A~_It 2

    where we assume that the rank of A is n.

    Since length is invariant under an orthogonal transformation we have

    where QA =

    lib - Ax l l 2 = l lQb - QA~_II "+ 2 2

    [ 1{ ]. Let 0

    Qb = c : [o_, ] . - - - - C2 m- n

    Then,

    2, 1{] x U' = Ha_,- ~_H" + lla.il" " [~_,] - [o - , ,

    and the solution to the least-squares problem is given by

    = 1{ -1 c,

    Thus it is easy to solve the least-squares problem using orthogonal transformations.

    Alternatively, the least-squares problem can be solved by constructing the

    normal equations

    A x = A D

    However these are well-known to be ill-conditioned.

    Nevertheless the normal equations can be used in the following way.

    Let the residual vector r be defined by:

    r = b -A~

    Then,

    ATr = ATb - ATA~ = 0

    These equations can be written:

    [IA A]O Ir> (:Jx+ Thus,

    0 I

    Multiplying out:

    (1{7o) o

    IAT AIi TO IOii IO

    C CO/o

    (r) X

    (7.~)

    :I(:)

  • 35

    where ~ = QE and S = Q~ .

    This system can easily be solved for ~ and ~. The method of iterative refine-

    ment may he applied to obtain a very accurate solution.

    This method has been analysed by BJhrck [2].

    8. Least-squares problem with linear constraints

    Here we consider the problem

    minimize ~ - A~_~ 2 2

    subject to G~ = ~ .

    Using Lagrange multipliers ~ , we may incorporate the constraints into

    equation (7.1) and obtain

    0 I A

    G T A T 0 1 b 0 The methods of the previous sections can be applied to obtain the solution of this

    system of equations, without actually constructing the above matrix. The problem

    simplifies and a very accurate solution may be obtained.

    Now we consider the problem

    minimize llb - A~_~ 2 2

    subject to Gx ~> h .

    Such a problem might arise in the following manner. Suppose we wish to approximate

    given aata by the polynomial

    y(t) = ~t ~ + @t 2 + yt +

    such that y(t) is convex. This implies

    y(')(t) = 6at + 2~ ) 0 .

    Thus, we require

    6 a t i + 2~ ) 0

    where t. are the data points, (This aces not necessarily guarantee that the poly- l

    hernial will be convex throughout the interval. ) Introduce slack variables w such

    that Gx - w = h

    where w ~ _O .

  • 36

    Introducing Lagrange multipliers as before, we may write the system as:

    i O 0 G -I 0 I A 0

    G T A T 0 0

    r

    x

    w

    h

    b

    0

    At the solution, we must have

    T _~o, w~o, _z_w=0.

    This implies that when a Lagrange multiplier is non-zero then the corresponding

    constraint holds with equality.

    Conversely, corresponding to a non-zero w i the Lagrange multiplier must be

    zero. Therefore, if we know which constraints held with equality at the solution,

    we could treat the problem as a linear least-squares problem with linear equality

    constraints. A technique, due to Cottle and Dantzig [5], exists for solving the

    problem inthis way.

  • 37

    Bibliography

    [11 Beale, E.M.L., "Numerical Methods", in Ngn~.inear Programming, J. Abadie (ed.).

    John Wiley, New York, 1967; pp. 133-205.

    [2] Bjorck, ~., "Iterative Refinement of Linear Least Squares Solutions II", BIT 8

    (1968), pp. 8-30.

    [3] and G. H. Golub, "Iterative Refinement of Linear Least Squares

    Solutions by Householder Transformations", BIT 7 (1967), pp. 322-37.

    [4] and V. Pereyra, "Solution of Vandermonde Systems of Equations",

    Publicaion 70-02, Universidad Central de Venezuela, Caracas, Venezuela, 1970.

    [5] Cottle, R. W. and @. B. Dantzig, "Complementary Pivot Theory of Mathematical

    Programming", Mathematics of the Decision Sclences~ Part 1, G. B. Dantzig and

    A. F. Veinott (eds.), American Mathematical Societ 2 (1968), pp. 115-136.

    [6] Dantzig, G. B., R. P. Harvey, R. D. McKnight, and S. S. Smith, "Sparse Matrix

    Techniques in Two Mathematical Programming Codes", Proceedinss of the S.ymposium

    on Sparse Matrices and Their Appllcations, T. J. Watson Research Publications

    RAI, no. 11707, 1969.

    [7] Fletcher, R., "A Technique for Orthogonalization", J. Inst. Maths. Applics. 5

    (1969), pp. 162-66.

    [8] Forsythe, G. E., and G. H. Golub, "On the Stationary Values of a Second-Degree

    Polynomial on the Unit Sphere", J. SIAM, 13 (1965), pp. 1050-68.

    [9] and C. B. Moler, Computer Solution of Linear Algebraic Systems,

    Prentice-Hall, Englewood Cliffs, New Jersey, 1967.

    [10] Francis, J., "The QR Transformation. A Unitary Analogue to the LR Transforma-

    tion," Comput. J. 4 (1961-62), pp. 265-71.

    [11] golub, G. H., and C. Reinsch, "Singular Value Decomposition and Least Squares

    Solutions", Numer. Math., 14(1970), pp. 403-20.

    [12] and R. Underwood, "Stationary Values of the Ratio of Quadratic

    Forms Subject to Linear Constraints", Technical Report No. CS 142, Computer

    Science Department, Stanford University, 1969.

    [13] Hanson, R. J., "Computing Quadratic Programming Problems: Linear Inequality

    and Equality Constraints", Technical Memorandum No. 240, Jet Propulsion

  • 38

    Laboratory, Pasadena, California, 1970.

    [14] and C. L. Lawson, "Extensions and Applications of the House-

    holder Algorithm for Solving Linear Least Squares Problems", Math. Comp., 23

    (1969), pp. 787-812.

    [15] Householder, A.S., "Unitary Triangularization of a Nonsymmetric Matrix",

    J. Assoc. Comp. Mach., 5 (1968), pp. 339-42.

    [16] Lanozos, C., Linear Differential Operators. Van Nostrand, London, 1961.

    Chapter 3

    [17] Leringe, 0., and P. Wedln, "A Comparison Betweem Different Methods to Compute

    a Vector x Which Minimizes JJAx - bH2 When Gx = h", Technical Report, Depart-

    ment of Computer Sciences, Lund University, Sweden.

    [18] Levenberg, K., "A Method for the solution of Certain Non-Linear Problems in

    Least Squares", ~uart. Appl. Math., 2 (1944), pp. 164-68.

    [19] Marquardt, D. W., "An Algorithm for Least-Squares Estimation of Non-Linear

    Parameters", J. SIAM, 11 (1963), pp. 431-41.

    [20] Meyer, R. R., "Theoretical and Computational Aspects of Nonlinear Regression",

    P-181 9, Shell Development Company, Emeryville, California.

    [21] Penrose, R., "A Generalized Inverse for Matrices", Proceedings of the

    Cambridge Philosophical Society, 51 (1955), pp. 406-13.

    [22] Peters, G., and J. H. Wilkinson, "Eigenvalues of Ax = kB x with Band Symmetric

    A and B", Comput. J., 12 (1969), pp. 398-404.

    [23] Powell, M.J.D., "Rank One Methods for Unconstrained Optimization", T. P. 372,

    Atomic Energy Research Establishment, Harwell, England, (1969).

    [24] Rosen, J. B., "Gradient Projection Method for Non-linear Programming. Part

    I. Linear Constraints", J. SIAM, 8 (1960), pp. 181-217.

    [25] Shanno, D. C. "Parameter Selection for Modified Newton Methods for Function

    Minimization", J. SIAM, Numer. Anal., Ser. B,7 (1970).

    [26] Stoer, J., "On the Numerical Solution of Constrained Least Squares Problems",

    (private communication), 1970.

    [27] Tewarson, R. P., "The Gaussian Elimination and Sparse Systems", Proceedings

    of the Symposium on Sparse Matrices and Their Applications~ T. J. Watson

  • 39

    Research Publication RA1, no. 11707, 1969.

    [28] Wilkinson, J. H., "Error Analysis of Direct Methods of Matrix Inversion",

    J. Assoc. Comp. Mach., 8 (1961), pp. 281-330.

    [29] "Error Analysis of Transformations Based on the Use of

    Matrices of the Form I - 2ww H', in Error in Digital Computation, Vol. ii, L.

    B. Rall (ed.), John Wiley and Sons, Inc., New York, 1965, pp. 77-101.

    [30] The Algebraic Eigenvalue Problem, Clarendon Press, Oxford,

    1 965.

    [31] ZoutendiJk, G., Methods of Feasible Directions, Elsevier Publishing Company,

    Amsterdam (1960), pp. 80-90.

  • Topics in Stability Theory for Partial Difference Operators

    VIDAR THOM~E

    University of Gothenburg

  • 42

    PREFACE

    The purpose of these lectures is to present a short introduction to some aspects

    of the theory of difference schemes for the solution of initial value problems for

    linear systems of partial differential equations. In particular, we shall discuss

    various stability concepts for finite difference operators and the related question

    of convergence of the solution of the discrete problem to the solution of the con-

    tinuous problem. Special emphasis will be given to the strong relationship between

    stability of difference schemes and correctness of initial value problems.

    In practice, most important applications deal with mixed initial boundary value

    problems for non-linear equations. It will net be possible in this short course to

    develop the theory to such a general context. However, the results in the particular

    cases we shall treat have intuitive implications for the more complicated situations.

    The two most important methods in stability theory for difference operators have been

    the Fourier method and the energy method. The former applies in its pure form only

    to equations with constant coefficients whereas the latter is more directly appli-

    cable to variable coefficients and even to non-linear situations. Often different

    methods have to be combined so that for instance Fourier methods are first used to

    analyse the linearized equations with coefficients fixed at some point and then the

    energy method, or some other method, is applied to appraise the error comm~tte~ by

    treating the simplified case. We have selected in these lectures to concentrate on

    Fourier techniques.

    These notes were developed from material used previously by the author for a

    similar course held in the summer of 1968 in a University of Michigan engineering

    summer conference on numerical analysis and also used for the author's survey paper

    ~361. Some of the relevant literature is collected in the list of references. A

    thorough account of the theory can be obtained by combining the book by Richtmyer

    and Morton E28] with the above mentioned survey paper E36S. Both these sources con-

    rain extensive lists of further references~

  • 43

    I. Introduction

    Let ~ be the set of uniformly continuous, bounded functions of x, and let

    be the set of functions v with (d/dx)Jv in ~ for J ~k . For v ~ ~ set

    X

    For any v C ~)amy k, and ~ >0we can f indv G ~Ksuch that

    ~1 v - v / / .0 (1)

    If v ~ C ~ this problem admits one and only one solution in C D

    (2)

    (3)

    It is clear that the solution u depends for fixed t linearly on v; we define a

    linear operator Ee(t ) By

    where u is defined by (3) and where v C C~A The solution operator Eo(t ) has the

    properties

    and

    II ~-~ b') v /t 0

  • 44

    The operator E(t) still has the properties

    = ~ " 0+)

    l ie (~:~~ I/ ~< /i v I\ , (~)

    and is continuous in t for t ~ O. For this particular equation we actually get a

    c lass io~ solutio~ for t ~ o~ even i f ~ i s o~y in C .e have E( t ) . ~ (_ - - /~ K=O

    for t > O,

    Consider new the initial-value problem

    , (~)

    For v g ~

    Clearly

    this problem admits one and only one genuine solution, namely

    (7)

    (act~mlly we have equality) and it is again natural to define a generalized solution

    operator, continuous in t by

    This has again the properties (~), (5). In this case, the solution is as irregular

    for t >0 as it is for t = O.

    Both these problems are thus "correctly posed" in ~ ; they can be uniquel~

    solved for a dense subset of ~ and the solution operator is bounded.

    We could instead of ~ also have considered ether Basic classes of functions.

    Thus let L ~ be the set of square integrable functions with

    ,, (LI 1 Consider again the initial-value problem (1),(2) and assume that u(x,t) is a classi-

    cal solution and that u(x,t) tends to zero as fast as necessary whsm I~I .-~o for

    the following to hold. Assume for simplicity that u is real-valued. We then have

    ~t (8 )

  • 45

    so that for t ~ O,

    i~ ~ [., ~-~'~ II ~ II v I\ (9)

    Relative to the present framework it is also possible to define genuine and gene-

    ralized solution operators; the latter is defined on the whole of L 2 and satisfies

    (~-), (5) .

    For the problem (6), (7) the calculation corresponding to (8) goes similarly~

    b'

    One other way of looking at this is to introduce the Fourier transform; for

    integrable v, set ~o ~X

    v : ( lO)

    Notice the Parseval relation, for v ~nx addition in L 2 we have ~ L a and

    II '~ Ii = / i -~ il v i~ .

    For the Fourier-transform u(~ ,t) with respect to x of the solution u(x,t) we then

    get i~itial-value problems for the ordinary differential equations, namely,

    for (l), (2) an~ a~ . ~, A~ ~ = Av(~

    fo r (6), (7). ~ese have the ~oZut~ons _~L -~ "~-,~

    u ~ (n )

    _~ (12)

    respectively, and the actual solutions can be obtained, under certain conditions,

    by the inverse Fourier transform. Also by Parseval's formula we have for both (ll)

    and (12), _ _~I I

    which is again (9).

    For the purpose of approximate solution of the initial-value problem (1), (2),

    where h,k are small positive numbers which we shall later make tend to zero in such

    a fashion that ~=k/h 2 is kept constant. Solving for u(x,t+k), we get

  • 46

    This suggests that for the exact (generalized) solution to (I), (2),

    . (z~)

    or after n steps

    We sh~l l prove that th i s i s es~ent i~ l ly cor rect for any v ~- ~ i f , but only i f ~

    Thus, let us first notice that if ~ ~ ~ , then the coefficients of ~ are all non-

    negative and add up to 1 so that (the norm is again the sup-norm)

    or generally

    iiE vll .< tl ll The boundedness of the powers of ~ is referred to as stability of ~ .

    Assume now that v 6 ~ We then know that the classical solution of (i), (2)

    exists and if n(x,t) E(t)v = Eo(t)v , the~ u E g ~ = for t '~ 0 an~

    We shall prove that~if nk = t, then ~ ~ II ~ J II

    To see this let us consider

    Notice now that we can write

  • 47

    Therefore "

    ~-, E- ~- V(~l

    which we wanted to prove.

    We shall new prove that for v not necessarily in

    have for nk = t,

    To see this, let ~ ~ 0 be arbitrary, and choose 'v"

    but only in ~ , we still

    when k ~ 0 .

    such that

    We then have

    - -K

    Therefore, choosing ~ -- ~'z(~il~l')-w~' have for h ~ ~

    which concludes the proof.

    Consider now the case ~

    Taking

    we get

    ~o

    X~

    The middle coeffic ~nt in ~ is th~ negative.

    so that the effect of ~ is multiplication by ( i -~) . We generally get

  • 48

    Since ~ > we have 1-~'~ ~ -i and it follows that it is not possible to have

    an inequality of the form

    // T.

    This can a l so be in terpreted to mean that smal l e r ro rs in the in i t ia l d~ta are blown

    up to an extent where they overshadow the real solution. This phenomenon is oalle~

    instability.

    Instead of the simple difference scheme (13) we could study a more general

    type of operator, e.g.

    If we wa~t this to be "consistent" with the equation (i) we have to demand that E k

    apprexi~tes E(k), or if u(x,t) is a solution, then

    Taylor series development gives for smooth u,

    or

    ( \

    J

    Assuming these consistency relations to hold and assuming that all the aj

    we get as above

    are ~ O,

    (15)

    and the convergence analysis above can be carried over to this more general case

    with few chs~ges.

  • 49

    However, the reasons for choosing an operator of the form (14) which is not our

    old operator (13) would be to obtain higher accuracy in the approximation and it will

    turn out then that all the coefficients are in general not non-negative. We cannot

    have (15) then, but we may still have

    fo r some C depend~mg on To

    When we work with the L2-norm rather than the maximum norm, Fourier transforms

    are again helpful; indeed in most of the subsequent lectures, Fourier analysis will

    be the foremost tool.

    Thus, let ~ be the Fourier transform of v defined by (lO). We then have

    ,.J

    3 J

    or, introducing the characteristic (trigonometric) polynomial of the operator ~ ,

    Jj i~_,,,f we find that the effect of E k on the Fourier transform side is multiplication by

    n a(h~ )n. One easily findsthat similarly, the effect of E k is multiplication by

    a(h ~)n. Using Parseval's relation, one then easily finds (the norm is now the L z-

    norm)

    IIE I ]11 and that this inequality is the best possible. It follows that we have stability if

    and only if la(~ ) I ~ 1 for all real~ . We then actually have (15) in the L2-norm

    Consider again the special operator (13).

    and a(~ ) takes all values in the interval[l-&/\

    We have in this case

    2A

    , i]. We therfore find that also in

    L 2 we have stability if and only if 1-4~ ~ -1, that is ~ ~ .

    Difference approximations to the initial value problem (6), (7) can be analysed

    similarly.

  • 50

    We shall put the above considerations in a more general setting and discuss an

    initial-value problem in a Banach space B. Thus let A be a linear operator with

    domain D(A) and let v ~ B.

    such that

    Consider then the problem of finding u(t) E B, t ~ O,

    A~*~t-) , E--, o (16)

    v (17)

    More precisely, we shall say that u(t), t ~ O, is a genuine solution of (16), (17)

    if (17) holds and

    (ii) Ii u(t,

  • 51

    We shall now study the approximation of a solution u(t) = E(t)v of a correctly

    posed Initial-value problem (16), (17). We will then for small k,k $ ke, consider

    an approximation ~ of E(k), where ~ is a bounded linear operator with D(E k) = B

    which depends continuously on k for 0 $ k ~ k e. The thought is then that ~

    is going to approximate E(nk)v = E(k)nv.

    We say that the eperator E k is consistent with the initlal-value problem (16),

    (17) if there is a set'~ of genuine solutio~ of (16), (17) such that

    (i) % ~ ~ ~(0~ ~ ~ ~ ~ ~ is dense in B.

    for any T > O.

    If the operator ~ is consistent with (16), (17), we say that it is convergent

    (in B) if for any v e B and any t ~ O, and any pair of sequences ~i ~ 9

    with kj ~-> O, njkj -->t for j ~ , we have

    II g vI/ o whenj-e . We say that the operator ~ is stable (in B) if for any T ~ 0 there is a con-

    stant C such that

    It turns out that consistency alone does not guarantee convergence; we have the

    following theorem which is referred to as Lax's equivalence theorem [22].

    Theorem Assume that (16), (17) is correctly posed and that ~ is a consistent

    approximation operator. Then stability is necessary and sufficient for convergence.

    The proof of the sufficiency of stability for convergence is similar to the

    proof in the particular case treated above; the proof of the necessity depends on

    the Banach-Steinhaus theorem.

    2. Ini~al-value problems in L 2 with constant coefficients

    We begin with some notation. We shall work here with the Banach space

    L ~ = L2(R d) with the norm

  • 52

    We define for a multi-index ~< = C~,~ ' j~%~ with ~

  • 53

    and double bars will indicate norms with respect to L a , so that for the N-vector

    u(x) 6 L ' , ~_ \ ~_

    For later use we need the following

    Lemma I Let ~ be a dense subset of L a and let a(~ ) be a continuous NxN matrix.

    Then il~ V II

    veglv ~i v/\ ~

    Let u(x,t) ~e an N-vector-function defined for x ~ R d and t ~ O.

    initial-value problem

    -~ _ ~( ,~ = "5_ ~.~)~ ~ >~o

    Consider the

    where P~ are constant N~N matrices and where we can consider Pu to be defined for

    u ~ ~ Let ~

    We have :

    Theorem I

    for any T ~. O, there is a C such that

    Proof

    The initial-value ~oblem (I), (2) is correctly posed in L" if and onlyif,

    a ~ IL o~-~ ~T

    Assume that (3) holds. Let v ~ ~ and consider

    (~)

    By differentiation under the integral sign we find that u(x,t) satisfies (i), and so

    is a solution te (i), (2). Since for t >I O, u(x,t) ~ ~) it is a genuine solution

    in the sense of Lecture land is also unique. Thus Eo(t)v = u(x,t) with D = 2 . By O

    Fourier's inversion formula smd Parseval's theorem

  • 54

    Since S is dense in L ~ it follows that the initial-value problem is correctly

    posed Jk

    We now want to prove the necessity of (3) for correctness. Let now v ~ ~o an~

    define u(x,t) by (4). We find at once that u(x,t) satisfies the initial-value

    problem (I), (2) and so u(x,t) = E(t)v. Again, by Fourier's inversion formula and

    Parseval' s theorem

    [I v//

    so that by Lemma I,

    which proves the necessity of (3) since Co is dense in L z.

    Ex. i Consider the symmetric hyperbolic system

    (5)

    Then the in~ial-value problem for (5) is correctly posed in L 2 for 8

    since this is a unitary matrix.

    Before proceeding to the next example we state a lemma.

    A with eigenvalues ~ d' j = I,...,N, we introduce

    We then have

    For an arbitrary NxN matrix

    Lemma 2 If A is an NxN matrix we have for t ~ 0

    .j --0

    Proof See [9].

    Ex~2 Consider the system (I) and consider also the principal part P of P which

    corresponds to the polynomi~l

  • 55

    We say that the system (i) is parabolic in Petrovskii's sense if there is a ~, > 0

    such that

    .)

    By homogeneity this is equivalent to the existence of a ~> 0 and a C such that

    We then have that if (I) is parabolic in Petrovskii's sense, the corresponding

    initial-value problem is correctly posed in L 2. For by Lemma 2 we have for

    0 ~ t ~ T,

    which is clearly bounded. In particular, the heat equation

    clearly falls into this category.

    Solutions of parabolic systems are smooth for t ~ 0; we have

    Theorem 2 Assume that (1) is parabolic in Petrovsk~'s sense. Then for t ~ O,

    D E(t)v ~ L 2 for any ~ and for ar~y T > 0 and anyQ there is a C such that

    6

  • 56

    -i o -~

    and a simple calculation yields

    which is not bounded for any t > 0 whe~ V~ cO .

    Ex. 5 Although our theory only deals with systems which are first-order systems with

    respect to t, it is actually possible to consider also hi~her-order systems by

    reducing them to first-order systems. We shall only exemplify this in one particu-

    lar case. Consider the initial-value problem (d=l)

    .~"~ _- ~-~"~ ~ "k >~ ~

    Introducing

    ~-~ ~

    (7)

    (a)

    we have for u the initial-value problem

    ul~o5 = vL~,5. ~ere

    (9)

    co~ ~ = / _~

    so that we have that the initial-value problem (9) obtained by the transformation (8)

    from (7) is correctly posed in L i.

    In order that an initial-value problem of the type (I), (2) be correctly posed in

    L 2, it is necessary that it be correctly posed in the sense of Petrovskii, more pre-

    cisely:

    Theorem 3 If (i), (2) is correctly posed in L 2 then there is a constant C such that

  • 57

    Proof Follows at once by

    We shall see at once by the following example that (I0) is not sufficiemt for

    correctness in Ls

    Ex. 6 Take the initial-value problem corresponding to (d~l)

    0 _ .~ = - f T_ -v -l: We get then

    However, a simple calculation yields ~ 1

    which is easily seen to he unbounded for 0 $ t ~ I (take t ~ = i).

    Necessary and sufficient conditions for correctness have been given by Kreiss

    [19]. The main contents in Kreiss' result are concentrated in the following lena.

    Here for a NxN matrix A we denote by Re A the matrix

    Also recall that for hermitian matrices A and B, A ~ B means

    for all N-vectors v. We denote the resolvent of A by R(A;z);

    It will be implicitly assume~, when we write down R(A;z ), that z is not an eigen-

    value of A.

    Lemma ~ Let ~ be a fa~ly of NxN matrices. Then the following four conditions are

    equivalent

    j

    ? - - -

    (iii) For A ~ ~ ~A(A)

  • 58

    and such that

    (, i s i ~/s- '~) 0 such that for each A ~ ~ there is a hermitian

    matrix H = H(A) with

    C-~ ~

  • 59

    One commonly used criterion is:

    Theorem 5 Let P(~) be a normal matrix. Then (i), (2) is correctly posed if and

    only if (IO) holds.

    Proof By Theorem 3 we only have to prove the sufficiency. Since P( ~ ) is normal we

    can find a unitary U(~ ) such that

    is diagonal. Hence

    which proves the result.

    For later use we state:

    Theorem 6 If (1), (2) is correctly posed in L 2 then (lO) holds and there are posi-

    tive constant C I and C a and for each ~ ~ R d a positive definite hermitian matrix

    H(~ ) such ~t -I

    and.

    (13)

    l:'roof By Theorem 4 there is a constant 7 such that the family S in (11) satisfies

    condition (iv) of Lemma 3 with C = C I. Thus for each ~g R d there is a positive

    definite H(~ ) such that

    But by (12) this implies (13).

    3. Difference appr0ximations in L i to initlal-value problems with constant

    coefficients

    Consider again the initial-value problem

    ae - ?(~)~:- Z P,~'~u. , ~ , o (l) "oe bzl ._ M

    u(', ,o) ~ v~) (2)

  • 60

    For the approximate solution of (i), (2) we consider explicit difference operators

    of the form

    where h is a small positive parameter, ~ = (~, .... ,~d) with ~j integer, e~(h) are

    NxN matrices which are polynomials in h, and the sunwaation is over a finite set of ~.

    We introduce the symbol of the operator ~,

    which is periodic with period 2,U/h in ] and notice that for v 6 ~ ~ the Fourier

    transform of ~v is n /k A

    Assume that the initial-value problem (i), (2 ) is correctly posed. Pie then want

    to choose ~ so that it approximates the solution operator E(k) when k is a positive

    parameter tied to h by the relation

    ~/h ~ = ~ = constant;

    we actually want to approximate u(x,nk) = E(nk)v = E(k)nv by ~v. In the future we

    shall emphasise the dependence on k rather than h and write ~ as in Lecture i.

    To accomplish this, we shall assume that E k satisfies the condition in the

    following definition. We say that ~ is consistent with (i) if for any solution of

    (1) ~ C ~o

    i f o(k) can be ~p~c,~ by k(~(h~), w, ~y that ~ ie aco~ate of o rder f . C1~arly

    any consistent scheme is accurate of order at least i.

    We can express consistency and accuracy in terms of the symbol (cf. [35]):

    Lemma i The operator ~ is consistent with (I) if and only if

    The operator ~ is accurate of order ~ i f and only i f ~ v ~

  • 61

    The proof of (3), say, consists in proving like in the special case in Lecture

    1 that consistency is equivalent to a number of algebraic conditions for the coeffi-

    cients, which turn out to be equivalent to the analytic functions exp(kP(h -I ~ )) and

    Ek(h-' ~ ) having the same coefficients for h j ~ up to a certain order.

    Using LemmA 1 it is easy to deduce that if ~ is consistent with (1) in the

    present sense then we also have consistency in the sense of Lecture 1. For the set

    @ of genuine solutions in the previous definition we can for instance take the ones

    corresponding to v ~ . From Lax's equivalence theorem it is clear that we want

    to discuss the stability of operators ~ of the form described. We have

    Theorem 1 khe operator ~ is stable if and only if for any T > O,

    c O~

    ,Proof We notice that F-,k( ~ )n in Lecture 2 that

    which praves the theorem.

    is the ~ymbol of ~k" It follows in the same way as

    and so

    Proof We have for nk ~ l,

    It is easy to prove by counter-examples that (4) is not sufficient for stabili~

    Necessary and sufficient conditions for stability have been given by Kreiss [18] and

    Buchanan [51 ; we quote here Kreiss' result. The main content in Kreiss' theorem is

    concentrated in the following Lemma. Here we have introduced the following

    notation: For H hermitian and positive definite, we introduce

    (4)

    We now turn to the algebraic characterization of stability. We first prove

    the necessity of the yon Neumann condition. For any NxN matrix A we denote byp (A)

    its spectral radius, the maximum of the moduli of the eigenvalues of A.

    Theorem 2 If ~ is stable in L 2, there exists a constant ~ such that

  • 62

    IAu Ii~

    Recall again that for hermitlan matrices, A ~ B means (Au,u) ~ (Bu,u).

    Lemma 2 Let ~ be a family of NxN matrices. Then the following four conditions are

    equivalent.

    (i) sup ~ ~ ~ ~ ~-~ ~--~, ~ ~ ~o

    (iii) For A ~ ~, ~ (A) ~ l, and there are two constants C, and Cm and for each

    A @ Jl- a matrix S = S(A) such that

    and such that

    Sg =

    is a triangular matrix wlth

    (iv) There is a constant C ~ 0 such that for each A ~ ~ there is a hermitian

    matrix H = H(A) with

    c- 'T_ ~ ~ ~ C I

    and

    Proof see [28].

    To be able to apply this lemma to our problem we need the following analogue of

    Lemma 2.4.

    Lemma 3 Assume that ~ is stable in L 2. Then there exists a constant

    for ~,~ (~ ~ ~ ~t~) one has

    such that

    ~O

  • 63

    An alternative way of expressing this result is that for some

    any n we have ~

    Y ,k ~ i, and

    Combining Lemmas 2 and 3 we have at once :

    Theorem ~ If ~he operator ~ is stable in L i, then there is a ~ such that

    satisfies the conditions of Lemma 2. On the other hand, if there is a constant

    such that kF satisfies at least one of the conditions of Lemma 2, then E k is stable

    in L ~.

    One commonly used criterion is :

    ~eo~m ~ Let ~ be su~ t~t ~(~) is a no~l matrix. ~en ~cn ~e~'s con-

    dition is necessary and sufficient for stability.

    Proof By Theorem 2 we only have to prove the sufficiency. Since ~( ~ ) is nor,~al

    there is for each k ~ I and ~ ~ R d a unitary matrix Uk(~) such that

    is diagonal. Her~e

    I ('

    which proves the result. To see the relation with Lemmas 2 and 3, we could also have

    formulated this as fellows. We have with the same ~ as in (4) for Fk(~ ) = e Ek(~)

    that _ ~ ~\

    which is diagonal with eigenvalues of modulus ~ i. Thus, afortiori it is triangu-

    lar, and the estimates in condition (iii) of Lemma 2 hold.

    As for existence of stable operators, we have (cf. [17]):

    Theorem ~ There exist L2-stable operators consistent with (i), (2) if and only if

    (I), (2) is correctly posed in L z.

    Proof We first prove that the correctness is necessary.

    the stability that l~p ~i: ~f~/ "= ~m / ~.~]a~/

    It follows by Lemma i, ar~1

  • 6't

    which implies correctness.

    On the other hand if (I), (2) is correctly posed one can construct a consistent

    difference operator, er which is equivalent, its symbol, by setting

    Using Kreiss' stability theorems one can prove that this ~ is stable for small

    ~= K/h~ ~ The part of this operator corresponding to the second term in (5) is

    referred to as an artificial viscosity.

    We shall consider some examples.

    Consider the initial-value problem for a symmetric hyperbolic system

    We know from Lecture 2 that this problem is correctly posed in L ~.

    before a difference operator

    where for simplicity we assume e~ independent of h.

    by Friedrichs [8].

    (6)

    Consider as

    We have the following result

    ~ =- ' I then Theorem 6 If e~ are hermitian, positive semi-definite and

    and thus E k is stable.

    Proof We have the generalized Cauchy-Sohwartz inequality

    where (u,v) = ~ uj ~j. Therefore ~-

  • 65

    and hence with w = Ek( ~ )v, ~.

    which proves the lemma.

    We have

    so that

    satisfies

    As an application, take

    : , -4

    t , 3

    is consistent with (6) and accurate of order i. It is clear that if

    o < ~ .< ~'~ Ca/~ l / -~ ,

    the coefficients are positive semi-definite and so the operator ~ is stable.

    The operator ~ can be considered as obtained from replacing (6) by

    Consider for a moment the perhaps more natural equation

    which gives the consistent operatorj ~i

    E~vC~} --- vl~) ~ ~ aa~ L~l~*~-~( '~-~' ~

    with J

  • 66

    We shall prove that this operator is not stable in L ~ if any of the Aj

    Assume e.g. A, ~ 0 and set ~ j = 0 for J ~ I, ~ I h =~/2.

    which has the eigenvalues

    is non-zero.

    With this choice,

    where the rea l numbers #~ are the eigenvalues of A~. Thus the von Neum~nn cond i -

    t ion is not satisfied and the operator is unstable for any ~ .

    It can be shown that in general the operator ~ defined in (8) is accurate of

    order exactly 1. We shall now look at an operator which is accurate of order 2 in

    the case of one space dimension (d=l).

    = ~ - -

    thus have the ~stem

    (9)

    Consider the difference operator

    =

    with

    This operator is often referred to as the Lax-Wendroff operator. We have

    and so, E k is consistent with (9), and in general accurate of order 2. We shall

    prove :

    Theorem

    (lO) is stable in L 2 if and only if

    Pro0f It is easy to see that the eigenvalues of ~(h -I ~ ) are

    Let~j , j = 1,...,N, be the eigenvalues of A. Then the operator E k in

    (ll)

    and we obtain after a simple calculation

    if and only if (II) holds. Since E k is clearly normal, this proves the theorem.

    For a NxN matrix A consider the numerical range

  • 67

    We have :

    Theorem 8 If ~ is a family of NxN matrices such that

    then ~ is a stable family, that is there is a constant C such that

    Proof

    since

    We shall prove that condition (ii) in Kreiss' theorem is satisfied. Clearly

    we have ~(A) ~ 1 so that R(A;z) exists for ~z~ >I .

    and v = R(A;z)w we have

    or

    Therefore, if w is arbitrary,

    which proves the result.

    Remark One can actually prove that IAnl ~ 2, A ~

    This result can be used to prove the stability of certain generalizations of

    the Lax-Wendroff operator to two dimensions (see [2~]).

    Consider again the symmetric hyperbolic system (6) and a difference operator

    of the form (7), consistent with (6). Then A(~ ) = ~(h-1~ ) is independent of h.

    We say with Kreiss that E k is dissipative or order O (~ even) if there is a ~ ~ 0

    such that /X)

    We shall prove

    Theorem 9 Under the above assumptions, if E k

    pative of order ~ it is stable in L 2 .

    is acct~ate of order~ -I an~ dissi-

  • 68

    Proof By the definition of accuracy, we have

    o,.s :~ -? O

    Let U = U( "~ ) be a unitary matrix which triangulates A( } ) so that

    Since B(~ ) is upper triangular it follows that the below-diagonal elements in

    exp(~UP(~ )U ~) are O(~) . Since this matrix is unitary, the same can easily be

    proved to hold for its above-diagonal terms, and thus the same holds for the above-

    diagonal terms in B( ~ ) so that

    \o

    . . . .

    and the s tab i l i ty fo l low~ by cond i t ion ( i i i ) in ~e iss t ~eorem, v

    Consider now the initial-value ~roblem for a Petrovskii parabolic system

    . _ ~ ~0

    so that

    We know from Lecture 2 that this problem is correctly posed in L 2. Consider a

  • 69

    aifference operator

    We say, foZlo~r.l.ng John [15] and ~idlund [38] that E is a parabol ic d i f fe re~e k

    operator if there are constants ~ and C, S ~ 0 such that

    Notice the close analogy with the concept of a dissipative operator.

    Theorem 10 Let E be consistent with (12) and parabolic. Then it is stable in L ~. k

    We shall base a proof on the following lemma, which we shall also need later for

    other purposes.

    Lemma 4 There exists a constant C N depending only on N such that for any NxN

    matrix A with spectral radius ~ we have for n ~ N, !

    IP, l O there is a C

    such that -- ~

  • 70

    ,Proof By Fourier transformation this reduces to proving . _~ l

    and the result therefcre easily follows by (13).

    We know by Lax's equivalence theorem that the stability of the parabolic

    difference operators considered above implies convergence. We shall now see

    that the difference quotients also converge to the corresponding derivatives,

    which we know to exist for t > 0 since the systems are parabolic.

    Theorem 12 Assume that (12) is parabolic and that ~ is consistent with (12) an~

    parabolic. Then for any t > O, any o~ , and any v 6 L 2 we have for nk = t,

    ~ i [ ~ _ ~ li b-~ ,, v f~ (,) v ii --> o ~ ~, ---. o , (~, )

    Proo____~f By Theorems 2,2 and ii one finds that it is sufficient to prove (14) for v A~

    in the dense subset C~ . But then, by Parseval's relation,

    "~ t~ - %~

    The result therefore follows by the following lemma which is a simple consequence